- Last 7 days
-
www.biorxiv.org www.biorxiv.org
-
Author Response:
We are proceeding without revisions as the first author has chosen to withdraw from the project and will not be contributing further.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Recommendations For The Authors):
I can find no problems with the experiments performed in this study, but there are several results that are not easily explained. I would like to see more consideration of possible explanations. For example, one of the major differences between the the CESA structure from primary and secondary cell walls is the displacement of TM7 in the primary cell wall CESAs that leads to the formation of lipid exposed channel. Why does this vary between primary and secondary cell wall CESA proteins? Could it explain differences in the properties, such as crystallinity between primary and secondary cell wall cellulose?
At this time, the different position of TM helix 7 observed in our GmCesA structures is just an observation. We have some emerging evidence that this helix is also flexible in POCesA8 under certain conditions; however, we do not know whether this affects catalytic activity or cellulose coalescence. We have revised the text to avoid the interpretation that TM 7 repositioning is a characteristic feature of primary cell wall CesAs only.
Similarly, regarding the formation of the larger structures from mixtures of different CESA trimers. Why do they not form roseOes? Par;cularly as these appear to be forming 2-dimensional structures.
We have included additional data on the interaction between different CesA isoform trimers (Figure 6). To answer the reviewer’s ques;on, the most likely reasons for not observing closely packed roseOe-like structures are (a) steric interferences between the micelles harboring the individual CesA trimers, and (b) the lack of a stabilizing cellulose fiber. This interpretation is supported by 2D class averages of dimers of CesA1 and CesA3 trimers (now shown in Fig. 6). The class averages show an ‘upside-down and side-by-side’ orientation of the two trimers, consistent with interferences between the solubilizing detergent micelles. The implica;ons of this non-physiological arrangement are discussed in the revised manuscript. In a biological membrane, the CesA trimers are confined to the same plane in the same orientation, which is likely necessary to form ordered arrangements.
What role does the NTD play in trimer formation given its apparent very high class specificity?
We have no data suggesting any contribution of the NTD to trimer formation. Recent work on moss CesA5 and similar AlphaFold predic;ons suggest that, for some CesAs, an extreme Nterminal region can interact with the beta sheet of the catalytic domain via beta-strand augmentation. Whether this interaction can contribute to CesA-CesA interactions remains unknown.
Reviewer #2 (Recommendations For The Authors):
The authors provide PDB codes but not EMDB codes for the EM maps, also I would encourage the authors to upload the raw micrographs to the EMPIAR database.
The EMDB codes are shown in Table 1 and data transfer to EMPIAR is ongoing.
Page 6 line 144, the statement "All CesA isoforms show greatest catalytic activity at neutral pH" seems to contradict the data in Figure 1e and the subsequent statements. This sentence should be removed.
The text has been revised to indicate that CesA1 and CesA6 show highest activity under mild alkaline conditions.
Page 6, line 150, the authors state "The affinities for substrate binding range from 1.4 mM for CesA1 to 0.6 and 2.4 mM for CesA3 and CesA6, respectively." How were the affinities determined? Is this the affinities or the Michaelis constants? Is it known whether CesAs are rapid equilibrium enzymes? This should be clarified.
The text now states that we performed Michaelis Menten kine;cs using the ‘UDP-Glo’ glycosyltransferase assay kit. We are uncertain about whether CesAs can be classified as rapid equilibrium enzymes. The rate-limiting step of cellulose biosynthesis has been proposed to be glycosyl transfer, rather than cellulose transloca;on. To avoid any confusion, we changed the text from '…reveals Michaelis Menten constants for substrate binding of CesA1 and CesA3' to '…reveals Michaelis Menten constants for CesA1 and CesA3 with respect to UDP-Glc'.
Page 6, line 153, the authors state "CesA1's apparent Ki for UDP is roughly 0.8 mM, whereas this concentration is increased to about 1.2 to 1.5 mM for CesA6 and CesA3, respectively." From the Figure 1g legend, it appears that the authors performed additional experiments at different UDP-Glc concentrations in order to determine Ki that are not shown. This data should be included as a figure supplement as the data presented are insufficient to determine Ki (only IC50).
The UDP inhibition data show apparent IC50 values, and this has been corrected in the text. For each CesA isoform, the titration was done at one UDP-Glc concentration only.
Page 8, line 202, the authors state that TM helix 7 of the primary cell wall CesAs is more flexible "as evidenced by weaker density." The density for the TM helix 7 should be shown. If the density shown in Supplementary Figure 3 corresponds to TM helices the number of the helices should be indicated as it is not immediately obvious from the amino acid residue numbers.
The densities for TM helix 7 of all CesA isoforms are shown in Supplemental Figure 3. The helices are now labeled to orient the reader.
Reviewer #2 (Public Review)
The authors demonstrate via truncation that the N-terminus of the CesA is not involved in the interactions between the isoforms and propose that the CSR hook-like extensions are the primary mediator of trimer-trimer interactions. This argument would be strengthened by equivalent truncation experiments in which the CSR region is removed.
We performed the suggested experiment. We replaced the CSR in N-terminally truncated GmCesA1 and GmCesA3 with a 20-residue long linker. The resulting constructs assemble into homotrimeric complexes as observed for the wild type and only N-terminally truncated versions. However, the CSR-truncated constructs of the different isoforms do not interact with each other in vitro. Further, CSR-deleted GmCesA3 also does not interact with full-length CesA1, suggesting that two CSR domains of different isoforms are necessary for homotrimer interaction. This data is now shown as Fig. 5.
Reviewer #3 (Recommendations For The Authors):
Major Points
(1) The authors state on Line 354 that they were unable to isolate heterotrimers, but they need to provide the data to support this claim; for example, it is important for readers to understand whether co-expression of all three CESAs leads to only homotrimers or only monomers. This information is essential to exclude model C in Figure 6.
We have revised the corresponding discussion and toned down the statement that heterotrimeric complexes did not form in our recombinant expression system. Co-expression of differently tagged secondary or primary cell wall CesAs in Sf9 cells has consistently resulted in negligible amounts of material that can be purified sequentially over different affinity matrices (corresponding to the tags on the recombinantly expressed CesAs – His, Strep, Flag). While this does not exclude the formation of a small fraction of hetero-oligomeric complexes (which could be trimers as observed in the structures or monomers interacting via their CSR regions), it demonstrates that CesAs favor the same isoform for trimer formation, rather than partnering with other isoforms. An example of such a purification is now shown as Supplemental Figure 8.
Determining whether heterotrimers are formed upon co-expression of different CesA isoforms requires high resolution structural analysis because co-purification of different isoforms can also be due to interactions between different homo-trimeric complexes, as demonstrated in this study.
While we cannot exclude that factors exist in planta that may prevent the formation of homotrimers and favor the formation of hetero-trimers, it is important to keep in mind that currently no experimental data supports the formation of hetero-trimeric complexes. Instead, our work demonstrates that existing data on CesA isoform interactions can be explained by the interaction of homotrimers of different isoforms.
(2) The evidence that the products of GmCEA1, GmCESA3, and GmCESA6 homotrimers are cellulose is that they consume UDP-glucose and produce a beta-glucanase-sensitive product. Other beta-glucans synthesized by similar GT2 family proteins (e.g. CSLDs, Yang et al., 2020 Plant Cell or CSLCs, Kim et al., 2020 PNAS) would be sensitive to this enzyme, and the product cannot truly be called cellulose unless it forms microfibrils. Previous reports of CESA activity in vitro have demonstrated that the products form genuine cellulose microfibrils rather than amorphous beta-glucan (via electron microscopy); extensively documented that the product is sensitive to beta-glucanase, but not other enzymes (e.g., callose or MLG degrading enzymes); provided linkage analysis of the product to conclusively demonstrate that it is a beta1,4-linked glucan; and documented a loss of activity when key catalytic residues were mutated (Purushotham et al., 2016 PNAS; Cho et al., 2017 Plant Phys; Purushotham et al., 2020 Science).
Other GT2 characterization efforts have documented activity to similar standards (e.g. CSLDs, Yang et al., 2020 Plant Cell or CSLFs, Purushotham et al., 2022 Science Advances). At least one independent method should be provided, and the TEM of the product is necessary for readers to appreciate whether the product forms true cellulose microfibrils.
There may be some confusion regarding the nomenclature. Therefore, we revised the second sentence of the Introduction to define ‘cellulose’ as a beta-1,4 linked glucose polymer, in accordance with the ‘Essentials of Glycobiology’. This is also consistent with enzyme nomenclature as the primary product of cellulose synthase is a single glucose polymer, and not a fibril. For example, most bacterial cellulose synthases only produce amorphous (single chain) cellulose.
We show that the GmCesA products can be degraded with a beta-1,4 specific glucanase (cellulase), which demonstrates the formation of authentic cellulose. This study does not focus on the formation of fibrillar cellulose apart from suggesting a revised model for a microfibrilforming CSC.
(3) The position of isoxaben-resistant mutations implies that primary cell wall CESAs form heterotrimers (Shim et al., 2018 Frontiers in Plant Biology). Indeed, in their previous description of the POCESA8 structure (Purushotham et al., 2020 Science), the authors discussed the position of isoxaben-resistant mutations as a way to justify the way that TM7 of one CESA can contribute to forming the cellulose translocation pore in the neighbouring CESA within a heterotrimer. However, in this manuscript, the authors document a different location for TM7 in the GmCEA1, GmCESA3, and GmCESA6 homotrimers, which would change the position of these resistance mutations. Please discuss.
As stated in the manuscript, we do not know what the functional implication of the TM7 flexibility may be, but we speculate that it could affect the alignment of the synthesized cellulose polymers. Regarding the previously reported POCesA8 structure, the mapping of one of the reported isoxaben resistance mutants to the C-terminus of TM7 was not used to justify the structure; the structure with its position of TM7 stands on its own. Considering recent observations suggesting that isoxaben may affect cellulose biosynthesis via secondary effects, we prefer not to speculate on the mechanism by which these mutations cause the apparent resistance to isoxaben (PMID: 37823413).
(4) The authors present no evidence that GmCESA1/3/6 are involved in primary cell wall synthesis. Please include gene expression information (documenting widespread expression consistent with primary CESAs) and rigorous molecular phylogenetic analysis (or references to these published data) to clarify that these are indeed primary cell wall CESAs.
This has been addressed. We have included additional figures (Fig. 1 and S1B) that show the strong and wide distribution of the selected CesAs in soybean leaves, their co-expression with primary cell wall markers, and their phylogenetic clustering with Arabidopsis primary cell wall CesAs.
(5) Several small changes need to be made to the abstract to ensure that it aligns with the data: Line 28: add "in vitro" arer "their assembly into homotrimeric complexes" Line 28: change "stabilized by the PCR" to "presumably stabilized by the PCR".
We inserted ‘in vitro’ as requested. We did not insert the second modification as requested since CesA trimers are stabilized by the PCR. This is a fact arising from several experimentally determined CesA trimer structures.
(6) In all graphs in all figures it is unclear what the sample size is and what the bars represent. These must be stated in the figure legends. It is best practice to plot individual data points so that readers can easily interpret both the sample size and the variation.
The sample sizes and error bars are now defined in the relevant figure legends.
(7) The methods need to unambiguously define GmCESA1, GmCESA3, GmCESA6 protein identities using appropriate accession numbers.
The accession codes are now provided in the Methods.
Minor Points
(1) Does CESA1 have higher activity in Figure 1D because of the pH at which the assay was conducted (see Figure 1E)? Could this difference in activity or pH preference have also affected their capacity to resolve TM7 of CESA1?
We consistently observe higher in vitro catalytic activity of CesA1, compared to CesA3 and CesA6. Activity assays are performed at a pH of 7.5, roughly halfway between the activity maxima of CesA3 and CesA1/6. At this pH, we expect activity differences to arise from factors other than the buffer pH. As detailed above, we do not know whether the conformational flexibility of TM helix 7 affects catalytic activity.
(2) Line 55: The authors should cite additional papers that also provide insight into CESA structure (e.g. Qiao et al 2021 PNAS).
A recent publication on moss CesA5 has been included. Qiao et al unfortunately report on a dimeric assembly of a fragment of Arabidopsis thaliana’s CesA3 catalytic domain, which we consider non-physiological. We added a brief statement in the Discussion explaining that our GmCesA3 structure is inconsistent with the dimeric arrangement reported by Qiao et al.
(3) Line 95: these references are about secondary cell wall CESA isoforms, but there are more appropriate references for the primary CESAs that should be included in place of these papers.
Fagard et al report on growth defects in roots and dark-grown hypocotyls linked to Arabidopsis CesA 1 and CesA6, which are primary cell wall CesAs. Nevertheless, we have included two additional recent publications from the Meyerowitz and Persson labs.
(4) Line 121-122: Please cite a specific figure that supports this claim, since the (Purushotham et al., 2020) reference refers to POCESA8 enrichment results, but the claims are about the GmCESA1/3/6 enrichment.
The POCesA8 reference has been removed. The classification into monomers and trimers arises from the data processing described in this manuscript and is consistent with similar results obtained for POCesA8.
(5) Line 314: It is more appropriate to use "enzyme activity" rather than "cellulose synthesis".
We prefer to use cellulose biosynthesis since the enzyme produces cellulose.
(6) Figure 1: please add colour to the graphs to clarify which trend lines belong to which data series (especially Figure 1G).
The figure (now Fig. 2) has been revised as suggested.
(7) Figure 2D: It's not clear which parts are GmCESA and which are POCESA8; please clarify the figure legend.
Thank you, the legend has been revised accordingly (now Fig. 3).
(8) In Figure 5, It's not clear that the one CESA is maintained at a steady concentration throughout the assay since there is only a bar for that CESA at the highest concentration (e.g. in Figure 5A, the blue bar for CESA1 only appears on the right-most assay, but there was CESA1 in all assays, so this should be indicated).
In the panel the reviewer is referring to, the blue bar corresponds to the activity measured for only CesA1 at a concentration of 20 µM. The red columns (indicated as ‘Mix’) represent the activities measured in the presence of 20 µM of CesA1 plus increasing concentrations of CesA3. The purple columns represent activities obtained for only CesA3 at the indicated concentrations. Numerical addition of the activities of CesA1 alone at 20 µM (blue column) and CesA 3 alone (purple columns) gives rise to the gray columns, now indicated by a capital ‘sigma’ sign. We are unclear on how the figure could be improved, but we have revised the legend to avoid confusion.
(9) Figure 5 legend needs to be clarified to indicate whether monomers or homotrimers were used in the assays.
This is now shown as Fig. 7 and the legend has been revised as requested. The experiments were performed with the trimeric CesA fractions.
(10) There seem to be some random dots near the top of Figures 6B & 6C
Removed. Thank you.
-
-
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This paper investigates the effects of the explicit recognition of statistical structure and sleep consolidation on the transfer of learned structure to novel stimuli. The results show a striking dissociation in transfer ability between explicit and implicit learning of structure, finding that only explicit learners transfer structure immediately. Implicit learners, on the other hand, show an intriguing immediate structural interference effect (better learning of novel structure) followed by successful transfer only after a period of sleep.
Strengths:
This paper is very well written and motivated, and the data are presented clearly with a logical flow. There are several replications and control experiments and analyses that make the pattern of results very compelling. The results are novel and intriguing, providing important constraints on theories of consolidation. The discussion of relevant literature is thorough. In sum, this work makes an exciting and important contribution to the literature.
Weaknesses:
There have been several recent papers which have identified issues with alternative forced choice (AFC) tests as a method of assessing statistical learning (e.g. Isbilen et al. 2020, Cognitive Science). A key argument is that while statistical learning is typically implicit, AFC involves explicit deliberation and therefore does not match the learning process well. The use of AFC in this study thus leaves open the question of whether the AFC measure benefits the explicit learners in particular, given the congruence between knowledge and testing format, and whether, more generally, the results would have been different had the method of assessing generalization been implicit. Prior work has shown that explicit and implicit measures of statistical learning do not always produce the same results (eg. Kiai & Melloni, 2021, bioRxiv; Liu et al. 2023, Cognition).
The authors argued in their response to this point that this issue could have quantitative but not qualitative impacts on the results, but we see no reason that the impact could not be qualitative. In other words, it should be acknowledged that an implicit test could potentially result in the implicit group exhibiting immediate structure transfer.
We thank the reviewer for their feedback and added a statement in our discussion section acknowledging the possible effects of alternative measures of learning.
Given that the explicit/implicit classification was based on an exit survey, it is unclear when participants who are labeled "explicit" gained that explicit knowledge. This might have occurred during or after either of the sessions, which could impact the interpretation of the effects and deserves discussion.
We agree with the mentioned shortcoming in principle, although there are good methodological reasons for this, as discussed in our previous response. We added a statement on this topic to our discussion to make the potential issues and our reasoning in the design decision more transparent for the reader.
Reviewer #2 (Public review):
Summary:
Sleep has not only been shown to support the strengthening of memory traces, but also their transformation. A special form of such transformation is the abstraction of general rules from the presentation of individual exemplars. The current work used large online experiments with hundreds of participants to shed further light on this question. In the training phase participants saw composite items (scenes) that were made up of pairs of spatially coupled (i.e., they were next to each other) abstract shapes. In the initial training, they saw scenes made up of six horizontally structured pairs and in the second training phase, which took place after a retention phase (2 min awake, 12 hour incl. sleep, 12 h only wake, 24 h incl. sleep), they saw pairs that were horizontally or vertically coupled. After the second training phase, a two-alternativesforced-choice (2-AFC) paradigm, where participants had to identify true pairs versus randomly assembled foils, was used to measure performance on all pairs. Finally, participants were asked five questions to identify, if they had insight into the pair structure and post-hoc groups were assigned based on this. Mainly the authors find that participants in the 2 minute retention experiment without explicit knowledge of the task structure were at chance level performance for the same structure in the second training phase, but had above chance performance for the vertical structure. The opposite was true for both sleep conditions. In the 12 h wake condition these participants showed no ability to discriminate the pairs from the second training phase at all.
Strengths:
All in all, the study was performed to a high standard and the sample size in the implicit condition was large enough to draw robust conclusions. The authors make several important statistical comparisons and also report an interesting resampling approach. There is also a lot of supplemental data regarding robustness.
Weaknesses:
My main concern regards the small sample size in the explicit group and the lack of experimental control.
We thank the reviewer for the valuable feedback throughout the review process. The issues mentioned here have been addressed in our previous response.
Reviewer #3 (Public review):
In this project, Garber and Fiser examined how the structure of incidentally learned regularities influences subsequent learning of regularities, that either have the same structure or a different one. Over a series of six online experiments, it was found that the structure (spatial arrangement) of the first set of regularities affected learning of the second set, indicating that it has indeed been abstracted away from the specific items that have been learned. The effect was found to depend on the explicitness of the original learning: Participants who noticed regularities in the stimuli were better at learning subsequent regularities of the same structure than of a different one. On the other hand, participants whose learning was only implicit had an opposite pattern: they were better in learning regularities of a novel structure than of the same one. However, when an overnight sleep separated the first and second learning phases, this opposite effect was reversed and came to match the pattern of the explicit group, suggesting that the abstraction and transfer in the implicit case were aided by memory consolidation.
In their revision the authors addressed my major comments successfully and I commend them for that.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
We would encourage the authors to add text to the manuscript that acknowledges/discusses the two issues pointed out in our review.
We added relevant passages to the discussion section of the manuscript.
Reviewer #2 (Recommendations for the authors):
The authors have improved some sections of the manuscript and this is reflected in my assessment. The major weaknesses remain unchanged. Since my review is published alongside the paper, readers can make up their own mind regarding their severity.
My only hard ask would be to add that the study was not preregistered into the main manuscript as I asked before! I am surprised that the authors are so reluctant to honestly state this fact....
We have not stated this fact in our manuscript until now since our understanding is that papers that report preregistered studies state and cite their preregistration in their method section, while any omission of such a statement by default conveys that no preregistration occurred. In fact, we cannot recall encountering papers with statements of no-preregistration in the literature. Nevertheless, we have no issue stating that our study was not preregistered and per the reviewer's request, we have added such an explicit statement in our manuscript.
Reviewer #3 (Recommendations for the authors):
* I strongly urge the authors to remove the Results sub-sections from Methods.
We thank the reviewer for highlighting this issue arising from our previous layout, which we decided to handle the following way. We re-labeledl the subsections in question as “Additional Analyses” to avoid confusion, we removed any redundant findings already reported in Results of the main text, and we moved a small number of more substantial findings from the Methods Section to the main text Results as requested. We believe that this solution constitutes the most readable option, as we do not clutter the main results with extensive sanity checks and results
of minor interest, while we also do not need to establish experiment-wise result sections in the Supplementary Materials, which would further disperse information interested readers might look for.
* Authors report that in Experiment 4 "Participants with explicit knowledge (n=23) show the same pattern of results as they did in Experiment 1", but that seems inaccurate, as they did learn novel pairs in Exp4 whereas they did not in Exp1. This can be seen in the figure and also in Methods-Results: "performing above chance for ... pairs of a novel structure (M=69.6, SE=5.9, d=0.69, t(22)=3.33 p=0.012, BF=13.6) in the second training phase"
We thank the reviewer for pointing out this error in our interpretation of the results and adjusted the section in question to better align with what our result actually shows.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
Multiple compounds that inhibit ATP-sensitive potassium (KATP) channels also chaperone channels to the surface membrane. The authors used an artificial intelligence (AI)-based virtual screening (AtomNet) to identify novel compounds that exhibit chaperoning effects on trafficking-deficient disease-causing mutant channels. One compound, which they named Aekatperone, acts as a low affinity, reversible inhibitor and effective chaperone. A cryoEM structure of KATP bound to Aekatperone showed that the molecule binds at the canonical inhibitory site.
Strengths and weaknesses:
The details of the AI screening itself are inevitably opaque, but appear to differ from classical virtual screening in not involving any physical docking of test compounds into the target site. The authors mention criteria that were used to limit the number of compounds, so that those with high similarity to known binders and 'sequence identity' (does this mean structural identity) were excluded. The identified molecules contain sulfonylurea-like moieties. How different are they from other sulfonylure4as?
We thank the reviewers for the questions. As part of the library preparation, molecules with greater than 0.5 Tanimoto similarity in ECFP4 space to any known binders of the target protein and its homologs within 70% sequence identity were excluded to increase the possibility of identifying novel hits. After scoring and ranking the molecules by the AtomNet® technology, a diversity clustering was performed using the Butina algorithm (Butina D. Unsupervised Data Base Clustering Based on Daylight’s Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets, J. Chem. Inf. Comput. Sci. 1999, 39, 747–750) with a Tanimoto similarity cutoff of 0.35 in ECFP4 space to minimize selection of structurally similar scaffolds for the final compound buy-list. We have revised the results and methods sections to make this clear.
Sulfonylureas are defined by their core structure comprising a sulfonyl group (–S(=O)<sub>2</sub>) and a urea moiety (–NH–CO–NH–). While some compounds identified in our study contain a sulfonamide group (R-S(=O) <sub>2</sub>-NR<sub>2</sub>), they differ structurally from sulfonylureas by lacking the key urea group and by incorporating unique R-group substitutions (we have now added this to Figure 1A legend). For example, compound C27 (Z2068224500) includes a sulfonamide group but not a urea moiety. Likewise, C45 (Aekatperone, Z1620764636) contains a sulfonamide group along with an aromatic, nitrogen-rich heterocyclic ring, but no urea group. Additionally, the R-groups in these compounds are more complex than the simple aromatic or alkyl chains typical of sulfonylureas. They include heterocyclic aromatic systems and nitrogen-rich structures, which likely influence their binding properties and lipophilicity. These structural differences suggest distinct functional and pharmacological profiles as supported by our biochemical and functional studies.
The experimental work confirming that Aekatperone acts to traffic mutant KATP channels to the surface and acts as a low affinity, reversible, inhibitor is comprehensive and clear, with very convincing cell biological and patch-clamp data, as is the cryoEM structural analysis, for which the group are leading experts. In addition to the three positive chaperone-effective molecules, the authors identified a large number of compounds that are predicted binders but apparently have no chaperoning effect. Did any of them have inhibitory action on channels? If so, does this give clues to separating chaperoning from inhibitory effects?
This is an interesting question. Evidence from cryo-EM, biochemical and electrophysiology studies reveal a critical role of Kir6.2 N-terminus in K<sub>ATP</sub> channel assembly and gating, and that pharmacological chaperones like glibenclamide, repaglinide, carbamazepine, and now aekatperone exert their chaperoning and inhibitory effects by stabilizing the interaction between Kir6.2 N-terminus and the SUR1-ABC core. This stabilization, while promoting the assembly of Kir6.2 and SUR1 to “chaperone” trafficking-impaired mutant channels to the cell surface, also inhibits the channel by restricting the Kir6.2 C-terminal domain from rotating to an open state. An additional mechanism by which these compounds inhibit channel activity is by preventing SUR1-NBD dimerization, which mediates physiological activation of the channel by MgADP (see review: Driggers CM, Shyng SL. Mechanistic insights on K<sub>ATP</sub> channel regulation from cryo-EM structures. J Gen Physiol. 2023 Jan 2;155(1): e202113046, PMID: 36441147). From our compound screening, we did find some compounds that showed mild inhibition of the channel by electrophysiology but no obvious chaperone effects by western blots. It is possible that small chaperoning effects of some compounds showing mild channel inhibition effects were missed due to the lower sensitivity of the western blot assay compared to electrophysiology. Alternatively, these compounds could inhibit channels by preventing SUR1NBD dimerization without stabilizing the Kir6.2 N-terminus, which is required for the chaperone effect based on our model. Unfortunately, we did not find any compounds that show chaperone effects but no channel inhibition effects, which is consistent with our understanding of how this type of K<sub>ATP</sub> chaperones work (i.e. by stabilizing Kir6.2 N-terminus interaction with SUR1’s ABC core).
The authors suggest that the novel compound may be a promising therapeutic for treatment of congenital hyperinsulinism due to trafficking defective KATP mutations. Because they are low affinity, reversible, inhibitors. This is a very interesting concept, and perhaps a pulsed dosing regimen would allow trafficking without constant channel inhibition (which otherwise defeats the therapeutic purpose), although it is unclear whether the new compound will offer advantages over earlier low-affinity sulfonylurea inhibitor chaperones. These include tolbutamide which has very similar affinity and effect to Aekatperone. As the authors point out this (as well as other sulfonlyureas) are currently out of favor because of potential adverse cardiovascular effects, but again, it is unclear why Aekatperone should not have the same concerns.
We thank the reviewer for the comments. This is clearly an important question to address in the future. While we have not directly tested the effects of Aekatperone on cardiac functions, we did assess its inhibitory effect on cells expressing the cardiac K<sub>ATP</sub> channel isoform (SUR2A/Kir6.2). Our results indicate that Aekatperone exhibits higher sensitivity toward the pancreatic K<sub>ATP</sub> channel isoform (SUR1/Kir6.2) compared to the cardiac isoform. However, we acknowledge that Aekatperone could still have cardiotoxic effects through its potential action on other channels, such as the hERG channel.
It is worth noting that tolbutamide, despite its known cardiotoxic effects, does not exert these effects through cardiac K<sub>ATP</sub> channel inhibition. This has been demonstrated in studies showing no inhibitory effect of tolbutamide on SUR2A/Kir6.2 channels and on channels formed by Kir6.2 and SUR1 harboring the S1238Y mutation (also shown as S1237Y in some studies using a different SUR1 isoform)--the amino acid substitution found in SUR2A at the corresponding position (Ashfield R, Gribble FM, Ashcroft SJ, Ashcroft FM. Identification of the high-affinity tolbutamide site on the SUR1 subunit of the K<sub>ATP</sub> channel. Diabetes. 1999 Jun;48(6):1341-7, PMID: 10342826). This suggests that tolbutamide’s cardiotoxic effects might involve other targets like the hERG channel. Interestingly, tolbutamide contains a hydrophobic tail and aromatic rings that align well with the structural features for hERG interaction (Garrido A, Lepailleur A, Mignani SM, Dallemagne P, Rochais C. hERG toxicity assessment: Useful guidelines for drug design. Eur J Med Chem. 2020 Jun 1;195:112290, PMID: 32283295). In contrast, highaffinity sulfonylureas such as glibenclamide and glimepiride, which have additional benzamide moieties, are associated with lower cardiovascular risks (Douros A, Yin H, Yu OHY, Filion KB, Azoulay L, Suissa S. Pharmacologic Differences of Sulfonylureas and the Risk of Adverse Cardiovascular and Hypoglycemic Events. Diabetes Care. 2017, 40:1506-1513, PMID:
28864502). Given these considerations, a comprehensive assessment of Aekatperone’s potential cardiotoxicity is crucial. Future studies involving in silico modeling, in vitro, and in vivo experiments will be essential to evaluate Aekatperone’s interaction with hERG and other offtarget effects. These efforts will help clarify its safety profile. This point has now been added to the Discussion.
Reviewer #2 (Public review):
Summary:
In their study 'AI-Based Discovery and CryoEM Structural Elucidation of a KATP Channel Pharmacochaperone', ElSheikh and colleagues undertake a computational screening approach to identify candidate drugs that may bind to an identified binding pocket in the SUR1 subunit of
KATP channels. Other KATP channel inhibitors such as glibenclamide have been previously shown to bind in this pocket, and in addition to inhibition KATP channel function, these inhibitors can very effectively rescue cell surface expression of trafficking deficient KATP mutations that cause excessive insulin secretion (Congenital Hyperinsulinism). However, a challenge for their utility for treatment of hyperinsulinism has been that they are powerful inhibitors of the channels that are rescued to the channel surface. In contrast, successful therapeutic pharmacochaperones (eg. CFTR chaperones) permit function of the channels rescued to the cell membrane. Thus, a key criteria for the authors' approach in this case was to identify relatively low affinity compounds that target the glibenclamide binding site (and be washed off) - these could potentially rescue KATP surface expression, but also permit KATP function.
Strengths:
The main findings of the manuscript include:
(1) Computational screening of a large virtual compound library, followed by functional screening of cell surface expression, which identified several potential candidate pharmacochaperones that target the glibenclamide binding site.
(2) Prioritization and functional characterization of Aekatperone as a low affinity KATP inhibitor which can be readily 'washed off' in patch clamp, and cell based efflux assays. Thus the drug clearly rescues cell surface expression, but can be manipulated experimentally to permit function of rescued channels.
(3) Determination of the binding site and dynamics of this candidate drug by cryo-EM, and functional validation of several residues involved in drug sensitivity using mutagenesis and patch clamp.
The experiments are well-conceived and executed, and the study is clearly described. The results of the experiments are very straightforward and clearly support the conclusions drawn by the authors. I found the study to provide important new information about KATP chaperone effects of certain drugs, with interesting considerations in terms of ion channel biology and human disease.
Weaknesses:
I don't have any major criticisms of the study as described, but I had some remaining questions that could be addressed in a revision.
(1) The chaperones can effectively rescue KATP trafficking mutants, but clearly not as strongly as the higher affinity inhibitor glibenclamide. Is this relationship between inhibitory potency, and efficacy of trafficking an intrinsic challenge of the approach? I suspect that it may be an intractable problem in the sense that the inhibitor bound conformation that underlies the chaperone effect cannot be uncoupled from the inhibited gating state. But this might not be true (many partial agonist drugs with low efficacy can be strongly potent, for example). In this case, the approach is really to find a 'happy medium' of a drug that is a weak enough inhibitor to be washed away, but still strong enough to exert some satisfactory chaperone effect. Could some additional clarity be added in the discussion on whether the chaperone and gating effects can be 'uncoupled'.
Thank you for the suggestion. A similar question was raised by Reviewer 1, which was addressed above (public review, point 2). We have now added more discussion to clarify this point.
(2) Based on the western blots in Figure 2B, the rescue of cell surface expression appears to require a higher concentration of AKP compared to the concentration response of channel inhibition (~9 microM in Figure 3, perhaps even more potent in patch clamp in Figure 2C). Could the authors clarify/quantify the concentration response for trafficking rescue?
Thank you for bringing up this observation. Indeed, the pharmacochaperone effects of Aekatperone as well as other previously published K<sub>ATP</sub> pharmacochaperones require higher concentrations compared to their inhibitory effects on surface-expressed channels. This difference likely stems from the necessity for these compounds to cross the cell membrane and interact with newly synthesized channels in the endoplasmic reticulum, where the trafficking rescue occurs. We estimate that effective pharmacochaperone activity for Aekatperone can be achieved at concentrations ranging from 50 to 100 µM in cells expressing trafficking-deficient K<sub>ATP</sub> channel mutants, higher than that required for inhibition of surface-expressed channels (~9 µM IC50). Future work could focus on medicinal chemistry modifications, for example esterification of Aekatperone (Zhou G. Exploring Ester Prodrugs: A Comprehensive Review of Approaches, Applications, and Methods. Pharmacology & Pharmacy, 2024, 15, 269-284). Once inside the cell, the esters would be cleaved by endogenous esterases to release the active compound, ensuring efficient intracellular delivery. This strategy could potentially improve membrane permeability and bioavailability of the compound, which would lower the required concentrations to achieve desired chaperoning effects.
(3) A future challenge in the application of pharmacochaperones of this type in hyperinsulinism may be the manipulation of chaperone concentration in order to permit function. In experiments it is straightforward to wash off the chaperone, but this would not be the case in an organism. I wondered if the authors had attempted to rescue channel function with diazoxide ine presence of AKP, rather than after washing off (ie. is AKP inhibition insurmountable, or can it be overcome by sufficient diazoxide).
Thank you for raising this important point. We have previously shown (Martin GM et al. Pharmacological Correction of Trafficking Defects in ATP-sensitive Potassium Channels Caused by Sulfonylurea Receptor 1 Mutations. J Biol Chem. 2016, 291: 21971-21983, PMID: 27573238) that diazoxide, which stabilizes K<sub>ATP</sub> channels in an open conformation, also reduces physical association between Kir6.2 N-terminus and SUR1 as demonstrated by reduced crosslinking of engineered azido-phenylalanine (an unnatural amino acid) at Kir6.2 N-terminal amino acid 12 position to SUR1. Incubating cells with diazoxide did not rescue the trafficking mutants but actually further reduced the maturation efficiency of trafficking mutants. For this reason, we did not include diazoxide during Aekatperone incubation and instead added diazoxide after Aekatperone washout to potentiate the activity of mutant channels rescued to the cell surface. In vivo, we envision testing alternating Aekatperone and diazoxide dosing to maximize functional rescue of K<sub>ATP</sub> trafficking mutants.
(4) Do the authors have any information about the turnover time of KATP after washoff of the chaperone (how stable are the rescued channels at the cell surface)? This is a difficult question to probe when glibenclamide is used as a chaperone, but maybe much simpler to address with a lower affinity chaperone like AKP.
Thank you for your thoughtful comment. While we have not yet tested the duration of rescued K<sub>ATP</sub> channels at the cell surface following Aekatperone washout, we have conducted similar studies with carbamazepine (Chen PC et al. Carbamazepine as a novel small molecule corrector of trafficking-impaired ATP-sensitive potassium channels identified in congenital hyperinsulinism. J Biol Chem. 2013, 288: 20942-20954, PMID: 23744072), another compound exhibiting reversible inhibitory and chaperone effects (apparent affinity between glibenclamide and Aekatperone). Our previous findings with carbamazepine showed that in cultured cells its chaperone effects were detectable as early as 1 hour and peaked around 6 hours after treatment. Furthermore, when carbamazepine was removed following a 16-hour treatment, the rescue effect persisted for up to 6 hours post-drug removal. These results provide a potential duration of the surface expression rescue effects of reversible pharmacochaperones.
Reviewer #1 (Recommendations for the authors):
The paper is well-written and comprehensive with only very minor essentially copy-editing needed. That said, it would be good if the authors could answer the main points raised above:
(1) What is the relevant Tanimoto parameters and sequence identity (does this mean structural identity) for the identified compounds?
As we answered above in response to the overall assessment, to facilitate the identification of novel hits, molecules with greater than 0.5 Tanimoto similarity in ECFP4 space to any known binders of the target protein and its homologs within 70% amino acid sequence identity were excluded from the commercial library. Additionally, after scoring and ranking the molecules by the AtomNet® technology, a diversity clustering was performed on the top 30,000 molecules using the Butina algorithm with a Tanimoto similarity cutoff of 0.35 in ECFP4 space to minimize selection of structurally similar scaffolds for the final compound buy-list.
(2) Did any of the identified putative binders have inhibitory action on channels? If so, does this give clues to separating chaperoning from inhibitory effects?
Please see response to the same question in the overall assessment above.
(3) Acknowledge that the identified compounds contain sulfonylurea-like moieties, and address why Aekatperone should (or perhaps does not) offer anything advantage over low affinity sulfonrylureas such as tolbutamide?
Please see response to the same question in the overall assessment above.
Reviewer #2 (Recommendations for the authors):
Thank you for assembling the interesting study, which I felt was well designed and communicated. The diverse approaches used in the study, with consistent findings, were definitely a strength. The core findings are also well distilled in the main body of the text, and although there is quite a lot of supplementary information, I felt that it was presented appropriately and well selected in terms of what would be important for readers hoping to learn more. In addition to the questions described above, I only had a few minor editorial issues that could be fixed related to presentation.
(1) Figure 1B. The colours and resolution of the chemical structures are difficult to see clearly and could be improved.
We have revised the figure accordingly.
(2) This is a minor wording point... first sentence of the discussion describes the drugs as pancreatic-selective, when it would be more clear to describe them as selective for the pancreatic isoform of KATP (Kir6.2/SUR1), or perhaps better as 'exhibiting ~4-5 fold selective for SUR1-containing KATP channels vs. SUR2A or SUR2B'.
We have changed the wording as suggested.
(3) As a curiosity (not necessary to do more experiments), but I am curious if the authors know whether there is any meaningful enhancement of trafficking of WT channels by AKP.
All pharmacochaperones we have identified to date including Aekatperone also slightly enhance WT channel surface expression (10-20%).
Reviewing editor recommendations:
(1) Given the modest resolution of the EM reconstruction, it is perhaps not entirely clear how AKP was assigned to the density observed. Specifically, it would be helpful to include a comparison of an AKP-free map and the current AKP map (filtered to a similar resolution) showing slice views of densities in the region around the inferred binding site. This would be very helpful in ascertaining whether the cryoEM reconstruction is an independent validation of the computational and functional experiments or whether the density inference depends on the additional knowledge.
We appreciate the editor’s suggestion. We have now added a Supplemental Figure (Supplementary Figure 7 in the revised manuscript) that compares our AKP-free cryoEM density deposited previously to the EMDB (EMD-26320) and the AKP-bound cryoEM density from this study, with cryoEM density (filtered to the same resolution) superimposed on the structural model.
(2) It could help to mention in brief what is a probable mechanism of AKP inhibition - that is how after binding of AKP, channel opening is restricted. Is it similar to that of other site A ligands?
Based on the strong Kir6.2 N-terminal cryoEM density observed in our AKP map, AKP most likely inhibits K<sub>ATP</sub> channels by trapping the Kir6.2 N-terminus in the central cavity of SUR1’s ABC core thus preventing Kir6.2-C-terminal domain from rotating to an open conformation, similar to other ligands that stabilize the Kir6.2 N-terminus-SUR1 interface by binding to site A (such as tolbutamide and AKP), site B (such as repaglinide), or both site A and site B (such as glibenclamide). We have now included this in the revised Results and Discussion sections.
(3) In the context of the MD simulations, do other site A ligands (which from my understanding bind at a similar site) also exhibit similar flexibility as AKP? If there is information available on the flexibility of ligands of varying affinities, bound to the same site, maybe some correlative inferences can be drawn? However, in MD simulation trajectories it is not entirely uncommon for a ligand to simply get trapped in a local energy well. Since the authors have performed significant analysis of their MD results it could be worth mentioning/discussing such phenomena.
Previously published MD data addressing ligand dynamics, such as glibenclamide in the SUR1 pocket (Walczewska-Szewc K, Nowak W. Photo-Switchable Sulfonylureas Binding to ATPSensitive Potassium Channel Reveal the Mechanism of Light-Controlled Insulin Release. J Phys Chem B. 2021, 125: 13111-13121, PMID: 34825567), indicate a certain degree of flexibility. Unfortunately, we cannot directly compare these results, as the simulations were performed without the KNtp domain in the SUR1 cavity, which partially contributes to ligand stabilization. This is an issue we plan to investigate in the future.
In this study, we ran five independent MD simulations, each 500 ns long, resulting in a total of 2.5 μs of simulation time. Across all replicates, the ligand stayed in the same position, with variations mainly in the dynamics of the blurred segment. Considering the length of the simulations and the consistency across the runs, we believe this binding pose is stable and represents a global (or at least highly stable) energy minimum, consistent with the cryo-EM data.
(4) In electrophysiological assays, 10 uM AKP seems to inhibit all currents (Figure 2), but in the Rb+ flux assay ~10 uM appears to be the IC50. The reason for this difference is not entirely clear and it would help to comment on this.
Thank you for noticing the difference. The initial electrophysiological experiments were conducted using the very small amount of AKP provided to us from Atomwise. We estimated the concentration of the reconstituted AKP the best we could, but the concentration was likely to not be very accurate due to difficulty in handling the very small amount of the AKP powder. Subsequent Rb<sup>+>/sup> efflux experiments were conducted using a different, larger batch of AKP we purchased from Enamine. We have now stated this in the Methods section.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
As reported above, this paper by Xu et al reports on a new method to combine the analysis of coevolutionary patterns with dynamic profiles to identify functionally important residues and reveal correlations between binding sites.
Strengths:
In general, coevolutionary analysis and MD analysis are carried out separately and while there have been attempts to compare the information provided by the two, no unified framework exists. Here, the authors convincingly demonstrate that integrating signals from Dynamics and coevolution gives information that substantially overcomes the one provided by either method in isolation. While other methods are useful, they do not capture how dynamics is fundamental to define function and thus sculpts coevolution, via the 3D structure of the protein. At the same time, the authors demonstrate how coevolution in turn also influences internal dynamics. The Networks they rebuild unveil information at an even higher level: the model starts pairwise but through network representation the authors arrive to community analysis, reporting on interaction patterns that are larger than simple couples.
Weaknesses:
The authors should
- Make an effort in suggesting/commenting the limits of applicability of their method;
We have added a sentence on Page 17, line 15 that describes the limitation of our method.
- Expand discussion on how DyNoPy compares to other methods;
A paragraph has been added to explain the comparison with other models (Page 3, line 18)
- Dynamic is not essential in all systems (structural proteins): The authors may want to comment on possible strategies they would use for other systems where their framework may not be suitable/applicable.
We agree with the reviewer that dynamics is not essential in all systems. In systems where there is limited role of dynamics in the function, the analysis done with DyNoPy is equivalent to conventional coevolution analysis, which can be consider one limitation of our method. Conversely, for dynamic proteins, combining functional dynamics descriptors with coevolution analysis using DyNoPy, helps in denoising information by deconvolution of communities. We have included this in the manuscript to highlight the suitability/applicability of the method.
Further, we have added a paragraph in the Introduction and conclusions highlighting the main difference between DyNoPy and existing computational tools like DCCM, KIN, and SPM and for your convenience it is provided below:
“Functional sites are often regulated by both, local and global interactions. Changes in these interactions are instrumental for functional events like substrate binding, catalysis, and conformational changes (18). The development of physical models of protein dynamics and the increase in available computational power has stimulated the adoption of computational techniques (19, 20) to investigate the conformational dynamics of proteins, an essential component of the many biological functions (21, 22). Different models have been proposed to describe the interactions between residues during simulations and network models have been particularly popular, including methods on single structures and MD simulations data built by analysing the response to external forces on residue networks (23), by estimating the prevalence of non-covalent energy interaction networks in homologous proteins (24), or by analysing linear or non-linear correlation in atomic fluctuations (25, 26). These techniques have demonstrated their usefulness in extracting allosteric networks from structural data with applications in enzyme design (26).”
Reviewer #2 (Public review):
Summary:
Authors introduced a computational framework, DyNoPy, that integrates residue coevolution analysis with molecular dynamics (MD) simulations to identify functionally important residues in proteins. DyNoPy identifies key residues and residue-residue coupling to generate an interaction graph and attempts to validate using two clinically relevant β-lactamases (SHV-1 and PDC-3).
Strengths:
DyNoPy could not only show clinically relevance of mutations but also predict new potential evolutionary mutations. Authors have provided biologically relevant insights into protein dynamics which can have potential applications in drug discovery and understanding molecular evolution.
Weaknesses:
Although DyNoPy could show the relevance of key residues in active and non-active site residues, no experiments have been performed to validate their predictions.
We thank the reviewer for highlighting this point. We acknowledge that direct experimental validation of our predictions for DyNoPy has not yet been performed. However, we have provided explanations and evidence from experiments conducted on closely related homologs to support the relevance of key residues. These homologs share significant structural and functional similarity, which strengthens the reliability of our predictions.
In addition, they should compare their method with conventional techniques and show how their method could be different.
We thank all the reviewers for highlighting this oversight on our behalf. In Introduction and conclusion, we have added the following paragraphs:
“Functional sites are often regulated by both, local and global interactions. Changes in these interactions are instrumental for functional events like substrate binding, catalysis, and conformational changes (18). The development of physical models of protein dynamics and the increase in available computational power has stimulated the adoption of computational techniques (19, 20) to investigate the conformational dynamics of proteins, an essential component of the many biological functions (21, 22). Different models have been proposed to describe the interactions between residues during simulations and network models have been particularly popular, including methods on single structures and MD simulations data built by analysing the response to external forces on residue networks (23), by estimating the prevalence of non-covalent energy interaction networks in homologous proteins (24), or by analysing linear or non-linear correlation in atomic fluctuations (25, 26). These techniques have demonstrated their usefulness in extracting allosteric networks from structural data with applications in enzyme design (26). ”
An explanation of "communities" divided in the work and how these communities are relevant to the article should be provided. In addition, choice of collective variables and their relevance in residue coupling movement is also not very well explained. Dynamics cross correlation map can also be a good method for understanding the residue movements and can explain the residue-residue coupling, it is not explained how DyNoPy is different from the conventional methods or can perform better.
The following sentences have been included in the manuscript to address the questions raised by the reviewer:
On Community Definition and relevance
DyNoPy identified coevolving residue pairs (scaled coevolution score >1) with interactions strongly correlated with protein functional motions (i.e., J values larger than zero). Applying network analysis on the combined dynamics-coevolution matrix helps us extracting higher-order interactions beyond pairwise coupling and detecting critical residues, which show multiple interactions with each other. Moreover, indirect long-range relationships, which would be hard to identify from numerical data, could be detected through community clustering. Community-based analysis offers a more comprehensive understanding of residue relationships and enables the visualization of residue couplings on the protein structure.
On Choice of collective variables:
DyNoPy works on the assumption that time-dependent interactions between critical residues, either having significant structural change or not will correlate with functional conformational motions. Since MD simulation data is high-dimensional, a time-dependent dynamic descriptor is required to extract the most relevant information for the process under study. A good collective variable (CV) should appropriately describe protein functional motions. Thus, a CV that detects the highest number of residue couplings is expected to be the most suitable descriptor (Mentioned in Page 22 Line 14). In our study, we tested 12 CVs, either focusing on the entire protein or on selected regions. And the best performed CV (the one identified the most residue couplings) was selected for further analysis. In practical applications, users can decide whether to focus on the most relevant global or local dynamics descriptor depending on the dynamics of their specific system.
We have added a paragraph in the Introduction differentiating DyNoPy with other methods including DCCM. DCCM differs from DyNoPy in two aspects 1) it does not account for inter-residue coevolution 2) the correlation matrix captures correlations of atomic/residue movements associated with the whole intrinsic dynamics of the system, without filtering for the contributions to the important motions involved in the biological function. Additionally, any residue pair contributing to functional motion without itself undergoing any structural change will not be visible in this approach.
In the sentence "DyNoPy identified eight significant communities of strongly coupled residues within SHV-1 (Supporting Fig. S4A)" I could not find a clear description of eight significant communities.
The following sentences have been included in the results, methods and figure legends that define ‘significant community’:
‘DyNoPy identified eight meaningful communities, each consisting of at least three strongly coupled residues within SHV-1 (Supplementary Fig. S4A). All crucial catalytic residues and critical substitution sites previously mentioned participating in one of these communities with the exceptions of R<sub>43</sub>, R<sub>202</sub>, and S<sub>130</sub>.’ (Page 8 Line 28)
‘A meaningful community should contain at least three residues.’ (Page 21 Line 2)
‘A reasonable residue community should contain at least three residues.’ (SI Page 11)
Again the description of communities is not clear to me in the following sentence "Detailed description of the other three communities is provided in the supporting information (Fig. S6)."
This following sentence has been rewritten.
‘Detailed description of communities with secondary importance for protein function (community 3, 8, and 9) is provided in the supplementary information (Supplementary Fig. S6).’ (Page 9, line 8)
In the sentence "N170 acts as an intermediary between N136 and E166". Kindly cite the reference figure to show N179 as intermediate residue.
This sentence has been rewritten to avoid any confusion.
‘Although DyNoPy did not detect this direct interaction between N136 and E166, the established relationship between N136 and N170 highlights the role of N136 in influencing E166.’ (Page 10 Line 8)
Please be careful with the numbers. In the sentence "These residues not only interact with each other directly but are also indirectly coupled via 21 other residues." I could count 22 other residues and not 21.
We thank the reviewer for spotting this error. This has now been corrected. All the communities are counted again.
‘These residues not only interact with each other directly but are also indirectly coupled via 22 other residues.’ (Page 12 Line 14)
In the sentence "Unlike other substitution sites that are adjacent to the active site, R<sub>205</sub> is situated more than 16 Å away from catalytic serine S<sub>70</sub>". Please add this label somewhere in the figure.
The figure legends have been updated to include this. Distances have been added to community 4 Fig. 3 and community 6 Fig. 4. Residue index in the legend of Fig.3 has been included as subscript. Distance in the main text has been changed to be more accurate.
‘G<sub>156</sub> and A<sub>146</sub> are two functional important residues distant from the active site. G<sub>156</sub> is 21.3Å away from the catalytic S<sub>70</sub>. A<sub>146</sub> is 16.8Å away from S<sub>70</sub>.’ (Page 12 Line 2)
‘R<sub>205</sub> is a functional important residue that is 20.6Å away from the active site S<sub>70</sub>.’ (Page 13 Line 10)
Please cite a reference in the sentence "This indicates that mutations on G238 would result in an alteration on protein catalytic function, as well as an increased flexibility of the protein, which strongly aligns with previous finding."
The citation has been added
‘This indicates that mutations on G238 would result in an alteration on protein catalytic function, as well as an increased flexibility of the protein, which strongly aligns with previous finding (62).’ (Page 15 Line 2)
Reviewer #3 (Public review):
Summary:
In this paper, Xu, Dantu and coworkers report a protocol for analyzing coevolutionary and dynamical information to identify a subset of communities that capture functionally relevant sites in beta-lactamases.
Strengths:
The combination of coevolutionary information and metrics from MD simulations is interesting for capturing functionally relevant sites, which can have implications in the fields of drug discovery but also in protein design.
Weaknesses:
The combination of coevolutionary information and metrics from MD simulations is not new as other protocols have been proposed along the years (the current version of the paper neglects some of them, see below), and there are a few parameters of the protocol that, in my opinion, should be better analyzed and discussed.
(1) As mentioned, the introduction of the paper lacks some important publications in the field of using graph theory to represent important interaction networks extracted from MD simulations (DOI: 10.1002/pro.4911), and also combining MD data with MSA to identify functionally relevant sites for enzyme design (doi: 10.1021/acscatal.4c04587, 10.1093/protein/gzae005).
We are very grateful for pointing us to these references. We have added a paragraph in the Introduction mentioning these and other computational tools similar to DyNoPy. Further, in conclusion we have highlighted the differences between DyNoPy and existing tools.
(2) The matrix used to apply graph theory (J_ij) is built from summing the scaled coevolution and degree of correlation values. The alpha and beta weights are defined, and the authors mention that alpha is set to 0.5, thus beta as well to fulfil with the alpha + beta = 1. Why a value of 0.5 has been selected? How this affects the overall results and conclusions extracted? The finding that many catalytically relevant residues are identified in the communities is not surprising given that such sites usually present a high conservation score.
This is an excellent question. Our present formulation allows the user to easily assess the influence of coevolution and dynamic couplings on the output. Setting alpha to 0.5, weights both evolutionary and dynamics information equally and has shown promising results in SHV-1 and PDC-3. As it has been presented in the manuscript, setting alpha to 1, i.e., purely utilising coevolution data does not let us identify critical residues effectively as all residues are included in the set (Supplementary Fig. S4 and S5). In future work, we would like to investigate the effect of scanning alpha from 0 to 1 on the final residue list, possibly on a larger set of proteins and protein families.
We would also like to point out that some of the residue pairs with coevolution scores in the top 1% have J-scores set to 0, as they lacked significant coupling to the functional dynamics.
(3) Another important point that needs further explanation is the selection of the relevant descriptor of protein dynamics. In this study two different strategies have been used (one more global the other more local), but more details should be provided regarding their choice. What is the best strategy according to the authors? Why not using the same strategy for both related systems? The obtained results using one methodology or the other will have a large impact on the dynamical score. Another related point is: what is the impact of the MD simulation length, how the MSA is generated and number of sequences used for MSA construction?
As in the case of many complex proteins, the flow of information occurs in β-lactamases via structural interactions (https://doi.org/10.7554/eLife.66567). These interactions occur both on a local level, as in the case of binding site residues or residues immediately surrounding the binding site; however, there are interactions far away (>20Å) from the binding site that have the ability to alter function. We have obtained this information from extensive surveys of clinical isolates and experimental data. To account for such interactions, a more global approach has to be taken. To answer the reviewer’s question: each system is unique and there is no one-fixed strategy. In short, the method used should be able to denoise information and the user is advised to fine-tune their findings by corroborating with experimental and clinical information.
The length of MD simulations is also system specific. Some systems effectively sample the functional dynamics within a shorter simulation time, while others take a long timescale MD simulation to converge. The results won’t change as long as the simulation has effectively sampled the functional dynamics associated with biological function.
The MSA is generated by the HH-Suite package as mentioned on Page 19 Line 19. More specifically, the MSA is constructed based on the UniRef30 database, where sequences are clustered, and each cluster contains sequences with at least 30% sequence identity. This provides a non-redundant set of protein sequences. Our package allows the automatic generation of MSAs from the database. For SHV-1, the alignment contains 18,175 protein sequences and for PDC-3, the alignment consists of 27,892 protein sequences. Full details of this protocol are published in Bibik et al. (https://doi.org/10.1093/bioinformatics/btae166). We have revised the methods section to include these details.
Other Minor Alterations
‘Fig. S1 and S2’ has been changed to ‘Supplementary Fig. S1 and S2’ for consistency (Page 6 Line 12)
(1) ‘Figure 5B’ has been changed to ‘Fig. 5B’ for consistency (Page 16 Line 11)
(2) All the ‘Figure’ has been changed to ‘Fig.’ in the SI for consistency
(3) Just as the suggestion, an alteration has been made on the Step 1 of Fig.1.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Evidence, reproducibility and clarity):
Summary:
In this manuscript, Hammond et al. study robustness of the vertebrate segmentation clock against morphogenetic processes such as cell ingression, cell movement and cell division to ask whether the segmentation clock and morphogenesis are modular or not. The modularity of these two would be important for evolvability of the segmenting system. The authors adopt a previously proposed 3D model of the presomitic mesoderm (Uriu et al. 2021 eLife) and include new elements; different types of cell ingression, tissue compaction and cell cycles. Based on the results of numerical simulations that synchrony of the segmentation clock is robust, the authors conclude that there is a modularity in the segmentation clock and morphogenetic processes. The presented results support the conclusion. The manuscript is clearly written. I have several comments that could help the authors further strengthen their arguments.
Major comment:
[Optional] In both the current model and Uriu et al. 2021, coupling delay in phase oscillator model is not considered. Given that several previous studies (e.g. Lewis 2003, Herrgen et al. 2010, Yoshioka-Kobayashi et al. 2020) suggested the presence of coupling delays in DeltaNotch signaling, could the authors analyze the effect of coupling delay on robustness of the segmentation clock against morphogenetic processes?
We thank the reviewer for the suggestion. Owing to the computational demands of including such a delay in the model, we cannot feasibly repeat every simulation analysed here in the presence of delay, and would like to note that the increased computational demand that delays put on the simulations is also the reason why Uriu et al 2021 did not include it, as stated in their published exchange with reviewers. However, analogous to our analysis in figure 7, we can analyse how varying the position of progenitor cell ingression affects synchrony in the presence of the coupling delay measured in zebrafish by Herrgen et al. (2010). We show this analysis in a new figure 8 (8B, specifically), on page 21, and discuss its implications in the text on pages 2022. Our analysis reveals that the model cannot recover synchrony using the default parameters used by Uriu et al. (2021) and reveal a much stronger dependence on the rate of cell mixing (vs) than shown in the instantaneous coupling case (cf. figure 7). However, by systematically varying the value of the delay we find that a relatively minor increase in the delay is sufficient to recover synchrony using the parameter set of Uriu et al. (see figure 8C). Repeating this across the three scenarios of cell ingression we see that the combination of coupling strength and delay determine the robustness of synchrony to varying position of cell ingression. This suggests that the combination of these two parameters constrain the evolution of morphogenesis.
Minor comments:
- PSM radius and oscillation synchrony are both denoted by the same alphabet r. The authors should use different alphabets for these two to avoid confusion.
We thank the reviewer for spotting this. This has now been changed throughout to rT, as shorthand for ‘radius of tissue’.
- page 5 Figure 1 caption: (x-x_a/L) should be (x-x_a)/L.
We thank the reviewer for spotting this. This has now been corrected.
- Figure 3C: Description of black crosses in the panels is required in the figure legend.
Thank you for spotting this. The legend has now been corrected.
- Figure 3C another comment: In this panel, synchrony r at the anterior PSM is shown. It is true that synchrony at anterior PSM is most relevant for normal segment formation. However, in this case, the mobility profile is changed, so it may be appropriate to show how synchrony at mid and posterior PSM would depend on changes in mobility profile. Is synchrony improved by cell mobility at the region where cell ingression happens?
We thank the reviewer for the suggestion. We have now plotted the synchrony along the AP axis for varying motility profiles, and this can be seen in figure 3 supplement 1, and is briefly discussed in the text on page 11. We show that while the synchrony varies with x-position (as already expected, see figure 2), there is no trend associated with the shape of the motility profile.
- In page 12, the authors state that "the results for the DP and DP+LV cases are exactly equal for L = 185 um, as .... and the two ingression methods are numerically equivalent in the model". I understood that in this case two ingression methods are equivalent, but I do not understand why the results are "exactly" equal, given the presence of stochasticity in the model.
These results can be exactly equal despite the simulations being stochastic because they were both initialised using the same ‘seed’ in the source code. However, we now see that this might be confusing to the reader, and we have re-generated this figure but this time initialising the simulations for each ingression scenario using a different seed value. This is now reflected in the text on page 12 and in figure 4.
- The authors analyze the effect of cell density on oscillation synchrony in Fig. 4 and they mention that higher density increases robustness of the clock by increasing the average number of interacting neighbours. I think it would be helpful to plot the average number of neighbouring cells in simulations as a function of density to quantitatively support the claim.
We thank the reviewer for their suggestion. Distributions of neighbour numbers for exemplar simulations with varying density can now be found in figure 4 supplementary figure 1 and are referred to in the text on page 11.
- The authors analyze the effect of PSM length on synchrony in Fig. 4. I think kymographs of synchrony r as shown in Fig. 2D would also be helpful to show that indeed cells get synchronized while advecting through a longer PSM.
We thank the reviewer for their suggestion and agree that visualising the data in this way is an excellent idea. We have generated the suggested kymographs and added them to figure 4 as supplements 2 and 4, and discussed these results in the text on page 12.
- I understand that cells in M phase can interact with neighboring cells with the same coupling strength kappa in the model, although their clocks are arrested. If so, this aspect should be also mentioned in the main text in page 16, as this coupling can be another noise source for synchrony.
We agree this is an important clarification. We explicitly state this, and briefly justify our choice, in the text on page 16.
- Figure 5-figure supplement 2: panel labels A, B, C are missing.
Thank you for bringing this to our attention. These have now been added.
– Figure 5-figure supplement 3: panel labels A, B, C are missing.
Thank you for bringing this to our attention. These have now been added.
Reviewer #1 (Significance):
Synchronization of the segmentation clock has been studied by mathematical modeling, but most previous studies considered cells in a static tissue without morphogenesis. In the previous study by Uriu et al. 2021, morphogenetic processes such as cell advection due to tissue elongation, tissue shortening, and cell mobility were considered in synchronization. The current manuscript provides methodological advances in this aspect by newly including cell ingression, tissue compaction and cell cycle. In addition, the authors bring a concept of modularity and evolvability to the field of the vertebrate segmentation clock, which is new. On the other hand, the manuscript confirms that the synchronization of the segmentation clock is robust by careful simulations, but it does not propose or reveal new mechanisms for making it robust or modular. The main targets of the manuscript will be researchers working on somitogenesis and evolutionary biologists who are interested in evolution of developmental systems. The manuscript will also be interested by broader audiences, like developmental biologists, biophysicists, and physicists and computer scientists who are working on dynamical systems.
We thank the reviewer for their interest in our manuscript and for acknowledging us as one of the first to address the modularity and evolvability of somitogenesis. We hope that this work will encourage others to think about these concepts in this system too.
In the original submission, we identified a high enough coupling strength as the main mechanism underlying the identified modularity in somitogenesis. Since, we have included an analysis of the coupling delay and find that it is the interplay between coupling strength and coupling delay that mediate the identified modularity, allowing PSM morphogenesis and the segmentation clock to evolve independently in regions of parameter space that are constrained and determined by the interplay between these two parameters. We have now added an extra figure (figure 8) where we explore this interplay and have discussed it at length in the last section of the results and in the discussion. We again thank the reviewer for encouraging us to include delays in our analysis.
Reviewer #2 (Evidence, reproducibility and clarity):
SUMMARY
The manuscript from Hammond et al., investigates the modularity of the segmentation clock and morphogenesis in early vertebrate development, focusing on how these processes might independently evolve to influence the diversity of segment numbers across vertebrates.
Methodology: The study uses a previously published computational model, parameterized for zebrafish, to simulate and analyse the interactions between the segmentation clock and the morphogenesis of the pre-somitic mesoderm (PSM). Their model integrates cell advection, motility, compaction, cell division, and the synchronization of the embryo clock. Three alternative scenarios of PSM morphogenesis were modeled to examine how these changes affect the segmentation clock.
Model System: The computational model system combines a representation of cell movements and the phase oscillator dynamics of the segmentation clock within a three-dimensional horseshoe-shaped domain mimicking the geometry of the vertebrate embryo PSM. The parameters used for the mathematical model are mostly estimated from previously published experimental findings.
Key Findings and Conclusions: (1) The segmentation clock was found to be broadly robust against variations in morphogenetic processes such as cell ingression and motility; (2) Changes in the length of the PSM and the strength of phase coupling within the clock significantly influenced the system's robustness; (3) The authors conclude that the segmentation clock and PSM morphogenesis exhibited developmental modularity (i.e. relative independence), allowing these two phenomena to evolve independently, and therefore possibly contributing to the diverse segment numbers observed in vertebrates.
MAJOR COMMENTS
(1) The key conclusion drawn by the authors (that there is robustness, and therefore modularity, between the morphogenetic cellular processes modeled and the embryo clock synchronization) stems directly from the modeling results appropriately presented and discussed in the manuscript. The model comprises some strong assumptions, however all have been clearly explained and the parameterization choices are supported by experimental findings, providing biological meaning to the model. Estimated parameters are well explained and seem reasonable assumptions (from the embryology perspective).
We thank the reviewer for their positive comments about our work
(2) This study, as is, achieves its proposed goal of evaluating the potential robustness of the embryo clock to changes in (some) morphogenetic processes. The authors do not claim that the model used is complete, and they properly identify some limitations, including the lack of cellcell interactions. Given the recognized importance of cellular physical interactions for successful embryo development, including them in the model would be a significant addition in future studies.
We would like to clarify that the model does include cell-cell interactions as cells interact with their neighbours’ clock phase to synchronise and to avoid occupying the same physical space.
(3) The authors have deposited all the code used for analysis in a public GitHub repository that is updated and available for the research community.
We support open source coding practices.
(4) In page 6, the authors justify their choice of clock parameters for cells ingressing the PSM: "As ingressing cells do not appear to express segmentation clock genes (Mara et al. (2007)), the position at which cells ingress into the PSM can create challenges for clock patterning, as only in the 'off' phase of the clock will ingressing cells be in-phase with their neighbours." However, there are several lines of evidence (in chick and mouse), that some oscillatory clock genes are already being expressed as early as in the gastrulation phase (so prior to PSM ingression) (Feitas et al, 2001 [10.1242/dev.128.24.5139]; Jouve et al, 2002 [10.1242/dev.129.5.1107]; Maia-Fernandes at al, 2024 [10.1371/journal.pone.0297853]) Question: Is this also true in zebrafish? (I.e. is there any recent experimental evidence that the clock genes are not expressed at ingression, since the paper cited to support this assumption is from 2007). If they are expressed in zebrafish (as they are in mouse and chick), then the cell addition should have random clock gene periods when they enter the PSM and not start all with a constant initial phase of zero. Probably this will not impact the results since the cells will also be out of phase with their neighbours when they "ingress", however, it will model more closely the biological scenario (and avoid such criticism).
We thank the reviewer for their comments. While it is known that in zebrafish the clock begins oscillating during epiboly and before the onset of segmentation (Riedel-Kruse et al., 2007), to our knowledge no-one has examined whether posteriorly or laterally ingressing progenitor cells express clock genes prior to their ingression into the PSM, which occurs later in development than the first oscillations which give rise to the first somites. We have not found any published evidence of her/hes gene expression in the dorsal donor tissues or lateral tissues surrounding the PSM, however we acknowledge that this has not been actively studied before and our assumption relies on an absence of evidence, rather than evidence of absence.
However, we agree with the reviewer that one should include such an analysis for completeness, and we have now generated additional simulations where progenitor cells ingress with a random clock phase. This data is presented in figure 2 supplement 1 and mentioned in the main text on page 9.
MINOR COMMENTS
(1) The citations are appropriate and cover the major labs that have published work related to this study (although with some overrepresentation of the lab that published the model used).
We have cited the vast literature on somitogenesis to the best of our ability and do recognise that the work of the Oates lab appears prominently, but this is probably because their experimental data were originally used to parametrise the model in Uriu et al. 2021.
(2) The text is clear, carefully written, and both the methods and the reasoning behind them are clearly explained and supported by proper citations.
We are very glad to see that the reviewer found that the manuscript was clearly presented.
(3) The figures are comprehensive, properly annotated, with explanatory self-contained legends. I have no comments regarding the presentation of the results.
Thank you
(4) Minor suggestions:
a. Page 26: In the Cell addition sub-section of the Methods section, correct all instances where the word domain is used, but subdomain should be used (for clarity and coherence with the description of the model, stated as having a single domain comprising 3 subdomains).
We thank the reviewer for raising this, this is a good point. We have now corrected to ‘subdomain’ where appropriate.
b. Page 32: Table 1. Parameter values used in our work, unless otherwise stated -> Suggestion: Add a column with the individual citations used for each parameter (to facilitate the confirmation of each corresponding reference).
Thank you for the suggstion, we have now done this (see table 1 page 36).
Reviewer #2 (Significance):
GENERAL ASSESSMENT
This study uses a previously published model to simulate alternative scenarios of morphogenetic parameters to infer the potential independence (termed here modularity) between the segmentation clock and a set of morphogenetic processes, arguing that such modularity could allow the evolution of more flexible body plans, therefore partially explaining the variability in the number of segments observed in the vertebrates. This question is fundamental and relevant, yet still poorly researched. This work provides a comprehensive simulation with a model that tries to simplify the many morphogenetic processes described in the literature, reducing it to a few core fundamental processes that allow drawing the conclusions seeked. It provides theoretical insight to support a conceptual advance in the field of evolutionary vertebrate embryology.
ADVANCE
This study builds on a model recently published by Uriu et al. (eLife, 2021) that incorporates quantitative experimental data within a modeling framework including cell and tissue-level parameters, allowing the study of multiscale phenomena active during zebrafish embryo segmentation. Uriu's publication reports many relevant and often non-intuitive insights uncovered by the model, most notably the description of phase vortices formed by the synchronizing genetic oscillators interfering with the traveling-wave front pattern. However, this model can be further explored to ask additional questions beyond those described in the original paper. A good example is the present study, which uses this mathematical framework to investigate the potential independence between two of the modeled processes, thereby extracting extra knowledge from it. Accordingly, the present study represents a step forward in the direction of using relevant theoretical frameworks to quantitatively explore the landscape of complex molecular hypotheses in silico, and with it shed some light on fundamental open questions or inform the design of future experiments in the lab.
The study incorporates a wide range of existing literature on the developmental biology of vertebrates. It comprehensively cites prior work, such as the foundational studies by Cooke and Zeeman on the segmentation clock and the role of FGF signaling in PSM development as discussed by Gomez et al. The literature properly covers the breadth of knowledge in this field.
AUDIENCE
Target audience | This study is relevant for fundamental research in developmental biology, specifically targeting researchers who focus on early embryo development and morphogenesis from both experimental and theoretical perspectives. It is also relevant for evolutionary biologists investigating the genetic factors that influence vertebrate evolution, as well as to computational biologists and bioinformatics researchers studying developmental processes and embryology.
Developmental researchers studying the segmentation clock in other vertebrate model organisms (namely mouse and chick), will find this publication especially valuable since it provides insights that can help them formulate new hypotheses to elucidate the molecular mechanisms of the clock (for example finding a set of evolutionarily divergent genes that might interfere with PSM length). Additionally, this study provides a set of cellular parameters that have yet to be measured in mouse and chick, therefore guiding the design of future experiments to measure them, allowing the simulation of the same model with sets of parameters from different vertebrate model organisms, therefore testing the robustness of the findings reported for zebrafish.
Reviewer #3 (Evidence, reproducibility and clarity):
In this manuscript, Verd and colleagues explored how various biologically relevant factors influence the robustness of clock dynamics synchronization among neighboring cells within the context of somatogenesis, adapting a mathematical model presented by Urio et. al in 2021 in a similar context. Specifically they show that clock dynamics is robust to different biological mechanisms such as cell infusion, cellular motility, compaction-extension and cell-division. On the other hand , the length of Presomitic Mesoderm (PSM) and density of cells in it has a significant role in the robustness of clock dynamics. While the manuscript is well-written and provides clear descriptions of methods and technical details, it tends to be somewhat lengthy.
Below are the comments I would like the authors to address:
(1) The authors mention that "...the model is three dimensional and so can quantitatively recapture the rates of cell mixing that we observe in the PSM". I am not convinced with this justification of using a 3D model. None of the effects the authors explore in this manuscript requires a three dimensional model or full physical description of the cellular mechanics such as excluded volume interaction etc. A one-dimensional model characterized by cell position along the arclength of PSM and somatic region and segmentation clock phase θ can incorporate all the physics authors described in this manuscript as well as significantly computationally cheap allowing the authors to explore the effect of different parameters in greater detail.
One of the main objectives of the work we present in this manuscript is to assess how the evolution of PSM morphogenesis affects, or does not affect, segment patterning. The PSM is a three-dimensional tissue with differing cell rearrangement dynamics along its anterior-posterior axis. In addition, PSM dimension, density, the rearrangement rate, and patterns of cell ingression all vary across vertebrate species, and they are functional, especially cell mixing as it promotes synchronisation and drives elongation. In order to answer questions on the modularity of somitogenesis we therefore consider it absolutely necessary to include a three-dimensional representation of the PSM that captures single cells and their movements. In addition, this will allow us, as Reviewer #2 also pointed out, to reparametrize our model using species-specific data as it becomes available.
While the reviewer is right in that lower dimensional representations would be computationally more efficient, and are generally more tractable, it would not be possible to represent cell mixing in one dimension, as this happens in three dimensions. One could perhaps encode the synchrony-promoting effect of cell mixing via some coupling function κ(x) that increases towards the posterior, however it is unclear what existing biological data one could use to parameterise this function or determine its form. Cell mixing can be modelled in a two-dimensional framework, however this cannot quantitatively recapture the rate of cell mixing observed in vivo, which is an advantage of this model.
Furthermore, it is unclear how one would simulate processes such as compactionextension using a one-dimensional model. The two different scenarios of cell ingression which we consider can also not be replicated in a one-dimensional model, as having a population of cells re-acquiring synchrony on the dorsal surface of the tissue while new material is added to the ventral side, creating asynchrony, is qualitatively different than a one-dimensional scenario where cells are introduced continuously along the spatial axis.
(2) I am not sure about the justification for limiting the quantification of phase synchrony in a very limited (one cell diameter wide) region at one end of the somatic part (Page 33 below Fig. 9). From my understanding of the manuscript, the segments appear in significant length anterior to this region. Wouldn't an ensemble average of multiple such one cell diameter wide regions in the somatic region be a more accurate metric for quantifying synchrony?
Indeed, such a metric (e.g. as that used by Uriu et al. to quantify synchrony along the xaxis) would be more accurate for determining synchrony within the PSM. However, as per the clock and wavefront model of somitogenesis, only synchrony at the very anterior of the PSM (or at the wavefront, equivalently) is functional for somitogenesis and thus evolution. Therefore, we restrict our analysis to the anterior-most region of the PSM. We now further justify this in the main text on page 9.
(3) While studying the effect of cellular ingression, the authors study three discrete modes- random, DP and DP+LV and show that in the DP+LV mode the clock synchrony becomes affected. I would like the authors to explore this in a continuous fashion from a pure DP ingression to Pure LV ingression and intermediates.
We thank the reviewer for this suggestion; this is a very interesting question. We are currently working on a related computational and experimental project to address the question of how PSM morphogenesis can change over evolutionary time to evolve the different modes that we see across species. As part of this work, we are running precisely the simulations suggested by the reviewer to find regions of parameter space in which all the relevant morphogenetic processes can freely evolve. While interesting, this work is however outside the scope of the current manuscript.
(4) While studying the effect of length and density of cells in PSM on cellular synchrony, the authors restrict to 3 values of density and 6 values of PSM length keeping the other parameter constant. I would be interested to see a phase diagram similar to Fig. 7 in the two-dimensional parameter space of L and ρ0. I am curious if a scaling relation exists for the parameter values that partition the parameter space with and without synchrony.
We thank the reviewer for their suggestion and agree that this would constitute an interesting addition to the manuscript. We have now generated these data, which are shown in figure 4 supplement 5 and mentioned on page 13. We see no clear relationship between these two variables when co-varying in the presence of random ingression.
(5) Both in the abstract and introduction, the authors discuss at a great length about the variability in the number of segments. I am curious how the number and width of the segments observed depend on different parameters related to cellular mechanics and the segmentation clock ?
We thank the reviewer for this question. It was not clear to us if this was something the reviewer wants us to address in the study’s background and introduction, or an analysis we should include in the results. Therefore, we have responded to both comprehensively below:
The prevailing conceptual framework for understanding this is the clock and wavefront model (Cooke and Zeeman, 1976), which posits that the somite length is inversely proportional to the frequency of the clock relative to the speed of the wavefront, and that the total number of segments is the relative frequency multiplied by the total duration of somitogenesis.
Experimentally we know that the frequency is determined in part by the coupling strength (Liao, Jorg, and Oates, 2016), and from comparative embryological studies (Gomez et al., 2008; Steventon et al., 2016) we know that changes in the elongation dynamics of the PSM correlate with changes in somite number, presumably by altering the total duration of somitogenesis (Gomez et al., 2009). These changes in elongation are thought to be driven by the changes in cell and tissue mechanics we test in our manuscript.
Within our model, we cannot in general predict how the number of segments responds to changes in either clock parameters or cell mechanical parameters, as we lack understanding of what causes somitogenesis to cease; this is thus not encoded in our model and segmentation can in principle proceed indefinitely. Therefore, we have not performed this analysis.
Similarly, we have not included an analysis of somite length. This is for two reasons: 1) as per the clock and wavefront model, the frequency at the PSM anterior (which we analyse) is equivalent to this measurement, as we assume (in general) the wavefront ($x = x_{a}$) is inertial. 2) the length of the nascent somite is not thought to be of much relevance to the adult phenotype, and by extension evolution. Somites undergo cell division and growth soon after their patterning by the segmentation clock, therefore their final size does not majorly depend on the dynamics of the segmentation clock. Rather, the main function of the clock is to control their number (and polarity).
(6) The authors assume that the phase dynamics of the chemical network may be described by an oscillator with constant frequency. For the completeness of the manuscript, the author should discuss in detail, for which chemical networks this is a good assumption.
We thank the reviewer for their suggestion and now justify this assumption in the methods on page 31.
Such an assumption is appropriate for the segmentation clock, as the clock in the posterior of the PSM is thought to oscillate with a constant frequency, at least for the majority of somitogenesis although the frequency of somite formation slows towards the end of this process in zebrafish (Giudicelli et al., 2007, PLoS Biol.). In addition, PSM cells isolated and cultured in the presence of FGF (thus replicating the signalling environment of the posterior PSM) will continue to exhibit her1 oscillations with an apparently constant frequency (Webb et al., 2016).
We note that such formulations are widely used within the segmentation clock literature (e.g. Riedel-Kruse et al., 2007, Morelli et al., 2009).
(7) Figure 3 and the associated text shows no effect of the cellular motility profile in the synchrony of the segmentation clock. This may be moved to the supplementary considering the length of this manuscript.
Thank you for the suggestion. However, we would argue that the lack of effect is a crucial result when discussing modularity. Reviewer #2 agrees with this assessment.
Reviewer #3 (Significance):
The manuscript answers some important questions in the synchrony of segmentation clock in the vertebrates utilizing a model published earlier. However, the presented result is incomplete in some aspects (points 2 to 5 of section A) and that could be overcome by a more detailed analysis using a simpler one dimensional (point 1 of section A). I believe this manuscript could be of interest to an intersecting audience of developmental biologists, systems biologists, and physicists/engineers interested in dynamical systems.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This work shows that a specific adenosine deaminase protein in Dictyostelium generates the ammonia that is required for tip formation during Dictyostelium development. Cells with an insertion in the ADGF gene aggregate but do not form tips. A remarkable result, shown in several different ways, is that the ADGF mutant can be rescued by exposing the mutant to ammonia gas. The authors also describe other phenotypes of the ADGF mutant such as increased mound size, altered cAMP signalling, and abnormal cell type differentiation. It appears that the ADGF mutant has defects in the expression of a large number of genes, resulting in not only the tip defect but also the mound size, cAMP signalling, and differentiation phenotypes.
Strengths:
The data and statistics are excellent.
Weaknesses:
(1) The key weakness is understanding why the cells bother to use a diffusible gas like ammonia as a signal to form a tip and continue development.
Ammonia serves as a crucial signalling molecule influencing both multicellular organization and differentiation in Dictyostelium (Francis, 1964; Bonner et al., 1989; Bradbury and Gross, 1989). Ammonia by raising the pH of the intracellular acidic vesicles of prestalk cells (Poole and Ohkuma, 1981; Gross et al, 1983), and the cytoplasm, is known to increase the speed of chemotaxing amoebae (Siegert and Weijer 1989; Van Duijn and Inouye, 1991), triggering multicellular movement (Bonner et al., 1988, 1989), favoring tipped mound development. The slug tip is known to release ammonia and the slime sheath at the back of the slug prevents diffusion (Bonner et al., 1989) and thus high ammonia levels at the back, help prespore differentiation (Newell et al., 1969). Ammonia favors slug migration than fruiting (Schindler and Sussman, 1977) and thus, ammonia coming from the tip may favor synchronized development of the entire colony. The tip exerts negative chemotaxis towards ammonia probably helping the slugs to move away from each other ensuring equal spacing of the fruiting bodies (Feit and Sollitto, 1987).
Ammonia released in pulses acts as a long-distance signalling molecule between colonies of yeast cells indicating depletion of nutrient resources and promoting synchronous development (Palkova et al., 1997; Palkova and Forstova, 2000). Similarly, ammonia diffusion may influence neighboring Dictyostelium colonies. Ammonia produced in millimolar concentrations (Schindler and Sussman, 1977) could ward off other predators in soil. For instance, ammonia released by Streptomyces symbionts of leaf-cutting ants is known to inhibit fungal pathogens (Dhodary and Spiteller, 2021). Additionally, ammonia recycling into amino acids, as observed in breast cancer proliferation (Spinelli et al., 2017), may also occur in starving Dictyostelium cells, supporting survival and differentiation. These findings suggest that ammonia acts as both a local and long-range regulatory signal, integrating environmental and cellular cues to coordinate multicellular development.
(2) The rescue of the mutant by adding ammonia gas to the entire culture indicates that ammonia conveys no positional information within the mound.
Ammonia is known to influence rapid patterning of Dictyostelium cells confined in a restricted environment (Sawai et al., 2002). Both neutral red staining (a marker for prestalk and ALCs) (Fig. S2) and the prestalk marker ecmA/ ecmB expression (Fig. 8C) in the adgf mutants suggest that the mounds have differentiated prestalk cells but are blocked in development. The mound arrest phenotype can be reversed by exposing the adgf mutant mounds to ammonia.
Based on cell cycle phases, there exists a dichotomy of cell types, that biases cell fate as prestalk or prespore (Weeks and Weijer, 1994; Jang and Gomer, 2011). Prestalk cells are enriched in acidic vesicles, and ammonia, by raising the pH of these vesicles and the cytoplasm (Davies et al 1993; Van Duijn and Inouye 1991), plays an active role in collective cell movement (Bonner et al., 1989). Thus, ammonia reinforces or maintains the positional information by elevating cAMP levels, favoring prespore differentiation (Bradbury and Gross, 1989; Riley and Barclay, 1990; Hopper et al., 1993).
(3) By the time the cells have formed a mound, the cells have been starving for several hours, and desperately need to form a fruiting body to disperse some of themselves as spores, and thus need to form a tip no matter what.
When the adgf mutants were exposed to ammonia just after tight mound formation, tips developed within 4 h (Fig. 6). In contrast, adgf mounds, not exposed to ammonia remained at the mound stage for at least 30 h. This demonstrates that starvation alone is not sufficient to drive tip development and ammonia serves as a cue that promotes the transition from mound to tipped mound formation.
Many mound arrest mutants are blocked in development and do not proceed to form fruiting bodies (Carrin et al., 1994). Further, not all the mound arrest mutants tested in this study were rescued by ADA enzyme (Fig. S3 A), and they continue to stay as mounds without dispersing as spores.
(4) One can envision that the local ammonia concentration is possibly informing the mound that some minimal number of cells are present (assuming that the ammonia concentration is proportional to the number of cells), but probably even a minuscule fruiting body would be preferable to the cells compared to a mound. This latter idea could be easily explored by examining the fate of the ADGF cells in the mound - do they all form spores? Do some form spores?
Or perhaps the ADGF is secreted by only one cell type, and the resulting ammonia tells the mound that for some reason that cell type is not present in the mound, allowing some of the cells to transdifferentiate into the needed cell type. Thus, elucidating if all or some cells produce ADGF would greatly strengthen this puzzling story.
A fraction of adgf mounds form bulkier spore heads by the end of 36 h as shown in Fig. 3. This late recovery may be due to the expression of other ADA isoforms. Mixing WT and adgf mutant cell lines results in a slug with the mutants occupying the prestalk region (Fig. 9) suggesting that WT ADGF favours prespore differentiation. However, it is not clear if ADGF is secreted by a particular cell type, as adenosine can be produced by both cell types, and the activity of three other intracellular ADAs may vary between the cell types. To address whether adgf expression is cell type-specific, prestalk and prespore cells will be isolated, and thereafter, adgf expression in each population will be examined.
ADGF activity is likely to be higher in the tip to remove excess adenosine, the tip-inhibiting molecule (Wang and Schaap, 1985), and our results also show that adgf - cells with high adenosine preferentially migrate to the prestalk than the prespore region when mixed with WT cells. Ammonia generated from adenosine deamination could thus drive tip development and prespore differentiation.
Reviewer #2 (Public review):
Summary:
The paper describes new insights into the role of adenosine deaminase-related growth factor (ADGF), an enzyme that catalyses the breakdown of adenosine into ammonia and inosine, in tip formation during Dictyostelium development. The ADGF null mutant has a pre-tip mound arrest phenotype, which can be rescued by the external addition of ammonia. Analysis suggests that the phenotype involves changes in cAMP signalling possibly involving a histidine kinase dhkD, but details remain to be resolved.
Strengths:
The generation of an ADGF mutant showed a strong mound arrest phenotype and successful rescue by external ammonia. Characterization of significant changes in cAMP signalling components, suggesting low cAMP signalling in the mutant and identification of the histidine kinase dhkD as a possible component of the transduction pathway. Identification of a change in cell type differentiation towards prestalk fate
Weaknesses:
(1) Lack of details on the developmental time course of ADGF activity and cell type type-specific differences in ADGF expression.
ADGF expression was examined at 0, 8, 12, and 16 h (Fig. 1), and the total ADA activity was assayed at 12 and 16 h (Fig. 4). Previously, the 12 h data was not included, but it has now been added (Fig. 4A). The adgf expression was found to be highest at 16 h and hence, the ADA assay was carried out at that time point. Since the ADA assay will also report the activity of other three isoforms, it will not exclusively reflect ADGF activity.
A fraction of adgf - mounds form bulkier spore heads by the end of 36 h as shown in Fig. 3. This late recovery may be due to the expression of the other ADA isoforms. Mixing WT and adgf mutant cell lines results in a slug with the mutants occupying the prestalk region (Fig. 9) suggesting that WT adgf favours prespore differentiation. However, it’s not clear if ADGF is secreted by a particular cell type, as adenosine can be produced by both cell types, and the activity of the other three intracellular ADAs may vary between the cell types. To address whether adgf expression is cell type-specific, prestalk and prespore cells will be isolated, and thereafter, adgf expression in each population will be examined.
ADGF activity is likely to be higher in the tip to remove excess adenosine, the tip-inhibiting molecule (Wang and Schaap, 1985), and our results also show that adgf - cells with high adenosine preferentially migrate to the prestalk than the prespore region when mixed with WT cells.
(2) The absence of measurements to show that ammonia addition to the null mutant can rescue the proposed defects in cAMP signalling.
The cAMP levels were measured at two time points 8 h and 12 h in the mutant. The adgf mutant in comparison to WT has lower ammonia levels (Fig. 6), diminished acaA expression (Fig. 7) and reduced cAMP levels (Fig. 7) both at 12 and 16 h of development. Ammonia is known to increase cAMP levels (Riley and Barclay, 1990; Feit et al., 2001). Thus, ammonia addition to the mutant is likely to increase acaA expression, thus increase cAMP levels (Riley and Barclay, 1990; Feit et al., 2001) thereby rescuing the defects in cAMP signalling.
(3) No direct measurements in the dhkD mutant to show that it acts upstream of adgf in the control of changes in cAMP signalling and tip formation.
The histidine kinases dhkD and dhkC are reported to modulate phosphodiesterase RegA activity, thereby maintaining cAMP levels (Singleton et al., 1998; Singleton and Xiong, 2013). By activating RegA, dhkD ensures proper cAMP distribution within the mound, which is essential for the patterning of prestalk and prespore cells, as well as for tip formation (Singleton and Xiong, 2013). Therefore, ammonia exposure to dhkD mutants is likely to regulate cAMP signalling and thereby tip formation.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary
Farkas and colleagues conducted a comparative neuroimaging study with domestic dogs and humans to explore whether social perception in both species is underpinned by an analogous distinction between animate and inanimate entities an established functional organizing principle in the primate and human brain. Presenting domestic dogs and humans with clips of three animate classes (dogs, humans, cats) and one inanimate control (cars), the authors also set out to compare how dogs and humans perceive their own vs other species. Both research questions have been previously studied in dogs, but the authors used novel dynamic stimuli and added animate and inanimate classes, which have not been investigated before (i.e., cats and cars). Combining univariate and multivariate analysis approaches, they identified functionally analogous areas in the dog and human occipitotemporal cortex involved in the perception of animate entities, largely replicating previous observations. This further emphasizes a potentially shared functional organizing principle of social perception in the two species. The authors also describe between- species divergencies in the perception of the different animate classes, arguing for a less generalized perception of animate entities in dogs, but this conclusion is not convincingly supported by the applied analyses and reported findings.
Strengths
Domestic dogs represent a compelling model species to study the neural bases of social perception and potentially shared functional organizing principles with humans and primates. The field of comparative neuroimaging with dogs is still young, with a growing but still small number of studies, and the present study exemplifies the reproducibility of previous research. Using dynamic instead of static stimuli and adding new stimuli classes, Farkas and colleagues successfully replicated and expanded previous findings, adding to the growing body of evidence that social perception is underpinned by a shared functional organizing principle in the dog and human occipito-temporal cortex.
Weaknesses
The study design is imbalanced, with only one category of inanimate objects vs. three animate entities. Moreover, based on the example videos, it appears that the animate stimuli also differed in the complexity of the content from the car stimuli, with often multiple agents interacting or performing goal-directed actions. Moreover, while dogs are familiar with cars, they are definitely of lower relevance and interest to them than the animate stimuli. Thus, to a certain extent, the results might also reflect differences in attention towards/salience of the stimuli.
We agree with the Reviewer and were aware that using only one class of inanimate objects but three classes of animate entities, along with the differences in complexity and relevance between the animate and the inanimate stimuli potentially elicited more attention to the inanimate condition and may have thus introduced a confound. We are revising the related limitation in the discussion to acknowledge this and to emphasize why we believe these differences do not compromise our main findings.
The methods section and rationale behind the chosen approaches were often difficult to follow and lacked a lot of information, which makes it difficult to judge the evidence and the drawn conclusions, and it weakens the potential for reproducibility of this work. For example, for many preprocessing and analysis steps, parameters were missing or descriptions of the tools used, no information on anatomical masks and atlas used in humans was provided, and it is often not clear if the authors are referring to the univariate or multivariate analysis.
We acknowledge the concerns regarding the clarity and completeness of the methods section and are significantly revising the descriptions of the methods. Of note, in humans, the Harvard-Oxford Cortical Structural Atlas (Frazier et al., 2005; Makris et al., 2006; Desikan et al., 2006; Goldstein et al., 2007), implemented within the FSL software package, was used for anatomical masks, while the Automated Anatomical Labeling atlas (Tzourio-Mazoyer et al., 2002) was used for assigning labels.
In regard to the chosen approaches and rationale, the authors generally binarize a lot of rich information. Instead of directly testing potential differences in the neural representations of the different animate entities, they binarize dissimilarity maps for, e.g. animate entity > inanimate cars and then calculate the overlap between the maps.
We thank the Reviewer for these comments and ideas. We also appreciate the second Reviewer for their related concerns and suggestions about the overlap calculation. Since the neural processing of different animate entities in the dog brain is largely unexplored, in some of our analyses we aimed to provide a straightforward and directly comparable characterization of animacy perception in the two species. We believe that a measure of how overlapping the neural representations of different animate classes are in the dog vs. the human visual cortex is a simple but meaningful and insightful characterization of how animacy perception is structured in the two species, despite the lack of spatial detail. Our decision to use binarization was based on these considerations. In response to this Reviewer’s request for providing richer information, in our revised manuscript, we will present more details and additional non-binarized calculations. Specifically, we are going to use nonbinarized data to present the response profiles of a broad, anatomically defined set of regions that have been related in other works to visual functions, to thus show where there is significant difference and overlap between the neural responses for the three animate classes in each species.
The comparison of the overlap of these three maps between species is also problematic, considering that the human RSA was constricted to the occipital and temporal cortex (there is now information on how they defined it) vs. whole-brain in dogs.
We thank this Reviewer for raising yet another relevant point about overlap calculation. We note that the overlap calculation for univariate results used the visually responsive cortex in both dogs and humans. The decision to restrict the multivariate analysis to the occipital and temporal lobes in humans, where the visual areas are, was to reduce computational load. Since RSA in dogs yielded significant voxels almost exclusively in the occipital and temporal cortices, we believe this decision did not introduce major bias in our results. This concern will also be discussed in our revised submission.
Of note, in the category- and class-boundary test, as for the other multivariate tests, the occipital and temporal cortex of humans was delineated based on the MNI atlas.
Considering that the stimuli do differ based on low-level visual properties (just not significantly within a run), the RSA would also allow the authors to directly test if some of the (dis)similarities might be driven by low-level visual features like they, e.g. did with the early visual cortex model. I do think RSA is generally an excellent choice to investigate the neural representation of animate (and inanimate) stimuli, but the authors should apply it more appropriately and use its full potential.
We thank the Reviewer for this suggestion. While this study did not aim to investigate the correlation between low-level visual features and animacy, the data is available, and the suggested analysis can be conducted in the future. This issue will also be discussed in our revised submission.
The authors localized some of the "animate areas" also with the early visual cortex model (e.g. ectomarginal gyrus, mid suprasylvian); in humans, it only included the known early visual cortex - what does this mean for the animate areas in dogs?
We thank the Reviewer for raising this point. Although the labels are the same, both EMG and mSSG are relatively large gyri, and the clusters revealed by each of the two analyses hardly overlap, with peak coordinates more than 12 mm apart for R EMG, and in different hemispheres for mSSG (but more than 11 mm apart even if projected on the same hemisphere). We will detail the differences and the overlaps in the revised submission.
The results section also lacks information and statistical evidence; for example, for the univariate region-of-interest (ROI) analysis (called response profiles) comparing activation strength towards each stimulus type, it is not reported if comparisons were significant or not, but the authors state they conducted t-tests. The authors describe that they created spheres on all peaks reported for the contrast animate > inanimate, but they only report results for the mid suprasylvian and occipital gyrus (e.g. caudal suprasylvian gyrus is missing).
We thank this Reviewer for catching these errors. The missing statistics will be provided in the revised manuscript. Also, we mistakenly named the peak in caudal suprasylvian gyrus occipital gyrus on the figure depicting the response profiles. This will also be corrected.
Furthermore, considering that the ROIs were chosen based on the contrast animate > inanimate stimuli, activation strength should only be compared between animate entities (i.e., dogs, humans, cats), while cars should not be reported (as this would be double dipping, after selecting voxels showing lower activation for that category).
We thank both Reviewers for raising this relevant point about potential double dipping. The aim of this analysis was to describe the relationship between the neural response elicited by the three animate stimulus classes, to show that the animacy-sensitive peaks are not the results of the standalone greater response to a single animate class. We conducted t-tests only to assess significant difference between these three animate conditions and no stats were performed or reported for any animate class vs. inanimate comparisons in these ROIs. In addition to providing the missing t-tests (comparing animate classes), we will present response profiles and corresponding statistics for a broad set of additional, independent ROIs, defined either anatomically or functionally by other studies in the revised version.
The descriptive data in Figure 3B (pending statistical evidence) suggests there were no strong differences in activation for the three species in dog and human animate areas. Thus, the ROI analysis appears to contradict findings from the binary analysis approach to investigate species preference, but the authors only discuss the results of the latter in support of their narrative for conspecific preference in dogs and do not discuss research from other labs investigating own-species preference.
Studying conspecific-preference was not the primary aim of this study. We only used our data to characterize the animate-sensitive regions from this aspect. The species-preference test provides an overall characterization of the entire animate-sensitive region, revealing a higher number of voxels with a maximal response to conspecific than other stimuli in dogs (and a similar tendency in humans), confirming previous evidence on neural conspecific preference in visual areas in both species. The response profiles presented so far describe only the ROIs around the main animate-sensitive peaks and, as the Reviewer points out, in most cases reveal no significant conspecific bias. We believe there is no contradiction here: the entire animate-sensitive region may weakly but still be conspecific-preferring, whereas the main animate-sensitive peaks are not; the centers of conspecific preference may be located elsewhere in the visual cortex and may be supported by mechanisms other than animacy-sensitivity. In the revised manuscript, we will elaborate more on this. Additionally, in response to other comments, and for a better and more coherent characterization of species preference (and animacy sensitivity) across the visual cortex, we will present response profiles for other, independently defined regions and explore conspecific-sensitivity in those additional regions as well. Furthermore, we will discuss related own-species preference literature in greater detail.
The authors also unnecessarily exaggerate novelty claims. Animate vs inanimate and own vs other species perceptions have both been investigated before in dogs (and humans), so any claims in that direction seem unsubstantiated - and also not needed, as novelty itself is not a sign of quality; what is novel, and a sign of theoretical advance besides the novelty, are as said the conceptual extension and replication of previous work.
We agree with this Reviewer regarding novelty claims in general, and we confirm that we had no intention to overstate the uniqueness of our results. We also did not mean to imply that this work would be the first one on animacy perception in dogs, which it obviously is not. But we understand that we could have been more explicit presenting our work as a conceptual extension and replication of previous works, and we are revising the wording of the discussion from this aspect.
Overall, more analyses and appropriate tests are needed to support the conclusions drawn by the authors, as well as a more comprehensive discussion of all findings.
We are thankful for all comments. We will revise the methods section to provide sufficient detail and ensure replicability; conduct additional analyses as detailed above; and provide a more comprehensive discussion of all findings.
Reviewer #2 (Public review):
Summary:
The manuscript reports an fMRI study looking at whether there is animacy organization in a non-primate, mammal, the domestic dog, that is similar to that observed in humans and non-human primates (NHPs). A simple experiment was carried out with four kinds of stimulus videos (dogs, humans, cats, and cars), and univariate contrasts and RSA searchlight analysis was performed. Previous studies have looked at this question or closely associated questions (e.g. whether there is face selectivity in dogs). The import of the present study is that it looks at multiple types of animate objects, dogs, humans, and cats, and tests whether there was overlapping/similar topography (or magnitude) of responses when these stimuli were compared to the inanimate reference class of cars. The main finding was of some selectivity for animacy though this was primarily driven by the dog stimuli, which did overlap with the other animate stimulus types, but far less so than in humans.
Strengths:
I believe that this is an interesting study in so far as it builds on other recent work looking at category-selectivity in the domestic dog. Given the limited number of such studies, I think it is a natural step to consider a number of different animate stimuli and look at their overlap. While some of the results were not wholly surprising (e.g. dog brains respond more selectively for dogs than humans or cats), that does not take away from their novelty, such as it is. The findings of this study are useful as a point of comparison with other recent work on the organization of high-level visual function in the brain of the domestic dog.
Weaknesses:
(1) One challenge for all studies like this is a lack of clarity when we say there is organization for "animacy" in the human and NHP brains. The challenge is by no means unique to the present study, but I do think it brings up two more specific topics.
First, one property associated with animate things is "capable of self-movement". While cognitively we know that cars require a driver, and are otherwise inanimate, can we really assume that dogs think of cars in the same way? After all, just think of some dogs that chase cars. If dogs represent moving cars as another kind of selfmoving thing, then it is not clear we can say from this study that we have a contrast between animate vs inanimate. This would not mean that there are no real differences in neural organization being found.
It was unclear whether all or some of the car videos showed them moving. But if many/most do, then I think this is a concern.
We thank this Reviewer for raising this relevant point about the potential animacy of cars for dogs and its implication for our results. Of note, two-thirds of our car stimuli showed a car moving (slow, accelerating, or fast). We acknowledge that these stimuli contained motionbased animacy cues, and in this regard, there was no clear difference between our animate and inanimate conditions, and possibly between some of the representations they elicited. However, our animate and inanimate stimuli differed in other key factors accounting for animacy organization, such as visual features including the presence of faces, bodies, body parts, postures, and certain aspects of biological motion. So we believe that this limitation does not compromise our main conclusions. We will elaborate on this point further in the revised discussion, also considering how dogs’ differential behavioral responses to cars and animate entities may provide additional insights in this regard.
Second, there is quite a lot of potential complexity in the human case that is worth considering when interpreting the results of this study. In the human case, some evidence suggests that animacy may be more of a continuum (Sha et al. 2015), which may reflect taxonomy (Connolly et al. 2012, 2016). However moving videos seem to be dominated more by signals relevant to threat or predation relative to taxonomy (Nastase et al. 2017). Some evidence suggests that this purported taxonomic organization might be driven by gradation in representing faces and bodies of animals based on their relative similarity to humans (Ritchie et al. 2021). Also, it may be that animacy organization reflects a number of (partially correlated) dimensions (Thorat et al. 2019, Jozwik et al. 2022). One may wonder whether the regions of (partial) overlap in animate responses in the dog brain might have some of these properties as well (or not).
We agree that it would be interesting to dissect which animacy-related factor(s) contribute to the observed animacy sensitivity in different regions, and although this was not the original aim of the study, we agree that we could have made better use of the variation in our stimuli to discuss this aspect. Specifically, some animacy features are shared by all three animate stimulus classes, namely the presence of biological motions, faces, and bodies. In contrast, animate classes differed in some other aspects, for example in how dogs perceived dogs, humans, and cats as social agents and in their potential behavioral goals towards them. It can therefore be argued that regions with two- and especially three-way overlapping activations are more probably involved in processing biological motion, face and body aspects, and non-overlapping ones the social agency- and behavioural goal-related aspects. In line with this, the shared animacy features are indeed ones that have been reported to be central in human animacy representation and that may have made the overlaps in human brain responses greater. We will provide a more detailed discussion of the results from this viewpoint in the revised manuscript.
(2) It is stated that previous studies provide evidence that the dog brain shows selectivity to "certain aspects of animacy". One of these already looked at selectivity for dog and human faces and bodies and identified similar regions of activity (Boch et al. 2023). An earlier study by Dilks et al. (2015), not cited in the present work (as far as I can tell), also used dynamic stimuli and did not suffer from the above limitations in choosing inanimate stimuli (e.g. using toy and scene objects for inanimate stimuli). But it only included human faces as the dynamic animate stimulus. So, as far as stimulus design, it seems the import of the present study is that it included a *third* animate stimulus (cats) and that the stimuli were dynamic.
We agree with this Reviewer that the findings of Dilks et al. (2015) are relevant to our study and have therefore cited them. However, the citation itself was imprecise and will be corrected in the revised manuscript.
(3) I am concerned that the univariate results, especially those depicted in Figure 3B, include double dipping (Kriegesorte et al. 2009). The analysis uses the response peak for the A > iA contrast to then look at the magnitude of the D, H, C vs iA contrasts. This means the same data is being used for feature selection and then to estimate the responses. So, the estimates are going to be inflated. For example, the high magnitudes for the three animate stimuli above the inanimate stimuli are going to inherently be inflated by this analysis and cannot be taken at face value. I have the same concern with the selectivity preference results in Figure 3E.
I think the authors have two options here. Either they drop these analyses entirely (so that the total set of analyses really mirrors those in Figure 4), or they modify them to address this concern. I think this could be done in one of two ways. One would be to do a within- subject standard split-half analysis and use one-half of the data for feature selection and the other for magnitude estimation. The other would be to do a between-subject design of some kind, like using one subject for magnitude estimation based on an ROI defined using the data for the other subjects.
We thank both Reviewers again for raising this important point about potential double dipping. We also thank this Reviewer for specific suggestions for split-half analyses – we agree that, had our original analyses involved double dipping, such a modification would be necessary. But, as we explained in our response above, this was not the case. Indeed, whereas we do visualize all four conditions in Fig. 3B, we only conducted t-tests to assess differences between the three animate conditions (the corresponding stats have been missing from the original manuscript but will be added during revision). So, importantly, we did not evaluate the magnitude of the D, H, C vs iA contrasts in any of the ROIs defined by animate-sensitive peaks; therefore, we believe that these analyses do not involve double dipping. This holds for the species preference results in Fig. 3E as well. We will clarify this in the revised manuscript. Of note, in response to a request by the other reviewer and to provide richer information about the univariate results, we will also provide response profiles and corresponding stats for a broad set of additional ROIs, defined either anatomically or functionally by other studies (e.g., Boch et al., 2023).
(4) There are two concerns with how the overlap analyses were carried out. First, as typically carried out to look at overlap in humans, the proportion is of overlapping results of the contrasts of interest, e.g, for face and body selectivity overlap (Schwarlose et al. 2006), hand and tool overlap (Bracci et al. 2012), or more recently, tool and food overlap (Ritchie et al. 2024). There are a number of ways of then calculating the overlap, with their own strengths and weaknesses (see Tarr et al. 2007). Of these, I think the Jaccard index is the most intuitive, which is just the intersection of two sets as a proportion of their union. So, for example, the N of overlapping D > iA and H > iA active voxels is divided by the total number of unique active voxels for the two contrasts. Such an overlap analysis is more standard and interpretable relative to previous findings. I would strongly encourage the authors to carry out such an analysis or use a similar metric of overlap, in place of what they have currently performed (to the extent the analysis makes sense to me).
We agree with this Reviewer that the Jaccard index is an intuitive and straightforward overlap measure. Importantly, for our overlap calculations we already use this measure (and a very similar one) – but we acknowledge that this was not clear from the original description. Specifically, for the multivariate overlap test, we used the Jaccard index exactly as described by this Reviewer. For the univariate overlap test, we use a very similar measure, with the only difference that there, to reference the search space, the intersection of specific animate-inanimate contrasts was divided by the total voxel number of animate-sensitive areas (which is highly similar to the union of the specific animate-inanimate contrasts). In the revised submission we will provide a more detailed explanation of the overlap calculations, making it explicit that we used the Jaccard index (and a variant of it).
Second, the results summarized in Figure 3A suggest multiple distinct regions of animacy selectivity. Other studies have also identified similar networks of regions (e.g. Boch et al. 2023). These regions may serve different functions, but the overlap analysis does not tell us whether there is overlap in some of these portions of the cortex and not in others. The overlap is only looked at in a very general sense. There may be more overlap locally in some portions of the cortex and not in others.
We thank this Reviewer for this comment, we agree that adding spatial specificity to these results will improve the manuscript. Therefore, during revision, we will assess the anatomical distribution of the overlap results, making use of a broad set of ROIs potentially relevant for animacy perception, defined either anatomically or functionally by other studies (e.g., Boch et al., 2023 for dogs).
(5) Two comments about the RSA analyses. First, I am not quite sure why the authors used HMAX rather than layers of a standardly trained ImageNet deep convolutional neural network. This strikes me also as a missed opportunity since many labs have looked at whether later layers of DNNs trained on object categorization show similar dissimilarity structures as category-selective regions in humans and NHPs. In so far as cross-species comparisons are the motivation here, it would be genuinely interesting to see what would happen if one did a correlation searchlight with the dog brain and layers of a DNN, a la Cichy et al. (2016).
We thank the Reviewer for this comment and suggestion. At the start of the project, HMAX was the most feasible model to implement given our time and expertise constrains. Additionally, the biologically motivated HMAX was also an appropriate choice, as it simulates the selective tuning of neurons in the primary visual cortex (V1) of primates, which is considered homologous with V1 in carnivores (Boch et al., 2024).
Although we agree that using DNNs have recently been extensively and successfully used to explore object representations and could provide valuable additional insights for dogs’ visual perception as well, we believe that adding a large set of additional analyses would stretch the frames of this manuscript, disproportionately shifting its focus from our original research question. Also, our experiment, designed with a different, more specific aim in mind, did not provide a large enough stimulus variety of animate stimuli for a general comparison of the cortical hierarchy underlying object representations in dog and human brains and thus our data are not an optimal starting point for such extensive explorations. Having said that, we are thankful for this Reviewer for the idea and will consider using a DNN to uncover dog’ visual cortical hierarchy in future studies with a better suited stimulus set. Furthermore, in accordance with eLife’s data-sharing policies, we will make the current dataset publicly available so further hypothesis and models can be tested.
Second, from the text is hard to tell what the models for the class- and categoryboundary effects were. Are there RDMs that can be depicted here? I am very familiar with RSA searchlight and I found the description of the methods to be rather opaque. The same point about overlap earlier regarding the univariate results also applies to the RSA results. Also, this is again a reason to potentially compare DNN RDMs to both the categorical models and the brains of both species.
In the revised manuscript we will provide a more detailed explanation of the methods used to determine class- and category-boundary effects. In short, the analysis we performed here followed Kriegeskorte et al. (2008), and the searchlight test looked for regions in which between-class/category differences were greater than within-class/category differences. We will also include RDMs. Additionally, we will provide anatomical details for the overlap results for RSA, just as for the univariate results, using the same independently defined broad set of ROIs, defined either anatomically or functionally by other studies (e.g., Boch et al., 2023 for dogs).
(6) There has been emphasis of late on the role of face and body selective regions and social cognition (Pitcher and Ungerleider, 2021, Puce, 2024), and also on whether these regions are more specialized for representing whole bodies/persons (Hu et al. 2020, Taubert, et al. 2022). It may be that the supposed animacy organization is more about how we socialize and interact with other organisms than anything about animacy as such (see again the earlier comments about animacy, taxonomy, and threat/predation). The result, of a great deal of selectivity for dogs, some for humans, and little for cats, seems to readily make sense if we assume it is driven by the social value of the three animate objects that are presented. This might be something worth reflecting on in relation to the present findings.
We thank the Reviewer for this suggestion. The original manuscript already discussed how motion-related animacy cues involved in social cognition may explain that animacysensitive regions reported in our study extend beyond those reported previously and also the role of biological motion in the observed across-species differences. This discussion of the role of visual diagnostic features and features that involved in perceiving social agents will be extended in the revised discussion, also in response to the first comment of this Reviewer, to reflect on how social cognition-related animacy cues may have affected our results in dogs.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Dad et al. explored the roles of cytosolic carboxypeptidase 5(CCP5)in the development of ependymal multicilia in the brain. CCP family are erasers of polyglutamylation of ciliary-axoneme microtubules. The authors generated a new mutant mouse of Agbl5 gene, which encodes CCP5, with deletion of its N-terminus and partial carboxypeptidase (CP) domain (named AGBL5M1/M1).
Strengths:
The mutant mice revealed lethal hydrocephalus due to degeneration of ependymal multicilia. Interestingly, this is in contrast with the phenotype of Agbl5 mutants with disruption solely in the CP domain of CCP5 (named AGBL5M2/M2) that did not develop hydrocephalus despite increased glutamylation levels in ependymal cilia as observed for AGBL5M1/M1 mutants. The study has been well-performed and the findings suggest a unique function of the N-domain of CCP5 in ependymal multicilia stability.
Weaknesses:
The content of this article is relatively descriptive and lacks molecular insights.
We thank the Reviewer’s positive comments. To address the molecular insights of the dysregulated planar cell polarity (PCP) in Agbl5<sup>M1/M1</sup> ependyma, we are planning to further assess the microtubule polarization and the expression/localization of PCP core proteins in ependymal cells. We also plan to quantify the intensity of actin networks around BB patches to better understand to which extent it is affected in the ependyma of the mutants and contributes to the impaired stability of BBs (Please see below).
We will also assess whether Agbl5 commonly functions in multiciliated cells of other organs.
Reviewer #2 (Public review):
Summary:
This study analyzed the consequences of Agbl5 mutation on ependymal cell development and function. The authors first characterize their mutant mouse line reporting a reduced lifespand and severe hydrocephalus. Next, they report a defect in ependymal cell cilia number and motility. They provide evidence for impaired basal body organisation and cilia glutamylation.
Strengths:
Description of a mutant mouse which implicates Cytosolic Carboxypeptidase 5 (the product of Agbl5 gene) for proper ependymal cells.
Weaknesses:
Description of phenotype is incomplete:
We thank the Reviewer’s constructive comments. We agree that more quantitative analysis of the phenotypes in Agbl5<sup>M1/M1</sup> will strengthen this study.
- Figure 3G - the sequence from the movie is not really informative. Providing beating frequencies as quantification of the data would be more informative.
We agree that quantification of the cilia beating frequencies and directions in these experiments will be more informative.
- Figure 3 - the quantification of actin network would strengthen the message.
We agree with the Reviewers. We will quantify the total intensity of actin around BB patch and the total intensity of actin per BB to determine to which extent the actin networks are affected in Agbl5<sup>M1/M1</sup> ependymal cells.
- Lines 219 -220 - the authors conclude “Taken together, in Agbl5<sup>M1/M1</sup> ependymal cells, the expression of genes promoting multiciliogenesis were not impaired but certain proteins associated with differentiated ependymal cells are not properly expressed”. However, they do not assess gene but protein expression (IF). In addition, their quantification shows differences in the number of FoxJ1 positive cells which indeed is an impaired expression.
We will clarify this statement.
- Microtubules are involved in the local organization of ciliary basal bodies (see Werner et al., Vladar et al.,2011; Boutin et al., 2014). It would be interesting for the authors to check whether the subapical network of microtubules is glutamylated or not during ependymal cell differentiation and how this network is affected in their mutants.
We thank the Reviewer’s suggestion. We agree this is an interesting point to look at. We will assess the glutamylation status of the subapical microtubule networks in differentiating ependymal cells and whether they are affected in the mutants.
- Showing the data mentioned in the discussion on Cep110 would be a nice addition to the paper.
These results will be provided.
- Line 354: "The latter serves as a component of tissue polarity that is required for asymmetric PCP protein localization in each cell (Boutin et al., 2014; Vladar et al., 2012)." The cited reference did not demonstrate that this microtubule network is required for asymmetric PCP localization.
We thank the Reviewer for critical reading. We will correct the citation.
Reviewer #3 (Public review):
Summary:
The authors developed a new Agbl5 KO allele, extending the deletion to the N-terminus of CCP5 to explore its function in mouse ependymal cells.
Strengths:
They show that the KO mice exhibit severe hydrocephalus due to disorganized and mislocated basal bodies. Additionally, they present evidence of both impaired beating coordination and a reduction in ciliary beating.
Weaknesses:
The manuscript is well-written but lacks specific interpretations of the results presented. Further experiments are needed to be fully convincing.
We thank the Reviewer’s comments. We plan to conduct the following experiments to strengthen this study.
(1) Quantify the intensity of actin staining around BB patches and its intensity relative to the number of BBs to assess to which extent the actin networks in Agbl5<sup>M1/M1</sup> ependymal cells are affected (please refer to the above response to the comments of Reviewer 2#).
(2) Co-stain tdTomato with cell specific markers to strengthen the spatial expression of tdTomato.
(3) Seek proper antibodies to determine the correlation between signals of GT335 and Ac-Tub in ependymal multicilia of Agbl5<sup>M1/M1</sup> mice.
(4) Quantitatively compare the size of ependymal cells in the wild-type and Agbl5<sup>M1/M1</sup> mice to address whether there is a consequence of possible dysfunction of primary cilia in the precursors of ependymal cells in the mutants. If so, we will further analyze how the primary cilia in the precursors of ependymal cells are affected in the mutants.
(5) Address whether the rotational polarity is affected in the Agbl5<sup>M1/M1</sup> mutant mice.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
To address Reviewer 1’s concerns, we will implement the following changes:
Comment 1: We will clarify that, even without direct comparisons within or across species, whether vertically transmitted microbes act as pioneering colonizers or integrate into an existing community is an important factor influencing their effect on community composition.
Comment 2: We will provide additional details on the biology of the surrogate frog Oophaga sylvatica, explain how tadpole manipulation might influence adhesion to the caregiver, and acknowledge that the lack of knowledge on the physiological mechanisms underlying tadpole attachment currently limits our discussion to speculation.
We will further clarify in the “Methods” section that SourceTracker’s ability to accurately estimate source proportions was assessed by evaluating how well it assigned training samples to their correct source environments. We will provide the predictions for the training set and describe how they informed our data preprocessing and analysis approach.
Comment 3: While we predicted that community distances between tadpoles and adults would be smaller in species with parental transport, we explicitly state that our results did not confirm this expectation. We thus see no contradiction in our discussion but will ensure that this point is more clearly communicated. In response to the reviewer’s suggestion, we will incorporate additional literature on how tadpoles’ skin microbial communities change over time and adapt to their environment. We will also expand on how the life history of L. longirostris—specifically, the frequent presence of adults in tadpole habitats—may facilitate horizontal microbiota transmission, potentially contributing to shorter community distances.
Comment 4: We will remove the network visualization to prevent any misinterpretation.
Additionally, following Reviewer 2’s suggestion, we will include data on the absolute abundance of ASVs shared between parent and offspring after one month of development to further support the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Weaknesses:
INTRODUCTION & THEORY
(1) Can the authors please clarify why the first trial of extinction in a standard protocol does NOT produce the retrieval-extinction effect? Particularly as the results section states: "Importantly, such a short-term effect is also retrieval dependent, suggesting the labile state of memory is necessary for the short-term memory update to take effect (Fig. 1e)." The importance of this point comes through at several places in the paper:
1A. "In the current study, fear recovery was tested 30 minutes after extinction training, whereas the effect of memory reconsolidation was generally evident only several hours later and possibly with the help of sleep, leaving open the possibility of a different cognitive mechanism for the short-term fear dementia related to the retrieval-extinction procedure." ***What does this mean? The two groups in study 1 experienced a different interval between the first and second CS extinction trials; and the results varied with this interval: a longer interval (10 min) ultimately resulted in less reinstatement of fear than a shorter interval. Even if the different pattern of results in these two groups was shown/known to imply two different processes, there is absolutely no reason to reference any sort of cognitive mechanism or dementia - that is quite far removed from the details of the present study.
Indeed, the only difference between the standard extinction paradigm and the retrieval-extinction paradigm is the difference between the first and second CS extinction trials. It has been shown before that a second CS+ presented 1 hour after the initial retrieval CS+ resulted in the dephosphorylation of GluR1 in rats, which was indicative of memory destabilization. The second CS+ presented only 3 minutes after the initial retrieval CS+, as in the standard extinction training, did not cause the GluR1 dephosphorylation effect (Monfils et al., 2009). Therefore, an isolated presentation of the CS+ seems to be important in preventing the return of fear expression. Behaviorally, when the CSs were presented in a more temporally spaced (vs. mass presentation) or a more gradual manner in the extinction training, the fear amnesia effects were more salient (Cain et al., 2003, Gershman et al., 2013). It has also been suggested that only when the old memory and new experience (through extinction) can be inferred to have been generated from the same underlying latent cause, the old memory can be successfully modified (Gershman et al., 2017). On the other hand, if the new experiences are believed to be generated by a different latent cause, then the old memory is less likely to be subject to modification. Therefore, the way the first and 2nd CS are temporally organized (retrieval-extinction or standard extinction) might affect how the latent cause is inferred and lead to different levels of fear expression from a theoretical perspective. These findings, together with studies in both fear and drug memories using the retrieval-extinction paradigm (Liu et al., 2014, Luo et al., 2015, Schiller et al., 2010, Xue et al., 2012), seem to suggest that the retrieval-extinction and the standard extinction procedures engage different cognitive and molecular mechanisms that lead to significant different behavioral outcomes.
In our study, we focus on the short-term and long-term amnesia effects of the retrieval-extinction procedure but also point out the critical role of retrieval in eliciting the short-term effect.
1B. "Importantly, such a short-term effect is also retrieval dependent, suggesting the labile state of memory is necessary for the short-term memory update to take effect (Fig. 1e)." ***As above, what is "the short-term memory update"? At this point in the text, it would be appropriate for the authors to discuss why the retrieval-extinction procedure produces less recovery than a standard extinction procedure as the two protocols only differ in the interval between the first and second extinction trials. References to a "short-term memory update" process do not help the reader to understand what is happening in the protocol.
Sorry for the lack of clarity here. By short-term memory update we meant the short-term amnesia in fear expression.
(2) "Indeed, through a series of experiments, we identified a short-term fear amnesia effect following memory retrieval, in addition to the fear reconsolidation effect that appeared much later."
***The only reason for supposing two effects is because of the differences in responding to the CS2, which was subjected to STANDARD extinction, in the short- and long-term tests. More needs to be said about how and why the performance of CS2 is affected in the short-term test and recovers in the long-term test. That is, if the loss of performance to CS1 and CS2 is going to be attributed to some type of memory updating process across the retrieval-extinction procedure, one needs to explain the selective recovery of performance to CS2 when the extinction-to-testing interval extends to 24 hours. Instead of explaining this recovery, the authors note that performance to CS1 remains low when the extinction-to-testing interval is 24 hours and invoke something to do with memory reconsolidation as an explanation for their results: that is, they imply (I think) that reconsolidation of the CS1-US memory is disrupted across the 24-hour interval between extinction and testing even though CS1 evokes negligible responding just minutes after extinction.
In our results, we did not only focus on the fear expression related to CS2. In fact, we also demonstrated that the CS1 related fear expression diminished in the short-term memory test but re-appeared in the long-term memory after the CS1 retrieval-extinction training.
The “…recovery of performance to CS2 when the extinction-to-testing interval extends to 24 hours…” is a result that has been demonstrated in various previous studies (Kindt and Soeter, 2018, Kindt et al., 2009, Nader et al., 2000, Schiller et al., 2013, Schiller et al., 2010, Xue et al., 2012). That is, the reconsolidation framework stipulates that the pharmacological or behavioral intervention during the labile states of the reconsolidation window only modifies the fear memory linked to the reminded retrieval cue, but not for the non-reminded CS-US memory expression (but also see (Liu et al., 2014, Luo et al., 2015) for using the unconditioned stimulus as the reminder cue and the retrieval-extinction paradigm to prevent the return of fear memory associated with different CS). In fact, we hypothesized the temporal dynamics of CS1 and CS2 related fear expressions were due to the interplay between the short-term and long-term (reconsolidation) effects of the retrieval-extinction paradigm in the last figure (Fig. 6).
(3) The discussion of memory suppression is potentially interesting but, in its present form, raises more questions than it answers. That is, memory suppression is invoked to explain a particular pattern of results but I, as the reader, have no sense of why a fear memory would be better suppressed shortly after the retrieval-extinction protocol compared to the standard extinction protocol; and why this suppression is NOT specific to the cue that had been subjected to the retrieval-extinction protocol.
We discussed memory suppression as one of the potential mechanisms to account for the three characteristics of the short-term amnesia effects: cue-independence, temporal dynamics (short-term) and thought-control-ability relevance. According to the memory suppression theory, the memory suppression effect is NOT specific to the cue and this effect was demonstrated via the independent cue test in a variety of studies (Anderson and Floresco, 2022, Anderson and Green, 2001, Gagnepain et al., 2014, Zhu et al., 2022). Therefore, we suggest in the discussion that it might be possible the CS1 retrieval cue prompted an automatic suppression mechanism and yielded the short-term fear amnesia consistent with various predictions from the memory suppression theory:
“In our experiments, subjects were not explicitly instructed to suppress their fear expression, yet the retrieval-extinction training significantly decreased short-term fear expression. These results are consistent with the short-term amnesia induced with the more explicit suppression intervention (Anderson et al., 1994; Kindt and Soeter, 2018; Speer et al., 2021; Wang et al., 2021; Wells and Davies, 1994). It is worth noting that although consciously repelling unwanted memory is a standard approach in memory suppression paradigm, it is possible that the engagement of the suppression mechanism can be unconscious. For example, in the retrieval-induced forgetting (RIF) paradigm, recall of a stored memory impairs the retention of related target memory and this forgetting effect emerges as early as 20 minutes after the retrieval procedure, suggesting memory suppression or inhibition can occur in a more spontaneous and automatic manner (Imai et al., 2014). Moreover, subjects with trauma histories exhibited more suppression-induced forgetting for both negative and neutral memories than those with little or no trauma (Hulbert and Anderson, 2018). Similarly, people with higher self-reported thought-control capabilities showed more severe cue-independent memory recall deficit, suggesting that suppression mechanism is associated with individual differences in spontaneous control abilities over intrusive thoughts (Küpper et al., 2014). It has also been suggested that similar automatic mechanisms might be involved in organic retrograde amnesia of traumatic childhood memories (Schacter et al., 2012; Schacter et al., 1996).”
3A. Relatedly, how does the retrieval-induced forgetting (which is referred to at various points throughout the paper) relate to the retrieval-extinction effect? The appeal to retrieval-induced forgetting as an apparent justification for aspects of the present study reinforces points 2 and 3 above. It is not uninteresting but needs some clarification/elaboration.
We introduced the retrieval-induced forgetting (RIF) to make the point that RIF was believed to be related to the memory suppression mechanism and the RIF effect can appear relatively early, consistent with what we observed in the short-term amnesia effect. We have re-written the manuscript to make this point clearer:
“It is worth noting that although consciously repelling unwanted memory is a standard approach in memory suppression paradigm, it is possible that the engagement of the suppression mechanism can be unconscious. For example, in the retrieval-induced forgetting (RIF) paradigm, recall of a stored memory impairs the retention of related target memory and this forgetting effect emerges as early as 20 minutes after the retrieval procedure, suggesting memory suppression or inhibition can occur in a more spontaneous and automatic manner (Imai et al., 2014). Moreover, subjects with trauma histories exhibited more suppression-induced forgetting for both negative and neutral memories than those with little or no trauma (Hulbert and Anderson, 2018). Similarly, people with higher self-reported thought-control capabilities showed more severe cue-independent memory recall deficit, suggesting that suppression mechanism is associated with individual differences in spontaneous control abilities over intrusive thoughts (Küpper et al., 2014).”
(4) Given the reports by Chalkia, van Oudenhove & Beckers (2020) and Chalkia et al (2020), some qualification needs to be inserted in relation to reference 6. That is, reference 6 is used to support the statement that "during the reconsolidation window, old fear memory can be updated via extinction training following fear memory retrieval". This needs a qualifying statement like "[but see Chalkia et al (2020a and 2020b) for failures to reproduce the results of 6]."
We have incorporated the reviewer’s suggestion into the revised manuscript in both the introduction:
“Pharmacological blockade of protein synthesis and behavioral interventions can both eliminate the original fear memory expression in the long-term (24 hours later) memory test ( Lee, 2008; Lee et al., 2017; Schiller et al., 2013; Schiller et al., 2010), resulting in the cue-specific fear memory deficit (Debiec et al., 2002; Lee, 2008; Nader, Schafe, & LeDoux, 2000). For example, during the reconsolidation window, retrieving a fear memory allows it to be updated through extinction training (i.e., the retrieval-extinction paradigm (Lee, 2008; Lee et al., 2017; Schiller et al., 2013; Schiller et al., 2010), but also see (Chalkia, Schroyens, et al., 2020; Chalkia, Van Oudenhove, et al., 2020; D. Schiller, LeDoux, & Phelps, 2020)”
And in the discussion:
“It should be noted that while our long-term amnesia results were consistent with the fear memory reconsolidation literatures, there were also studies that failed to observe fear prevention (Chalkia, Schroyens, et al., 2020; Chalkia, Van Oudenhove, et al., 2020; Schroyens et al., 2023). Although the memory reconsolidation framework provides a viable explanation for the long-term amnesia, more evidence is required to validate the presence of reconsolidation, especially at the neurobiological level (Elsey et al., 2018). While it is beyond the scope of the current study to discuss the discrepancies between these studies, one possibility to reconcile these results concerns the procedure for the retrieval-extinction training. It has been shown that the eligibility for old memory to be updated is contingent on whether the old memory and new observations can be inferred to have been generated by the same latent cause (Gershman et al., 2017; Gershman and Niv, 2012). For example, prevention of the return of fear memory can be achieved through gradual extinction paradigm, which is thought to reduce the size of prediction errors to inhibit the formation of new latent causes (Gershman, Jones, et al., 2013). Therefore, the effectiveness of the retrieval-extinction paradigm might depend on the reliability of such paradigm in inferring the same underlying latent cause. Furthermore, other studies highlighted the importance of memory storage per se and suggested that memory retention was encoded in the memory engram cell ensemble connectivity whereas the engram cell synaptic plasticity is crucial for memory retrieval (Ryan et al., 2015; Tonegawa, Liu, et al., 2015; Tonegawa, Pignatelli, et al., 2015). It remains to be tested how the cue-independent short-term and cue-dependent long-term amnesia effects we observed could correspond to the engram cell synaptic plasticity and functional connectivity among engram cell ensembles (Figure 6). This is particularly important, since the cue-independent characteristic of the short-term amnesia suggest that either different memory cues fail to evoke engram cell activities, or the retrieval-extinction training transiently inhibits connectivity among engram cell ensembles. Finally, SCR is only one aspect of the fear expression, how the retrieval-extinction paradigm might affect subjects’ other emotional (such as the startle response) and cognitive fear expressions such as reported fear expectancy needs to be tested in future studies since they do not always align with each other (Kindt et al., 2009; Sevenster et al., 2012, 2013).”
5A. What does it mean to ask: "whether memory retrieval facilitates update mechanisms other than memory reconsolidation"? That is, in what sense could or would memory retrieval be thought to facilitate a memory update mechanism?
It is widely documented in the literatures that memory retrieval renders the old memory into a labile state susceptible for the memory reconsolidation process. However, as we mentioned in the manuscript, studies have shown that memory reconsolidation requires the de novo protein synthesis and usually takes hours to complete. What remains unknown is whether old memories are subject to modifications other than the reconsolidation process. Our task specifically tested the short-term effect of the retrieval-extinction paradigm and found that fear expression diminished 30mins after the retrieval-extinction training. Such an effect cannot be accounted for by the memory reconsolidation effect.
5B. "First, we demonstrate that memory reactivation prevents the return of fear shortly after extinction training in contrast to the memory reconsolidation effect which takes several hours to emerge and such a short-term amnesia effect is cue independent (Study 1, N = 57 adults)."
***The phrasing here could be improved for clarity: "First, we demonstrate that the retrieval-extinction protocol prevents the return of fear shortly after extinction training (i.e., when testing occurs just min after the end of extinction)." Also, cue-dependence of the retrieval-extinction effect was assessed in study 2.
We thank the reviewer and have modified the phrasing of the sentence:
“First, we demonstrate that memory retrieval-extinction protocol prevents the return of fear expression shortly after extinction training and this short-term effect is memory reactivation dependent (Study 1, N = 57 adults).”
5C. "Furthermore, memory reactivation also triggers fear memory reconsolidation and produces cue-specific amnesia at a longer and separable timescale (Study 2, N = 79 adults)." ***In study 2, the retrieval-extinction protocol produced a cue-specific disruption in responding when testing occurred 24 hours after the end of extinction. This result is interesting but cannot be easily inferred from the statement that begins "Furthermore..." That is, the results should be described in terms of the combined effects of retrieval and extinction, not in terms of memory reactivation alone; and the statement about memory reconsolidation is unnecessary. One can simply state that the retrieval-extinction protocol produced a cue-specific disruption in responding when testing occurred 24 hours after the end of extinction.
We have revised the text according to the reviewer’s comment.
“Furthermore, across different timescales, the memory retrieval-extinction paradigm triggers distinct types of fear amnesia in terms of cue-specificity and cognitive control dependence, suggesting that the short-term fear amnesia might be caused by different mechanisms from the cue-specific amnesia at a longer and separable timescale (Study 2, N = 79 adults).”
5D. "...we directly manipulated brain activities in the dorsolateral prefrontal cortex and found that both memory retrieval and intact prefrontal cortex functions were necessary for the short-term fear amnesia."
***This could be edited to better describe what was shown: E.g., "...we directly manipulated brain activities in the dorsolateral prefrontal cortex and found that intact prefrontal cortex functions were necessary for the short-term fear amnesia after the retrieval-extinction protocol."
Edited:
“Finally, using continuous theta-burst stimulation (Study 3, N = 75 adults), we directly manipulated brain activity in the dorsolateral prefrontal cortex, and found that both memory reactivation and intact prefrontal cortex function were necessary for the short-term fear amnesia after the retrieval-extinction protocol.”
5E. "The temporal scale and cue-specificity results of the short-term fear amnesia are clearly dissociable from the amnesia related to memory reconsolidation, and suggest that memory retrieval and extinction training trigger distinct underlying memory update mechanisms."
***The pattern of results when testing occurred just minutes after the retrieval-extinction protocol was different from that obtained when testing occurred 24 hours after the protocol. Describing this in terms of temporal scale is unnecessary, and suggesting that memory retrieval and extinction trigger different memory update mechanisms is not obviously warranted. The results of interest are due to the combined effects of retrieval+extinction and there is no sense in which different memory update mechanisms should be identified with retrieval (mechanism 1) and extinction (mechanism 2).
We did not argue for different memory update mechanisms for the “retrieval (mechanism 1) and extinction (mechanism 2)” in our manuscript. Instead, we proposed that the retrieval-extinction procedure, which was mainly documented in the previous literatures for its association with the reconsolidation-related fear memory retention (the long-term effect), also had a much faster effect (the short-term effect). These two effects differed in many aspects, suggesting that different memory update mechanisms might be involved.
5F. "These findings raise the possibility of concerted memory modulation processes related to memory retrieval..."
***What does this mean?
As we mentioned in our response to the previous comment, we believe that the retrieval-extinction procedure triggers different types of memory update mechanisms working on different temporal scales.
(6) "...suggesting that the fear memory might be amenable to a more immediate effect, in addition to what the memory reconsolidation theory prescribes..."
***What does it mean to say that the fear memory might be amenable to a more immediate effect?
We intended to state that the retrieval-extinction procedure can produce a short-term amnesia effect and have thus revised the text.
(7) "Parallel to the behavioral manifestation of long- and short-term memory deficits, concurrent neural evidence supporting memory reconsolidation theory emphasizes the long-term effect of memory retrieval by hypothesizing that synapse degradation and de novo protein synthesis are required for reconsolidation."
***This sentence needs to be edited for clarity.
We have rewritten this sentence:
“Corresponding to the long-term behavioral manifestation, concurrent neural evidence supporting memory reconsolidation hypothesis emphasizes that synapse degradation and de novo protein synthesis are required for reconsolidation.”
(8) "previous behavioral manipulations engendering the short-term declarative memory effect..."
***What is the declarative memory effect? It should be defined.
We meant the amnesia on declarative memory research, such as the memory deficit caused by the think/no-think paradigms. Texts have been modified for clarity:
“On the contrary, previous behavioral manipulations engendering the short-term amnesia on declarative memory, such as the think/no-think paradigm, hinges on the intact activities in brain areas such as dorsolateral prefrontal cortex (cognitive control) and its functional coupling with specific brain regions such as hippocampus (memory retrieval) (Anderson and Green, 2001; Wimber et al., 2015).”
(9) "The declarative amnesia effect emerges much earlier due to the online functional activity modulation..."
***Even if the declarative memory amnesia effect had been defined, the reference to online functional activity modulation is not clear.
We have rephrased the sentence:
“The declarative amnesia effect arises much earlier due to the more instant modulation of functional connectivity, rather than the slower processes of new protein synthesis in these brain regions.”
(10) "However, it remains unclear whether memory retrieval might also precipitate a short-term amnesia effect for the fear memory, in addition to the long-term prevention orchestrated by memory consolidation."
***I found this sentence difficult to understand on my first pass through the paper. I think it is because of the phrasing of memory retrieval. That is, memory retrieval does NOT precipitate any type of short-term amnesia for the fear memory: it is the retrieval-extinction protocol that produces something like short-term amnesia. Perhaps this sentence should also be edited for clarity.
We have changed “memory retrieval” to “retrieval-extinction” where applicable.
I will also note that the usage of "short-term" at this point in the paper is quite confusing: Does the retrieval-extinction protocol produce a short-term amnesia effect, which would be evidenced by some recovery of responding to the CS when tested after a sufficiently long delay? I don't believe that this is the intended meaning of "short-term" as used throughout the majority of the paper, right?
By “short-term”, we meant the lack of fear expression in the test phase (measured by skin conductance responses) shortly after the retrieval-extinction procedure (30 mins in studies 1 & 2 and 1 hour in study 3). It does not indicate that the effect is by itself “short-lived”.
(11) "To fully comprehend the temporal dynamics of the memory retrieval effect..."<br /> ***What memory retrieval effect? This needs some elaboration.
We’ve changed the phrase “memory retrieval effect” to “retrieval-extinction effect” to refer to the effect of retrieval-extinction on fear amnesia.
(12) "We hypothesize that the labile state triggered by the memory retrieval may facilitate different memory update mechanisms following extinction training, and these mechanisms can be further disentangled through the lens of temporal dynamics and cue-specificities."
***What does this mean? The first part of the sentence is confusing around the usage of the term "facilitate"; and the second part of the sentence that references a "lens of temporal dynamics and cue-specificities" is mysterious. Indeed, as all rats received the same retrieval-extinction exposures in Study 2, it is not clear how or why any differences between the groups are attributed to "different memory update mechanisms following extinction".
As the reviewer mentioned, if only one time point data were collected, we cannot differentiate whether different memory update mechanisms are involved. In study 2, however, the 3 groups only differed on the time onsets the reinstatement test was conducted. Accordingly, our results showed that the fear amnesia effects for CS1 and CS2 cannot be simply explained by forgetting: different memory update mechanisms must be at work to explain the characteristics of the SCR related to both CS1 and CS2 at three different time scales (30min, 6h and 24h). It was based on these results, together with the results from the TMS study (study 3), that we proposed the involvement of a short-term memory update mechanism in addition to the reconsolidation related fear amnesia (which should become evident much later) induced by the retrieval-extinction protocol.
(13) "In the first study, we aimed to test whether there is a short-term amnesia effect of fear memory retrieval following the fear retrieval-extinction paradigm."
***Again, the language is confusing. The phrase, "a short-term amnesia effect" implies that the amnesia itself is temporary; but I don't think that this implication is intended. The problem is specifically in the use of the phrase "a short-term amnesia effect of fear memory retrieval." To the extent that short-term amnesia is evident in the data, it is not due to retrieval per se but, rather, the retrieval-extinction protocol.
We have changed the wordings and replaced “memory retrieval” with “retrieval-extinction” where applicable.
(14) The authors repeatedly describe the case where there was a 24-hour interval between extinction and testing as consistent with previous research on fear memory reconsolidation. Which research exactly? That is, in studies where a CS re-exposure was combined with a drug injection, responding to the CS was disrupted in a final test of retrieval from long-term memory which typically occurred 24 hours after the treatment. Is that what the authors are referring to as consistent? If so, which aspect of the results are consistent with those previous findings? Perhaps the authors mean to say that, in the case where there was a 24-hour interval between extinction and testing, the results obtained here are consistent with previous research that has used the retrieval-extinction protocol. This would clarify the intended meaning greatly.
Our 24 hour test results after the retrieval-extinction protocol was consistent with both pharmacological and behavioral intervention studies in fear memory reconsolidation studies (Kindt and Soeter, 2018, Kindt et al., 2009, Liu et al., 2014, Luo et al., 2015, Monfils et al., 2009, Nader et al., 2000, Schiller et al., 2013, Schiller et al., 2010, Xue et al., 2012) since the final test phase typically occurred 24 hours after the treatment. At the 24-hour interval, the memory reconsolidation effect would become evident either via drug administration or behavioral intervention (extinction training).
DATA
(15) Points about data:
5A. The eight participants who were discontinued after Day 1 in study 1 were all from the no-reminder group. Can the authors please comment on how participants were allocated to the two groups in this experiment so that the reader can better understand why the distribution of non-responders was non-random (as it appears to be)?
15B. Similarly, in study 2, of the 37 participants that were discontinued after Day 2, 19 were from Group 30 min, and 5 were from Group 6 hours. Can the authors comment on how likely these numbers are to have been by chance alone? I presume that they reflect something about the way that participants were allocated to groups, but I could be wrong.
We went back and checked out data. As we mentioned in the supplementary materials, we categorized subjects as non-responders if their SCR response to any CS was less than 0.02 in Day 1 (fear acquisition). Most of the discontinued participants (non-responders) in the no-reminder group (study 1) and the 30min & 24 h groups (study 2) were when the heating seasons just ended or were yet to start, respectively. It has been documented that human body thermal conditions were related to the quality of the skin conductance response (SCR) measurements (Bauer et al., 2022, Vila, 2004). We suspect that the non-responders might be related to the body thermal conditions caused by the lack of central heating.
15C. "Post hoc t-tests showed that fear memories were resilient after regular extinction training, as demonstrated by the significant difference between fear recovery indexes of the CS+ and CS- for the no-reminder group (t26 = 7.441, P < 0.001; Fig. 1e), while subjects in the reminder group showed no difference of fear recovery between CS+ and CS- (t29 = 0.797, P = 0.432, Fig. 1e)."
***Is the fear recovery index shown in Figure 1E based on the results of the first test trial only? How can there have been a "significant difference between fear recovery indexes of the CS+ and CS- for the no-reminder group" when the difference in responding to the CS+ and CS- is used to calculate the fear recovery index shown in 1E? What are the t-tests comparing exactly, and what correction is used to account for the fact that they are applied post-hoc?
As we mentioned in the results section of the manuscript, the fear recovery index was defined as “the SCR difference between the first test trial and the last extinction trial of a specific CS”. We then calculated the “differential fear recovery index” (figure legends of Fig. 1e) between CS+ and CS- for both the reminder and no-reminder groups. The post-hoc t-tests were used to examine whether there were significant fear recoveries (compare to 0) in both the reminder (t<sub>29</sub> = 0.797, P = 0.432, Fig. 1e) and no-reminder (t<sub>26</sub> = 7.441, P < 0.001; Fig. 1e) groups. We realize that the description of Bonferroni correction was not specified in the original manuscript and hence added in the revision where applicable.
15D. "Finally, there is no statistical difference between the differential fear recovery indexes between CS+ in the reminder and no reminder groups (t55 = -2.022, P = 0.048; Fig. 1c, also see Supplemental Material for direct test for the test phase)."
***Is this statement correct - i.e., that there is no statistically significant difference in fear recovery to the CS+ in the reminder and no reminder groups? I'm sure that the authors would like to claim that there IS such a difference; but if such a difference is claimed, one would be concerned by the fact that it is coming through in an uncorrected t-test, which is the third one of its kind in this paragraph. What correction (for the Type 1 error rate) is used to account for the fact that the t-tests are applied post-hoc? And if no correction, why not?
We are sorry about the typo. The reviewer was correct that we meant to claim here that “… there is a significant difference between the differential fear recovery indexes between CS+ in the reminder and no-reminder groups (t<sub>55</sub> =- 2.022, P = 0.048; Fig. 1e)”. Note that the t-test performed here was a confirmatory test following our two-way ANOVA with main effects of group (reminder vs. no-reminder) and time (last extinction trial vs. first test trial) on the differential CS SCR response (CS+ minus CS-) and we found a significant group x time interaction effect (F<sub>1.55</sub> = 4.087, P = 0.048, η<sup>2</sup> = 0.069). The significant difference between the differential fear recovery indexes was simply a re-plot of the interaction effect mentioned above and therefore no multiple correction is needed. We have reorganized the sequence of the sentences such that this t-test now directly follows the results of the ANOVA:
“The interaction effect was confirmed by the significant difference between the differential fear recovery indexes between CS1+ and CS2+ in the reminder and no-reminder groups (t<sub>55</sub> \= -2.022, P \= 0.048; Figure 1E, also see Supplemental Material for the direct test of the test phase).”
15E. In study 2, why is responding to the CS- so high on the first test trial in Group 30 min? Is the change in responding to the CS- from the last extinction trial to the first test trial different across the three groups in this study? Inspection of the figure suggests that it is higher in Group 30 min relative to Groups 6 hours and 24 hours. If this is confirmed by the analysis, it has implications for the fear recovery index which is partly based on responses to the CS-. If not for differences in the CS- responses, Groups 30 minutes and 6 hours are otherwise identical.
Following the reviewer’s comments, we went back and calculated the mean SCR difference of CS- between the first test trial and the last extinction trial for all three studies (see Author response image 1 below). In study 1, there was no difference in the mean CS- SCR (between the first test trial and last extinction trial) between the reminder and no-reminder groups (Kruskal-Wallis test
, panel a), though both groups showed significant fear recovery even in the CS- condition (Wilcoxon signed rank test, reminder: P = 0.0043, no-reminder: P = 0.0037). Next, we examined the mean SCR for CS- for the 30min, 6h and 24h groups in study 2 and found that there was indeed a group difference (one-way ANOVA,F<sub>2.76</sub> = 5.3462, P = 0.0067, panel b), suggesting that the CS- related SCR was influenced by the test time (30min, 6h or 24h). We also tested the CS- related SCR for the 4 groups in study 3 (where test was conducted 1 hour after the retrieval-extinction training) and found that across TMS stimulation types (PFC vs. VER) and reminder types (reminder vs. no-reminder) the ANOVA analysis did not yield main effect of TMS stimulation type (F<sub>1.71</sub> = 0.322, P = 0.572) nor main effect of reminder type (F<sub>1.71</sub> = 0.0499, P = 0.824, panel c). We added the R-VER group results in study 3 (see panel c) to panel b and plotted the CS- SCR difference across 4 different test time points and found that CS- SCR decreased as the test-extinction delay increased (Jonckheere-Terpstra test, P = 0.00028). These results suggest a natural “forgetting” tendency for CS- related SCR and highlight the importance of having the CS- as a control condition to which the CS+ related SCR was compared with.
Author response image 1.
15F. Was the 6-hour group tested at a different time of day compared to the 30-minute and 24-hour groups; and could this have influenced the SCRs in this group?
For the 30min and 24h groups, the test phase can be arranged in the morning, in the afternoon or at night. However, for the 6h group, the test phase was inevitably in the afternoon or at night since we wanted to exclude the potential influence of night sleep on the expression of fear memory (see Author response table 1 below). If we restricted the test time in the afternoon or at night for all three groups, then the timing of their extinction training was not matched.
Author response table 1.
Nevertheless, we also went back and examined the data for the subjects only tested in the afternoon or at nights in the 30min and 24h groups to match with the 6h group where all the subjects were tested either in the afternoon or at night. According to Author response table 1 above, we have 17 subjects for the 30min group (9+8),18 subjects for the 24h group (9 + 9) and 26 subjects for the 6h group (12 + 14). As Author response image 2 shows, the SCR patterns in the fear acquisition, extinction and test phases were similar to the results presented in the original figure.
Author response image 2.
15G. Why is the range of scores in "thought control ability" different in the 30-minute group compared to the 6-hour and 24-hour groups? I am not just asking about the scale on the x-axis: I am asking why the actual distribution of the scores in thought control ability is wider for the 30-minute group?
We went back and tested whether the TCAQ score variance was the same across three groups. We found that there was significant difference in the variance of the TCAQ score distribution across three groups (F<sub>2.155</sub> = 4.324, P = 0.015, Levene test). However, post-hoc analyses found that the variance of TCAQ is not significantly different between the 30min and 6h groups (F<sub>26.25</sub> = 0.4788, P = 0.0697), nor between the 30min and 24h groups (i>F<sub>26.25</sub> = 0.4692, P = 0.0625). To further validate our correlational results between the TCAQ score and the fear recovery index, we removed the TCAQ scores that were outside the TCAQ score range of the 6h & 24h groups from the 30min group (resulting in 4 “outliner” TCAQ scores in the 30min group, panel a in Author response image 3 below) and the Levene test confirmed that the variance of the TCAQ scores showed no difference across groups after removing the 4 “outliner” data points in the 30min group (i>F<sub>2.147</sub> = 0.74028, P = 0.4788). Even with the 4 “outliers” removed from the 30min group, the correlational analysis of the TCAQ scores and the fear recovery index still yielded significant result in the 30min group (beta = -0.0148, t = -3.731, P = 0.0006, see panel b below), indicating our results were not likely due to the inclusion of subjects with extreme TCAQ scores.
Author response image 3.
(16) During testing in each experiment, how were the various stimuli presented? That is, was the presentation order for the CS+ and CS- pseudorandom according to some constraint, as it had been in extinction? This information should be added to the method section.
We mentioned the order of the stimuli in the testing phase in the methods section “… For studies 2 & 3, …a pseudo-random stimulus order was generated for fear acquisition and extinction phases of three groups with the rule that no same trial- type (CS1+, CS2+ and CS-) repeated more than twice. In the test phase, to exclude the possibility that the difference between CS1+ and CS2+ was simply caused by the presentation sequence of CS1+ and CS2+, half of the participants completed the test phase using a pseudo-random stimuli sequence and the identities of CS1+ and CS2+ reversed in the other half of the participants.”
(17) "These results are consistent with previous research which suggested that people with better capability to resist intrusive thoughts also performed better in motivated dementia in both declarative and associative memories."
***Which parts of the present results are consistent with such prior results? It is not clear from the descriptions provided here why thought control ability should be related to the present findings or, indeed, past ones in other domains. This should be elaborated to make the connections clear.
In the 30min group, we found that subjects’ TCAQ scores were negatively correlated with their fear recovery indices. That is, people with better capacity to resist intrusive thoughts were also less likely to experience the return of fear memory, which are consistent with previous results. Together with our brain stimulation results, the short-term amnesia is related to subject’s cognitive control ability and intact dlPFC functions. It is because of these similarities that we propose that the short-term amnesia might be related to the automatic memory suppression mechanism originated from the declarative memory research. Since we have not provided all the evidence at this point of the results section, we briefly listed the connections with previous declarative and associative memory research.
Reviewer #2 (Public Review):
The fear acquisition data is converted to a differential fear SCR and this is what is analysed (early vs late). However, the figure shows the raw SCR values for CS+ and CS- and therefore it is unclear whether the acquisition was successful (despite there being an "early" vs "late" effect - no descriptives are provided).
As the reviewer mentioned, the fear acquisition data was converted to a differential fear SCR and we conducted a two-way mixed ANOVA (reminder vs. no-reminder) x time (early vs. late part of fear acquisition) on the differential SCRs. We found a significant main effect of time (early vs. late; F<sub>1.55</sub> = 6.545, P = 0.013, η<sup>2</sup> = 0.106), suggesting successful fear acquisition in both groups. Fig. 1c also showed the mean differential SCR for the latter half of the acquisition phase in both the reminder and no-reminder groups and there was no significant difference in acquired SCRs between groups (early acquisition: t<sub>55</sub> = -0.063, P = 0.950; late acquisition: t<sub>55</sub> = -0.318, P = 0.751; Fig. 1c).
In Experiment 1 (Test results) it is unclear whether the main conclusion stems from a comparison of the test data relative to the last extinction trial ("we defined the fear recovery index as the SCR difference between the first test trial and the last extinction trial for a specific CS") or the difference relative to the CS- ("differential fear recovery index between CS+ and CS-"). It would help the reader assess the data if Figure 1e presents all the indexes (both CS+ and CS-). In addition, there is one sentence that I could not understand "there is no statistical difference between the differential fear recovery indexes between CS+ in the reminder and no reminder groups (P=0.048)". The p-value suggests that there is a difference, yet it is not clear what is being compared here. Critically, any index taken as a difference relative to the CS- can indicate recovery of fear to the CS+ or absence of discrimination relative to the CS-, so ideally the authors would want to directly compare responses to the CS+ in the reminder and no-reminder groups. The latter issue is particularly relevant in Experiment 2, in which the CS- seems to vary between groups during the test and this can obscure the interpretation of the result.
In all the experiments, the fear recovery index (FRI) was defined as the SCR difference between the first test trial and the last extinction trial for any CS. Subsequently, the differential fear recovery index (FRI) was defined between the FRI of a specific CS+ and the FRI of the CS-. The differential FRI would effectively remove the non-specific time related effect (using the CS- FRI as the baseline). We have revised the text accordingly.
As we responded to reviewer #1, the CS- fear recovery indices (FIR) for the reminder and no-reminder groups were not statistically different (Kruskal-Wallis test
, panel a, Author response image 1), though both groups showed significant fear recovery even in the CS- condition (Wilcoxon signed rank test, reminder: P = 0.0043, no-reminder: P = 0.0037, panel a). Next, we examined the mean SCR for CS- for the 30min, 6h and 24h groups in study 2 and found that there was indeed a group difference (one-way ANOVA, one-way ANOVA,F<sub>2.76</sub> = 5.3462, P = 0.0067, panel b), suggesting that the CS- SCR was influenced by the test time delay. We also tested the CS- SCR for the 4 groups in study 3 and found that across TMS stimulation types (PFC vs. VER) and reminder types (reminder vs. no-reminder) the ANOVA analysis did not yield main effect of TMS stimulation type (F<sub>1.71</sub> = 0.322, P = 0.572) nor main effect of reminder type (F<sub>1.71</sub> = 0.0499, P = 0.824, panel c). We added the R-VER group results in study 3 (see panel c) to panel b and plotted the CS- SCR difference across 4 different test time points and found that CS- SCR decreased as the test-extinction delay increased (Jonckheere-Terpstra test, P = 0.00028). These results suggest a natural “forgetting” tendency for the CS- fear recovery index and highlight the importance of having the CS- as a control condition to compare the CS+ recovery index with (resulting in the Differential recovery index). Parametric and non-parametric analyses were adopted based on whether the data met the assumptions for the parametric analyses.
In Experiment 1, the findings suggest that there is a benefit of retrieval followed by extinction in a short-term reinstatement test. In Experiment 2, the same effect is observed on a cue that did not undergo retrieval before extinction (CS2+), a result that is interpreted as resulting from cue-independence, rather than a failure to replicate in a within-subjects design the observations of Experiment 1 (between-subjects). Although retrieval-induced forgetting is cue-independent (the effect on items that are suppressed [Rp-] can be observed with an independent probe), it is not clear that the current findings are similar. Here, both cues have been extinguished and therefore been equally exposed during the critical stage.
We appreciate the reviewer’s insight on this issue. Although in the discussion we raised the possibility of memory suppression to account for the short-term amnesia effect, we did not intend to compare our paradigm side-by-side with retrieval-induced forgetting. In our previous work (Wang et al., 2021), we reported that active suppression effect of CS+ related fear memory during the standard extinction training generalized to other CS+, yielding a cue-independent effect. In the current experiments, we did not implement active suppression; instead, we used the CS+ retrieval-extinction paradigm. It is thus possible that the CS+ retrieval cue may function to facilitate automatic suppression. Indeed, in the no-reminder group (standard extinction) of study 1, we did observe the return of fear expression, suggesting the critical role of CS+ reminder before the extinction training. Based on the results mentioned above, we believe our short-term amnesia results were consistent with the hypothesis that the retrieval CS+ (reminder) might prompt subjects to adopt an automatic suppress mechanism in the following extinction training, yielding cue-independent amnesia effects.
The findings in Experiment 2 suggest that the amnesia reported in Experiment 1 is transient, in that no effect is observed when the test is delayed by 6 hours. The phenomena whereby reactivated memories transition to extinguished memories as a function of the amount of exposure (or number of trials) is completely different from the phenomena observed here. In the former, the manipulation has to do with the number of trials (or the total amount of time) that the cues are exposed to. In the current study, the authors did not manipulate the number of trials but instead the retention interval between extinction and test. The finding reported here is closer to a "Kamin effect", that is the forgetting of learned information which is observed with intervals of intermediate length (Baum, 1968). Because the Kamin effect has been inferred to result from retrieval failure, it is unclear how this can be explained here. There needs to be much more clarity on the explanations to substantiate the conclusions.
Indeed, in our studies, we did not manipulate the amount of exposure (or number of trials) but only the retention interval between extinction and test. Our results demonstrated that the retrieval-extinction protocol yielded the short-term amnesia on fear memory, qualitatively different from the reconsolidation related amnesia proposed in the previous literatures. After examining the temporal dynamics, cue-specificity and TCAQ association with the short-term amnesia, we speculated that the short-term effect might be related to an automatic suppression mechanism. Of course, further studies will be required to test such a hypothesis.
Our results might not be easily compared with the “Kamin effect”, a term coined to describe the “retention of a partially learned avoidance response over varying time intervals” using a learning-re-learning paradigm (Baum, 1968, Kamin, 1957). However, the retrieval-extinction procedure used in our studies was different from the learning-re-learning paradigm in the original paper (Kamin, 1957) and the reversal-learning paradigm the reviewer mentioned (Baum, 1968).
There are many results (Ryan et al., 2015) that challenge the framework that the authors base their predictions on (consolidation and reconsolidation theory), therefore these need to be acknowledged. Similarly, there are reports that failed to observe the retrieval-extinction phenomenon (Chalkia et al., 2020), and the work presented here is written as if the phenomenon under consideration is robust and replicable. This needs to be acknowledged.
We thank the reviewer pointing out the related literature and have added a separate paragraph about other results in the discussion (as well as citing relevant references in the introduction) to provide a full picture of the reconsolidation theory to the audience:
“It should be noted that while our long-term amnesia results were consistent with the fear memory reconsolidation literatures, there were also studies that failed to observe fear prevention (Chalkia, Schroyens, et al., 2020; Chalkia, Van Oudenhove, et al., 2020; Schroyens et al., 2023). Although the memory reconsolidation framework provides a viable explanation for the long-term amnesia, more evidence is required to validate the presence of reconsolidation, especially at the neurobiological level (Elsey et al., 2018). While it is beyond the scope of the current study to discuss the discrepancies between these studies, one possibility to reconcile these results concerns the procedure for the retrieval-extinction training. It has been shown that the eligibility for old memory to be updated is contingent on whether the old memory and new observations can be inferred to have been generated by the same latent cause (Gershman et al., 2017; Gershman and Niv, 2012). For example, prevention of the return of fear memory can be achieved through gradual extinction paradigm, which is thought to reduce the size of prediction errors to inhibit the formation of new latent causes (Gershman, Jones, et al., 2013). Therefore, the effectiveness of the retrieval-extinction paradigm might depend on the reliability of such paradigm in inferring the same underlying latent cause. Furthermore, other studies highlighted the importance of memory storage per se and suggested that memory retention was encoded in the memory engram cell ensemble connectivity whereas the engram cell synaptic plasticity is crucial for memory retrieval (Ryan et al., 2015; Tonegawa, Liu, et al., 2015; Tonegawa, Pignatelli, et al., 2015). It remains to be tested how the cue-independent short-term and cue-dependent long-term amnesia effects we observed could correspond to the engram cell synaptic plasticity and functional connectivity among engram cell ensembles (Figure 6). This is particularly important, since the cue-independent characteristic of the short-term amnesia suggest that either different memory cues fail to evoke engram cell activities, or the retrieval-extinction training transiently inhibits connectivity among engram cell ensembles. Finally, SCR is only one aspect of the fear expression, how the retrieval-extinction paradigm might affect subjects’ other emotional (such as the startle response) and cognitive fear expressions such as reported fear expectancy needs to be tested in future studies since they do not always align with each other (Kindt et al., 2009; Sevenster et al., 2012, 2013).”
The parallels between the current findings and the memory suppression literature are speculated in the general discussion, and there is the conclusion that "the retrieval-extinction procedure might facilitate a spontaneous memory suppression process". Because one of the basic tenets of the memory suppression literature is that it reflects an "active suppression" process, there is no reason to believe that in the current paradigm, the same phenomenon is in place, but instead, it is "automatic". In other words, the conclusions make strong parallels with the memory suppression (and cognitive control) literature, yet the phenomena that they observed are thought to be passive (or spontaneous/automatic).
Ultimately, it is unclear why 10 mins between the reminder and extinction learning will "automatically" suppress fear memories. Further down in the discussion, it is argued that "For example, in the well-known retrieval-induced forgetting (RIF) phenomenon, the recall of a stored memory can impair the retention of related long-term memory and this forgetting effect emerges as early as 20 minutes after the retrieval procedure, suggesting memory suppression or inhibition can occur in a more spontaneous and automatic manner". I did not follow with the time delay between manipulation and test (20 mins) would speak about whether the process is controlled or automatic.
In our previous research, we showed that the memory suppression instruction together with the extinction procedure successfully prevented the return of fear expression in the reinstatement test trials 30mins after the extinction training (Wang et al., 2021). In the current experiments, we replaced the suppression instruction with the retrieval cue before the extinction training (retrieval-extinction protocol) and observed similar short-term amnesia effects. These results prompted us to hypothesize in the discussion that the retrieval cue might facilitate an automatic suppression process. We made the analogy to RIF phenomenon in the discussion to suggest that the suppression of (competing) memories could be unintentional and fast (20 mins), both of which were consistent with our results. We agree with the reviewer that this hypothesis is more of a speculation (hence in the discussion), and more studies are required to further test such a hypothesis. However, what we want to emphasize in this paper is the report of the short-term amnesia effects which were clearly not related to the memory reconsolidation effect in a variety of aspects.
Among the many conclusions, one is that the current study uncovers the "mechanism" underlying the short-term effects of retrieval extinction. There is little in the current report that uncovers the mechanism, even in the most psychological sense of the mechanism, so this needs to be clarified. The same applies to the use of "adaptive".
Whilst I could access the data on the OFS site, I could not make sense of the Matlab files as there is no signposting indicating what data is being shown in the files. Thus, as it stands, there is no way of independently replicating the analyses reported.
We have re-organized data on the OFS site, and they should be accessible now.
The supplemental material shows figures with all participants, but only some statistical analyses are provided, and sometimes these are different from those reported in the main manuscript. For example, the test data in Experiment 1 is analysed with a two-way ANOVA with the main effects of group (reminder vs no-reminder) and time (last trial of extinction vs first trial of the test) in the main report. The analyses with all participants in the sup mat used a mixed two-way ANOVA with a group (reminder vs no reminder) and CS (CS+ vs CS-). This makes it difficult to assess the robustness of the results when including all participants. In addition, in the supplementary materials, there are no figures and analyses for Experiment 3.
We are sorry for the lack of clarity in the supplementary materials. We have supplementary figures Fig. S1 & S2 for the data re-analysis with all the responders (learners + non-learners). The statistical analyses performed on the responders in both figures yielded similar results as those in the main text. For other analyses reported in the supplementary materials, we specifically provided different analysis results to demonstrate the robustness of our results. For example, to rule out the effects we observed in two-way ANOVA in the main text may be driven by the different SCR responses on the last extinction trial, we only tested the two-way ANOVA for the first trial SCR of test phase and these analyses provided similar results. Please note we did not include non-learners in these analyses (the texts of the supplementary materials).
Since we did not exclude any non-learners in study 3, all the results were already reported in the main text.
One of the overarching conclusions is that the "mechanisms" underlying reconsolidation (long term) and memory suppression (short term) phenomena are distinct, but memory suppression phenomena can also be observed after a 7-day retention interval (Storm et al., 2012), which then questions the conclusions achieved by the current study.
As we stated before, the focus of the manuscript was to demonstrate a novel short-term fear amnesia effect following the retrieval-extinction procedure. We discussed memory suppression as one of the potential mechanisms for such a short-term effect. In fact, the durability of the memory suppression effect is still under debate. Although Storm et al. (2012) suggested that the retrieval-induced forgetting can persist for as long as a week, other studies, however, failed to observe long-term forgetting (after 24 hrs; (Carroll et al., 2007, Chan, 2009). It is also worth noting that Storm et al. (2012) tested RIF one week later using half of the items the other half of which were tested 5 minutes after the retrieval practice. Therefore, it can be argued that there is a possibility that the long-term RIF effect is contaminated by the test/re-test process on the same set of (albeit different) items at different time onsets (5mins & 1 week).
Reviewer #3 (Public Review):
(1) The entire study hinges on the idea that there is memory 'suppression' if (1) the CS+ was reminded before extinction and (2) the reinstatement and memory test takes place 30 minutes later (in Studies 1 & 2). However, the evidence supporting this suppression idea is not very strong. In brief, in Study 1, the effect seems to only just reach significance, with a medium effect size at best, and, moreover, it is unclear if this is the correct analysis (which is a bit doubtful, when looking at Figure 1D and E). In Study 2, there was no optimal control condition without reminder and with the same 30-min interval (which is problematic, because we can assume generalization between CS1+ and CS2+, as pointed out by the authors, and because generalization effects are known to be time-dependent). Study 3 is more convincing, but entails additional changes in comparison with Studies 1 and 2, i.e., applications of cTBS and an interval of 1 hour instead of 30 minutes (the reason for this change was not explained). So, although the findings of the 3 studies do not contradict each other and are coherent, they do not all provide strong evidence for the effect of interest on their own.
Related to the comment above, I encourage the authors to double-check if this statement is correct: "Also, our results remain robust even with the "non-learners" included in the analysis (Fig. S1 in the Supplemental Material)". The critical analysis for Study 1 is a between-group comparison of the CS+ and CS- during the last extinction trial versus the first test trial. This result only just reached significance with the selected sample (p = .048), and Figures 1D and E even seem to suggest otherwise. I doubt that the analysis would reach significance when including the "non-learners" - assuming that this is what is shown in Supplemental Figure 1 (which shows the data from "all responded participants").
Our subjects were categorized based on the criteria specified in supplementary table S1. More specifically, we excluded the non-responders (Mean CS SCR < 0.02 uS in the fear acquisition phase), and non-learners and focused our analyses on the learners. Non-responders were dismissed after day 1 (the day of fear acquisition), but both learners and non-learners finished the experiments. This fact gave us the opportunity to examine data for both the learners and the responders (learners + non-learners). What we showed in fig. 1D and E were differential SCRs (CS+ minus CS-) of the last extinction trials and the differential fear recovery indices (CS+ minus CS-), respectively. We have double checked the figures and both the learners (Fig. 1) and the responders (i.e. learners and non-learners, supplementary Fig. 1) results showed significant differences between the reminder and no-reminder groups on the differential fear recovery index.
Also related to the comment above, I think that the statement "suggesting a cue-independent short-term amnesia effect" in Study 2 is not correct and should read: "suggesting extinction of fear to the CS1+ and CS2+", given that the response to the CS+'s is similar to the response to the CS-, as was the case at the end of extinction. Also the next statement "This result indicates that the short-term amnesia effect observed in Study 2 is not reminder-cue specific and can generalize to the non-reminded cues" is not fully supported by the data, given the lack of an appropriate control group in this study (a group without reinstatement). The comparison with the effect found in Study 1 is difficult because the effect found there was relatively small (and may have to be double-checked, see remarks above), and it was obtained with a different procedure using a single CS+. The comparison with the 6-h and 24-h groups of Study 2 is not helpful as a control condition for this specific question (i.e., is there reinstatement of fear for any of the CS+'s) because of the large procedural difference with regard to the intervals between extinction and reinstatement (test).
In Fig. 2e, we showed the differential fear recovery indices (FRI) for the CS+ in all three groups. Since the fear recovery index (FRI) was calculated as the SCR difference between the first test trial and the last extinction trial for any CS, the differential fear recovery indices (difference between CS+ FRI and CS- FRI) not significantly different from 0 should be interpreted as the lack of fear expression in the test phase. Since spontaneous recovery, reinstatement and renewal are considered canonical phenomena in demonstrating that extinction training does not really “erase” conditioned fear response, adding the no-reinstatement group as a control condition would effectively work as the spontaneous recovery group and the comparison between the reinstatement and no-instatement groups turns into testing the difference in fear recovery using different methods (reinstatement vs. spontaneous recovery).
(2) It is unclear which analysis is presented in Figure 3. According to the main text, it either shows the "differential fear recovery index between CS+ and CS-" or "the fear recovery index of both CS1+ and CS2+". The authors should clarify what they are analyzing and showing, and clarify to which analyses the ** and NS refer in the graphs. I would also prefer the X-axes and particularly the Y-axes of Fig. 3a-b-c to be the same. The image is a bit misleading now. The same remarks apply to Figure 5.
We are sorry about the lack of clarity here. Figures 3 & 5 showed the correlational analyses between TCAQ and the differential fear recovery index (FRI) between CS+ and CS-. That is, the differential FRI of CS1+ (CS1+ FRI minus CS- FRI) and the differential FRI of CS2+ (CS2+ FRI minus CS- FRI).
We have rescaled both X and Y axes for figures 3 & 5 (please see the revised figures).
(3) In general, I think the paper would benefit from being more careful and nuanced in how the literature and findings are represented. First of all, the authors may be more careful when using the term 'reconsolidation'. In the current version, it is put forward as an established and clearly delineated concept, but that is not the case. It would be useful if the authors could change the text in order to make it clear that the reconsolidation framework is a theory, rather than something that is set in stone (see e.g., Elsey et al., 2018 (https://doi.org/10.1037/bul0000152), Schroyens et al., 2022 (https://doi.org/10.3758/s13423-022-02173-2)).
In addition, the authors may want to reconsider if they want to cite Schiller et al., 2010 (https://doi.org/10.1038/nature08637), given that the main findings of this paper, nor the analyses could be replicated (see, Chalkia et al., 2020 (https://doi.org/10.1016/j.cortex.2020.04.017; https://doi.org/10.1016/j.cortex.2020.03.031).
We thank the reviewer’s comments and have incorporated the mentioned papers into our revised manuscript by pointing out the extant debate surrounding the reconsolidation theory in the introduction:
“Pharmacological blockade of protein synthesis and behavioral interventions can both eliminate the original fear memory expression in the long-term (24 hours later) memory test ( Lee, 2008; Lee et al., 2017; Schiller et al., 2013; Schiller et al., 2010), resulting in the cue-specific fear memory deficit (Debiec et al., 2002; Lee, 2008; Nader, Schafe, & LeDoux, 2000). For example, during the reconsolidation window, retrieving a fear memory allows it to be updated through extinction training (i.e., the retrieval-extinction paradigm (Lee, 2008; Lee et al., 2017; Schiller et al., 2013; Schiller et al., 2010), but also see (Chalkia, Schroyens, et al., 2020; Chalkia, Van Oudenhove, et al., 2020; D. Schiller, LeDoux, & Phelps, 2020). ”
As well as in the discussion:
“It should be noted that while our long-term amnesia results were consistent with the fear memory reconsolidation literatures, there were also studies that failed to observe fear prevention (Chalkia, Schroyens, et al., 2020; Chalkia, Van Oudenhove, et al., 2020; Schroyens et al., 2023). Although the memory reconsolidation framework provides a viable explanation for the long-term amnesia, more evidence is required to validate the presence of reconsolidation, especially at the neurobiological level (Elsey et al., 2018). While it is beyond the scope of the current study to discuss the discrepancies between these studies, one possibility to reconcile these results concerns the procedure for the retrieval-extinction training. It has been shown that the eligibility for old memory to be updated is contingent on whether the old memory and new observations can be inferred to have been generated by the same latent cause (Gershman et al., 2017; Gershman and Niv, 2012). For example, prevention of the return of fear memory can be achieved through gradual extinction paradigm, which is thought to reduce the size of prediction errors to inhibit the formation of new latent causes (Gershman, Jones, et al., 2013). Therefore, the effectiveness of the retrieval-extinction paradigm might depend on the reliability of such paradigm in inferring the same underlying latent cause. Furthermore, other studies highlighted the importance of memory storage per se and suggested that memory retention was encoded in the memory engram cell ensemble connectivity whereas the engram cell synaptic plasticity is crucial for memory retrieval (Ryan et al., 2015; Tonegawa, Liu, et al., 2015; Tonegawa, Pignatelli, et al., 2015). It remains to be tested how the cue-independent short-term and cue-dependent long-term amnesia effects we observed could correspond to the engram cell synaptic plasticity and functional connectivity among engram cell ensembles (Figure 6). This is particularly important, since the cue-independent characteristic of the short-term amnesia suggest that either different memory cues fail to evoke engram cell activities, or the retrieval-extinction training transiently inhibits connectivity among engram cell ensembles. Finally, SCR is only one aspect of the fear expression, how the retrieval-extinction paradigm might affect subjects’ other emotional (such as the startle response) and cognitive fear expressions such as reported fear expectancy needs to be tested in future studies since they do not always align with each other (Kindt et al., 2009; Sevenster et al., 2012, 2013).”
Relatedly, it should be clarified that Figure 6 is largely speculative, rather than a proven model as it is currently presented. This is true for all panels, but particularly for panel c, given that the current study does not provide any evidence regarding the proposed reconsolidation mechanism.
We agree with the reviewer that Figure 6 is largely speculative. We realize that there are still debates regarding the retrieval-extinction procedure and the fear reconsolidation hypothesis. We have provided a more elaborated discussion and pointed out that figure 6 is only a working hypothesis and more work should be done to test such a hypothesis:
“Although mixed results have been reported regarding the durability of suppression effects in the declarative memory studies (Meier et al., 2011; Storm et al., 2012), future research will be needed to investigate whether the short-term effect we observed is specifically related to associative memory or the spontaneous nature of suppression (Figure 6C).”
Lastly, throughout the paper, the authors equate skin conductance responses (SCR) with fear memory. It should at least be acknowledged that SCR is just one aspect of a fear response, and that it is unclear whether any of this would translate to verbal or behavioral effects. Such effects would be particularly important for any clinical application, which the authors put forward as the ultimate goal of the research.
Again, we agree with the reviewer on this issue, and we have acknowledged that SCR is only one aspect of the fear response and caution should be exerted in clinical application:
“Finally, SCR is only one aspect of the fear expression, how the retrieval-extinction paradigm might affect subjects’ other emotional (such as the startle response) and cognitive fear expressions such as reported fear expectancy needs to be tested in future studies since they do not always align with each other (Kindt et al., 2009; Sevenster et al., 2012, 2013).”
(4) The Discussion quite narrowly focuses on a specific 'mechanism' that the authors have in mind. Although it is good that the Discussion is to the point, it may be worthwhile to entertain other options or (partial) explanations for the findings. For example, have the authors considered that there may be an important role for attention? When testing very soon after the extinction procedure (and thus after the reminder), attentional processes may play an important role (more so than with longer intervals). The retrieval procedure could perhaps induce heightened attention to the reminded CS+ (which could be further enhanced by dlPFC stimulation)?
We thank the reviewer for this suggestion and have added more discussion on the potential mechanisms involved. Unfortunately, since the literature on attention and fear recovery is rather scarce, it is even more of a speculation given our study design and results are mainly about subjects’ skin conductance responses (SCR).
(5) There is room for improvement in terms of language, clarity of the writing, and (presentation of the) statistical analyses, for all of which I have provided detailed feedback in the 'Recommendations for the authors' section. Idem for the data availability; they are currently not publicly available, in contrast with what is stated in the paper. In addition, it would be helpful if the authors would provide additional explanation or justification for some of the methodological choices (e.g., the 18-s interval and why stimulate 8 minutes after the reminder cue, the choice of stimulation parameters), and comment on reasons for (and implications of) the large amount of excluded participants (>25%).
We have addressed the data accessibility issue and added the justifications for the methodological choices as well as the excluded participants. As we mentioned in the manuscript and the supplementary materials, adding the non-learners into data analysis did not change the results. Since the non-responders discontinued after Day 1 due to their non-measurable spontaneous SCR signals towards different CS, it’s hard to speculate whether or how the results might have changed. However, participants’ exclusion rate in the SCR studies were relatively high (Hu et al., 2018, Liu et al., 2014, Raio et al., 2017, Schiller et al., 2010, Schiller et al., 2012, Wang et al., 2021). The non-responders were mostly associated with participants being tested in the winter in our tasks. Cold weather and dry skins in the winter are likely to have caused the SCR hard to measure (Bauer et al., 2022, Vila, 2004). Different intervals between the reinstating US (electric shock) and the test trials were used in the previous literature such as 10min (Schiller et al., 2010, Schiller et al., 2013) and 18 or 19s (Kindt and Soeter, 2018, Kindt et al., 2009, Wang et al., 2021). We stuck with the 18s reinstatement interval in the current experiment. For the cTBS stimulation, since the stimulation itself lasted less than 2mins, we started the cTBS 8min after the onset of reminder cue to ensure that any effect caused by the cTBS stimulation occurred during the hypothesized time window, where the old fear memory becomes labile after memory retrieval. All the stimulation parameters were determined based on previous literature, which showed that with the transcranial magnetic stimulation (TMS) on the human dorsolateral prefrontal cortex could disrupt fear memory reconsolidation (Borgomaneri et al., 2020, Su et al., 2022).
Finally, I think several statements made in the paper are overly strong in light of the existing literature (or the evidence obtained here) or imply causal relationships that were not directly tested.
We have revised the texts accordingly.
Reviewer #2 (Recommendations For The Authors):
On numerous occasions there are typos and the autocorrect has changed "amnesia" for "dementia".
We are sorry about this mistake and have revised the text accordingly.
Reviewer #3 (Recommendations For The Authors):
*"Neither of the studies reported in this article was preregistered. The data for both studies are publicly accessible at https://osf.io/9agvk". This excerpt from the text suggests that there are 2 studies, but there are 3 in the paper. Also, the data are only accessible upon request, not publicly available. I haven't requested them, as this could de-anonymize me as a reviewer.
We are sorry for the accessibility of the link. The data should be available to the public now.
*Please refrain from causal interpretations when they are not supported by the data:
- Figure 3 "thought-control ability only affected fear recovery"; a correlation does not provide causal evidence.
- "establishing a causal link between the dlPFC activity and short-term fear amnesia." I feel this statement is too strong; to what extent do we know for sure what the applied stimulation of (or more correct: near) the dlPFC does exactly?
We thank the reviewer for the suggestion and have changed the wording related to figure 3. On the other hand, we’d like to argue that the causal relationship between the dlPFC activity and short-term fear amnesia is supported by the results from study 3. Although the exact functional role of the TMS on dlPFC can be debated, the fact that the TMS stimulation on the dlPFC (compared to the vertex group) brought back the otherwise diminished fear memory expression can be viewed as the causal evidence between the dlPFC activity and short-term fear amnesia.
*The text would benefit from language editing, as it contains spelling and grammar mistakes, as well as wording that is vague or inappropriate. I suggest the authors check the whole text, but below are already some excerpts that caught my eye:
"preludes memory reconsolidation"; "old fear memory can be updated"; "would cause short-term memory deficit"; "the its functional coupling"; "Subjects (...) yielded more severe amnesia in the memory suppression tasks"; "memory retrieval might also precipitate a short-term amnesia effect"; "more SEVERE amnesia in the memory suppression tasks"; "the effect size of reinstatement effect"; "the previous literatures"; "towards different CS"; "failed to show SCR response to the any stimuli"; "significant effect of age of TMS"; "each subject' left hand"; "latter half trials"; "Differntial fear recovery"; "fear dementia"; "the fear reinstatement effects at different time scale is related to"; "fear reocery index"; "thought-control abiliites"; "performed better in motivated dementia"; "we tested that in addition to the memory retrieval cue (reminder), whether the"; "during reconsolidation window"; "consisitent with the short-term dementia"; "low level of shock (5v)"
We thank the reviewer for thorough reading and sorry about typos in the manuscript. We have corrected typos and grammar mistakes as much as we can find.
*In line with the remark above, there are several places where the text could still be improved.
- The last sentence of the Abstract is rather vague and doesn't really add anything.
- Please reword or clarify: "the exact functional role played by the memory retrieval remains unclear".
- Please reword or clarify: "the unbinding of the old memory trace".
- "suggesting that the fear memory might be amenable to a more immediate effect, in addition to what the memory reconsolidation theory prescribes" shouldn't this rather read "in contrast with"?
We have modified the manuscript.
- In the Introduction, the authors state: "Specifically, memory reconsolidation effect will only be evident in the long-term (24h) memory test due to its requirement of new protein synthesis and is cue-dependent". They then continue about the more immediate memory update mechanisms that they want to study, but it is unclear from how the rationale is presented whether (and why (not)) they also expect this mechanism to be cue-dependent.
Most of the previous studies on the fear memory reconsolidation using CS as the memory retrieval cues have demonstrated that the reconsolidation effect is cue-dependent (Kindt and Soeter, 2018, Kindt et al., 2009, Monfils et al., 2009, Nader et al., 2000, Schiller et al., 2013, Schiller et al., 2010, Xue et al., 2012). However, other studies using unconditioned stimulus retrieval-extinction paradigm showed that such protocol was able to prevent the return of fear memory expression associated with different CSs (Liu et al., 2014, Luo et al., 2015). In our task, we used CS+ as the memory retrieval cues and our results were consistent with results from previous studies using similar paradigms.
- "The effects of cTBS over the right dlPFC after the memory reactivation were assessed using the similar mixed-effect four-way ANOVA". Please clarify what was analyzed here.<br /> - "designing novel treatment of psychiatric disorders". Please make this more concrete or remove the statement.
This sentence was right after a similar analysis performed in the previous paragraph. While the previous graph focused on how the SCRs in the acquisition phase were modulated by factors such as CS+ (CS1+ and CS2+), reminder (reminder vs. no-reminder), cTBS site (right dlPFC vs. vertex) and trial numbers, this analysis focused instead on the SCR responses in the extinction training phase. We have made the modifications as the reviewer suggested.
*I have several concerns related to the (presentation) of the statistical analyses/results:<br /> - Some statistical analyses, as well as calculation of certain arbitrary indices (e.g., differential fear recovery index) are not mentioned nor explained in the Methods section, but only mentioned in the Results section.
We have added the explanation of the differential fear recovery index into the methods section:
“To measure the extent to which fear returns after the presentation of unconditioned stimuli (US, electric shock) in the test phase, we defined the fear recovery index as the SCR difference between the first test trial and the last extinction trial for a specific CS for each subject. Similarly, in studies 2 and 3, differential fear recovery index was defined as the difference between fear recovery indices of CS+ and CS- for both CS1+ and CS2+.”
- Figure 1C-E: It is unclear what the triple *** mean. Do they have the same meaning in Figure 1C and Figure 1E? I am not sure that that makes sense. The meaning is not explained in the figure caption (I think it is different from the single asterisk*) and is not crystal clear from the main text either.
We explained the triple *** in the figure legend (Fig. 1): ***P < 0.001. The asterisk placed within each bar in Figure 1C-E indicates the statistical results of the post-hoc test of whether each bar was significant. For example, the *** placed inside bars in Figure 1E indicates that the differential fear recovery index is statistically significant in the no-reminder group (P < 0.001).
- Supplemental Figure 1: "with all responded participants" Please clarify how you define 'responded participants' and include the n's.
We presented the criteria for both the responder/non-responder and the learner/non-learner in the table of the supplementary materials and reported the number of subjects in each category (please see supplement Table 1).
- "the differential SCRs (difference between CS+ and CS-) for the CS+". Please clarify what this means and/or how it is calculated exactly.
Sorry, it means the difference between the SCRs invoked by CS+ and CS- for both CS1+ (CS1+ minus CS-) and CS2+ (CS2+ minus CS-).
*I suggest that the authors provide a bit more explanation about the thought-control ability questionnaire. For example, the type of items, etc, as this is not a very commonly used questionnaire in the fear conditioning field.
We provided a brief introduction to the thought-control ability questionnaire in the methods section:
“The control ability over intrusive thought was measured by the 25-item Thought-Control Ability Questionnaire (TCAQ) scle(30). Participants were asked to rate on a five-point Likert-type scale the extent to which they agreed with the statement from 1 (completely disagree) to 5 (completely agree). At the end of the experiments, all participants completed the TCAQ scale to assess their perceived control abilities over intrusive thoughts in daily life(17).”
We have added further description of the item types to the TCAQ scale.
*The authors excluded more than 25% of the participants. It would be interesting to hear reasons for this relatively large number and some reflection on whether they think this selection affects their results (e.g., could being a (non)responder in skin conductance influence the susceptibility to reactivation-extinction in some way?).
Participants exclusion rate in the SCR studies were relatively high (Hu et al., 2018, Liu et al., 2014, Raio et al., 2017, Schiller et al., 2010, Schiller et al., 2012, Wang et al., 2021). The non-responders were mostly associated with participants being tested in the winter in our tasks. Cold weather and dry skins in the winter are likely to have caused the SCR hard to measure (Bauer et al., 2022, Vila, 2004).
*Minor comments that the authors may want to consider:
- Please explain abbreviations upon first use, e.g., TMS.
- In Figure 6, it is a bit counterintuitive that the right Y-axis goes from high to low.
We added the explanation of TMS:
“Continuous theta burst stimulation (cTBS), a specific form of repetitive transcranial magnetic stimulation (rTMS)…”
We are sorry and agree that the right Y-axis was rather counterintuitive. However, since the direction of the fear recovery index (which was what we measured in the experiment) and the short/long-term amnesia effect are of the opposite directions, plotting one index from low to high would inevitably cause the other index to go from high to low.
Reference:
Anderson, M. C. and Floresco, S. B. 2022. Prefrontal-hippocampal interactions supporting the extinction of emotional memories: The retrieval stopping model. Neuropsychopharmacology, 47, 180-195.
Anderson, M. C. and Green, C. 2001. Suppressing unwanted memories by executive control. Nature, 410, 366-9.
Bauer, E. A., Wilson, K. A. and Macnamara, A. 2022. 3.03 - cognitive and affective psychophysiology. In: ASMUNDSON, G. J. G. (ed.) Comprehensive clinical psychology (second edition). Oxford: Elsevier.
Baum, M. 1968. Reversal learning of an avoidance response and the kamin effect. J Comp Physiol Psychol, 66, 495-7.
Borgomaneri, S., Battaglia, S., Garofalo, S., Tortora, F., Avenanti, A. and Di Pellegrino, G. 2020. State-dependent tms over prefrontal cortex disrupts fear-memory reconsolidation and prevents the return of fear. Curr Biol, 30, 3672-3679.e4.
Cain, C. K., Blouin, A. M. and Barad, M. 2003. Temporally massed cs presentations generate more fear extinction than spaced presentations. J Exp Psychol Anim Behav Process, 29, 323-33.
Carroll, M., Campbell-Ratcliffe, J., Murnane, H. and Perfect, T. 2007. Retrieval-induced forgetting in educational contexts: Monitoring, expertise, text integration, and test format. European Journal of Cognitive Psychology, 19, 580-606.
Chan, J. C. K. 2009. When does retrieval induce forgetting and when does it induce facilitation? Implications for retrieval inhibition, testing effect, and text processing. Journal of Memory and Language, 61, 153-170.
Gagnepain, P., Henson, R. N. and Anderson, M. C. 2014. Suppressing unwanted memories reduces their unconscious influence via targeted cortical inhibition. Proc Natl Acad Sci U S A, 111, E1310-9.
Gershman, S. J., Jones, C. E., Norman, K. A., Monfils, M. H. and Niv, Y. 2013. Gradual extinction prevents the return of fear: Implications for the discovery of state. Front Behav Neurosci, 7, 164.
Gershman, S. J., Monfils, M. H., Norman, K. A. and Niv, Y. 2017. The computational nature of memory modification. Elife, 6.
Hu, J., Wang, W., Homan, P., Wang, P., Zheng, X. and Schiller, D. 2018. Reminder duration determines threat memory modification in humans. Sci Rep, 8, 8848.
Kamin, L. J. 1957. The retention of an incompletely learned avoidance response. J Comp Physiol Psychol, 50, 457-60.
Kindt, M. and Soeter, M. 2018. Pharmacologically induced amnesia for learned fear is time and sleep dependent. Nat Commun, 9, 1316.
Kindt, M., Soeter, M. and Vervliet, B. 2009. Beyond extinction: Erasing human fear responses and preventing the return of fear. Nat Neurosci, 12, 256-8.
Liu, J., Zhao, L., Xue, Y., Shi, J., Suo, L., Luo, Y., Chai, B., Yang, C., Fang, Q., Zhang, Y., Bao, Y., Pickens, C. L. and Lu, L. 2014. An unconditioned stimulus retrieval extinction procedure to prevent the return of fear memory. Biol Psychiatry, 76, 895-901.
Luo, Y.-X., Xue, Y.-X., Liu, J.-F., Shi, H.-S., Jian, M., Han, Y., Zhu, W.-L., Bao, Y.-P., Wu, P., Ding, Z.-B., Shen, H.-W., Shi, J., Shaham, Y. and Lu, L. 2015. A novel ucs memory retrieval-extinction procedure to inhibit relapse to drug seeking. Nature Communications, 6, 7675.
Monfils, M. H., Cowansage, K. K., Klann, E. and Ledoux, J. E. 2009. Extinction-reconsolidation boundaries: Key to persistent attenuation of fear memories. Science, 324, 951-5.
Nader, K., Schafe, G. E. and Le Doux, J. E. 2000. Fear memories require protein synthesis in the amygdala for reconsolidation after retrieval. Nature, 406, 722-6.
Raio, C. M., Hartley, C. A., Orederu, T. A., Li, J. and Phelps, E. A. 2017. Stress attenuates the flexible updating of aversive value. Proc Natl Acad Sci U S A, 114, 11241-11246.
Schiller, D., Kanen, J. W., Ledoux, J. E., Monfils, M. H. and Phelps, E. A. 2013. Extinction during reconsolidation of threat memory diminishes prefrontal cortex involvement. Proc Natl Acad Sci U S A, 110, 20040-5.
Schiller, D., Monfils, M. H., Raio, C. M., Johnson, D. C., Ledoux, J. E. and Phelps, E. A. 2010. Preventing the return of fear in humans using reconsolidation update mechanisms. Nature, 463, 49-53.
Schiller, D., Raio, C. M. and Phelps, E. A. 2012. Extinction training during the reconsolidation window prevents recovery of fear. J Vis Exp, e3893.
Su, S., Deng, J., Yuan, K., Gong, Y., Zhang, Y., Li, H., Cao, K., Huang, X., Lin, X., Wu, P., Xue, Y., Bao, Y., Shi, J., Shi, L. and Lu, L. 2022. Continuous theta-burst stimulation over the right dorsolateral prefrontal cortex disrupts fear memory reconsolidation in humans. iScience, 25, 103614.
Vila, J. 2004. Psychophysiological assessment. In: SPIELBERGER, C. D. (ed.) Encyclopedia of applied psychology. New York: Elsevier.
Wang, Y., Zhu, Z., Hu, J., Schiller, D. and Li, J. 2021. Active suppression prevents the return of threat memory in humans. Commun Biol, 4, 609.
Xue, Y. X., Luo, Y. X., Wu, P., Shi, H. S., Xue, L. F., Chen, C., Zhu, W. L., Ding, Z. B., Bao, Y. P., Shi, J., Epstein, D. H., Shaham, Y. and Lu, L. 2012. A memory retrieval-extinction procedure to prevent drug craving and relapse. Science, 336, 241-5.
Zhu, Z., Anderson, M. C. and Wang, Y. 2022. Inducing forgetting of unwanted memories through subliminal reactivation. Nature communications, 13, 6496-6496.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Recommendations for the authors:
Reviewer #1:
The authors have thoroughly changed the manuscript and addressed most of my concerns. I appreciate adding the activity assays of the C115/120S mutants, however, I suggest that the authors embed and also discuss these data more clearly. It also escaped my attention earlier that the positioning of the disulfide bond is 117-122 in the deposited PDBs instead of 115-120. The authors should carefully check which positioning is correct here.
We thank reviewer #1 for his or her careful assessment of our revised manuscript. As suggested, we detailed the results section “CrSBPase enzymatic activity” with additional numerical values, and discussed more clearly the comparisons of results for activity assays of mutants C115S and C120S in the section “Oligomeric states of CrSBPase”. Residues numbering was carefully proof-checked throughout the manuscript for correctness and homogeneity. C115 and C120 are numbered according to best databases consensus, ie. GenBank and Uniprot, and may differ from one database to another (including PDB) due to varying numbering rules. We clarified the chosen nomenclature in methods section “Cloning and mutagenesis of CrSBPase expression plasmids”.
Line 246-250: I think it is evident that the two SBPase structures superpose well given the sequence identity of more than 70%. However, it would be great to include a superposition of the two structures in Figure 1, especially with regard to the region harboring C115 and C120.
We added a panel showing superimposition of CrSBPase 7b2o and PpSBPase 5iz3 and made a close-up view around the region C115-C120 in supplementary figure 5. Given the density in information of figure 1 we prefer not to add additional images on it. Supplementary figure 5 was initially intended to illustrate sequence conservation/variation among homologs, thus fitting with the objective to compare past and present XRC results.
Line 255-266: I am again missing a panel in Figure 1 here, e.g. a side-by-side view of Xray vs AF2/3 structure.
We added another panel in supplementary figure 5 to visually compare side-by-side SBPase crystallographic structure 7b2o and our AF3 model. Again, for the sake of clarity we prefer not to overload figure 1 with additional panels. This will also enable thorough comparison of past XRC of PpSBPase, present XRC of CrSBPase, and various AF models (see below, oligomer comparisons).
Line 261-266: Did the authors predict dimers and tetramers using AF3? What are the confidence metrics in this case? Do the authors see differences to the monomer prediction in case a multimer is confidently predicted?
We modeled dimers and tetramers using AF3 and added them on supplementary figure 5 side by side with protomer of XRC model 7b2o and with monomer predicted by AF3. Color code for supplementary figure 5 panels F-H is according to AF standard representation of plDDT. Confidence metrics per residue correspond to very high reliability (navy blue) or, locally, confident prediction (cyan) and overall prediction scores range from pTM=0.85-0.91, a high-quality prediction. Interface prediction score is high for both dimer (ipTM=0.9) and tetramer (ipTM=0.82). We reported these data in supplementary figure 5 and corresponding updated legend. XRC and AF models all align with RMSD<0.5 Å, indicating a globally unchanged structure of the protomer in the various methods and oligomeric states.
Line 441: How does the oligomeric equilibrium change in C115/120S mutants? This information should be added for the mutants. Besides, the mAU units in Fig. 6 could be normalized to allow an easier comparison between the chromatograms of wt and mutants.
Change in oligomeric equilibrium is assessed by size-exclusion chromatography of WT and mutants C115S, C120S as reported in figure 6A. We made quantitative estimation of WT, and C115S and C120S mutants equilibrium by comparing maximal peak intensity and added this information in the text. Briefly, the oligomer ratio on a scale of 100 is 9:48:43 for WT, 42:25:33 for mutant C115S, and 29:17:54 for mutant C120S (ratio expressed as tetramer:dimer:monomer). We prefer not to normalize values of absorbance, but rather keep the actual measurement of absorbance at 280 nm on the chromatogram of figure 6, for the sake of consistency with the added text and for a more transparent report of the experiment.
Line 447: WT activity is 12.15+-2.15 and both mutants have a higher activity. The authors should check if their values (96% and 107%) are correct. Besides, did the authors check if the increase in C120S is statistically significant? My impression is that both mutants have a higher activity than the wildtype, in both correlating with increased fractions of the tetramer. This would also make sense, as the corresponding region is part of the tetramer interface in the crystal packing.
The reported activity values were checked for correctness. Wild-type SBPase specific activity at 12.5 ±2.15 µmol(NADPH) min<sup>-1</sup> mg(SBPase)<sup>-1</sup> was obtained by pre-incubating the enzyme with 1 µM CrTRXf2 supplemented with 1 mM DTT and 10 mM Mg<sup>2+</sup>, while the results of supplementary figure 14 reporting the comparison of activation of WT and mutants, with a variation of 107 or 96 %, were obtained with a slightly different protocol for pre-incubation of the enzyme with 10 mM DTT and 10 mM Mg<sup>2+</sup>. Please note that whether WT enzyme was assayed in 10 mM DTT 10 mM Mg or in 1 µM TRX 1 mM DTT 10 mM Mg, its specific activity appears equal within experimental error. Both mutants have nearly the same activity than the WT in the assay reported in supplementary figure 14: we fully agree that 107% (and 96%) variation is indeed not significant considering the uncertainty of the measurement (see error bars representing standard deviations of the mean in supplementary figure 14). We added this important information in the text. Even though both mutations stabilize the most active tetramer in untreated recombinant protein, we think that after reducting treatment both WT and mutants all reach the same maximal activity because they all form an equivalent proportion of the active tetramer versus alternative oligomeric states. We furhter interprete this piece of data as a decoupling of reduction and catalysis: in physiological conditions we assume that SBPase would initiate activation upon the reduction of disulfide bridges, including but not limited to C115-C120 that restricts the entry into fully active tetramer, at which point SBPase in reduced form reaches maximal activity until another post-translational signal eventually changes its conformation and oligomerisation.
We thank again reviewer 1 for his or her assessment and valuable suggestions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
First, the authors confirm the up-regulation of the main genes involved in the three branches of the Unfolded Protein Response (UPR) system in diet-induced obese mice in AT, observations that have been extensively reported before. Not surprisingly, IRE1a inhibition with STF led to an amelioration of the obesity and insulin resistance of the animals. Moreover, non-alcoholic fatty liver disease was also improved by the treatment. More novel are their results in terms of thermogenesis and energy expenditure, where IRE1a seems to act via activation of brown AT. Finally, mice treated with STF exhibited significantly fewer metabolically active and M1-like macrophages in the AT compared to those under vehicle conditions. Overall, the authors conclude that targeting IRE1a has therapeutical potential for treating obesity and insulin resistance.
The study has some strengths, such as the detailed characterization of the effect of STF in different fat depots and a thorough analysis of macrophage populations. However, the lack of novelty in the findings somewhat limits the study´s impact on the field.
We thank the reviewer for the appreciation of our findings. We would use the opportunity to highlight several novelties. First, we characterized the relationship between the newly discovered CD9<sup>+</sup> ATMs and the “M1-like” CD11c+ ATMs. Second, we demonstrated that M2 macrophage population was not reduced but instead increased in adipose tissue in obesity. Third, IRE1 inhibition does not improve thermogenesis by boosting M2 population, but instead, IRE1 inhibition suppresses pro-inflammatory macrophage populations including the M1-like ATMs.
Reviewer #3 (Public review):
Summary:
The manuscript by Wu D. et al. explores an innovative approach in immunometabolism and obesity by investigating the potential of targeting macrophage Inositol-requiring enzyme 1α (IRE1α) in cases of overnutrition. Their findings suggest that pharmacological inhibition of IRE1α could influence key aspects such as adipose tissue inflammation, insulin resistance, and thermogenesis. Notable discoveries include the identification of High-Fat Diet (HFD)-induced CD9<sup>+</sup> Trem2+ macrophages and the reversal of metabolically active macrophages' activity with IRE1α inhibition using STF. These insights could significantly impact future obesity treatments.
Strengths:
The study's key strengths lie in its identification of specific macrophage subsets and the demonstration that inhibiting IRE1α can reverse the activity of these macrophages. This provides a potential new avenue for developing obesity treatments and contributes valuable knowledge to the field.
Weaknesses:
The research lacks an in-depth exploration of the broader metabolic mechanisms involved in controlling diet-induced obesity (DIO). Addressing this gap would strengthen the understanding of how targeting IRE1α might fit into the larger metabolic landscape.
We thank the reviewer for the appreciation of strengths in our manuscript. In particular, we appreciate the reviewer’s recommendation on the exploration of broader metabolic landscape, such as the effect of IRE1 inhibition on non-adipose tissue macrophages and metabolism. We agree that achieving these will certainly broaden the therapeutic potential of IRE1 inhibition to larger metabolic disorders and we will pursue these explorations in future studies.
Impact and Utility:
The findings have the potential to advance the field of obesity treatment by offering a novel target for intervention. However, further research is needed to fully elucidate the metabolic pathways involved and to confirm the long-term efficacy and safety of this approach. The methods and data presented are useful, but additional context and exploration are required for broader application and understanding.
Comments on revisions:
The author has revised the manuscript and addressed the most relevant comments raised by the reviewers. The paper is now significantly improved, though two minor issues remain.
(1) Studies were limited to male mice; this should be mentioned in the paper's Title.
Thanks for comment. We have modified the title to reflect the male mice only.
(2) Please include the sample size (n=) in all provided tables in the main manuscript and supplementary tables.
We have included the sample size in the main manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Bacterial effectors that interfere with the inner molecular workings of eukaryotic host cells are of great biological significance across disciplines. On the one hand they help us to understand the molecular strategies that bacteria use to manipulate host cells. On the other hand they can be used as research tools to reveal molecular details of the intricate workings of the host machinery that is relevant for the interaction/defence/symbiosis with bacteria. The authors investigate the function and biological impact of a rhizobial effector that interacts with and modifies, and curiously is modified by, legume receptors essential for symbiosis. The molecular analysis revealed a bacterial effector that cleaves a plant symbiosis signaling receptor to inhibit signaling and the host counterplay by phosphorylation via a receptor kinase. These findings have potential implications beyond bacterial interactions with plants.
Bao and colleagues investigated how rhizobial effector proteins can regulate the legume root nodule symbiosis. A rhizobial effector is described to directly modify symbiosis-related signaling proteins, altering the outcome of the symbiosis. Overall, the paper presents findings that will have a wide appeal beyond its primary field.
Out of 15 identified effectors from Sinorhizobium fredii, they focus on the effector NopT, which exhibits proteolytic activity and may therefore cleave specific target proteins of the host plant. They focus on two Nod factor receptors of the legume Lotus japonicus, NFR1 and NFR5, both of which were previously found to be essential for the perception of rhizobial nod factor, and the induction of symbiotic responses such as bacterial infection thread formation in root hairs and root nodule development (Madsen et al., 2003, Nature; Tirichine et al., 2003; Nature). The authors present evidence for an interaction of NopT with NFR1 and NFR5. The paper aims to characterize the biochemical and functional consequences of these interactions and the phenotype that arises when the effector is mutated.
Evidence is presented that in vitro NopT can cleave NFR5 at its juxtamembrane region. NFR5 appears also to be cleaved in vivo. and NFR1 appears to inhibit the proteolytic activity of NopT by phosphorylating NopT. When NFR5 and NFR1 are ectopically over-expressed in leaves of the non-legume Nicotiana benthamiana, they induce cell death (Madsen et al., 2011, Plant Journal). Bao et al., found that this cell death response is inhibited by the coexpression of nopT. Mutation of nopT alters the outcome of rhizobial infection in L. japonicus. These conclusions are well supported by the data.
The authors present evidence supporting the interaction of NopT with NFR1 and NFR5. In particular, there is solid support for cleavage of NFR5 by NopT (Figure 3) and the identification of NopT phosphorylation sites that inhibit its proteolytic activity (Figure 4C). Cleavage of NFR5 upon expression in N. benthamiana (Figure 3A) requires appropriate controls (inactive mutant versions) that have been provided, since Agrobacterium as a closely rhizobia-related bacterium might increase defense related proteolytic activity in the plant host cells.
We appreciate your recognition of the importance of appropriate controls in our experimental design. In response to your comments, we revised our manuscript to ensure that the figures and legends provide a clear description of the controls used. We also included a more detailed description of our experimental design at several places. In particular, we have highlighted the use of the protease-dead version of NopT as a control (NopT<sup>C93S</sup>). Therefore, NFR5-GFP cleavage in N. benthamiana clearly depended on protease activity of NopT and not on Agrobacterium (Fig. 3A). In the revised text, we carefully revied the conclusion and do not conclude at this stage that NopT proteolyzes NFR5. However, our subsequent experiments, including in vitro experiments, clearly show that NopT is able to proteolyze NFR5.
Key results from N. benthamiana appear consistent with data from recombinant protein expression in bacteria. For the analysis in the host legume L. japonicus transgenic hairy roots were included. To demonstrate that the cleavage of NFR5 occurs during the interaction in plant cells the authors build largely on western blots. Regardless of whether Nicotiana leaf cells or Lotus root cells are used as the test platform, the Western blots indicate that only a small proportion of NFR5 is cleaved when co-expressed with nopT, and most of the NFR5 persists in its full-length form (Figures 3A-D). It is not quite clear how the authors explain the loss of NFR5 function (loss of cell death, impact on symbiosis), as a vast excess of the tested target remains intact. It is also not clear why a large proportion of NFR5 is unaffected by the proteolytic activity of NopT. This is particularly interesting in Nicotiana in the absence of Nod factor that could trigger NFR1 kinase activity.
Thank you for your comments regarding the cleavage of NFR5 by NopT and its functional implications. We acknowledge that our immunoblots indicate only a relatively small proportion of the NFR5 cleavage product. Possible explanations could be as follows:
(1) The presence of full-length NFR5 does not preclude a significant impact of NopT on function of NFR5, as NopT is able to interact with NFR5. In other words, the NopT-NFR5 and NopT-NFR1 interactions at the plasma membrane might influence the function of the NFR1/NFR5 receptor without proteolytic cleavage of NFR5. In fact, protease-dead NopT<sup>C93S</sup> expressed in NGR234ΔnopT showed certain effects in L. japonicus (less infection foci were formed compared to NGR234ΔnopT Fig. 5E). In this context, it is worth mentioning that the non-acylated NopT<sup>C93S</sup> (Fig. 1B) and NopT<sub>USDA257</sub> (Fig. 6B) proteins were unable to suppress NFR1/NFR5-induced cell death in N. benthamina, but this could be explained by the lack of acylation and altered subcellular localization.
(2) In the cleavage assay, only small portion of NFR5 could be detected for cleavage by NopT. However, this cleavage might be sufficient to suppress signaling pathways, leading to the observed phenotypic changes (loss of cell death in N. benthamiana; altered infection in L. japonicus). We do believe this is a great point, therefore, we carefully revised the conclusion about this point. Throughout the paper, we stated that the cleavage of NFR5 suppresses symbiotic signaling but not disrupt the symbiotic signaling. We also removed the conclusion that cleavage of NFR5 by NopT results in the function loss of NFR5.
(3) N. benthamiana co-expressing NFR1/NFR5 leads to strong cell death, which suggest that the NFR1 kinase activity might be constitutively active even in the absence of Nod factors. But why co-expression of symbiotic receptor leads to cell death and how kinase activity is active in the absence of Nod factor are not clear, which is of great interest to be studied.
(4) The proteolytic activity of NopT may be reduced by the interaction of NopT with other proteins such as NFR1, which phosphorylates NopT and inactivates its protease activity.
In our revised manuscript version, we provide now quantitative data for the efficiency of NFR5 cleavage by NopT in different expression systems used (Figure 3 and Supplemental Fig. 16). We have also improved our Discussion in this context.
Comments on latest version:
The presentation of the figures and the language has greatly improved and the specific mistakes pointed out in the last review have been corrected. I especially appreciate the new images used to illustrate the observed mutant phenotypes, which are much clearer and easier to understand. The pictures used to illustrate the mutant phenotypes seem to be of more comparable root regions than before. Overall, the requested changes have been implemented, with some exceptions described below.
• Figure 1: New representative images are shown for BAX1 and CERK1. These pictures are more consistent with the phenotype seen in other treatments, but since the data has not changed, I presume the data from leaf discs (where the leaf discs for these treatments looked very different) previously shown is still included. The criteria for what was considered cell death is in my opinion still not described in the legend. The cell death/total ratio has been added for all leaf discs, as requested.
Thank you so much for carefully pointing out this. Cell death in leaf disc results in the formation of necrotic plaques, which restrains pathogens within deceased cells. These plaques commonly manifest as leaf dehydration, frequently accompanied by a translucent appearance. Brown and shriveled leaf discs serve as indicators of cell death. We have added these descriptions in the figure legend of Figure 1.
• Figure 2: the discussion of the figure now emphasizes direct protein interaction. There is still no size marker in 2D or a description of size in the figure legend, making it difficult to compare the result to Figure 3. If I understand the rebuttal comments correctly, there are other bands on the blot, including non-specific bands. This does not negate the need to include the full blot as a supplemental figure to show cleaved NFR5 as well as other bands. I do not see any other clarifications on this subject in the manuscript.
Thank you for your suggestion. In the revised manuscript, we have included the kDa range for all proteins detected in Figure.2D. The full blot of Co-IP assay was shown in Fig S2 (a new supplemental data). Yes, we detected some smaller bands after immunoblot, but we cannot give clear conclusion of what these bands are based on the current study. Interestingly, these smaller bands were immunoprecipitated by anti-FLAG beads, suggesting that these bands are some truncated peptides from NFR5.
• Figure 5: From the pictures, it is now easier to understand what is meant by "infection foci". Although there is no description in the methods of how these were distinguished from infection threads, I believe the images are clear enough.
Thank you for your helpful comment. In the revised manuscript, we have added the descriptions about this experiment in the method section and in the legend in Figure 5A.
• Figure 6: The changes in the discussion are appreciated, but panel E still misrepresents the evidence in the paper, as from the drawing it still seems that the cleaved NFR5 is somehow directly responsible for suppressing infection when this was not shown.
Thank you for your thoughtful comments. We appreciate your suggestion to the schematic model to illustrate the cleavage of NFR5 to suppressing rhizobia infection. In the revised manuscript, we have changed the model in Figure 6E.
Reviewer #2 (Public review):
Summary:
This manuscript presents data demonstrating NopT's interaction with Nod Factor Receptors NFR1 and NFR5 and its impact on cell death inhibition and rhizobial infection. The identification of a truncated NopT variant in certain Sinorhizobium species adds an interesting dimension to the study. These data try to bridge the gaps between classical Nod-factor-dependent nodulation and T3SS NopT effector-dependent nodulation in legume-rhizobium symbiosis. Overall, the research provides interesting insights into the molecular mechanisms underlying symbiotic interactions between rhizobia and legumes.
Strengths:
The manuscript nicely demonstrates NopT's proteolytic cleavage of NFR5, regulated by NFR1 phosphorylation, promoting rhizobial infection in L. japonicus. Intriguingly, authors also identify a truncated NopT variant in certain Sinorhizobium species, maintaining NFR5 cleavage but lacking NFR1 interaction. These findings bridge the T3SS effector with the classical Nod-factor-dependent nodulation pathway, offering novel insights into symbiotic interactions.
Weaknesses:
(1) In the previous study, when transiently expressed NopT alone in Nicotiana tobacco plants, proteolytically active NopT elicited a rapid hypersensitive reaction. However, this phenotype was not observed when expressing the same NopT in Nicotiana benthamiana (Figure 1A). Conversely, cell death and a hypersensitive reaction were observed in Figure S8. This raises questions about the suitability of the exogenous expression system for studying NopT proteolysis specificity.
We appreciate your attention to these plant-specific differences. Previous studies showed that NopT expressed in tobacco (N. tabacum) or in specific Arabidopsis ecotypes (with PBS1/RPS5 genes) causes rapid cell death (Dai et al. 2008; Khan et al. 2022). Khan et al. 2022 reported recently that cell death does not occur in N. benthamiana unless the leaves were transformed with PBS1/RPS5 constructs. Our data shown in Fig. S17 confirm these findings. As cell death is usually associated with induction of plant protease activities, we considered N. tabacum and A. thaliana plants as not suitable for testing NFR5 cleavage by NopT. In fact, no NopT/NFR5 experiments were not performed with these plants in our study. In response to your comment, we now better describe the N. benthamiana expression system and cite the previous articles_. Furthermore, we have revised the Discussion section to better emphasize effector-induced immunity in non-host plants and the negative effect of rhizobial effectors during symbiosis. Our revisions certainly provide a clearer understanding of the advantages and limitations of the _N. benthamiana expression system.
(2) NFR5 Loss-of-function mutants do not produce nodules in the presence of rhizobia in lotus roots, and overexpression of NFR1 and NFR5 produces spontaneous nodules. In this regard, if the direct proteolysis target of NopT is NFR5, one could expect the NGR234's infection will not be very successful because of the Native NopT's specific proteolysis function of NFR5 and NFR1. Conversely, in Figure 5, authors observed the different results.
Thank you for this comment, which points out that we did not address this aspect precisely enough in the original manuscript version. We improved our manuscript and now write that nfr1 and nfr5 mutants do not produce nodules (Madsen et al., 2003; Radutoiu et al., 2003) and that over-expression of either NFR1 or NFR5 can activate NF signaling, resulting in formation of spontaneous nodules in the absence of rhizobia (Ried et al., 2014). In fact, compared to the nopT knockout mutant NGR234ΔnopT, wildtype NGR234 (with NopT) is less successful in inducing infection foci in root hairs of L. japonicus (Fig. 5). With respect to formation of nodule primordia, we repeated our inoculation experiments with NGR234ΔnopT and wildtype NGR234 and also included a nopT over-expressing NGR234 strain into the analysis. Our data clearly showed that nodule primordium formation was negatively affected by NopT. The new data are shown in Fig. 5 of our revised version. Our data show that NGR234 infection is not really successful, especially when NopT is over-expressed. This is consistent with our observations that NopT targets Nod factor receptors in L. japonicus and inhibits NF signaling (NIN promoter-GUS experiments). Our findings indicate that NopT might be an “Avr effector” for L. japonicus. However, in other host plants of NGR234, NopT possesses a symbiosis-promoting role (Dai et al. 2008; Kambara et al. 2009). Such differences could be explained by different NopT targets in different plants (in addition to Nod factor receptors), which may influence the outcome of the infection process. Indeed, our work shows that NopT can interact with various kinase-dead LysM domain receptors, suggesting a role of NopT in suppression or activation of plant immunity responses depending on the host plant. We discuss such alternative mechanisms in our revised manuscript version and emphasize the need for further investigation to elucidate the precise mechanisms underlying the observed infection phenotype and the role of NopT in modulating symbiotic signaling pathways. In this context, we would also like to mention the new figures of our manuscript which are showing (i) the efficiency of NFR5 cleavage by NopT in different expression systems (Figure 3), (ii) the interaction between NopT<sup>C93S</sup> and His-SUMO-NFR5JM-GFP (Supplementary Fig. 5), and (iii) cleavage of His-SUMO-NFPJM-GFP by NopT (Supplementary Figs. S8 and S9).
(3) In Figure 6E, the model illustrates how NopT digests NFR5 to regulate rhizobia infection. However, it raises the question of whether it is reasonable for NGR234 to produce an effector that restricts its own colonization in host plants.
Thank you for mentioning this point. We are aware of the possible paradox that the broad-host-range strain NGR234 produces an effector that appears to restrict its infection of host plants. As mentioned in our answer to the previous comment, NopT could have additional functions beyond the regulation of Nod factor signaling. In our revised manuscript version, we have modified our text as follows:
(1) We mention the potential evolutionary aspects of NopT-mediated regulation of rhizobial infection and discuss the possibility that interactions between NopT and Nod factor receptors may have evolved to fine-tune Nod factor signaling to avoid rhizobial hyperinfection in certain host legumes.
(2) We also emphasize that the presence of NopT may confer selective advantages in other host plants than L. japonicus due to interactions with proteins related to plant immunity. Like other effectors, NopT could suppress activation of immune responses (suppression of PTI) or cause effector-triggered immunity (ETI) responses, thereby modulating rhizobial infection and nodule formation. Interactions between NopT and proteins related to the plant immune system may represent an important evolutionary driving force for host-specific nodulation and explain why the presence of NopT in NGR234 has a negative effect on symbiosis with L. japonicus but a positive one with other legumes.
(4) The failure to generate stable transgenic plants expressing NopT in Lotus japonicus is surprising, considering the manuscript's claim that NopT specifically proteolyzes NFR5, a major player in the response to nodule symbiosis, without being essential for plant development.
We also thank for this comment. We have revised the Discussion section of our manuscript and discuss now our failure to generate stable transgenic L. japonicus plants expressing NopT. We observed that the protease activity of NopT in aerial parts of L. japonicus had a negative effect on plant development, whereas NopT expression in hairy roots was possible. Such differences may be explained by different NopT substrates in roots and aerial parts of the plant. In this context, we also discuss our finding that NopT not only cleaves NFR5 but is also able to proteolyze other proteins of L. japonicus such as LjLYS11, suggesting that NopT not only suppresses Nod factor signaling, but may also interfere with signal transduction pathways related to plant immunity. We speculate that, depending on the host legume species, NopT could suppress PTI or induce ETI, thereby modulating rhizobial infection and nodule formation.
Comments on revised version:
This version has effectively addressed most of my concerns. However, one key issue remains unresolved regarding the mechanism of NopT in regulating nodule symbiosis. Specifically, the explanation of how NopT catabolizes NFR5 to regulate symbiosis is still not convincing within the current framework of plant-microbe interaction, where plants are understood to genetically control rhizobial colonization.
While alternative regulatory mechanisms in plant-microbe interactions are plausible, the notion that the NRG234-secreted effector NopT could reduce its own infection by either suppressing plant immunity or degrading the symbiosis receptor remains unsubstantiated. I believe further revisions are needed in the discussion section to more clearly address and clarify these findings and any lingering uncertainties.
We appreciate your positive comments on the reason why NopT catabolizes NFR5 to regulate symbiosis. NopT belongs to pathogen effecftors YopT family and also cleavage Arabidopsis AtLYK5 and L. japonicus LjLYS11 which trigger immunity responses in plants. NFR5, AtLYK5 and LjLYS11 has the conserved amino acid motif at the juxtamembrane domain, leading to cleaving NFR5 by NopT during symbiosis. Besides, in plant-microbe interaction, effector HopB1 cleaves immune co-receptor BAK1 at the kinase domain to inhibit plant defense. The effect on cleavage of receptor may be positive or negative. NopT suppressing symbiosis may avoid preventing hyperinfection in the specific interaction between rhizobia and legumes. In the revised manuscript, we have emphasized this point more clearly in why NopT could reduce its own infection by either suppressing plant immunity in discussion.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Evaluation of the author's responses to the reviewer comments during the first review round
Reviewer's Comment:
Regardless of whether Nicotiana leaf cells or Lotus root cells are used as the test platform, the Western blots indicate that only a small proportion of NFR5 is cleaved when co-expressed with NopT, and most of the NFR5 persists in its full-length form (Figures 3A-D). It is not quite clear how the authors explain the loss of NFR5 function (loss of cell death, impact on symbiosis), as a vast excess of the tested target remains intact. It is also not clear why a large proportion of NFR5 is unaffected by the proteolytic activity of NopT. This is particularly interesting in Nicotiana in the absence of Nod factor that could trigger NFR1 kinase activity.
Summary of response:
• NopT could be interfering with the NFR1/NFR5 complex without proteolytic cleavage
• The cleaved fraction may still be sufficient to disrupt signaling pathways
• Elevated abundance of NFR5 relative to WT levels
• Add quantitative data for efficiency of NFR5 cleavage in different systems
Evaluation of response:
• The quantification of NFR5 cleavage efficiency is welcome, and there is some discussion of the possible reasons for the large proportion of uncleaved NFR5. It is clear that there is a large difference in cleavage efficiency between L. japonicus roots and N. benthamiana.
• The data is shown as a bar plot. Given that only 3 biological replicates are used, the data points should be shown, and there is too little data to provide sensible error bars. It would be better to simply make a dot-plot and indicate the mean for each sample. However, the main aim of the comment is addressed.
Thank you for your constructive comments regarding Figure S16. In the revised manuscript, we have presented these data into dot-Plot format.
Reviewer's Comment:
It is also difficult to evaluate how the ratios of cleaved and full-length protein change when different versions of NopT are present without a quantification of band strengths normalized to loading controls (Figure 3C, 3D, 3F). The same is true for the blots supporting NFR1 phosphorylation of NopT (Figure 4A).
Summary of response:
• Quantified proportion of cleaved and full length NFR5 in different systems (S14)
• Band strengths of immunoblots quantified (4B)
Evaluation of response:
• The quantification has been performed as requested and the data is shown as bar plots. This type of data is frequently displayed as part of the blot figure itself, printed under each respective lane, making it easier for the reader to connect the ratios to the band sizes. If data is shown in a plot, the data points should be shown on the plot, as described above.
Thank you for your constructive comments regarding Figure 3. In the revised manuscript, we have added the cleavage efficiency in the 3A-3D.
Reviewer's Comment:
Nodule primordia and infection threads are still formed when L. japonicus plants are inoculated with ∆nopT mutant bacteria, but it is not clear if these primordia are infected or develop into fully functional nodules (Figure 5). A quantification of the ratio of infected and non-infected nodules and primordia would reveal whether NopT is only active at the transition from infection focus to thread or perhaps also later in the bacterial infection process of the developing root nodule.
Summary of response:
• Additional experiments with NGR234 or NGR234ΔnopT mutants find no non-infected nodules (fig. 5)
Evaluation of response:
• The requested quantification has been done, although the support for the findings would be stronger if also mature nodules per plant were quantified and plotted. If non-infected nodules were neither present in NGR234 or NGR234ΔnopT, it would still be advisable to include images of cross-sections of the fully-developed nodules.
We appreciate your positive comments on the cross-sections of the fully-developed nodules. In the revised manuscript, we have added the cross-section images of nodules in the Figure S12.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors used a subset of a very large, previously generated 16S dataset to:<br /> (1) Assess age-associated features; and (2) develop a fecal microbiome clock, based on an extensive longitudinal sampling of wild baboons for which near-exact chronological age is known. They further seek to understand deviation from age-expected patterns and uncover if and why some individuals have an older or younger microbiome than expected, and the health and longevity implications of such variation. Overall, the authors compellingly achieved their goals of discovering age-associated microbiome features and developing a fecal microbiome clock. They also showed clear and exciting evidence for sex and rank-associated variation in the pace of gut microbiome aging and impacts of seasonality on microbiome age in females. These data add to a growing understanding of modifiers of the pace of age in primates, and links among different biological indicators of age, with implications for understanding and contextualizing human variation. However, in the current version, there are gaps in the analyses with respect to the social environment, and in comparisons with other biological indicators of age. Despite this, I anticipate this work will be impactful, generate new areas of inquiry, and fuel additional comparative studies.
Thank you for the supportive comments and constructive reviews.
Strengths:
The major strengths of the paper are the size and sampling depth of the study population, including the ability to characterize the social and physical environments, and the application of recent and exciting methods to characterize the microbiome clock. An additional strength was the ability of the authors to compare and contrast the relative age-predictive power of the fecal microbiome clock to other biological methods of age estimation available for the study population (dental wear, blood cell parameters, methylation data). Furthermore, the writing and support materials are clear, informative and visually appealing.
Weaknesses:
It seems clear that more could be done in the area of drawing comparisons among the microbiome clock and other metrics of biological age, given the extensive data available for the study population. It was confusing to see this goal (i.e. "(i) to test whether microbiome age is correlated with other hallmarks of biological age in this population"), listed as a future direction, when the authors began this process here and have the data to do more; it would add to the impact of the paper to see this more extensively developed.
Comparing the microbiome clock to other metrics of biological age in our population is a high priority (these other metrics of biological age are in Table S5 and include epigenetic age measured in blood, the non-invasive physiology and behavior clock (NPB clock), dentine exposure, body mass index, and blood cell counts (Galbany et al. 2011; Altmann et al. 2010; Jayashankar et al. 2003; Weibel et al. 2024; Anderson et al. 2021)). However, we have opted to test these relationships in a separate manuscript. We made this decision because of the complexity of the analytical task: these metrics were not necessarily collected on the same subjects, and when they were, each metric was often measured at a different age for a given animal. Further, two of the metrics (microbiome clock and NPB clock) are measured longitudinally within subjects but on different time scales (the NPB clock is measured annually while microbiome age is measured in individual samples). The other metrics are cross-sectional. Testing the correlations between them will require exploration of how subject inclusion and time scale affect the relationships between metrics.
We now explain the complexity of this analysis in the discussion in lines 447-450. In addition, we have added the NPB clock (Weibel et al. 2024) to the text in lines 260-262 and to Table S5.
An additional weakness of the current set of analyses is that the authors did not explore the impact of current social network connectedness on microbiome parameters, despite the landmark finding from members of this authorship studying the same population that "Social networks predict gut microbiome composition in wild baboons" published here in eLife some years ago. While a mother's social connectedness is included as a parameter of early life adversity, overall the authors focus strongly on social dominance rank, without discussion of that parameter's impact on social network size or directly assessing it.
Thank you for raising this important point, which was not well explained in our manuscript. We find that the signatures of social group membership and social network proximity are only detectable our population for samples collected close in time. All of the samples analyzed in Tung et al. 2015 (“Social networks predict gut microbiome composition in wild baboons”) were collected within six weeks of each other. By contrast, the data set analyzed here spans 14 years, with very few samples from close social partners collected close in time. Hence, the effects of social group membership and social proximity are weak or undetectable. We described these findings in Grieneisen et al. 2021 and Bjork et al. 2022, and we now explain this logic on line 530, which states, “We did not model individual social network position because prior analyses of this data set find no evidence that close social partners have more similar gut microbiomes, probably because we lack samples from close social partners sampled close in time (Grieneisen et al. 2021; Björk et al. 2022).”
We do find small effects of social group membership, which is included as a random effect in our models of how each microbiome feature is associated with host age (line 529) and our models predicting microbiome Dage (line 606; Table S6).
Reviewer #2 (Public review):
Summary:
Dasari et al present an interesting study investigating the use of 'microbiota age' as an alternative to other measures of 'biological age'. The study provides several curious insights into biological aging. Although 'microbiota age' holds potential as a proxy of biological age, it comes with limitations considering the gut microbial community can be influenced by various non-age related factors, and various age-related stressors may not manifest in changes in the gut microbiota. The work would benefit from a more comprehensive discussion, that includes the limitations of the study and what these mean to the interpretation of the results.
We agree and have text to the discussion that expands on the limitations of this study and what those limitations mean for the interpretation of the results. For instance, lines 395-400 read, “Despite the relative accuracy of the baboon microbiome clock compared to similar clocks in humans, our clock has several limitations. First, the clock’s ability to predict individual age is lower than for age clocks based on patterns of DNA methylation—both for humans and baboons (Horvath 2013; Marioni et al. 2015; Chen et al. 2016; Binder et al. 2018; Anderson et al. 2021). One reason for this difference may be that gut microbiomes can be influenced by several non-age-related factors, including social group membership, seasonal changes in resource use, and fluctuations in microbial communities in the environment”
In addition, lines 405-411 now reads, “Third, the relationships between potential socio-environmental drivers of biological aging and the resulting biological age predictions were inconsistent. For instance, some sources of early life adversity were linked to old-for-age gut microbiomes (e.g., males born into large social groups), while others were linked to young-for-age microbiomes (e.g., males who experienced maternal social isolation or early life drought), or were unrelated to gut microbiome age (e.g., males who experienced maternal loss; any source of early life adversity in females).”
Strengths:
The dataset this study is based on is impressive, and can reveal various insights into biological ageing and beyond. The analysis implemented is extensive and high-level.
Weaknesses:
The key weakness is the use of microbiota age instead of e.g., DNA-methylation-based epigenetic age as a proxy of biological ageing, for reasons stated in the summary. DNA methylation levels can be measured from faecal samples, and as such epigenetic clocks too can be non-invasive. I will provide authors a list of minor edits to improve the read, to provide more details on Methods, and to make sure study limitations are discussed comprehensively.
Thank you for this point. In response, we have deleted the text from the discussion that stated that non-invasive sampling is an advantage of microbiome clocks. In addition, we now propose a non-invasive epigenetic clock from fecal samples as an important future direction for our population (see line 450).
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Abstract - The opening 2 sentences are not especially original or reflective of the potential value/ premise of the study. Members of this team have themselves measured variation in biological age in many different ways, and the implication that measuring a microbiome clock is easy or straightforward is not compelling. This paper is very interesting and provides unique insight, but I think overall there is a missed opportunity in the abstract to emphasize this, given the innovative science presented here. Furthermore, the last 2 sentences of the abstract are especially interesting - but missing a final statement on the broader significance of research outside of baboons.
We appreciate these comments and have revised the Abstract accordingly. The introductory sentences now read, “Mammalian gut microbiomes are highly dynamic communities that shape and are shaped by host aging, including age-related changes to host immunity, metabolism, and behavior. As such, gut microbial composition may provide valuable information on host biological age.” (lines 31-34). The last two sentences of the abstract now read, “Hence, in our host population, gut microbiome age largely reflects current, as opposed to past, social and environmental conditions, and does not predict the pace of host development or host mortality risk. We add to a growing understanding of how age is reflected in different host phenotypes and what forces modify biological age in primates.” (lines 40-43).
If possible, it would be highly useful to present some comments on concordance in patterns at different levels. Are all ASVs assessed at both the family and genus levels? Do they follow similar patterns when assessed at different levels? What can we learn about the system by looking at different levels of taxonomic assignment?
The section on relationships between host age and individual microbiome features is already lengthy, so we have not added an analysis of concordance between different taxonomic levels. However, we added a justification for why we tested for age signatures in different levels of taxa to line 171, which reads, “We tested these different taxonomic levels in order to learn whether the degree to which coarse and fine-grained designations categories were associated with host age.”
To calculate the delta age - please clarify if this was done at the level of years, as suggested in Figure 3C, or at the level of months or portion months, etc?
Delta age is measured in years. This is now clarified in lines 294, 295, and 578.
Spelling mistake in table S12, cell B4 (Octovber)
Thank you. This typo has been corrected.
Given the start intro with vertebrates, the second paragraph needs some tweaking to be appropriate. Perhaps, "At least among mammals, one valuable marker of biological aging may lie in the composition and dynamics of the mammalian gut microbiome (7-10)." Or simply remove "mammalian".
We have updated this sentence based on your suggestions in line 54. It reads, “In mammals, one valuable marker of biological aging may lie in the composition and dynamics of the gut microbiome (Claesson et al. 2012; Heintz and Mair 2014; O’Toole and Jeffery 2015; Sadoughi et al. 2022).”
A rewrite at the end of the introduction is needed to avoid the almost direct repetition in lines 115-118 and 129-131 (including lit cited). One potentially effective way to approach this is to keep the predictions in the earlier paragraph and then more clearly center the approach and the overarching results statement in the latter paragraph. (I.e., "we find that season and social rank have stronger effects on microbiome age than early life events. Further, microbiome age does not predict host development or mortality.").
Thank you for pointing this out. We have re-organized the predictions in the introduction based on your suggestion. The alternative “recency effects” model now appears in the paragraph that starts in line 110. The final paragraph then centers on the overall approach and the results statement (lines 128-140)
Be clear in each case where taxon-level trends are discussed if it's at Family, Genus, or other level. It's there most, but not all, of the time.
We have gone through the text and clarified what taxa or microbiome feature was the subject of our analyses in any places where this was not clear.
In the legend for Figure 2, add clarification for how values to right versus left of the centered value should be interpreted with respect to age (e.g. "values to x of the center are more abundant in older individuals").
We now clarify in Figure 2C and 2D that “Positive values are more abundant in older hosts”.
Figure 3 - Are Panels A, B, and C all needed - can the value for all individuals not also be overlaid in the panel showing sex differences and the same point showing individuals with "old" and "young" microbiomes be added in the same plot if it was slightly larger?
We agree and have simplified Figure 3. We reduced the number of panels from three to two, and we added the information about how to calculate delta age to Panel A. We also moved the equation from the top of Panel C to the bottom right of Panel A.
Reviewer #2 (Recommendations for the authors):
Dasari et al present an interesting study investigating the use of 'microbiota age' as an alternative to other measures of 'biological age'. The study provides several curious insights which in principle warrant publication. However, I do think the manuscript should be carefully revised. Below I list some minor revisions that should be implemented. Importantly, the authors should discuss in the Discussion the pros and cons of using 'microbiota age' as a proxy of 'biological age'. Further, the authors should provide more information on Methods, to make sure the study can be replicated.
Thank you for these important points. Based on your comments and those of the first reviewer, we have expanded our discussion of the limitations of using microbiota age as a proxy for biological age (see edits to the paragraph starting in line 395).
We have also expanded our methods around sample collection, DNA extraction, and sequencing to describe our sampling methods, strategies to mitigate and address possible contamination, and batch effects. See lines 483-490 and our citations to the original papers where these methods are described in detail.
(1) Lines 85-99: I think this paragraph could be revisited to make the assumptions clearer. For instance, the last sentence is currently a little confusing: are authors expecting males to exhibit old-for-age microbiomes already during the juvenile period?
This prediction has been clarified. Line 96 now reads, “Hence, we predicted that adult male baboons would exhibit gut microbiomes that are old-for-age, compared to adult females (by contrast, we expected no sex effects on microbiome age in juvenile baboons).”
(2) Lines 118-121: Could the authors discuss this assumption in relation to what has been observed e.g., in humans in terms of delays in gut microbiome development? Delayed/accelerated gut microbiome development has been studied before, so this assumption would be stronger if related to what we know from previous studies.
This comment refers to the sentence which originally stated, “However, we also expected that some sources of early life adversity might be linked to young-for-age gut microbiota. For instance, maternal social isolation might delay gut microbiome development due to less frequent microbial exposures from conspecifics.” We have slightly expanded the text here (line 117) to explain our logic. We now include citations for our predictions. We did not include a detailed discussion of prior literature on microbiome development in the interest of keeping the same level of detail across all sections on our predictions.
(3) As the authors discuss, various adversities can lead to old-for-age but also young-for-age microbiome composition. This should be discussed in the limitations.
We agree. This is now discussed in the sentence starting at line 371, which reads, “…deviations from microbiome age predictions are explained by socio-environmental conditions experienced by individual hosts, especially recent conditions, although the effect sizes are small and are not always directionally consistent.” In addition, the text starting at line 405 now reads, “Third, the relationships between potential socio-environmental drivers of biological aging and the resulting biological age predictions were inconsistent. For instance, some sources of early life adversity were linked to old-for-age gut microbiomes (e.g., males born into large social groups), while others were linked to young-for-age microbiomes (e.g., males who experienced maternal social isolation or early life drought), or were unrelated to gut microbiome age (e.g., males who experienced maternal loss; any source of early life adversity in females).”
(4) In various places, e.g., lines 129-131, it is a little unclear at what chronological age authors are expecting microbiota to appear young/old-for-age.
This sentence was removed while responding to the comments from the first reviewer.
(5) Lines 132-133: this statement could be backed by stating that this is because the gut microbiota can change rapidly e.g., when diet changes (or whatever the authors think could be behind this).
We have added an expository sentence at line 123, including new citations. This sentence reads, “Indeed, gut microbiomes are highly dynamic and can change rapidly in response to host diet or other aspects of host physiology, behavior, or environments”.
We now cite:
· Hicks, A.L., et al. (2018). Gut microbiomes of wild great apes fluctuate seasonally in response to diet. Nature Communications 9, 1786.
· Kolodny, O., et al. (2019). Coordinated change at the colony level in fruit bat fur microbiomes through time. Nature Ecology & Evolution 3, 116-124.
· Risely, A., et al. (2021) Diurnal oscillations in gut bacterial load and composition eclipse seasonal and lifetime dynamics in wild meerkats. Nat Commun 12, 6017.
(6) Lines 135-137: current or past season and social rank? This paragraph introduces the idea that it could be past rather than current socio-environmental factors that might predict microbiota age, so the authors should clarify this sentence.
We have clarified the information in this sentence. line 135 now reads, “In general, our results support the idea that a baboon’s current socio-environmental conditions, especially their current social rank and the season of sampling, have stronger effects on microbiome age than early life events—many of which occurred many years prior to sampling.”
(7) Lines 136-137: this sentence could include some kind of a conclusion of this finding. What might this mean?
We have added a sentence at line 138, which speculates that, “…the dynamism of the gut microbiome may often overwhelm and erase early life effects on gut microbiome age.”
(8) Use 'microbiota' or 'microbiome' across the manuscript; currently, the terms are used interchangeably. I don't have a strong opinion on this, although typically 'microbiota' is used when data comes from 16S rRNA.
We have updated the text to replace any instance of “microbiota” with “microbiome”. We use the term microbiome in the sense of this definition from the National Human Genome Research Institute, which defines a microbiome as “the community of microorganisms (such as fungi, bacteria and viruses) that exists in a particular environment”.
(9) Figure 1 legend: make sure to unify formatting; e.g., present sample sizes as N= or n=, rather than both, and either include or do not include commas in 4-digit values (sample sizes).
We have checked the formatting related to sample sizes and the use of commas in 4-digits in the main text and supplement. The formats are now consistent.
(10) Line 166: relative abundances surely?
Following Gloor et al. (2017), our analyses use centered log-ratio (CLR) transformations of read counts, which is the recommended approach for compositional data such as 16S rRNA amplicon read counts. CLR transformations are scale-invariant, so the same ratio is obtained in a sample with few read versus many reads. We now cite Gloor et al. (2017) at line 169 and in the methods in line 517, which reads “centered log ratio (CLR) transformed abundances (i.e., read counts) of each microbial phyla (n=30), family (n=290), genus (n=747), and amplicon sequence variance (ASV) detected in >25% of samples (n=358). CLR transformations are a recommended approach for addressing the compositional nature of 16S rRNA amplicon read count data (Gloor et al. 2017).”
(11) Lines 167-172: were technical factors, e.g., read depth or sequencing batch, included as random effects?
Thank you for catching this oversight in the text. We did model sequencing depth and batch effects. The sentence starting at line 173 now reads, “For each of these 1,440 features, we tested its association with host age by running linear mixed effects models that included linear and quadratic effects of host age and four other fixed effects: sequencing depth, the season of sample collection (wet or dry), the average maximum temperature for the month prior to sample collection, and the total rainfall in the month prior to sample collection (Grieneisen et al. 2021; Björk et al. 2022; Tung et al. 2015). Baboon identity, social group membership, hydrological year of sampling, and sequencing plate (as a batch effect) were modeled as random effects.”
(12) Lines 175-180: When discussing how these alpha diversity results relate to previous findings, the authors should be clear about whether they talk about weighted or non-weighted measures of alpha diversity. - also maybe this should be included in the discussion rather than the results? Please consider this when revisiting the manuscript (see how it reads after edits).
Richness is the only unweighted metric, which we now clarify in line 181. We opted to retain the interpretation in the text in its original location to maintain the emphasis in the discussion on the microbiome clock results.
(13) Table S1 is very hard to interpret in the provided PDF format as columns are not presented side-by-side. It is currently hard to check model output for e.g., specific families. This needs to be revisited.
We agree. We believe that eLife’s submission portal automatically generates a PDF for any supplementary item. However, we also include the supplementary tables as an Excel workbook which has the columns presented side-by-side.
(14) Line 184: taxa meaning what? Unclear what authors refer to with this sentence, taxa across taxonomic levels, or ASVs, or what does the 51.6% refer to?
We have edited line 191 to clarify that this sentence refers to taxa at all taxonomic levels (phyla to ASVs).
(15) Line 191: a punctuation mark missing after ref (81).
We have added the missing period at the end of this sentence.
(16) Lines 189-197: this should go into the discussion in my opinion.
We have opted to retain this interpretation, now at line 183.
(17) Lines 215-219: Not sure what this means; do the authors mean features were not restricted to age-associated taxa, ie also e.g., diversity and other taxa-independent patterns were included? If so, the rest of the highlighted lines should be revisited to make this clear, currently to me it is very unclear what 'These could include features that are not strongly age-correlated in isolation' means. Currently, that sounds like some features included were only age-associated in combination with other features, but unclear how this relates to taxa-dependency/taxa-independency.
We agree this was not clear. We have revised line 224 to read, “We included all 9,575 microbiome features in our age predictions, as opposed to just those that were statistically significantly associated with age because removing these non-significant features could exclude features that contribute to age prediction via interactions with other taxa.”
(18) Line 403-407: There is now a paper showing epigenetic clocks can be built with faecal samples, so this argument is not valid. Please revisit in light of this publication: https://onlinelibrary.wiley.com/doi/epdf/10.1111/mec.17330
Thank you for bringing this paper to our attention. We deleted the text that describes epigenetic clocks as invasive, and we now cite this paper in line 450, which reads, “We also hope to measure epigenetic age in fecal samples, leveraging methods developed in Hanski et al. 2024.”
(19) Line 427: a punctuation mark/semicolon missing before However.
We have corrected this typo.
(20) Lines 419-428: I don't quite understand this speculation. Why would the priority of access to food lead to an old-looking gut microbiome? This paragraph needs stronger arguments, currently unclear and also not super convincing.
We agree this was confusing. We have revised this text to clarify the explanation. The text starting at line 424 now reads, “This outcome points towards a shared driver of high social status in shaping gut microbiome age in both males and females. While it is difficult to identify a plausible shared driver, one benefit shared by both high-ranking males and females is priority of access to food. This access may result in fewer foraging disruptions and a higher quality, more stable diet. At the same time, prior research in Amboseli suggests that as animals age, their diets become more canalized and less variable (Grieneisen et al. 2021). Hence aging and priority of access to food might both be associated with dietary stability and old-for-age microbiomes. However, this explanation is speculative and more work is needed to understand the relationship between rank and microbiome age.”
(21) Line 434: remove 'be'.
We have corrected this typo.
(22) Line 478: add information on how samples were collected; e.g., were samples collected from the ground? How was cross-contamination with soil microbiota minimised? Were samples taken from the inner part of depositions? These factors can influence microbiota samples quite drastically so detailed info is needed. Also what does homogenisation mean in this context? How soon were samples freeze-dried after sample collection?
We have expanded our methods with respect to sample collection. This text starts in line 483 and reads, “Samples were collected from the ground within 15 minutes of defecation. For each sample, approximately 20 g of feces was collected into a paper cup, homogenized by stirring with a wooden tongue depressor, and a 5 g aliquot of the homogenized sample was transferred to a tube containing 95% ethanol. While a small amount of soil was typically present on the outside of the fecal sample, mammalian feces contains 1000 times the number of microbial cells in a typical soil sample (Sender, Fuchs, and Milo 2016; Raynaud and Nunan 2014), which overwhelms the signal of soil bacteria in our analyses (Grieneisen et al. 2021). Samples were transported from the field in Amboseli to a lab in Nairobi, freeze-dried, and then sifted to remove plant matter prior to long term storage at -80°C.”
(23) Line 480 onwards: were negative controls included in extraction batches? Were samples randomised into extraction batches?
Yes, we included extraction blanks. These are now described in lines 495-500. This text reads, “We included one extraction blank per batch, which had significantly lower DNA concentrations than sample wells (t-test; t=-50, p < 2.2x10-16; Grieneisen et al. 2021). We also included technical replicates, which were the same fecal sample sequenced across multiple extraction and library preparation batches. Technical replicates from different batches clustered with each other rather than with their batch, indicating that true biological differences between samples are larger than batch effects.”
(24) Were extraction, library prep, and sequencing negative controls included? Is data available?
We included extraction blanks (described above) and technical replicates, which were the same sample sequenced across multiple extraction and library preparation batches. Technical replicates from different batches clustered with each other rather than with their batch, indicating that true biological differences between samples are larger than batch effects.
We have updated the data availability statement to read, “All data for these analyses are available on Dryad at https://doi.org/10.5061/dryad.b2rbnzspv. The 16S rRNA gene sequencing data are deposited on EBI-ENA (project ERP119849) and Qiita (study 12949). Code is available at the following GitHub repository: https://github.com/maunadasari/Dasari_etal-GutMicrobiomeAge”.
(25) Line 562: how were corrected microbiome delta ages calculated? Currently, the authors state x, y and z factors were corrected for, but it is unclear how this was done.
The paragraph starting at line 577 describes how microbiome delta age was calculated. We have made only a few changes to this text because we were not sure which aspects of these methods confused the reviewer. However, briefly, we calculated sample-specific microbiome Dage in years as the difference between a sample’s microbial age estimate, age<sub>m</sub> from the microbiome clock, and the host’s chronological age in years at the time of sample collection, age<sub>c</sub>. Higher microbiome Dages indicate old-for-age microbiomes, as age<sub>m</sub> > age<sub>c</sub>, and lower values (which are often negative) indicate a young-for-age microbiome, where age<sub>c</sub> > age<sub>m</sub> (see Figure 3).
(26) Line 579: typo 'as'.
We have corrected this typo.
Works Cited
Altmann, Jeanne, Laurence Gesquiere, Jordi Galbany, Patrick O Onyango, and Susan C Alberts. 2010. “Life History Context of Reproductive Aging in a Wild Primate Model.” Annals of the New York Academy of Sciences 1204:127–38. https://doi.org/10.1111/j.1749-6632.2010.05531.x.
Anderson, Jordan A, Rachel A Johnston, Amanda J Lea, Fernando A Campos, Tawni N Voyles, Mercy Y Akinyi, Susan C Alberts, Elizabeth A Archie, and Jenny Tung. 2021. “High Social Status Males Experience Accelerated Epigenetic Aging in Wild Baboons.” Edited by George H Perry. eLife 10 (April):e66128. https://doi.org/10.7554/eLife.66128.
Binder, Alexandra M., Camila Corvalan, Verónica Mericq, Ana Pereira, José Luis Santos, Steve Horvath, John Shepherd, and Karin B. Michels. 2018. “Faster Ticking Rate of the Epigenetic Clock Is Associated with Faster Pubertal Development in Girls.” Epigenetics 13 (1): 85–94. https://doi.org/10.1080/15592294.2017.1414127.
Björk, Johannes R., Mauna R. Dasari, Kim Roche, Laura Grieneisen, Trevor J. Gould, Jean-Christophe Grenier, Vania Yotova, et al. 2022. “Synchrony and Idiosyncrasy in the Gut Microbiome of Wild Baboons.” Nature Ecology & Evolution, June, 1–10. https://doi.org/10.1038/s41559-022-01773-4.
Chen, Brian H., Riccardo E. Marioni, Elena Colicino, Marjolein J. Peters, Cavin K. Ward-Caviness, Pei-Chien Tsai, Nicholas S. Roetker, et al. 2016. “DNA Methylation-Based Measures of Biological Age: Meta-Analysis Predicting Time to Death.” Aging (Albany NY) 8 (9): 1844–59. https://doi.org/10.18632/aging.101020.
Claesson, Marcus J., Ian B. Jeffery, Susana Conde, Susan E. Power, Eibhlís M. O’Connor, Siobhán Cusack, Hugh M. B. Harris, et al. 2012. “Gut Microbiota Composition Correlates with Diet and Health in the Elderly.” Nature 488 (7410): 178–84. https://doi.org/10.1038/nature11319.
Galbany, Jordi, Jeanne Altmann, Alejandro Pérez-Pérez, and Susan C. Alberts. 2011. “Age and Individual Foraging Behavior Predict Tooth Wear in Amboseli Baboons.” American Journal of Physical Anthropology 144 (1): 51–59. https://doi.org/10.1002/ajpa.21368.
Gloor, Gregory B., Jean M. Macklaim, Vera Pawlowsky-Glahn, and Juan J. Egozcue. 2017. “Microbiome Datasets Are Compositional: And This Is Not Optional.” Frontiers in Microbiology 8. https://doi.org/10.3389/fmicb.2017.02224.
Grieneisen, Laura E., Mauna Dasari, Trevor J. Gould, Johannes R. Björk, Jean-Christophe Grenier, Vania Yotova, David Jansen, et al. 2021. “Gut Microbiome Heritability Is Nearly Universal but Environmentally Contingent.” Science 373 (6551): 181–86. https://doi.org/10.1126/science.aba5483.
Hanski, Eveliina, Susan Joseph, Aura Raulo, Klara M. Wanelik, Áine O’Toole, Sarah C. L. Knowles, and Tom J. Little. 2024. “Epigenetic Age Estimation of Wild Mice Using Faecal Samples.” Molecular Ecology 33 (8): e17330. https://doi.org/10.1111/mec.17330.
Heintz, Caroline, and William Mair. 2014. “You Are What You Host: Microbiome Modulation of the Aging Process.” Cell 156 (3): 408–11. http://dx.doi.org/10.1016/j.cell.2014.01.025.
Horvath, Steve. 2013. “DNA Methylation Age of Human Tissues and Cell Types.” Genome Biology 14 (10): R115. https://doi.org/10.1186/gb-2013-14-10-r115.
Jayashankar, Lakshmi, Kathleen M. Brasky, John A. Ward, and Roberta Attanasio. 2003. “Lymphocyte Modulation in a Baboon Model of Immunosenescence.” Clinical and Vaccine Immunology 10 (5): 870–75. https://doi.org/10.1128/CDLI.10.5.870-875.2003.
Marioni, Riccardo E., Sonia Shah, Allan F. McRae, Brian H. Chen, Elena Colicino, Sarah E. Harris, Jude Gibson, et al. 2015. “DNA Methylation Age of Blood Predicts All-Cause Mortality in Later Life.” Genome Biology 16 (1): 25. https://doi.org/10.1186/s13059-015-0584-6.
O’Toole, Paul W., and Ian B. Jeffery. 2015. “Gut Microbiota and Aging.” Science 350 (6265): 1214–15. https://doi.org/10.1126/science.aac8469.
Raynaud, Xavier, and Naoise Nunan. 2014. “Spatial Ecology of Bacteria at the Microscale in Soil.” PLOS ONE 9 (1): e87217. https://doi.org/10.1371/journal.pone.0087217.
Sadoughi, Baptiste, Dominik Schneider, Rolf Daniel, Oliver Schülke, and Julia Ostner. 2022. “Aging Gut Microbiota of Wild Macaques Are Equally Diverse, Less Stable, but Progressively Personalized.” Microbiome 10 (1): 95. https://doi.org/10.1186/s40168-022-01283-2.
Sender, Ron, Shai Fuchs, and Ron Milo. 2016. “Revised Estimates for the Number of Human and Bacteria Cells in the Body.” PLoS Biology 14 (8): e1002533. https://doi.org/10.1371/journal.pbio.1002533.
Tung, J, L B Barreiro, M B Burns, J C Grenier, J Lynch, L E Grieneisen, J Altmann, S C Alberts, R Blekhman, and E A Archie. 2015. “Social Networks Predict Gut Microbiome Composition in Wild Baboons.” Elife 4. https://doi.org/10.7554/eLife.05224.
Weibel, Chelsea J., Mauna R. Dasari, David A. Jansen, Laurence R. Gesquiere, Raphael S. Mututua, J. Kinyua Warutere, Long’ida I. Siodi, Susan C. Alberts, Jenny Tung, and Elizabeth A. Archie. 2024. “Using Non-Invasive Behavioral and Physiological Data to Measure Biological Age in Wild Baboons.” GeroScience 46 (5): 4059–74. https://doi.org/10.1007/s11357-024-01157-5.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers for their thoughtful reading and review of our manuscript. These reviews make clear that, for this work to be complete, we must make progress on the following fronts:
(1) Expand the discussion to better incorporate alternate explanations of our data
(2) Improve data visualization and experimental support or an experimental refutation for the following concepts
a. Photoreceptor-derived lactate exported specifically from photoreceptors is utilized in the RPE TCA cycle
b. Photoreceptors can utilize lactate as a fuel source when starved of glucose
To address these concerns, we will focus our efforts on infusing <sup>13</sup>C<sub>6</sub>-glucose into rodΔglut1 mice. Lactate is not made without glucose, so this experiment should indicate whether glucose utilization in photoreceptors provides lactate to the RPE, and whether that lactate is used in the TCA cycle.
The reviewers also noted that changes in <sup>13</sup>C labeling of RPE TCA cycle intermediates downstream of lactate is not obvious (between C57BL6J mice and AIPL1<sup>-/-</sup>). We think that at least in part, this is a consequence of the way we presented the data. We will improve how we display our data so that the differences of incorporation of <sup>13</sup>C in TCA cycle intermediates in control and AIPL1<sup>-/-</sup> RPE is clearer.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The issue of a control without blue light illumination was raised. Clearly without the light we will not obtain any signal in the fluorescence microscopy experiments, which would not be very informative. Instead, we changed the level of blue light illumination in the fluorescence microscopy experiments (figure 4A) and the response of the bacteria scales with dosage. It is very hard to find an alternative explanation, beyond that the blue light is stressing the bacteria and modulating their membrane potentials.
One of the referees refuses to see wavefronts in our microscopy data. We struggle to understand whether it is an issue with definitions (Waigh has published a tutorial on the subject in Chapter 5 of his book ‘The physics of bacteria: from cells to biofilms’, T.A.Waigh, CUP, 2024 – figure 5.1 shows a sketch) or something subtler on diffusion in excitable systems. We stand by our claim that we observe wavefronts, similar to those observed by Prindle et al<sup>1</sup> and Blee et al<sup>2</sup> for B. subtilis biofilms.
The referee is questioning our use of ThT to probe the membrane potential. We believe the Pilizota and Strahl groups are treating the E. coli as unexcitable cells, leading to their problems. Instead, we believe E. coli cells are excitable (containing the voltage-gated ion channel Kch) and we now clearly state this in the manuscript. Furthermore, we include a section here discussing some of the issues with ThT.
Use of ThT as a voltage sensor in cells
ThT is now used reasonably widely in the microbiology community as a voltage sensor in both bacterial [Prindle et al]1 and fungal cells [Pena et al]12. ThT is a small cationic fluorophore that loads into the cells in proportion to their membrane potential, thus allowing the membrane potential to be measured from fluorescence microscopy measurements.
Previously ThT was widely used to quantify the growth of amyloids in molecular biology experiments (standardized protocols exist and dedicated software has been created)13 and there is a long history of its use14. ThT fluorescence is bright, stable and slow to photobleach.
Author response figure 1 shows a schematic diagram of the ThT loading in E. coli in our experiments in response to illumination with blue light. Similar results were previously presented by Mancini et al15, but regimes 2 and 3 were mistakenly labelled as artefacts.
Author response figure 1. Schematic diagram of ThT loading during an experiment with E. coli cells under blue light illumination i.e. ThT fluorescence as a function of time. Three empirical regimes for the fluorescence are shown (1, 2 and 3).
The classic study of Prindle et al on bacterial biofilm electrophysiology established the use of ThT in B. subtilis biofilms by showing similar results occurred with DiSc3 which is widely used as a Nernstian voltage sensor in cellular biology1 e.g. with mitochondrial membrane potentials in eukaryotic organisms where there is a large literature. We repeated such a comparative calibration of ThT with DiSc3 in a previous publication with both B. subtilis and P. aeruginosa cells2. ThT thus functioned well in our previous publications with Gram positive and Gram negative cells.
However, to our knowledge, there are now two groups questioning the use of ThT and DiSc3 as voltage sensors with E. coli cells15-16. The first by the Pilizota group claims ThT only works as a voltage sensor in regime 1 of Author response figure 1 using a method based on the rate of rotation of flagellar motors. Another slightly contradictory study by the Strahl group claims DiSc316 only acts as a voltage sensor with the addition of an ionophore for potassium which allows free movement of potassium through the E. coli membranes.
Our resolution to this contradiction is that ThT does indeed work reasonably well with E. coli. The Pilizota group’s model for rotating flagellar motors assumes the membrane voltage is not varying due to excitability of the membrane voltage (otherwise a non-linear Hodgkin Huxley type model would be needed to quantify their results) i.e. E. coli cells are unexcitable. We show clearly in our study that ThT loading in E. coli is a function of irradiation with blue light and is a stress response of the excitable cells. This is in contradiction to the Pilizota group’s model. The Pilizota group’s model also requires the awkward fiction of why cells decide to unload and then reload ThT in regimes 2 and 3 of Author response figure 1 due to variable membrane partitioning of the ThT. Our simple explanation is that it is just due to the membrane voltage changing and no membrane permeability switch needs to be invoked. The Strahl group’s16 results with DiSc3 are also explained by a neglect of the excitable nature of E. coli cells that are reacting to blue light irradiation. Adding ionophores to the E. coli membranes makes the cells unexcitable, reduces their response to blue light and thus leads to simple loading of DiSc3 (the physiological control of K+ in the cells by voltage-gated ion channels has been short circuited by the addition of the ionophore).
Further evidence of our model that ThT functions as a voltage sensor with E. coli include:
1) The 3 regimes in Author response figure 1 from ThT correlate well with measurements of extracellular potassium ion concentration using TMRM i.e. all 3 regimes in Author response figure 1 are visible with this separate dye (figure 1d).
2) We are able to switch regime 3 in Author response figure 1, off and then on again by using knock downs of the potassium ion channel Kch in the membranes of the E. coli and then reinserting the gene back into the knock downs. This cannot be explained by the Pilizota model.
We conclude that ThT works reasonably well as a sensor of membrane voltage in E. coli and the previous contradictory studies15-16 are because they neglect the excitable nature of the membrane voltage of E. coli cells in response to the light used to make the ThT fluoresce.
Three further criticisms of the Mancini et al method15 for calibrating membrane voltages include:
1) E. coli cells have clutches that are not included in their models. Otherwise the rotation of the flagella would be entirely enslaved to the membrane voltage allowing the bacteria no freedom to modulate their speed of motility.
2) Ripping off the flagella may perturb the integrity of the cell membrane and lead to different loading of the ThT in the E. coli cells.
3) Most seriously, the method ignores the activity of many other ion channels (beyond H+) on the membrane voltage that are known to exist with E. coli cells e.g. Kch for K+ ions. The Pilizota groups uses a simple Nernstian battery model developed for mitochondria in the 1960s. It is not adequate to explain our results.
An additional criticism of the Winkel et al study17 from the Strahl group is that it indiscriminately switches between discussion of mitochondria and bacteria e.g. on page 8 ‘As a consequence the membrane potential is dominated by H+’. Mitochondria are slightly alkaline intracellular organelles with external ion concentrations in the cytoplasm that are carefully controlled by the eukaryotic cells. E. coli are not i.e. they have neutral internal pHs, with widely varying extracellular ionic concentrations and have reinforced outer membranes to resist osmotic shocks (in contrast mitochondria can easily swell in response to moderate changes in osmotic pressure).
A quick calculation of the equilibrium membrane voltage of E. coli can be easily done using the Nernst equation dependent on the extracellular ion concentrations defined by the growth media (the intracellular ion concentrations in E. coli are 0.2 M K+ and 10-7 M H+ i.e. there is a factor of a million fewer H+ ions). Thus in contradiction to the claims of the groups of Pilizota15 and Strahl17, H+ is a minority determinant to the membrane voltage of E. coli. The main determinant is K+. For a textbook version of this point the authors can refer to Chapter 4 of D. White, et al’s ‘The physiology and biochemistry of prokaryotes’, OUP, 2012, 4th edition.
Even in mitochondria the assumption that H+ dominates the membrane potential and the cells are unexcitable can be questioned e.g. people have observed pulsatile depolarization phenomena with mitochondria18-19. A large number of K+ channels are now known to occur in mitochondrial membranes (not to mention Ca2+ channels; mitochondria have extensive stores of Ca2+) and they are implicated in mitochondrial membrane potentials. In this respect the seminal Nobel prize winning research of Peter Mitchell (1961) on mitochondria needs to be amended20. Furthermore, the mitochondrial work is clearly inapplicable to bacteria (the proton motive force, PMF, will instead subtly depend on non-linear Hodgkin-Huxley equations for the excitable membrane potential, similar to those presented in the current article). A much more sophisticated framework has been developed to describe electrophysiology by the mathematical biology community to describe the activity of electrically excitable cells (e.g. with neurons, sensory cells and cardiac cells), beyond Mitchell’s use of the simple stationary equilibrium thermodynamics to define the Proton Motive Force via the electrochemical potential of a proton (the use of the word ‘force’ is unfortunate, since it is a potential). The tools developed in the field of mathematical electrophysiology8 should be more extensively applied to bacteria, fungi, mitochondria and chloroplasts if real progress is to be made.
Related to the previous point, we now cite articles from the Pilizota and Strahl groups in the main text (one from each group). Unfortunately, the space constraints of eLife mean we cannot make a more detailed discussion in the main article.
In terms of modelling the ion channels, the Hodgkin-Huxley type model proposes that the Kch ion channel can be modelled as a typical voltage-gated potassium ion channel i.e. with a 𝑛<sup>4</sup> term in its conductivity. The literature agrees that Kch is a voltage-gated potassium ion channel based on its primary sequence<sup>3</sup>. The protein has the typical 6 transmembrane helix motif for a voltage-gated ion channel. The agent-based model assumes little about the structure of ion channels in E. coli, other than they release potassium in response to a threshold potassium concentration in their environment. The agent based model is thus robust to the exact molecular details chosen and predicts the anomalous transport of the potassium wavefronts reasonably well (the modelling was extended in a recent Physical Review E article(<sup>4</sup>). Such a description of reaction-anomalous diffusion phenomena has not to our knowledge been previously achieved in the literature<sup>5</sup> and in general could be used to describe other signaling molecules.
-
Prindle, A.; Liu, J.; Asally, M.; Ly, S.; Garcia-Ojalvo, J.; Sudel, G. M., Ion channels enable electrical communication in bacterial communities. Nature 2015, 527, 59.
-
Blee, J. A.; Roberts, I. S.; Waigh, T. A., Membrane potentials, oxidative stress and the dispersal response of bacterial biofilms to 405 nm light. Physical Biology 2020, 17, 036001.
-
Milkman, R., An E. col_i homologue of eukaryotic potassium channel proteins. _PNAS 1994, 91, 3510-3514.
-
Martorelli, V.; Akabuogu, E. U.; Krasovec, R.; Roberts, I. S.; Waigh, T. A., Electrical signaling in three-dimensional bacterial biofilms using an agent-based fire-diffuse-fire model. Physical Review E 2024, 109, 054402.
-
Waigh, T. A.; Korabel, N., Heterogeneous anomalous transport in cellular and molecular biology. Reports on Progress in Physics 2023, 86, 126601.
-
Hodgkin, A. L.; Huxley, A. F., A quantitative description of membrane current and its application to conduction and excitation in nerve. Journal of Physiology 1952, 117, 500.
-
Dawson, S. P.; Keizer, J.; Pearson, J. E., Fire-diffuse-fire model of dynamics of intracellular calcium waves. PNAS 1999, 96, 606.
-
Keener, J.; Sneyd, J., Mathematical Physiology. Springer: 2009.
-
Coombes, S., The effect of ion pumps on the speed of travelling waves in the fire-diffuse-fire model of Ca2+ release. Bulletin of Mathematical Biology 2001, 63, 1.
-
Blee, J. A.; Roberts, I. S.; Waigh, T. A., Spatial propagation of electrical signals in circular biofilms. Physical Review E 2019, 100, 052401.
-
Gorochowski, T. E.; Matyjaszkiewicz, A.; Todd, T.; Oak, N.; Kowalska, K., BSim: an agent-based tool for modelling bacterial populations in systems and synthetic biology. PloS One 2012, 7, 1.
-
Pena, A.; Sanchez, N. S.; Padilla-Garfias, F.; Ramiro-Cortes, Y.; Araiza-Villaneuva, M.; Calahorra, M., The use of thioflavin T for the estimation and measurement of the plasma membrane electric potential difference in different yeast strains. Journal of Fungi 2023, 9 (9), 948.
-
Xue, C.; Lin, T. Y.; Chang, D.; Guo, Z., Thioflavin T as an amyloid dye: fibril quantification, optimal concentration and effect on aggregation. Royal Society Open Science 2017, 4, 160696.
-
Meisl, G.; Kirkegaard, J. B.; Arosio, P.; Michaels, T. C. T.; Vendruscolo, M.; Dobson, C. M.; Linse, S.; Knowles, T. P. J., Molecular mechanisms of protein aggregation from global fitting of kinetic models. Nature Protocols 2016, 11 (2), 252-272.
-
Mancini, L.; Tian, T.; Guillaume, T.; Pu, Y.; Li, Y.; Lo, C. J.; Bai, F.; Pilizota, T., A general workflow for characterization of Nernstian dyes and their effects on bacterial physiology. Biophysical Journal 2020, 118 (1), 4-14.
-
Buttress, J. A.; Halte, M.; Winkel, J. D. t.; Erhardt, M.; Popp, P. F.; Strahl, H., A guide for membrane potential measurements in Gram-negative bacteria using voltage-sensitive dyes. Microbiology 2022, 168, 001227.
-
Derk te Winkel, J.; Gray, D. A.; Seistrup, K. H.; Hamoen, L. W.; Strahl, H., Analysis of antimicrobial-triggered membrane depolarization using voltage sensitive dyes. Frontiers in Cell and Developmental Biology 2016, 4, 29.
-
Schawarzlander, M.; Logan, D. C.; Johnston, I. G.; Jones, N. S.; Meyer, A. J.; Fricker, M. D.; Sweetlove, L. J., Pulsing of membrane potential in individual mitochondria. The Plant Cell 2012, 24, 1188-1201.
-
Huser, J.; Blatter, L. A., Fluctuations in mitochondrial membrane potential caused by repetitive gating of the permeability transition pore. Biochemistry Journal 1999, 343, 311-317.
-
Mitchell, P., Coupling of phosphorylation to electron and hydrogen transfer by a chemi-osmotic type of mechanism. Nature 1961, 191 (4784), 144-148.
-
Baba, T.; Ara, M.; Hasegawa, Y.; Takai, Y.; Okumura, Y.; Baba, M.; Datsenko, K. A.; Tomita, M.; Wanner, B. L.; Mori, H., Construction of Escherichia Coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Molecular Systems Biology 2006, 2, 1.
-
Schinedlin, J.; al, e., Fiji: an open-source platform for biological-image analysis. Nature Methods 2012, 9, 676.
-
Hartmann, R.; al, e., Quantitative image analysis of microbial communities with BiofilmQ. Nature Microbiology 2021, 6 (2), 151.
The following is the authors’ response to the original reviews.
Critical synopsis of the articles cited by referee 2:
(1) ‘Generalized workflow for characterization of Nernstian dyes and their effects on bacterial physiology’, L.Mancini et al, Biophysical Journal, 2020, 118, 1, 4-14.
This is the central article used by referee 2 to argue that there are issues with the calibration of ThT for the measurement of membrane potentials. The authors use a simple Nernstian battery (SNB) model and unfortunately it is wrong when voltage-gated ion channels occur. Huge oscillations occur in the membrane potentials of E. coli that cannot be described by the SNB model. Instead a Hodgkin Huxley model is needed, as shown in our eLife manuscript and multiple other studies (see above). Arrhenius kinetics are assumed in the SNB model for pumping with no real evidence and the generalized workflow involves ripping the flagella off the bacteria! The authors construct an elaborate ‘work flow’ to insure their ThT results can be interpreted using their erroneous SNB model over a limited range of parameters.
(2) ‘Non-equivalence of membrane voltage and ion-gradient as driving forces for the bacterial flagellar motor at low load’, C.J.Lo, et al, Biophysical Journal, 2007, 93, 1, 294.
An odd de novo chimeric species is developed using an E. coli chassis which uses Na+ instead of H+ for the motility of its flagellar motor. It is not clear the relevance to wild type E. coli, due to the massive physiological perturbations involved. A SNB model is using to fit the data over a very limited parameter range with all the concomitant errors.
(3) Single-cell bacterial electrophysiology reveals mechanisms of stress-induced damage’, E.Krasnopeeva, et al, Biophysical Journal, 2019, 116, 2390.
The abstract says ‘PMF defines the physiological state of the cell’. This statement is hyperbolic. An extremely wide range of molecules contribute to the physiological state of a cell. PMF does not even define the electrophysiology of the cell e.g. via the membrane potential. There are 0.2 M of K+ compared with 0.0000001 M of H+ in E. coli, so K+ is arguably a million times more important for the membrane potential than H+ and thus the electrophysiology!
Equation (1) in the manuscript assumes no other ions are exchanged during the experiments other than H+. This is a very bad approximation when voltage-gated potassium ion channels move the majority ion (K+) around!
In our model Figure 4A is better explained by depolarisation due to K+ channels closing than direct irreversible photodamage. Why does the THT fluorescence increase again for the second hyperpolarization event if the THT is supposed to be damaged? It does not make sense.
(4) ‘The proton motive force determines E. coli robustness to extracellular pH’, G.Terradot et al, 2024, preprint.
This article expounds the SNB model once more. It still ignores the voltage-gated ion channels. Furthermore, it ignores the effect of the dominant ion in E. coli, K+. The manuscript is incorrect as a result and I would not recommend publication.
In general, an important problem is being researched i.e. how the membrane potential of E. coli is related to motility, but there are serious flaws in the SNB approach and the experimental methodology appears tenuous.
Answers to specific questions raised by the referees
Reviewer #1 (Public Review):
Summary:
Cell-to-cell communication is essential for higher functions in bacterial biofilms. Electrical signals have proven effective in transmitting signals across biofilms. These signals are then used to coordinate cellular metabolisms or to increase antibiotic tolerance. Here, the authors have reported for the first time coordinated oscillation of membrane potential in E. coli biofilms that may have a functional role in photoprotection.
Strengths:
- The authors report original data.
- For the first time, they showed that coordinated oscillations in membrane potential occur in E. Coli biofilms.
- The authors revealed a complex two-phase dynamic involving distinct molecular response mechanisms.
- The authors developed two rigorous models inspired by 1) Hodgkin-Huxley model for the temporal dynamics of membrane potential and 2) Fire-Diffuse-Fire model for the propagation of the electric signal.
- Since its discovery by comparative genomics, the Kch ion channel has not been associated with any specific phenotype in E. coli. Here, the authors proposed a functional role for the putative K+ Kch channel : enhancing survival under photo-toxic conditions.
We thank the referee for their positive evaluations and agree with these statements.
Weaknesses:
- Since the flow of fresh medium is stopped at the beginning of the acquisition, environmental parameters such as pH and RedOx potential are likely to vary significantly during the experiment. It is therefore important to exclude the contributions of these variations to ensure that the electrical response is only induced by light stimulation. Unfortunately, no control experiments were carried out to address this issue.
The electrical responses occur almost instantaneously when the stimulation with blue light begins i.e. it is too fast to be a build of pH. We are not sure what the referee means by Redox potential since it is an attribute of all chemicals that are able to donate/receive electrons. The electrical response to stress appears to be caused by ROS, since when ROS scavengers are added the electrical response is removed i.e. pH plays a very small minority role if any.
- Furthermore, the control parameter of the experiment (light stimulation) is the same as that used to measure the electrical response, i.e. through fluorescence excitation. The use of the PROPS system could solve this problem.
>>We were enthusiastic at the start of the project to use the PROPs system in E. coli as presented by J.M.Krajl et al, ‘Electrical spiking in E. coli probed with a fluorescent voltage-indicating protein’, Science, 2011, 333, 6040, 345. However, the people we contacted in the microbiology community said that it had some technical issues and there have been no subsequent studies using PROPs in bacteria after the initial promising study. The fluorescent protein system recently presented in PNAS seems more promising, ‘Sensitive bacterial Vm sensors revealed the excitability of bacterial Vm and its role in antibiotic tolerance’, X.Jin et al, PNAS, 120, 3, e2208348120.
- Electrical signal propagation is an important aspect of the manuscript. However, a detailed quantitative analysis of the spatial dynamics within the biofilm is lacking. In addition, it is unclear if the electrical signal propagates within the biofilm during the second peak regime, which is mediated by the Kch channel. This is an important question, given that the fire-diffuse-fire model is presented with emphasis on the role of K+ ions.
We have presented a more detailed account of the electrical wavefront modelling work and it is currently under review in a physical journal, ‘Electrical signalling in three dimensional bacterial biofilms using an agent based fire-diffuse-fire model’, V.Martorelli, et al, 2024 https://www.biorxiv.org/content/10.1101/2023.11.17.567515v1
- Since deletion of the kch gene inhibits the long-term electrical response to light stimulation (regime II), the authors concluded that K+ ions play a role in the habituation response. However, Kch is a putative K+ ion channel. The use of specific drugs could help to clarify the role of K+ ions.
Our recent electrical impedance spectroscopy publication provides further evidence that Kch is associated with large changes in conductivity as expected for a voltage-gated ion channel (https://pubs.acs.org/doi/10.1021/acs.nanolett.3c04446, 'Electrical impedance spectroscopy with bacterial biofilms: neuronal-like behavior', E.Akabuogu et al, ACS Nanoletters, 2024, in print.
- The manuscript as such does not allow us to properly conclude on the photo-protective role of the Kch ion channel.
That Kch has a photoprotective role is our current working hypothesis. The hypothesis fits with the data, but we are not saying we have proven it beyond all possible doubt.
- The link between membrane potential dynamics and mechanosensitivity is not captured in the equation for the Q-channel opening dynamics in the Hodgkin-Huxley model (Supp Eq 2).
Our model is agnostic with respect to the mechanosensitivity of the ion channels, although we deduce that mechanosensitive ion channels contribute to ion channel Q.
- Given the large number of parameters used in the models, it is hard to distinguish between prediction and fitting.
This is always an issue with electrophysiological modelling (compared with most heart and brain modelling studies we are very conservative in the choice of parameters for the bacteria). In terms of predicting the different phenomena observed, we believe the model is very successful.
Reviewer #2 (Public Review):
Summary of what the authors were trying to achieve:
The authors thought they studied membrane potential dynamics in E.coli biofilms. They thought so because they were unaware that the dye they used to report that membrane potential in E.coli, has been previously shown not to report it. Because of this, the interpretation of the authors' results is not accurate.
We believe the Pilizota work is scientifically flawed.
Major strengths and weaknesses of the methods and results:
The strength of this work is that all the data is presented clearly, and accurately, as far as I can tell.
The major critical weakness of this paper is the use of ThT dye as a membrane potential dye in E.coli. The work is unaware of a publication from 2020 https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com] that demonstrates that ThT is not a membrane potential dye in E. coli. Therefore I think the results of this paper are misinterpreted. The same publication I reference above presents a protocol on how to carefully calibrate any candidate membrane potential dye in any given condition.
We are aware of this study, but believe it to be scientifically flawed. We do not cite the article because we do not think it is a particularly useful contribution to the literature.
I now go over each results section in the manuscript.
Result section 1: Blue light triggers electrical spiking in single E. coli cells
I do not think the title of the result section is correct for the following reasons. The above-referenced work demonstrates the loading profile one should expect from a Nernstian dye (Figure 1). It also demonstrates that ThT does not show that profile and explains why is this so. ThT only permeates the membrane under light exposure (Figure 5). This finding is consistent with blue light peroxidising the membrane (see also following work Figure 4 https://www.sciencedirect.com/science/article/pii/S0006349519303923 [sciencedirect.com] on light-induced damage to the electrochemical gradient of protons-I am sure there are more references for this).
The Pilizota group invokes some elaborate artefacts to explain the lack of agreement with a simple Nernstian battery model. The model is incorrect not the fluorophore.
Please note that the loading profile (only observed under light) in the current manuscript in Figure 1B as well as in the video S1 is identical to that in Figure 3 from the above-referenced paper (i.e. https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com]), and corresponding videos S3 and S4. This kind of profile is exactly what one would expect theoretically if the light is simultaneously lowering the membrane potential as the ThT is equilibrating, see Figure S12 of that previous work. There, it is also demonstrated by the means of monitoring the speed of bacterial flagellar motor that the electrochemical gradient of protons is being lowered by the light. The authors state that applying the blue light for different time periods and over different time scales did not change the peak profile. This is expected if the light is lowering the electrochemical gradient of protons. But, in Figure S1, it is clear that it affected the timing of the peak, which is again expected, because the light affects the timing of the decay, and thus of the decay profile of the electrochemical gradient of protons (Figure 4 https://www.sciencedirect.com/science/article/pii/S0006349519303923 [sciencedirect.com]).
We think the proton effect is a million times weaker than that due to potasium i.e. 0.2 M K+ versus 10-7 M H+. We can comfortably neglect the influx of H+ in our experiments.
If find Figure S1D interesting. There authors load TMRM, which is a membrane voltage dye that has been used extensively (as far as I am aware this is the first reference for that and it has not been cited https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1914430 [ncbi.nlm.nih.gov]/). As visible from the last TMRM reference I give, TMRM will only load the cells in Potassium Phosphate buffer with NaCl (and often we used EDTA to permeabilise the membrane). It is not fully clear (to me) whether here TMRM was prepared in rich media (it explicitly says so for ThT in Methods but not for TMRM), but it seems so. If this is the case, it likely also loads because of the damage to the membrane done with light, and therefore I am not surprised that the profiles are similar.
The vast majority of cells continue to be viable. We do not think membrane damage is dominating.
The authors then use CCCP. First, a small correction, as the authors state that it quenches membrane potential. CCCP is a protonophore (https://pubmed.ncbi.nlm.nih.gov/4962086 [pubmed.ncbi.nlm.nih.gov]/), so it collapses electrochemical gradient of protons. This means that it is possible, and this will depend on the type of pumps present in the cell, that CCCP collapses electrochemical gradient of protons, but the membrane potential is equal and opposite in sign to the DeltapH. So using CCCP does not automatically mean membrane potential will collapse (e.g. in some mammalian cells it does not need to be the case, but in E.coli it is https://www.biorxiv.org/content/10.1101/2021.11.19.469321v2 [biorxiv.org]). CCCP has also been recently found to be a substrate for TolC (https://journals.asm.org/doi/10.1128/mbio.00676-21 [journals.asm.org]), but at the concentrations the authors are using CCCP (100uM) that should not affect the results. However, the authors then state because they observed, in Figure S1E, a fast efflux of ions in all cells and no spiking dynamics this confirms that observed dynamics are membrane potential related. I do not agree that it does. First, Figure S1E, does not appear to show transients, instead, it is visible that after 50min treatment with 100uM CCCP, ThT dye shows no dynamics. The action of a Nernstian dye is defined. It is not sufficient that a charged molecule is affected in some way by electrical potential, this needs to be in a very specific way to be a Nernstian dye. Part of the profile of ThT loading observed in https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com] is membrane potential related, but not in a way that is characteristic of Nernstian dye.
Our understanding of the literature is CCCP poisons the whole metabolism of the bacterial cells. The ATP driven K+ channels will stop functioning and this is the dominant contributor to membrane potential.
Result section 2: Membrane potential dynamics depend on the intercellular distance
In this chapter, the authors report that the time to reach the first intensity peak during ThT loading is different when cells are in microclusters. They interpret this as electrical signalling in clusters because the peak is reached faster in microclusters (as opposed to slower because intuitively in these clusters cells could be shielded from light). However, shielding is one possibility. The other is that the membrane has changed in composition and/or the effective light power the cells can tolerate (with mechanisms to handle light-induced damage, some of which authors mention later in the paper) is lower. Given that these cells were left in a microfluidic chamber for 2h hours to attach in growth media according to Methods, there is sufficient time for that to happen. In Figure S12 C and D of that same paper from my group (https://ars.els-cdn.com/content/image/1-s2.0-S0006349519308793-mmc6.pdf [ars.els-cdn.com]) one can see the effects of peak intensity and timing of the peak on the permeability of the membrane. Therefore I do not think the distance is the explanation for what authors observe.
Shielding would provide the reverse effect, since hyperpolarization begins in the dense centres of the biofilms. For the initial 2 hours the cells receive negligible blue light. Neither of the referee’s comments thus seem tenable.
Result section 3: Emergence of synchronized global wavefronts in E. coli biofilms
In this section, the authors exposed a mature biofilm to blue light. They observe that the intensity peak is reached faster in the cells in the middle. They interpret this as the ion-channel-mediated wavefronts moved from the center of the biofilm. As above, cells in the middle can have different membrane permeability to those at the periphery, and probably even more importantly, there is no light profile shown anywhere in SI/Methods. I could be wrong, but the SI3 A profile is consistent with a potential Gaussian beam profile visible in the field of view. In Methods, I find the light source for the blue light and the type of microscope but no comments on how 'flat' the illumination is across their field of view. This is critical to assess what they are observing in this result section. I do find it interesting that the ThT intensity collapsed from the edges of the biofilms. In the publication I mentioned https://www.sciencedirect.com/science/article/pii/S0006349519308793#app2 [sciencedirect.com], the collapse of fluorescence was not understood (other than it is not membrane potential related). It was observed in Figure 5A, C, and F, that at the point of peak, electrochemical gradient of protons is already collapsed, and that at the point of peak cell expands and cytoplasmic content leaks out. This means that this part of the ThT curve is not membrane potential related. The authors see that after the first peak collapsed there is a period of time where ThT does not stain the cells and then it starts again. If after the first peak the cellular content leaks, as we have observed, then staining that occurs much later could be simply staining of cytoplasmic positively charged content, and the timing of that depends on the dynamics of cytoplasmic content leakage (we observed this to be happening over 2h in individual cells). ThT is also a non-specific amyloid dye, and in starving E. coli cells formation of protein clusters has been observed (https://pubmed.ncbi.nlm.nih.gov/30472191 [pubmed.ncbi.nlm.nih.gov]/), so such cytoplasmic staining seems possible.
>>It is very easy to see if the illumination is flat (Köhler illumination) by comparing the intensity of background pixels on the detector. It was flat in our case. Protons have little to do with our work for reasons highlighted before. Differential membrane permittivity is a speculative phenomenon not well supported by any evidence and with no clear molecular mechanism.
Finally, I note that authors observe biofilms of different shapes and sizes and state that they observe similar intensity profiles, which could mean that my comment on 'flatness' of the field of view above is not a concern. However, the scale bar in Figure 2A is not legible, so I can't compare it to the variation of sizes of the biofilms in Figure 2C (67 to 280um). Based on this, I think that the illumination profile is still a concern.
The referee now contradicts themselves and wants a scale bar to be more visible. We have changed the scale bar.
Result section 4: Voltage-gated Kch potassium channels mediate ion-channel electrical oscillations in E. coli
First I note at this point, given that I disagree that the data presented thus 'suggest that E. coli biofilms use electrical signaling to coordinate long-range responses to light stress' as the authors state, it gets harder to comment on the rest of the results.
In this result section the authors look at the effect of Kch, a putative voltage-gated potassium channel, on ThT profile in E. coli cells. And they see a difference. It is worth noting that in the publication https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com] it is found that ThT is also likely a substrate for TolC (Figure 4), but that scenario could not be distinguished from the one where TolC mutant has a different membrane permeability (and there is a publication that suggests the latter is happening https://onlinelibrary.wiley.com/doi/10.1111/j.1365-2958.2010.07245.x [onlinelibrary.wiley.com]). Given this, it is also possible that Kch deletion affects the membrane permeability. I do note that in video S4 I seem to see more of, what appear to be, plasmolysed cells. The authors do not see the ThT intensity with this mutant that appears long after the initial peak has disappeared, as they see in WT. It is not clear how long they waited for this, as from Figure S3C it could simply be that the dynamics of this is a lot slower, e.g. Kch deletion changes membrane permeability.
The work that TolC provides a possible passive pathway for ThT to leave cells seems slightly niche. It just demonstrates another mechanism for the cells to equilibriate the concentrations of ThT in a Nernstian manner i.e. driven by the membrane voltage.
The authors themselves state that the evidence for Kch being a voltage-gated channel is indirect (line 54). I do not think there is a need to claim function from a ThT profile of E. coli mutants (nor do I believe it's good practice), given how accurate single-channel recordings are currently. To know the exact dependency on the membrane potential, ion channel recordings on this protein are needed first.
We have good evidence form electrical impedance spectroscopy experiments that Kch increases the conductivity of biofilms (https://pubs.acs.org/doi/10.1021/acs.nanolett.3c04446, 'Electrical impedance spectroscopy with bacterial biofilms: neuronal-like behavior', E.Akabuogu et al, ACS Nanoletters, 2024, in print.
Result section 5: Blue light influences ion-channel mediated membrane potential events in E. coli
In this chapter the authors vary the light intensity and stain the cells with PI (this dye gets into the cells when the membrane becomes very permeable), and the extracellular environment with K+ dye (I have not yet worked carefully with this dye). They find that different amounts of light influence ThT dynamics. This is in line with previous literature (both papers I have been mentioning: Figure 4 https://www.sciencedirect.com/science/article/pii/S0006349519303923 [sciencedirect.com] and https://ars.els-cdn.com/content/image/1-s2.0-S0006349519308793-mmc6.pdf [ars.els-cdn.com] especially SI12), but does not add anything new. I think the results presented here can be explained with previously published theory and do not indicate that the ion-channel mediated membrane potential dynamics is a light stress relief process.
The simple Nernstian battery model proposed by Pilizota et al is erroneous in our opinion for reasons outlined above. We believe it will prove to be a dead end for bacterial electrophysiology studies.
Result section 6: Development of a Hodgkin-Huxley model for the observed membrane potential dynamics
This results section starts with the authors stating: 'our data provide evidence that E. coli manages light stress through well-controlled modulation of its membrane potential dynamics'. As stated above, I think they are instead observing the process of ThT loading while the light is damaging the membrane and thus simultaneously collapsing the electrochemical gradient of protons. As stated above, this has been modelled before. And then, they observe a ThT staining that is independent from membrane potential.
This is an erroneous niche opinion. Protons have little say in the membrane potential since there are so few of them. The membrane potential is mostly determined by K+.
I will briefly comment on the Hodgkin Huxley (HH) based model. First, I think there is no evidence for two channels with different activation profiles as authors propose. But also, the HH model has been developed for neurons. There, the leakage and the pumping fluxes are both described by a constant representing conductivity, times the difference between the membrane potential and Nernst potential for the given ion. The conductivity in the model is given as gK*n^4 for potassium, gNa*m^3*h sodium, and gL for leakage, where gK, gNa and gL were measured experimentally for neurons. And, n, m, and h are variables that describe the experimentally observed voltage-gated mechanism of neuronal sodium and potassium channels. (Please see Hodgkin AL, Huxley AF. 1952. Currents carried by sodium and potassium ions through the membrane of the giant axon of Loligo. J. Physiol. 116:449-72 and Hodgkin AL, Huxley AF. 1952. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117:500-44).
In the 70 years since Hodgkin and Huxley first presented their model, a huge number of similar models have been proposed to describe cellular electrophysiology. We are not being hyperbolic when we state that the HH models for excitable cells are like the Schrödinger equation for molecules. We carefully adapted our HH model to reflect the currently understood electrophysiology of E. coli.
Thus, in applying the model to describe bacterial electrophysiology one should ensure near equilibrium requirement holds (so that (V-VQ) etc terms in authors' equation Figure 5 B hold), and potassium and other channels in a given bacterium have similar gating properties to those found in neurons. I am not aware of such measurements in any bacteria, and therefore think the pump leak model of the electrophysiology of bacteria needs to start with fluxes that are more general (for example Keener JP, Sneyd J. 2009. Mathematical physiology: I: Cellular physiology. New York: Springer or https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0000144 [journals.plos.org])
The reference is to a slightly more modern version of a simple Nernstian battery model. The model will not oscillate and thus will not help modelling membrane potentials in bacteria. We are unsure where the equilibrium requirement comes from (inadequate modelling of the dynamics?)
Result section 7: Mechanosensitive ion channels (MS) are vital for the first hyperpolarization event in E. coli.
The results that Mcs channels affect the profile of ThT dye are interesting. It is again possible that the membrane permeability of these mutants has changed and therefore the dynamics have changed, so this needs to be checked first. I also note that our results show that the peak of ThT coincides with cell expansion. For this to be understood a model is needed that also takes into account the link between maintenance of electrochemical gradients of ions in the cell and osmotic pressure.
The evidence for permeability changes in the membranes seems to be tenuous.
A side note is that the authors state that the Msc responds to stress-related voltage changes. I think this is an overstatement. Mscs respond to predominantly membrane tension and are mostly nonspecific (see how their action recovers cellular volume in this publication https://www.pnas.org/doi/full/10.1073/pnas.1522185113 [pnas.org]). Authors cite references 35-39 to support this statement. These publications still state that these channels are predominantly membrane tension-gated. Some of the references state that the presence of external ions is important for tension-related gating but sometimes they gate spontaneously in the presence of certain ions. Other publications cited don't really look at gating with respect to ions (39 is on clustering). This is why I think the statement is somewhat misleading.
We have reworded the discussion of Mscs since the literature appears to be ambiguous. We will try to run some electrical impedance spectroscopy experiments on the Msc mutants in the future to attempt to remove the ambiguity.
Result section 8: Anomalous ion-channel-mediated wavefronts propagate light stress signals in 3D E. coli biofilms.
I am not commenting on this result section, as it would only be applicable if ThT was membrane potential dye in E. coli.
Ok, but we disagree on the use of ThT.
Aims achieved/results support their conclusions:
The authors clearly present their data. I am convinced that they have accurately presented everything they observed. However, I think their interpretation of the data and conclusions is inaccurate in line with the discussion I provided above.
Likely impact of the work on the field, and the utility of the methods and data to the community:
I do not think this publication should be published in its current format. It should be revised in light of the previous literature as discussed in detail above. I believe presenting it in it's current form on eLife pages would create unnecessary confusion.
We believe many of the Pilizota group articles are scientifically flawed and are causing the confusion in the literature.
Any other comments:
I note, that while this work studies E. coli, it references papers in other bacteria using ThT. For example, in lines 35-36 authors state that bacteria (Bacillus subtilis in this case) in biofilms have been recently found to modulate membrane potential citing the relevant literature from 2015. It is worth noting that the most recent paper https://journals.asm.org/doi/10.1128/mbio.02220-23 [journals.asm.org] found that ThT binds to one or more proteins in the spore coat, suggesting that it does not act as a membrane potential in Bacillus spores. It is possible that it still reports membrane potential in Bacillus cells and the recent results are strictly spore-specific, but these should be kept in mind when using ThT with Bacillus.
>>ThT was used successfully in previous studies of normal B. subtilis cells (by our own group and A.Prindle, ‘Spatial propagation of electrical signal in circular biofilms’, J.A.Blee et al, Physical Review E, 2019, 100, 052401, J.A.Blee et al, ‘Membrane potentials, oxidative stress and the dispersal response of bacterial biofilms to 405 nm light’, Physical Biology, 2020, 17, 2, 036001, A.Prindle et al, ‘Ion channels enable electrical communication in bacterial communities’, Nature, 2015, 527, 59-63). The connection to low metabolism pore research seems speculative.
Reviewer #3 (Public Review):
It has recently been demonstrated that bacteria in biofilms show changes in membrane potential in response to changes in their environment, and that these can propagate signals through the biofilm to coordinate bacterial behavior. Akabuogu et al. contribute to this exciting research area with a study of blue light-induced membrane potential dynamics in E. coli biofilms. They demonstrate that Thioflavin-T (ThT) intensity (a proxy for membrane potential) displays multiphasic dynamics in response to blue light treatment. They additionally use genetic manipulations to implicate the potassium channel Kch in the latter part of these dynamics. Mechanosensitive ion channels may also be involved, although these channels seem to have blue light-independent effects on membrane potential as well. In addition, there are challenges to the quantitative interpretation of ThT microscopy data which require consideration. The authors then explore whether these dynamics are involved in signaling at the community level. The authors suggest that cell firing is both more coordinated when cells are clustered and happens in waves in larger, 3D biofilms; however, in both cases evidence for these claims is incomplete. The authors present two simulations to describe the ThT data. The first of these simulations, a Hodgkin-Huxley model, indicates that the data are consistent with the activity of two ion channels with different kinetics; the Kch channel mutant, which ablates a specific portion of the response curve, is consistent with this. The second model is a fire-diffuse-fire model to describe wavefront propagation of membrane potential changes in a 3D biofilm; because the wavefront data are not presented clearly, the results of this model are difficult to interpret. Finally, the authors discuss whether these membrane potential changes could be involved in generating a protective response to blue light exposure; increased death in a Kch ion channel mutant upon blue light exposure suggests that this may be the case, but a no-light control is needed to clarify this.
In a few instances, the paper is missing key control experiments that are important to the interpretation of the data. This makes it difficult to judge the meaning of some of the presented experiments.
(1) An additional control for the effects of autofluorescence is very important. The authors conduct an experiment where they treat cells with CCCP and see that Thioflavin-T (ThT) dynamics do not change over the course of the experiment. They suggest that this demonstrates that autofluorescence does not impact their measurements. However, cellular autofluorescence depends on the physiological state of the cell, which is impacted by CCCP treatment. A much simpler and more direct experiment would be to repeat the measurement in the absence of ThT or any other stain. This experiment should be performed both in the wild-type strain and in the ∆kch mutant.
ThT is a very bright fluorophore (much brighter than a GFP). It is clear from the images of non-stained samples that autofluorescence provides a negligible contribution to the fluorescence intensity in an image.
(2) The effects of photobleaching should be considered. Of course, the intensity varies a lot over the course of the experiment in a way that photobleaching alone cannot explain. However, photobleaching can still contribute to the kinetics observed. Photobleaching can be assessed by changing the intensity, duration, or frequency of exposure to excitation light during the experiment. Considerations about photobleaching become particularly important when considering the effect of catalase on ThT intensity. The authors find that the decrease in ThT signal after the initial "spike" is attenuated by the addition of catalase; this is what would be predicted by catalase protecting ThT from photobleaching (indeed, catalase can be used to reduce photobleaching in time lapse imaging).
Photobleaching was negligible over the course of the experiments. We employed techniques such as reducing sample exposure time and using the appropriate light intensity to minimize photobleaching.
(3) It would be helpful to have a baseline of membrane potential fluctuations in the absence of the proposed stimulus (in this case, blue light). Including traces of membrane potential recorded without light present would help support the claim that these changes in membrane potential represent a blue light-specific stress response, as the authors suggest. Of course, ThT is blue, so if the excitation light for ThT is problematic for this experiment the alternative dye tetramethylrhodamine methyl ester perchlorate (TMRM) can be used instead.
Unfortunately the fluorescent baseline is too weak to measure cleanly in this experiment. It appears the collective response of all the bacteria hyperpolarization at the same time appears to dominate the signal (measurements in the eLife article and new potentiometry measurements).
(4) The effects of ThT in combination with blue light should be more carefully considered. In mitochondria, a combination of high concentrations of blue light and ThT leads to disruption of the PMF (Skates et al. 2021 BioRXiv), and similarly, ThT treatment enhances the photodynamic effects of blue light in E. coli (Bondia et al. 2021 Chemical Communications). If present in this experiment, this effect could confound the interpretation of the PMF dynamics reported in the paper.
We think the PMF plays a minority role in determining the membrane potential in E. coli. For reasons outlined before (H+ is a minority ion in E. coli compared with K+).
(5) Figures 4D - E indicate that a ∆kch mutant has increased propidium iodide (PI) staining in the presence of blue light; this is interpreted to mean that Kch-mediated membrane potential dynamics help protect cells from blue light. However, Live/Dead staining results in these strains in the absence of blue light are not reported. This means that the possibility that the ∆kch mutant has a general decrease in survival (independent of any effects of blue light) cannot be ruled out.
>>Both strains of bacterial has similar growth curve and also engaged in membrane potential dynamics for the duration of the experiment. We were interested in bacterial cells that observed membrane potential dynamics in the presence of the stress. Bacterial cells need to be alive to engage in membrane potential dynamics (hyperpolarize) under stress conditions. Cells that engaged in membrane potential dynamics and later stained red were only counted after the entire duration. We believe that the wildtype handles the light stress better than the ∆kch mutant as measured with the PI.
(6) Additionally in Figures 4D - E, the interpretation of this experiment can be confounded by the fact that PI uptake can sometimes be seen in bacterial cells with high membrane potential (Kirchhoff & Cypionka 2017 J Microbial Methods); the interpretation is that high membrane potential can lead to increased PI permeability. Because the membrane potential is largely higher throughout blue light treatment in the ∆kch mutant (Fig. 3AB), this complicates the interpretation of this experiment.
Kirchhoff & Cypionka 2017 J Microbial Methods, using fluorescence microscopy, suggested that changes in membrane potential dynamics can introduce experimental bias when propidium iodide is used to confirm the viability of tge bacterial strains, B subtilis (DSM-10) and Dinoroseobacter shibae, that are starved of oxygen (via N2 gassing) for 2 hours. They attempted to support their findings by using CCCP in stopping the membrane potential dynamics (but never showed any pictoral or plotted data for this confirmatory experiment). In our experiment methodology, cell death was not forced on the cells by introducing an extra burden or via anoxia. We believe that the accumulation of PI in ∆kch mutant is not due to high membrane potential dynamics but is attributed to the PI, unbiasedly showing damaged/dead cells. We think that propidium iodide is good for this experiment. Propidium iodide is a dye that is extensively used in life sciences. PI has also been used in the study of bacterial electrophysiology (https://pubmed.ncbi.nlm.nih.gov/32343961/, ) and no membrane potential related bias was reported.
Throughout the paper, many ThT intensity traces are compared, and described as "similar" or "dissimilar", without detailed discussion or a clear standard for comparison. For example, the two membrane potential curves in Fig. S1C are described as "similar" although they have very different shapes, whereas the curves in Fig. 1B and 1D are discussed in terms of their differences although they are evidently much more similar to one another. Without metrics or statistics to compare these curves, it is hard to interpret these claims. These comparative interpretations are additionally challenging because many of the figures in which average trace data are presented do not indicate standard deviation.
Comparison of small changes in the absolute intensities is problematic in such fluorescence experiments. We mean the shape of the traces is similar and they can be modelled using a HH model with similar parameters.
The differences between the TMRM and ThT curves that the authors show in Fig. S1C warrant further consideration. Some of the key features of the response in the ThT curve (on which much of the modeling work in the paper relies) are not very apparent in the TMRM data. It is not obvious to me which of these traces will be more representative of the actual underlying membrane potential dynamics.
In our experiment, TMRM was used to confirm the dynamics observed using ThT. However, ThT appear to be more photostable than TMRM (especially towars the 2nd peak). The most interesting observation is that with both dyes, all phases of the membrane potential dynamics were conspicuous (the first peak, the quiescent period and the second peak). The time periods for these three episodes were also similar.
A key claim in this paper (that dynamics of firing differ depending on whether cells are alone or in a colony) is underpinned by "time-to-first peak" analysis, but there are some challenges in interpreting these results. The authors report an average time-to-first peak of 7.34 min for the data in Figure 1B, but the average curve in Figure 1B peaks earlier than this. In Figure 1E, it appears that there are a handful of outliers in the "sparse cell" condition that likely explain this discrepancy. Either an outlier analysis should be done and the mean recomputed accordingly, or a more outlier-robust method like the median should be used instead. Then, a statistical comparison of these results will indicate whether there is a significant difference between them.
The key point is the comparison of standard errors on the standard deviation.
In two different 3D biofilm experiments, the authors report the propagation of wavefronts of membrane potential; I am unable to discern these wavefronts in the imaging data, and they are not clearly demonstrated by analysis.
The first data set is presented in Figures 2A, 2B, and Video S3. The images and video are very difficult to interpret because of how the images have been scaled: the center of the biofilm is highly saturated, and the zero value has also been set too high to consistently observe the single cells surrounding the biofilm. With the images scaled this way, it is very difficult to assess dynamics. The time stamps in Video S3 and on the panels in Figure 2A also do not correspond to one another although the same biofilm is shown (and the time course in 2B is also different from what is indicated in 2B). In either case, it appears that the center of the biofilm is consistently brighter than the edges, and the intensity of all cells in the biofilm increases in tandem; by eye, propagating wavefronts (either directed toward the edge or the center) are not evident to me. Increased brightness at the center of the biofilm could be explained by increased cell thickness there (as is typical in this type of biofilm). From the image legend, it is not clear whether the image presented is a single confocal slice or a projection. Even if this is a single confocal slice, in both Video S3 and Figure 2A there are regions of "haze" from out-of-focus light evident, suggesting that light from other focal planes is nonetheless present. This seems to me to be a simpler explanation for the fluorescence dynamics observed in this experiment: cells are all following the same trajectory that corresponds to that seen for single cells, and the center is brighter because of increased biofilm thickness.
We appreciate the reviewer for this important observation. We have made changes to the figures to address this confusion. The cell cover has no influence on the observed membrane potential dynamics. The entire biofilm was exposed to the same blue light at each time. Therefore all parts of the biofilm received equal amounts of the blue light intensity. The membrane potential dynamics was not influenced by cell density (see Fig 2C).
The second data set is presented in Video S6B; I am similarly unable to see any wave propagation in this video. I observe only a consistent decrease in fluorescence intensity throughout the experiment that is spatially uniform (except for the bright, dynamic cells near the top; these presumably represent cells that are floating in the microfluidic and have newly arrived to the imaging region).
A visual inspection of Video S6B shows a fast rise, a decrease in fluorescence and a second rise (supplementary figure 4B). The data for the fluorescence was carefully obtained using the imaris software. We created a curved geometry on each slice of the confocal stack. We analyzed the surfaces of this curved plane along the z-axis. This was carried out in imaris.
3D imaging data can be difficult to interpret by eye, so it would perhaps be more helpful to demonstrate these propagating wavefronts by analysis; however, such analysis is not presented in a clear way. The legend in Figure 2B mentions a "wavefront trace", but there is no position information included - this trace instead seems to represent the average intensity trace of all cells. To demonstrate the propagation of a wavefront, this analysis should be shown for different subpopulations of cells at different positions from the center of the biofilm. Data is shown in Figure 8 that reflects the velocity of the wavefront as a function of biofilm position; however, because the wavefronts themselves are not evident in the data, it is difficult to interpret this analysis. The methods section additionally does not contain sufficient information about what these velocities represent and how they are calculated. Because of this, it is difficult for me to evaluate the section of the paper pertaining to wave propagation and the predicted biofilm critical size.
The analysis is considered in more detail in a more expansive modelling article, currently under peer review in a physics journal, ‘Electrical signalling in three dimensional bacterial biofilms using an agent based fire-diffuse-fire model’, V.Martorelli, et al, 2024 https://www.biorxiv.org/content/10.1101/2023.11.17.567515v1
There are some instances in the paper where claims are made that do not have data shown or are not evident in the cited data:
(1) In the first results section, "When CCCP was added, we observed a fast efflux of ions in all cells"- the data figure pertaining to this experiment is in Fig. S1E, which does not show any ion efflux. The methods section does not mention how ion efflux was measured during CCCP treatment.
We have worded this differently to properly convey our results.
(2) In the discussion of voltage-gated calcium channels, the authors refer to "spiking events", but these are not obvious in Figure S3E. Although the fluorescence intensity changes over time, it's hard to distinguish these fluctuations from measurement noise; a no-light control could help clarify this.
The calcium transients observed were not due to noise or artefacts.
(3) The authors state that the membrane potential dynamics simulated in Figure 7B are similar to those observed in 3D biofilms in Fig. S4B; however, the second peak is not clearly evident in Fig. S4B and it looks very different for the mature biofilm data reported in Fig. 2. I have some additional confusion about this data specifically: in the intensity trace shown in Fig. S4B, the intensity in the second frame is much higher than the first; this is not evident in Video S6B, in which the highest intensity is in the first frame at time 0. Similarly, the graph indicates that the intensity at 60 minutes is higher than the intensity at 4 minutes, but this is not the case in Fig. S4A or Video S6B.
The confusion stated here has now been addressed. Also it should be noted that while Fig 2.1 was obtained with LED light source, Fig S4A was obtained using a laser light source. While obtaining the confocal images (for Fig S4A ), the light intensity was controlled to further minimize photobleaching. Most importantly, there is an evidence of slow rise to the 2nd peak in Fig S4B. The first peak, quiescence and slow rise to second peak are evident.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Scientific recommendations:
- Although Fig 4A clearly shows that light stimulation has an influence on the dynamics of cell membrane potential in the biofilm, it is important to rule out the contribution of variations in environmental parameters. I understand that for technical reasons, the flow of fresh medium must be stopped during image acquisition. Therefore, I suggest performing control experiments, where the flow is stopped before image acquisition (15min, 30min, 45min, and 1h before). If there is no significant contribution from environmental variations (pH, RedOx), the dynamics of the electrical response should be superimposed whatever the delay between stopping the flow stop and switching on the light.
In this current research study, we were focused on studying how E. coli cells and biofilms react to blue light stress via their membrane potential dynamics. This involved growing the cells and biofilms, stopping the media flow and obtaining data immediately. We believe that stopping the flow not only helped us to manage data acquisition, it also helped us reduce the effect of environmental factors. In our future study we will expand the work to include how the membrane potential dynamics evolve in the presence of changing environmental factors for example such induced by stopping the flow at varied times.
- Since TMRM signal exhibits a linear increase after the first response peak (Supplementary Figure 1D), I recommend mitigating the statement at line 78.
- To improve the spatial analysis of the electrical response, I suggest plotting kymographs of the intensity profiles across the biofilm. I have plotted this kymograph for Video S3 and it appears that there is no electrical propagation for the second peak. In addition, the authors should provide technical details of how R^2(t) is measured in the first regime (Figure 7E).
See the dedicated simulation article for more details. https://www.biorxiv.org/content/10.1101/2023.11.17.567515v1
- Line 152: To assess the variability of the latency, the authors should consider measuring the variance divided by the mean instead of SD, which may depend on the average value.
We are happy with our current use of standard error on the standard deviation. It shows what we claim to be true.
- Line 154-155: To truly determine whether the amplitude of the "action potential" is independent of biofilm size, the authors should not normalise the signals.
Good point. We qualitatively compared both normalized and unnormalized data. Recent electrical impedance spectroscopy measurements (unpublished) indicate that the electrical activity is an extensive quantity i.e. it scales with the size of the biofilms.
- To precise the role of K+ in the habituation response, I suggest using valinomycin at sub-inhibitory concentrations (10µM). Besides, the high concentration of CCCP used in this study completely inhibits cell activity. Not surprisingly, no electrical response to light stimulation was observed in the presence of CCCP. Finally, the Kch complementation experiment exhibits a "drop after the first peak" on a single point. It would be more convincing to increase the temporal resolution (1min->10s) to show that there is indeed a first and a second peak.
An interesting experiment for the future.
- Line 237-238: There are only two points suggesting that the dynamics of hyperpolarization are faster at higher irradiance(Fig 4A). The authors should consider adding a third intermediate point at 17µW/mm^2 to confirm the statement made in this sentence.
Multiple repeats were performed. We are confident of the robustness of our data.
- Line 249 + Fig 4E: It seems that the data reported on Fig 4E are extracted from Fig 4D. If this is indeed the case, the data should be normalised by the total population size to compare survival probabilities under the two conditions. It would also be great to measure these probabilities (for WT and ∆kch) in the presence of ROS scavengers.
- To distinguish between model fitting and model predictions, the authors should clearly state which parameters are taken from the literature and which parameters are adjusted to fit the experimental data.
- Supplementary Figure 4A: why can't we see any wavefront in this series of images?
For the experimental data, the wavefront was analyzed by employing the imaris software. We systematically created a ROI with a curved geometry within the confocal stack (the biofilm). The fluorescence of ThT was traced along the surface of the curved geometry was analyzed along the z-axis.
- Fig 7B: Could the authors explain why the plateau is higher in the simulations than in the biofilm experiments? Could they add noise on the firing activities?
See the dedicated Martorelli modelling article. In general we would need to approach stochastic Hodgkin-Huxley modelling and the fluorescence data (and electrical impedance spectroscopy data) presented does not have extensive noise (due to collective averaging over many bacteria cells).
- Supplementary Figure 4B: Why can't we see the second peak in confocal images?
The second peak is present although not as robust as in Fig 2B. The confocal images were obtained with a laser source. Therefore we tried to create a balance between applying sufficient light stress on the bacterial cells and mitigating photobleaching.
Editing recommendations:
The editing recommendations below has been applied where appropriate
- Many important technical details are missing (e.g. R^2, curvature, and 445nm irradiance measurements). Error bars are missing from most graphs. The captions should clearly indicate if these are single-cell or biofilm experiments, strain name, illumination conditions, number of experiments, SD, or SE. Please indicate on all panels of all figures in the main text and in the supplements, which are the conditions: single cell vs. biofilm, strains, medium, centrifugal vs centripetal etc..., where relevant. Please also draw error bars everywhere.
We have now made appropriate changes. We specifically use cells when we were dealing with single cells and biofilms when we worked on biofilms. We decided to describe the strain name either on the panel or the image description.
- Line 47-51: The way the paragraph is written suggests that no coordinated electrical oscillations have been observed in Gram-negative biofilms. However, Hennes et al (referenced as 57 in this manuscript) have shown that a wave of hyperpolarized cells propagates in Neisseria gonorrhoea colony, which is a Gram-negative bacterium.
We are now aware of this work. It was not published when we first submitted our work and the authors claim the waves of activity are due to ROS diffusion NOT propagating waves of ions (coordinated electrical wavefronts).
- Line 59: "stressor" -> "stress" or "perturbation".
The correction has been made.
- Line 153: Please indicate in the Material&Methods how the size of the biofilm is measured.
The biofilm size was obtained using BiofilmQ and the step by step guide for using BiofilmQ were stated..
- Figure 2A: Please provide associated brightfield images to locate bacteria.
- Line 186: Please remove "wavefront" from the caption. Fig2B only shows the average signal as a function of time.
This correction has been implemented.
- Fig 3B,C: Please indicate single cell and biofilm on the panels and also WT and ∆kch.
- Line 289: I suggest adding "in single cell experiments" to the title of this section.
- Fig 5A: blue light is always present at regular time intervals during regime I and II. The presence of blue light only in regime I could be misleading.
- Fig 5C: The curve in Fig 5D seems to correspond to the biofilm case. The curve given by the model, should be compared with the average curve presented in Fig 1D.
- Fig 6A, B, and C: These figures could be moved to supplements.
- Line 392: Replace "turgidity" with "turgor pressure".
- Fig 7C,E: Please use a log-log scale to represent these data and indicate the line of slope 1.
- Fig 7E: The x-axis has been cropped.
- Please provide a supplementary movie for the data presented in Fig 7E.
- Line 455: E. Coli biofilms do not express ThT.
- Line 466: "\gamma is the anomalous exponent". Please remove anomalous (\gamma can equal 1 at this stage).
- Line 475: Please replace "section" with "projection".
- Line 476: Please replace "spatiotemporal" with "temporal". There is no spatial dependency in either figure.
- Line 500: Please define Eikonal approximation.
- Fig 8 could be moved to supplements.
- Line 553: "predicted" -> "predict".
- Line 593: Could the authors explain why their model offers much better quantitative agreement?
- Line 669: What does "universal" mean in that context?
- Line 671: A volume can be pipetted but not a concentration.
- Line 676: Are triplicates technical or biological replicates?
- Sup Fig1: Please use minutes instead of seconds in panel A.
- Model for membrane dynamics: "The fraction of time the Q+ channel is open" -> "The dynamics of Q+ channel activity can be written". Ditto for K+ channel...
- Model for membrane dynamics: "the term ... is a threshold-linear". This function is not linear at all. Why is it called linear? Also, please describe what \sigma is.
- ABFDF model: "releasing a given concentration" -> "releasing a local concentration" or "a given number" but it's not \sigma anymore. Besides, this \sigma is unlikely related to the previous \sigma used in the model of membrane potential dynamics in single cells. Please consider renaming one or the other. Also, ions are referred to as C+ in the text and C in equation 8. Am I missing something?
Reviewer #2 (Recommendations For The Authors):
I have included all my comments as one review. I have done so, despite the fact that some minor comments could have gone into this section, because I decided to review each Result section. I thus felt that not writing it as one review might be harder to follow. I have however highlighted which comments are minor suggestions or where I felt corrections.
However, while I am happy with all my comments being public, given their nature I think they should be shown to authors first. Perhaps the authors want to go over them and think about it before deciding if they are happy for their manuscript to be published along with these comments, or not. I will highlight this in an email to the editor. I question whether in this case, given that I am raising major issues, publishing both the manuscript and the comments is the way to go as I think it might just generate confusion among the audience.
Reviewer #3 (Recommendations For The Authors):
I was unable to find any legends for any of the supplemental videos in my review materials, and I could not open supplemental video 5.
I made some comments in the public review about the analysis and interpretation of the time-to-fire data. One of the other challenges in this data set is that the time resolution is limited- it seems that a large proportion of cells have already fired after a single acquisition frame. It would be ideal to increase the time resolution on this measurement to improve precision. This could be done by imaging more quickly, but that would perhaps necessitate more blue light exposure; an alternative is to do this experiment under lower blue light irradiance where the first spike time is increased (Figure 4A).
In the public review, I mentioned the possible impact of high membrane potential on PI permeability. To address this, the experiment could be repeated with other stains, or the viability of blue light-treated cells could be addressed more directly by outgrowth or colony-forming unit assays.
In the public review, I mentioned the possible combined toxicity of ThT and blue light. Live/dead experiments after blue light exposure with and without ThT could be used to test for such effects, and/or the growth curve experiment in Figure 1F could be repeated with blue light exposure at a comparable irradiance used in the experiment.
Throughout the paper and figure legends, it would help to have more methodological details in the main text, especially those that are critical for the interpretation of the experiment. The experimental details in the methods section are nicely described, but the data analysis section should be expanded significantly.
At the end of the results section, the authors suggest a critical biofilm size of only 4 µm for wavefront propagation (not much larger than a single cell!). The authors show responses for various biofilm sizes in Fig. 2C, but these are all substantially larger. Are there data for cell clusters above and below this size that could support this claim more directly?
The authors mention image registration as part of their analysis pipeline, but the 3D data sets in Video S6B and Fig. S4A do not appear to be registered- were these registered prior to the velocity analysis reported in Fig. 8?
One of the most challenging claims to demonstrate in this paper is that these membrane potential wavefronts are involved in coordinating a large, biofilm-scale response to blue light. One possible way to test this might be to repeat the Live/Dead experiment in planktonic culture or the single-cell condition. If the protection from blue light specifically emerges due to coordinated activity of the biofilm, the Kch mutant would not be expected to show a change in Live/Dead staining in non-biofilm conditions.
Line 140: How is "mature biofilm" defined? Also on this same line, what does "spontaneous" mean here?
Line 151: "much smaller": Given that the reported time for 3D biofilms is 2.73 {plus minus} 0.85 min and in microclusters is 3.27 {plus minus} 1.77 min, this seems overly strong.
Line 155: How is "biofilm density" characterized? Additionally, the data in Figure 2C are presented in distance units (µm), but the text refers to "areal coverage"- please define the meaning of these distance units in the legend and/or here in the text (is this the average radius?).
Lines 161-162: These claims seem strong given the data presented before, and the logic is not very explicit. For example, in the second sentence, the idea that this signaling is used to "coordinate long-range responses to light stress" does not seem strongly evidenced at this point in the paper. What is meant by a long-range response to light stress- are there processes to respond to light that occur at long-length scales (rather than on the single-cell scale)? If so, is there evidence that these membrane potential changes could induce these responses? Please clarify the logic behind these conclusions.
Lines 235-236: In the lower irradiance conditions, the responses are slower overall, and it looks like the ThT intensity is beginning to rise at the end of the measurement. Could a more prominent second peak be observed in these cases if the measurement time was extended?
Line 242-243: The overall trajectories of extracellular potassium are indeed similar, but the kinetics of the second peak of potassium are different than those observed by ThT (it rises some minutes earlier)- is this consistent with the idea that Kch is responsible for that peak? Additionally, the potassium dynamics also reflect the first peak- is this surprising given that the Kch channel has no effect on this peak?
Line 255-256: Again, this seems like a very strong claim. There are several possible interpretations of the catalase experiment (which should be discussed); this experiment perhaps suggests that ROS impacts membrane potential, but does not obviously indicate that these membrane potential fluctuations mitigate ROS levels or help the cells respond to ROS stress. The loss of viability in the ∆kch mutant might indicate a link between these membrane potential experiments and viability, but it is hard to interpret without the no-light control I mention in the public review.
Lines 313-315: "The model predicts... the external light stress". Please clarify this section. Where this prediction arises from in the modeling work? Second, I am not sure what is meant by "modulates the light stress" or "keeps the cell dynamics robust to the intensity of external light stress" (especially since the dynamics clearly vary with irradiance, as seen in Figure 4A).
Line 322: I am not sure what "handles the ROS by adjusting the profile of the membrane potential dynamics" means. What is meant by "handling" ROS? Is the hypothesis that membrane potential dynamics themselves are protective against ROS, or that they induce a ROS-protective response downstream, or something else? Later in lines 327-8 the authors write that changes in the response to ROS in the model agree with the hypothesis, but just showing that ROS impacts the membrane potential does not seem to demonstrate that this has a protective effect against ROS.
Line 365-366: This section title seems confusing- mechanosensitive ion channels totally ablate membrane potential dynamics, they don't have a specific effect on the first hyperpolarization event. The claim that mechanonsensitive ion channels are specifically involved in the first event also appears in the abstract.
Also, the apparent membrane potential is much lower even at the start of the experiment in these mutants- is this expected? This seems to imply that these ion channels also have a blue light independent effect.
Lines 368, 371: Should be VGCCs rather than VGGCs.
Line 477: I believe the figure reference here should be to Figure 7B, not 6B.
Line 567-568: "The initial spike is key to registering the presence of the light stress." What is the evidence for this claim?
Line 592-594: "We have presented much better quantitative agreement..." This is a strong claim; it is not immediately evident to me that the agreement between model and prediction is "much better" in this work than in the cited work. The model in Figure 4 of reference 57 seems to capture the key features of their data. Clarification is needed about this claim.
Line 613: "...strains did not have any additional mutations." This seems to imply that whole genome sequencing was performed- is this the case?
Line 627: I believe this should refer to Figure S2A-B rather than S1.
Line 719: What percentage of cells did not hyperpolarize in these experiments?
Lines 751-754: As I mentioned above, significant detail is missing here about how these measurements were made. How is "radius" defined in 3D biofilms like the one shown in Video S6B, which looks very flat? What is meant by the distance from the substrate to the core, since usually in this biofilm geometry, the core is directly on the substrate? Most importantly, this only describes the process of sectioning the data- how were these sections used to compute the velocity of ThT signal propagation?
I also have some comments specifically on the figure presentation:
Normalization from 0 to 1 has been done in some of the ThT traces in the paper, but not all. The claims in the paper would be easiest to evaluate if the non-normalized data were shown- this is important for the interpretation of some of the claims.
Some indication of standard deviation (error bars or shading) should be added to all figures where mean traces are plotted.
Throughout the paper, I am a bit confused by the time axis; the data consistently starts at 1 minute. This is not intuitive to me, because it seems that the blue light being applied to the cells is also the excitation laser for ThT- in that case, shouldn't the first imaging frame be at time 0 (when the blue light is first applied)? Or is there an additional exposure of blue light 1 minute before imaging starts? This is consequential because it impacts the measured time to the first spike. (Additionally, all of the video time stamps start at 0).
Please increase the size of the scale bars and bar labels throughout, especially in Figure 2A and S4A.
In Figure 1B and D, it would help to decrease the opacity on the individual traces so that more of them can be discerned. It would also improve clarity to have data from the different experiments shown with different colored lines, so that variability between experiments can be clearly visualized.
Results in Figure 1E would be easier to interpret if the frequency were normalized to total N. It is hard to tell from this graph whether the edges and bin widths are the same between the data sets, but if not, they should be. Also, it would help to reduce the opacity of the sparse cell data set so that the full microcluster data set can be seen as well.
Biofilm images are shown in Figures 2A, S3A, and Video S3- these are all of the same biofilm. Why not take the opportunity to show different experimental replicates in these different figures? The same goes for Figure S4A and Video S6B, which again are of the same biofilm.
Figure 2C would be much easier to read if the curves were colored in order of their size; the same is true for Figure 4A and irradiance.
The complementation data in Figure S3D should be moved to the main text figure 3 alongside the data about the corresponding knockout to make it easier to compare the curves.
Fig.ure S3E: Is the Y-axis in this graph mislabeled? It is labeled as ThT fluorescence, but it seems that it is reporting fluorescence from the calcium indicator?
Video S6B is very confusing - why does the video play first forwards and then backwards? Unless I am looking very carefully at the time stamps it is easy to misinterpret this as a rise in the intensity at the end of the experiment. Without a video legend, it's hard to understand this, but I think it would be much more straightforward to interpret if it only played forward. (Also, why is this video labeled 6B when there is no video 6A?)
-
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
This paper presents a comprehensive study of how neural tracking of speech is a ected by background noise. Using five EEG experiments and Temporal response function (TRF), it investigates how minimal background noise can enhance speech tracking even when speech intelligibility remains very high. The results suggest that this enhancement is not attention-driven but could be explained by stochastic resonance. These findings generalize across di erent background noise types and listening conditions, o ering insights into speech processing in real-world environments. I find this paper well-written, the experiments and results are clearly described. However, I have a few comments that may be useful to address.
I thank the reviewer for their positive feedback.
(1) The behavioral accuracy and EEG results for clear speech in Experiment 4 di er from those of Experiments 1-3. Could the author provide insights into the potential reasons for this discrepancy? Might it be due to linguistic/ acoustic di erences between the passages used in experiments? If so, what was the rationale behind using di erent passages across di erent experiments?
The slight di erences in behavior and EEG magnitudes may be due to several factors. Di erent participants took part in the di erent experiments (with some overlap). Stories and questions were generated using ChatGPT using the same approach, but di erent research assistants have supported story and question generation, and ChatGPT advanced throughout the course of the study, such that di erent versions were used over time (better version control was only recently introduced by OpenAI). The same Google voice was used for all experiments, so this cannot be a factor. Most critically, within each experiment, assignment of speech-clarity conditions to di erent stories was randomized, such that statistical comparisons are una ected by these minor di erences between experiments. The noise-related enhancement generalizes across all experiments, showing that minor di erences in experimental materials do not impact it.
(2) Regarding peak amplitude extraction, why were the exact peak amplitudes and latencies of the TRFs for each subject not extracted, and instead, an amplitude average within a 20 ms time window based on the group-averaged TRFs used? Did the latencies significantly di er across di erent SNR conditions?
Estimation of peak latency can be challenging if a deflection is not very pronounced in a participant. Especially the N1 was small for some conditions. Using the mean amplitude in a specific time window is very common practice in EEG research that mitigates this issue. Another, albeit less common, approach is to use a Jackknifing procedure to estimate each participant’s latencies (Smulders 2010 Psychophysiology; although this may sometimes not work well). For the revision, I used the Jackknifing approach to estimate peak latencies for each participant and condition, and extracted the mean amplitude around the peak latency. As expected, this approach provides very similar e ects as reported in the main article, here exemplified for Experiments 1 and 2. The results are thus not a ected by this data analysis choice. The estimated latencies di ered across SNRs, e.g., the N1 increased with decreasing SNR (this is less surprising/novel and was thus not added to the manuscript to avoid increasing the amount of information).
Author response image 1.
P1-minus-N1 amplitude for Experiment 1 and 2, using amplitudes centered on individually estimated peak latencies. The asterisk indicates a significant di erence from the clear speech condition (FDR-thresholded).
(3) How is neural tracking quantified in the current study? Does improved neural tracking correlate with EEG prediction accuracy or individual peak amplitudes? Given the di ering trends between N1 and P2 peaks in babble and speech-matched noise in experiment 3, how is it that babble results in greater envelope tracking compared to speech-matched noise?
Neural tracking is generally used for responses resulting from TRF analyses, crosscorrelations, or coherence, where the speech envelope is regressed against the brain signals (see review of Brodbeck & Simon 2020 Current Opinion in Physiology). Correlations between EEG prediction accuracy and individual peak amplitudes was not calculated because the data used for the analyses are not independent. The EEG prediction accuracy essentially integrates information over a longer time interval (here 0–0.4 s), whereas TRF amplitudes are more temporally resolved. If one were to shorten the time interval (e.g., 0.08–0.12 s), then EEG prediction accuracy would look more similar to the TRF results (because the TRF is convolved with the amplitude-onset envelope of the speech [predicted EEG] before calculating the EEG prediction accuracy). Regarding the enhancement di erence between speech-matched noise and babble, I have discussed a possible interpretation in the discussion section. The result is indeed surprising, but it replicates across two experiments (Experiments 3 and 4), and is consistent with previous work using speech-matched noise that did not find the enhancement. I reproduce the part of the discussion here.
“Other work, using a noise masker that spectrally matches the target speech, have not reported tracking enhancements (Ding and Simon, 2013; Zou et al., 2019; Synigal et al., 2023). However, in these works, SNRs have been lower (<10 dB) to investigate neural tracking under challenging listening conditions. At low SNRs, neural speech tracking decreases (Ding and Simon, 2013; Zou et al., 2019; Yasmin et al., 2023; Figures 1 and 2), thus resulting in an inverted u-shape in relation to SNR for attentive and passive listening (Experiments 1 and 2).”
“The noise-related enhancement in the neural tracking of the speech envelope was greatest for 12talker babble, but it was also present for speech-matched noise, pink noise, and, to some extent, white noise. The latter three noises bare no perceptional relation to speech, but resemble stationary, background buzzing from industrial noise, heavy rain, waterfalls, wind, or ventilation. Twelve-talker babble – which is also a stationary masker – is clearly recognizable as overlapping speech, but words or phonemes cannot be identified (Bilger, 1984; Bilger et al., 1984; Wilson, 2003; Wilson et al., 2012b). There may thus be something about the naturalistic, speech nature of the background babble that facilitates neural speech tracking.”
“Twelve-talker babble was associated with the greatest noise-related enhancement in neural tracking, possibly because the 12-talker babble facilitated neuronal activity in speech-relevant auditory regions, where the other, non-speech noises were less e ective.”
(4) The paper discusses how speech envelope-onset tracking varies with di erent background noises. Does the author expect similar trends for speech envelope tracking as well? Additionally, could you explain why envelope onsets were prioritized over envelope tracking in this analysis?
The amplitude-onset envelope was selected because several previous works have used the amplitude-onset envelope, our previous work that first observed the enhancement also used the amplitude-onset envelope, and the amplitude-onset envelope has been suggested to work better for speech tracking. This was added to the manuscript. For the manuscript revision, analyses were calculated for the amplitude envelope, largely replicating the results for the amplitude-onset envelope. The results for the amplitude envelope are now presented in the Supplementary Materials and referred to in the main text.
“The amplitude-onset envelope was selected because a) several previous works have used it (Hertrich et al., 2012; Fiedler et al., 2017; Brodbeck et al., 2018a; Daube et al., 2019; Fiedler et al., 2019), b) our previous work first observing the enhancement also used the amplitude-onset envelope (Yasmin et al., 2023; Panela et al., 2024), and c) the amplitude-onset envelope has been suggested to elicit a strong speech tracking response (Hertrich et al., 2012). Results for analyses using the amplitude envelope instead of the amplitude-onset envelope show similar e ects and are provided in the Supplementary Materials (Figure 1-figure supplement 1).”
Recommendations for the authors:
(1) Include all relevant parameters related to data analysis where applicable. For example, provide the filter parameters (Line 154, Line 177, Line 172), and the default parameters of the speech synthesizer (Line 131).
Additional filter information and parameter values are provided in the revised manuscript.
(2) Please share the data and codes or include a justification as to why the data cannot be shared.
Data and code are provided on OSF (https://osf.io/zs9u5/). A materials availability statement has been added to the manuscript.
Reviewer #2 (Public review):
The author investigates the role of background noise on EEG-assessed speech tracking in a series of five experiments. In the first experiment, the influence of di erent degrees of background noise is investigated and enhanced speech tracking for minimal noise levels is found. The following four experiments explore di erent potential influences on this e ect, such as attentional allocation, di erent noise types, and presentation mode. The step-wise exploration of potential contributors to the e ect of enhanced speech tracking for minimal background noise is compelling. The motivation and reasoning for the di erent studies are clear and logical and therefore easy to follow. The results are discussed in a concise and clear way. While I specifically like the conciseness, one inevitable consequence is that not all results are equally discussed in depth. Based on the results of the five experiments, the author concludes that the enhancement of speech tracking for minimal background noise is likely due to stochastic resonance. Given broad conceptualizations of stochastic resonance as a noise benefit this is a reasonable conclusion. This study will likely impact the field as it provides compelling support questioning the relationship between speech tracking and speech processing.
I thank the reviewer for the positive review and thoughtful feedback.
Recommendations for the authors:
As mentioned in the public review, I like the conciseness. However, some points might benefit from addressing them.
(1) The absence of comprehension e ects is on the one hand surprising, as the decreased intelligibility should (theoretically) be visible in this data. On the other hand, from my own experience, the generation of "good" comprehension questions is quite di icult. While it is mentioned in the methods section, that comprehension accuracy and gist rating go hand in hand, this is not the case here. I am wondering if the data here should be rather understood as "there is no di erence in intelligibility" or that comprehension assessment via comprehension questions is potentially not a valid measure.
I assume that the reviewer refers to Experiment 1, where SNRs approximately below 15 dB led to reduced gist ratings (used as a proxy for speech intelligibility; Davis and Johnsrude, 2003, J Neurosci; Ritz et al., 2022, J Neurosci). That story comprehension accuracy does not decrease could be due to the comprehension questions themselves (as indicated by the reviewer, “good” questions can be hard to generate, potentially having low sensitivity). On the other hand, speech for the most di icult SNR was still ‘reasonably’ intelligible (gist ratings suggest ~85% of words could be understood), and participants may still have been able to follow the thread of the story. I do not further discuss this point in the manuscript, since it is not directly related to the noise-related enhancement in the neural tracking response, because the enhancement was present for high SNRs for which gist ratings did not show a di erence relative to clear speech (i.e., 20 dB and above).
(2) However, if I understood correctly, the "lower" manipulation (same RMS for the whole sound stimulus) of experiment 3 was, what was also used in experiment 1. In experiment 3, unlike 1, there are comprehension e ects. I wondered if there are ideas about why that is.
Yes indeed, the ‘lower’ manipulation in Experiment 3 was also used in Experiments 1, 2, 4, and 5. The generation of the stimulus materials was similar across experiments. However, a new set of stories and comprehension questions was used for each experiment and the participants di ered as well (with some overlap). These aspects may have contributed to the di erence.
(3) Concerning the prediction accuracy, for a naive reader, some surrounding information would be helpful: What is the purpose/expectation of this measure? Is it to show that all models are above chance?
EEG prediction accuracy was included here, mainly because it is commonly used in studies using TRFs. A reader may wonder about EEG prediction accuracy if it were not reported. The hypotheses of the current study are related to the TRF weights/amplitude. This was added to the manuscript.
“EEG prediction accuracy was calculated because many previous studies report it (e.g., Decruy et al., 2019; Broderick et al., 2021; Gillis et al., 2021; Weineck et al., 2022; Karunathilake et al., 2023), but the main focus of the current study is on the TRF weights/amplitude.”
(4) Regarding the length of training and test data I got confused: It says per story 50 25-s snippets. As the maximum length of a story was 2:30 min, those snippets were mostly overlapping, right? It seems that depending on the length of the story and the "location within the time series" of the snippets, the number of remaining non-over-lapping snippets is variable. Also, within training, the snippets were overlapping, correct? Otherwise, the data for training would be too short. Again, as a naive reader, is this common, or can overlapping training data lead to overestimations?
The short stories made non-overlapping windows not feasible, but the overlap unlikely a ects the current results. Using cross-correlation (Hertrich et al 2012 Psychophysiology; which is completely independent for di erent snippets) instead of TRFs shows the same results (now provided in the supplementary materials). In one of our previous studies where the enhancement was first observed (Yasmin et al. 2023 Neuropsychologia), non-overlapping data were used because the stories were longer. This makes any meaningful impact of the overlap very unlikely. Critically, speech-clarity levels were randomized and all analyses were conducted in the same way for all conditions, thus not confounding any of the results/conclusions. The methods section was extended to further explain the choice of overlapping data snippets.
“Speech-clarity levels were randomized across stories and all analyses were conducted similarly for all conditions. Hence, no impact of overlapping training data on the results is expected (consistent with noise-related enhancements observed previously when longer stories and non-overlapping data were used; Yasmin et al., 2023). Analyses using cross-correlation, for which data snippets are treated independently, show similar results compared to those reported here using TRFs (Figure 1figure supplement 2).”
(5) For experiment 1, three stories were clear, while the other 21 conditions were represented by one story each. Presumably, the ratio of 3:1 can a ect TRFs?
TRFs were calculated for each story individually and then averaged across three stories: either three clear stories, or three stories in babble for neighboring SNRs. Hence, the same number of TRFs were averaged for clear and noise conditions, avoiding exactly this issue. This was described in the methods section and is reproduced here:
“Behavioral data (comprehension accuracy, gist ratings), EEG prediction accuracy, and TRFs for the three clear stories were averaged. For the stories in babble, a sliding average across SNR levels was calculated for behavioral data, EEG prediction accuracy, and TRFs, such that data for three neighboring SNR levels were averaged. Averaging across three stories was calculated to reduce noise in the data and match the averaging of three stories for the clear condition.”
(6) Was there an overlap in participants?
Some participants took part in several of the experiments in separate sessions on separate days. This was added to the manuscript.
“Several participants took part in more than one of the experiments, in separate sessions on separate days: 7, 7, 9, 9, and 14 (for Experiments 1-5, respectively) participated only in one experiment; 3 individuals participated in all 5 experiments; 68 unique participants took part across the 5 experiments.”
(7) Can stochastic resonance also explain inverted U-shape results with vocoded speech?
This is an interesting question. Distortions to the neural responses to noise-vocoding may reflect internal noise, but this would require additional research. For example, the Hauswald study (2022 EJN), showing enhancements due to noise-vocoding, used vocoding channels that also reduced speech intelligibility. The study would ideally be repeated with a greater number of vocoding channels to make sure the e ects are not driven by increased attention due to reduced speech intelligibility. I did not further discuss this in detail in the manuscript as it would go too far away from the experiments of the current study.
(8) Typo in the abstract: box sexes is probably meant to say both sexes?
This text was removed, because more detailed gender identification is reported in the methods, and the abstract needed shortening to meet the eLife guidelines.
Reviewing Editor Comments:
Interesting series of experiments to assess the influence of noise on cortical tracking in di erent conditions, interpreting the results with the mechanism of stochastic resonance.
I thank the editor for their encouraging feedback.
For experiment 2, the author wishes to exclude the role of attention, by making participants perform a visual task. Data from low performers on the visual task was excluded, to avoid that participants attended the spoken speech. However, from the high performers on the visual task, how can you be sure that they did not pay attention to the auditory stimuli as well (as auditory attention is quite automatic, and these participants might be good at dividing their attention)? I understand that you can not ask participants about the auditory task during the experiment, but did you ask AFTER the experiment whether they were able to understand the stimuli? I think this is crucial for your interpretation.
Participants were not asked whether they were able to understand the stimuli. Participants would unlikely invest e ort/attention in understanding the stories in babble without a speech-related task. Nevertheless, for follow-up analyses, I removed participants who performed above 0.9 in the visual task (i.e., the high performers), and the di erence between clear speech and speech in babble replicates. In the plots, data from all babble conditions above 15 dB SNR (highly intelligible) were averaged, but the results look almost identical if all SNRs are averaged. Moreover, the correlation between visual task performance and the babble-related enhancement was not-significant. These analyses were added to the Supplementary Materials (Figure 2-figure supplement 1).
Statistics: inconsistencies across experiments with a lot of simple tests (FDR corrected) and in addition sometimes rmANOVA added - if interactions in rmANOVA are not significant then all the simple tests might not be warranted. So a bit of double dipping and over-testing here, but on the whole the conclusions do not seem to be overstated.
The designs of the di erent experiments di ered, thus requiring di erent statistical approaches. Moreover, the di erent tests assess di erent comparisons. For all experiments, contrasting the clear condition to all noise conditions was the main purpose of the experiments. To correct for multiple comparison, the False Discovery Rate correction was used. Repeated-measures ANOVAs were conducted in addition to this – excluding the clear condition because it would not fit into a factorial structure (e.g., Experiment 3) or to avoid analyzing it twice (e.g., Experiment 5) – to investigate di erences between di erent noise conditions. There was thus no over-testing in the presented study.
Small points:
Question on methods: For each story, 50 25-s data snippets were extracted (Page 7, line 190). As you have stories with a duration of 1.5 to 2 minutes, does that mean there is a lot of overlap across data snippets? How does that influence the TRF/prediction accuracy?
The short stories made non-overlapping windows not feasible, but the overlap unlikely a ects the current results. Using cross-correlation (Hertrich et al 2012 Psychophysiology; which is completely independent for di erent snippets) instead of TRFs shows the same results (newly added Figure 1-figure supplement 2). In one of our previous studies where the enhancement was first observed (Yasmin et al. 2023 Neuropsychologia), non-overlapping data were used because the stories were longer. This makes any meaningful impact of the overlap very unlikely. Critically, speechclarity levels were randomized and all analyses were conducted in the same way for all conditions, thus not confounding any of the results/conclusions. The methods section was extended to further explain the choice of overlapping data snippets.
“Overlapping snippets in the training data were used to increase the amount of data in the training given the short duration of the stories. Speech-clarity levels were randomized across stories and all analyses were conducted similarly for all conditions. Hence, no impact of overlapping training data on the results is expected (consistent with noise-related enhancements observed previously when longer stories and non-overlapping data were used; Yasmin et al., 2023). Analyses using crosscorrelation, for which data snippets are treated independently, show similar results compared to those reported here using TRFs (Figure 1-figure supplement 2).”
Results Experiment 3: page 17, line 417: no di erences were found between clear speech and masked speech - is this a power issue (as it does look di erent in the figure, Figure 4b)?
I thank the editor for pointing this out. Indeed, I made a minor mistake. Two comparisons were significant after FDR-thresholding. This is now included in the revised Figure 4. I also made sure the mistake was not present for other analyses; which it was not.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public review):
This paper examines the role of MLCK (myosin light chain kinase) and MLCP (myosin light chain phosphatase) in axon regeneration. Using loss-of-function approaches based on small molecule inhibitors and siRNA knockdown, the authors explore axon regeneration in cell culture and in animal models from central and peripheral nervous systems. Their evidence shows that MLCK activity facilitates axon extension/regeneration, while MLCP prevents it.
Major concerns:
(1) In the title, authors indicate that the observed effects from loss-of-function of MLCK/MLCP take place via F-actin redistribution in the growth cone. However, there are no experiments showing a causal effect between changes in axon growth mediated by MLCK/MLCP and F-actin redistribution.
Thank you for your comments. We revised the title of our manuscript to “MLCK/MLCP regulates mammalian axon regeneration and redistributes the growth cone F-actin”. (line 3)
(2) The author combines MLCK inhibitors with Bleb (Figure 6), trying to verify if both pairs of inhibitors act on the same target/pathway. MLCK may regulate axon growth independent of NMII activity. However, this has very important implications for the understanding not only on how NMII works and affects axon extension, but also in trying to understand what MLCP is doing. One wonders if MLCP actions, which are opposite of MLCK, also independent of NMII activity? The authors, in the discussion section, try to find an explanation for this finding, but I consider it fails since the whole rationale of the manuscript is still around how MLCK and MLCP affect NMII phosphorylation.
Thank you for your comments. Although both MLCK and MLCP regulate the activity of NMII, it has been reported that they also govern domain-specific spatial control of actin-based motility in the growth cone. Specifically, MLCK activity is essential for arc translocation and retrograde flow within the P domain, while MLCP appears to specifically modulate arc movement and associated myosin II contractility in the T zone and C domain (Ref). Therefore, it is proposed that the regulatory mechanisms of MLCK and MLCP are highly complex during the process of axon growth.
[Ref]:Xiao-Feng Zhang, Andrew W Schaefer, Dylan T Burnette, Vincent T Schoonderwoert, Paul Forscher. Rho-dependent contractile responses in the neuronal growth cone are independent of classical peripheral retrograde actin flow. Neuron. 2003 Dec 4;40(5):931-44.
What follows is a discussion of the merits and limitations of different claims of the manuscript in light of the evidence presented.
(1) Using western blot and immunohistochemical analyses, authors first show that MLCK expression is increased in DRG sensory neurons following peripheral axotomy, concomitant to an increase in MLC phosphorylation, suggesting a causal effect (Figure 1). The authors claim that it is common that axon growth-promoting genes are upregulated. It would have been interesting at this point to study in this scenario the regulation of MLCP.
We thank Reviewer for the positive comment on our manuscript.
(2) Using DRG cultures and sciatic nerve crush in the context of MLCK inhibition (ML-7) and down-regulation, authors conclude that MLCK activity is required for mammalian peripheral axon regeneration both in vitro and in vivo (Figure 2). In parallel, the authors show that these treatments affect as expected the phosphorylation levels of MLC.
The in vitro evidence is of standard methods and convincing. However, here, as well as in all other experiments using siRNAs, no Control siRNAs were used. Authors do show that the target protein is downregulated, and they can follow transfected cells with GFP. Still, it should be noted that the standard control for these experiments has not been done.
Thank you for your comments. We utilized scrambled siRNA as a control. I sincerely apologize for the oversight in the manuscript; although we mentioned that scrambled siRNA was used as a control in the figure legends, we failed to clearly articulate this important information in the methods section. We have revised the manuscript accordingly. (line 87, line 549, line, line 562, line 568).
(3) The authors then examined the role of the phosphatase MLCP in axon growth during regeneration. The authors first use a known MLCP blocker, phorbol 12,13-dibutyrate (PDBu), to show that is able to increase the levels of p-MLC, with a concomitant increase in the extent of axon regrowth of DRG neurons, both in permissive as well as non-permissive substrates. The authors repeat the experiments using the knockdown of MYPT1, a key component of the MLC-phosphatase, and again can observe a growth-promoting effect (Figure 3).
The authors further show evidence for the growth-enhancing effect in vivo, in nerve crush experiments. The evidence in vivo deserves more evidence and experimental details (see comment 2). A key weakness of the data was mentioned previously: no control siARN was used.
Thank you for your comments. As mentioned above, we used scramble siRNA as control in vivo experiment as well.
(4) In the next set of experiments (presented in Figure 4) authors extend the previous observations in primary cultures from the CNS. For that, they use cortical and hippocampal cultures, and pharmacological and genetic loss-of-function using the above-mentioned strategies. The expected results were obtained in both CNS neurons: inhibition or knockdown of the kinase decreases axon growth, whereas inhibition or knockdown of the phosphatase increases growth. A main weakness in this set is that drugs were used from the beginning of the experiment, and hence, they would also affect axon specification. As pointed in Materials and Method (lines 143-145) authors counted as "axons" neurites longer than twice the diameter of the cell soma, and hence would not affect the variable measured. In any case, to be sure one is only affecting axon extension in these cells, the drugs should have been used after axon specification and maturation, which occurs at least after 5 DIV.
Thank you for your comments. We acknowledge that the early administration of drugs can lead to unintended effects on neuronal polarization and axon formation. However, in line with our previous publication, we focused exclusively on measuring the longest length of the axon. To quantify axon length, we selected neurons exhibiting an axonal process exceeding twice the diameter of their cell body and measured the longest axon from 100 neurons for each condition (Ref 1, Ref 2). Consequently, we believe that drug administration at the onset of cell culture influences axon formation; however, it does not significantly affect the drug's impact on axon length.
[Ref 1]: Chang-Mei Liu, Rui-Ying Wang, Saijilafu, Zhong-Xian Jiao, Bo-Yin Zhang, Feng-Quan Zhou. MicroRNA-138 and SIRT1 form a mutual negative feedback loop to regulate mammalian axon regeneration. Genes Dev. 2013 Jul 1;27(13):1473-83.
[Ref 2]: Eun-Mi Hur, Saijilafu, Byoung Dae Lee, Seong-Jin Kim, Wen-Lin Xu, Feng-Quan Zhou. GSK3 controls axon growth via CLASP-mediated regulation of growth cone microtubules. Genes Dev. 2011 Sep 15;25(18):1968-81.
(5) In Figure 7, the authors a local cytoskeletal action of the drug, but the evidence provided does not differentiate between a localized action of the drugs and a localized cell activity.
We appreciate the reviewer’s insightful comments and have revised our title to “MLCK/MLCP Regulates mammalian axon regeneration and redistributes growth cone F-actin.” Furthermore, we have made corresponding revisions to the manuscript (line31, line 73).
References:
(1) Eun-Mi Hur 1, In Hong Yang, Deok-Ho Kim, Justin Byun, Saijilafu, Wen-Lin Xu, Philip R Nicovich, Raymond Cheong, Andre Levchenko, Nitish Thakor, Feng-Quan Zhou. 2011. Engineering neuronal growth cones to promote axon regeneration over inhibitory molecules. Proc Natl Acad Sci U S A. 2011 Mar 22;108(12):5057-62. doi: 10.1073/pnas.1011258108.
(2) Garrido-Casado M, Asensio-Juárez G, Talayero VC, Vicente-Manzanares M. 2024. Engines of change: Nonmuscle myosin II in mechanobiology. Curr Opin Cell Biol. 2024 Apr;87:102344. doi: 10.1016/j.ceb.2024.102344.
(3) Karen A Newell-Litwa 1, Rick Horwitz 2, Marcelo L Lamers. 2015. Non-muscle myosin II in disease: mechanisms and therapeutic opportunities. Dis Model Mech. 2015 Dec;8(12):1495-515. doi: 10.1242/dmm.022103.
Reviewer #2 (Public review):
Summary:
Saijilafu et al. demonstrate that MLCK/MLCP proteins promote axonal regeneration in both the central nervous system (CNS) and peripheral nervous system (PNS) using primary cultures of adult DRG neurons, hippocampal and cortical neurons, as well as in vivo experiments involving sciatic nerve injury, spinal cord injury, and optic nerve crush. The authors show that axon regrowth is possible across different contexts through genetic and pharmacological manipulation of these proteins. Additionally, they propose that MLCK/MLCP may regulate F-actin reorganization in the growth cone, which is significant as it suggests a novel strategy for promoting axonal regeneration.
Strengths:
This manuscript presents a wide range of experimental models to address its hypothesis and biological question. Notably, the use of multiple in vivo models significantly enhances the overall validity of the study.
We thank Reviewer for the positive comment on our manuscript.
Weaknesses:
- The authors previously published that blocking myosin II activity stimulates axonal growth and that MLCK activates myosin II. The present work shows that inhibiting MLCK blocks axonal regeneration while blocking MLCP (the protein that dephosphorylates myosin II) produces the opposite effect. Although this contradiction is discussed, no new evidence has been added to the manuscript to clarify this mechanism or address the remaining questions. Critical unresolved questions include: what happens to myosin II expression when both MLCK and MLCP are inhibited? If MLCK/MLCP are acting through an independent mechanism, what would that mechanism be?
- In the discussion, the authors mention the existence of two myosin II isoforms with opposing effects on axonal growth. Still, there is no evidence in the manuscript to support this point.
- It is also unclear how MLCK/MLCP acts on the actin cytoskeleton. The authors suggest that proteins such as ADF/cofilin, Arp 2/3, Eps8, Profilin, Myosin II, and Myosin V could regulate changes in F-actin dynamics. However, this study provides no experimental evidence to determine which proteins may be involved in the mechanism.
Thank you for your comments. Axon growth is an exceptionally intricate process, facilitated by the coordinated regulation of gene expression in the soma, axonal transport along the shaft, and the assembly of cytoskeletal elements and membrane proteins at the growth cone. In this paper, our results primarily demonstrate that MLCK/MLCP plays a crucial role in regulating mammalian axon regeneration and redistributing F-actin within the growth cone; however, we did not investigate which specific proteins act downstream of MLCK/MLCP during axon regeneration.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
- A title more suitable for the evidence shown can be: MLCK/MLCP regulates mammalian axon regeneration and redistributes the growth cone F-actin.
Thank you for your comments. We revised the title of our manuscript to“MLCK/MLCP regulates mammalian axon regeneration and redistributes the growth cone F-actin” (line 3).
-In figure 3, It would be useful to indicate in the figure legend, that the red arrow is pointing to a suture that was performed during surgery to mark clearly the injury site.
Thank you for your comments. We revised Figure 3 legend that indicates the red arrow is pointing to a suture that was performed during surgery to mark clearly the injury site (line 571-572).
- The following is a concern raised in the previous round, and that the response by the authors was so complete and accurate that I consider it would be useful to include it in the discussion section.
Thank you for your comments. We included those contents in the discussion section of our revised manuscript (line 348-354, line 355-359).
The author combines MLCK inhibitors with Bleb (Figure 6), trying to verify if both pairs of inhibitors act on the same target/pathway. The rationale is wrong for at least two reasons.
a- Because both lines of evidence point to contrasting actions of NMII on axon growth, one approach could never "rescue" the other.
Reply by authors in R1:If MLCK regulates axon growth through the activation of Myosin, the inhibitory effect of ML-7 (an MLCK inhibitor) on axon growth might be influenced by Bleb, a NMII inhibitor. However, our findings reveal that the combination of Bleb and ML-7 does not alter the rate of axon outgrowth compared to ML-7 alone. This suggests that the roles of ML-7 and Bleb in axon growth are independent. It means MLCK may regulate axon growth independent of NMII activity.
b- Because the approaches target different steps on NMII activation, one could never "prevent" or rescue the other. For example, for Bleb to provide a phenotype, it should find any p-MLC, because it is only that form of MLC that is capable of inhibiting its ATPase site. In light of this, it is not surprising that Bleb is unable to exert any action in a situation where there is no p-MLC (ML-7, which by inhibiting the kinase drives the levels of p-MLC to zero, Figure 4A). Hence, the results are not possible to validate in the current general interpretation of the authors. (See 'major concern').
Reply by authors in R1: The reported mechanism of blebbistatin is not through competition with the ATP binding site of myosin. Instead, it selectively binds to the ATPase intermediate state associated with ADP and inorganic phosphate, which decelerates the phosphate release. Importantly, blebbistatin does not impede myosin's interaction with actin or the ATP-triggered disassociation of actomyosin. It rather inhibits the myosin head when it forms a product complex with a reduced affinity for actin. This indicates that blebbistatin functions by stabilizing a particular myosin intermediate state that is independent of the phosphorylation status of myosin light chain (MLC).
[Ref] Kovács M, Tóth J et al. Mechanism of blebbistatin inhibition of myosin II. J Biol Chem. 2004 Aug 20;279(34):35557-63.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Liu et al., present an immersion objective adapter design called RIM-Deep, which can be utilized for enhancing axial resolution and reducing spherical aberrations during inverted confocal microscopy of thick cleared tissue.
Strengths:
RI mismatches present a significant challenge to deep tissue imaging, and developing a robust immersion method is valuable in preventing losses in resolution. Liu et al., present data showing that RIM-Deep is suitable for tissue cleared with two different clearing techniques, demonstrating the adaptability and versatility of the approach.
Greetings, we greatly appreciate your feedback. In truth, we have utilized three distinct clearing techniques, including iDISCO, CUBIC, and MACS, to substantiate the adaptability and multifunctionality of the RIM-Deep adapter.
Weaknesses:
Liu et al., claim to have developed a useful technique for deep tissue imaging, but in its current form, the paper does not provide sufficient evidence that their technique performs better than existing ones.
We are in complete agreement with your recommendation, and the additional experiments will conduct a thorough comparison of the efficacy between the RIM-deep adapter and the official adapter in the context of fluorescence bead experiments, along with their performance in cubic and MASC tissue clearing techniques.
Reviewer #1 (Recommendations for the authors):
Suggestions for improvement:
Major revisions:
(1) For the bead experiment, the comparison was made to a 10X dry objective instead of an immersion objective, please make a comparison to the standard immersion objective.
Thank you for your suggestion. We fully agree with your suggestion to make a comparison with the standard immersion objective. We plan to conduct this comparison in future experiments and will thoroughly analyze the imaging differences between the official adapter and the RIM-deep adapter.
(2) It is unclear if an accurate comparison of objectives (same NA etc) is being made in Fig 1G-J, since the official adapter image appears to be of lower resolution even at the surface. At the very least, progressive 2D slices of the reconstruction must be shown for both adapters instead of just the RIM-Deep adapter.
Thank you for your suggestion. We strictly controlled the numerical aperture (NA) of the objectives in Fig 1G-J to ensure the accuracy of the comparison. However, the imaging resolution of the official adapter is consistent with that of the RIM-deep adapter. We agree that showing progressive 2D slices of the reconstruction would provide a more comprehensive comparison of the two adapters.
(3) Similarly, since there already exists an official adapter, it would be useful to see that RIM-Deep performs better even in the mouse tissue, since the clearing method was different.
Thank you for your suggestion. We will investigate the imaging performance of the two additional tissue clearing protocols using both the official adapter and the RIM-deep adapter.
(4) The movies need legends, as it is unclear if they even show 2-D slices very deep into the tissue.
Thank you for your suggestion. We will add figure legends to each movie.
(5) The purpose of Supplementary Figure 3 in its current form is unclear, as is the statement in the text related to it : "The effectiveness and utility of this adapter configuration have been substantiated through a comprehensive series of experimental validations".
Thank you for your suggestion. We will revise the statement to: "We validated the effectiveness and utility of this adapter configuration through a series of experiments."
(6) The system is variably referred to as RIM-Deep or DepthView Enhancer in the text and figures, it would be beneficial to the readers if the authors stuck to one name.
Thank you for your suggestion. We will choose RIM-Deep as the sole name.
Minor revisions
Figures
(1) “Confocal" is incorrectly spelled as "confocol" in Figure 1, "media" is misspelled in multiple places.
Thank you. We will correct these errors.
(2) The camera is misplaced in the Figure 1 A drawing
Thank you. We will fix this issue.
(3) It would be useful to have actual pictures of the immersion objective setup (both RIM-Deep and the pre-existing adapter) since the diagrams are not very clear.
Thank you. We will include actual pictures of both the RIM-Deep and the pre-existing adapter in the supplementary materials.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
This study by Popli et al. evaluated the function of Atg14, an autophagy protein, in reproductive function using a conditional knockout mouse model. The authors showed that female mice lacking Atg14 were infertile partly due to defective embryo transport function of the oviduct and faulty uterine receptivity and decidualization using PgrCre/+;Atg14f/f mice. The findings from this work are exciting and novel. The authors demonstrated that a loss of Atg14 led to an excessive pyroptosis in the oviductal epithelial cells that compromises cellular integrity and structure, impeding the transport function of the oviduct. In addition, the authors use both genetic and pharmacological approaches to test the hypothesis. Therefore, the findings from this study are high-impact and likely reproducible. However, there are multiple major concerns that need to be addressed to improve the quality of the work.
Thank you for the additional data that solidified the conclusion of this study. The authors addressed almost all of my previous concerns in this revised manuscript. However, some key points wording still need to be addressed.
Comments on revisions:
In Fig. 2A, please ensure that these are 5.0 dpc samples since implantation has already occurred at this point. However, the embryo appeared free-floating adjacent to the luminal epithelial cells (LE), even in control.
We understand the reviewer’s concern. We have now replaced the previous H & E image with a clearer, higher-quality section that shows a fully attached embryo within a closed uterine lumen representing a typical implantation morphology at the D5 stage of pregnancy. (Revised Figure 2A)
Fig. 3A-B: "Approximately 80-90% of blastocysts" contradicts the quantification in Figure 3C, which showed a percentage of blastocysts below 50%. Please clarify and correct as needed.
In Fig. 3A-B, we mean to say approximately 80-90% embryos. We have now corrected the statement in the revised manuscript (Line no: 349-351).
The authors showed that Acetylated a-tubulin was present in the ampulla region of cKO (Fig. 4A). However, the revised manuscript still stated that (lines 397-399) ...there was a substantial loss of the ciliary epithelial cells (indicated by fewer a-tubulin and FOXJ1-positive cells) (Fig. 4B, left panel and Fig. S3)... So, the authors may want to tone down their conclusion regarding a "substantial loss" of ciliated epithelial cells if the quantification of ciliated cell number is not performed.
We thank the reviewer for this suggestion. To avoid redundancy and ambiguity, we have revised the statement as below (Line no: 391-395):
“As shown in Fig. 4A, normal ciliary structures were observed in the ampulla of both control and cKO oviducts. However, in the isthmus of cKO oviducts, we observed a reduction in both the FOXJ1- and PAX8-expressing cells (Fig. 4B, and Fig. S3).”
Fig. 4C - the areas with red inset boxes labeled for isthmus are not really isthmus (in both control and cKO). The zoomed-in images (Fig. 4C - The far-right panel for both control and cKO, images are the transitional zone from the ampulla to the isthmus. The isthmus areas should have a thick muscle layer with almost no ciliated cells - see Fig. 4B cKO - those are true isthmus areas.
We thank the reviewer for noting this. We have corrected the label accordingly. Since ciliary epithelial cells predominantly reside in the ampulla, we have included high-resolution images specifically for the ampulla regions.
• Fig. 3A and 3C, it appears that the images were taken at different magnifications, but the scale bars are the same at 200 um. The authors, please double-check the scale bars.
We thank the reviewer for noting this. We have double-checked all the figures to ensure the scale bars are correctly displayed and aligned with the resolution.
• Fig. 6D - why polyphillin-treated samples did not sum to 100%? - please double-check.
Since approximately 50% of the embryos were retained in the oviduct following polyphyllin treatment (Figure 6C, upper bar), the bar in Figure 6D represents this percentage (50% retained) rather than 100%.
Reviewer #2 (Public review)
In this manuscript, Popli et al investigated the roles of autophagy-related gene, Atg14, in the female reproductive tract (FRT) using conditional knockout mouse models. By ablation of Atg14 in both oviduct and uterus with PR-Cre (Atg14 cKO), authors discovered that such females are completely infertile. They went on to show that Atg14 cKO females have impaired embryo implantation as well as embryo transport from oviduct to uterus. Further analysis showed that Atg14 cKO leads to increased pyroptosis in oviduct, which disrupts oviduct epithelial integrity and leads to obstructive oviduct lumen and impaired embryo transport. The authors concluded that Atg14 is critical for maintaining the oviduct homeostasis and keeping the inflammation under check to enable proper embryo transport.
The authors have barely addressed most of my concerns in this revised version with a few minor issues remaining to be addressed:
(1) The authors tried to address my first concern regarding the statement that "autophagy is critical for maintaining the oviduct homeostasis". The revised statement in Lines 53-54 "we report that Atg14-dependent autophagy plays a crucial role in maintaining..." is still not correct. It should be corrected as " we report that autophagy-related protein Atg14 plays a crucial role in maintaining...".
We thank the reviewer for this nice suggestion. We have now modified the statement as suggested (Line no: 54).
(2) Line 349-351 described 80-90% of blastocysts retrieved from oviducts of cKO mice, which is in consistent with Figure 3B (showing more than 98%).
We thank the reviewer for noting this. We have now corrected the statement as: “Unexpectedly, oviduct flushing from cKO mice resulted in the retrieval of approximately 90% of embryos, suggesting their potential entrapment within the oviducts, impeding their transit to the uterus”. (Line No: 349-351).
(3) Line 447, "Fig. 5E" should be Fig. 6A. In addition, grammar error in the next sentence.
We have corrected the figure number and addressed the grammatical error.
(4) In Figure 6D, why the composition of blastocysts in chemical treated group do not add up to 100%.
As explained in Reviewer 1 responses, the bar in Figure 6D represents the 50% retained embryos from Figure 6C upper bar the full count.
Reviewer #3 (Public review):
Summary:
The manuscript by Pooja Popli and co-authors tested the importance of Atg14 in the female reproductive tract by conditionally deleting Atg14 use PrCre and also Foxj1cre. The authors showed that loss of Atg14 leads to infertility due to the retention of embryos within the oviduct. The authors further concluded that the retention of embryos within the oviduct is due to pyroptosis in oviduct cells leading to defective cellular integrity. The revised manuscript has included new experimental data (Figs. S2B, 5B, 5C, and S3) that satisfied the concerns of this reviewer. The manuscript should provide important advancement to the field.
We sincerely thank the reviewer for the thoughtful evaluation of our manuscript and appreciate your constructive feedback.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We appreciate the reviewers thoughtful consideration of our manuscript, and their recognition of the variety of experimental and computational approaches we have brought to bear in probing the very challenging question of uncoupled proton leak through EmrE.
We did record SSME measurements with MeTPP+, a small molecule substrate at two different protein:lipid ratios. These experiments report the rate of net flux when both proton-coupled substrate antiport and substrate-gated proton leak are possible. We will add this data to the revision, including data acquired with different lipid:protein ratio that confirms we are detecting transport rather than binding. In brief, this data shows that the net flux is highly dependent on both proton concentration (pH) and drug-substrate concentration, as predicted by our mechanistic model. This demonstrates that both types of transport contribute to net flux when small molecule substrates are present.
In the absence of drug-substrate, proton leak is the only possible transport pathway. The pyranine assay directly assesses proton leak under these conditions and unambiguously shows faster proton entry into proteoliposomes through the ∆107-EmrE mutant than through WT EmrE, with the rate of proton entry into ∆107-EmrE proteoliposomes matching the rate of proton entry achieved by the protonophore CCCP. We have revised the text to more clearly emphasize how this directly measures proton leak independently of any other type of transport activity. The SSME experiments with a proton gradient only (no small molecule substrate present) provide additional data on shorter timescales that is consistent with the pyranine data. The consistency of the data across multiple LPRs and comparison of transport to proton leak in the SSME assays further strengthens the importance of the C-terminal tail in determining the rate of flux.
None of the current structural models have good resolution (crystallography, EM) or sufficient restraints (NMR) to define the loop and tail conformations sufficiently for comparison with this work. We are in the process of refining an experimental structure of EmrE with better resolution of the loop and tail regions implicated in proton-entry and leak. Direct assessment of structural interactions via mutagenesis is complicated because of the antiparallel homodimer structure of EmrE. Any point mutation necessarily affects both subunits of the dimer, and mutations designed to probe the hydrophobic gate on the more open face of the transporter also have the potential to disrupt closure on the opposite face, particularly in the absence of sufficient resolution in the available structures. Thus, mutagenesis to test specific predicted structural features is deferred until our structure is complete so that we can appropriately interpret the results.
In our simulation setup, the MD results can be considered representative and meaningful for two reasons. First, the C-terminal tail, not present in the prior structure and thus modeled by us, is only 4 residues long. We will show in the revision and detailed response that the system will lose memory of its previous conformation very quickly, such that velocity initialization alone is enough for a diverse starting point. Second, our simulation is more like simulated annealing, starting from a high free energy state to show that, given such random initialization, the tail conformation we get in the end is consistent with what we reported. It is also difficult to sample back-and-forth tail motion within a realistic MD timescale. Therefore, it can be unconclusive to causally infer the allosteric motions with unbiased MD of the wildtype alone. The best viable way is to look at the equilibrium statistics of the most stable states between WT- and ∆107-EmrE and compare the differences.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This descriptive manuscript builds on prior research showing that the elimination of Origin Recognition Complex (ORC) subunits does not halt DNA replication. The authors use various methods to genetically remove one or two ORC subunits from specific tissues and observe continued replication, though it may be incomplete. The replication appears to be primarily endoreduplication, indicating that ORC-independent replication may promote genome reduplication without mitosis. Despite similar findings in previous studies, the paper provides convincing genetic evidence in mice that liver cells can replicate and undergo endoreduplication even with severely depleted ORC levels. While the mechanism behind this ORC-independent replication remains unclear, the study lays the groundwork for future research to explore how cells compensate for the absence of ORC and to develop functional approaches to investigate this process. The reviewers agree that this valuable paper would be strengthened significantly if the authors could delve a bit deeper into the nature of replication initiation, potentially using an origin mapping experiment. Such an exciting contribution would help explain the nature of the proposed new type of Mcm loading, thereby increasing the impact of this study for the field at large.
We appreciate the reviewers’ suggestion. Till now we know of only one paper that mapped origins of replication in regenerating mouse liver, and that was published two months back in Cell (PMID: 39293447). We want to adopt this method, but we do not need it to answer the question asked. We have mapped origins of replication in ORC-deleted cancer cell lines and compared to wild-type cells in Shibata et al., BioRXiv (PMID: 39554186) (it is under review). We report the following: Mapping of origins in cancer cell lines that are wild type or engineered to delete three of the subunits, ORC1, ORC2 or ORC5 shows that specific origins are still used and are mostly at the same sites in the genome as in wild type cells. Of the 30,197 origins in wild type cells (with ORC), only 2,466 (8%) are not used in any of the three ORC deleted cells and 18,319 (60%) are common between the four cell types. Despite the lack of ORC, excess MCM2-7 is still loaded at comparable rates in G1 phase to license reserve origins and is also repeatedly loaded in the same S phase to permit re-replication.
Citation: Specific origin selection and excess functional MCM2-7 loading in ORC-deficient cells. Yoshiyuki Shibata, Mihaela Peycheva, Etsuko Shibata, Daniel Malzl, Rushad Pavri, Anindya Dutta. bioRxiv 2024.10.30.621095; doi: https://doi.org/10.1101/2024.10.30.621095 (PMID: 39554186)
We have now included this in the discussion.
Public Reviews:
Reviewer #1 (Public review):
The origin recognition complex (ORC) is an essential loading factor for the replicative Mcm2-7 helicase complex. Despite ORC's critical role in DNA replication, there have been instances where the loss of specific ORC subunits has still seemingly supported DNA replication in cancer cells, endocycling hepatocytes, and Drosophila polyploid cells. Critically, all tested ORC subunits are essential for development and proliferation in normal cells. This presents a challenge, as conditional knockouts need to be generated, and a skeptic can always claim that there were limiting but sufficient ORC levels for helicase loading and replication in polyploid or transformed cells. That being said, the authors have consistently pushed the system to demonstrate replication in the absence or extreme depletion of ORC subunits.
Here, the authors generate conditional ORC2 mutants to counter a potential argument with prior conditional ORC1 mutants that Cdc6 may substitute for ORC1 function based on homology. They also generate a double ORC1 and ORC2 mutant, which is still capable of DNA replication in polyploid hepatocytes. While this manuscript provides significantly more support for the ability of select cells to replicate in the absence or near absence of select ORC subunits, it does not shed light on a potential mechanism.
The strengths of this manuscript are the mouse genetics and the generation of conditional alleles of ORC2 and the rigorous assessment of phenotypes resulting from limiting amounts of specific ORC subunits. It also builds on prior work with ORC1 to rule out Cdc6 complementing the loss of ORC1.
The weakness is that it is a very hard task to resolve the fundamental question of how much ORC is enough for replication in cancer cells or hepatocytes. Clearly, there is a marked reduction in specific ORC subunits that is sufficient to impact replication during development and in fibroblasts, but the devil's advocate can always claim minimal levels of ORC remaining in these specialized cells.
The significance of the work is that the authors keep improving their conditional alleles (and combining them), thus making it harder and harder (but not impossible) to invoke limiting but sufficient levels of ORC. This work lays the foundation for future functional screens to identify other factors that may modulate the response to the loss of ORC subunits.
This work will be of interest to the DNA replication, polyploidy, and genome stability communities.
Thank you.
Reviewer #2 (Public review):
This manuscript proposes that primary hepatocytes can replicate their DNA without the six-subunit ORC. This follows previous studies that examined mice that did not express ORC1 in the liver. In this study, the authors suppressed expression of ORC2 or ORC1 plus ORC2 in the liver.
Comments:
(1) I find the conclusion of the authors somewhat hard to accept. Biochemically, ORC without the ORC1 or ORC2 subunits cannot load the MCM helicase on DNA. The question arises whether the deletion in the ORC1 and ORC2 genes by Cre is not very tight, allowing some cells to replicate their DNA and allow the liver to develop, or whether the replication of DNA proceeds via non-canonical mechanisms, such as break-induced replication. The increase in the number of polyploid cells in the mice expressing Cre supports the first mechanism, because it is consistent with few cells retaining the capacity to replicate their DNA, at least for some time during development.
In our study, we used EYFP as a marker for Cre recombinase activity. ~98% of the hepatocytes in tissue sections and cells in culture express EYFP, suggesting that the majority of hepatocytes successfully expressed the Cre protein to delete the ORC1 or ORC2 genes. To assess deletion efficiency, we employed sensitive genotyping and Western blotting techniques to confirm the deletion of ORC1 and ORC2 in hepatocytes isolated from Alb-Cre mice. Results in Fig. 2C and Fig. 6D demonstrate the near-complete absence of ORC2 and ORC1 proteins, respectively, in these hepatocytes.
The mutant hepatocytes underwent at least 15–18 divisions during development. The inherited ORC1 or ORC2 protein present during the initial cell divisions, would be diluted to less than 1.5% of wild-type levels within six divisions, making it highly unlikely to support DNA replication, and yet we observe hepatocyte numbers that suggest there was robust cell division even after that point.
Furthermore, the EdU incorporation data confirm DNA synthesis in the absence of ORC1 and ORC2. Specifically, immunofluorescence showed that both in vitro and in vivo, EYFP-positive hepatocytes (indicating successful ORC1 and ORC2 deletion) incorporated EdU, demonstrating that DNA synthesis can occur without ORC1 and ORC2.
Finally, the Alb-ORC2f/f mice have 25-37.5% of the number of hepatocyte nuclei compared to WT mice (Table 2). If that many cells had an undeleted ORC2 gene, that would have shown up in the genotyping PCR and in the Western blots.
(2) Fig 1H shows that 5 days post infection, there is no visible expression of ORC2 in MEFs with the ORC2 flox allele. However, at 15 days post infection, some ORC2 is visible. The authors suggest that a small number of cells that retained expression of ORC2 were selected over the cells not expressing ORC2. Could a similar scenario also happen in vivo?
This would not explain the significant incorporation of EdU in hepatocytes that are EYFP positive and do not have detectable ORC by Western blots. Also note that for MEFs we are delivering the Cre by Adenovirus infection in vitro, so there is a finite probability that a cell will not receive the virus, the Cre and will not delete ORC2. However, in vivo, the Alb-Cre will be expressed in every cell that turns on albumin. There is no escaping the expression of Cre.
(3) Figs 2E-G shows decreased body weight, decreased liver weight and decreased liver to body weight in mice with recombination of the ORC2 flox allele. This means that DNA replication is compromised in the ALB-ORC2f/f mice.
It is possible that DNA replication is partially compromised or may slow down in the absence of ORC2. However, it is important to emphasize that livers with ORC2 deletion remain capable of DNA replication, so much so that liver function and life span are near normal. Therefore, some kind of DNA replication has to serve as a compensatory mechanism in the absence of ORC2 to maintain liver function and support regeneration.
(4) Figs 2I-K do not report the number of hepatocytes, but the percent of hepatocytes with different nuclear sizes. I suspect that the number of hepatocytes is lower in the ALB-ORC2f/f mice than in the ORC2f/f mice. Can the authors report the actual numbers?
We show in Table 2 that the Alb-Orc2f/f mice have about 25-37.5% of the hepatocytes compared to the WT mice.
(5) Figs 3B-G do not report the number of nuclei, but percentages, which are plotted separately for the ORC2-f/f and ALB-ORC2-f/f mice. Can the authors report the actual numbers?
In all the FACS experiments in Fig. 3B-G we collect data for a total of 10,000 nuclei (or cells). For Fig. 3E-G we divide the 10,000 nuclei into the bottom 40% on the EYFP axis (EYFP low, which is mostly EYFP negative) as the control group, and EYFP high (top 20% on the EYFP axis) test group. We have described this in the Methods in the revision and labeled EYFP negative and positive as EYFP low and high in the Figures and Figure legends.
(6) Fig 5 shows the response of ORC2f/f and ALB-ORC2f/f mice after partial hepatectomy. The percent of EdU+ nuclei in the ORC2-f/f (aka ALB-CRE-/-) mice in Fig 5H seems low. Based on other publications in the field it should be about 20-30%. Why is it so low here? The very low nuclear density in the ALB-ORC2-f/f mice (Fig 5F) and the large nuclei (Fig 5I) could indicate that cells fire too few origins, proceed through S phase very slowly and fail to divide.
The percentage of EdU+ nuclei in the ORC2f/f without Alb-Cre mice is 8%, while in PMID 10623657 ~10% of wild type nuclei incorporate EdU at 42 hr post partial hepatectomy (mid-point between the 36-48 hr post hepatectomy that was used in our study). The important result here is that in the ORC2f/f mice with Alb-Cre (+/-) we are seeing significant EdU incorporation. We have also corrected the X-axis labels in 5F, 5I, 7E and 7F to reflect that those measurements were not made at 36 hr post-resection but later (as was indicated in the schematic in Fig. 5A).
(7) Fig 6F shows that ALB-ORC1f/f-ORC2f/f mice have very severe phenotypes in terms of body weight and liver weight (about on third of wild-type!!). Fig 6H and 6I, the actual numbers should be presented, not percentages. The fact that there are EYFP negative cells, implies that CRE was not expressed in all hepatocytes.
The liver weight is very dependent on the body weight, and so we have to look at the liver to body weight ratio to determine if it is inordinately small, and the ratio is 70% of the WT. In females the liver and body weight are low (although in proportion to each other), which maybe is what the reviewer is talking about. However, the fact that liver weight and body weight are not as low in males, suggest that this is a gender (hormone?) specific effect and not a DNA replication defect. We had discussed this possibility. We have another paper also in BioRXiv (Su et al. doi.org/10.1101/2024.12.18.629220) that suggests that ORC subunits have significant effect on gene expression, so it is possible that that is what leads to this sexual dimorphism in phenotype. We have now added this to the discussion.
The bottom 40% of nuclei on the EYFP axis in the FACS profiles (what was labeled EYFP negative but will now be called EYFP low) contains mostly non-hepatocytes that are genuinely EYFP negative. Non-hepatocytes (bile duct cells, endothelial cells, Kupffer cells, blood cells) are a significant part of cells in the dissociated liver (as can be seen in the single cell sequencing results in PMID: 32690901). Their presence does not mean that hepatocytes are not expressing Cre. Hepatocytes are nearly 100% EYFP positive, as can be seen in the tissue sections (where the hepatocytes take up most of visual field) and in cells in culture. Also if there are EYFP negative hepatocyte nuclei in the FACS, that still does not rule out EYFP presence in the cytoplasm. The important point from the FACS is that the EYFP high nuclei (which have expressed Cre for the longest period) are polyploid relative to the EYFP low nuclei.
(8) Comparing the EdU+ cells in Fig 7G versus 5G shows very different number of EdU+ cells in the control animals. This means that one of these images is not representative. The higher fraction of EdU+ cells in the double-knockout could mean that the hepatocytes in the double-knockout take longer to complete DNA replication than the control hepatocytes. The control hepatocytes may have already completed DNA replication, which can explain why the fraction of EdU+ cells is so low in the controls. The authors may need to study mice at earlier time points after partial hepatectomy, i.e. sacrifice the mice at 30-32 hours, instead of 40-52 hours.
The apparent difference that the reviewer comments on stems from differences in nuclear density in the images in Fig. 7G and 5G (also quantitated in Fig. 7F and 5F). The quantitation in Fig. 7H and 5H show that the % of EdU plus cells are comparable (5-8%).
(9) Regarding the calculation of the number of cell divisions during development: the authors assume that all the hepatocytes in the adult liver are derived from hepatoblasts that express Alb. Is it possible to exclude the possibility that pre-hepatoblast cells that do not express Alb give rise to hepatocytes? For example the cells that give rise to hepatoblasts may proliferate more times than normal giving rise to a higher number of hepatoblasts than in wild-type mice.
Single cell sequencing of mouse liver at e11 shows hepatoblasts expressing hepatocyte specific markers (PMID: 32690901). All the cells annotated from the single-cell seq analysis are differentiated cells arguing against the possibility that undifferentiated endodermal cells (what the reviewer probably means by pre-hepatoblasts) exist at e11. We have added this citation to the paper.
Here is a review that says the hepatoblasts expressing Albumin are present before e13. (https://www.ncbi.nlm.nih.gov/books/NBK27068/) says: “The differentiation of bi-potential hepatoblasts into hepatocytes or BECs begins around e13 of mouse development. Initially hepatoblasts express genes associated with both adult hepatocytes (Hnf4α, Albumin) ...” Thus, we can be certain that hepatoblasts before e13 express albumin. Our calculation of number of cell divisions in Table 2 begins from e12.
The reviewer may be suggesting that ORC deletion leads to the immediate demise of hepatoblasts (despite having inherited ORC protein from the endodermal cells) causing undifferentiated endodermal cells to persist and proliferate much longer than in normal development. We consider it unlikely, but if true it will be very unexpected, both by suggesting that deletion of ORC immediately leads to the death of the hepatoblasts (despite a healthy reserve of inherited ORC protein) and by suggesting that there is a novel feedback mechanism from the death/depletion of hepatoblasts leading to the persistence and proliferation of undifferentiated endodermal cells. We have added the reviewer’s suggestion to the discussion.
(10) My interpretation of the data is that not all hepatocytes have the ORC1 and ORC2 genes deleted (eg EYFP-negative cells) and that these cells allow some proliferation in the livers of these mice.
Please see the reply in question #1. Particularly relevant: “Finally, the Alb-ORC2f/f mice have 25-37.5% of the number of hepatocyte nuclei compared to WT mice (Table 2). If that many cells had an undeleted ORC2 gene, that would have shown up in the genotyping PCR and in the Western blots.
Reviewer #3 (Public review):
Summary:
The authors address the role of ORC in DNA replication and that this protein complex is not essential for DNA replication in hepatocytes. They provide evidence that ORC subunit levels are substantially reduced in cells that have been induced to delete multiple exons of the corresponding ORC gene(s) in hepatocytes. They evaluate replication both in purified isolated hepatocytes and in mice after hepatectomy. In both cases, there is clear evidence that DNA replication does not decrease at a level that corresponds with the decrease in detectable ORC subunit and that endoreduplication is the primary type of replication observed. It remains possible that small amounts of residual ORC are responsible for the replication observed, although the authors provide arguments against this possibility. The mechanisms responsible for DNA replication in the absence of ORC are not examined.
Strengths:
The authors clearly show that there are dramatic reductions in the amount of the targeted ORC subunits in the cells that have been targeted for deletion. They also provide clear evidence that there is replication in a subset of these cells and that it is likely due to endoreduplication. Although there is no replication in MEFs derived from cells with the deletion, there is clearly DNA replication occurring in hepatocytes (both isolated in culture and in the context of the liver). Interestingly, the cells undergoing replication exhibit enlarged cell sizes and elevated ploidy indicating endoreduplication of the genome. These findings raise the interesting possibility that endoreduplication does not require ORC while normal replication does.
Weaknesses:
There are two significant weaknesses in this manuscript. The first is that although there is clearly robust reduction of the targeted ORC subunit, the authors cannot confirm that it is deleted in all cells. For example, the analysis in Fig. 4B would suggest that a substantial number of cells have not lost the targeted region of ORC2. Although the western blots show stronger effects, this type of analysis is notorious for non-linear response curves and no standards are provided. The second weakness is that there is no evaluation of the molecular nature of the replication observed. Are there changes in the amount of location of Mcm2-7 loading that is usually mediated by ORC? Does an associated change in Mcm2-7 loading lead to the endoreduplication observed? After numerous papers from this lab and others claiming that ORC is not required for eukaryotic DNA replication in a subset of cells, we still have no information about an alternative pathway that could explain this observation.
We do not see a significant deficit in MCM2-7 loading (amount and rate) in cancer cell lines where we have deleted ORC1, ORC2 or ORC5 genes separately in Shibata et al. bioRxiv 2024.10.30.621095; doi: https://doi.org/10.1101/2024.10.30.621095 (PMID: 39554186). This is now cited in the discussion.
The authors frequently use the presence of a Cre-dependent eYFP expression as evidence that the ORC1 or ORC2 genes have been deleted. Although likely the best visual marker for this, it is not demonstrated that the presence of eYFP ensures that ORC2 has been targeted by Cre. For example, based on the data in Fig. 4B, there seems to be a substantial percentage of ORC2 genes that have not been targeted while the authors report that 100% of the cells express eYFP.
(1) The PCR reactions in Fig. 4B are still contaminated by DNA from non-hepatocyte cells: bile duct cells, endothelial, Kupfer cells and blood cells. Microscopy of cultured cells idnetifies the hepatocytes unequivocally from their morphology. <2% of the hepatocyte cells in culture in Fig. 4C are EYFP-.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors should present the data as suggested in the review and reformulate their conclusions. If possible, mice should be examined 30-32 hours after partial hepatectomy.
Based on the Literature we chose a time that is consistent with the previous paper from us (Uchida et al., Genes & Dev).
Reviewer #3 (Recommendations for the authors):
(1) It would improve the paper to use single-cell methods (e.g. FISH) to assess the deletion of ORC subunits in the targeted cells.
This is something we will reserve for future studies.
(2) The importance of the paper would be increased dramatically by showing that the elimination of ORC changed the location of Mcm2-7 loading. This would be highly likely if the authors hypothesis that ORC is not involved is true. On the other hand, given ORC's role in origin selection, an observation that the same sites are used but less frequently would support a hypothesis that residual intact ORC is responsible for the replication observed.
Shibata et al (PMID: 39554186) has answered this question. The loss of ORC does not change the locations of origins or even the ability to specify origins. We argue that this is what is to be expected from our hypothesis, that although ORC is clearly important for MCM loading in yeast and in biochemical experiments, something unexpected is going on in human cells. Either a vanishingly small amount of ORC (undetectable by commonly used methods) can load the full complement of MCM2-7 at a rate that is comparable to wild type cells, or there is an ORC-independent mechanism of MCM2-7 loading. This is now added to the discussion.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer 1:
Comments on revisions:
This manuscript is in some ways improved - mainly by toning down the conclusions - but a few major weaknesses have not been addressed. I do not agree that it is not justified to perform experiments to investigate the sterility of single CDK8 knockout mice since this could be important and given that the new data show that while there is some overlap in expression of the two prologues, there are also significant differences in the testis. At the least, it would have been interesting and easy to do to show the expression of CDK8 and CDK19 in the single cell transcriptomics, since this might help to identify the different populations.
Certainly, we tried to analyse Cdk8/Cdk19 in single cell transcriptomics. However, we were unable to draw a clear conclusion. Due to a limited sensitivity of single cell sequencing, especially for low abundant transcripts, such as transcription factors (for 10x technology used in our study) (Chuang et al., 2024), it is challenging to establish with certainty CDK8/19 positive and -negative tissues from single cell data because both transcripts are minor. Nevertheless, the majority of cell types showed some expression of CDK8/19, with maximum expression in pachytene/diplotene spermatocytes. We do not include these data to the manuscript particularly as we were successful to assess Cdk8/19 expression patterns using IF approaches.
Author response image 1.
The only definitive way of concluding a kinase-independent phenotype is to rescue with a kinase dead mutant. While I agree that the inhibitors have been well validated, since they did not have any effects, it is hard to be sure that they actually reached their targets in the tissue concerned. This could have been done by cell thermal shift assay. In the absence of any data on this, the conclusion of a kinase-independent effect is weak.
We totally agree with this point, but it takes several years to produce mice with inducible expression of KD CDK8 mice on the DKO background. These experiments are already underway in our lab, however, their results will be published in our future works.
Figure 2 legend includes (G) between (B) and (C), and appears to, in fact, refer to Fig 1E, for which the legend is missing the description.
Thank you, we corrected this.
Finally, Figure S1C appears wrong. Goblet cells are not in the crypt but on the villi (so the graph axis label is wrong), and there are normally between 5 and 15 per villus, so the iDKO figure is normal, but there are a surprisingly high number of goblet cells in the controls. And normally there are 10-15 Paneth cells/crypt, so it looks like these have been underestimated everywhere. I wonder how the counting was done - if it is from images such as those shown here then I am not surprised as the quality is insufficient for quantification. How many crypts and villi were counted? Given the difficulty in counting and the variability per crypt/villus, with quantitative differences like this it is important to do quantifications blind. I personally wouldn't conclude anything from this data and I would recommend to either improve it or not include it. If these data are shown, then data showing efficient double knockout in this tissue should also accompany it, by IF, Western or PCR. Otherwise, given a potentially strong phenotype, repopulation of the intestine by unrecombined crypts might have occurred - this is quite common (see Ganuza et al, EMBO J. 2012).
We added fig. S1C with Western blot showing presence of CDK8 and CCNC in WT intestine and their absence in the DKO intestine. We also corrected that the part of the intestine analyzed was the duodenum, not ileum. We also replaced intestine sections photos with the ones of better quality and higher magnification (200X) and corrected Y axis legend. We apologize for the confusion, and thank the reviewer for careful analysis of our data, which allowed us to make this correction. The numbers of cells were counted on 600x magnification, and the magnification given in the article is for presentation purposes only. Our number of goblet cells was indeed calculated per villus, not crypt, and the resulting number is similar to ones reported in Dannapel et al (Dannappel et al., 2022). As for Paneth cells their numbers correspond to several articles that use the c57bl6 strain (Brischetto et al., 2021; King et al., 2013), as the number of Paneth cells differs between different part of the intestine and different mouse strains (Nakamura et al., 2020).
Reviewer 2:
This reviewer appreciated the authors' effort in improving the quality of this manuscript during their revision. While some concerns remain, the revision is a much improved work and the authors addressed most of my major concerns.
Figure 2E CDK8 and CDK19 immunofluorescent staining images seem to show CDK8 and CDK19 location are completely distinct and in different cells, the authors need to elaborate on this results and discuss what such a distinct location means in line of their double knockout data.
We thank the reviewer for this suggestion. We had expanded the discussion in the lines 518 and 529 and included a better quality picture of the 200x magnification. Our main line of reasoning is that despite distinct expression in different cell types, high magnification show a certain level of expression of both proteins in most cells, so single knockouts will not demonstrate more than a slight phenotype, while the full knockout will have the full effect. This is especially true if our hypothesis that CCNC stabilization is important here, as both kinases can stabilize the protein.
Minor comments:
Supplemental figure 1(C) legend typo : (C) Periodic acid-Schiff stained sections of ilea of tamoxifen treated R26/Cre/ERT2 and DKO mice.
Thank you, we corrected this.
While the effort to identify and generate new antibodies is appreciated, the specificity of the antibodies used should be examined and presented if available.
The specificity of the antibodies for the western blot is confirmed in figure S1F. We added fig. S1G with IF staining of CDK19 KO testes proving our CDK19 antibody specificity.
References:
Brischetto C., Krieger K., Klotz C., et.al. 2021. NF-κB determines Paneth versus goblet cell fate decision in the small intestine. Development 148. doi:10.1242/dev.199683
Chuang H.-C., Li R., Huang H., et.al. 2024. Single-cell sequencing of full-length transcripts and T-cell receptors with automated high-throughput Smart-seq3. BMC Genomics 25:1127. doi:10.1186/s12864-024-11036-0
Dannappel M.V., Zhu D., Sun X., et.al. 2022. CDK8 and CDK19 regulate intestinal differentiation and homeostasis via the chromatin remodeling complex SWI/SNF. J Clin Invest 132. doi:10.1172/JCI158593
King S.L., Mohiuddin J.J., Dekaney C.M.. 2013. Paneth cells expand from newly created and preexisting cells during repair after doxorubicin-induced damage. Am J Physiol Gastrointest Liver Physiol 305:G151–62. doi:10.1152/ajpgi.00441.2012
Nakamura K., Yokoi Y., Fukaya R., et.al. 2020. Expression and localization of Paneth cells and their α-defensins in the small intestine of adult mouse. Front Immunol 11:570296. doi:10.3389/fimmu.2020.570296
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Recommendations For The Authors):
Although the scripts are available at the github link that is shown, the Readme file is not available as a text file. Spreadsheets summarizing the RNA-seq data ought to be available for download, but these are not present. Likewise, are spreadsheets available for the data used to generate the plots in Fig. 10, so that the identities of particular, correlated genes can be viewed?
We have now included the excel sheet with all the DEGs shown in Figure 8-9 (Figure 8 – Source data 1-8). The source data include DEGs that are up- and down-regulated in gWAT, iWAT, liver, and skeletal muscle. The source data files (excel) are the standard output format. We have also updated the github (https://github.com/Leandromvelez/CTRP10-Manuscript-DEG-Sex-specific-connectivities-and-integration) to include a README file and updated the R scripts to annotate steps and processing considerations. In addition, the README file now contains drive links to the files used the unfiltered kallisto TPM and counts at the transcript-level, as well as resulting Differential Expression results based on genotype. Obviously, all criteria from aligned transcripts such as gene filtering and normalization are included in the scripts provided.
Several items would strengthen the work:
(1) Is a CTRP10 antibody available, and does the protein abundance correlate with the mRNA abundances that were assessed in Fig. 1?
Unfortunately, no validated antibody currently exists for CTRP10. Consequently, we were not able to assess protein abundance of CTRP10 in our study.
(2) Were there compensatory changes in the abundance of other CTRP family members? This might be observed at the protein, but not mRNA, level. It might be reasonable to test for the effects of liver, gWAT, skeletal muscle, and iWAT.
We observed no compensatory changes in other CTRP family members based on our RNA-seq data. Unfortunately, we do not have protein data for other CTRP family members.
(3) The gene expression changes shown in Fig. 9 are ranked according to z-score, but it is not clear how this is calculated. It would be helpful to indicate the log2 change in each case.
The z-score is a very commonly used method to show DEGs in studies involving RNA-seq data. We calculate the z-score based on the gene transcript source data (Fig. 8 – Source data 1-8). Z-score is defined as z = (x-μ)/σ, where x is the raw score (gene transcript level), μ is the population mean (mean of gene expression across both WT and KO samples), and σ is the population standard deviation. In essence, the z-score is the raw score minus the population mean, divided by the population standard deviation. We now included this information in Fig. 9 legend.
(4) In Fig. 6, female HFD-fed KO mice had increased glucose (and insulin) after an overnight fast, but increased glucose was not observed in the GTT data. Possibly, this is because the mice were fasted for only 6h for the GTT. This might be mentioned during the description of these data, on lines 221-224. However, this also raises the question of whether there is a difference in the rate of gluconeogenesis (or possibly glycogenolysis for the 6h data) in the KO compared to the controls. Understanding this would require the use of tracers, and is reasonably beyond the scope of this study, but might be mentioned in the discussion.
Per reviewer’s suggestion, we have included this in the “limitation section” of the discussion.
Reduced RER in the HFD-fed female mice might begin to suggest a mechanism since this suggests the mice might have decreased oxidation of carbohydrates and increased oxidation of fat compared to control animals. A glucose tracer might be used to test whether more glucose is stored and, if so, in what tissue this occurs. Possibly, this could be done ex vivo on isolated tissues or cells. Again, this is reasonably beyond the scope of the present study.
Per reviewer’s suggestion, we have included this in the “limitation section” of the discussion.
(5) The discussion includes a brief discussion of the role of estrogen and suggests that in CTRP10 KO mice there are differences in other factors that would be needed to explain the phenotype. Although it is agreed that this is likely the case, estrogen levels were not measured in the present study. It seems like this would be important to study, and might shed light on the female-specific phenotype.
We have now included serum estrogen data. No significant differences in estrogen levels were seen between WT and KO female mice fed either a low-fat diet (Fig. 4 – figure supplement 1) or a high-fat diet (Fig. 5 – figure supplement 2).
Reviewer #2 (Recommendations For The Authors):
While the concept is potentially exciting, there are major problems with the current manuscript. It lacks the mechanistic details behind MHO.
(1) There is a significant gap that was not addressed by the authors. How exactly does CTRP10 lead to the activation of proteins like Fgf1, Fgf21, Il22ra1, Ucp3, and Klf15 in Ctrp10 knockout female mice? Is it likely that CTRP10 regulates these proteins via indirect mechanisms?
We acknowledge that the lack of mechanistic understanding of how CTRP10 loss-of-function leads to changes in gene expression is a major limitation of the study. We have highlighted this limitation in the discussion section.
• The author notes that Ctrp10 knockout female mice, particularly those on a high-fat diet lack Nr1d1 and can sustain a relatively healthy metabolic state. This is supported by the demonstrated upregulation of Fgf1, Fgf21, Il22ra1, Ucp3, and Klf15 in Ctrp10 knockout female mice. However, the mechanisms through which Ctrp10 knockout influences the expression of these molecules are not elucidated.
We acknowledge that this is a major limitation of the study. We have highlighted this limitation in the discussion section.
• How do you substantiate the role of age and a high-nutrient diet in the development of obesity in knockout female mice? However, it is still unclear whether administering a high-fat diet in >20 week age of mice can develop insulin resistance where obesity is developing in LFD.
When fed a low-fat diet, Ctrp10-KO female mice developed obesity with age and yet show little if any glucose intolerance or insulin resistance based on our glucose tolerance and insulin tolerance tests. For the HFD group, we are only comparing WT and KO mice on the same diet (not across diet). While WT mice on HFD gained significant amount of weight over time as expected, Ctrp10-KO female mice gain substantially higher amount of weight relative to WT littermates. Despite this, we did not observe a worsening of glucose tolerance and insulin resistance (based on GTT and ITT) in the KO female mice relative to WT controls as we would expect, since greater adiposity in HFD-fed mice generally correlated with worse metabolic outcomes.
(2) The authors should add the NR1D1 dependency study in female mice if possible.
To address would require the generation of Ctrp10/Nr1d1 double KO mouse model and to carry out the entire study again in these double KO mice. Although this suggestion by the reviewer is a good one, this is beyond the scope of the present study.
(3) NR1D1 represses the set of genes that promotes lipogenesis (the author should add some data that validates this statement).
The role of NR1D1 in regulating metabolic genes are extensively documented in the published literature. NR1D1 (also known as REV-ERBα) is a constitutive transcriptional repressor (PMID: 26044300; PMID: 27445394). Many metabolic genes that are normally represses by NR1D1 is de-repressed in mice lacking NR1D1 globally or in the tissue-specific manner (PMID: 26044300; PMID: 34350828; PMID: 22562834). Among the many NR1D1 target genes involved in lipid metabolism include: CD36, Plin2, Elovl5, Acss3 (from: PMID: 26044300); as well as Scd1, Scd2, Pnpla5, Acsl1, Fasn, Hadhb, and Oxsm (from: PMID: 34350828). We have included this information in the discussion section.
(4) The authors should study the effect of Ctrp10 overexpression in HFD-fed female mice and also with KO of CTRP10 in adult mice if possible.
The suggestion by the reviewer is a good one. However, this is beyond the scope of the study. We do not have a Ctrp10 conditional KO mouse model; as such, we could not study the effect of knocking out CTRP10 in adult mice. Overexpression studies are often considered non-physiological these days since the level of the overexpressed protein is generally much higher than the normal physiological level. For this reason, we did not attempt any overexpression study.
Reviewer #3 (Recommendations For The Authors):
Line 114: Could you please provide definitions for "GluK2" and "GluK4" for readers unfamiliar with these terms?
We have now provided definition for these terms.
Line 140: It's stated that skeletal muscle and the pancreas express similar levels of Ctrp10 as the brain. Please double-check and clarify this assertion for accuracy.
In mice, based on our own data (Fig. 1B), Ctrp10 expression in skeletal muscle and pancreas is comparable to that in the whole brain. In human, based on publicly available data (e.g., Genotype-Tissue Expression portal; GTex), brain expresses much higher level of CTRP10 transcript relative to other peripheral tissues.
Line 141: Have you investigated whether Ctrp10 levels in plasma change after refeeding? If not, consider exploring this aspect to enhance the comprehensiveness of the study.
No validated antibody currently exists for CTRP10. As such, we could not assess plasma level of CTRP10 after refeeding. We have included this as limitation of our study in the discussion section.
Lines 143-144: Clarify the age bracket of the animals used in the study. Additionally, have you observed similar responses, such as downregulation of Ctrp10 in response to refeeding, in both old and young mice in peripheral tissues?
We have now included the age of the mice (~10 weeks old) for the fasting refeeding study as shown in Fig. 1C in the result and method sections.
Lines 135-149: To complement the experiments shown in Fig 1B-D, provide data pertaining to females.
Ideally, we would like to have this data as well. However, to do this for females would involve 47 mice and the collection of 120 tissues (Fig. 1B; n = 10 per tissue), 390 tissues (Fig. 1C; n = 7-8 per tissue per fast or refed state), and 528 tissues (Fig. 1D; n = 11 per tissue per HFD or LFD). This would be a total of 1038 tissue samples. The main purpose of Fig. 1B-D is to demonstrate that Ctrp10 transcript is widely expressed and that its expression is modulated by nutritional (HFD vs. LFD) and metabolic (fast vs. refeed) states. These data provided a rationale to examine the metabolic phenotype in mice lacking CTRP10.
To address the reviewer’s point, we looked at the expression levels of CTRP10/C1QL1 between males and females in the Genotype-Tissue Expression (GTEx) database portal and it does not appear that there are sex differences in CTRP10 expression patterns in normal tissues.
Line 152: Can you provide evidence supporting the hypothesis that Ctrp10 is secreted into the circulation?
CTRP10 has a classic signal peptide sequence and the protein is secreted when expressed in HEK 293 cells (PMID: 18783346). We have shown previously that CTRP10 can be found in the FPLC-fractionated mouse serum using a polyclonal rabbit anti-mouse CTRP10 antibody we generated (PMID: 18783346); this antibody, however, does not work on tissue lysates (many non-specific bands). There is evidence in published literature to show that CTRP10/C1QL2 is clearly found circulating in human plasma. Some of the studies include: 1) Human C1QL2/CTRP10 is detected in the human plasma from UK BioBank (PMID: 37794186; C1QL2 is highlighted in page 335) and serum samples from pregnant females (PMID: 39062451; C1QL2 is highlighted in Table 2). We have included this information in the Introduction section.
Line 178: In Fig 4 D and E (and other figures in the paper), it would be more accurate to express adipocyte size in "μm²" instead of "uM2."
We have double checked and fixed this issue in the figure 4 and 7.
Line 259: Please specify the age of the animals used in the study.
In the method section, we did mention that LFD was provided for the duration of the study, beginning at 5 weeks of age; and that HFD was provided for 14 weeks, beginning at 6-7 weeks of age. Also, in Figure 2A and Figure 4A, the age of the mice is also indicated.
Lines 275-283 and 288-296: It would be more appropriate to move this content to the Discussion section for better contextualization.
We feel that the published information on NR1D1 and FGF21 should be mentioned in the result section so that the readers can immediately appreciate the significance of our data shown in Fig. 8 and 9. However, we also included similar information concerning NR1D1 in the discussion section for better contextualization as suggested.
Line 301: The section on DEG analysis requires additional details. How was the DEG analysis conducted? Were the DEGs from "wild type and KO mice" compared with "human DEGs regulated by sex"? Also, details about the phenotype of the human subjects and their association with obesity should be included. Additionally, discuss specific genes identified by the analysis and their relevance to the Ctrp10 story and human sex-specific gene connectivity analysis.
We have updated the section on DEG analysis and, related to reviewer comments above, significantly expanded the github repository, detailing an analytical walkthrough of all computational analyses performed. To clarify the human integration analysis, we have added the following to the methods:
“To investigate the degree of conservation of CTRP-engaged pathways, we mapped the differentially expressed genes (DEGs) identified from Ctrp10 knockout (KO) versus wild-type (WT) mice to their human orthologs, including human CTRP10, in the GTEx database for transcriptional correlations. Individuals were stratified by sex to examine sex-specific gene connectivity, consisting of 210 males and 100 females to compare gene expression across tissues. Gene-connectivity analyses were performed based on population correlation significances summarized by cumulative -log10(pvalues) as previously described"
Line 330: In Fig 7L, increased oxidative stress in the liver of KO mice is shown. Please provide an explanation for the claim that Ctrp10-KO female mice resembled the WT controls.
In Fig. 7L, we did observe a modest, but significant, increase in oxidative stress in the liver based on the quantification of malondialdehyde (MDA) level, a marker of tissue oxidative stress. However, we did not see any significant differences in the expression of oxidative genes in the liver between WT and KO female mice (Fig. 7J); thus, the statement in line 330 (discussion section) that pertains to oxidative gene expression in fat and liver (Fig. 7E and 7J) is correct.
Line 375: Could you clarify the term "adipose tissue health" and further discuss or provide evidence demonstrating compromised adipose tissue health in female KO mice following HFD?
Adipose tissue health refers to the healthy functioning of adipose tissue (based on its functionality, immune cell population and profile, and metabolic gene expression profiles). Adipose tissue releases free fatty acids in response to fasting and takes up lipids in response to refeeding. Both are these functions are preserved in KO mice as we did not observe any significant differences in free fatty acids (NEFA) and triglyceride levels in the fasted and refed states (Fig. 6AB). Also, we did not observe any significant differences in the expression of inflammatory and fibrotic genes in the adipose tissue of WT and KO female mice fed a high-fat diet (Fig. 7E). If anything, we actually observed a modest, but significant, reduction in the expression of some ER and oxidative stress genes in the KO female mice relative to WT controls (Fig. 7E).
Line 408: Please provide data regarding estrogen levels in wild-type and KO female mice for comparison.
We have now included serum estrogen data. No significant differences in estrogen levels were seen between WT and KO female mice fed either a low-fat diet (Fig. 4 – figure supplement 1) or a high-fat diet (Fig. 5 – figure supplement 2).
Line 587: The GitHub link provided seems to be inactive or incorrect. Please verify and provide the correct link.
We have also updated the github (https://github.com/Leandromvelez/CTRP10-Manuscript-DEG-Sex-specific-connectivities-and-integration) to include a README file and updated the R scripts to annotate steps and processing considerations.
Lines 590-599: Provide additional details about the analysis of human sex-specific genes. Including a table of the top DEGs and pathways differentially regulated by sex would be beneficial for readers' comprehension.
We have expanded the methods, results and associated github repositories to detail all reproducible parameters used in these analyses. The new table of DEGs is included in the manuscript and github repositories.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this article, Nedbalova et al. investigate the biochemical pathway that acts in circulating immune cells to generate adenosine, a systemic signal that directs nutrients toward the immune response, and S-adenosylmethionine (SAM), a methyl donor for lipid, DNA, RNA, and protein synthetic reactions. They find that SAM is largely generated through the uptake of extracellular methionine, but that recycling of adenosine to form ATP contributes a small but important quantity of SAM in immune cells during the immune response. The authors propose that adenosine serves as a sensor of cell activity and nutrient supply, with adenosine secretion dominating in response to increased cellular activity. Their findings of impaired immune action but rescued larval developmental delay when the enzyme Ahcy is knocked down in hemocytes are interpreted as due to effects on methylation processes in hemocytes and reduced production of adenosine to regulate systemic metabolism and development, respectively. Overall this is a strong paper that uses sophisticated metabolic techniques to map the biochemical regulation of an important systemic mediator, highlighting the importance of maintaining appropriate metabolite levels in driving immune cell biology.
Strengths:
The authors deploy metabolic tracing - no easy feat in Drosophila hemocytes - to assess flux into pools of the SAM cycle. This is complemented by mass spectrometry analysis of total levels of SAM cycle metabolites to provide a clear picture of this metabolic pathway in resting and activated immune cells.
The experiments show that the recycling of adenosine to ATP, and ultimately SAM, contributes meaningfully to the ability of immune cells to control infection with wasp eggs.
This is a well-written paper, with very nice figures showing metabolic pathways under investigation. In particular, the italicized annotations, for example, "must be kept low", in Figure 1 illustrate a key point in metabolism - that cells must control levels of various intermediates to keep metabolic pathways moving in a beneficial direction.
Experiments are conducted and controlled well, reagents are tested, and findings are robust and support most of the authors' claims.
Weaknesses:
The authors posit that adenosine acts as a sensor of cellular activity, with increased release indicating active cellular metabolism and insufficient nutrient supply. It is unclear how generalizable they think this may be across different cell types or organs.
In the final part of the Discussion, we elaborate slightly more on a possible generalization of our results, while being aware of the limited space in this experimental paper and therefore intend to address this in more detail and comprehensively in a subsequent perspective article.
The authors extrapolate the findings in Figure 3 of decreased extracellular adenosine in ex vivo cultures of hemocytes with knockdown of Ahcy (panel B) to the in vivo findings of a rescue of larval developmental delay in wasp egg-infected larvae with hemocyte-specific Ahcy RNAi (panel C). This conclusion (discussed in lines 545-547) should be somewhat tempered, as a number of additional metabolic abnormalities characterize Ahcy-knockdown hemocytes, and the in vivo situation may not mimic the ex vivo situation. If adenosine (or inosine) measurements were possible in hemolymph, this would help bolster this idea. However, adenosine at least has a very short half-life.
We agree with the reviewer, and in the 4th paragraph of the Discussion we now discuss more extensively the limitations of our study in relation to ex vivo adenosine measurements and the importance of the SAM pathway on adenosine production.
Reviewer #2 (Public review):
Summary:
In this work, the authors wish to explore the metabolic support mechanisms enabling lamellocyte encapsulation, a critical antiparasitic immune response of insects. They show that S-adenosylmethionine metabolism is specifically important in this process through a combination of measurements of metabolite levels and genetic manipulations of this metabolic process.
Strengths:
The metabolite measurements and the functional analyses are generally very strong and clearly show that the metabolic process under study is important in lamellocyte immune function.
Weaknesses:
The gene expression data are a potential weakness. Not enough is explained about how the RNAseq experiments in Figures 2 and 4 were done, and the representation of the data is unclear.
The RNAseq data have already been described in detail in our previous paper (doi.org/10.1371/journal.pbio.3002299), but we agree with the reviewer that we should describe the necessary details again here. The replicate numbers for RNAseq data were added to figure legends, the TPM values for the selected genes shown in figures are in S1_Data and new S4_Data file with complete RNAseq data (TPM and DESeq2) was added to this revised version.
The paper would also be strengthened by the inclusion of some measure of encapsulation effectiveness: the authors show that manipulation of the S-adenosylmethionine pathway in lamellocytes affects the ability of the host to survive infection, but they do not show direct effects on the ability of the host to encapsulate wasp eggs.
The reviewer is correct that wasp egg encapsulation and host survival may be different (the host can encapsulate and kill the wasp egg and still not survive) and we should also include encapsulation efficiency. This is now added to Figure 3D, which shows that encapsulation efficiency is reduced upon Ahcy-RNAi, which is consistent with the reduced number of lamellocytes.
Reviewer #3 (Public review):
Summary:
The authors of this study provide evidence that Drosophila immune cells show upregulated SAM transmethylation pathway and adenosine recycling upon wasp infection. Blocking this pathway compromises the lamellocyte formation, developmental delay, and host survival, suggesting its physiological relevance.
Strengths:
Snapshot quantification of the metabolite pool does not provide evidence that the metabolic pathway is active or not. The authors use an ex vivo isotope labelling to precisely monitor the SAM and adenosine metabolism. During infection, the methionine metabolism and adenosine recycling are upregulated, which is necessary to support the immune reaction. By combining the genetic experiment, they successfully show that the pathway is activated in immune cells.
Weaknesses:
The authors knocked down Ahcy to prove the importance of SAM methylation pathway. However, Ahcy-RNAi produces a massive accumulation of SAH, in addition to blocking adenosine production. To further validate the phenotypic causality, it is necessary to manipulate other enzymes in the pathway, such as Sam-S, Cbs, SamDC, etc.
We are aware of this weakness and have addressed it in a much more detailed discussion of the limitations of our study in the 6th paragraph of the Discussion.
The authors do not demonstrate how infection stimulates the metabolic pathway given the gene expression of metabolic enzymes is not upregulated by infection stimulus.
Although the goal of this work was to test by 13C tracing whether the SAM pathway activity is upregulated, not to analyze how its activity is regulated, we certainly agree with the reviewer that an explanation of possible regulation, especially in the context of the enzyme expressions we show, should be included in our work. Therefore, we have supplemented the data with methyltransferase expressions (Figure 2-figure supplement 3. And S3_Data) and better describe the changes in expression of some SAM pathway genes, which also support stimulation of this pathway by changes in expression. The enzymes of the SAM transmethylation pathway are highly expressed in hemocytes, and it is known that the activity of this pathway is primarily regulated by (1) increased methionine supply to the cell and (2) the actual utilization of SAM by methyltransferases. Therefore, a possible increase in SAM transmethylation pathway in our work can be suggested (1) by increased expression of 4 transporters capable of transporting methionine, (2) by decreased expression of AhcyL2 (dominant-negative regulator of Ahcy) and (3) by increased expression of 43 out of 200 methyltransferases. This was now added to the first section of Results.
Recommendations for the authors:
Reviewing Editor Comments:
In the discussion with the reviewers, two points were underlined as very important:
(1) Knocking down Ahyc and other enzymes in the SAM methylation pathway may give very distinct phenotypes. Generalising the importance of "SAM methyaltion" only by Ahcy-RNAi is a bit cautious. The authors should be aware of this issue and probably mention it in the Discussion part.
We are aware of this weakness and have addressed it in a much more detailed discussion of the limitations of our study in the 6th paragraph of the Discussion.
(2) Sample sizes should be indicated in the Figure Legends. Replicate numbers on the RNAseq are important - were these expression levels/changes seen more than once?
Sample sizes are shown as scatter plots with individual values wherever possible and all graphs are supplemented with S1_Data table with raw data. The RNAseq data have already been described in detail in our previous paper (doi.org/10.1371/journal.pbio.3002299), but we agree with the reviewers that we should describe the necessary details again here. The replicate numbers for RNAseq data were added to figure legends, the TPM values for the selected genes shown in figures are in S1_Data and new S4_Data file with complete RNAseq data (TPM and DESeq2) was added to this revised version.
Reviewer #1 (Recommendations for the authors):
Major points:
(1) Please provide sample sizes in the legends rather than in a supplementary table.
Sample sizes are shown either as scatter plots with individual values or added to figure legends now.
(2) More details in the methods section are needed:
For hemocyte counting, are sessile and circulating hemocytes measured?
We counted circulating hemocytes (upon infection, most sessile hemocytes are released into the circulation). While for metabolomics all hemocyte types were included, for hemocyte counting we were mainly interested in lamellocytes. Therefore, we counted them 20 hours after infection, when most of the lamellocytes from the first wave are fully differentiated but still mostly in circulation, as they are just starting to adhere to the wasp egg. This was added to the Methods section.
How were levels of methionine and adenosine used in ex vivo cultures selected? This is alluded to in lines 158-159, but no references are provided.
The concentrations are based on measurements of actual hemolymph concentrations in wild-type larvae in the case of methionine, and in the case of adenosine, we used a slightly higher concentration than measured in the adgf-a mutant to have a sufficiently high concentration to allow adenosine to flow into the hemocytes. This is now added to the Methods section.
Minor points:
Response to all minor points: Thank you, errors has now been fixed.
(1) Line 186 - spell out MTA - 5-methylthioadenosine.
(2) Lines 196-212 (and elsewhere) - spelling out cystathione rather than using the abbreviation CTH is recommended because the gene cystathione gamma-lyase (Cth) is also discussed in this paragraph. Using the full name of the metabolite will reduce confusion.
We rather used cystathionine γ-lyase as a full name since it is used only three times while CTH many more times, including figures.
(3) Figure 2 - supplement 2: please include scale bars.
(4) Line 303 - spelling error: "trabsmethylation" should be "transmethylation".
(5) Line 373 - spelling error: "higer" should be "higher".
Reviewer #2 (Recommendations for the authors):
For the RNAseq data, it's unclear whether the gene expression data in Figures 2 and 4 include biological replicates, so it's unclear how much weight we should place on them.
The replicate numbers for RNAseq data were added to figure legends, the TPM values for the selected genes shown in figures are in S1_Data and new S4_Data file with complete RNAseq data (TPM and DESeq2) was added to this revised version.
The representation of these data is also a weakness: Figure 2 shows measurements of transcripts per million, but we don't know what would be high or low expression on this scale.
We have added the actual TPM values for each cell in the RNAseq heatmaps in Figure 2, Figure 2-figure supplement 3, and Figure 4 to make them more readable. Although it is debatable what is high or low expression, to at least have something for comparison, we have added the following information to the figure legends that only 20% of the genes in the presented RNAseq data show expression higher than 15 TPM.
Figure 4 is intended to show expression changes with treatment, but expression changes should be shown on a log scale (so that increases and decreases in expression are shown symmetrically) and should be normalized to some standard level (such as uninfected lamellocytes).
The bars in Figure 4C,D show the fold change (this is now stated in the y-axis legend) compared to 0 h (=uninfected) Adk3 samples - the reason for this visualization is that we wanted to show (1) the differences in levels between Adk3 and Adk2 and in levels between Ak1 and Ak2, respectively, and at the same time (2) the differences between uninfected and infected Adk3 and Ak1. In our opinion, these fold change differences are also much more visible in normal rather than log scale.
Reviewer #3 (Recommendations for the authors):
(1) It might be interesting to test how general this finding would be. How about Bacterial or fungal infection? The authors may also try genetic activation of immune pathways, e.g. Toll, Imd, JAK/STAT.
Although we would also like to support our results in different systems, we believe that our results are already strong enough to propose the final hypothesis and publish it as soon as possible so that it can be tested by other researchers in different systems and contexts than the Drosophila immune response.
(2) How does the metabolic pathway get activated? Enzyme activity? Transporters? Please test or at least discuss the possible mechanism.
The response is already provided above in the Reviewer #3 (Public review) section.
(3) The authors might test overexpression or genetic activation of the SAM transmethylation pathway.
Although we agree that this would potentially strengthen our study, it may not be easy to increase the activity of the SAM transmethylation pathway - simply overexpressing the enzymes may not be enough, the regulation is primarily through the utilization of SAM by methyltransferases and there are hundreds of them and they affect numerous processes.
(4) Supplementation of adenosine to the Ahcy-RNAi larvae would also support their conclusion.
Again, this is not an easy experiment, dietary supplementation would not work, direct injection of adenosine into the hemolymph would not last long enough, adenosine would be quickly removed.
(5) It is interesting to test genetically the requirement of some transporters, especially for gb, which is upregulated upon infection.
Although this would be an interesting experiment, it is beyond the scope of this study; we did not aim to study the role of the SAM transmethylation pathway itself or its regulation, only its overall activity and its role in adenosine production.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
Wang et al. created a series of specific FLIM-FRET sensors to measure the activity of different Rab proteins in small cellular compartments. They apply the new sensors to monitor Rab activity in dendritic spines during induction of LTP. They find sustained (30 min) inactivation of Rab10 and transient (5 min) activation of Rab4 after glutamate uncaging in zero Mg. NMDAR function and CaMKII activation are required for these effects. Knockdown of Rab4 reduced spine volume change while knockdown of Rab10 boosted it and enhanced functional LTP (in KO mice). To test Rab effects on AMPA receptor exocytosis, the authors performed FRAP of fluorescently labeled GluA1 subunits in the plasma membrane. Within 2-3 min, new AMPARs appear on the surface via exocytosis. This process is accelerated by Rab10 knock-down and slowed by Rab4 knock-down. The authors conclude that CaMKII promotes AMPAR exocytosis by i) activating Rab4, the exocytosis driver and ii) inhibiting Rab10, possibly involved in AMPAR degradation.
Strengths:
The work is a technical tour de force, adding fundamental insights to our understanding of the crucial functions of different Rab proteins in promoting/preventing synaptic plasticity. The complexity of compartmentalized Ras signaling is poorly understood and this study makes substantial inroads. The new sensors are thoroughly characterized, seem to work very well, and will be quite useful for the neuroscience community and beyond (e.g. cancer research). The use of FLIM for read-out is compelling for precise activity measurements in rapidly expanding compartments (i.e., spines during LTP).
Thank you for the evaluation.
Weaknesses:
The interpretation of the FRAP experiments (Figure 5, Ext. Data Figure 13) is not straightforward as spine volume and surface area greatly expand during uncaging. I appreciate the correction for the added spine membrane shown in Extended Data Figure 14i, but shouldn't this be a correction factor (multiplication) derived from the volume increase instead of a subtraction?
We thank the reviewer for this question. The fluorescence change should reflect a subtraction of surface area, as SEP-GluA1 is only fluorescent on the cell surface, unlike cytosolic mCherry, whose fluorescence intensity is proportional to spine volume. Therefore, the overall fluorescence change (ΔF) should be the addition of the contribution from AMPAR trafficking (ΔF<sub>t</sub>) and the change in surface area (ΔS) multiplied by the remaining SEP-GluA1 fluorescence per unit area (f):
ΔF = ΔF<sub>t</sub> + fΔS
Since fluorescence immediately after photobleaching (before AMPAR trafficking happens), F<sub>o</sub>, is given by fS (S is the surface area of the spine):
ΔF/F<sub>o</sub> = ΔF<sub>t</sub>/ F<sub>o</sub> + fΔS / fS
\= ΔF<sub>t</sub>/fS + ΔS/S
Assuming that the surface area change (ΔS/S) is the volume change (ΔV/V) to the power of 2/3, the contribution of the AMPAR trafficking can be calculated as:
ΔF<sub>t</sub>/F = ΔF/F – (Δ<sup>V/V)<sup>2/3</sup>
This is the reason that we subtracted the contribution of the spine surface area. We have discussed this in the updated method section.
Also, experiments were not conducted or analyzed blind, risking bias in the selection/exclusion of experiments for analysis. This reduces my confidence in the results.
We acknowledge the reviewer's concern regarding the lack of blinding in our experiments. However, it is challenging to conduct blinded experiments for certain types of studies, such as sensor screening for a protein family, where we do not have expected results or a specific hypothesis prior to the experiments. In these cases, our primary readout is whether the sensor indicates any activity change upon stimulation.
To address this concern, after identifying that Rab10 is inactivated during structural LTP (sLTP) and is likely important for inhibiting spine structural LTP, we performed blinded electrophysiology experiments and obtained similar results (deletion of Rab10 from Camk2a-positive neurons leads to enhanced LTP; Fig. 4k, 4l).
Reviewer #2 (Public review):
Summary:
Wang et al. developed a set of optical sensors to monitor Rab protein activity. Their investigation into Rab activity in dendritic spines during structural long-term plasticity (sLTP) revealed sustained Rab10 inactivation (>30min) and transient Rab4 activation (~5 min). Through pharmacological and genetic manipulation to constitutively activate or inhibit Rab proteins, they found that Rab10 negatively regulates sLTP and AMPA receptor insertion, while Rab4 positively influences sLTP but only in the transient phase. The optical sensors provide new tools for studying Rab activity in cells and neurobiology. However, a full understanding of the timing of Rab activity will require a detailed characterization of sensor kinetics.
Strengths:
(1) Introduction of a series of novel sensors that can address numerous questions in Rab biology.
(2) Multiple methods to manipulate Rab proteins to reveal the roles of Rab10 and rab4 in LTP.
(3) Discovery of Rab4 activation and Rab10 inhibition with different kinetics during sLTP, correlating with their functional roles in the transient (Rab4) and both transient and sustained (Rab10) phases of sLTP.
Thank you for the positive evaluation.
Weaknesses:
(1) Lack of characterization of sensor kinetics, making it difficult to determine if the observed Rab kinetics during sLTP were due to sensor behavior or actual Rab activity.
We estimated that the kinetics of the sensors for Rab4 and Rab10 are within a few minutes. For Rab4, we observed rapid increase and decrease of the activation in response to glutamate uncaging. Thus, this would be the upper limit of the ON/OFF time constants of Rab4. For Rab10, we observed a rapid dissociation of the sensor in response to sLTP induction within ~1 min. This means that the donor and acceptor molecules are quickly dissociated during the process. Thus, the off kinetics of the sensor is within the range of minute. Meanwhile, we have the on-kinetics from Rab10 activation (donor/accepter association) in response to NMDA application and again this is within a few minutes. Given these rapid sensor kinetics in neurons, our observation of the sustained inactivation of Rab10 should reflect the true behavior of Rab10, rather than just the sensor’s response.
We revised our manuscript discussion session as follows:
“Understanding the kinetics of Rab4 and Rab10 sensors is essential for interpreting their actual activity during sLTP. The Rab4 sensor exhibits a rapid rise and fall in activation (Fig. 3), indicating ON/OFF times of less than a few minutes. In contrast, the Rab10 sensor rapidly dissociates during sLTP induction (Fig. 2), with OFF kinetics occurring within one minute and fast ON kinetics in response to NMDA (Fig. 1j). Given these rapid kinetics, the observed sustained inactivation of Rab10 likely reflects its true behavior rather than sensor dynamics.”
(2) It is crucial to assess whether the overexpression of Rab proteins as reporters, affects Rab activity and cellular structure and physiology (e.g. spine number and size).
While we did not measure the effects of Rab sensor overexpression on Rab activity or cellular structure and physiology, we showed that sLTP is similar in neurons expressing sensors. This suggests that the overexpression of Rab sensors does not significantly disrupt signaling required for sLTP.
(3) The paper does not explain the apparently different results between NMDA receptor activation and glutamate uncaging. NMDA receptor activation increased Rab10 activity, while glutamate uncaging decreased it. NMDA receptor activation resulted in sustained Rab4 activation, whereas glutamate uncaging caused only brief activation of about 5 minutes. A potential explanation, ideally supported by data, is needed.
It is a long-standing question in the field why simple NMDA receptor activation by bath application of NMDA does not induce LTP, but instead induce LTD. Rab proteins are regulated by many GEFs and GAPs and identifying different mechanisms requires completely different techniques, such as molecular screening. While our manuscript provides some insights into this question by showing that they provide opposing signals for Rab10, we believe that identifying exact mechanisms would be out of the scope of this manuscript.
(4) There is a discrepancy between spine phenotype and sLTP potential with Rab10 perturbation. Rab10 perturbation affected spine density but not size, suggesting a role in spinogenesis rather than sLTP. However, glutamate uncaging affected sLTP, and spinogenesis was not examined. Explaining the discrepancy between spine size and sLTP potential is necessary. Exploring spinogenesis with glutamate uncaging would strengthen these results. Additionally, Figure 4j shows no change in synaptic transmission with Rab10 knockout, despite an increase in spine density. An explanation, ideally supported by data, is needed for the unchanged fEPSP slope despite an increase in spine density.
We thank the reviewer for raising these important questions. In our findings, shRNA-mediated knockdown of Rab10 did not alter spine size but did increase spine density in the basal state (Extended Data Fig. 11i). This suggests that Rab10 may restrict spinogenesis without affecting spine size. Conversely, sLTP induction via glutamate uncaging is an activity-dependent process that may involve different molecular mechanisms. The signal interplay between spinogenesis and sLTP and how the exact roles of Rab signaling in different modalities of plasticity would remain elusive for the future study.
The lack of change in synaptic transmission with Rab10 knockout, despite the increase in spine density from Rab10 shRNA knockdown, may be due to different preparation and developmental stages: spine density measurements were conducted with shRNA knockdown in organotypic slices (sliced at P6-8, DIV 9-13), while electrophysiological recordings were performed in knockout mice in acute slices from adult animals (P30-60).
(5) Spine volume was imaged using acceptor fluorophores (mCherry, or mCherry/Venus) at 920nm, where the two-photon cross-section of mCherry is minimal. 920nm was also used to excite the donor fluorophore, hence the spine volume measurement based on total red channel fluorescence is the sum of minimal mCherry fluorescence from direct 920nm excitation, bleed-through from the green channel, and FRET. This confounded measurement requires correction and clarification.
We assumed that the most of fluorescence is from direct excitation of mCherry at 920 nm. The contribution from the bleed-through from mEGFP-Rab (~3%) and from FRET changes (~20%) may influence the volume measurements. However, since we observed similar fluorescence changes in the green and red channels, these factors would have only a minor impact on our results (Extended Data Fig. 6a, 6d). Also, please note that the volume change in neurons expressing sensors is just to check if the volume change is normal, and not a major point of this manuscript. We clarified this in the method section as:
“For the sensor experiments, we used mCherry as a volume indicator. We acknowledge that contributions from bleed-through from mEGFP-Rab (approximately 3%) and FRET changes (around 20%) could affect the volume measurements. However, since we observed similar fluorescence changes in both the green and red channels, we believe these factors have a minimal impact on our results (Extended Data Fig. 6a, 6d).”
Reviewer #3 (Public review):
Summary:
This study examines the roles of Rab10 and Rab4 proteins in structural long-term potentiation (sLTP) and AMPA receptor (AMPAR) trafficking in hippocampal dendritic spines using various different methods and organotypic slice cultures as the biological model.
The paper shows that Rab10 inactivation enhances AMPAR insertion and dendritic spine head volume increase during sLTP, while Rab4 supports the initial stages of these processes. The key contribution of this study is identifying Rab10 inactivation as a previously unknown facilitator of AMPAR insertion and spine growth, acting as a brake on sLTP when active. Rab4 and Rab10 seem to be playing opposing roles, suggesting a somewhat coordinated mechanism that precisely controls synaptic potentiation, with Rab4 facilitating early changes and Rab10 restricting the extent and timing of synaptic strengthening.
Strengths:
The study combines multiple techniques such as FRET/FLIM imaging, pharmacology, genetic manipulations, and electrophysiology to dissect the roles of Rab10 and Rab4 in sLTP. The authors developed highly sensitive FRET/FLIM-based sensors to monitor Rab protein activity in single dendritic spines. This allowed them to study the spatiotemporal dynamics of Rab10 and Rab4 activity during glutamate uncaging-induced sLTP. They also developed various controls to ensure the specificity of their observations. For example, they used a false acceptor sensor to verify the specificity of the Rab10 sensor response.
This study reveals previously unknown roles for Rab10 and Rab4 in synaptic plasticity, showing their opposing functions in regulating AMPAR trafficking and spine structural plasticity during LTP.
Thank you for the positive evaluation.
Weaknesses:
In sLTP, the initial volume of stimulated spines is an important determinant of induced plasticity. To address changes in initial volume and those induced by uncaging, the authors present Extended Data Figure 2. In my view, the methods of fitting, sample selection, or both may pose significant limitations for interpreting the overall results. While the initial spine size distribution for Rab10 experiments spans ~0.1-0.4 fL (with an unusually large single spine at the upper end), Rab4 spine distribution spans a broader range of ~0.1-0.9 fL. If the authors applied initial size-matched data selection or used polynomials rather than linear fitting, panels a, b, e, f, and g might display a different pattern. In that case, clustering analysis based on initial size may be necessary to enable a fair comparison between groups not only for this figure but also for main Figures 2 and 3.
We thank the reviewer for these questions. For sensor uncaging experiments, we usually uncaged glutamate at large mushroom spines because we need to have a good signal-to-noise ratio. We just happen to choose these spines with different initial sizes for Rab4 sensor and Rab10 sensor uncaging experiments.
Another limitation is the absence of in vivo validation, as the experiments were performed in organotypic hippocampal slices, which may not fully replicate the complexity of synaptic plasticity in an intact brain, where excitatory and inhibitory processes occur concurrently. High concentrations of MNI-glutamate (4 mM in this study) are known to block GABAergic responses due to its antagonistic effect on GABA-A receptors, thereby precluding the study of inhibitory network activity or connectivity [1], which is already known to be altered in organotypic slice cultures.
(1) https://www.frontiersin.org/journals/neural-circuits/articles/10.3389/neuro.04.002.2009/full
We appreciate the reviewer's comments and would like to clarify that we have conducted experiments in acute slices for LTP using conditional Rab10 knockout (Fig. 4k, 4l), and we obtained similar results. Additionally, we have recently published findings on the behavioral deficits observed in heterozygous Rab10 knockout mice (PubMed 37156612). These studies further support our conclusions and provide additional context for our findings.
Recommendations for the authors:
From the Senior/Reviewing Editor:
I apologize that this took longer than intended. As you will see from the reviews there was some disagreement on several points. There was some disagreement among reviewers as to the strength of the evidence with some characterizing it as "compelling," "convincing," or "solid" while others felt the characterization of the sensors was "incomplete" and that this could have affected some of the conclusions. After extensive discussion, reviewers agreed that there was a valid concern that the conclusion that Rab10 activation is sustained could reflect a feature of the sensor. If Rab10/RBD dissociation rate were very low, and the affinity of binding were very high, this could lead to an incorrect estimate of the sustained binding due to sensor kinetics, not Rab10 activation. It was noted that this has been seen in other sensors previously (e.g. first generation PKA activity sensors), which the developers altered in later generations to increase reversibility and off kinetics of the sensor.
There was also discussion of how this might be addressed and we would be interested in your comments on this issue. It was suggested that it might be helpful to revise Figure 2b to show binding fraction dynamics separately for each spine (to determine whether any actually return to baseline). Subsequently, clustering of these binding dynamics into two groups could be summarized in a version of Fig. 2e for each cluster. Differences in spine volume dynamics between these clusters would provide a measure of how strongly Rab10 binding correlates with spine volume. If they never go back to baseline, some extra experiments with longer post-plasticity induction (150mins instead of 35), might show if any reversible Rab10 binding exists post-LTP induction.
An alternative suggestion was to measure the time course in the presence of a GAP or GEF, which should alter the kinetics.
Thanks for the comments. It is important that the inactivation is observed as the dissociation of the donor and acceptor of the sensor. Thus, the fact that the sensor rapidly decreases in response to uncaging means that they have rapid off kinetics. In addition, we provide evidence of a rapid increase of Rab10 in response to NMDA application, suggesting that kinetics is also rapid. We added discussion about this in the revised manuscript as:
“Understanding the kinetics of Rab4 and Rab10 sensors is essential for interpreting their actual activity during sLTP. The Rab4 sensor exhibits a rapid rise and fall in activation (Fig. 3), indicating ON/OFF times of just a few minutes. In contrast, the Rab10 sensor rapidly dissociates during sLTP induction (Fig. 2), with OFF kinetics occurring within one minute and fast ON kinetics in response to NMDA (Fig. 1j). Given these rapid kinetics, the observed sustained inactivation of Rab10 likely reflects its true behavior rather than sensor dynamics.”
There was also further discussion of the nature of the "spine volume" signal, given the fact that the two-photon cross-section of mCherry is minimal at 920nm. It was suggested that this could be due to direct acceptor excitation rather than FRET, but there was agreement that further clarity on this issue would be valuable.
We assumed that the most of fluorescence is from direct excitation of mCherry at 920 nm. The contribution from the bleed-through from mEGFP-Rab (~3%) and from FRET changes (~20%) may influence the volume measurements. However, since we observed similar fluorescence changes in the green and red channels, these factors would have only a minor impact on our results (Extended Data Fig. 6a, 6d). Also, please note that the volume change in neurons expressing sensors is just to check if the volume change is normal, and not a major point of this manuscript. We clarified this in the method section as:
“For the sensor experiments, we used mCherry as a volume indicator. We acknowledge that contributions from bleed-through from mEGFP-Rab (approximately 3%) and FRET changes (around 20%) could affect the volume measurements. However, since we observed similar fluorescence changes in both the green and red channels, we believe these factors have a minimal impact on our results (Extended Data Fig. 6a, 6d).”
The equations in the methods section differ from other papers by the same lab (e.g. Laviv et al, Neuron 2020, Tu et al. Sci Adv. 2023, Jain et al. Nature 2024). Please clarify which equations are correct.
Thanks for pointing this out. In fact, some of the equations in this manuscript were wrong, and we have corrected them in the method session.
Reviewer #1 (Recommendations for the authors):
The effects of Rab knockdown affect both spine volume expansion and AMPAR recovery in a very similar fashion. To explain this tight coupling, the authors suggest that the availability of membrane could be a limiting factor for spine enlargement. However, some Rabs are known to affect actin dynamics, which could also explain the dual effects on AMPAR exocytosis and spine enlargement. It is not easy to come up with an experiment to differentiate between these alternative explanations, as blocking actin polymerization would likely affect exocytosis, too. The authors should consider/discuss the possibility that all of the observed Ras effects result from altered actin dynamics and that the lipid bilayer is sufficiently fluid to form a minimal surface around the expanding cytoskeleton.
Thanks for the suggestions. We included the discussion about the potential impact on the actin cytoskeleton by Rab10.
Typos: heterougenous, compartmantalization, chemaical, ballistically/biolistically (chose one).
Thanks for pointing out these typos. We have corrected them in the revised manuscript.
Reviewer #2 (Recommendations for the authors):
(1) Venus shows pH sensitivity, which can be significant at synapses due to pH changes. Characterizing the pH sensitivity of the sensors is essential.
Thanks for the suggestions. We did not measure pH dependence, but the PKa of these fluorophores has already been published. PKa for EGFP and Venus are both 6.0, and it is unlikely that it influenced our measurements.
(2) Presenting individual data points within all bar graphs (e.g. Fig. 2c, 2d) would enhance data transparency.
Thanks for the suggestions. We now provide individual data points in the revised main figures.
(3) In Figure 1f: Rab5 GAP expression increased the binding fraction against expectations. In addition, clarifying the color scheme in Figure 1 is needed. Are GAPs supposed to be blue/green, and GEFs red/orange? Figure 1f seems to contradict this color scheme.
Thanks for the suggestions. We clarified these issues.
(4) Quantification of the point spread function of the uncaging laser, response/settle time of the scan mirror during uncaging, and reason for changes in neighboring spines in many example images (e.g. Figure 2a, especially at 240 s; Figure 4a) would be important.
The laser is controlled by Pockels cells, which changes the laser intensity with microsecond resolution. The laser is parked for milliseconds during uncaging, much longer than the settling time of the mirror (~0.1 milliseconds). The point spread function of the uncaging laser is limited by the diffraction (~0.5 um). The uncaging spot size is mostly limited by the diffusion of uncaged glutamate, but our calcium imaging and CaMKII imaging show that the signaling is induced mostly in the stimulated spines (Lee et al., 2009; Chang et al., 2017, 2019).
(5) Please include traces for "false" sensors in stimulated spines in Figures 2b, 2e, 3b, and 3e.
The traces for the false sensors have been presented in Extended Data Fig. 3 and Extended Data Fig. 8.
(6) The traces in Figure 4k (fEPSP slope in response to theta burst stimulation, where there is a decrease in fEPSP slope followed by a gradual increase) differ from prior publications (e.g. PMID: 1359925, 3967730, 19144965, 20016099). An investigation and explanation for these differences are necessary.
We appreciate the reviewer’s comments. We performed the experiments blindly and did not try to find a condition providing control data similar to previous publications. The variations in fEPSP responses compared to prior publications may be attributed to several factors, including differences in experimental conditions such as the genetic background of the animals used, the specific protocols for theta burst stimulation, and variations in the preparation of the hippocampal slices.
(7) The title and text state that Rab10 inactivation promotes AMPAR insertion. It is unclear if this is a direct effect on AMPAR insertion or an indirect effect through membrane remodeling. Providing data to distinguish these possibilities or adjusting the title/text to reflect alternative interpretations would be beneficial.
We appreciate the reviewer's feedback. To clarify, we have revised our terminology to use "AMPAR trafficking" instead of "AMPAR insertion", as it includes both insertion and other mechanisms of AMPAR movement within the cell.
(8) Please provide an explanation for the initial Rab10 inactivation observed in Figure 1j upon NMDA application.
The application of NMDA in Fig. 1j is similar to the commonly used chemical LTD induction protocol. We used this broad stimulation approach to test whether our sensors could report Rab activity changes in neurons upon strong stimulation. However, it is an entirely different stimulation approach from the sLTP induction protocol, thus resulting in different sensor activity changes. We describe the phenomenon in the revised manuscript, but we believe that detailed analyses of Rab10 activation in response to NMDA application are beyond the scope of this manuscript.
(9) Please explain why the study focuses on Rab4 and Rab10 instead of other Rab proteins.
During our initial screening of sensors for various Rab proteins, we observed significant activity changes in the sensors for Rab4 and Rab10 upon sLTP induction. This suggested their potential relevance in synaptic processes, leading us to focus on understanding their specific roles in structural long-term potentiation.
Reviewer #3 (Recommendations for the authors):
(1) Although it might seem trivial, the definition of adjacent spine has not been made in the text. It would be nice to have it in the Methods section.
We included it in the Methods section as follows:
"The adjacent spine refers to the first or second spine located next to the stimulated spine, typically positioned opposite the stimulated spine. Additionally, the size of the adjacent spine must be sufficiently large for imaging."
(2) The transfection method has been mentioned as "ballistic" and "biolistic" transfection. You might want to use only one term. Additionally, you can add the equipment used (Bio-rad?) and pressure (psi) in the Methods section.
We use “biolistic” throughout the manuscript now. We also added the equipment and conditions used.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Neuronal activity spatiotemporal fine-tuning of cerebral blood flow balances metabolic demands of changing neuronal activity with blood supply. Several 'feed-forward' mechanisms have been described that contribute to activity-dependent vasodilation as well as vasoconstriction leading to a reduction in perfusion. Involved messengers are ionic (K+), gaseous (NO), peptides (e.g., NPY, VIP), and other messengers (PGE2, GABA, glutamate, norepinephrine) that target endothelial cells, smooth muscle cells, or pericytes. Contributions of the respective signaling pathways likely vary across brain regions or even within specific brain regions (e.g., across the cortex) and are likely influenced by the brain's physiological state (resting, active, sleeping) or pathological departures from normal physiology.
The manuscript "Elevated pyramidal cell firing orchestrates arteriolar vasoconstriction through COX-2derived prostaglandin E2 signaling" by B. Le Gac, et al. investigates mechanisms leading to activitydependent arteriole constriction. Here, mainly working in brain slices from mice expressing channelrhodopsin 2 (ChR2) in all excitatory neurons (Emx1-Cre; Ai32 mice), the authors show that strong optogenetic stimulation of cortical pyramidal neurons leads to constriction that is mediated through the cyclooxygenase-2 / prostaglandin E2 / EP1 and EP3 receptor pathway with contribution of NPY-releasing interneurons and astrocytes releasing 20-HETE. Specifically, using a patch clamp, the authors show that 10-s optogenetic stimulation at 10 and 20 Hz leads to vasoconstriction (Figure 1), in line with a stimulation frequency-dependent increase in somatic calcium (Figure 2). The vascular effects were abolished in the presence of TTX and significantly reduced in the presence of glutamate receptor antagonists (Figure 3). The authors further show with RT-PCR on RNA isolated from patched cells that ~50% of analyzed cells express COX-1 or -2 and other enzymes required to produce PGE2 or PGF2a (Figure 4). Further, blockade of COX-1 and -2 (indomethacin), or COX-2 (NS-398) abolishes constriction. In animals with chronic cranial windows that were anesthetized with ketamine and medetomidine, 10-s long optogenetic stimulation at 10 Hz leads to considerable constriction, which is reduced in the presence of indomethacin. Blockade of EP1 and EP3 receptors leads to a significant reduction of the constriction in slices (Figure 5). Finally, the authors show that blockade of 20-HETE synthesis caused moderate and NPY Y1 receptor blockade a complete reduction of constriction.
The mechanistic analysis of neurovascular coupling mechanisms as exemplified here will guide further in-vivo studies and has important implications for human neuroimaging in health and disease. Most of the data in this manuscript uses brain slices as an experimental model which contrasts with neurovascular imaging studies performed in awake (headfixed) animals. However, the slice preparation allows for patch clamp as well as easy drug application and removal. Further, the authors discuss their results in view of differences between brain slices and in vivo observations experiments, including the absence of vascular tone as well as blood perfusion required for metabolite (e.g., PGE2) removal, and the presence of network effects in the intact brain. The manuscript and figures present the data clearly; regarding the presented mechanism, the data supports the authors' conclusions.
We thank the reviewer for his/her supportive comments as well as for pointing out pros and cons of the brain slice preparation.
Some of the data was generated in vivo in head-fixed animals under anesthesia; in this regard, the authors should revise the introduction and discussion to include the important distinction between studies performed in slices, or in acute or chronic in-vivo preparations under anesthesia (reduced network activity and reduced or blockade of neuromodulation, or in awake animals (virtually undisturbed network and neuromodulatory activity).
We have now added a paragraph in the introduction (lines 52-64) to highlight the distinction between ex vivo and in vivo models. We now also discuss that anesthetized animals exhibit slower NVC (Line 308-309).
Further, while discussed to some extent, the authors could improve their manuscript by more clearly stating if they expect the described mechanism to contribute to CBF regulation under 'resting state conditions' (i.e., in the absence of any stimulus), during short or sustained (e.g., visual, tactile) stimulation, or if this mechanism is mainly relevant under pathological conditions; especially in the context of the optogenetic stimulation paradigm being used (10-s long stimulation of many pyramidal neurons at moderate-high frequencies) and the fact that constriction leading to undersupply in response to strongly increased neuronal activity seems counterintuitive?
We now discuss more extensively the physiological relevance (lines 422-434 and 436-439) and the conditions where the described mechanisms of neurogenic vasoconstriction may occur.
We agree with the reviewer that vasoconstriction in response to a large increase in neuronal activity is counterintuitive as it leads to undersupply despite an increased energy demand. We now discuss its potential physio/pathological role in attenuating neuronal activity by reducing energy supply (lines 453-464).
Reviewer #2 (Public review):
Summary:
The present study by Le Gac et al. investigates the vasoconstriction of cerebral arteries during neurovascular coupling. It proposes that pyramidal neurons firing at high frequency lead to prostaglandin E2 (PGE2) release and activation of arteriolar EP1 and EP3 receptors, causing smooth muscle cell contraction. The authors further claim that interneurons and astrocytes also contribute to vasoconstriction via neuropeptide Y (NPY) and 20-hydroxyeicosatetraenoic acid (20-HETE) release, respectively. The study mainly uses brain slices and pharmacological tools in combination with Emx1Cre; Ai32 transgenic mice expressing the H134R variant of channelrhodopsin-2 (ChR2) in the cortical glutamatergic neurons for precise photoactivation. Stimulation with 470 nm light using 10-second trains of 5-ms pulses at frequencies from 1-20 Hz revealed small constrictions at 10 Hz and robust constrictions at 20 Hz, which were abolished by TTX and partially inhibited by a cocktail of glutamate receptor antagonists. Inhibition of cyclooxygenase-1 (COX-1) or -2 (COX-2) by indomethacin blocked the constriction both ex vivo (slices) and in vivo (pial artery), and inhibition of EP1 and EP3 showed the same effect ex vivo. Single-cell RT-PCR from patched neurons confirmed the presence of the PGE2 synthesis pathway.
While the data are convincing, the overall experimental setting presents some limitations. How is the activation protocol comparable to physiological firing frequency?
As also suggested by Reviewer #1 we have now discussed more extensively the physiological relevance of our observations (lines 422-434 and 436-439).
The delay (minutes) between the stimulation and the constriction appears contradictory to the proposed pathway, which would be expected to occur rapidly. The experiments are conducted in the absence of vascular "tone," which further questions the significance of the findings.
The slow kinetics observed ex vivo are probably due to the low recording temperature and the absence of pharmacologically induced vascular tone, as already discussed (lines 312-317). Furthermore, as recommended by reviewer #1, we have presented the advantages and limitations of ex vivo and in vivo approaches (lines 52-64).
Some of the targets investigated are expressed by multiple cell types, which makes the interpretation difficult; for example, cyclooxygenases are also expressed by endothelial cells.
Under normal conditions, endothelial cells only express COX-1 and barely COX-2, whose expression is essentially observed in pyramidal cells (see Tasic et al. 2016, Zeisel et al. 2015, Lacroix et al., 2015). As pointed out by Reviewer # 1, our ex vivo pharmacological data clearly indicate that vasoconstriction is mostly due to COX-2 activity, and to a much lesser extent to COX-1. Since it is well established that the previously described vascular effects of pyramidal cells are essentially mediated by COX-2 activity (Iadecola et al., 2000; Lecrux et al., 2011; Lacroix et al., 2015), we are quite confident that vasoconstriction described here is mainly due COX-2 activity of pyramidal cells.
Finally, how is the complete inhibition of the constriction by the NPY Y1 receptor antagonist BIBP3226 consistent with a direct effect of PGE2 and 20-HETE in arterioles?
We agree with both reviewers that the complete blockade of the constriction by the NPY Y1 receptor antagonist BIBP3226 needs to be more carefully discussed. We have now included in the discussion the possible involvement of Y1 receptors in pyramidal cells, which could promote glutamate release and possibly COX-2, thereby contributing to PGE2 and 20-HETE signaling (lines 402-409).
Overall, the manuscript is well-written with clear data, but the interpretation and physiological relevance have some limitations. However, vasoconstriction is a rather understudied phenomenon in neurovascular coupling, and the present findings may be of significance in the context of pathological brain hypoperfusion.
We thank the reviewer for his/her comment and suggestions, which have helped us to improve our manuscript.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Methods:
It is not clear if brain slices (or animals) underwent one, two, or several optogenetic stimulations - especially for experiments where 'control' is compared to 'treated' - does this data come from the same vessels (before and after treatment) or from two independent groups of vessels? If repeated stimulations are performed, do these repeated stimulations cause the same vascular response?
As indicated in the Materials and Methods section, line 543: “Only one arteriole was monitored per slice” implies that the comparisons between the ‘control’ and ‘treated’ groups were made from independent groups of vessels. To clarify this point, we have added “receiving a single optogenetic or pharmacological stimulation” to this sentence lines 543-544.
For in vivo experiments, animals underwent 10-20 optogenetic stimulations with a 5-minute interstimulus interval during an experiment lasting 2 hours for maximum. Trials from the same vessel were averaged (with a 0.1 s interpolation) for analysis, and the mean per vessels is presented in the graphics.
Figure 2:
Can the authors speculate about the cause for the slow increase in indicator fluorescence from minute 1.5 onward, which seems dependent on stimulation frequency? Is this increase also present when slices from a ChR2-negative animal undergo the same stimulation paradigm?
Rhod2 was delivered by the patch pipette as indicated in the Materials and Methods section (line 514). Although a period of “at least 15 min after passing in whole-cell configuration to allow for somatic diffusion of the dye” (line 551-552) was observed, this single-wavelength Ca2+ indicator likely continued to diffuse into the cells during the optical recording thereby, inducing a slight increase in delta F/F0, which is consistent with the positive slopes of the mean fluorescence changes observed during the 30-s control baseline (Fig. 2b).
Figure 4: Why did the authors include panel a) here? Also, do the authors observe that cells with different COX-1 or -2 expression profiles show different (electrical, morphological) properties?
The purpose of panel a) in Fig. 4 was to ensure the regular spiking electrophysiological phenotype of the pyramidal neurons whose cytoplasm was harvested for subsequent RT-PCR analysis. Despite our efforts, we found no difference in the 32 electrophysiological features between COX-1 or COX-2 positive and negative cells. This is now clearly stated in the result section (lines 210-212) and a supplementary table of electrophysiological features is now provided. Because it is difficult to determine the morphology of neurons analyzed by single-cell RT-PCR (Devienne et al. 2018), these cells were not processed for biocytin labeling.
Figure 5: (1) Maybe the authors could highlight panels b-f as in vivo experiments to emphasize that these are in-vivo observations while the other experiments (especially panels g, h) are made in slices?
We thank the reviewer for this suggestion. A black frame is now depicted in Figure 5 to emphasize in vivo experiments.
(2) What is the power of the optogenetic stimulus in this experiment?
The power of the optogenetic stimulus was 38 mW/mm<sup>2</sup> in ex vivo experiments (see Line 527). For in vivo experiments, 1 mW pulses of 5 ms were used, the intensity being measured at the fiber end. We now provide the information for in vivo experiments in the Methods lines 639-640.
(3) Experiments were performed with Fluorescein-Dextran at 920-nm excitation which would overlap with EYFP fluorescence from the ChR2-EYFP transgene. Did the authors encounter any issues with crosstalk between the two labels?
Crosstalk between EYFP and fluorescein fluorescence was indeed an issue. This is why arterioles were monitored at the pial level to avoid fluorescence contamination from the cortical parenchyma. Because of the perivascular space around pial arterioles, it was possible to measure vessel diameter without pollution for the parenchyma (see Author response image 1 below). To clarify this point we added the statement “which are not compromised by the fluorescence from the ChR2-EYFP transgene in the parenchyma (Madisen et al. 2012),” Line 628-629. Note that line scan acquisitions without photoactivation stimulation did not trigger any progressive change in the vessel size or resting fluorescence.
Author response image 1.
Example of a pial arteriole filled with fluorescein dextran (cyan) in an Emx1-EYFP mouse (parenchyma labeled with YFP, in cyan). The red line represents a line scan to record the change in diameter. Due to the perivascular space surrounding the arterioles, the vessel walls are clearly identified and separated from the fluorescent parenchyma.
(4) Could the authors potentially extend the time course in panel e) to show the recovery of the preparation to the baseline?
Because arterioles were only monitored for a 40-s period during a session of optogenetic stimulation/imaging we cannot extend panel e. Nonetheless, a 5 minutes interstimulus interval was observed to allow the full recovery of the preparation to the baseline. This now clarified line 640. Of note, the arteriole shown in panel d before indomethacin treatment fully recovered to baseline after this treatment.
Also, did the authors observe any 'abnormal' behavior of the vasculature after stimulation, such as large-amplitude oscillations? (5)
We did not specifically investigate resting state oscillations, such as vasomotion, but the 10-s long baseline recording for each measurement indicates no long lasting, abnormal and de novo behavior with a frequency higher than 0.1-0.2 Hz.
Can the authors show in vivo data from control experiments in EYFP-expressing or WT mice that underwent the same stimulation paradigm (Supplementary Figure 1 shows data from brain slices)?
The reviewer is correct to point out this important control, as optogenetic stimulation can induce a vascular response without channel rhodopsin activation at high power (see our study on the topic, Rungta et al, Nat Com 2017). We therefore tested this potential artefact in a WT mouse using our setup, with different intensities and durations of optogenetic stimulation.
Author response image 2A shows that stimulations of 10 seconds, 10 Hz, 1 mW, 5 ms pulses, i.e. the conditions we used for the experiments in Emx1 mice, did not induce dilation or constriction. Stimulation for 5 seconds with the same number of pulses, but with a higher power (4 mW), longer duration (20 ms pulses) and at a higher frequency elicited a small dilation in 1 of 2 pial arterioles (Author response image 2B). For this reason, we used only shorter (5ms) and less intense (1 mW) optogenetic stimulation to ensure that the observed dilation was solely due to Emx1 activation and not to light-induced artefactual dilation.
Author response image 2.
Optogenetic stimulation in a wild-type mouse. A. No diameter changes upon stimulations of 10 seconds, 10 Hz, 1 mW, 5 ms pulses, i.e. the conditions we used for the experiments in Emx1 mice. B. Stimulation of higher power (4 mW), longer duration (20 ms pulses) and at a higher frequency elicited a small dilation in 1 (grey traces) of 2 pial arterioles.
Figures 6 and 7: It is surprising that blockade of NPY Y1 receptors leads to a complete loss of the constriction response. As shown in Figure 7, the authors suggest that pyramidal neuron-released PGE2 (and glutamate) initiate several cascades acting on smooth muscle directly (PGE2-EP1/EP3), through astrocytes (Glu/COX-1/PGE2 or 20-HETE), or through NPY interneurons (Glu/NPY/Y1 or PGE2/NPY/Y1). This would imply that COX-1/2 and NPY/Y1 pathways act in series (as discussed by the authors). Besides the potential effects on NPY release mentioned in the discussion, could the authors comment if both (NPY and PGE2) pathways need to be co-activated in smooth muscle cells to cause constriction?
We thank the reviewer for raising this surprising complete loss of vasoconstriction by Y1 antagonism, despite the contribution of other vasoconstrictive pathways. We now discuss (lines 402-409) the possibility that activation of the neuronal Y1 receptors in pyramidal cells may also have contributed to the vasoconstriction by promoting glutamate and possibly PGE2 release. The combined activation of vascular and neuronal Y1 receptors may explain the complete blockage of optogenetically induced vasoconstriction by BIBP3226.
Reviewer #2 (Recommendations for the authors):
The complete block of the constriction by BIBP3226 needs to be carefully considered.
We thank the reviewer for stressing this point also raised by Reviewer #1. As mentioned above we now discuss (lines 402-409) the possibility that activation of the neuronal Y1 receptors in pyramidal cells may also have contributed to the vasoconstriction by promoting glutamate and possibly PGE2 release. The combined activation of vascular and neuronal Y1 receptors may explain the complete blockage of optogenetically induced vasoconstriction by BIBP3226.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary of what the authors were trying to achieve:
In this manuscript, the authors investigated the role of β-CTF on synaptic function and memory. They report that β-CTF can trigger the loss of synapses in neurons that were transiently transfected in cultured hippocampal slices and that this synapse loss occurs independently of Aβ. They confirmed previous research (Kim et al, Molecular Psychiatry, 2016) that β-CTF-induced cellular toxicity occurs through a mechanism involving a hexapeptide domain (YENPTY) in β-CTF that induces endosomal dysfunction. Although the current study also explores the role of β-CTF in synaptic and memory function in the brain using mice chronically expressing β-CTF, the studies are inconclusive because potential effects of Aβ generated by γ-secretase cleavage of β-CTF were not considered. Based on their findings, the authors suggest developing therapies to treat Alzheimer's disease by targeting β-CTF, but did not address the lack of clinical improvement in trials of several different BACE1 inhibitors, which target β-CTF by preventing its formation.
We would like to thank the reviewer for his/her suggestions. We have addressed the specific comments in following sections.
Major strengths and weaknesses of the methods and results:
The conclusions of the in vitro experiments using cultured hippocampal slices were well supported by the data, but aspects of the in vivo experiments and proteomic studies need additional clarification.
(1) In contrast to the in vitro experiments in which a γ-secretase inhibitor was used to exclude possible effects of Aβ, this possibility was not examined in in-vivo experiments assessing synapse loss and function (Figure 3) and cognitive function (Figure 4). The absence of plaque formation (Figure 4B) is not sufficient to exclude the possibility that Aβ is involved. The potential involvement of Aβ is an important consideration given the 4-month duration of protein expression in the in vivo studies.
We appreciate the reviewer for raising this question. While our current data did not exclude the potential involvement of Aβ-induced toxicity in the synaptic and cognitive dysfunction observed in mice overexpressing β-CTF, addressing this directly remains challenging. Treatment with γ-secretase inhibitors could potentially shed light on this issue. However, treatments with γ-secretase inhibitors are known to lead to brain dysfunction by itself likely due to its blockade of the γ-cleavage of other essential molecules, such as Notch[1, 2]. Therefore, this approach is unlikely to provide a clear answer, which prevents us from pursuing it further experimentally in vivo. We hope the reviewer understands this limitation. We have included additional discussion (page 14 of the revised manuscript) to highlight this question.
(2) The possibility that the results of the proteomic studies conducted in primary cultured hippocampal neurons depend in part on Aβ was also not taken into consideration.
We thank the reviewer for raising this question. In the revised manuscript, we examined the protein levels of synaptic proteins after treatment with γ-secretase inhibitors and found that the levels of certain synaptic proteins were further reduced in neurons expressing β-CTF (Supplementary figure 5A-B). These results do not support Aβ as a major contributor of the proteomic changes induced by β-CTF.
Likely impact of the work on the field, and the utility of the methods and data to the community:
The authors' use of sparse expression to examine the role of β-CTF on spine loss could be a useful general tool for examining synapses in brain tissue.
We thank the reviewer for these comments.
Additional context that might help readers interpret or understand the significance of the work:
The discovery of BACE1 stimulated an international effort to develop BACE1 inhibitors to treat Alzheimer's disease. BACE1 inhibitors block the formation of β-CTF which, in turn, prevents the formation of Aβ and other fragments. Unfortunately, BACE1 inhibitors not only did not improve cognition in patients with Alzheimer's disease, they appeared to worsen it, suggesting that producing β-CTF actually facilitates learning and memory. Therefore, it seems unlikely that the disruptive effects of β-CTF on endosomes plays a significant role in human disease. Insights from the authors that shed further light on this issue would be welcome.
Response: We would like to express our gratitude to the reviewer for raising this question. It remains puzzling why BACE1 inhibition has failed to yield benefits in AD patients, while amyloid clearance via Aβ antibodies are able to slow down disease progression. One possible explanation is that pharmacological inhibition of BACE1 may not be as effective as its genetic removal. Indeed, genetic depletion of BACE1 leads to the clearance of existing amyloid plaques[3], whereas its pharmacological inhibition prevents the formation of new plaques but does not deplete the existing ones[4]. We think the negative results of BACE1 inhibitors in clinical trials may not be sufficient to rule out the potential contribution of β-CTF to AD pathogenesis. Given that cognitive function continues to deteriorate rapidly in plaque-free patients after 1.5 years of treatment with Aβ antibodies in phase three clinical studies[5], it is important to consider the potential role of other Aβ-related fragments in AD pathogenesis, such as β-CTF. We included further discussion in the revised manuscript (page 15 of the revised manuscript) to discusss this question.
Reviewer #2 (Public Review):
Summary:
In this study, the authors investigate the potential role of other cleavage products of amyloid precursor protein (APP) in neurodegeneration. They combine in vitro and in vivo experiments, revealing that β-CTF, a product cleaved by BACE1, promotes synaptic loss independently of Aβ. Furthermore, they suggest that β-CTF may interact with Rab5, leading to endosomal dysfunction and contributing to the loss of synaptic proteins.
We would like to thank the reviewer for his/her suggestions. We have addressed the specific comments in following sections.
Weaknesses:
Most experiments were conducted in vitro using overexpressed β-CTF. Additionally, the study does not elucidate the mechanisms by which β-CTF disrupts endosomal function and induces synaptic degeneration.
We would like to thank the reviewer for this comment. While a significant portion of our experiments were conducted in vitro, the main findings were also confirmed in vivo (Figure 3 and 4). Repeating all the experiments in vivo would be challenging and may not be possible because of technical difficulties. Regarding the use of overexpressed β-CTF, we acknowledge that this represents a common limitation in neurodegenerative disease studies. These diseases progress slowly over decades in patients. To model this progression in cell or mouse models within a time frame feasible for research, overexpression of certain proteins is often inevitable. Since β-CTF levels are elevated in AD patients[6], its overexpression is not a irrelevant approach to investigate its potential effects.
We did not further investigate the mechanisms by which β-CTF disrupted endosomal function because our preliminary results align with previous findings that could explain its mechanism. Kim et al. demonstrated that β-CTF recruits APPL1 (a Rab5 effector) via the YENPTY motif to Rab5 endosomes, where it stabilizes active GTP-Rab5, leading to pathologically accelerated endocytosis, endosome swelling and selectively impaired transport of Rab5 endosomes[6]. However, this paper did not show whether this Rab5 overactivation-induced endosomal dysfunction leads to any damages in synapses. In our study, we observed that co-expression of Rab5<sub>S34N</sub> with β-CTF effectively mitigated β-CTF-induced spine loss in hippocampal slice cultures (Figures 6L-M), indicating that Rab5 overactivation-induced endosomal dysfunction contributed to β-CTF-induced spine loss. We included further discussion in the revised manuscript to clarify this (page 15 of the revised manuscript).
Reviewer #3 (Public Review):
Summary:
Most previous studies have focused on the contributions of Abeta and amyloid plaques in the neuronal degeneration associated with Alzheimer's disease, especially in the context of impaired synaptic transmission and plasticity which underlies the impaired cognitive functions, a hallmark in AD. But processes independent of Abeta and plaques are much less explored, and to some extent, the contributions of these processes are less well understood. Luo et all addressed this important question with an array of approaches, and their findings generally support the contribution of beta-CTF-dependent but non-Abeta-dependent process to the impaired synaptic properties in the neurons. Interestingly, the above process appears to operate in a cell-autonomous manner. This cell-autonomous effect of beta-CTF as reported here may facilitate our understanding of some potentially important cellular processes related to neurodegeneration. Although these findings are valuable, it is key to understand the probability of this process occurring in a more natural condition, such as when this process occurs in many neurons at the same time. This will put the authors' findings into a context for a better understanding of their contribution to either physiological or pathological processes, such as Alzheimer's. The experiments and results using the cell system are quite solid, but the in vivo results are incomplete and hence less convincing (see below). The mechanistic analysis is interesting but primitive and does not add much more weight to the significance. Hence, further efforts from the authors are required to clarify and solidify their results, in order to provide a complete picture and support for the authors' conclusions.
We would like to thank the reviewer for the suggestions. We have addressed the specific comments in following sections.
Strengths:
(1) The authors have addressed an interesting and potentially important question
(2) The analysis using the cell system is solid and provides strong support for the authors' major conclusions. This analysis has used various technical approaches to support the authors' conclusions from different aspects and most of these results are consistent with each other.
We would like to thank the reviewer for these comments.
Weaknesses:
(1) The relevance of the authors' major findings to the pathology, especially the Abeta-dependent processes is less clear, and hence the importance of these findings may be limited.
We would like to thank the reviewer for this question. Phase 3 clinical trial data from Aβ antibodies show that cognitive function continues to decline rapidly, even in plaque-free patients, after 1.5 years of treatment[5]. This suggests that plaque-independent mechanisms may drive AD progression. Therefore, it is crucial to consider the potential contributions of other Aβ species or related fragments, such as alternative forms of Aβ and β-CTF. While it is early to predict how much β-CTF contributes to AD progression, it is notable that β-CTF induced synaptic deficits in mice, which recapitulates a key pathological feature of AD. Ultimately, the contribution of β-CTF in AD pathogenesis can only be tested through clinical studies in the future.
(2) In vivo analysis is incomplete, with certain caveats in the experimental procedures and some of the results need to be further explored to confirm the findings.
We would like to thank the reviewer for this suggestion. We have corrected these caveats in the revised manuscript.
(3) The mechanistic analysis is rather primitive and does not add further significance.
We would like to thank the reviewer for this comment. We did not delve further into the underlying mechanisms because our analysis indicates that Rab5 overactivation-induced endosomal dysfunction underlies β-CTF-induced synaptic dysfunction, which is consistent with another study and has been addressed in our study[6]. We hope the reviewer could understand that our focus in this paper is on how β-CTF triggers synaptic deficits, which is why we did not investigate the mechanisms of β-CTF-induced endosomal dysfunction further.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Suggestions for improved or additional experiments, data, or analyses:
(1) In Figures 4H, 4J, 4K and Supplemental Figures 3C, 3E, and 3G, it was unclear whether a repeated measures 2-way ANOVA, rather than a 2-way ANOVA, followed by appropriate post-hoc analyses was used to strengthen the conclusion that there were significant effects in the behavioral tests.
We appreciate the reviewer for raising this point and apologize for the lack of clear description in the manuscript. In those figures mentioned above, we use a repeated measures 2-way ANOVA to analyze the data by Graphpad Prism. In Figure 4H, fear conditioning tests were conducted. The same cohort of mice were used in the baseline, contextual and cued tests. Firstly, baseline freezing was tested; then these mice underwent tone and foot shock training, followed by contextual test and cued test. So, a repeated measures 2-way ANOVA is more appropriate for the experiment.
In water T maze tests (Figure 4J and K), the same cohort of mice were trained and tested each day. So, it’s also appropriate to use a repeated measures 2-way ANOVA.
In Supplementary figure 3C, 3E and 3G, OFT was conducted. In this experiment, the locomotion of the same cohort of mice were recorded. Also, it’s appropriate to use a repeated measures 2-way ANOVA.
Clearer description for these experiments has been provided in the revised manuscript.
(2) Including gender analyses would be helpful.
The mice we used in this study were all males.
Minor corrections to text and figures:
(1) Quantitative analyses in Figures 5A-C, 5H, 6G, 6H, and Supplementary Figures 4 and 5C would be helpful.
We have provided quantitative analysis of these results (Figure 5D, 5J, 6K, Supplementary figure 4D, 5F) mentioned above in the revised manuscript.
(2) Percent correct (%) in Figures 4J and 4K should be labeled as 0, 50, and 100 instead of 0.0, 0.5, and 1.0.
We would like to thank the reviewer for pointing out this. We have made corrections in the revised manuscript.
Reviewer #2 (Recommendations For The Authors):
In the study conducted by Luo et al, it was observed that the fragment of amyloid precursor protein (APP) cleaved by beta-site amyloid precursor protein cleaving enzyme 1 (BACE1), known as β-CTF, plays a crucial role in synaptic damage. The study found increasing expression of β-CTF in neurons could induce synapse loss both in vitro and in vivo, independent of Aβ. Mechanistically, they explored how β-CTF could interfere with the endosome system by interacting with RAB5. While this study is intriguing, there are several points that warrant further investigation:
(1) The study involved overexpressing β-CTF in neurons. It would be valuable to know if the levels of β-CTF are similarly increased in Alzheimer's disease (AD) patients or AD mouse models.
We would like to thank the reviewer for the suggestion. It’s reported β-CTF levels were significantly elevated in the AD cerebral cortex[6]. Most AD mouse models are human APP transgenic mouse models with elevated β-CTF levels[7].
(2) The study noted that β-CTF in neurons is a membranal fragment, but the overexpressed β-CTF was not located in the membrane. It is important to ascertain whether the membranal β-CTF and cytoplasmic β-CTF lead to synapse loss in a similar manner.
We apologize for not clearly explaining the localization of β-CTF in the original manuscript. β-CTF is produced from APP through β-cleavage, a process that occurs in organelles such as endo-lysosomes[8]. The overexpressed β-CTF is also primarily localized in the endo-lysosomal systems (Figure 5C and Supplementary figure 4C), similar to those generated by APP cleavage.
(3) The study found a significant decrease in GluA1, a subunit of AMPA receptors, due to β-CTF. It would be beneficial to investigate whether there are systematic alterations in NMDA receptors, including GluN2A and GluN2B.
We would like to express our gratitude to the reviewer for bringing up this question. The protein levels of GluN2A and GluN2B are also reduced in neurons expressing β-CTF (Figure 6E-F)
(4) The study showed a significant decrease in the frequency of miniature excitatory postsynaptic currents (mEPSC), indicating disrupted presynaptic vesicle neurotransmitter release. It would be pertinent to test whether the expression level of the presynaptic SNARE complex, which is required for vesicle release, is altered by β-CTF.
We would like to express our gratitude to the reviewer for bringing up this question. The protein level of the presynaptic SNARE complex, such as VAMP2, is also reduced in neurons expressing β-CTF (Figure 6E, G).
(5) Since AMPA receptors are glutamate receptors, it is important to determine whether the ability of glutamate release is altered by β-CTF. In vivo studies using a glutamate sensor should be conducted to examine glutamate release.
We would like to express our gratitude to the reviewer for this suggestion. It will be interesting to use glutamate sensors to assess the ability of glutamate release in the future.
(6) The quality of immunostaining associated with Figures 4B and 4C was noted to be suboptimal.
We apologize for the suboptimal quality of these images. The immunostaining in Figures 4B and 4C were captured using the stitching function of a confocal microscope to display larger areas, including the entire hemisphere and hippocampus. We have reprocessed the images to obtain higher-quality versions.
(7) It would be insightful to investigate whether treatment with a BACE1 inhibitor in the study could reverse synaptic deficits mediated by β-CTF.
We would like to thank the reviewer for this sggestion. In Figure 1I-M, we constructed an APP mutant (APP<sub>MV</sub>), which cannot be cleaved by BACE1 to produce β-CTF and Aβ but has no impact on β’-cleavage. When co-expressed with BACE1, APP<sub>MV</sub> failed to induce spine loss, supporting the effect of β-CTF. We think these results domonstrate that β-CTF underlies the synaptic deficits. It would be interesting to test the effects of BACE1 inhibition in the future.
(8) Considering the potential implications for therapeutics, it is worth exploring whether extremely low levels of β-CTF have beneficial effects in regulating synaptic function or promoting synaptogenesis at a physiological level.
We would like to thank the reviewer for raising this question. We found that when the plasmid amount was reduced to 1/8 of the original dose, β-CTF no longer induced a decrease in dendritic spine density (Supplementary figure 2E-F). It’s reported APP-Swedish mutation in familial AD increased synapse numbers and synaptic transmission, whereas inhibition of BACE1 lowered synapse numbers, suppressed synaptic transmission in wild type neurons, suggesting that at physiological level, β-CTF might be synaptogenic[9].
(9) The molecular mechanism through which β-CTF interferes with Rab5 function should be elucidated.
We would like to thank the reviewer for raising this question. Kim et al have elucidated the mechanism through which β-CTF interferes with Rab5 function. β-CTF recruited APPL1 (a Rab5 effector) via YENPTY motif to Rab5 endosomes, where it stabilizes active GTP-Rab5, leading to pathologically accelerated endocytosis, endosome swelling and selectively impaired transport of Rab5 endosomes[6]. We have included additional discussion for this question in the revised manuscript (page 15 of the revised manuscript).
(10) The study could compare the role of β-CTF and Aβ in neurodegeneration in AD mouse models.
We would like to thank the reviewer for raising this point. While it is easier to dissect the role of Aβ and β-CTF in vitro, some of the critical tools are not applicabe in vivo, such as γ-secretase inhibitors, which lead to severe side effects because of their inhibition on other γ substrates[1, 2]. Therefore it will be difficult to deomonstrate their different roles in vivo. There are studies showing that β-CTF accumulation precedes Aβ deposition in model mice and mediates Aβ independent intracellular pathologies[10, 11], consistent with our results.
(11) Based on the findings, it would be valuable to discuss possible explanations for the failure of most BACE1 inhibitors in recent clinical trials for humans.
Response: We would like to express our gratitude to the reviewer for raising this recommendation. It is a big puzzle why BACE1 inhibition failed to provide beneficial effects in AD patients whereas clearance of amyloid by Aβ antibodies could slow down the AD progress. One potential answer is that pharmacological inhibition of BACE1 might be not as effective as its genetic removal. Indeed, genetic depletion of BACE1 leads to clearance of existing amyloid plaques[3], whereas pharmacological inhibition of BACE1 could not stop growth of existing plaques, although it prevents formation of new plaques[4]. The negative result of BACE1 inhibitors might not be sufficient to exclude the possibility that β-CTF could also contribute to the AD pathogenesis. We have included additional discussion for this question in the revised manuscript (page 15 of the revised manuscript).
Reviewer #3 (Recommendations For The Authors):
Major:
(1) The cell experiments were performed at DIV 9, do the authors know whether at this age, the neurons are still developing and spine density has not reached a pleated yet? If so, the observed effect may reflect the impact on development and/or maturation, rather than on the mature neurons. The authors should be more specific about this issue.
We would like to thank the reviewer for pointing out this question. These slice cultures were made from 1-week-old rats. DIV 9 is about two weeks old. These neurons are still developing and spine density has not reached a plateau yet[12]. In addition, we also investigated the effects of β-CTF on the synapses of mature neurons in two-month-old mice (Figure 3). So we think the observed effect reflects the impact on both immature and mature neurons.
(2) mEPSCs shown in Figure 3D were of small amplitudes, perhaps also indicating that these synapses are not yet mature.
In Figure 3D, the mEPSC results were obtained from pyramidal neurons in the CA1 region of two-month-old mice. At the age of two months, neurotransmitter levels and synaptic density have reached adult levels[13].
(3) There was no data on the spine density or mEPSCs in the mice OE b-CTF, hence it is unclear whether a primary impact of this manipulation (b-CTF effect) on the synaptic transmission still occurs in vivo.
In Figure 3, we examined the density of dendritic spines and mEPSCs from CA1 pyramidal neurons infected with lentivirus expressing β-CTF in mice and showed that those neurons expressing additional amount of β-CTF exhibited lower spine density and less mEPSCs, supporting that β-CTF also damaged synaptic transmission in vivo.
(4) OE of b-CTF should lead to the production of Abeta, although this may not lead to the formation of significant plaques. How do the authors know whether their findings on behavioral and cognitive impairments were not largely mediated by Abeta, which has been widely reported by previous studies?
We would like to thank the reviewer for pointing out this question. Indeed, our in vivo data could not exclude the potential involvement of Aβ in the pathology, despite the absence of amyloid plaque formation. It will be difficult to demonstrate this question in vivo because of the severe side effects from γ inhibition.
(5) Figure 4H, the freezing level in the cued fear conditioning was very high, likely saturated; this may mask a potential reduction in the b-CTF OE mice (there is a hint for that in the results). The authors should repeat the experiments using less strong footshock strength (hence resulting in less freezing, <70%).
We would like to express our gratitude to the reviewer for bringing up this question. The contextual fear conditioning test assesses hippocampal function, while the cued fear conditioning test assesses amygdala function. We hope the reviewer understands that our primary goal is to assess hippocampus-related functions in this experiment and we did see a significant difference between GFP and β-CTF groups. Therefore, we think the intensity of footshock we used was suitable to serve the primary purpose of this experiment.
(6) Why was the deficit in the Morris water maze in the b-CTF OE mice only significant in the training phase?
We would like to thank the reviewer for rasing this question and apologize for not describing the test clearly. This is a water T maze test, not Morris water maze test.
To make the behavioral paradigm of the water T maze test easier to understand, we have provided a more detailed description of the methods in the new version of the manuscript.
The acquisition phase of the Water T Maze (WTM) evaluates spatial learning and memory, where mice use spatial cues in the environment to navigate to a hidden platform and escape from water, while the reversal learning measures cognitive flexibility in which mice must learn a new location of the hidden platform[14]. In reversal learning task (Figure 4J-K), the learning curves of the two groups of mice did not show any significant differences, indicating that the expression of β-CTF only damages spatial learning and memory but not cognitive flexibility. This is consistent with a previous report using APP/PS1 mice[15].
(7) Will the altered Rab5 in the b-CTF OE condition also affect the level of other proteins?
We would like to express our gratitude to the reviewer for raising this interesting question. Expression of Rab5<sub>S34N</sub> in β-CTF-expressing neurons did not alter the levels of synapse-related proteins that were reduced in these neurons (Supplementary figure 5G-H), suggesting Rab5 overactivation did not contribute to these protein expression changes induced by β-CTF.
(8) How do the authors reconcile their findings with the well-established findings that Abeta affects synaptic transmission and spine density? Do they think these two processes may occur simultaneously in the neurons, or, one process may dominate in the other?
APP, Aβ, and presenilins have been extensively studied in mouse models, providing convincing evidence that high Aβ concentrations are toxic to synapses[16]. Moreover, addition of Aβ to murine cultured neurons or brain slices is toxic to synapses[17]. However, Aβ-induced synaptotoxicity was not observed in our study. A major difference between our study and others is that our study used a isolated expression system that apply Aβ only to individual neurons surrounded by neurons without excessive amount of Aβ, whereas the rest studies generally apply Aβ to all the neurons. Therefore, we predict that Aβ does not lead to synaptic deficits from individual neurons in cell autonomous manners, whereas β-CTF does. Aβ and β-CTF represent two parallel pathways of action. Additional discussion for this question has been included in the revised manuscript (page 14 of the revised manuscript).
Minor:
Fig 2F-G, "prevent" rather than "reverse"?
We would like to thank the reviewer for pointing this out. We have made corrections in the revised manuscript.
Reference:
(1) GüNER G, LICHTENTHALER S F. The substrate repertoire of γ-secretase/presenilin [J]. Seminars in cell & developmental biology, 2020, 105: 27-42.
(2) DOODY R S, RAMAN R, FARLOW M, et al. A phase 3 trial of semagacestat for treatment of Alzheimer's disease [J]. The New England journal of medicine, 2013, 369(4): 341-50.
(3) HU X, DAS B, HOU H, et al. BACE1 deletion in the adult mouse reverses preformed amyloid deposition and improves cognitive functions [J]. The Journal of experimental medicine, 2018, 215(3): 927-40.
(4) PETERS F, SALIHOGLU H, RODRIGUES E, et al. BACE1 inhibition more effectively suppresses initiation than progression of β-amyloid pathology [J]. Acta neuropathologica, 2018, 135(5): 695-710.
(5) SIMS J R, ZIMMER J A, EVANS C D, et al. Donanemab in Early Symptomatic Alzheimer Disease: The TRAILBLAZER-ALZ 2 Randomized Clinical Trial [J]. Jama, 2023, 330(6): 512-27.
(6) KIM S, SATO Y, MOHAN P S, et al. Evidence that the rab5 effector APPL1 mediates APP-βCTF-induced dysfunction of endosomes in Down syndrome and Alzheimer's disease [J]. Molecular psychiatry, 2016, 21(5): 707-16.
(7) MONDRAGóN-RODRíGUEZ S, GU N, MANSEAU F, et al. Alzheimer's Transgenic Model Is Characterized by Very Early Brain Network Alterations and β-CTF Fragment Accumulation: Reversal by β-Secretase Inhibition [J]. Frontiers in cellular neuroscience, 2018, 12: 121.
(8) ZHANG X, SONG W. The role of APP and BACE1 trafficking in APP processing and amyloid-β generation [J]. Alzheimer's research & therapy, 2013, 5(5): 46.
(9) ZHOU B, LU J G, SIDDU A, et al. Synaptogenic effect of APP-Swedish mutation in familial Alzheimer's disease [J]. Science translational medicine, 2022, 14(667): eabn9380.
(10) LAURITZEN I, PARDOSSI-PIQUARD R, BAUER C, et al. The β-secretase-derived C-terminal fragment of βAPP, C99, but not Aβ, is a key contributor to early intraneuronal lesions in triple-transgenic mouse hippocampus [J]. The Journal of neuroscience : the official journal of the Society for Neuroscience, 2012, 32(46): 16243-1655a.
(11) KAUR G, PAWLIK M, GANDY S E, et al. Lysosomal dysfunction in the brain of a mouse model with intraneuronal accumulation of carboxyl terminal fragments of the amyloid precursor protein [J]. Molecular psychiatry, 2017, 22(7): 981-9.
(12) HARRIS K M, JENSEN F E, TSAO B. Three-dimensional structure of dendritic spines and synapses in rat hippocampus (CA1) at postnatal day 15 and adult ages: implications for the maturation of synaptic physiology and long-term potentiation [J]. The Journal of neuroscience : the official journal of the Society for Neuroscience, 1992, 12(7): 2685-705.
(13) SEMPLE B D, BLOMGREN K, GIMLIN K, et al. Brain development in rodents and humans: Identifying benchmarks of maturation and vulnerability to injury across species [J]. Progress in neurobiology, 2013, 106-107: 1-16.
(14) GUARIGLIA S R, CHADMAN K K. Water T-maze: a useful assay for determination of repetitive behaviors in mice [J]. Journal of neuroscience methods, 2013, 220(1): 24-9.
(15) ZOU C, MIFFLIN L, HU Z, et al. Reduction of mNAT1/hNAT2 Contributes to Cerebral Endothelial Necroptosis and Aβ Accumulation in Alzheimer's Disease [J]. Cell reports, 2020, 33(10): 108447.
(16) CHAPMAN P F, WHITE G L, JONES M W, et al. Impaired synaptic plasticity and learning in aged amyloid precursor protein transgenic mice [J]. Nature neuroscience, 1999, 2(3): 271-6.
(17) WANG Z, JACKSON R J, HONG W, et al. Human Brain-Derived Aβ Oligomers Bind to Synapses and Disrupt Synaptic Activity in a Manner That Requires APP [J]. The Journal of neuroscience : the official journal of the Society for Neuroscience, 2017, 37(49): 11947-66.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
A number of modifications/additions have been made to the text which help to clarify the background and details of the study and I feel have improved the study.
NAD deficiency induced using the dietary/Haao null model showed a window of susceptibility at E7.5-10.5. Further, HAAO enymze activity data has been added at E11.5 and the minimal HAAO activity in the embryo act E11.5 supports the hypothesis that the NAD synthesis pathway from kynurenine is not functional until the liver starts to develop.
The caveat to this is that absence of expression/activity in embryonic cells at E7.5-10/5 relies on previous scRNA-seq data. Both reviewers commented that analysis of RNA and/or protein expression at these stages (E7.5-10.5) would be necessary to rule this out, and would strongly support the conclusions regarding the necessity for yolk sac activity.
There are a number of antibodies for HAAO, KNYU etc so it is surprising if none of these are specific for the mouse proteins, while an alternative approach in situ hydridisation would also be possible.
We have tested 2 anti-HAAO antibodies, 2 anti-KYNU antibodies and 1 anti-QPRT antibody on adult liver and various embryonic tissues.
Given that all tested antibodies only detected a specific band in tissues with very high expression and abundant target protein levels (adult liver), they were determined to be unsuitable to conclusively prove that these proteins of the NAD _de novo_synthesis pathway are absent in embryos prior to the development of a functional liver. They were also unsuitable for IHC experiments to determine which cell types (if any) have these proteins.
The antibodies, tested assays and samples, and the results obtained were as follows:
Anti-HAAO antibody (ab106436, Abcam, UK)
-
Was tested in western blots of liver, E11.5-E14.5 yolk sac, E14.5 placenta, and E14.5 and E16.5 embryonic liver lysates from wild-type (WT) and Haao-/- mice. The target band (32.5 KD) was visible in the WT liver samples and absent in_Haao_-/- livers, and faintly visible in E11.5-E14.5 WT yolk sac, with intensity gradually increasing in E12.5 and E13.5 WT yolk sac. Multiple strong non-specific bands occurred in all samples, requiring cutting off the >50 KD area of the blots.
-
Was re-tested in western blots comparing WT, Haao-/-, and Kynu-/- E9.5-E11.5 embryo, E9.5 yolk sac, and adult liver tissues. It detected the target band faintly only in WT and Kynu-/- liver lysates. No target band could be resolved in E9.5 yolk sac or embryo lysates. Due to the low sensitivity of the antibody, it is unsuitable to conclusively determine whether HAAO is present or absent in E9.5 yolk sacs and E9.5-E11.5 embryos.
-
Was tested in IHC with DAB and IF, producing non-specific staining on both WT and Haao-/- liver and kidney tissue.
Anti-HAAO antibody (NBP1-77361, Novus Biologicals, LLC, CO, USA)
-
Was tested in western blots and detected a very faint target band in WT liver lysate that was absent in Haao-/- lysate, with stronger non-specific bands occurring in both genotypes.
-
Was tested in IHC with DAB, producing non-specific staining on both WT and Haao-/- liver and kidney tissue
Anti-L-Kynurenine Hydrolase antibody (11796-1-AP, Proteintech Group, IL, USA)
-
Was tested in western blots and detected a faint target band (52 KD) in E11.5, E12.5 E13.5, and E14.5 yolk sac lysates. Detected a weak band in E14.5 liver, a stronger band in E16.5 liver, but not in E14.5 placenta. The target band was only resolved with normal ECL substrate and extended exposure when the >75 KD part of the blot was cut off.
-
Was re-tested in western blots comparing WT, Haao-/-, and Kynu-/- E9.5-E11.5 embryo, E9.5 yolk sac, and adult liver tissues. It detected the target band only in WT and Haao-/- liver lysates, requiring Ultra Sensitive Substrate. No target band could be resolved in yolk sac or embryo lysates of any genotype.
Anti-L-Kynurenine Hydrolase antibody (ab236980, Abcam, UK)
-
Was tested in western blots and detected a very faint target band (52 KD) in WT liver lysates and no band in Kynu-/- liver lysates. Multiple non-specific bands occurred irrespective of the Kynu genotype of the lysate.
-
Was tested in IHC with DAB and IF, producing non-specific staining on both WT and Kynu-/- liver and kidney tissue
Anti-QPRT (orb317756, Biorbyt, NC, USA)
- Was tested in western blots and detected a faint target band (31 KD) with multiple other bands between 25-75 KD and an extremely strong band around 150 KD on WT liver lysates.
The following is the authors’ response to the original reviews.
Reviewer 1 Public Review:
The current dietary study narrows the period when deficiency can cause malformations (analysed at E18.5), and altered metabolite profiles (eg, increased 3HAA, lower NAD) are detected in the yolk sac and embryo at E10.5. However, without analysis of embryos at later stages in this experiment it is not known how long is needed for NAD synthesis to be recovered - and therefore until when the period of exposure to insufficient NAD lasts. This information would inform the understanding of the developmental origin of the observed defects.
Our previous published work (Cuny et al 2023 https://doi.org/10.1242/dmm.049647) indicates that the timing of NAD de novo synthesis pathway precursor availability and consequently the timing of NAD deficiency during organogenesis drives which organs are affected in their development. Furthermore, experimental data of another project (manuscript submitted) shows that mouse embryos (from mothers on an NAD precursor restricted diet that induces CNDD) were NAD deficient at E9.5 and E11.5, but embryo NAD levels were fully recovered at E14.5 when compared to same-stage embryos from mothers on precursor-sufficient diet. This was observed irrespective of the embryos’ Haao genotype. In the current study, NAD precursor provision was only restricted until E10.5. Thus, we expect that our embryos phenotyped at E18.5 had recovered their NAD levels back to normal by E14.5 at the latest. More research, beyond the scope of the current manuscript, is required to spatio-temporally link embryonic NAD deficiency to the occurrence of specific defect types and elucidate the mechanistic origin of the defects. To acknowledge this, we updated the respective Discussion paragraph on page 7 and added the following statement: “This observation supports our hypothesis that the timing of NAD deficiency during organogenesis determines which organs/tissues are affected (Cuny et al., 2023), but more research is needed to fully characterise the onset and duration of embryonic NAD deficiency in dietary NAD precursor restriction mouse models.”
More importantly, there is still a question of whether in addition to the yolk sac, there is HAAO activity within the embryo itself prior to E12.5 (when it has first been assayed in the liver - Figure 1C). The prediction is that within the conceptus (embryo, chorioallantoic placenta, and visceral yok sac) the embryo is unlikely to be the site of NAD synthesis prior to liver development. Reanalysis of scRNA-seq (Fig 1B) shows expression of all the enzymes of the kynurenine pathway from E9.5 onwards. However, the expression of another available dataset at E10.5 (Fig S3) suggested that expression is 'negligible'. While the expression in Figure 1B, Figure S1 is weak this creates a lack of clarity about the possible expression of HAAO in the hepatocyte lineage, or especially elsewhere in the embryo prior to E10.5 (corresponding to the period when the authors have demonstrated that de novo NAD synthesis in the conceptus is needed). Given these questions, a direct analysis of RNA and/or protein expression in the embryos at E7.5-10.5 would be helpful.
We now have included additional data showing that whole embryos at E11.5 and embryos with their livers removed at E14.5 have negligible HAAO enzyme activity. The observed lack of HAAO activity in the embryo at E11.5 is consistent with the absence of a functional embryonic liver at that stage. Thus, it confirms that the embryo is dependent of extraembryonic tissues (the yolk sac) for NAD de novo synthesis prior to E12.5. The additional datasets are now included in Supplementary Table S1 and as Supplementary Figure 2. The Results section on page 2 has been updated to refer to these datasets.
Reviewer #2 (Public Review):
Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency.
We now provide more information about the malformation types in the Results on page 4. Also, Table S4 now defines the missing vertebral, sternum, and kidney descriptors.
Can the authors define whether the role of the NAD pathway in a couple of tissue or organ systems is the same? By this I mean is the molecular or cellular effect of NAD deficiency is the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected? If some are not, can the authors explain why they escape the need for the NAD pathway?
This is a good comment, highlighting that further research, beyond the scope of this manuscript, is needed to better understand the underlying mechanisms of CNDD causation. We have expanded the Discussion paragraph “NAD deficiency in early organogenesis is sufficient to cause CNDD” to indicate that while the timing of NAD deficiency during embryogenesis explains variability in phenotypes among the CNDD spectrum, it is unknown why other organs/tissues are seemingly not affected by NAD deficiency.
To answer the reviewer’s questions and elucidate the underlying cellular and molecular processes in individual organs affected by NAD deficiency, a multiomic approach is required. This is because NAD is involved in hundreds of molecular and cellular processes affecting gene expression, protein levels, metabolism, etc. For details of NAD functions that have relevance to embryogenesis, the reviewer may refer to our recent review article (Dunwoodie et al 2023 https://doi.org/10.1089/ars.2023.0349).
Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depend on the number of cells sequenced, the technology (methodology) used, the depth of sequencing, and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Afmid, Kmo, Kynu, Haao, Qprt, and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis.
We have tested antibodies against HAAO, KYNU, and QPRT in adult mouse liver samples (the main site of NAD de novo synthesis) but these produced non-specific bands in western blotting experiments. Therefore, immunostaining studies on embryonic tissues were not feasible.
However, we agree that histological methods such as in situ hybridisation would provide secondary validation of the exact cell types that express these genes. To acknowledge this, we have updated a sentence on page 5 referring to the data shown in Figure 6C as follows: “While histological methods such as in situ hybridisation would be required to confirm the exact cell types expressing these genes, the available expression data indicates that the genes encoding those enzymes required to convert L-kynurenine to NAD (kynurenine pathway) are exclusively expressed in the yolk sac endoderm lineage from the onset of organogenesis (E8.0-8.5).”
Absolute functional proof of the yolk sac endoderm as being essential and required for NAD synthesis in the context of CNDD might require conditional deletion of Haao in the yolk sac versus embryo using appropriate Cre driver lines or in the absence of a conditional allele, could be performed by tetraploid embryo-ES cell complementation approaches. But temporal dietary intervention can also approximate the same thing by perturbing NAD synthesis Shen the yolk sac is the primary source versus when the liver becomes the primary source in the embryo.
Reviewer 1 has made a similar comment about confirming that indeed NAD de novo synthesis activity is limited to extraembryonic tissues (=yolk sacs) and absent in the embryo prior to development of an embryonic liver. We now have included additional data showing that whole embryos at E11.5 and embryos with their livers removed at E14.5 have negligible HAAO enzyme activity. The observed lack of HAAO activity in the embryo at E11.5 is consistent with the absence of a functional embryonic liver at that stage. We think this provides enough proof that the embryo is dependent of extraembryonic tissues (the yolk sac) for NAD de novo synthesis prior to E12.5. The additional datasets are now included in Supplementary Table S1 and as Supplementary Figure 2. The Results section on page 2 has been updated to refer to these data.
Reviewer #1 (Recommendations For The Authors):
(1) Introduction (page 1) introduces mouse models with defects in the kynurenine pathway "confirming that NAD de novo synthesis is required during embryogenesis ...". This requirement is revealed by the imposition of maternal dietary deficiency and more detail (or a more clear link to the following sentences) here would help the reader who is not familiar with the previous papers using the HAAO mice and dietary modulation.
We have updated this paragraph in the Introduction to better indicate that the requirement of NAD de novo synthesis for embryogenesis was confirmed in mouse models by modulating the maternal dietary NAD precursor provision during pregnancy.
(2) Discussion - throughout the introduction and results the authors refer to the NAD de novo synthesis pathway, with the study focussing on the effects of HAAO loss of function. Data implies that the kynurenine pathway is active in the yolk sac but whether de novo synthesis from L-tryptophan occurs has not been addressed. The first sub-heading of the discussion could be more accurate referring to the kynurenine pathway, or synthesis from kynurenine.
We agree that our manuscript needed to make better distinction between NAD de novo synthesis starting from kynurenine and starting from tryptophan. We removed “from Ltryptophan” from the sub-heading in the Discussion and clarified in this paragraph which genes are required to convert tryptophan to kynurenine and which genes to convert kynurenine to NAD. We also updated two Results paragraphs (page 2, 2nd paragraph; page 5, 5th paragraph) to improve clarity.
It is worth noting that our statement in the Discussion “this is the first demonstration of NAD de novo synthesis occurring in a tissue outside of the liver and kidney.” is valid because vascular smooth muscle cells express Tdo2 and in combination with the other requisite genes expressed in endoderm cells, the yolk sac has the capability to synthesise NAD de novo from L-tryptophan.
(3) Outlook - While this section is designed to be looking ahead to the potential implications of the work, the last section on gene therapy of the yolk sac seems far removed from the paper content and highly speculative. I feel this could detract from the main points of the study and could be removed.
We have updated the Outlook paragraph and shortened the final part to “Further research is required to better understand the mechanisms of CNDD causation and of other causes of adverse pregnancy outcomes involving the yolk sac.”
(4) In Figure 2D it would be useful to label the clusters as the colours in the legend are difficult to match to the heatmap.
We now have labelled the clusters with lowercase letters above the heatmap to make it easier to match the clusters in Figure 2D to the colours used for designating tissues and genotypes. These labels are described in the figure’s key and the figure legend.
Reviewer #2 (Recommendations For The Authors):
Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency.
We now provide more information about the malformation types in the Results on page 4. Also, Table S4 now defines the missing vertebral, sternum, and kidney descriptors.
Can the authors define whether the role of the NAD pathway in a couple of tissue or organ systems is the same? By this I mean is the molecular or cellular effect of NAD deficiency is the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected? If some are not, can the authors explain why they escape the need for the NAD pathway?
This is a good comment, highlighting that further research, beyond the scope of this manuscript, is needed to better understand the underlying mechanisms of CNDD causation. We have expanded the Discussion paragraph “NAD deficiency in early organogenesis is sufficient to cause CNDD” to indicate that while the timing of NAD deficiency during embryogenesis explains variability in phenotypes among the CNDD spectrum, it is unknown why other organs/tissues are seemingly not affected by NAD deficiency.
To answer the reviewer’s questions and elucidate the underlying cellular and molecular processes in individual organs affected by NAD deficiency, a multiomic approach is required. This is because NAD is involved in hundreds of molecular and cellular processes affecting gene expression, protein levels, metabolism, etc. For details of NAD functions that have relevance to embryogenesis, the reviewer may refer to our recent review article (Dunwoodie et al 2023 https://doi.org/10.1089/ars.2023.0349).
Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depend on the number of cells sequenced, the technology (methodology) used, the depth of sequencing, and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Afmid, Kmo, Kynu, Haao, Qprt, and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis.
We have tested antibodies against HAAO, KYNU, and QPRT in adult mouse liver samples (the main site of NAD de novo synthesis) but these produced non-specific bands in western blotting experiments. Therefore, immunostaining studies on embryonic tissues were not feasible.
However, we agree that histological methods such as in situ hybridisation would provide secondary validation of the exact cell types that express these genes. To acknowledge this, we have updated a sentence on page 5 referring to the data shown in Figure 6C as follows: “While histological methods such as in situ hybridisation would be required to confirm the exact cell types expressing these genes, the available expression data indicates that the genes encoding those enzymes required to convert L-kynurenine to NAD (kynurenine pathway) are exclusively expressed in the yolk sac endoderm lineage from the onset of organogenesis (E8.0-8.5).”
-
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
In this paper by Brickwedde et al., the authors observe an increase in posterior alpha when anticipating auditory as opposed to visual targets. The authors also observe an enhancement in both visual and auditory steady-state sensory evoked potentials in anticipation of auditory targets, in correlation with enhanced occipital alpha. The authors conclude that alpha does not reflect inhibition of early sensory processing, but rather orchestrates signal transmission to later stages of the sensory processing stream. However, there are several major concerns that need to be addressed in order to draw this conclusion.
First, I am not convinced that the frequency tagging method and the associated analyses are adequate for dissociating visual vs auditory steady-state sensory evoked potentials.
Second, if the authors want to propose a general revision for the function of alpha, it would be important to show that alpha effects in the visual cortex for visual perception are analogous to alpha effects in the auditory cortex for auditory perception.
Third, the authors propose an alternative function for alpha - that alpha orchestrates signal transmission to later stages of the sensory processing stream. However, the supporting evidence for this alternative function is lacking. I will elaborate on these major concerns below.
(1) Potential bleed-over across frequencies in the spectral domain is a major concern for all of the results in this paper. The fact that alpha power, 36Hz and 40Hz frequency-tagged amplitude and 4Hz intermodulation frequency power is generally correlated with one another amplifies this concern. The authors are attaching specific meaning to each of these frequencies, but perhaps there is simply a broadband increase in neural activity when anticipating an auditory target compared to a visual target?
We appreciate the reviewer’s insightful comment regarding the potential bleed-over across frequencies in the spectral domain. We fully acknowledge that the trade-off between temporal and frequency resolution is a challenge, particularly given the proximity of the frequencies we are examining.
To address this concern, we performed additional analyses to investigate whether there is indeed a broadband increase in neural activity when anticipating an auditory target as compared to a visual target, as opposed to distinct frequency-specific effects. Our results show that the bleed-over between frequencies is minimal and does not significantly affect our findings. Specifically, we repeated the analyses using the same filter and processing steps for the 44 Hz frequency. At this frequency, we did not observe any significant differences between conditions.
These findings suggest that the effects we report are indeed specific to the 40 Hz frequency band and not due to a general broadband increase in neural activity. We hope this addresses the reviewer’s concern and strengthens the validity of our frequency-specific results.
Author response image 1.
Illustration of bleeding over effects over a span of 4 Hz. A, 40 Hz frequency-tagging data over the significant cluster differing between when expecting an auditory versus a visual target (identical to Fig. 9 in the manuscript). B, 44 Hz signal over the same cluster chosen for A. The analysis was identical with the analysis performed in A, apart from the frequency for the band-pass filter.
We do, however, not specifically argue against the possibility of a broadband increase when anticipating an auditory compared to a visual target. But even a broadband-increase would directly contradict the alpha inhibition hypothesis, which poses that an increase in alpha completely disengages the whole cortex. We will clarify this point in the revised manuscript.
(2) Moreover, 36Hz visual and 40Hz auditory signals are expected to be filtered in the neocortex. Applying standard filters and Hilbert transform to estimate sensory evoked potentials appears to rely on huge assumptions that are not fully substantiated in this paper. In Figure 4, 36Hz "visual" and 40Hz "auditory" signals seem largely indistinguishable from one another, suggesting that the analysis failed to fully demix these signals.
We appreciate the reviewer’s insightful concern regarding the filtering and demixing of the 36 Hz visual and 40 Hz auditory signals, and we share the same reservations about the reliance on standard filters and the Hilbert transform method.
To address this, we would like to draw attention to Author response image 1, which demonstrates that a 4 Hz difference is sufficient to effectively demix the signals using our chosen filtering and Hilbert transform approach. We believe that the reason the 36 Hz visual and 40 Hz auditory signals show similar topographies lies not in incomplete demixing but rather in the possibility that this condition difference reflects sensory integration, rather than signal contamination.
This interpretation is further supported by our findings with the intermodulation frequency at 4 Hz, which also suggests cross-modal integration. Furthermore, source localization analysis revealed that the strongest condition differences were observed in the precuneus, an area frequently associated with sensory integration processes. We will expand on this in the discussion section to better clarify this point.
(3) The asymmetric results in the visual and auditory modalities preclude a modality-general conclusion about the function of alpha. However, much of the language seems to generalize across sensory modalities (e.g., use of the term 'sensory' rather than 'visual').
We thank the reviewer for pointing this out and agree that in some cases we have not made a good enough distinction between visual and sensory. We will make sure, that when using ‘sensory’, we either describe overall theories, which are not visual-exclusive or refer to the possibility of a broad sensory increase. However, when directly discussing our results and the interpretation thereof, we will now use ‘visual’ in the revised manuscript.
(4) In this vein, some of the conclusions would be far more convincing if there was at least a trend towards symmetry in source-localized analyses of MEG signals. For example, how does alpha power in the primary auditory cortex (A1) compare when anticipating auditory vs visual target? What do the frequency-tagged visual and auditory responses look like when just looking at the primary visual cortex (V1) or A1?
We thank the reviewer for this important suggestion and have added a virtual channel analysis. We were however, not interested in alpha power in primary auditory cortex, as we were specifically interested in the posterior alpha, which is usually increased when expecting an auditory compared to a visual target (and used to be interpreted as a blanket inhibition of the visual cortex). We will improve upon the clarity concerning this point in the manuscript.
We have however, followed the reviewer’s suggestion of a virtual channel analysis, showing that the condition differences are not observable in primary visual cortex for the 36 Hz visual signal and in primary auditory cortex for the 40 Hz auditory signal. Our data clearly shows that there is an alpha condition difference in V1, while there no condition difference for 36 Hz in V1 and for 40 Hz in Heschl’s Gyrus (see Author response image 2).
Author response image 2.
Virtual channels for V1 and Helschl’s gyrus. A, alpha power for the virtual channel created in V1 (Calcerine_L and Calcerine_R from AAL atlas; Tzourio-Mazoyer et al., 2002, NeuroImage). A cluster permutation analysis over time (between -2 and 0) revealed a significant condition difference between ~ -2 and -1.7 s (p = 0.0449). B, 36 Hz frequency-tagging signal for the virtual channel created in V1 (equivalent to the procedure in A). The same cluster permutation as performed in A revealed no significant condition differences. C, 40 Hz frequency-tagging signal for the virtual channel created in Heschl’s gryrus (Heschl_L and Heschl_R from AAL atlas; Tzourio-Mazoyer et al., 2002, NeuroImage). The same cluster permutation as performed in A revealed no significant condition differences.
(5) Blinking would have a huge impact on the subject's ability to ignore the visual distractor. The best thing to do would be to exclude from analysis all trials where the subjects blinked during the cue-to-target interval. The authors mention that in the MEG experiment, "To remove blinks, trials with very large eye-movements (> 10 degrees of visual angle) were removed from the data (See supplement Fig. 5)." This sentence needs to be clarified since eye-movements cannot be measured during blinking. In addition, it seems possible to remove putative blink trials from EEG experiments as well, since blinks can be detected in the EEG signals.
We thank the reviewer for mentioning that we were making this point confusing. From the MEG-data, we removed eyeblinks using ICA. Alone for the supplementary Fig. 5 analysis, we used the eye-tracking data to confirm that participants were in fact fixating the centre of the screen. For this analysis, we removed trials with blinks (which can be seen in the eye-tracker as huge amplitude movements or as large eye-movements in degrees of visual angle; see Author response image 3 below to show a blink in the MEG data and the according eye-tracker data in degrees of visual angle). We will clarify this in the methods section.
As for the concern closed eyes to ignore visual distractors, in both experiments we can observe highly significant distractor cost in accuracy for visual distractors, which we hope will convince the reviewer that our visual distractors were working as intended.
Author response image 3.
Illustration of eye-tracker data for a trial without and a trial with a blink. All data points recorded during this trial are plottet. A, ICA component 1, which reflects blinks and its according data trace in a trial. No blink is visible. B, eye-tracker data transformed into degrees of visual angle for the trial depicted in A. C, ICA component 1, which reflects blinks and its according data trace in a trial. A clear blink is visible. D, eye-tracker data transformed into degrees of visual angle for the trial depicted in C.
(6) It would be interesting to examine the neutral cue trials in this task. For example, comparing auditory vs visual vs neutral cue conditions would be indicative of whether alpha was actively recruited or actively suppressed. In addition, comparing spectral activity during cue-to-target period on neutral-cue auditory correct vs incorrect trials should mimic the comparison of auditory-cue vs visual-cue trials. Likewise, neutral-cue visual correct vs incorrect trials should mimic the attention-related differences in visual-cue vs auditory-cue trials.
We thank the reviewer for this suggestion. We have analysed the neutral cue trials in the EEG dataset (see suppl. Fig. 1) and will expand this figure to show all conditions. There were no significant differences to auditory or visual cues, but descriptively alpha power was higher for neutral cues compared to visual cues and lower for neutral cues compared to auditory cues. While this may suggest that for visual trials alpha is actively suppressed and for auditory trials actively recruited, we do not feel comfortable to make this claim, as the neutral condition may not reflect a completely neutral state. The neutral task can still be difficult, especially because of the uncertainty of the target modality.
As for the analysis of incorrect versus correct trials, we love the idea, but unfortunately the accuracy rate was quite high so that the number of incorrect trials would not be sufficient to perform a reliable analysis.
(7) In the abstract, the authors state that "This implies that alpha modulation does not solely regulate 'gain control' in early sensory areas but rather orchestrates signal transmission to later stages of the processing stream." However, I don't see any supporting evidence for the latter claim, that alpha orchestrates signal transmission to later stages of the processing stream. If the authors are claiming an alternative function to alpha, this claim should be strongly substantiated.
We thank the reviewer for pointing out, that we have not sufficiently explained our case. The first point refers to gain control akin to the alpha inhibition hypothesis, which claims that increases in alpha disengage a whole cortical area. Since we have confirmed the alpha increase in our data to originate from primary visual cortex through source analysis, this should lead to decreased visual processing. The increase in 36 Hz visual processing therefore directly contradicts the alpha inhibition hypothesis. We propose an alternative explanation for the functionality of alpha activity in this task. Through pulsed inhibition, information packages of relevant visual information could be transmitted down the processing stream, thereby enhancing relevant visual signal transmission. We believe the fact that the enhanced visual 36 Hz signal we found correlated with visual alpha power on a trial-by-trial basis, and did not originate from primary visual cortex, but from areas known for sensory integration supports our claim.
We will make this point clearer in our revised manuscript.
Reviewer #2 (Public review):
Brickwedde et al. investigate the role of alpha oscillations in allocating intermodal attention. A first EEG study is followed up with a MEG study that largely replicates the pattern of results (with small to be expected differences). They conclude that a brief increase in the amplitude of auditory and visual stimulus-driven continuous (steady-state) brain responses prior to the presentation of an auditory - but not visual - target speaks to the modulating role of alpha that leads them to revise a prevalent model of gating-by-inhibition.
Overall, this is an interesting study on a timely question, conducted with methods and analysis that are state-of-the-art. I am particularly impressed by the author's decision to replicate the earlier EEG experiment in MEG following the reviewer's comments on the original submission. Evidently, great care was taken to accommodate the reviewer's suggestions.
We thank the reviewer for the positive feedback and expression of interest in the topic of our manuscript.
Nevertheless, I am struggling with the report for two main reasons: It is difficult to follow the rationale of the study, due to structural issues with the narrative and missing information or justifications for design and analysis decisions, and I am not convinced that the evidence is strong, or even relevant enough for revising the mentioned alpha inhibition theory. Both points are detailed further below.
We thank the reviewer for raising this important point. We will revise our introduction and results in line with the reviewer’s suggestions, hoping that our rationale will then be easier to follow and that our evidence will be more convincing.
Strength/relevance of evidence for model revision: The main argument rests on 1) a rather sustained alpha effect following the modality cue, 2) a rather transient effect on steady-state responses just before the expected presentation of a stimulus, and 3) a correlation between those two. Wouldn't the authors expect a sustained effect on sensory processing, as measured by steady-state amplitude irrespective of which of the scenarios described in Figure 1A (original vs revised alpha inhibition theory) applies? Also, doesn't this speak to the role of expectation effects due to consistent stimulus timing? An alternative explanation for the results may look like this: Modality-general increased steady-state responses prior to the expected audio stimulus onset are due to increased attention/vigilance. This effect may be exclusive (or more pronounced) in the attend-audio condition due to higher precision in temporal processing in the auditory sense or, vice versa, too smeared in time due to the inferior temporal resolution of visual processing for the attend-vision condition to be picked up consistently. As expectation effects will build up over the course of the experiment, i.e., while the participant is learning about the consistent stimulus timing, the correlation with alpha power may then be explained by a similar but potentially unrelated increase in alpha power over time.
We thank the reviewer for raising these insightful questions and suggestions.
It is true that our argument rests on a rather sustained alpha effect and a rather transient effect on steady-state responses and a correlation between the two. However, this connection would not be expected under the alpha inhibition hypothesis, which states that alpha activity would inhibit a whole cortical area (when irrelevant to the task), exerting “gain control”. This notion directly contradicts our results of the “irrelevant” visual information a) being transmitted at all and b) increasing.
However, it has been shown on many occasions that alpha activity exerts pulsed inhibition, so we proposed an alternative theory of an involvement in signal transmission. In this case, the cyclic inhibition would serve as an ordering system, which only allows for high-priority information to pass, resulting in higher signa-to-noise. We do not make a claim about how fast or when these signals are transmitted in relation to alpha power. For instance, it could be that alpha power increases as a preparatory state even before signal is actually transmitted. Zhigalov (2020 Hum. Brain M.) has shown that in V1, frequency-tagging responses were up-and down regulated with attention – independent of alpha activity.
But we do believe that the fact that visual alpha power correlates on a trial-by-trial level with visual 36 Hz frequency-tagging increases and (a relationship which has not been found in V1, see Zhigalov 2020, Hum. Brain Mapp.) suggest a strong connection. Furthermore, the fact that the alpha modulation originates from early visual areas and occurs prior to any frequency-tagging changes, while the increase in frequency-tagging can be observed in areas which are later in the processing stream (such as the precuneus) is strongly indicative for an involvement of alpha power in the transmission of this signal. We cannot fully exclude alternative accounts and mechanisms which effect both alpha power and frequency-tagging responses.
We do believe that the alternative account described by the reviewer does not contradict our theory, as we do believe that the alpha power modulation may reflect an expectation effect (and the idea that it could be related to the resolution of auditory versus visual processing is very interesting!). It is also possible that this expectation is, as the reviewer suggests, related to attention/vigilance and might result in a modality-general signal increase. And indeed, we can observe an increase in the frequency-tagging response in sensory integration areas. Accordingly, we believe that the alternative explanation provided by the reviewer contradicts the alpha inhibition hypothesis, but not necessarily our alternative theory.
We will revise the discussion, which we hope will make our case stronger and easier to follow. Additionally, we will mention the possibility for alternative explanations as well as the possibility, that alpha networks fulfil different roles in different locations/task environments.
Structural issues with the narrative and missing information: Here, I am mostly concerned with how this makes the research difficult to access for the reader. I list the major points below:
In the introduction the authors pit the original idea about alpha's role in gating against some recent contradictory results. If it's the aim of the study to provide evidence for either/or, predictions for the results from each perspective are missing. Also, it remains unclear how this relates to the distinction between original vs revised alpha inhibition theory (Fig. 1A). Relatedly if this revision is an outcome rather than a postulation for this study, it shouldn't be featured in the first figure.
We agree with the reviewer that we have not sufficiently clarified our goal as well as how different functionalities of alpha oscillations would lead to different outcomes. We will revise the introduction and restructure the results and hope that it will be easier to follow.
The analysis of the intermodulation frequency makes a surprise entrance at the end of the Results section without an introduction as to its relevance for the study. This is provided only in the discussion, but with reference to multisensory integration, whereas the main focus of the study is focussed attention on one sense. (Relatedly, the reference to "theta oscillations" in this sections seems unclear without a reference to the overlapping frequency range, and potentially more explanation.) Overall, if there's no immediate relevance to this analysis, I would suggest removing it.
We thank the reviewer for pointing this out and will add information about this frequency to the introduction part. We believe that the intermodulation frequency analysis is important, as it potentially supports the notion that condition differences in the visual-frequency tagging response are related to downstream processing rather than overall visual information processing in V1. We would therefore prefer to leave this analysis in the manuscript.
Reviewer #3 (Public review):
Brickwedde et al. attempt to clarify the role of alpha in sensory gain modulation by exploring the relationship between attention-related changes in alpha and attention-related changes in sensory-evoked responses, which surprisingly few studies have examined given the prevalence of the alpha inhibition hypothesis. The authors use robust methods and provide novel evidence that alpha likely exhibits inhibitory control over later processing, as opposed to early sensory processing, by providing source-localization data in a cross-modal attention task.
This paper seems very strong, particularly given that the follow-up MEG study both (a) clarifies the task design and separates the effect of distractor stimuli into other experimental blocks, and (b) provides source-localization data to more concretely address whether alpha inhibition is occurring at or after the level of sensory processing, and (c) replicates most of the EEG study's key findings.
We are very grateful to the reviewer for their positive feedback and evaluation of our work.
There are some points that would be helpful to address to bolster the paper. First, the introduction would benefit from a somewhat deeper review of the literature, not just reviewing when the effects of alpha seem to occur, but also addressing how the effect can change depending on task and stimulus design (see review by Morrow, Elias & Samaha (2023).
We thank the reviewer for this suggestion and agree. We will add a paragraph to the introduction which refers to missing correlation studies and the impact of task design.
Additionally, the discussion could benefit from more cautionary language around the revision of the alpha inhibition account. For example, it would be helpful to address some of the possible discrepancies between alpha and SSEP measures in terms of temporal specificity, SNR, etc. (see Peylo, Hilla, & Sauseng, 2021). The authors do a good job speculating as to why they found differing results from previous cross-modal attention studies, but I'm also curious whether the authors think that alpha inhibition/modulation of sensory signals would have been different had the distractors been within the same modality or whether the cues indicated target location, rather than just modality, as has been the case in so much prior work?
We thank the reviewer for suggesting these interesting discussion points and will include a paragraph in our discussion which goes deeper into these topics.
Overall, the analyses and discussion are quite comprehensive, and I believe this paper to be an excellent contribution to the alpha-inhibition literature.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
General Response to Public Reviews
We thank the three reviewers for their positive evaluation of our work, which presents the first molecular characterization of type-II NB lineages in an insect outside the fly Drosophila. They seem convinced of our finding of an additional type-II NB and increased proliferation during embryogenesis in the red flour beetle. The reviewers expressed hesitations on our interpretation that the observed quantitative differences of embryonic lineages can directly be linked to the embryonic development of the central complex in Tribolium. While we still believe that a connection of both observations is a valid and likely hypothesis, we acknowledge that due the lack of functional experiments and lineage tracing a causal link has not directly been shown. We have therefore changed the manuscript to an even more careful wording that on one hand describes the correlation between increased embryonic proliferation with the earlier development of the Cx but on the other hand also stresses the need for additional functional and lineage tracing experiments to test this hypothesis. We have also strengthened the discussion on alternative explanations of the increased lineage size and emphasize the less disputed elements like presence and conservation of type-II NB lineages.
While our manuscript could in conclusion not directly show that the reason of the heterochronic shift lies in the progenitor behaviour, we still provide a first approach to answering the question of the developmental basis of this shift and testable hypotheses directly emerge from our work. We agree with reviewer#1 that functional work is best suited to test our hypothesis and we are planning to do so. However, we believe that the presented work is already rich in novel data and significantly advances our understanding on the conservation and divergence of type-II NBs in insects. We would also like to stress that most transgenic tools for which genome-wide collections exist for Drosophila have to be created for Tribolium and doing so can be quite time consuming. Conducting RNAi experiments is certainly possible in Tribolium but observing phenotypes in this defined cellular context will need laborious optimization. We have for example tried knocking down Tc-fez/erm but could not see any embryonic phenotype which might be due to an escaper effect in which only mildly affected or wild type-like embryos survive while the others die in early embryogenesis. Due to pleiotropic functions of the involved genes a cell-specific knockdown might be necessary and we are working towards establishing a system to do that in the red flour beetle. For the stated reasons, we see our work as an important basis to inspire future functional studies that build up on the framework that we introduced.
In response to these common points, we have made the following changes to the manuscript
- The title has been changed from ‘being associated’ to ‘correlate’
- The conclusions part of the abstract has been changed
- We deleted the statement ‘…thus providing the material for the early central complex formation…’
- Rephrased to saying that the two observations just correlate
- The part of the discussion ‘Divergent timing of type-II NB activity and heterochronic development of the central complex’ has been extensively rewritten and now discusses several alternative explanations that were suggested by the reviewers. It also stresses the need for further functional work and lineage tracing (line 859-862 (608-611)).
In addition, we have made numerous changes to the manuscript to account for more specific comments of the reviewers and to the recommendations for the authors.
Our responses to the individual comments can be found in the following.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Insects inhabit diverse environments and have neuroanatomical structures appropriate to each habitat. Although the molecular mechanism of insect neural development has been mainly studied in Drosophila, the beetle, Tribolium castaneum has been introduced as another model to understand the differences and similarities in the process of insect neural development. In this manuscript, the authors focused on the origin of the central complex. In Drosophila, type II neuroblasts have been known as the origin of the central complex. Then, the authors tried to identify those cells in the beetle brain. They established a Tribolium fez enhancer trap line to visualize putative type II neuroblasts and successfully identified 9 of those cells. In addition, they also examined expression patterns of several genes that are known to be expressed in the type II neuroblasts or their lineage in Drosophila. They concluded that the putative type II neuroblasts they identified were type II neuroblasts because those cells showed characteristics of type II neuroblasts in terms of genetic codes, cell diameter, and cell lineage.
Strengths:
The authors established a useful enhancer trap line to visualize type II neuroblasts in Tribolium embryos. Using this tool, they have identified that there are 9 type II neuroblasts in the brain hemisphere during embryonic development. Since the enhancer trap line also visualized the lineage of those cells, the authors found that the lineage size of the type II neuroblasts in the beetle is larger than that in the fly. They also showed that several genetic markers are also expressed in the type II neuroblasts and their lineages as observed in Drosophila.
Weaknesses:
I recommend the authors reconstruct the manuscript because several parts of the present version are not logical. For example, the author should first examine the expression of dpn, a well-known marker of neuroblast. Without examining the expression of at least one neuroblast marker, no one can say confidently that it is a neuroblast. The purpose of this study is to understand what makes neuroanatomical differences between insects which is appropriate to their habitats. To obtain clues to the question, I think, functional analyses are necessary as well as descriptive analyses.
The expression of an exclusive type-II neuroblast marker would indeed have been the most convincing evidence. However, asense is absent from type-II NBs and deadpan is not specific enough as it is expressed in many other cells of the developing protocerebrum. The gene pointed, although also expressed elsewhere, emerged as the the most specific marker. Therefore, we start with pointed and fez/erm to describe the first appearance and developmental progression of the cells and then add further evidence that these cells are indeed type-II neuroblasts. Further evidence is provided in the following chapters. We have discussed the need for functional work in the general response.
Reviewer #2 (Public Review):
The authors address the question of differences in the development of the central complex (Cx), a brain structure mainly controlling spatial orientation and locomotion in insects, which can be traced back to the neuroblast lineages that produce the Cx structure. The lineages are called type-II neuroblast (NB) lineages and are assumed to be conserved in insects. While Tribolium castaneum produces a functional larval Cx that only consists of one part of the adult Cx structure, the fan-shaped body, in Drosophila melanogaster a non-functional neuropile primordium is formed by neurons produced by the embryonic type-II NBs which then enter a dormant state and continue development in late larval and pupal stages.
The authors present a meticulous study demonstrating that type-II neuroblast (NB) lineages are indeed present in the developing brain of Tribolium castaneum. In contrast to type-I NB lineages, type-II NBs produce additional intermediate progenitors. The authors generate a fluorescent enhancer trap line called fez/earmuff which prominently labels the mushroom bodies but also the intermediate progenitors (INPs) of the type-II NB lineages. This is convincingly demonstrated by high-resolution images that show cellular staining next to large pointed labelled cells, a marker for type-II NBs in Drosophila melanogaster. Using these and other markers (e.g. deadpan, asense), the authors show that the cell type composition and embryonic development of the type-II NB lineages are similar to their counterparts in Drosophila melanogaster. Furthermore, the expression of the Drosophila type-II NB lineage markers six3 and six4 in subsets of the Tribolium type-II NB lineages (anterior 1-4 and 1-6 type-II NB lineages) and the expression of the Cx marker skh in the distal part of most of the lineages provide further evidence that the identified NB lineages are equivalent to the Drosophila lineages that establish the central complex. However, in contrast to Drosophila, there are 9 instead of 8 embryonic type-II NB lineages per brain hemisphere and the lineages contain more progenitor cells compared to the Drosophila lineages. The authors argue that the higher number of dividing progenitor cells supports the earlier development of a functional Cx in Tribolium.
While the manuscript clearly shows that type-II NB lineages similar to Drosophila exist in Tribolium, it does not considerably advance our understanding of the heterochronic development of the Cx in these insects. First of all, the contribution of these lineages to a functional larval Cx is not clear. For example, how do the described type-II NB lineages relate to the DM1-4 lineages that produce the columnar neurons of the Cx? What is the evidence that the embryonically produced type-II NB lineage neurons contribute to a functional larval Cx? The formation of functional circuits could rely on larval neurons (like in Drosophila) which would make a comparison of embryonic lineages less informative with respect to understanding the underlying variations of the developmental processes. Furthermore, the higher number of progenitors (and consequently neurons) in Tribolium could simply reflect the demand for a higher number of cells required to build the fan-shaped body compared to Drosophila. In addition, the larger lineages in Tribolium, including the higher number of INPs could be due to a greater number of NBs within the individual clusters, rather than a higher rate of proliferation of individual neuroblasts, as suggested. What is the evidence that there is only one NB per cluster? The presented schemes (Fig. 7/12) and description of the marker gene expression and classification of progenitor cells are inconsistent but indicate that NBs and immature INPs cannot be consistently distinguished.
We thank this reviewer for pointing out the inconsistency in our classification of cells within the lineages as one central part of our manuscript. These were due to a confusion in the used terms (young vs. immature). We have corrected this mistake and have changed the naming of the INP subtypes to immature-I and immature-II. We are confident that based on the analysed markers, type-II NBs and immature INPs can actually be distinguished with confidence.
We agree that a functional link of increased proliferation to heterochronic CX development is not shown although we consider it to be likely. As stated in the general response we have changed the manuscript to saying that the two observations (higher number of progenitors and larger lineages/more INPs) correlate but that a causal link can only be hypothesized for the time being. At the same time, we have strengthened the discussion on alternative explanations.
We would like to remain with our statement of an increased number of embryonic progeny of Tribolium type-II NBs. We counted the total number of progenitor cells emerging from the anterior median cluster and divided this by the number of type II NBs in that cluster. Hence, the shown increased number of cells represents an average per NB but is not influenced by the increased number of NBs. On the same line, we have never seen indication for the presence of additional NBs within any cluster while one type-II NB is what we regularly found. Hence, we are confident that we know the number of respective NBs. The fact that the fly data included also neurons and was counted at a later stage indicates that the observed differences are actually minimum estimates.
We have discussed that based on the position and comparison to the grasshopper we believe that Tribolium type-II NB 1-4 contribute to the x, y, z and w tracts. To confirm this, lineage tracing experiments would be necessary, for which tools remain to be developed.
We agree that the role of larvally born neurons and the fate of Tribolium neuroblasts through the transition from embryo to larva and pupa need to be further studied.
Available data suggests that the adult fan shaped body in Tribolium does not hugely differ in size from the Drosophila counterpart, although no data in terms of cell number is available. In the larva, however, no fan shaped body or protocerebral bridge can be distinguished in flies while in beetle larvae, these structures are clearly developed. Hence, we think that it is more likely that differences observed in the embryo reflect differences in the larval central complex. We discuss the need for further investigation of larval stages.
The main difference between Tribolium and Drosophila Cx development with regards to the larval functionality might be that Drosophila type-II NB lineage-derived neurons undergo quiescence at the end of embryogenesis so that the development of the Cx is halted, while a developmental arrest does not occur in Tribolium. However, this needs to be confirmed (as the authors rightly observe).
Indeed, there is evidence that cells contributing to the CX go into quiescence in flies – hence, this certainly is one of the mechanisms. However, based on our data we would suggest that in addition, the balance of embryonic versus larval proliferation of type-II lineages is different between the two insects: The increased embryonic proliferation and development leads to a functional larval CX in beetles while in flies, postembryonic proliferation may be increased in order to catch up.
Reviewer #3 (Public Review):
Summary:
In this paper, Rethemeier et al capitalize on their previous observation that the beetle central complex develops heterochronically compared to the fly and try to identify the developmental origin of this difference. For this reason, they use a fez enhancer trap line that they generated to study the neuronal stem cells (INPs) that give rise to the central complex. Using this line and staining against Drosophila type-II neuroblast markers, they elegantly dissect the number of developmental progression of the beetle type II neuroblasts. They show that the NBs, INPs, and GMCs have a conserved marker progression by comparing to Drosophila marker genes, although the expression of some of the lineage markers (otd, six3, and six4) is slightly different. Finally, they show that the beetle type II neuroblast lineages are likely longer than the equivalent ones in Drosophila and argue that this might be the underlying reason for the observed heterochrony.
Strengths:
- A very interesting study system that compares a conserved structure that, however, develops in a heterochronic manner.
- Identification of a conserved molecular signature of type-II neuroblasts between beetles and flies. At the same time, identification of transcription factors expression differences in the neuroblasts, as well as identification of an extra neuroblast.
- Nice detailed experiments to describe the expression of conserved and divergent marker genes, including some lineaging looking into the co-expression of progenitor (fez) and neuronal (skh) markers.
Weaknesses:
- Comparing between different species is difficult as one doesn't know what the equivalent developmental stages are. How do the authors know when to compare the sizes of the lineages between Drosophila and Tribolium? Moreover, the fact that the authors recover more INPs and GMCs could also mean that the progenitors divide more slowly and, therefore, there is an accumulation of progenitors who have not undergone their programmed number of divisions.
We understand the difficulty of comparing stages between species, but we feel that our analysis is on the save side. At stages comparable with respect to overall embryonic development (retracting or retracted germband), the fly numbers are clearly smaller. To account for potential heterochronic shifts in NB activity, we have selected the stages to compare based on the criteria given: In Drosophila the number of INPs goes down after stage 16, meaning that they reach a peak at the selected stages. In Tribolium the chosen stages also reflect the phase when lineage size is larger than in all previous stages. Therefore, we believe that the conclusion that Tribolium has larger lineages and more INPs is well founded. Lineage size in Tribolium might further increase just before hatching (stage 15) but we were for technical reasons not able to look at this. As lineage size goes down in the last stage of Drosophila embryogenesis the number of INPs goes down and type-II NB enter quiescence, we think it is highly unlikely that the ratio between Tribolium and Drosophila INPs reverses at this stage, but a study of the behaviour of type-II NB in Tribolium and whether there is a stage of quiescence is still needed.
- The main conclusion that the earlier central complex development in beetles is due to the enhanced activity of the neuroblasts is very handwavy and is not the only possible conclusion from their data.
As discussed in the general response we have made several changes to the manuscript to account for this criticism and discuss alternative explanations for the observations.
- The argument for conserved patterns of gene expression between Tribolium and Drosophila type-II NBs, INPs, and GMCs is a bit circular, as the authors use Drosophila markers to identify the Tribolium cells.
We tested the hypothesis that in Tribolium there are type-II NBs with a molecular signature similar to flies. Our results are in line with that hypothesis. If pointed had not clearly marked cells with NB-morphology or fez/erm had not marked dividing cells adjacent to these NBs, we would have concluded that no such cells/lineages exist in the Tribolium embryo, or that central complex producing lineages exist but express different markers. Therefore, we regard this a valid scientific approach and hence find this argument not problematic.
An appraisal of whether the authors achieved their aims, and whether the results support their conclusions: Based on the above, I believe that the authors, despite advancing significantly, fall short of identifying the reasons for the divergent timing of central complex development between beetle and fly.
We agree that based on the available data, we cannot firmly make that link and we have changed the text accordingly.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
In addition to these descriptive analyses, functional analyses can be included. RNAi is highly effective in this beetle.
We agree that functional analyses of some of the studied genes and possible effects of gene knockdowns on the studied cell lineages and on central complex development could be highly informative. However, when studying specific cell types or organs these experiments are less straight forward than it may seem as knockdowns often lead to pleiotropic effects, sterility or lethality. All the genes involved are expressed in additional cells and may have essential functions there. Given the systemic RNAi of Tribolium, it is challenging to unequivocally assign phenotypes to one of the cell groups. Overcoming these challenges is often possible but needs extensive optimization. Our study, though descriptive is already rich in data and is the first description of NB-II lineages in Tribolium central complex development. We see it as a basis for future studies on central complex development that will include functional experiments.
(1) Introduction
For these reasons the beetle...
Could you explain the differences in the habitats between Tribolium and Drosophila? or What is the biggest difference between these two species at the ecological aspect?
We have added a short characterisation of the main differences.
The insect central complex is an anterior...
The author should explain why they focus on the structure.
Added
It is however not known how these temporal...
If the authors want to get the answer to the question, they need to conduct functional analyses.
While we agree with the importance of functional work (see above) we believe that detailed descriptions under the inclusion of molecular markers as presented here is very informative by itself for understanding developmental processes and sets the foundation for the analysis of mutant/RNAi- phenotypes in future studies.
CX - Central complex?
We have opted to not use this abbreviation anymore for clarity.
“because intermediate cycling progenitors have also been...”
Is the sentence correct?
We have included ‘INPs’ in the sentence to make clear what the comparison refers to and added a comma
“However, molecular characterization of such lineage in another...”
The authors should explain why molecular characterization is necessary.
We have done so
(2) Results
a) Figure 8. Could you delineate the skh/eGFP expression region?
We have added brackets to figure 1 panel A to indicate the extent of skh and other gene expressions within the lineages.
b) This section should be reorganized for better logical flow.
There certainly are different ways to organize this part and we have considered different structures of the results part. We eventually subjectively concluded that the chosen one is the best fit for our data (also see comment below on dpn-expression).
c) For the tables. The authors should mention what statistical analysis they have conducted.
The tables themselves are just listing the raw numbers. They are the basis for the graph in figure 9. Statistical tests (t-test) are mentioned in the legend of that figure and now also in the Methods sections.
“We also found that the large Tc-pnt...”
The authors could examine the mitotic index using an anti-pH3 antibody.
We have used the anti-pH3 antibody to detect mitoses (figure 3C, table 1 and 3) but as data on mitoses based on this antibody is only a snapshot it would require a lot of image data to reliably determine an index in this specific cells. While mitotic activity over time possibly combined with live imaging might be very interesting in this system also with regards to the timing of development, for this basic study we are satisfied with the statement that the type-II NB are indeed dividing at these stages.
“Based on their position by the end of embryogenesis...”
How can the authors conclude that they are neuroblasts without examining the expression of NB markers?
Type-II NB do not express asense as the key marker for type I neuroblasts. To corroborate our argument that the cells are neuroblasts we have used several criteria:
- We have used the same markers that are used in Drosophila to label type-II NBs (pnt, dpn, six4). We are not aware of any other marker that would be more specific.
- We have shown that these cells are larger and have larger nuclei than neighbouring cells and they are dividing
- We have shown that these cells through their INP lineages give rise to central complex neuropile
We believe that these features taken together leave little doubt that the described cells are indeed neuroblasts.
“We found that the cells they had assigned as...”
How did the authors distinguish that they are really neuroblasts?
We see the difficulty that we first describe the position and development of these cells (e.g. fig 3) and then add further evidence (cell size, additional marker dpn) that these are neuroblasts (also see above). However, without previous knowledge on position (and on pnt expression as the most specific marker) the type-II NB could not have been distinguished from other NBs based on cell size or expression of other markers.
“Conserved patterns of gene expression...”
This must be the first (especially dpn).
Dpn is not specific to type-II NB because it is also expressed in type-1 NBs, mature INPs and possibly other neural cells. It is therefore impossible to identify type-II NBs based on this gene alone. We therefore first used the most specific marker, pnt, in addition to adjacent fez expression to identify candidates for type-II lineages. Then we mapped expression of further genes on these lineages to support the interpretation (and show homology to the Drosophila lineages). Although of course the structure of a paper does not necessarily have to reflect the sequence in which experiments were done we would find putting dpn expression first misleading as it would not be clear why exactly a certain part of the expression should belong to type-II NB. Also, our pnt-fez expression data shows the position of the NB-II in the context of the whole head lobe whereas the other gene expressions are higher magnifications focussing on details. We therefore believe that the structure we chose best fits our data and the other reviewers seemed to find it acceptable as well.
“As type-II NBs contribute to central...”
Before the sentence, the author could explain differences in the central complex structure between Tribolium and Drosophila in terms of cell number and tissue size.
We have added references on the comparisons of tissue sizes, but unfortunately there is no Tribolium data that can be directly compared to available Drosophila resources in terms of cell number.
“We conclude that the embryonic development of...”
How did the authors conclude? They must explain their logic.
Actually, before this sentence, I only found the description of the comparison between Tribolium NBs and Drosophila once.
We agree that this conclusion is not fully evident from the presented data. We have therefore changed this part to stating that there is a correlation with the earlier central complex development described in Tribolium. See also response to the general reviewer comments.
“Hence, we wondered...”
The authors need to do a functional assessment of the genes they mentioned.
We agree that the goals originally stated at the beginning of this paragraph can only be achieved with functional experiments. We have therefore rephrased this part.
(3) Discussion
“A beetle enhancer trap line...”
This part should be moved elsewhere (it does not seem to be a discussion)
In accordance with this comment and reviewer#2’s similar comment we have removed this section. We have added a statement on the importance of testing the expression of an enhancer trap line to the results part and an added the use of CRISPR-Cas9 for line generation to the introduction.
“We have identified a total...”
The authors emphasized that they discovered 9 type II NBs. The authors should clarify how important this it
We have added some discussion on the importance of this finding.
Dpn is a neural marker - Is this correct?
According to Bier et al 1992 (now added as reference) dpn is a pan-neural marker. Reviewer#2 also recommended calling dpn a neural marker.
“Previous work described a heterochronic...” - reference?
Reference have been added
“By contrast, we show that Tribolium...”
What about the number of neurons in the central complex in Tribolium and Drosophila?
Does the lineage size of type II NBs reflect the number?
Unfortunately, we do not have numbers for that.
Reviewer #2 (Recommendations For The Authors):
I recommend using page and line numbers to make reviewing and revising less timeconsuming.
We apologize for this oversight. We include a line numbering system into our resubmission.
(1) Abstract
"These neural stem cells are believed to be conserved among insects, but their molecular characteristics and their role in brain development in other insect neurogenetics models, such as the beetle Tribolium castaneum have so far not been studied."
I recommend explaining the importance of studying Tribolium with regard to the evolution of brain centres rather than just stating that data are lacking.
We have now emphasized the importance of Tribolium as model for the evolution of brain centres.
"Intriguingly, we found 9 type-II neuroblast lineages in the Tribolium embryo while Drosophila produces only 8 per brain hemisphere."
It should be made clear that the 9 lineages also refer to brain hemispheres.
We have added this information
(2) Introduction
I would remove the first paragraph of the introduction; the use of Tribolium as model representative for insects is too general. The authors should focus on the specific question, i.e. the introduction should start with paragraph 2.
While we can relate to the preference for short and concise writing, we feel that giving some background on Tribolium might be important as we expect that many of our readers might be primarily Drosophila researchers. Keeping this paragraph also seems in line with a recommendation of reviewer#1 to add some additional information on Tribolium ecology.
"Several NBs of the anterior-most part of the neuroectoderm contribute to the CX and compared…”
The abbreviation has not been introduced.
For clarity we have now opted to not use this abbreviation but to always spell out central complex.
"Several NBs of the anterior-most part of the neuroectoderm contribute to the CX and compared to the ventral ganglia produced by the trunk segments, it is of distinctively greater complexity..."
Puzzling statement. Why would you compare a brain center with ventral ganglia? I recommend removing this.
We have changed this statement to just emphasizing the complexity of the brain structure.
"The dramatically increased number of neural cells that are produced by individual type-II lineages, and the fact that one lineage can produce different types of neurons..." In my opinion, this statement is too vague and unprofessional in style. Instead of "dramatically increased" use numbers.
We have removed ‘dramatically increased’ and now give a numeric example.
"The dramatically increased number of neural cells that are produced by individual type-II lineages, and the fact that one lineage can produce different types of neurons, leads to the generation of increased neural complexity within the anterior insect brain when compared to the ventral nerve cord.."
I assume that this statement relates to the comparison of type I and II nb lineages. However, type I NB lineages also produce different types of neurons due to GMC temporal identity, and neuronal hemi-lineage identity.
We have rephrased and tried to make clear that the second part of the statement is not specific to type-II NB only. In line with the comment above we have also removed the reference to the ventral nerve cord.
"In addition, in Drosophila brain tumours have been induced from type-II NBs lineages [34], opening up the possibility of modelling tumorigenesis in an invertebrate brain, thus making these lineages one of the most intriguing stem cell models in invertebrates [35,36]."
This statement is misplaced here; it should be mentioned at the start (if at all).
We have moved this statement up.
"However, molecular characterisation of such lineages in another insect but the fly and a thorough comparison of type-II NBs lineages and their sub-cell-types between fly and beetle are still lacking"
The background information should include what is known about type-II NB lineages in Tribolium, including marker gene expression, e.g. Farnworth et al.
We refer to He et al 2019, Farnworth et al 2020 and Garcia-Perez 2021. All these publications speculate about a contribution of type-II NBs to Tribolium central complex development but do not show evidence of it. As we emphasize throughout the manuscript, the present work is the first description of type-II NB in Tribolium.
"The ETS-transcription factor pointed (pnt) marks type-II NBs [40,41], which do not express the type-I NB marker asense (ase) but the pro-neural gene deadpan (dpn)" Deadpan is considered a pan-neural gene. To avoid confusion, I would remove "proneural" throughout.
We have done so throughout the manuscript.
"We further found that, like the type-II NBs itself, the youngest Tc-pnt-positive but fezmm-eGFP-negative INPs neither express Tc-ase (Fig. 5D, pink arrowheads)." What is the evidence that these are the youngest pnt positive cells? Position? This needs to be explained.
We have clarified that ‘youngest pnt-positive cells’ refers to the position of these cells close to the type-II NB.
"Therefore these neural markers can be used for a classification of type II NBs (Tc-pnt+, Tcase-), young INPs (Tc-pnt+, Tc-fez/erm-, Tc-ase-), immature INPs (Tc-pnt+, Tcfez/erm+, Tcase+), mature INPs (Tc-dpn+, Tc-ase+, Tc-fez/erm+, Tc-pros+), and GMCs (Tc-ase+, Tcfez/ erm+, Tc-pros+, Tc-dpn). This classification is summarized in Fig. 7 A-B."
This is not the best classification and not in line with the schemes in Figure 7 - the young INPs are also immature. What is the difference? It needs to be explained what "mature" means (dividing?).
Thank you for pointing this out. We have corrected the error in this part that confused the two original groups (young and immature). To take the immaturity of both types of INPs into account we have then also changed our naming of INP subtypes into immature-I and immature-II and throughout the manuscript). Figure 7 and figure 12 were also changed accordingly. While our classification if primarily based on gene expression the available data indicates that both types of immature INPs are not dividing, whereas mature INPs are. We have added a statement on that to this part.
"In beetles a single-unit functional central complex develops during embryogenesis while in flies the structure is postembryonic."
This statement is vague - the authors need to explain what is meant by "single-unit". The phrase "The structure is postembryonic" also needs more explanation. The Drosophila CX neuroblasts lineages originate in the embryo and the neurons form a commissural tract that becomes incorporated into the fan-shaped body of the Cx.
We have explained single-unit central complex and have improved our summary of known differences in central complex development between fly and beetle.
"To assess the size of the embryonic type-II NBs lineages in beetles we counted the Tc- fez/erm positive (fez-mm-eGFP) cells (INPs and GMCs) associated with a Tc-pntexpressing type-II NBs of the anterior medial group (type-II NBs lineages 1-7). It is not clear what is meant by "with a Tc-pnt-expressing type-II NBs". Is this a typo?"
We have removed this bit.
(3) Discussion
I would remove the first paragraph "A beetle enhancer trap lines reflects Tc-fez/earmuff expression". This is a repetition of the methods rather than a discussion.
This part has been removed also in line with reviewer#1’s comment.
(4) Figures
Figure 2
To which developing structure do the strongly labelled areas in Figure 2D correspond?
We believe that these areas from the protocerebrum including central complex, mushroom bodies and optic lobe. We have added this to the text and to the figure legend.
Figure 7
What do A and B represent? Different stages?
A and B show the same lineage but map the expression of different additional markers for clarity. We have added an explanation of this.
The classification contradicts the description in the section "Conserved patterns of gene expression mark Tribolium type-II NBs, different stages of INPs and GMCs" (last sentence) where young INPs are first in the sequence and described as pnt+, erm-, ase- and immature INPs as pnt+ erm+ and ase+.
We have corrected this mistake and changed the names of the subtypes into immatureI and immature-II (see above).
"We conclude that the evolutionary ancient six3 territory gives rise to the neuropile of the z, y, x and w tracts."
Please clarify if six3 is also expressed in the corresponding grasshopper NB lineages or if your conclusion is based on the comparison of Drosophila and Tribolium and you assume that this is the ancestral condition.
Six3 expression has not been studied in grasshoppers. Owing to the highly conserved nature of an anterior median six3 domain in arthropods and bilaterian animals in general, we would expect it to be expressed anterior-medially in grasshoppers as well. In Drosophila the gene is expressed in the anterior-medial embryonic region where the type-II NBs are expected to develop, but to our knowledge it has not been specifically studied which type-II NB lineages are located within this domain. We have clarified in our text that we do not claim that the origin of anterior-medial type-II NB 1-4 and the X,Y, Z and W lineages from the six3 territory is highly conserved but only the territory itself. As far as we know our work is the first to analyse the relationship of type-II lineages and the conserved head patterning genes six3 and otd. We have added some clarification of this into this part of the discussion.
(5) Methods
The methods section should include the methods for cell counting, as well as cell and nuclei size measurements including statistics (e.g. how many embryos, how many NB lineages). The comparison of the Tribolium NB lineage cell numbers to published Drosophila data should include a brief description of the method used in Drosophila (in addition to the method used here in Tribolium) so that the reader can understand how the data compare.
We have added a separate section on this to the Methods part which also includes the criteria used in Drosophila. We have also included some more information to the results part on the inclusion of neurons in the Drosophila counts that may only be partially included in our numbers. This does however not change the results in terms of larger numbers of progenitor cells in Tribolium.
(6) Typos and minor errors
Abstract
“However, little is known on the developmental processes that create this diversity”
Change to ... little is known about
Changed.
NBs lineages
Change to NB lineages throughout.
We have used text search to find and replace all position where this was used erroneously,
Results
"Schematic drawing of expression different markers in type-II NB lineages.."
Schematic drawing of expression of different markers
Corrected
Discussion
"However, the type-II NB 7, which is we assigned to the anterior medial group but which..."
.... which we assigned....
corrected
"......might be the one that does not have a homologue in the fly embryo The identification of more..." Full stop missing.
Added.
"Adult like x, y, and w tracts as well as protocerebral bridge are...."
Change to "The adult like x, y, and w tracts as well as the protocerebral bridge are....
This part has been removed with the rewriting of this paragraph.
Reviewer #3 (Recommendations For The Authors):
(1) Suggestions for improved or additional experiments, data, or analyses:
a) The analysis of nuclear size is wrong. The authors compare the largest cell of a cluster of cells with a number of random cells from the same brain. It is obvious that the largest cell of a cluster will be larger than the average cell of the same brain. A better control would be to compare the largest cell of the pnt+ cluster with the largest cell of a random sample of cells, although this also comes with biases. Personally, I have no doubt that the authors are looking at neuroblasts, based on the markers they are using, so I would recommend completely eliminating Figure 4.
We agree that we produced a somewhat biased and expected result when we select the largest cell of a cluster for size comparison. However, we found it important to show based on a larger sample that these cells are also statistically larger than the average cell of a brain, which we think our assessment shows. We do not claim that type-II NBs are the largest cells of a brain, or that they are larger than type-I NBs, therefore in a random sample there might be cells that are equally big (see also distribution of the control sample shown in figure 4, and we have added a note on this to the text). We are happy to hear that this reviewer has no doubts we are looking at neural stem cells. However, reviewer#1 did express some hesitations and therefore we think it is important to keep the information on cell size as part of our argument that we are indeed looking at type-II NBs (gene expression, cell size, dividing, part of a neural lineage).
b) The comparison of NB, INP, and GMC numbers between Drosophila and Trbolium (section "The Tribolium embryonic lineages of type-II NBs are larger and contain more mature INPs than those of Drosophila") compares an experiment that the authors did with published data. I would suggest that the authors repeat the Drosophila stainings and compare themselves to avoid cases of batch effects, inconsistent counting, etc.
None of the authors is a Drosophila expert or has any experience at working with this model and reassessing the lineage size would require a number of combinatorial staining. Therefore, we feel that using the published data produced by experts and which also includes repeat experiments is for us the more reliable approach.
c) In Figure 10, there are some otd+ GFP+ cells laterally. What are these?
We believe that these cells contribute to the eye anlagen. We have added this information to the legend.
(2) Minor corrections to the text and figures:
a) There are some typos in the text: e.g. "pattering" in the abstract.
We have carefully checked the text for typos and hope that we have found everything.
b) The referencing of figures in the text is inconsistent (eg "Figure 5 panel A" vs "Figure 5D" on page 12).
We have checked throughout the manuscript and made sure to always refer to a panel correctly.
c) In Figure 3C, the white staining (anti-PH3) is not indicated in the Figure.
The label has been added in the figure.
d) Moreover, in Figure 3, green is not very visible in the images.
We have improved the colour intensity where possible.
e) In the figures, it might be better to outline the cells with color-coded dashed circles instead of using arrows.
We think that this would obscure some details of the stainings and create a rather artificial representation. We also feel that doing this consistently in all our images is an amount of work not justified by the degree of expected improvement to the figures
NOTE: We are submitting a revised version of the supplementary material which only contains two minor changes: a headline was added to Table S4 (Antibodies and staining reagents) and a typo was corrected in line one of table S5 (TC to Tc).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
First, we thank the reviewers for a thorough reading of our paper and some useful comments. A recurrent remark of the reviewers concerns the appearance of kRas-expressing cells (labelled by a nuclear blue fluorescent marker) which we attribute to the progeny of the initially induced cell. The reviewers suggest that these cells may have been obtained through activation of the Cre-recombinase in other cells by cyclofen released from light scattering, via diffusion, leakiness, etc. These remarks are perfectly reasonable from people not familiar with the cyclofen uncaging approach that we are using, but are unwarranted as we shall show below.
We have been using cyclofen uncaging with subsequent activation of a Cre-recombinase (or some other proteins) since 2010 (see ref.34, Sinha et al., Zebrafish 7, 199-204 (2010) and our 2018 review (ref.35, Zhang et al., ChemBioChem 19,1-8 (2018)). In our experiments, the embryos are incubated in the dark in 6µM caged cyclofen (cCyc) and washed in E3 medium (and transferred to a new medium with no cCyc). In these conditions, over many years we never observed activation of the recombinase, i.e. the appearance of the associated fluorescent label in cells of embryos grown in E3 medium. Hence leakiness can be ruled out (in presence of cCyc or in its absence).
Following transfer of the embryos to new E3 medium we illuminate the embryos locally with light at 405nm. In these conditions, cCyc is only partially uncaged and results in activation of Cre-recombinase in only a few cells (1,2, 3, …) within the illuminated region only, namely in the appearance of the kRas-associated nuclear blue fluorescent label in usually one cell (and sometimes in a few more). Data and statistics are now incorporated in the revised manuscript, see Fig.2A and S7. In absence of activation of a reprogramming factor these fluorescently labelled cells disappear within a few days (either via shut-down of their promotor, apoptosis or some other mechanism). The crucial point here is that we see less and not more kRas expressing cells (i.e. with nuclear blue fluorescence) in absence of VentX activation. This observation rules out activation of Cre-recombinase in other cells days after illumination due to leakiness, cyclofen released by light or diffusing from the illumination spot.
To observe many more fluorescent cells days after activation of the initial cell, one needs to transiently activate VentX-GR by overnight incubation in dexamethasone (DEX). Injecting the embryos at 1-cell stage with VentX-GR only or incubating them in DEX (without injection of VentX-GR) does not result in the appearance of more blue fluorescent cells. Following activation of VentX-GR, the fluorescent cells observed a couple of days after initiation are visualized in E3 medium (i.e. in absence of cyclofen) and are localized to the vicinity of the otic vesicle (the region where the initial cell was activated). In the revised manuscript we show images of these fluorescent cells taken a few days apart in the same embryo in which a single cell was initially activated (Fig.S8). Hence, we attribute these cells to the progeny of the activated cell. Obviously, single cell tracking via time-lapse microscopy would definitely nail down this issue and provide fascinating insight into the initial stages of tumor growth. Unfortunately, immobilization of embryos in the usual medium (e.g. MS222, tricaine) over 5-6 days to track the division and motion of single cells is not possible. We are considering some other possibilities (immobilization in bungarotoxin or via photo-activation of anionic channels), but these challenging experiments are for a future paper.
Reviewer #1 (Public Review):
The authors then performed allotransplantations of allegedly single fluorescent TICs in recipient larvae and found a large number of fluorescent cells in distant locations, claiming that these cells have all originated from the single transplanted TIC and migrated away. The number of fluorescent cells showed in the recipient larve just after two days is not compatible with a normal cell cycle length and more likely represents the progeny of more than one transplanted cell.
As mentioned in the manuscript, we measure the density of cells/nl and inject in the yolk of 2dpf Nacre embryos a volume equivalent to about 1 cell, following published protocols (S.Nicoli and M.Presta, Nat.Prot. 2,2918 (2007)). We further image the injected cell(s) by fluorescence microscopy immediately following injection, as shown in Fig.4A and Fig.S8B. We might miss a few cells but not many. With a typical cell cycle of ~10h the images of tumors in larvae at 3dpt (and not 2dpt) correspond to ~100 cells. In any case the purpose of this experiment was to show that the progeny of the initial induced cell is capable of developing into a tumor in a naïve fish, which is the operational definition of cancer that we adopted here.
The ability to migrate from the injection site should be documented by time-lapse microscopy.
As stated above our purpose here is not to study tumor formation from transplanted cell(s) but to use that assay as an operational test of cancer. Besides as mentioned earlier single cell tracking in larvea over 3-4dpt is not a trivial task.
Then, the authors conclude that "By allowing for specific and reproducible single cell malignant transformation in vivo, their optogenetic approach opens the way for a quantitative study of the initial stages of cancer at the single cell level". However, the evidence for these claims are weak and further characterization should be performed to:
(1) Show that they are actually activating the oncogene in a single cell (the magnification is too low and it is difficult to distinguish a single nucleus, labelling of the cell membrane may help to demonstrate that they are effectively activating the oncogene in, or transplanting, a single cell)
In the revised manuscript we provide larger magnification of the initial induced cell and show examples of oncogene activation in more than one cell.
(2) The expression of the genes used as markers of tumorigenesis is performed in whole larvae, with only a few transformed cells in them. Changes should be confirmed in FACS sorted fluorescent cells
When the oncogene is activated in a whole larvae all cells are fluorescent and thus FACS is of no use for cell sorting. Sorting could be done in larvae where single cells are activated , but then the efficiency of FACS is not good enough to isolate the few fluorescent cells among the many more non-fluorescent ones. We agree that the expression change of the genes used as markers of tumorigenesis is an underestimate of their true change, but our goal at this time is not to precisely measure the change in expression level, but to show that the pattern of change was different from the controls and corresponded to what is expected in tumorigenesis.
(3) The histology of the so called "tumor masses" is not showing malignant transformation, but at the most just hyperplasia.
The histology of the hyperplasic tissues show cellular proliferation with a higher density of nuclear material which is characteristic of tumors, Fig.S4C. Besides the increased expression of pERK in these tissues, Fig.S4A,B is also a hallmark of cancer.
In the brain, the sections are not perfectly symmetrical and the increase of cellularity on one side of the optic tectum is compatible with this asymmetry.
The expected T-shape formed by the sections of the tegmentum and hypothalamus are compatible with the symmetric sections shown in Fg.2D. The asymmetry in the optic tectum is a result of the hyperplasic growth.
(4) The number of fluorescent cells found dispersed in the larvae transplanted with one single TIC after 48 hours will require a very fast cell cycle to generate over 50 cells. Do we have an idea of the cell cycle features of the transplanted TICs?
As answered above, the transplanted larvae are shown at 3dpt. With a cell cycle of about 10h, a single cell can give rise to about 100 cells in that time lapse.
Reviewer #2 (Public Review):
Summary:
This paper describes a genetically tractable and modifiable system …which could be used to study an array of combinations and temporal relationships of these cancer drivers/modifiers.
We thank this referee for its positive comments. We would also like to point out that our approach provides for the first quantitative means to estimate the probability of tumorigenesis from a single cell, an estimate which is crucial in any assessment of cancer malignancy and the effectiveness of prophylactics.
Weaknesses:
There is minimal quantitation of … the efficiency of activation of the Ras-TFP fusion (Fig 1) in, purportedly, a single cell. …, such information seems essential.
We have added more images of induction of a single (or a few cells) and a plot where the probability of RAS activation in one or a few cells is specified.
The authors indicate that a single cell is "initiated" (Fig 2) using the laser optogenetic technique, but without definitive genetic lineage tracing, it is not possible to conclude that cells expressing TFP distant from the target site near the ear are daughter cells of the claimed single "initiated" cell. A plausible alternative explanation is 1) that the optogenetic targeting is more diffuse (i.e. some of the light of the appropriate wavelength hits other cells nearby due to reflection/diffraction), so these adjacent cells are additional independent "initiated" cells or 2) that the uncaged tamoxifen analogue can diffuse to nearby cells and allow for CreER activation and recombination.
We have addressed this point in our general comments to the reviewers’ remarks. The possibilities mentioned by this reviewer would result in cells expressing TFP in absence of VentX activation, which is NOT the case. Cells expressing TFP away from the initial site are observed DAYS after activation of the oncogene (and TFP) in a single cell and ONLY upon activation of VentX.
In Fig 2B, the claim is made that "the activated cell has divided, giving rise to two cells" - unless continuously imaged or genetically traced, this is unproven.
We have addressed this remark previously. Tracking of larvae over many days is not possible with the usual protocol using tricaine to immobilize the larvae. Nonetheless, in the revised version we present images of an embryo imaged at various times post activation (1hpi, 3dpi, 7dpi) where proliferation and metastasis of the cells can be observed. We are pursuing other alternatives for time-lapse microscopy over many days, since besides convincing the sceptics, a single cell tracking experiment (possibly coupled with in-situ spatial transcriptomics) will shed a new and fascinating light on the initial stages of tumor growth.
In addition, it appears that Figures S3 and S4 are showing that hyperplasia can arise in many different tissues (including intestine, pancreas, and liver, S4C) with broad Ras + Ventx activation …. This should be clarified in the manuscript).
This is true and has been clarified in the new version.
In Fig S7 where single cell activation and potential metastasis is discussed, similar gut tissues have TFP+ cells that are called metastatic, but this seems consistent with the possibility that multiple independent sites of initiation are occurring even when focal activation is attempted.
As mentioned previously this is ruled out by the fact that these cells are observed days after cyclofen uncaging (and TFP activation) and IF AND ONLY IF VentX was activated during the first dpi.
Although the hyperplastic cells are transplantable (Fig 4), the use of the term "cells of origin of cancer" or metastatic cells should be viewed with care in the experiments showing TFP+ cells (Fig 1, 2, 3) in embryos with targeted activation for the reasons noted above.
The purpose of this transplantation experiment was to show that cell in which both kRas and VentX have been activated possess the capacity to metastasize and develop a tumor mass when transplanted in a naïve zebrafish. This - to the best of our knowledge - is the operational definition of a malignant tumor. Notice also that transplantation of kRAS only activated cells (i.e. without subsequent activation of VentX) does NOT yield tumors, rather the transplanted cell disappears after a few days, see Fig.S10.
Reviewer #3 (Public Review):
Summary:
This study employs an optogenetics approach … to examine tumorigenesis probabilities under altered tissue environments.
We thank this reviewer for this remark, since we believe that the probability to assess the probability of tumorigenesis from a single cell is probably the most significant contribution of this work.
Weaknesses:
Lack of Methodological Clarity: The manuscript lacks detailed descriptions of methodologies,
We have included additional detail of our methodology and statistical analyses in the revised manuscript.
Sub-optimal Data Presentation and Quality:
Lack of quantitative data and control condition data obtained from images of higher magnification limits the ability to robustly support the conclusions.
We have included more images at higher magnification and quantitative data to support the main report of targeted single cell induction.
Here are some details:
Authors might want to provide more evidence to support their claim on the single cell KRAS activation.
More images and a data on activation of single or few cells in the illumination field are provided as well as statistical analysis of cell induction.
Stability of cCYC: The manuscript does not provide information on the half-life and stability of cCYC. Understanding these properties is crucial for evaluating the system's reliability and the likelihood of leakiness, which could significantly influence the study's outcomes.
We have been using the cCyc system for about 14 years. We refer the reader to our previous papers and reviews on this methodology. Briefly, cCyc is stable when not illuminated with light around 375nm. Typically, we incubate our embryos in the dark for about 1h before washing, transferring them into E3 medium and illuminating them. Assessing the leakiness of the system is easy as expression of a fluorescent marker is permanently turned on. We have observed none in the conditions of our experiment or in previous works.
Metastatic Dissemination claim: However, the absence of a supportive cellular compartment within the fin-fold tissue makes the presence of mTFP-positive metastatic cells there particularly puzzling. This distribution raises concerns about the spatial specificity of the optogenetic activation protocol … The unexpected locations of these signals suggest potential ectopic activation of the KRAS oncogene,
We have addressed this remark in the introduction and above. Specifically, metastatic and proliferative mTFP-positive cells are observed IF AND ONLY IF VentX is also activated concomitant with activation of kRAS in a single cell. No proliferative cells are observed in absence of VentX activation, or in presence of VentX or Dex alone, or if kRAS has not been activated by cyclofen uncaging.
Image Resolution Concerns: The cells depicted in Figure 3C β, which appear to be near the surface of the yolk sac and not within the digestive system as suggested in the MS, underscore the necessity for higher-resolution imaging. Without clearer images, it is challenging to ascertain the exact locations and states of these cells, thus complicating the assessment of experimental results.
Better images are provided in the revised version.
The cell transplantation experiment is lacking protocol details:
Details are provided. We have followed regular protocols for transplantation: S.Nicoli and M.Presta, Nat.Prot. 2,2918 (2007).
If the cells are obtained from whole larvae with induced RAS + VX expression, it is notable and somewhat surprising that the larvae survived up to six days post-induction (6dpi) before cells were harvested for transplantation. This survival rate and the subsequent ability to obtain single cell suspensions raise questions about the heterogeneity of the RAS + VX expressing cells that transplanted.
From Fig.S4D, about 50% of the embryos survive at 6dpi. Though an interesting question by itself we have not (yet) addressed the important issue of the heterogeneity of the outgrowth obtained from a single cell. Our purpose here was just to show that cells in which both kRAS and VentX have been activated possess the capacity to metastasize and develop a tumor mass when transplanted in a naïve zebrafish. This - to the best of our knowledge - is the operational definition of a malignant tumor.
Unclear Experimental Conditions in Figure S3B: …It is not specified whether the activation of KRAS was targeted to specific cells or involved whole-body exposure.
This was whole body (global) illumination and is specified in the revised version.
Contrasting Data in Figure S3C compared to literature: The graph in Figure S3C indicates that KRAS or KRAS + DEX induction did not result in any form of hyperplastic growth. The authors should provide detailed descriptions of the conditions under which the experiments were conducted in Figure S3B and clarifying the reasons for the discrepancies observed in Figure S3C are crucial. The authors should discuss potential reasons for the deviation from previous reports.
This discrepancy is discussed in the revised version. First the previous reports consider the development of tumors within 3-4 weeks which we have not studied in detail. Second, the expression of the oncogene in these reports might be stronger than in ours. Third, the stochastic and random appearance of tumors in these reports suggest that some other mechanism (transient stress-induced reprogramming?) might have activated the oncogene in the initial cell.
Further comments:
Throughout the study, KRAS-activated cell expansion and metastasis are two key phenotypes discussed that Ventx is promoting. However, the authors did not perform any experiments to directly show that KRAS+ cells proliferate only in Ventx-activated conditions.
Yes, we did. See Fig. S1 and compare with Fig.S3B, or Fig.S10A in comparison with Fig.2A,B.
The authors also did not show any morphological features or time-lapse videos demonstrating that KRAS+ cells are motile, even though zebrafish is an excellent model for in vivo live imaging. This seems to be a missed opportunity for providing convincing evidence to support the authors' conclusions.
Performing time-lapse microscopy on larvae over many (4-5) days is not possible with the regular tricaine protocol for immobilization. We are definitely planning such experiments, but they will require some other protocol, perhaps using bungarotoxin or some optogenetic inhibitory channels.
There were minimal experimental details provided for the qPCR data presented in the supplementary figures S5 and S6, therefore, it is hard to evaluate result obtained.
More details are given in the revised version.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Abstract: what is the definition of tumors that they are using? I never heard of a full-blown tumor that develops in less than 6 days from a single cell!
This is indeed surprising! We are using an operational definition of a tumor: if cells from an hyperplasic tissue can metastasize and outgrow when transplanted in a naïve zebrafish, then it is a tumor.
Introduction: The claim that this is the first report of the induction of oncogene expression in a single cell in zebrafish is wrong as there are other reports (PMID: 27810924, PMID: 30061297)
These other approaches are invasive (electroporation and transplantation). We have added non-invasive in the revised version.
Figure 2: The quality of these images is too low to visualize the infiltration that they talk about, the sections are not perfectly coronal and the asymmetric distribution of cells may be confused with an infiltration.
We have addressed this question above.
Results, page 5: how do we know that these are metastatic cells? there could have been spurious activation in other locations, you need to prove that these cells moved from one place to the other and that they are of the same cell type as the primary tumor
We have addressed this question extensively in the introduction and in our answers to the reviewers. We have also added a figure showing cell proliferation in the same embryos at various time post induction. Time-lapse microscopy studies of tumor initiation and growth over many days are planned, but will be the subject of an other paper.
Figure 3: not clear why they did not use anaesthetic or mounting media to take pictures of the transplanted fish
We tried to minimally stress the larvae that are already in a perilous condition…
Results, page 6: Not clear why the authors used KRAS v12 as an oncogene and uncaged its expression in the brain, as KRAS is not a common oncogene for brain tumors.
There are reports of kRASG12V tumors in zebrafish brain (doi: 10.1186/s12943-015-0288-2)
It is not clear what is the mechanism of Ventx -driven oncogenesis? What changes in gene expression, cell function etc are induced by Ventx in the cells that express KRASv12? The qPCR analysis performed is done on whole larvae and an analysis on single TICs and their progeny should be done following FACS sorting of fluorescent cells.
FACS sorting of a single TIC (and its progeny) among many thousand cells in the embryo is not possible. The analysis on whole larvae provides an underestimate of the changes in gene expression following activation of kRAS and VentX. We are looking for spatial transcriptomics as a better approach of the changes in gene expression induced in single TICs and their progeny, but that is beyond the scope of this paper.
Nuclear staining is necessary to make sure that only 1 cell was transplanted. How is it possible that we get more than 50 cells from a single transplanted cell in less than 48 hours? What is the length of the cell cycle of these transformed cells?
Nuclear staining is not necessary as the transplanted cell is fluorescent. Thus we can see how many cells are transplanted. With a cell-cycle of about 10h in 3dpt, a single cell will have generated as many as 100 cells.
Reviewer #2 (Recommendations For The Authors):
Minor grammatical change - hyperplasic more commonly called hyperplastic.
Reviewer #3 (Recommendations For The Authors):
Provide Detailed Methodologies: Clearly describe all experimental protocols used, particularly those for cell transplantation and photo-activation techniques. Detailed protocols will aid in replicating your findings and enhancing the manuscript's credibility.
Done.
Provide High-Resolution Imaging data: To substantiate the claims about cell location and behaviour, provide high-resolution images where individual cells and their specific tissue contexts are clearly visible.
Greater magnification images provided.
Quantitative Data: Incorporate quantitative analyses to strengthen the findings, particularly in experiments where cell proliferation and activation are key outcomes.
Done.
Verify Single Cell Activation: Offer additional evidence or experimental validation to support the claim that KRASG12V activation is confined to single cells, considering the limitations mentioned about the photo-activation setup.
Discussion, figures and statistical analysis added in manuscript.
Discuss Stability and Leakage of cCYC: Provide data on the stability and half-life of cCYC to assess the likelihood of system leakiness, which could influence the interpretation of your results.
Reference to our previous papers and reviews added.
Clarify Metastatic Claims: Discuss the unexpected presence of mTFP-positive cells in nontraditional metastatic sites, like the fin fold, and consider additional experiments to verify whether these are cases of ectopic activation or true metastasis.
Discussion added in manuscript
Utilize time-lapse live imaging to visually document the motility and behaviour of KRAS+ cells over time, leveraging the strengths of the zebrafish model.
Definitely interesting, but non trivial to conduct over many days and subject for a future paper.
Address Discrepancies in KRAS Activation Effects from literature: Specifically, discuss why your findings on KRAS-induced hyperplasia differ from existing literature. Consider whether experimental conditions or KRAS expression levels might have contributed to these differences.
Discussion added in revised version
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Reviewer #1 (Public review):
When different groups (populations, species) are presented with similar environmental pressures, how similar are the ultimate targets (genes, pathways)? This study sought to illuminate this broader question via experimental evolution in D. simulans and quantifying gene-expression changes, specifically in the context of standing genetic variation (and not de novo mutation). Ultimately, the authors showed pleiotropy and standing-genetic variation play a significant role in the "predictability" of evolution.
The results of this manuscript look at the interplay between pleiotropy, standing genetic variation and parallelism (i.e. predictability of evolution) in gene expression. Ultimately, their results suggest that (a) pleiotropic genes typically have a smaller range in variation/expression, and (b) adaptation to similar environments tends to favor changes in pleiotropic genes, which leads to parallelism in mechanisms (though not dramatically). However, it is still uncertain how much parallelism is directly due to pleiotropy, instead of a complex interplay between them and ancestral variation.
Yes, the reviewer is correct that our results for the direct effects of pleiotropy were not consistent for both measures of pleiotropy. We highlight this in the discussion:” Only tissue specificity had a significant direct effect, which was even larger than the indirect effect (Table 2). No significant direct effect was found for network connectivity. The discrepancy between the two measures of pleiotropy is particularly interesting given their significant correlation (Supplementary Figure 1). This suggests that both measures capture aspects of pleiotropy that differ in their biological implications.”
Reviewer #2 (Public review):
Summary:
Lai and collaborators use a previously published RNAseq dataset derived from an experimental evolution set up to compare the pleiotropic properties of genes which expression evolved in response to fluctuating temperature for over 100 generations. The authors correlate gene pleiotropy with the degree of parallelisms in the experimental evolution set up to ask: are genes that evolved in multiple replicates more or less pleiotropic?
They find that, maybe counter to expectation, highly pleiotropic genes show more replicated evolution. And such effect seems to be driven by direct effects (which the authors can only speculate on) and indirect effect through low variance in pleiotropic genes (which the authors indirectly link to genetic variation underlying gene expression variance).
Weaknesses:
The results offer new insights into the evolution of gene expression and into the parameters that constrain such evolution, i.e., pleiotropy. Although the conclusions are supported by the data, I find the interpretation of the results a little bit complicated.
We are very happy to read that the reviewer finds our conclusions to be supported by the data.
Major comment:
The major point I ask the authors to address is whether the connection between polygenic adaptation and parallelism can indeed be used to interpret gene expression parallelism. If the answer is not, please rephrase the introduction and discussion, if the answer is yes, please make it explicit in the text why it is so.
Yes, we think that gene expression parallelism can be explained by polygenic adaptation.
The authors argument: parallelism in gene expression is the same as parallelism in SNP allele frequency (AFC) (see L389-383 here they don't mention that this explanation is derived from SNP parallelism and not trait parallelism, and see Fig1 b). In previous publications the authors have explained the low level of AFC parallelism using a polygenic argument. Polygenic traits can reach a new trait optimum via multiple SNPs and therefore although the trait is parallel across replicates, the SNPs are not necessarily so.
In the current paper, they seem to be exchanging SNP AFC by gene expression, and to me, those are two levels that cannot be interchanged. Gene expression is a trait, not a SNP, and therefore the fact that a gene expression doesn't replicate cannot be explained by polygenic basis, because again the trait is gene expression itself. And, actually the results of the simulations show that high polygenicity = less trait parallelism (Fig4).
We agree with the reviewer that it is important to consider different hierarchies when talking about the implications of polygenic adaptation. The lowest hierarchical level is SNP variation and the highest level is fitness. In-between these extreme hierarchical levels is gene expression. While gene expression is a trait itself, as correctly pointed out by the reviewer, it is possible that selection is not favoring a specific trait value, because selection targets a trait on a higher hierarchical level. This implies that not only SNPs, but also intermediate traits such as gene expression can exhibit redundancy. Considering a simple example of one selected trait (e.g. body size), which is affected by the expression level of two genes A and B, each regulated by SNP A1, A2 and B1, B2. It is now possible to modulate the focal trait by allele frequency changes of A1, which in turn will only affect gene A. Alternatively, SNP B2 may change, modifying the expression of gene B, leading to the same change in body size. Hence, we could have redundancy both at the SNP level as well as on the gene expression level (although higher redundancy is expected on the SNP level). Most importantly, this redundancy at intermediate hierarchical levels is not pure theory, but it is supported by empirical evidence. We have shown that redundancy exists not only for gene expression (10.1111/mec.16274) but also for metabolite concentrations (10.1093/gbe/evad098).
Now, if the authors focus on high parallel genes (present in e.g. 7 or more replicates) and they show that the eQTLs for those genes are many (highly polygenic) and the AFC of those eQTL are not parallel, then I would agree with the interpretation. But, given that here they just assess gene expression and not eQTL AFC, I do not think they can use the 'highly polygenic = low parallelism' explanation.
This is clearly an interesting proposed research project, but we doubt that it would result in the expected outcome. Since most of the adaptive gene expression changes are not having a simple genetic basis (10.1093/gbe/evae077) and most expression variation is determined by trans-regulatory effects (10.1038/s41576-020-00304-w), eQTL mapping will most likely not identify all contributing loci. Large effect loci are more easily identified, but they are also expected to be more parallel.
The interpretation of the results to me, should be limited to: genes with low variance and high pleiotropy tend to be more parallel, and the explanation might be synergistic pleiotropy.
We thank the reviewer for the suggestion, but prefer to stick to our interpretation of the data.
Comments on revisions: The authors didn't really address any of the comments made by any of the reviewers - basically nothing was changed in the main text. Therefore, I leave my original review unchanged.
We modestly disagree, in our point to point reply, we respond to all reviewers’ comments. Since, we did not identify any major problem in our manuscript, we only modified the wording in some parts where we felt that a clarification could resolve the misunderstanding of the reviewers. In response to the reviewers’ comments, we added a new paragraph in the discussion and generated a new figure.
Reviewer #3 (Public review):
The authors aim to understand how gene pleiotropy affects parallel evolutionary changes among independent replicates of adaptation to a new hot environment of a set of experimental lines of Drosophila simulans using experimental evolution. The flies were RNAsequenced after more than 100 generations of lab adaptation and the changes in average gene expression were obtained relative to ancestral expression levels from reconstructed ancestral lines. Parallelism of gene expression change among lines is evaluated as variance in differential gene expression among lines relative to error variance. Similarly, the authors ask how the standing variation in gene expression estimated from a handful of flies from a reconstructed outbred line affects parallelism. The main findings are that parallelism in gene expression responses is positively associated with pleiotropy and negatively associated with expression variation. Those results are in contradiction with theoretical predictions and empirical findings. To explain those seemingly contradictory results the authors invoke the role of synergistic pleiotropy and correlated selection, although they do not attempt to measure either.
Strengths:
The study uses highly replicated outbred laboratory lines of Drosophila simulans evolved in the lab under constant hot regime for over 100 generations. This allows for robust comparisons of evolutionary responses among lines.
The manuscript is well written and the hypotheses are clearly delineated at the onset.
The authors have run a causal analysis to understand the causal dependencies between pleiotropy and expression variation on parallelism.
The use of whole-body RNA extraction to study gene expression variation is well justified.
Weaknesses:
The accuracy of the estimate of ancestral phenotypic variation in gene expression is likely low because estimated from a small sample of 20 males from a reconstructed outbred line. It might not constitute a robust estimate of the genetic variation of the evolved lines under study.
We agree with the reviewer that variation estimates based on 20 samples are not very precise. Nevertheless, we demonstrated that the estimated variance in gene expression was highly correlated between two independent samples from the same ancestral population. Furthermore, we identified a significant correlation of expression variance with evolutionary parallelism. In other words, the biological signal has been sufficiently strong despite the variance estimate has been noisy.
There are no estimates of the standing genetic variation of expression levels of the genes under study, only estimates of their phenotypic variation. I wished the authors had been clear about that limitation and had refrained from equating phenotypic variation in expression level with standing genetic variation.
The reviewer is right that we did not estimate genetic variation of gene expression, but use expression variation as a proxy for the standing genetic variation. There are two potential problems with this approach. First, a large expression variation could be caused by a single large effect variant segregating at intermediate frequency. Such large effect variants will exhibit a highly parallel selection response-contrary to our empirical results. Since we have shown previously (10.1093/gbe/evae077) that adaptive gene expression changes are mostly polygenic we do not consider this extreme scenario to be very relevant in our study. Rather, we would like to emphasize that neither a SNP analysis of the 5’ region nor an eQTL study will provide an unbiased estimator of genetic variation of gene expression. The second problem arises if gene expression noise differs among genes, hence more noisy genes will appear to have more standing genetic variation than genes with less noise. Since, we average across many different cells and cell types, gene expression noise is expected to be levelled out- this aspect is discussed in detail in the manuscript.
In other words, despite these two potential limitations, we consider our approach superior to alternative approaches of estimating genetic variation in gene expression.
Moreover, since the phenotype studied is gene expression, its genetic basis extends beyond expressed sequences. The phenotypic variation of a gene's expression may thus likely misrepresent the genetic variation available for its evolution. The authors do not present evidence that sequence variation correlates with expression variation.
Gene expression is determined by the joint effects of cis-regulatory and trans-regulatory variation. Hence, recombination can create more extreme phenotypes than the one of the parental lines (in quantitative genetics this is called transgressive segregation). It is unclear to what extent this constitutes a problem for our analyses. Nevertheless, we would like to point out that eQTL mapping will miss many trans-acting variants and therefore we doubt that the requested empirical evidence for correlation between genetic variation (estimated by eQTL mapping) and observed expression variation is as straight forward as suggested by the reviewer.
Nevertheless, we reference an empirical study, which showed a positive correlation between expression variation and cis-regulatory variation.
The authors have not attempted to estimate synergistic pleiotropy among genes, nor how selection acts on gene expression modules. It makes their conclusion regarding the role of synergistic pleiotropy rather speculative.
The reviewer is correct that we did not demonstrate synergistic pleiotropy, but we discuss this as a possible explanation for the observed direct effects of pleiotropy.
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
The results of this manuscript look at the interplay between pleiotropy, standing genetic variation, and parallelism (i.e. predictability of evolution) in gene expression. Ultimately, their results suggest that (a) pleiotropic genes typically have a smaller range in variation/expression, and (b) adaptation to similar environments tends to favor changes in pleiotropic genes, which leads to parallelism in mechanisms (though not dramatically). However, it is still uncertain how much parallelism is directly due to pleiotropy, instead of a complex interplay between them and ancestral variation.
I have a few things that I was uncertain about. It may be these things are easily answered but require more discussion or clarity in the manuscript.
(1) The variation being talked about in this manuscript is expression levels, and not SNPs within coding regions (or elsewhere). The cause of any specific gene having a change in expression can obviously be varied - transcription factors, repressors, promoter region variation, etc. Is this taken into account within the "network connectivity" measurement? I understand the network connectivity is a proxy for pleiotropy - what I'm asking is, conceptually, what can be said about how/why those highly pleiotropic genes have a change (or not) in expression. This might be a question for another project/paper, but it feels like a next step worth mentioning somewhere.
In current study, we are only able to detect significant and repeatable expression changes but unable to identify the underlying causal variants. An eQTL study in the founder population in combination with genomic resequencing for both evolved and ancestral populations would be required to address this question.
(2) The authors do have a passing statement in line 361 about cis-regulatory regions. Is the assumption that genetic variation in promoter regions is the ultimate "mechanism" driving any change in expression? In the same vein, the authors bring up a potential confounding factor, though they dismiss it based on a specific citation (lines 476-481; citation 65). I'm of the mindset that in order to more confidently disregard this "issue" based on previous evidence, it requires more than one citation. Especially since the one citation is a plant. That specific point jumps out to me as needing a more careful rebuttal.
It was not our intention to claim that the expression changes in our experiment are caused by cis-regulatory variation only. We believe that the observed expression variation has both cis- and trans-genetic components, where as some studies tend to estimate much higher cisvariation for gene expression in Drosophila populations (e.g. [1, 2]). We mentioned the positive correlation between cis-regulatory polymorphism and expression variation to (1) highlight the genetic control of gene expression and (2) make the connection between polygenic adaptation and gene expression evolutionary parallelism.
(3) I feel like there isn't enough exploration of tissue specificity versus network connectivity. Tissue specificity was best explained by a model in which pleiotropy had both direct and indirect effects on parallelism; while network connectivity was best explained (by a small margin) via the model which was mostly pleiotropy having a direct effect on ancestral variation, that then had a direct effect on parallelism. When the strengths of either direct/indirect effects were quantified, tissue specificity showed a stronger direct effect, while network connectivity had none (i.e. not significant). My confusion is with the last point - if network connectivity is explained by a direct effect in the best-supported model, how does this work, since the direct effect isn't significant? Perhaps I am misunderstanding something.
To clarify, for network connectivity, there’s a significant “indirect” effect on parallelism (i.e. network connectivity affect ancestral gene expression and ancestral gene expression affect parallelism). Hence, in table 2, the direct effect of network connectivity on parallelism is weak and not significant while the indirect effect via ancestral variation is significant.
Also, network connectivity might favor the most pleiotropic genes being transcription factor hubs (or master regulators for various homeostasis pathways); while the tissue specificity metric perhaps is a kind of a space/time element. I get that a gene having expression across multiple tissues does fit the definition of pleiotropy in the broad sense, but I'm wondering if some important details are getting lost - I'm just thinking about the relative importance of what tissue specificity measurements say versus the network connectivity measurement.
We examined the statistical relationship between the two measures and found a moderate positive correlation on the basis of which we argued that the two measures may capture different aspects of pleiotropy. We appreciate the reviewer’s suggestions about the biological basis of the two estimates of pleiotropy, but we think that without further experimental insights, an extended discussion of this topic is too premature to provide meaningful insights to the readership.
Reviewer #2 (Public review):
Summary:
Lai and collaborators use a previously published RNAseq dataset derived from an experimental evolution set up to compare the pleiotropic properties of genes whose expression evolved in response to fluctuating temperature for over 100 generations. The authors correlate gene pleiotropy with the degree of parallelisms in the experimental evolution set up to ask: are genes that evolved in multiple replicates more or less pleiotropic?
They find that, maybe counter to expectation, highly pleiotropic genes show more replicated evolution. Such an effect seems to be driven by direct effects (which the authors can only speculate on) and indirect effects through low variance in pleiotropic genes (which the authors indirectly link to genetic variation underlying gene expression variance).
Weaknesses:
The results offer new insights into the evolution of gene expression and into the parameters that constrain such evolution, i.e., pleiotropy. Although the conclusions are supported by the data, I find the interpretation of the results a little bit complicated.
Major comment:
The major point I ask the authors to address is whether the connection between polygenic adaptation and parallelism can indeed be used to interpret gene expression parallelism. If the answer is not, please rephrase the introduction and discussion, if the answer is yes, please make it explicit in the text why it is so.
Our answer is yes, we interpreted gene expression parallelism (high ancestral variance -> less parallelism) using the same framework that links polygenic adaptation and parallelism (high polygenicity = less trait parallelism). We believe that our response covers several of the reviewer’s concerns.
The authors' argument: parallelism in gene expression is the same as parallelism in SNP allele frequency (AFC) (see L389-383 here they don't mention that this explanation is derived from SNP parallelism and not trait parallelism, and see Figure 1 b). In previous publications, the authors have explained the low level of AFC parallelism using a polygenic argument. Polygenic traits can reach a new trait optimum via multiple SNPs and therefore although the trait is parallel across replicates, the SNPs are not necessarily so.
Importantly, our rationale is based on the idea that gene expression is rarely the direct target of selection, but rather an intermediate trait [3]. Recently, we have specifically tested this assumption for gene expression and metabolite concentrations and our analysis showed that both traits were are redundant [4], as previously shown for DNA sequences [5]. The important implication for this manuscript is that gene expression is also redundant, so that adaptation can be achieved by distinct changes in gene expression in replicate populations adapting to the same selection pressure. This implies that we can use the same simulation framework for gene expression as for sequencing data. In our case different SNP frequencies correspond to different expression levels (averaged across individuals from a population), which in turn increases fitness by modifying the selected trait. Importantly, the selected trait in our simulations is not gene expression, but a not defined high level phenotype. A key insight from our simulations is that with increasing polygenicity the expression of a gene is more variable in the ancestral population.
In the current paper, they seem to be exchanging SNP AFC by gene expression, and to me, those are two levels that cannot be interchanged. Gene expression is a trait, not an SNP, and therefore the fact that a gene expression doesn't replicate cannot be explained by a polygenic basis, because again the trait is gene expression itself. And, actually, the results of the simulations show that high polygenicity = less trait parallelism (Figure 4).
As detailed above, because adaptation can be reached by changes in gene expression at different sets of genes, redundancy is also operating on the expression level not just on the level of SNPs. To clarify, the x-axis of Fig. 4 is the expression variation in the ancestral population.
Now, if the authors focus on high parallel genes (present in e.g. 7 or more replicates) and they show that the eQTLs for those genes are many (highly polygenic) and the AFC of those eQTLs are not parallel, then I would agree with the interpretation. But, given that here they just assess gene expression and not eQTL AFC, I do not think they can use the 'highly polygenic = low parallelism' explanation.
The interpretation of the results to me, should be limited to: genes with low variance and high pleiotropy tend to be more parallel, and the explanation might be synergistic pleiotropy.
While we understand the desire to model the full hierarchy from eQTLs to gene expression and adaptive traits, we raise caution that this would be a very challenging task. eQTLs very often underestimate the contribution of trans-acting factors, hence the understanding of gene expression evolution based on eQTLs is very likely incomplete and cannot explain the redundancy of gene expression during adaptation. Hence, we think that the focus on redundant gene expression is conceptually simpler and thus allows us to address the question of pleiotropy without the incorporation of allele frequency changes.
Reviewer #3 (Public review):
The authors aim to understand how gene pleiotropy affects parallel evolutionary changes among independent replicates of adaptation to a new hot environment of a set of experimental lines of Drosophila simulans using experimental evolution. The flies were RNAsequenced after more than 100 generations of lab adaptation and the changes in average gene expression were obtained relative to ancestral expression levels from reconstructed ancestral lines. Parallelism of gene expression change among lines is evaluated as variance in differential gene expression among lines relative to error variance. Similarly, the authors ask how the standing variation in gene expression estimated from a handful of flies from a reconstructed outbred line affects parallelism. The main findings are that parallelism in gene expression responses is positively associated with pleiotropy and negatively associated with expression variation. Those results are in contradiction with theoretical predictions and empirical findings. To explain those seemingly contradictory results the authors invoke the role of synergistic pleiotropy and correlated selection, although they do not attempt to measure either.
Strengths:
(1) The study uses highly replicated outbred laboratory lines of Drosophila simulans evolved in the lab under a constant hot regime for over 100 generations. This allows for robust comparisons of evolutionary responses among lines.
(2) The manuscript is well written and the hypotheses are clearly delineated at the onset.
(3) The authors have run a causal analysis to understand the causal dependencies between pleiotropy and expression variation on parallelism.
(4) The use of whole-body RNA extraction to study gene expression variation is well justified.
Weaknesses:
(1) It is unclear how well phenotypic variation in gene expression of the evolved lines has been estimated by the sample of 20 males from a reconstructed outbred line not directly linked to the evolved lines under study. I see this as a general weakness of the experimental design.
Our intention was not to measure the phenotypic variance of the evolved lines, but rather to estimate the phenotypic variance at the beginning of the experiment. Hence, we measured and investigated the variation of gene expression in the ancestral population since this was the beginning of the replicated experimental evolution. Furthermore, since the ancestral population represents the natural population in Florida, the gene expression variation reflects the history of selection history acting on it.
(2) There are no estimates of standing genetic variation of expression levels of the genes under study, only phenotypic variation. I wished the authors had been clear about that limitation and had discussed the consequences of the analysis. This also constitutes a weakness of the study.
The reviewer is correct that we do not aim to estimate the standing genetic variation, which is responsible for differences in gene expression. While we agree that it could be an interesting research question to use eQTL mapping to identify the genetic basis of gene expression, we caution that trans-effects are difficult to estimate and therefore an important component of gene expression evolution will be difficult to estimate. Hence, we consider that our focus on variation in gene expression without explicit information about the genetic basis is simpler and sufficient to address the question about the role of pleiotropy.
(3) Moreover, since the phenotype studied is gene expression, its genetic basis extends beyond expressed sequences. The phenotypic variation of a gene's expression may thus likely misrepresent the genetic variation available for its evolution. The genetic variation of gene expression phenotypes could be estimated from a cross or pedigree information but since individuals were pool-sequenced (by batches of 50 males), this type of analysis is not possible in this study.
We agree with the reviewer that gene expression variation may also have a non-genetic basis, we discuss this in depth in the discussion of the manuscript.
(4) The authors have not attempted to estimate synergistic pleiotropy among genes, nor how selection acts on gene expression modules. It makes any conclusion regarding the role of synergistic pleiotropy highly speculative.
We mentioned synergistic pleiotropy as a possible explanation for our results. A positive correlation between the fitness effect of gene expression variation would predict more replicable evolutionary changes. A similar argument has been made by [6].
I don't understand the reason why the analysis would be restricted to significantly differentially expressed genes only. It is then unclear whether pleiotropy, parallelism, and expression variation do play a role in adaptation because the two groups of adaptive and non-adaptive genes have not been compared. I recommend performing those comparisons to help us better understand how "adaptive" genes differentially contribute to adaptation relative to "nonadaptive" genes relative to their difference in population and genetic properties.
We agree with the reviewer that the comparison between the pleiotropy of adaptive and nonadaptive genes is interesting. We performed the analysis but omitted from the current manuscript for simplicity. Similar to the results in [6], non-adaptive genes are more pleiotropic than the adaptive genes. For adaptive genes we find a positive correlation between the level of pleiotropy and evolutionary parallelism. Thus, high pleiotropy limits the evolvability of a gene, but moderate and potentially synergistic pleiotropy increases the repeatability of adaptive evolution. We included this result in the revised manuscript and discuss it.
There is a lack of theoretical groundings on the role of so-called synergistic pleiotropy for parallel genetic evolution. The Discussion does not address this particular prediction. It could be removed from the Introduction.
We modestly disagree with the reviewer, synergistic pleiotropy is covered by theory and empirical results also support the importance of synergistic pleiotropy.
References
(1) Genissel A, McIntyre LM, Wayne ML, Nuzhdin SV. Cis and trans regulatory effects contribute to natural variation in transcriptome of Drosophila melanogaster. Molecular biology and evolution. 2008;25(1):101-10. Epub 20071112. doi: 10.1093/molbev/msm247. PubMed PMID: 17998255.
(2) Osada N, Miyagi R, Takahashi A. Cis- and Trans-regulatory Effects on Gene Expression in a Natural Population of Drosophila melanogaster. Genetics. 2017;206(4):2139-48. Epub 20170614. doi: 10.1534/genetics.117.201459. PubMed PMID: 28615283; PubMed Central PMCID: PMCPMC5560811.
(3) Barghi N, Hermisson J, Schlötterer C. Polygenic adaptation: a unifying framework to understand positive selection. Nature reviews Genetics. 2020;21(12):769-81. Epub 2020/07/01. doi: 10.1038/s41576-020-0250-z. PubMed PMID: 32601318.
(4) Lai WY, Otte KA, Schlötterer C. Evolution of Metabolome and Transcriptome Supports a Hierarchical Organization of Adaptive Traits. Genome biology and evolution. 2023;15(6). Epub 2023/05/26. doi: 10.1093/gbe/evad098. PubMed PMID: 37232360; PubMed Central PMCID: PMCPMC10246829.
(5) Barghi N, Tobler R, Nolte V, Jaksic AM, Mallard F, Otte KA, et al. Genetic redundancy fuels polygenic adaptation in Drosophila. PLoS biology. 2019;17(2):e3000128. Epub 2019/02/05. doi: 10.1371/journal.pbio.3000128. PubMed PMID: 30716062.
(6) Rennison DJ, Peichel CL. Pleiotropy facilitates parallel adaptation in sticklebacks. Molecular ecology. 2022;31(5):1476-86. Epub 2022/01/09. doi: 10.1111/mec.16335. PubMed PMID: 34997980; PubMed Central PMCID: PMCPMC9306781.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 1:
Point 1 of public reviews and point 2 of recommendations to authors.
Temporal ambiguity in credit assignment: While the current design provides clear task conditions, future studies could explore more ambiguous scenarios to further reflect real-world complexity…. The role of ambiguity is very important for the credit assignment process. However, in the current task design, the instruction of the task design almost eliminates the ambiguity of which the trial's choice should be assigned credit to. The authors claim the realworld complexity of credit assignment in this task design. However, the real-world complexity of this type of temporal credit assignment involves this type of temporal ambiguity of responsibility as causal events. I am curious about the consequence of increasing the complexity of the credit assignment process, which is closer to the complexity in the real world.
We agree that the structure of causal relationships can be more ambiguous in real-world contexts. However, we also believe that there are multiple ways in which a task might approach “real-world complexity”. One way is by increasing the ambiguity in the relationships between choices and outcomes (as done by Jocham et al., 2016). Another is by adding interim decisions that must be completed between viewing the outcome of a first choice, which mimics task structures such as the cooking tasks described in the introduction. In such tasks, the temporal structure of the actions maybe irrelevant, but the relationship between choice identities and the actions is critical to be effective in the task (e.g., it doesn’t matter whether I add spice before or after the salt, all I need to know that adding spice will result in spicy soup). While ambiguity about either form of causal relation is clearly an important part of real-world complexity, and would make credit assignment harder, our study focuses on how links between outcomes and specific past choice identities are created at the neural level when they are known to be causal.
We consequently felt it necessary to resolve temporal ambiguity for participants. Instructing participants on the structure of the task allowed us to make assumptions about how credit assignment for choice identities should proceed (assign credit to the choice made N trials back) and allowed us make positive predictions about the content of representations in OFC when viewing an outcome. This gave the highest power to detect multivariate information about the causal choice and the highest interpretability of such findings.
In contrast, if we had not resolved this ambiguity, it would be difficult to tell if incorrect decoding from the classifier resulted from noise in the neural signal, or if on that trial participants were assigning credit to non-causal choices that they erroneously believed to have caused the outcome due to the perceived temporal structure. We believe this would have ultimately decreased our power to determine whether representations of the causal choice were present at the time of outcome because we would have to make assumptions about what counts as a “true” causal representation.
We have commented on this in the discussions (p.13):
“While our study was designed to focus on the complexity of assigning credit in tasks with different known causal structures, another important component of real-world credit assignment is temporal ambiguity. To isolate the mechanisms which create associations between specific choices and specific outcomes, we instructed participants on the causal structure of each task, removing temporal ambiguity about the causal choice. However, our results are largely congruent with previously reported results in tasks that dissolved the typical experimental trial structure, producing temporal ambiguity, and which observed more pronounced spreading of effect, in addition to appropriate credit assignment (Jocham et al, 2016). Namely, this study found that activation in the lOFC increased only when participants received rewards contingent on a previous action, an effect that was more pronounced in subjects whose behavior reflected more accurate credit assignment. This suggests a shared lOFC mechanism for credit assignment in different types of complex environments. Whether these mechanisms extend to situations where the temporal causal structure is completely unknown remains an important question.”
Point 2 of public reviews and point 1 of recommendations to authors
Role of task structure understanding: The difference in task comprehension between human subjects in this study and animal subjects in previous studies offers an interesting point of comparison…. The credit assignment involves the resolution of the ambiguity in which the causal responsibility of an outcome event is assigned to one of the preceding events. In the original study of Walton and his colleagues, the monkey subjects could not be instructed on the task structure defining the causal relationships of the events. Then, the authors of the original study observed the spreading of the credit assignments to the "irrelevant" events, which did not occur in the same trial of the outcome event but to the events (choices) in neighbouring trials. This aberrant pattern of the credit assignment can be due to the malfunctions of the credit assignment per se or the general confusion of the task structure on the part of the monkey subjects. In the current study design, the subjects are humans and they are not confused about the task structure. Consistently, it is well known that human subjects rarely show the same patterns of the "spreading of credit assignment". So the implicit mechanism of the credit assignment process involves the understanding of the task structure. In the current study, there are clearly demarked task conditions that almost resolve the ambiguity inherent in the credit assignment process. Yet, the focus of the current analysis stops short of elucidating the role of understanding the task structure. It would be great if the authors could comment on the general difference in the process between the conditions, whether it is behavioral or neural.
We would like to thank the reviewer for making this important point. We believe that understanding the structure of the credit-assignment problem above is quite important, at least for the type of credit assignment described here. That is, because participants know that the outcome viewed is caused by the choice they made, 0 or 1 trials into the past, they can flexibly link choice identities to the newly observed outcomes as the probabilities change. Note, however, that this is already very challenging in the 1-back condition because participants need to track the two independently changing probabilities. We believe this is critical to address the questions we aimed to answer with this experiment, as described above.
We agree that this might be quite different from previous studies done with non-human primates, which also included many more training trials and lesions to the lOFC. Both of these aspects could manifest as difference in task performance and processing at behavioural and neural levels, respectively. Consistent with this possibility, in our task, we found no differences in credit spreading between conditions, suggesting that humans were quite precise in both, despite causal relationships being harder to track in the “indirect transition condition”. This lack of credit spreading could be because humans better understood the task-structure compared to macaques or be due to differences in functioning of the OFC and other regions. Because all participants were trained to understand, and were cued with explicit knowledge of, the task structure, it is difficult to isolate its role as we would need another condition in which they were not instructed about the task structure. This would also be an interesting study, and we leave it to future research to parse the contributions of task-structure ambiguity to credit assignment.
Point 3 of public reviews.
The authors used a sophisticated method of multivariate pattern analysis to find the neural correlate of the pending representation of the previous choice, which will be used for the credit assignment process in the later trials. The authors tend to use expressions that these representations are maintained throughout this intervening period. However, the analysis period is specifically at the feedback period, which is irrelevant to the credit assignment of the immediately preceding choice. This task period can interfere with the ongoing credit assignment process. Thus, rather than the passive process of maintaining the information of the previous choice, the activity of this specific period can mean the active process of protecting the information from interfering and irrelevant information. It would be great if the authors could comment on this important interpretational issue.
We agree that lFPC is likely actively protecting the pending choice representation from interference with the most recent choice for future credit assignment. This interpretation is largely congruent with the idea of “prospective memory” (e.g., Burgess, Gonen-Yaacovi, Volle, 2011), in which the lFPC can be thought of as protecting information that will be needed in the future but is not currently needed for ongoing behavior. That said, from our study alone it is difficult to make claims about whether the information maintained in frontal pole is actively protecting this information because of potentially interfering processes. Our “indirect transition condition” only contains trials where there is incoming, potentially interfering information about new outcomes, but no trials that might avoid interference (e.g., an interim choice made but there is nothing to be learned from it). We comment on this important future direction on page 14:
“One interpretation of these results is that the lFPC actively protects information about causal choices when potentially interfering information must be processed. Future studies will be needed to determine if the lFPC’s contributions are specific to these instances of potential interference, and whether this is a passive or active process”
Point 3 of recommendation to authors
A slightly minor, but still important issue is the interpretation of the role of lOFC. The authors compared the observed patterns of the credit assignment to the ideal patterns of credit assignment. Then, the similarity between these two matrices is used to find the associated brain region. In the assumption that lOFC is involved in the optimal credit assignment, the result seems reasonable. But as mentioned above, the current design involves the heavy role of understanding the task structure, it is debatable whether the lOFC is just involved in the credit assignment process or a more general role of representing the task structure.
We agree that this is an important distinction to make, and it is very likely that multiple regions of the OFC carry information about the task structure, and the extent to which participants understood this structure may be reflected in behavioral estimates of credit assignment or the overall patterns of the matrices (though all participants verbalized the correct structure prior to the task). However, we believe that in our task the lOFC is specifically involved in credit-assignment because of the content of the information we decoded. We demonstrated that the lOFC and HPC carry information about the causal choice during the outcome. These results cannot be explained by differences in understanding of the task structure because that understanding would have been consistent across trials where participants choose either shape identity. Thus, a classifier could not use this to separate these types of trials and would reflect chance decoding.
One interpretation of the lOFC’s role in credit assignment is that it is particularly important when a model of the task structure has to be used to assign credit appropriately. Here, we show lOFC the reinstates specific causal representations precisely at the time credit needs to be assigned, which are appropriate to participants’ knowledge of the task structure. These representations may exist alongside representations of the task structure, in the lOFC and other regions of the brain (Park et al., 2020; Boorman et al., 2021; Seo and Lee, 2010; Schuck et al., 2016). We have added the following sentences to clarify our perspective on this point in the discussion (p. 13):
“Our results from the “indirect transition” condition show that these patterns are not merely representations of the most recent choice but are representations of the causal choice given the current task structure, and may exist alongside representations of the task structure, in the lOFC and elsewhere (Boorman et al., 2021; Park et al., 2020; Schuck et al., 2016; Seo & Lee, 2010).”
Point 4 of public reviews and point 4 of recommendation to authors
Broader neural involvement: While the focus on specific regions of interest (ROIs) provided clear results, future studies could benefit from a whole-brain analysis approach to provide a more comprehensive understanding of the neural networks involved in credit assignment… Also, given the ROI constraint of the analysis, the other neural structure may be involved in representing the task structure but not detected in the current analysis
Given our strong a priori hypotheses about regions of interest (ROIs) in this study, we focused on these specific areas. This choice was based on theoretical and empirical grounds that guided our investigation. However, we thank the reviewer for pointing this out and agree that there could be other unexplored areas that are critical to credit-assignment which we did not examine.
We conducted the same searchlight decoding procedure on a whole brain map and corrected for multiple comparisons using TFCE. We found no significant regions of the brain in the “direct transition condition” but did find other significant regions in our information connectivity analysis of the “indirect transition condition”. In addition to replicating the effects in lOFC and HPC, we also found a region of mOFC which showed a strong correlation with pending choice in lFPC. It’s difficult to say whether this region is involved in credit assignment per se, because we did not see this region in the “direct transition condition” and so we cannot say that it is consistently related to this process. However, the mOFC is thought to be critical to representing the current task state (Schuck et al., 2016), and the task structure (Park et al., 2020). In our task, it could be a critical region for communicating how to assign credit given the more complex task structure of the “indirect transition condition” but more evidence would be needed to support this interpretation.
For now, we have added the results of this whole brain analysis to a new supplementary figure S7 (page 41), and all unthresholded maps have been deposited in a Neurovault repository, which is linked in the paper, for interested readers to assess.
Minor points:
There are some missing and confusing details in the Figure reference in the main text. For example, references to Figure 3 are almost missing in the section "Pending item representations in FPl during indirect transitions predict credit assignment in lOFC". For readability, the authors should improve this point in this section and other sections.
Thank you to the reviewer for pointing this out. We have now added references to Figure 3 on page 8:
“Our analysis revealed a cluster of voxels specifically within the right lFPC ([x,y,z] = [28, 54, 8], t(19) = 3.74, pTFCE <0.05 ROI-corrected; left hemisphere all pTFCE > 0.1, Fig. 3A)”
And on page 10:
Specifically, we found significant correlations in decoding distance between lFPC and bilateral lOFC ([x,y,z] = [-32,24, -22], t(19) = 3.81, [x,y,z] = [20, 38, -14], t(19) = 3.87, pTFCE <0.05 ROI corrected]) and bilateral HC ([x,y,z] = [-28, -10, -24], t(19) = 3.41, [x,y,z] = [22, -10, -24], t(19) = 4.21, pTFCE <0.05 ROI corrected]), Fig. 3C).
Task instructions for the two conditions (direct and indirect) play important roles in the study. If possible, please include the following parts in the figures and descriptions in the introduction and/or results sections.
We have now included a short description of the condition instructions beginning on page 5:
“Participants were instructed about which condition they were in with a screen displaying “Your latest choice” in the direct transition condition, and “Your previous choice” in the indirect condition.”
And have modified Figure 1 to include the instructions in the title of each condition. We thought this to be the most parsimonious solution so that the choice options in the examples were not occluded.
The subject sample size might be slightly too small in the current standards. Please give some justifications.
We originally selected the sample size for this study to be commensurate with previous studies that looked for similar behavioral and neural effects (see Boorman et al., 2016; Howard et al., 2015; Jocham et al., 2016). This has been mentioned in the “methods” section on page 24.
However, to be thorough, we performed a power analysis of this sample size using simulations based on an independently collected, unpublished data set. In this data set, 28 participants competed an associative learning task similar to the task in the current manuscript. We trained a classifier to decode causal choice option at the time of feedback, using the same searchlight and cross-validation procedures described in the current manuscript, for the same lateral OFC ROI. We calculated power for various sample sizes by drawing N participants with replacement 1000 times, for values of N ranging from 15 to 25. After sampling the participants, we tested for significant decoding for the causal choice within the subset of data, using smallvolume TFCE correction to correct for multiple comparisons. Finally, we calculated the proportion of these samples that were significant at a level of pTFCE <.05.
The results of this procedure show that an N of 20 would result in 84.2% power, which is slightly above the typically acceptable level of 80%. We have added the following sentences to the methods section on page 25:
“Using an independent, unpublished data set, we conducted a power analysis for the desire neural effect in lOFC. We found that this number of participants had 84% power to detect this effect (Fig. S8).”
We also added the following figure to the supplemental figures page (42):
Reviewer 2:
I have several concerns regarding the causality analyses in this study. While Multivariate analyses of information connectivity between regions are interesting and appear rigorous, they make some assumptions about the nature of the input data. It is unclear if fMRI with its poor temporal resolution (in addition to possible region-specific heterogeneity in the readouts), can be coupled with these casual analysis methods to meaningfully study dynamics on a decision task where temporal dynamics is a core component (i.e., delay). It would be helpful to include more information/justification on the methods for inferring relationships across regions from fMRI data. Along this line, discussing the reported findings in light of these limitations would be essential.
We agree that fMRI is limited for capturing fast neural dynamics, and that it can be difficult to separate events that occur within a few seconds. However, we designed the information connectivity analysis to maximally separate the events in question – the representations of the causal choice being held in a pending state, and the representation of the causal choice during credit assignment. These events were separated by at least 10 seconds and by 15 seconds on average, which is commensurate with recommended intervals for disentangling information in such analysis (Mumford et al., 2012, 2014, also see van Loon et al., 2018, eLife; as example of fluctuations in decodability over time). This feature of our task design may not have been clear because information connectivity analyses are typically performed in the same task period. We clarify this point on page 32:
“Note that the decoding fidelity metric at each time point represents the decodability of the same choice at different phases of the task. These phases were separated by at least 10 seconds and 15 seconds on average, which can be sufficient for disentangling unique activity (Mumford et al., 2012, 2014).”
However, we agree with the reviewer that the limitations of fMRI make it difficult to precisely determine how roles of the OFC and lFPC might change over time, and whether other regions may contribute to information transfer at times scales which cannot be detected by fMRI. Further, we do not wish to imply causality between lFPC and lOFC (something we believe we do not claim in the paper), only that information strength in lFPC predicts subsequent strength of the same information in the OFC and HC. We have clarified this limitation on page 14:
“Although we show evidence that lFPC is involved in maintaining specific content about causal choices during interim choices, the limited temporal resolution of fMRI makes it difficult to tell if other regions may be supporting the learning processes at timescales not detectable in the BOLD response. Thus, it is possible that the network of regions supporting credit assignment in complex tasks may be much larger. Our results provide a critical first stem in discerning the nature of interactions between cognitive subsystems that make different contributions to the learning process in these complex tasks.”
Reviewer 3:
Point 1 of public reviews:
They do find (not surprisingly) that the one-back task is harder. It would be good to ensure that the reason that they had more trouble detecting direct HC & lOFC effects on the harder task was not because the task is harder and thus that there are more learning failures on the harder oneback task. (I suspect their explanation that it is mediated by FPl is likely to be correct. But it would be nice to do some subsampling of the zero-back task [matched to the success rate of the one-back task] to ensure that they still see the direct HC and lOFC there).
We would like to thank the reviewer for this comment and agree that the “indirect transition condition” is more difficult than the direct transition condition. However, in this task it is difficult to have an explicit measure of learning failures per se because the “correctness” of a choice is to some extent subjective (i.e., based on the gift card preference and the computational model). We could infer when learning failures occur through the computational model by looking at trials in which participants made choices that the model would consider improbable, (i.e., non-reward maximizing) while accounting for outcome preference. However, there are also a myriad of other possible explanations for these choices, such as exploratory/confirmatory strategies, lapses in attention etc. Thus, we could not guarantee that the two conditions would be uniquely matched in difficulty with specific regard to learning even if we subsampled these trials. We feel it would be better left to future experiments which can specifically compare learning failures to tackle this issue. We have now addressed this point when discussing the model on page 31:
“Note that learning failures are not trivial to identify in our paradigm and model, because every choice is based on a participant’s preference between gift card outcomes, and the ability of the computational model to accurately estimate participants’ beliefs in the stimulus-outcome transition probabilities.”
Point 2 of public reviews:
The evidence that they present in the main text (Figure 3) that the HC and lOFC are mediated by FPl is a correlation. I found the evidence presented in Supplemental Figure 7 to be much more convincing. As I understand it, what they are showing in SF7 is that when FPl decodes the cue, then (and only then) HC and lOFC decode the cue. If my understanding is correct, then this is a much cleaner explanation for what is going on than the secondary correlation analysis. If my understanding here is incorrect, then they should provide a better explanation of what is going on so as to not confuse the reader.
SF7 (now Figures 3C and 3D) does show that positive decoding in the HC and lOFC are more likely to occur when there is positive decoding in lFPC. However, the analysis shown in these figures are only meant to be control analysis to further characterise what is being captured, but not necessarily implied, by the information connectivity analysis. For example, in principle the classifier might never correctly decode a choice label in the lOFC or HC while still getting closer to the hyperplane when the lFPC patterns are correctly decoded. This would lead to a positive correlation, but a difficult to interpret result since patterns in lOFC and HPC are incorrect. Figure SF7A (now Fig. 3C) shows that this is not the case. Lateral OFC and HC have higher than chance positive decoding when lFPC has positive decoding. Figure SF7B (now Fig. 3D) shows that we can decode that information even if a new hyperplane is constructed. However, both cases have less information about the relationship between these regions because they do not include the trials where lOFC/HC and lFPC classifiers were incorrect at the same time. The correlation in Figure 3B includes these failures, giving a more wholistic picture of the data. We therefore try to concisely clarify this point on page 10:
“These signed distances allow us to relate both success in decoding information, as well as failures, between regions.”
And here on page 10:
“Subsequent analyses confirmed that this effect was due to these regions showing a significant increase in positive (correct) decoding in trials where pending information could be positively (correctly) decoded in lFPC, and not simply due to a reduction in incorrect information fidelity (see Fig. 3C & 3D).”
And have integrated these figures on page 9:
Point 3 of public reviews:
I like the idea of "credit spreading" across trials (Figure 1E). I think that credit spreading in each direction (into the past [lower left] and into the future [upper right]) is not equivalent. This can be seen in Figure 1D, where the two tasks show credit spreading differently. I think a lot more could be studied here. Does credit spreading in each of these directions decode in interesting ways in different places in the brain?
We agree that this an interesting question because each component of the off diagonal (upper and lower triangles) may reflect qualitatively different processes of credit spreading. However, we believe this analysis is difficult to carry out with the current dataset for two reasons. First, we designed this study to ask specifically about the information represented in key credit assignment regions during precise credit assignment, meaning we did not optimize the task to induce credit spreading at any point. Indeed, our efforts to train participants on the task were to ensure they would correctly assign credit as much as possible. Figure 1F shows that the regression coefficients representing credit spreading in each condition are near zero (in the negative direction), with little individual differences compared to the credit assignment coefficients. Thus, any analysis aiming to test for credit spreading would unfortunately be poorly powered. Studies such as Jocham et al. (2016), with more variability in causal structures, or studies with ambiguity about the causal structure by dissolving the typical trial structure would be better suited to address this interesting question. The second reason why such an analysis would be challenging is that due to our design, it is difficult to intuitively determine what kind of information should be coded by neural regions when credit spreads to the upper diagonal, since these cells reflect current outcomes that are being linked to future choices.
Replace all the FPl with LFPC (lateral frontal polar cortex)
We have no replace “FPl” with “LFPC” throughout the text and figures
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Comment of Review of Revised Version:
Although the authors have partly corrected the manuscript by removing the mislabeling in their Co-IP experiments, my primary concern on the actual functional connotations and direct interaction between PA28y and C1QBP still remains unaddressed. As already mentioned in my previous review, since the core idea of the work is PA28y's direct interaction with C1QBP, stabilizing it, the same should be demonstrated in a more convincing manner.
My other observation on the detection of C1QBP as a doublet has been addressed by usage of anti-C1QBP Monoclonal antibody against the polyclonal one used before. C1QBP doublets have not been observed in the present case.
The authors have also worked on the presentation of the background by suitably modifying the statements and incorporating appropriate citations.
However, the authors are requested to follow the recommendations provided to them by the reviewers to address the major concerns.
Thank you very much for your comments. We appreciate your concerns regarding the need for more direct evidence to support the stabilizing interaction between PA28γ and C1QBP. In response to your feedback, we have taken additional steps to provide more convincing evidence of this interaction.
To complement our existing pull-down and Co-IP experiments, we utilized AlphaFold 3 to predict the three-dimensional structure of the PA28γ-C1QBP complex. The predicted model reveals specific residues and interfaces that are likely involved in the direct interaction between PA28γ and C1QBP. Our analysis indicates that this interaction may depend on amino acids 1-167 and 1-213 of C1QBP (Revised Appendix Figure 1E-H). Furthermore, aspartate (ASP), as the 177th amino acids of PA28γ, was predicted to interact with the 76th amino acid threonine (THR) and the 78th amino acid glycine (GLY) of C1QBP (Revised Appendix Figure 1I). This structural insight was further validated by our immunoprecipitation experiments (Revised Figure 1J). These findings provide a molecular basis for the observed stabilizing effect and suggest potential mechanisms by which PA28γ influences C1QBP stability. Specifically, the identified interaction sites offer clues into how PA28γ may stabilize C1QBP at the molecular level.
Furthermore, we performed proximity ligation assays (PLA) to detect in situ interactions between PA28γ and C1QBP at the single-cell level. PLA results clearly demonstrate the presence of PA28γ-C1QBP complexes within cells, providing direct evidence of their physical interaction (Revised Figure 1D). This approach overcomes some of the limitations associated with traditional IP experiments and confirms the direct nature of the interaction.
In summary, the integration of AlphaFold 3 predictions, PLA data, and our previous Pull-down and Co-IP experiments provides robust and direct evidence for a stable interaction between PA28γ and C1QBP. We believe that these additional findings significantly reinforce our conclusions and effectively address the concerns raised by the reviewers. Once again, thank you for your valuable feedback, which has been instrumental in refining and enhancing our study.
Reviewer #2 (Public review):
Comment of Review of Revised Version:
Weaknesses:
Many data sets are shown in figures that cannot be understood without more descriptions either in the text or the legend, e.g., Fig. 1A. Similarly, many abbreviations are not defined.
The revision addressed these issues.
Some of the pull-down and coimmunoprecipitation data do not support the conclusion about the PA28g-C1QBP interaction. For example, in Appendix Fig. 1B the Flag-C1QBP was detected in the Myc beads pull-down when the protein was expressed in the 293T cells without the Myc-PA28g, suggesting that the pull-down was not due to the interaction of the C1QBP and PA28g proteins. In Appendix Fig. 1C, assume the SFB stands for a biotin tag, then the SFB-PA28g should be detected in the cells expressing this protein after pull-down by streptavidin; however, it was not. The Western blot data in Fig. 1E and many other figures must be quantified before any conclusions about the levels of proteins can be drawn.
The revision addressed these problems.
The immunoprecipitation method is flawed as it is described. The antigen (PA28g or C1QBP) should bind to the respective antibody that in turn should binds to Protein G beads. The resulting immunocomplex should end up in the pellet fraction after centrifugation, and analyzed further by Western blot for coprecipitates. However, the method in the Appendix states that the supernatant was used for the Western blot.
The revision corrected this method.
To conclude that PA28g stabilizes C1QBP through their physical interaction in the cells, one must show whether a protease inhibitor can substitute PA28q and prevent C1QBP degradation, and also show whether a mutation that disrupt the PA28g-C1QBP interaction can reduce the stability of C1QBP. In Fig. 1F, all cells expressed Myc-PA28g. Therefore, the conclusion that PA28g prevented C1QBP degradation cannot be reached. Instead, since more Myc-PA28g was detected in the cells expressing Flag-C1QBP compared to the cells not expressing this protein, a conclusion would be that the C1QBP stabilized the PA28g. Fig. 1G is a quantification of a Western blot data that should be shown.
The binding site for PA28g in C1QBP was mapped to the N-terminal 167 residues using truncated proteins. One caveat would be that some truncated proteins did not fold correctly in the absence of the sequence that was removed. Thus, the C-terminal region of the C1QBP with residues 168-283 may still bind to the PA29g in the context of full-length protein. In Fig. 1I, more Flag-C1QBP 1-167 was pull-down by Myc-PA28g than the full-length protein or the Flag-C1QBP 1-213. Why?
The interaction site in PA28g for C1QBP was not mapped, which prevents further analysis of the interaction. Also, if the interaction domain can be determined, structural modeling of the complex would be feasible using AlphaFold2 or other programs. Then, it is possible to test point mutations that may disrupt the interaction and if so, the functional effect.
The revision added AlphaFold models for the protein interaction. However, the models were not analyzed and potential mutations that would disrupt the interact were not predicted, made and tested. The revision did not addressed the request for the protease inhibitor.
Thank you for your insightful comments regarding the binding site of PA28γ in C1QBP. We appreciate your concern about the potential misfolding of truncated proteins and the possible interaction between the C-terminal region (residues 168-283) of C1QBP and PA28γ in the context of full-length protein.
To address these concerns, we have conducted additional analyses and experiments to provide a more comprehensive understanding of the interaction between PA28γ and C1QBP. Using AlphaFold 3, we predicted the three-dimensional structure of the PA28γ-C1QBP complex. The model reveals specific residues and interfaces that are likely involved in the direct interaction between PA28γ and C1QBP. Notably, our structural analysis indicates that the interaction may primarily depend on amino acids 1-167 and 1-213 of C1QBP (Revised Appendix Figure 1E-H). Furthermore, aspartate (ASP), as the 177th amino acids of PA28γ, was predicted to interact with the 76th amino acid threonine (THR) and the 78th amino acid glycine (GLY) of C1QBP (Revised Appendix Figure 1I). This prediction supports the idea that the N-terminal region of C1QBP is crucial for its interaction with PA28γ. Regarding the observation in old Figure 1I (Revised Figure 1J), where more Flag-C1QBP 1-167 was pulled down by Myc-PA28γ compared to the full-length protein or Flag-C1QBP 1-213, we believe this can be explained by several factors:
A. The truncation of C1QBP to residues 1-167 may expose key interaction sites that are partially obscured in the full-length protein. This enhanced accessibility could lead to stronger binding affinity and higher pull-down efficiency.
B. While it is possible that some truncated proteins do not fold correctly, our data suggest that the N-terminal fragment (1-167) retains sufficient structural integrity to interact effectively with PA28γ. The increased pull-down of this fragment suggests that it captures the essential elements required for binding.
C. The C-terminal region (168-283) might exert steric hindrance or allosteric effects on the N-terminal binding site in the context of the full-length protein. This interference could reduce the overall binding efficiency, leading to less pull-down of full-length C1QBP compared to the truncated version.
Compared with the control group, the presence of Myc-PA28γ significantly increased the expression level of Flag-C1QBP (r Revised Figure 1G). Gray value analysis showed that in cells transfected with Myc-PA28γ, the decay rate of Flag-C1QBP was significantly slower than that of the control group (Revised Figure 1H), suggesting that PA28γ can delay the protein degradation of C1QBP and stabilize its protein level. This indicates that an increase in the level of PA28γ protein can significantly enhance the expression level of C1QBP protein, while PA28γ can slow down the degradation rate of C1QBP and improve its stability. In addition, our western blot analysis also proved that PA28γ could still prevent the degradation of C1QBP under the action of proteasome inhibitor MG-132 (Revised Appendix Figure 1D). Moreover, PA28γ could not stabilize the mutation of C-terminus of C1QBP (amino acids 94-282), which was not the interaction domain of PA28γ-C1QBP (Revised Figure 1K).
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Barlow and coauthors utilized the high-parameter imaging platform of CODEX to characterize the cellular composition of immune cells in situ from tissues obtained from organ donors with type 1 diabetes, subjects presented with autoantibodies who are at elevated risk, or non-diabetic organ donor controls. The panels used in this important study were based on prior publications using this technology, as well as a priori and domain-specific knowledge of the field by the investigators. Thus, there was some bias in the markers selected for analysis. The authors acknowledge that these types of experiments may be complemented moving forward with the inclusion of unbiased tissue analysis platforms that are emerging that can conduct a more comprehensive analysis of pathological signatures employing emerging technologies for both high-parameter protein imaging and spatial transcriptomics.
Strengths:
In terms of major findings, the authors provide important confirmatory observations regarding a number of autoimmune-associated signatures reported previously. The high parameter staining now increases the resolution for linking these features with specific cellular subsets using machine learning algorithms. These signatures include a robust signature indicative of IFN-driven responses that would be expected to induce a cytotoxic T-cell-mediated immune response within the pancreas. Notable findings include the upregulation of indolamine 2,3-dioxygenase-1 in the islet microvasculature. Furthermore, the authors provide key insights as to the cell:cell interactions within organ donors, again supporting a previously reported interaction between presumably autoreactive T and B cells.
Weaknesses:
These studies also highlight a number of molecular pathways that will require additional validation studies to more completely understand whether they are potentially causal for pathology, or rather, epiphenomenon associated with increased innate inflammation within the pancreas of T1D subjects. Given the limitations noted above, the study does present a rich and integrated dataset for analysis of enriched immune markers that can be segmented and annotated within distinct cellular networks. This enabled the authors to analyze distinct cellular subsets and phenotypes in situ, including within islets that peri-islet infiltration and/or intra-islet insulitis.
Despite the many technical challenges and unique organ donor cohort utilized, the data are still limited in terms of subject numbers - a challenge in a disease characterized by extensive heterogeneity in terms of age of onset and clinical and histopathological presentation. Therefore, these studies cannot adequately account for all of the potential covariates that may drive variability and alterations in the histopathologies observed (such as age of onset, background genetics, and organ donor conditions). In this study, the manuscript and figures could be improved in terms of clarifying how variable the observed signatures were across each individual donor, with the clear notion that non-diabetic donors will present with some similar challenges and variability.
Thank you to all reviewers and editors for their thoughtful and constructive engagement with our manuscript. We agree that patient heterogeneity and the sample size limited the impact of this study. In the future, more cases with insulitis will become available and spatial technologies will become more scalable.
Given these constraints, we have made a significant effort to illustrate the individual heterogeneity of the disease by using the same color for each nPOD case ID throughout the manuscript and showing individual donors whenever feasible (e.g. Figures 1D-E, 2C, 2I, 3E, 3G, 4B-C, 5C, and 5F). For figures related to insulitis, we do not typically include non-T1D controls since they did not have any insulitis (Figure 2C). We also explicitly discuss the differences in the two autoantibody-positive, non-T1D cases: one closely resembled the T1D cases with respect to multiple features and the other more closely resembled the non-T1D, autoantibody-negative controls.
Reviewer #2 (Public review):
Summary:
The authors aimed to characterize the cellular phenotype and spatial relationship of cell types infiltrating the islets of Langerhans in human T1D using CODEX, a multiplexed examination of cellular markers
Strengths:
Major strengths of this study are the use of pancreas tissue from well-characterized tissue donors, and the use of CODEX, a state-of-the-art detection technique of extensive characterization and spatial characterization of cell types and cellular interactions. The authors have achieved their aims with the identification of the heterogeneity of the CD8+ T cell populations in insulitis, the identification of a vasculature phenotype and other markers that may mark insulitis-prone islets, and the characterization of tertiary lymphoid structures in the acinar tissue of the pancreas. These findings are very likely to have a positive impact on our understanding (conceptual advance) of the cellular factors involved in T1D pathogenesis which the field requires to make progress in therapeutics.
Weaknesses:
A major limitation of the study is the cohort size, which the authors directly state. However, this study provides avenues of inquiry for researchers to gain further understanding of the pathological process in human T1D.
Thank you for your analysis. We point the reader to our above description of our efforts to faithfully report the patient variability despite the small sample size.
Reviewer #3 (Public review):
Summary:
The authors applied an innovative approach (CO-Detection by indEXing - CODEX) together with sophisticated computational analyses to image pancreas tissues from rare organ donors with type 1 diabetes. They aimed to assess key features of inflammation in both islet and extra-islet tissue areas; they reported that the extra-islet space of lobules with extensive islet infiltration differs from the extra-islet space of less infiltrated areas within the same tissue section. The study also identifies four sub-states of inflamed islets characterized by the activation profiles of CD8+T cells enriched in islets relative to the surrounding tissue. Lymphoid structures are identified in the pancreas tissue away from islets, and these were enriched in CD45RA+ T cells - a population also enriched in one of the inflamed islet sub-states. Together, these data help define the coordination between islets and the extra-islet pancreas in the pathogenesis of human T1D.
Strengths:
The analysis of tissue from well-characterized organ donors, provided by the Network for the Pancreatic Organ Donor with Diabetes, adds strength to the validity of the findings.
By using their innovative imaging/computation approaches, key known features of islet autoimmunity were confirmed, providing validation of the methodology.
The detection of IDO+ vasculature in inflamed islets - but not in normal islets or islets that have lost insulin-expression links this expression to the islet inflammation, and it is a novel observation. IDO expression in the vasculature may be induced by inflammation and may be lost as disease progresses, and it may provide a potential therapeutic avenue.
The high-dimensional spatial phenotyping of CD8+T cells in T1D islets confirmed that most T cells were antigen-experienced. Some additional subsets were noted: a small population of T cells expressing CD45RA and CD69, possibly naive or TEMRA cells, and cells expressing Lag-3, Granzyme-B, and ICOS.
While much attention has been devoted to the study of the insulitis lesion in T1D, our current knowledge is quite limited; the description of four sub-clusters characterized by the activation profile of the islet-infiltrating CD8+T cells is novel. Their presence in all T1D donors indicates that the disease process is asynchronous and is not at the same stage across all islets. Although this concept is not novel, this appears to be the most advanced characterization of insulitis stages.
When examining together both the exocrine and islet areas, which is rarely done, authors report that pancreatic lobules affected by insulitis are characterized by distinct tissue markers. Their data support the concept that disease progression may require crosstalk between cells in the islet and extra-islet compartments. Lobules enriched in β-cell-depleted islets were also enriched in nerves, vasculature, and Granzyme-B+/CD3- cells, which may be natural killer cells.
Lastly, authors report that immature tertiary lymphoid structures (TLS) exist both near and away from islets, where CD45RA+ CD8+T cells aggregate, and also observed an inflamed islet-subcluster characterized by an abundance of CD45RA+/CD8+ T cells. These TLS may represent a point of entry for T cells and this study further supports their role in islet autoimmunity.
Weaknesses:
As the authors themselves acknowledge, the major limitation is that the number of donors examined is limited as those satisfying study criteria are rare. Thus, it is not possible to examine disease heterogeneity and the impact of age at diagnosis. Of 8 T1D donors examined, 4 would be considered newly diagnosed (less than 3 months from onset) and 4 had longer disease durations (2, 2, 5, and 6 years). It was unclear if disease duration impacted the results in this small cohort. In the introduction, the authors discuss that most of the pancreata from nPOD donors with T1D lack insulitis. This is correct, yet it is a function of time from diagnosis. Donors with shorter duration will be more likely to have insulitis. A related point is that the proportion of islets with insulitis is low even near diagnosis, Finally, only one donor was examined that while not diagnosed with T1D, was likely in the preclinical disease stage and had autoantibodies and insulitis. This is a critically important disease stage where the methodology developed by the investigators could be applied in future efforts.
While this was not the focus of this investigation, it appears that the approach was very much immune-focused and there could be value in examining islet cells in greater depth using the methodology the authors developed.
Additional comments:
Overall, the authors were able to study pancreas tissues from T1D donors and perform sophisticated imaging and computational analysis that reproduce and importantly extend our understanding of inflammation in T1D. Despite the limitations associated with the small sample size, the results appear robust, and the claims well-supported.
The study expands the conceptual framework of inflammation and islet autoimmunity, especially by the definition of different clusters (stages) of insulitis and by the characterization of immune cells in and outside the islets.
Thank you for your feedback. We agree that it would be very informative to expand on our analysis of autoantibody-positive cases and look at additional non-immune features.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Do any of the observed cellular or structural features correlate with age of onset or disease duration? While numbers of subjects are low, considering these as continuous variables may clarify some of the findings.
Thank you for the suggestion. In Supplemental Figure 5B-C, we plotted the key immune signatures from the manuscript against the diabetes duration and age of onset.
(2) The IDO is an interesting observation and has prior support in the literature. The authors speculate this may be induced as a feature of IFNg expressed by lymphocytes in the local microenvironment. Can any of these concepts be further validated by staining for transcription factors or surrogate downstream markers associated with Th1 skewing (e.g., Tbet, CXCR3, etc)?
The only other interferon-stimulated gene in our panel is HLA-ABC. We updated Supplemental Figure 2F to include HLA-ABC expression in IDO- and IDO+ islets (within the “Inflamed” group). Consistent with the hypothesis that IDO is stimulated by interferon, HLA-ABC is also significantly higher in IDO+ islets than IDO- islets. PDL1, another interferon-stimulated gene. was included in the panel but we did not detect any signal. This antibody was very weak during testing in the tonsil, so we couldn’t confidently claim that PDL1 was not expressed.
(3) The authors discuss the potential that CD45RA may be expressed in Temra populations. This could use additional clarification and a distinction from Tscm if possible.
Unfortunately, we did not have the appropriate markers to distinguish naïve, TEMRA, or Tscm cells from each other. We updated the text in the discussion to include this consideration (Line 432).
(4) Supplemental Figure 5 is not informative in the current display.
Thank you, we replotted these data.
(5) Supplemental Table 1 could be expanded with additional metadata of interest, including the genetic features of the donors (e.g, class II diplotype and GRS2 values) that are published and available in the nPOD program.
Some genetic data are only available to nPOD investigators. We think it is more appropriate to request the data directly from them.
Reviewer #2 (Recommendations for the authors):
(1) I had only a few specific comments. I think the statement in Lines 317 and 318 is too strong. It implies that each lobe is always homogeneous for having all islets with insulitis or not having insulitis. Some lobes are certainly enriched for islets with insulitis but insulin+ islets without insulitis in some lobes in some T1D donors are seen. Please soften that statement.
We apologize for our lack of clarity. We have edited the text (line 305-309) to better articulate that organ donors fall on a spectrum. Thank you for raising this point as we think the motivation for our analysis is much clearer after these revisions.
(2) Please cite and discuss In't Veld Diabetes 20210 PMID: 20413508. While the main point of the paper is that there is beta cell replication after prolonged life support, another observation is that there is a correlation between prolonged life support and CD45+ cells in the pancreas parenchyma. This might indicate that not all immune cells in the parenchyma are T1D associated in donors with T1D.
Thank you, we have added this citation to our discussion of the importance of duration of stay in the ICU (Line 471).
(3) Can you rule out that CD46RA+/CD69+ CD8+ T cells in the islets are not TSCM?
(See above)
Reviewer #3 (Recommendations for the authors):
Similar studies in experimental models may afford increased opportunity to evaluate the significance of these findings and model their potential relevance for disease staging and therapeutic targeting.
We agree that the lack of experimental data limits the ability to interpret and validate the significance of our findings. We hope that our study motivates and helps inform such experiments.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.’
Public Reviews:
Reviewer #1 (Public Review):
For the colony analysis, it is unclear from the methods and main text whether the initial individual sorted colonies were split and subject to different conditions to support the claim of bi-potency. The finding that 40% of colonies displayed tenogenic differentiation, may instead suggest heterogeneity of the sorted progenitor population. The methods as currently described, suggest that two different plates were subject to different induction conditions. It is therefore difficult to assess the strength of the claim of bi-potency.
Thanks for your valuable comment. We are sorry for the confusing illustration of colony assay. In fact, we first obtained CD29+/CD56+ myogenic progenitors by FACs. Then these freshly isolated cells were randomly seeded to 96-well plate with density of 1 cell/well. Subsequently, the single cell in each plate was cultured with growth medium to form colonies for ten days. Then myogenic induction was performed in three 96-well plates and tenogenic induction was performed in another three 96-well plates for subsequent analyses. We agree with your point that the sorted cell population could be heterogeneous myogenic progenitors. The result showed over 95% colonies successfully differentiated into myotubes, while 40% of colonies displayed tenogenic differentiation (Fig. 2g). Since the freshly obtained CD29+/CD56+ myogenic progenitors were randomly seeded for tenogenic induction or myogenic induction, the undifferentiated cells in each group were considered as the same sample. Furthermore, the optimal tenogenic differentiation condition for these cells was still waiting for investigation. Thus, we believe the colony analysis combined with the data in Figure 1 and Figure 2 could indicate the bi-potency for human CD29+/CD56+ myogenic progenitors.
This group uses the well-established CD56+/CD29+ sorting strategy to isolate muscle progenitor cells, however recent work has identified transcriptional heterogeneity within these human satellite cells (ie Barruet et al, eLife 2020). Given that they identify a tenocyte population in their human muscle biopsy in Figure 1a, it is critical to understand the heterogeneity contained within the population of human progenitors captured by the authors' FACS strategy and whether tenocytes contained within the muscle biopsy are also CD56+/CD29+.
Thanks for your constructive suggestion. We have included more samples to perform scRNA-seq and reanalyzed the data. The scRNA-seq data revealed that all the CD29+/CD56+ cells were myogenic progenitors, which occupied 19.3% of all the myogenic progenitors (Fig. 1e). However, there existed no tenocytes with CD29+/CD56+ (Fig. 1d), and tenocytes made up only a small percentage (0.06%) of all the mononuclear cells. Thus, human CD29+/CD56+ cells are myogenic progenitors, and tenocytes contained within the muscle biopsy are not CD56+/CD29+. In addition, both published research and our results indicated the heterogeneity of CD29+/CD56+ myogenic progenitors. Since the main purpose of current study was to investigate the tenogenic differentiation potential of CD29+/CD56+ myogenic progenitors, the heterogeneity in CD29+/CD56+ myogenic progenitors should be investigated in the further study.
The bulk RNA sequencing data presented in Figure 3 to contrast the expression of progenitor cells under different differentiation conditions are not sufficiently convincing. In particular, it is unclear whether more than one sample was used for the RNAseq analyses shown in Figure 3. The volcano plots have many genes aligned on distinct curves suggesting that there are few replicates or low expression. There is also a concern that the sorted cells may contain tenocytes as tendon genes SCX, MKX, and THBS4 were among the genes upregulated in the myogenic differentiation conditions (shown in Figure 3b).
Thanks for your comment. Each group consisted of three samples for RNAseq analyses. We are sorry there existed a minor analysis mistake in Fig. 3b and Fig. 3c, which have been reanalyzed in the revised version. There was no significantly difference of tendon related marker genes after myogenic differentiation (Fig. 3b), while these tenogenic genes were significantly up-regulated after tenogenic induction (Fig. 3c). As for contamination of tenocytes, scRNA-seq data showed there were no tenocytes with both CD29 and CD56 positive (please see response to Comment 2). And almost all the obtained cells highly expressed myogenic progenitors markers PAX7/MYOD1/MYF5 (Fig. 1f-g). Low expression levels of tendon markers were identified in these cells (Fig. 2a-c). Furthermore, although tendon genes slightly upregulated in myogenic differentiation conditions, these markers dramatically upregulated in tenogenic differentiation conditions (Fig. 2c). Thus, we believe the bulk RNA sequencing data could add the evidence of tenogenic differentiation ability of human CD29+/CD56+ myogenic progenitors.
Reviewer #2 (Public Review):
scRNAseq assay using total mononuclear cell population did not provide meaningful insight that enriched knowledge on CD56+/CD29+ cell population. CD56+/CD29+ cells information may have been lost due to the minority identity of these cells in the total skeletal muscle mononuclear population, especially given the total cell number used for scRNAseq was very low and no information on participant number and repeat sample number used for this assay. Using this data to claim a stem cell lineage relationship for MuSCs and tenocytes may not convincing, as seeing both cell types in the total muscle mononuclear population does not establish a lineage connection between them.
Thanks for your constructive suggestion. We have included more samples to perform scRNA-seq and reanalyzed the data. Three samples with a total of 57,193 cells were included for analysis. As you can see in Fig. 1d and 1e, the joint expression analysis revealed that all the CD29+/CD56+ cells were myogenic progenitors, which occupied 19.3% of all the myogenic progenitors. In addition, we agree with your comment that the pseudotime analysis could be a bit misleading as the nature of computational biology with pseudotime plots, so we deleted this assay.
The TGF-b pathway assay uses a small molecular inhibitor of TGF-b to probe Smad2/3. The assay conclusion regarding Smad2/3 pathway responsible for tenocyte differentiation may be overinterpretation without Smad2/3 specific inhibitors being applied in the experiments.
Thanks for your comment. We agree with your comment and we have revised it in the revision version (Figure 7, Line 306-326).
Reviewer #3 (Public Review):
This dual differentiation capability was not observed in mouse muscle stem cells.
Thanks for your comment. We have explored the tenogenic differentiation potential of mouse MuSCs both in vivo and in vitro. However, low tenogenic differentiation ability was revealed (Figure 4), which might be due to species diversity. Maybe it is more demanding for humans to maintain the homeostasis of the locomotion system and the whole organism locomotion ability in much longer life span and bigger body size. Thus, the current study also indicated that anima studies may not clinically relevant when investigating human diseases.
Recommendations For The Authors:
Reviewer #1 (Recommendations For The Authors):
The methods section contained insufficient details for sample tissue for many methods, including the single cell analysis, RNA FISH, and for in vivo cardiotoxin treatment. ie. how were the samples subclustered for the monocle pseudotime analysis; how many cells were counted in the FISH shown in Fig 1e/f, does the n=5 refer to tissue sections or biological replicates?; for the double injury, what was the cardiotoxin dose?
Thanks for your comment. Three samples and a total 57,193 cells were analyzed in single cell analysis (Line 464). We deleted RNA FISH assay data because it provided limited information to prove bipotential ability of human CD29+/CD56+ myogenic progenitors. In addition, since the pseudotime analysis could be a bit misleading as the nature of computational biology with pseudotime plots, we also deleted this assay. For the double injury, 15μl of 10μM cardiotoxin was used for lineage tracing (Line 533).
Additionally, the RNA sequencing datasets are not currently publicly available under the accession numbers provided.
The raw data of RNA sequencing has been uploaded in NCBI (accession number: PRJNA1178160, PRJNA1012476 and PRJNA1012828), and these data will be released immediately after publication.
The poor resolution of 1d makes it impossible to read any of the gene names or interpret the expression profiles of their proposed trajectories.
Since the pseudotime analysis could be a bit misleading as the nature of computational biology with pseudotime plots, we deleted this assay.
What does the color key for 3a refer to? It is not indicated in the figure or legend.
Thanks for your comment. The color key for 3a refer to “Scaled expression values”, which has been added in the revised version.
scRNAseq of the sorted CD29/56+ population could help uncover possible cell heterogeneity within these muscle progenitors and which sub-populations of myogenic progenitor cells have tenogenic potential.
Thanks for your valuable suggestion. We included more cells from three biological repetitions to perform scRNA-seq and found that CD29/CD56+ cells were absolutely from myogenic progenitors (Fig. 1d and 1e). We agree with you that additional scRNAseq will be helpful to clarify the possible cell heterogeneity within these muscle progenitors. Since the main scope of current study is to investigate the biopotential of CD29/CD56+ myogenic progenitors, analysis of scRNAseq of the sorted CD29/56+ population would be performed in the further study for further exploration.
Typos: Line 459 sored cells... preparasion with Chromium Single Cell 3' Reagent Kits (10X genomics, cat# 1000121-1000157). Figure 4E - typo in the word tamoxifen.
Thanks for your valuable suggestion. We are sorry for the typos and have revised these typos (Line 459 and Fig. 4e).
Reviewer #2 (Recommendations For The Authors):
(1) scRNAseq is performed in total mononuclear cells isolated from human skeletal muscle. The cell number (around 15000 cells) seems very low for this assay, given the CD56+/CD29+ cells are a minority population in this sequencing, the data does not seem to provide meaningful insight into the MuSC cell identities. No information on sample numbers and number of patient participants can be found in the paper.
Thanks for your comment. We added more cells to reanalyze the data in the revised manuscript. Three samples with a total of 57,193 cells were analyzed (Line 464). The joint expression analysis revealed that all the CD29+/CD56+ cells were myogenic progenitors, which occupied 19.3% of all the myogenic progenitors (Fig. 1d and 1e). These scRNA-seq data combined with functional experiment confirmed the MuSC cell identity of CD29+/CD56+ cells from mononuclear cells.
In this regard, the paragraph starts with "To confirm the single cell analysis results, we first isolated myogenic progenitor cells from human muscle biopsy using FACS as described previously" which is misleading as the seRNAseq is not the result of the sorted cells. Please reword this paragraph to clarify.
The related paragraph has been reworded (Line 84-95).
Similarly, the existence of myocytes and tenocytes in scRNAseq does not necessarily prove a stem cell and mature cell lineage relationship. Please edit the wording to avoid overinterpretation.
Thanks for your reminding. Since the pseudotime analysis could be a bit misleading as the nature of computational biology with pseudotime plots, we deleted this assay.
(2) The in vitro differentiation assays are well performed, which included bulk culture and clonal culture. The efficiencies of those two assays seem to have discrepancies which may need clarification. Again, no sample numbers and repeats have been informed.
Since the tendon differentiation period for bulk culture was 12 days, those myotubes fused by CD29+/CD56+ myogenic progenitors with only myogenic differentiation potential will be no longer alive. Thus, the efficiency of bulk culture seemed higher than that in clonal culture. As stated in statistical analysis, at least three biological replicates and technical repeats were performed in each experimental group (Line 577).
In these paragraphs, terminologies including MuSCs, myogenic progenitors, CD56+/CD29+, and Pax7+ are interchangeably used, which generates confusion while reading. It is probably best to consistently use the cell sorting markers markers to address this cell population, throughout the paper.
Thanks for your constructive suggestion. The cell population was consistently named as CD29+/CD56+ myogenic progenitors throughout the paper.
Information on the proliferation rate and expansion of the MuSCs would be useful but not provided.
Thanks for your comment. The analysis of cell proliferation was added in Figure 1 (Fig. 1h).
The murine cell differentiation assays are not as convincing as the human study. The assay regarding "mouse muscle CD29+/CD56+ cells were isolated for tenogenic induction. However, very few mouse muscle CD29+/CD56+ cells expressed myogenic progenitor cell marker Pax7, MyoD1 and Vcam1" does not add any value to the work as those markers are not mouse MuSC markers to start with.
Thanks for your comment. The experiments concerning mouse muscle CD29+/CD56+ cells have been deleted to avoid misleading.
The Pax7-cre-TdTomato assay was also not convincing, as a negative finding may not be the best proof of absence.
Thanks for your comment. Pax7 positive cells could consistently express TdTomato for lineage tracing. In current study, large amount of tdTomato+ myofibers were observed after muscle injury (SFig. 2c-d), suggesting that the tracing system works well. However, less than 0.2% tendon cells originated from TdTomato+ MuSCs were observed even four months after tendon removal (Fig. 4f-g). When comparing in vivo data between murine MuSCs and human CD29+/CD56+ myogenic progenitors, we believe these data could indicate the poor tendon differentiation abilities of murine MuSCs.
(5) TGFb as a pathway of smad2/3 mediated tenocyte differentiation assays were well done albeit not novel. Using TGFb universal inhibitor may not accurately state the pathways were due to SMAD2/3 inhibition either.
We agree with your comment and the conclusion concerning SMAD2/3 has been deleted throughout the manuscript.
The paper also needs thorough proofreading. Currently, typographic, grammatical, and logical sequences of writing do not lend the paper to easy reading.
(1) Figure 1K and 1I have similar legends but presumably K is referring to MuSC and I is referring to differentiated cells.
(2) Tenogenic and myogenic induction should be changed to tenogenic/myogenic differentiation as they are the cells at the end of differentiation.
(3) Figure 6, it is not clear how the "human cells" are calculated in this assay.
Thanks for your constructive comment. (1) The figure legends in Figure1 have been revised (Line 797-804). (2) Tenogenic and myogenic induction have been changed to tenogenic/myogenic differentiation manuscript when they are referring to cells at the end of differentiation (Fig.1, Fig.2, Fig.3, Fig.4, Fig.7 and SFig.1). (3) In Figure 6, “human cells” is referring to those injured tendons with transplantation of human CD29+/CD56+ myogenic progenitors. To evaluate the function of human CD29+/CD56+ myogenic progenitors, PBS group was set as negative control and uninjured group was set as normal control.
Reviewer #3 (Recommendations For The Authors):
(1) The full extent of the differentiation potential of CD29+/CD56+ stem/progenitor cells has not been thoroughly evaluated. There can also exist heterotopic ossification in injured tendon sites. Thus, it remains unclear whether these cells are truly bipotent as the authors claim, or can they differentiate into chondrocytes and osteoblasts.
Thanks for your comment. The current study focused on the tenogenic differentiation potential of CD29+/CD56+ myogenic progenitors, so the research priority was the bipotential ability of CD29+/CD56+ myogenic progenitors. We agree with you that chondrogenic and osteogenic ability of CD29+/CD56+ myogenic progenitors is also important and would investigate it in the further study.
(2) In Figure 3, the GO analysis also shows increased enrichment of muscle-related terms including muscle contraction and filament. Please clarify it.
The tenogenic differentiation efficiency of CD29+/CD56+ myogenic progenitors was about 40% in clonal assay. Some cells would myogenically differentiated under this tenogenic induction system. Thus, the GO analysis could also enrich muscle related terms including muscle contraction and filament.
(3) The authors use TNC staining to evaluate cell transplantation. My concern is whether the TNC expression is specific to the tendon site, or do engrafted human cells also express TNC in other sites such as muscle?
TNC is one of a well-known tendon-related markers. As you can see in Figure 6b and Figure 6c, although some human cells (labeled by Lamin A/C) were engrafted in muscle tissue area (labeled by MyHC), these engrafted human cells didn’t express TNC in muscle. In addition, we also used tendon related markers SCX and TNMD to confirm the tenogenic differentiation ability of engrafted human cells in vivo (SFig. 3a and 3b).
(4) The authors demonstrate that CD29+/CD56+ human stem/progenitor cells could efficiently transplant and contribute to myofiber regeneration in vivo. However, why were only a few transplanted human cells differentiating into myofiber (labeled by MyHC) in the tenon injury model even with CTX injection?
Thanks for your comment. Since skeletal muscle is able to regenerate with in situ muscle progenitor cells, regeneration of injured muscle by CTX injection was dependent on not only CD29+/CD56+ myogenic progenitors, but also native murine MuSCs. Thus, it is reasonable that there were only a few transplanted human cells differentiating into myofiber (labeled by MyHC) in the tenon injury model even with CTX injection.
(5) Figure 7 shows the crucial role of TGFB/SMAD signaling for the tenogenesis of human CD29+/CD56+ stem/progenitor cells. However, can TGFB/SMAD signaling activation facilitate the tenogenic differentiation of mouse MuSCs? This point is crucial to clarify the difference of MuSCs between different species.
Thanks for your valuable suggestion. We did a series of pilot assays to investigate the effect of TGFβ signaling activation to facilitate tenogenic differentiation of mouse MuSCs (Author response image 1). As you can see, activating TGFβ by SRI-011381 could slightly increase the expression of tenogenic markers of murine MuSCs. It’s an interesting topic and we would investigate it in the further study.
Author response image 1.
TGFβ signaling pathway slightly elevated tenogenic differentiation ability of murine MuSCs (a) Immunofluorescence staining of tendon marker Scx and Tnc in murine MuSCs induced for tenogenic differentiation with or without TGFβ signaling pathway agonist SRI-011381, respectively. Scale bars, 50 µm. (b) Quantification of Scx and Tnc fluorescent intensity in murine MuSCs undergone tenogenic induction with or without TGFβ signaling pathway agonist SRI-011381, respectively. Error bars indicated standard deviation (n=5). (c) Protein levels of Tnc and Scx. Murine MuSCs were induced towards tenogenic differentiation with or without TGFβ signaling pathway agonist SRI-011381. Total protein was extracted from cells before and after differentiation and subjected for Tnc and Scx immunoblotting. GAPDH was served as loading control.
(6) Please quantify the WB blot data throughout the manuscript.
Thanks for your comment. The WB blot data has been quantified throughout the manuscript.
(7) The data of RT-qPCR should indicate what the fold changes in relative to throughout the manuscript.
Thanks for your comment. The sentence “GAPDH was served as reference gene” was added in the figure legends to illustrate RT-qPCR results.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
eLife Assessment<br /> …. While intuitive, the model's underlying issue is grouping many factors under "variance in reproductive success" without explicitly modeling the molecular processes. This limitation, …, provides incomplete support for the authors' claim that the observed paradoxical patterns in rRNA genes can largely be explained by homogenizing processes, such as gene conversion, unequal crossover and replication slippage.
This second paper addresses the genetic drift in multi-copy gene systems using rRNA genes as an example. Note that genetic drift happens in two stages here – within individuals and between individuals while the drift mechanisms are very different between the two stages. We now reply to the editors’ decision that it would be more rigorous to model each molecular process, than to lump all stochastic forces into V(K). We respond to this criticism on three fronts.
First, for molecular evolutionists, there is NO NEED to model the detailed molecular processes. This is because we are only interested in knowing the totality of the stochastic variations. Interesting biological forces such as selection and meiotic drive are masked by such random forces. Our objective is precisely to lump all noises into a quantity that can be estimated.
Second, the homogenization process is the bulk, if not the totality of the within-individual random forces (i.e,, genetic drift). The criticism of incomplete support for drift as a sufficient account of the observations is curious because we did conclude that genetic drift is an insufficient explanation of the human data. Since drift only influences fixation time, which can have a significant effect in short-term evolution (as shown in Fig 2), but it does not affect fixation rate itself. In contrast, selection influences the both. Thus, we can define the limitation of drift in evolutionary process. Even if the speed of drift-driven fixation is only a few generations, it is still too little for the human-chimpanzee divergence comparisons. In contrast, the speed of genetic drift in mice, as extrapolated from the polymorphism data, is sufficient to drive the divergence between M. m. domesticus and Mus spretus. The criticism appears to be that unbiased gene conversion, unequal crossover and replication slippage together may be insufficient to account for the observations. Since the contribution of each of these three forces is not central to our goal of filtering out the total contributions, we only conclude that the totality of within-individual drift in mice is sufficient to explain the data.
Third, even if we really want to dissect the molecular processes, previous attempts by prominent theorists like Tom Nagylaki and Tomoko Ohta could only model a small subset of such processes. In fact, Ohta often lumps a few of these forces into one process. More importantly, if we want to tackle other systems like viruses and mitochondria, we will have to develop a new set of theories for each molecular process. V(K) can take care of all such diverse systemes. In short, genetic drift is just noises and our goal is to quantify them in total across diverse systmes. By filtering out noises, we will be able to move on to something more important.
We now briefly comment on the WF models in relation to multi-gene systems. For example, in the case SARS-CoV-2, there are millions of virions in each patient among millions of patients. It is not possible to know what Ne acaully means in the WF modesl. Also, the rDNA population in each individual is not the sub-populations of the WF models. After all, the mechanisms of genetic drift within individuals by the homogenization processes are entirely different from the genetic drift between individuals. For a comparison, we published several papers (cited in #2) using the Haldane model to estimate the strength of genetic drift. It is also important to note that the parameters and assumptions of WF model cannot fully capture the evolutionary dynamics of the multi-copy genes.
… ., along with insufficient consideration of technical challenges in alignment and variants calling, provides incomplete support for the authors' claim …
Before delving into the technical details, we would like to summarize our defense. First, all rRNA gene copies belong in a pseudo-population, due to the homogenization process. The concept of specific locus with specific variants does not apply. Second, the levels of within-individual and within-species variation is so low that sequence alignment is not a problem at all. Third, thanks to the large number of sequence reads, occasional sequence errors (rarely encountered) should have minimal effects on the analyses. Now the technical details:
Regarding the concerns about the alignment and variant calling, we would like to clarify our methodology. While we acknowledge the technical challenges inherent in alignment and variant calling, particularly with respect to orthologous alignments to distinguish different copies, it is important to note that rDNA copies are subject to homogenization processes, meaning that there is no orthology among rDNA copies. Due to the high sequence similarity and frequent genetic exchange among rDNA units within species, we used the species-specific rDNA reference sequence for variant calling. We directly utilized the raw read depth from all rDNA copies within individuals to calculate the site frequency. For each site, we focused on the frequency of the major allele to calculate nucleotide diversity using the 2p(1-p), where p represents the frequency of the major allele. This approach helps capture genetic variation while minimizing the impact of alignment or variant calling errors, which primarily affect low-frequency variants (e.g., 0.800A, 0.199T, 0.001C, with A being the major allele). As for the divergence sites between species, we defined FST = 0.8 as a cutoff (roughly, when a mutant is > 0.95 in frequency in one species and < 0.05 in the other, FST would be > 0.80.), which is less likely to be influenced by low-frequency polymorphic sites within species.We believe this method is more appropriate for estimating genetic diversity at rDNA than traditional variant calling pipelines designed to detect homozygotes and heterozygotes.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
eLife Assessment (divided into 3 parts)
This study presents a useful modification of a standard model of genetic drift by incorporating variance in reproductive success, claiming to address several paradoxes in molecular evolution. ……
It is crucial to emphasize that our model is NOT a modification of the standard model. The Haldane model, which is generalized here for population regulation, is based on the branching process. The Haldane model and the WF model which is based on population sampling are fundamentally different. We referred to our model as the integrated WF-H model because the results obtained from the WF model over the last 90 years are often (but not always) good approximations for the Haldane model. The analogy would be the comparisons between the Diffusion model and the Coalescence model. Obviously, the results from one model are often good approximations for the other. But it is not right to say that one is a useful modification of the other.
We realize that it is a mistake to call our model the integrated WFH model, thus causing confusions over two entirely different models. Clearly, the word “integrated” did not help. We have now revised the paper by using the more accurate name for the model – the Generalized Haldane (GH) model. The text explains clerarly that the original Haldane model is a special case of the GH model.
Furthermore, we present the paradoxes and resolve them by the GH model. We indeed overreached by claiming that WF models could not resolve them. Whether the WF models have done enough to resolve the paradoxes or at least will be able to resolve them should not be a central point of our study. Here is what we state at the end of this study.:
“We understand that further modifications of the WF models may account for some or all of these paradoxes. However, such modifications have to be biologically feasible and, if possible, intuitively straightforward. Such possible elaborations of WF models are beyond the scope of this study. We are only suggesting that the Haldane model can be extensively generalized to be an alternative approach to genetic drift. The GH model attempts to integrate population genetics and ecology and, thus, can be applied to genetic systems far more complex than those studied before. The companion study is one such example.”
….. However, some of the claimed "paradoxes" seem to be overstatements, as previous literature has pointed out the limitations of the standard model and proposed more advanced models to address those limitations….
As stated in the last paragraph of the paper, it is outside of the scope of our study to comment on whether the earlier WF models can resolve these paradoxes. So, all such statements have been removed or at least drastically toned down in the formal presentation. That said, editors and reviewers may ask whether we are re-inventing the wheels. The answers are as follows:
First, two entirely different models reaching the same conclusion are NOT the re-invention of wheels. The coalescence theory does not merely rediscover the results obtained by the diffusion models. The process of obtaining the results is itself a new invention. This would lead to the next question: is the new process more rigorous and more efficient? I think the Haldane model is indeed more efficient in comparisons with the very complex modifications of the WF models.
Second, we are not sure that the paradoxes have been resolved, or even can be resolved. Note that these skepticisms have been purged from the formal presentation. Thefore, I am presenting the arguments outside of the paper for a purely intellectual discourse. Below, please allow us to address the assertions that the WF models can resolve the paradoxes.
The first paradox is that the drift strength in relation to N is often opposite of the WF model predictions. Since the WF models (standard or modified) do not generate N from within the model, how can it resolve the paradox? In contrast, the Generalized Haldane model generates N within the model. It is the regulation of N near the carrying capacity that creates the paradox – When N increases, drift also increases.
The second paradox that the same locus experiences different drifts in males and females is accepted by the reviewers. Nevertheless, we would like to point out that this second paradox echoed the first one as newly stated in the Discussion section “The second paradox of sex-dependent drift is about different V(K)’s between sexes (generally Vm > Vf) but the same E(K) between them. In the conventional models of sampling, it is not clear what sort of biological sampling scheme could yield V(K) ≠ E(K), let alone two separate V(K)’s with one single E(K). Mathematically, given separate K distributions for males and females, it is unlikely that E(K) for the whole population could be 1, hence, the population would either explode in size or decline to zero. In short, N regulation has to be built into the genetic drift model as the GH model does to avoid this paradox.”
The third paradox stems from the fact that drift is operating even for genes under selection. But then the drift strength, 2s/V(K) for an advantage of s, is indepenent of N or Ne. Since the determinant of drift strength in the WF model is ALWAYS Ne, how is Paradox 3 not a paradox for the WF model?
The 4th paradox about multi-copy gene systems is the subject of the companion paper (Wang et al.). Note that the WF model cannot handle systems of evolution that experience totally different sorts of drift within vs. between hosts (viruses, rDNAs etc). This paradox can be understood by the GH model and and will be addressed in the next paper.
While the modified model presented in this paper yields some intriguing theoretical predictions, the analysis and simulations presented are incomplete to support the authors' strong claims, and it is unclear how much the model helps explain empirical observations.
The objections appear to be that our claims of “paradox resolution” being too strong. We interpret this objection is based on the view (which we agree) that these paradoxes are intrisicallly difficult to resolve by the WF models. Since our model has been perceived to be a modified WF model, the claim of resolution is clearly too strong. However, the GH model is conceptually and operationally entirely different from the WF models as we have emphasized above. In case our reading of the editorial comments is incorrect, would it be possible for some clarifications on the nature of “incomplete support”? We would be grateful for the help.
-
-
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this manuscript, Janssens et al. addressed the challenge of mapping the location of transcriptionally unique cell types identified by single nuclei sequencing (snRNA-seq) data available through the Fly Cell Atlas. They identified 100 transcripts for head samples and 50 transcripts for fly body samples allowing the identification of every unique cell type discovered through the Fly Cell Atlas. To map all of these cell types, the authors divided the fly body into head and body samples and used the Molecular Cartography (Resolve Biosciences) method to visualize these transcripts. This approach allowed them to build spatial tissue atlases of the fly head and body, to identify the location of previously unknown cell types and the subcellular localization of different transcripts. By combining snRNA-seq data from the Fly Cell Atlas with their spatially resolved transcriptomics (SRT) data, they demonstrated an automated cell type annotation strategy to identify unknown clusters and infer their location in the fly body. This manuscript constitutes a proof-of-principle study to map the location of the cells identified by ever-growing single-cell transcriptomic datasets generated by others.
Strengths:
The authors used the Molecular Cartography (Resolve Biosciences) method to visualize 100 transcripts for head samples and 50 transcripts for fly body samples in high resolution. This method achieves high resolution by multiplexing a large number of transcript visualization steps and allows the authors to map the location of unique cell types identified by the Fly Cell Atlas.
We thank this reviewer for appreciating the quality of our spatial data. We do not know what caused the technical problem (grayscale version of PDF) for this reviewer (the PDF figures are in color on the eLife website and on bioRxiv). We are surprised that the eLife discussion session did not resolve this issue.
Weaknesses:
Combining single-nuclei sequencing (snRNA-seq) data with spatially resolved transcriptomics (SRT) data is challenging, and the methods used by the authors in this study cannot reliably distinguish between cells, especially in brain regions where the processes of different neurons are clustered, such as in neuropils. This means that a grid that the authors mark as a unique cell may actually be composed of processes from multiple cells.
The small size of an individual fly is one of the most challenging aspects of performing spatial transcriptomics. While the resolution of Molecular Cartography is rather high (< 200 nm), in the brain challenges remain as noted by the reviewer. Drosophila neuronal nuclei are notoriously small and cannot be easily resolved with the current imaging techniques. We agree that for a full atlas either expansion microscopy, 3D techniques or other super-resolution techniques will be required.
Reviewer #2 (Public Review):
Summary:
The landmark publication of the "Fly Atlas" in 2022 provided a single cell/nuclear transcriptomic dataset from 15 individually dissected tissues, the entire head, and the body of male and female flies. These data led to the annotation of more than 250 cell types. While certainly a powerful and datarich approach, a significant step forward relies on mapping these data back to the organism in time and space. The goal of this manuscript is to map 150 transcripts defined by the Fly Atlas by FISH and in doing so, provide, for the first time, a spatial transcriptomic dataset of the adult fly. Using this approach (Molecular Cartography with Resolve Biosciences), the authors, furthermore, distinguish different RNA localizations within a cell type. In addition, they seek to use this approach to define previously unannotated clusters found in the Fly Atlas. As a resource for the community at large interested in the computational aspects of their pipeline, the authors compare the strengths and weaknesses of their approach to others currently being performed in the field.
Strengths:
(1) The authors use Resolve Biosciences and a novel bioinformatics approach to generate a FISHbased spatial transcriptomics map. To achieve this map, they selected 150 genes (50 body; 100 head) that were highly expressed in the single nuclear RNA sequencing dataset and were used in the 2022 paper to annotate specific cell types; moreover, the authors chose several highly expressed genes characteristic of unannotated cell types. Together, the approach and generated data are important next steps in translating the transcriptomic data to spatial data in the organism.
We thank the reviewer for this comment, as it reminded us that we need to be clearer in the text, about how we chose the genes to investigate. The statement that we selected “150 genes (50 body; 100 head) that were highly expressed in the single nuclear RNA sequencing dataset” is not correct. We have chosen genes with widely differing expression levels (log-scale range of 3.95 in body, 5.76 in head, we show this now in the new Figure 1 – figure fupplement 1B, D). Many of the chosen genes are also transcription factors. In fact, the here introduced method is more sensitive than the single cell atlas: the tinman positive cells were readily located (even non-heart cells were found to express tinman), whereas in the single cell FCA data tinman expression is often not detected in the cardiomyocytes (tinman is detected in 273 cells in the entire FCA (mean expression of 1.44 UMI in positive cells), and in 71 cells out of 273 cardiac cells (26%)).
(2) Working with Resolve, the authors developed a relatively high throughput approach to analyze the location of transcripts in Drosophila adults. This approach confirmed the identification of particular cell types suggested by the FlyAtlas as well as revealed interesting subcellular locations of the transcripts within the cell/tissue type. In addition, the authors used co-expression of different RNAs to unbiasedly identify "new cell types". This pipeline and data provide a roadmap for additional analyses of other time points, female flies, specific mutants, etc.
(3) The authors show that their approach reveals interesting patterns of mRNA distribution (e.g alpha- and beta-Trypsin in apical and basal regions of gut enterocytes or striped patterns of different sarcomeric proteins in body muscle). These observations are novel and reveal unexpected patterns. Likewise, the authors use their more extensive head database to identify the location of cells in the brain. They report the resolution of 23 clusters suggested by the single-cell sequencing data, given their unsupervised clustering approach. This identification supports the use of spatial cell transcriptomics to characterize cell types (or cell states).
(4) Lastly, the authors compare three different approaches --- their own described in this manuscript, Tangram, and SpaGE - which allow integration of single cell/nuclear RNA-seq data with spatial localization FISH. This was a very helpful section as the authors compared the advantages and disadvantages (including practical issues, like computational time).
Weaknesses:
(1) Experimental setup. It is not clear how many and, for some of the data, the sex of the flies that were analyzed. It appears that for the body data, only one male was analyzed. For the heads, methods say male and female heads, but nothing is annotated in the figures. As such, it remains unclear how robust these data are, given such a limited sample from one sex. As such, the claims of a spatial atlas of the entire fly body and its head ("a rosetta stone") are overstated. Also, the authors should clearly state in the main text and figure legends the sex, the age, how many flies, and how many replicates contributed to the data presented (not just the methods). What also adds to the confusion is the use of "n" in para 2 of the results. " ... we performed coronal sections at different depths in the head (n=13)..." 13 sections in total from 1 head or sections from 13 heads? Based on the body and what is shown in the figure, one assumes 13 sections from one head. Please clarify.
While we agree that sex differences present indeed an interesting opportunity to study with spatial transcriptomics, our goal was not to define male/female differences but rather to establish the technology to go into this detail if wanted in the future. In the revised version, we have provided an additional supplementary table with a more detailed description of the head sections (Table S3). We have added the number of animals (12 for the head sections, mixed sex; and 1 male for the body sections) to the main text. We would like to point out that we verified the specificity of our MC method on all the 5 body sections (Figure 2A, TpnC4 & Act88F and text) and not only on one. Furthermore, we also would like to state that the idea of “a Rosetta stone” was mentioned as a future prospect that clearly goes beyond our presented work. We have rewritten the discussion to make this clearer and to any avoid overstatements.
(2) Probes selected: Information from the methods section should be put into the main text so that it is clear what and why the gene lists were selected. The current main text is confusing. If the authors want others to use their approach, then some testing or, at the very least, some discussion of lower expressed genes should be added. How useful will this approach be if only highly expressed genes can be resolved? In addition, while it is understood that the company has a propriety design algorithm for the probes, the authors should comment on whether the probes for individual genes detect all isoforms or subsets (exons and introns?), given the high level of splicing in tissues such as muscle.
As stated above, while there is a slight bias to higher expressed genes (as expected for marker genes), we have also used low expressed genes like salm, CG32121, tinman (body) or sens (head). This is now shown in new Figure 1 – figure Supplement 1B, D. This shows that our method is more sensitive than single-cell data, as all cardiomyocytes can be identified by tinman expression and not only some are positive, as is the case in the FCA data. In fact, the method cannot resolve too highly expressed genes due to optical crowding of the signal leading to a worse quantification. For this reason, ninaE was removed from the analysis (as mentioned in Spatial transcriptomics allows the localization of cell types in the head and brain and in Methods).
As mentioned in the Methods, the probes are designed on gene level targeting all isoforms, but favoring principal isoforms (weighted by APPRIS level). The high level of splicing is indeed interesting and we expect that in the future spatial transcriptomics can help to generate more insight into this by designing isoform-specific probes.
(3) Imaging: it isn't clear from the text whether the repeated rounds of imaging impacted data collection. In many of what appear to be "stitched" images, there are gradients of signal (eg, figure 2F); please comment. Also, since this a new technique, could a before and after comparison of the original images and the segmented images be shown in the supplemental data so that the reader can better appreciate how the authors assessed/chose/thresholded their data? More discussion of the accuracy of spot detection would be helpful.
High-resolution imaging (pixel size = 138 nm) of a large field of view (>1mm) for spatial transcriptomics uses a stitching method to combine several individual images to reconstruct a large field of view. This does not generate signal gradients, apart from lower signal at the extreme edges of each of the individual images, as seen in our images, too. The spot detection algorithm was written and used by Resolve Biosciences and benchmarked for human (Hela) and mouse (NIH-3T3) cell lines in Groiss et al. 2021 (Highly resolved spatial transcriptomics for detection of rare events in cells, bioR xiv). The specificity of the decoded probes was found to lie between 99.45 and 99.9% here, matching the results we found for specific detection of TpnC4 and Act88F (99.4 and 99.8%).
(4) The authors comment on how many RNAs they detected (first paragraph of results). How do these numbers compare to the total mRNA present as detected by single-cell or single-nuclear sequencing?
We can compare the numbers, but the different methodologies make the interpretation of such a comparison difficult. FCA used single nucleus sequencing, so only nuclear pre-mRNAs are detected. The total amount of counts per single cell sample strongly depends on how many cells were sequenced in an experiment. MC detects all mRNAs present in the section. Here, the size of the sample and hence the size or the number of cells analyzed determines how many mRNAs are detected. In Author response image 1, we have compared our MC results versus FCA data, comparing the genes investigated here in MC per section vs per sequencing experiment. Numbers for MC are slightly lower for the brain (not all cell types are on all sections) and much higher for the larger body samples. However, we feel a direct comparison is questionable, so we prefer to not include this figure in our manuscript.
Author response image 1:
Barplots showing total number of mRNA molecules detected in Molecular Cartography (MC, Resolve, spatial spots) and in snRNA-seq data from the Fly Cell Atlas (10x Genomics, UMIs). Individual black dots show individual experiments, counts are only shown for the chosen gene panel for each sample. Bar shows the mean, with error bars representing the standard error.
(5) Using this higher throughput method of spatial transcriptomics, the authors discern different cell types and different localization patterns within a tissue/cell type.
a. The authors should comment on the resolution provided by this approach, in terms of the detection of populations of mRNAs detected by low throughput methods, for example, in glia, motor neuron axons, and trachea that populate muscle tissue. Are these found in the images? Please show.
We did not add any markers for trachea in our gene panel, but we do detect sparse spots of repo (glia) and elav/VGlut in the muscle tissues (Gad1/VAChT are hardly detected in the muscle tissue). This is consistent with the glutamatergic nature of motor neurons in Drosophila as described previously (Schuster CM (2006), Glutamatergic synapses of Drosophila neuromuscular junctions: a high-resolution model for the analysis of experience-dependent potentiation. Cell Tissue Res 326:
287–299.). We have present these new data in new Figure 2 – figure supplement 1.
b.The authors show interesting localization patterns in muscle tissue for different sarcomere proteincoding mRNAs, including enrichment of sls in muscle nuclei located near the muscle-tendon attachment sites. As this high throughput approach is newly being applied to the adult fly, it would increase confidence in these data, if the authors would confirm these data using a low throughput FISH technique. For example, do the authors detect such alternating "stripes" ( Act 88F, TpnC4, and Mhc) or enriched localization (sls) using FISH that doesn't rely on the repeated colorization, imaging, decolorization of the probes?
We thank the reviewer for the interest in the localization patterns in muscle tissue. We show that Act88F, TpnC4 are not detected outside of flight muscle cells (99.4% and 99.8% of the single molecular signal in flight muscles only), giving us confidence in the specificity of the MC method. Following the suggestion of the reviewer, we have adapted an HCR-FISH method to Drosophila adult body sections for the revised version of the manuscript. Using this method, we were able to confirm the higher expression/localization of sls transcripts to and around the adult flight muscle nuclei, with an enrichment in nuclei close to the muscle-tendon attachment sites (new Figure 4D-F and new Figure 4 – figure supplement 1). We have also been able to confirm some complementarity in the localization patterns of Act88F and TpnC4 in longitudinal stripes in adult flight muscles, however for Mhc we could not confirm this pattern with HCR-FISH (new Figure 5C-F and new Figure 5 – figure supplement 1). While we could confirm most of the pattern seen, we do not know the exact reason for the slight discrepancies. Thus, we now recommend that insights found with SRT should be confirmed with more classical FISH methods.
(6) The authors developed an unbiased method to identify "new cell types" which relies on coexpression of different transcripts. Are these new cell types or a cell state? While expression is a helpful first step, without any functional data, the significance of what the authors found is diminished. The authors need to soften their statements.
The term “new cell types” only appeared in the old title. We agree that with the current spatial map we cannot be sure to have found “new cell types”, instead we show where unannotated/uncharacterized clusters from the scRNA-seq atlas are located, based on their gene expression. Therefore, we have updated the title in the revised version (Spatial transcriptomics in the adult Drosophila brain and body) and thank the reviewer for this valuable suggestion.
Appraisal:
The authors' goal is to map single cell/nuclear RNAseq data described in the 2022 Fly Atlas paper spatially within an organism to achieve a spatial transcriptomic map of the adult fly; no doubt, this is a critical next step in our use of 'omics approaches. While this manuscript does the hard work of trying to take this next step, including developing and testing a new pipeline for high throughput FISH and its analysis, it falls short, in its present form, in achieving this goal. The authors discuss creating a robust spatial map, based on one male fly. Moreover, they do not reveal principles of mRNA localization, as stated in the abstract; they show us patterns, but nothing about the logic or function of these patterns. This same criticism can be said of the identification of "new cell types, just based on RNA colocalization. In both cases (mRNA subcellular localization or cell type identification), further data in the form of validation with traditional low throughput FISH and genetic manipulations to assess the relation to cell function are required for the authors to make such claims.
We have indeed used one male fly for the adult male body data. This is mainly due to the cost of the sample processing. We used 12 individuals for the head samples (from 1 individual we acquired 2 sections, a total of 13 sections). We show that the body samples show a high correlation with each other, while the head samples cover multiple depths of the head. Still, even in the head, we find that sections at similar depths show a high similarity to each other in terms of gene-gene coexpression and expression patterns. Although obtaining sections from more animals would be valuable, we do not believe it to be necessary for our current goals. Additional replicates beyond the ones we already provide would require significant amounts of extra time and budget, while they would very likely produce similar results as we already show. Following the reviewer’s suggestion, we have tested several genes with HCR-FISH and could readily confirm the localization pattern of sls mRNA close to the terminal nuclei of the flight muscles. This pattern is likely due to a higher expression of sls in these nuclei, as a large amount of sls mRNA signal is detected within the nuclei (Figure 4). A detailed dissection of the mechanism that establishes this pattern is beyond the scope of this manuscript, which is the first one on applying spatial transcriptomics to adult Drosophila.
The usage of the term “new cell types” was indeed ambiguous and we removed this from the revised version. We now clarified that we map the spatial location of unannotated clusters in the brain. This may or may not include uncharacterized cell types. We now further specify that we have only inferred the location of the nuclei; thus, neuronal function or the location of their axonal processes are still unknown. As such, our data provides a starting point to identify uncharacterized cell types, since their marker genes and nuclear location are now determined. The next step to identify “new cell types” would indeed be to acquire genetic access to these cell types and characterize them in more detail. This is beyond the scope of this manuscript, and therefore we have toned down the title in the revised version and thank the reviewer for this valuable suggestion.
Discussion of likely impact:
If revised, these data, and importantly the approach, would impact those working on Drosophila adults as well as those working in other model systems where single cell/nuclear sequencing is being translated to the spatial localization within the organism. The subcellular localization data - for example, the size of transcripts and how that relates to localization or the patterns of sarcomeric protein localization in muscle - are intriguing, and would likely impact our thinking on RNA localization, transport, etc if confirmed. Lastly, the authors compare their computational approaches to those available in the field; this is valuable as this is a rapidly evolving field and such considerations are critical for those wishing to use this type of approach.
We thank this reviewer for appreciating the impact of our findings and approach to the Drosophila field and beyond. We here provide the groundwork for a full Drosophila adult spatial atlas, similar to how early scRNA-seq datasets provided a framework for the Fly Cell Atlas. In the manuscript we provide both experimental information on how to successfully perform spatial transcriptomics (treating slides for optimal attachment) and the data serves as a benchmark for future experiments to improve upon (similar to how early Drop-seq datasets were compared to later 10x datasets in single-cell transcriptomics). In addition, it also provides proof of principle methods on how to integrate the FCA data with these spatial data and it identifies localized mRNA species in large adult muscle cells, showing the complementarity of spatial techniques with single-cell RNA-seq. For a small number of genes, we have confirmed the mRNA patterns using HCR-FISH in the revised version of this manuscript. To conclude, this is the first spatial adult Drosophila transcriptomics paper, locating 150 mRNA species with easy data access in our user portal (https://spatialfly.aertslab.org/).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) All figures in the manuscript were in grayscale, which made it difficult to interpret the results because the data could only be interpreted by distinguishing different colors to visualize different transcripts. This is likely a technical problem. The manuscript must contain colored images.
We apologize to the reviewer for this technical issue. The manuscript was uploaded in color to bioRxiv and to eLife. We therefore do not understand to reason for this problem. We are surprised that this issue was not resolved in the reviewers’ discussion since color is obviously essential to appreciate the beauty of this manuscript.
(2) In Figure 2a, the authors comment on the subcellular localization of trypsin isoforms, but the figure does not indicate the cell borders or the apical and basal regions of the cell. These must be indicated in the figure to help readers understand the results.
We thank the reviewer for pointing this out; we have now indicated the outlines of the single-cell layer epithelium on the figure. While we have no marker for cell borders, we have a nuclear marker showing that it is a single cell layer. We hope this allows the reader to appreciate the subcellular localization of the trypsin isoforms.
(3) All figures (including the data on the authors' website) contain background staining, which I assume is labeling nuclei. This is not indicated in the manuscript, and should be clarified.
We again thank the reviewer for pointing this out; the background staining indeed labels nuclei (using DAPI). We have indicated this better in the revised version.
(4) In Figure 5c, the authors claim that neuronal and muscular genes are grouped into the same cluster, but they do not indicate which transcripts are neuronal and which ones are muscular. This must be indicated in the figure.
We thank the reviewer for this comment. Indeed, there was only one gene, acj6, present in the muscle cluster. So, we decided to delete this statement in the revised version.
(5) The authors utilized and compared three different approaches to integrate single nuclei sequencing data from the Fly Cell Atlas to their spatially resolved transcriptomics (SRT) data. I was wondering if it is possible to generate a virtual expression explorer using this integrated data, similar to the dataset published in the 2017 Science article by Karaiskos et al., where they combined publicly available in situ hybridization data of fly embryos and their single-cell sequencing data. This virtual expression explorer would be useful to visualize the expression pattern of transcripts that the authors of this paper did not use for their SRT.
We thank the reviewer for this interesting comment. Using Tangram, we indeed infer gene expression for all genes from the Fly Cell Atlas. To make this visible we have created a Scope session (https://scope.aertslab.org/#/Spatial_Fly/*/welcome), with which users can browse inferred gene expression levels (note that this is on a segmented cell level). We do notice that the inferred gene expression levels contain many false positives and should therefore be used with caution. The spatial data themselves can be browsed through the spatial portal at https://spatialfly.aertslab.org/ .
Reviewer #2 (Recommendations For The Authors):
Suggestions for improved or additional experiments, data, or analyses:
The authors have used a new high throughput approach to examine the location of 150 RNAs in adult Drosophila heads or one body. It is unclear whether the fixation/repeated imaging etc is accurately reflecting the patterns of expression in vivo. The authors should confirm these data using low throughput established techniques for the RNA patterns in muscle for example.
The authors should clarify their experimental approaches and include additional samples if they indeed want to establish the rosetta stone of fly adults. These data are from only a male fly (and as such is not a complete analysis of the adult fly). To be a map of the adult fly, data from both sexes need to be included.
Unless functional data that complement the descriptive data shown here are included, the authors have to soften their conclusions. For example, while spatial transcriptomics has mapped RNA expression to a location, without some functional data, it is difficult to conclude that these are indeed "new cell types". Same with the RNA localization principles.
Recommendations for improving the writing and presentation:
(1) The manuscript should be heavily revised: in many places, important details are left out or should be moved from the methods to the main text. In addition, the authors often overstate their findings throughout the manuscript. As an example, it appears that the data presented is only from 1 fly, so this doesn't increase the reader's confidence in the data or the applicability of the approach. Also, it isn't clear how many flies were analyzed for the heads (one male fly too?) nor what variability is present from fly to fly. For the approach and data to be used by others, this is important to know.
We moved some text from the methods section to the main text to be clearer. We now also state how many animals were used for the MC method. While the data for the body has been generated from 1 male only, the data for the head was generated from 12 flies; for both cases, similar slices show very similar gene expression patterns. Furthermore, in the body we used widely known and published marker genes that all showed expected expression patterns, indicating robustness. We agree that this is not a full spatial atlas of the fly, this was also not our goal and we have removed such general statements from the revised version: we aimed to generate a spatial transcriptomics dataset, covering the entire fly (head and body) as a proof-of-principle, tackling data generation and analysis, and highlighting challenges in both.
(2) The grammar and word choice throughout are challenging often making the text difficult to follow. This reads like an early draft of the paper.
We apologize to the reviewer for any difficulties. We have revised the text and hope it is now easier to read, while still being accurate on the technical details of the various methods used in our manuscript.
Minor corrections to the text and figures.
See the weaknesses mentioned above. Also:
Figure S1 is unreadable.
There is no simple way to describe the expression values of 100 genes in 100 cell types on a single page. The resolution of the PDF is high enough that after zooming in, all the information can be read easily.
Figure S2, in a, please include the axes so that the reader can better understand the sections shown.
In b, it is unclear what the pink boxes mean. In c, the labels are barely legible.
In Figure 1 – figure supplement 2 (head sections), we have ordered the head sections from anterior to posterior. The boxes in (B) represent boxplots. We have updated this plot for clarity to better display the number of mRNA molecules detected for each gene. We have increased the font size in (C).
Figure S3, in a, please include axes. In b, the meaning of the pink box
In Figure 1 – figure supplement 3 (the body sections) we have added the anterior to posterior and dorso-ventral axis, and ordered the sections that stem from the same animal. The boxes in (B) represent boxplots. We have updated this plot for clarity to better display the number of mRNA molecules detected for each gene. We have added an explanation to the figure legend.
Figure S4, the text in the axes of the heatmap should have a darker typeface
We have changed it to black, thanks.
Figure S5c, are the colors in the dendrogram supposed to match the spatial location on the right?
The purple of the muscles is barely visible.
Yes, they do match. Colors were modified in the revised version for better visibility.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
The significance of Notch in liver cancer has been inconsistently described to date. The authors conduct a PDX screen using JAG1 ab and identify 2 sensitive tumor models. Further characterization with bulk RNA seq, scRNA seq, and ATAC seq of these tumors was performed.
Strengths:
The reliance on an extensive panel of PDXs makes this study more definitive than prior studies.
Gene expression analyses seem robust.
Identification of a JAG1-dependent signature associated with hepatocyte differentiation is interesting.
Weaknesses:
The introduction is rather lengthy and not entirely accurate. HCC is a single cancer type/histology. There may be variants of histology (allusion to "mixed-lineage" is inaccurate as combined HCC-CCa are not conventionally considered HCC and are not treated as HCC in clinical practice as they are even excluded from HCC trials), but any cancer type can have differences in differentiation. Just state there are multiple molecular subtypes of this disease.
We will shorten the Introduction, in part by eliminating the discussion of histological variation in HCC and focusing on the molecular classifications.
There is minimal data on the PDXs, despite this being highlighted throughout the text. Clinical and possibly some molecular characterization of these cancers should be provided. It is also odd that the authors include only 35 HCC and then a varied sort of cancer histologies, which is peculiar given their prior statements regarding the heterogeneity of HCC.
We agree that clinical and molecular characterizations of the PDX models would be helpful and will follow up with the relevant contract research organization to determine what characterization is available.
Regarding the liver cancer PDX panel, we suggest that a major strength of the manuscript is the large number of HCC models that were tested (the reviewer also notes the importance of the “extensive” panel); thus, we are a bit confused by the reference to “only 35 HCC”. To clarify the choice of models in the PDX screen, it may help to put the screen in historical perspective as the project unfolded. In retrospect, our preliminary efficacy studies using only two HCC models were fortunate to identify the highly sensitive model, LIV78. To go beyond the simple diagnostic hypothesis that focused on Jag1, Notch2 and Hes1 expression, we took an unbiased approach to discover features linked to Notch dependence. This approach meant running an efficacy screen in all liver cancer models that were up and running at our chosen research organization, without biased selection criteria. That set of models is what is represented in the “pre-clinical screen” in Fig. 1B
"super-responder" is not a meaningful term, I would eliminate this use as it has no clinical or scientific convention that I am aware of.
We were aware of the interchangeable terms of “exceptional-“ or “super-responder” and prefer to leave this language in the text. Some references are as follows:
● Prasad et al., Characteristics of exceptional or super responders to cancer drugs. Mayo Clinic Proceedings, 2015.
● NCI Press Release 2020: https://www.cancer.gov/news-events/press-releases/2020/cancer-exceptional-responders-study-genetic-alterations-may-contribute
● NIH Info: https://www.nih.gov/news-events/nih-research-matters/understanding-exceptional-responders-cancer-treatment
● “What is a Super Responder? Bradley Jones, Cancer Today, June 26, 2020.
● “What is a Super Responder?” AACR. https://www.aacr.org/patients-caregivers/progress-against-cancer/what-is-a-super-responder/
The "expansion" of the PDX screen is poorly described. Why weren't these PDXs included in the first screen? This is quite odd as the responses in the initial screen were underwhelming. What was the denominator number of all PDXs that were assessed for JAG1 and NOTCH2 expression? This is important as it clarifies how relevant JAG1 inhibition would be to an unselected HCC population.
We will revise the writing here to clarify as requested. For now, we can hopefully clarify by building on the historical context described above. As the reviewer notes and as we describe in the text, the in vivo screen revealed only a modest JAG1 dependence. The screen also highlighted that LIV78 was exceptional, and we wanted to understand why. Hypothesizing that the expression of progenitor markers in LIV78 were important for understanding its JAG1 dependence, we identified four additional models at other contract research organizations. It is this set of four that comprises the “expansion” cohort.
Was there some kind of determination of the optimal dose or dose dependency for the JAG1 ab? The original description of the JAG1 ab was in mouse lungs, not malignant or liver cells. In addition, supplementary Figure 2D is missing. There needs to be data provided on the specificity of the human-specific JAG1 ab and the anti-NOTCH2 ab. I'm not familiar with these ab, and if they are not publicly accessible reagents, more transparency on this is needed. In addition, given the reliance of the entire paper on these antibodies, I would recommend orthogonal approaches (either chemical or genetic) to confirm the sensitivity and insensitivity of select PDXs to Notch inhibition.
First, we note that the anti-human/mouse Jagged1 and Notch2 blocking antibodies used in our study have been extensively characterized as potent and selective and have been widely used outside of our group by the Notch research community (for the human/mouse cross-reactive antibodies, see Wu et al., Nature, 2010 for anti-NOTCH2 and Lafkas et al., Nature 2015 for anti-JAG1). As noted, the antibodies have been used in studies of normal mouse lungs (Lafkas et al.). Please note that the characterization also includes mouse models of primary liver cancer that formed the foundation for the current work (please refer to Huntzicker et al, 2015).
While we show dose responses in Figures 1A and 1D, we have not optimized dosing, for example by determining the minimal drug exposures needed for pharmacodynamic changes (pathway inhibition) and efficacy. For the purposes of this study, we erred on the side of dosing at high concentrations to minimize the risk of false negative responses.
Regarding the specificity of the human-specific anti-JAG1 antibody, which is revealed here for the first time, we apologize that we incorrectly provided a text reference to Supplementary Figure 2D instead of Supplementary Figure 1D. We will revise accordingly. Fig. 1D shows results from a reporter assay demonstrating that the antibody blocks signaling induced by human but not mouse JAG1.
We appreciate the value of orthogonal methods in establishing the credibility of a novel finding. We note that genetic approaches are technically highly challenging in PDX models. Chemically, we could have tested y-secretase inhibitors (GSIs). Our position is that such inhibitors are poor substitutes for the selective antibodies that we employed, at least for addressing the questions that are relevant in this study. Although commonly used to perturb Notch signaling, GSIs target numerous proteins and signaling cascades independent of Notch. Moreover, their use in vivo leads to intestinal and other toxicities, limiting exposure.
scRNA-seq data seems to add little to the paper and there is no follow-up of the findings. Are the low-expressing JAG1 cells eventually enriched in treated tumors contributing to disease recurrence?
We respectfully disagree with this sentiment. The single-cell RNA sequencing dataset revealed the enrichment of hepatocyte-like tumor cells following Notch inhibition. Importantly, this dataset also allowed us to identify transcription factor activities regulating different cell states, which we could not have done otherwise. This understanding in turn was fundamental to develop our hypothesis that Notch inhibition, through derepressing CEBPA expression, allows chromatin engagement of HNF4A and CEPBA and thereby promotes a hepatocyte differentiation program that is not compatible with tumor maintenance.
The discussion should be tempered. The finding of only 2 PDXs that are sensitive out of 45+ tumors treated or selected for indicates that JAG1/NOTCH2 inhibition is likely only effective in rare HCC.
We agree that strong responses to Notch inhibition in the PDX models are rare (~5%) and state as much in both the Results and Discussion sections. We maintain that it is important to put this PDX response frequency into a larger context. First, establishing PDX models---human tumor samples that grow on the flanks of immunocompromised mice---represents a strong selective pressure. In other words, we don’t know precisely how the frequency of responses in this selected set of PDX models may compare to the frequency that would be observed in human patient populations. Second, the magnitude of the response points to important and hitherto unappreciated biology, with blocking JAG1 or NOTCH2 reproducibly inducing regressions in the most sensitive models. Our hope is that the field can build from this study to generate diagnostic tools that identify sensitive patient tumors, define the true frequency of this patient group within the larger HCC population (even though likely rare), and direct the relevant Notch-based therapeutics to these patients. Within this context, and while noting the rarity of PDX responses, we hope that we have not overstated the case.
Reviewer #2 (Public review):
Summary:
The authors used a large panel of hepatocellular carcinoma patient-derived xenograft models to test the hypothesis that the developmental dependence of the liver on Jagged1-Notch2 signaling is retained in at least a subset of hepatocellular carcinomas. This led to the identification of two models that were extraordinarily sensitive to well-characterized, specific inhibitory antibodies against Jagged1 or Notch2. Based on additional analyses in these in vivo models, the authors provide compelling evidence that the response is due to the inhibition of human Notch2 and human Jagged1 on tumor cells and that this inhibition leads to a change in gene expression from a progenitor-like state to a hepatocyte-like state accompanied by cell cycle arrest. This change in cell state is associated with up-regulation of HNF4a and CEBPB and increased accessibility of predicted HNF4a and CEBPB genomic binding sites, accompanied by loss of accessibility to sequences predicted to bind TFs linked to multipotent liver progenitors. The authors put forth a plausible model in which inhibition of Notch2 downregulates transcriptional repressors of the Hairy/Enhancer of Split family, leading to increased expression of CEBPB and changes in gene expression that drive hepatocyte differentiation.
Strengths:
The strengths of the paper include the breadth of the preclinical screen in PDX models (which may be of an unprecedented size as preclinical trials go), the high quality of the well-characterized antibodies used as therapeutics and as biological perturbagens, the quality of the data and data analysis, and the authors balanced discussion of the strengths and weaknesses of their findings.
Weaknesses:
The principal weakness is the inability to clearly demonstrate the "translatability" of the PDX findings to primary human hepatocellular carcinoma.
We agree that translatability has not been fully addressed. As noted in our response to Reviewer 1, our hope is that the field can build from this study to generate diagnostic tools that identify sensitive patient tumors, define the true frequency of this patient group within the larger HCC population, and direct the relevant Notch-based therapeutics to these patients. We remain encouraged by the strength of the response in the sensitive models.
Additional Comments:
Hepatocellular carcinoma is increasing in frequency and is difficult to treat; cure is only possible through early diagnosis and surgery, often in the form of liver transplantation. It is also a common cancer, and so even if only 5% of tumors (a value based on the frequency of super-responders in this preclinical trial) fall into the Jagged1-Notch2 group defined by Seidel et al., the development of an effective therapy for this subgroup would be a very important advance. The chief limitation of their work is that it stops short of identifying primary human hepatocellular carcinomas that correspond to the super-responder PDX models. It can be hoped that their intriguing observations will spur work aimed at filling this gap.
There are several other loose ends. An unusual feature of this model is that both Jagged 1 and Notch2 are expressed in the same cells, and even in the same individual cells. In developmental systems, the expression of ligands and receptors in the same cell generally produces receptor inhibition rather than activation, a phenomenon described as cis inhibition. Their super-responder tumor models appear to break this rule, and how and why this is so remains to be understood. A follow-up question is what explains the observed heterogeneity in tumor cells, both at the level of Notch2 activation and scRNAseq clustering, and whether these different cell states are static or interchangeable.
We enthusiastically agree that these are fascinating questions, worthy of further study. As noted, the majority of tumor cells express both ligand and receptor and seem to be “on” for Notch signaling. We have not been able to determine whether the signal is induced in a cell autonomous or non-autonomous manner (or both). As the reviewer notes, the HCC features we observe are inconsistent with the dogma that has arisen from studies on Notch signaling in developmental contexts.
We do not yet have the experimental data to fully address the second question of what causes the heterogeneity of Notch2 activation and scRNAseq clustering. We speculate that the cell states may be dynamic, which would be consistent with the changes in cell populations observed after antibody treatment.
Another unanswered issue pertains to the nature of the tumor response to Notch signaling blockade, which appears to be mainly cell cycle arrest. There are a number of human tumors with cell autonomous Notch signaling due to gain of function Notch receptor mutations that also respond to Notch blockade with cell cycle arrest, such as T cell acute lymphoblastic leukemia (T-ALL). In general, clinical trials of pan-Notch inhibitors such as gamma-secretase inhibitors have been disappointing in such tumors, perhaps reflecting a limitation of treatments with significant toxicity that do not kill tumor cells directly. It could be argued that this limitation will be mitigated by the apparently excellent safety profile of Notch2 blocking antibody, which perhaps could be administered for a sustained period, akin to the use of tyrosine kinase inhibitors in chronic myeloid leukemia---but this remains to be determined.
We agree that a full understanding of the tumor response warrants further investigation. Like the reviewer, we speculate that the improved safety profile of selective antibodies relative to pan-Notch inhibitors may enable greater and sustained therapeutic coverage of Notch inhibition than has been feasible in T-ALL trials. Given that in the sensitive PDX models we observe rapid tumor regressions, not just stasis, it would seem to follow that the mechanism underpinning the tumor response involves more than just cell cycle blockade. Whether tumor shrinkage reflects additional cell death mechanisms or simply tumor cell turnover after cell cycle arrest remains to be determined.
A minor comment is reserved for the statement in the discussion that "In chronic myelomonocytic leukemia, which results from an inactivating mutation in the y-secretase complex component nicastrin, Notch signaling has a tumor suppressive function, that is mediated through direct repression of CEBPA and PU.1 by HES1 (Klinakis et al., 2011)". Thousands of cases of CMML and related myeloid tumors have been subjected to whole exome and even whole genome sequencing without the identification of Notch signaling pathway mutations. Thus, an important tumor suppressive role for Notch-mediated through HES1 in myeloid tumors is not proven.
We agree that our sentence about Notch and CMML does not fit well with the prevalent paradigm established by genome wide sequencing and other methods. We will edit this paragraph accordingly, focusing on Hes1 negative regulation of CEBPA in myeloid fate control and how that shapes our thinking on molecular mechanisms in the Notch-dependent HCCs.
Reviewer #3 (Public review):
Summary:
Notch is active in HCC, but generally not mutated. The authors use a JAG1-selective blocking antibody in a large panel of liver cancer patient-derived xenograft models. They find JAG-dependent HCCs, and these are aggressive and proliferative. Notch inhibition induces cycle arrest and promotes hepatocyte differentiation, through upregulation of CEBPA expression and activation of existing HNF4A, mimicking normal developmental programs.
The authors use aJ1.b70, a potent and selective therapeutic antibody that inhibits JAG1 against PDX models. They tested over 40 PDX models and found a handful of super-responders to single-agent inhibition. In LIV78 and Li1035 cancer cells, NOTCH2 was expressed and required, in contrast to NOTCH1. RNA-seq showed that the responsive HCCs resembled the S2 transcriptional class of HCCs, which were enriched for Notch-dependent models. They conclude that these dependent tumors have transcriptomes that resemble a hybrid progenitor cell expressing FGF9 and GAS7. Inhibition was able to induce hepatocyte differentiation away from a NOTCH-driven progenitor program. scRNA-seq analysis showed a large population of NOTCH-JAG expressing cells but also showed that there are cells that did not. Not surprisingly, NOTCH2 inhibition leads to increased CEBPA and HNF4A transcriptional activity, which are standard TFs in hepatocytes.
Strengths:
The paper provides useful information about the frequency of HCCs and CCA that respond to NOTCH inhibition and could allow us to anticipate the super-responder rate if these antibodies were actually used in the clinic. The inhibitor tools are highly specific, and provide useful information about NOTCH activities in liver cancers. The large number of PDXs and the careful transcriptomic analyses were positives about the study.
Weaknesses:
The paper is mostly descriptive.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This article investigates the phenotype of macrophages with a pathogenic role in arthritis, particularly focusing on arthritis induced by immune checkpoint inhibitor (ICI) therapy.
Building on prior data from monocyte-macrophage coculture with fibroblasts, the authors hypothesized a unique role for the combined actions of prostaglandin PGE2 and TNF. The authors studied this combined state using an in vitro model with macrophages derived from monocytes of healthy donors. They complemented this with single-cell transcriptomic and epigenetic data from patients with ICI-RA, specifically, macrophages sorted out of synovial fluid and tissue samples. The study addressed critical questions regarding the regulation of PGE2 and TNF: Are their actions co-regulated or antagonistic? How do they interact with IFN-γ in shaping macrophage responses?
This study is the first to specifically investigate a macrophage subset responsive to the PGE2 and TNF combination in the context of ICI-RA, describes a new and easily reproducible in vitro model, and studies the role of IFNgamma regulation of this particular Mф subset.
Strengths:
Methodological quality: The authors employed a robust combination of approaches, including validation of bulk RNA-seq findings through complementary methods. The methods description is excellent and allows for reproducible research. Importantly, the authors compared their in vitro model with ex vivo single-cell data, demonstrating that their model accurately reflects the molecular mechanisms driving the pathogenicity of this macrophage subset.
Weaknesses:
Introduction: The introduction lacks a paragraph providing an overview of ICI-induced arthritis pathogenesis and a comparison with other types of arthritis. Including this would help contextualize the study for a broader audience.
Thank you for this suggestion, we will add a paragraph on ICI-arthritis to intro.
Results Section: At the beginning of the results section, the experimental setup should be described in greater detail to make an easier transition into the results for the reader, rather than relying just on references to Figure 1 captions.
We will clarify the experimental setup.
There is insufficient comparison between single-cell RNA-seq data from ICI-induced arthritis and previously published single-cell RA datasets. Such a comparison may include DEGs and GSEA, pathway analysis comparison for similar subsets of cells. Ideally, an integration with previous datasets with RA-tissue-derived primary monocytes would allow for a direct comparison of subsets and their transcriptomic features.
This is a great idea, we will integrate the data sets and if batch correction is successful will present this analysis.
While it's understandable that arthritis samples are limited in numbers and myeloid cell numbers, it would still be interesting to see the results of PGE2+TNF in vitro stimulation on the primary RA or ICI-RA macrophages. It would be valuable to see RNA-Seq signatures of patient cell reactivation in comparison to primary stimulation of healthy donor-derived monocytes.
We agree that this would be interesting but given limited samples and distribution of samples amongst many studies and investigators this is beyond the scope of the current study.
Discussion: Prior single-cell studies of RA and RA macrophage subpopulations from 2019, 2020, 2023 publications deserve more discussion. A thorough comparison with these datasets would place the study in a broader scientific context.
Creating an integrated RA myeloid cell atlas that combines ICI-RA data into the RA landscape would be ideal to add value to the field.
As one of the next research goals, TNF blockade data in RA and ICI-RA patients would be interesting to add to such an integrated atlas. Combining responders and non-responders to TNF blockade would help to understand patient stratification with the myeloid pathogenic phenotypes. It would be great to read the authors' opinion on this in the Discussion section.
We will be happy to improve the discussion by including these topics.
Conclusion: The authors demonstrated that while PGE2 maintains the inflammatory profile of macrophages, it also induces a distinct phenotype in simultaneous PGE2 and TNF treatment. The study of this specific subset in single-cell data from ICI-RA patients sheds light on the pathogenic mechanisms underlying this condition, however, how it compares with conventional RA is not clear from the manuscript.
Given the substantial incidence of ICI-induced autoimmune arthritis, understanding the unique macrophage subsets involved for future targeting them therapeutically is an important challenge. The findings are significant for immunologists, cancer researchers, and specialists in autoimmune diseases, making the study relevant to a broad scientific audience.
Reviewer #2 (Public review):
Summary/Significance of the findings:
The authors have done a great job by extensively carrying out transcriptomic and epigenomic analyses in the primary human/mouse monocytes/macrophages to investigate TNF-PGE2 (TP) crosstalk and their regulation by IFN-γ in the Rheumatoid arthritis (RA) synovial macrophages. They proposed that TP induces inflammatory genes via a novel regulatory axis whereby IFN-γ and PGE2 oppose each other to determine the balance between two distinct TNF-induced inflammatory gene expression programs relevant to RA and ICI-arthritis.
Strengths:
The authors have done a great job on RT-qPCR analysis of gene expression in primary human monocytes stimulated with TNF and showing the selective agonists of PGE2 receptors EP2 and EP4 22 that signal predominantly via cAMP. They have beautifully shown IFN-γ opposes the effects of PGE2 on TNF-induced gene expression. They found that TP signature genes are activated by cooperation of PGE2-induced AP-1, CEBP, and NR4A with TNF-induced NF-κB activity. On the other hand, they found that IFN-γ suppressed induction of AP-1, CEBP, and NR4A activity to ablate induction of IL-1, Notch, and neutrophil chemokine genes but promoted expression of distinct inflammatory genes such as TNF and T cell chemokines like CXCL10 indicating that TP induces inflammatory genes via IFN-γ in the RA and ICI-arthritis.
Weaknesses:
(1) The authors carried out most of the assays in the monocytes/macrophages. How do APC-cells like Dendritic cells behave with respect to this TP treatment similar dosing?
We agree that this is an interesting topic especially as TNF + PGE2 is one of the standard methods of maturing in vitro generated human DCs. As DC maturation is quite different from monocyte activation this would represent an entire new study and is beyond the scope of the current manuscript. We will instead describe and cite the literature on DC maturation by TNF + PGE2 including one of our older papers (PMID: 18678606; 2008)
(2) The authors studied 3h and 24h post-treatment transcriptomic and epigenomic. What happens to TP induce inflammatory genes post-treatment 12h, 36h, 48h, 72h. It is critical to see the upregulated/downregulated genes get normalised or stay the same throughout the innate immune response.
We will clarify that the gene response is mostly subsiding at the 24 hour time point, which is in line with in vitro stimulation of primary monocytes in other systems.
(3) The authors showed IL1-axis in response to the TP-treatment. Do other cytokine axes get modulated? If yes, then how do they cooperate to reduce/induce inflammatory responses along this proposed axis?
We will analyze the data for other pathways that are modulated.
Overall, the data looks good and acceptable but I need to confirm the above-mentioned criticisms.
-
-
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
This manuscript presents evidence of ’vocal style’ in sperm whale vocal clans. Vocal style was defined as specific patterns in the way that rhythmic codas were produced, providing a fine-scale means of comparing coda variations. Vocal style effectively distinguished clans similar to the way in which vocal repertoires are typically employed. For non-identity codas, vocal style was found to be more similar among clans with more geographic overlap. This suggests the presence of social transmission across sympatric clans while maintaining clan vocal identity.
Strengths:
This is a well-executed study that contributes exciting new insights into cultural vocal learning in sperm whales. The methodology is sound and appropriate for the research question, building on previous work and ground-truthing much of their theories. The use of the Dominica dataset to validate their method lends strength to the concept of vocal style and its application more broadly to the Pacific dataset. The results are framed well in the context of previous works and clearly explain what novel insights the results provide to the current understanding of sperm whale vocal clans. The discussion does an overall great job of outlining why horizontal social learning is the best explanation for the results found.
Weaknesses:
The primary issues with the manuscript are in the technical nature of the writing and a lack of clarity at times with certain terminology. For example, several tree figures are presented and ’distance’ between trees is key to the results, yet ’distance’ is not clearly defined in a way for someone unfamiliar with Markov chains to understand. However, these are issues that can easily be dealt with through minor revisions with a view towards making the manuscript more accessible to a general audience.
I also feel that the discussion could focus a bit more on the broader implications - specifically what the developed methods and results might imply about cultural transmission in other species. This is specifically mentioned in the abstract but not really delved into in detail during the discussion.
We are grateful for the Reviewer’s recognition of the study’s contributions to understanding cultural vocal learning in sperm whales. In response to the concerns regarding clarity and accessibility, we have revised the manuscript to improve the definition of key concepts, such as the notion of “distance” between subcoda trees. This adjustment ensures clarity for readers unfamiliar with the technical details of Markov chains. Additionally, we have expanded the discussion to highlight broader implications of our findings, particularly their relevance to understanding cultural transmission in other species, as suggested.
Reviewer #2 (Public review):
Summary:
The current article presents a new type of analytical approach to the sequential organisation of whale coda units.
Strengths:
The detailed description of the internal temporal structure of whale codas is something that has been thus far lacking.
Weaknesses:
It is unclear how the insight gained from these analyses differs or adds to the voluminous available literature on how codas varies between whale groups and populations. It provides new details, but what new aspects have been learned, or what features of variation seem to be only revealed by this new approach? The theoretical basis and concepts of the paper are problematical and indeed, hamper potentially the insights into whale communication that the methods could offer. Some aspects of the results are also overstated.
We appreciate the Reviewer’s acknowledgment of the novelty in describing the internal temporal structure of whale codas. Regarding the concern about the unique contributions of this approach, we have further emphasized in the revised manuscript how our methodology reveals previously uncharacterized dimensions of coda structure. Specifically, our work highlights how non-identity codas, which have received limited attention, play a significant role in inter-clan acoustic interactions. By leveraging Variable Length Markov Chains, we provide a nuanced understanding of coda subunits that complements existing studies and demonstrates the value of this analytical approach.
Reviewer #3 (Public review):
Summary:
The study presented by Leitao et al., represents an important advancement in comprehending the social learning processes of sperm whales across various communicative and socio-cultural contexts. The authors introduce the concept of ”vocal style” as an addition to the previously established notion of ”vocal repertoire,” thereby enhancing our understanding of sperm whale vocal identity.
Strengths:
A key finding of this research is the correlation between the similarity of clan vocal styles for non-ID codas and spatial overlap (while no change occurs for ID codas), suggesting that social learning plays a crucial role in shaping symbolic cultural boundaries among sperm whale populations. This work holds great appeal for researchers interested in animal cultures and communication. It is poised to attract a broad audience, including scholars studying animal communication and social learning processes across diverse species, particularly cetaceans.
Weaknesses:
In terms of terminology, while the authors use the term ”saying” to describe whale vocalizations, it may be more conservative to employ terms like ”vocalize” or ”whale speech” throughout the manuscript. This approach aligns with the distinction between human speech and other forms of animal communication, as outlined in prior research (Hockett, 1960; Cheney & Seyfarth, 1998; Hauser et al., 2002; Pinker & Jackendoff, 2005; Tomasello, 2010).
We thank the Reviewer for recognizing the importance of our findings and their appeal to broader audiences interested in animal cultures and communication. In response to the suggestion regarding terminology, we have adopted a more conservative language to align with distinctions between human and non-human communication systems. For example, terms like “vocalize” and “vocal repertoire” are used in place of anthropomorphic terms such as “saying”. This ensures consistency with established conventions while maintaining clarity for a broad readership.
Reviewer #1 (Recommendations):
Comment 1
Lines 11-13: As mentioned above, the implications for comparing communication systems and cultural transmission in other species isn’t really discussed much and I think it’s a really interesting component of the study’s broader implications.
Thank you for the comment.
Action - We added a few more sentences to the discussion regarding this.
Comment 2
Figure 1: More information on the figure of these trees would help. What do the connecting lines represent? What do the plain black dots and the black dot with the white dot represent? Especially since the ”distance between trees” is a key result, it’s important that someone unfamiliar with Markov chains can understand the basics of how this is calculated and what it represents. It is explained in the methods, but a brief explanation here would make the results and the figure a lot clearer since the methods are the last section of the manuscript.
These were omitted as we believed that attempting to introduce the mathematical structure and the methodology to compare two instances, in a figure caption, would have caused more ambiguity than necessary.
Action - Added an informal introduction to these concepts on the figure caption. Also added a pointer to the Supplementary Materials.
Comment 3
Table 1: A definition of dICIs should be included here.
Added the definition of discrete ICI to the table.
Comment 4
Figure 2: The placement of the figures is a bit confusing because they are quite far from the text that references them.
We thank the reviewer for pointing this out, we tried to edit the manuscript to improve this issue, but this part of the editing is more within the journal’s powers than our own.
Action - Moved images closes to the corresponding text in manuscript.
Comment 5
Line 117: Probabilistic distance needs to be briefly explained earlier when you first mention distance (see Lines 11-13 comments).
Action - Clarifications added in the caption of figure 1. as per comment on Lines 11-13
Comment 6
Figure 4: Is order considered in these pairwise comparisons? It looks like there are two dots for each pairwise comparison. Additionally, why is the overlap different in these two comparisons? For example, short:four-plus has an overlap of 0.6, while four-plus:short has an overlap of 0.95.
The x-axis of the plots in Figure 4 is geographical clan overlap. This is calculated as per (Hersh et al., 2022) and is described in our Methods (see “Measuring clan overlap” section). Given two clans—for example, the Four-Plus and the Short clan—spatial overlap is calculated twice: as the proportion of the Four-Plus clan’s repertoires that were recorded within 1,000 km of at least one of the Short clan’s repertoires, and as the proportion of the Short clan’s repertoires that were recorded within 1,000 km of at least one of the Four-Plus clan’s repertoires.
Order is important in these pairwise comparisons and generates an asymmetric matrix because the clans have different spatial extents. A clan found in only one small region might overlap completely with a clan that spans the Pacific Ocean, while the opposite is not true. For example, the Short clan spans the Pacific Ocean while the Four-Plus clan has been documented over a smaller area (but that smaller area overlaps extensively with the Short clan range). That is why the value is smaller (0.6) when considering how much of the Short clan’s range is shared with the Four-Plus clan, and larger ( 0.95) when considering how much of the Four-Plus clan’s range is shared with the Short clan.
Action - We have now added a reference to that section of the Methods in our Figure 4 caption and include the clan spatial overlap matrix as a supplemental table (Table S5).
Comment 7
Figure 4: I think the reference should be Hersh et al. [11].
Thank you for catching this.
Action - Reference corrected
Comment 8
Line 227: What aspect of your analysis looked at how often codas were produced? You mention coda frequency, but it is unclear how this was incorporated into your analysis. If this is included in the methods, the language is a bit too technical to easily parse it out.
Indeed here we are referencing the results of the paper mentioned in the previous line. We do not look at coda production frequency.
Action - Added citation to paper that actually performs this analysis.
Comment 9
Lines 253-255: I think you could dig into this a little more, as ”there is currently no evidence” is not the most convincing argument that something is not a driver. Perhaps expanding on the latter sentence that clans are recognizable across oceans basins would be helpful. Does this suggest that clans with similar geographic overlap experience diverse environmental conditions across ocean basins? If so, this might better strengthen your argument against environmental drivers.
Thank you for pointing this out. We feel that the next sentence highlights that clans are recognizable across environmental variation from one side to the other of the ocean basin, which supports the inductive reasoning that codas do not vary systematically with environment. However, we have edited these sentences for clarity.
Comment 10
Lines 311-314: It would also be interesting to look at vocal style across non-ID coda types. Are some more similar to each other across clans than others? Perhaps vocal style can further distinguish types of non-ID codas.
In supplementary Materials 3.4.2 and 3.5 we highlight our results when the codas are separated by coda type summarized in Table S4. We do compare the vocal style across non-ID coda types across clans and within the same clan. The results however are aggregated to highlight the differences in style between the clans and a a coda type-only comparison is not shown.
Comment 11
Lines 390-392: I’m assuming this is why pairwise comparisons were directional (i.e., there was both an A:B and a B:A comparison)? Can you speak to why A:B and B:A comparisons can have such different overlap values?
Given two clans—for example, the Four-Plus and the Short clan—spatial overlap is calculated twice: as the proportion of the Four-Plus clan’s repertoires that were recorded within 1,000 km of at least one of the Short clan’s repertoires, and as the proportion of the Short clan’s repertoires that were recorded within 1,000 km of at least one of the Four-Plus clan’s repertoires.
Order is important in these pairwise comparisons and generates an asymmetric matrix because the clans have different spatial extents. A clan found in only one small region might overlap completely with a clan that spans the Pacific Ocean, while the opposite is not true. For example, the Short clan spans the Pacific Ocean while the Four-Plus clan has been documented over a smaller area (but that smaller area overlaps extensively with the Short clan range). That is why the value is smaller (0.6) when considering how much of the Short clan’s range is shared with the Four-Plus clan, and larger (0.95) when considering how much of the Four-Plus clan’s range is shared with the Short clan.
Action - We now include the clan spatial overlap matrix as a supplemental table (Table S5).
Comment 13
Line 56: Can you briefly explain what memory means in the context of Markov chains?
We provide an explanation of the meaning of memory in the Methods section on ”Variable length Markov Chains”. Briefly, the memory in this case means how many states in the past of the Markov chain’s current state are required to predict the next transition of the chain itself. Standard Markov chains “look” back only one time step, while k-th order Markov chains look back k steps. In our case, there was no reason to assume that the memory required to predict different sequences of states (interclick intervals) should be the same across all sequences, and thus we adopted the formalism of variable length Markov chains, that allow for different levels of memory across the system.
Comment 14
Supplementary Figure S3: Like in the main manuscript, briefly explain or remind us what the blank nodes and the yellow nodes are.
Action - Clarified that the orange node represents the root of the tree in the figures.
Comment 15
Supplementary Figure S7: Put the letters before the dataset name.
Action - Done.
Comment 16
Supplementary Figure S10: Unclear what ’inner vs outer’ means.
One specifies comparisons across clans (outer) and the other within the same clan (inner)
Action - Added clarification on the caption of Figure S10
Comment 17
Supplementary Figure S14: Include a-c labels in the figure itself.
Action - Labels added to figure
Comment 18
Supplementary Figure S14: The information about the nodes is what needs to be included earlier and in the main body when discussing the trees.
Action - Added the explanation earlier in the text and in the main body
Reviewer #2 (Recommendations):
Comment 19
Line 22: ”Symbolic” and ”Arbitrary” are not synonyms. Please see the comment above.
We agree. Here, we make the point that the evolution of symbolic markers of group identity can be explained from what are initially arbitrary, and meaningless, signals (see [L1, L2]). Our point being that any vocalization, any coda, could have become selected for as an identity coda, and to become symbolic, and evolve to play a key role in cultural group formation and in-group favoritism because they enable a community of individuals to solve the problem of with whom to collaborate. The specific coda itself does not affect collaborative pay offs, but group specific differences in behavior can, as such the coda is arguably symbolic; as it is observable and recognizable, and can serve as a means for social assortment even when the behavioural differences are not. This can explain the means by which the social segregation which is observed among behaviorally distinct clans of sperm whales. However, in this manuscript, we do not extend this discussion of existing literature and have attempted to concisely describe this in a couple of lines, which clearly do a disservice to the large body of literature on the evolution of symbolic markers and human ethnic groups. We have added some citations to this section so that the reader may follow up should they disagree with out brief introductory statements.
Action - Added citations and pointers to the literature.
Comment 20
Line 24: The authors’ terminology around ”markers”, ”arbitrary”, ”symbolic” is unnecessarily confusing and mystifying, giving the impression these terms are interchangeable. They are not. These terms are an integral and long-established part of key definitions in signal theory. Term use should be followed accordingly. The observation that whale vocal signals vary per population does not necessarily mean that they function as a social tag. The word ”dog” varies per population but its use relates to an animal, not the population that utters the word. ”Dog” is not ”symbolic” of England, English-speaking populations or the English language. Furthermore, the function of whale vocal signals is extremely challenging to determine. In the best conditions, researchers can pin the signal’s context, this is distinct from signal’s function and further even for the signal’s meaning. How exactly the authors determine that whale vocal signals are arbitrary is, thus, perplexing given that this would require a detailed description and understanding of who is producing the song, when, towards whom, and how the receivers react, none of which the authors have and without which no claim on the signals’ function can be made. This terminological laxness and the sensu latu in extremis to various terms in an unjustified, unnecessary and unhelpful.
We use these terms as established in Hersh et al 2022 and the works leading up to it over the last 20 years in the study of sperm whales. These are often derived from definitions by Boyd and Richerson’s work on culture in humans and animals along with evolution of symbolic markers both in theory and in humans. We agree with the reviewer that these are difficult to establish in non-humans, whales or otherwise, but feel strongly that the accumulating evidence provides strong support for the function of these signals as symbolic markers of cultural groups, and that they likely evolved from initially arbitrary calls which were a part of the vocal repertoire (similar to the process and selective environment in Efferson et al. [L1] and McElreath et al. [L2]). We feel that we do not use these terms interchangeably here, and have inherited their use from definitions from anthropology. The work presented here uses terminology built across two decades of work in cetacean, and sperm whale, culture. And do not feel that these terms should be omitted here.
Comment 21
Lines 21-27: Overly broad and hazy paragraph.
We hope the replies above and our changes satisfy this comment and clarify the text.
Comment 22
Figure 1 legend: What are ”memory structures”? Unjustified descriptor.
The phrase was chosen to make draw some intuition on the variation of context length in variable length markov models.
Action - Re-worded from memory structures to statistical properties
Comment 23
Line 30: Omit ”finite”.
Action - Omitted.
Comment 24
Line 31: Please define and distinguish ”rhythm” and ”tempo”. Also see comment above, rhythm and tempo definitions require the use of IOIs.
We disagree with the reviewer’s claims here. In our research specifically, and for sperm whale research generally, coda inter-click intervals (ICIs) are calculated as the time between the start of the first click and the start of the subsequent click. This makes ICIs identical to inter-onset intervals (IOIs) under all definitions we are aware of. For example, Burchardt and Knornschild [L3] define IOIs as such: “In a sequence of acoustic signals, the time span between the start of an element and the next element, comprising the element duration and the following gap duration”. We now include a sentence making this point.
Regardless, we disagree on a more fundamental level with the statement that unless researchers quantify inter-onset intervals (IOIs), they cannot make any claims about rhythm. There are many studies that investigate rhythmic aspects of human and animal vocalizations without using IOIs [L4–L7]. If the duration of sound elements of interest is relatively constant (as is the case for sperm whale clicks), then rhythm analyses can still be meaningfully conducted on inter-call intervals (the silent intervals between calls).
For sperm whales, coda rhythm is defined by the relative ICIs standardized by their total duration. These can be clustered into discrete, defined rhythm types based on characteristic ICI patterns. Coda tempo is relative to the total duration of the coda itself. This can also be clustered into discrete tempo types across all coda durations as well (see [L8]).
Action - We added a sentence specifying that in this case we can use both ICIs and IOIs because of the standardized length of a single click.
Comment 25
Line 36: Are there non-vocalized codas to require the disambiguation here?
No, we have omitted for clarity.
Comment 26
Line 44: ”Higher” than which other social group class?
Sperm whales live in a multi-level social organization. Clans are a “higher” level of social organization than the social “units” which we define in line 40. Clans are made up of all units which share similar production repertoire of codas.
Action - We have added ’above social units’ on line 44 to make this clear.
Comment 27
Line 47: The use of “symbolic” continues to be enigmatic, even if authors are taking in this classification from other researchers. In signal theory (semiotics), not all biomarkers are necessarily symbols. I advise the authors to avoid the use of the term colloquially and instead adopt the definition used in the research field within which the study falls in.
There is ample examples of the use of ”symbolic” when referring to markers of in-group membership both in human and non-human cultures.Our choice to use the term “symbolic” is based on a previous study [L9] that found quantitative evidence that sperm whale identity codas function as symbolic markers of cultural identity, at least for Pacific Ocean clans. The full reasoning behind why the authors used the term “symbolic markers” is given in that paper, but briefly, they found evidence that identity coda usage becomes more distinct as clan overlap increases, while non-identity coda usage does not change. This matches theoretical and empirical work on human symbolic markers[L1, L2, L10, L11].
Action - We retain the use of the term here, as defined in the works cited, and based on its prior usage in the study of both human and non-human cultures.
Comment 28
Line 50: This statement is not technically accurate. The use of a signal as a marker by individuals can only be determined by how individuals ”interpret” and react to that signal - e.g., via playback experiments - it cannot be determined by how different populations use and produce the signals.
We respectfully disagree. While we agree that the optimal situation would be that of playback, the contextual use can provide insight into the functional use of signals; as can expected patterns of use and variation, as was tested in the papers we cite. However, this argument is not the scope nor the synthesis of this paper. These statements are supported by existing published works, as cited, and we encourage the reviewer to take exception with those papers.
Comment 29
Line 69: ”Meaningful speech characteristics”??? These terms do not logically or technically follow the previous statement. Why not stay faithful to the results and state that the method used seems to be valid and reliable because it confirms former studies and methods?
Action - Reworded to better underline the method’s results with previous studies
Comment 30
Lines 72-74: This statement doesn’t seem to accurately capture/explain/resume the difference between ID and non-ID codas.
We are not sure what the reviewer is referring to in this case. The sentence in this case was meant to explain the different relations that ID/non-ID codas have with clan sympatry.
Comment 31
Line 75: The information provided in the few previous sentences does not allow the reader to understand why these results support the notion that cultural transmission and social learning occurs between clans.
We conclude out introduction with a brief summary of our overall findings, which we then use the rest of the manuscript to support these statements.
Comment 32
Table 1: So far, the authors refer to their analyses as capturing the ”rhythm” of whale clicks. Consequently, it is not readily clear at this point why the authors rely on ”ICIs” (inter click intervals) instead of the ”universal” measure used across taxa to capture the rhythm of signal sequences - IOIs (inter onset intervals). If ICIs are the same measure as IOIs, why not use the common term, instead of creating a new term name? Alternatively, if ICIs are not equivalent to IOIs, then arguably the analyses do not capture the ”rhythm” of whale clicks, as claimed by the authors. Any rhythmic claim will need to be based on IOI measures. In animal behaviour, stereotyped is primarily used to describe pathological, dysfunctional behaviour. I suggest the use of other adjective, such as ”regular”, ”repetitive”, ”recurring”, ”predictable”. Another deviation from typical terminology: ”usage frequency” -¿ ”production rate”. Why is a clan a ”higher-order” level of social organization? This requires explanation, at least a mention, of what are the ”lower-order” levels. To the non-expert reader, there is a logical circularity/gap here: Clans are said to produce clan-specific codas, and then, it is said that codas are used to delineate clans. Either one deduces, or one infers, but not both. This raises the question, are clans confirmed by any other means than codas?
We are not creating a “new term name”: inter-click interval (ICI) is the standard terminology used in odontocete (toothed whale) research. We take the reviewer’s point that some readers will not be coming to our paper with that background, however, and now explicitly point out that ICI is synonymous with IOI for sperm whales. Please see our response to your earlier comment for more on this point.
Comment 33
Line 92: Unclear term, ”sub-sequence”. Fig. 1B doesn’t seem to readily help disambiguate the meaning of the term.
In fact reference to Fig. 1B is misplaced as it does not refer to the text. A sub-sequence is simply a contiguous subset of a coda, a subset of it.
Action - Removed ambiguous reference to Fig. 1B
Comment 34
Line 94: How does the use of ”sequence” compare here with ”sub-sequence” above?
In fact its the same situation although the previous comment highlighted a source of ambiguity.
Action - Reworded the sentence to be less confusing.
Comment 35
Line 95: Signal sequences don’t ”contain” memory, they require memory for processing.
Action - Rephrased from “sequences contain memory” to “states depend on previous sequences of varying length”.
Comment 36
Lines 95-97: The analogy with human language seems forced, combinatorics in any given species are expected to entail different transitions between unit/unit-sequences.
Thank you for the comment. Indeed, the purpose of the analogy is to illustrate how variable length Markov Chains work (which have been shown to be good at discerning even accents of the same language). We used human language as an analogy to provide the readers’ with a more intuitive understanding of the results.
Action - Revised paragraph to read: “Despite we do not have direct evidence of unitary blocks in sperm whale communication, on can imagine this effect similarly to what happens with words (e.g., a word beginning with “re” can continue in more ways than one starting with “zy”).”
Comment 37
Line 97: Unclear which possibility is this.
Action - Made the wording clearer.
Comment 38
Line 99: Invocation of memory, although common in the use of Markov chains, in inadequate here given that the research did not study how individuals perceived or processed click sequences, only how individual produced click sequences. If the authors are referring to the cognitive load imposed by producing clicks sequences, terms such as ”sequence planning” will be more accurate.
Here, we use the term “fixed-memory” in relation to the definition of a variable length Markov model. We feel that, in this section of the manuscript, the context is clear that it is a mathematical definition and in no way invokes the biological idea of memory or cognition. It is rather standard to use memory to describe the order of Markov chains. Swapping words in the definition of mathematical objects when the context is clear seems to cause unnecessary ambiguity.
Action - We clarified this in the manuscript (see comments above).
Reviewer #3 (Recommendations):
Comment 39
Line 16: Add ”broadly defined” as there are many other more restricted definitions (see for example Tomasello 1999; 2009). Tomasello M (1999) The cultural origins of human cognition. Harvard University Press, Cambridge Tomasello M (2009) The question of chimpanzee culture, plus postscript (chimpanzee culture 2009). In: Laland KN, Galef BG (eds) The question of animal culture. Harvard University Press, Cambridge, pp 198-221.
Thanks for the clarification.
Action - We added the term “broadly” and added the last reference.
Comment 40
Line 22: Is all stable social learned behavior that becomes idiosyncratic and ”distinguishable” considered symbolic markers? If not, consider adding ”potentially.”
No, but the evolution of cultural groups with differing behavior can reorganize the selective environment in such a way that it can favour an in-group bias that was not initially advantageous to individuals and lead to a preference towards others who share an overt symbolic marker that initially had no meaning and a random frequency in both populations. That is to say, even randomly assigned trivial groups can evolve arbitrary symbolic markers through in-group favouritism once behavioural differences exist even in the absence of any history of rivalry, conflict, or competition between groups. See for example [L1, L2].
Comment 41
Table 1: Identity codas are defined as a ”Subset of coda types most frequently used by a sperm whale clan; canonically used to define vocal clans.” Therefore, I infer that an identity coda is not exclusively used by a specific clan and may be utilized by other clans, albeit less frequently. If this is the case, what criteria determine the frequency of usage for a coda to be categorized as an identity or non-identity coda? Does the criteria used to differentiate between ID and non-ID codas reflect the observed differences in micro changes between the two and within clans?
The methods for this categorization are defined, discussed, and justified in previous work in [L9, L12]. We feel its outside the scope of this paper to review these details here in this manuscript. However, the differences between vocal styles discussed here and the frequency production repertoires which allow for the definition of identity codas are on different scales. The differences between identity and non-identity codas are not the observed differences in vocal style reported here.
Comment 42
Table 1: The definition of vocal style states that it ”Encodes the rhythmic variations within codas.” However, if rhythm changes, does the type of coda change as well? Typically, in musical terms, the component that maintains the structure of a rhythm is ”tempo,” not ”rhythm.” How much microvariation is acceptable to maintain the same rhythm, and when do these variations constitute a new rhythm?
Thank you for raising this important point about the relationship between rhythmic variations and coda categorization. In our definition, ”vocal style” refers to subtle, micro-level variations in the rhythmic structure of codas that do not alter their overarching categorical identity. These microvariations are akin to ”tempo” changes in musical terms, which can modify the expression of a rhythm without fundamentally altering its structure.
The threshold at which microvariations constitute a new rhythm, and thus a new coda type, remains an open question and is a limitation of current analytical approaches. In our study, we used established classification methods to group codas into types, treating variations within these groups as part of the same rhythm. Future work could refine these thresholds to better distinguish between meaningful rhythmic variation and the emergence of new coda types.
Comment 43
Table 1: Change ”say” to ”vocalize” (similarly as used in line 273 for humpback whales ”vocalizations”).
Thanks.
Action - Done.
Comment 44
Lines 33-35 and Figure 1-C: Can a lay listener discern the microvariations within each coda type by ear? Consider including sound samples of individual rhythmic microvariations for the same coda type pattern (e.g., Four plus, Palindrome, Plus One, Regular) to provide readers/listeners with an impression of their detectability. If authors considered too much or redundant Supplemental material at least give a sound sample for each the 4 subcodas modeled structures examples of 4R2 coda variations depicted in Figure 1-C so the reader can have an acoustic impression of them.
We do not think that human listeners would be able to all of the variation detected here. However, this does not mean that it is not important variation for the whales. Human observers being able to classify call variation aurally shouldn’t be seen as a bar representing important biological variation for non-human species, given that their hearing and vocal production systems have evolved independently. Importantly, ’Four Plus’,’Palindrome’, etc are names of Clans; sympatric, but socially segregated, communities of whale families, which share a distinct vocal dialect of coda types. These clans each have have distinguishable coda dialects made up of dozens of coda types (and delineated based on identity codas), these are not names/categorical coda types themselves.
Action - We now provide audio samples of all coda types listed in Figure 1B in the paper’s Github repository.
Comment 45
Line 69: As stated above, it may be confusing to refer to it as ”speech.” I suggest adding something like: ”Our method does capture one essential characteristic of human speech: phonology.” Reply 45.—Thank you for drawing our attention to this.
Action - We removed the word “speech” from the manuscript, using “communication” and/or “vocalization” depending on the context.
Comment 46
Line 111-112: Consider adding a sound sample of the variation of the 4R2 coda type that can be vocalized as BCC but also as CBB as supplementary data.
What the reviewer has correctly observed is that the traditional categorical coda type ’names’ do not capture the variation within a type by rhythm nor by tempo.
Action - We have added samples of all coda types listed in Figure 1B in the paper’s Github repo.
Comment 47
Figure 3: Include a sound sample for each of the 7 coda types in Figure 1B (”specific vocal repertoires”) to illustrate the set of coda types used and their associated usage frequencies, or at least for each of the 7 coda types in Figure 3 and tables S1 and S2.
Sperm whales in the Eastern Caribbean produce dozens of rhythm types across at least five categorical tempo types [L8, L13]. The coda types represented in Figure 1B do not demonstrate all the variability inherent in the sperm whales’ vocal dialect. Importantly, Figure 3, as well as table S1 and S2, refer to clan-level dialects not specific individual coda types.
Action - We added sound samples for each coda rhythm type listed in Figure 1B to the Github repository.
Comment 48
Lines 184-190: It is unclear what human analogy term is used for ID codas. This needs clarification.
We are not making an analogy in humans for the role of ID vs non-ID codas, but only providing the example of accents as changes in vocalization (style) without a change in the actual words used (repertoire).
Action - We tried to make it clearer in the manuscript.
Comment 49
Line 190: Change ”whale speech” to ”whale vocalizations.”
Thanks.
Action - Done.
Comment 50
Figure 4: Correct citation number Hersh ”10” to Hersh ”11.”
Thanks.
Action - Fixed the reference.
Comment 51
Lines 224-232: Clarify whether the reference to how spatial overlap affects the frequency of ID codas refers to shared ID codas between clans or the production frequency of each coda within the total repertoire of codas.
The similarity between ID coda repertoires we are referring to there is based on the ID codas of both clans.
More details on the comparison can be found in [L9].
Action - We added a sentence explaining the comparison is made using the joint set of ID codas.
Comment 52
Lines 240-241: What are non-ID codas vocal cues for?
Non-ID codas likely serve as flexible, context-dependent signals that facilitate group coordination, convey environmental or social context, and promote social learning, especially in mixed-clan or overlapping habitats. Their variability suggests multifunctional roles shaped by ecological and social pressures.
Comment 53
Lines 267-268: It’s unclear whether non-ID coda vocal styles are genetically inherited or not, as argued in lines 257-258.
We did not intend to argue that non-ID coda vocal styles are genetically inherited. Instead, we aimed to present a hypothetical consideration: if non-ID coda vocal styles were genetically inherited, one would expect a direct correlation between vocal style similarity and genetic relatedness. This hypothetical framework was introduced to strengthen our argument that the observed patterns are unlikely to be explained by genetic inheritance, as such correlations have not been observed. While we acknowledge that we lack definitive proof to rule out genetic influences entirely, the evidence available strongly suggests that social learning, rather than genetic transmission, is the more plausible mechanism.
Action - Clarified in manuscript.
Comment 54
Line 277: Can males mate with females from different clans?
Yes, genetic evidence shows that males may even switch ocean basins.
Action - We have clarified that we mean the female members of units from different clans have only rarely been observed to interact at sea between clans.
Comment 55
Lines 287-292: Consider discussing the difference between controlled/voluntary and automatic/involuntary imitation and their implications for cultural selection and social learning (see Heyes 2011; 2012). Heyes, C. (2011). Automatic imitation. Psychological bulletin, 137(3), 463. Heyes, C. (2012). What’s social about social learning?. Journal of comparative psychology, 126(2), 193.
Thank you for your insightful comment regarding this. The distinction between controlled/voluntary and automatic/involuntary imitation, as highlighted by Heyes [L14, L15], provides a potentially valuable framework for interpreting social learning mechanisms in sperm whales. Automatic imitation refers to reflexive, often unconscious mimicry driven by perceptual or motor coupling, while controlled imitation involves deliberate and goal-directed efforts to replicate behaviors. Both forms likely play complementary roles in the cultural transmission observed in sperm whales.
This dual-process perspective highlights the potential for cultural selection to act at different levels. Automatic imitation may drive convergence in shared environments, promoting acoustic homogeneity and facilitating inter-clan communication. In contrast, controlled imitation ensures the preservation of clan-specific vocal traditions, maintaining cultural diversity. This interplay between automatic and controlled processes could reflect a balancing act between cultural assimilation and differentiation, underscoring the adaptive value of these mechanisms in dynamic social and ecological contexts.
Action - We have incorporated a short discussion of this distinction and its implications for our findings in the Discussion. Additionally, we have cited [L14, L15] to provide theoretical grounding for this interpretation.
Comment 56
Methods: Consider integrating the paragraph from lines 319-321 into lines 28-35 and eliminate redundant information.
Thanks.
Action - We implemented the suggestion, removing the first paragraph of the Dataset description and integrating the information when we introduce the concepts of codas and clicks.
[L1] C. Efferson, R. Lalive, and E. Fehr, Science 321, 1844 (2008).
[L2] R. McElreath, R. Boyd, and P. Richerson, Curr. Anthropol. 44, 122 (2003).
[L3] L. S. Burchardt and M. Knornschild, PLoS Computational Biology 16, e1007755 (2020).
[L4] A. Ravignani and K. de Reus, Evolutionary Bioinformatics 15, 1176934318823558 (2019).
[L5] C. T. Kello, S. D. Bella, B. Med´ e, and R. Balasubramaniam, Journal of the Royal Society Interface 14, 20170231 (2017).
[L6] D. Gerhard, Canadian Acoustics 31, 22 (2003).
[L7] N. Mathevon, C. Casey, C. Reichmuth, and I. Charrier, Current Biology 27, 2352 (2017).
[L8] P. Sharma, S. Gero, R. Payne, D. F. Gruber, D. Rus, A. Torralba, and J. Andreas, Nature Communications 15, 3617 (2024).
[L9] T. A. Hersh, S. Gero, L. Rendell, M. Cantor, L. Weilgart, M. Amano, S. M. Dawson, E. Slooten, C. M. Johnson, I. Kerr, et al., Proc. Natl. Acad. Sci. 119, e2201692119 (2022).
[L10] R. Boyd and P. J. Richerson, Cult Anthropol 2, 65 (1987). [L11] E. Cohen, Curr. Anthropol. 53, 588 (2012).
[L12] T. A. Hersh, S. Gero, L. Rendell, and H. Whitehead, Methods Ecol. Evol. 12, 1668 (2021), ISSN 2041-210X, 2041-210X.
[L13] S. Gero, A. Bøttcher, H. Whitehead, and P. T. Madsen, R. Soc. Open Sci. 3, 160061 (2016).
[L14] C. Heyes, Psychological Bulletin 137, 463 (2011).
[L15] C. Heyes, Journal of Comparative Psychology 126, 193 (2012).
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Weiler, Teichert, and Margrie systematically analyzed long-range cortical connectivity, using a retrograde viral tracing strategy to identify layer and region-specific cortical projections onto the primary visual, primary somatosensory, and primary motor cortices. Their analysis revealed several hundred thousand inputs into each region, with inputs originating from almost all cortical regions but dominated in number by connections within cortical sub-networks (e.g. anatomical modules). Generally, the relative areal distribution of contralateral inputs followed the distribution of corresponding ipsilateral inputs. The largest proportion of inputs originated from layer 6a cells, and this layer 6 dominance was more pronounced for contralateral than ipsilateral inputs, which suggests that these connections provide predominantly feedback inputs. The hierarchical organization of input regions was similar between ipsi- and contralateral regions, except for within-module connections, where ipsilateral connections were much more feed-forward than contralateral. These results contrast earlier studies which suggested that contralateral inputs only come from the same region (e.g. V1 to V1) and from L2/3 neurons. Thus, these results provide valuable data supporting a view of interhemispheric connectivity in which layer 6 neurons play an important role in providing modulatory feedback.
The conclusions of this paper are mostly well-supported by the data and analysis, but additional consideration of possible experimental biases is needed.
We thank the reviewer for their positive feedback on our manuscript.
Further discussion or analysis is needed about possible biases in uptake efficiency for different cell types. Is it possible that the nuclear retro-AAV has a tropism for layer 6 axons? Quantitative comparisons with results obtained with alternative methods such as rabies virus (Yao et al., 2023) or anterograde tracing (Harris et al., 2019) may be helpful for this.
We appreciate this technical comment. For the reasons indicated below we are confident that our AAV approach successfully and rather comprehensively labels inputs to the three target areas. Firstly, in the brains in which we injected our retrograde nuclear-AAV tracer into VISp, SSp-bfd or MOp we found several instances where layer 5 and/or layer 2/3 as was the dominant cortical projection layer (please see e.g. Figure 3 heatmaps). This was true for both ipsilateral and contralateral projection.
Secondly, by way of comparison Yao et al., 2023 injected rabies virus into VISp (but not in SSp-bfd or MOp) and their results show notable similarities to ours: 1) They show that contralateral inputs to VISp (and higher visual areas) were mainly located in Layers 5 and 6. 2) Retrogradely labelled neurons in higher visual areas revealed anatomical hierarchy that reflects the known functional hierarchy of the mouse cortical visual system and that shown by our retro-AAV approach. Thus, as AAV and rabies based tracing lead to similar results, this is further evidence against bias via tropism of our AAV tracer. That said, direct comparisons of the results between our study and the Yao et al., 2023 study should be viewed with some caution since Yao et. al. injected rabies virus into specific Cre-driver lines in which the rabies virus targets individual genetically defined cell types in specific layers. Importantly, because of the lack of a specific cre-driver line, L6 cortico-cortical (L6 CC) cells could not be targeted by their approach. Thus, the dataset in Yao et al., overlook the contribution of L6 CCs due to the lack of available Cre-lines.
Thirdly, in a recent study (Weiler et al., 2024) we found that in a specific pathway (SSp-bfd→ VISp) both retro-AAV and the more traditional non-viral tracer cholera toxin subunit B (CTB) identified neurons in Layer 6 as the main source of projection neurons. The same results for the same pathway was shown by Bieler et al., 2019 (Bieler et al., 2017) using Fluorogold for retrograde tracing. Thus, the described dominance of Layer 6 projection neurons in specific pathways is likely not the result of a tropism of retro-AAV tracers.
Please also see that we have now further extended the summary of these points in our revised manuscript in the discussion section (e.g. lines 457-463):
Quantitative analysis of the injection sites should be included to account for possible biases. For example, L6 neurons are known to be the main target of contralateral inputs into the visual cortex (Yao et al., 2023). Thus, if the injections are biased towards or against layer 6 neurons, this may change the layer distribution of retrogradely labeled input cells. Comparison across biological replicates may help reveal sensitivity to particular characteristics of the injections.
In response to the reviewers' feedback, please see we have now quantified the injection volume per cortical layer, as shown in the revised Fig. S3D. Our results indicate that the injections were not biased toward Layer 6. Instead, the injected tracer volumes in Layers 1, 4, 5, and 6 were similar across all animals and injected areas. However, we observed that the injected tracer volume in Layer 2/3 tended to be higher than in other layers. Although the tracer volumes in Layers 2/3 appeared to be higher, the proportion of input neurons located in Layers 2/3 for most of the cortical projection areas was consistently lower than that from Layer 6. These findings provide strong evidence against injection bias towards L6 inputs.
The possibility of labelling axons of passage within the white matter should be addressed. This could potentially lead to false positive connections, contributing to the broad connectivity from most cortical regions that were observed.
For clarification, please see Fig.S2B in our revised manuscript. In this panel we plot the average percentage volume of the viral boli in the target areas and in all other nearby structures including the white matter. The percentage of virus injected into the white matter (WM) was 0.0824 ± 0.0759% for VISp and 0.0650 ± 0.0481 for SSp-bfd injections. Notably, injections into MOp showed no leakage into white matter (0%). These minimal volumes of virus in the white matter are unlikely to significantly influence the observed profile of widespread connectivity. Please see we have added a sentence to the Results section (lines 84-86) where we state that we only used brains that had a transduction of the white matter below 0.1%.
Reviewer #2 (Public review):
Summary:
Weiler et al use retrograde tracers, two-photon tomography, and automatic cell detection to provide a detailed quantitative description of the laminar and area sources of ipsi- and contralateral cortico-cortical inputs to two primary sensory areas and a primary motor area. They found considerable bilateral symmetry in the areas providing cortico-cortical inputs. However, although the same regions in both hemispheres tended to supply inputs, a larger proportion of inputs from contralateral areas originated from deeper layers (L5 and L6).
Strengths:
The study applies state-of-the-art anatomical methods, and the data is very effectively presented and carefully analyzed. The results provide many novel insights into the similarities and differences of inputs from the two hemispheres. While over the past decade there have been many studies quantitatively and comprehensively describing cortico-cortical connections, by directly comparing inputs from the ipsi and contralateral hemispheres, this study fills in an important gap in the field. It should be of great utility and an important reference for future studies on inter-hemispheric interactions.
We thank the reviewer for this encouraging feedback on our manuscript.
Weaknesses:
Overall, I do not find any major weakness in the analyses or their interpretation. However, one must keep in mind that the study only analyses inputs projecting to three areas. This is not an inherent flaw of the study; however, it warrants caution when extrapolating the results to callosal projections terminating in other areas. As inputs to two primary sensory areas and one is the primary motor cortex are studied, some of the conclusions could potentially be different for inputs terminating in high-order sensory and motor areas. Given that primary areas were injected, there are few instances of feedforward connections sampled in the ipsilateral hemisphere. The study finds that while ipsi-projections from the visual cortex to the barrel cortex are feedforward given its fILN values, those from the contralateral visual cortex are feedback instead. One is left to wonder whether this is due to the cross-modal nature of these particular inputs and whether the same rule (that contralateral inputs consistently exhibit feedback characteristics regardless of the hierarchical relationship of their ipsilateral counterparts with the target area,) would also apply to feedforward inputs within the same sensory cortices.
We acknowledge that what we find for primary sensory and motor target areas may not hold for other functionally different areas such as anterior cingulate cortex, retrosplenial cortex or frontal lobe that might be expected to receive strong feedforward cortical input. To begin to understand the organization of the global cortical input we have however first explored with primary sensory and motor areas. Please see that we have now added a sentence to the Discussion section of our manuscript that highlights the importance of investigating the hierarchical organization of intra and interhemispheric input onto higher cortical areas or within subregions of a given sensory area.
Another issue that is left unexplored is that, in the current analyses the barrel and primary visual cortex are analyzed as a uniform structure. It is well established that both the laminar sources of callosal inputs and their terminations differ in the monocular and binocular areas of the visual cortex (border with V2L). Similarly, callosal projections differ when terminating the border of S1 (a row of whiskers), and then in other parts of S1. Thus, some of the conclusions regarding the laminar sources of callosal inputs might depend on whether one is analyzing inputs terminating or originating in these border regions.
The aim of the present study was to analyse the global projectome to the VISp, SSp-bfd and MOp, irrespective of which subregions were included. Importantly, we purposely injected rather large bolus volumes to achieve large sample sizes of target neurons in each cortical layer. For SSp-bfd, we utilised our previously reconstructed barrel map (Weiler et al., 2024) to precisely map our viral injection sites onto the barrels (Author response image 1). Analysis revealed that the six injection sites consistently encompassed 7–13 barrels (Author response image 1, three exemplary injection sites). Additionally, we determined the centres of mass for each injection site and mapped them onto the barrel map. Four of the injection sites were located in the lateral part of SSp-bfd, two in the central region, and none in the medial part. Notably, the injection sites within SSp-bfd exhibited significant overlap. As a result, a selective analysis of callosal projections targeting these injection sites would likely not yield distinct projection patterns, as the projectomes would inevitably include projections to surrounding barrels, leading to contamination.
Author response image 1.
Left: exemplary Injection sites mapped onto the 3D barrel map of SSp-bfd within the Mouse Allen Brain Atlas. Barrels were reconstructed using a specialized software as described previously (Weiler et al., 2024) Right: Centres of mass of all SSp-bfd injection sites mapped onto the 3D barrel map.
Due to the fact we covered a significant proportion of the respective target primary sensory area any further subdivision of these data is not possible and requires more tailored injections into clearly defined subareas. Investigating the separate projectomes onto these subregions (e.g. onto V1M and V1B) remains an important interesting research question that we, at least in part, will address in a future study.
Finally, while the paper emphasizes that projections from L6 "dominate" intra and contralateral cortico-cortical inputs, the data shows a more nuanced scenario. While it is true that the areas for which L6 neurons are the most common source of cortico-cortical projections are the most abundant, the picture becomes less clear when considering the number of neurons sending these connections. In fact, inputs from L2/3 and L5 combined are more abundant than those from L6 (Figure 3B), challenging the view that projections from L6 dominate ipsi- and contralateral projecting cortico-cortical inputs.
We agree in the case of the barrel cortex, layer 5 significantly contributes in terms of the number of brain areas projecting from within the ipsilateral and contralateral hemispheres. Please see we have replaced the term “dominates” in the title, abstract and in the manuscript where relevant.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The sections analyzing the role of L6 towards feedback (pg. 11-13, Figure 6) were a bit verbose and confusing to me. Three possible models are proposed:
(1) a decrease in L23 projections, (2) an increase in L56 projections, or (3) both.
However, what is being quantified appears to be the fractions inputs, with L23. L5, and L6 summing to 1. Thus, a decrease in L23 would necessarily result in an increase in L56 projections. It seems like it would make more sense to quantify the percent change in the total number of inputs (rather than fractional) from each layer so that the 3 models are actually independent possibilities.
The issue with the suggested analysis is that, with one exception (one area projecting to MOp), the number of projection neurons in contralateral areas is always ~60-80% lower compared to their ipsilateral counterparts. Consequently, this is also true for the number of projection neurons in the different cortical layers. Thus, quantifying the percentage change from the ipsilateral to the contralateral hemisphere in the total number of inputs from each layer will always result in negative values.
Nevertheless, we addressed the reviewer’s issue by calculating the preservation index (1(ipsi-contra)/(ipsi+contra)) for the sensory-motor areas independently for the absolute number of neurons within L2/3, 5 and 6 for the cortical areas projecting to VISp, SSp-bfd and MOp (see Author response image 2). When analysing the shift from the ipsilateral to the contralateral hemisphere, we observed that significantly more projection neurons were preserved in L6 compared to L2/3 for VISp and SSp-bfd. This shows that the number of L6 projection neurons declines less from the ipsilateral to the contralateral hemisphere compared to L2/3. However, our focus was on the fraction of projection neurons within each layer relative to the other layers per hemisphere (see Fig.6 of our manuscript). This measure is critical for distinguishing between feedforward and feedback connectivity. Calculating the change for each layer independently unfortunately does not provide insights into this comparison, as it does not capture the relative distribution of projection neurons across layers, which is central to our analysis. Therefore, we chose to present the data as layer fractions normalised within each hemisphere separately, enabling a comparison of relative changes between hemispheres, as shown in Fig.6 in the manuscript. We agree that with our approach a decrease in the fraction of L2/3 neurons would necessarily lead to an increase in the fraction of L5+6 neurons. However, as we analysed the fractional change for L5 and L6 separately, we found that the fraction of projection neurons in L5 generally showed only minor changes, while the fraction of L6 projection neurons increased substantially (Fig.6C). In addition, excluding L5 from the ipsi- or contralateral default network had significant effects on the fILN in only a relatively small number of projection areas. Excluding L6 resulted in significant changes in many more projection areas than layer 5.
Author response image 2.
Preservation index for L2/3, L5 and L6 of the 24 sensory-motor areas projecting onto the three target areas VISp, SSp-bfd and MOp.
Reviewer #2 (Recommendations for the authors):
I feel that there are a few conclusions that could be strengthened in the paper:
(1) The laminar sources of callosal inputs and their terminations differ in the monocular and binocular areas of the visual cortex (border with V2L. Similarly, callosal inputs are different close to the border of S1 with S2 than in the rest of the barrel cortex. From the methods sections and Figure S2, it seems that some injections targeted the V1 binocular zone while others were aimed at the monocular zone. Thus, it would be of interest to compare the laminar distribution and fILM of the contra inputs in inputs to the binocular and monocular zones (and S1 border vs the rest, if possible within this dataset).
Please see the answer for the reviewer’s second point in the public review (above).
(2) The results are currently a bit unclear on whether the contra inputs reflect the cortical hierarchy. Figure 4E-F makes it clear that the ipsi and contra fILMs do not always match. However, it seems from the plots in Figure 4D and Figure S6 that, while the contra fILM values are always higher, there might be a correlation between the ipsi and contra fILM. This could be addressed by directly plotting contra vs ipsi fILM.
Similarly, it would be useful to directly address if there is any hint of the visual hierarchy, as calculated in Figure S5 for the contra inputs.
Regarding the first point of the reviewer: We appreciate this comment. We do indeed find a positive correlation between the fILN ipsilateral and fILN contralateral across the individual cortical areas for all three targets. (please see Author response image 3 below). This is indeed an interesting observation that indicates a high degree of preservation concerning the rank order of the anatomical hierarchy within the input arising from both hemispheres. Please see we have included this new figure 4F into the manuscript and added a sentence in the results (lines 282-288):
Regarding the second point of the reviewer: For visual hierarchy, although weaker, we find that the hierarchical ranking of the higher cortical visual areas is preserved for the contralateral hemisphere (see Author response image 3 below).
Author response image 3.
Rank ordered average fILN values (± sem) of higher visual cortical areas of the ventral and dorsal visual stream for the ipsilateral and contralateral hemisphere.
(3) I find the emphasis in the title and other parts of the paper on Layer 6 corticocortical cells dominating the anatomical organization of intra and interhemispheric feedback a bit of an overstatement. While it is true that the areas for which L6 is the most abundant source of cortico-cortical projections are the most abundant (Figure 3C), when just focusing on the number of neurons sending corticocortical connections (Figure 3B), this is less clear. Ipsi connections are roughly divided 1/3, 1/3 , 1/3 between L2/3 , L5 and L6. In the contra, while projections from L6 neurons are the most abundant, there are not a majority and are less than those of L2/3 and L5 together. I suggest revising the statement about L6 cells dominating cortico-cortical connections to more accurately reflect these nuances.
(4) The observations from Figure 3 discussed above suggest that L6 inputs dominate in areas with less abundant projections to the injected areas. Is this the case? Is the fraction of L6 inputs inversely correlated with the number of inputs from that area?
Please see the following correlation plots for the total number of inputs versus the fraction of L6 inputs per area for both the ipsilateral and contralateral hemisphere. We do find on the ipsilateral hemisphere a negative correlation between the total number of inputs and the L6 input fraction for VISp and to a lesser degree for SSp-bfd. Interestingly, we find the opposite correlation for the ipsilateral MOp, contralateral VISp, SSp-bfd and MOp (Author response image 4, Author response table 1). While this is an interesting finding, the correlations often appeared to be weak and often absent within the individual animals and across the three target areas (Author response table 1). Thus, these correlations are seemingly not a general feature of cortical connectivity.
Author response image 4.
Total number of cells versus fraction of cells within L6 per cortical brain area (average across animals) for the ipsilateral (top) and contralateral (bottom) hemisphere for the three target areas VISp, SSp-bfd and MOp.
Author response table 1: Respective correlations between total numbers of cells and fraction of cells within L6 per cortical brain area for the ipsilateral and contralateral hemisphere for the three target areas (significant correlations highlighted with green).
Minor issues:
(5) Where was the mouse in Figure 3A injected?
In this exemplary mouse the retrograde tracer was injected into VISp. We added this information in the Figure legend of Figure 3A1.
(6) Clarify in panel 4F that the position of the circle corresponds to the area location.
Done as suggested.
References
Bieler M, Sieben K, Cichon N, Schildt S, Röder B, Hanganu-Opatz IL. 2017. Rate and Temporal Coding Convey Multisensory Information in Primary Sensory Cortices. eNeuro 4. doi:10.1523/ENEURO.0037-17.2017
Weiler S, Rahmati V, Isstas M, Wutke J, Stark AW, Franke C, Graf J, Geis C, Witte OW, Hübener M, Bolz J, Margrie TW, Holthoff K, Teichert M. 2024. A primary sensory cortical interareal feedforward inhibitory circuit for tacto-visual integration. Nat Commun 15:3081. doi:10.1038/s41467-024-47459-2
Yao S, Wang Q, Hirokawa KE, Ouellette B, Ahmed R, Bomben J, Brouner K, Casal L, Caldejon S, Cho A, Dotson NI, Daigle TL, Egdorf T, Enstrom R, Gary A, Gelfand E, Gorham M, Griffin F, Gu H, Hancock N, Howard R, Kuan L, Lambert S, Lee EK, Luviano J, Mace K, Maxwell M, Mortrud MT, Naeemi M, Nayan C, Ngo N-K, Nguyen T, North K, Ransford S, Ruiz A, Seid S, Swapp J, Taormina MJ, Wakeman W, Zhou T, Nicovich PR, Williford A, Potekhina L, McGraw M, Ng L, Groblewski PA, Tasic B, Mihalas S, Harris JA, Cetin A, Zeng H. 2023. A whole-brain monosynaptic input connectome to neuron classes in mouse visual cortex. Nat Neurosci 26:350–364. doi:10.1038/s41593-022-01219-x
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
Lejeune et al. demonstrated sex-dependent differences in the susceptibility to MRSA infection. The authors demonstrated the role of the microbiota and sex hormones as potential determinants of susceptibility. Moreover, the authors showed that Th17 cells and neutrophils contribute to sex hormone-dependent protection in female mice.
Strengths:
The role of microbiota was examined in various models (gnotobiotic, co-housing, microbiota transplantation). The identification of responsible immune cells was achieved using several genetic knockouts and cell-specific depletion models. The involvement of sex hormones was clarified using ovariectomy and the FCG model.
Weaknesses:
The mechanisms by which specific microbiota confer female-specific protection remain unclear.
We thank the reviewer for highlighting the strengths of the manuscript including the models and techniques we employ. We agree that the relationship between the microbiota and sex-dependent protection is less developed compared with other aspects of the study. As detailed below, we are attempting to identify specific microbes that confer femalespecific protection and links with sex hormones. We have promising but preliminary results. Thus, in our revised manuscript, we added new data on the host response as suggested by the detailed comments from the Reviewers. We also elaborate on the potential role of the microbiota in the discussion section.
Reviewer #1 (Recommendations for the authors):
(1) The authors nicely showed that the transfer of the protective phenotype by FMT requires the female sex in recipients (Figure 2E). However, it remains unclear whether the female sex is required to develop protective microbiota in donor mice, as only the female NYU donor-male Jax recipient combination was tested. What happens if the microbiota from male NYU mice is transplanted into female Jax mice? If sex hormones act only on the downstream of the microbiota, such mice would show the protective phenotype. However, if sex hormones are required to establish a protective microbiota, the transplantation of microbiota from male NYU mice will not confer protection in recipient female Jax mice.
The Reviewer’s comment is well taken. We have not conducted the suggested experiment of FMT from male NYU mice to JAX female mice yet because we are pursuing an in vitro approach that we hope will eventually provide a more definitive answer. We observed that stool from female NYU mice and not JAX mice inhibits MRSA when cultured under anaerobic conditions, and this inhibitory activity is eliminated by filtration (Author response image 1A). We also observed that stool from male NYU mice inhibits MRSA growth to a similar extent as stool from female NYU mice (Author response image 1B). This result suggests that the protective role of sex hormones is downstream of the microbiota. We are in the process of identifying the specific microbiota member to support this conclusion.
Author response image 1.
Stool from NYU mice inhibits MRSA growth in vitro. (A) MRSA CFU/mL in media (TSB) following culture with unfiltered or filtered stool homogenate from female NYU or JAX mice. Stool homogenate or TSB alone was added in a 1:1 ratio to 1x106 CFU/mL MRSA and cultured anaerobically for up to 24 hours. (B) MRSA CFU/mL in TSB following culture with unfiltered stool homogenate from NYU male or female mice. Stool homogenate or TSB alone was added in a 1:1 ratio to 1x106 CFU/mL MRSA. 3 experimental replicates performed; stool taken from 6 individual mice per condition. Mean MRSA burden ± SEM. Area under the curve analysis + One way ANOVA with Sidak’s multiple comparisons test. ns: not significant.
(2) The results clearly showed the involvement of the specific microbiota in NYU mice in the sex-dependent bias in susceptibility to MRSA. However, the mechanisms by which specific microbiota promotes female sex-mediated protection need to be better described. Is this simply attributed to the different Th17 cell numbers in NYU and Jax mice (i.e., increased commensalspecific Th17 cells in NYU like Taconic mice)? Or is it possible that NYU microbiota impacts the regulation of sex hormones or their downstream signaling? What about the level of sex hormones in NYU and Jax mice? Are these levels equivalent or different? Do NYU and Jax microbiotas regulate the expression of sex hormone receptors in immune cells differently?
These are great questions. We do not observe baseline differences in Th17 cells like JAX versus Taconic mice (Figure 5B), suggesting that the mechanism is different. However, it is quite possible that an antigen-specific T cells, or Th17 cell specifically, is present at low levels and expands rapidly upon MRSA colonization. We have added this possibility to the discussion in the revised manuscript. To address the Reviewer’s question about the effect of the microbiota on sex hormones, we first sought to determine which sex hormone is necessary. Using estrogen receptor knockouts (Esr1<sup>-/-</sup>), we were able to implicate estrogen and have added this important finding to the manuscript (Fig 6C). Then, we measured levels of estradiol in stool samples but did not observe a difference between NYU and JAX female mice (Author response image 2). We provide the results below but did not add it to the revised manuscript because we found it difficult to draw a conclusion without more extensive profiling as well as quantification of the receptor on specific immune cell subsets and cell-type specific knockouts. Also, see our response to Reviewer #3 regarding receptor expression. Although we have yet to explain the role of the microbiota, we hope the Reviewer agrees that we have promising yet preliminary results and that the new experiments we added to the manuscript have further strengthened the mechanism on the host-side.
Author response image 2.
Estradiol levels in stool samples prior to MRSA inoculation. (A) Estradiol levels in stool samples collected prior to MRSA inoculation in male and female mice bred at NYU or purchased from Jackson Labs. Frozen stool samples were normalized by weight and processed using the DetectX® Estradiol ELISA Kit (Arbor Assays).
(3) The authors claimed that Th17-mediated recruitment of neutrophils likely promotes the clearance of MRSA in female NYU mice. However, the experimental evidence supporting this claim could be stronger. The authors should show the neutrophil recruitment in the gut mucosa in female and male NYU mice. Also, the levels of neutrophils between NYU and Jax female mice should be examined. To further strengthen the link between Th17 and neutrophils, it would be ideal to analyze neutrophil recruitment in mice lacking Th17 cells (i.e., Rag2-/-, anti-CD4 treated, Rorgt-/- mice).
We agree and now include a more detailed analyses of neutrophils. We found that the number of neutrophils in the intestine were not higher in NYU female mice compared with NYU male mice, with or without MRSA. Instead, we show that neutrophils in NYU female mice display higher levels of surface CD11b, a sign of activation, compared to males following inoculation with MRSA . We have added these findings to the revised manuscript (Fig5 H and I). IL-17 can activate neutrophils and increase their antimicrobial activity. Consistent with this possibility, we now show that female mice lacking the IL-17 receptor lose the enhanced colonization resistance. Based on these findings, we have modified this aspect of the conclusion, and thank the reviewer for the helpful suggestion.
Reviewer #2 (Public review):
The current study by Lejeune et al. investigates factors that allow for persistent MRSA infection in the GI tract. They developed an intriguing model of intestinal MRSA infection that does not use the traditional antibiotic approach, thereby allowing for a more natural infection that includes the normal intestinal microbiota. This model is more akin to what might be expected to be observed in a healthy human host. They find that biological sex plays a clear role in bacterial persistence during infection but only in mice bred at an NYU Facility and not those acquired from Jackson Labs. This clearly indicates a role for the intestinal microbiome in affecting female bacterial persistence but not male persistence which was unaffected by the origin of the mice and thus the microbiome. Through a series of clever microbiome-specific transfer experiments, they determine that the NYU-specific microbiome plays a role in this sexual dimorphism but is not solely responsible. Additional experiments indicate that Th17 cells, estrogen, and neutrophils also participate in the resistance to persistent infection. Notably, they assess the role of sex chromosomes (X/Y) using the established four core genotype model and find that these chromosomes appear to play little role in bacterial persistence.
Overall, the paper nicely adds to the growing body of literature investigating how biological sex impacts the immune system and the burden of infectious disease. The conclusions are mostly supported by the data although there are some aspects of the data that could be better addressed and clarified.
We thank the Reviewer for appreciating our contribution and these supportive comments. We have added several experiments to fill-in gaps and text revisions to increase clarity and acknowledge limitations.
(1) There is something of a disconnect between the initial microbiome data and the later data that analyzes sex hormones and chromosomes. While there are clearly differences in microbial species across the two sites (NYU and JAX) how these bacterial species might directly interact with immune cells to induce female-specific responses is left unexplored. At the very least it would help to try and link these two distinct pieces of data to try and inform the reader how the microbiome is regulating the sex-specific response. Indeed, the reader is left with no clear exploration of the microbiota's role in the persistence of the infection and thus is left wanting.
We agree. This comment is similar to Reviewer #1’s feedback. As mentioned above, we are attempting to clarify the association between sex differences and the microbiota and have included preliminary results for the Reviewers. However, addressing this disconnect will require substantially more investigation. Instead, we have added insightful new data that elaborate on aspects of the host response. We hope the Reviewer agrees that revised manuscript is stronger and that further delineation of the microbiota can be addressed by future studies.
(2) While the authors make a reasonable case that Th17 T cells are important for controlling infection (using RORgt knockout mice that cannot produce Th17 cells), it is not clear how these cells even arise during infection since the authors make most of the observations 2 days postinfection which is longer before a normal adaptive immune response would be expected to arise. The authors acknowledge this, but their explanation is incomplete. The increase in Th17 cells they observe is predicated on mitogenic stimulation, so they are not specific (at least in this study) for MRSA. It would be helpful to see a specific restimulation of these cells with MRSA antigens to determine if there are pre-existing, cross-reactive Th17 cells specific for MRSA and microbiota species which could then link these two as mentioned above.
We acknowledge that this is a limitation of our study. Although an experiment demonstrating pre-existing, cross-reactive T cells would help support our conclusion, aspects of MRSA biology may make the results of this experiment difficult to interpret. We have consulted with an expert on MRSA virulence factors, co-lead author Dr. Victor Torres, about the feasibility of this experiment. MRSA possess superantigens, such as Staphylococcal enterotoxin B, which bind directly to specific Vβ regions of T-cell receptors (TCR) and major histocompatibility complex (MHC) class II on antigen-presenting cells, resulting in hyperactivation of T lymphocytes and monocytes/macrophages. Additionally, other MRSA virulence factors, such as α-hemolysin and LukED, induce cell death of lymphocytes. MRSA’s enterotoxins are heat stable, so heat-inactivation of the bacterium may not help in this matter. For these reasons, it is unlikely that we can perform a simple restimulation of lymphocytes with MRSA antigens.
A study by Shao et al. provides an example of a host commensal species inducing Th17 cells with cross-reactivity against MRSA. Upon intestinal colonization, the intestinal fungus Candida albicans influences T cell polarization towards a Th17 phenotype in the spleen and peripheral lymph nodes which provided protection to the host against systemic candidemia. Interestingly, this induction of protective Th17 cells, increased IL-17 and responsiveness in circulating Ly6G+ neutrophils also protected mice from intravenous infection with MRSA, indicating that T cell activation and polarization by intestinal C. albicans leads to non-specific protective responses against extracellular pathogens.
Shao TY, Ang WXG, Jiang TT, Huang FS, Andersen H, Kinder JM, Pham G, Burg AR, Ruff B, Gonzalez T, Khurana Hershey GK, Haslam DB, Way SS. Commensal Candida albicans Positively Calibrates Systemic Th17 Immunological Responses. Cell Host & Microbe. 2019 Mar 13;25(3):404-417.e6. doi: 10.1016/j.chom.2019.02.004. PMID: 30870622; PMCID: PMC6419754.
We have added a brief version of the above discussion in the revised manuscript. Also, as mentioned earlier, we have added new data strengthening the axis between Th17 and neutrophils, including showing that IL-17 receptor is necessary and that neutrophils display signs of heightened activation in female mice during MRSA colonization.
(3) The ovariectomy experiment demonstrates a role for ovarian hormones; however, it lacks a control of adding back ovarian hormones (or at least estrogen) so it is not entirely obvious what is causing the persistence in this experiment. This is especially important considering the experiments demonstrating no role for sex chromosomes thus demonstrating that hormonal effects are highly important. Here it leaves the reader without a conclusive outcome as to the exact hormonal mechanism.
This is a great suggestion. Rather than adding back ovarian hormones, we performed the more direct experiment and tested whether the estrogen receptor (ERα, encoded by Esr1) is necessary for the enhanced colonization resistance. Indeed, we observed that Esr1<sup>-/-</sup> female mice have increased MRSA burden compared to Esr1<sup>+/-</sup> littermates. We have added this new result (Figure 6C) and thank the Reviewer for their guidance.
4) The discussion is underdeveloped and is mostly a rehash of the results. It would greatly enhance the manuscript if the authors would more carefully place the results in the context of the current state of the field including a more enhanced discussion of the role of estrogen, microbiome, and T cells and how the field might predict these all interact and how they might be interacting in the current study as well.
Author response: We thank the Reviewer for their feedback in improving the scholarship on the manuscript. We have expanded on the literature and the mechanistic model in both the discussion section and other parts to provide better context for our findings.
Reviewer #3 (Public review):
Summary:
Using a mouse model of Staphylococcus aureus gut colonization, Lejeune et al. demonstrate that the microbiome, immune system, and sex are important contributing factors for whether this important human pathogen persists in the gut. The work begins by describing differential gut clearance of S. aureus in female B6 mice bred at NYU compared to those from Jackson Laboratories (JAX). NYU female mice cleared S. aureus from the gut but NYU male mice and mice of both sexes from JAX exhibited persistent gut colonization. Further experimentation demonstrated that differences between staphylococcal gut clearance in NYU and JAX female mice were attributed to the microbiome. However, NYU male and female mice harbor similar microbiomes, supporting the conclusion that the microbiome cannot account for the observed sex-dependent clearance of S. aureus gut colonization. To identify factors responsible for female clearance of S. aureus, the authors performed RNAseq on intestinal epithelial cells and cells enriched within the lamina propria. This analysis revealed sexdependent transcriptional responses in both tissues. Genes associated with immune cell function and migration were distinctly expressed between the sexes. To determine which immune cell types contribute to S. aureus clearance Lejeune et al employed genetic and antibody-mediated immune cell depletion. This experiment demonstrated that CD4+ IL17+ cells and neutrophils promote the elimination of S. aureus from the gut. Subsequent experiments, including the use of the 'four core genotype model' were conducted to discern between the roles of sex chromosomes and sex hormones. This work demonstrated that sex-chromosome-linked genes are not responsible for clearance, increasing the likelihood that hormones play a dominant role in controlling S. aureus gut colonization.
Strengths:
A strength of the work is the rigorous experimental design. Appropriate controls were executed and, in most cases, multiple approaches were conducted to strengthen the authors' conclusions. The conclusions are supported by the data.
The following suggestions are offered to improve an already strong piece of scholarship.
Weaknesses:
The correlation between female sex hormones and the elimination of S. aureus from the gut could be further validated by quantifying sex hormones produced in the four core genotype mice in response to colonization. Additionally, and this may not be feasible, but according to the proposed model administering female sex hormones to male mice should decrease colonization. Finally, knowing whether the quantity of IL-17a CD4+ cells change in the OVX mice has the potential to discern whether abundance/migration of the cells or their activation is promoted by female sex hormones.
In the Discussion, the authors highlight previous work establishing a link between immune cells and sex hormone receptors, but whether the estrogen (and progesterone) receptor is differentially expressed in response to S. aureus colonization could be assessed in the RNAseq dataset. Differential expression of known X and Y chromosome-linked genes were discussed but specific sex hormones or sex hormone receptors, like the estrogen receptor, were not. This potential result could be highlighted.
We appreciate the comment on the scholarship and thank the Reviewer for the insightful suggestions to improve this manuscript. We apologize for not including references that address some of the Reviewer’s questions. Other research groups have compared the levels of hormones between XX and XY males and females in the four core genotypes model and have found similar levels of circulating testosterone in adult XX and XY males. No difference was found in circulating estradiol levels in XX vs XY- females when tested at 4-6 or 79 months of age.
Karen M. Palaszynski, Deborah L. Smith, Shana Kamrava, Paul S. Burgoyne, Arthur P. Arnold, Rhonda R. Voskuhl, A Yin-Yang Effect between Sex Chromosome Complement and Sex Hormones on the Immune Response. Endocrinology, Volume 146, Issue 8, 1 August 2005, Pages 3280–3285, https://doi.org/10.1210/en.2005-0284
Sasidhar MV, Itoh N, Gold SM, Lawson GW, Voskuhl RR. The XX sex chromosome complement in mice is associated with increased spontaneous lupus compared with XY. Ann Rheum Dis. 2012 Aug;71(8):1418-22. doi: 10.1136/annrheumdis-2011-201246. Epub 2012 May 12. PMID: 22580585; PMCID: PMC4452281.
Administering female sex hormones to males is a good idea. We did not observe an effect of injecting males with estrogen on MRSA colonization (data not shown), perhaps due to the dose or timing, or because it is not sufficient (i.e., additional hormones and factors may be required). Therefore, we analyzed the necessity of estrogen signaling and found that Esr1<sup>-/-</sup> female mice impairs colonization resistance to MRSA. We have added this new experiment to the revised manuscript (Fig6 C).
Examination of the levels of estrogen, progesterone, and androgen receptors in our cecalcolonic lamina propria RNA-seq dataset is an excellent idea. We observed a significant increase in the G-protein coupled estrogen receptor 1 (Gper1) and a non-significant increase in Estrogen receptor alpha (Esr1) following MRSA inoculation in the immune cell compartment. This analysis has been added to the revised manuscript (Supplemental Fig6).
Reviewer #3 (Recommendations for the authors)
Minor editing issues:
The topic sentence of the last paragraph in the Results section states - 'male sex defining gene sex determining region Y (Sry) has been moved from the Y chromosome to an autosome'. 'Sex defining gene' and sex-determining region seems redundant in this context. A sex-defining gene would presumably be located within a sex-determining region.
Bold the letter 'F' in the Figure 5 legend.
It's not clear from the Figure 6E legend when the IL-17A+ CD4+ cells were quantified, 2 dpi?
In the third sentence of the second paragraph of the Discussion, the two references are merged together.
We thank the Reviewer for pointing out these editing issues. They have been addressed in the revised manuscript.
-
-
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Overall I found the approach taken by the authors to be clear and convincing. It is striking that the conclusions are similar to those obtained in a recent study using a different computational approach (finite state controllers), and lend confidence to the conclusions about the existence of an optimal memory duration. There are a few points or questions that could be addressed in greater detail in a revision:
(1) Discussion of spatial encoding
The manuscript contrasts the approach taken here (reinforcement learning in a grid world) with strategies that involve a "spatial map" such as infotaxis. The authors note that their algorithm contains "no spatial information." However, I wonder if further degrees of spatial encoding might be delineated to better facilitate comparisons with biological navigation algorithms. For example, the gridworld navigation algorithm seems to have an implicit allocentric representation, since movement can be in one of four allocentric directions (up, down, left, right). I assume this is how the agent learns to move upwind in the absence of an explicit wind direction signal. However, not all biological organisms likely have this allocentric representation. Can the agent learn the strategy without wind direction if it can only go left/right/forward/back/turn (in egocentric coordinates)? In discussing possible algorithms, and the features of this one, it might be helpful to distinguish<br /> (1) those that rely only on egocentric computations (run and tumble),<br /> (2) those that rely on a single direction cue such as wind direction,<br /> (3) those that rely on allocentric representations of direction, and<br /> (4) those that rely on a full spatial map of the environment.
As Referee 1 points out, even if the algorithm does not require a map of space, the agent is still required to tell apart directions relative to the wind direction which is assumed known. Indeed, although in the manuscript we labeled actions allocentrically as “ up down left and right”, the source is always placed in the same location, hence “left” corresponds to upwind; “right” to downwind and “up” and “down” to crosswind right and left. Thus in fact directions are relative to the mean wind, which is therefore assumed known. We have better clarified the spatial encoding required to implement these strategies, and re-labeled the directions as upwind, downwind, crosswind-right and crosswind-left.
In reality, animals cannot measure the mean flow, but rather the local flow speed e.g. with antennas for insects, with whiskers for rodents and with the lateral line for marine organisms. Further work is needed to address how local flow measures enable navigation using Q learning.
(2) Recovery strategy on losing the plume
While the approach to encoding odor dynamics seems highly principled and reaches appealingly intuitive conclusions, the approach to modeling the recovery strategy seems to be more ad hoc. Early in the paper, the recovery strategy is defined to be path integration back to the point at which odor was lost, while later in the paper, the authors explore Brownian motion and a learned recovery based on multiple "void" states. Since the learned strategy works best, why not first consider learned strategies, and explore how lack of odor must be encoded or whether there is an optimal division of void states that leads to the best recovery strategies? Also, although the authors state that the learned recovery strategies resemble casting, only minimal data are shown to support this. A deeper statistical analysis of the learned recovery strategies would facilitate comparison to those observed in biology.
We thank Referee 1 for their remarks and suggestion to give the learned recovery a more prominent role and better characterize it. We agree that what is done in the void state is definitely key to turbulent navigation. In the revised manuscript, we have further substantiated the statistics of the learned recovery by repeating training 20 times and comparing the trajectories in the void (Figure 3 figure supplement 3, new Table 1). We believe however that starting with the heuristic recovery is clearer because it allows to introduce the concept of recovery more clearly. Indeed, the learned “recovery” is so flexible that it ends up mixing recovery (crosswind motion) to aspects of exploitation (surge): we defer a more in-depth analysis that disentangles these two aspects elsewhere. Also, we added a whole new comparison with other biologically inspired recoveries both in the native environment and for generalization (Figure 3 and 5).
(3) Is there a minimal representation of odor for efficient navigation?
The authors suggest (line 280) that the number of olfactory states could potentially be reduced to reduce computational cost. This raises the question of whether there is a maximally efficient representation of odors and blanks sufficient for effective navigation. The authors choose to represent odor by 15 states that allow the agent to discriminate different spatial regimes of the stimulus, and later introduce additional void states that allow the agent to learn a recovery strategy. Can the number of states be reduced or does this lead to loss of performance? Does the optimal number of odor and void states depend on the spatial structure of the turbulence as explored in Figure 5?
We thank the referee for their comment. Q learning defines the olfactory states prior to training and does not allow a systematic optimization of odor representation for the task. We can however compare different definitions of the olfactory states, for example based on the same features but different discretizations. We added a comparison with a drastically reduced number of non-empty olfactory states to just 1, i.e. if the odor is above threshold at any time within the memory, the agent is in the non-void olfactory state, otherwise it is in the void state. This drastic reduction in the number of olfactory states results in less positional information and degrades performance (Figure 5 figure supplement 5).
The number of void states is already minimal: we chose 50 void states because this matches the time agents typically remain in the void (less than 50 void states results in no convergence and more than 50 introduces states that are rarely visited).
One may instead resort to deep Q-learning or to recurrent neural networks, which however do not provide answers as for what are the features or olfactory states that drive behavior (see discussion in manuscript and questions below).
Reviewer #2 (Public review):
Summary:
The authors investigate the problem of olfactory search in turbulent environments using artificial agents trained using tabular Q-learning, a simple and interpretable reinforcement learning (RL) algorithm. The agents are trained solely on odor stimuli, without access to spatial information or prior knowledge about the odor plume's shape. This approach makes the emergent control strategy more biologically plausible for animals navigating exclusively using olfactory signals. The learned strategies show parallels to observed animal behaviors, such as upwind surging and crosswind casting. The approach generalizes well to different environments and effectively handles the intermittency of turbulent odors.
Strengths:
(1) The use of numerical simulations to generate realistic turbulent fluid dynamics sets this paper apart from studies that rely on idealized or static plumes.
(2) A key innovation is the introduction of a small set of interpretable olfactory states based on moving averages of odor intensity and sparsity, coupled with an adaptive temporal memory.
(3) The paper provides a thorough analysis of different recovery strategies when an agent loses the odor trail, offering insights into the trade-offs between various approaches.
(4) The authors provide a comprehensive performance analysis of their algorithm across a range of environments and recovery strategies, demonstrating the versatility of the approach.
(5) Finally, the authors list an interesting set of real-world experiments based on their findings, that might invite interest from experimentalists across multiple species.
Weaknesses:
(1) The inclusion of Brownian motion as a recovery strategy, seems odd since it doesn't closely match natural animal behavior, where circling (e.g. flies) or zigzagging (ants' "sector search") could have been more realistic.
We agree that Brownian motion may not be biologically plausible -- we used it as a simple benchmark. We clarified this point, and re-trained our algorithm with adaptive memory using circling and zigzaging (cast and surge) recoveries. The learned recovery outperforms all heuristic recoveries (Figure 3D, metrics G). Circling ranks second, and achieves these good results by further decreasing the probability of failure and paying slightly in speed. When tested in the non-native environments 2 to 6, the learned recovery performs best in environments 2, 5 and 6 i.e. from long range more relevant to flying insects; whereas circling generalizes best in odor rich environments 3 and 4, representative of closer range and close to the substrate (Figure 5B, metrics G). In the new environments, similar to the native environment, circling favors convergence (Figure 5B, metrics f<sup>+</sup>) over speed (Figure 5B, metrics g<sup>+</sup> and τ<sub>min</sub>/τ), which is particularly deleterious at large distance.
(2) Using tabular Q-learning is both a strength and a limitation. It's simple and interpretable, making it easier to analyze the learned strategies, but the discrete action space seems somewhat unnatural. In real-world biological systems, actions (like movement) are continuous rather than discrete. Additionally, the ground-frame actions may not map naturally to how animals navigate odor plumes (e.g. insects often navigate based on their own egocentric frame).
We agree with the reviewer that animal locomotion does not look like a series of discrete displacements on a checkerboard. However, to overcome this limitation, one has to first focus on a specific system to define actions in a way that best adheres to a species’ motor controls. Moreover, these actions are likely continuous, which makes reinforcement learning notoriously more complex. While we agree that more realistic models are definitely needed for a comparison with real systems, this remains outside the scope of the current work. We have added a remark to clarify this limitation.
(3) The lack of accompanying code is a major drawback since nowadays open access to data and code is becoming a standard in computational research. Given that the turbulent fluid simulation is a key element that differentiates this paper, the absence of simulation and analysis code limits the study's reproducibility.
We have published the code and the datasets at
- code: https://github.com/Akatsuki96/qNav
- datasets: https://zenodo.org/records/14655992
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Line 59-69: In comparing the results here to other approaches (especially the Verano and Singh papers), it would also be helpful to clarify which of these include an explicit representation of the wind direction. My understanding is that both the Singh and Verano approaches include an explicit representation of wind direction. In Singh wind direction is one of the observations that inputs to the agent, while in Verano, the actions are defined relative to the wind direction. In the current paper, my understanding is that there is no explicitly defined wind direction, but because movement directions are encoded allocentrically, the agent is able to learn the upwind direction from the structure of the plume- is this correct? I think this information would be helpful to spell out and also to address whether an agent without any allocentric direction sense can learn the task.
Thank you for the comment. In our algorithm the directions are defined relative to the mean wind, which is assumed known, as in Verano et al. As far as we understand, Singh et al provide the instantaneous, egocentric wind velocities as part of the input.
(1) Line 105: "several properties of odor stimuli depend on the distance from the source" might cite Boie...Victor 2018, Ackles...Schaefer, 2021, Nag...van Breugel 2024.
Thank you for the suggestions - we have added these references
(2) Line 130: "we first define a finite set of olfactory states" might be helpful to the reader to state what you chose in this paragraph rather than further down.
We have slightly modified the incipit of the paragraph. We first declare we are setting out to craft the olfactory states, then define the challenges, finally we define the olfactory states.
(3) Line 267: "Note that the learned recovery strategy resembles casting behavior observed in flying insects" Might note that insects seem to deploy a range of recovery strategies depending on locomotor mode and environment. For example, flying flies circle and sink when odor is lost in windless environments (Stupski and van Breugel 2024).
Thank you for your comment. We have included the reference and we now added comparisons to results using circling and cast & surge recovery strategies.
(4) Line 289: "from positions beyond the source, the learned strategy is unable to recover the plume as it mostly casts sideways, with little to no downwind action" This is curious as many insects show a downwind bias in the absence of odor that helps them locate the plumes in the first place (e.g. Wolf and Wehner, 2000, Alvarez-Salvado et al. 2018). Is it possible that the agent could learn a downwind bias in the absence of odor if given larger environments or a longer time to learn?
The reviewer is absolutely correct – Downwind motion is not observed in the recovery simply because the agent rarely overshoots the source. Hence overall optimization for that condition is washed out by the statistics. We believe downwind motion will emerge if an agent needs to avoid overshooting the source – we do not have conclusive results yet but are planning to introduce such flexibility in a further work. We added this remark and refs.
(5) Line 377-391: testing these ideas in living systems. Interestingly, Kathman..Nagel 2024 (bioRxiv) shows exactly the property predicted here and in Verano in fruit flies- an odor memory that outlasts the stimulus by a duration of several seconds, appropriate for filling in "blanks." Relatedly, Alvarez-Salvado et al. 2018 showed that fly upwind running reflected a temporal integration of odor information over ~10s, sufficient to avoid responding to blanks as loss of odor.
Indeed, we believe this is the most direct connection between algorithms and experiments. We are excited to discuss with our colleagues and pursue a more direct comparison with animal behavior. We were aware of the references and forgot to cite them, thank you for your careful reading of our work !
Reviewer #2 (Recommendations for the authors):
Suggestions
(1) The paper does not clearly specify which type of animals (e.g., flying insects, terrestrial mammals) the model is meant to approximate or not approximate. The authors should consider clarifying how these simulations are suited to be a general model across varied olfactory navigators. Further, it isn't clear how low/high the intermittency studied in this model is compared to what different animals actually encounter. (Minor: The Figure 4 occupancy circles visualization could be simplified).
Environment 1 represents the lower layers of a moderately turbulent boundary layer. Search occurs on a horizontal plane ~half meter from the ground. The agent is trained at distances of about 10 meters and also tested on longer distances ~ 17 meters (environment 6), lower heights ~1cm from the ground (environments 3-4), lower Reynolds number (environment 5) and higher threshold of detection (environment 2 and 4). Thus Environments 1,2,5 and 6 are representative of conditions encountered by flying organisms (or pelagic in water), and Environments 3 and 4 of searches near the substrate, potentially involved in terrestrial navigation (benthic in water). Even near the substrate, we use odor dispersed in the fluid, and not odor attached to the substrate (relevant to trail tracking).
Also note that we pick Schmidt number Sc = 1 and this is appropriate for odors in air but not in water. However, we expect a weak dependence on the Schmidt number as the Batchelor and Kolmogorov scales are below the size of the source and we are interested in the large scale statistics Falkovich et al., 2001; Celani et al., 2014; Duplat et al., 2010.
Intermittency contours are shown in Fig 1C, they are highest along the centerline, and decay away from the centerline, so that even within the plume detecting odor is relatively rare. Only a thin region near the centerline has intermittency larger than 66%; the outer and most critical bin of the plume has intermittency under 33%; in the furthest point on the centerline intermittency is <10%. For reference, experimental values in the atmospheric boundary layer report intermittency 25% to 20% at 2 to 15m from the source along the centerline (Murlis and Jones, 1981).
We have more clearly labeled the contours in Fig 1C and added these remarks.
We included these remarks and added a whole table with matching to real conditions within the different environments.
(2) Could some biological examples and references be added to support that backtracking is a biologically plausible mechanism?
Backtracking was observed e.g. in ants displaced in unfamiliar environments (Wystrach et al, P Roy Soc B, 280, 2013), in tsetse flies executing reverse turns uncorrelated to wind, which bring them back towards the location where they last detected odor (Torr, Phys Entom, 13, 1988, Gibson & Brady Phys Entom 10, 1985) and in coackroaches upon loss of contact with the plume (Willis et al, J. Exp. Biol. 211, 2008). It is also used in computational models of olfactory navigation (Park et al, Plos Comput Biol, 12:e1004682, 2016).
(3) Hand-crafted features can be both a strength and a limitation. On the one hand, they offer interpretability, which is crucial when trying to model biological systems. On the other hand, they may limit the generality of the model. A more thorough discussion of this paper's limitations should address this.
(4) The authors mention the possibility of feature engineering or using recurrent neural networks, but a more concrete discussion of these alternatives and their potential advantages/disadvantages would be beneficial. It should be noted that the hand-engineered features in this manuscript are quite similar to what the model of Singh et al suggests emerges in their trained RNNs.
Merged answer to points 3 and 4.
We agree with the reviewer that hand-crafted features are both a strength and a limitation in terms of performance and generality. This was a deliberate choice aimed at stripping the algorithm bare of implicit components, both in terms of features and in terms of memory. Even with these simple features, our model performs well in navigating across different signals, consistent with our previous results showing that these features are a “good” surrogate for positional information.
To search for the most effective temporal features, one may consider a more systematic hand crafting, scaling up our approach. In this case one would first define many features of the odor trace; rank groups of features for their accuracy in regression against distance; train Q learning with the most promising group of features and rank again. Note however that this approach will be cumbersome because multiple factors will have to be systematically varied: the regression algorithm; the discretization of the features and the memory.
Alternatively, to eliminate hand crafting altogether and seek better performance or generalization, one may consider replacing these hand-crafted features and the tabular Q-learning approach with recurrent neural networks or with finite state controllers. On the flip side, neither of these algorithms will directly provide the most effective features or the best memory, because these properties are hidden within the parameters that are optimized for. So extra work is needed to interrogate the algorithms and extract these information. For example, in Singh et al, the principal components of the hidden states in trained agents correlate with head direction, odor concentration and time since last odor encounter. More work is needed to move beyond correlations and establish more systematically what are the features that drive behavior in the RNN.
We have added these points to the discussion.
(5) Minor: the title of the paper doesn't immediately signal its focus on recovery strategies and their interplay with memory in the context of olfactory navigation. Given the many other papers using a similar RL approach, this might help the authors position this paper better.
We agree with the referee and have modified the title to reflect this.
(6) Minor: L 331: "because turbulent odor plumes constantly switch on and off" -- the signal received rather than the plume itself is switching on and off.
Thank you for the suggestion, we implemented it.
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The manuscript by Cao et al. examines an important but understudied question of how chronic exposure to heat drives changes in affective and social behaviors. It has long been known that temperature can be a potent driver of behaviors and can lead to anxiety and aggression. However, the neural circuitry that mediates these changes is not known. Cao et al. take on this question by integrating optical tools of systems neuroscience to record and manipulate bulk activity in neural circuits, in combination with a creative battery of behavior assays. They demonstrate that chronic daily exposure to heat leads to changes in anxiety, locomotion, social approach, and aggression. They identify a circuit from the preoptic area (POA) to the posterior paraventricular thalamus (pPVT) in mediating these behavior changes. The POA-PVT circuit increases activity during heat exposure. Further, manipulation of this circuit can drive affective and social behavioral phenotypes even in the absence of heat exposure. Moreover, silencing this circuit during heat exposure prevents the development of negative phenotypes. Overall the manuscript makes an important contribution to the understudied area of how ambient temperature shapes motivated behaviors.
Strengths:
The use of state-of-the-art systems neuroscience tools (in vivo optogenetics and fiber photometry, slice electrophysiology), chronic temperature-controlled experiments, and a rigorous battery of behavioral assays to determine affective phenotypes. The optogenetic gain of function of affective phenotypes in the absence of heat, and loss of function in the presence of heat are very convincing manipulation data. Overall a significant contribution to the circuit-level instantiation of temperature-induced changes in motivated behavior, and creative experiments.
Weaknesses:
(1) There is no quantification of cFos/rabies overlap shown in Figure 2, and no report of whether the POA-PVT circuit has a higher percentage of Fos+ cells than the general POA population. Similarly, there is no quantification of cFos in POA recipient PVT cells for Figure 2 Supplement 2.
Thanks for the comment. The quantification results of c-Fos signal have been provided in the main text and figures.
(2) The authors do not address whether stimulation of POA-PVT also increases core body temperature in Figure 3 or its relevant supplements. This seems like an important phenotype to make note of and could be addressed with a thermal camera or telemetry.
Thanks for raising this point. We did indeed monitor the core body temperature during stimulation of POA-PVT pathway, but we did not observe any significant changes. We have included this finding in the revised manuscript.
(3) In Figure 3G: is Day 1 vs Day 22 "pre-heat" significant? The statistics are not shown, but this would be the most conclusive comparison to show that POA-PVT cells develop persistent activity after chronic heat exposure, which is one of the main claims the authors make in the text. This analysis is necessary in order to make the claim of persistent circuit activity after chronic heat exposure.
Figure 3G does compare the Day 1 preheat to Day22 preheat, and the difference was significant. The wording has been corrected to avoid confusion. Also, we have modified Figure 3D to 3H in our revised manuscript to improve the clarity of these plots.
(4) In Figure 4, the control virus (AAV1-EYFP) is a different serotype and reporter than the ChR2 virus (AAV9-ChR2-mCherry). This discrepancy could lead to somewhat different baseline behaviors.
Thanks for bringing out this issue. We acknowledge that using AA1-EGFP (a different serotype and reporter compared to the AAV9-ChR2-mCherry) as our control virus is not ideal. But based on our own prior experiments, we observed no significant differences in baseline behaviors between animals injected with AAV1 and AAV9 EYFP as well as control mice without virus injection. Therefore, we believe that the baseline behaviors of the animals were unaffected.
(5) In Figure 5G, N for the photometry data: the authors assess the maximum z-score as a measure of the strength of calcium response, however the area under the curve (AUC) is a more robust and useful readout than the maximum z score for this. Maximum z-score can simply identify brief peaks in amplitude, but the overall area under the curve seems quite similar, especially for Figure 5N.
Thanks for the comment. We agree with the reviewer that the area under the curve (AUC) is an alternative readout for measurement of the strength of calcium response. However, the reason why we chose the maximum z-score is based on the observation that we found POA recipient pPVT neurons after chronic heat treatment exhibited a higher calcium peak corresponding to certain behavioral performances when compared to pre-heat conditions. We thus applied the maximum z-score as a representative way to describe the neuronal activity changes of mice during certain behaviors before and after chronic heat treatment. The other consideration is that we want to reflect that POA recipient pPVT neurons become more sensitive and easier to be activated after chronic heat exposure under the same stressful situations compared to control mice. The maximum z score represented by peak in combination with particular behavioral performances is considered more suitable to highlight our findings in this study.
(6) For Fig 5V: the authors run the statistics on behavior bouts pooled from many animals, but it is better to do this analysis as an animal average, not by compiling bouts. Compiling bouts over-inflates the power and can yield significant p values that would not exist if the analysis were carried out with each animal as an n of 1.
Thanks for the comment and suggestion. We had tried both methods and the statistical results were similar. As suggested, we have updated Fig 5V, as well as Fig. 5H and 5O by comparing animal average in our revised manuscript.
(7) In general this is an excellent analysis of circuit function but leaves out the question of whether there may be other inputs to pPVT that also mediate the same behavioral effect. Future experiments that use activity-dependent Fos-TRAP labeling in combination with rabies can identify other inputs to heat-sensitive pPVT cells, which may have convergent or divergent functions compared to the POA inputs.
Thanks for the valuable suggestion, which would enhance the conclusion. We will consider adopting this approach in future investigations into this question.
Reviewer #2 (Public review):
Summary
The study by Cao et al. highlights an interesting and important aspect of heat- and thermal biology: the effect of repetitive, long-term heat exposure and its impact on brain function.
Even though peripheral, sensory temperature sensors and afferent neuronal pathways conveying acute temperature information to the CNS have been well established, it is largely unknown how persistent, long-term temperature stimuli interact with and shape CNS function, and how these thermally-induced CNS alterations modulate efferent pathways to change physiology and behavior. This study is therefore not only novel but, given global warming, also timely.
The authors provide compelling evidence that neurons of the paraventricular thalamus change plastically over three weeks of episodic heat stimulation and they convincingly show that these changes affect behavioral outputs such as social interactions, and anxiety-related behaviors.
Strengths
(1) It is impressive that the assessed behaviors can be (i) recruited by optogenetic fiber activation and (ii) inhibited by optogenetic fiber inhibition when mice are exposed to heat. Technically, when/how long is the fiber inhibition performed? It says in the text "3 min on and 3 min off". Is this only during the 20-minute heat stimulation or also at other times?
Thanks for pointing out the need for clarification. Our optogenetic inhibition had been conducted for 21 days during the heat exposure period (90 mins) for each mouse. And to avoid the light-induced heating effect, we applied the cyclical mode of 3 minutes’ light on and 3 minutes’ light off only during the process of heat exposure but not other time. The detailed description has been supplemented in the Method part of our revised manuscript.
(2) It is interesting that the frequency of activity in pPVT neurons, as assessed by fiber photometry, stays increased after long-term heat exposure (day 22) when mice are back at normal room temperature. This appears similar to a previous study that found long-term heat exposure to transform POA neurons plastically to become tonically active (https://www.biorxiv.org/content/10.1101/2024.08.06.606929v1). Interestingly, the POA neurons that become tonically active by persistent heat exposure described in the above study are largely excitatory, and thus these could drive the activity of the pPVT neurons analyzed in this study.
Thanks for pointing out this study that suggests similar plasticity of POA neurons under long-term heat exposure serving a different purpose. We have included this information in our discussion as well.
(3) How can it be reconciled that the majority of the inputs from the POA are found to be largely inhibitory (Fig. 2H)? Is it possible that this result stems from the fact that non-selective POA-to-pPVT projections are labelled by the approach used in this study and not only those pathways activated by heat? These points would be nice to discuss.
Thanks for raising these important questions. Although it is not our primary focus, we are aware of the substantial inhibitory inputs from POA to pPVT which suggests an important function. However, we do not think that this pathway, which would exert an opposite effect on POA-recipient pPVT neurons compared to the excitatory input, contributes to the long-term effect of chronic heat exposure. This is due to the increased, rather than decreased, excitability of the neurons. There is a possibility that this inhibitory input serves as a short-term inhibitory control for other purpose. Further work is needed to fully address this question.
(4) It is very interesting that no LTP can be induced after chronic heat exposure (Figures K-M); the authors suggest that "the pathway in these mice were already saturated" (line 375). Could this hypothesis be tested in slices by employing a protocol to extinguish pre-existing (chronic heat exposure-induced) LTP? This would provide further strength to the findings/suggestion that an important synaptic plasticity mechanism is at play that conveys behavioral changes upon chronic heat stimulation.
We agree with the reviewer that the results of the suggested experiment would further strengthen our hypothesis. We will try to confirm this in future studies.
(5) It is interesting that long-term heat does not increase parameters associated with depression (Figure 1N-Q), how is it with acute heat stress, are those depression parameters increased acutely? It would be interesting to learn if "depression indicators" increase acutely but then adapt (as a consequence of heat acclimation) or if they are not changed at all and are also low during acute heat exposure.
Based on our observations, we did not find increased depression parameters after acute heat stress in our experiments (data not shown), which was consistent with other two previous studies (Beas et al., 2018; Zhang et al., 2021). It appears that acute heat stress is more associated with anxiety-like behavior and may not be sufficient to induce depression-like phenotypes in rodents, aligning with our observation during experiments.
Beas BS, Wright BJ, Skirzewski M, Leng Y, Hyun JH, Koita O, Ringelberg N, Kwon HB, Buonanno A, Penzo MA (2018) The locus coeruleus drives disinhibition in the midline thalamus via a dopaminergic mechanism Nat Neurosci 21:963-973.
Zhang GW, Shen L, Tao C, Jung AH, Peng B, Li Z, Zhang LI, Whit Tao HZ (2021) Medial preoptic area antagonistically mediates stress-induced anxiety and parental behavior Nat Neurosci 24:516-528.
Weaknesses/suggestions for improvement.
(1) The introduction and general tenet of the study is, to us, a bit too one-sided/biased: generally, repetitive heat exposure --heat acclimation-- paradigms are known to not only be detrimental to animals and humans but also convey beneficial effects in allowing the animals and humans to gain heat tolerance (by strengthening the cardiovascular system, reducing energy metabolism and weight, etc.).
Thanks for the suggestion. We have modified the introduction in our revised manuscript to make it more balanced.
(2) The point is well taken that these authors here want to correlate their model (90 minutes of heat exposure per day) to heat waves. Nevertheless, and to more fully appreciate the entire biology of repetitive/chronic/persistent heat exposure (heat acclimation), it would be helpful to the general readership if the authors would also include these other aspects in their introduction (and/or discussion) and compare their 90-minute heat exposure paradigm to other heat acclimation paradigms. For example, many past studies (using mice or rats)m have used more subtle temperatures but permanently (and not only for 90 minutes) stimulated them over several days and weeks (for example see PMID: 35413138). This can have several beneficial effects related to cardiovascular fitness, energy metabolism, and other aspects. In this regard: 38{degree sign}C used in this study is a very high temperature for mice, in particular when they are placed there without acclimating slowly to this temperature but are directly placed there from normal ambient temperatures (22{degree sign}C-24{degree sign}C) which is cold/coolish for mice. Since the accuracy of temperature measurement is given as +/- 2{degree sign}C, it could also be 40{degree sign}C -- this temperature, 40{degree sign}C, non-heat acclimated C57bl/6 mice will not survive for long.
The authors could consider discussing that this very strong, short episodic heat-stress model used here in this study may emphasize detrimental effects of heat, while more subtle long-term persistent exposure may be able to make animals adapt to heat, become more tolerant, and perhaps even prevent the detrimental cognitive effects observed in this study (which would be interesting to assess in a follow-up study).
Thanks for pointing out the important aspect regarding the different heat exposure paradigms and their potential impacts. We have incorporated these points into both the Introduction and Discussion sections of the revised manuscript.
(3) Line 140: It would help to be clear in the text that the behaviors are measured 1 day after the acute heat exposure - this is mentioned in the legend to the figure, but we believe it is important to stress this point also in the text. Similarly, this is also relevant for chronic heat stimulation: it needs to be made very clear that the behavior is measured 1 day after the last heat stimulus. If the behaviors had been measured during the heat stimulus, the results would likely be very different.
Thanks for the suggestion, and we have clarified the procedure in the revised manuscript.
(4) Figure 2 D and Figure 2- Figure Supplement 1: since there is quite some baseline cFos activity in the pPVT region we believe it is important to include some control (room temperature) mice with anterograde labelling; in our view, it is difficult/not possible to conclude, based on Fig 2 supplement 2C, that nearly 100% of the cfos positive cells are contacted by POA fibre terminals (line 168). By eye there are several green cells that don't have any red label on (or next to) them; additionally, even if there is a little bit of red signal next to a green cell: this is not definitive proof that this is a synaptic contact. It is therefore advisable to revisit the quantification and also revisit the interpretation/wording about synaptic contacts.
In relation to the above: Figure 2h suggests that all neurons are connected (the majority receiving inhibitory inputs), is this really the case, is there not a single neuron out of the 63 recorded pPVT neurons that does not receive direct synaptic input from the POA?
Thanks for the comments. For Figure 2-figure supplement 1, the baseline c-Fos activity in pPVT were indeed measured from mouse under room temperature. Observed activity may be attributed to the diverse functions that the pPVT is responsible for. Compared to the heat-exposed group, we observed significant increases in c-Fos signals, suggesting the effect of heat exposure.
For Figure 2-figure supplement 2, through targeted injection of AAV1-Cre into the POA, we achieved selective expression of Cre-dependent ChR2-mCherry in pPVT neurons receiving POA inputs. Following heat exposure, we observed substantial colocalization between heat-induced c-Fos expression (green signal) and ChR2-mCherry-labeled neurons (red signal) in the pPVT. This extensive overlap indicates that POA-recipient pPVT neurons are predominantly heat-responsive and likely mediate the behavioral alterations induced by chronic heat exposure. We have validated these signals and included updated quantification in our revised manuscript.
For Fig 2H, we specifically patched those neurons that were surrounded by red fluorescence under the microscope, ensuring that the patched neurons had a high likelihood of being innervated from POA. This is why all 63 recorded pPVT neurons were found to receive direct synaptic input from the POA.
(5) It would be nice to characterize the POA population that connects to the pPVT, it is possible/likely that not only warm-responsive POA neurons connect to that region but also others. The current POA-to-pPVT optogenetic fibre stimulations (Figure 4) are not selective for preoptic warm responsive neurons; since the POA subserves many different functions, this optogenetic strategy will likely activate other pathways. The referees acknowledge that molecular analysis of the POA population would be a major undertaking. Instead, this could be acknowledged in the discussion, for example in a section like "limitation of this study".
Thanks for the suggestion. We have supplemented this part in our revised manuscript.
(6) Figure 3a the strategy to express Gcamp in a Cre-dependent manner: it seems that the Gcamp8f signal would be polluted by EGFP (coming from the Cre virus injected into the POA): The excitation peak for both is close to 490nm and emission spectra/peaks of GCaMP8f (510-520 nm) and EGFP (507-510 nm) are also highly overlapping. We presume that the high background (EGFP) fluorescence signal would preclude sensitive calcium detection via Gcamp8f, how did the authors tackle this problem?
Thank you for pointing out this issue. We acknowledge that we included AAV1-EGFP when recording the GCaMP8F signal to assist in the post-verification of the accuracy of the injection site. But we also collected recording data from mice with AAV1-Cre without EGFP injected into POA and Cre-dependent GCaMP8F in pPVT, albert in a smaller number. We did not observe any obvious differences in the change in calcium signal between these two virus strategies, suggesting that the sensitivity of the GCaMP signals was not significantly affected by the increased baseline fluorescence due to EGFP.
(7) How did the authors perform the social interaction test (Figures 1F, G)? Was the intruder mouse male or female? If it was a male mouse would the interaction with the female mouse be a form of mating behavior? If so, the interpretation of the results (Figures 1F, G) could be "episodic heat exposure over the course of 3 weeks reduces mating behavior".
Thanks for the comment. For this female encounter test, we strictly followed the protocol by Ago Y, et al., (2015). During this test, both the strange male and female mice were placed into a wired cup (which is made up of mental wire entanglement and the size for each hole is 0.5 cm [L] x 0.5 cm [W]), which successfully prevented large body contact and the mating behavior but only innate sex-motivated moving around the cup. We have supplemented the details in the method part of our revised manuscript.
Ago Y, Hasebe S, Nishiyama S, Oka S, Onaka Y, Hashimoto H, Takuma K, Matsuda T (2015) The Female Encounter Test: A Novel Method for Evaluating Reward-Seeking Behavior or Motivation in Mice Int J Neuropsychopharmacol 18: pyv062.
Reviewer #3 (Public review):
In this study, Cao et al. explore the neural mechanisms by which chronic heat exposure induces negative valence and hyperarousal in mice, focusing on the role of the posterior paraventricular nucleus (pPVT) neurons that receive projections from the preoptic area (POA). The authors show that chronic heat exposure leads to heightened activity of the POA projection-receiving pPVT neurons, potentially contributing to behavioral changes such as increased anxiety level and reduced sociability, along with heightened startle responses. In addition, using electrophysiological methods, the authors suggest that increased membrane excitability of pPVT neurons may underlie these behavioral changes. The use of a variety of behavioral assays enhances the robustness of their claim. Moreover, while previous research on thermoregulation has predominantly focused on physiological responses to thermal stress, this study adds a unique and valuable perspective by exploring how thermal stress impacts affective states and behaviors, thereby broadening the field of thermoregulation. However, a few points warrant further consideration to enhance the clarity and impact of the findings.
(1) The authors claim that behavior changes induced by chronic heat exposure are mediated by the POA-pPVT circuit. However, it remains unclear whether these changes are unique to heat exposure or if this circuit represents a more general response to chronic stress. It would be valuable to include control experiments with other forms of chronic stress, such as chronic pain, social defeat, or restraint stress, to determine if the observed changes in the POA-pPVT circuit are indeed specific to thermal stress or indicative of a more universal stress response mechanism.
We also share similar considerations as the reviewer and indeed have conducted experiments to explore this possibility. Our findings suggest that the POA-pPVT pathway may also mediate behavioral changes induced by other chronic stress, e.g. chronic restraint stress. Nevertheless, given the well-known prominent role of POA neurons in heat perception, we do believe that the POA-pPVT has a specialized role in mediating chronic heat induced changes. The role of this pathway in other stress-related responses will need a more comprehensive study in the future.
(2) The authors use the term "negative emotion and hyperarousal" to interpret behavioral changes induced by chronic heat (consistently throughout the manuscript, including the title and lines 33-34). However, the term "emotion" is broad and inherently difficult to quantify, as it encompasses various factors, including both valence and arousal (Tye, 2018; Barrett, L. F. 1999; Schachter, S. 1962). Therefore, the reviewer suggests the authors use a more precise term to describe these behaviors, such as valence. Additionally, in lines 117 and 137-139, replacing "emotion" with "stress responses," a term that aligns more closely with the physiological observations, would provide greater specificity and clarity in interpreting the findings.
Thanks for the suggestion. We have modified the description of “emotion” to “emotional valence” in various places throughout the revised manuscript.
(3) Related to the role of POA input to pPVT,
a) The authors showed increased activity in pPVT neurons that receive projections from the POA (Figure 3), and these neurons are necessary for heat-induced behavioral changes (Figures 4N-W). However, is the POA input to the pPVT circuit truly critical? Since recipient pPVT neurons can receive inputs from various brain regions, the reviewer suggests that experiments directly inhibiting the POA-to-pPVT projection itself are needed to confirm the role of POA input. Alternatively, the authors could show that the increased activity of pPVT neurons due to chronic heat exposure is not observed when the POA is blocked. If these experiments are not feasible, the reviewer suggests that the authors consider toning down the emphasis on the role of the POA throughout the manuscript and discuss this as a limitation.<br /> b) In the electrophysiology experiments shown in Figures 6A-I, the authors conducted in vitro slice recordings on pPVT neurons. However, the interpretation of these results (e.g., "The increase in presynaptic excitability of the POA to pPVT excitatory pathway suggested plastic changes induced by the chronic heat treatment.", lines 349-350) appears to be an overclaim. It is difficult to conclude that the increased excitability of pPVT neurons due to heat exposure is specifically caused by inputs from the POA. To clarify this, the reviewer suggests the authors conduct experiments targeting recipient neurons in the pPVT, with anterograde labeling from the POA to validate the source of excitatory inputs.
For point (a), we acknowledge that pPVT neurons receiving POA inputs may also receive projections from other brain regions. While these additional inputs warrant investigation, they fall beyond the scope of our current study and represent promising directions for future research. Notably, compared to other well-characterized regions such as the amygdala and ventral hippocampus, the pPVT receives particularly robust projections from hypothalamic nuclei (Beas et al., 2018). Our optogenetic inhibition of POA-recipient pPVT neurons during chronic heat exposure effectively prevented the influence of POA excitatory projections on pPVT neurons. Furthermore, selective optogenetic activation of POA excitatory terminals within the pPVT was sufficient to induce similar behavioral abnormalities in mice, strongly supporting the causal role of POA inputs in mediating chronic heat exposure-induced behavioral alterations.
Beas BS, Wright BJ, Skirzewski M, Leng Y, Hyun JH, Koita O, Ringelberg N, Kwon HB, Buonanno A, Penzo MA (2018) The locus coeruleus drives disinhibition in the midline thalamus via a dopaminergic mechanism Nat Neurosci 21:963-973.
Regarding point (b), we acknowledge certain limitations in our in vitro patch-clamp recordings when attributing increased pPVT neuronal excitability to enhanced presynaptic POA inputs. Nevertheless, our brain slice recordings clearly demonstrated heightened excitability of pPVT neurons following chronic heat exposure. This finding was further corroborated by our in vivo fiber photometry recordings specifically targeting POA-recipient pPVT neurons, which confirmed that the increased pPVT neuronal activity was indeed modulated by POA inputs. The causal relationship was strengthened by our observation that optogenetic activation of POA excitatory terminals within the pPVT reproduced behavioral abnormalities similar to those observed in chronic heat-exposed mice. Additionally, our inability to induce circuit-specific LTP in the POA-pPVT pathway suggests that these synapses were already potentiated and saturated, reflecting enhanced excitatory inputs from the POA to pPVT. Collectively, these findings support our conclusion that increased excitatory projections from the POA to pPVT likely represent a key mechanism underlying chronic heat exposure-induced behavioral alterations in mice.
(4) The authors focus on the excitatory connection between the POA and pPVT (e.g., "Together, our results indicate that most of the pPVT-projecting POA neurons responded to heat treatment, which would then recruit their downstream neurons in the pPVT by exerting a net excitatory influence.", lines 169-171). However, are the POA neurons projecting to the pPVT indeed excitatory? This is surprising, considering i) the electrophysiological data shown in Figures 2E-K that inhibitory current was recorded in 52.4% of pPVT neurons by stimulation of POA terminal, and ii) POA projection neurons involved in modulating thermoregulatory responses to other brain regions are primarily GABAergic (Tan et al., 2016; Morrison and Nakamura, 2019). The reviewer suggests showing whether the heat-responsive POA neurons projecting to the pPVT are indeed excitatory (This could be achieved by retrogradely labeling POA neurons that project to the pPVT and conducting fluorescence in situ hybridization (FISH) assays against Slc32a1, Slc17a6, and Fos to label neurons activated by warmth). Alternatively, demonstrate, at least, that pPVT-projecting POA neurons are a distinct population from the GABAergic POA neurons that project to thermoregulatory regions such as DMH or rRPa. This would clarify how the POA-pPVT circuit integrates with the previously established thermoregulatory pathways.
Thanks for the comment and suggestion. We acknowledge that there are both excitatory and inhibitory projections from POA to pPVT. Although it is not our primary focus, we are aware of the substantial inhibitory inputs from POA to pPVT which suggests an important function. However, we do not think that this pathway, which would exert an opposite effect on POA-recipient pPVT neurons compared to the excitatory input, contributes to the long-term effect of chronic heat exposure. This is due to the increased, rather than decreased, excitability of the neurons. There is a possibility that this inhibitory input serves as a short-term inhibitory control for other purpose. Further work is needed to fully address this question.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
I have a number of suggested minor edits that would improve the readability and interpretation of figures for the reader. In many figures, there are places where it is unclear what is being tested, and making minor changes would make the manuscript flow more easily for the reader:
(1) The authors could add additional details about the behavior paradigms in the Figures, especially Figure 1. How long was the chronic heat exposure for? At what temperature? What is the length of time between the end of heat exposure and the start of behaviors? What was the schedule of testing for EPM and social behaviors? Was it all on the same day or on different days? These details will make it easier for the reader to understand the behavior tests.
We have revised our experimental scheme, especially Figure 1, and added more detailed descriptions in the method section. The modifications have also been applied to the other figures.
(2) In Figures 1J and 1K, it is a bit unclear what is being shown in the right panel, since there are no axes or labels to interpret what is being plotted.
We have added body kinetics (purple dot) in the left panel of Figure 1J and 1K to align with the right panels, and we have updated our descriptions in the figure legend.
(3) In general, Figure 1 would benefit from more headers/labels or schematics to demonstrate what is being tested (for example, it's unclear that forced swim, tail suspension, open field, aggression, sucrose preference, or acoustic startle are being studied unless the reader looks at the figure legend in depth. Simple schematics or titles for each panel would help.
We have added the abbreviated titles for each panel of Figure 1 to help readers to better understand what was being tested.
(4) Figure 2A would benefit from edits to the schematic so that it is clear that heat exposure is being done before the animal is sacrificed and cFos is stained.
We have revised the text to clarify that heat exposure occurred before the animal was sacrificed and c-Fos was stained.
(5) Figure 2D: would help if the quantification of overlap of cFos and rabies was shown in the figure in addition to reporting it in the text (84%).
We have added quantification in Figure 2D.
(6) The supplemental data in Figure 2 - Supplemental Figure 1 showing increased Fos in PVT and POA after heat exposure would actually help if it was in main Figure 2 so that the reader can more clearly see the rationale for choosing the POA-PVT circuit. But this is a matter of preference and up to the author where they want to show this data.
Thanks for the suggestion. But considering the layout and space, we will prefer to retain this part in Figure 2-supplemental figure 1.
(7) Figure 3 would benefit from a behavior schematic illustrating the time course of the experiment and what the heat exposure protocol is for each day (how many minutes heat 'on' vs 'off', the temperature of heat, etc). Also, what is different about day 22 that makes it chronic heat vs day 21? Currently, it is a bit hard to understand the protocol.
We have added the temperature and time of chronic heat exposure in the schematic of Figure 3. The “day 22” represented the time point after chronic heat exposure. And we measured the calcium activity of POA recipient pPVT neurons on day 22 to compare with day 1 to demonstrate that the activity changes of POA recipient pPVT neurons after chronic heat exposure.
(8) Figure 3D, it is unclear what the difference is between the Day 1 data on the left and Day 1 data on the right. Same with Figure 3H, unclear what the difference is between the left and the right.
The left panel and right panel reflect different parameters: frequency /min (left) and amplitude (△F/F) for Figure 3D-3H. By doing this, we want to reflect the dynamic activity changes of POA recipient pPVT neurons throughout chronic heat exposure process. Now, all figures in panel 3D to 3H have been revised to make them clearer in meaning.
(9) Figure 4A would benefit from schematics showing the stimulation protocol for chronic optogenetics (how many days? Frequency? Duration of time? Etc)
We have added detailed schematics in our Figure 4A.
Reviewer #2 (Recommendations for the authors)
(1) It is interesting that social behavior appears to be reduced upon long-term heat exposure but not after acute heat exposure. Interaction of animals, such as huddling, can be used by animals as a form of behavioral thermoregulation in cold environments and heat may drive animals apart to allow for better heat dissipation. The social interaction measured here is not huddling (because, I assume, the animals are separated by a divider?) but is this form of behavior measured here related to huddling/"social thermoregulation"? This could be discussed.
Our behavioral tests were performed at room temperature. Even though huddling is a type of social behavior, based on our observation, the tested mouse was actively revolving around the mental cap, suggesting this type of behavior is not related to huddling/social thermoregulation type of social behavior.
(2) Line 113: The statement "Chronic treatment did not change body temperature" should be clarified/rephrased because 90 minutes of 38 degrees centigrade exposure to heat will increase the body temperature of mice. It would be helpful if the authors made clear that they measure body temperature before the heat stimulus (and not during the heat stimulus), which is now only obvious if one digs into the methods section.
We have revised the text and clarified that body temperature was measured before the heat stimulus in the revised manuscript.
(3) Figure 1J and K: for the non-experts, these graphs are difficult to interpret, some more explanation is needed (what exactly is measured ?). We believe that the term "arousal" may not be justified in this context because the authors have not measured sleep patterns (EEG and EMG) to show that the mice arouse from a sleep (or sleep-like) stage; the authors may consider changing the terminology, e.g. something along the lines of "agitation" or "activity".
We have further elaborated the meaning of Figure 1J and K in our revised manuscript. The acoustic startle response is a well-recognized behavioral parameter reflecting arousal levels in rodent model. The more agitation in response to stimulus, the higher the arousal levels in mice. We have used the term “agitation” to describe mice’s performance in the acoustic startle response test.
Reviewer #3 (Recommendations for the authors):
(1) The authors suggest in the introduction of the manuscript that the HPA axis and other multifaceted factors may influence emotional changes caused by heat stress (lines 63-78). However, there are no experiments or discussions on how the POA-pPVT circuit interacts with these factors. In line with the study's proposed direction in the introduction section, it would be valuable to explore, or at least discuss, whether and how the POA-pPVT circuit interacts with the HPA axis or other neural circuits known to regulate emotional and stress responses. Alternatively, the reviewer suggests revising the content of the introduction to align with the focus of the study.
Although POA is known to possibly interact with the HPA axis via its connection with the paraventricular nucleus of the hypothalamus, there is hardly any evidence for the pPVT. Thus, we prefer not to speculate this question, which remains open, in our current manuscript.
(2) In Figure 5, the authors report that pPVT neurons that receive projections from the POA exhibited increased responses to stressful situations following chronic heat exposure. However, considering the long pre- and post-recording time gap of approximately three weeks, the additional expression of GCaMP protein over time could potentially account for the increased signal. Therefore, the reviewer recommends including a control group without heat exposure to rule out this possibility.
We have included Figure 3-figure supplement 1 in our manuscript to exclude the effect of expression of GCaMP protein over time on the recording of calcium signal.
(3) Related to Figure 2, a) Please include quantification data of the overlap between retrogradely labeled and c-Fos-expressing POA neurons, which can be presented as a bar graph in Figure 2. This would be beneficial for readers to estimate how many warm-activated POA neurons connected to the pPVT are actively engaged under these conditions.
In the revised manuscript, we have included the quantification analysis in Figure 2.
b) The images in Figure 2 - Figure Supplement 1 seem to degrade in quality when magnified, making it difficult to discern finer details. Higher-resolution images would greatly improve the clarity and help in accurately visualizing the c-Fos expression patterns in the POA and pPVT regions.
We have changed our images of Figure 2-figure supplement 1 to higher-resolution in the revised manuscript.
c) The c-Fos images in Figure 2D and Figure 2 - Figure Supplement 2C appear unusual in that the c-Fos signal seems to fill the entire cell, whereas c-Fos protein is localized to the nucleus. Could the authors clarify whether this image accurately represents c-Fos staining or if there might be an issue with the staining or imaging process?
We are confident that the green signals in both Figure 2D and Figure 2-figure supplement 2C, which did not occupy the whole cell body, have already accurately reflected the c-Fos and that they were nucleus staining. We have updated the amplified picture in Figure 2D.
d) In Supplemental Figure 2B, the square marking the region of interest should be clearly explained in the figure legend to ensure that readers can fully understand the context and focus of the image.
We have further modified our figure legend in Figure 2-figure supplement 1 in our revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Satoshi Yamashita et al., investigate the physical mechanisms driving tissue bending using the cellular Potts Model, starting from a planar cellular monolayer. They argue that apical length-independent tension control alone cannot explain bending phenomena in the cellular Potts Model, contrasting with previous works, particularly Vertex Models. They conclude that an apical elastic term, with zero rest value (due to endocytosis/exocytosis), is necessary to achieve apical constriction, and that tissue bending can be enhanced by adding a supracellular myosin cable. Additionally, a very high apical elastic constant promotes planar tissue configurations, opposing bending.
Strengths:
- The finding of the required mechanisms for tissue bending in the cellular Potts Model provides a natural alternative for studying bending processes in situations with highly curved cells.
- Despite viewing cellular delamination as an undesired outcome in this particular manuscript, the model's capability to naturally allow T1 events might prove useful for studying cell mechanics during out-of-plane extrusion.
We thank the reviewer for the careful comments and suggestions.
Weaknesses:
- The authors claim that the cellular Potts Model (CPM) is unable to achieve the results of the vertex model (VM) simulations due to naturally non-straight cellular junctions in the CPM versus the VM. The lack of a substantial comparison undermines this assertion. None of the references mentioned in the manuscript are from a work using vertex model with straight cellular junctions, simulating apical constriction purely by a enhancing a length-independent apical tension. Sherrard et al and Pérez-González et al. use 2D and 3D Vertex Models, respectively, with a "contractility" force driving apical constriction. However, their models allow cell curvature. Both references suggest that the cell side flexibility of the CPM shouldn't be the main issue of the "contractility model" for apical constriction.
We appreciate the comment.
For the reports by Sherrard et al and Pérez-Gonález et al, lack of the cell rearrangement (T1 transition) might have caused the difference. Other than these, Muñoz et al. (doi:10.1016/j.jbiomech.2006.05.006), Polyakov et al. (doi:10.1016/j.bpj.2014.07.013), Inoue et al.
(doi:10.1007/s10237-016-0794-1), Sui et al.
(doi:10.1038/s41467-018-06497-3), and Guo et al. (doi:10.7554/eLife.69082) used simulation models with the straight lateral surface.
We updated an explanation about the difference between the vertex model and the cellular Potts model in the discussion.
P12L318 “An edge in the vertex model can be bent by interpolating vertices or can be represented with an arc of circle (Brakke, 1992). Even in cases where vertex models were extended to allow bent lateral surfaces, the model still limited cell rearrangement and neighbor changes (Pérez-González et al., 2021), limiting the cell delamination. Thus the difference in simulation results between the models could be due to whether the cell rearrangement was included or not. However, it is not clear how the absence of the cell rearrangement affected cell behaviors in the simulation, and it shall be studied in future. In contrast to the vertex model, the cellular Potts model included the curved cell surface and the cell rearrangement innately, it elucidated the importance of those factors.”
- The myosin cable is assumed to encircle the invaginated cells. Therefore, it is not clear why the force acts over the entire system (even when decreasing towards the center), and not locally in the contour of the group of cells under constriction. The specific form of the associated potential is missing. It is unclear how dependent the results of the manuscript are on these not-well-motivated and model-specific rules for the myosin cable.
A circle radius decreases when the circle perimeter shrinks, and this was simulated with the myosin cable moving toward the midline in the cross section.
We added an explanation in the introduction and the results.
P2L74 “In the same way with the contracting circumferential myosin belt in a cell decreasing the cell apical surface, the circular supracellular myosin cable contraction decreases the perimeter, the radius of the circle, and an area inside the circle.”
P6L197 “In the cross section, the shrinkage of the circular supracellular myosin cable was simulated with a move of adherens junction under the myosin cable toward the midline.”
- The authors are using different names than the conventional ones for the energy terms. Their current attempt to clarify what is usually done in other works might lead to further confusion.
The reviewer is correct. However we named the energy terms differently because the conventional naming would be misleading in our simulation model.
We added an explanation in the results.
P4L140 “Note that the naming for the energy terms differs from preceding studies. For example, Farhadifar et al. (2007) named a surface energy term expressed by a proportional function "line tensions" and a term expressed by a quadratic function "contractility of the cell perimeter". In this study, however, calling the quadratic term "contractility" would be misleading since it prevents the contraction when < _0. Therefore we renamed the terms accordingly.”
Reviewer #2 (Public Review):
Summary:
In their work, the Authors study local mechanics in an invaginating epithelial tissue. The work, which is mostly computational, relies on the Cellular Potts model. The main result shows that an increased apical "contractility" is not sufficient to properly drive apical constriction and subsequent tissue invagination. The Authors propose an alternative model, where they consider an alternative driver, namely the "apical surface elasticity".
Strengths:
It is surprising that despite the fact that apical constriction and tissue invagination are probably most studied processes in tissue morphogenesis, the underlying physical mechanisms are still not entirely understood. This work supports this notion by showing that simply increasing apical tension is perhaps not sufficient to locally constrict and invaginate a tissue.
We thank the reviewer for the careful comments.
Weaknesses:
Although the Authors have improved and clarified certain aspects of their results as suggested by the Reviewers, the presentation still mostly relies on showing simulation snapshots. Snapshots can be useful, but when there are too many, the results are hard to read. The manuscript would benefit from more quantitative plots like phase diagrams etc.
We agree with the comment.
However, we could not make the qualitative measurement for the phase diagram since 1) the measurement must be applicable to all simulation results, and 2) measured values must match with the interpretation of the results. To do so, the measurement must distinguish a bent tissue, delaminated cells, a tissue with curved basal surface and flat apical surface, and a tissue with closed invagination. Such measurement is hardly designed.
Recommendations for the authors:
Reviewing Editor (Recommendations For The Authors):
I see that the authors have worked on improving their paper in the revision. However, I agree with both reviewer #1 and reviewer #2 that the presentation and discussion of findings could be clearer.
Concrete recommendations for improvement:
(1) I find the observation by reviewer #1 on cell rearrangement very illuminating: It is indeed another key difference between the Cellular Potts Model that the authors use compared to typical Vertex Models, and could very well explain the different model outcomes. The authors could expand on the discussion of this point.
We updated an explanation about the difference between the vertex model and the cellular Potts model in the discussion.
P12L318 “An edge in the vertex model can be bent by interpolating vertices or can be represented with an arc of circle (Brakke, 1992). Even in cases where vertex models were extended to allow bent lateral surfaces, the model still limited cell rearrangement and neighbor changes (Pérez-González et al., 2021), limiting the cell delamination. Thus the difference in simulation results between the models could be due to whether the cell rearrangement was included or not. However, it is not clear how the absence of the cell rearrangement affected cell behaviors in the simulation, and it shall be studied in future. In contrast to the vertex model, the cellular Potts model included the curved cell surface and the cell rearrangement innately, it elucidated the importance of those factors.”
(2) In lines 161-164, the authors write "Some preceding studies assumed that the apical myosin generated the contractile force (Sherrard et al, 2010: Conte et al., 2012; Perez-Mockus et al., 2017; Perez-Gonzalez et al., 2021), while others assumed the elastic force (Polyakov et al., 2014; Inoue et al. 2016; Nematbakhsh et al., 2020)."
Similarly, in lines 316-319 the authors write "In the preceding studies, the apically localized myosin was assumed to generate either the contractile force (Sherrard et al, 2010: Conte et al., 2012; Perez-Mockus et al., 2017; Perez-Gonzalez et al., 2021), or the elastic force (Polyakov et al., 2014; Inoue et al. 2016; Nematbakhsh et al., 2020)."
The phrasing here is poor, as it suggests that the latter three studies (Polyakov et al., 2014; Inoue et al. 2016; Nematbakhsh et al., 2020) do not use the assumption that apical myosin generated contractile forces. This is wrong. All three of these studies do in fact assume apical surface contractility mediated by myosin. In addition, they also include other factors such as elastic restoring forces from the cell membrane (but not mediated by myosin as far as I understand).
These statements should be corrected.
We named the energy term expressed with the proportional function “contractility” and the energy term expressed with the quadratic function “elasticity”. Here we did not define what biological molecules correspond with the contractility or the elasticity.
For the three studies, the effect of myosin was expressed by the quadratic function, and Polyakov et al. (2014) named it “springlike elastic properties”, Inoue et al. (2016) named it “Apical circumference elasticity”, and Nematbakhsh et al. (2020) named it “Actomyosin contractility”. To explain that the for generated by myosin was expressed with the quadratic function in these studies, we wrote that they “assumed the elastic force”.
We assumed the myosin activity to be approximated with the proportional function in later parts and proposed that the membrane might be expressed with the quadratic function and responsible for the apical constriction based on other studies.
To clarify this, we added it to the results.
P4L175 “Some preceding studies assumed that the apical myosin generated the contractile force (Sherrard et al., 2010; Conte et al., 2012; Perez-Mockus et al., 2017; Pérez-González et al., 2021), while the others assumed the myosin to generate the elastic force (Polyakov et al., 2014; Inoue et al., 2016; Nematbakhsh et al., 2020).”
(3) Lines 294-296: The phrasing suggests that the "alternative driving mechanism" consists of apical surface elasticity remodelling alone. This is not true, it's an additional mechanism, not an alternative. The authors' model works by the combined action of increased apical surface contractility and apical surface elasticity remodelling (and the effect can be strengthened by including a supracellular actomyosin cable).
We agree with the comment that the surface remodeling is not solely driving the apical constriction but with myosin activity. However, if we wrote it as an additional mechanism, it might look like that both the myosin activity alone and the surface remodeling alone could drive the apical constriction, and they would drive it better when combined together. So we replaced “mechanism” with “model”.
P12L311 “In this study, we demonstrated that the increased apical surface contractility could not drive the apical constriction, and proposed the alternative driving model with the apical surface elasticity remodeling.”
(4) In general, the part of the results section encompassing equations 1-5 should more explicitly state which equations were used in all simulations (Eqs1+5), and which ones were used only for certain conditions (Eqs2+3+4).
We added it as follows.
P4L153 “While the terms Equation 1 and Equation 5 were included in all simulations since they were fundamental and designed in the original cellular Potts model (Graner and Glazier, 1992), the other terms Equation 2-Equation 4 were optional and employed only for certain conditions.”
(5) Lines 150-152: Please state which parameters were examined. I assume Equation 4 was also left out of this initial simulation, as it is the potential energy of the actomyosin cable that was only included in some simulations.
We added it as follows.
P4L163 “The term Equation 4 was not included either. For a cell, its compression was determined by a balance between the pressure and the surface tension, i.e., the heigher surface tension would compress the cell more. The bulk modulus 𝜆 was set 1, the lateral cell-cell junction contractility 𝐽_𝑙 was varied for different cell compressions, and the apical and basal surface contractilities 𝐽_𝑎 and 𝐽_𝑏 were varied proportional to 𝐽_𝑙.”
(6) Lines 118-122: The sentence is very long and hard to parse. I suggest the following rephrasing:
“In this study, we assumed that the cell surface tension consisted of contractility and elasticity. We modelled the contractility as constant to decrease the surface, but not dependent on surface width or strain. We modelled the elasticity as proportional to the surface strain, working to return the surface to its original width."
We updated the explanation as follows.
P3L121 “In this study, we assumed that the cell surface tension consisted of contractility and elasticity. We modeled the contractility as a constant force to decrease the surface, but not dependent on surface width or strain. We modeled the elasticity as a force proportional to the surface strain, working to return the surface to its original width.”
(7) Lines 270-274: Another long sentence that is difficult to understand.
Suggested rephrasing:
"Note that the supracellular myosin cable alone could not reproduce the apical constriction (Figure 2c), and cell surface elasticity in isolation caused the tissue to stay almost flat. However, combining both the supracellular myosin cable and the cell surface elasticity was sufficient to bend the tissue when a high enough pulling force acted on the adherens junctions."
We updated the sentence as follows.
P9L287 “Note that the supracellular myosin cable alone could not reproduce the apical constriction (Figure 2c), and that with some parameters the modified cell surface elasticity kept the tissue almost flat (Figure 4). However, combining both the supracellular myosin cable and the cell surface elasticity made a sharp bending when the pulling force acting on the adherens junction was sufficiently high.”
(8) Lines 434-435: Unclear what is meant with sentence starting with "Rest of sites"
We update the sentence as follows.
P17L456 “At the initial configuration and during the simulation, sites adjacent to medium and not marked as apical are marked as basal.”
(9) Fixing typos and other minor grammar and wording changes would improve readability. Following is a list in order of appearance in the text with suggestions for improvement.
We greatly appreciate the careful editing, and corrected the manuscript accordingly.
Line 14: "a" is not needed in the phrase "increased a pressure"
Line 15: "cell into not the wedge shape" --"cell not into the wedge shape" In fact it might be better to flip the sentence around to say, e.g. "making the cells adopt a drop shape instead of the expected wedge shape".
Line 24: "cells decrease its apical surface" --"cells decrease their apical surface"
Line 25: instead of "turn into wedge shape", a more natural-sounding expression could be "adopt a wedge shape"
Line 28: "which crosslink and contract" --because the subject is the singular "motor protein", the verb tense needs to be changed to "crosslinks and contracts"
Line 29: I suggest to use the definite article "the" before "actin filament network" as this is expected to be a known concept to the reader.
Line 31: "adherens junction and tight junction" --use the plural, because there are many per cell: "adherens junctions and tight junctions"
Line 42: "In vertebrate" --"In vertebrates"
Line 46: "Since the interruption to" --"Since the interruption of"
Line 56: "the surface tension of the invaginated cells were" --since the subject is "the surface tension", the verb "were" needs to be changed to "was" Line 63: "extra cellular matrix" --generally written as "extracellular matrix" without the first space
Line 66: "many epithelial tissues" --"in many epithelial tissues"
Line 70: "This supracellular cables" --"These supracellular cables"
Line 72: "encircling salivary gland" --either "encircling the salivary gland" or "encircling salivary glands"
Lines 76-77: "investigated a cell physical property required" --"investigated what cell physical properties were required"
Line 78: "was another framework" --"is another framework" (it is a generally and currently valid true statement, so use the present tense)
Line 79: "simulated an effect of the apically localized myosin" --for clarity, I suggest rephrasing as "simulated the effect of increased apical contractility mediated by apically localized myosin"
Similarly, in Line 80: "did not reproduce the apical constriction" --"did not reproduce tissue invagination by apical constriction", as technically the cells in the model do reduce their apical area, but fail to invaginate as a tissue.
Line 82: "we found that a force" --"we found that the force"
Line 101: "apico-basaly" --"apico-basally"
Lines 107-108: "in order to save a computational cost" --"in order to save on computational cost"
Line 114: "Therefore an area of the cell" --"Therefore the interior area of the cell"
Line 139: "formed along adherens junction" --"formed along adherens junctions"
Line 166: "we ignored an effect" --"we ignored the effect"
Line 167: "and discussed it later" --"and discuss it later"
Lines 167-168: "an experiment with a cell cultured on a micro pattern showed that the myosin activity was well corresponded by the contractility" --"an experiment with cells cultured on a micro pattern showed that the myosin activity corresponded well to the contractility"
Line 172: "success of failure" --"success or failure"
Figure 1 caption: "none-polar" --"non-polarized"; "reg" --"red"
Line 179: "To prevented the surface" --"To prevent the surface"
Line 180: "It kept the cells surface" --"It kept the cells' surface" (apostrophe missing)
Line 181: "cells were delaminated and resulted in similar shapes" --"cells were delaminated and adopted similar shapes"
Line 190: "To investigate what made the difference" --"To investigate the origin of the difference"
Line 203: For clarity, I would suggest to add more specific wording. "the pressure, and a difference in the pressure between the cells resulted in" --"the internal pressure due to cell volume conservation, and a difference in the pressure between the contracting and non-contracting cells resulted in"
Line 206: "by analyzing the energy with respect to a cell shape" --"by analyzing the energy with respect to cell shape"
Line 220: "indicating that cell could shrink" --"indicating that a cell could shrink"
Line 224: For clarity, I would suggest more specific wording "lateral surface, while it seems not natural for the epithelial cells" --"lateral surface imposed on the vertex model, a restriction that seems not natural for epithelial cells"
Line 244: "succeeded in invaginating" --"succeeding in invaginating"
Line 247: "were checked whether the cells" --"were checked to assess whether the cells"
Line 250: "cells became the wedge shape" --"cells adopted the wedge shape"
Line 286: "there were no obvious change in a distribution pattern" --"there was no obvious change in the distribution pattern"
Lines 296-297: "When the cells were assigned the high apical surface contractility, the cells were rounded" --"When the cells were assigned a high apical surface contractility, the cells became rounded"
Line 298: "This simulation results" --"These simulation results"
Lines 301-302: I suggest to increase clarity by somewhat rephrasing. "Even when the vertex model allowed the curved lateral surface, the model did not assume the cells to be rearranged and change neighbors" --"Even in cases where vertex models were extended to allow curved lateral surfaces, the model still limited cell rearrangement and neighbor changes"
Line 326: "high surface tension tried to keep" --"high surface tension will keep"
Line 334: "In many tissue" --"In many tissues"
Line 345: "turned back to its original shape" --"turned back to their original shape" (subject is the plural "cells")
Lines 348-349: "resembles the result of simulation" --"resembles the result of simulations"
Line 352: "how the myosin" --"how do the myosin"
Line 356: "it bears the surface tension when extended and its magnitude" What does the last "its" refer to? The surface tension?
Line 365: "the endocytosis decrease" --"the endocytosis decreases"
Line 371: "activatoin" --"activation"
Line 374 "the cells undergoes" --"the cells undergo"
Line 378: "entier" --"entire"
Line 389: "individual tissue accomplish" --"individual tissues accomplish"
Line 423: "is determined" --"are determined" (subject is the plural "labels")
Line 430: "phyisical" --"physical"
Table 6 caption: "cell-ECN" --cell-ECM
Line 557: "do not confused" --"should not be confused"
Reviewer #1 (Recommendations For The Authors):
- The phrase "In addition, the encircling supracellular myosin cable largely promoted the invagination by the apical constriction, suggesting that too high apical surface tension may keep the epithelium apical surface flat." is not clear to me. It sounds contradictory.
This finding was unexpected and surprising for us too. However, it is actually not contradictory since stronger surface tension will make the surface flatter in general. Figure 4 shows the flat apical surface with the wedge shape cells for the too strong apical surface tension. On the other hand, the supracellular myosin cable promoted the cell shape changes without raising the surface tension, and thus it could make a sharp bending (Figure 5).
We updated the explanation for the effect of the supracellular myosin cable as follows.
P2L74 “In the same way as the contracting circumferential myosin belt in a cell decreasing the cell apical surface, the circular supracellular myosin cable contraction decreases the perimeter, the radius of the circle, and an area inside the circle.”
P6L197 “In the cross section, the shrinkage of the circular supracellular myosin cable was simulated with a move of adherens junction under the myosin cable toward the midline.”
- Even when the authors now avoid to say "in contrast to vertex model simulations" in pg.4, in the next section there is still the intention to compare VM to CPM. Idem in the Discussion section. The conclusion in that section is that the difference between the results arising with VM (achieving the constriction) and the CPM (not achieving the constriction, and leading to cell delamination) are due to the straight lateral surfaces. However, Sherrard et at could achieve the constriction with an enhanced apical surface contractility using a 2D VM that allows curvatures. Therefore, I don't think the main difference is given by the deformability of the lateral surfaces. Instead, it might be due to the facility of the CPM to drive cellular rearrangements, coupled to specific modeling rules such as the permanent lost of the "apical side" once a delamination occurs and the boundary conditions. A clear example is the observation of loss of cell-cell adherence when all the tensions are set the same. Instead, in a VM cells conserve their lateral neighbors in the uniform tension regime (Sherrard et at). Is it noteworthy that the two mentioned works using vertex models to achieve apical constriction (Sherrard et at. (2D) and Pérez-González (3D) et al.) seem to neglect T1 transitions. I specifically think the added discussion on the impact of the T1 events (fundamental for cell delamination) is quite poor. A more detailed description would help justify the differences between model outcomes.
We updated an explanation about the difference between the vertex model and the cellular Potts model in the discussion.
P12L318 “ An edge in the vertex model can be bent by interpolating vertices or can be represented with an arc of circle (Brakke, 1992). Even in cases where vertex models were extended to allow bent lateral surfaces, the model still limited cell rearrangement and neighbor changes (Pérez-González et al., 2021), limiting the cell delamination. Thus the difference in simulation results between the models could be due to whether the cell rearrangement was included or not. However, it is not clear how the absence of the cell rearrangement affected cell behaviors in the simulation, and it shall be studied in future. In contrast to the vertex model, the cellular Potts model included the curved cell surface and the cell rearrangement innately, it elucidated the importance of those factors.”
- Fig6c: cell boundary colors are quite difficult to see.
The images were drawn by custom scripts, and those scripts do not implement a method to draw wide lines.
- Title Table 1: "epitherila".
We corrected the typo.
Reviewer #2 (Recommendations For The Authors):
The Authors have addressed most of my initial comments. In my opinion, the results could be better represented. Overall, the manuscript contains too many snapshots that are hard to read. I am sure the Authors could come up with a parameter that would tell the overall shape of the tissue and distinguish between a proper invagination and delamination. Then they could plot this parameter in a phase diagram using color plots to show how varying values of model parameters affects the shape. Presentation aside, I believe the manuscript will be a valuable piece of work that will be very useful for the community of computational tissue mechanics.
We agree with the comment.
However, we could not make a suitable qualitative measurement method. For the phase diagrams, the measurement must be applicable to simulation results, otherwise each figure introduce a new measurement and a color representation would just redraw the snapshots but no comparison between the figures. So the different measurements would make the figures more difficult to read.
The single measurement must distinguish the cell delamination by the increased surface contractility from the invagination by the modified surface elasticity and the supracellular contractile ring, even though the center cells were covered by the surrounding cells and lost contact with apical side extracellular medium in both cases.
With the center of mass, the delaminated cells would return large values because they were moved basally. With the tissue basal surface curvature, it would not measure if the tissue apical surface was also curved or kept flat. If the phase diagram and interpretation of the simulation results do not match with each other, it would be misleading.
A measurement meeting all these conditions was hardly designed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Weaknesses:
(1) Important details about the nature of DEG comparisons between the wild type and the Lrrk2 G2019S model are missing.
Please see the recommendations section below for specific responses to individual comments from Reviewer #1.
(2) Some aspects of the integration between snRNA-seq and MERFISH data are not clear, and many MERFISH-identified cells do not appear to have a high-confidence cluster transfer into the snRNA-seq data space. Imputation is used to overcome some issues with the MERFISH dataset, but it is not clear that this is appropriate.
Please see the recommendations section below for specific responses to individual comments from Reviewer #1.
Reviewer #2 (Public review):
(1) In the GO pathway analyses (both GSEA and DEG GO), I did not see a correction applied to the gene background considered. The study focusses on dopaminergic neurons and thus the gene background should be restricted to genes expressed in dopaminergic neurons, rather than all genes in the mouse genome. The problem arises that if we randomly sample genes from dopaminergic neurons instead of the whole genome, we are predisposed to sampling genes enriched in relevant cell-type-specific roles (and their relevant GO terms) and correspondingly depleted in genes enriched in functions not associated with this cell type. Thus, I am unsure whether the results presented in Figures 8 and 9 may be more likely to be obtained just by randomly sampling genes from a dopaminergic neuron. The background should be limited and these functional analyses rerun.
Thank you for pointing out this important concern. We agree that overrepresentation analyses (ORAs) are vulnerable to selecting cell-type specific markers as significantly differentially expressed and thus inflating detection of cell-type associated gene sets rather than those truly altered as a function of experimental condition. We have thus re-run the GO analyses in our study with the genetic background being adjusted for each individual comparison. For dataset-level GO in Fig 8, genetic background was defined as genes with expression detected in at least 5% of all cells (to approximate the inclusion of cluster-specific genes). For comparisons of subsets within the dataset (i.e. a family or cluster) across conditions, a minimum detection level of 10% of cells was used to define the genetic background. These same thresholds were applied to filter the DEG lists used as input for GO. Interestingly, this correction appears to have filtered out or lowered the significance of some of the more generic brain-associated pathways that we initially presented, such as axonogenesis or learning and memory, and we feel even more confident in our original interpretation.
Functional class scoring methods like GSEA, however, are unlike ORAs in that they do utilize a hypergeometric test to calculate overrepresentation as no distinction is made between significant and non-significant differential gene expression (nor is a genetic background provided as input to this tool). GSEA takes as input the full DE results, ranking genes according to their association with either group. Thus, genes simply enriched in DA neurons should be present towards both extremes of the rank list, rather than uniformly skewed toward one extreme. Per the GSEA authors’ user manual and original source paper, the entirety of DE testing should be provided as input for GSEA (barring genes with detection levels so low that their differential expression and/or ranking is likely to be artifactual):
“The GSEA algorithm does not filter the expression dataset and generally does not benefit from your filtering of the expression dataset. During the analysis, genes that are poorly expressed or that have low variance across the dataset populate the middle of the ranked gene list and the use of a weighted statistic ensures that they do not contribute to a positive enrichment score. By removing such genes from your dataset, you may actually reduce the power of the statistic and processing time is rarely a factor as GSEA can easily analyze 22,000 genes with even modest processing power. However, an exception exists for RNA-seq datasets where GSEA may benefit from the removal of extremely low count genes (i.e., genes with artifactual levels of expression such that they are likely not actually expressed in any of the samples in the dataset).” [https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html]
In our study, this filtering of very low expression genes (to account for artifactually inflated fold changes or a large number of ties in the rank list that are subsequently ordered at random) occurred at the level of DE testing using the Seurat FindMarkers command, in which differential expression calculations were only performed for genes that were detected in a minimum of 10% of cells in the dataset.
(2) In the scRDS results, I am unsure what is significant and what isn't. The authors refer to relative measures in the text ("highest") but I do not know whether these differences are significant nor whether any associations are significantly unexpected. Can the x-axis of scRDS results presented in Figure 9 H and I be replaced with a corrected p-value instead of the scRDS score?
An important distinction should be made here between scDRS and similar approaches that utilize overrepresentation analyses to assess for associations of DEGs with putative risk genes, similar to the GO analyses performed in our paper. The scDRS score represents the relative association for each individual cell’s expression profile (among all other cells in the dataset) with PD risk loci by utilizing the underlying SNPs and associations described in GWAS summary statistics (see Methods or Zhang et al., Nat Genetics 2022 for more details). While scDRS can be used to generate a p value for each individual cell in the dataset, scDRS does not have a native method for defining group-level p values, nor have we attempted to calculate group-level p values here. In order to compare cluster-level mean scDRS scores and determine their significance, we created bootstrapped 95% confidence intervals for the mean scDRS score of each cluster or family (shown by the error bars in forest plots 9G, 9H). A score of 0 represents the null hypothesis of no association between gene expression and PD risk loci, and thus if the 95% confidence interval does not overlap 0, the mean scDRS score for a given group can be regarded as significant as there is a less than 5% chance of the true group mean containing the null. Similarly, groups can be compared to each other in the same way to determine if the group-level mean scDRS score is significantly different across a given pair. However, this overlap of confidence intervals should be interpreted cautiously, as there are a large number of potential comparisons that can be made, creating the potential for Type I error. We have added language to clarify what the scDRS score represents, and to ensure it is not conflated with approaches such as GO or GSEA.
(3) The results discussed at the bottom of page 13 [page 14 of new version] state that 48.82% of the proteins encoded by the Calb1 DEGs have pre-synaptic localisations as opposed to 45.83% of the SOX6 DEGs, which does not support the statement that "greater proportions of DEGs are associated with presynaptic locations in cells from vulnerable DA neurons (Sox6 family, [and in particular,Sox6^tafa1]), compared to less vulnerable ones (Calb1 family)".
Thank you for pointing this out; the error here lies in the wording of the results. The percentages mentioned above describe the percentages within the synaptic localized genes rather than the total DEG lists. We have rephrased this section for clarity to include both the percentages within this category as well as the total (the results of which are in line with our original statement).
(4) While an interest in the Sox6^tafa1 subtype is explained through their expression of Anxa1 denoting a previously identified subtype associated with locomotory behaviours, it was unclear to me how to interpret the functional associations made to DEGs in this subtype taken out of context of other subtypes. Given all the other subtypes, it is not possible to ascertain how specific and thus how interesting these results are unless other subtypes are analysed in the same way and this Sox6^tafa1 subtype is demonstrated as unusual given results from other subtypes.
In our study, we chose to specifically focus on this population given its unique acceleration-locked functional activity pattern observed in Azcorra & Gaertner et al, Nat Neuro 2023, as there are technical limitations that warrant cautious application of the above approach. We agree that the associations of this population to the described DEGs cannot be interpreted as unique to this population given the data presented and have added language to this effect within the text. There are two major challenges to analyzing all other subtypes to provide a comparison. Firstly, given the number of subtypes involved and number of downstream analyses, it is computationally intensive to carry out this analysis. More importantly however, the results cannot be easily compared across different populations due to the variability in both cluster size and internal heterogeneity of each cluster, as the statistical power in calculating DEGs will be inherently different across these populations (i.e. smaller or more heterogenous clusters would be expected to show a lower number of DEGs reaching significance). While pseudo bulk testing is effective for mitigating these factors, our limited sample number (n=2 independently generated datasets per group) dramatically underpowers differential expression testing using pseudo bulk analysis. One solution is to uniformly limit each cluster size to the minimally observed cluster size through random down-sampling. While this allows the ‘n’ in DE calculations to be uniform, this potentially worsens the problem of internal heterogeneity, which would remain roughly constant but in the setting of a lower ‘n’, increasing the variability in results for larger clusters. To provide a comparator for the population of interest we focused on, we have performed this down sampling approach in order to compare Sox6^Tafa1 to another cluster within the VTA, Calb1^Stac, that also expresses high levels of Anxa1 and Aldh1a1 given the broad interest in these markers as proxies for vulnerability. The results of this comparison are now shown in Figure S10.
(5) On p12, the authors highlight Mir124a-1hg that encodes miR-124. This is upregulated in Figure 8D but the authors note this has been to be downregulated in PD patients and some PD mouse models. Can the authors comment on the directional difference?
We have adjusted the text to reflect this discrepancy and speculate on why this may be observed. In short, one hypothesis is that miR-124, given its proposed neuroprotective effects, is increased in DA neurons facing toxic metabolic insults as a compensatory response. In our prodromal model without observable degeneration, this could represent an early sign of cell stress. While speculative, in PD patients or overtly degenerative models, lack of compensatory miR-124 or fulminant cell death among vulnerable cells could result in an observed decrease in miR-124 expression.
(6) Lastly, can the authors comment on the selection of a LogFC cut-off of 0.15 for their DEG selection? I couldn't see this explained (apologies if I missed it).
The 0.15 cutoff was selected arbitrarily based on the observed range of fold changes seen among our differentially expressed genes. However, importantly, this cutoff was not used for defining DEGs for downstream analyses such as GSEA or GO, nor for defining significance of differential expression, which was done purely based on FDR-adjusted p values <0.05. The selection of 0.15 affects only the coloring seen in the volcano plot, which we have decided to move to supplemental figures given the uniformly small effect size seen in individual genes and a separate reviewer comment regarding concern in the field over differential expression testing methods in single-cell datasets. Instead, this figure now focuses on highlighting pathway- and gene-set level comparisons that can provide easier interpretation of small, but concordant changes across swaths of genes.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) In the MERFISH dataset, only around half of the DAergic cells (2,297 of 4,532) were successfully projected into the snRNA-seq UMAP space, based on a similarity score > 0.5. Additionally, key transcripts that were used to define the snRNA-seq clusters (such as Sox6) were not identified at all in the MERFISH dataset. This raises some questions about the ability to integrate and compare these datasets directly, which are not fully considered in the manuscript. These discrepancies are smoothed over using imputation, which allows specific class-defining genes such as Sox6 to be plotted on spatial coordinates in Figure 4D. However, imputation is not without caveats, and the appropriateness of the imputation is not well considered in the text.
We fully agree with the reviewer that the use of an imputation approach needs to be clarified and justified thoroughly. We added a sentence to better clarify the process of imputation on Page 9 “The imputed gene expression is extrapolated from anchors established from pairwise correspondences of cell expression levels between MERFISH and snRNA-Seq datasets.” This pair-wise cell correspondence as defined by anchors can be assessed using Seurat confidence score. We acknowledge the fact that only about 50% of cells could confidently be transferred onto the snRNA-Seq data. This is the result of using a stringent confidence level of 0.5 (similar to previous publications, PMID: 38092916 & 38092912). We preferred mapping fewer high-confidence cells than potentially misrepresenting the spatial location of some of these clusters.
It is also important to demonstrate the reliability of gene imputation. Indeed as pointed out by the reviewer, some probes such as Sox6 were not detected in the MERFISH dataset. To strengthen our data integration and as already mentioned in the manuscript, we excluded 219 genes based on the deviation of average counts per cell between the datasets. The fact that the imputed expression of Sox6 perfectly reflects its well-characterized distribution (PMIDs: 25127144, 30104732, 25437550, 34758317) strengthened our confidence in our imputation pipeline. We also looked at the correlation of imputed gene expression with the detected transcripts in our MERFISH experiments. We added a new supplemental figure (S7) highlighting the correlations between MERFISH and imputed gene expression of 8 genes (4 for each Sox6 and Calb1 family). Together Fig S6 and S7 show the range of correlations between imputed and actual MERFISH transcript. Altogether, we can observe relatively high correlation between the number of detected transcripts per gene in snRNA-Seq and MERFISH datasets
In addition, we added a paragraph discussing limitations of gene expression imputation on page 17: “A strength of our study is that it utilizes advantages of each transcriptomic approach, the deep molecular profiling of individual cells using snRNA-Seq and the spatial resolution of MERFISH. For instance, we relied on gene expression imputation to ascribe expression level to genes not covered/detected in our MERFISH probe panel. Gene imputation as described by Stuart et al.(92) has been used in several recent studies integrating spatial and transcriptomic data(46, 47). It relies on identifying anchors that enable projection of MERFISH data onto the UMAP space of a snRNA-Seq dataset and then uses neighboring cells to extrapolate the expression of genes not included in our probe panel. This approach was used to impute Sox6 expression, which accurately reflects what has been reported in prior immunofluorescence and in situ hybridization studies(11, 27, 38, 43, 55). Moreover, imputed gene expression levels correlated strongly with MERFISH detected transcript for most genes further supporting our approach (Fig S6 and S7). Nevertheless, dataset integration has limitations that should be considered. First, imputed gene expression relies on the ability to identify reliable anchors linking the snRNA-Seq and MERFISH datasets. These anchors are determined in part by the choice of genes included on probe panels and thus could indirectly influence the reliability of imputed gene expression. Secondly, gene counts per cell in MERFISH are determined via segmentation of images, which is susceptible to artifacts and bias from centrally versus peripherally localized gene transcripts. In summary, although limitations are present in multi-modal transcriptomic analyses, merging these two approaches provided a molecular and spatial map of the DA system that could not have been resolved by either method alone.”
(2) In the discussion, the authors argue that the cellular classifications identified here for DA neurons are more likely to reflect discrete cell types than cell states. The rationale for this conclusion is largely based on the absence of subtype differences between wild-type and LRRK2 G2019S transgenic mice. I do not find this argument to be convincing, because it is still possible that certain subdivisions simply reflect dynamic cell states that are also not grossly altered in the mutant mouse. A stronger argument for this claim would be to include trajectory-based analyses that do not show predicted transition points between nearby or related clusters.
We thank the reviewer for pointing out this particular limitation as differentiating “cell type” and “cell states” been debated in the field for years with no consensus emerging how to address the issue. As suggested, we performed a trajectory analysis using Monocle3 on both control and Lrrk2 samples. We’ve built the trajectory map, taking cluster 20 as the starting node. To avoid potential biased trajectories induced by different cell coverage, we’ve down sampled the Lrrk2 condition to match the number of cells of wildtype. As expected, since most of the DA clusters are not segregated in the UMAP space, the trajectory analysis showed predicted transitions between clusters (see Author response image 1A and 1B). Even though some clusters’ pseudotime score were statistically different between the wildtype and Lrrk2 samples, they overall remained similar (Author response image 1C). This analysis suggests that the LRRK2G2019S mutation induces a mild transcriptional perturbation but does not result in a major cell state drift. Indeed, we believe changes in the observed trajectory path would disappear as the number of cells analyzed increases. Because of this bias introduced by cell coverage, we prefer not to include this trajectory analysis in the manuscript to avoid misleading readers. Thus, as suggested by the reviewer, we softened our claim to “This suggests that our taxonomic scheme is agnostic to a mild perturbation such as LRRK2G2019S, suggesting that our clusters are reflective of cell types, rather than cell states. It is possible that with more severe perturbations, such as a toxin lesion, more substantial alterations of taxonomic schemes are observed(86, 93). However, we expect that for mild insults, day to day behavioral changes, or pharmacological paradigms, our clusters will be resistant to changes, although individual gene levels may vary. Nonetheless, we cannot definitively confirm that a given DA neuron cannot convert from one subtype to another. Ultimately, alternative approaches such as detailed fate mapping of clusters or RNAseq-based trajectory analyses with greater numbers of sampled cells could be used to resolve this question.”.
Author response image 1.
A)Trajectory analysis of wildtype and B) LRRK2<sup>G2019S</sup> samples. C) Pseudotime scores for each cluster across wildtype and Lrrk2 conditions. Error bars represent the confidence of error for false positives discovery rate of 5%.
(3) The relationship between individual samples, GEMwell, and sequenced library should be clarified. If independent samples were combined into one GEMwell, this should be explicitly stated for clarity.
We have revised the text to better clarify the methodology. In brief, each of our 4 independent samples (2 control, 2 mutants; equal sexes per sample) were isolated from n=2 pooled mice (for a total n=8 mice across the 4 samples). Each sample was processed in its own GEM well to produce 4 distinct libraries that were subsequently sequenced and analyzed as described.
(4) Please include more details on DEG testing in the manuscript, this is key for interpreting the robustness of certain findings. Ideally, pseudobulked comparisons would be used here (given concerns in the field that DEG testing where N = number of cells artificially inflates the statistical power, violates assumptions of independence, and results in false positive DEGs).
While we agree that pseudobulk analysis would be ideal for reducing false positives, our study, while exceptionally large in total numbers of DA cells profiled, was generated from 4 total 10X libraries as described above, without any mechanism to definitively demultiplex to the original n=8 source mice. Thus, pseudobulk comparisons would be performed using only n=2 per group, which is below the recommended sample size for these methods. Given this concern, we have moved the volcano plot from Figure 8D to the supplementals and added language to the methods and relevant figure legend acknowledging the limitation in Seurat’s default differential expression analysis methodology.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
[…] Overall, this is an important paper that demonstrates that one model for transgenerational inheritance in C. elegans is not reproducible. This is important because it is not clear how many of the reported models of transgenerational inheritance reported in C. elegans are reproducible. The authors do demonstrate a memory for F1 embryos that could be a maternal effect, and the authors confirm that this is mediated by a systemic small RNA response. There are several points in the manuscript where a more positive tone might be helpful.
We would like to correct the statement made in the second to last sentence. The demonstration of an F1 response to PA14 was first reported by Moore et al., (2019) and then by Pereira et al., (2020) using a different behavioral assay. We merely confirmed these results in our hands, and confirmed the observation, first reported by Kaletsky et al., (2020), that sid-1 and sid-2 are required for this F1 response; although we did find that sid-1 and sid-2 are not required for the PA14-induced increase in daf-7p::gfp expression in ASI neurons in the F1 progeny of trained adults, which had not been addressed in the published work.
Yes, the intergenerational F1 response could be a maternal effect, but the in utero F1 embryos and their precursor germ cells were directly exposed to PA14 metabolites and toxins (non-maternal effect) as well as any parental response, whether mediated by small RNAs, prions, hormones, or other unknown information carriers. While the F1 aversion response does require sid-1 and sid-2, we would not presume that the substrate is therefore an RNA molecule, particularly because the systemic RNAi response supported by sid-1 and sid-2 is via long double-stranded RNA. To date, no evidence suggests that either protein transports small RNAs, particularly single-stranded RNAs.
Strengths:
The authors note that the high copy number daf-7::GFP transgene used by the Murphy group displayed variable expression and evidence for somatic silencing or transgene breakdown in the Hunter lab, as confirmed by the Murphy group. The authors nicely use single copy daf-7::GFP to show that neuronal daf-7::GFP is elevated in F1 but not F2 progeny with regards to the memory of PA14 avoidance, speaking to an intergenerational phenotype.
The authors nicely confirm that sid-1 and sid-2 are generally required for intergenerational avoidance of F1 embryos of moms exposed to PA14. However, these small RNA proteins did not affect daf-7::GFP elevation in the F1 progeny. This result is unexpected given previous reports that single copy daf-7::GFP is not elevated in F1 progeny of sid mutants. Because the Murphy group reported that daf-7 mutation abolishes avoidance for F1 progeny, this means that the sid genes function downstream of daf-7 or in parallel, rather than upstream as previously suggested.
The published report (Moore et al., 2019) shows only multicopy daf-7p::gfp results and does not address the daf-7p::gfp response in sid-1 or sid-2 mutants. Thus, our discovery that systemic RNAi, exogenous RNAi, and heritable RNAi mutants don’t disrupt elevated daf-7p::gfp in ASI neurons in the F1 progeny of PA14 trained P0’s is only unexpected with respect to the published models (Moore et al., 2019, Kaletsky et al., 2020).
The authors studied antisense small RNAs that change in Murphy data sets, identifying 116 mRNAs that might be regulated by sRNAs in response to PA14. Importantly, the authors show that the maco-1 gene, putatively targeted by piRNAs according to the Kaletsky 2020 paper, displays few siRNAs that change in response to PA14. The authors conclude that the P11 ncRNA of PA14, which was proposed to promote interkingdom RNA communication by the Murphy group, is unlikely to affect maco-1 expression by generating sRNAs that target maco-1 in C. elegans. The authors define 8 genes based on their analysis of sRNAs and mRNAs that might promote resistance to PA14, but they do not further characterize these genes' role in pathogen avoidance. The Murphy group might wish to consider following up on these genes and their possible relationship with P11.
Weaknesses:
This very thorough and interesting manuscript is at times pugnacious.
We reiterate that we never claimed that Moore et al., (2019) did not obtain their reported results. We simply stated that we could not replicate their results using the published methods and then failed in our search to identify variable(s) that might account for our results. In revising the manuscript, we have striven to make clear, unmuddied statements of facts and state that future investigations may provide independent evidence that supports the original claims and explains our divergent results.
Please explain more clearly what is High Growth media for E. coli in the text and methods, conveying why it was used by the Murphy lab, and if Normal Growth or High Growth is better for intergenerational heritability assays.
We added the standard recipes and the following explanations in the methods section to the revised text.
“NG plates minimally support OP50 growth, resulting in a thin lawn that facilitates visualization of larvae and embryos. HG plates (8X more peptone) support much higher OP50 growth, resulting in a thick bacterial lawn that supports larger worm populations.”
We have also included the following text in our presentation and discussion of the effects of growth conditions on worm choice in PA14 vs OP50 choice assays.
“Furthermore, because OP50 pathogenicity is enhanced by increased E. coli nutritive conditions (Garsin et al., 2003, Shi et al., 2006), the growth of F1-F4 progeny on High Growth (HG) plates (Moore et al., 2019; 2021b), which contain 8X more peptone than NG plates and therefore support much higher OP50 growth levels, immediately prior to the F1-F4 choice assays may further contribute to OP50 aversion among the control animals.”
We don’t know enough to claim that HG or NG media is better than the other for intergenerational assays, but they are different. Thus, switching between the two in a multigenerational experiment likely introduces unknown variability.
Reviewer #2 (Public Review):
This paper examines the reproducibility of results reported by the Murphy lab regarding transgenerational inheritance of a learned avoidance behavior in C. elegans. It has been well established by multiple labs that worms can learn to avoid the pathogen pseudomonas aeruginosa (PA14) after a single exposure. The Murphy lab has reported that learned avoidance is transmittable to 4 generations and dependent on a small RNA expressed by PA14 that elicits the transgenerational silencing of a gene in C. elegans. The Hunter lab now reports that although they can reproduce inheritance of the learned behavior by the first generation (F1), they cannot reproduce inheritance in subsequent generations.
This is an important study that will be useful for the community. Although they fail to identify a "smoking gun", the study examines several possible sources for the discrepancy, and their findings will be useful to others interested in using these assays. The preference assay appears to work in their hands in as much as they are able to detect the learned behavior in the P0 and F1 generations, suggesting that the failure to reproduce the transgenerational effect is not due to trivial mistakes in the protocol. An obvious reason, however, to account for the differing results is that the culture conditions used by the authors are not permissive for the expression of the small RNA by PA14 that the MUrphy lab identified as required for transgenerational inheritance. It would seem prudent for the authors to determine whether this small RNA is present in their cultures, or at least acknowledge this possibility.
We thank the reviewer for raising this issue and have added the following statement to this effect in the revised manuscript.
“We note that previous bacterial RNA sequence analysis identified a small non-coding RNA called P11 whose expression correlates with bacterial growth conditions that induce heritable avoidance (Kaletsky et al., 2020). Critically, C. elegans trained on a PA14 ΔP11 strain (which lacks this small RNA) still learn to avoid PA14, but their F1 and F2-F4 progeny fail to show an intergenerational or transgenerational response (Figure 3L in Kaletsky et al., 2020). The fact that we observed an intergenerational (F1) avoidance response is evidence that our PA14 growth conditions induce P11 expression.”
We believe that this addresses the concern raised here.
The authors should also note that their protocol was significantly different from the Murphy protocol (see comments below) and therefore it remains possible that protocol differences cumulatively account for the different results.
As suggested below, we have added to the supplemental documents the protocol we followed for the aversion assay. In our view, this document shows that our adjustments to the core protocol were minor. Furthermore, where possible, these adjustments were explicitly tested in side-by-side experiments for both the aversion assay and the daf-7p::gfp expression assay and presented in the manuscript.
To discover the source(s) of discrepancy between our results and the published results we subsequently introduced variations to this core protocol to exclude likely variables (worm and bacteria growth temperatures, assay conditions, worm handling methods, bacterial culture and storage conditions, and some minor developmental timing issues). Again, where possible, the effect of variations was tested in side-by-side experiments for both the aversion assay and the daf-7p::gfp expression assay and were presented in or have now been added to the manuscript.
It remains possible that we misunderstood the published Murphy lab protocols, but we were highly motivated to replicate the results so we could use these assays to investigate the reported RNAi-pathway dependent steps, thus we read every published version with extreme care.
Reviewer #3 (Public Review):
[…] Strengths:
(1) The authors provide a thorough description of their methods, and a marked-up version of a published protocol that describes how they adapted the protocol to their lab conditions. It should be easy to replicate the experiments.
As noted above in response to a suggestion by reviewer #2, we have replaced the annotated published protocol with the protocol that we followed. This will aid other groups' attempts to replicate our experimental conditions.
(2) The authors test the source of bacteria, growth temperature (of both C. elegans and bacteria), and light/dark husbandry conditions. They also supply all their raw data, so that the sample size for each testing plate can be easily seen (in the supplementary data). None of these variations appears to have a measurable effect on pathogen avoidance in the F2 generation, with all but one of the experiments failing to exhibit learned pathogen avoidance.
We note that the parallel analysis of daf-7p::gfp expression in ASI neurons was also tested for several of these conditions and also failed to replicate the published findings.
(3) The small RNA seq and mRNA seq analysis is well performed and extends the results shown in the original paper. The original paper did not give many details of the small RNA analysis, which was an oversight. Although not a major focus of this paper, it is a worthwhile extension of the previous work.
(4) It is rare that negative results such as these are accessible. Although the authors were unable to determine the reason that their results differ from those previously published, it is important to document these attempts in detail, as has been done here. Behavioral assays are notoriously difficult to perform and public discourse around these attempts may give clarity to the difficulties faced by a controversial field.
Thank you for your support. Choosing to pursue publication of these negative results was not an easy decision, and we thank members of the community for their support and encouragement.
Weaknesses:
(1) Although the "standard" conditions have been tested over multiple biological replicates, many of the potential confounders that may have altered the results have been tested only once or twice. For example, changing the incubation temperature to 25{degree sign}C was tested in only two biological replicates (Exp 5.1 and 5.2) - and one of these experiments actually resulted in apparent pathogen avoidance inheritance in the F2 generation (but not in the F1). An alternative pathogen source was tested in only one biological replicate (Exp 3). Given the variability observed in the F2 generation, increasing biological replicates would have added to the strengths of the report.
We agree that our study was not exhaustive in our exploration of variables that might be interfering with our ability to detect F2 avoidance. We also note that some of these variables also failed (with many more independent experiments) to induce elevated daf-7p::gfp expression in ASI neurons in F2 progeny. Our goal was not to show that variation in some growth or assay condition would generate reproducible negative results, but the exploration was designed to tweak conditions to enable detection of a robust F2 response. Given the strength of the data presented in Moore et al., (2019) we expected that adjustment of the problematic variable would produce positive results apparent in a single replicate, which could then be followed up. If we had succeeded, then we would have documented the conditions that enabled robust F2 inheritance and would have explored molecular mechanisms that support this important but mysterious process.
(2) A key difference between the methods used here and those published previously, is an increase in the age of the animals used for training - from mostly L4 to mostly young adults. I was unable to find a clear example of an experiment when these two conditions were compared, although the authors state that it made no difference to their results.
We can state firmly that the apparent time delay did not affect P0 learned avoidance (new Figure S1) or, as documented in Table S1, daf-7p::gfp expression in ASI neurons. In our experience, training mostly L4’s on PA14 frequently failed to produce sufficient F1 embryos for both F1 avoidance assays or daf-7p::gfp measurements in ASI neurons and collection of F2 progeny. Indeed, in early attempts to detect heritable PA14 aversion, trained P0 and F1 progeny were not assayed in order to obtain sufficient F2’s for a choice assay. These animals failed to display aversion, but without evidence of successful P0 training or an F1 intergenerational response this was deemed a non-fruitful trouble-shooting approach. We have added supplemental Figure S1 which presents P0 choice assay results from experiments using younger trained animals that failed to produce sufficient F1’s to continue the inheritance experiments.
The different timing at the start of training between the two protocols may reflect the age of the recovered bleached P0 embryos. It is reasonable to assume that bleaching day 1 adults vs day 2 or 3 adults from the P-1 population could shift the average age of recovered P0 embryos by several hours. The Murphy protocol only states that P0 embryos were obtained by bleaching healthy adults. Regardless, if the hypothesis entertained here is true, that a several hour difference in larval/adult age during 24 hours of training affects F2 inheritance of learned aversion but does not affect P0 learned avoidance, then we would argue that this paradigm for heritable learned avoidance, as described in Moore et al., (2019, 2021), is not sufficiently robust for mechanistic investigations.
(3) The original paper reports a transgenerational avoidance effect up to the F5 generation. Although in this work the authors failed to see avoidance in the F2 generation, it would have been prudent to extend their tests for more generations in at least a couple of their experiments to ensure that the F2 generation was not an aberration (although this reviewer acknowledges that this seems unlikely to be the case).
We would point out that we also failed to robustly replicate the F2 response in the daf-7p::gfp expression assays. An F2-specific aberration that affects two different assays seems quite unlikely, and it remains unclear how we would interpret a positive result in F3 and F4 generations without a positive result in the F2 generation. Were we to further extend these investigations, we believe that exploration of additional culture conditions would warrant higher priority than extension of our results to the F3 and F4 generations.
Reviewing Editor Comments:
The reviewers' suggestions for improving the manuscript were mostly minor, to change the wording in some places and to add some more explanation regarding the methods.
What should be highlighted in the section on OP50 growth conditions is that the initial preference for PA14 in the Murphy lab has also been observed by multiple other labs (Bargmann, Kim, Zhang, Abbalay). The fact that this preference was not observed by the Hunter lab is one of several indicators of subtle differences in the environment that might add up to explain the differences in results.
We agree that subtle known and unknown differences in OP50 and PA14 culture conditions can have measurable effects on the detection of PA14 attraction/aversion relative to OP50 attraction/aversion that could obscure or create the appearance of heritable effects between generations. We have added (see below) to the text a fuller description of the variability in the initial or naive preference observed in different laboratories using similar or variant 2-choice assays and culture conditions. It is worth emphasizing that direct comparison of the OP50 growth conditions specified in Moore et al., (2021) frequently revealed a much larger effect on the naïve choice index than is reported between labs (Figure 4).
“Naïve (OP50 grown) worms often show a bias towards PA14 in choice assays (Zhang et al., 2005; Ha et al., 2010; Moore et al., 2019; Pereira et al., 2020; Lalsiamthara and Aballay, 2022). This response, rather than representing an innate attraction to PA14, likely reflects the context of the worm's recent growth on OP50, a mild C. elegans pathogen (Garigan et al., 2002; Garsin et al., 2003; Shi et al., 2006). Thus, the naïve worms presented with a choice between a recently experienced mild pathogen (OP50) and a novel food choice (PA14) initially choose the novel food instead of the known mild pathogen (OP50 aversion).
In line with our results, some other groups have also reported higher naïve choice index scores (Lee et al., 2017). This variability in naïve choice may reflect differences in growth conditions of either the OP50 or PA14 bacteria. In addition, we note that among the studies that show naïve worm attraction to Pseudomonas (OP50 aversion) there are extensive methodological differences from the methods in Moore et al., (2019; 2021b), including differences in bacterial growth temperature, incubation time, whether the bacteria is diluted or concentrated prior to placement on the choice plates, the concentration of peptone in the choice plates, the length of the choice assay, and the inclusion of sodium azide in the choice assays (Zhang et al., 2005; Ha et al., 2010; Moore et al., 2019; Pereira et al 2020; Lalsiamthara and Aballay, 2022). Thus, the cause of the variability across published reports is not clear.”
Overall, an emphasis on the absence of robustness of the reported results, rather than failure to reproduce them (which can always have many reasons), is appropriate.
We agree that an emphasis on robustness is appropriate and have modified the text throughout the manuscript to shift the emphasis to absence of robustness. This includes a change to the manuscript title, which is now, “Reported transgenerational responses to Pseudomonas aeruginosa in C. elegans are not robust”
A significant experimental addition would be some attempts to determine whether the bacterial PA14 pathogen in the authors' lab produces the P11 small RNA, which has been proposed to have a causal role in initiating the previously reported transgenerational inheritance.
We acknowledge in the revised manuscript that a subsequent publication (Kaletsky et al., 2020) identified a correlation between PA14 training conditions that induced transgenerational memory and the expression of P11, a P. aeruginosa small non-coding RNA (see our response above to Reviewer #2’s similar query). While testing for the presence of P11 in Harvard culture conditions would be an important assay in any study whose purpose was to investigate the proposed P11-mediated mechanism underlying the transgenerational responses reported by the Murphy Lab, our goal was rather to replicate the robust transgenerational (F2) responses to PA14 training and then to investigate in more detail how sid-1 and sid-2 contribute to transgenerational epigenetic inheritance. Neither sid-1 nor sid-2 are predicted to transport small RNAs or single-stranded RNAs, thus testing for the presence of P11 is less relevant to our goals. Regardless, we note that Figure 3L in Kaletsky et al., (2020) showed that PA14 ΔP11 bacteria failed to induce an F1 avoidance response. Thus, the fact that we observed F1 avoidance implies that our culture conditions successfully induced P11 expression.
Reviewer #1 (Recommendations For The Authors):
The abstract could be more positive by concluding that 'We conclude that this example of transgenerational inheritance lacks robustness but instead reflects an example of small RNA-mediated intergenerational inheritance.'
As recommended, we have added additional clarifying information to the abstract and moderated the conclusion sentence.
“We did confirm that the dsRNA transport proteins SID-1 and SID-2 are required for the intergenerational (F1) inheritance of pathogen avoidance, but not for the F1 inheritance of elevated daf-7 expression. Furthermore, our reanalysis of RNA seq data provides additional evidence that this intergenerational inherited PA14 response may be mediated by small RNAs.”
“We conclude that this example of transgenerational inheritance lacks robustness, confirm that the intergenerational avoidance response, but not the elevated daf-7p::gfp expression in F1 progeny, requires sid-1 and sid-2, and identify candidate siRNAs and target genes that may mediate this intergenerational response.”
Differential expression of sRNAs or mRNAs might be better understood quantitatively by presenting data in scatterplots (Reed and Montgomery 2020) rather than in volcano plots.
We agree and have modified Figure 6A and 6B.
This statement in the main text might be unnecessary, as it affects the tenor of the conclusion of this significant manuscript. 'We note that none of the raw data for the published figures and unpublished replicate experiments . . . this hampered our ability to fully compare'.
We have rewritten this paragraph to focus on our goal: to identify the source of the discrepancy between our results and the published results. We considered discarding this statement but ultimately decided that our inability to directly compare our data to that of previously published work is a shortcoming of our study that deserves to be acknowledged and explained.
“Ideally, we would have compared our results with the published results (Moore et al., 2019), to possibly identify additional experimental parameters for further investigation; for example, a quantitative comparison of naïve choice in the P0 and F1 generations could help to determine the role of bacterial growth in the choice assay response. However, none of the raw data for the published figures and unpublished replicate experiments (Moore et al., 2019) were available on the publisher’s website or provided upon request to the corresponding author. In the absence of a quantitative comparison, it remains possible that an explanation for the discrepancies between our results and those of Moore et al., (2019) has been overlooked.”
The final sentence of the Discussion could be tempered and more positive by stating 'Thus independent reproducibility is of paramount concern, and we have tried to be completely transparent as a model for how heritability research should be conducted within the C. elegans community'.
Thank you. The suggested sentence nicely captures our intention. We now use it, almost verbatim, as our final sentence.
“Thus, independent reproducibility is of paramount concern, and we have tried to be completely transparent as a model for how heritability research should be presented within the C. elegans community.”
Reviewer #2 (Recommendations For The Authors):
Specific comments:
(1) Protocol: It is difficult to assess from the Methods the exact protocol used by the authors to assay food preference. The annotated Murphy protocol is not sufficient. The authors should provide their own protocol - a detailed lab-ready protocol where every step is outlined, and any steps that deviate from the Murphy lab protocol are called out.
Thank you for this excellent suggestion. We now include a protocol that documents the precise steps, timings, and controls that we followed (S1_aversion_protocol). We also include footnotes to both explain the reasons behind particular steps and to document known differences to the published protocol. Given the thoroughness of this suggested approach, we have thus removed the annotated version of Moore et al., (2021) from the revised submission.
(2) The authors imply in the methods that, unlike the Murphy lab, they did NOT use azide in the assay, and instead used 4oC to "freeze" the worms in place - It is not clear whether this method was used throughout all their assays and whether this could be a source of the difference. This change is NOT indicated in the annotated Murphy lab STAR Protocol they provide in the supplement.
We apologize for the lack of clarity. Concerned that azide may be interfering with our ability to detect heritable silencing we tested and then used cold-induced rigor to preserve worm choice in some choice assay results. This was not a change to the core protocol, but a variation used in some assays to determine whether azide could reduce our ability to detect heritable behavioral responses to PA14 exposure. As Moore et al., (2021) show, too much azide can affect measurement of worm choice. Too little or ineffective azide also can affect measurement of worm choice. Azide also affects bacteria (both OP50 and PA14), which could affect the production of molecules that attract or repel worms, much like performing the assay in light vs dark conditions can influence the measured choice index.
In our hands, cold-induced rigor worked well and within biological replicates was indistinguishable from azide (Figure S10). Thus, we include those results in our analysis and now indicate in Tables 2 and S2 and in Figures 1 and 3 which experiments used which method. As suggested, we now provide a detailed protocol that includes a note describing our precise method for cold-induced rigor.
Also, the number of worms used in each assay needs to be specified (same or different from Murphy protocol?), and whether any worms were "censored" as in the Murphy protocol, and if so on what basis.
While we published the exact number of worms scored in each assay (on each plate) it is unknown how this might compare to the results published in Moore et al., (2019), as the number of animals in the presented choice assays (either per plate or per choice) were not reported. Details on censoring, when to exclude data, and additional criteria to abandon an in-progress experiment are now detailed in the protocol (S1_aversion_protocol)
(3) Several instances in the text cite changes in the protocol as producing "no meaningful differences" without referring to a specific experiment that supports that statement (for example, line 399 regarding azide).
We now include data and methods comparing azide and cold-induced rigor (Supplemental document S1_aversion_protocol, Supplemental Figure S10), and data showing the P0 choice index for 48-52 hour post-bleach L4/young adults (Supplemental Figure S1), in addition to the previously noted absence of effects due to differences in embryo bleaching protocols (Figures 2, 3 and Tables 1, 2, S1, and S2).
(4) If the authors want to claim the irreproducibility of the Murphy lab results, they should use the exact protocol used by the Murphy lab in its entirety. It is not sufficient to show that individual changes do not affect the outcome, since the protocol they use appears to include SEVERAL changes which could cumulatively affect the results. If the authors do not want to do this, they should at least acknowledge and summarize in their discussion ALL their protocol changes.
We acknowledge these minor differences between the protocols we followed and the published methods but disagree that they invalidate our results. We transparently present the effect of known minimal protocol changes. We also present analysis of possible invalidating variations (number of animals in a choice assay). We emphasize that in our hands both measures of TEI, the choice assay and measurement of daf-7p::gfp in ASI neurons, failed to replicate the published transgenerational results.
If the protocol is sensitive to how animals are counted, whether bleached embryos are mixed gently or vigorously or a few hours difference in age at training, then in our view this TEI paradigm is not robust.
See also our response to reviewer #3’s public reviews above.
(5) The authors acknowledge that "non-obvious growth culture differences" could account for the different results. In this respect, the Murphy lab has proposed that the transgenerational effect requires a small RNA expressed in PA14. The authors should check that this RNA is expressed in the cultures they grow in their lab and use for their experiments. This could potentially identify where the two protocols diverge.
The bacterial culture conditions and worm training procedures described in Moore et al., (2019) successfully produced trained P0 animals that transmitted a PA14 aversion response to their F1 progeny. In a subsequent publication (Kaletsky et al., 2020), the Murphy lab showed a correlation between the culture conditions that induce heritable avoidance and the expression of P11, a P. aeruginosa small non-coding RNA. As mentioned above in response to Reviewer #2’s public review and the Reviewing Editor’s comments to authors, the Murphy lab showed that PA14 ΔP11 bacteria fail to induce an F1 avoidance response (Figure 3L in Kaletsky et al., (2020)). Thus, the fact that we observed F1 avoidance implies that our culture conditions successfully induced P11 expression. We believe that this addresses the concern raised here. Furthermore, if P11 is not reliably expressed in pathogenic PA14, then the published model is unlikely to be relevant in a natural environment. Again, we thank the reviewer for raising this issue and have added this information to the revised manuscript (see above response to Reviewer #2’s Public Reviews).
(6) Legend to Figure 1: please clarify which experiments were done with which PA14 isolates especially for A-C. What is the origin of the N2 strain used here?
These details from Tables 2 and S2 have been added to Figure 1 panels A-C and Figure 3. Bristol N2, obtained from the CGC (reference 257), was used for aversion experiments.
(7) Growth conditions: "These young adults produced comparable P0 and F1 results (Figure 1, Figure 2, and Figure 3)." It is not clear from the text what specific figure panels need to be compared to examine the effect of the variables described in the text. Please indicate which figure panels should be compared (lines 70-95).
The information for the daf-7p::gfp expression experiments displayed in Figure 1 and Figure 2 is presented in Table 1 and Table S1. The data for P0 aversion training using younger animals is now presented in Figure S1.
Reviewer #3 (Recommendations For The Authors):
While overall I found this easy to follow and well-written, I think the clarity of the figures could be improved by incorporating some of the information from S2 into Figure 3. Besides the figure label listing the experiment (Exp1, Exp2, etc) it would be helpful to add pertinent information about the experiment. For example Exp 1.1 (light, 20{degree sign}C), Exp1.2 (dark, 20{degree sign}C), Exp 5 (25{degree sign}C, light), etc.
Thank you for the suggestion. These details from Tables 2 and S2 have been added to Figures 1 A-C, and 3.
Citations
-
Moore, R.S., Kaletsky, R., and Murphy, C.T. (2019). Piwi/PRG-1 Argonaute and TGF-beta Mediate Transgenerational Learned Pathogenic Avoidance. Cell 177, 1827-1841 e1812.
-
Moore, R.S., Kaletsky, R., and Murphy, C.T. (2021). Protocol for transgenerational learned pathogen avoidance behavior assays in Caenorhabditis elegans. STAR Protoc 2, 100384.
-
Kaletsky, R., Moore, R.S., Vrla, G.D., Parsons, L.R., Gitai, Z., and Murphy, C.T. (2020). C. elegans interprets bacterial non-coding RNAs to learn pathogenic avoidance. Nature 586, 445-451.
-
Pereira, A.G., Gracida, X., Kagias, K., and Zhang, Y. (2020). C. elegans aversive olfactory learning generates diverse intergenerational effects. J Neurogenet 34, 378-388.
-
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This important study proposes a framework to understand and predict generalization in visual perceptual learning in humans based on form invariants. Using behavioral experiments in humans and by training deep networks, the authors offer evidence that the presence of stable invariants in a task leads to faster learning. However, this interpretation is promising but incomplete. It can be strengthened through clearer theoretical justification, additional experiments, and by rejecting alternate explanations.
We sincerely thank the editors and reviewers for their thoughtful feedback and constructive comments on our study. We have taken significant steps to address the points raised, particularly the concern regarding the incomplete interpretation of our findings.
In response to Reviewer #1, we have included long-term learning curves from the human experiments to provide a clearer demonstration of the differences in learning rates across invariants, and have incorporated a new experiment to investigate location generalization within each invariant stability level. These new findings have shifted the focus of our interpretation from learning rates to the generalization patterns both within and across invariants, which, alongside the observed weight changes across DNN layers, support our proposed framework based on the Klein hierarchy of geometries and the Reverse Hierarchy Theory (RHT).
We have also worked to clarify the conceptual foundation of our study and strengthen the theoretical interpretation of our results in light of the concerns raised by Reviewers #1 and #2. We have further expanded the discussion linking our findings to previous work on VPL generalization, and addressed alternative explanations raised by Reviewers #1.
Reviewer #1 (Public Review):
Summary:
Visual Perceptual Learning (VPL) results in varying degrees of generalization to tasks or stimuli not seen during training. The question of which stimulus or task features predict whether learning will transfer to a different perceptual task has long been central in the field of perceptual learning, with numerous theories proposed to address it. This paper introduces a novel framework for understanding generalization in VPL, focusing on the form invariants of the training stimulus. Contrary to a previously proposed theory that task difficulty predicts the extent of generalization - suggesting that more challenging tasks yield less transfer to other tasks or stimuli - this paper offers an alternative perspective. It introduces the concept of task invariants and investigates how the structural stability of these invariants affects VPL and its generalization. The study finds that tasks with high-stability invariants are learned more quickly. However, training with low-stability invariants leads to greater generalization to tasks with higher stability, but not the reverse. This indicates that, at least based on the experiments in this paper, an easier training task results in less generalization, challenging previous theories that focus on task difficulty (or precision). Instead, this paper posits that the structural stability of stimulus or task invariants is the key factor in explaining VPL generalization across different tasks
Strengths:
- The paper effectively demonstrates that the difficulty of a perceptual task does not necessarily correlate with its learning generalization to other tasks, challenging previous theories in the field of Visual Perceptual Learning. Instead, it proposes a significant and novel approach, suggesting that the form invariants of training stimuli are more reliable predictors of learning generalization. The results consistently bolster this theory, underlining the role of invariant stability in forecasting the extent of VPL generalization across different tasks.
- The experiments conducted in the study are thoughtfully designed and provide robust support for the central claim about the significance of form invariants in VPL generalization.
Weaknesses:
- The paper assumes a considerable familiarity with the Erlangen program and the definitions of invariants and their structural stability, potentially alienating readers who are not versed in these concepts. This assumption may hinder the understanding of the paper's theoretical rationale and the selection of stimuli for the experiments, particularly for those unfamiliar with the Erlangen program's application in psychophysics. A brief introduction to these key concepts would greatly enhance the paper's accessibility. The justification for the chosen stimuli and the design of the three experiments could be more thoroughly articulated.
We appreciate your feedback regarding the accessibility of our paper, particularly concerning the Erlangen Program and its associated concepts. We have revised the manuscript to include a more detailed introduction to Klein’s Erlangen Program in the second paragraph of Introduction section. It provides clear descriptions and illustrative examples for the three invariants within the Klein hierarchy of geometries, as well as the nested relationships among them (see revised Figure 1). We believe this addition will enhance the accessibility of the theoretical framework for readers who may not be familiar with these concepts.
In the revised manuscript, we have also expanded the descriptions of the stimuli and experimental design for psychophysics experiments. These additions aim to clarify the rationale behind our choices, ensuring that readers can fully understand the connection between our theoretical framework and experimental approach.
- The paper does not clearly articulate how its proposed theory can be integrated with existing observations in the field of VPL. While it acknowledges previous theories on VPL generalization, the paper falls short in explaining how its framework might apply to classical tasks and stimuli that have been widely used in the VPL literature, such as orientation or motion discrimination with Gabors, vernier acuity, etc. It also does not provide insight into the application of this framework to more naturalistic tasks or stimuli. If the stability of invariants is a key factor in predicting a task's generalization potential, the paper should elucidate how to define the stability of new stimuli or tasks. This issue ties back to the earlier mentioned weakness: namely, the absence of a clear explanation of the Erlangen program and its relevant concepts.
We thank you for highlighting the necessary to integrate our proposed framework with existing observations in VPL research.
Prior VPL studies have not concurrently examined multiple geometrical invariants with varying stability levels, making direct comparisons challenging. However, we have identified tasks from the literature that align with specific invariants. For example, orientation discrimination with Gabors (e.g., Dosher & Lu, 2005) and texture discrimination task (e.g., Wang et al., 2016) involve Euclidean invariants, and circle versus square discrimination (e.g., Kraft et al., 2010) involves affine invariants. On the other hand, our framework does not apply to studies using stimuli that are unrelated to geometric transformations, such as motion discrimination with Gabors or random dots, depth discrimination, vernier acuity, spatial frequency discrimination, contrast detection or discrimination.
By focusing on geometrical properties of stimuli, our work addresses a gap in the field and introduces a novel approach to studying VPL through the lens of invariant extraction, echoing Gibson’s ecological approach to perceptual learning.
In the revised manuscript, we have added a clearer explanation of Klein’s Erlangen Program, including the definition of geometrical invariants and their stability (the second paragraph in Introduction section). Additionally, we have expanded the Discussion section to draw more explicit comparisons between our results and previous studies on VPL generalization, highlighting both similarities and differences, as well as potential shared mechanisms.
- The paper does not convincingly establish the necessity of its introduced concept of invariant stability for interpreting the presented data. For instance, consider an alternative explanation: performing in the collinearity task requires orientation invariance. Therefore, it's straightforward that learning the collinearity task doesn't aid in performing the other two tasks (parallelism and orientation), which do require orientation estimation. Interestingly, orientation invariance is more characteristic of higher visual areas, which, consistent with the Reverse Hierarchy Theory, are engaged more rapidly in learning compared to lower visual areas. This simpler explanation, grounded in established concepts of VPL and the tuning properties of neurons across the visual cortex, can account for the observed effects, at least in one scenario. This approach has previously been used/proposed to explain VPL generalization, as seen in (Chowdhury and DeAngelis, Neuron, 2008), (Liu and Pack, Neuron, 2017), and (Bakhtiari et al., JoV, 2020). The question then is: how does the concept of invariant stability provide additional insights beyond this simpler explanation?
We appreciate your thoughtful alternative explanation. While this explanation accounts for why learning the collinearity task does not transfer to the orientation task—which requires orientation estimation—it does not explain why learning the collinearity task fails to transfer to the parallelism task, which requires orientation invariance rather than orientation estimation. Instead, the asymmetric transfer observed in our study could be perfectly explained by incorporating the framework of the Klein hierarchy of geometries.
According to the Klein hierarchy, invariants with higher stability are more perceptually salient and detectable, and they are nested hierarchically, with higher-stability invariants encompassing lower-stability invariants (as clarified in the revised Introduction). In our invariant discrimination tasks, participants need only extract and utilize the most stable invariant to differentiate stimuli, optimizing their ability to discriminate that invariant while leaving the less stable invariants unoptimized.
For example:
-
In the collinearity task, participants extract the most stable invariant, collinearity, to perform the task. Although the stimuli also contain differences in parallelism and orientation, these lower-stability invariants are not utilized or optimized during the task.
-
In the parallelism task, participants optimize their sensitivity to parallelism, the highest-stability invariant available in this task, while orientation, a lower-stability invariant, remains irrelevant and unoptimized.
-
In the orientation task, participants can only rely on differences in orientation to complete the task. Thus, the least stable invariant, orientation, is extracted and optimized.
This hierarchical process explains why training on a higher-stability invariant (e.g., collinearity) does not transfer to tasks involving lower-stability invariants (e.g., parallelism or orientation). Conversely, tasks involving lower-stability invariants (e.g., orientation) can aid in tasks requiring higher-stability invariants, as these higher-stability invariants inherently encompass the lower ones, resulting in a low-to-high-stability transfer effect.
This unique perspective underscores the importance of invariant stability in understanding generalization in VPL, complementing and extending existing theories such as the Reverse Hierarchy Theory. To help the reader understand our proposed theory, we revised the Introduction and Discussion section.
- While the paper discusses the transfer of learning between tasks with varying levels of invariant stability, the mechanism of this transfer within each invariant condition remains unclear. A more detailed analysis would involve keeping the invariant's stability constant while altering a feature of the stimulus in the test condition. For example, in the VPL literature, one of the primary methods for testing generalization is examining transfer to a new stimulus location. The paper does not address the expected outcomes of location transfer in relation to the stability of the invariant. Moreover, in the affine and Euclidean conditions one could maintain consistent orientations for the distractors and targets during training, then switch them in the testing phase to assess transfer within the same level of invariant structural stability.
We thank you for this good suggestion. Using one of the primary methods for test generalization, we performed a new psychophysics experiment to specifically examine how VPL generalizes to a new test location within a single invariant stability level (see Experiment 3 in the revised manuscript). The results show that the collinearity task exhibits greater location generalization compared to the parallelism task. This finding suggests the involvement of higher-order visual areas during high-stability invariant training, aligning with our theoretical framework based on the Reverse Hierarchy Theory (RHT). We attribute the unexpected location generalization observed in the orientation task to an additional requirement for spatial integration in its specific experimental design (as explained in the revised Results section “Location generalization within each invariant”). Moreover, based on previous VPL studies that have reported location specificity in orientation discrimination (Fiorentini and Berardi, 1980; Schoups et al., 1995; Shiu and Pashler, 1992), along with the substantial weight changes observed in lower layers of DNNs trained on the orientation task (Figure 9B, C), we infer that under a more controlled experimental design—such as the two-interval, two-alternative forced choice (2I2AFC) task employed in DNN simulations, where spatial integration is not required for any of the three invariants—the plasticity for orientation tasks would more likely occur in lower-order areas.
In the revised manuscript, we have discussed how these findings, together with the observed asymmetric transfer across invariants and the distribution of learning across DNN layers, collectively reveal the neural mechanisms underlying VPL of geometrical invariants.
- In the section detailing the modeling experiment using deep neural networks (DNN), the takeaway was unclear. While it was interesting to observe that the DNN exhibited a generalization pattern across conditions similar to that seen in the human experiments, the claim made in the abstract and introduction that the model provides a 'mechanistic' explanation for the phenomenon seems overstated. The pattern of weight changes across layers, as depicted in Figure 7, does not conclusively explain the observed variability in generalizations. Furthermore, the substantial weight change observed in the first two layers during the orientation discrimination task is somewhat counterintuitive. Given that neurons in early layers typically have smaller receptive fields and narrower tunings, one would expect this to result in less transfer, not more.
We appreciate your suggestion regarding the clarity of DNN modeling. While the DNN employed in our study recapitulates several known behavioral and physiological VPL effects (Manenti et al., 2023; Wenliang and Seitz, 2018), we acknowledge that the claim in the abstract and introduction suggesting the model provides a ‘mechanistic’ explanation for the phenomenon may have been overstated. The DNN serves primarily as a tool to generate important predictions about the underlying neural substrates and provides a promising testbed for investigating learning-related plasticity in the visual hierarchy.
In the revised manuscript, we have made significant improvements in explaining the weight change across DNN layers and its implication for understanding “when” and “where” learning occurs in the visual hierarchy. Specifically, in the Results ("Distribution of learning across layers") and Discussion sections, we have provided a more explicit explanation of the weight change across layers, emphasizing its implications for understanding the observed variability in generalizations and the underlying neural mechanisms.
Regarding the substantial weight change observed in the first two layers during the orientation discrimination task, we interpret this as evidence that VPL of this least stable invariant relies more on the plasticity of lower-level brain areas, which may explain the poorer generalization performance to new locations or features observed in the previous literature (Fiorentini and Berardi, 1980; Schoups et al., 1995; Shiu and Pashler, 1992). However, this does not imply that learning effects of this least stable invariant cannot transfer to more stable invariants. From the perspective of Klein’s Erlangen program, the extraction of more stable invariants is implicitly required when processing less stable ones, which leads to their automatic learning. Additionally, within the framework of the Reverse Hierarchy Theory (RHT), plasticity in lower-level visual areas affects higher-level areas that receive the same low-level input, due to the feedforward anatomical hierarchy of the visual system (Ahissar and Hochstein, 2004, 1997; Markov et al., 2013; McGovern et al., 2012). Therefore, the improved signal from lower-level plasticity resulted from training on less stable invariants can enhance higher-level representations of more stable invariants, facilitating the transfer effect from low- to high-stability invariants.
Reviewer #2 (Public Review):
The strengths of this paper are clear: The authors are asking a novel question about geometric representation that would be relevant to a broad audience. Their question has a clear grounding in pre-existing mathematical concepts, that, to my knowledge, have been only minimally explored in cognitive science. Moreover, the data themselves are quite striking, such that my only concern would be that the data seem almost *too* clean. It is hard to know what to make of that, however. From one perspective, this is even more reason the results should be publicly available. Yet I am of the (perhaps unorthodox) opinion that reviewers should voice these gut reactions, even if it does not influence the evaluation otherwise. Below I offer some more concrete comments:
(1) The justification for the designs is not well explained. The authors simply tell the audience in a single sentence that they test projective, affine, and Euclidean geometry. But despite my familiarity with these terms -- familiarity that many readers may not have -- I still had to pause for a very long time to make sense of how these considerations led to the stimuli that were created. I think the authors must, for a point that is so central to the paper, thoroughly explain exactly why the stimuli were designed the way that they were and how these designs map onto the theoretical constructs being tested.
We thank you for reminding us to better justify our experimental designs. In response, we have provided a detailed introduction to Klein’s Erlangen Program, describing projective, affine, and Euclidean geometries, their associated invariants, and the hierarchical relationships among them (see revised Introduction and Figure 1).
All experiments in our study employed stimuli with varying structural stability (collinearity, parallelism, orientation, see revised Figure 2, 4), enabling us to investigate the impact of invariant stability on visual perceptual learning. Experiment 1 was adapted from paradigms studying the "configural superiority effect," commonly used to assess the salience of geometric invariants. This paradigm was chosen to align with and build upon related research, thereby enhancing comparability across studies. To address the limitations of Experiment 1 (as detailed in our Results section), Experiments 2, 3, and 4 employed a 2AFC (two-alternative forced choice)-like paradigm, which is more common in visual perceptual learning research. Additionally, we have expanded descriptions of our stimuli and designs. aiming to ensure clarity and accessibility for all readers.
(2) I wondered if the design in Experiment 1 was flawed in one small but critical way. The goal of the parallelism stimuli, I gathered, was to have a set of items that is not parallel to the other set of items. But in doing that, isn't the manipulation effectively the same as the manipulation in the orientation stimuli? Both functionally involve just rotating one set by a fixed amount. (Note: This does not seem to be a problem in Experiment 2, in which the conditions are more clearly delineated.)
We appreciate your insightful observation regarding the design of Experiment 1 and the potential similarity between the manipulations of the parallelism and orientation stimuli.
The parallelism and orientation stimuli in Experiment 1 were originally introduced by Olson and Attneave (1970) to support line-based models of shape coding and were later adapted by Chen (1986) to measure the relative salience of different geometric properties. In the parallelism stimuli, the odd quadrant differs from the others in line slope, while in the orientation stimuli, the odd quadrant contains identical line segments but differs in the direction pointed by their angles. The faster detection of the odd quadrant in the parallelism stimuli compared to the orientation stimuli has traditionally been interpreted as evidence supporting line-based models of shape coding. However, as Chen (1986, 2005) proposed, the concept of invariants over transformations offers a different interpretation: in the parallelism stimuli, the fact that line segments share the same slope essentially implies that they are parallel, and the discrimination may be actually based on parallelism. This reinterpretation suggests that the superior performance with parallelism stimuli reflects the relative perceptual salience of parallelism (an affine invariant property) compared to the orientation of angles (a Euclidean invariant property).
In the collinearity and orientation tasks, the odd quadrant and the other quadrants differ in their corresponding geometries, such as being collinear versus non-collinear. However, in the parallelism task, participants could rely either on the non-parallel relationship between the odd quadrant and the other quadrants or on the difference in line slope to complete the task, which can be seen as effectively similar to the manipulation in the orientation stimuli, as you pointed out. Nonetheless, this set of stimuli and the associated paradigm have been used in prior studies to address questions about Klein’s hierarchy of geometries (Chen, 2005; Wang et al., 2007; Meng et al., 2019). Given its historical significance and the importance of ensuring comparability with previous research, we adopted this set of stimuli despite its imperfections. Other limitations of this paradigm are discussed in the Results section (“The paradigm of ‘configural superiority effects’ with reaction time measures”), and optimized experimental designs were implemented in Experiment 2, 3, and 4 to produce more reliable results.
(3) I wondered if the results would hold up for stimuli that were more diverse. It seems that a determined experimenter could easily design an "adversarial" version of these experiments for which the results would be unlikely to replicate. For instance: In the orientation group in Experiment 1, what if the odd-one-out was rotated 90 degrees instead of 180 degrees? Intuitively, it seems like this trial type would now be much easier, and the pattern observed here would not hold up. If it did hold up, that would provide stronger support for the authors' theory.
It is not enough, in my opinion, to simply have some confirmatory evidence of this theory. One would have to have thoroughly tested many possible ways that theory could fail. I'm unsure that enough has been done here to convince me that these ideas would hold up across a more diverse set of stimuli.
Thanks for your nice suggestion to validate our results using more diverse stimuli. However, the limitations of Experiment 1 make it less suitable for rigorous testing of diverse or "adversarial" stimuli. In addition to the limitation discussed in response to (2), another issue is that participants may rely on grouping effects among shapes in the quadrants, rather than solely extracting the geometrical invariants that are the focus of our study. As a result, the reaction times measured in this paradigm may not exclusively reflect the extraction time of geometrical invariants but could also be influenced by these grouping effects.
Therefore, we have shifted our focus to the improved design used in Experiment 2 to provide stronger evidence for our theory. Building on this more robust design, we have extended our investigations to study location generalization (revised Experiment 3) and long-term learning effects (revised Figure 6—figure supplement 2). These enhancements allow us to provide stronger evidence for our theory while addressing potential confounds present in Experiment 1.
While we did not explicitly test the 90-degree rotation scenario in Experiment 1, future studies could employ more diverse set of stimuli within the Experiment 2 framework to better understand the limits and applicability of our theoretical predictions. We appreciate this suggestion, as it offers a valuable direction for further research.
Reviewer #1 (Recommendations For The Authors):
Major comments:
- A concise introduction to the Erlangen program, geometric invariants, and their structural stability would greatly enhance the paper. This would not only clarify these concepts for readers unfamiliar with them but also provide a more intuitive explanation for the choice of tasks and stimuli used in the study.
- I recommend adding a section that discusses how this new framework aligns with previous observations in VPL, especially those involving more classical stimuli like Gabors, random dot kinematograms, etc. This would help in contextualizing the framework within the broader spectrum of VPL research.
- Exploring how each level of invariant stability transfers within itself would be an intriguing addition. Previous theories often consider transfer within a condition. For instance, in an orientation discrimination task, a challenging training condition might transfer less to a new stimulus test location (e.g., a different visual quadrant). Applying a similar approach to examine how VPL generalizes to a new test location within a single invariant stability level could provide insightful contrasts between the proposed theory and existing ones. This would be particularly relevant in the context of Experiment 2, which could be adapted for such a test.
- I suggest including some example learning curves from the human experiment for a more clear demonstration of the differences in the learning rates across conditions. Easier conditions are expected to be learned faster (i.e. plateau faster to a higher accuracy level). The learning speed is reported for the DNN but not for the human subjects.
- In the modeling section, it would be beneficial to focus on offering an explanation for the observed generalization as a function of the stability of the invariants. As it stands, the neural network model primarily demonstrates that DNNs replicate the same generalization pattern observed in human experiments. While this finding is indeed interesting, the model currently falls short of providing deeper insights or explanations. A more detailed analysis of how the DNN model contributes to our understanding of the relationship between invariant stability and generalization would significantly enhance this section of the paper.
Minor comments:
- Line 46: "it is remains" --> "it remains"
- Larger font sizes for the vertical axis in Figure 6B would be helpful.
We thank your detailed and constructive comments, which have significantly helped us improve the clarity and rigor of our manuscript. Below, we provide a response to each point raised.
Major Comments
(1) A concise introduction to the Erlangen program, geometric invariants, and their structural stability:
We appreciate your suggestion to provide a clearer introduction to these foundational concepts. In the revised manuscript, we have added a dedicated section in the Introduction that offers a concise explanation of Klein’s Erlangen Program, including the concept of geometric invariants and their structural stability. This addition aims to make the theoretical framework more accessible to readers unfamiliar with these concepts and to better justify the choice of tasks and stimuli used in the study.
(2) Contextualizing the framework within the broader spectrum of VPL research:
We have expanded the Discussion section to better integrate our framework with previous VPL studies that reported generalization, including those using classical stimuli such as Gabors (Dosher and Lu, 2005; Hung and Seitz, 2014; Jeter et al., 2009; Liu and Pack, 2017; Manenti et al., 2023) and random dot kinematograms (Chang et al., 2013; Chen et al., 2016; Huang et al., 2007; Liu and Pack, 2017). In particular, we now discuss the similarities and differences between our findings and these earlier studies, exploring potential shared mechanisms underlying VPL generalization across different types of stimuli. These additions aim to contextualize our framework within the broader field of VPL research and highlight its relevance to existing literature.
(3) Exploring transfer within each invariant stability level:
In response to this insightful suggestion, we have added a new psychophysics experiment in the revised manuscript (Experiment 3) to examine how VPL generalizes to a new test location within the same invariant stability level. This experiment provides an opportunity to further explore the neural substrates underlying VPL of geometrical invariants, offering a contrast to existing theories and strengthening the connection between our framework and location generalization findings in the VPL literature.
(4) Including example learning curves from the human experiments:
We appreciate your suggestion to include learning curves for human subjects. In the revised manuscript, we have added learning curves of long-term VPL (see revised Figure 6—figure supplement 2) to track the temporal learning processes across invariant conditions. Interestingly, and in contrast to the results reported in the DNN simulations, these curves show that less stable invariants are learned faster and exhibit greater magnitudes of learning. We interpret this discrepancy as a result of differences in initial performance levels between humans and DNNs, as discussed in the revised Discussion section.
(5) Offering a deeper explanation of the DNN model's findings:
We acknowledge your concern that the modeling section primarily demonstrates that DNNs replicate human generalization patterns without offering deeper mechanistic insights. To address this, we have expanded the Results and Discussion sections to more explicitly interpret the weight change patterns observed across DNN layers in relation to invariant stability and generalization. We discuss how the model contributes to understanding the observed generalization within and across invariants with different stability, focusing on the neural network's role in generating predictions about the neural mechanisms underlying these effects.
Minor Comments
(1) Line 46: Correction of “it is remains” to “it remains”:
We have corrected this typo in the revised manuscript.
(2) Vertical axis font size in Figure 6B:
We have increased the font size of the vertical axis labels in revised Figure 8B for improved readability.
Reviewer #2 (Recommendations For The Authors):
(1) There are many details throughout the paper that are confusing, such as the caption for Figure 4, which does not appear to correspond to what is shown (and is perhaps a copy-paste of the caption for Experiment 1?). Similarly, I wasn't sure about many methodological details, like: How participants made their second response in Experiment 2? It says somewhere that they pressed the corresponding key to indicate which one was the target, but I didn't see anything explaining what that meant. Also, I couldn't tell if the items in the figures were representative of all trials; the stimuli were described minimally in the paper.
(2) The language in the paper felt slightly off at times, in minor but noticeable ways. Consider the abstract. The word "could" in the first sentence is confusing, and, more generally, that first sentence is actually quite vague (i.e., it just states something that would appear to be true of any perceptual system). In the following sentence, I wasn't sure what was meant by "prior to be perceived in the visual system". Though I was able to discern what the authors were intending to say most times, I was required to "read between the lines" a bit. This is not to fault the authors. But these issues need to be addressed, I think.
(1) We sincerely apologize for the oversight regarding the caption for (original) Figure 4, and thank you for pointing out this error. In the revised manuscript, we have corrected the caption for Figure 4 (revised Figure 5) and ensured it accurately describes the content of the figure. Additionally, we have strengthened the descriptions of the stimuli and tasks in both the Materials and Methods section and the captions for (revised) Figures 4 and 5 to provide a clearer and more comprehensive explanation of Experiment 2. These revisions aim to help readers fully understand the experimental design and methodology.
(2) We appreciate your feedback regarding the clarity and precision of the language in the manuscript. We acknowledge that some expressions, particularly in the abstract, were unclear or imprecise. In the revised manuscript, we have rewritten the abstract to improve clarity and ensure that the statements are concise and accurately convey our intended meaning. Additionally, we have thoroughly reviewed the entire manuscript to address any other instances of ambiguous language, aiming to eliminate the need for readers to "read between the lines." We are grateful for your suggestions, which have helped us enhance the overall readability of the paper.
-
-
-
www.researchsquare.com www.researchsquare.com
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This study focuses on the role of GABA in semantic memory and its neuroplasticity. The researchers stimulated the left ATL and control site (vertex) using cTBS, measured changes in GABA before and after stimulation using MRS, and measured changes in BOLD signals during semantic and control tasks using fMRI. They analyzed the effects of stimulation on GABA, BOLD, and behavioral data, as well as the correlation between GABA changes and BOLD changes caused by the stimulation. The authors also analyzed the relationship between individual differences in GABA levels and behavioral performance in the semantic task. They found that cTBS stimulation led to increased GABA levels and decreased BOLD activity in the ATL, and these two changes were highly correlated. However, cTBS stimulation did not significantly change participants' behavioral performance on the semantic task, although behavioral changes in the control task were found after stimulation. Individual levels of GABA were significantly correlated with individuals' accuracy on the semantic task, and the inverted U-shaped (quadratic) function provides a better fit than the linear relationship. The authors argued that the results support the view that GABAergic inhibition can sharpen activated distributed semantic representations. They also claimed that the results revealed, for the first time, a non-linear, inverted-U-shape relationship between GABA levels in the ATL and semantic function, by explaining individual differences in semantic task performance and cTBS responsiveness
Strengths:
The findings of the research regarding the increase of GABA and decrease of BOLD caused by cTBS, as well as the correlation between the two, appear to be reliable. This should be valuable for understanding the biological effects of cTBS.
We appreciated R1’s positive evaluation of our manuscript.
Weaknesses:
Regarding the behavioral effects of GABA on semantic tasks, especially its impact on neuroplasticity, the results presented in the article are inadequate to support the claims made by the authors. There are three aspects of results related to this: 1) the effects of cTBS stimulation on behavior, 2) the positive correlation between GABA levels and semantic task accuracy, and 3) the nonlinear relationship between GABA levels and semantic task accuracy. Among these three pieces of evidence, the clearest one is the positive correlation between GABA levels and semantic task accuracy. However, it is important to note that this correlation already exists before the stimulation, and there are no results supporting that it can be modulated by the stimulation. In fact, cTBS significantly increases GABA levels but does not significantly improve performance on semantic tasks. According to the authors' interpretation of the results in Table 1, cTBS stimulation may have masked the practice effects that were supposed to occur. In other words, the stimulation decreased rather than enhanced participants' behavioral performance on the semantic task.
The stimulation effect on behavioral performance could potentially be explained by the nonlinear relationship between GABA and performance on semantic tasks proposed by the authors. However, the current results are also insufficient to support the authors' hypothesis of an inverted U-shaped curve. Firstly, in Figure 3C and Figure 3D, the last one-third of the inverted U-shaped curve does not have any data points. In other words, as the GABA level increases the accuracy of the behavior first rises and then remains at a high level. This pattern of results may be due to the ceiling effect of the behavioral task's accuracy, rather than an inverted U-shaped ATL GABA function in semantic memory. Second, the article does not provide sufficient evidence to support the existence of an optimal level of GABA in the ATL. Fortunately, this can be tested with additional data analysis. The authors can estimate, based on pre-stimulus data from individuals, the optimal level of GABA for semantic functioning. They can then examine two expectations: first, participants with pre-stimulus GABA levels below the optimal level should show improved behavioral performance after stimulation-induced GABA elevation; second, participants with pre-stimulus GABA levels above the optimal level should exhibit a decline in behavioral performance after stimulation-induced GABA elevation. Alternatively, the authors can categorize participants into groups based on whether their behavioral performance improves or declines after stimulation, and compare the pre- and post-stimulus GABA levels between the two groups. If the improvement group shows significantly lower pre-stimulus GABA levels compared to the decline group, and both groups exhibit an increase in GABA levels after stimulation, this would also provide some support for the authors' hypothesis.
Another issue in this study is the confounding of simulation effects and practice effects. According to the results, there is a significant improvement in performance after the simulation, at least in the control task, which the authors suggest may reflect a practice effect. The authors argue that the results in Table 1 suggest a similar practice effect in the semantic task, but it is masked by the simulation of the ATL. However, since no significant effects were found in the ANOVA analysis of the semantic task, it is actually difficult to draw a conclusion. This potential confound increases the risk in data analysis and interpretation. Specifically, for Figure 3D, if practice effects are taken into account, the data before and after the simulation should not be analyzed together.
We thank for the R1’s thoughtful comments. Due to the limited dataset, it is challenging to determine the optimal level of ATL GABA. Here, we re-grouped the participants into the responders and non-responders to address the issues R1 raised. It is important to note that we applied cTBS over the ATL, an inhibitory protocol, which decreases cortical excitability within the target region and semantic task performance (Chiou et al., 2014; Jung and Lambon Ralph, 2016). Therefore, responders and non-responders were classified according to their semantic performance changes after the ATL stimulation: subjects showing a decrease in task performance at the post ATL cTBS compared to the baseline were defined as responders; whereas subjects showing no changes or an increase in their task performance after the ATL cTBS were defined as non-responders. Here, we used the inverse efficiency (IE) score (RT/1-the proportion of errors) as individual semantic task performance to combine accuracy and RT. Accordingly, we had 7 responders and 10 non-responders.
Recently, we demonstrated that the pre-stimulation neurochemical profile of the ATL was associated with cTBS responsiveness on semantic processing (Jung et al., 2022). Specifically, the baseline GABA and Glx levels in the ATL predicted cTBS induced semantic task performance changes: individuals with higher GABA and lower Glx in the ATL would show bigger inhibitory effects and responders who decreased semantic task performance after ATL stimulation. Importantly, the baseline semantic task performance was significantly better in responders compared to non-responders. Thus, we expected that responders would show better semantic task performance along with higher ATL GABA levels in their pre-stimulation session relative to non-responders. We performed the planned t-tests to examine the difference in task performance and ATL GABA levels in pre-stimulation session. The results revealed that responders had lower IE (better task performance, t = -1.756, p = 0.050) and higher ATL GABA levels (t = 2.779, p = 0.006) in the pre-stimulation session (Figure 3).
In addition, we performed planned paired t-test to investigate the cTBS effects on semantic task performance and regional ATL GABA levels according to the groups (responders and non-responders). Responders showed significant increase of IE (poorer performance, t = -1.937, p = 0.050) and ATL GABA levels (t = -2.203, p = 0.035) after ATL cTBS. Non-responders showed decreased IE (better performance, t = 2.872, p = 0.009) and increased GABA levels in the ATL (t = -3.912, p = 0.001) after the ATL stimulation. The results were summarised in Figure 3.
It should be noted that there was no difference between the responders and non-responders in the control task performance at the pre-stimulation session. Both groups showed better performance after the ATL stimulation – practice effects (Author response image 1 below).
Author response image 1.
As we expected, our results replicated the previous findings (Jung et al., 2022) that responders who showed the inhibitory effects on semantic task performance after the ATL stimulation had higher GABA levels in the ATL than non-responders at their baseline, the pre-stimulation session. Importantly, cTBS increased ATL GABA levels in both responders and non-responders. These findings support our hypothesis – the inverted U-shaped ATL GABA function for cTBS response (Figure 4B). cTBS over the ATL resulted in the inhibition of semantic task performance among individuals initially characterized by higher concentrations of GABA in the ATL, indicative of better baseline semantic capacity. Conversely, the impact of cTBS on individuals with lower semantic ability and relatively lower GABA levels in the ATL was either negligible or exhibited a facilitatory effect. This study posits that individuals with elevated GABA levels in the ATL tend to be more responsive to cTBS, displaying inhibitory effects on semantic task performance (responders). On the contrary, those with lower GABA concentrations and reduced semantic ability were less likely to respond or even demonstrated facilitatory effects following ATL cTBS (non-responders). Moreover, our findings suggest the critical role of the baseline neurochemical profile in individual responsiveness to cTBS in the context of semantic memory. This highlights substantial variability among individuals in terms of semantic memory and its plasticity induced by cTBS.
Our analyses with responders and non-responders have highlighted significant inter-individual variability in both pre- and post-ATL stimulation sessions, including behavioural outcomes and ATL GABA levels. Responders showed distinctive neurochemical profiles in the ATL, associating with their task performance and responsiveness to cTBS in semantic memory. Our findings suggest that responders may possess an optimal level of ATL GABA conducive to efficient semantic processing. This results in enhanced semantic task performance and increased responsiveness to cTBS, leading to inhibitory effects on semantic processing following an inverted U-shaped function. On the contrary, non-responders, characterized by relatively lower ATL GABA levels, exhibited poorer semantic task performance compared to responders at the baseline. The cTBS-induced increase in GABA may contribute to their subsequent improvement in semantic performance. These results substantiate our hypothesis regarding the inverted U-shape function of ATL GABA and its relationship with semantic behaviour.
To address the confounding of simulation effects and practice effects in behavioural data, we used the IE and computed cTBS-induced performance changes (POST-PRE). Employing a 2 x 2 ANOVA with stimulation (ATL vs. Vertex) and task (Semantic vs. Control) as within subject factors, we found a significant task effect (F<sub>1, 15</sub> = 6.656, p = 0.021) and a marginally significant interaction between stimulation and task (F<sub>1, 15</sub> = 4.064, p = 0.061). Post hoc paired t-test demonstrated that ATL stimulation significantly decreased semantic task performance (positive IE) compared to both vertex stimulation (t = 1.905, p = 0.038) and control task (t = 2.814, p = 0.006). Facilitatory effects (negative IE) were observed in the control stimulation and control task. Please, see the Author response image 2 below. Thus, we believe that ATL cTBS induced task-specific inhibitory effects in semantic processing.
Author response image 2.
Accordingly, we have revised the Methods and Materials (p 25, line 589), Results (p8, line 188, p9-11, line 202- 248), Discussion (p19, line 441) and Figures (Fig. 2-3 & all Supplementary Figures).
Reviewer #2 (Public Review):
Summary:
The authors combined inhibitory neurostimulation (continuous theta-burst stimulation, cTBS) with subsequent MRI measurements to investigate the impact of inhibition of the left anterior temporal lobe (ATL) on task-related activity and performance during a semantic task and link stimulation-induced changes to the neurochemical level by including MR spectroscopy (MRS). cTBS effects in the ATL were compared with a control site in the vertex. The authors found that relative to stimulation of the vertex, cTBS significantly increased the local GABA concentration in the ATL. cTBS also decreased task-related semantic activity in the ATL and potentially delayed semantic task performance by hindering a practice effect from pre to post. Finally, pooled data from their previous MRS study suggest an inverted U-shape between GABA concentration and behavioral performance. These results help to better understand the neuromodulatory effects of non-invasive brain stimulation on task performance.
Strengths:
Multimodal assessment of neurostimulation effects on the behavioral, neurochemical, and neural levels. In particular, the link between GABA modulation and behavior is timely and potentially interesting.
We appreciated R2’s positive evaluation of our manuscript.
Weaknesses:
The analyses are not sound. Some of the effects are very weak and not all conclusions are supported by the data since some of the comparisons are not justified. There is some redundancy with a previous paper by the same authors, so the novelty and contribution to the field are overall limited. A network approach might help here.
Thank you for your thoughtful critique. We have taken your comments into careful consideration and have made efforts to address them.
We acknowledge the limitations regarding the strength of some effects and the potential lack of justification for certain conclusions drawn from the data. In response, we have reviewed our analyses and performed new analyses to address the behavioural discrepancies and strengthened the justifications for our conclusions.
Regarding the redundancy with a previous paper by the same authors, we understand your concern about the novelty and contribution to the field. We aim to clarify the unique contributions of our current study compared to our previous work. The main novelty lies in uncovering the neurochemical mechanisms behind cTBS-induced neuroplasticity in semantic representation and establishing a non-linear relationship between ATL GABA levels and semantic representation. Our previous work primarily demonstrated the linear relationship between ATL GABA levels and semantic processing. In the current study, we aimed to address two key objectives: 1) investigate the role of GABA in the ATL in short-term neuroplasticity in semantic representation, and 2) explore a biologically more plausible function between ATL GABA levels and semantic function using a larger sample size by combining data from two studies.
Additionally, we appreciate your suggestion regarding a network approach. We have explored the relationship between ATL GABA and cTBS-induced functional connectivity changes in our new analysis. However, there was no significant relationship between them. In the current study, our decision to focus on the mechanistic link between ATL GABA, task-induced activity, and individual semantic task performance reflects our intention to provide a detailed exploration of the role of GABA in the ATL and semantic neuroplasticity.
We have addressed the specific weaknesses raised by Reviewer #2 in detail in our response to 'Reviewer #2 Recommendations For The Authors'.
Reviewer #3 (Public Review):
Summary:
The authors used cTBS TMS, magnetic resonance spectroscopy (MRS), and functional magnetic resonance imaging (fMRI) as the main methods of investigation. Their data show that cTBS modulates GABA concentration and task-dependent BOLD in the ATL, whereby greater GABA increase following ATL cTBS showed greater reductions in BOLD changes in ATL. This effect was also reflected in the performance of the behavioural task response times, which did not subsume to practice effects after AL cTBS as opposed to the associated control site and control task. This is in line with their first hypothesis. The data further indicates that regional GABA concentrations in the ATL play a crucial role in semantic memory because individuals with higher (but not excessive) GABA concentrations in the ATLs performed better on the semantic task. This is in line with their second prediction. Finally, the authors conducted additional analyses to explore the mechanistic link between ATL inhibitory GABAergic action and semantic task performance. They show that this link is best captured by an inverted U-shaped function as a result of a quadratic linear regression model. Fitting this model to their data indicates that increasing GABA levels led to better task performance as long as they were not excessively low or excessively high. This was first tested as a relationship between GABA levels in the ATL and semantic task performance; then the same analyses were performed on the pre and post-cTBS TMS stimulation data, showing the same pattern. These results are in line with the conclusions of the authors.
Strengths:
I thoroughly enjoyed reading the manuscript and appreciate its contribution to the field of the role of the ATL in semantic processing, especially given the efforts to overcome the immense challenges of investigating ATL function by neuroscientific methods such as MRS, fMRI & TMS. The main strengths are summarised as follows:
• The work is methodologically rigorous and dwells on complex and complementary multimethod approaches implemented to inform about ATL function in semantic memory as reflected in changes in regional GABA concentrations. Although the authors previously demonstrated a negative relationship between increased GABA levels and BOLD signal changes during semantic processing, the unique contribution of this work lies within evidence on the effects of cTBS TMS over the ATL given by direct observations of GABA concentration changes and further exploring inter-individual variability in ATL neuroplasticity and consequent semantic task performance.
• Another major asset of the present study is implementing a quadratic regression model to provide insights into the non-linear relationship between inhibitory GABAergic activity within the ATLs and semantic cognition, which improves with increasing GABA levels but only as long as GABA levels are not extremely high or low. Based on this finding, the authors further pinpoint the role of inter-individual differences in GABA levels and cTBS TMS responsiveness, which is a novel explanation not previously considered (according to my best knowledge) in research investigating the effect of TMS on ATLs.
• There are also many examples of good research practice throughout the manuscript, such as the explicitly stated exploratory analyses, calculation of TMS electric fields, using ATL optimised dual echo fRMI, links to open source resources, and a part of data replicates a previous study by Jung et. al (2017).
We appreciated R3’s very positive evaluation of our manuscript.
Weaknesses:
• Research on the role of neurotransmitters in semantic memory is still very rare and therefore the manuscript would benefit from more context on how GABA contributes to individual differences in cognition/behaviour and more justification on why the focus is on semantic memory. A recommendation to the authors is to highlight and explain in more depth the particular gaps in evidence in this regard.
This is an excellent suggestion. Accordingly, we have revised our introduction, highlighting the role of GABA on individual differences in cognition and behaviour and research gap in this field.
Introduction p3, line 77
“Research has revealed a link between variability in the levels of GABA in the human brain and individual differences in cognitive behaviour (for a review, see 5). Specifically, GABA levels in the sensorimotor cortex were found to predict individual performance in the related tasks: higher GABA levels were correlated with a slower reaction time in simple motor tasks (12) as well as improved motor control (13) and sensory discrimination (14, 15). Visual cortex GABA concentrations were positively correlated with a stronger orientation illusion (16), a prolonged binocular rivalry (17), while displaying a negative correlation with motion suppression (17). Individuals with greater frontal GABA concentrations demonstrated enhanced working memory capacity (18, 19). Studies on learning have reported the importance of GABAergic changes in the motor cortex for motor and perceptual learning: individuals showing bigger decreases in local GABA concentration can facilitate this plasticity more effectively (12, 20-22). However, the relationship between GABAergic inhibition and higher cognition in humans remains unclear. The aim of the study was to investigate the role of GABA in relation to human higher cognition – semantic memory and its neuroplasticity at individual level.”
• The focus across the experiments is on the left ATL; how do the authors justify this decision? Highlighting the justification for this methodological decision will be important, especially given that a substantial body of evidence suggests that the ATL should be involved in semantics bilaterally (e.g. Hoffman & Lambon Ralph, 2018; Lambon Ralph et al., 2009; Rice et al., 2017; Rice, Hoffman, et al., 2015; Rice, Ralph, et al., 2015; Visser et al., 2010).
This is an important point, which we thank R3 for. Supporting the bilateral ATL systems in semantic representation, previous rTMS studies delivered an inhibitory rTMS in the left and right ATL and both ATL stimulation significantly decreased semantic task performance (Pobric et al., 2007 PNAS; 2010 Neuropsychologia; Lambon Ralph et al., 2009 Cerebral Cortex). Importantly, there was no significant difference on rTMS effects between the left and right ATL stimulation. Therefore, we assume that either left or right ATL stimulation could produce similar, intended rTMS effects on semantic processing. In the current study, we combined the cTBS with multimodal imaging to examine the cTBS effects in the ATL. Due to the design of the study (having a control site, control task, and control stimulation) and limitation of scanning time, we could have a target region for the simulation and chose the left ATL, which was the same MRS VOI of our precious study (Jung et al., 2017). This enabled us to combine the datasets to explore GABAergic function in the ATL.
• When describing the results, (Pg. 11; lines 233-243), the authors first show that the higher the BOLD signal intensity in ATL as a response to the semantic task, the lower the GABA concentration. Then, they state that individuals with higher GABA concentrations in the ATL perform the semantic task better. Although it becomes clearer with the exploratory analysis described later, at this point, the results seem rather contradictory and make the reader question the following: if increased GABA leads to less task-induced ATL activation, why at this point increased GABA also leads to facilitating and not inhibiting semantic task performance? It would be beneficial to acknowledge this contradiction and explain how the following analyses will address this discrepancy.
We apologised that our description was not clear. As R1 also commented this issue, we re-analysed behavioural results and demonstrated inter-individual variability in response to cTBS (Please, see the reply to R1 above).
• There is an inconsistency in reporting behavioural outcomes from the performance on the semantic task. While experiment 1 (cTBS modulates regional GANA concentrations and task-related BOLD signal changes in the ATL) reports the effects of cTBS TMS on response times, experiment 2 (Regional GABA concentrations in the ATL play a crucial role in semantic memory) and experiment 3 (The inverted U-shaped function of ATL GABA concentration in semantic processing) report results on accuracy. For full transparency, the manuscript would benefit from reporting all results (either in the main text or supplementary materials) and providing further explanations on why only one or the other outcome is sensitive to the experimental manipulations across the three experiments.
Regarding the inconsistency of behavioural outcome, first, there were inter- individual differences in our behavioural data (see the Figure below). Our new analyses revealed that there were responders and non-responders in terms of cTBS responsiveness (please, see the reply to R1 above. It should be noted that the classification of responders and non-responders was identical when we used semantic task accuracy). In addition, RT was compounded by practice effects (faster in the post-stimulation sessions), except for the ATL-post session. Second, we only found the significant relationship between semantic task accuracy and ATL GABA concentrations in both previous (Jung et al., 2017) and current study. ATL GABA levels were not correlated with semantic RT (Jung et al., 2017: r = 0.34, p = 0.14, current study: r = 0.26, p = 0.14). It should be noted that there were no significant correlations between ATL GABA levels and semantic inverse efficiency (IE) in both studies (Jung et al., 2017: r = 0.13, p = 0.62, current study: r = 0.22, p = 0.44). As a result, we found no significant linear and non-linear relationship between ATL GABA levels and RT (linear function R<sup>2</sup> = 0.21, p =0.45, quadratic function: R<sup>2</sup> = 0.17, p = 0.21) and between ATL GABA levels and IE (linear function R<sup>2</sup> = 0.24, p =0.07, quadratic function: R<sup>2</sup> = 2.24, p = 0.12). Thus, our data suggests that GABAergic action in the ATL may sharpen activated distributed semantic representations through lateral inhibition, leading to more accurate semantic performance (Isaacson & Scanziani., 2011; Jung et al., 2017).
We agreed with R3’s suggestion to report all results. The results of control task and control stimulation were included in Supplementary information (Figure S1, S4-5).
Overall, the most notable impact of this work is the contribution to a better understanding of individual differences in semantic behaviour and the potential to guide therapeutic interventions to restore semantic abilities in neurological populations. While I appreciate that this is certainly the case, I would be curious to read more about how this could be achieved.
Thank you once again to R3 for the positive evaluation of our study. We acknowledge your interest in understanding the practical implications of our findings. It is crucial to highlight the substantial variability in the effectiveness of rTMS and TBS protocols among individuals. Previous studies in healthy subjects have reported response rates ranging from 40% to 70% in the motor cortex, and in patients, the remission rate for rTMS treatment in treatment-resistant depression is around 29%. Presently, the common practice in rTMS treatment is to apply the same protocol uniformly to all patients.
Our study demonstrated that 40% of individuals in our sample were classified as responders to ATL cTBS. Notably, we observed differences in ATL GABA levels before stimulation between responders and non-responders. Responders exhibited higher baseline ATL GABA levels, along with better semantic performance at the baseline (as mentioned in our response to R1). This suggests that establishing the optimal level of ATL GABA by assessing baseline GABA levels before stimulation could enable the tailoring of an ideal protocol for each individual, thereby enhancing their semantic capability. To achieve this, more data is needed to delineate the proposed inverted U-shaped function of ATL GABA in semantic memory.
Our ongoing efforts involve collecting additional data from both healthy aging and dementia cohorts using the same protocol. Additionally, future pharmacological studies aim to modulate GABA, providing a deeper understanding of the individual variations in semantic function. These initiatives contribute to the potential development of personalized therapeutic interventions for individuals with semantic impairments.
Reviewer #1 (Recommendations For The Authors):
My major suggestion is to include an analysis regarding the "existence of an optimal GABA level". This would be the most direct test for the authors' hypothesis on the relationship between GABA and semantic memory and its neuroplasticity. Please refer to the public review section for details.
Here are some other suggestions and questions.
(1) The sample size of this study is relatively small. Although the sample size was estimated, a small sample size can bring risks to the generalizability of the results to the population. How did the author consider this risk? Is it necessary to increase the sample size?
We agreed with R1’s comments. However, the average of sample size in healthy individuals was 17.5 in TMS studies on language function (number of studies = 26, for a review, see Qu et al, 2022 Frontiers in Human Neuroscience), 18.3 in the studies employing rTMS and fMRI on language domain (number of studies = 8, for a review, see Hartwigsen & Volz., 2021 NeuroImage), and 20.8 in TMS combined MRS studies (number of studies = 11, for a review, see Cuypers & Marsman., 2021 NeuroImage). Notably, only two studies utilizing rTMS, fMRI, and MRS had sample sizes of N = 7 (Grohn et al., 2019 Frontiers in Neuroscience) and N = 16 (Rafique & Steeves. 2020 Brain and Behavior). Despite having 19 participants in our current study, it is noteworthy that our sample size aligns closely with studies employing similar approaches and surpasses those employing the same methodology.
As a result of the changes in a scanner and the relocation of the authors to different institutes, it is impossible to increase the sample size for this study.
(2) How did the authors control practice effects? How many practice trials were arranged before the experiment? Did you avoid the repetition of stimuli in tasks before and after the stimuli?
At the beginning of the experiment, participants performed the practice session (20 trials) for each tasks outside of the scanner. Stimuli in tasks were not repeated before and after stimulation sessions.
(3) In Figures 2D and E, does the vertical axis of the BOLD signal refer to the semantic task itself or the difference between the semantic and control tasks? Could you provide the respective patterns of the BOLD signal before and after the stimuli in the semantic and control tasks in a figure?
We apologised that the names of axis of Figure 2 were not clear. In Fig 2D-E, the BOLD signal changes refer to the semantic task itself. Accordingly, we have revised the Fig. 2.
(4) Figure 1A shows that MRS ATL always comes before MRS Vertex. Was the order of them counterbalanced across participants?
The order of MRS acquisition was not counterbalanced across participants.
(5) I am confused by the statement "Our results provide strong evidence that regional GABA levels increase following inhibitory cTBS in the human associative cortex, specifically in the ATL, a representational semantic hub. Notably, the observed increase was specific to the ATL and semantic processing, as it was not observed in the control region (vertex) and not associated with control processing (visuospatial processing)". GABA levels are obtained in the MRS, and this stage does not involve any behavioral tasks. Why do the authors state that the increase in GABA levels was specific to semantic processing and was not associated with control processing?
Following R1’s suggestion, we have re-analysed behavioural data and showed cTBS-induced suppression in semantic task performance after ATL stimulation only (please, see the reply above). There were no cTBS effects in the control task performance, control site (vertex) and no correlations between the ATL GABA levels and control task performance. The Table was added to the Supplementary Information as Table S3.
(6) In Figure 3, the relationship between GABA levels in the ATL and performance on semantic tasks is presented. What is the relationship between GABA levels at the control site and performance on semantic tasks? Should a graph be provided to illustrate this?
As the vertex was not involved in semantic processing (no activation during semantic processing), we did not perform the analysis between vertex GABA levels and semantic task performance. Following R3’s suggestion, we performed a linear regression between vertex GABA levels and semantic task performance in the pre-stimulation session, accounting for GM volume, age, and sex. As we expected that there was no significant relationship between them. (R<sup>2</sup> = 0.279, p = 0.962).
(7) The author claims that GABA can sharpen distributed semantic representations. However, even though there is a positive correlation between GABA levels and semantic performance, there is no direct evidence supporting the inference that this correlation is achieved through sharpening distributed semantic representations. How did the author come to this conclusion? Are there any other possibilities?
We showed that ATL GABA concentrations in pre-stimulation was ‘negatively’ correlated with task-induced regional activity in the ATL and ‘positively’ correlated with semantic task performance. In our semantic task, such as recognizing a camel (Fig. 1), the activation of all related information in the semantic representation (e.g., mammal, desert, oasis, nomad, humps, & etc.) occurs. To respond accurately to the task (a cactus), it becomes essential to suppress irrelevant meanings through an inhibitory mechanism. Therefore, the inhibitory processing linked to ATL GABA levels may contribute to more efficient processing in this task.
Animal studies have proposed a related hypothesis in the context of the close interplay between activation and inhibition in sensorimotor cortices (Isaacson & Scanziani., 2011). Liu et al (2011, Neuron) demonstrated that the rise of excitatory glutamate in the visual cortex is followed by the increase of inhibitory GABA in response to visual stimuli. Tight coupling of these paired excitatory-inhibitory functions results in a sharpening of the activated representation. (for a review, see Isaacson & Scanziani., 2011 Neuron How Inhibition Shapes Cortical Activity). In human, Kolasinski et al (2017, Current Biology) revealed that higher sensorimotor GABA levels are associated with more selective cortical tuning measured fMRI, which in turn is associated with enhanced perception (better tactile discrimination). They claimed that the relationship between inhibition and cortical tuning could result from GABAergic signalling, shaping the selective response profiles of neurons in the primary sensory regions of the brain. This process is crucial for the topographic organization (task-induced fMRI activation in the sensorimotor cortex) vital to sensory perception.
Building on these findings, we suggest a similar mechanism may operate in higher-order association cortices, including the ATL semantic hub. This suggests a process that leads to more sharply defined semantic representations associated with more selective task-induced activation in the ATL and, consequently, more accurate semantic performance (Jung et al., 2017).
Reviewer #2 (Recommendations For The Authors):
Major issues:
(1) It wasn't completely clear what the novel aspect of this study relative to their previous one on GABAergic modulation in semantic memory issue, this should be clarified. If I understand correctly, the main difference from the previous study is that this study considers the TMS-induced modulation of GABA?
We apologise that the novelty of study was not clear. The main novelty lies in uncovering the neurochemical mechanisms behind cTBS-induced neuroplasticity in semantic representation and establishing a non-linear relationship between ATL GABA levels and semantic representation. Our previous work firstly demonstrated the linear relationship between the ATL GABA levels and semantic processing. In the current study, we aimed to address two key objectives: 1) investigate the role of GABA in the ATL in short-term neuroplasticity in semantic representation, and 2) explore a biologically more plausible function between ATL GABA levels and semantic function using a larger sample size by combining data from two studies.
The first part of the experiment in this study mirrored our previous work, involving multimodal imaging during the pre-stimulation session. We conducted the same analysis as in our previous study to replicate the findings in a different cohort. Subsequently, we combined the data from both studies to examine the potential inverted U-shape function between ATL GABA levels and semantic function/neuroplasticity.
Accordingly, we have revised the Introduction by adding the following sentences.
“The study aimed to investigate the neural mechanisms underlying cTBS-induced neuroplasticity in semantic memory by linking cortical neurochemical profiles, task-induced regional activity, and variability in semantic memory capability within the ATL.”
“Furthermore, to address and explore the relationship between regional GABA levels in the ATL and semantic memory function, we combined data from our previous study (Jung et al., 2017) with the current study’s data.”
(2) I found the scope of the study very narrow. I guess everyone agrees that TMS induces network effects, but the authors selectively focus on the modulation in the ATL. This is unfortunate since semantic memory requires the interaction between several brain regions and a network perspective might add some novel aspect to this study which has a strong overlap with their previous one. I am aware that MRS can only measure pre-defined voxels but even these changes could be related to stimulation-induced effects on task-related activity at the whole brain level.
We appreciate R2's thoughtful comments and acknowledge the concern about the perceived narrow scope of the study. We agreed with the notion that cTBS induces network-level changes. In our investigation, we did observe cTBS over the ATL influencing task-induced regional activity in other semantic regions and functional connectivity within the semantic system. Specifically, ATL cTBS increased activation in the right ATL after ATL stimulation compared to pre-stimulation, along with increased functional connectivity between the left and right ATL, between the left ATL and right semantic control regions (IFG and pMTG), and between the left ATL and right angular gyrus. These results were the replication of Jung & Lambon Ralph (2016) Cerebral Cortex.
However, it is important to note that we did not find any significant correlations between ATL GABA changes and cTBS-induced changes in the functional connectivity. Consequently, we are currently preparing another paper that specifically addresses the network-level changes induced by ATL cTBS. In the current study, our decision to focus on the mechanistic link between ATL GABA, task-induced activity, and individual semantic task performance reflects our intention to provide a detailed exploration of the role of GABA in the ATL and semantic neuroplasticity.
(3) On a related note, I think the provided link between GABAergic modulation and behavioral changes after TMS is somehow incomplete because it ignores the stimulation effects on task-related activity. Could these be linked in a regression analysis with two predictors (with behavior or GABA level as a criterion and the other two variables as predictors)?
In response to R2’s suggestion, we performed a multiple regression analysis, by modelling cTBS-induced ATL GABA changes (POST-PRE), task-related BODL signal changes (POST-PRE), and semantic task performance (IE) changes (POST-PRE). The model with GABA changes (POST-PRE) as a criterion was significant (F<sub>2, 14</sub> = 8.77, p = 0.003), explaining 56% of cTBS-induced ATL GABA changes (adjusted R<sup>2</sup>) with cTBS-related ATL BOLD signal changes and semantic task performance changes. However, the model with semantic task performance change (POST-PRE) as a criterion was not significant (F = 0.26, p = 0.775). Therefore, cTBS-induced changes in ATL BOLD signals and semantic task performance significantly predicted the cTBS-induced ATL GABA changes. It was found that cTBS-induced ATL BOLD signal changes significantly predicted cTBS-induced GABA changes in the ATL (β = -4.184, p = 0.001) only, aligning with the results of our partial correlation analysis.
Author response table 1.
(4) Several statements in the intro and discussion need to be rephrased or toned down. For example, I would not agree that TBS "made healthy individuals mimic semantic dementia patients". This is clearly overstated. TMS protocols slightly modulate brain functions, but this is not similar to lesions or brain damage. Please rephrase. In the discussion, it is stated that the results provide "strong evidence". I disagree based on the overall low values for most comparisons.
Hence, we have revised both the Introduction and the Discussion.
“Perturbing the ATL with inhibitory repetitive transcranial magnetic stimulation (rTMS) and theta burst stimulation (TBS) resulted in healthy individuals exhibiting slower reaction times during semantic processing.”
“Our results demonstrated an increase in regional GABA levels following inhibitory cTBS in human associative cortex, specifically in the ATL, a representational semantic hub.”
(5) Changes in the BOLD signal in the ATL: There is a weak interaction between stimulation and VOI and post hoc comparisons with very low values reported. Are these corrected for multiple comparisons? I think that selectively reporting weak values with small-volume corrections (if they were performed) does not provide strong evidence. What about whole-brain effects and proper corrections for multiple comparisons?
There was no significant interaction between the stimulation (ATL vs. Vertex) and session (pre vs post) in the ATL BOLD signal changes (p = 0.29). Our previous work combining rTMS with fMRI (Binney et al., 2015; Jung & Lambon Ralph, 2016) demonstrated that there was no significant rTMS effects on the whole brain analysis and only ROI analyses revealed the subtle but significant rTMS effects in the target site (reduction of task-induced ATL activity). In the current study, we focused our hypothesis on the anticipated decrease in task-induced regional activity in the ATL during semantic processing following the inhibitory cTBS. Accordingly, we conducted planned paired t-tests specifically within the ATL for BOLD signal changes without applying multiple comparison corrections. It's noted that these results were derived from regions of interest (ROIs) and not from small-volume corrections. Furthermore, no significant findings emerged from the comparison of the ATL post-session vs. Vertex post-session and the ATL pre-session vs. ATL post-session in the whole-brain analysis (see Supplementary figure 2).
Accordingly, we have added the Figure S2 in the Supplementary Information.
(6) Differences between selected VOIs: Numerically, the activity (BOLD signal effect) is higher in the vertex than the ATL, even in the pre-TMS session (Figure 2D). What does that mean? Does that indicate that the vertex also plays a role in semantic memory?
We apologise that the figure was not clear. Fig. 2D displays the BOLD signal changes in the ATL VOI for the ATL and Vertex stimulation. As there was no activation in the vertex during semantic processing, we did not present the fMRI results of vertex VOI (please, see Author response image 3 below). Accordingly, we have revised the label of Y axis of the Figure 2D – ATL BOLD signal change.
Author response image 3.
The cTBS effects within the Vertex VOI during semantic processing
(7) Could you provide the e-field for the vertex condition?
We have added it in the Supplementary Information as Supplementary Figure 6.
(8) Stimulation effects on performance (RTs): There is a main effect of the session in the control task. Post-hoc tests show that control performance is faster in the post-pre comparison, while the semantic task is not faster after ATL TMS (as it might be delayed). I think you need to perform a 3-way ANOVA here including the factor task if you want to show task specificity (e.g., differences for the control but not semantic task) and then a step-down ANOVA or t-tests.
Thanks for R2’s suggestion. We have addressed this issue in reply to R1. Please, see the reply to R1 for semantic task performance analysis.
Minor issue:
In the visualization of the design, it would be helpful to have the timing/duration of the different measures to directly understand how long the experiment took.
We have added the duration of the experiment design in the Figure 1.
Reviewer #3 (Recommendations For The Authors):
Further Recommendations:
• Pg. 6; lines 138-147: There is a sense of uncertainty about the hypothesis conveyed by expressions such as 'may' or 'could be'. A more confident tone would be beneficial.
Thanks for R3’s thoughtful suggestion. We have revised the Introduction.
• Pg. 6; line 155: left or bilateral ATL, please specify.
We have added ‘left’ in the manuscript.
• Pg. 8; line 188: Can the authors provide a table with peak activations to complement the figure?
We have added the Table for the fMRI results in the Supplementary Information (Table S1).
• Pg 9; Figure 2C: The ATL activation elicited by the semantic task seems rather medial. What are the exact peak coordinates for this cluster, and how can the authors demonstrate that the electric fields induced by TMS, which seem rather lateral (Figure 2A), also impacted this area? Please explain.
We apologise that the Figure was not clear. cTBS was delivered to the peak coordinate of the left ventral ATL [-36, -15, -30] determined by previous fMRI studies (Binney et al., 2010; Visser et al., 2012). To confirm the cTBS effects at the target region, we conducted ROI analysis centred in the ventral ATL [-36, -15, -30] and the results demonstrated a reduced ATL activity after ATL stimulation during semantic processing (t = -2.43, p = 0.014) (please, see Author response image 4 below). Thus, cTBS successfully modulated the ATL activity reaching to the targe coordinate.
Author response image 4.
• Pg.23; line 547: What was the centre coordinate of the ROI (VOI), and was it consistent across all participants? Please specify.
We used the ATL MRS VOI (a hexahedron with 4cm x 2cm x 2cm) for our regions of interest analysis and the central coordinate was around -45, -12, -20 (see Author response image 5). As we showed in Fig. 1C, the location of ATL VOI was consistent across all participants.
Author response image 5.
• Pg. 24; line 556-570: What software was used for performing the statistical analyses? Please specify.
We have added the following sentence.
“Statistical analyses were undertaken using Statistics Package for the Social Sciences (SPSS, Version 25, IBM Cary, NC, USA) and RStudio (2023).”
• Pg. 21; line 472-480: It is not clear if and how neuronavigation was used (e.g. were T1scans or an average MNI template used, what was the exact coordinate of stimulation and how was it decided upon). Please specify.
We apologised the description was not clear. We have added a paragraph describing the procedure.
“The target site in the left ATL was delineated based on the peak coordinate (MNI -36 -15 -30), which represents maximal peak activation observed during semantic processing in previous distortion-corrected fMRI studies (38, 41). This coordinate was transformed to each individual’s native space using Statistical Parametric Mapping software (SPM8, Wellcome Trust Centre for Neuroimaging, London, UK). T1 images were normalised to the MNI template and then the resulting transformations were inverted to convert the target MNI coordinate back to the individual's untransformed native space coordinate. These native-space ATL coordinates were subsequently utilized for frameless stereotaxy, employing the Brainsight TMS-MRI co-registration system (Rogue Research, Montreal, Canada). The vertex (Cz) was designated as a control site following the international 10–20 system.”
• Miscellaneous
- line 57: insert 'about' to the following sentence: '....little is known the mechanisms linking'
- line 329: 'Previous, we demonstrated'....should be Previously we demonstrated....
We thank for R3’s thorough evaluation our manuscript. We have revised them.
Furthermore, it would be an advantage to make the data freely available for the benefit of the broader scientific community.
We appreciate Reviewer 3’s suggestion. Currently, this data is being used in other unpublished work. However, upon acceptance of this manuscript, we will make the data freely available for the benefit of the broader scientific community.
Chiou R, Sowman PF, Etchell AC, Rich AN (2014) A conceptual lemon: theta burst stimulation to the left anterior temporal lobe untangles object representation and its canonical color. J Cogn Neurosci 26:1066-1074.
Jung J, Lambon Ralph MA (2016) Mapping the Dynamic Network Interactions Underpinning Cognition: A cTBS-fMRI Study of the Flexible Adaptive Neural System for Semantics. Cereb Cortex 26:3580-3590.
Jung J, Williams SR, Sanaei Nezhad F, Lambon Ralph MA (2017) GABA concentrations in the anterior temporal lobe predict human semantic processing. Sci Rep 7:15748.
Jung J, Williams SR, Nezhad FS, Lambon Ralph MA (2022) Neurochemical profiles of the anterior temporal lobe predict response of repetitive transcranial magnetic stimulation on semantic processing. Neuroimage 258:119386.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1(Public review):
Strengths:
Utilization of both human placental samples and multiple mouse models to explore the mechanisms linking inflammatory macrophages and T cells to preeclampsia (PE).<br /> Incorporation of advanced techniques such as CyTOF, scRNA-seq, bulk RNA-seq, and flow cytometry.
Identification of specific immune cell populations and their roles in PE, including the IGF1-IGF1R ligand-receptor pair in macrophage-mediated Th17 cell differentiation.<br /> Demonstration of the adverse effects of pro-inflammatory macrophages and T cells on pregnancy outcomes through transfer experiments.
Weaknesses:
Comment 1. Inconsistent use of uterine and placental cells, which are distinct tissues with different macrophage populations, potentially confounding results.
Response1: We thank the reviewers' comments. We have done the green fluorescent protein (GFP) pregnant mice-related animal experiment, which was not shown in this manuscript. The wild-type (WT) female mice were mated with either transgenic male mice, genetically modified to express GFP, or with WT male mice, in order to generate either GFP-expressing pups (GFP-pups) or their genetically unmodified counterparts (WT-pups), respectively. Mice were euthanized on day 18.5 of gestation, and the uteri of the pregnant females and the placentas of the offspring were analyzed using flow cytometry. The majority of macrophages in the uterus and placenta are of maternal origin, which was defined by GFP negative. In contrast, fetal-derived macrophages, distinguished by their expression of GFP, represent a mere fraction of the total macrophage population. We have added the GFP pregnant mice-related data in uterine and placental cells (Line204-212).
Comment 2. Missing observational data for the initial experiment transferring RUPP-derived macrophages to normal pregnant mice.
Response 2: We thank the reviewers' comments. We have added the observational data (Figure 4-figure supplement 1D, 1E) and a corresponding description of the data (Line 198-203).
Comment 3. Unclear mechanisms of anti-macrophage compounds and their effects on placental/fetal macrophages.
Response 3: We thank the reviewers' comments. PLX3397, the inhibitor of CSF1R, which is needed for macrophage development (Nature. 2023, PMID: 36890231; Cell Mol Immunol. 2022, PMID: 36220994), we have stated that on Line 227-230. However, PLX3397 is a small molecule compound that possesses the potential to cross the placental barrier and affect fetal macrophages. We have discussed the impact of this factor on the experiment in the Discussion section (Line457-459).
Comment 4. Difficulty in distinguishing donor cells from recipient cells in murine single-cell data complicates interpretation.
Response 4: We thank the reviewers' comments. Upon analysis, we observed a notable elevation in the frequency of total macrophages within the CD45<sup>+</sup> cell population. Then we subsequently performed macrophage clustering and uncovered a marked increase in the frequency of Cluster 0, implying a potential correlation between Cluster 0 and donor-derived cells. RNA sequencing revealed that the F480<sup>+</sup>CD206<sup>-</sup> pro-inflammatory donor macrophages exhibited a Folr2<sup>+</sup>Ccl7<sup>+</sup>Ccl8<sup>+</sup>C1qa<sup>+</sup>C1qb<sup>+</sup>C1qc<sup>+</sup> phenotype, which is consistent with the phenotype of cluster 0 in macrophages observed in single-cell RNA sequencing (Figure 4D and Figure 5E). Therefore, we believe that the donor cells should be cluster 0 in macrophages.
Comment 5. Limitation of using the LPS model in the final experiments, as it more closely resembles systemic inflammation seen in endotoxemia rather than the specific pathology of PE.
Response 5: We thank the reviewers' comments. Firstly, our other animal experiments in this manuscript used the Reduction in Uterine Perfusion Pressure (RUPP) mouse model to simulate the pathology of PE. However, the RUPP model requires ligation of the uterine arteries in pregnant mice on day 12.5 of gestation, which hinders T cells returning from the tail vein from reaching the maternal-fetal interface. In addition, this experiment aims to prove that CD4<sup>+</sup> T cells are differentiated into memory-like Th17 cells through IGF-1R receptor signaling to affect pregnancy by clearing CD4<sup>+</sup> T cells in vivo with an anti-CD4 antibody followed by injecting IGF-1R inhibitor-treated CD4<sup>+</sup> T cells. And we proved that injection of RUPP-derived memory-like CD4<sup>+</sup> T cells into pregnant mice induces PE-like symptoms (Figure 6F-6H). In summary, the application of the LPS model in the final experiments does not affect the conclusions.
Reviewer #2 (Public review):
Strengths:
(1) This study combines human and mouse analyses and allows for some amount of mechanistic insight into the role of pro-inflammatory and anti-inflammatory macrophages in the pathogenesis of pre-eclampsia (PE), and their interaction with Th17 cells.
(2) Importantly, they do this using matched cohorts across normal pregnancy and common PE comorbidities like gestation diabetes (GDM).
(3) The authors have developed clear translational opportunities from these "big data" studies by moving to pursue potential IGF1-based interventions.
Weaknesses:
(1) Clearly the authors generated vast amounts of multi-omic data using CyTOF and single-cell RNA-seq (scRNA-seq), but their central message becomes muddled very quickly. The reader has to do a lot of work to follow the authors' multiple lines of inquiry rather than smoothly following along with their unified rationale. The title description tells fairly little about the substance of the study. The manuscript is very challenging to follow. The paper would benefit from substantial reorganizations and editing for grammatical and spelling errors. For example, RUPP is introduced in Figure 4 but in the text not defined or even talked about what it is until Figure 6. (The figure comparing pro- and anti-inflammatory macrophages does not add much to the manuscript as this is an expected finding).
Response 1: We thank the reviewers' comments. According to the reviewer's suggestion, we have made the necessary revisions. Firstly, the title of the article has been modified to be more specific. We also introduce the RUPP mouse model when interpreted Figure 4-figure supplement 1. Thirdly, We have moved the images of Figure 7 to the Figure 6-figure supplement 2 make them easier to follow. Finally, we diligently corrected the grammatical and spelling errors in the article. As for the figure comparing pro- and anti-inflammatory macrophages, the Editor requested a more comprehensive description of the macrophage phenotype during the initial submission. As a result, we conducted the transcriptome RNA-seq of both uterine-derived pro-inflammatory and anti-inflammatory macrophages and conducted a detailed analysis of macrophages in scRNA-seq.
Comment 2. The methods lack critical detail about how human placenta samples were processed. The maternal-fetal interface is a highly heterogeneous tissue environment and care must be taken to ensure proper focus on maternal or fetal cells of origin. Lacking this detail in the present manuscript, there are many unanswered questions about the nature of the immune cells analyzed. It is impossible to figure out which part of the placental unit is analyzed for the human or mouse data. Is this the decidua, the placental villi, or the fetal membranes? This is of key importance to the central findings of the manuscript as the immune makeup of these compartments is very different. Or is this analyzed as the entirety of the placenta, which would be a mix of these compartments and significantly less exciting?
Response 2: We thank the reviewers' comments. Placental villi rather than fetal membranes and decidua were used for CyToF in this study. This detail about how human placenta samples were processed have been added to the Materials and Methods section (Line564-576).
Comment 3. Similarly, methods lack any detail about the analysis of the CyTOF and scRNAseq data, much more detail needs to be added here. How were these clustered, what was the QC for scRNAseq data, etc? The two small paragraphs lack any detail.
Response 3: We thank the reviewers' comments. The details about the analysis of the CyTOF (Line577-586) and scRNAseq (Line600-615) data have been added in the Materials and Methods section.
Comment 4. There is also insufficient detail presented about the quantities or proportions of various cell populations. For example, gdT cells represent very small proportions of the CyTOF plots shown in Figures 1B, 1C, & 1E, yet in Figures 2I, 2K, & 2K there are many gdT cells shown in subcluster analysis without a description of how many cells are actually represented, and where they came from. How were biological replicates normalized for fair statistical comparison between groups?
Response 4: We thank the reviewers' comments. In our study, approximately 8×10^<sup>5</sup> cells were collected per group for analysis using CyTOF. Of these, about 10% (8×10^<sup>4</sup> cells per group) were utilized to generate Figure 1B. As depicted in Figure 1B, gdT cells constitute roughly 1% of each group, with specific percentages as follows: NP group (1.23%), PE group (0.97%), GDM group (0.94%), and GDM&PE group (1.26%), which equates to approximately 800 cells per group. For the subsequent gdT cell analysis presented in Figure 2I, we employed data from all cells within each group to construct the tSNE maps, comprising approximately 8000 cells per group. Consequently, it may initially appear that the number of gdT cells is significantly higher than what is shown in Figure 1B. To clarify this, we have included pertinent explanations in the figure legend. Given the relatively low proportions of gdT cells, we did not pursue further investigations of these cells in subsequent experiments. Following your suggestion, we have relocated this result to the supplementary materials, where it is now presented as Figure 2-figure supplement 1D-E.
The number of biological replicates (samples) is consistent with Figure 1, and this information has been added to the figure legend.
Comment 5. The figures themselves are very tricky to follow. The clusters are numbered rather than identified by what the authors think they are, the numbers are so small, that they are challenging to read. The paper would be significantly improved if the clusters were clearly labeled and identified. All the heatmaps and the abundance of clusters should be in separate supplementary figures.
Response 5: We thank the reviewers' comments. Based on your suggestions, we have labeled and defined the Clusters (Figure 2A, 2F, Figure 3A, Figure 5C and Figure 6A). Additionally, we have moved most of the heatmaps to the supplementary materials.
Comment 6. The authors should take additional care when constructing figures that their biological replicates (and all replicates) are accurately represented. Figure 2H-2K shows N=10 data points for the normal pregnant (NP) samples when clearly their Table 1 and test denote they only studied N=9 normal subjects.
Response 6: We thank the reviewers' careful checking. During our verification, we found that one sample in the NP group had pregnancy complications other than PE and GDM. The data in Figure 2H-2K was not updated in a timely manner. We have promptly updated this data and reanalyze it.
Comment 7. There is little to no evaluation of regulatory T cells (Tregs) which are well known to undergird maternal tolerance of the fetus, and which are well known to have overlapping developmental trajectory with RORgt+ Th17 cells. We recommend the authors evaluate whether the loss of Treg function, quantity, or quality leaves CD4+ effector T cells more unrestrained in their effect on PE phenotypes. References should include, accordingly: PMCID: PMC6448013 / DOI: 10.3389/fimmu.2019.00478; PMC4700932 / DOI: 10.1126/science.aaa9420.
Response 7: We thank the reviewers' comments. We have done the Treg-related animal experiment, which was not shown in this manuscript. We have added the Treg-related data in Figure 6F-6H. The injection of CD4<sup>+</sup>CD44<sup>+</sup> T cells derived from RUPP mouse, characterized by a reduced frequency of Tregs, could induce PE-like symptoms in pregnant mice (Line297-304). Additionally, we have added a necessary discussion about Tregs and cited the literature you mentioned (Line433-439).
Comment 8. In discussing gMDSCs in Figure 3, the authors have missed key opportunities to evaluate bona fide Neutrophils. We recommend they conduct FACS or CyTOF staining including CD66b if they have additional tissues or cells available. Please refer to this helpful review article that highlights key points of distinguishing human MDSC from neutrophils: https://doi.org/10.1038/s41577-024-01062-0. This will both help the evaluation of potentially regulatory myeloid cells that may suppress effector T cells as well as aid in understanding at the end of the study if IL-17 produced by CD4+ Th17 cells might recruit neutrophils to the placenta and cause ROS immunopathology and fetal resorption.
Response 8: We thank the reviewers' comments. Although we do not have additional tissues or cells available to conduct FACS or CyTOF staining, including for CD66b, we have utilized CD15 and CD66b antibodies for immunofluorescence stain of placental tissue, and our findings revealed a pronounced increase in the proportion of neutrophils among PE patients, fostering the hypothesis that IL-17A produced by Th17 cells might orchestrate the migration of neutrophils towards the placental milieu (Figure 6-figure supplement 2F; Line 325-328). We have cited these references and discussed them in the Discussion section (Line 459-465).
Comment 9. Depletion of macrophages using several different methodologies (PLX3397, or clodronate liposomes) should be accompanied by supplementary data showing the efficiency of depletion, especially within tissue compartments of interest (uterine horns, placenta). The clodronate piece is not at all discussed in the main text. Both should be addressed in much more detail.
Response 9: We thank the reviewers' comments. We already have the additional data on the efficiency of macrophage depletion involving PLX3397 and clodronate liposomes, which were not present in this manuscript, and we'll add it to the Figure 4-figure supplement 2A,2B. The clodronate piece is mentioned in the main text (Line236-239), but only briefly described, because the results using clodronate we obtained were similar to those using PLX3397.
Comment 10. There are many heatmaps and tSNE / UMAP plots with unhelpful labels and no statistical tests applied. Many of these plots (e.g. Figure 7) could be moved to supplemental figures or pared down and combined with existing main figures to help the authors streamline and unify their message.
Response 10: We thank the reviewers' comments. We have moved the images of Figure 7 to the Figure 6-figure supplement 2. We also have moved most of the heatmaps to the supplementary materials.
Comment 11. There are claims that this study fills a gap that "only one report has provided an overall analysis of immune cells in the human placental villi in the presence and absence of spontaneous labor at term by scRNA-seq (Miller 2022)" (lines 362-364), yet this study itself does not exhaustively study all immune cell subsets...that's a monumental task, even with the two multi-omic methods used in this paper. There are several other datasets that have performed similar analyses and should be referenced.
Response 11: We thank the reviewers' comments. We have search for more literature and reference additional studies that have conducted similar analyses (Line382-393).
Comment 12. Inappropriate statistical tests are used in many of the analyses. Figures 1-2 use the Shapiro-Wilk test, which is a test of "goodness of fit", to compare unpaired groups. A Kruskal-Wallis or other nonparametric t-test is much more appropriate. In other instances, there is no mention of statistical tests (Figures 6-7) at all. Appropriate tests should be added throughout.
Response 12: We thank the reviewers' comments. As stated in the Statistical Analysis section (lines 672-676), the Kruskal-Wallis test was used to compare the results of experiments with multiple groups. Comparisons between the two groups in Figures 5 were conducted using Student's t-test. The aforementioned statistical methods have been included in the figure legends.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Overall, the study has several strengths, including the use of human samples and animal models, as well as the incorporation of multiple cutting-edge techniques. However, there are some significant issues with the murine model experiments that need to be addressed:
Comment 1. The authors are not consistent in their use of or focus on uterine and placental cells. These are distinct tissues, and numerous prior reports have indicated differences in the macrophage populations of these tissues, due in part to the predominantly maternal origin of macrophages in the uterus and the largely fetal origin of those in the placenta. The rationale for switching between uterine and placental cells in different experiments is not clear, and the inclusion of cells from both (such as in the bulk RNAseq experiments) could be potentially confounding.
Response 1: We thank the reviewers' comments. We have done the green fluorescent protein (GFP) pregnant mice-related animal experiment, which was not shown in this manuscript. The wild-type (WT) female mice were mated with either transgenic male mice, genetically modified to express GFP, or with WT male mice, in order to generate either GFP-expressing pups (GFP-pups) or their genetically unmodified counterparts (WT-pups), respectively. Mice were euthanized on day 18.5 of gestation, and the uteri of the pregnant females and the placentas of the offspring were analyzed using flow cytometry. The majority of macrophages in the uterus and placenta are of maternal origin, which was defined by GFP negative. In contrast, fetal-derived macrophages, distinguished by their expression of GFP, represent a mere fraction of the total macrophage population, signifying their inconsequential or restricted presence amidst the broader cellular landscape. We have added the GPF pregnant mice-related data in Figure 4-figure supplement 1D-1E to explain the different macrophage populations in the uterine and placental cells.
Comment 2. The observational data for the initial experiment transferring RUPP-derived macrophages to normal pregnant mice (without any other manipulations) seems to be missing. They do not seem to be presented in Figure 4 where they are expected based on the results text.
Response 2: We thank the reviewers' comments. We thank the reviewers' comments. We have added the observational data (Figure 4-figure supplement 1D, 1E) and a corresponding description of the data (Line 198-203).
Comment 3. The action of the anti-macrophage compounds is not well explained, nor are their mechanisms validated as affecting or not affecting the placental/fetal macrophage populations. It is important to clarify whether the macrophages are depleted or merely inhibited by these treatments, and it is absolutely critical to determine whether these treatments are affecting placental/fetal macrophage populations (the latter indicative of placental transfer), given the focus on placental macrophages.
Response 3: We thank the reviewers' comments. PLX3397, the inhibitor of CSF1R, which is needed for macrophage development (Nature. 2023, PMID: 36890231; Cell Mol Immunol. 2022, PMID: 36220994), we have stated that on Line227-230. However, PLX3397 is a small molecule compound that possesses the potential to cross the placental barrier and affect fetal macrophages. We will discuss the impact of this factor on the experiment in the Discussion section (Line457-459).
Comment 4. The interpretation of the murine single-cell data is hampered by the lack of means for distinguishing donor cells from recipient cells, which is important when seeking to identify the influence of the donor cells.
Response 4: We thank the reviewers' comments. Upon analysis, we observed a notable elevation in the frequency of total macrophages within the CD45<sup>+</sup> cell population. Then we subsequently per formed macrophage clustering and uncovered a marked increase in the frequency of Cluster 0, implying a potential correlation between Cluster 0 and donor-derived cells. RNA sequencing revealed that the F480<sup>+</sup>CD206<sup>-</sup> pro-inflammatory donor macrophages exhibited a Folr2<sup>+</sup>Ccl7<sup>+</sup>Ccl8<sup>+</sup>C1qa<sup>+</sup>C1qb<sup>+</sup>C1qc<sup>+</sup> phenotype, which is consistent with the phenotype of cluster 0 in macrophages observed in single-cell RNA sequencing (Figure 4D and Figure 5E). Therefore, the donor cells should be in cluster 0 in macrophages.
Comment 5. The switch to the LPS model in the final experiments is a limitation, as this model more closely resembles the systemic inflammation seen in endotoxemia rather than the specific pathology of preeclampsia (PE). While this is not an exhaustive list, the number of weaknesses in the experimental design makes it difficult to evaluate the findings comprehensively.
Response 5: We thank the reviewers' comments. Firstly, our other animal experiments in this manuscript used the RUPP mouse model to simulate the pathology of PE. However, the RUPP model requires ligation of the uterine arteries in pregnant mice on day 12.5 of gestation, which hinders T cells returning from the tail vein from reaching the maternal-fetal interface. In addition, this experiment aims to prove that CD4<sup>+</sup> T cells are differentiated into memory-like Th17 cells through IGF-1R receptor signaling to affect pregnancy by clearing CD4<sup>+</sup> T cells in vivo with an anti-CD4 antibody followed by injecting IGF-1R inhibitor-treated CD4<sup>+</sup> T cells. We proved that injection of RUPP-derived memory-like CD4<sup>+</sup> T cells into pregnant rats induces PE-like symptoms (Figure 6F-6H). In summary, applying the LPS model in the final experiments does not affect the conclusions.
Minor comments:
Comment 1. Introduction, Lines 67-74: The phrasing here is unclear as to the roles that each mentioned immune cell subset is playing in preeclampsia. Given the statement "Elevated levels of maternal inflammation...", does this imply that the numbers of all mentioned immune cell subsets are increased in the maternal circulation? If not, please consider rewording this.
Response 1: We thank the reviewers' comments. We have revised the manuscript as follows: Currently, the pivotal mechanism underpinning the pathogenesis of preeclampsia is widely acknowledged to involve an increased frequency of pro-inflammatory M1-like maternal macrophages, along with an elevation in Granulocytes capable of superoxide generation, CD56<sup>+</sup> CD94<sup>+</sup> natural killer (NK) cells, CD19<sup>+</sup>CD5<sup>+</sup> B1 lymphocytes, and activated γδ T cells. Conversely, this pathological process is accompanied by a notable decrease in the frequency of anti-inflammatory M2-like macrophages and NKp46<sup>+</sup> NK cells (Line67-77).
Comment 2. Introduction, Lines 67-80: Is the involvement of the described immune cell subsets largely ubiquitous to preeclampsia? Recent multi-omic studies suggest that preeclampsia is a heterogeneous condition with different subsets, some more biased towards systemic immune activation than others. Thus, it is important to clarify whether the involvement of specific immune subsets is generally observed or more specific.
Response 2: We thank the reviewers' comments. We have added a new paragraph as follows: Moreover, as PE can be subdivided into early- and late-onset PE diagnosed before 34 weeks or from 34 weeks of gestation, respectively. Research has revealed that among the myriad of cellular alterations in PE, pro-inflammatory M1-like macrophages and intrauterine B1 cells display an augmented presence at the maternal-fetal interface of both early-onset and late-onset PE patients. Decidual natural killer (dNK) cells and neutrophils emerge as paramount contributors, playing a more crucial role in the pathogenesis of early-onset PE than late-onset PE (Front Immunol. 2020. PMID: 33013837) (Line83-89).
Comment 3. Introduction, Lines 81-86: The point of this short paragraph is not clear; the authors mention two very specific cellular interactions without explaining why.
Response 3: In the previous paragraph, we uncovered a heightened inflammatory response among multiple immune cells in patients with PE, yet the intricate interplay between these individual immune cells has been seldom elucidated in the context of PE patient. This is precisely why we delve into the realm of specific immune cellular interactions in relation to other pregnancy complications in this paragraph (Line91-98).
Comment 4. Methods: What placental tissues (e.g., villous tree, chorionic plate, extraplacental membranes) were included for CyTOF analysis? Was any decidual tissue (e.g., basal plate) included? Please clarify.
Response 4: Placental villi rather than chorionic plate and extraplacental membranes were used for CyToF in this study. The relevant content has been incorporated into the "Materials and Methods" section (Line564-576).
Comment 5. Results, Table 1: The authors should clarify that all PE samples were not full term (i.e., were less than 37 weeks of gestation), which is to be expected. In addition, were the PE cases all late-onset PE?
Response 5: All PE samples enumerated in Table 1 demonstrate a late-onset preeclampsia, with placental specimens being procured from patients more than 35 weeks of gestation and less than the 38 weeks of pregnancy. The relevant content has been incorporated into the "Materials and Methods" section (Line574-576).
Comment 6. Results, Figure 1: Are the authors considering the identified Macrophage cluster as being largely fetal (e.g., Hofbauer cells)? This also depends on whether any decidual tissue was included in the placental samples for CyTOF.
Response 6: Firstly, the specimens subjected to CyToF analysis were devoid of decidual tissue and exclusively comprised placental villi. Secondly, the Macrophage cluster in Figure 1 undeniably encompasses Hofbauer cells, and we considering fetal-derived macrophages likely constituting the substantial proportion of the cellular population. However, a limitation of the CyToF technique lies in its inability to discern between maternal and fetal origins of these cells, thereby precluding a definitive distinction.
Comment 7. Results, Figure 2C: Did the authors validate other T-cell subset markers (e.g., Th1, Th2, Th9, etc.)?
Response 7: In this study, we did not validate additional T-cell subset markers presented in Figure 2C, recognizing the potential for deeper insights. As we embark on our subsequent research endeavors, we aim to meticulously explore and characterize the intricate changes in diverse T-cell populations at the maternal-fetal interface, with a particular focus on preeclampsia patients, thereby advancing our understanding of this complex condition.
Comment 8. Results, Figure 2D: Where were the detected memory-like T cells located in the placenta? Did they cluster in certain areas or were they widely distributed?
Response 8: Upon a thorough re-evaluation of the immunofluorescence images specific to the placenta, we observed a notable preponderance of memory-like T cells residing within the placental sinusoids (Line135-139).
Comment 9. Results, Figure 2E: I would suggest separating the two plots so that the Y-axis can be expanded for TIM3, as it is impossible to view the medians currently.
Response 9: We thank the reviewers' comments. We have made the adjustment to Figure 2E according to the reviewers' suggestions.
Comment 10. Results, Lines 138-140: Do the authors consider that the altered T-cells are largely resident cells of the placenta or newly invading/recruited cells? The clarification of distribution within the placental tissues as mentioned above would help answer this.
Response 10: Our analysis revealed the presence of memory-like T cells within the placental sinusoids, as evident from the immunofluorescence examination of placental tissues. Consequently, these T cells may represent recently recruited cellular entities, traversing the placental vasculature and integrating into this unique maternal-fetal microenvironment (Line135-139).
Comment 11. Results, Figure 3C: Has a reduction of gMDSCs (or MDSCs in general) been previously reported in PE?
Response 11: Myeloid-derived suppressor cells (MDSCs) constitute a diverse population of myeloid-derived cells that exhibit immunosuppressive functions under various conditions. Previous reports have documented a decrease in the levels of gMDSCs from peripheral blood or umbilical cord blood among patients with preeclampsia (Am J Reprod Immunol. 2020, PMID: 32418253; J Reprod Immunol. 2018, PMID: 29763854; Biol Reprod. 2023, PMID: 36504233). Nevertheless, there was no documented reports thus far on the alterations and specific characteristics in gMDSCs within the placenta of PE patients.
Comment 12. Results, Figure 3D-E: It is not clear what new information is added by the correlations, as the increase of both cluster 23 in CD11b+ cells and cluster 8 in CD4+ T cells in PE cases was already apparent. Are these simply to confirm what was shown from the quantification data?
Response 12: Despite the evident increase in both cluster 23 within CD11b<sup>+</sup> cells and cluster 8 within CD4<sup>+</sup> T cells in PE cases, the existence of a potential correlation between these two clusters remains elusive. To gain insight into this question, we conducted a Pearson correlation analysis, which is presented in Figure 3D-E, revealing a positive correlation between the two clusters.
Comment 13. Results, Figure 4A: Please clarify in the results text that the RNA-seq of macrophages from RUPP mice was performed prior to their injection into normal pregnant mice.
Response 13: We thank the reviewers' comments. We have updated Figure 4A according to the reviewers' suggestions.
Comment 14. Results / Methods, Figure 4: For the transfer of macrophages from RUPP mice into normal mice, why were the uterine tissues included to isolate cells? The uterine macrophages will be almost completely maternal, as opposed to the largely fetal placental macrophages, and despite the sorting for specific markers these are likely distinct subsets that have been combined for injection. This could potentially impact the differential gene expression analysis and should be accounted for. In addition, did murine placental samples include decidua? This should be clarified.
Response 14: We thank the reviewers' comments. For our experimental design involving human samples, we meticulously selected placental tissue as the primary focus. Initially, we aimed for uniformity by contemplating the utilization of mouse placenta. However, a pivotal revelation emerged from the GFP pregnant mice-related data in Figure 4-figure supplement 1D,1E: the uterus and placenta of mice are predominantly populated by maternal macrophages, with fetal macrophages virtually absent, marking a notable divergence from the human scenario. Furthermore, the uterine milieu exhibits a macrophage concentration exceeding 20% of total cellular composition, whereas in the placenta, this proportion dwindles to less than 5%, underscoring a distinct distribution pattern. Given these discrepancies and considerations, we incorporated mouse uterine tissues into our protocol to isolate cells, ensuring a more comprehensive and informative exploration that acknowledges the inherent differences between human and mouse placental biology.
Comment 15. Results, Lines 186-187: I think the figure citation should be Figure 4D here.
Response 15: We thank the reviewers' careful checking. We have revised and updated Figure 4 accordingly.
Comment 16. Results, Figure 4: Where are the results of the injection of anti-inflammatory and pro-inflammatory macrophages into normal mice? This experiment is mentioned in Figure 4A, but the only results shown in Figure 4 are with the PLX3397 depletion.
Response 16: The aim of this experiment in figure 4 is to conclusively ascertain the influence of pro-inflammatory and anti-inflammatory macrophages on the other immune cells within the maternal-fetal interface, as well as their implications for pregnancy outcomes. To achieve this, we employed a strategic approach involving the administration of PLX3397, a compound capable of eliminating the preexisting macrophages in mice. Subsequently, anti-inflam or pro-inflam macrophages were injected to these mice, thereby eliminating the confounding influence of the native macrophage population. This methodology allows for a more discernible observation of the specific effects these two types of macrophages exert on the immune landscape at the maternal-fetal interface and their ultimate impact on pregnancy outcomes.
Comment 17. Results, Lines 189-190: Does PLX3397 inhibit macrophage development/signaling/etc. or result in macrophage depletion? This is an important distinction. If depletion is induced, does this affect placental/fetal macrophages or just maternal macrophages?
Response 17: We thank the reviewers' comments. We have updated the additional data on the efficiency of macrophage depletion involving PLX3397 in Figure 4-figure supplement 2A. PLX3397 is a small molecule compound that possesses the potential to cross the placental barrier and affect fetal macrophages. We have discussed the impact of this factor on the experiment in the Discussion section (Line457-459).
Comment 18. Results, Lines 197-198: Similarly, does clodronate liposome administration affect only maternal macrophages, or also placental/fetal macrophages?
Response 18: We thank the reviewers' comments. We have updated the additional data on the efficiency of macrophage depletion involving Clodronate Liposomes in Figure 4-figure supplement 2B. Clodronate Liposomes, which are intricate vesicles encapsulating diverse substances, while only small molecule compounds possess the potential to cross the placental barrier. Consequently, we hold the view that the influence of these liposomes is likely confined to the maternal macrophages (Artif Cells Nanomed Biotechnol. 2023. PMID: 37594208).
Comment 19. Results, Line 206: A minor point, but consider continuing to refer to the preeclampsia model mice as RUPP mice rather than PE mice.
Response 19: We thank the reviewers' comments. We have revised and updated this section accordingly.
Comment 20. Results / Methods, Figure 5: For these experiments, why did the authors focus on the mouse uterus?
Response 20: We have previously addressed this query in our Response 14. We incorporated mouse uterine tissues for cell isolation due to the profound differences in placental biology between humans and mice.
Comment 21. Results, Figure 5: Did the authors have a means of distinguishing the transferred donor cells from the recipient cells for their single-cell analysis? If the goal is to separate the effects of the macrophage transfer on other uterine immune cells, then it would be important to identify and separate the donor cells.
Response 21: We thank the reviewers' comments. Upon analysis, we observed a notable elevation in the frequency of total macrophages within the CD45<sup>+</sup> cell population. Then we subsequently performed macrophage clustering and uncovered a marked increase in the frequency of Cluster 0, implying a potential correlation between Cluster 0 and donor-derived cells. RNA sequencing revealed that the F480<sup>+</sup>CD206<sup>-</sup> pro-inflammatory donor macrophages exhibited a Folr2<sup>+</sup>Ccl7<sup>+</sup>Ccl8<sup>+</sup>C1qa<sup>+</sup>C1qb<sup>+</sup>C1qc<sup>+</sup> phenotype, which is consistent with the phenotype of cluster 0 in macrophages observed in single-cell RNA sequencing (Figure 4D and Figure 5E). Therefore, the donor cells should be in cluster 0 in macrophages.
Comment 22. Results, Lines 247-248: While the authors have prudently noted that the observed T-cell phenotypes are merely suggestive of immunosuppression, any claims regarding changes in the immunosuppressive function after macrophage transfer would require functional studies of the T cells.
Response 22: We thank the reviewers' comments. Upon revisiting and meticulously reviewing the pertinent literature, we have refined our terminology, transitioning from 'immunosuppression' to 'immunomodulation', thereby enhancing the accuracy and precision of our Results (Line285-287).
Comment 23. Results, Figure 6G: The observation of worsened outcomes and PE-like symptoms after T-cell transfer is interesting, but other models of PE induced by the administration of Th1-like cells have already been reported. Are the authors' findings consistent with these reports? These findings are strengthened by the evaluation of second-pregnancy outcomes following the transfer of T cells in the first pregnancy.
Response 23: We thank the reviewers' comments. As we verified in Figure 6F-6H, the injection of CD4<sup>+</sup>CD44<sup>+</sup> T cells derived from RUPP mouse, characterized by a reduced frequency of Tregs and an increased frequency of Th17 cells, could induce PE-like symptoms in pregnant mice. In line with other studies, which have implicated Th1-like cells in the manifestation of PE-like symptoms, we posit a novel hypothesis: beyond Th1 cells, Th17 cells also have the potential to induce PE-like symptoms.
Comment 24. Results, Lines 327-337: The disease model implied by the authors here is not clear. Given that the authors' human findings are in the placental macrophages, are the authors proposing that placental macrophages are induced to an M1 phenotype by placenta-derived EVs? Please elaborate on and clarify the proposed model.
Response 24 In the article authored by our team, titled "Trophoblast-Derived Extracellular Vesicles Promote Preeclampsia by Regulating Macrophage Polarization" published in Hypertension (Hypertension. 2022, PMID: 35993233), we employed trophoblast-derived extracellular vesicles isolated from PE patients as a means to induce an M1-like macrophage phenotype in macrophages from human peripheral blood in vitro. Consequently, in the present study, we have directly leveraged this established methodology to induce pro-inflammatory macrophages.
Comment 25. Results / Methods, Figure 8E-H: What is the reasoning for switching to an LPS model in this experiment? LPS is less specific to PE than the RUPP model.
Response 25: We thank the reviewers' comments. Firstly, our other animal experiments in this manuscript used the RUPP mouse model to simulate the pathology of PE. However, the RUPP model requires ligation of the uterine arteries in pregnant mice on day 12.5 of gestation, which hinders T cells returning from the tail vein from reaching the maternal-fetal interface. In addition, this experiment aims to prove that CD4<sup>+</sup> T cells are differentiated into memory-like Th17 cells through IGF-1R receptor signaling to affect pregnancy by clearing CD4<sup>+</sup> T cells in vivo with an anti-CD4 antibody followed by injecting IGF-1R inhibitor-treated CD4<sup>+</sup> T cells. And we proved that injection of RUPP-derived memory-like CD4<sup>+</sup> T cells into pregnant mice induces PE-like symptoms (Figure 6). In summary, the application of the LPS model in the final experiments does not affect the conclusions.
Comment 26. Discussion: What do the authors consider to be the origins of the inflammatory cells associated with PE onset? Are these maternal cells invading the placental tissues, or are these placental resident (likely fetal) cells?
Response 26: We thank the reviewers' comments. Numerous reports have consistently observed the presence of inflammatory cells and factors in the maternal peripheral blood and placenta tissues of PE patients, fostering the prevailing notion that the progression of PE is intricately linked to the maternal immune system's inflammatory response towards the fetus. Nevertheless, intriguing findings from single-cell RNA sequencing, analyzed through bioinformatic methods, have challenged this perspective (Elife. 2019. PMID: 31829938;Proc Natl Acad Sci U S A. 2017.PMID: 28830992). These studies reveal that the placenta harbors not just immune cells of maternal origin but also those of fetal origin, raising questions about whether these are maternal cells infiltrating placental tissues or resident (possibly fetal) placental cells. Further investigation is imperative to elucidate this complex interplay.
Comment 27. Discussion: Given the observed lack of changes in the GDM or GDM+PE groups, do the authors consider that GDM represents a distinct pathology that can lead to secondary PE, and thus is different from primary PE without GDM?
Response 27: It's possible. Though previous studies reported GDM is associated with aberrant maternal immune cell adaption the findings remained controversial. It seems that GDM does not induce significant alterations in placental immune cell profile in our study, which made us pay more attention to the immune mechanism in PE. However, it is confusing for the reasons why individuals with GDM&PE were protected from the immune alterations at the maternal fetal interface. Limited placental samples in the GDM&PE group can partly explain it, for it is hard to collect clean samples excluding confounding factors. A study reported that macrophages in human placenta maintained anti-inflammatory properties despite GDM (Front Immunol, 2017, PMID: 28824621).Barke et al. also found that more CD163<sup>+</sup> cells were observed in GDM placentas compared to normal controls (PLoS One, 2014, PMID: 24983948). Thus, GDM is likely to have a protective property in the placental immune environment when the individuals are complicated with PE.
Reviewer #2 (Recommendations for the authors):
Comment 1. IF images need to be quantified.
Response 1: We thank the reviewers' comments. We have quantified and calculated the fluorescence intensity and added it in Figure 2D.
Comment 2. Cluster 12 in Figure 3 is labeled as granulocytes but listed under macrophages.
Response 2: We thank the reviewers' careful checking. We have revised and updated Figure 3A.
Comment 3. Figure 4 labels in the text and figure do not match, no 4G in the figure.
Response 3: We thank the reviewers' careful checking. The figure labels of Figure 4 have been revised and updated.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers for their thorough reading and thoughtful feedback. Below, we provisionally address each of the concerns raised in the public reviews, and outline our planned revision that aims to further clarify and strengthen the manuscript.
In our response, we clarify our conceptualization of elasticity as a dimension of controllability, formalizing it within an information-theoretic framework, and demonstrating that controllability and its elasticity are partially dissociable. Furthermore, we provide clarifications and additional modeling results showing that our experimental design and modeling approach are well-suited to dissociating elasticity inference from more general learning processes, and are not inherently biased to find overestimates of elasticity. Finally, we clarify the advantages and disadvantages of our canonical correlation analysis (CCA) approach for identifying latent relationships between multidimensional data sets, and provide additional analyses that strengthen the link between elasticity estimation biases and a specific psychopathology profile.
Reviewer 1:
This research takes a novel theoretical and methodological approach to understanding how people estimate the level of control they have over their environment, and how they adjust their actions accordingly. The task is innovative and both it and the findings are well-described (with excellent visuals). They also offer thorough validation for the particular model they develop. The research has the potential to theoretically inform the understanding of control across domains, which is a topic of great importance.
We thank the reviewer for their favorable appraisal and valuable suggestions, which have helped clarify and strengthen the study’s conclusion.
An overarching concern is that this paper is framed as addressing resource investments across domains that include time, money, and effort, and the introductory examples focus heavily on effort-based resources (e.g., exercising, studying, practicing). The experiments, though, focus entirely on the equivalent of monetary resources - participants make discrete actions based on the number of points they want to use on a given turn. While the same ideas might generalize to decisions about other kinds of resources (e.g., if participants were having to invest the effort to reach a goal), this seems like the kind of speculation that would be better reserved for the Discussion section rather than using effort investment as a means of introducing a new concept (elasticity of control) that the paper will go on to test.
We thank the reviewer for pointing out a lack of clarity regarding the kinds of resources tested in the present experiment. Investing additional resources in the form of extra tickets did not only require participants to pay more money. It also required them to invest additional time – since each additional ticket meant making another attempt to board the vehicle, extending the duration of the trial, and attentional effort – since every attempt required precisely timing a spacebar press as the vehicle crossed the screen. Given this involvement of money, time, and effort resources, we believe it would be imprecise to present the study as concerning monetary resources in particular. That said, we agree with the Reviewer that results might differ depending on the resource type that the experiment or the participant considers most. Thus, in our revision of the manuscript, we will make sure to clarify the kinds of resources the experiment involved, and highlight the open question of whether inferences concerning the elasticity of control generalize across different resource domains.
Setting aside the framing of the core concepts, my understanding of the task is that it effectively captures people's estimates of the likelihood of achieving their goal (Pr(success)) conditional on a given investment of resources. The ground truth across the different environments varies such that this function is sometimes flat (low controllability), sometimes increases linearly (elastic controllability), and sometimes increases as a step function (inelastic controllability). If this is accurate, then it raises two questions.
First, on the modeling front, I wonder if a suitable alternative to the current model would be to assume that the participants are simply considering different continuous functions like these and, within a Bayesian framework, evaluating the probabilistic evidence for each function based on each trial's outcome. This would give participants an estimate of the marginal increase in Pr(success) for each ticket, and they could then weigh the expected value of that ticket choice (Pr(success)*150 points) against the marginal increase in point cost for each ticket. This should yield similar predictions for optimal performance (e.g., opt-out for lower controllability environments, i.e., flatter functions), and the continuous nature of this form of function approximation also has the benefit of enabling tests of generalization to predict changes in behavior if there was, for instance, changes in available tickets for purchase (e.g., up to 4 or 5) or changes in ticket prices. Such a model would of course also maintain a critical role for priors based on one's experience within the task as well as over longer timescales, and could be meaningfully interpreted as such (e.g., priors related to the likelihood of success/failure and whether one's actions influence these). It could also potentially reduce the complexity of the model by replacing controllability-specific parameters with multiple candidate functions (presumably learned through past experience, and/or tuned by experience in this task environment), each of which is being updated simultaneously.
Second, if the reframing above is apt (regardless of the best model for implementing it), it seems like the taxonomy being offered by the authors risks a form of "jangle fallacy," in particular by positing distinct constructs (controllability and elasticity) for processes that ultimately comprise aspects of the same process (estimation of the relationship between investment and outcome likelihood). Which of these two frames is used doesn't bear on the rigor of the approach or the strength of the findings, but it does bear on how readers will digest and draw inferences from this work. It is ultimately up to the authors which of these they choose to favor, but I think the paper would benefit from some discussion of a common-process alternative, at least to prevent too strong of inferences about separate processes/modes that may not exist. I personally think the approach and findings in this paper would also be easier to digest under a common-construct approach rather than forcing new terminology but, again, I defer to the authors on this.
We thank the reviewer for suggesting this interesting alternative modeling approach. We agree that a Bayesian framework evaluating different continuous functions could offer advantages, particularly in its ability to generalize to other ticket quantities and prices. We will attempt to implement this as an alternative model and compare it with the current model.
We also acknowledge the importance of avoiding a potential "jangle fallacy". We entirely agree with the Reviewer that elasticity and controllability inferences are not distinct processes. Specifically, we view resource elasticity as a dimension of controllability, hence the name of our ‘elastic controllability’ model. In response to this and other Reviewers’ comments, we now offer a formal definition of elasticity as the reduction in uncertainty about controllability due to knowing the amount of resources the agent is able and willing to invest (see further details in response to Reviewer 3 below).
With respect to how this conceptualization is expressed in the modelling, we note that the representation in our model of maximum controllability and its elasticity via different variables is analogous to how a distribution may be represented by separate mean and variance parameters. Ultimately, even in the model suggested by the Reviewer, there would need to be a dedicated variable representing elasticity, such as the probability of sloped controllability functions. A single-process account thus allows that different aspects of this process would be differently biased (e.g., one can have an accurate estimate of the mean of a distribution but overestimate its variance). Therefore, our characterization of distinct elasticity and controllability biases (or to put it more accurately, ‘elasticity of controllability bias’ and ‘maximum controllability bias’) is consistent with a common construct account.
That said, given the Reviewer’s comments, we believe that some of the terminology we used may have been misleading. In our planned revision, we will modify the text to clarify that we view elasticity as a dimension of controllability that can only be estimated in conjunction with controllability.
Reviewer 2:
This research investigates how people might value different factors that contribute to controllability in a creative and thorough way. The authors use computational modeling to try to dissociate "elasticity" from "overall controllability," and find some differential associations with psychopathology. This was a convincing justification for using modeling above and beyond behavioral output and yielded interesting results. Interestingly, the authors conclude that these findings suggest that biased elasticity could distort agency beliefs via maladaptive resource allocation. Overall, this paper reveals some important findings about how people consider components of controllability.
We appreciate the Reviewer's positive assessment of our findings and computational approach to dissociating elasticity and overall controllability.
The primary weakness of this research is that it is not entirely clear what is meant by "elastic" and "inelastic" and how these constructs differ from existing considerations of various factors/calculations that contribute to perceptions of and decisions about controllability. I think this weakness is primarily an issue of framing, where it's not clear whether elasticity is, in fact, theoretically dissociable from controllability. Instead, it seems that the elements that make up "elasticity" are simply some of the many calculations that contribute to controllability. In other words, an "elastic" environment is inherently more controllable than an "inelastic" one, since both environments might have the same level of predictability, but in an "elastic" environment, one can also partake in additional actions to have additional control overachieving the goal (i.e., expend effort, money, time).
We thank the reviewer for highlighting the lack of clarity in our concept of elasticity. We first clarify that elasticity cannot be entirely dissociated from controllability because it is a dimension of controllability. If no controllability is afforded, then there cannot be elasticity or inelasticity. This is why in describing the experimental environments, we only label high-controllability, but not low-controllability, environments as ‘elastic’ or ‘inelastic’. For further details on this conceptualization of elasticity, and a planned revision of the text, see our response above to Reviewer 1.
Second, we now clarify that controllability can also be computed without knowing the amount of resources the agent is able and willing to invest, for instance by assuming infinite resources available or a particular distribution of resource availabilities. However, knowing the agent’s available resources often reduces uncertainty concerning controllability. This reduction in uncertainty is what we define as elasticity. Since any action requires some resources, this means that no controllable environment is entirely inelastic if we also consider agents that do not have enough resources to commit any action. However, even in this case environments can differ in the degree to which they are elastic. For further details on this formal definition, see our response to Reviewer 3 below. We will make these necessary clarifications in the revised manuscript.
Importantly, whether an environment is more or less elastic does not determine whether it is more or less controllable. In particular, environments can be more controllable yet less elastic. This is true even if we allow that investing different levels of resources (i.e., purchasing 0, 1, 2, or 3 tickets) constitute different actions, in conjunction with participants’ vehicle choices. Below, we show this using two existing definitions of controllability.
Definition 1, reward-based controllability<sup>1</sup>: If control is defined as the fraction of available reward that is controllably achievable, and we assume all participants are in principle willing and able to invest 3 tickets, controllability can be computed in the present task as:
where P(S' \= goal ∣ 𝑆, 𝐴, 𝐶 ) is the probability of reaching the treasure from present state 𝑆 when taking action A and investing C resources in executing the action. In any of the task environments, the probability of reaching the goal is maximized by purchasing 3 tickets (𝐶 = 3) and choosing the vehicle that leads to the goal (𝐴 = correct vehicle). Conversely, the probability of reaching the goal is minimized by purchasing 3 tickets (𝐶 = 3) and choosing the vehicle that does not lead to the goal (𝐴 = wrong vehicle). This calculation is thus entirely independent of elasticity, since it only considers what would be achieved by maximal resource investment, whereas elasticity consists of the reduction in controllability that would arise if the maximal available 𝐶 is reduced. Consequently, any environment where the maximum available control is higher yet varies less with resource investment would be more controllable and less elastic.
Note that if we also account for ticket costs in calculating reward, this will only reduce the fraction of achievable reward and thus the calculated control in elastic environments.
Definition 2, information-theoretic controllability<sup>2</sup>: Here controllability is defined as the reduction in outcome entropy due to knowing which action is taken:
I(S'; A, C | S) = H(S'|S) - H(S'|S, A, C)
where H(S'|S) is the conditional entropy of the distribution of outcomes S' given the present state 𝑆, and H(S'|S, A, C) is the conditional entropy of the outcome given the present state, action, and resource investment.
To compare controllability, we consider two environments with the same maximum control:
• Inelastic environment: If the correct vehicle is chosen, there is a 100% chance of reaching the goal state with 1, 2, or 3 tickets. Thus, out of 7 possible action-resource investment combinations, three deterministically lead to the goal state (≥1 tickets and correct vehicle choice), three never lead to it (≥1 tickets and wrong vehicle choice), and one (0 tickets) leads to it 20% of the time (since walking leads to the treasure on 20% of trials).
• Elastic Environment: If the correct vehicle is chosen, the probability of boarding it is 0% with 1 ticket, 50% with 2 tickets, and 100% with 3 tickets. Thus, out of 7 possible actionresource investment combinations, one deterministically leads to the goal state (3 tickets and correct vehicle choice), one never leads to it (3 tickets and wrong vehicle choice), one leads to it 60% of the time (2 tickets and correct vehicle choice: 50% boarding + 50% × 20% when failing to board), one leads to it 10% of time (2 ticket and wrong vehicle choice), and three lead to it 20% of time (0-1 tickets).
Here we assume a uniform prior over actions, which renders the information-theoretic definition of controllability equal to another definition termed ‘instrumental divergence’3,4. We note that changing the uniform prior assumption would change the results for the two environments, but that would not change the general conclusion that there can be environments that are more controllable yet less elastic.
Step 1: Calculating H(S'|S)
For the inelastic environment:
P(goal) = (3 × 100% + 3 × 0% + 1 × 20%)/7 = .46, P(non-goal) = .54 H(S'|S) = – [.46 × log<sub>2</sub>(.46) + .54 × log<sub>2</sub>(.54)] \= 1 bit
For the elastic environment:
P(goal) \= (1 × 100% + 1 × 0% + 1 × 60% + 1 × 10% + 3 × 20%)/7 \= .33, P(non-goal) \= .67 H(S'|S) = – [.33 × log<sub>2</sub>(.33) + .67 × log<sub>2</sub>(.67)] \= .91 bits
Step 2: Calculating H(S'|S, A, C)
Inelastic environment: Six action-resource investment combinations have deterministic outcomes entailing zero entropy, whereas investing 0 tickets has a probabilistic outcome (20%). The entropy for 0 tickets is: H(S'|C \= 0) \= -[.2 × log<sub>2</sub>(.2) + 0.8 × log<sub>2</sub> (.8)] = .72 bits. Since this actionresource investment combination is chosen with probability 1/7, the total conditional entropy is approximately .10 bits
Elastic environment: 2 actions have deterministic outcomes (3 tickets with correct/wrong vehicle), whereas the other 5 actions have probabilistic outcomes:
2 tickets and correct vehicle (60% success):
H(S'|A = correct, C = 2) = – [.6 × log<sub>2</sub>(.6) + .4 × log<sub>2</sub>(.4)] \= .97 bits 2 tickets and wrong vehicle (10% success):
H(S'|A = wrong, C = 2) = – [.1 × <sub>2</sub>(.1) + .9 × <sub>2</sub>(.9)] \= .47 bits 0-1 tickets (20% success):
H(S'|C = 0-1) = – [.2 × <sub>2</sub>(.2) + .8 × <sub>2</sub> .8)] \= .72 bits
Thus the total conditional entropy of the elastic environment is: H(S'|S, A, C) = (1/7) × .97 + (1/7) × .47 + (3/7) × .72 \= .52 bits
Step 3: Calculating I(S' | A, S)
Inelastic environment: I(S'; A, C | S) = H(S'|S) – H(S'|S, A, C) = 1 – 0.1 = .9 bits
Elastic environment: I(S'; A, C | S) = H(S'|S) – H(S'|S, A, C) = .91 – .52 = .39 bits
Thus, the inelastic environment offers higher information-theoretic controllability (.9 bits) compared to the elastic environment (.39 bits).
Of note, even if each combination of cost and goal reaching is defined as a distinct outcome, then information-theoretic controllability is higher for the inelastic (2.81 bits) than for the elastic (2.30 bits) environment.
In sum, for both definitions of controllability, we see that environments can be more elastic yet less controllable. We will amend the manuscript to clarify this distinction between controllability and its elasticity.
Reviewer 3:
A bias in how people infer the amount of control they have over their environment is widely believed to be a key component of several mental illnesses including depression, anxiety, and addiction. Accordingly, this bias has been a major focus in computational models of those disorders. However, all of these models treat control as a unidimensional property, roughly, how strongly outcomes depend on action. This paper proposes---correctly, I think---that the intuitive notion of "control" captures multiple dimensions in the relationship between action and outcome is multi-dimensional. In particular, the authors propose that the degree to which outcome depends on how much *effort* we exert, calling this dimension the "elasticity of control". They additionally propose that this dimension (rather than the more holistic notion of controllability) may be specifically impaired in certain types of psychopathology. This idea thus has the potential to change how we think about mental disorders in a substantial way, and could even help us better understand how healthy people navigate challenging decision-making problems.
Unfortunately, my view is that neither the theoretical nor empirical aspects of the paper really deliver on that promise. In particular, most (perhaps all) of the interesting claims in the paper have weak empirical support.
We appreciate the Reviewer's thoughtful engagement with our research and recognition of the potential significance of distinguishing between different dimensions of control in understanding psychopathology. We believe that all the Reviewer’s comments can be addressed with clarifications or additional analyses, as detailed below.
Starting with theory, the elasticity idea does not truly "extend" the standard control model in the way the authors suggest. The reason is that effort is simply one dimension of action. Thus, the proposed model ultimately grounds out in how strongly our outcomes depend on our actions (as in the standard model). Contrary to the authors' claims, the elasticity of control is still a fixed property of the environment. Consistent with this, the computational model proposed here is a learning model of this fixed environmental property. The idea is still valuable, however, because it identifies a key dimension of action (namely, effort) that is particularly relevant to the notion of perceived control. Expressing the elasticity idea in this way might support a more general theoretical formulation of the idea that could be applied in other contexts. See Huys & Dayan (2009), Zorowitz, Momennejad, & Daw (2018), and Gagne & Dayan (2022) for examples of generalizable formulations of perceived control.
We thank the Reviewer for the suggestion that we formalize our concept of elasticity to resource investment, which we agree is a dimension of action. We first note that we have not argued against the claim that elasticity is a fixed property of the environment. We surmise the Reviewer might have misread our statement that “controllability is not a fixed property of the environment”. The latter statement is motivated by the observation that controllability is often higher for agents that can invest more resources (e.g., a richer person can buy more things). We will clarify this in our revision of the manuscript.
To formalize elasticity, we build on Huys & Dayan’s definition of controllability(1) as the fraction of reward that is controllably achievable, 𝜒 (though using information-theoretic definitions(2,3) would work as well). To the extent that this fraction depends on the amount of resources the agent is able and willing to invest (max 𝐶), this formulation can be probabilistically computed without information about the particular agent involved, specifically, by assuming a certain distribution of agents with different amounts of available resources. This would result in a probability distribution over 𝜒. Elasticity can thus be defined as the amount of information obtained about controllability due to knowing the amount of resources available to the agent: I(𝜒; max 𝐶). We will add this formal definition to the manuscript.
Turning to experiment, the authors make two key claims: (1) people infer the elasticity of control, and (2) individual differences in how people make this inference are importantly related to psychopathology. Starting with claim 1, there are three sub-claims here; implicitly, the authors make all three. (1A) People's behavior is sensitive to differences in elasticity, (1B) people actually represent/track something like elasticity, and (1C) people do so naturally as they go about their daily lives. The results clearly support 1A. However, 1B and 1C are not supported. Starting with 1B, the experiment cannot support the claim that people represent or track elasticity because the effort is the only dimension over which participants can engage in any meaningful decision-making (the other dimension, selecting which destination to visit, simply amounts to selecting the location where you were just told the treasure lies). Thus, any adaptive behavior will necessarily come out in a sensitivity to how outcomes depend on effort. More concretely, any model that captures the fact that you are more likely to succeed in two attempts than one will produce the observed behavior. The null models do not make this basic assumption and thus do not provide a useful comparison.
We appreciate the reviewer's critical analysis of our claims regarding elasticity inference, which as detailed below, has led to an important new analysis that strengthens the study’s conclusions. However, we respectfully disagree with two of the Reviewer’s arguments. First, resource investment was not the only meaningful decision dimension in our task, since participant also needed to choose the correct vehicle to get to the right destination. That this was not trivial is evidenced by our exclusion of over 8% of participants who made incorrect vehicle choices more than 10% of the time. Included participants also occasionally erred in this choice (mean error rate = 3%, range [0-10%]).
Second, the experimental task cannot be solved well by a model that simply tracks how outcomes depend on effort because 20% of the time participants reached the treasure despite failing to board their vehicle of choice. In such cases, reward outcomes and control were decoupled. Participants could identify when this was the case by observing the starting location, which was revealed together with the outcome (since depending on the starting location, the treasure location was automatically reached by walking). To determine whether participants distinguished between control-related and non-control-related reward, we have now fitted a variant of our model to the data that allows learning from each of these kinds of outcomes by means of a different free parameter. The results show that participants learned considerably more from control-related outcomes. They were thus not merely tracking outcomes, but specifically inferred when outcomes can be attributed to control. We will include this new analysis in the revised manuscript.
Controllability inference by itself, however, still does not suffice to explain the observed behavior. This is shown by our ‘controllability’ model, which learns to invest more resources to improve control, yet still fails to capture key features of participants’ behavior, as detailed in the manuscript. This means that explaining participants’ behavior requires a model that not only infers controllability—beyond merely outcome probability—but also assumes a priori that increased effort could enhance control. Building these a priori assumption into the model amounts to embedding within it an understanding of elasticity – the idea that control over the environment may be increased by greater resource investment.
That being said, we acknowledge the value in considering alternative computational formulations of adaptation to elasticity. Thus, in our revision of the manuscript, we will add a discussion concerning possible alternative models.
For 1C, the claim that people infer elasticity outside of the experimental task cannot be supported because the authors explicitly tell people about the two notions of control as part of the training phase: "To reinforce participants' understanding of how elasticity and controllability were manifested in each planet, [participants] were informed of the planet type they had visited after every 15 trips." (line 384).
We thank the reviewer for highlighting this point. We agree that our experimental design does not test whether people infer elasticity spontaneously. Our research question was whether people can distinguish between elastic and inelastic controllability. The results strongly support that they can, and this does have potential implications for behavior outside of the experimental task. Specifically, to the extent that people are aware that in some contexts additional resource investment improve control, whereas in other contexts it does not, then our results indicate that they would be able to distinguish between these two kinds of contexts through trial-and-error learning. That said, we agree that investigating whether and how people spontaneously infer elasticity is an interesting direction for future work. We will clarify the scope of the present conclusions in the revised manuscript.
Finally, I turn to claim 2, that individual differences in how people infer elasticity are importantly related to psychopathology. There is much to say about the decision to treat psychopathology as a unidimensional construct. However, I will keep it concrete and simply note that CCA (by design) obscures the relationship between any two variables. Thus, as suggestive as Figure 6B is, we cannot conclude that there is a strong relationship between Sense of Agency and the elasticity bias---this result is consistent with any possible relationship (even a negative one). The fact that the direct relationship between these two variables is not shown or reported leads me to infer that they do not have a significant or strong relationship in the data.
We agree that CCA is not designed to reveal the relationship between any two variables. However, the advantage of this analysis is that it pulls together information from multiple variables. Doing so does not treat psychopathology as unidimensional. Rather, it seeks a particular dimension that most strongly correlates with different aspects of task performance. This is especially useful for multidimensional psychopathology data because such data are often dominated by strong correlations between dimensions, whereas the research seeks to explain the distinctions between the dimensions. Similar considerations hold for the multidimensional task parameters, which although less correlated, may still jointly predict the relevant psychopathological profile better than each parameter does in isolation. Thus, the CCA enabled us to identify a general relationship between task performance and psychopathology that accounts for different symptom measures and aspects of controllability inference.
Using CCA can thus reveal relationships that do not readily show up in two-variable analyses. Indeed, the direct correlation between Sense of Agency (SOA) and elasticity bias was not significant – a result that, for completeness, we will now report in the supplementary materials along with all other direct correlations. We note, however, that the CCA analysis was preregistered and its results were replicated. Furthermore, an auxiliary analysis specifically confirmed the contributions of both elasticity bias (Figure 6D, bottom plot) and, although not reported in the original paper, of the Sense of Agency score (SOA; p\=.03 permutation test) to the observed canonical correlation. Participants scoring higher on the psychopathology profile also overinvested resources in inelastic environments but did not futilely invest in uncontrollable environments (Figure 6A), providing external validation to the conclusion that the CCA captured meaningful variance specific to elasticity inference. The results thus enable us to safely conclude that differences in elasticity inferences are significantly associated with a profile of controlrelated psychopathology to which SOA contributed significantly.
Finally, whereas interpretation of individual CCA loadings that were not specifically tested remains speculative, we note that the pattern of loadings largely replicated across the initial and replication studies (see Figure 6B), and aligns with prior findings. For instance, the positive loadings of SOA and OCD match prior suggestions that a lower sense of control leads to greater compensatory effort(7), whereas the negative loading for depression scores matches prior work showing reduced resource investment in depression(5-6).
We will revise the text to better clarify the advantageous and disadvantageous of our analytical approach, and the conclusions that can and cannot be drawn from it.
There is also a feature of the task that limits our ability to draw strong conclusions about individual differences in elasticity inference. As the authors clearly acknowledge, the task was designed "to be especially sensitive to overestimation of elasticity" (line 287). A straightforward consequence of this is that the resulting *empirical* estimate of estimation bias (i.e., the gamma_elasticity parameter) is itself biased. This immediately undermines any claim that references the directionality of the elasticity bias (e.g. in the abstract). Concretely, an undirected deficit such as slower learning of elasticity would appear as a directed overestimation bias. When we further consider that elasticity inference is the only meaningful learning/decisionmaking problem in the task (argued above), the situation becomes much worse. Many general deficits in learning or decision-making would be captured by the elasticity bias parameter. Thus, a conservative interpretation of the results is simply that psychopathology is associated with impaired learning and decision-making.
We apologize for our imprecise statement that the task was ‘especially sensitive to overestimation of elasticity’, which justifiably led to Reviewer’s concern that slower elasticity learning can be mistaken for elasticity bias. To make sure this was not the case, we made use of the fact that our computational model explicitly separates bias direction (λ) from the rate of learning
through two distinct parameters, which initialize the prior concentration and mean of the model’s initial beliefs concerning elasticity (see Methods pg. 22). The higher the concentration of the initial beliefs (𝜖), the slower the learning. Parameter recovery tests confirmed that our task enables acceptable recovery of both the bias λ<sub>elasticity</sub> (r=.81) and the concentration 𝝐<sub>elasticity</sub> (r=.59) parameters. And importantly, the level of confusion between the parameters was low (confusion of 0.15 for 𝝐<sub>elasticity</sub>→ λ<sub>elasticity</sub> and 0.04 for λ<sub>elasticity</sub>→ 𝝐<sub>elasticity</sub>). This result confirms that our task enables dissociating elasticity biases from the rate of elasticity learning.
Moreover, to validate that the minimal level of confusion existing between bias and the rate of learning did not drive our psychopathology results, we re-ran the CCA while separating concentration from bias parameters. The results (Author response image 1) demonstrate that differences in learning rate (𝜖) had virtually no contribution to our CCA results, whereas the contribution of the pure bias (𝜆) was preserved.
We will incorporate these clarifications and additional analysis in our revised manuscript.
Author response image 1.
Showing that a model parameter correlates with the data it was fit to does not provide any new information, and cannot support claims like "a prior assumption that control is likely available was reflected in a futile investment of resources in uncontrollable environments." To make that claim, one must collect independent measures of the assumption and the investment.
We apologize if this and related statements seemed to be describing independent findings. They were merely meant to describe the relationship between model parameters and modelindependent measures of task performance. It is inaccurate, though, to say that they provide no new information, since results could have been otherwise. For instance, instead of a higher controllability bias primarily associating with futile investment of resources in uncontrollable environments, it could have been primarily associated with more proper investment of resources in high-controllability environments. Additionally, we believe these analyses are of value to readers who seek to understand the role of different parameters in the model. In our planned revision, we will clarify that the relevant analyses are merely descriptive.
Did participants always make two attempts when purchasing tickets? This seems to violate the intuitive model, in which you would sometimes succeed on the first jump. If so, why was this choice made? Relatedly, it is not clear to me after a close reading how the outcome of each trial was actually determined.
We thank the reviewer for highlighting the need to clarify these aspects of the task in the revised manuscript.
When participants purchased two extra tickets, they attempted both jumps, and were never informed about whether either of them succeeded. Instead, after choosing a vehicle and attempting both jumps, participants were notified where they arrived at. This outcome was determined based on the cumulative probability of either of the two jumps succeeding. Success meant that participants arrived at where their chosen vehicle goes, whereas failure meant they walked to the nearest location (as determined by where they started from).
Though it is unintuitive to attempt a second jump before seeing whether the first succeed, this design choice ensured two key objectives. First, that participants would consistently need to invest not only more money but also more effort and time in planets with high elastic controllability. Second, that the task could potentially generalize to the many real-world situations where the amount of invested effort has to be determined prior to seeing any outcome, for instance, preparing for an exam or a job interview.
It should be noted that the model is heuristically defined and does not reflect Bayesian updating. In particular, it overestimates control by not using losses with less than 3 tickets (intuitively, the inference here depends on your beliefs about elasticity). I wonder if the forced three-ticket trials in the task might be historically related to this modeling choice.
We apologize for not making this clear, but in fact losing with less than 3 tickets does reduce the model’s estimate of available control. It does so by increasing the elasticity estimates
(a<sub>elastic≥1</sub>, a<sub>elastic2</sub> parameters), signifying that more tickets are needed to obtain the maximum available level of control, thereby reducing the average controllability estimate across ticket investment options.
It would be interesting to further develop the model such that losing with less than 3 tickets would also impact inferences concerning the maximum available control, depending on present beliefs concerning elasticity, but the forced three-ticket purchases already expose participants to the maximum available control, and thus, the present data may not be best suited to test such a model. These trials were implemented to minimize individual differences concerning inferences of maximum available control, thereby focusing differences on elasticity inferences. We will discuss the Reviewer’s suggestion for a potentially more accurate model in the revised manuscript.
References
(1) Huys, Q. J. M., & Dayan, P. (2009). A Bayesian formulation of behavioral control. Cognition, 113(3), 314– 328.
(2) Ligneul, R. (2021). Prediction or causation? Towards a redefinition of task controllability. Trends in Cognitive Sciences, 25(6), 431–433.
(3) Mistry, P., & Liljeholm, M. (2016). Instrumental divergence and the value of control. Scientific Reports, 6, 36295.
(4) Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151
(5) Cohen RM, Weingartner H, Smallberg SA, Pickar D, Murphy DL. Effort and cognition in depression. Arch Gen Psychiatry. 1982 May;39(5):593-7. doi: 10.1001/archpsyc.1982.04290050061012. PMID: 7092490.
(6) Bi R, Dong W, Zheng Z, Li S, Zhang D. Altered motivation of effortful decision-making for self and others in subthreshold depression. Depress Anxiety. 2022 Aug;39(8-9):633-645. doi: 10.1002/da.23267. Epub 2022 Jun 3. PMID: 35657301; PMCID: PMC9543190.
(7) Tapal, A., Oren, E., Dar, R., & Eitam, B. (2017). The Sense of Agency Scale: A measure of consciously perceived control over one's mind, body, and the immediate environment. Frontiers in Psychology, 8, 1552
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
Reviewer #1 (Public review):
Summary:
This study identified three independent components of glucose dynamics-"value," "variability," and "autocorrelation", and reported important findings indicating that they play an important role in predicting coronary plaque vulnerability. Although the generalizability of the results needs further investigation due to the limited sample size and validation cohort limitations, this study makes several notable contributions: validation of autocorrelation as a new clinical indicator, theoretical support through mathematical modeling, and development of a web application for practical implementation. These contributions are likely to attract broad interest from researchers in both diabetology and cardiology and may suggest the potential for a new approach to glucose monitoring that goes beyond conventional glycemic control indicators in clinical practice.
Strengths:
The most notable strength of this study is the identification of three independent elements in glycemic dynamics: value, variability, and autocorrelation. In particular, the metric of autocorrelation, which has not been captured by conventional glycemic control indices, may bring a new perspective for understanding glycemic dynamics. In terms of methodological aspects, the study uses an analytical approach combining various statistical methods such as factor analysis, LASSO, and PLS regression, and enhances the reliability of results through theoretical validation using mathematical models and validation in other cohorts. In addition, the practical aspect of the research results, such as the development of a Web application, is also an important contribution to clinical implementation.
We appreciate reviewer #1 for the positive assessment and for the valuable and constructive comments on our manuscript.
Weaknesses:
The most significant weakness of this study is the relatively small sample size of 53 study subjects. This sample size limitation leads to a lack of statistical power, especially in subgroup analyses, and to limitations in the assessment of rare events.
We appreciate the reviewer’s concern regarding the sample size. We acknowledge that a larger sample size would increase statistical power, especially for subgroup analyses and the assessment of rare events.
We would like to clarify several points regarding the statistical power and validation of our findings. Our sample size determination followed established methodological frameworks, including the guidelines outlined by Muyembe Asenahabi, Bostely, and Peters Anselemo Ikoha. “Scientific research sample size determination.” (2023). These guidelines balance the risks of inadequate sample size with the challenges of unnecessarily large samples. For our primary analysis examining the correlation between CGM-derived measures and %NC, power calculations (a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4) indicated that a minimum of 47 participants was required. Our sample size of 53 exceeded this threshold and allowed us to detect statistically significant correlations, as described in the Methods section. Moreover, to provide transparency about the precision of our estimates, we have included confidence intervals for all coefficients.
Furthermore, our sample size aligns with previous studies investigating the associations between glucose profiles and clinical parameters, including Torimoto, Keiichi, et al. “Relationship between fluctuations in glucose levels measured by continuous glucose monitoring and vascular endothelial dysfunction in type 2 diabetes mellitus.” Cardiovascular Diabetology 12 (2013): 1-7. (n=57), Hall, Heather, et al. “Glucotypes reveal new patterns of glucose dysregulation.” PLoS biology 16.7 (2018): e2005143. (n=57), and Metwally, Ahmed A., et al. “Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning.” Nature Biomedical Engineering (2024): 1-18. (n=32).
Furthermore, the primary objective of our study was not to assess rare events, but rather to demonstrate that glucose dynamics can be decomposed into three main factors - mean, variance and autocorrelation - whereas traditional measures have primarily captured mean and variance without adequately reflecting autocorrelation. We believe that our current sample size effectively addresses this objective.
Regarding the classification of glucose dynamics components, we have conducted additional validation across diverse populations including 64 Japanese, 53 American, and 100 Chinese individuals. These validation efforts have consistently supported our identification of three independent glucose dynamics components.
However, we acknowledge the importance of further validation on a larger scale. To address this, we conducted a large follow-up study of over 8,000 individuals (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.
To address the sample size considerations, we will add the following sentences in the Discussion section:
Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed to improve the predictive utility and generalizability of our findings.
We appreciate the reviewer’s feedback and believe that these clarifications will strengthen the manuscript.
In terms of validation, several challenges exist, including geographical and ethnic biases in the validation cohorts, lack of long-term follow-up data, and insufficient validation across different clinical settings. In terms of data representativeness, limiting factors include the inclusion of only subjects with well-controlled serum cholesterol and blood pressure and the use of only short-term measurement data.
We appreciate the reviewer’s comment regarding the challenges associated with validation. In terms of geographic and ethnic diversity, our study includes validation cohorts from diverse populations, including 64 Japanese, 53 American and 100 Chinese individuals. These cohorts include a wide range of metabolic states, from healthy individuals to those with diabetes, ensuring validation across different clinical conditions. In addition, we recognize the limited availability of publicly available datasets with sufficient sample sizes for factor decomposition that include both healthy individuals and those with type 2 diabetes (Zhao, Qinpei, et al. “Chinese diabetes datasets for data-driven machine learning.” Scientific Data 10.1 (2023): 35.). The main publicly available datasets with relevant clinical characteristics have already been analyzed in this study using unbiased approaches.
However, we fully agree with the reviewer that expanding the geographic and ethnic scope, including long-term follow-up data, and validation in different clinical settings would further strengthen the robustness and generalizability of our findings. To address this, we conducted a large follow-up study of over 8,000 individuals with two years of follow-up (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.
Regarding the validation considerations, we will add the following sentences to the Discussion section:
Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed to improve the predictive utility and generalizability of our findings.
Although our LASSO and factor analysis indicated that CGM-derived measures were strong predictors of %NC, this does not mean that other clinical parameters, such as lipids and blood pressure, are irrelevant in T2DM complications. Our study specifically focused on characterizing glucose dynamics, and we analyzed individuals with well-controlled serum cholesterol and blood pressure to reduce confounding effects. While we anticipate that inclusion of a more diverse population would not alter our primary findings regarding glucose dynamics, it is likely that a broader data set would reveal additional predictive contributions from lipid and blood pressure parameters.
In terms of elucidation of physical mechanisms, the study is not sufficient to elucidate the mechanisms linking autocorrelation and clinical outcomes or to verify them at the cellular or molecular level.
We appreciate the reviewer’s point regarding the need for further elucidation of the physical mechanisms linking glucose autocorrelation to clinical outcomes. We fully agree with the reviewer that the detailed molecular and cellular mechanisms underlying this relationship are not yet fully understood, as noted in our Discussion section.
However, we would like to emphasize the theoretical basis that supports the clinical relevance of autocorrelation. Our results show that glucose profiles with identical mean and variability can exhibit different autocorrelation patterns, highlighting that conventional measures such as mean or variance alone may not fully capture inter-individual metabolic differences. Incorporating autocorrelation analysis provides a more comprehensive characterization of metabolic states. Consequently, incorporating autocorrelation measures alongside traditional diabetes diagnostic criteria - such as fasting glucose, HbA1c and PG120, which primarily reflect only the “mean” component - can improve predictive accuracy for various clinical outcomes. While further research at the cellular and molecular level is needed to fully validate these findings, it is important to note that the primary goal of this study was to analyze the characteristics of glucose dynamics and gain new insights into metabolism, rather than to perform molecular biology experiments.
Furthermore, our previous research has shown that glucose autocorrelation reflects changes in insulin clearance (Sugimoto, Hikaru, et al. “Improved Detection of Decreased Glucose Handling Capacities via Novel Continuous Glucose Monitoring-Derived Indices: AC_Mean and AC_Var.” medRxiv (2023): 2023-09.). The relationship between insulin clearance and cardiovascular disease has been well documented (Randrianarisoa, Elko, et al. “Reduced insulin clearance is linked to subclinical atherosclerosis in individuals at risk for type 2 diabetes mellitus.” Scientific reports 10.1 (2020): 22453.), and the mechanisms described in this prior work may potentially explain the association between glucose autocorrelation and clinical outcomes observed in the present study.
Rather than a limitation, we view these currently unexplored associations as an opportunity for further research. The identification of autocorrelation as a key glycemic feature introduces a new dimension to metabolic regulation that could serve as the basis for future investigations exploring the molecular mechanisms underlying these patterns.
While we agree that further research at the cellular and molecular level is needed to fully validate these findings, we believe that our study provides a strong theoretical framework to support the clinical utility of autocorrelation analysis in glucose monitoring, and that this could serve as the basis for future investigations exploring the molecular mechanisms underlying these autocorrelation patterns, which adds to the broad interest of this study. Regarding the physical mechanisms linking autocorrelation and clinical outcomes, we will add the following sentences in the Discussion section:
This study also provided evidence that autocorrelation can vary independently from the mean and variance components using simulated data. In addition, simulated glucose dynamics indicated that even individuals with high AC_Var did not necessarily have high maximum and minimum blood glucose levels. This study also indicated that these three components qualitatively corresponded to the four distinct glucose patterns observed after glucose administration, which were identified in a previous study (Hulman et al., 2018). Thus, the inclusion of autocorrelation in addition to mean and variance may improve the characterization of inter-individual differences in glucose regulation and improve the predictive accuracy of various clinical outcomes.
Despite increasing evidence linking glycemic variability to oxidative stress and endothelial dysfunction in T2DM complications (Ceriello et al., 2008; Monnier et al., 2008), the biological mechanisms underlying the independent predictive value of autocorrelation remain to be elucidated. Our previous work has shown that glucose autocorrelation is influenced by insulin clearance (Sugimoto et al., 2023), a process known to be associated with cardiovascular disease risk (Randrianarisoa et al., 2020). Therefore, the molecular pathways linking glucose autocorrelation to cardiovascular disease may share common mechanisms with those linking insulin clearance to cardiovascular disease. Although previous studies have primarily focused on investigating the molecular mechanisms associated with mean glucose levels and glycemic variability, our findings open new avenues for exploring the molecular basis of glucose autocorrelation, potentially revealing novel therapeutic targets for preventing diabetic complications.
Reviewer #2 (Public review):
Sugimoto et al. explore the relationship between glucose dynamics - specifically value, variability, and autocorrelation - and coronary plaque vulnerability in patients with varying glucose tolerance levels. The study identifies three independent predictive factors for %NC and emphasizes the use of continuous glucose monitoring (CGM)-derived indices for coronary artery disease (CAD) risk assessment. By employing robust statistical methods and validating findings across datasets from Japan, America, and China, the authors highlight the limitations of conventional markers while proposing CGM as a novel approach for risk prediction. The study has the potential to reshape CAD risk assessment by emphasizing CGM-derived indices, aligning well with personalized medicine trends.
Strengths:
(1) The introduction of autocorrelation as a predictive factor for plaque vulnerability adds a novel dimension to glucose dynamic analysis.
(2) Inclusion of datasets from diverse regions enhances generalizability.
(3) The use of a well-characterized cohort with controlled cholesterol and blood pressure levels strengthens the findings.
(4) The focus on CGM-derived indices aligns with personalized medicine trends, showcasing the potential for CAD risk stratification.
We appreciate reviewer #2 for the positive assessment and for the valuable and constructive comments on our manuscript.
Weaknesses:
(1) The link between autocorrelation and plaque vulnerability remains speculative without a proposed biological explanation.
We appreciate the reviewer’s point about the need for a clearer biological explanation linking glucose autocorrelation to plaque vulnerability. We fully agree with the reviewer that the detailed biological mechanisms underlying this relationship are not yet fully understood, as noted in our Discussion section.
However, we would like to emphasize the theoretical basis that supports the clinical relevance of autocorrelation. Our results show that glucose profiles with identical mean and variability can exhibit different autocorrelation patterns, highlighting that conventional measures such as mean or variance alone may not fully capture inter-individual metabolic differences. Incorporating autocorrelation analysis provides a more comprehensive characterization of metabolic states. Consequently, incorporating autocorrelation measures alongside traditional diabetes diagnostic criteria - such as fasting glucose, HbA1c and PG120, which primarily reflect only the “mean” component - can improve predictive accuracy for various clinical outcomes.
Furthermore, our previous research has shown that glucose autocorrelation reflects changes in insulin clearance (Sugimoto, Hikaru, et al. “Improved Detection of Decreased Glucose Handling Capacities via Novel Continuous Glucose Monitoring-Derived Indices: AC_Mean and AC_Var.” medRxiv (2023): 2023-09.). The relationship between insulin clearance and cardiovascular disease has been well documented (Randrianarisoa, Elko, et al. “Reduced insulin clearance is linked to subclinical atherosclerosis in individuals at risk for type 2 diabetes mellitus.” Scientific reports 10.1 (2020): 22453.), and the mechanisms described in this prior work may potentially explain the association between glucose autocorrelation and clinical outcomes observed in the present study.
Rather than a limitation, we view these currently unexplored associations as an opportunity for further research. The identification of autocorrelation as a key glycemic feature introduces a new dimension to metabolic regulation that could serve as the basis for future investigations exploring the molecular mechanisms underlying these patterns.
While we agree that further research at the cellular and molecular level is needed to fully validate these findings, we believe that our study provides a strong theoretical framework to support the clinical utility of autocorrelation analysis in glucose monitoring, and that this could serve as the basis for future investigations exploring the molecular mechanisms underlying these autocorrelation patterns, which adds to the broad interest of this study. Regarding the physical mechanisms linking autocorrelation and clinical outcomes, we will add the following sentences in the Discussion section:
This study also provided evidence that autocorrelation can vary independently from the mean and variance components using simulated data. In addition, simulated glucose dynamics indicated that even individuals with high AC_Var did not necessarily have high maximum and minimum blood glucose levels. This study also indicated that these three components qualitatively corresponded to the four distinct glucose patterns observed after glucose administration, which were identified in a previous study (Hulman et al., 2018). Thus, the inclusion of autocorrelation in addition to mean and variance may improve the characterization of inter-individual differences in glucose regulation and improve the predictive accuracy of various clinical outcomes.
Despite increasing evidence linking glycemic variability to oxidative stress and endothelial dysfunction in T2DM complications (Ceriello et al., 2008; Monnier et al., 2008), the biological mechanisms underlying the independent predictive value of autocorrelation remain to be elucidated. Our previous work has shown that glucose autocorrelation is influenced by insulin clearance (Sugimoto et al., 2023), a process known to be associated with cardiovascular disease risk (Randrianarisoa et al., 2020). Therefore, the molecular pathways linking glucose autocorrelation to cardiovascular disease may share common mechanisms with those linking insulin clearance to cardiovascular disease. Although previous studies have primarily focused on investigating the molecular mechanisms associated with mean glucose levels and glycemic variability, our findings open new avenues for exploring the molecular basis of glucose autocorrelation, potentially revealing novel therapeutic targets for preventing diabetic complications.
(2) The relatively small sample size (n=270) limits statistical power, especially when stratified by glucose tolerance levels.
We appreciate the reviewer’s concern regarding sample size and its potential impact on statistical power, especially when stratified by glucose tolerance level. We fully agree that a larger sample size would increase statistical power, especially for subgroup analyses.
We would like to clarify several points regarding the statistical power and validation of our findings. Our sample size determination followed established methodological frameworks, including the guidelines outlined by Muyembe Asenahabi, Bostely, and Peters Anselemo Ikoha. “Scientific research sample size determination.” (2023). These guidelines balance the risks of inadequate sample size with the challenges of unnecessarily large samples. For our primary analysis examining the correlation between CGM-derived measures and %NC, power calculations (a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4) indicated that a minimum of 47 participants was required. Our sample size of 53 exceeded this threshold and allowed us to detect statistically significant correlations, as described in the Methods section. Moreover, to provide transparency about the precision of our estimates, we have included confidence intervals for all coefficients.
Furthermore, our sample size aligns with previous studies investigating the associations between glucose profiles and clinical parameters, including Torimoto, Keiichi, et al. “Relationship between fluctuations in glucose levels measured by continuous glucose monitoring and vascular endothelial dysfunction in type 2 diabetes mellitus.” Cardiovascular Diabetology 12 (2013): 1-7. (n=57), Hall, Heather, et al. “Glucotypes reveal new patterns of glucose dysregulation.” PLoS biology 16.7 (2018): e2005143. (n=57), and Metwally, Ahmed A., et al. “Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning.” Nature Biomedical Engineering (2024): 1-18. (n=32).
Regarding the classification of glucose dynamics components, we have conducted additional validation across diverse populations including 64 Japanese, 53 American, and 100 Chinese individuals. These validation efforts have consistently supported our identification of three independent glucose dynamics components.
However, we acknowledge the importance of further validation on a larger scale. To address this, we conducted a large follow-up study of over 8,000 individuals with two years of follow-up (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.
To address the sample size considerations, we will add the following sentences in the Discussion section:
Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed to improve the predictive utility and generalizability of our findings.
(3) Strict participant selection criteria may reduce applicability to broader populations.
We appreciate the reviewer’s comment regarding the potential impact of strict participant selection criteria on the broader applicability of our findings. We acknowledge that extending validation to more diverse populations would improve the generalizability of our findings.
Our study includes validation cohorts from diverse populations, including 64 Japanese, 53 American and 100 Chinese individuals. These cohorts include a wide range of metabolic states, from healthy individuals to those with diabetes, ensuring validation across different clinical conditions. However, we acknowledge that further validation in additional populations and clinical settings would strengthen our conclusions. To address this, we conducted a large follow-up study of over 8,000 individuals (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.
We will add the following text to the Discussion section to address these considerations:
Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed to improve the predictive utility and generalizability of our findings.
Although our LASSO and factor analysis indicated that CGM-derived measures were strong predictors of %NC, this does not mean that other clinical parameters, such as lipids and blood pressure, are irrelevant in T2DM complications. Our study specifically focused on characterizing glucose dynamics, and we analyzed individuals with well-controlled serum cholesterol and blood pressure to reduce confounding effects. While we anticipate that inclusion of a more diverse population would not alter our primary findings regarding glucose dynamics, it is likely that a broader data set would reveal additional predictive contributions from lipid and blood pressure parameters.
(4) CGM-derived indices like AC_Var and ADRR may be too complex for routine clinical use without simplified models or guidelines.
We appreciate the reviewer’s concern about the complexity of CGM-derived indices such as AC_Var and ADRR for routine clinical use. We acknowledge that for these indices to be of practical use, they must be both interpretable and easily accessible to healthcare providers.
To address this concern, we have developed an easy-to-use web application that automatically calculates these measures, including AC_Var, mean glucose levels, and glucose variability. This tool eliminates the need for manual calculations, making these indices more practical for clinical implementation.
Regarding interpretability, we acknowledge that establishing specific clinical guidelines would enhance the practical utility of these measures. For example, defining a cut-off value for AC_Var above which the risk of diabetes complications increases significantly would provide clearer clinical guidance. However, given our current sample size limitations and our predefined objective of investigating correlations among indices, we have taken a conservative approach by focusing on the correlation between AC_Var and %NC rather than establishing definitive cutoffs. This approach intentionally avoids problematic statistical practices like p-hacking. It is not realistic to expect a single study to accomplish everything from proposing a new concept to conducting large-scale clinical trials to establishing clinical guidelines. Establishing clinical guidelines typically requires the accumulation of multiple studies over many years. Recognizing this reality, we have been careful in our manuscript to make modest claims about the discovery of new “correlations” rather than exaggerated claims about immediate routine clinical use.
To address this limitation, we conducted a large follow-up study of over 8,000 individuals in the next study (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which proposed clinically relevant cutoffs and reference ranges for AC_Var and other CGM-derived indices. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, by integrating automated calculation tools with clear clinical thresholds, we expect to make these measures more accessible for clinical use.
We will add the following text to the Discussion section to address these considerations:
While CGM-derived indices such as AC_Var and ADRR hold promise for CAD risk assessment, their complexity may present challenges for routine clinical implementation. To improve usability, we have developed a web-based calculator that automates these calculations. However, the definition of clinically relevant thresholds and reference ranges requires further validation in larger cohorts.
(5) The study does not compare CGM-derived indices to existing advanced CAD risk models, limiting the ability to assess their true predictive superiority.
We appreciate the reviewer’s comment regarding the comparison of CGM-derived indices with existing CAD risk models. Given that our study population consisted of individuals with well-controlled total cholesterol and blood pressure levels, a direct comparison with the Framingham Risk Score for Hard Coronary Heart Disease (Wilson, Peter WF, et al. “Prediction of coronary heart disease using risk factor categories.” Circulation 97.18 (1998): 1837-1847.) may introduce inherent bias, as these factors are key components of the score.
Nevertheless, to further assess the predictive value of the CGM-derived indices, we performed additional analyses using linear regression to predict %NC. Using the Framingham Risk Score, we obtained an R² of 0.04 and an Akaike Information Criterion (AIC) of 330. In contrast, our proposed model incorporating the three glycemic parameters - CGM_Mean, CGM_Std, and AC_Var - achieved a significantly improved R² of 0.36 and a lower AIC of 321, indicating superior predictive accuracy.
We will add the following text to the Result section:
The regression model including CGM_Mean, CGM_Std and AC_Var to predict %NC achieved an R² of 0.36 and an Akaike Information Criterion (AIC) of 321. Each of these indices showed statistically significant independent positive correlations with %NC. In contrast, the model using conventional glycemic markers (FBG, HbA1c, and PG120) yielded an R<sup>2</sup> of only 0.05 and an AIC of 340. Similarly, the model using the Framingham Risk Score for Hard Coronary Heart Disease (Wilson et al., 1998) showed limited predictive value, with an R<sup>2</sup> of 0.04 and an AIC of 330.
(6) Varying CGM sampling intervals (5-minute vs. 15-minute) were not thoroughly analyzed for impact on results.
We appreciate the reviewer’s comment regarding the potential impact of different CGM sampling intervals on our results. To assess the robustness of our findings across different sampling frequencies, we performed a down sampling analysis by converting our 5-minute interval data to 15-minute intervals. The AC_Var value calculated from 15-minute intervals was significantly correlated with that calculated from 5-minute intervals (R = 0.99, 95% CI: 0.97-1.00). Furthermore, the regression model using CGM_Mean, CGM_Std, and AC_Var from 15-minute intervals to predict %NC achieved an R<sup>2</sup> of 0.36 and an AIC of 321, identical to the model using 5-minute intervals. These results indicate that our results are robust to variations in CGM sampling frequency.
We will add this analysis to the Result section:
The AC_Var value calculated from 15-minute intervals was significantly correlated with that calculated from 5-minute intervals (R = 0.99, 95% CI: 0.97-1.00). Consequently, the regression model including CGM_Mean, CGM_Std and AC_Var from 15-minute intervals to predict %NC achieved an R² of 0.36 and an AIC of 321.
Reviewer #3 (Public review):
Summary:
This is a retrospective analysis of 53 individuals over 26 features (12 clinical phenotypes, 12 CGM features, and 2 autocorrelation features) to examine which features were most informative in predicting percent necrotic core (%NC) as a parameter for coronary plaque vulnerability. Multiple regression analysis demonstrated a better ability to predict %NC from 3 selected CGM-derived features than 3 selected clinical phenotypes. LASSO regularization and partial least squares (PLS) with VIP scores were used to identify 4 CGM features that most contribute to the precision of %NC. Using factor analysis they identify 3 components that have CGM-related features: value (relating to the value of blood glucose), variability (relating to glucose variability), and autocorrelation (composed of the two autocorrelation features). These three groupings appeared in the 3 validation cohorts and when performing hierarchical clustering. To demonstrate how these three features change, a simulation was created to allow the user to examine these features under different conditions.
We appreciate reviewer #3 for the valuable and constructive comments on our manuscript.
Review:
The goal of this study was to identify CGM features that relate to %NC. Through multiple feature selection methods, they arrive at 3 components: value, variability, and autocorrelation. While the feature list is highly correlated, the authors take steps to ensure feature selection is robust. There is a lack of clarity of what each component (value, variability, and autocorrelation) includes as while similar CGM indices fall within each component, there appear to be some indices that appear as relevant to value in one dataset and to variability in the validation.
We appreciate the reviewer’s comment regarding the classification of CGM-derived measures into the three components: value, variability, and autocorrelation. As the reviewer correctly points out, some measures may load differently between the value and variability components in different datasets. However, we believe that this variability reflects the inherent mathematical properties of these measures rather than a limitation of our study.
For example, the HBGI clusters differently across datasets due to its dependence on the number of glucose readings above a threshold. In populations where mean glucose levels are predominantly below this threshold, the HBGI is more sensitive to glucose variability (Fig. S7A). Conversely, in populations with a wider range of mean glucose levels, HBGI correlates more strongly with mean glucose levels (Fig. 3A). This context-dependent behavior is expected given the mathematical properties of these measures and does not indicate an inconsistency in our classification approach.
Importantly, our main findings remain robust: CGM-derived measures systematically fall into three components-value, variability, and autocorrelation. Traditional CGM-derived measures primarily reflect either value or variability, and this categorization is consistently observed across datasets. While specific indices such as HBGI may shift classification depending on population characteristics, the overall structure of CGM data remains stable.
To address these considerations, we will add the following text to the Discussion section:
Some indices, such as HBGI, showed variation in classification across datasets, with some populations showing higher factor loadings in the “value” component and others in the “variability” component. This variation occurs because HBGI calculations depend on the number of glucose readings above a threshold. In populations where mean glucose levels are predominantly below this threshold, the HBGI is more sensitive to glucose variability (Fig. S7A). Conversely, in populations with a wider range of mean glucose levels, the HBGI correlates more strongly with mean glucose levels (Fig. 3A). Despite these differences, our validation analyses confirm that CGM-derived indices consistently cluster into three components: value, variability, and autocorrelation.
We are sceptical about statements of significance without documentation of p-values.
We appreciate the reviewer’s concern regarding statistical significance and the documentation of p values.
First, given the multiple comparisons in our study, we used q values rather than p values, as shown in Figure S1. Q values provide a more rigorous statistical framework for controlling the false discovery rate in multiple testing scenarios, thereby reducing the likelihood of false positives.
Second, our statistical reporting follows established guidelines, including those of the New England Journal of Medicine (Harrington, David, et al. “New guidelines for statistical reporting in the journal.” New England Journal of Medicine 381.3 (2019): 285-286.), which recommend that “reporting of exploratory end points should be limited to point estimates of effects with 95% confidence intervals” and that “replace p values with estimates of effects or association and 95% confidence intervals”. According to these guidelines, p values should not be reported in this type of study. We determined significance based on whether these 95% confidence intervals excluded zero - a statistical method for determining whether an association is significantly different from zero (Tan, Sze Huey, and Say Beng Tan. "The correct interpretation of confidence intervals." Proceedings of Singapore Healthcare 19.3 (2010): 276-278.).
For the sake of transparency, we provide p values for readers who may be interested, although we emphasize that they should not be the basis for interpretation, as discussed in the referenced guidelines. Specifically, in Figure 1, the p values for CGM_Mean, CGM_Std, and AC_Var were 0.02, 0.02, and <0.01, respectively, while those for FBG, HbA1c, and PG120 were 0.83, 0.91, and 0.25, respectively. In Figure 3C, the p values for factors 1–5 were 0.03, 0.03, 0.03, 0.24, and 0.87, respectively, and in Figure S10B, the p values for factors 1–3 were <0.01, <0.01, and 0.20, respectively.
We appreciate the opportunity to clarify our statistical methodology and are happy to provide additional details if needed.
While hesitations remain, the ability of these authors to find groupings of these many CGM metrics in relation to %NC is of interest. The believability of the associations is impeded by an obtuse presentation of the results with core data (i.e. correlation plots between CGM metrics and %NC) buried in the supplement while main figures contain plots of numerical estimates from models which would be more usefully presented in supplementary tables.
We appreciate the reviewer’s comment regarding the presentation of our results and recognize the importance of ensuring clarity and accessibility of the core data.
The central finding of our study is twofold: first, that the numerous CGM-derived measures can be systematically classified into three distinct components-mean, variance, and autocorrelation-and second, that each of these components is independently associated with %NC. This insight cannot be derived simply from examining scatter plots of individual correlations, which are provided in the Supplementary Figures. Instead, it emerges from our statistical analyses in the main figures, including multiple regression models that reveal the independent contributions of these components to %NC.
However, we acknowledge the reviewer’s concern regarding the accessibility of key data. To improve clarity, we will move several scatter plots from the Supplementary Figures to the main figures to allow readers to more directly visualize the relationships between CGM-derived measures and %NC. We believe this revision will improve the transparency and readability of our results while maintaining the rigor of our analytical approach.
Given the small sample size in the primary analysis, there is a lot of modeling done with parameters estimated where simpler measures would serve and be more convincing as they require less data manipulation. A major example of this is that the pairwise correlation/covariance between CGM_mean, CGM_std, and AC_var is not shown and would be much more compelling in the claim that these are independent factors.
We appreciate the reviewer’s feedback on our statistical analysis and data presentation. The correlations between CGM_Mean, CGM_Std, and AC_Var are documented in Figure S1B. However, to improve accessibility and clarity, we will move these correlation analyses to the main figures. Regarding our modeling approach, we chose LASSO and PLS methods because they are well-established techniques that are particularly suited to scenarios with many input variables and a relatively small sample size. These methods have been extensively validated in the literature as robust approaches for variable selection under such conditions (Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J R Stat Soc 58:267–288. Wold S, Sjöström M, Eriksson L. 2001. PLS-regression: a basic tool of chemometrics. Chemometrics Intellig Lab Syst 58:109–130. Pei X, Qi D, Liu J, Si H, Huang S, Zou S, Lu D, Li Z. 2023. Screening marker genes of type 2 diabetes mellitus in mouse lacrimal gland by LASSO regression. Sci Rep 13:6862. Wang C, Kong H, Guan Y, Yang J, Gu J, Yang S, Xu G. 2005. Plasma phospholipid metabolic profiling and biomarkers of type 2 diabetes mellitus based on high-performance liquid chromatography/electrospray mass spectrometry and multivariate statistical analysis. Anal Chem 77:4108–4116.).
Lack of methodological detail is another challenge. For example, the time period of CGM metrics or CGM placement in the primary study in relation to the IVUS-derived measurements of coronary plaques is unclear. Are they temporally distant or proximal/ concurrent with the PCI?
We appreciate the reviewer’s important question regarding the temporal relationship between CGM measurements and IVUS-derived plaque assessments. As described in our previous work (Otowa‐Suematsu, Natsu, et al. “Comparison of the relationship between multiple parameters of glycemic variability and coronary plaque vulnerability assessed by virtual histology–intravascular ultrasound.” Journal of Diabetes Investigation 9.3 (2018): 610-615.), all individuals underwent continuous glucose monitoring for at least three consecutive days within the seven-day period prior to the PCI procedure. To improve clarity for readers, we will include this methodological detail in the revised manuscript.
A patient undergoing PCI for coronary intervention would be expected to have physiological and iatrogenic glycemic disturbances that do not reflect their baseline state. This is not considered or discussed.
We appreciate the reviewer’s concern regarding potential glycemic disturbances associated with PCI. As described in our previous work (Otowa‐Suematsu, Natsu, et al. “Comparison of the relationship between multiple parameters of glycemic variability and coronary plaque vulnerability assessed by virtual histology–intravascular ultrasound.” Journal of Diabetes Investigation 9.3 (2018): 610-615.), all CGM measurements were performed before the PCI procedure. This temporal separation ensures that the glycemic patterns analyzed in our study reflect the baseline metabolic state of the patients, rather than any physiological or iatrogenic effects of PCI. To avoid any misunderstanding, we will clarify this temporal relationship in the revised manuscript.
The attempts at validation in external cohorts, Japanese, American, and Chinese are very poorly detailed. We could only find even an attempt to examine cardiovascular parameters in the Chinese data set but the outcome variables are unspecified with regard to what macrovascular events are included, their temporal relation to the CGM metrics, etc. Notably macrovascular event diagnoses are very different from the coronary plaque necrosis quantification. This could be a source of strength in the findings if carefully investigated and detailed but due to the lack of detail seems like an apples-to-oranges comparison.
We appreciate the reviewer’s comment regarding the validation cohorts and the need for greater clarity, particularly in the Chinese dataset. We acknowledge that our initial description lacked sufficient methodological detail, and we will expand the Methods section to provide a more comprehensive explanation.
For the Chinese dataset, the data collection protocol was previously documented (Zhao, Qinpei, et al. “Chinese diabetes datasets for data-driven machine learning.” Scientific Data 10.1 (2023): 35.). Briefly, trained research staff used standardized questionnaires to collect demographic and clinical information, including diabetes diagnosis, treatment history, comorbidities, and medication use. Physical examinations included anthropometric measurements, and body mass index was calculated using standard protocols. CGM monitoring was performed using the FreeStyle Libre H device (Abbott Diabetes Care, UK), which records interstitial glucose levels at 15-minute intervals for up to 14 days. Laboratory measurements, including metabolic panels, lipid profiles, and renal function tests, were obtained within six months of CGM placement. While previous studies have linked necrotic core to macrovascular events (Xie, Yong, et al. “Clinical outcome of nonculprit plaque ruptures in patients with acute coronary syndrome in the PROSPECT study.” JACC: Cardiovascular Imaging 7.4 (2014): 397-405.), we acknowledge the limitations of the cardiovascular outcomes in the Chinese data set. These outcomes were extracted from medical records rather than standardized diagnostic procedures or imaging studies. To address these concerns, we will expand the Discussion section to clarify the differences in outcome definitions and methodological approaches between the data sets.
Finally, the simulations at the end are not relevant to the main claims of the paper and we would recommend removing them for the coherence of this manuscript.
We appreciate the reviewer’s feedback regarding the relevance of the simulation component of our manuscript. The primary contribution of our study goes beyond demonstrating correlations between CGM-derived measures and %NC; it highlights three fundamental components of glycemic patterns-mean, variability, and autocorrelation-and their independent relationships with coronary plaque characteristics.
The simulations are included to illustrate how glycemic patterns with identical means and variability can have different autocorrelation structures. Because temporal autocorrelation can be conceptually difficult to interpret, these visualizations were intended to provide intuitive examples for the readers.
However, we recognize the reviewer’s concern about the coherence of the manuscript. In response, we will streamline the simulation section by removing technical simulations that do not directly support our primary conclusions, while retaining only those that enhance understanding of the three glycemic components.
-
- Mar 2025
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this study, Bu et al examined the dynamics of TRPV4 channel in cell overcrowding in carcinoma conditions. They investigated how cell crowding (or high cell confluence) triggers a mechano-transduction pathway involving TRPV4 channels in high-grade ductal carcinoma in situ (DCIS) cells that leads to large cell volume reduction (or cell volume plasticity) and proinvasive phenotype.
In vitro, this pathway is highly selective for highly malignant invasive cell lines derived from a normal breast epithelial cell line (MCF10CA) compared to the parent cell line, but not present in another triple-negative invasive breast epithelial cell line (MDA-MB-231). The authors convincingly showed that enhanced TRPV4 plasmamembrane localization correlates with highgrade DCIS cells in patient tissue samples. Specifically in invasive MCF10DCIS.com cells they showed that overcrowding or over-confluence leads to a decrease in cell volume and intracellular calcium levels. This condition also triggers the trafficking of TRPV4 channels from intracellular stores (nucleus and potentially endosomes), to the plasma membrane (PM). When these over-confluent cells are incubated with a TRPV4 activator, there is an acute and substantial influx of calcium, attesting the fact that there are high number of TRPV4 channels present on the PM. Long-term incubation of these over-confluent cells with the TRPV4 activator results in the internalization of the PM-localized TRPV4 channels.
In contrast, cells plated at lower confluence primarily have TRPV4 channels localized in the nucleus and cytosol. Long-term incubation of these cells at lower confluence with a TRPV4 inhibitor leads to the relocation of TRPV4 channels to the plasma membrane from intracellular stores and a subsequent reduction in cell volume. Similarly, incubation of these cells at low confluence with PEG 3000 (a hyperosmotic agent) promotes the trafficking of TRPV4 channels from intracellular stores to the plasma membrane.
Strengths:
The study is elegantly designed and the findings are novel. Their findings on this mechanotransduction pathway involving TRPV4 channels, calcium homeostasis, cell volume plasticity, motility and invasiveness will have a great impact in the cancer field and potentially applicable to other fields as well. Experiments are well-planned and executed, and the data is convincing. Authors investigated TRVP4 dynamics using multiple different strategies- overcrowding, hyperosmotic stress, pharmacological and genetic means, and showed a good correlation between different phenomena.
All of my previous concerns have been addressed. The quality of the manuscript has improved significantly.
We are deeply grateful to the reviewer for their thoughtful assessment and invaluable suggestions, including crucial additional experiments and more effective presentation and description of our findings, which have greatly enhanced the quality of our manuscript.
Reviewer #2 (Public review):
Summary:
The metastasis poses a significant challenge in cancer treatment. During the transition from non-invasive cells to invasive metastasis cells, cancer cells usually experience mechanical stress due to a crowded cellular environment. The molecular mechanisms underlying mechanical signaling during this transition remain largely elusive. In this work, the authors utilize an in vitro cell culture system and advanced imaging techniques to investigate how non-invasive and invasive cells respond to cell crowding, respectively.
The results clearly show that pre-malignant cells exhibit a more pronounced reduction in cell volume and are more prone to spreading compared to non-invasive cells. Furthermore, the study identifies that TRPV4, a calcium channel, relocates to the plasma membrane both in vitro and in vivo (patient's samples). Activation and inhibition of TRPV4 channel can modulate the cell volume and cell mobility. These results unveil a novel mechanism of mechanical sensing in cancer cells, potentially offering new avenues for therapeutic intervention targeting cancer metastasis by modulating TRPV4 activity. This is a very comprehensive study, and the data presented in the paper are clear and convincing. The study represents a very important advance in our understanding of the mechanical biology of cancer.
We sincerely appreciate the reviewer’s insightful evaluation and invaluable recommendations for key additional experiments, which have significantly strengthened our manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The study by Jena et al. addresses important questions on the fundamental mechanisms of genetic adaptation, specifically, does adaptation proceed via changes of copy number (gene duplication and amplification "GDA") or by point mutation. While this question has been worked on (for example by Tomanek and Guet) the authors add several important aspects relating to resistance against antibiotics and they clarify the ability of Lon protease to reduce duplication formation (previous work was more indirect).
A key finding Jena et al. present is that point mutations after significant competition displace GDA. A second one is that alternative GDA constantly arise and displace each other (see work on GDA-2 in Figure 3). Finally, the authors found epistasis between resistance alleles that was contingent on lon. Together this shows an intricate interplay of lon proteolysis for the evolution and maintenance of antibiotic resistance by gene duplication.
Strengths:
The study has several important strengths: (i) the work on GDA stability and competition of GDA with point mutations is a very promising area of research and the authors contribute new aspects to it, (ii) rigorous experimentation, (iii) very clearly written introduction and discussion sections. To me, the best part of the data is that deletion of lon stimulates GDA, which has not been shown with such clarity until now.
Weaknesses:
The minor weaknesses of the manuscript are a lack of clarity in parts of the results section (Point 1) and the methods (Point 2).
We thank the reviewer for their comments and suggestions on our manuscript. We also appreciate the succinct summary of primary findings that the Reviewer has taken cognisance of in their assessment, in particular the association of the Lon protease with the propensity for GDAs as well as its impact on their eventual fate. We have now revised the manuscript for greater clarity as suggested by Reviewer #1.
Reviewer #2 (Public review):
Summary:
In this strong study, the authors provide robust evidence for the role of proteostasis genes in the evolution of antimicrobial resistance, and moreover, for stabilizing the proteome in light of gene duplication events.
Strengths:
This strong study offers an important interaction between findings involving GDA, proteostasis, experimental evolution, protein evolution, and antimicrobial resistance. Overall, I found the study to be relatively well-grounded in each of these literatures, with experiments that spoke to potential concerns from each arena. For example, the literature on proteostasis and evolution is a growing one that includes organisms (even micro-organisms) of various sorts. One of my initial concerns involved whether the authors properly tested the mechanistic bases for the rule of Lon in promoting duplication events. The authors assuaged my concern with a set of assays (Figure 8).
More broadly, the study does a nice job of demonstrating the agility of molecular evolution, with responsible explanations for the findings: gene duplications are a quick-fix, but can be out-competed relative to their mutational counterparts. Without Lon protease to keep the proteome stable, the cell allows for less stable solutions to the problem of antibiotic resistance.
The study does what any bold and ambitious study should: it contains large claims and uses multiple sorts of evidence to test those claims.
Weaknesses:
While the general argument and conclusion are clear, this paper is written for a bacterial genetics audience that is familiar with the manner of bacterial experimental evolution. From the language to the visuals, the paper is written in a boutique fashion. The figures are even difficult for me - someone very familiar with proteostasis - to understand. I don't know if this is the fault of the authors or the modern culture of publishing (where figures are increasingly packed with information and hard to decipher), but I found the figures hard to follow with the captions. But let me also consider that the problem might be mine, and so I do not want to unfairly criticize the authors.
For a generalist journal, more could be done to make this study clear, and in particular, to connect to the greater community of proteostasis researchers. I think this study needs a schematic diagram that outlines exactly what was accomplished here, at the beginning. Diagrams like this are especially important for studies like this one that offer a clear and direct set of findings, but conduct many different sorts of tests to get there. I recommend developing a visual abstract that would orient the readers to the work that has been done.
The reviewer’s comments regarding data presentation are well-taken. Since we already had a diagrammatic model that sums up the chief findings of our study (Figure 9), we have now provided schematics in Figures 1, 3, 5 and 8 to clarify the workflow of smaller sections of the study. We hope that these diagrams provide greater clarity with regards to the experiments we have conducted.
Next, I will make some more specific suggestions. In general, this study is well done and rigorous, but doesn't adequately address a growing literature that examines how proteostasis machinery influences molecular evolution in bacteria.
While this paper might properly test the authors' claims about protein quality control and evolution, the paper does not engage a growing literature in this arena and is generally not very strong on the use of evolutionary theory. I recognize that this is not the aim of the paper, however, and I do not question the authors' authority on the topic. My thoughts here are less about the invocation of theory in evolution (which can be verbose and not relevant), and more about engagement with a growing literature in this very area.
The authors mention Rodrigues 2016, but there are many other studies that should be engaged when discussing the interaction between protein quality control and evolution.
A 2015 study demonstrated how proteostasis machinery can act as a barrier to the usage of novel genes: Bershtein, S., Serohijos, A. W., Bhattacharyya, S., Manhart, M., Choi, J. M., Mu, W., ... & Shakhnovich, E. I. (2015). Protein homeostasis imposes a barrier to functional integration of horizontally transferred genes in bacteria. PLoS genetics, 11(10), e1005612
A 2019 study examined how Lon deletion influenced resistance mutations in DHFR specifically: Guerrero RF, Scarpino SV, Rodrigues JV, Hartl DL, Ogbunugafor CB. The proteostasis environment shapes higher-order epistasis operating on antibiotic resistance. Genetics. 2019 Jun 1;212(2):565-75.
A 2020 study did something similar: Thompson, Samuel, et al. "Altered expression of a quality control protease in E. coli reshapes the in vivo mutational landscape of a model enzyme." Elife 9 (2020): e53476.
And there's a new review (preprint) on this very topic that speaks directly to the various ways proteostasis shapes molecular evolution:
Arenas, Carolina Diaz, Maristella Alvarez, Robert H. Wilson, Eugene I. Shakhnovich, C. Brandon Ogbunugafor, and C. Brandon Ogbunugafor. "Proteostasis is a master modulator of molecular evolution in bacteria."
I am not simply attempting to list studies that should be cited, but rather, this study needs to be better situated in the contemporary discussion on how protein quality control is shaping evolution. This study adds to this list and is a unique and important contribution. However, the findings can be better summarized within the context of the current state of the field. This should be relatively easy to implement.
We thank the reviewer for their encouraging assessment of our manuscript as well as this important critique regarding the context of other published work that relates proteostasis and molecular evolution. Indeed, this was a particularly difficult aspect for us given the different kinds of literature that were needed to make sense of our study. We have now added the references suggested by the reviewer as well as others to the manuscript. We have also added a paragraph in the discussion section (Lines 463-476) that address this aspect and hopefully fill the lacuna that the reviewer points out in this comment.
Reviewer #3 (Public review):
Summary:
This paper investigates the relationship between the proteolytic stability of an antibiotic target enzyme and the evolution of antibiotic resistance via increased gene copy number. The target of the antibiotic trimethoprim is dihydrofolate reductase (DHFR). In Escherichia coli, DHFR is encoded by folA and the major proteolysis housekeeping protease is Lon (lon). In this manuscript, the authors report the results of the experimental evolution of a lon mutant strain of E. coli in response to sub-inhibitory concentrations of the antibiotic trimethoprim and then investigate the relationship between proteolytic stability of DHFR mutants and the evolution of folA gene duplication. After 25 generations of serial passaging in a fixed concentration of trimethoprim, the authors found that folA duplication events were more common during the evolution of the lon strain, than the wt strain. However, with continued passaging, some folA duplications were replaced by a single copy of folA containing a trimethoprim resistance-conferring point mutation. Interestingly, the evolution of the lon strain in the setting of increasing concentrations of trimethoprim resulted in evolved strains with different levels of DHFR expression. In particular, some strains maintained two copies of a mutant folA that encoded an unstable DHFR. In a lon+ background, this mutant folA did not express well and did not confer trimethoprim resistance. However, in the lon- background, it displayed higher expression and conferred high-level trimethoprim resistance. The authors concluded that maintenance of the gene duplication event (and the absence of Lon) compensated for the proteolytic instability of this mutant DHFR. In summary, they provide evidence that the proteolytic stability of an antibiotic target protein is an important determinant of the evolution of target gene copy number in the setting of antibiotic selection.
Strengths:
The major strength of this paper is identifying an example of antibiotic resistance evolution that illustrates the interplay between the proteolytic stability and copy number of an antibiotic target in the setting of antibiotic selection. If the weaknesses are addressed, then this paper will be of interest to microbiologists who study the evolution of antibiotic resistance.
Weaknesses:
Although the proposed mechanism is highly plausible and consistent with the data presented, the analysis of the experiments supporting the claim is incomplete and requires more rigor and reproducibility. The impact of this finding is somewhat limited given that it is a single example that occurred in a lon strain and compensatory mutations for evolved antibiotic resistance mechanisms are described. In this case, it is not clear that there is a functional difference between the evolution of copy number versus any other mechanism that meets a requirement for increased "expression demand" (e.g. promoter mutations that increase expression and protein stabilizing mutations).
We thank the reviewer for their in-depth assessment of our work and appreciate their concerns regarding reproducibility and rigor in analysis of our data. We have now incorporated this feedback and provided necessary clarifications/corrections in the revised version of our manuscript.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Major Points:
(1) The authors show that a deletion of lon increases the ability for GDA and they argue that this is adaptive during TMP treatment because it increases the dosage of folA (L. 129). However, the highest frequency of GDA occurred in drug-free conditions (see Figure 1C). This indicates either that GDA is selected in drug-free media and potentially selected against by certain antibiotics. It would help for the authors to discuss this possibility more clearly.
We thank the reviewer for this astute observation. It is indeed striking that the GDA mutation (i.e. the GDA-2 mutation) selected in a lon-deficient background does not come up in presence of antibiotics. To probe this further, we have now measured the relative fitness of a representative population of lon-knockout from short-term evolution in drug-free LB (population #3) that harbours GDA-2 against its ancestor (marked with DlacZ). These competition experiments were performed in LB (in which GDA-2 emerged spontaneously), as well as in LB supplemented with antibiotics at the concentrations used during the short term evolution.
Values of relative fitness, w (mean ± SD from 3 measurements), are provided below:
LB: 1.4 ± 0.2
LB + Trimethoprim: 1.6 ± 0.2
LB + Spectinomycin: 0.9 ± 0.2
LB + Erythromycin: 1.3 ± 0.3
LB + Nalidixic acid: 1.5 ± 0.2
LB + Rifampicin: 1.4 ± 0.2
These data show an increase in relative fitness in drug-free LB as would be expected. Interestingly, we also observe an increase in relative fitness in LB supplemented with antibiotics, except spectinomycin. This result supports the idea that GDA-2 is a “media adaptation” and provides a general fitness advantage to the lon knockout. However, as the reviewer pointed out, we should expect to see GDA-2 emerge spontaneously in antibiotic-supplemented media as well. We think that this does not happen as the fitness advantage of drug-specific mutations (GDAs or point mutations) far exceed the advantage of a media adaptation GDA. As a result, we only see the specific mutations that provide high benefit against the antibiotic at least over the relatively short duration of 20-25 generations. It is noteworthy the GDA-2 mutation does come up in LTMPR1 when it is passaged over >200 generations in drug-free media, but shows fluctuating frequency over time. We expect, therefore, that given enough time we may detect the GDA-2 mutations even in antibiotic-supplemented media.
We note, however, that a major caveat in the above fitness calculations is that we cannot be sure that the competing ancestor has no GDA-2 mutations during the course of the experiment. Thus, the above fitness values are only indicative and not definitive. We have therefore not included these data in the revised manuscript.
(2) It is unclear if the isolates WTMPR1 - 5 and LTMPR1 - 5 were pure clones. The authors write in L.488 "Colonies were randomly picked, cultured overnight in drug-free LB and frozen in 50% glycerol at -80C until further use." And in L. 492 "For long-term evolution, trimethoprim-resistant isolates LTMPR1, WTMPR4 and WTMPR5 were first revived from frozen stocks in drug-free LB overnight." From these descriptions, it is possible that the isolates contained a fraction of cells of other genotypes since colonies are often formed by more than one cell and thus, unless pure-streaked, a subpopulation is present and would in drug-free media be maintained. The possibility of pre-existing subpopulations is important for all statements relating to "reversal".
This is indeed a valid concern. As far as we can tell all our initial isolates (i.e. WTMPR1-5 and LTMPR1-5) are pure clones at least as far as SNPs are concerned. This is based on whole genome sequencing data that we have reported earlier in Patel and Matange, eLife (2021), where we described the evolution and isolation of WTMPR1-5 and the present study for LTMPR1-5. All SNPs detected were present at a frequency of 100%. For clones with GDAs, however, there is no way to eliminate a sub-population that has a lower or higher gene copy number than average from an isolate. This is because of the inherent instability of GDAs that will inevitably result in heterogeneous gene copy number during standard growth. In this sense, there is most certainly a possibility of a pre-existing subpopulation within each of the clones that may have reversed the GDA. Indeed, we believe that it is this inherent instability that contributes to their rapid loss during growth in drug-free media.
Minor Points:
(1) L. 406. "allowing accumulation of IS transposases in E. coli" Please specify that it is the accumulation of transposase proteins (and not genes).
We have made this change.
(2) L. 221 typo. Known "to" stabilize.
We have made this change.
Reviewer #2 (Recommendations for the authors):
Most of my suggestions are found in the public review. I believe this to be a strong study, and some slight fixes can solidify its presence in the literature.
We have attempted to address the two main critiques by Reviewer 2. To simplify the understanding of our data, we have provided small schematics at various points in the paper to clarify the experimental pipelines used by us. We have also provided additional discussion situating our study in the emerging area of proteostasis and molecular evolution. We hope that our revisions have addressed these lacunae in our manuscript.
Reviewer #3 (Recommendations for the authors):
Major Points:
(1) The manuscript is generally a bit difficult to follow. The writing is overly complicated and lacks clarity at times. It should be simplified and improved.
We have made several revisions to the text, as well as provided schematics in some of our figures which hopefully make our paper easier to understand.
(2) I cannot find the raw variant summary data for the lon strain evolution experiment in trimethoprim (after 25 generations). Were there any other mutations identified? If not, this should be explicitly stated in the text and the variant output summary from sequencing included as supplemental data.
We apologise for this oversight. We have now provided these data as Table 1.
(3) What is the trimethoprim IC50 of the starting (pre-evolution) strains (i.e. wt and lon)? I can't find this information, but it is critical to interpretation.
We had reported these values earlier in Matange N., J Bact (2020). Wild type and lon-knockout have similar MIC values for trimethoprim, though the lon mutant shows a higher IC50 value. We have now mentioned this in the results section (Line 100-101) and also provided the reference for these data.
(4) What was the average depth of coverage for WGS? This information is necessary to assess the quality of the variant calling, especially for the population WGS.
All genome sequencing data has a coverage at least 100x. We have added this detail to the methods section (Line 580-581).
(5) Five replicate evolution experiments (25 generations, or 7x 10% daily batch transfers) were performed in trimethoprim for the wt and lon strains. Duplication of the folA locus occurred in 1/5 and 4/5 experiments, respectively. It is not entirely clear what type of sampling was actually done to arrive at these numbers (this needs to be stated more clearly), but presumably 1 random colony was chosen at the end of the passaging protocol for each replicate. Based on this result, the authors conclude that folA duplication occurred more frequently in the lon strain, however, this is not rigorously supported by a statistical evaluation. With N=5, one cannot rigorously conclude that a 20% frequency and 80% frequency are significantly different. Furthermore, it's not entirely clear what the mechanism of resistance is for these strains. For example, in one colony sequenced (LTMPR5), it appears no known resistance mechanism (or mutations?) were identified, and yet the IC50 = 900 nM, which is also similar to other strains.
Indeed, we agree with the reviewer that we don’t have the statistical power to rigorously make this claim. However, since the lon-knockout showed us a greater frequency of GDA across 3 different environments we are fairly confident that loss of lon enhances the overall frequency for GDA mutations. This idea in also supported by a number of previous papers that related GDAs and IS-element transpositions with Lon, viz. Nicoloff et al, Antimicrob Agent Chemother (2007), Derbyshire et al. PNAS (1990), Derbyshire and Grindley, Mol Microbiol (1996). We have therefore not provided further justification in the revised manuscript.
We had indeed sampled a random isolate from each of the 5 populations and have added a schematic to figure 1 that provides greater clarity.
Having relooked at the sequencing data for LTMPR1-5 isolates (Table 1), we realised that both LTMPR4 and LTMPR5 harbour mutations in the pitA gene. We had missed this locus during the previous iteration of this manuscript and misidentified an mgrB mutations in LTMPR4. PitA codes for a metal-phosphate symporter. We have observed mutations in pitA in earlier evolution experiments with trimethoprim as well (Vinchhi and Yelpure et al. mBio 2023). Interestingly, in LTMPR5 there was a deletion of pitA, along with 17 other contiguous genes mediated by IS5. To test if loss of pitA is beneficial in trimethoprim, we tested the ability of a pitA knockout to grow on trimethoprim supplemented plates. Indeed, loss of pitA conferred a growth advantage to E. coli on trimethoprim, comparable to loss of mgrB, indicating that the mechanism of resistance of LTMPR5 may be due to loss of pitA. We have added these data to the Supplementary Figure 1 of the revised manuscript and provided a brief description in Lines 103-108. How pitA deficiency confers trimethoprim resistance is yet to be investigated. The mechanism is likely to be by activating some intrinsic resistance mechanism as loss of pitA also conferred a fitness benefit against other antibiotics. This work is currently underway in our lab and hence we do not provide any further mechanism in the present manuscript.
(6) Although measurement error/variance is reported, statistical tests were not performed for any of the experiments. This is critical to support the rigor and reproducibility of the conclusions.
We have added statistical testing wherever appropriate to the revised manuscript.
(7) Lines 150-155 and Figure 2E: Putting a wt copy of mgrB back into the WTMPR4 and LTMPR1 strains would be a better experiment to dissect out the role of mgrB versus the other gene duplications in these strains on fitness. Without this experiment, you cannot confidently attribute the fitness costs of these strains to the inactivation of mgrB alone.
We agree with the reviewer that our claim was based on a correlation alone. We have now added some new data to confirm our model (Figure 2 E, F). The costs of mgrB mutations come from hyperactivation of PhoQP. In earlier work we have shown that the costs (and benefit) of mgrB mutations can be abrogated in media supplemented with Mg<sup>2+</sup>, which turns off the PhoQ receptor (Vinchhi and Yelpure et al. mBio, 2023). We use this strategy to show that like the mgrB-knockout, the costs of WTMPR4, WTMPR5 and LTMPR1 can be almost completely alleviated by adding Mg<sup>2+</sup> to growth media. These results confirm that the source of fitness cost of TMP-resistant bacteria was not linked to GDA mutations, but to hyperactivation of PhoQP.
(8) Figure 3F and G: Does the top symbol refer to the starting strain for the 'long-term' evolution? If so, why does WTMPR4 not have the mgrB mutation (it does in Figure 1)? Based on your prior findings, it seems odd that this strain would evolve an mgrB loss of function mutation in the absence of trimethoprim exposure.
We thank the reviewer for pointing this error out. We have made the correction in the revised manuscript.
(9) Figure 6A: If the marker is neutral, it should be maintained at 0.1% throughout the 'neutrality' experiment. In both plots, the proportion of some marked strains goes up and then down. This suggests either ongoing evolution (these competitions take place over 105 generations), or noisy data. I suspect these data are just inherently noisy. I don't see error bars in the plots. Were these experiments ever replicated? It seems that replicating the experiments might be able to separate out noise from signal and perhaps clarify this point and better confirm the hypothesis that the point mutants are more fit.
These experiments were indeed noisy and the apparent enrichment is most likely a measurement error rather than a real change in frequency of competing genotypes. We have now provided individual traces for each of the competing pairs with mean and SD from triplicate observations at each time point.
(10) Figure 6A: Please indicate which plotted line refers to which 'point mutant' using different colors. These mutants have different trimethoprim IC50s and doubling times, so it would be nice to be able to connect each mutant to its specific data plot.
We thank the reviewer for this suggestion. We have now colour coded the different strain combinations as suggested.
(11) Lines 284-285: I disagree that the IC50s are similar. The C-35T mutant has IC50 that is 2x that of LTMPR1. Perhaps more telling is that, compared to the folA duplication strain from the same time-point (which also carries the rpoS mutation), all of the point mutants have greater IC50s (~2x greater). 2-fold changes in IC50 are significant. It would seem that the point-mutants were likely not competing against LTMPR1 at the time they arose, so LTMPR1 might not be the best comparator if it was extinguished from the population early. I'm assuming this is why you chose a contemporary isolate (and, also, rpoS mutant) for the competition experiments. This should be explained more clearly.
We thank the reviewer for this comment. Indeed, the reviewer is correct about the rationale behind the use of a contemporary isolate and we have provided this clarification in the revised manuscript (Line 287-289). Also, the reviewer is correct in pointing out that a two-fold difference in IC50 cannot be ignored. However, the key point here would be in assessing the differences in growth rates at the antibiotic concentration used during competition (i.e. 300 ng/mL). We are unable to see a direct correlation between the growth rates and enrichment in culture indicating that the observed trends are unlikely to be driven by ‘level of resistance’ alone. We have added these clarifications to the modified manuscript (Lines 299-301)
Minor Points:
(1) Line 13: Add a comma before 'Escherichia'
We have made this change.
(2) Line 14: Consider changing "mutations...were beneficial in trimethoprim" to "mutations...were beneficial under trimethoprim exposure"
We have made this change.
(3) Line 32: Is gene dosage really only "relative to the genome"? Is it not simply its relative copy number generally? Consider changing to "The dosage of a gene, or its relative copy number, can impact its level of expression..."
We have made this change.
(4) Line 38: The idea that GDAs are 1000x more frequent than point mutations seems an overgeneralization.
We agree with the reviewer and have softened our claim.
(5) Line 50: The term "hard-wired" is confusing. Please be more specific.
We have modified this statement to “…GDAs are less stable than point mutations….”.
(6) Line 52-53: What do you mean by "there is also evidence to suggest that...more common in bacteria than appreciated"? Are you implying the field is naïve to this fact? If there is "evidence" of this, then a reference should be included. However, it's not clear why this is important to state in the article. I would consider simply removing this sentence. Less is more in this case.
We have removed this statement.
(7) Lines 59-60: Enzymes catalyze reactions. Please also state the substrates for DHFR. Consider, "It catalyzes the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate, and important co-factor for..."
We have made this change.
(8) Line 72: Please change to, "In E. coli, DHFR is encoded by folA." You do not need to state this is a gene, as it is implicit with lowercase italics.
We have made this change.
(9) Lines 72-86: This paragraph is a bit confusing to read, as it has several different ideas in it. Consider breaking it into two paragraphs at Line 80, "In this study,...". The first paragraph could just review the trimethoprim resistance mechanisms in E. coli and so would change the first sentence (Line 72) to reflect this topic: "In E. coli, DHFR is encoded by folA and several different resistance mechanisms have been characterized." Then, just describe each mechanism in turn. Also, by "hot spots" it would seem you are referring to "point mutations" in the gene that alter the protein sequence and cluster onto the 3D protein structure when mapped? Please be more specific with this sentence for clarity.
We have made these changes.
(10) Lines 92-93: Please also state the MIC value of the strain to specifically define "sub-MIC". Alternatively, you could also state the fraction MIC (e.g. 0.1 x MIC).
We have modified this statement to “…in 300 ng/mL of trimethoprim (corresponding to ~0.3 x MIC) for 25 generations.”
(11) Lines 95-96. Remove, "These sequencing have been reported earlier, ...(2021)". You just need to cite the reference.
We have made this change.
(12) Line 96: Remove the word "gene".
We have made this change.
(13) Figure 1 and Figure 4C: The color scheme is tough for those with the most common type of color blindness. Red/green color deficiency causes a lot of difficulty with Red/gray, red/green, green/gray. Consider changing.
We thank the reviewer for bringing this to our notice. We have modified the colour scheme throughout the manuscript.
(14) Figure 1: Was there a trimethoprim resistance mechanism identified for LTMPR5?
As stated by us in response to major comment #7, LTMPR5’s resistance seems to come from a novel mechanism involving loss of the pitA gene.
(15) Line 349-351: Please briefly define "lower proteolytic stability" as a relative susceptibility to proteolytic degradation and make sure it is clear to the reader that this causes less DHFR. This needs to be clarified because it is confusing how a mutation that causes DHFR proteolytic instability would lead to an increase in trimethoprim IC50. So, you also need to mention that some mutations can cause both increased trimethoprim inhibition and lower proteolytic stability simultaneously. It seems the Trp30Arg mutation is an example of this, as this mutation is associated with a net increase in trimethoprim resistance despite the competing effects of the mutation on enzyme inhibition and DHFR levels.
We thank the reviewer for this comment and agree that the text in the original manuscript did not fully convey the message. We have made modifications to this section (Lines 359-363) in the revised manuscript in agreement with the reviewer’s suggestions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We would like to sincerely thank the editors and reviewers for their thoughtful comments, which provide valuable insights, and will help us enhance the overall quality of our manuscript. We will address all comments comprehensively in our revised submission.
It appears to us that two major concerns were raised by the reviewers and highlighted by the editor, regarding statistical methodology and manuscript readability.
As a provisional response, we would like to summarize our approach for addressing them in our revised manuscript:
(1) Statistical Methodology
Two specific concerns were raised regarding the statistical methods:
First, regarding FDR versus FWE correction in our voxelwise (searchlight) analyses. We recognize that our methods section might have created some confusion on this point. While we stated that "all analyses are FDR-corrected unless noted otherwise", this was meant to refer only to ROI-based analyses. For all voxel-wise analyses, including searchlight RSA analyses, we actually employed FWE correction. This was briefly mentioned in the section on univariate analyses. However, we did not emphasize this information in the searchlight section of the methods, and it is to our understanding that this might have created some confusion.
To clarify: we used (1) FWE correction for all voxel-based analyses and (2) FDR correction for ROI-based analyses (which could thus be considered exploratory). However, to fully address the concerns raised by the reviewers, and avoid potential confusion for the future readers, we will use exclusively FWE correction methods in the revised version of the manuscript. If some category of ROI-based analysis only yields not-significant results when corrected with FWE, we plan to report the uncorrected p-values, and pinpoint the exploratory nature of these results.
Second, regarding the alpha threshold adjustment for searchlight analyses involving multiple comparisons within the same experimental phase: We acknowledge this concern and will address it thoroughly in our revision.
(2) Manuscript Readability
We agree that readability should be improved despite the paradigm's inherent complexity. In our revision, we will:
- Replace non-essential technical terminology with clearer descriptions
- Improve writing quality in particularly dense or conceptually complex sections
- Enhance the overall structure to better guide readers through our methods and findings
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
This manuscript presents a pipeline incorporating a deep generative model and peptide property predictors for the de novo design of peptide sequences with dual antimicrobial/antiviral functions. The authors synthesized and experimentally validated three peptides designed by the pipeline, demonstrating antimicrobial and antiviral activities, with one leading peptide exhibiting antimicrobial efficacy in animal models. However, the manuscript as it stands, has several major limitations on the computational side.
Thanks for your comments.
Major issues:
(1) The choice of GAN as the generative model. There are multiple deep generative frameworks (e.g., language models, VAEs, and diffusion models), and GANs are known for their training difficulty and mode collapse. Could the authors elaborate on the specific rationale behind choosing GANs for this task?
We thank the reviewer for his/her concern on GAN models. We agree that there are some limitations of GAN itself such as its training difficulty, but we cannot deny its potential in generating biological sequences, especially in AMP generation. GAN and VAE are the two most commonly used generative models in the field of AMP design (Curr Opin Struct Biol 2023, 83:102733). AMPGAN (J Chem Inf Model, 2021, 61, 2198-2207.), Multi-CGAN (J Chem Inf Model 2024, 64, 1, 316–326), PepGAN (ACS Omega, 2020, 5, 22847-22851) and others have verified its application ability on peptide design. Moreover, PandoraGAN (Sn Comput Sci 2023, 4, 607) is one of the few works on AVP generation which is also based on GAN architecture. GAN updates the generator weights on the backpropagation from the discriminator directly rather than manually defined complicated loss function, which alleviates the reliance on input data. Our current results demonstrated that the trained GAN generator could produce novel sequences that featured high antimicrobial activity, both validated in silico and in vitro.
(2) The pipeline is supposed to generate peptides showing dual properties. Why were antiviral peptides not used to train the GAN? Would adding antiviral peptides into the training lead to a higher chance of getting antiviral generations?
A major mechanism of antimicrobial peptides is to disrupt cell membranes. Thus, some antimicrobial peptides are reported with broad-spectrum antibacterial and antiviral activities, since the virus shares a membrane structure with bacteria, especially the enveloped viruses. In APD3 database, 244 of 3940 AMPs are labeled with antiviral activities. In constrast, most reported antiviral peptides inhibit the viruses by binding to specific targets (proteins and nucleic acids) related to viral proliferation so that they may not have antibacterial effects. Therefore, we trained the GAN with the AMP dataset. We chose this AMP dataset mainly for AMPredictor (with detailed logMIC label against E.coli) and then used the same dataset to train a GAN for simplification.
In the revised manuscript, we also tested adding available antiviral peptides from AVPdb to train the GAN model. The number of AVPs is 1,788 after removing overlaps with used AMP dataset. The GAN architecture and hyperparameters remain the same. After generating a batch of sequences with this trained generator, we scored them by AMPredictor and filtered them with five AVP classifiers. As expected, the predicted MIC values shifted to higher performance with 17 sequences < 5 μM and 39 sequences < 10 uM, and previous numbers are 26 and 42 in the manuscript. Among 39 sequences < 10 μM, 13 passed all five AVP classifiers and 17 passed four (33.3% and 43.6%, respectively). Previous ratios are 40.5% and 35.7% (17 and 15 out of 42). Two generators perform roughly the same for generating AVPs (76.9% vs. 76.1%) as evaluated by our rules (4 or more positives), but the generator trained solely with AMPs provided more AVPs with higher possibility (5 positives).
We also experimentally tested dozens of generated peptides from two versions of generators (v1 for training solely on AMPs, v2 for training with AVPs, Figure 2 in revised manuscript). The ‘antiviral’ feature of a peptide was checked when significant inhibition was observed in immunofluorescence assays against HSV-1 at the concentration of 10 µM. Six and seven antiviral peptides were found out of 12 tested peptides from generators v1 and v2, respectively. Therefore, the success rates for two versions of generators are about 60% (including three reported peptides in the original manuscript) and show no significant difference.
(3) For the antimicrobial peptide predictor, where were the contact maps of peptides sourced from?
The contact maps of AMPs were predicted from ESM, which were obtained at the same time when obtaining the ESM embeddings (Methods section, Page 24, Line 538: Pretrained language model esm1b_t33_650M_UR50S was used to provide the embeddings and the contact maps.)
(4) Morgan fingerprint can be used to generate amino acid features. Would it be better to concatenate ESM features with amino acid-level fingerprints and use them as node features of GNN?
We thank the reviewer for this suggestion. We test using ESM and fingerprint (FP) features on graph nodes and the result is shown in Author response table 1. AMPredictor (ESM on nodes, FP after GNN) still performed slightly better than concatenating FP on node features on four regression metrics.
Author response table 1.
Results of AMPredictor with fingerprint on nodes
(5) Although the number of labeled antiviral peptides may be limited, the input features (ESM embeddings) should be predictive enough when coupled with shallow neural networks. Have the authors tried simple GNNs on antiviral prediction and compared the prediction performance to those of existing tools?
We thank the reviewer for his/her suggestion on AVP predictions. We haven’t tried it. An important reason is that we focused on developing regressors instead of binary classifiers. Currently available AVP data with numerical labels did not support training a reliable regressor, for their limited amount as well as heterogenous virus target and experimental assay. Therefore, we decided to use reported AVP classifiers as an additional filter following AMPredictor. Since only using one classifier may lead to bias, we chose five AVP classifiers as ensemble votes.
(6) Instead of using global alignment to get match scores, the authors should use local alignment.
We calculated the match scores by global alignment methods referred to AMPGAN v2 (J Chem Inf Model 2021, 61, 2198−2207), CLaSS (Nat Biomed Eng 2021 5, 613–623), and AMPTrans-lstm (Comput Struct Biotechnol J 2022, 21, 463-471), to check the similarity between the generated sequences and any sequences in the training set. In addition, we also used local alignment to check the novelty of peptides (regarding the next question).
(7) How novel are the validated peptides? The authors should run a sequence alignment to get the most similar known AMP for each validated peptide, and analyze whether they are similar.
We have listed the most similar AMP segments to our generated peptides from the training set and DRAMP database (28,233 sequences after filtering out those containing irregular characters). BLAST parameters were set as CLaSS (Nat Biomed Eng 2021 5, 613–623) for short peptides. The lowest Evalue of P001 aligned with the training set is 1.2, and no hits were found for P001 with DRAMP. Two E-values of P002 are 1.4 and 0.46. P076 had no hits in the training set and got a high E-value of 7.0 with DRAMP. Detailed alignments are shown below. This result indicates that our three validated AMPs are novel.
Since we generated more sequences using two versions of generator for validation, we also checked the BLAST E-value of these validated peptides. The results are listed in Table S3. All sequences obtained E-values > 0.1 and some of them had no hits when aligned with the training set or the DRAMP database.
Author response image 1.
Alignments of three validated peptides.
(8) Only three peptides were synthesized and experimentally validated. This is too few and unacceptable in this field currently. The standard is to synthesize and characterize several dozens of peptides at the very least to have a robust study.
We thank the reviewer for the suggestion and promoted our models to generate >10 times more peptides in the revised manuscript. We have synthesized and tested more peptides in vitro and added these results in the revised manuscript (Figure 2). From two versions of generators (trained with or without AVPs), we selected 24 peptides in total for antibacterial and antiviral validations. All 24 peptides showed antibacterial activity towards at least bacterial strain, and 13 peptides were screened out through the quick antiviral test. This result indicates the capability of our design method for bifunctional AMPs with a notable success rate (60%).
Reviewer #2 (Public Review):
Summary:
This study marks a noteworthy advance in the targeted design of AMPs, leveraging a pioneering deeplearning framework to generate potent bifunctional peptides with specificity against both bacteria and viruses. The introduction of a GAN for generation and a GCN-based AMPredictor for MIC predictions is methodologically robust and a major stride in computational biology. Experimental validation in vitro and in animal models, notably with the highly potent P076 against a multidrug-resistant bacterium and P002's broad-spectrum viral inhibition, underpins the strength of their evidence. The findings are significant, showcasing not just promising therapeutic candidates, but also demonstrating a replicable means to rapidly develop new antimicrobials against the threat of drug-resistant pathogens.
Strengths:
The de novo AMP design framework combines a generative adversarial network (GAN) with an AMP predictor (AMPredictor), which is a novel approach in the field. The integration of deep generative models and graph-encoding activity regressors for discovering bifunctional AMPs is cutting-edge and addresses the need for new antimicrobial agents against drug-resistant pathogens. The in vitro and in vivo experimental validations of the AMPs provide strong evidence to support the computational predictions. The successful inhibition of a spectrum of pathogens in vitro and in animal models gives credibility to the claims. The discovery of effective peptides, such as P076, which demonstrates potent bactericidal activity against multidrug-resistant A. baumannii with low cytotoxicity, is noteworthy. This could have far-reaching implications for addressing antibiotic resistance. The demonstrated activity of the peptides against both bacterial and viral pathogens suggests that the discovered AMPs have a wide therapeutic potential and could be effective against a range of pathogens.
We thank the reviewer for the comments.
Reviewer #3 (Public Review):
Summary:
Dong et al. described a deep learning-based framework of antimicrobial (AMP) generator and regressor to design and rank de novo antimicrobial peptides (AMPs). For generated AMPs, they predicted their minimum inhibitory concentration (MIC) using a model that combines the Morgan fingerprint, contact map, and ESM language model. For their selected AMPs based on predicted MIC, they also use a combination of antiviral peptide (AVP) prediction models to select AMPs with potential antiviral activity. They experimentally validated 3 candidates for antimicrobial activity against S. aureus, A. baumannii, E. coli, and P. aeruginosa, and their toxicity on mouse blood and three human cell lines. The authors select their most promising AMP (P076) for in vivo experiments in A. baumannii-infected mice. They finally test the antiviral activity of their 3 AMPs against viruses.
Strengths:
-The development of de novo antimicrobial peptides (AMPs) with the novelty of being bifunctional (antimicrobial and antiviral activity).
-Novel, combined approach to AMP activity prediction from their amino acid sequence.
Weaknesses:
(1) I missed justification on why training AMPs without information of their antiviral activity would generate AMPs that could also have antiviral activity with such high frequency (32 out of 104).
Thanks for your inquiry. A major mechanism of antimicrobial peptides is to disrupt cell membranes. Thus, some antimicrobial peptides are reported with broad-spectrum antibacterial and antiviral activities, since the virus shares a membrane structure with bacteria, especially the enveloped viruses. In APD3 database, 244 of 3940 AMPs are labeled with antiviral activities. However, several reported antiviral peptides inhibit the viruses by binding to specific targets (proteins and nucleic acids) related to viral proliferation so that they may not have antibacterial effects. Therefore, we trained the GAN with the AMP dataset. We chose this AMP dataset mainly for AMPredictor (with detailed logMIC label against E.coli) and then used the same dataset to train a GAN for simplification. In addition, it’s not 32 antiviral candidates out of 104 but 32 out of 42 peptides with predicted MIC < 10 µM because we did the filtering process stepwise.
In revision, we also tested adding available antiviral peptides from AVPdb to train the GAN model (generator v2). The number of AVPs is 1,788 after removing overlaps with used AMP dataset. The GAN architecture and hyperparameters remain the same. We used generator v2 to obtain a batch of sequences and screened out bifunctional candidates following the same procedure. 30 out of 39 peptides with predicted MIC < 10 µM passed four or five AVP predictors. Therefore, two generators perform roughly the same for generating AVP candidates (76.9% vs. 76.1%).
(2) The justification for AMP predictor advantages over previous tools lacks rationale, comparison with previous tools (e.g., with the very successful AMP prediction approach described by Ma et al. 10.1038/s41587-022-01226-0), and proper referencing.
Thanks for your suggestion. Ma et al. proposed ensemble binary classification models to mine AMPs from metagenomes successfully. However, we concentrated on the development of regression models. As a regressor, AMPredictor predicts the specific logMIC value of the input sequences instead of giving a yes/no answer. Since the training settings and evaluation metrics are different for the classification and regression tasks, we could not compare AMPredictor with Ma et al. directly. Instead, we compared the performance of AMPredictor with some regression baseline models (Figure S2a) and our model outperformed them.
(3) Experimental validation of three de novo AMPs is a very low number compared to recent similar studies.
Thanks for pointing out this shortcoming. We have synthesized and tested more peptides in vitro and added these results in the revised manuscript (Figure 2). From two versions of generators (trained with or without AVPs), we selected 24 peptides in total for antibacterial and antiviral validations. All 24 peptides showed antibacterial activity towards at least bacterial strain, and 13 peptides were screened out through the quick antiviral test. This result indicates the capability of our design method for bifunctional AMPs with a notable success rate (60%).
(4) I have concerns regarding the in vivo experiments including i) the short period of reported survival compared to recent studies (0.1038/s41587-022-01226-0, 10.1016/j.chom.2023.07.001, 0.1038/s41551-022-00991-2) and ii) although in Figure 2 f and g statistics have been provided, log scale y-axis would provide a better comparative representation of different conditions.
Thank you for your suggestions.
i) In current study, we monitored the survival of mice with peritoneal bacterial infection for 48 h.
Because abdominal bacterial infection can induce severe sepsis and cause mouse death within 40 h (Sci Adv 2019, 5(7), eaax1946), the 48 h is sufficient to evaluate the therapeutic efficacy of antimicrobial peptides (Nat Biotechnol 2019, 37(10), 1186-1197).
ii) In Figure 2f and 2g (3f and 3g in the revised manuscript), the y-axis has already been in log-scale and tick labels are marked in scientific notation.
(5) I had difficulty reading the story because of the use of acronyms without referring to their full name for the first time, and incomplete annotation in figures and captions.
Thank you for pointing this. We have checked the manuscript carefully and modified the figure captions during revision.
Reviewer #2 (Recommendations For The Authors):
(1) To validate the generalizability of the model, it would be prudent to include data on AMPs targeting a broader range of bacteria and viruses. This could help ensure that the peptides designed are not narrowly focused on E. coli but are effective against a more extensive set of pathogens.
Thanks for your suggestions. We just incorporated AMPs with E. coli activity labels since it is the most common strain among available AMP databases. As for a regressive model (AMPredictor), the fitting object should be defined concisely, which means limited targeting bacteria. Some other articles also focused on E. coli labels as well (Nat Commun 2023, 14, 7197; mSystems 2023, 8, e0034523).
We used the same processed dataset to train the GAN generator for simplification. Most reported AMPs have the potential to target various microbes. We have counted the antimicrobial labels of these peptides in our dataset, shown in Figure S1b. In addition to E. coli, some of the peptides target Grampositive S. aureus, fungus C. albicans, and other bacterial species as well. Our experimental validation also reveals the wide spectrum of designed peptides inhibiting Gram-negative, Gram-positive, drugresistant bacteria, and enveloped viruses. With the expansion of well-curated AMP databases, we expect to update the model with larger scale datasets in the near future.
(2) Conduct sensitivity analyses to understand how minor changes in the peptide sequences impact the model’s predictions. This will reduce the chances of overlooking potential AMP candidates due to the model’s inability to capture subtle changes.
Thank you for this valuable suggestion. We kept similar known peptide sequences in the training sets regarding that a single mutation may have an impact on their antimicrobial performances. We took P001 as an example to perform the sensitivity analysis by site saturation mutagenesis in silico. Author response image 2 represents the change of antimicrobial activity scores as predicted by AMPredictor. Since the predicted MIC of P001 is 0.949 µM (experimentally measured value is 0.80 µM), most single mutations lead to higher scores (i.e., worse performance), especially Asp (D) and Glu (E) residues with negative charges. The largest change value of single amino acid replacement is 25.51 (W6D). Although this value may not reflect the actual changes, it is enough to be distinguished when screening and ranking candidate sequences.
Author response image 2.
Site saturated mutagenesis of P001. Color shows the change of predicted MIC against E. coli as predicted by AMPredictor (lower score is better).
(3) Given the relatively short length of the peptides, typically ranging from 10 to 20 residues, the authors might consider employing a fully-connected graph in the peptide’s graphical representation. This approach could potentially simplify the model without sacrificing the descriptive power due to the limited size of the peptides.
Thanks for your suggestions. We tested fully-connected graph edge encodings and the results on the test set were shown in Author response table 2 below. We found that AMPredictor with peptide contact map still performed better on Pearson correlation coefficient and CI, while using fully-connected graphs reached a slightly improved RMSE and MSE. Nonetheless, using fully-connected graph demands about 10time memory and more computational costs when processing more complicated message-passing. Therefore, the involvement of structural information is still a preferred choice.
Author response table 2.
Results of AMPredictor with different graph edge encodings
(4) Upon reviewing Table S1, it is apparent that the application of ESM embeddings alone achieves commendable prediction accuracy. It would be intriguing to investigate whether the adoption of the more recent ESM models-specifically the second-generation ESM2 t36_3B, t48_15B, and t33_650Mcould enhance model performance beyond that observed using the esm1b_t33_650M_UR50S model described in the manuscript.
Thanks for your suggestions. Here, we included various ESM2 models’ outputs as our node features and presented the results in Author response table 3. Notably, the dimensions of esm2_t36_3B and esm2_t48_15B are 2560 and 5120, respectively, while both esm2_t33_650M and esm1b_t33_650M are 1280 dimensions.
Interestingly, we found that larger models don’t lead to improved performance. ESM-1b version still holds the best metrics in RMSE, MSE, and Pearson correlation coefficient. This indicates that the choice of pretrained model versions depended on specific downstream tasks.
Author response table 3.
Results of AMPredictor with different ESM versions
(5) It may be pertinent to reevaluate the use of the MM-PBSA approach within the scope of this study. Typically, MM-PBSA is utilized to estimate the free energy differences between the bound and unbound states of solvated molecules. The application of MM-PBSA is to calculate binding energies between proteins and membranes is unconventional and infrequently documented in the literature. Therefore, it is recommended that the authors consider omitting this portion of the manuscript, or provide a robust justification for its inclusion and application in this context.
Thanks for your comments on MM/PBSA methods. There have been several literatures using this approach to calculate peptide-membrane binding free energy (Langmuir 2016, 32, 1782-1790; J Cell Biochem 2018, 119, 9205-9216; J Chem Inf Model 2019, 59, 3262-3276; Molecular Therapy Oncolytics 2019, 16, 7-19; Microbiology Spectrum 2023, 11, e0320622; J Chem Inf Model 2023, 63, 5823-5833) and we referred to their settings, such as the dielectric constant. All of these works built similar all-atom systems including cationic antimicrobial peptides and membrane bilayers, and utilized MM/PBSA method to describe the absorption process of the peptide from an unbound initial state. The order of magnitude of our calculation results is consistent with other reported works. Additionally, computational results may provide supporting evidence and we discussed that this quantitative energy calculation should be considered along with other observed metrics.
Reviewer #3 (Recommendations For The Authors):
The weaknesses I mentioned in the Public Review may be addressed by improving the writing and presentation and corrections to the text and figures.
Thanks for your suggestion. We have carefully checked and improved the presentation of text and figures in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The paper is well-organized, with clearly defined sections. The systematic review methodology is thorough, with clear eligibility criteria, search strategy, and data collection methods. The risk of bias assessment is also detailed and useful for evaluating the strength of evidence. The involvement of a patient panel is noticeable and positive, ensuring the research addresses real-world concerns and aligning scientific inquiry with patient perspectives. The statistical approach used for analyzing seems appropriate.
The authors are encouraged to take into account the following points:
As the authors have acknowledged, there is a high risk of bias across all included studies, particularly in randomization, selective outcome reporting, and incomplete data, which could be highlighted more explicitly in the paper's discussion section, particularly the potential implications for the generalizability of the results. The authors can also suggest mitigation strategies for future studies (e.g., better randomization, blinding, reporting standards, etc.).
We agree that it is important to highlight mitigation strategies that will allow preclinical researchers to more transparently report future studies. We have directed readers to ensure reporting in alignment with the ARRIVE 2.0 guidelines for further details on reporting of preclinical studies, as follows in paragraph two of the Discussion, “Future studies should carefully incorporate all elements of the ARRIVE 2.0 guidelines to help ensure that all results are transparently reported and improve confidence in the findings.(41)”
None of the studies include female animals, and the use of young adult animals (instead of aged models) limits the applicability of the findings to the human stroke population, where stroke incidence is higher in older adults and perhaps the gender issue must be included to reflect the translational aspects. The authors can add to the paper's discussion section that perhaps future preclinical studies should include both sexes and aged animals to align better with the clinical population and improve the translation of findings. Another point is the comorbidity. Comorbidities such as diabetes and hypertension are prevalent in stroke patients. How can these be considered in preclinical designs? The authors should emphasize the importance of future research incorporating such comorbid models to enhance clinical relevance. None of the studies had independent replication of their findings, which is a key limitation, especially for a field with high translational expectations. This should be highlighted as a critical next step for validating the efficacy of CCR5 antagonists.
We agree that these are important evidence gaps to address. Although we highlighted these gaps in paragraph 3 of the Discussion, we have now added a more explicit call to action for researchers to address these gaps at the end of the relevant paragraph as follows, “Future preclinical research should aim to address these evidence gaps to further increase the clinical relevance and comprehensiveness of evidence for CCR5 antagonists in stroke.”
The studies accessed limited cognitive outcomes (only one reported a cognitive outcome). Given the importance of cognitive recovery post-stroke, this is a gap to highlight in the discussion. Future studies should include more diverse and comprehensive behavioral assessments, including cognitive and emotional domains, to fully evaluate the therapeutic potential.
We have expanded on this important point in paragraph four of the Discussion, which explores the alignment of the preclinical literature to the CAMAROS trial, as follows, “Finally, clinically relevant secondary outcomes in the CAMAROS trial, such as cognitive and emotional domains as measured by the Montreal Cognitive Assessment (MoCA) and Stroke Aphasia Depression Questionnaire (SADQ) were not modelled in the preclinical literature. Although one study included a cognitive outcome, the other treatment parameters of this study were not aligned to the CAMAROS trial. Future preclinical studies should assess a more diverse and comprehensive battery of clinically relevant behavioural tasks, which could be based on the range of outcomes employed in the CAMAROS trial, or those found in the SRRR recommendations.(9)”
This addition highlights the lack of supporting preclinical evidence for cognitive recovery post-stroke. We also offer recommendations on discrete ways to address this gap in future preclinical studies by taking inspiration from the outcomes used in CAMAROS as well as the SRRR guidelines used throughout our assessment of the CCR5 literature.
The timing of CCR5 administration across studies varies widely (from pre-stroke to several days post-stroke) complicating the interpretation and comparison of results. The authors are encouraged to add that future preclinical studies could focus on narrowing the therapeutic window to more clinically relevant time points.
We agree with the review and feel that this recommendation is currently captured in paragraph three of our Discussion - “However, demonstration of efficacy under a wider range of conditions, such as in aged animals, females, animals with stroke-related comorbidities, more clinically relevant timing of dose administrations, or in conjunction with rehabilitative therapies are necessary to provide further confidence in these findings.” As mentioned above, we added a new sentence to the end of this paragraph to make it more explicit that these are gaps that should be addressed by future preclinical research. “Future preclinical research should aim to address these evidence gaps to further increase the clinical relevance and comprehensiveness of evidence for CCR5 antagonists in stroke.” We also added the word “clinically” to the original sentence mentioned above to more explicitly align with the reviewer’s recommendation.
The paper identifies some alignment with clinical trials, but there are several gaps, too, particularly in the types of behavioral tests used in preclinical studies versus those in clinical trials. If this systematic review and meta-analysis aim to formulate a set of recommendations for future studies, it is important that the authors also propose specific preclinical behavioral tasks that could better align with clinical measures used in trials, like functional assessments related to human stroke outcomes.
As mentioned above, we added a sentence to Discussion paragraph four, the comparison to the CAMAROS trial, that provides recommendations as to the behavioural tasks that would be useful to employ in future studies. Namely, “Future preclinical studies should assess a more diverse and comprehensive battery of clinically relevant behavioural tasks, which could be modelled after the range of outcomes employed in the CAMAROS trial, or those found in the SRRR recommendations.(9)” The SRRR recommendations that we reference here provide discrete consensus recommendations for interested readers on behavioural task selection, as well as priority rankings based on rodent species, to better align with clinical measures used in trials.
The discussion needs some revisions. It could benefit from an expanded explanation of CCR5's mechanistic role in neuroplasticity and stroke recovery. For instance, linking CCR5 antagonism more closely with molecular pathways related to synaptic repair and remyelination would enhance the quality of the discussion and understanding of the drugs' potential.
We have provided a synthesis of CCR5’s proposed mechanistic roles in the Supplementary Materials, Figure S1 (for a summary pathway diagram), and Table S3 (for a list of potential mechanistic pathways and supporting evidence presented in each paper). Given our focus on study quality and alignment with translational recommendations, we felt that it was more appropriate to not focus on mechanistic elements in the Discussion. Indeed, the appraisal of the quality of support for each potential mechanism was beyond the scope of our present analysis.
While the tool is used to assess the risk of bias, it might be helpful to integrate a broader framework for evaluating the quality of included studies. This could include sample size justifications, statistical power analysis, or the use of pre-registration in animal studies. These elements can also introduce bias or minimize those if in place.
We agree these are important and the SYRCLE risk of bias tool we used addresses many major domains of bias mentioned by the reviewer (e.g., selection bias, performance bias, detection bias, attrition bias, reporting bias). For example, the SYRCLE item of “selective outcome reporting” domain address pre-registration by asking “Was the study protocol available and were all of the study’s pre-specified primary and secondary outcomes reported in the current manuscript?”. The SYRCLE Risk of Bias tool represents the current state of the art for risk of bias assessment in preclinical systematic reviews and aligns well with similar tools used clinically, such as the Cochrane Risk of Bias tool. Although the tool does not assess statistical power, we would note that this is considered to be a separate issue from internal validity, and it is the reason this is not even assessed by the Cochrane risk of bias tool used in clinical systematic reviews.
Please also highlight confounding factors that might have influenced the results in the included studies, such as variation in stroke models, dosing regimens, or behavioral assessment methods.
We agree that exploring potential confounding factors is an important element of the assessment. We highlight potential confounding factors in several parts of the Results and Discussion, such as in our Synthesis of Behavioural Outcomes section, “…equivalent infarct volumes were not demonstrated between the treated and control groups in this cohort, which could potentially lead to confounding effects.” and Comprehensiveness of Preclinical Evidence section, “All studies tested both behavioral and histological outcomes and demonstrated neuroprotective effects, but most studies failed to measure and control post-stroke temperature, which could potentially confound the observed neuroprotection (Table S4).(32) Most histological measurements were also assessed at <72 hours, which could confound the observed neuroprotective effects if cell death was merely delayed.(32) For CCR5 antagonists as a post-stroke recovery-inducing treatment, one experiment assessed the effects of initiating CCR5 administration in a similar post-stroke phase as the CAMAROS trial. This experiment (Joy et al.)(6) did not demonstrate that each treatment group had equivalent baseline stroke volumes, which may potentially confound observed behavioral effects.”
Although there are many factors that could potentially confound the observed results, we believe that we have addressed some of the most prominent examples that are known in the preclinical stroke literature. We expanded our statement in the final sentence of the Results to highlight this, “Overall, our assessments highlight a variety of knowledge gaps, potential confounding factors, and areas of misalignment between the preclinical evidence and clinical trial parameters that could be improved with further preclinical experimentation.
There is some discussion of the meta-analysis' limitations due to the few studies, but this point could be more thoroughly addressed. Please consider including a more critical discussion of the limitations of pooling data from heterogeneous study designs, stroke models, and outcome measures. What can this lead to? Is it reliable to do so, or does it lack scientific rigor? The authors are encouraged to formulate a balanced discussion adding, positive and negative aspects.
We appreciate the reviewer’s insightful comment regarding the limitations related to pooling data from heterogeneous study designs, stroke models, and outcome measures. We have added to the original limitations described in the first paragraph of our Discussion with additional text to provide a better balance about the potential risks and benefits of the meta-analysis strategy that we undertook in the present study.
“Pooling data across heterogenous experimental designs, animal/stroke models, and treatment parameters, as we have done with the infarct volume analysis in the present study, can introduce variability that increases the risk of overestimating or underestimating the true effect of the intervention.(38) Treatment effects observed across model systems and therapeutic compounds may represent different biological mechanisms. Despite this potential limitation, meta-analysis can provide valuable insights, especially in preclinical settings where the sample sizes of individual studies may be too small to detect significant effects on their own. In these cases, pooling data across studies can help identify overarching estimates of benefits and harm, highlight subgroups of interest, and help guide areas of future research. As described in the results above, we attempted to mitigate the risks of inappropriate data pooling through careful investigation of heterogeneity, subgroup analyses, and differentiation between outcomes where we felt that meta-analytic pooling was (infarct volume) and was not (behavioural outcomes) appropriate. Overall, we believe that our results indicate that further investigation is warranted to determine the optimal timing of administration and behavioral domains under which CCR5 antagonists exhibit the strongest post-stroke neuroprotective and recovery-inducing effects.”
The conclusion should more explicitly acknowledge that while CCR5 antagonists show potential, the findings are still preliminary due to the limitations in the preclinical studies (high bias risk, lack of diverse animal models). Overall, the conclusion can end with a call for rigorous, well-controlled, and replicated studies with improved alignment to clinical populations and trials to show that the conclusion remains inconclusive, considering what has been analyzed here.
We modified our concluding paragraph to highlight that the current evidence should be considered preliminary, as follows, “In conclusion, CCR5 antagonists show promise in preclinical studies for stroke neuroprotection, corresponding reduction in impairment, as well as improved functional recovery related to neural repair in the late sub-acute/early chronic phase. However, high risk of bias and the limited (or no) evidence in clinically relevant domains underscore the need for more rigorous and transparent preclinical research to further strengthen the current preliminary evidence available in the literature.”
Reviewer #2 (Public review):
Summary:
This is an interesting, timely, and high-quality study on the potential neuroprotective capabilities of C-C chemokine receptor type 5 (CCR5) antagonists in ischemic stroke. The focus is on preclinical investigations.
Strengths:
The results are timely and interesting. An outstanding feature is that stroke patient representatives have directly participated in the work. Although this is often called for, it is hardly realized in research practice, so the work goes beyond established standards.
The included studies were assessed regarding the therapeutic impact and their adherence to current quality assurance guidelines such as STAIR and SRRR, another important feature of this work. While overall results were promising, there were some shortcomings regarding guideline adherence.
The paper is very well written and concise yet provides much highly useful information. It also has very good illustrations and extremely detailed and transparent supplements.
Weaknesses:
Although the paper is of very high quality, a couple of items that may require the authors' attention to increase the impact of this exciting work further. Specifically:
Major aspects:
(1) I hope I did not miss that (apologies if I did), but when exactly was the search conducted? Is it possible to screen the recent literature (maybe up to 12/2024) to see whether any additional studies were published?
We added the following statements to the “Information sources and search strategy” section of Materials and Methods to clarify the timing and intention of our search strategy, “The search was conducted October 25, 2022, to align with the listed launch date of the CAMAROS trial (September 15, 2022). Our intention in doing so was to collate and assess all preclinical evidence that could have feasibly informed the clinical trial. We sought to assess the comprehensiveness of evidence and readiness for translation of CCR5 antagonist drugs at the time of their actual translation into human clinical trials, as well as the alignment of the CAMAROS trial design to the existing preclinical evidence base.”
Although we agree that an update of the search provides valuable information for the field, we believe that the studies entering the literature after the launch of the CAMAROS trial fill a different conceptual niche than those prior to trial launch (since newer preclinical studies explicitly did not inform decisions to move to clinical trials or clinical trial design). It is our view that newer studies should be assessed from a lens of how effectively they close knowledge gaps that were present at trial launch and emulate the conditions of clinical trial populations and design parameters (which represent the de facto most “clinically relevant” conditions). Such an analysis would require a different approach that is outside the scope and aims of the present study. The present study provides an assessment of the preclinical literature up to the date of the translation of CCR5 antagonist drugs into human clinical trials (via the CAMAROS trial), which we believe will serve as a valuable prospective benchmark for evaluating the predictiveness of preclinical evidence after the results of the CAMAROS trial emerge.
(2) Please clearly define the difference between "study" and "experiment," as this is not entirely clear. Is an "experiment" a distinct investigation within a particular publication (=study) that can describe more than one such "experiment"? Thanks for clarifying.
We have now added definitions for “studies” and “experiments” immediately after the first time they are mentioned in paragraph one of the Study Selection section of Results, as follows: “Herein, “studies” refer to the published articles as a unit, while “experiments” refer to distinct investigations within each published article used to test various hypotheses (i.e., a subunit of “studies” comprised of a select cohort of animals).”
(3) Is there an opportunity to conduct a correlation analysis between the quality of a study (for instance, after transforming the ROB assessment into a kind of score) and reported effect sizes for particular experiments or studies? This might be highly interesting.
This is an interesting suggestion, which under different circumstances could provide insights into potential associations between study quality and effect size, as have been observed in the literature (e.g., Macleod et al., 2008; PMID:18635842). However, we are unable to assess this relationship in the present dataset as all studies were scored as “high risk of bias”, meaning that there was no variability in terms of observed study quality.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
Minor aspects:
(1) The scope of the work is perfectly in line with very recent STAIR recommendations, which strongly suggest assessing potential interventions that may augment impact and improve outcomes in recanalization procedures (Wechsler et al., doi: 10.1161/STROKEAHA.123.044279; PMID 37886850). The authors may to discuss their work in light of these recent recommendations.
We thank the reviewer for highlighting the more recent STAIR recommendation document, as well as its focus on assessing interventions in conjunction with recanalization procedures. An item related to the importance of combining novel interventions with established recanalization procedures was included as part of Table S4 but was not highlighted in the main text. We have added to the final paragraph of the Results section “Comprehensiveness of preclinical evidence” to highlight that no studies tested CCR5 antagonist drugs in conjunction with recanalization procedures as follows, “…no studies assessed behavioural effects on upper extremity skilled reaching / grasping or potential interactions of CCR5 antagonists with rehabilitative therapies or established recanalization procedures (Table S4).(35–38)” The Weschler reference provided by the reviewer has now been cited as well.
(2) The authors may wish to consider the term "cerebroprotective" rather than "neuroprotective" unless neurons are the only cells to which a respective statement applies.
We agree that “cerebroprotective” is the more appropriate term and have thus substituted it wherever we previously used “neuroprotective”.
(3) The paper features a mixture between American (e.g.," hemorrhagic") and British English (e.g., "favours"). Although this is not untypical for Canadian English, deciding on one or the other may be an option.
Given eLife’s basis in the UK, we have modified the language used throughout to be consistent with British English style.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The manuscript entitled "Phosphodiesterase 1A Physically Interacts with YTHDF2 and Reinforces the Progression of Non-Small Cell Lung Cancer" explores the role of PDE1A in promoting NSCLC progression by binding to the m6A reader YTHDF2 and regulating the mRNA stability of several novel target genes, consequently activating the STAT3 pathway and leading to metastasis and drug resistance.
Strengths:
The study addresses a novel mechanism involving PDE1A and YTHDF2 interaction in NSCLC, contributing to our understanding of cancer progression.
Weaknesses:
The following issues should be addressed:
(1) The body weight changes and/or survival times of each group in the in vivo metastasis studies should be provided.
Thank you for your suggestion! We have already provided the body weight of each group in the in vivo metastasis studies in FigureS4D and FigureS5D (see below).
(2) In Figure 7, the direct binding between YTHDF2 and the potential target genes should be further validated by silencing YTHDF2 to observe the half-life of the mRNA levels of target genes, in addition to silencing PDE1A.
Thank you for your suggestion! We have found that siYTHDF2 does not significantly affect expression of SOCS2 in NSCLC cells (see author response image 1 below). We hypothesize that YTHDF2 functions as a m6A reader to recognize the target mRNA, thus if YTHDF2 is silence by siRNA, there is still some expression in the cells, allowing it to continue recognizing and exerting its function. Therefore, the mRNA of SOCS2 could not significantly affect expressed. However, PDE1A functions as a degrader of mRNA, thus when it is disrupted, the mRNA degradation effect could be strong.
Author response image 1.
SOCS2 mRNA expression after siYTHDF2 in NSCLC cells
(3) In Figure 7, the potential methylation sites of "A" on the target genes such as SOCS2 should be verified by mutation analysis, followed by m6A IP or reporter assays.
Thank you for your suggestion! The m6A IP or reporter assays may be carried out to detect the potential methylation sites in future. We have added the suggestion in manuscript “Meanwhile, YTHDF2 might act as an m6A RNA “reader” by interacting with PDE1A, but the mechanism might need further investigation”.
(4) In Figure 6G, the correlation between the mRNA levels of STAT3 and YTHDF2 needs clarification. According to the authors' mechanism, the STAT3 pathway is activated, rather than upregulation of mRNA levels (or protein levels, as shown in Figure 6F). Figure 7 does not provide evidence that STAT3 is a bona fide target gene regulated by YTHDF2.
Thank you for your suggestion! The reviewer is right, STAT3 pathway is activated, rather than upregulation of mRNA levels by YTHDF2, so the relationship between YTHDF2 mRNA and STAT3 mRNA is not suitable for this study. Meanwhile, the relationship between YTHDF2 mRNA and STAT3 mRNA is not as strong as we expected with Pearson value 0.37. Thus, we have already deleted Figure 6G in the revised version.
(5) The final figure, which discusses sensitization to cisplatin by PDE1A suppression, does not appear to be closely related to the interaction or regulation of PDE1A/YTHDF2. If the authors claim this is an m6A-associated event, additional evidence is needed. Otherwise, this part could be removed from the manuscript.
Thank you for your suggestion! We have already deleted Figure 8 just as the reviewer suggested.
Reviewer #2 (Public review):
This manuscript aims to investigate the biological impact and mechanisms of phosphodiesterase 1A (PDE1A) in promoting non-small cell lung cancer (NSCLC) progression. They first analyzed several databases and used three established NSCLC cell lines and a normal cell line to demonstrate that PDE1A is overexpressed in lung cancer and its expression negatively correlated with the outcomes of patients. Based on this data, they suggested PDE1A could be considered as a novel prognostic predictor in lung cancer treatment and progression. To study the biological function of PDE1A in NSCLC, they focused on testing the effect of inhibition of PDE1A genetically and pharmacologically on cell proliferation, migration, and invasion in vitro. They also used an experimental metastasis model via tail vein injection of H1299 cells to test if PDE1A promoted metastasis. By database analysis, they also decided to investigate if PDE1A promoted angiogenesis by co-culturing NSCLC cells with HUVECs as well as assessing the tumors from the subcutaneous xenograft model. However, in this model, whether PDE1A modulation impacted tumor metastasis was not examined. To address the mechanism of how PDE1A promotes metastasis, the authors again performed a bioinformatic and GSEA enrichment analysis and confirmed PDE1A indeed activated STAT3 signaling to promote migration. In combination with IP followed by Mass spectrometry, they found PDE1A is a partner of YTHDF2, the cooperation of PDE1A and YTHDF2 negatively regulated SOCS2 mRNA as demonstrated by RIP assay, and ultimately activated STAT3 signaling. Finally, the authors shifted the direction from metastasis to chemoresistance, specifically, they found that PDEA1 inhibitions sensitized NSCLC cells to cisplatin through MET and NRF2 signaling.
Strength:
Overall, the manuscript was well-written and the majority of the data supported the conclusions. The authors used a series of methods including cell lines, animal models, and database analysis to demonstrate the novel roles and mechanism of how PDE1 promotes NSCLC invasion and metastasis as well as cisplatin sensitivity. Given that PDE1A inhibitors have been perused to use in clinic, this study provided valuable findings that have the translational potential for NSCLC treatment.
Weaknesses:
The role of YTHDF2 in PDE1A-promoted tumor metastasis was not investigated. To make the findings more clinical and physiologically relevant, it would be interesting to test if inhibition of PDE1A impacts metastasis using lung cancer orthotopic and patient-derived xenograft models. It is also important to use a cisplatin-resistant NSCLC cell line to test if a PDE1A inhibitor has the potential to sensitize cisplatin in vitro and in vivo.
Thank you for your suggestion! The role of YTHDF2 in PDE1A-promoted tumor metastasis may need in vivo analysis. Therefore, we discussed the point in the discussion section “In addition, it is worth testing if PDE1A inhibition affects metastasis in lung cancer orthotopic and patient-derived xenograft models. The role of YTHDF2 in PDE1A-driven tumor metastasis should be elucidated in future studies”.
The reviewer is absolutely right, it is very important to use a cisplatin-resistant NSCLC cell line to test the potential effect of PDE1A in sensitization to cisplatin. The current data could not support the conclusion, more data is needed to make the final conclusion. As suggested by reviewer 1, we have deleted these data in this version.
Furthermore, this study relied heavily on different database analyses, although providing novel and compelling data that was followed up and confirmed in the paper, it is critical to have detailed statistical description section on data acquisition throughout the manuscript.
Thank you for your suggestion! We have already added the detailed statistical description section in Figure legends.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Scale Bar Display: Scale bars should be included in Figures 4F, 5F, and 6E to ensure clarity and accuracy in the presented microscopic images.
Thank you for your suggestion! We have already added the scale bars on Figures 4F, 5F, and 6E.
(2) HE Staining Images: The authors are suggested to provide more images for HE staining of lungs to offer a comprehensive visual representation and to substantiate the findings.
Thank you for your suggestion! We have already provided more images for HE staining of lungs in Figure S4E and Figure S5E.
Reviewer #2 (Recommendations for the authors):
It would be helpful to clarify several points in the manuscript for better understanding.
(1)The HELF cells were stated between the epithelial cell line (page 7, line 118) and fibroblast (page 12, line 288) which needs to be clarified. It is not clear if the cells used in this study were periodically authenticated.
Thank you for your suggestion! We have already revised the expression of HELF cells, and it is actually the human lung fibroblasts.
(2) More details could be added to the methods such as the amount of Matrigel coated for invasion assay and the components for the lysis buffer and IP buffer.
Thank you for your suggestion! We have already added more details in the Methods section.
(3) Providing the rationale for using 20% FBS instead of using some chemoattracts such as EGF, LPA, or HGF or a low level of FBS for migration will be helpful.
Thank you for your suggestion! Although chemoattracts are suitable for cell migration experiment, and 20% FBS is also suitable for cell migration experiment. We listed the literatures using this system below for example.
(1) Xiaolin Peng, Zhengming Wang, Yang Liu. et al. Oxyfadichalcone C inhibits melanoma A375 cell proliferation and metastasis via suppressing PI3K/Akt and MAPK/ERK pathways, Life Sciences, 2018, 206, 35-44. https://doi.org/10.1016/j.lfs.2018.05.032
(2) Rong, S., Dai, B., Yang, C. et al. HNRNPC modulates PKM alternative splicing via m6A methylation, upregulating PKM2 expression to promote aerobic glycolysis in papillary thyroid carcinoma and drive malignant progression. J Transl Med, 2024, 22, 914 (2024). https://doi.org/10.1186/s12967-024-05668-9
(4) For HPA analysis In Figure 1, it would be great to assess how many lung cancer cases are NSCLC and define IDO/area for the y-axis.
Thank you for your suggestion! There are 19 samples were analyzed, they are all NSCLC sample, and we have already revised our manuscript accordingly. Meanwhile, we also made a mistake, it should be IOD/area which means Integral optical density/area. We have revised the Figures and Figure legends.
(5) On page 23, line 480, "Therefore, this study reveals the effect and mechanism of PDEA1 in promoting HCC metastasis...", should HCC be NSCLC?
Thank you for your suggestion! We have already revised the manuscript accordingly.
(6) Specific scramble siRNAs should be clearly shown in their respective figures. In Figure 7F, it is not clear why DMSO did not scramble siRNA was used as the control.
Thank you for your suggestion! It is our fault to show the DMSO in Figure 5F, DMSO is the negative control of Figure 5G, and we have revised the Figure 5F and 5G accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Common comments
(1) Significance of zero mutation rate
Reviewers asked why we included mutation rate even though setting mutation rate to zero doesn’t change results. We think that including non-zero mutation rate makes our results more generalisable, and thus is a strength rather than weakness. To better motivate this choice, we have added a sentence to the beginning of Results:
(2) Writing the mu=0 case first
Reviewers suggested that we should first focus on the mu=0 case, and then generalize the result. The suggestions are certainly good. However, given the large amount of work involved in a re-organization, we have decided to adhere to our current narrative. However, we now only include equations where mu=0 in the main text, and have moved the case of nonzero mutation rate to Supplementary Information.
Making equations more accessible
We have taken three steps to make equations more readable.
● Equations in the main text correspond to the case of zero-mutation rate.
● The original section on equation derivation is now in a box in the main text so that readers have the choice of skipping it but interested readers can still get a gist of where equations came from.
We have provided a much more detailed interpretation of the equation:
(3) Validity of the Gaussian approximation
Reviewers raised concerns about the validity of Gaussian approximation on F suggest that𝑓( 𝜏this) approximation is reasonable. Still, we added a discussion frequency. The fact that our calculations closely match simulations about the validity of this approximation in Box 1.
We also added to SI with various cases of initial S and F sizes. This figure not normal. However, if initial S and F are both on the order of hundreds,𝑓(𝜏) then shows that when either initial S or initial F is small, the distribution of is the distribution of 𝑓(𝜏) is approximately Gaussian.
Public Reviews:
Summary:
The authors demonstrate with a simple stochastic model that the initial composition of the community is important in achieving a target frequency during the artificial selection of a community.
Strengths:
To my knowledge, the intra-collective selection during artificial selection has not been seriously theoretically considered. However, in many cases, the species dynamics during the incubation of each selection cycle are important and relevant to the outcome of the artificial selection experiment. Stochasticity from birth and death (demographic stochasticity) plays a big role in these species' abundance dynamics. This work uses a simple framework to tackle this idea meticulously.
This work may or may not be hysteresis (path dependency). If this is true, maybe it would be nice to have a discussion paragraph talking about how this may be the case. Then, this work would even attract the interest of people studying dynamic systems.
We have added this clarification in the main text:
“Note that here, selection outcome is path-dependent in the sense of being sensitive to initial conditions. This phenomenon is distinct from hysteresis where path-dependence results from whether a tuning parameter is increased or decreased.
Weaknesses:
(1) Connecting structure and function
In typical artificial selection literature, most of them select the community based on collective function. Here in this paper, the authors are selecting a target composition. Although there is a schematic cartoon illustrating the relationship between collective function (y-axis) and the community composition in the main Figure 1, there is no explicit explanation or justification of what may be the origin of this relationship. I think giving the readers a naïve idea about how this structure-function relationship arises in the introduction section would help. This is because the conclusion of this paper is that the intra-collective selection makes it hard to artificially select a community that has an intermediate frequency of f (or s). If there is really evidence or theoretical derivation from this framework that indeed the highest function comes from the intermediate frequency of f, then the impact of this paper would increase because the conclusions of this stochastic model could allude to the reasons for the prevalent failures of artificial selection in literature.
We have added this to introduction: “This is a common quest: whenever a collective function depends on both populations, collective function is maximised, by definition, at an intermediate frequency (e.g. too little of either population will hamper function [23]).”
(2) Explain intra-collective and inter-collective selection better for readers.
The abstract, the introduction, and the result section use these terms or intra-collective and inter-collective selection without much explanation. For the wide readership of eLife, a clear definition in the beginning would help the audience grasp the importance of this paper, because these concepts are at the core of this work.
This is a great point. We have added in Abstract:
“Such collective selection is dictated by two opposing forces: during collective maturation, intra-collective selection acts like a waterfall, relentlessly driving the S-frequency to lower values, while during collective reproduction, inter-collective selection resembles a rafter striving to reach the target frequency. Due to this model structure, maintaining a target frequency requires the continued action of inter-collective selection.”
and in Introduction
“A selection cycle consists of three stages (Fig. 1). During collective maturation, intra-collective selection favors fast-growing individuals within a collective. At the end of maturation, inter-collective selection acts on collectives and favors those achieving the target composition. Finally during collective reproduction, offspring collectives sample stochastically from the parents, a process dominated by genetic drift.”
(3) Achievable target frequency strongly depending on the degree of demographic stochasticity.
I would expect that the experimentalists would find these results interesting and would want to consider these results during their artificial selection experiments. The main Figure 4 indicates that the Newborn size N0 is a very important factor to consider during the artificial selection experiment. This would be equivalent to how much bottleneck is imposed on the artificial selection process in every iteration step (i.e., the ratio of serial dilution experiment). However, with a low population size, all target frequencies can be achieved, and therefore in these regimes, the initial frequency now does not matter much. It would be great for the authors to provide what the N0 parameter actually means during the artificial selection experiments. Maybe relative to some other parameter in the model. I know this could be very hard. But without this, the main result of this paper (initial frequency matters) cannot be taken advantage of by the experimentalists.
We have added an analytical approximation for N0˘, the Newborn size below which all target frequencies can be achieved in SI.
Also, we have added lines indicating N0˘ in Fig4a.
(4) Consideration of environmental stochasticity.
The success (gold area of Figure 2d) in this framework mainly depends on the size of the demographic stochasticity (birth-only model) during the intra-collective selection. However, during experiments, a lot of environmental stochasticity appears to be occurring during artificial selection. This may be out of the scope of this study. But it would definitely be exciting to see how much environmental stochasticity relative to the demographic stochasticity (variation in the Gaussian distribution of F and S) matters in succeeding in achieving the target composition from artificial selection.
You are correct that our work considers only demographic stochasticity.
Indeed, considering other types of stochasticity will be an exciting future research direction. We added in the main text:
“Overall our model considers mutational stochasticity, as well as demographic stochasticity in terms of stochastic birth and stochastic sampling of a parent collective by offspring collectives. Other types of stochasticity, such as environmental stochasticity and measurement noise, are not considered and require future research.”
(5) Assumption about mutation rates
If setting the mutation rates to zero does not change the result of the simulations and the conclusion, what is the purpose of having the mutation rates \mu? Also, is the unidirectional (S -> F -> FF) mutation realistic? I didn't quite understand how the mutations could fit into the story of this paper.
This is a great point. We have added this to the beginning of Results to better motivate our study:
“We will start with a complete model where S mutates to F at a nonzero mutation rate µ. We made this choice because it is more challenging to attain or maintain the target frequency when the abundance of fast-growing F is further increased via mutations. This scenario is encountered in biotechnology: an engineered pathway will slow down growth, and breaking the pathway (and thus faster growth) is much easier than the other way around. When the mutation rate is set to zero, the same model can be used to capture collectives of two species with different growth rates.
See answer on common question 1.
(6) Minor points
In Figure 3b, it is not clear to me how the frequency difference for the Intra-collective and the Inter-collective selection is computed.
We added a description in caption 3b.
In Figure 5b, the gold region (success) near the FF is not visible. Maybe increase the size of the figure or have an inset for zoom-in. Why is the region not as big as the bottom gold region?
We increased the resolution of Fig 5b so that the gold region near FF is more visible.
We have added Fig 5c and the following explanation to the main text:
“From numerical simulations, we identified two accessible regions: a small region near FF and a band region spanning from S to F (gold in Fig. 5b i). Intuitively, the rate at which FF grows faster than S+F is greater than the rate at which F grows faster than S (see section VIII in Supplementary Information). Thus, the problem can initially be reduced to a two-population problem (i.e. FF versus F+S; Fig. 5c left), and then expanded to a three-population problem (Fig. 5c right).”
Recommendations For The Authors
Since the conclusion of the model greatly depends on the noise (variation) of F and S in the Gaussian distribution, it would be nice to have a plot where the y-axis is the variation in terms of frequency and the x-axis is the s_0 or f_0 (frequency). In the plot, I would love to see how the variation in the frequency depends on the initial frequency of S and F. Maybe this is just trivial.
In the SI, we added Fig6a, as per your request. Previous Fig6 became Fig6b.
Reviewer #2 (Public review):
The authors provide an analytical framework to model the artificial selection of the composition of communities composed of strains growing at different rates. Their approach takes into account the competition between the targeted selection at the level of the meta-community and the selection that automatically favors fast-growing cells within each replicate community. Their main finding is a tipping point or path-dependence effect, whereby compositions dominated by slow-growing types can only be reached by community-level selection if the community does not start and never crosses into a range of compositions dominated by fast growers during the dynamics.
These results seem to us both technically correct and interesting. We commend the authors on their efforts to make their work reproducible even when it comes to calculations via extensive appendices, though perhaps a table of contents and a short description of these appendices at the start of SI would help navigate them.
Thank you for the suggestion. We have added a paragraph at the beginning of SI.
The main limitation in the current form of the article is that it could clarify how its assumptions and findings differ from and improve upon the rest of the literature:
- Many studies discuss the interplay between community-level evolution and species- or strain-level evolution. But "evolution" can be a mix of various forces, including selection, drift/randomness, and mutation/innovation.
- This work's specificity is that it focuses strictly on constant community-level selection versus constant strain-level selection, all other forces being negligible (neither stochasticity nor innovation/mutation matter at either level, as we try to clarify now).
Note that intra-collective selection is not strictly “constant” in the sense that selection favoring F is the strongest at intermediate F frequency (Fig 3). However, we think that you mean that intra- and inter-collective selection are present in every cycle, and this is correct for our case, and for community selection in general.
- Regarding constant community-level selection, it is only briefly noted that "once a target frequency is achieved, inter-collective selection is always required to maintain that frequency due to the fitness difference between the two types" [pg. 3 {section sign}2]. In other words, action from the selector is required indefinitely to maintain the community in the desired state. This assumption is found in a fraction of the literature, but is still worth clarifying from the start as it can inform the practical applicability of the results.
This is a good point. We have added to abstract:
“Such collective selection is dictated by two opposing forces: during collective maturation, intra-collective selection acts like a waterfall, relentlessly driving the S-frequency to lower values, while during collective reproduction, inter-collective selection resembles a rafter striving to reach the target frequency. Due to this model structure, maintaining a target frequency requires the continued action of inter-collective selection.”
- More importantly, strain-level evolution also boils down here to pure selection with a constant target, which is less usual in the relevant literature. Here, (1) drift from limited population sizes is very small, with no meaningful counterbalancing of selection, (2) pure exponential regime with constant fitness, no interactions, no density- or frequency-dependence, (3) there is no innovation in the sense that available types are unchanging through time (no evolution of traits such as growth rate or interactions) and (4) all the results presented seem unchanged when mutation rate mu = 0 (as noted in Appendix III), meaning that the conclusions are not "about" mutation in any meaningful way.
With regard to point (1), Figure 4a (reproduced below) shows how Newborn size affects the region of achievable targets. Indeed at large Newborn size (e.g. 5000 and above), no target frequency is achievable (since drift is too small to generate sufficient inter-community variation and consequently all communities are dominated by fast-growing F). However at Newborn size of for example 1000, there are two regions of accessible target frequencies. At smaller Newborn size, all target frequencies become achievable due to drift becoming sufficiently strong.
With regard to points (2) and (3), we have added to Introduction
“To enable the derivation of an analytical expression, we have made the following simplifications.
First, growth is always exponential, without complications such as resource limitation, ecological interactions between the two populations, or density-dependent growth. Thus, the exponential growth equation can be used. Second, we consider only two populations (genotypes or species): the fast-growing F population with size F and the slow-growing S population with size S. We do not consider a spectrum of mutants or species, since with more than two populations, an analytical solution becomes very difficult.”
With regard to point (4), we view this as a strength rather than weakness. We have added the following to the beginning of Results and Discussions:
“We will start with a complete model where S mutates to F at a nonzero mutation rate µ. We made this choice because it is more challenging to attain or maintain the target frequency when the abundance of fast-growing F is further increased via mutations.”
“When the mutation rate is set to zero, the same model can be used to capture collectives of two species with different growth rates.”
See Point 1 of Common comments.
- Furthermore, the choice of mutation mechanism is peculiar, as it happens only from slow to fast grower: more commonly, one assumes random non-directional mutations, rather than purely directional ones from less fit to fitter (which is more of a "Lamarckian" idea). Given that mutation does not seem to matter here, this choice might create unnecessary opposition from some readers or could be considered as just one possibility among others.
We have added the following justification:
“This scenario is encountered in biotechnology: an engineered pathway will slow down growth, and breaking the pathway (and thus faster growth) is much easier than the other way around.”
It would be helpful to have all these points stated clearly so that it becomes easy to see where this article stands in an abundant literature and contributes to our understanding of multi-level evolution, and why it may have different conclusions or focus than others tackling very similar questions.
Finally, a microbial context is given to the study, but the assumptions and results are in no way truly tied to that context, so it should be clear that this is just for flavor.
We have deleted “microbial” from the title, and revised our abstract:
Recommendations For The Authors
(1) More details concerning our main remark above:
- The paragraph discussing refs [24, 33] is not very clear in how they most importantly differ from this study. Our impression is that the resource aspect is not very important for instance, and the main difference is that these other works assume that strains can change in their traits.
We are fairly sure that resource depletion is important in Rainey group’s study, as the attractor only evolved after both strains grew fast enough to deplete resources by the end of maturation. Indeed, evolution occurred in interaction coefficients which dictate the competition between strains for resources.
Regardless, you raised an excellent point. As discussed earlier, we have added the following:
“To enable the derivation of an analytical expression, we have made the following simplifications.
First, growth is always exponential, without complications such as resource limitation, ecological interactions between the two populations, or density-dependent growth. Thus, the exponential growth equation can be used. Second, we consider only two populations (genotypes or species): the fast-growing F population with size F and the slow-growing S population with size S. We do not consider a spectrum of mutants or species, since with more than two populations, an analytical solution becomes very difficult.”
- We would advise the main text to focus on mu = 0, and only say in discussion that results can be generalized.
Your suggestion is certainly good. However, given the large amount of work involved in a reorganisation, we have decided to adhere to our current narrative. However, as discussed earlier, we have added this at the beginning of Results to help orient readers:
“We will start with a complete model where S mutates to F at a nonzero mutation rate µ. We made this choice because it is more challenging to attain or maintain the target frequency when the abundance of fast-growing F is further increased via mutations.”
“When the mutation rate is set to zero, the same model can be used to capture collectives of two species with different growth rates.”
(2) We think the material on pg. 5 "Intra-collective evolution is the fastest at intermediate F frequencies, creating the "waterfall" phenomenon", although interesting, could be presented in a different way. The mathematical details on how to find the probability distribution of the maximum of independent random variables (including Equation 1) will probably be skipped by most of the readers (for experienced theoreticians, it is standard content; for experimentalists, it is not the most relevant), as such I would recommend displacing them to SM and report only the important results.
This is an excellent suggestion. We have put a sketch of our calculations in a box in the main text to help orient interested readers. As before, details are in SI.
Similarly, Equations 2, 3, and 4 are hard to read given the large amount of parameters and the low amount of simplification. Although exploring the effect of the different parameters through Figures 3 and 4 is useful, I think the role of the equations should be reconsidered:
i. Is it possible to rewrite them in terms of effective variables in a more concise way?
See Point 3 of Common comments.
ii. Is it possible to present extreme/particular cases in which they are easier to interpret?
We have focused on the case where the mutation rate is zero. This makes the mathematical expressions much simpler (see above).
(3) Is it possible to explain more in detail why the distribution of f_k+1 conditional to f_k^* is well approximated by a Gaussian? Also, have you explored to what extent the results would change if this were not true (in light of the few universal classes for the maximum of independent variables)?
Despite the appeal to the CLT and the histograms in the Appendix suggesting that the distribution looks a bit like a Gaussian at a certain scale, fluctuations on that scale are not necessarily what is relevant for the results - a rapid (and maybe wrong) attempt at a characteristic function calculation suggests that in your case, one does not obtain convergence to Gaussians unless we renormalize by S(t=0) and F(t=0), so it seems there is a justification missing in the text as is for the validity of this approximation (or that it is simply assumed).
See point 4 of Common comments.
Reviewer #3 (Public Reviews):
The authors address the process of community evolution under collective-level selection for a prescribed community composition. They mostly consider communities composed of two types that reproduce at different rates, and that can mutate one into the other. Due to such differences in 'fitness' and to the absence of density dependence, within-collective selection is expected to always favour the fastest grower, but the collective-level selection can oppose this tendency, to a certain extent at least. By approximating the stochastic within-generation dynamics and solving it analytically, the authors show that not only high frequencies of fast growers can be reproducibly achieved, aligned with their fitness advantage. Small target frequencies can also be maintained, provided that the initial proportion of fast growers is sufficiently small. In this regime, similar to the 'stochastic corrector' model, variation upon which selection acts is maintained by a combination of demographic stochasticity and of sampling at reproduction. These two regions of achievable target compositions are separated by a gap, encompassing intermediate frequencies that are only achievable when the bottleneck size is small enough or the number of communities is (disproportionately) larger.
A similar conclusion, that stochastic fluctuations can maintain the system over evolutionary time far from the prevalence of the faster-growing type, is then confirmed by analyzing a three-species community, suggesting that the qualitative conclusions of this study are generalizable to more complex communities.
I expect that these results will be of broad interest to the community of researchers who strive to improve community-level selection, but are often limited to numerical explorations, with prohibitive costs for a full characterization of the parameter space of such embedded populations. The realization that not all target collective functions can be as easily achieved and that they should be adapted to the initial conditions and the selection protocol is also a sobering message for designing concrete applications.
A major strength of this work is that the qualitative behaviour of the system is captured by an analytically solvable approximation so that the extent of the 'forbidden region' can be directly and generically related to the parameters of the selection protocol.
Thanks so much for these positive comments.
I however found the description of the results too succinct and I think that more could be done to unpack the mathematical results in a way that is understandable to a broader audience. Moreover, the phenomenon the authors characterize is of purely ecological nature. Here, mutations of the growth rate are, in my understanding, neither necessary (non-trivial equilibria can be maintained also when \mu =0) nor sufficient (community-level selection is necessary to keep the system far from the absorbing state) for the phenomenon described. Calling this dynamics community evolution reflects a widespread ambiguity, and is not ascribable just to this work. I find that here the authors have the opportunity to make their message clearer by focusing on the case where the 'mutation' rate \mu vanishes (Equations 39 & 40 of the SI) - which is more easily interpretable, at least in some limits - while they may leave the more general equations 3 & 4 in the SI.
See points 1-4 of Common comments.
Combined with an analysis of the deterministic equations, that capture the possibility of maintaining high frequencies of fast growers, the authors could elucidate the dynamics that are induced by the presence of a second level of selection, and speculate on what would be the result of real open-ended evolution (not encompassed by the simple 'switch mutations' generally considered in evolutionary game theory), for instance discussing the invasibility (or not) of mutant types with slightly different growth rates.
Indeed, evolution is not restricted to two types. However, our main goal here is to derive an analytical expression, and it was difficult for even two types. For three-type collectives, we had to resort to simulations. Investigating the case where fitness effects of mutations are continuously distributed is beyond the scope of this study.
The single most important model hypothesis that I would have liked to be discussed further is that the two types do not interact. Species interactions are not only essential to achieve inheritance of composition in the course of evolution but are generally expected to play a key role even on ecological time scales. I hope the authors plan to look at this in future work.
In our system, the S and F do interact in a competitive fashion: even though S and F are not competing for nutrients (which are always in excess), they are competing for space. This is because a fixed number of cells are transferred to the next cycle. Thus, the presence of F will for example reduce the chance of S being propagated. We have added this clarification to our main text:
“Note that even though S and F do not compete for nutrients, they compete for space: because the total number of cells transferred to the next cycle is fixed, an overabundance of one population will reduce the likelihood of the other being propagated.”
Recommendations For The Authors
I felt the authors could put some additional effort into making their theoretical results meaningful for a population of readers who, though not as highly mathematically educated as they are, can nonetheless appreciate the implications of simple relations or scaling. Below, you find some suggestions:
(1) In order to make it clear that there is a 'natural' high-frequency equilibrium that can be reached even in the absence of selection, the authors could examine first the dynamics of the deterministic system in the absence of mutations, and use its equilibria to elucidate the combined role of the 'fitness' difference \omega and of the generation duration \tau in setting its value. The fact that these parameters always occur in combination (when there are no mutations) is a general and notable feature of the stochastic model as well. Moreover, this model would justify why you only focus on decreasing the frequency in the new generation.
Note that the ‘natural’ high-frequency equilibrium in the absence of collective selection is when fast grower F becomes fixed in the population. Following your suggestion, we have introduced two parameters 𝑅τ and 𝑊τ to reflect the coupling between ‘fitness’ and ‘generation duration’:
(2) Since the phenomenon described in the paper is essentially ecological in nature (as the author states, it does not change significantly if the 'mutation rate' \mu is set to zero), I would put in the main text Equations 39 & 40 of the SI in order to improve intelligibility.
See Point 2 at the beginning of this letter.
These equations can be discussed in some detail, especially in the limit of small f^*_k, where I think it is worth discussing the different dependence of the mean and the variance of the frequency distribution on the system's parameters.
This is a great suggestion. We have added the following:
“In the limit of small
, Equation (3) becomes f
while Equation (4) becomes
. Thus, both Newborn size (N<sub>0</sub>) and fold-change in F/S during maturation (W<sub>τ</sub>) are important determinants of selection progress.
(3) I would have appreciated an explanation in words of what are the main conceptual steps involved in attaining Equation 2, the underlying hypotheses (notably on community size and distributions), and the expected limits of validity of the approximation.
See points 3 and 4 at the beginning of this letter.
(4) I think that some care needs to be put into explaining where extreme value statistics is used, and why is the median of the conditional distribution the most appropriate statistics to look at for characterizing the evolutionary trajectory (which seems to me mostly reliant on extreme values).
Great point! We added an explanation of using median value in Box 1.
and also added figure 7 to explaining it in SI.
Showing in a figure the different distributions you are considering (for instance, plotting the conditional distribution for one generation in the trajectories displayed in Figure 2) would be useful to understand what information \bar f provides on a sequence of collective generations, where in principle there may be memory effects.
Thanks for this suggestion. We have added to Fig 2d panel to illustrate the shape and position of F frequency distributions in each step in the first two selection cycles.
(5) Similarly, I do not understand why selecting the 5% best communities should push the system's evolution towards the high-frequency solution, instead of just slowing down the improvement (unless you are considering the average composition of the top best communities - which should be justified). I think that such sensitivity to the selection intensity should be appropriately referenced and discussed in the main text, as it is a parameter that experimenters are naturally led to manipulate.
In the main text, we have added this explanation:
“In contrast with findings from an earlier study [23], choosing top 1 is more effective than the less stringent “choosing top 5%”. In the earlier study, variation in the collective trait is partly due to nonheritable factors such as random fluctuations in Newborn biomass. In that context, a less stringent selection criterion proved more effective, as it helped retain collectives with favorable genotypes that might have exhibited suboptimal collective traits due to unfavorable nonheritable factors. However, since this study excludes nonheritable variations in collective traits, selecting the top 1 collective is more effective than selecting the top 5% (see Fig. 11 in Supplementary Information).”
(6) Equation 1 could be explained in simpler terms as the product between the probability that one collective reaches the transmitted value times the probability that all others do worse than that. The current formulation is unclear, perhaps just a matter of English formulation.
We have revised our description to state:
“Equation (1) can be described as the product between two terms related to probability: (i)
describes the probability density that any one of the g Adult collectives achieves f given
, and (ii)
describes the probability that all other g – 1 collectives achieve frequencies above f and thus not selected.”
(7) I think that the discussion of the dependence of the boundaries of the 'waterfall' region with the difference in growth rate \omega is important and missing, especially if one wants to consider open-ended evolution of the growth rate - which can occur at steps of different magnitude.
We added a new chapter and figure in supplementary information on the threshold values when \omega varies. As expected, smaller \omega enlarges the success area.
We have also added a new figure panel to show how maturation time affects selection efficacy.
(8) Notations are a bit confusing and could be improved. First of all, in most equations in the main text and SI, what is initially introduced as \omega appears as s. This is confusing because the letter s is also used for the frequency of the slow type.
The letter S is used to denote an attribute of cells (S cells), the type of cells (Equations 1-3 of the SI) and the number of these cells in the population, sometimes with different meanings in the same sentence. This is confusing, and I suggest referring to slow cells or fast cells instead (or at least to S-cells and F-cells), and keeping S and F as variables for the number of cells of the two types.
All typos related to the notation have been fixed. We use S and F as types, and S and F (italic) and population numbers.
(9) On page 3, when introducing the sampling of newborns as ruled by a binomial distribution, the information that you are just transmitting one collective is needed, while it is conveyed later.
We have added this emphasis:
“At the end of a cycle, a single Adult with the highest function (with F frequency f closest to the target frequency
) is chosen to reproduce g Newborn collectives each with N<sub>0</sub> cells (‘Selection’ and ’Reproduction’ in Fig. 1).”
(10) I found that the abstract talks too early about the 'waterfall' phenomenon. As this is a concept introduced here, I suggest the authors first explain what it is, then use the term. It is a useful metaphor, but it should not obscure the more formal achievements of the paper.
We feel that the “waterfall” analogy offers a gentle helping hand to orient those who have not thought much about the phenomenon. We view abstract as an opportunity to attract readership, and thus the more accessible the better.
(11) In the SI there are numerous typos and English language issues. I suggest the authors read carefully through it, and add line numbers to the next version so that more detailed feedback is possible.
Thank you for going through SI. We have gone through the SI, and fixed problems.
-
- Feb 2025
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public Review):
Summary:
This paper presents a compelling and comprehensive study of decision-making under uncertainty. It addresses a fundamental distinction between belief-based (cognitive neuroscience) formulations of choice behavior with reward-based (behavioral psychology) accounts. Specifically, it asks whether active inference provides a better account of planning and decision making, relative to reinforcement learning. To do this, the authors use a simple but elegant paradigm that includes choices about whether to seek both information and rewards. They then assess the evidence for active inference and reinforcement learning models of choice behavior, respectively. After demonstrating that active inference provides a better explanation of behavioral responses, the neuronal correlates of epistemic and instrumental value (under an optimized active inference model) are characterized using EEG. Significant neuronal correlates of both kinds of value were found in sensor and source space. The source space correlates are then discussed sensibly, in relation to the existing literature on the functional anatomy of perceptual and instrumental decision-making under uncertainty.
We are deeply grateful for your careful review of our work and your suggestions. Your insights have helped us identify areas where we can strengthen the arguments and clarify the methodology. We hope to apply the idea of active inference to our future work, emphasizing the integrity of perception and action.
Reviewer #1 (Recommendations For The Authors):
Many thanks for attending to my previous suggestions. I think your presentation is now much clearer and nicely aligned with the active inference literature.
There is one outstanding issue. I think you have overinterpreted the two components of epistemic value in Equation 8. The two components that you have called the value of reducing risk and the value of reducing ambiguity are not consistent with the normal interpretation. These two components are KL divergences that measure the expected information gain about parameters and states respectively.
If you read the Schwartenbeck et al paper carefully, you will see that the first (expected information gain about parameters) is usually called novelty, while the second (expected information gain about states) is usually called salience.
This means you can replace "the value of reducing ambiguity" with "novelty" and "the value of reducing risk" with "salience".
For your interest, "risk" and "ambiguity" are alternative ways of decomposing expected free energy. In other words, you can decompose expected free energy into (negative) expected information gain and expected value (as you have done). Alternatively, you can rearrange the terms and express expected free energy as risk and ambiguity. Look at the top panel of Figure 4 in:
https://www.sciencedirect.com/science/article/pii/S0022249620300857
I hope that this helps.
We deeply thank you for your recommendations about the interpretation of the epistemic value in Equation 8. We have now corrected them to Novelty and Salience:
In addition, in order to avoid terminology conflicts with active inference and to describe these two different uncertainties, we replaced Ambiguity in the article with Novelty, referring to the uncertainty that can be reduced by sampling, and replaced Risk with Variability, referring to the uncertainty inherent in the environment (variance).
Reviewer # 2 (Public Review):
Summary:
Zhang and colleagues use a combination of behavioral, neural, and computational analyses to test an active inference model of exploration in a novel reinforcement learning task..
Strengths:
The paper addresses an important question (validation of active inference models of exploration). The combination of behavior, neuroimaging, and modeling is potentially powerful for answering this question.
I appreciate the addition of details about model fitting, comparison, and recovery, as well as the change in some of the methods.
We are deeply grateful for your careful review of our work and your suggestions. And we are also very sorry that in our last responses, there were a few suggestions from you that we did not respond them appropriately in our manuscript. We hope to be able to respond to these suggestions well in this revision. Thank you for your contribution to ensuring the scientificity and reproducibility of the work.
The authors do not cite what is probably the most relevant contextual bandit study, by Collins & Frank (2018, PNAS), which uses EEG.
The authors cite Collins & Molinaro as a form of contextual bandit, but that's not the case (what they call "context" is just the choice set). They should look at the earlier work from Collins, starting with Collins & Frank (2012, EJN).
We deeply thank you for your comments. Now we add the relevant citations in the manuscript (line 46):
“These studies utilized different forms of multi-armed bandit tasks, e.g the restless multi-armed bandit tasks (Daw et al., 2006; Guha et al., 2010), risky/safe bandit tasks (Tomov et al., 2020; Fan et al., 2022; Payzan et al., 2013), contextual multi-armed bandit tasks (Collins & Frank, 2018; Schulz et al., 2015; Collins & Frank, 2012)”
Daw, N. D., O'doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876-879.
Guha, S., Munagala, K., & Shi, P. (2010). Approximation algorithms for restless bandit problems. Journal of the ACM (JACM), 58(1), 1-50.
Tomov, M. S., Truong, V. Q., Hundia, R. A., & Gershman, S. J. (2020). Dissociable neural correlates of uncertainty underlie different exploration strategies. Nature communications, 11(1), 2371.
Fan, H., Gershman, S. J., & Phelps, E. A. (2023). Trait somatic anxiety is associated with reduced directed exploration and underestimation of uncertainty. Nature Human Behaviour, 7(1), 102-113.
Payzan-LeNestour, E., Dunne, S., Bossaerts, P., & O’Doherty, J. P. (2013). The neural representation of unexpected uncertainty during value-based decision making. Neuron, 79(1), 191-201.
Collins, A. G., & Frank, M. J. (2018). Within-and across-trial dynamics of human EEG reveal cooperative interplay between reinforcement learning and working memory. Proceedings of the National Academy of Sciences, 115(10), 2502-2507.
Schulz, E., Konstantinidis, E., & Speekenbrink, M. (2015, April). Exploration-exploitation in a contextual multi-armed bandit task. In International conference on cognitive modeling (pp. 118-123).
Collins, A. G., & Frank, M. J. (2012). How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience, 35(7), 1024-1035.
Placing statistical information in a GitHub repository is not appropriate. This needs to be in the main text of the paper. I don't understand why the authors refer to space limitations; there are none for eLife, as far as I'm aware.
We deeply thank you for your comments. We calculated the average t-value of the brain regions with significant results over the significant time, and added the t-value results to the main text and supplementary materials.
In answer to my question about multiple comparisons, the authors have added the following: "Note that we did not attempt to correct for multiple comparisons; largely, because the correlations observed were sustained over considerable time periods, which would be almost impossible under the null hypothesis of no correlations." I'm sorry, but this does not make sense. Either the authors are doing multiple comparisons, in which case multiple comparison correction is relevant, or they are doing a single test on the extended timeseries, in which case they need to report that. There exist tools for this kind of analysis (e.g., Gershman et al., 2014, NeuroImage). I'm not suggesting that the authors should necessarily do this, only that their statistical approach should be coherent. As a reference point, the authors might look at the aforementioned Collins & Frank (2018) study.
We deeply thank you for your comments. We have now replaced all our results with the results after false discovery rate correction and added relevant descriptions (line 357,358):
“The significant results after false discovery rate (FDR) (Benjamini et al., 1995, Gershman et al., 2014) correction were shown in shaded regions. Additional regression results can be found in Supplementary Materials.”
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological), 57(1), 289-300.
Gershman, S. J., Blei, D. M., Norman, K. A., & Sederberg, P. B. (2014). Decomposing spatiotemporal brain patterns into topographic latent sources. NeuroImage, 98, 91-102.
After FDR correction, our results have changed slightly. We have updated our Results and Discussion section.
It should be acknowledged that the changes in these results may represent a certain degree of error in our data (perhaps because the EEG data is too noisy or because of the average template we used, ‘fsaverage’). Therefore, we added relevant discussion in the Discussion section (line527-529):
“It should be acknowledged that our EEG-based regression results are somewhat unstable, and the brain regions with significant regression are inconsistent before and after FDR correction. In future work, we should collect more precise neural data to reduce this instability.”
I asked the authors to show more descriptive comparison between the model and the data. Their response was that this is not possible, which I find odd given that they are able to use the model to define a probability distribution on choices. All I'm asking about here is to show predictive checks which build confidence in the model fit. The additional simulations do not address this. The authors refer to figures 3 and 4, but these do not show any direct comparison between human data and the model beyond model comparison metrics.
We deeply thank you for your comments. We now compare the participants’ behavioral data and the model’s predictions trial by trial (Figure 5). We can clearly see the participants’ behavioral strategies in different states and trials and the model’s prediction accuracy. We have added the discussion related to Figure 5 (line 309-318):
“Figure 5 shows the comparison between the active inference model and the behavioral data, where we can see that the model can fit the participants behavioral strategies well. In the “Stay-Cue" choice, participants always tend to choose to ask the ranger and rarely choose not to ask. When the context was unknown, participants chose the “Safe" option or the “Risky" option very randomly, and they did not show any aversion to variability. When given “Context 1", where the “Risky" option gave participants a high average reward, participants almost exclusively chose the “Risky" option, which provided more information in the early trials and was found to provide more rewards in the later rounds. When given “Context 2", where the “Risky" option gave participants a low average reward, participants initially chose the “Risky" option and then tended to choose the “Safe" option. We can see that participants still occasionally chose the “Risky" option in the later trials of the experiment, which the model does not capture. This may be due to the influence of forgetting. Participants chose the “Risky" option again to establish an estimate of the reward distribution.”
Reviewer # 2 (Recommendations For The Authors):
In the supplement, there are missing references ("[?]").
Thank you very much for pointing out this. We have now fixed this error.
Reviewer # 3 (Public review):
Summary:
This paper aims to investigate how the human brain represents different forms of value and uncertainty that participate in active inference within a free-energy framework, in a two-stage decision task involving contextual information sampling, and choices between safe and risky rewards, which promotes shifting between exploration and exploitation. They examine neural correlates by recording EEG and comparing activity in the first vs second half of trials and between trials in which subjects did and did not sample contextual information, and perform a regression with free-energy-related regressors against data "mapped to source space."
Strengths:
This two-stage paradigm is cleverly designed to incorporate several important processes of learning, exploration/exploitation and information sampling that pertain to active inference. Although scalp/brain regions showing sensitivity to the active-inference related quantities do not necessary suggest what role they play, they are illuminating and useful as candidate regions for further investigation. The aims are ambitious, and the methodologies impressive. The paper lays out an extensive introduction to the free energy principle and active inference to make the findings accessible to a broad readership.
Weaknesses:
In its revised form the paper is complete in providing the important details. Though not a serious weakness, it is important to note that the high lower-cutoff of 1 Hz in the bandpass filter, included to reduce the impact of EEG noise, would remove from the EEG any sustained, iteratively updated representation that evolves with learning across trials, or choice-related processes that unfold slowly over the course of the 2-second task windows.
We are deeply grateful for your careful review of our work and your suggestions. We are very sorry that we did not modify our filter frequency (it would be a lot of work to modify it). Thank you very much for pointing this out. We noticed the shortcoming of the high lower-cutoff of 1 Hz in the bandpass filter. We will carefully consider the filter frequency when preprocessing data in future work. Thank you very much!
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
[…] Weaknesses:
Unfortunately, the revised manuscript does not show significant improvement. While the identification of the receptors is highly convincing, important issues about the biological relevance remain unaddressed. First, the main point I raised about the first version of this article is that the redundancy and/or specificity of the two receptors should be clarified, even though I understand that it cannot be deeply investigated here. I believe that this point, shared by all reviewers, is highly relevant for the scope of this work. In this revised version, it is still unclear how to reconcile gain and loss-of-function experiments and the different expression profiles of the receptors. Second, the newly added explanations and pieces of discussion provided about the mild in vivo phenotypes of early pupation upon Cad96ca or Fgfr1 knock-out do not clarify the issue but instead put emphasis on methodological issues. Indeed, it is not clear whether the mild phenotypes reflect the biological role of Cad96ca and Fgfr1, or the redundancy of these two RTKs (and/or others), or some issue with the knock-out strategy (partial efficiency, mosaicism...). Finally, parts of the updated discussion and the modifications to the figures are confusing.
Thank you for asking the questions. We performed additional experiments, including editing Met1 individually (single knockout), Cad96ca and Fgfr1 together (double knockout), and Met1, Cad96ca and Fgfr1 together (triple knockout) using CRISPR/Cas9. The results showed that single mutation of Cad96ca or Fgfr1 caused precocious pupation, respectively. The double mutation of Cad96ca and Fgfr1 caused earlier pupation and death compared to the single mutation of Cad96ca or Fgfr1. The triple mutation of Met1, Cad96ca and Fgfr1 caused most serious effect on pupation time and death. These data suggested that both CAD96CA and FGFR1 can transmit JH signal to prevent pupation independently and cooperatively, and the JH exert a complete regulatory role through cell membrane receptors and intracellular receptor of JH. We increased the results in Lines 242-263 and discussion in Lines 328-375.
CAD96CA and FGFR1 have similar functions in JH signaling, including transmitting JH signal for Kr-h1 expression, larval status maintaining, rapid intracellular calcium increase, phosphorylation of transcription factors MET1 and TAI, and high affinity to JH III. CAD96CA and FGFR1 are essential in the JH signal pathway, and the loss-of-function of each is sufficient to trigger strong effects on pupation, suggesting they can transmit JH signal individually. The difference is that CAD96CA expression has no tissue specificity, and the Fgfr1 gene is highly expressed in the midgut. A possibility is that CAD96CA and FGFR1 play roles by forming homodimer or heterodimer with each other or with other RTKs in tissues, which needs to be addressed in future studies. CAD96CA and FGFR1 transmit JH III signals in three different insect cell lines, suggesting their conserved roles in other insects.
The mild phenotypes shown in the previous picture, Fig 4E, were counted from all the surviving individuals injected with gRNA, including mutated and non-mutated individuals. In fact, there is no phenotype of pupation on time in the mutants. According to the first round of reviewers' comments, we found that it was inappropriate to count all the surviving individuals injected with gRNA, so we replaced the picture by counting the phenotypes of all successfully mutated individuals in the second version to avoid the confusion of the phenotypes.
Reviewer #2 (Public review):
[…] Weaknesses:
Results of their in vivo experiments, particularly those of their loss-of-function analyses using CRISPR mutants are still preliminary, and the results rather indicate that these membrane receptors do not have any physiologically significant roles in vivo. More specifically, previous studies in lepidopteran species have clearly and repeatedly shown that precocious metamorphosis is the hallmark phenotype for all JH signaling-deficient larvae. In contrast, the present study showed that Cad96ca and Fgfr1 G0 mutants only showed slight acceleration in their pupation timing, which is not a typical phenotype one would expect from JH signaling deficiency. This is inconsistent with their working model provided in Figure 6, which indicates that these cell membrane JH receptors promote the canonical JH signaling by phosphorylating Met/Tai. If the authors argue that this slight acceleration of pupation is indeed a major JH signaling-deficient phenotype in Helicoverpa, they need to provide more data to support their claim by analyzing CRISPR mutants of other genes involved in JH signaling, such as Jhamt and Met. An alternative explanation is that there is functional redundancy between CAD96CA and FGFR1 in mediating phosphorylation of Met/Tai. This possibility can be tested by analyzing double knockouts of these two receptors. Currently, the validity of their calcium imaging analysis in Figure 5 is also questionable. When performing calcium imaging in cultured cells, it is critically important to treat all the cells at the end of each experiment with a hormone or other chemical reagents that universally induce calcium increase in each particular cell line. Without such positive control, the validity of calcium imaging data remains unknown, and readers cannot properly evaluate their results.
Thank you for the comments. We took your suggestions and performed additional experiments, editing Met1 individually (single knockout), Cad96ca and Fgfr1 together (double knockout), and Met1, Cad96ca and Fgfr1 together (triple knockout) using CRISPR/Cas9. We increased the results in Lines 242-263 and discussion in Lines 328-375.
About the calcium imaging in cultured cells (now Fig 6), our goal is to examine the roles of CAD96CA and FGFR1 in JH trigged cellular responses. The experiment was well designed and controlled and the results were validated. For examples: JH III induced intracellular Ca<sup>2+</sup> release and extracellular Ca<sup>2+</sup> influx in Sf9 and S2 cells, but DMSO could not. However, knockdown of Cad96ca and Fgfr1 significantly decreased JH III-induced intracellular Ca<sup>2+</sup> release and extracellular Ca<sup>2+</sup> influx (Figure 6A, B), and Kr-h1 expression (Figure 6—figure supplement 1A and B), suggesting that CAD96CA and FGFR1 had a general function to transmit JH signal in S. frugiperda and D. melanogaster.
Wild mammalian HEK-293T cells had no significant changes in calcium ion levels under JH III induction, because there is no CAD96CA and FGFR1 in mammal cells (Figure 6C). However, when HEK-293T cells were overexpressed insect CAD96CA or FGFR1, respectively, JH III triggered rapid cytosolic Ca<sup>2+</sup> release and influx (Figure 6D).
An increase in Ca<sup>2+</sup> was not detected in mutants of CAD96CA-M3 and CAD96CA-M4 under JH III induction (Figure 6E) and nor in FGFR1-M4 (Figure 6F). These results confirmed that CAD96CA and FGFR1 play roles in transmitting JH III signal.
We carefully revised the description of the results and methods to help people understand the study.
Reviewer #3 (Public review):
[…] Weaknesses:
The authors have provided evidences that the Cad96Ca and FGF1 RTK receptors contribute to JH signaling through CRISPR/Cas9, inducing precocious metamorphosis, although not to the same extent as absence of JH. Therefore, it still remains unclear whether these RTKs are completely required for pathway activation or only necessary for high activation levels during the last larval stage. While the authors have included some additional data, the mechanism by which different RTKs function in transducing JH signaling in a tissue specific manner is still unclear. As the authors note in the discussion, it is possible that other RTKs may also play a role in facilitating the transduction of JH signaling. Lastly, the study does not yet explain how RTKs with known ligands could also bind JH and contribute to JH signaling activation. Although receptor promiscuity has been suggested as a possible mechanism, future studies could explore whether activation of RTK pathways by their known ligands induces certain levels of JH transducer phosphorylation, which, in the presence of JH, could contribute to full pathway activation without the need for direct JH-RTK binding.
Thank you for your comments. To address your questions, we carried out additional experiments. The relevant results have been incorporated into Lines 242-263, and the corresponding discussion has been added to Lines 328-375.
We agree with your suggestions that the future studies should resolve the questions such as how different RTKs function in transducing JH signaling in a tissue specific manner; whether other RTKs can transduce JH signal; how RTKs with known ligands could also bind JH and contribute to JH signaling activation; and how the RTK pathways are activated by their ligands.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) First, some of the new paragraphs, repeatedly used in the point-by-point answer to the reviewers, are highly confusing and need proofreading (i.e. 225-230; 320-340)
Thank you for your advice. We have carefully revised the manuscript and the point-by-point answer to avoid repetition.
(2) While the double knock-down or knock-out of Cad96ca and Fgfr1 is expected to provide valuable information regarding their respective functions, the authors indicated that they wouldn't provide experiments in that direction. It is not clear to me if they have tried or not. The Crispr/Cas9 approach might be difficult to put in place to test this interaction. However, couldn't the authors try the double knock-down compared to single knock-downs using dsRNA? This method gave convincing results to test the role of the putative receptors in mediating JH-induced developmental delay in vivo (Figure 1).
Thank you for your suggestion. We added experiments, editing Met1 individually (single knockout), Cad96ca and Fgfr1 together (double knockout), and Met1, Cad96ca and Fgfr1 together (triple knockout) using CRISPR/Cas9, the new evidence fully defined the physiological roles of these receptors in JH signaling in vivo. We increased the results in Lines 242-263 and discussion in Lines 328-375.
(3) Concerning the effect of Crispr knock-out on pupation timing, this paragraph was added: "The low death rate after Cad96ca and Fgfr1 knockout might be because of following reasons, including the editing efficiency (67% and 61% for Cad96ca mutant and Fgfr1 mutant, respectively), the chimera of the gene knockout at the G0 generation, and the redundant RTKs that play similar roles in JH signaling". A similar explanation applies to the pupation phenotype itself... I am therefore wondering whether the Crispr/Cas9 approach (at the G0 generation) is the best strategy. Since the dsRNA knock-down gave efficient (and probably more reproducible) results according to Figure 1B-C, why not using the same approach for analyzing loss-of-function phenotypes?
(4) Similarly, this new paragraph regarding the knock-out strategy by Crispr is problematic: "However, in the Cad96ca mutant, 86% of the larvae (an editing efficiency of 67% by TA clone analysis) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 24 h earlier. In the Fgfr1 mutant, 91% of the larvae (an editing efficiency of 61%) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 23 h earlier" (lines 225-230).
- How does the editing efficiency relate to the mutation efficiency few lines earlier (not clearly explained in the methods)? Were the animals homozygous or heterozygous for the mutations? - A shortened feeding stage can only be invoked if previous developmental transitions are unaffected. Such statement should be supported by a better description of the developmental timing phenotype (as suggested already by reviewer 2).
Thank you for your questions in (3) and (4). The editing rates of 67% and 61% for Cad96ca and Fgfr1 in individuals were calculated from the PCR products, indicating that the cells were mosaics by CRISPR/Cas9 editing. The mutants produced by CRISPR/Cas9 are mosaics. We removed the content to the methods section and increased the detail methods, Lines 705-717.
We increased discussion: "The phenotypes of gene mutation in H. armigera are somehow different from those obtained by homozygous mutation in other animals, due to the mosaic mutation by CRISPR/Cas9. In addition, RNAi of Cad96ca and Fgfr1 was observed precocious pupation as was the case in CRISPR/Cas9, suggesting the RNAi can be used for the study of gene function in insect, especially when the gene editing is embryonic lethal". Lines 367-380.
We removed the improper description of the phenotypes in the results, such as that of the feeding stage. By increasing experiments of editing Met1 individually (single knockout), Cad96ca and Fgfr1 together (double knockout), and Met1, Cad96ca and Fgfr1 together (triple knockout) to define the physiological roles of these receptors in JH signaling in vivo.
(5) Importantly, I don't understand where the new version of the figure 4E stems from. The « pupation on time » (blue) category present in the first version of the figure has now disappeared for mutant animals. Why? In the first, my understanding was that, among the mutant animals, around 50% had precocious pupation. In the new version of the figure 4E, the "pupation on time" category is missing, and the percentages of early pupation are therefore strongly increased... The explanations provided in the text are not clear regarding the reanalysis of the mutant phenotypes. In the first version of the manuscript, the following explanation was given: "In 61 survivors of Cas9 protein and Cad96ca-gRNA injection, 30 mutants were identified by the earlier pupation and sequencing (an editing efficiency of 49.2%)". Were all animals sequenced, or only the 30 displaying earlier pupation? Were the 31 others not sequenced or did they have no mutation? Could it be, as suggested by the first version of the figure, that some mutant animals did not display early pupation? It was indeed stated in the text that: "CRISPR/Cas9 editing by Cad96ca-gRNA or Fgfr1-gRNA injection resulted in earlier pupation (Figure 4D) for about 23-24 h by comparison with normal pupation in 46% and 54% of larvae, respectively". This new version of the figure should be explained.
Thank you for your reminder. The phenotype of pupation on time appeared in the first version, because we counted the phenotypes of all the surviving individuals injected with gRNA, that is, the survivors in Figure 4C, which including mutated and non-mutated individuals. According to the comments from first round of reviewers, we realized that it was inappropriate to count all the surviving individuals injected with gRNA, since there is no phenotype of pupation on time in the mutants. Therefore, in the second version, we replaced the picture by counting the phenotypes of all successfully mutated individuals, namely the mutants in Figure 4C.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
Juvenile Hormone (JH) plays a key role in insect development and physiology. Although the intracellular receptor for JH was identified long ago, a number of studies have shown that part of JH functions should be fulfilled through binding to an unknown membrane receptor, which was proposed to belong to the RTK family. In this study, the authors screened all RTKs from the H. armigera genome for their ability to mediate responses to JH III treatment both in cultured cells and in developing animals. They also present convincing evidence that CAD96CA and FGFR1 directly bind JH III, and that their role might be conserved in other insect species.
Strengths:
Altogether, the experimental approach is very complete and elegant, providing evidence for the role of CAD96CA and FGFR1 in JH signalling using different techniques and in different contexts. I believe that this work will open new perspectives to study the role of JH and better understand what is the contribution of signalling through membrane receptors for JH-dependent developmental processes.
Weaknesses:
I don't see major weaknesses in this study. However, I think that the manuscript would benefit from further information or discussion regarding the relationship between the two newly identified receptors. Experiments (especially in HEK-293T cells) suggest that CAD96CA and FGFR1 are sufficient on their own to transduce JH signalling. However, they are also necessary since loss-of-function conditions for each of them are sufficient to trigger strong effects (while the other is supposed to be still present).
Thank you for the suggestion. We have added the discussion in the text: "CAD96CA and FGFR1 have similar functions in JH signaling, including transmitting JH signal for Kr-h1 expression, larval status maintaining, rapid intracellular calcium increase, phosphorylation of transcription factors MET1 and TAI, and high affinity to JH III. CAD96CA and FGFR1 are essential in the JH signal pathway, and loss-of-function for each is sufficient to trigger strong effects on pupation. The difference is that CAD96CA expression has no tissue specificity, and the Fgfr1 gene is highly expressed in the midgut; possibly, it plays a significant role in the midgut. Other possibility is that they play roles by forming heterodimer with each other or other RTKs, which needs to be addressed in future study. CAD96CA and FGFR1 transmit JH III signals in three different insect cell lines, suggesting their conserved roles in other insects.".
In addition, despite showing different expression patterns, the two receptors seem to display similar developmental functions according to loss-of-function phenotypes. It is therefore unclear how to draw a model for membrane receptor-mediated JH signalling that includes both CAD96CA and FGFR1.
Thank you for your question. We have modified the figure and the legends to make the conception clear.
Reviewer #2 (Public Review):
Summary:
Juvenile hormone (JH) is a pleiotropic terpenoid hormone in insects that mainly regulates their development and reproduction. In particular, its developmental functions are described as the "status quo" action, as its presence in the hemolymph (the insect blood) prevents metamorphosis-initiating effects of ecdysone, another important hormone in insect development, and maintains the juvenile status of insects. While such canonical functions of JH are known to be mediated by its intracellular receptor complex composed of Met and Tai, there have been multiple reports suggesting the presence of cell membrane receptor(s) for JH, which mediate non-genomic effects of this terpenoid hormone. In particular, the presence of receptor tyrosine kinase(s) that phosphorylate Met/Tai in response to JH and thus indirectly affect the canonical JH signaling pathway has been strongly suggested. Given the importance of JH in insect physiology and the fact that the JH signaling pathway is a major target of insect growth regulators, elucidating the identification and functions of putative JH membrane receptors is of great significance from both basic and applied perspectives. In the present study, the authors identified candidate receptors for such cell membrane JH receptors, CAD96CA and FGFR1, in the cotton bollworm Helicoverpa armigera.
Strengths:
Their in vitro analyses are conducted thoroughly using multiple methods, which overall supports their claim that these receptors can bind to JH and mediate their non-genomic effects.
Weaknesses:
Results of their in vivo experiments, particularly those of their loss-of-function analyses using CRISPR mutants are still preliminary, and the results rather indicate that these membrane receptors do not have any physiologically significant roles in vivo. More specifically, previous studies in lepidopteran species have clearly and repeatedly shown that precocious metamorphosis is the hallmark phenotype for all JH signaling-deficient larvae. In contrast, the present study showed that Cad96ca and Fgfr1 G0 mutants only showed a slight acceleration in their pupation timing, which is not a typical phenotype one would expect from JH signaling deficiency. This is inconsistent with their working model provided in Figure 6, which indicates that these cell membrane JH receptors promote the canonical JH signaling by phosphorylating Met/Tai.
If the authors argue that this slight acceleration of pupation is indeed a major JH signaling-deficient phenotype in Helicoverpa, they need to provide more data to support their claim by analyzing CRISPR mutants of other genes involved in JH signaling, such as Jhamt and Met. An alternative explanation is that there is functional redundancy between CAD96CA and FGFR1 in mediating phosphorylation of Met/Tai. This possibility can be tested by analyzing double knockouts of these two receptors.
Thank you for your question and suggestion. The cadherin 96ca (CAD96CA) and fibroblast growth factor receptor 1 (FGFR1) were finally determined as JH cell membrane receptors by their roles in JH regulated-gene expression, maintaining larval status, JH induced-rapid increase of intracellular calcium levels, JH induced-phosphorylation of MET and TAI, and their JH-binding affinity. Their roles as JH cell membrane receptors were further determined by knockdown and knockout of them in vivo and in cell lines, and overexpression of them in mammal HEK-293T heterogeneously. Figure 6 is drafted by these solidate evidences.
Cad96ca and Fgfr1 G0 mutants caused slight acceleration of pupation is one of the types of evidence of JH signaling-deficient. Othe evidences include a set of gene expression and the block of JH induced-rapid intracellular calcium increase.
Kr-h1 is a typical indicator gene at the downstream of Jhamt and in JH signaling, so we used it as an indicator to examine JH signaling. Jhamt and Met or other genes might be affected in Cad96ca and Fgfr1 G0 mutants, which can be examined in future study.
We have discussed the question that Cad96ca and Fgfr1 G0 mutants only showed a slight acceleration in their pupation timing: "Homozygous Cad96ca null Drosophila die at late pupal stages (Wang et al., 2009). However, we found that 86% of the larvae of the Cad96ca mutant successfully pupated in G0 generation, although earlier than the control. Similarly, null mutation of Fgfr1 or Fgfr2 in mouse is embryonic lethal (Arman et al., 1998; Deng et al., 1994; Yamaguchi et al., 1994). In D. melanogaster, homozygous Htl (Fgfr) mutant embryos die during late embryogenesis, too (Beati et al., 2020; Beiman et al., 1996; Gisselbrecht et al., 1996). However, in H. armigera, 91% of larvae successfully pupated in G0 generation after Fgfr1 knockout. The low death rate after Cad96ca and Fgfr1 knockout might be because of following reasons, including the editing efficiency (67% and 61% for Cad96ca mutant and Fgfr1 mutant, respectively), the chimera of the gene knockout at the G0 generation, and the redundant RTKs that play similar roles in JH signaling, similar to the redundant roles of MET and Germ-cell expressed bHLH-PAS (GCE) in JH signaling (Liu et al., 2009), which needs to obtain alive G1 homozygote mutants and double knockout of these two receptors in future study. We indeed observed that the eggs did not hatch successfully after mixed-mating of G0 Cad96ca mutant or Fgfr1 mutant, respectively, but the reason was not addressed further due to the embryonic death. By the similar reasons, most of the Cad96ca and Fgfr1 mutants showed a slight acceleration of pupation (about one day) without the typical precocious metamorphosis (at least one instar earlier) phenotype caused by JH signaling defects (Daimon et al., 2012; Fukuda, 1944; Riddiford et al., 2010) and JH pathway gene deletions (Abdou et al., 2011; Liu et al., 2009). On other side, JH can regulate gene transcription by diffusing into cells and binding to the intracellular receptor MET to conduct JH signal, which might affect the results of gene knockdown and knockout.".
Currently, the validity of their calcium imaging analysis in Figure 5 is also questionable. When performing calcium imaging in cultured cells, it is critically important to treat all the cells at the end of each experiment with a hormone or other chemical reagents that universally induce calcium increase in each particular cell line. Without such positive control, the validity of calcium imaging data remains unknown, and readers cannot properly evaluate their results.
Thank you for your question. For Figure 5, our goal was to demonstrate that JH can induce calcium mobilization through CAD96CA and FGFR1. Controls have been established between different experimental groups within the same cell, as well as between different cells. Increasing the positive experimental group would make the results more complex.
Reviewer #3 (Public Review):
Summary:
In this study, Li et al. identified CAD96CA and FGF1 among 20 receptor tyrosine kinase receptors as mediators of JH signaling. By performing a screen in HaEpi cells with overactivated JH signaling, the authors pinpointed two main RTKs that contribute to the transduction of JH. Using the CRISPR/Cas9 system to generate mutants, the authors confirmed that these RTKs are required for normal JH activation, as precocious pupariation was observed in their absence. Additionally, the authors demonstrated that both CAD96CA and FGF1 exhibit a high affinity for JH, and their activation is necessary for the proper phosphorylation of Tai and Met, transcription factors that promote the transcriptional response. Finally, the authors provided evidence suggesting that the function of CAD96CA and FGF1 as JH receptors is conserved across insects.
Strengths:
The data provided by the authors are convincing and support the main conclusions of the study, providing ample evidence to demonstrate that phosphorylation of the transducers Met and Tai mainly depends on the activity of two RTKs. Additionally, the binding assays conducted by the authors support the function of CAD96CA and FGF1 as membrane receptors of JH. The study's results validate, at least in H. amigera, the predicted existence of membrane receptors for JH.
Weaknesses:
The study has several weaknesses that need to be addressed. Firstly, it is not clear what criteria were used by the authors to discard several other RTKs that were identified as repressors of JH signaling. For example, while NRK and Wsck may not fulfill all the requirements to become JH receptors, other evidence, such as depletion analysis and target gene expression, suggests they are involved in proper JH signaling activation.
Thank you for your question. We screened the RTKs sequentially, including examining the roles of 20 RTKs identified in the H. armigera genome in JH regulated-gene expression to obtain primary candidates, followed by screening of the candidates by their roles in maintaining larval status, JH induced-rapid increase of intracellular calcium levels, JH induced-phosphorylation of MET and TAI, and affinity to JH. WSCK was not involved in the phosphorylation of MET and TAI and was discarded during subsequent screening. NRK did not bind to JH III, did not meet the screening strategy, and was discarded.
We increased the information in the Introduction: "We screened the RTKs sequentially, including examining the roles of 20 RTKs identified in the H. armigera genome in JH regulated-gene expression to obtain primary candidates, followed by screening of the candidates by their roles in maintaining larval status, JH induced-rapid increase of intracellular calcium levels, JH induced-phosphorylation of MET and TAI, and affinity to JH. The cadherin 96ca (CAD96CA) and fibroblast growth factor receptor 1 (FGFR1) were finally determined as JH cell membrane receptors by their roles in JH regulated-gene expression, maintaining larval status, JH induced-rapid increase of intracellular calcium levels, JH induced-phosphorylation of MET and TAI, and their JH-binding affinity. Their roles as JH cell membrane receptors were further determined by knockdown and knockout of them in vivo and cell lines, and overexpression of them in mammal HEK-293T heterogeneously.".
We increased discussion: "This study found six RTKs that respond to JH induction by participating in JH induced-gene expression and intracellular calcium increase, however; they exert different functions in JH signaling, and finally CAD96CA and FGFR1 are determined as JH cell membrane receptors by their roles in JH induced-phosphorylation of MET and TAI and binding to JH III. We screen the RTKs transmitting JH signal primarily by examining some of JH induced-gene expression. By examining other genes or by other strategies to screen the RTKs might find new RTKs functioning as JH cell membrane receptors; however, the key evaluation indicators, such as the binding affinity of the RTKs to JH and the function in transmitting JH signal to maintain larval status are essential.".
Secondly, the expression of the six RTKs, which, when knocked down, were able to revert JH signaling activation, was mainly detected in the last larval stage of H. amigera. However, since JH signaling is active throughout larval development, it is unclear whether these RTKs are completely required for pathway activation or only needed for high activation levels at the last larval stage.
Thank you for the question. We knocked down the genes at last larval stage to observe pupation, which is a relatively simple and easily to be observed target to examine the role of the gene in JH-maintained larval status. The results from CRISPR/Cas9 experiments showed: "Most wild-type larvae showed a phenotype of pupation on time. However, in the Cad96ca mutant, 86% of the larvae (an editing efficiency of 67% by TA clone analysis) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 24 h earlier. In the Fgfr1 mutant, 91% of the larvae (an editing efficiency of 61%) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 23 h earlier (Figure 4D and E). The data suggested that CAD96CA and FGFR1 support larval growth and prevent pupation in vivo.".
Additionally, the mechanism by which different RTKs exert their functions in a specific manner is not clear. According to the expression profile of the different RTKs, one might expect some redundant role of those receptors. In fact the no reversion of phosphorilation of tai and met upon depletion of Wsck in cells with overactivated JH signalling seems to support this idea.
Nevertheless, and despite the overlapping expression of the different receptors, all RTKs seem to be required for proper pathway activation, even in the case of FGF1 which seems to be only expressed in the midgut. This is an intriguing point unresolved in the study.
Thank you for your comments. Yes, from our study, different RTKs exert their functions in a specific manner. We have increased discussion: "This study found six RTKs that respond to JH induction by participating in JH induced-gene expression and intracellular calcium increase, however; they exert different functions in JH signaling, and finally CAD96CA and FGFR1 are determined as JH cell membrane receptors by their roles in JH induced-phosphorylation of MET and TAI and binding to JH III. We screen the RTKs transmitting JH signal primarily by examining some of JH induced-gene expression. By examining other genes or by other strategies to screen the RTKs might find new RTKs functioning as JH cell membrane receptors; however, the key evaluation indicators, such as the binding affinity of the RTKs to JH and the function in transmitting JH signal to maintain larval status are essential.".
Finally, the study does not explain how RTKs with known ligands could also bind JH and contribute to JH signaling activation. in Drosophila, FGF1 is activated by pyramus and thisbe for mesoderm development, while CAD96CA is activated by collagen during wound healing. Now the authors claim that in addition to these ligands, the receptors also bind to JH. However, it is unclear whether these RTKs are activated by JH independently of their known ligands, suggesting a specific binding site for JH, or if they are only induced by JH activation when those ligands are present in a synergistic manner. Alternatively, another explanation could be that the RTK pathways by their known ligands activation may induce certain levels of JH transducer phosphorylation, which, in the presence of JH, contributes to the full pathway activation without JH-RTK binding being necessary.
Thank you for your professional questions. It is an exciting and challenging to explore the molecular mechanism by which multiple ligands transmit signals through the same receptor. It requires a long-term research plan and in-depth studies. We added discussion in the text: "CAD96CA (also known as Stitcher, Ret-like receptor tyrosine kinase) activates upon epidermal wounding in Drosophila embryos (Tsarouhas et al., 2014) and promotes growth and suppresses autophagy in the Drosophila epithelial imaginal wing discs (O'Farrell et al., 2013). There is a CAD96CA in the genome of the H. armigera, which is without function study. Here, we reported that CAD96CA prevents pupation by transmitting JH signal as a JH cell membrane receptor. We also showed that CAD96CA of other insects has a universal function of transmitting JH signal to trigger Ca2+ mobilization, as demonstrated by the study in Sf9 cell lines of S. frugiperda and S2 cell lines of D. melanogaster.
FGFRs control cell migration and differentiation in the developing embryo of D. melanogaster (Muha and Muller, 2013). The ligand of FGFR is FGF in D. melanogaste_r (Du et al., 2018_). FGF binds FGFR and triggers cell proliferation, differentiation, migration, and survival (Beenken and Mohammadi, 2009; Lemmon and Schlessinger, 2010). Three FGF ligands and two FGF receptors (FGFRs) are identified in Drosophila (Huang and Stern, 2005). The Drosophila FGF-FGFR interaction is specific. Different ligands have different functions. The activation of FGFRs by specific ligands can affect specific biological processes (Kadam et al., 2009). The FGFR in the membrane of Sf9 cells can bind to Vip3Aa (Jiang et al., 2018). One FGF and one FGFR are in the H. armigera genome, which has yet to be studied functionally. The study found that FGFR prevents insect pupation by transmitting JH signal as a JH cell membrane receptor. Exploring the molecular mechanism and output by which multiple ligands transmit signals through the same receptor is exciting and challenging.".
Reviewer #1 (Recommendations For The Authors):
As an experimental suggestion, I will only propose that authors test the double knock-down/knock-out or overexpression of CAD96CA and FGFR1 to give some hints into how redundant/independent the two receptors are.
Thank you very much for your professional advice. We agree with your point of view that double knockout of CAD96CA and FGFR1 is very important to resolve the redundant/independent of the two receptors, which can make our research more complete. Unfortunately, due to experimental difficulty and time constraints, we did not provide supplementary experiments. In this study, we aim to screen the cell membrane receptors of JH. Therefore, we focused on which RTKs can function as receptors. This article is a preliminary study to identify the cell membrane receptors of JH. To further understand the relationship between the two membrane receptors, we will conduct in-depth research in future work.
Apart from that, here are some minor points about the manuscript:
Figure 2A: changing the scale on the y-axis would help to better see the different genotypes (similar to the way it is presented in Figure 5).
Thanks for your reminding, we have changed the scale in Figure 2A.
Figure 4J: image settings could be improved to better highlight the green fluorescence.
Thank you for your advice, we have improved the imaged in Figure 4J.
In general, the manuscript would benefit from some proofreading since a number of sentences are incorrect.
Thanks for your reminding, we have carefully revised the manuscript.
Reviewer #2 (Recommendations For The Authors):
(1) Although the authors note that there are 21 RTK genes in Drosophila (line 55), I can only see 16 Drosophila RTKs in Figure 1 - Figure Supplement 1. Some important Drosophila RTKs such as breathless are missing. The authors need to redraw the phylogenetic tree.
Thanks for your reminding, we have presented the new phylogenetic tree in Figure 1-figure supplement 1.
(2) The accelerated pupation phenotype in Cad96ca and Fgfr1 G0 mutants needs to be better described. In particular, it is critical to examine which developmental stage(s) are shortened in these mutant larvae. Refer to a similar study on a JH biosynthetic enzyme in Bombyx (PMID: 22412378) regarding how to describe the developmental timing phenotype.
Thank you for your advice. We have re-shown Figure 4E and added the explanation in the text: "In 61 survivors of Cas9 protein plus Cad96ca-gRNA injection, 30 mutants were sequenced, and a mutation efficiency was 49.2%. Similarly, in the 65 survivors of Cas9 protein plus Fgfr1-gRNA injection, 35 mutants were sequenced, and a mutation efficiency was 53.8% (Figure 4C). The DNA sequences, deduced amino acids and off–target were analyzed (Figure 4—figure supplement 1). Most wild-type larvae showed a phenotype of pupation on time. However, in the Cad96ca mutant, 86% of the larvae (an editing efficiency of 67% by TA clone analysis) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 24 h earlier. In the Fgfr1 mutant, 91% of the larvae (an editing efficiency of 61%) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 23 h earlier (Figure 4D and E). The data suggested that CAD96CA and FGFR1 support larval growth and prevent pupation in vivo.".
(3) The editing efficiency described in lines 211-213 is obscure. Does this indicate the percentage of animals with noisy sequencing spectra or the percentage of mutation rates analyzed by TA cloning?
Thanks for your reminder. We have revised the description in the text: "In 61 survivors of Cas9 protein plus Cad96ca-gRNA injection, 30 mutants were sequenced, and a mutation efficiency was 49.2%. Similarly, in the 65 survivors of Cas9 protein plus Fgfr1-gRNA injection, 35 mutants were sequenced, and a mutation efficiency was 53.8% (Figure 4C). The DNA sequences, deduced amino acids and off–target were analyzed (Figure 4—figure supplement 1). Most wild-type larvae showed a phenotype of pupation on time. However, in the Cad96ca mutant, 86% of the larvae (an editing efficiency of 67% by TA clone analysis) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 24 h earlier. In the Fgfr1 mutant, 91% of the larvae (an editing efficiency of 61%) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 23 h earlier (Figure 4D and E). The data suggested that CAD96CA and FGFR1 support larval growth and prevent pupation in vivo.".
(4) In Figures 4F and G, the authors examined expression levels of some JH/ecdysone responsive genes only at 0 hr-old 6th instar larvae. This single developmental stage is not enough for this analysis. In particular, the expression level of Fgfr1 only goes up in the mid-6th instar according to their own data (Figure 1-Figure Supplement 4), so it is critical to examine expression levels of these genes at least throughout the 6th larval instar.
Thank you for your advice. Indeed, it is essential to detect the expression levels of JH/ecdysone response genes in the whole sixth instar larvae. Because we observed that the mutation has a shorter feeding stage at the sixth instar, we examined the expression level of the JH/ecdysone response gene at the early sixth instar. Due to the number of mutants obtained in the experiment was small and non-destructive sampling could not be performed in sixth instar period, there were no enough samples to test. In the future, we will generate Cad96ca Fgfr1 double mutations to carry out studies and detect the expression level of JH/ecdysone response genes in the whole sixth instar.
(5) As mentioned above, some important Drosophila RTKs such as breathless are missing in their analyses. As breathless is a close paralog of heartless (Htl), I am sure that Drosophila breathless is also orthologous to Helicoverpa FGFR1. The authors therefore need to analyze breathless in Figure 5B in addition to Htl.
Thank you for your advice. We added experiments and the results are shown in Figure 5B and Figure 5—figure supplement 1.
(6) More discussion about the reason why dsNrk and dsWsck can provide resistance to JHIII in Figure 1 is required.
Thank you for your advice. We added explanation in the discussion: "It is generally believed that the primary role of JH is to antagonize 20E during larval molting (Riddiford, 2008). The knockdown of Cad96ca, Nrk, Fgfr1, and Wsck showed phenotypes resistant to JH III induction and the decrease of Kr-h1 and increase of Br-z7 expression, but knockdown of Vegfr and Drl only decrease Kr-h1, without increase of Br-z7. Br-z7 is involved in 20E-induced metamorphosis in H. armigera (Cai et al., 2014), whereas, Kr-h1 is a JH early response gene that mediates JH action (Minakuchi et al., 2009) and represses Br expression (Riddiford et al., 2010). The high expression of Br-z7 is possible due to the down-regulation of Kr-h1 in Cad96ca, Nrk, Fgfr1 and Wsck knockdown larvae. The different expression profiles of Br-z7 in Vegfr and Drl knockdown larvae suggest other roles of Vegfr and Drl in JH signaling, which need further study."
Reviewer #3 (Recommendations For The Authors):
(1) The authors should consider optimizing their experimental approach by depleting the six candidate RTKs in an early larval stage rather than using a sensitized background with JH application in the last larval stage.
Thank you for your precious suggestion. We knocked down the genes at last larval stage to observe pupation, which is a relatively simple and easily to be observed target to examine the role of the gene in JH-maintained larval status. The results from CRISPR/Cas9 experiments showed: "Most wild-type larvae showed a phenotype of pupation on time. However, in the Cad96ca mutant, 86% of the larvae (an editing efficiency of 67% by TA clone analysis) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 24 h earlier. In the Fgfr1 mutant, 91% of the larvae (an editing efficiency of 61%) had a shortened feeding stage in the sixth instar and entered the metamorphic molting stage earlier, showing early pupation, with the pupation time being 23 h earlier (Figure 4D and E). The data suggested that CAD96CA and FGFR1 support larval growth and prevent pupation in vivo.". To know the roles of other RTKs in the whole larval development needs future work since a lot of experiments are needed.
(2) Including a positive control for JH signaling, such as met or tai, would strengthen the assays and provide a benchmark for evaluating the downregulation of target genes and phenotype reversion upon JH application. This addition, especially in Figure 1, would enhance the interpretability of the results.
Thank you for your suggestion. We agree with your point of view that adding the detection of Met or Tai as a positive control. Our laboratory has reported in previous studies that knockdown of Met leads to decreased expression of genes in the JH signaling pathway and precocious pupation (PMID: 24872508), so we did not repeat this related experiment in this study. In the future, when performg Cad96ca and Fgfr1 double mutant experiments, Met mutant can be generated as a control to provide more references for the interpretation of the results.
(3) I recommend revising the manuscript to improve readability, particularly in the Results section, where descriptions of the binding part are particularly dense.
Thank you for your advice. We have carefully revised the manuscript.
(4) In line 122, please add the reference Wang et al., 2016.
Thank you for your reminding, we have added the reference in line 125 of the new manuscript.
(5) The authors should clarify why they chose to test the possible binding to JH of only Cad96CA, FGFR1, and NRK after conducting various assays while including OTK in the study as a negative control. This explanation should be included in the text.
Thank you for the suggestion. We added the explanation, as described in the text: "We screened the RTKs sequentially, including examining the roles of 20 RTKs identified in the H. armigera genome in JH regulated-gene expression to obtain primary candidates, followed by screening of the candidates by their roles in maintaining larval status, JH induced-rapid increase of intracellular calcium levels, JH induced-phosphorylation of MET and TAI, and affinity to JH. The cadherin 96ca (CAD96CA) and fibroblast growth factor receptor 1 (FGFR1) were finally determined as JH cell membrane receptors by their roles in JH regulated-gene expression, maintaining larval status, JH induced-rapid increase of intracellular calcium levels, JH induced-phosphorylation of MET and TAI, and their JH-binding affinity. Their roles as JH cell membrane receptors were further determined by knockdown and knockout of them in vivo and cell lines, and overexpression of them in mammal HEK-293T heterogeneously.".
"Since Cad96CA, FGFR1, and NRK were not only involved in JH-regulated Kr-h1 expression, JH III-induced delayed pupation, and calcium levels increase, but also involved in MET and TAI phosphorylation, we further analyzed their binding affinity to JH III. OTK did not respond to JH III, so we used it as a control protein on the cell membrane to exclude the possibility of nonspecific binding.".
(6) The observed embryonic lethality of cad96ca and FGF1 mutants in Drosophila contrasts with the ability of the respective mutants in H. armigera to reach the pupal stage. The authors should discuss this significant difference.
Thank you for the suggestion. We added the explanation in the discussion, as described in the text: "Homozygous Cad96ca null Drosophila die at late pupal stages (Wang et al., 2009). However, we found that 86% of the larvae of the Cad96ca mutant successfully pupated in G0 generation, although earlier than the control. Similarly, null mutation of Fgfr1 or Fgfr2 in mouse is embryonic lethal (Arman et al., 1998; Deng et al., 1994; Yamaguchi et al., 1994). In D. melanogaster, homozygous Htl (Fgfr) mutant embryos die during late embryogenesis, too (Beati et al., 2020; Beiman et al., 1996; Gisselbrecht et al., 1996). However, in H. armigera, 91% of larvae successfully pupated in G0 generation after Fgfr1 knockout. The low death rate after Cad96ca and Fgfr1 knockout might be because of following reasons, including the editing efficiency (67% and 61% for Cad96ca mutant and Fgfr1 mutant, respectively), the chimera of the gene knockout at the G0 generation, and the redundant RTKs that play similar roles in JH signaling, similar to the redundant roles of MET and Germ-cell expressed bHLH-PAS (GCE) in JH signaling (Liu et al., 2009), which needs to obtain alive G1 homozygote mutants and double knockout of these two receptors in future study. We indeed observed that the eggs did not hatch successfully after mixed-mating of G0 Cad96ca mutant or Fgfr1 mutant, respectively, but the reason was not addressed further due to the embryonic death. By the similar reasons, most of the Cad96ca and Fgfr1 mutants showed a slight acceleration of pupation (about one day) without the typical precocious metamorphosis (at least one instar earlier) phenotype caused by JH signaling defects (Daimon et al., 2012; Fukuda, 1944; Riddiford et al., 2010) and JH pathway gene deletions (Abdou et al., 2011; Liu et al., 2009). On other side, JH can regulate gene transcription by diffusing into cells and binding to the intracellular receptor MET to conduct JH signal, which might affect the results of gene knockdown and knockout.".
(7) Building upon the previous point, it is noteworthy that the cad96ca and FGF1 mutants exhibit only a 24-hour early pupation phenotype, contrasting with the 48-hour early pupation induced by Kr-h1 depletion. This discrepancy suggests that while the function of these RTKs is necessary, it may not be sufficient to fully activate JH signaling. The expression profile of these receptors, primarily observed in the last larval stage, supports this hypothesis.
Thank you for your suggestion. We added the explanation in the discussion, as described in the text: "Homozygous Cad96ca null Drosophila die at late pupal stages (Wang et al., 2009). However, we found that 86% of the larvae of the Cad96ca mutant successfully pupated in G0 generation, although earlier than the control. Similarly, null mutation of Fgfr1 or Fgfr2 in mouse is embryonic lethal (Arman et al., 1998; Deng et al., 1994; Yamaguchi et al., 1994). In D. melanogaster, homozygous Htl (Fgfr) mutant embryos die during late embryogenesis, too (Beati et al., 2020; Beiman et al., 1996; Gisselbrecht et al., 1996). However, in H. armigera, 91% of larvae successfully pupated in G0 generation after Fgfr1 knockout. The low death rate after Cad96ca and Fgfr1 knockout might be because of following reasons, including the editing efficiency (67% and 61% for Cad96ca mutant and Fgfr1 mutant, respectively), the chimera of the gene knockout at the G0 generation, and the redundant RTKs that play similar roles in JH signaling, similar to the redundant roles of MET and Germ-cell expressed bHLH-PAS (GCE) in JH signaling (Liu et al., 2009), which needs to obtain alive G1 homozygote mutants and double knockout of these two receptors in future study. We indeed observed that the eggs did not hatch successfully after mixed-mating of G0 Cad96ca mutant or Fgfr1 mutant, respectively, but the reason was not addressed further due to the embryonic death. By the similar reasons, most of the Cad96ca and Fgfr1 mutants showed a slight acceleration of pupation (about one day) without the typical precocious metamorphosis (at least one instar earlier) phenotype caused by JH signaling defects (Daimon et al., 2012; Fukuda, 1944; Riddiford et al., 2010) and JH pathway gene deletions (Abdou et al., 2011; Liu et al., 2009). On other side, JH can regulate gene transcription by diffusing into cells and binding to the intracellular receptor MET to conduct JH signal, which might affect the results of gene knockdown and knockout.".
(8) The expression profile of the RTK hits described in Supplementary Figure 4A appears to be limited to the last larval stage until pupation. The authors should clarify whether these receptors are expressed earlier, and the meaning of the letters in the plot should be described in the figure legend.
Thank you for the suggestion. We added the explanation in the Figure 1—figure supplement 4 legend, as described in the text: "The expression profiles of Vegfr1, Drl, Cad96ca, Nrk, Fgfr1, and Wsck during development. 5F: fifth instar feeding larvae; 5M: fifth instar molting larvae; 6th-6 h to 6th-120 h: sixth instar at 6 h to sixth instar 120 h larvae; P0 d to P8 d: pupal stage at 0-day to pupal stage at 8-day F: feeding stage; M: molting stage; MM: metamorphic molting stage; P: pupae.".
We are very sorry, but due to time limitations, we will investigate the expression profile of RTK throughout the larval stage in future work.
(9) In Figure 4, panels F and G, the levels of Kr-h1 are shown in cad96ca and FGF1 mutants in the last larval stage. The authors should indicate whether Kr-h1 levels are also low in earlier larval stages or only detected in the last larval stage, as this would imply that these RTKs are only required at this stage.
Thank you for your suggestion. In this study, the Cad96ca and Fgfr1 mutants' feeding stage was shortened in the sixth instar, and they entered the metamorphic molting stage earlier. So, we detected the expression of Kr-h1 in the sixth instar. It is an excellent idea to detect the expression of Kr-h1 at various larvae stages to analyze the stages in which CAD96CA and FGFR1 play a role and to study the relationship between CAD96CA and FGFR1 in future.
(10) While Figure 5 demonstrates JH-triggered calcium ion mobilization in Sf9 cells and S2 cells, the authors should also include data on JH signaling target genes, such as Kr-h1, for a more comprehensive analysis.
Thank you for your advice. We added experiments, as described in the text: "To demonstrate the universality of CAD96CA and FGFR1 in JH signaling in different insect cells, we investigated JH-triggered calcium ion mobilization and Kr-h1 expression in Sf9 cells developed from S. frugiperda and S2 cells developed from D. melanogaster. Knockdown of Cad96ca and Fgfr1 (named Htl or Btl in D. melanogaster), respectively, significantly decreased JH III-induced intracellular Ca2+ release and extracellular Ca2+ influx, and Kr-h1 expression (Figure 5A, B, Figure 5—figure supplement 1A and B). The efficacy of RNAi of Cad96ca and Fgfr1 was confirmed in the cells (Figure 5—figure supplement 1C and D), suggesting that CAD96CA and FGFR1 had a general function to transmit JH signal in S. frugiperda and D. melanogaster.".
(11) The authors should consider improving the quality of images and some plots, particularly enlarging panels showing larval and pupal phenotypes, such as Figure 1B and Supplementary Figure C. Additionally, adding a plot showing the statistical analysis of the phenotype in Supplementary Figure C would enhance clarity. Some plots are overly busy and difficult to read due to small size, such as Figure 1C, Figure 2A, and all the plots in Figure 3. Figure 4E also requires improvement for better readability.
Thank you for your suggestion. We have adjusted Figure 1B, Figure 1C, Figure 1—figure supplement 1C, Figure 2A and Figure 4E. However, for Figure 3, we have not found a better way to arrange and adapt them, considering the overall arrangement of the results and the page space, so we keep them in their original state.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This work is meant to help create a foundation for future studies of the Central Complex, which is a critical integrative center in the fly brain. The authors present a systematic description of cellular elements, cell type classifications, behavioral evaluations and genetic resources available to the Drosophila neuroscience community.
Strengths:
The work contributes new, useful and systematic technical information in compelling fashion to support future studies of the fly brain. It also continues to set a high and transparent standard by which large-scale resources can be defined and shared.
Weaknesses:
manuscript p. 1
"The central complex (CX) of the adult Drosophila melanogaster brain consists of approximately 2,800 cells that have been divided into 257 cell types based on morphology and connectivity (Scheer et al., 2020; Hulse et al. 2021; Wolff et al., 2015)."
The 257 accumulated cell types have informational names (e.g., PBG2‐9.s‐FBl2.b‐NO3A.b) in addition to their associations with specific Gal4 lines and specific EM Body IDs. All this is very useful. I have one suggestion to help a reader trying to get a "bird's eye view" of such a large amount of detailed and multi-layered information. Give each of the 257 CX cell types an arbitrary number: 1 to 257. In fact, Supplemental File 2 lists ~277 cell types each with a number in sequence, so perhaps in principle, it is there. This could expedite the search function when a reader is trying to cross-reference CX cell type information from the text, to the Figures and/or to the Supplemental Figures. Also, the use of (arbitrary) cell type numbers could expedite the explanation of which cell types are included in any compilation of information (e.g., which ones were tested for specific NT expression).
In this report we adhered to the nomenclature introduced in Hulse et al. 2021. We agree that the nomenclature of cell types in the CX is imperfect. There are inherent limitations to what can be done with present data. Even between the hemibrain and FAFB/Flywire EM datasets, it was not possible to derive a one-to-one correspondence in many cases, largely because we do not yet have enough information to distinguish between natural variation within a cell type and distinct cell types (see Schlegel et al. 2024). Moreover, many cell type distinctions depend on connectivity differences that are observable only in EM datasets but not in LM images. Several research groups are currently engaged in a comprehensive and collaborative effort to update the CX nomenclature that will extend over the next few months as additional connectomes become available. This work will require hundreds of hours of effort from anatomical and computational experts in multiple laboratories who have a strong interest in the CX. Since the correspondence between the established Hulse et al nomenclature we use and this new nomenclature will be made clear, it will be easy to transfer our data to that new nomenclature. For all these reasons, we believe we should not unilaterally introduce any new naming systems at this time.
manuscript p 2
"Figure 2 and Figure 2-figure supplements 1-4 show the expression of 52 new split-GAL4 lines with strong GAL4 expression that is largely limited to the cell type of interest. .... We also generated lines of lesser quality for other cell types that in total bring overall coverage to more than three quarters of CX cell types."
This section describes the generation and identification of specific split Gal4 lines, and the presentation is generally excellent. It represents an outstanding compendium of information. My reading of the text suggests ~200 cell types have Gal4 lines that are of immediate use (having high specificity or v close-to-high). Use of an arbitrary number system (mentioned above) could augment that description for the reasons stated. For example, which of the 257 cell types are represented by split Gal4 lines that constitute the ~1/3 representing "high-quality lines "? A second comment relates to this study 's functional analysis of the contributions of CX cell types to sleep physiology. The recent literature contains renewed interest in the specific expression patterns of Gal4 lines that can promote sleep-like behaviors. In particular Gal4 line expression outside the brain (in the VNC and outside the CNS) have been raised as important elements that need be included for interpretation interpretation of sleep regulation. This present study offers useful information about a large number of expression patterns, as well as a basis with which to seek additional information., including mention of VNC expression in many cases However, perhaps I missed it, but I could not find a short description of the over-all strategy used to describe the expression patterns and feel that could be helpful. Were all Gal4 lines studied for expression in the VNC? and in the peripheral NS? It is probably published elsewhere, but even a short reprise would still be useful.
We added a couple of sentences to clarify that the lines were imaged in the adult female brain and VNC and many were also imaged in males. These data, including the ability to download the original confocal stacks, are contained in an on-line web source cited in the text. We also make clear that we did not assay expression outside of the brain, optic lobes and VNC. Therefore, we cannot rule out expression in the peripheral nervous system (other than detected in the axons of sensory neurons in the CNS) or in muscle or other non-neuronal cell types.
manuscript p 9
Neurotransmitter expression in CX cell types
"To determine what neurotransmitters are used by the CX cell types, we carried out fluorescent in situ hybridization using EASI-FISH (Eddison and Irkhe, 2022; Close et al., 2024) on brains that also expressed GFP driven from a cell-type-specific split GAL4 line. In this way, we could determine what neurotransmitters were expressed in over 100 different CX cell types based on ...."
Reading this description, I was uncertain whether the >100 cell types mentioned were tested with all the NT markers by EASI-FISH? Also, assigning arbitrary numbers to the cell types (same suggestion as above) could help the reader more readily ascertain which were the ~100 cell types classified in this context.
The specific probes used for each cell type are indicated in Figure 9 and in Supplemental File 1.
manuscript p 10
"Our full results are summarized below, together with our analysis of neuropeptide expression in the same cell types."
I recommend specifying which Figures and Tables contain the "full results" indicated.
We changed the wording to read:
“Our full results are summarized, together with our analysis of neuropeptide expression in the same cell types, in Figures 5 -9 and in Supplemental File 1.”
NP expression in CX cell types
Similar to the comments regarding studies of NT expression: were all ~100 cell types tested with each of the 17 selected NPs? Arbitrary numerical identifies could be useful for the reader to determine which cell types/ lines were tested and which were not yet tested.
We expanded the description in Methods to now read:
“For neurotransmitters, the specific probes used for each cell type are indicated in Figure 9 and in Supplemental File 1. For neuropeptides, each of the 17 selected NP probes shown in Figure 5—figure supplement 1 was used on all cell types in Figure 9 except those marked by “—” in the neuropeptide column.”
manuscript p. 11
"The neuropeptide expression patterns we observed fell into two broad categories."
This section presents information that is extensive and extremely useful. It supports consideration of peptidergic cell signaling at a circuits level and in a systematic fashion that will promote future progress in this field. I have two comments. First, regarding the categorization of two NP expression patterns, discernible by differences in cell number: this idea mirrors one present in prior literature. Recently the classification of the transcription factor DIMM summarizes this same two-way categorization (e.g., doi: 10.1371/journal.pone.0001896). That included the fact that a single NP can be utilized by cell of either category.
We inserted a sentence to acknowledge this earlier work:
“Such large neurosecretory cells often express the transcription factor DIMM (Park et al. 2008).”
Second, regarding this comment:
"In contrast, neuropeptides like those shown in Figure 6 appear to be expressed in dozens to hundreds of cells and appear poised to function by local volume transmission in multiple distinct circuits."
Signaling by NPs in this second category (many small cells) suggests more local diffusion, a smaller geographic expanse compared to "volume" signaling by the sparser larger peptidergic cells. Given this, I suggest re-consideration in using the term "volume" in this instance, perhaps in favor of "local" or "paracrine". This is only a suggestion and in fact rests almost entirely on speculation/ interpretation, as the field lacks a strong empirical basis to say how far NPs diffuse and act. A recent study in the fly brain of peptide co-transmitters (doi: 10.1016/j.cub.2020.04.025) provides an instructive example in which differences between the spatial extents of long-range (peptide 1) versus short-range (peptide 2) NP signaling may be inferred in vivo.
We have modified the text to now read:
“those shown in Figure 6 are expressed in dozens to hundreds of cells and appear poised to function by transmission to nearby cells in multiple distinct circuits.”
Spab was mentioned (Figure 6 legend) but discarded as a candidate NP to include based on a personal communication, as was Nplp1. The manuscript did not include reasons to do so, nor include a reference to spab peptide. I suggest including explicit reasons to discard candidate NPs.
While there is strong supportive evidence for many NPs in Drosophila, the fact that other transcripts express NPs is more circumstantial often relying simply on sequence analysis and without convincing evidence for a specific cognate receptor. We note that Spab is not listed as a neuropeptide in the current release of FlyBase. In these cases, we relied on the opinion of individuals with extensive experience in studying Drosophila NPs. The results obtained with the probes for Spab and Nplp1 are still available in Supplemental File 1.
In Fig 9-supplement 1, neurotransmitter biosynthetic enzymes were measured by RNA-seq for given CX cell types to augment the cell type classification. The same methods could be used to support cell type classification regarding putative peptidergic character (in Figure 9 supplement 2) by measuring expression levels of critical, canonical neuropeptide biosynthetic enzymes. These include the proprotein convertase dPC2 (amon); the carboxypeptidase dCPD/E (silver); and the amidating enzymes dPHM; dPal1; dPal2. PHM is most related to DBM (dopamine beta monooxygenase), the rate limiting enzyme for DA production, and greater than 90% of Drosophila neuropeptides are amidated. If the authors are correct in surmising widespread use of NPs by CX cell types (and I expect they are), there could be diagnostic value to report expression levels of this enzyme set across many/most CX cell types.
In our admittedly limited experience, most cells express these enzymes and the level we observed in confirmed NP expressing cell types was not reproducibly higher. (The complete data for all genes for the cell types we assayed are available from our deposition in the NCBI Gene Expression Omnibus with accession number GSE271123.) Given our small sample size we chose not to comment on this in the paper.
Comment #6
Screen of effects on Sleep behavior
This work is large in scope and as suggested likely presents excellent starting points for many follow-up studies. I again suggest assigning stable number identities to the elements described. In this case, not cell types, but split Gal4 lines. This would expedite the cross-referencing of results across the four Supplemental Files 3-6. For example, line SS00273 is entry line #27 in S Files 3 and 4, but line entry #18 in S Files 5 and 6.
We believe the interested reader can make this correspondence by searching the supplemental files which are excel spreadsheets. We note that both driver lines and cell types have stable identifiers that are used across Figures and Tables: the line numbers (for example, SS00273) for driver lines and the Hulse et al cell type names for cell types.
manuscript p 26
Clock to CX
"Not surprisingly, the connectome reveals that many of the intrinsic CX cell types with sleep phenotypes are connected by wired pathways (Figure 12 and Figure 12-figure supplement 1)."
Do intrinsic CX cells with sleep phenotypes also connect by wired pathways to CX cells that do not have sleep phenotypes?
Yes, but we do not have high confidence that negative sleep phenotypes in our assays indicate no role in sleep.
"The connectome also suggested pathways from the circadian clock to the CX. Links between clock output DN1 neurons to the ExR1 have been described in Lamaze et al. (2018) and Guo et al. (2018), and Liang et al. (2019) described a connection from the clock to ExR2 (PPM3) dopaminergic neurons."
The introduction to this section indicates a focus on connectome-defined synaptic contacts. Whereas the first two studies cited featured both physiological and anatomic evidence to support connectivity from clock cells to CX, the third did not describe any anatomical connections, and that connection may in fact be due to diffuse not synaptic signaling
I could not easily discern the difference between Figs 12 and 12-S1? These appear to be highly-related circuit models, wherein the second features more elements. Perhaps spell out the basis for the differences between the two models to avoid ambiguity.
We clarify the supplemental diagram differs from the one in the main text by the inclusion of additional connections:
“The strongest of these connections are diagrammed in Figure 12, with Figure 12—figure supplement 1 also showing additional weaker connections.”
"...the cellular targets of Dh31 released from ER5 are unknown, however previous work (Goda et al., 2017; Mertens et al., 2005; Shafer et al., 2008) has shown that Dh31 can activate the PDF receptor raising the possibility of autocrine signaling."
Regarding pharmacological evidence for Dh31 activation of Pdfr: strong in vivo evidence was developed in doi: 10.1016/j.neuron.2008.02.018: a strong pdfr mutation greatly reduces response to synthetic dh31 in neurons that normally express Pdfr
We added the Shafer et al., 2008 reference.
manuscript p 30
"Unexpectedly, we found that all neuropeptide-expressing cell types also expressed a small neurotransmitter."
Did this conclusion apply only to CX cell types? - or was it also true for large peptidergic neurons? Prior evidence suggests the latter may not express small transmitters (doi: 10.1016/j.cub.2009.11.065). The question pertains to the broader biology of peptidergic neurons, and is therefore outside the strict scope of the main focus area - the CX. However, the text did initially consider peptidergic neurons outside the CX, so the information may be pertinent to many readers.
We did not look at other cell types in the current study and so cannot provide an answer.
Reviewer #2 (Public review):
Summary:
In this paper, Wolff et al. describe an impressive collection of newly created split-GAL4 lines targeting specific cell types within the central complex (CX) of Drosophila. The CX is an important area in the brain that has been involved in the regulation of many behaviors including navigation and sleep/wake. The authors advocate that to fully understand how the CX functions, cell-specific driver lines need to be created. In that respect, this manuscript will be of very important value to all neuroscientists trying to elucidate complex behaviors using the fly model. In addition, and providing a further very important finding, the authors went on to assess neurotransmitter/neuropeptides and their receptors expression in different cells of the CX. These findings will also be of great interest to many and will help further studies aimed at understanding the CX circuitries. The authors then investigated how different CX cell types influence sleep and wake. While the description of the new lines and their neurochemical identity is excellent, the behavioral screen seems to be limited.
Strengths:
(1) The description of dozens of cell-specific split-GAL4 lines is extremely valuable to the fly community. The strength of the fly system relies on the ability to manipulate specific neurons to investigate their involvement in a specific behavior. Recently, the need to use extremely specific tools has been highlighted by the identification of sleep-promoting neurons located in the VNC of the fly as part of the expression pattern of the most widely used dorsal-Fan Shaped Body (dFB) GAL4 driver. These findings should serve as a warning to every neurobiologist, make sure that your tool is clean. In that respect, the novel lines described in this manuscript are fantastic tools that will help the fly community.
(2) The description of neurotransmitter/neuropeptides expression pattern in the CX is of remarkable importance and will help design experiments aimed at understanding how the CX functions.
Weaknesses:
(1) I find the behavioral (sleep) screen of this manuscript to be limited. It appears to me that this part of the paper is not as developed as it could be. The authors have performed neuronal activation using thermogenetic and/or optogenetic approaches. For some cell types, only thermogenetic activation is shown. There is no silencing data and/or assessment of sleep homeostasis or arousal threshold. The authors find that many CX cell types modulate sleep and wake but it's difficult to understand how these findings fit one with the other. It seems that each CX cell type is worthy of its own independent study and paper. I am fully aware that a thorough investigation of every CX neuronal type in sleep and wake regulation is a herculean task. So, altogether I think that this manuscript will pave the way for further studies on the role of CX neurons in sleep regulation.
(2) Linked to point 1, it is possible that the activation protocols used in this study are insufficient for some neuronal types. The authors have used 29{degree sign} for thermogenetic activation (instead of the most widely used 31{degree sign}) and a 2Hz optogenetic activation protocol. The authors should comment on the fact that they may have missed some phenotypes by using these mild activation protocols.
Our primary goal was to test the feasibility of using these tools in assessing sleep and wake function of neurons within the CX. In the process we uncovered several new neurons within the DFB-EB network that control sleep and make connections with previously identified sleep regulating neurons. For all single cell type lines and lines with sparse patterns and no VNC expression we present both optogenetics and thermogenetic data. The lines for which we only have thermogenetic but no optogenetic data are those which have multiple cell types or VNC expression. We felt that optogenetic data for these non-specific or contaminated lines would not reliably indicate a role for individual cell types in sleep regulation.
Many previous studies that have used 31 degrees have done so for shorter durations and often using different times of the day for manipulations. The lack of consistency between studies using this temperature may be due in part to the fact that 31 degrees alters behaviors of flies (including controls) and, for this reason, is usually not used for 24-hour activation durations.
To keep the screen consistent and ensure we capture changes in both daytime and nighttime sleep we used 29 degrees. The behavior of control flies is not as disrupted or altered at this temperature, and 29 degrees for activation is routinely used in behavioral experiments.
We similarly selected an optogenetic stimulation protocol that minimizes the response of flies to the red-light pulses. We chose this protocol because we found, in earlier experiments in a different project, that this level of stimulation was able to elicit activation phenotypes across a range of cell types (including several known clock neurons). However, we cannot rule out false negatives in both the TrpA and optogenetic experiments and agree that we might have missed some phenotypes.
Finally, as the reviewer rightfully points out, a thorough, detailed investigation of each cell type is a herculean task. We screened in both genders with very sparse, and often cell-type-specific, driver lines while using two distinct modes of activation and different methods for assessing sleep. For these reasons, we believe the GAL4 lines we identified provide excellent starting points for the additional investigations that will be required to better understand the roles of specific cell types.
(3) There are multiple spelling errors in the manuscript that need to be addressed.
Reviewer #3 (Public review):
Summary:
The authors created and characterized genetic tools that allow for precise manipulation of individual or small subsets of central complex (CX) cell types in the Drosophila brain. They developed split-GAL4 driver lines and integrated this with a detailed survey of neurotransmitter and neuropeptide expression and receptor localization in the central brain. The manuscript also explores the functional relevance of CX cell types by evaluating their roles in sleep regulation and linking circadian clock signals to the CX. This work represents an ambitious and comprehensive effort to provide both molecular and functional insights into the CX, offering tools and data that will serve as a critical resource for researchers.
Strengths:
(1) The extensive collection of split-GAL4 lines targeting specific CX cell types fills a critical gap in the genetic toolkit for the Drosophila neuroscience community.
(2) By combining anatomical, molecular, and functional analyses, the authors provide a holistic view of CX cell types that is both informative and immediately useful for researchers across diverse disciplines.
(3) The identification of CX cell types involved in sleep regulation and their connection to circadian clock mechanisms highlights the functional importance of the CX and its integrative role in regulating behavior and physiological states.
(4) The authors' decision to present this work as a single, comprehensive manuscript rather than fragmenting it into smaller publications each focusing on separate central complex components is commendable. This decision prioritizes accessibility and utility for the broader neuroscience community, which will enable researchers to approach CX-related questions with a ready-made toolkit.
Weaknesses:
While the manuscript is an outstanding resource, it leaves room for more detailed mechanistic exploration in some areas. Nonetheless, this does not diminish the immediate value of the tools and data provided.
Appraisal:
The authors have succeeded in achieving their aims of creating well-characterized genetic tools and providing a detailed survey of neurochemical and functional properties in the CX. The results strongly support their conclusions and open numerous avenues for future research. The work effectively bridges the gap between genetic manipulation, molecular characterization, and functional assessment, enabling a deeper understanding of the CX's diverse roles.
Impact and Utility
This manuscript will have a significant and lasting impact on the field, providing tools and data that facilitate new discoveries in the study of the CX, sleep regulation, circadian biology, and beyond. The genetic tools developed here are likely to become a standard resource for Drosophila researchers, and the comprehensive dataset on neurotransmitter and neuropeptide expression will inspire investigations into the interplay between neuromodulation and classical neurotransmission.
Additional Context
The breadth and depth of the resources presented in this manuscript justify its publication without further modification. By delivering an integrated dataset that spans anatomy, molecular properties, and functional relevance, the authors have created a resource that will serve the neuroscience community for years to come.
Recommendations for the authors:
Reviewing Editor:
The reviewers suggest that a nomenclature, perhaps a numbering system, be adopted for different cell types and Gal4 drivers in order to facilitate reading of the manuscript and cross-referencing.
We agree that a comprehensive reanalysis of the CX nomenclature is in order, but it is premature for us to attempt that as part of this study. This is best done after additional connectomes are generated to help resolve the degree of variation in morphology and connectivity between the same cell in multiple animals.
Reviewer #3 (Recommendations for the authors):
The authors have characterized a large number of split-GAL4 drivers targeting individual or small subsets of CX cell types. This manuscript delivers a detailed anatomical, molecular, and functional mapping of the CX.
By integrating data on neurotransmitters, neuropeptides, and their receptors, the authors provide a holistic view of CX cell types that will undoubtedly serve as a foundation for future studies.
The use of these genetic tools to identify CX cell types affecting sleep, as well as those linking the circadian clock to the CX, represents a significant advance. These findings hint at the diverse and integrative roles of the CX in regulating both behavior and physiological states.
The authors' decision to present this work as a single, comprehensive manuscript rather than fragmenting it into smaller publications each focusing on separate central complex components is commendable. This decision prioritizes accessibility and utility for the broader neuroscience community, which will enable researchers to approach CX-related questions with a ready-made toolkit.
While the manuscript leaves room for further exploration and mechanistic studies, the breadth and depth of the resources presented are more than sufficient to justify publication in their current form.
The data on neuropeptide and receptor expression patterns, especially the observation that all examined CX cell types co-express a small neurotransmitter, opens intriguing new avenues of inquiry into the interplay between classical neurotransmission and neuromodulation in this region.
This manuscript has provided a much-needed resource for the Drosophila neuroscience community and beyond. This work will facilitate important discoveries in CX function, sleep regulation, circadian biology, and more.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #4 (Public review):
We would like to thank the reviewer for their careful consideration of our manuscript. The suggestions have been useful in improving our manuscript. Please see our responses to the specific comments below.
Summary:
This is an important study that underscores that reproduction-survival trade-offs are not manifested (contrary to what generally accepted theory predicts) across a range of studies on birds. This has been studied by a meta-analytical approach, gathering data from a set of 46 papers (30 bird species). The overall conclusion is that there are no trade-offs apparent unless experimental manipulations push the natural variability to extreme values. In the wild, the general pattern for within-species variation is that birds with (naturally) larger clutches survive better.
Strengths:
I agree this study highlights important issues and provides good evidence of what it claims, using appropriate methods.
Weaknesses:
I also think, however, that it would benefit from broadening its horizon beyond bird studies. The conclusions can be reinforced through insights from other taxa. General reasoning is that there is positive pleiotropy (i.e. individuals vary in quality and therefore some are more fit (perform better) than others. Of course, this is within their current environment (biotic, abiotic, social. ...), with consequences of maintaining genetic variation across generations - outlined in Maklakov et al. 2015 (https://doi.org/10.1002/bies.201500025). This explains the outcomes of this study very well and would come to less controversy and surprise for a more general audience.
I have two fish examples in my mind where this trade-off is also discounted. Of course, given that it is beyond brood-caring birds, the wording in those studies is slightly different, but the evolutionary insight is the same. First, within species but across populations, Reznick et al. (2004, DOI: 10.1038/nature02936) demonstrated a positive correlation between reproduction and parental survival in guppies. Second, an annual killifish study (2021, DOI: 10.1111/1365-2656.13382) showed, within a population, a positive association between reproduction and (reproductive) aging.
In fruit flies, there is also a strong experimental study demonstrating the absence of reproduction-lifespan trade-offs (DOI: 10.1016/j.cub.2013.09.049).
I suggest that incorporating insights from those studies would broaden the scope and reach of the current manuscript.
We would like to thank the reviewer for this useful insight and for highlighting these studies. We have added detail in our discussion around positive correlations observed in the wild, and how positive pleiotropy has been presented as an explanation. We have also added the suggested studies as references to demonstrate the reproduction-lifespan trade-off has been shown to be absent. See lines 257-260.
Likely impact:
I think this is an important contribution to a slow shift in how we perceive the importance of trade-offs in ecology and evolution in general. While the current view still is that one individual excelling in one measure of its life history (i.e. receiving benefits) must struggle (i.e. pay costs) in another part. However, a positive correlation between all aspects of life history traits is possible within an individual (such as due to developmental conditions or fitting to a particular environment). Simply, some individuals can perform generally better (be of good quality than others).
We would like to thank the reviewer for highlighting the importance of our study. We hope our study will help the research community reflect on the importance of trade-offs between life-history traits and consider other possible explanations as to why variation in life-history traits is maintained within species.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
The authors have performed extensive work generating reporter mice and performing single-cell analysis combined with in situ hybridization to arrive at 14 clusters of enterochromaffin (EC) cells. Then, they focus on Piezo channel expression in distal EC cells and find that these channels might play a role in regulating colonic motility. Overall, this is an informative study that comprehensively classifies EC cells in different regions of the small and large intestine. From a functional point of view, however, the authors seem to ignore the fact that the expression of Piezo-2-IRES-Cre is broad, which would raise concerns regarding their physiological conclusions.
The authors may wish to consider the following specific points:
It is surprising that the number of ileal EC cells is less than that of the distal colon, and it would be interesting to know whether the authors can comment about ileal EC cells. It is unclear why ileal ECs were not included in the study, even though they are mentioned in the diagram (Fig. 2c).
We have discussed the rationale for excluding ileal ECs in the methods section under “Elimination of ileal GFP+ cells”. In our initial scRNA-seq experiment, our yield of epithelial cells and GFP positive cells was low, and a large proportion of these cells appeared to not have fully committed to the EC lineage. Also to note, we have previously seen fewer ECs in the distal ileum than upper small intestine and colon (PMID: 26803512). Given the low yield, and some uncertainty regarding the nature of the ileal EC population sorted by our methods, we considered that data from ileal ECs may not be an accurate representation of ileal EC cell diversity. Thus, we did not use ileal ECs in our second scRNA-seq experiment.
Based on their analysis, there are 10 EC cell clusters in SI while there are only 4 clusters in the colon. The authors should comment on whether this is reflective of lesser diversity among colonic ECs or due to the smaller number of colonic ECs collected.
The 4 clusters identified in the colon are consistent with previous a previous publication (Glass et al., Mol. Metab. 2017, PMID: 29031728), supporting the idea that these clusters are representative of the major clusters of colonic ECs. Nonetheless, we anticipate that with greater sample sizes (in any region) further resolution of subtypes could be resolved.
The authors previously described that distal colonic EC cells exhibit various morphologies (Kuramoto et al., 2021). Do Ascl1(+) EC cells particularly co-localize with EC cells with long basal processes? Also, to validate the RNA seq data, the authors might show co-localization between Piezo2/Ascl1/Tph1 in distal EC cells. It would be interesting to see whether Ascl1-CreER (which is available in Jax) specifically labels distal colonic EC cells as this could provide a good genetic tool to specifically manipulate distal colonic EC cells.
We have shown co-localization between Piezo2/Ascl1/Tph1 in Supplementary Figure 6a. Unfortunately we did not study cell morphology in the Ascl1 smRNA-FISH experiments as these used thin cryosections, whereas morphological assessment of EC processes is best performed with thick (>60 µm) sections. It would be interesting if neuronal-like expression profiles correlate with neuronal-like morphology, which could be addressed in future studies with spatial transcriptomics.
The authors used Piezo2-IRES-Cre mice, whose expression is rather broad. They might examine the distribution of Chrm3-mCitrine in the intestine (IF/IHC would be straightforward). And if the expression is in other cell types (which is most likely the case), they should justify that the observed phenotype derives from Piezo2-expressing EC cells. Alternatively, they could use Piezo2-Cre;ePetFlp (or Vil-Flp);Chrm3 to specifically express DREADD receptors in distal colonic EC cells. Also, what does 5HT release look like in jejunal EC cells in Piezo-CHRM3 mice?
Unfortunately we no longer have access to the animals to do these experiments.
For the same reasons as above, DTR experiments may also be non-specific. For example, based on the IF staining (Fig. 6b,d), there seems to be a loss of Tph1+ cells in the proximal colon of Piezo2-DTR mice, so the effects of the Piezo2-DTR likely extend beyond the distal colon.
Figures 6b and d show distal colon, not proximal colon. Our Tph1<sup>+</sup> cell counts indicate there was no loss of Tph1 cells in the proximal colon following intraluminal administrations of DT.
It is unclear why the localized loss of Piezo2 in Piezo2-DTR mice alters small intestinal transit (Fig. 6g,h). The authors should discuss the functional differences observed between Piezo2-DTR (intraluminal app) and Vil1Piezo2 KO mice i.e., small intestinal transit, 5HT release, etc. Are these differences due to the residual Piezo2 expression in Piezo2 KO mice? In this context, the authors may want to discuss their findings in the context of recent papers, such as those from the Patapoutian and Ginty groups.
We have made the following amendment to speculate on the reason for delayed small intestinal transit in the DTR experiments:
“There are a several possible explanations for this. Some Piezo2+ cells in the small intestine could have been depleted. Alternatively, 5-HT released from Piezo2+Tph1+ cells in the distal colon may provide feedback to the small intestine to accelerate motility, and thus depletion of these cells would result in slower intestinal transit.”
We have also added a comment speculating on why we did not see similar slowing of small intestinal transit in the Villlin-Cre Piezo2 KO:
“No difference was observed in small intestine transit… in contrast to the DTR experiments, in which small intestinal transit was delayed. This could be due to the depletion of EC cells in the DTR experiments, whereas they are retained in the Villin-Cre Piezo2 KO mice. 5-HT secretion from ECs can be induced by other stimulants (even when Piezo2 is knocked out), and thus colonic 5-HT could be providing feedback to the small intestine to accelerate motility in the Villin-Cre Piezo2 KO mice. Residual Piezo2 expression in these mice could also be contributing to this effect.”
We have added a comment on neural Piezo2 in the discussion:
“However, in contrast to Piezo2 signalling in ECs which results in accelerated gut transit, Piezo2 signalling in DRG neurons appears to slow transit (refs: Wolfson et al., Cell 2023; PMID: 37541195; Servin-Venves et al., Cell 2023, PMID: 37541196).”
Reviewer #2 (Public Review):
Summary:
The authors investigated the expression profile of enterochromaffin (EC) cells after creating a new tryptophan hydroxylase 1 (Tph1) GFP-reporter mouse using scRNAseq and confirmative RNAscope analysis. They distinguish 14 clusters of Tph1+ cells found along the gut axis. The manuscript focuses on two of these, (i) a multihormonal cell type shown to express markers of pathogen/toxin and nutrient detection in the proximal small intestine, and (ii) on a EC-cluster in the distal colon, which expresses Piezo2, rendering these cells mechanosensitive. In- and ex- vivo data explore the role of the mechanosensitive EC population for intestinal/colonic transit, using chemogenetic activation, diptheria-toxin receptor dependent cell ablation and conditional gut epithelial specific Piezo2 knock-out. Whilst some of these data are confirmative of previous reports - Piezo2 has been implicated in mechanosensitive serotonin release previously, as referred to by the authors - the data are solid and emphasize the importance of mechanosensitive serotonin release for colonic propulsion. The transcriptomic data will guide future research.
Strengths:
The transcriptomic data, whilst confirmative, is more granular than previous data sets. Employing new tools to establish a role of mechanosensitive EC cells for colonic and thus total intestinal transit.
Weaknesses:
(1) The proposed villus/crypt distribution of the 14 cell types is not verified adequately. The RNAscope and immunohistochemistry samples presented do not allow assessment of whether this interpretation is correct - spatial transcriptomics, now approaching single-cell resolution, would be likely to help verify this claim.
Spatial transcriptomics would be excellent in validating the spatial distribution of the EC cell types in future studies. In our work, although the villus/crypt cluster annotations are assumptions (based on the differential expression of Neurog3, Tac1, and Sct, which is well supported by the literature), we have validated the spatial segregation of key markers. We quantified the crypt/villus location of Cartpt, Ucn3, and Trpm2 overlap with Tph1 (Figure 2d), Oc3, Cck, and Tph1 (Figure 3d), and TK/5-HT (Supplementary Fig 2d). This work supports our predictions on the spatial distribution of these clusters.
(2) The physiological function and/or functionality of most of the transcriptomically enriched gene products has not been assessed. Whilst a role for Piezo2 expressing cells for colonic transit is convincingly demonstrated, the nature of the mechanical stimulus or the stimulus-secretion coupling downstream of Piezo2 activation is not clear.
While we have not investigated the mechanical forces involved in activating Piezo2, we can at least say that physiological mechanical stimulation activates Piezo2, as we measured fecal pellet output in the DTR experiments.
Reviewer #2 (Recommendations For The Authors):
(1) Please state (even more) clearly if/that the apparently GFP+/Tph1+ cells which clustered with the GFP- cells (Suppl. Fig1d/e) were excluded from the subsequent analysis. The detectable Chg-a/b expression in the GFP- cells in Suppl. Fig1f seems to suggest that these (if they have been included in the GFP- group here) are genuine ECs. How do these cells relate to the non-EC cells in Fig1d, which seem to lack Tph1 expression? And given the information in the methods, what %age of these cells derived from the ileum?
To clarify, data shown in Suppl. Fig 1d/e/f was from our first single cell profiling experiment whereas our subsequent clustering analysis utilizes data from a second (independent) single cell profiling experiment (e.g. Fig1d).
In the first profiling experiment, 23% of GFP<sup>+</sup> cells clustered with GFP<sup>-</sup> cells, and for the purposes of Suppl. Figures 1d/e/f, we called these “non-ECs”. In the second profiling experiment (e.g. shown in Fig 1d) we performed a more detailed cluster analysis focusing on only GFP<sup>+</sup> cells. In this second experiment, 19% of GFP<sup>+</sup> cells were identified as “non-EC cells” based on the presence of markers for stem cells, transit amplifying cells (TACs), immature enterocytes, mature enterocytes, colonocytes, T lymphocytes and mucosal mast cells (see Fig 1d and Suppl. Fig 1g). Similar to the first profiling dataset, many of the GFP<sup>+</sup> “non-EC cells” in the second dataset express Tph1, Chga, and Chgb, generally at lower levels than the “EC cells” (Suppl. Fig1i). It is possible that the stem cell and transit amplifying cell clusters are cells that are differentiating into EC cells. However, given that they have not fully committed to the lineage yet, we do not consider it appropriate to classify them as “EC cells”. With regards to the other “non-EC” clusters, we do not think that the expression of EC cell marker genes (Tph1, Chga, and Chgb) is evidence enough to call them genuine “EC cells” given the concurrent expression of markers of other lineages (e.g. enterocyte and mast cell markers Suppl. Fig 1g). The expression of Tph1 in murine mast cells is known, however the expression in enterocytes is unexpected and could be a result of imperfect/incomplete differentiation. Since the ileum was not included in the second profiling experiment we do not think the GFP<sup>+</sup> “non-EC cells” are an artifact from the ileum.
We have made some adjustments in the first section of the results to clarify some thoughts on this matter:
“It is possible that some GFP is expressed in cells that have not yet fully committed to the EC lineage, or that there is some expression in cells outside this lineage, for example, in mast cells. Given the small sample size, we did not further investigate these cells in this dataset. In Supplementary Figures 1 d and f we refer to the GFP<sup>+</sup> cells that clustered with the GFP<sup>-</sup> cells as “non-EC cells”.”
“It is possible that the stem cell and transit amplifying cell clusters include cells that are in the process of differentiating into EC cells. However, given that they have not fully committed to the lineage, we do not consider it appropriate to classify them as “EC cells” for the purposes of analyzing EC cell types in this study.”
(2) The authors state: "Notably, OSR2 and HOXB13 were restricted to the ileum and rectum respectively in humans (Fig. 1f)." - the statement regarding OSR2 seems too strong, given that only the ileal part of the human small intestine was examined and that there is a small signal in the proximal colon in Figure 1f.
Thanks, we have made the following amendment:
"Notably, OSR2 and HOXB13 were preferentially enriched in the ileum and rectum respectively in these human samples (Fig. 1f)."
(3) Please clarify Suppl Fig2g/h labelling as villus and crypt enrichment ("...enrichment in villus clusters (g) or crypt clusters (h)."), when enrichment for some genes in cluster 4 is shown in both g and h. Why was duodenal cluster 6 excluded from this subset of data?
We suspect (although have not proven) that cluster 4 is at a later stage in maturation/migration than cluster, as indicated by a somewhat ‘middle ground’ level of Sct expression, and generally being ‘in between’ the villus clusters and cluster 5 in expression levels of differentially expressed genes shown in Suppl Fig 2g/h. We have added the following comment to the figure legend to clarify this. We have not included cluster 6 as it is transcriptionally quite distinct from the other clusters:
“Note that cluster 4 shares some features in common with crypt and villus clusters and may represent cells at an intermediate stage of development.”
(4) "Using smRNA-FISH, we further mapped Olfr558 and Il12a transcripts to a separate subset of EC cells expressing Cpb2 (Fig. 4b,c), confirming the presence of two subpopulations of EC cells associated with different physiological roles in the proximal colon." - Claiming populations with different physiological functionality seems a strong statement given the relatively weak Cpb2 signals observed and that mRNA detection necessarily is a transcriptomic time limited snap-shot. Please reformulate.
We have made the following revision:
“Using smRNA-FISH, we further mapped Olfr558 and Il12a transcripts to a separate subset of EC cells expressing Cpb2 (Fig. 4b,c), supporting the idea that there are subpopulations of EC cells in the proximal colon with gene transcripts associated with different physiological roles.”
(5) What are the white signals in the overlay in Fig5a, given that the Piezo1 probe (white) apparently did not give any staining by itself? Please consider a positive control for the Piezo1 probe.
The white signals in the overlay are Piezo1 staining that we do observe at what we consider background levels (also visible in the single-channel image).
(6) "Systematic administration of DT led to lethality in the Piezo2-DTR mice within 12 hours, but not in the Rosa26LSL-DTR or Piezo2-cre mice (data not shown), likely due to the essential function of Piezo2 in respiration" - presumably this should be corrected to "Systemic administration ...".
Thanks, this has been corrected to "Systemic administration ...".
(7) "Although gastric emptying (GE) was not affected in the Piezo2-DTR animals after DT treatment, small intestine transit (SIT) time, a measurement to assess the motility of small intestine, presented a small but statistically significant slowdown in the former group (Fig. 6g,h), suggesting that some Piezo2+ cells in the small intestine were depleted." - alternatively there could, of course, be a slowing of SIT in response to slower colonic transit independent of small intestinal epithelial Piezo2 or 5HT - to me this seems more likely given that even proximal colonic cells are spared in Fig6c and this should be discussed.
Thanks, that is a good point. We have made an amendment, which is shown in response to reviewer 1.
(8) In the context of the Villin-Cre experiments it should be discussed that other colonic EECs although express Piezo2, which might contribute to the observed phenotypes.
In our study, 97.7% of Piezo2+ cells in the distal colon had detectable Tph1 expression, suggesting that there is not a significant degree of overlap with other EEC types.
(9) MC4R is several times referred to as a nutrient-sensing moeity (e.g. in the discussion: "...and receptors associated with nutrient sensing (Casr and Mc4r), ...") - whilst the melanocortin system is important for nutrient homeostasis, MC4R is itself not a "nutrient sensor", a term usually reserved for the detection of macronutrients, such as amino acids, fatty acids, and monosaccharides; please reformulate.
We have amended this to “nutrient sensing and homeostasis”.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
The objective of this study was to infer the population dynamics (rates of differentiation, division, and loss) and lineage relationships of clonally expanding NK cell subsets during an acute immune response.
Strengths:
A rich dataset and thorough analysis of a particular class of stochastic models.
Weaknesses:
The stochastic models used are quite simple; each population is considered homogeneous with first-order rates of division, death, and differentiation. In Markov process models such as these, there is no dependence of cellular behavior on its history of divisions. In recent years models of clonal expansion and diversification, in the settings of T and B cells, have progressed beyond this picture. So I was a little surprised that there was no mention of the literature exploring the role of replicative history in differentiation (e.g. Bresser Nat Imm 2022), nor of the notion of family 'division destinies' (either in division number or the time spent proliferating, as described by the Cyton and Cyton2 models developed by Hodgkin and collaborators; e.g. Heinzel Nat Imm 2017). The emerging view is that variability in clone (family) size may arise predominantly from the signals delivered at activation, which dictate each precursor's subsequent degree of expansion, rather than from the fluctuations deriving from division and death modeled as Poisson processes.
As you pointed out, the Gerlach and Buchholz Science papers showed evidence for highly skewed distributions of family sizes and correlations between family size and phenotypic composition. Is it possible that your observed correlations could arise if the propensity for immature CD27+ cells to differentiate into mature CD27- cells increases with division number? The relative frequency of the two populations would then also be impacted by differences in the division rates of each subset - one would need to explore this. But depending on the dependence of the differentiation rate on division number, there may be parameter regimes (and time points) at which the more differentiated cells can predominate within large clones even if they divide more slowly than their immature precursors. One might not then be able to rule out the two-state model. I would like to see a discussion or rebuttal of these issues.
We thank the reviewer for the insightful comment. We are currently in the process of developing alternate models based on the above comment and the references (Bresser Nat Imm 2022 and Heinzel Nat Imm 2017). We plan to include the results from the analysis in the revised version.
Reviewer #2 (Public review):
Summary:
Wethington et al. investigated the mechanistic principles underlying antigen-specific proliferation and memory formation in mouse natural killer (NK) cells following exposure to mouse cytomegalovirus (MCMV), a phenomenon predominantly associated with CD8+ T cells. Using a rigorous stochastic modeling approach, the authors aimed to develop a quantitative model of NK cell clonal dynamics during MCMV infection.
Initially, they proposed a two-state linear model to explain the composition of NK cell clones originating from a single immature Ly49+CD27+ NK cell at 8 days post-infection (dpi). Through stochastic simulations and analytical investigations, they demonstrated that a variant of the two-state model incorporating NK cell death could explain the observed negative correlation between NK clone sizes at 8 dpi and the percentage of immature (CD27+) NK cells (Page 8, Figure 1e, Supplementary Text 1). However, this two-state model failed to accurately reproduce the first (mean) and second (variance and covariance) moments of the measured CD27+ and CD27- NK cell populations within clones at 8 dpi (Figure 1g).
To address this limitation, the authors increased the model's complexity by introducing an intermediate maturation state, resulting in a three-stage model with the transition scheme: CD27+Ly6C- → CD27-Ly6C- → CD27-Ly6C+. This three-stage model quantitatively fits the first and second moments under two key constraints: (i) immature CD27+ NK cells exhibit faster proliferation than CD27- NK cells, and (ii) there is a negative correlation (upper bound: -0.2) between clone size and the fraction of CD27+ cells. The model predicted a high proliferation rate for the intermediate stage and a high death rate for the mature CD27-Ly6C+ cells.
Using NK cell reporter mice data from Adams et al. (2021), which tracked CD27+/- cell population dynamics following tamoxifen treatment, the authors validated the three-stage model. This dataset allowed discrimination between NK cells originating from the bone marrow and those pre-existing in peripheral blood at the onset of infection. To test the prediction that mature CD27- NK cells have a higher death rate, the authors measured Ly49H+ NK cell viability in the mice spleen at different time points post-MCMV infection. Experimental data confirmed that mature (CD27-) NK cells exhibited lower viability compared to immature (CD27+) NK cells during the expansion phase (days 4-8 post-infection).
Further mathematical analyses using a variant of the three-stage model supported the hypothesis that the higher death rate of mature CD27- cells contributes to a larger proportion of CD27- cells in the dead cell compartment, as introduced in the new variant model.
Altogether, the authors proposed a three-stage quantitative model of antigen-specific expansion and maturation of naïve Ly49H+ NK cells in mice. This model delineates a maturation trajectory: (i) CD27+Ly6C- (immature) → (ii) CD27-Ly6C- (mature I) → (iii) CD27-Ly6C+ (mature II). The findings highlight the highly proliferative nature of the mature I (CD27-Ly6C-) phenotype and the increased cell death rate characteristic of the mature II (CD27-Ly6C+) phenotype.
Strengths:
By designing models capable of explaining correlations, first and second moments, and employing analytical investigations, stochastic simulations, and model selection, the authors identified the key processes underlying antigen-specific expansion and maturation of NK cells. This model distinguishes the processes of antigen-specific expansion, contraction, and memory formation in NK cells from those observed in CD8+ T cells. Understanding these differences is crucial not only for elucidating the distinct biology of NK cells compared to CD8+ T cells but also for advancing the development of NK cell therapies currently under investigation.
Weaknesses:
The conclusions of this paper are largely supported by the available data. However, a comparative analysis of model predictions with more recent works in the field would be desirable. Moreover, certain aspects of the simulations, parameter inference, and modeling require further clarification and expansion, as outlined below:
(1) Initial Conditions and Grassmann Data: The Grassmann data is used solely as a constraint, while the simulated values of CD27+/CD27- cells could have been directly fitted to the Grassmann data, which assumes a 1:1 ratio of CD27+/CD27- at t = 0. This approach would allow for an alternative initial condition rather than starting from a single CD27+ cell, potentially improving model applicability.
We thank the reviewer for this comment. We are working on performing the above analysis and plan to include results from the analysis in the revised manuscript.
(2) Correlation Coefficients in the Three-State Model: Although the parameter scan of the three-state model (Figure 2) demonstrates the potential for achieving negative correlations between colony size and the fraction of CD27+ cells, the authors did not present the calculated correlation coefficients using the estimated parameter values from fitting the three-state model to the data. Including these simulations would provide additional insight into the parameter space that supports negative correlations and further validate the model.
We will include the above calculation in the revised manuscript.
(3) Viability Dynamics and Adaptive Response: The authors measured the time evolution of CD27+/- dynamics and viability over 30 days post-infection (Figure 4). It would be valuable to test whether the three-state model can reproduce the adaptive response of CD27- cells to MCMV infection, particularly the observed drop in CD27- viability at 5 dpi (prior to the 8 dpi used in the study) and its subsequent rebound at 8 dpi. Reproducing this aspect of the experiment is critical to determine whether the model can simultaneously explain viability dynamics and moment dynamics. Furthermore, this analysis could enable sensitivity analysis of CD27- viability with respect to various model parameters.
We will include some discussion of potential mechanisms of cell viability in this experiment.
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This study introduces a novel therapeutic strategy for patients with high-risk HER2-positive breast cancer and demonstrates that the incorporation of pyrotinib into adjuvant trastuzumab therapy can improve invasive disease-free survival.
Strengths:
The study features robust logic and high-quality data. Data from 141 patients across 23 centers were analyzed, thereby effectively mitigating regional biases and endowing the research findings with high applicability.
Weaknesses:
(1) Introduction and Discussion: Update the literature regarding the efficacy of pyrotinib combined with trastuzumab in treating HER2-positive advanced breast cancer.
Thank you for this helpful suggestion. The literature regarding the efficacy of pyrotinib combined with trastuzumab in treating HER2-positive advanced breast cancer referenced in our manuscript was the PHILA study, but we mistakenly cited its corrections (reference 14). We revised this reference as suggested.
Changes in the text: Page 6, line 347-353.
(2) Did all the data have a normal distribution? Expand the description of statistical analysis.
As the sample size increases, the sampling distribution of the mean follows a normal distribution even when the underlying distribution of the original variable is non-normal, allowing the use of a normal distribution to calculate their confidence interval. We believe it is unnecessary to specifically describe whether the data followed a normal distribution in this study. Therefore, we did not revise the statistical section.
(3) The novelty and innovative potential of your manuscript compared to the published literature should be described in more detail in the abstract and discussion section.
Thank you for your suggestion. The word count for abstracts recommended by eLife is around 250 words. Therefore, we did not compare the present study with published literature in detail in the abstract, as this might exceed the recommended word limit. We revised the discussion section to provide a more detailed comparison between published literature and our study, and to analyze the novelty of our findings accordingly.
Changes in the text: Page 11, line 177-180.
(4) Figure legend should provide a bit more detail about what readers should focus on.
Thank you for this suggestion. We did not revise the figure legend of Figure 1, as it provides a common description. For the figure legend of Figure 2, we added the method used to estimate the invasive disease-free survival curve. For the figure legend of Figure 3, we added more details regarding methods and numbers of patients in different subgroups.
Changes in the text: Page 7, line 463-472.
(5) P-values should be clarified for the analysis.
Thank you for this comment. All subgroup analyses were post-hoc and lacked predefined hypotheses. Kaplan-Meier curves were used to present the subgroup results with the aim of performing descriptive statistics rather than inferential statistics. Therefore, we did not calculate their p-values.
(6) The order (A, B, and C) in Figure 3 should be labeled in the upper left corner of the Figure.
Thanks for this comment. We revised Figure 3 accordingly.
Changes in the text: Figure 3.
Reviewer #2 (Public review):
In this manuscript, Cao et al. evaluated the efficacy and safety of 12 months pyrotinib after trastuzumab-based adjuvant therapy in patients with high-risk, HER2-positive early or locally advanced breast cancer. Notably, the 2-year iDFS rate reached 94.59% (95% CI: 88.97-97.38) in all patients, and 94.90% (95% CI: 86.97-98.06) in patients who completed 1-year treatment of pyrotinib. This is an interesting and uplifting results, given that in ExteNET study, the 2-year iDFS rate was 93.9% (95% CI 92·4-95·2) in the 1-year neratinib group, and the 5-year iDFS survival was 90.2%, and 1-year treatment of neratinib in ExteNET study did not translate into OS benefit after 8-year follow-up. In this case, readers will be eagerly anticipating the long-term follow-up results of the current PERSIST study, as well as the results of the phase III clinical trial (NCT03980054).
I have the following comments:
(1) The introduction of the differences between pyrotinib and neratinib in terms of mechanism, efficacy, resistance, etc. is supposed to be included in the text so that authors could better highlight the clinical significance of the current trial.
Thanks for this comment.
In terms of mechanism, pyrotinib and neratinib are both irreversible pan-HER tyrosine kinase inhibitors that target HER1, HER2 and HER4 by covalently binding to ATP binding sites. Overall, the similarities between them far outweigh the differences. This is the reason why we referenced the ExteNET study, which used neratinib as extended adjuvant therapy, for the sample size calculation.
Regarding efficacy, currently, no head-to-head studies comparing efficacy of pyrotinib and neratinib have been reported, and the comparison of the efficacy between them using historical data from different studies have inevitable bias due to differences in treatment regimens, study populations, assessment criteria, etc.
Regarding resistance, only a few studies with small sample size and case reports have investigated their mechanisms of resistance, and the underlying mechanisms have not been fully understood.
Collectively, we believe that the similarities in the mechanisms of these two drugs far outweigh their differences, and their efficacy and resistance cannot be reasonably compared. Moreover, the sample size calculation was conducted based on the premise that the two drugs are similar. After careful consideration, we believe that overanalyzing the differences between neratinib and pyrotinib would shift the focus of this manuscript. Therefore, we did not discuss their differences in the article.
(2) Please make sure that a total of 141 patients were enrolled in the study, 38 patients had a treatment duration of less than or equal to 6 months, and a total of 92 and 31 patients completed 1-year and 6-month treatment of extended adjuvant pyrotinib, respectively, which means 7 patients had a treatment duration of fewer than 6 months.
Thank you for raising this relevant question. There were 141 patients enrolled in the study and received study treatment, and a total of 92 and 31 patients completed 1-year and 6-month treatment of extended adjuvant pyrotinib. Of the remaining 18 patients, 16 patients had a treatment duration of fewer than 6 months, and 2 patients had a treatment duration longer than 6 months but less than 1 year.
(3) The previous surgery history should be provided, and how many patients received lumpectomy, and mastectomy.
Thank you for your suggestion. All patients in the present study underwent breast cancer surgery. Unfortunately, we did not collect data on the specific types of surgeries performed.
Recommendations for the authors:
Reviewing Editor:
I have carefully reviewed the content and findings of your study, and while I recognize the potential impact of your research, there are several critical aspects that need to be addressed to fully appreciate the contribution of your work.
Significance of Findings:
Your study provides valuable insights into the efficacy and safety of pyrotinib as an extended adjuvant therapy following trastuzumab-based treatment in patients with high-risk HER2-positive breast cancer. The 2-year invasive disease-free survival (iDFS) rate of 94.59% is notably high and suggests that pyrotinib could be a promising option for patients who have completed trastuzumab therapy. This is particularly significant given the unmet need for effective therapies that can extend disease-free survival in this patient population.
Strength of Evidence:
The strength of the evidence presented is supported by the multicenter phase II trial design, which included a substantial number of patients across 23 centers in China. The rigorous methodology, including the use of the Kaplan-Meier method for estimating iDFS and the application of the Brookmeyer-Crowley method for confidence intervals, adds to the credibility of your findings. However, the single-arm study design without a control group limits the ability to draw definitive conclusions about the comparative effectiveness of pyrotinib.
In conclusion, your study presents intriguing findings that contribute to the field of breast cancer therapy. However, the current evidence, while suggestive of pyrotinib's potential, requires further validation in controlled trials to confirm its efficacy and optimal use in clinical practice. I encourage you to address the issues raised and consider resubmitting a revised version of your work.
Thank you for your comments. We acknowledge the limitation of our single-arm study design without a control group and agree that it restricts definitive conclusions about the comparative effectiveness of pyrotinib. This limitation was noted in our manuscript. Furthermore, we have revised our manuscript in response to the issues raised by the reviewers.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors aim to assess the effect of salt stress on root:shoot ratio, identify the underlying genetic mechanisms, and evaluate their contribution to salt tolerance. To this end, the authors systematically quantified natural variations in salt-induced changes in root:shoot ratio. This innovative approach considers the coordination of root and shoot growth rather than exploring biomass and the development of each organ separately. Using this approach, the authors identified a gene cluster encoding eight paralog genes with a domain-of-unknown-function 247 (DUF247), with the majority of SNPs clustering into SR3G (At3g50160). In the manuscript, the authors utilized an integrative approach that includes genomic, genetic, evolutionary, histological, and physiological assays to functionally assess the contribution of their genes of interest to salt tolerance and root development.
Comments on revisions:
As the authors correctly noted, variations across samples, genotypes, or experiments make achieving statistical significance challenging. Should the authors choose to emphasize trends across experiments to draw biological conclusions, careful revisions of the text, including titles and figure legends, will be necessary to address some of the inconsistencies between figures (see examples below). However, I would caution that this approach may dilute the overall impact of the work on SR3G function and regulation. Therefore, I strongly recommend pursuing additional experimental evidence wherever possible to strengthen the conclusions.
(1) Given the phenotypic differences shown in Figures S17A-B, 10A-C, and 6A, the statement that "SR3G does not play a role in plant development under non-stress conditions" (lines 680-681) requires revision to better reflect the observed data.
Thank you to the reviewer for the comment. We appreciate the acknowledgment that variations among experiments are inherent to biological studies. Figures 6A and S17 represent the same experiment, which initially indicated a phenotype for the sr3g mutant under salt stress. To ensure that growth changes were specifically normalized for stress conditions, we calculated the Stress Tolerance Index (Fig. 6B). In Figure 10, we repeated the experiment including all five genotypes, which supported our original observation that the sr3g mutant exhibited a trend toward reduced lateral root number under 75 mM NaCl compared to Col-0, although this difference was not significant (Fig. 10B). Additionally, we confirmed that the wrky75 mutant showed a significant reduction in main root growth under salt stress compared to Col-0, consistent with findings reported in The Plant Cell by Lu et al. 2023. For both main root length and lateral root number, we demonstrated that the double mutants of wrky75/sr3g displayed growth comparable to wild-type Col-0. This result suggests that the sr3g mutation compensates for the salt sensitivity of the wrky75 mutant.
We completely agree with the reviewer that there is a variation in our results regarding the sr3g phenotype under control conditions, as presented in Fig. 6A/Fig. S17 and Fig. 10A-C. In Fig. 6A/Fig. S17, we did not observe any consistent trends in main root or lateral root length for the sr3g mutant compared to Col-0 under control conditions. However, in Fig. 10A-C, we observed a significant reduction in main root length, lateral root number, and lateral root length for the sr3g mutant under control conditions. We believe this may align with SR3G’s role as a negative regulator of salt stress responses. While loss of this gene benefits plants in coping with salt stress, it might negatively impact overall plant growth under non-stress conditions. This interpretation is further supported by our findings on the root suberization pattern in sr3g mutants under control conditions (Fig. 8B), where increased suberization in root sections 1 to 3, compared to Col-0, could inhibit root growth. While SR3G's role in overall plant fitness is intriguing, it is beyond the scope of this study. We cannot rule out the possibility that SR3G contributes positively to plant growth, particularly root growth. That said, we observed no differences in shoot growth between Col-0 and the sr3g mutant under control conditions (Fig. 7). Additionally, we calculated the Stress Tolerance Index for all aspects of root growth shown in Fig. 10 and presented it in Fig. S25.
To address the reviewer request on rephrasing the lines 680-681 from"SR3G does not play a role in plant development under non-stress conditions" (lines 680-681) statement, this statement is found in lines 652-653 and corresponds to Fig. 7, where we evaluated rosette growth in the WT and sr3g mutant under both control and salt stress conditions. We did not observe any significant differences or even trends between the two genotypes under control conditions, confirming the accuracy of the statement. To clarify further, we have added “SR3G does not play a role in rosette growth and development under non-stress conditions”.
(2) I agree with the authors that detecting expression differences in lowly expressed genes can be challenging. However, as demonstrated in the reference provided (Lu et al., 2023), a significant reduction in WRKY75 expression is observed in T-DNA insertion mutant alleles of WRKY75. In contrast, Fig. 9B in the current manuscript shows no reduction in WRKY75 expression in the two mutant alleles selected by the authors, which suggests that these alleles cannot be classified as loss-of-function mutants (line 745). Additionally, the authors note that the wrky75 mutant exhibits reduced main root length under salt stress, consistent with the phenotype reported by Lu et al. (2023). However, other phenotypic discrepancies exist between the two studies. For example, 1) Lu et al. (2023) report that w¬rky75 root length is comparable to WT under control conditions, whereas the current manuscript shows that wrky75 root growth is significantly lower than WT; 2) under salt stress, Lu et al. (2023) show that wrky75 accumulates higher levels of Na+, whereas the current study finds Na+ levels in wrky75 indistinguishable from WT. To confirm the loss of WRKY75 function in these T-DNA insertion alleles the authors should provide additional evidence (e.g., Western blot analysis).
We sincerely appreciate the reviewer acknowledging the challenge of detecting expression differences in lowly expressed genes, such as transcription factors. Transcription factors are typically expressed at lower levels compared to structural or enzymatic proteins, as they function as regulators where small quantities can have substantial effects on downstream gene expression.
That said, we respectfully disagree with the reviewer’s interpretation that there is no reduction in WRKY75 expression in the two mutant lines tested in Fig. 9C. Among the two independent alleles examined, wrky75-3 showed a clear reduction in expression compared to WT Col-0 under both control and salt stress conditions. Using the Tukey test to compare all groups, we observed distinct changes in the assigned significance letters for each case:
Col/root/control (cd) vs wrky75-3/root/control (cd): Although the same significance letter was assigned, we still observed a clear reduction in WRKY75 transcript abundance. More importantly, the variation in expression is notably lower compared to Col-0.
Col/shoot/control (bcd) vs wrky75-3/shoot/control (a): This is significant reduction compared to Col
Col/root/salt (cd) vs wrky75-3/root/salt (bcd): Once again, the reduction in WRKY75 transcript levels corresponds to changes in the assigned significance letters.
Col/shoot/salt (bc) vs wrky75-3/shoot/salt (ab): Once again, the reduction in WRKY75 transcript levels corresponds to changes in the assigned significance letters.
To address the reviewer’s comment regarding the significant reduction in WRKY75 expression observed in T-DNA insertion mutant alleles of WRKY75 in the reference by Lu et al., 2023, we would like to draw the reviewer’s attention to the following points:
a) Different alleles: The authors in The Plant Cell used different alleles than those used in our study, with one of their alleles targeting regions upstream of the WRKY75 gene. While we identified one of their described alleles (WRKY75-1, SALK_101367) on the T-DNA express website, which targets upstream of WRKY75, the other allele (wrky75-25) appears to have been generated through a different mechanism (possibly an RNAi line) that is not defined in the Plant Cell paper and does not appear on the T-DNA express website. The authors mentioned they have received these seeds as gifts from other labs in the acknowledgement ”We thank Prof. Hongwei Guo (Southern University of Science and Technology, China) and Prof. Diqiu Yu (Yunnan University, China) for kindly providing the WRKY75<sub>pro</sub>:GUS, 35S<sub>pro</sub>:WRKY75-GFP, wrky75-1, and wrky75-25 seeds. We thank Man-cang Zhang (Electrophysiology platform, Henan University) for performing the NMT experiment”.
However, in our study, we selected two different T-DNAs that target the coding regions. While this may explain slight differences in the observed responses, both studies independently link WRKY75 to salt stress, regardless of the alleles used. For your reference, we have included a screenshot of the different alleles used.
Author response image 1.
b) Different developmental stages: They measured WRKY75 expression in 5-day-old seedlings. In our experiment, we used seedlings grown on 1/2x MS for 4 days, followed by transfer to treatment plates with or without 75 mM NaCl for one week. As a result, we analyzed older plants (12 days old) for gene expression analysis. Despite the difference in developmental stage, we were still able to observe a reduction in gene expression.
c) Different tissues: The authors of The Plant Cell used whole seedlings for gene expression analysis, whereas we separated the roots and shoots and measured gene expression in each tissue type individually. This approach is logical, as WRKY75 is a root cell-specific transcription factor with higher expression in the roots compared to the shoots, as demonstrated in our analysis (Fig. 9C).
Based on the reasoning above, we did work with loss-of-function mutants of WRKY75, particularly wrky75-3. To more accurately reflect the nature of the mutation, we have changed the term "loss-of-function" to "knock-down" in line 717.
The reviewer mentioned phenotypic discrepancies between the two studies. We agree that there are some differences, particularly in the magnitude of responses or expression levels. However, despite variations in the alleles used, developmental stages, and tissue types, both studies reached the same conclusion: WRKY75 is involved in the salt stress response and acts as a positive regulator. We have discussed the differences between our study and The Plant Cell in the section above, summarizing them into three main points: different alleles, different developmental stages, and different tissue types.
To address the reviewer’s comment regarding "Lu et al. (2023) report that wrky75 root length is comparable to WT under control conditions, whereas the current manuscript shows that wrky75 root growth is significantly lower than WT": We evaluated root growth differently than The Plant Cell study. In The Plant Cell (Fig. 5, H-J), root elongation was measured in 10-day-old plants with a single time point measurement. They transferred five-day-old wild-type, wrky75-1, wrky75-25, and WRKY75-OE plants to 1/2× MS medium supplemented with 0 mM or 125 mM NaCl for further growth and photographed them 5 days after transfer. In contrast, our study used 4-day-old seedlings, which were transferred to 1/2 MS with or without 0, 75, or 125 mM salt for additional growth (9 days). Rather than measuring root growth only at the end, we scanned the roots every other day, up to five times, to assess root growth rates. Essentially, the precision of our method is higher as we captured growth changes throughout the developmental process, compared to the approach used in The Plant Cell. We do not underestimate the significance of the work conducted by other colleagues in the field, but we also recognize that each laboratory has its own approach and specific practices. This variation in experimental setup is intrinsic to biology, and we believe it is important to study biological phenomena in different ways. Especially as the common or contrasting conclusions reached by different studies, performed by different labs and using different experimental setups are shedding more light on reproducibility and gene contribution across different conditions, which is intrinsic to phenotypic plasticity, and GxE interactions.
The Plant Cell used a very high salt concentration, starting at 125 mM, while we were more cautious in our approach, as such a high concentration can inhibit and obscure more subtle phenotypic changes.
To address the reviewer’s comment on "Lu et al. (2023) show that wrky75 accumulates higher levels of Na+, whereas the current study finds Na+ levels in wrky75 indistinguishable from WT," we would like to highlight the differences in the methodologies used in both studies. The Plant Cell measured Na+ accumulation in the wrky75 mutant using xylem sap (Supplemental Figure S10), which appears to be a convenient and practical approach in their laboratory. In their experiment, wild-type and wrky75 mutant plants were grown in soil for 3 weeks, watered with either a mock solution or 100 mM NaCl solution for 1 day, and then xylem sap was collected for Na+ content analysis. In contrast, our study employed a different method to measure Na+ and K+ ion content, using Inductively Coupled Plasma Atomic Emission Spectroscopy (ICP-AES) for root and shoot Na+ and K+ measurements. Additionally, we collected samples after two weeks on treatment plates and focused on the Na+/K+ ratio, which we consider more relevant than net Na+ or K+ levels, as the ratio of these ions is a critical determinant of plant salt tolerance. With this in mind, we observed a considerable non-significant increase in the Na+/K+ ratio in the shoots of the wrky75-3 mutant (assigned Tukey’s letter c) compared to the Col-0 WT (assigned Tukey’s letters abc) under 125 mM salt, suggesting that this mutant is salt-sensitive. Importantly, the Na+/K+ ratio in the double wrky75/sr3g mutants was reduced to the WT level under the same salt conditions, further indicating that the salt sensitivity of wrky75 is mitigated by the sr3g mutation.
Based on the reasons mentioned above, we believe that conducting additional experiments, such as Western blot analysis, is unnecessary and would not contribute new insights or alter the context of our findings.
Reviewer #2 (Public review):
Summary:
Salt stress is a significant and growing concern for agriculture in some parts of the world. While the effects of sodium excess have been studied in Arabidopsis and (many) crop species, most studies have focused on Na uptake, toxicity and overall effects on yield, rather than on developmental responses to excess Na, per se. The work by Ishka and colleagues aims to fill this gap.
Working from an existing dataset that exposed a diverse panel of A. thaliana accessions to control, moderate, and severe salt stress, the authors identify candidate loci associated with altering the root:shoot ratio under salt stress. Following a series of molecular assays, they characterize a DUF247 protein which they dub SR3G, which appears to be a negative regulator of root growth under salt stress.
Overall, this is a well-executed study which demonstrates the functional role played by a single gene in plant response to salt stress in Arabidopsis.
Review of revised manuscript:
The authors have addressed my point-by-point comments to my satisfaction. In the cases where they have changed their manuscript language, clarified figures, or added analyses I have no further comment. In some cases, there is a fruitful back-and-forth discussion of methodology which I think will be of interest to readers.
I have nothing to add during this round of review. I think that the paper and associated discussion will make a nice contribution to the field.
We sincerely appreciate the reviewer’s recognition of the significance of our work to the field.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Lines 518-519: The statement that other DUF247s exhibit similar expression patterns to SR3G, suggesting their responsiveness to salt stress, is not fully supported by Fig. S14. Please clarify the specific similarities (and differences) in the expression patterns of the DUF247s shown in Fig. S14, as their expression appears to be spatially and temporally diverse. Additionally, the scale is missing in Fig. S14.
We thank the reviewer. We fixed the text and added expression scales to Figure S14.
Line 684, Fig. 6A should be 7A.
Thanks. It is fixed.
Line 686, Fig. 7A should be 7B.
Thanks. It is fixed.
Lines 721-723: The signal quantification in Fig. 8B does not support the claim that "in section one,..., sr3g-5 showed more suberization compared to Col-0." Given the variability and noise often associated with histological dyes such as Fluorol Yellow staining, conclusions should be cautiously grounded in robust signal quantification. Additionally, please specify the number of biological replicates used in both Fig. 8B and C.
We thank the reviewer for their comments. We believe the statement in the text accurately reflects our results presented in Figure 8B, where we stated “non-significant, but substantially higher levels of root suberization in sr3g-5 compared to Col-0 in sections one to three of the root under control condition (Fig. 8B).” Therefore, we kept the statement and have included the number of biological replicates in the figure legend.
Lines 731-732: Please provide a more detailed explanation of how the significant changes in suberin monomer levels align with the Fluorol Yellow staining results, and clarify how these findings support the proposed negative role of SR3G in root suberization.
Fluorol Yellow is a lipophilic dye widely used to label suberin in plant tissues, specifically in roots in this study. Given the inherent variability in histological assays, we confirmed the increase in suberization using an alternative method, Gas Chromatography–Mass Spectrometry (GC-MS). Both approaches revealed elevated suberin levels in the sr3g mutant compared to Col-0. Since the overall suberin content was higher in the mutant under both control and salt stress conditions, we proposed that SR3G acts as a negative regulator of root suberization.
Lines 686-688 and Figure S24: The authors calculated water mass as FW-DW. A more standard approach for calculating water content is (FW-DW)/FW x 100. Please update the text or adjust the calculation accordingly. Additionally, if the goal is to test differences between WT and the mutant within each condition, a t-test would be a more appropriate statistical method.
We thank the reviewer. We added water content % to the figure S24. We kept the statistical test as it is as we wanted to be able to observe changes across conditions and genotypes.
Lines 633-635 states that "No significant difference was observed between sr3g-4 and Col-0 (Fig. S18), except for the Stress Tolerance Index (STI) calculated using growth rates of lateral root length and number." However, based on the Figure S18 legend and statistical analysis (i.e., ns), it appears that the sr3g-4 mutant shows no alterations in root system architecture compared to Col-0. Please revise the text to accurately reflect the results of the statistical analysis.
We thank the reviewer. We now fixed the text to reflect the result.
Lines 698-707: The statistical analysis does not support the reported differences in the Na+/K+ ratio for the single and double mutants of sr3g-5 and wrky75-3 (Fig. 10D, where levels connected by the same letters indicate they are not significantly different). Furthermore, the conclusion that "the SR3G mutation indeed compensated for the increased Na+ accumulation observed in the wrky75 mutant under salt stress" is also based on non-significant differences (Fig. S25B). Please revise the text to accurately reflect the results of the statistical analysis. Additionally, since each mutant is compared to the WT, I recommend using Dunnett's test for statistical analysis.
We thank the reviewer for their feedback. We have carefully revised the text to better support our findings. As previously mentioned, variations among samples are evident and are well-reflected across all our datasets. We have presented all data and focused on identifying trends within our samples to guide interpretation.
We observed that the SR3G mutation effectively compensated for the increased Na+ accumulation observed in the wrky75 mutant under salt stress. A closer examination of the shoot Na+/K+ ratio under 125 mM salt shows that the wrky75 single mutant has a higher Na+/K+ ratio (indicated by the letter "c") compared to Col-0 (indicated by "abc") and the two double mutants (also indicated by "abc"). Therefore, we have retained the statistical analysis as originally conducted, and maintain our conclusions as is.
Figure 6: data in panel C present the Na/K ratio, not Na+ content. Based on the statistical analysis of root Na+ levels presented in Fig. S17C, there is no significant difference between sr3g-5 and WT. Please update the title of Fig. 6. In addition, in panel A, the title of the Y-axis and figure legend should be "Lateral root growth rate" without the word length, and in panel C, the statistical analysis is missing.
We thank the reviewer. We updated Fig. 6 title and fixed the Y-axis in panel A, and added statistical letters to panel C. Legend was updated to reflect the changes.
Figure 7: Please clearly label the time points where significant differences between genotypes are observed for both early and late salt treatments. Was there a significant difference recorded between WT and sr3g-5 on day 0 under early salt stress? Such differences may arise from initial variations in plant size within this experiment, as indicated by Fig. 7B, where significant differences in rosette area are evident starting from day 0. Additionally, please indicate the statistical analysis in panel E.
We thank the reviewer for this suggestion. We updated the figure with a statistical test added to the panel E. Although the difference between sr3g mutant and Col-0 is indeed significant in its growth rate at day 0, we would like to draw the attention of the reviewer that this growth rate was calculated over the 24 hours after adding salt stress. Therefore, this difference in growth rate is related to exposure to salt stress. Moreover, the growth rate between Col-0 and sr3g mutant does not differ in two other treatments (Control and Late Salt Stress) further supporting the conclusion that sr3g is affecting rosette size and growth rate only under early salt stress conditions.
We have also added the Salt Tolerance Index calculation to Figure S24 as additional evidence, controlling for potential differences in size between Col-0 and sr3g mutant.
Figure S17: statistical analysis is not indicated in panels A, B, and D.
We thank the reviewer for spotting that. We updated the figure with a statistical test.
Figures S21-23: The quality of these figures is insufficient, hindering the ability to effectively interpret the authors' results and main message. Furthermore, a Dunnett's test, rather than a t-test, is the appropriate statistical method for this analysis.
We thank the reviewer for this observation. We have now added a high resolution figures for all supplemental figures, which should increase the resolution of the figures. As we are comparing all of the genotypes to Col-0 one-by-one - the results of individual t-tests are sufficient for this analysis.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
(1) The mechanism by which STAMBPL1 mediates GRHL3 transcription through its interaction with FOXO1 is not sufficiently discussed, especially in relation to how STAMBPL1 regulates FOXO1. Some reported effects are modest.
We appreciate the reviewer’s comments. In response, we have added a discussion on the potential mechanisms by which STAMPBL1 regulates FOXO1 transcriptional activity in Discussion, highlighted in red on page 18, lines 342 to 352. The specific reply content is as follows: “The transcriptional activity of FOXO1 is primarily regulated by its nucleocytoplasmic shuttling process (Van Der Heide, Hoekman et al. 2004). The PI3K/AKT pathway promotes the phosphorylation of FOXO1, resulting in the formation of a complex with members of the 14-3-3 family (including 14-3-3σ, 14-3-3ε, and 14-3-3ζ), which facilitates its export from the nucleus and inhibits its transcriptional activity (Huang and Tindall 2007, Tzivion, Dobson et al. 2011). It’s reported that TDAG51 prevents the binding of 14-3-3ζ to FOXO1 in the nucleus by interacting with FOXO1, thereby enhancing its transcriptional activity through increased accumulation within the nucleus (Park, Jeon et al. 2023). Our results indicate that the overexpression of STAMBPL1 and STAMBPL1-E292A did not affect the protein levels of FOXO1 (Fig.7E and Fig.S5E), but STAMBPL1 co-localizes with FOXO1 in the nucleus (Fig.7M) and interacts with it (Fig.7N and Fig.S5I-J). This suggests that STAMBPL1 enhances the transcriptional activity of FOXO1 on GRHL3 by interacting with nuclear FOXO1.” The result was added to Supplementary Figure 5 as Fig.S5E.
Reviewer #2 (Public review):
(1) A potential limitation of the study is the reliance on specific cellular and animal models, which may constrain the extrapolation of these findings to the broader spectrum of human TNBC biology. Furthermore, while the study provides evidence for a novel regulatory axis involving STAMBPL1, FOXO1, and GRHL3, the multifaceted nature of angiogenesis may implicate additional regulatory factors not exhaustively addressed in this research.
We appreciate the valuable suggestions provided by the reviewer. In Discussion, we have added an in-depth discussion of the limitations of the study, as well as an analysis of the regulatory factors related to tumor angiogenesis, which highlighted in red on pages 20 to 21, lines 396 to 412. The relevant content added is as follows: “In this study, we utilized two triple-negative breast cancer cell lines, HCC1806 and HCC1937, along with human primary umbilical vein endothelial cells (HUVECs) and a nude mouse breast orthotopic transplantation tumor model to investigate the regulatory mechanism by which STAMBPL1 activates the GRHL3/HIF1α/VEGFA signaling pathway through its interaction with FOXO1, thereby promoting angiogenesis in TNBC. The results of this study have certain limitations regarding their applicability to human TNBC biology. Furthermore, in addition to the HIF1α/VEGFA signaling pathway emphasized in this study, tumor cells can continuously release or upregulate various pro-angiogenic factors, such as Angiopoietin and FGF, which activate endothelial cells, pericytes (PCs), cancer-associated fibroblasts (CAFs), endothelial progenitor cells (EPCs), and immune cells (ICs). This leads to capillary dilation, basement membrane disruption, extracellular matrix remodeling, pericyte detachment, and endothelial cell differentiation, thereby sustaining a highly active state of angiogenesis (Liu, Chen et al. 2023). It is important to collect clinical TNBC tissue samples in the future to analyze the expression of the STAMBPL1/FOXO1/GRHL3/HIF1α/VEGFA signaling axis. Furthermore, patient-derived organoid and xenograft models are useful to elucidate the regulatory relationship of this axis in TNBC angiogenesis”
Reviewer #3 (Public review):
The main weaknesses of this work are that the relevance of this molecular axis to the pathogenesis of TNBC is not clear, and it is not clearly established whether this is a regulatory pathway that occurs in hypoxic conditions or independently of oxygen levels.
(1) With respect to the first point, both FOXO1 and GRHL3 have been previously described as tumor suppressors, with reports of FOXO1 inhibiting tumor angiogenesis. Therefore, this works describes an apparently contradictory function of these proteins in TNBC. While it is not surprising that the same genes perform divergent functions in different tumor contexts, a stronger evidence in support of the oncogenic function of these two genes should be provided to make the data more convincing. As an example, the data in support of high STAMBPL1, FOXO and GRHL3 gene expression in TNBC TCGA specimens provided in Figure 8 is not very strong and it is not clear what the non-TNBC specimens are (whether other breast cancers or other tumors, perhaps those tumors whether these genes perform tumor suppressive functions). To strengthen the notion that STAMBPL1, FOXO and GRHL3 are overexpressed in TNCB, the authors could provide a comparison with normal tissue, as well as the analysis of other publicly available datasets (like the NCI Clinical Proteomic Tumor Analysis Consortium as an example). Finally, is it not clear what are the basal protein expression levels of STAMBPL1 in the cell lines used in this study, as based on the data presented in Figures 2D and F it appears that the protein is not expressed if not exogenously overexpressed. It would be helpful if the authors addressed this issue and provided further evidence of STAMBPL1 expression in TNBC cell lines.
We appreciate the suggestions. In this study, we utilized the BCIP online tool to analyze the Metabric database, incorporating adjacent normal tissues as controls. Although the expression levels of STAMBPL1, FOXO1, and GRHL3 in breast cancer tissues are not uniformly higher than those in adjacent tissues, their expression levels in triple-negative breast cancer (TNBC) are significantly elevated compared to non-TNBC. The results of this re-analysis have been added in Supplementary Figure 6 as Fig.S6A-C.
About the question of the basal protein expression levels of STAMBPL1 in the cell lines used in this study, our response is that Fig. 2A showed the endogenous level of STAMBPL1 in HCC1806 and HCC1937. For Fig. 2D and 2F, the overexpressed STAMBPL1 was fused with a 3xFlag tag, resulting in a higher molecular weight compared to the endogenous STAMBPL1. In the revised Figure 2, we have indicated the positions of the endogenous (Endo.) and exogenous (OE.) STAMBPL1 bands with arrows.
(2) Linked to these considerations is the second major criticism, namely that it is not made clear if this new regulatory axis is proposed to act in normoxic or hypoxic conditions. The experiments presented in this paper are performed in both conditions but a clear explanation as to why cells are exposed to hypoxia is not given and would be necessary being that HIF-1a transcription and not protein stability is being analyzed. Also, different hypoxic conditions are sometimes used, resulting in different mRNA levels of HIF-1a and its downstream targets and quite significant fluctuations within the same cell line from one experimental setting to the next. The authors should provide an explanation as to why experimental conditions are changed and, more importantly, the experiments presented in Figure 2 should be performed also in normoxia.
Thanks for the comments. Under normoxic conditions, HIF1α is recognized by pVHL due to hydroxylation and is rapidly degraded via the proteasomal pathway. In contrast, under hypoxic conditions, HIF1α protein is accumulated. To investigate the effect of STAMBPL1 knockdown on HIF1A gene transcription levels, we conducted experiments under hypoxic conditions to avoid interference from the rapid degradation of HIF1α at the protein level, as shown in Figures 2B-C. Furthermore, under normoxic conditions, the overexpression of STAMBPL1 had been demonstrated to significantly enhance the protein levels of HIF1α and upregulate the transcription of VEGFA through HIF1α. To avoid the potential impact of excessive accumulation of HIF1α protein under hypoxic conditions on its protein level detection and the transcription of downstream VEGFA, the related experiments shown in Figure 2D-G were performed under normoxic conditions. We have explained the corresponding experimental conditions in the “Result” and “Figure legends” according to the reviewer's comments, highlighted in red.
(3) Another critical point is that necessary experimental controls are sometimes missing, and this is reducing the strength of some of the conclusions enunciated by the authors. As examples, experiments where overexpression of STAMBPL1 is coupled to silencing of FOXO1 to demonstrate dependency lack FOXO1 silencing the absence of STAMBPL1 overexpression. Because diminishing FOXO1 expression affects HIF-1a/VEGF transcription even in the absence of STAMBPL1 (shown in Figure 7C, D), it is not clear if the data presented in Figure 7G are significant. The difference between HIF-1a expression upon FOXO1 silencing should be compared in the presence or absence of STAMBPL1 overexpression to understand if FOXO1 impacts HIF-1a transcription dependently or independently of STAMBPL1.
Thank you for this comment. For Fig.7G-H, our experimental objective was to determine whether the activation of HIF1A/VEGFA transcription by STAMBPL1 via FOXO1. Therefore, under STAMBPL1 overexpression, we knocked down FOXO1 to investigate whether FOXO1 silencing could reverse the upregulation of HIF1A/VEGFA transcription induced by STAMBPL1 overexpression.
(4) In addition, some minor comments to improve the quality of this manuscript are provided.
(4.1) As a general statement, the manuscript is extremely synthetic. While this is not necessarily a negative feature, sometimes results are discussed in the figure legends and not in the main text (as an example, western blots showing HIF-1a expression) and this makes it hard to read thought the data in an easy and enjoyable manner.
Thank you for this suggestion. We have revised the figure legends to make them clearer and more concise, highlighted in red.
(4.2) The effect of STAMBPL1 overexpression on HIF-1a transcription is minor (Figure 2) The authors should explain why they think this is the case and whether hypoxia may provide a molecular environment that is more permissive to this type of regulation.
Thank you for the comment. Under normoxic conditions, we conducted WB to examine the protein expression of HIF1α after the overexpression of STAMBPL1 and the knockdown of HIF1α. To visually illustrate the impact of STAMBPL1 overexpression on HIF1A protein levels, as well as the effectiveness of HIF1α knockdown, we annotated the grayscale analysis results of the bands in Figures 2D and 2F. As the reviewer pointed out, under normoxic conditions, HIF1α is rapidly degraded, which may explain why the upregulation of HIF1α protein levels by STAMBPL1 overexpression is not very pronounced.
(4.3) HIF-1a does not appear upregulated at the protein level protein by STAMBPL1 or GRLH3 overexpression, even though this is stated in the legends of Figures 2 and 6. The authors should show unsaturated western blots images and provide quantitative data of independent experiments to make this point.
Thank you for this comment. We have added the unsaturated image of HIF1α into Fig.2D, and performed a grayscale analysis of the HIF1α bands in Fig.2D and Fig.6A to indicate the relative protein level of HIF1α.
Reviewer #1 (Recommendations for the authors):
(1) The authors previously reported that STAMBPL1 stabilizes MKP1 in TNBC. However, in this study, they focus on HIF1a. Given that STAMBPL1 affects HIF1a expression, it would be valuable to examine the levels of ROS in TNBC cells with or without STAMBPL1, as ROS is known to influence HIF1a stability.
Thank you for your comments. It’s known that STAMBPL1 functions as a deubiquitinating enzyme. However, our study reveals that the upregulation of HIF1α by STAMBPL1 is independent of its deubiquitinating activity. This conclusion is supported by the observation that overexpression of the deubiquitinase active site mutant, STAMBPL1-E292A, also upregulated HIF1α expression (Figure 1F). Moreover, STAMBPL1 overexpression enhanced HIF1α transcription (Figures 4E and S3E), while STAMBPL1 knockdown was able to inhibit the transcription of HIF1α (Figures 2B-C). These results indicate that STAMBPL1 mediates the transcription of HIF1α but does not affect the stability of HIF1α. For these reasons, we think that it is unnecessary to examine the ROS levels.
(2) Figure 1A: The regulation of HIF1a mRNA by STAMBPL1, but not its protein levels, could be better addressed by using MG132 to rule out the impact of protein degradation.
Thanks for this comment. Under normoxic conditions, the oxygen-sensitive prolyl hydroxylases PHD1-3 act on HIF1α, specifically inducing hydroxylation at the proline 402 and 564 residues. These hydroxylated residues are recognized by the pVHL/E3 ubiquitin ligase complex, leading to ubiquitination and subsequent degradation via the proteasome pathway. Conversely, under hypoxic conditions, PHD1-3 are inactivated, and non-hydroxylated HIF1α is not recognized by the pVHL/E3 ubiquitin ligase complex, thereby avoiding ubiquitination and proteasomal degradation (DOI: 10.1073/pnas.95.14.7987, DOI: 10.1515/BC.2004.016, and DOI: 10.1042/BJ20040620). The mechanism of HIF1α accumulation under hypoxia is analogous to the action of the proteasome inhibitor MG132. When we treated cells with hypoxia, the ubiquitination and proteasomal degradation pathway of HIF1α was blocked. At this time, STAMBPL1 knockdown could downregulate the expression of HIF1α (Fig.1A). Meanwhile, since the knockdown of STAMBPL1 significantly downregulated the mRNA level of HIF1α under hypoxia (Fig.2B-C), we concluded that STAMBPL1 affects the expression of HIF1α by mediating its transcription. In addition, MG132 will block all proteasomal substrate degradation and may affect HIF1α mRNA levels indirectly.
(3) Figure 2D and 2F: The effect of STAMBPL1 in promoting HIF1a expression is quite mild, and the effect of HIF1a knockdown is also modest. Given the high levels of STAMBPL1 in TNBC cell lines (Figure 2A), it would be better to repeat these experiments in a STAMBPL1-knockdown setting for clearer insights.
We appreciate this insightful suggestion. Considering that the regulation of HIF1α expression by STAMBPL1 occurs at the transcriptional level, and to prevent excessive accumulation of HIF1a during hypoxia that could confound the effect of STAMBPL1 overexpression on HIF1α regulation, we opted to overexpress STAMBPL1 under normoxic conditions and subsequently knock down HIF1α, as shown in Fig.2D and Fig.2F. This approach allowed us to observe that STAMBPL1 overexpression can upregulate HIF1a expression to some extent. Additionally, in response to the reviewer's suggestion to knock down STAMBPL1, we have conducted the corresponding experiments, with results presented in Fig.1A-E and Fig.2B-C.
(4) Figure 4A: Why does the RNA-seq pattern differ significantly between the two siRNAs? Additionally, the authors should clarify why they focus primarily on transcription factors, as other mechanisms, such as mRNA stability and RNA modification, could also influence gene transcription.
Thank you for this comment. Two siRNAs for STAMBPL1 were designed and synthesized by a biotechnology company. Although both siRNAs target STAMBPL1, they target different sequences. While both siRNAs effectively knocked down STAMBPL1 (Fig. 1A and Fig. 2A), the possibility of off-target effects cannot be completely ruled out. Therefore, we needed to use two siRNAs simultaneously for RNA-seq, ensuring that the gene expression changes observed are due to the knockdown of STAMBPL1 by focusing on genes downregulated by both two siRNAs. Additionally, among the 27 genes downregulated by both two siRNAs, only 18 genes were annotated. Of these 18 genes, except for GRHL3, which is a transcription factor reported to be involved in gene transcription regulation, the remaining 17 genes have no documented association with RNA transcription, stability, or modification. Therefore, we focused on the GRHL3 gene.
(5) Figure 5G: To investigate whether STAMBPL1 and GRHL3 function epistatically in the pathway, a double knockdown of STAMBPL1 and GRHL3 should be examined. Additionally, a double knockdown of STAMBPL1 and FOXO1 should be assessed.
Thank you for your comment. In Figure 5G, we aimed to assess the knockdown efficiency of GRHL3 using siRNAs. To determine whether STAMBPL1 upregulates the HIF1a/VEGFA axis via GRHL3, we overexpressed STAMBPL1 and subsequently knocked down GRHL3. Our findings indicated that STAMBPL1 overexpression indeed enhanced the HIF1a/VEGFA axis, which was rescued by the knockdown of GRHL3, as shown in Figures 4E-F and S3E-F. Similarly, upon overexpressing STAMBPL1 and knocking down FOXO1, we observed that STAMBPL1 overexpression increased the GRHL3/HIF1a/VEGFA axis, which could also be rescued by knocking down FOXO1, as shown in Figures 7F-H. These results suggest that STAMBPL1 upregulates the GRHL3/HIF1a/VEGFA axis through FOXO1. We do not think it is a right way to double knock down STAMBPL1 and FOXO1 or GRHL3.
(6) Figure 7: It remains unclear how STAMBPL1 regulates FOXO1. The authors show that STAMBPL1 increases the transcriptional activation of FOXO1 at the GRHL3 promoter, but it is not clear if STAMBPL1 is required for FOXO1 binding to the GRHL3 promoter. To address this, STAMBPL1-knockdown should be included to examine its effect on FOXO1 binding to the GRHL3 promoter. Furthermore, it would be important to determine whether the STAMBPL1-FOXO1 interaction is essential for GRHL3 transcription. Since the interaction sites of STAMBPL1-FOXO1 have been mapped, a mutant disrupting the interaction would provide better insight into how STAMBPL1 promotes GRHL3 transcription by interacting with FOXO1.
Thank you for this comment. It has been reported that FOXO1 promotes the transcription of the GRHL3 gene by interacting with its promoter (DOI: 10.1093/nar/gkw1276). We also verified through ChIP assay that FOXO1 can bind to the promoter of GRHL3 gene (Fig.7I) and mediate its transcription. Specifically, knocking down FOXO1 significantly down-regulated the mRNA level of GRHL3 (Fig.7B), and the GRHL3 promoter lacking FOXO1 binding site almost completely lost transcriptional activity (Fig.7J), indicating that FOXO1 is crucial for the transcriptional activity of the GRHL3 promoter. Overexpression of STAMBPL1 enhances the activating effect of FOXO1 on the transcriptional activity of the GRHL3 promoter (Fig.7K). However, the up-regulation of GRHL3 transcription by overexpression of STAMBPL1 is completely blocked by FOXO1 knockdown (Fig.7F), and the knockdown of FOXO1 essentially blocks the binding of STAMBPL1 to the GRHL3 promoter (Fig.7L), suggesting that STAMBPL1 affects the transcriptional expression of GRHL3 based on FOXO1. As we added in Discussion, the transcription factor activity of FOXO1 is mainly regulated by its nucleoplasm shuttling process, and the accumulation of FOXO1 in nucleus can enhance its transcription factor activity (DOI: 10.1042/BJ20040167; DOI: 10.15252/embj.2022111867). In our research, neither STAMBPL1 nor its mutant of deubiquitinating enzyme site affected the expression of FOXO1 (Fig.S5E), but STAMBPL1 and FOXO1 co-located in the nucleus (Fig.7M), and they interacted with each other (Fig.7N, Fig.S5I-J). Therefore, we speculate that STAMBPL1 interacts with FOXO1 in the nucleus, obstructs the binding of FOXO1 with the members of 14-3-3 family, inhibits the export of FOXO1, thereby enhancing its transcriptional activity. This interaction between STAMBPL1 and FOXO1 does not necessarily affect the binding of FOXO1 with DNA, including the GRHL3 promoter.
(7) Figure 8 A-C: What is the correlation among the expressions of STAMBPL1, FOXO1, and GRHL3 in TNBC tumors compared to non-TNBC tumors?
Thank you for your comment. In Figure 8A-C, we analyzed the expression levels of STAMBPL1, FOXO1, and GRHL3 in both TNBC and non-TNBC samples using the BCIP. The results indicate that the expression levels of these three genes are significantly higher in TNBC compared to non-TNBC samples. To investigate the correlation among the expressions of STAMBPL1, FOXO1, and GRHL3 in TNBC versus non-TNBC, we further utilized the Metabric data. Besides the positive correlation trend between STAMBPL1 and GRHL3 expression in TNBC clinical samples (Pearson R = 0.27), no significant correlation was observed in the expression levels of STAMBPL1, FOXO1, and GRHL3 in TNBC and non-TNBC clinical samples (as shown in Author response image 1 below). Since STAMBPL1 and FOXO1 are involved as protein molecules in the transcriptional regulation of GRHL3 gene, and the data obtained from the Metabric database are the transcriptional levels of these three genes, this might be the reason why the correlation between their expressions was not observed.
Author response image 1.
Reviewer #2 (Recommendations for the authors):
The authors have thoroughly elucidated the role of STAMBPL1 in TNBC. However, it would be beneficial to discuss the potential clinical implications of these findings, such as how targeting STAMBPL1 or FOXO1 might impact current treatment strategies for TNBC. However, several issues need to be addressed.
Major:
(1) While the study provides an exhaustive analysis of the molecular mechanisms, a comparison with other subtypes of breast cancer could enhance our understanding of the specificity of the STAMBPL1/FOXO1/GRHL3/HIF1α/VEGFA axis in TNBC.
Thank you for your comment. According to report, STAMBPL1 is significantly associated with the mesenchymal characteristics of breast cancer (DOI: 10.1038/s41416-020-0972-x). We utilized cBioPortal (http://www.cbioportal.org/) to analyze the expression of STAMBPL1 across various clinical subtypes of breast cancer. The results indicated that STAMBPL1 is highly expressed in invasive breast cancer, which has been added to Supplementary Figure 6 as Fig.S6D. Given that TNBC is an aggressive type of invasive breast cancer, we further examined the expression of STAMBPL1 in TNBC compared to non-TNBC using BCIP (http://omicsnet.org/bcancer/database). Our findings revealed that the expression level of STAMBPL1 in TNBC was elevated relative to its levels in non-TNBC (Fig.8A). Additionally, since tumor angiogenesis is a critical factor influencing the metastasis of cancer cells, our study focused specifically on the pro-angiogenic effects of STAMBPL1 in TNBC.
(2) The authors might consider discussing any potential off-target effects of the siRNA and shRNA used in the study to bolster the conclusions drawn from the knockdown experiments.
We appreciate the reviewer's suggestion. It is well-known that siRNA or shRNA have off-target effects. To address this concern, we employed two siRNAs for each gene knockdown in our study. Specifically, we knocked down genes such as STAMBPL1, FOXO1, GRHL3, and HIF1A in two TNBC cell lines, HCC1806 and HCC1937, using two siRNAs. Except for siRNA#1 targeting HIF1A, which did not show a significant knockdown effect in HCC1806 cells (Fig.2D and Fig.6A), the knockdown effects of other siRNAs on their respective genes were effective, and the resulting phenotypes were consistent. As shown in Fig.2F and Fig.S4H, siRNA#1 targeting HIF1A had a significant knockdown effect in HCC1937 cells. The lower knockdown efficiency of this siRNA in HCC1806 cell line might be attributed to cell-specific factors.
(3) It would be advantageous if the authors could provide further details on the patient demographics and tumor characteristics in the TCGA database analysis to better comprehend the clinical relevance of their findings.
Thanks for the reviewer's suggestions. We have now indicated the number of clinical samples in each group in the legend of Fig.8A-C. Since we utilized the BCIP online database to analyze and compare the expression levels of the three genes STAMBPL1, FOXO1, and GRHL3 in TNBC and non-TNBC, we are unable to obtain more specific information regarding the tumor characteristics of each sample. However, our analysis clearly shows that the expression levels of these three genes are significantly higher in TNBC compared to non-TNBC.
(4) The authors should consider discussing any limitations regarding the generalizability of their findings, such as potential variations among different TNBC subtypes or the specificity of their observations to certain stages of the disease.
We appreciate the reviewer's comment. Accordingly, we have added a discussion on the limitation of this study in Discussion, highlighted in red font on pages 20 to 21, lines 396 to 412. In addition, we utilized the bc-GenExMiner online database to conduct a comparative analysis of STAMBPL1 expression in different subtypes of non-TNBC and TNBC. The result indicates that STAMBPL1 is highly expressed in mesenchymal-like and basal-like TNBC, which has been added into Supplementary Figure 6 as Fig.S6E. Since these two subtypes of TNBC are highly invasive and metastatic, it suggests that targeting the signaling pathway of STAMBPL1/FOXO1/GRHL3/HIF1α/VEGFA may offer clinical benefits for patients with invasive TNBC.
Minor:
The paper is generally well-written, but it's crucial to maintain vigilance for subject-verb agreement, proper use of tense, and consistent terminology.
Thank you for this suggestion. We have thoroughly revised the article for issues such as grammar, including tense, subject-verb agreement, and terminology.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Recommendations for the authors:
Reviewing Editor Note:
The two reviewers have provided thoughtful and constructive feedback that we hope will be of use to the authors to improve their manuscript.
Reviewer #1 (Recommendations For The Authors):
The section on "Circuit evolution by duplication and divergence" (starting on line 622) should cite:
Chakraborty, Mukta, and Erich D. Jarvis. "Brain evolution by brain pathway duplication." Philosophical Transactions of the Royal Society B: Biological Sciences 370, no. 1684 (2015): 20150056.
and
Roberts, Ruairí JV, Sinziana Pop, and Lucia L. Prieto-Godino. "Evolution of central neural circuits: state of the art and perspectives." Nature Reviews Neuroscience 23, no. 12 (2022): 725-743.
It should also reference that the concept originated from genetics:
Ohno, Susumu. Evolution by gene duplication. Springer Science & Business Media, 1970
These papers have now been cited: “Duplication and divergence of circuits was also proposed as a possible mechanism for the evolution of brain pathways for vocal learning in song-learning birds, spoken language in humans [@chakraborty2015brain] and other circuits [@roberts2022evolution].”
and: Our reconstructions identified a potential case for circuit evolution by duplication and divergence [@tosches2017developmental; @roberts2022evolution], a concept that originated from genetics [@ohno1970evolution].
The terms outgoing and incoming synapses were confusing. The more common terminology is pre and postsynaptic elements. For example, in Fig 1, the label Sensory neuron outgoing and incoming was confusing because I mistakenly thought it was referring to the neurons and I could not figure out what an outgoing sensory neuron was.
We have now changed ‘incoming’ to ‘postsynaptic’ and ‘outgoing’ to ‘presynaptic’.
In L-O, there should be an indicator on the figures that they refer to the locations of synaptic sites, as it does in F.
We have now replaced the labels ‘incoming’ and ‘outgoing’ with ‘presyn’ and ‘postsyn’ for Figure 1 panels L-O to make it clear that these are synaptic sites.
Figure 2. - last panel of muscle motor - it would be helpful to have names of muscles instead of just having 5 'muscle motor' of different colors
Each muscle-motor module contains a large number and type of muscles and motor neurons. Labelling them by the name of individual muscle types is therefore not practical at this resolution. The three-day-old Platynereis larvae has 53 different muscle cell types. Their anatomy and classification, together with the details of motoneuron innervation have been described in detail elsewhere (Jasek et al 2022 https://doi.org/10.7554/eLife.71231).
Figure 3. D and E are hard to understand from the figure; The shading is the number of neurons; that scale should be shown somewhere.
We are not sure we understand the comment. These plots are histograms that show the distribution of the number of cells across categories. The y axis is the number of neuronal or non-neuronal cell types in each bin.
PageRank is an algorithm that Google uses. In Figure 4, it seems to be used to indicate centrality. A brief explanation in the text would be useful.
We have now added an explanation of the centrality measures used. “PageRank is an algorithm used by Google to rank webpages and scores the number and quality of the incoming links of a node [@page1999pagerank], betweenness centrality measures the number of shortest paths that pass through a node in a graph [@freeman1977set], and authority measures the extent of inputs to a node by hubs in a network [@kleinberg1999authoritative].”
Figure 5. The labels on some images are not clear. They are on top of each other and elements of the figure
We have now moved the position of the labels to minimise overlap. We have also added an interactive html file with the network shown in Figure 5 panel A to help the exploration of the network. Added: “Figure 5—source data 1. Interactive html file with the network shown in panel A.”
There are differences in line thickness in several figures, such as Figure 9 (A and B) and Figure 12 (D and I and N) that presumably means numbers of synaptic contacts. It would be useful to know what the scale is.
We have now added labels of line thickness to the networks in Figure 4, Figure 5 – figure supplement 2, Figure 9, Figure 12, Figure 7 – figure supplement 1, Figure 15 and Figure 16.
Reviewer #2 (Recommendations For The Authors):
(1) Suggestions for improved or additional experiments, data, or analyses.
(2) Recommendations for improving the writing and presentation.
Perhaps we require a comprehensive inventory detailing all the innovations compared to previous, more limited publications, particularly in relation to the 2017 publication and 2020 preprint.
We have provided this detail in Supplementary table 1 that lists all cell types. We included the reference for previously published cell types in the ‘reference’ column except for those that were also described in the 2020 preprint. The current manuscript is a greatly revised and extended version of the original 2020 preprint. In addition, in the online connectome database (https://catmaid.jekelylab.ex.ac.uk), all cell types that were previously published are annotated with the notation ‘FirstAuthor_et_al_year’.
It is a bit frustrating given the huge amount of graphs, analyses, tables, and networks that are presented in the manuscript, we do not see much of the original EM pictures except for a few examples of cell type blow-ups. It would be useful for future workers in the field to have eventually a sort of compendium of how the authors actually recognized each cell type, without having to connect to the original CATMAID annotation.
Most neuronal cell types (with the exception of some characteristic sensory neurons such as photoreceptor cells and mechanosensory cells) were not classified based on ultrastructural features, but on features of neurite morphology, body position and synaptic connectivity. It would be therefore not possible to represent most of the cell types with a single layer of an original EM picture. However, in order to make the morphological skeleton characteristics more accessible to the reader, we have now added a comprehensive website ( https://jekelylab.github.io/Platynereis_connectome/) including all cell types together with their interactive 3D rendering.
“Interactive 3D morphological renderings of each cell type together with their main annotations can also be explored on a webpage (https://jekelylab.github.io/Platynereis_celltype_compendium.html).”
The Platynereis 3-day larva is obviously only one transient stage in the developmental cycle of the animal, and it is a very specialized stage (called metatrochophore in annelid jargon), during which the animal does not yet feed, relying instead on its copious yolk. Moreover, it is a stage whose purpose is limited to dispersion, with no complex behavior or social interaction that later stages are going to display. While this work represents a substantial leap forward in understanding neural integration in a whole animal, it must be kept in mind that compared to an adult or growing juvenile, there are likely a considerable number of cells, cell types, and neural modules missing in this larva. This is clearly not a weakness of this study per se, but readers may find it interesting to be presented with this perspective and therefore more biological details about the Platynereis life cycle and associated behaviors.
Obviously, understanding how the constantly developing nervous system of a worm-like Platynereis gets reshuffled in time will be a great subject to investigate. The authors mention that the 3-day larva displays more than 4000 neuronal cells not yet differentiated. Readers may be interested in their location. Are there niches of neural stem cells? A description of what may be missing from the larva in terms of cell types compared to the adult may be useful.
We have now added further explanation into the Introduction about the early nectochaete larval stage: “The early nectochaete larva represents a transient dispersing stage in the life cycle of Platynereis. During this stage the larvae do not feed yet but rely on maternally provided yolk. Compared to the juvenile and adult stages it is expected that a considerable number of cell types will be only developing or completely missing at this stage. Three-day-old larvae do not yet have sensory palps and other sensory appendages (cirri), they do not crawl or feed and lack visceral muscles and an enteric nervous system.”
The location of developing neurons is shown in Figure 3—figure supplement 1 panel I.
Juvenile or adult cell types have not yet been described in any detail that is close to the level of detail we now provide for the nectochaete larva, therefore a meaningful comparison of cell-type complements across stages is not yet feasible.
(3) Minor corrections to the text and figures.
Figure 1: "outgoing" not "outgoung" in panels M, O, Q.
Corrected
Line 128: We may need a precise definition of "cable length".
We have included a definition of cable length in the Methods section under a new subheading ‘Quantitative analysis of neuron morphologies’.
In all Figures: information on the orientation of the worm's view is sometimes missing in figures, which could make interpretation difficult for the reader, especially for anterior views with no D/V indication. The authors should indicate the orientation for each panel or provide a general orientation in the figure if all panels are oriented the same.
We have now added D/V or A/P indication to all figures.
Figure 23: "right view, left side" is confusing.
We have changed this to “ Each panel shows a ventral (left panel) and a left-side view (right panel).”
Line 406 : the first mention of the Platynereis cryptic segment, as far as I know, is Saudemont et al, 2008.
Thank you for pointing this out. We added the citation.
Figure 45: descending and decussating, 2nd and 3rd line of the legend.
Corrected
The format of data source tables is not homogeneized with some files in Excel format and others in plain comma format.
We have homogeneized the file formats of the supplements and source data. We have .csv files or .rds (R data format) files for the more complex data, such as tibble graphs that cannot be represented in a simple .csv format.
-
-
www.biorxiv.org www.biorxiv.org
-
Author Response:
Reviewer #1 (Public review):
[…] Strengths:
The strategies used for increasing PCR sensitivity offer the potential for enhancing treatment monitoring and understanding the dynamics of parasite-host interactions in chronic Chagas disease.
Weaknesses:
While the study offers valuable insights for research in T.cruzi infection dynamics and monitoring of trypanocidal drugs efficacy, its broader adoption depends on the development of cost-effective and scalable alternatives to labor-intensive techniques such as sonication, currently required for DNA fragmentation. Additionally, the reliance on blood cell pellets and the DNA fragmentation protocol introduces extra processing steps, which may not be feasible for many clinical laboratories, particularly in resource-limited endemic areas that require simpler and more streamlined procedures.
We agree that this methodology is likely to be used primarily as a research tool and for selective use in the field (e.g. drug trials) and unlikely to be standard in many clinical labs, irrespective of resources. We note the protocol does not require cell pellets (although that fraction provides the highest sensitivity) and that the fragmentation step is not at all labor-intensive. But to achieve consistent detection across the range of parasite burden known to occur in chronic T. cruzi infection, appropriately processed DNA from higher volumes of blood than are now routinely used for detection of T. cruzi, will be required.
Reviewer #2 (Public review):
[…] Strengths:
The primary strength of this study lies in its methodological novelty, particularly the combination of multiple parallel PCR reactions and DNA fragmentation to enhance sensitivity. It is a sort of brute-force method for detecting the parasite. This approach promises the detection of parasitic DNA at levels significantly lower than those achievable with standard qPCR methods. Additionally, the authors demonstrate the utility of this method in tracking parasitemia dynamics and post-treatment responses in macaques and dogs, providing valuable insights for both research and clinical applications.
Weaknesses:
(1) Methodological Concerns on detection and quantification limits
Some methodological inconsistencies and limitations were observed that merit consideration. In Figure 1, there is a clear lack of consistency with theoretical expectations and with the trends observed in Figure 4A. Based on approximate calculations, having 10^-7 parasite equivalents with 100,000 target copies per parasite implies an average of 0.01 target copies per reaction. This would suggest an amplification rate of approximately 1 in 100 reactions, yet the observed 30% amplification appears disproportionately high. In addition, Figure 4A (not fragmented) shows lower values of positivity than Figure 1 for 10^-5 and 10^-6 dilutions showing this inconsistency among experiments. Some possible explanations could account for this inconsistency: (1) an inaccurate quantification of the starting number of parasites used for serial dilutions, or (2) random contamination not detected by negative controls, potentially due to a low number of template molecules.
Similarly, Figure 5B presents another inconsistency in theoretical expectations for amplification. The authors report detecting amplification in reactions containing 10^-9 parasites after DNA fragmentation. Based on the figure, at least 3 positives (as I can see because raw data is not available) out of 388 PCRs are observed at this dilution. Assuming 100,000 copies of satellite DNA per parasite, the probability of a single copy being present in a 10^-9 dilution is approximately 1/10,000. If we assume this as the probability of amplification of a PCR (an approximation), by using a simple binomial calculation, the probability of at least 3 positive reactions out of 388 is approximately 9.39 x 10^-6 (in ideal conditions, likely lower in real-world scenarios). This translates to a probability of about 1 in 100,000 to observe such frequency of positives, which is highly improbable and suggests either inaccuracies in the initial parasite quantification or issues with contamination. In addition, at 10^-6 PE/reactions (the proposed limit of quantification) it is observed that 40% of repetitions are amplified. The number of repetitions is not specified but probably more than 50 according to the graph. Such dilution implies 0.1 targets per reaction (assuming 100.000 copies divided by 10^6), which means a total of 5 target molecules to distribute among the reactions (0.1 targets multiplied by 50 reactions). It seems highly improbable that 40% of the reactions (20/50) would amplify under the described conditions. Even considering 200.000 target copies per parasite implies 0.2 targets per reaction and an average of 10 molecules to distribute among 50 reactions. The approximate probability of the observation of at least 20/50 positives can be calculated by determining the probability of a reaction to receive targets by assuming a random distribution of the targets among the tubes, p= 1 - (1 - 1/50)^10, and then by using a binomial distribution to determine the probability that at least 20 reactions receive at least one target copy. The probability of at least 20/50 positive reactions in a dilution of 10^-6 parasites (200.000 target copies per parasite) is 0.00028. Consequently, the observed result is highly unlikely.
We disagree with the reviewer on both of these points.
First, the mean (S.D.) Cq values of the 10-3 PE unfragmented dataset in Figure 1 (40 replicates) and Figure 4a (88 replicates) are nearly identical at 30.02 (0.5813) and 30.21 (1.071), respectively, demonstrating a highly accurate initial quantification of parasites to make these 2 separate dilution series (reviewer’s point 1.1). At this concentration of parasites in blood, and with unfragmented DNA, each aliquot for PCR has an equal chance of receiving some parasite DNA (hence all reactions are positive) and a reasonably good chance of receiving similar amounts of parasite DNA (the Cq values cluster with relatively low S.D.). However further dilutions from this parasite input result in some aliquots that receive no parasite DNA and a much wider variation in the amount of parasite DNA/aliquot in samples that are positive (Cq mean (SD) of 34.47 (2.732) for 10-4 in Figure 1). This result demonstrates that these dilution series do not follow binomial distribution as suggested by the reviewer. This is likely because each template for amplification is not independently distributed. Instead, they are known to be clustered (on individual chromosomes or chromosome fragments) in the DNA. Indeed, this observation of widely varying Cq values in dilutions below 10-3 strongly suggested this clustering and was the impetus for fragmenting the DNA (see manuscript line 209). The impact of declustering achieved by DNA fragmentation supports this conclusion (when the DNA is fragmented, 100% of aliquots are positive at 10-4 PE, 10X less than in unfragmented samples, and the Cq values are tightly grouped (mean 33.47, S.D. 0.3358), indicating the unequal distribution of targets upon dilution, rather than counting, pipetting errors or contamination as responsible for the lack of a binomial distribution of targets with increasing dilution. Thus, when entities are clustered and can’t be fully declustered, a simple binomial (or Poisson) distribution of counts cannot be assumed in the serial dilutions. Clustering results in more complicated distribution patterns, and it becomes difficult to predict precisely how these clusters will distribute from one dilution to the next (and thus differences in proportions of positives in different dilution series, as observed herein).
This clustering and unequal distribution of amplification targets also addresses the reviewer’s second comment with respect to the unlikelihood of detecting at least one positive at a high dilution. If we accept the reviewer’s estimate of 100,000 copies of target per parasite, then at 10-4 PE/aliquot - a dilution at which all aliquots are PCR positive in the fragmented samples (Figures 4a and 5b) – each aliquot would be expected to have on average 10 target sequences and the chances of detecting at least one positive reaction from 400 aliquots would be respectively 98% for the 10-7 dilution, 33% for 10-8 and 4% for 10-9 PE per aliquot. These percentages would change (increase) with a higher copy number of targets per genome, and if the targets are still clustered to some degree (which we would expect they would be even in the fragmented DNA). Thus, the chances of detecting positive PCRs at 10-9 PE is low, but it is not “highly improbable”.
Taking the reviewer’s second example of the frequency of positive reactions at 10-6 PE and the assumption of 200,000 target copies per genome (referring to Fig 5B, we believe), the mean template copies per aliquot would be 0.2 at this dilution. Assuming a negative binomial distribution of the still clustered templates (although mechanically fragmented, it would be highly unlikely that they would be completely declustered), then the probability of an aliquot being positive at the 10-6 PE dilution would be 16.7%. Our results in Figure 4A (26%) and Figure 5B (37.5%) are slightly higher but not “highly unlikely” as suggested.
We do not know the target copy number in the parasites used to make these serial dilution profiles herein but that is certainly different from the copy number in the parasites infecting each of the hosts from which we have analyzed blood. Thus, we do not propose that this assay can quantify the absolute parasite burden in a host nor do we see a benefit in trying to do so (see paragraph beginning line 384). Such quantification requires assumptions about not only the target copy number in the parasites in a host, but also that fragmentation is 100% efficient, and particularly, that a single or multiple blood samples accurately reflects the whole host parasite burden (clearly shown not to be the case with the data from serial bleeds presented in Figures 3 and 5). But we standby the conclusion that deep-sampling PCR when employed as presented herein, gives an accurate assessment of the presence of infection and relative parasite burden differences between hosts, and in the same hosts over time or under treatment and that the results presented are not compromised by inaccuracies in quantifying parasites for spiked samples or by sample contamination.
(2) Lack of details on contamination detection
Additionally, the manuscript does not provide enough details on how cross-contamination was detected or managed. It is unclear how the negative controls (NTCs) and no-template controls were distributed across plates, in terms of both quantity and placement. This omission is critical, as the low detection thresholds targeted in this study increase the risk of false positives by contamination. To ensure reliability and reproducibility, future uses of the technique would benefit from more standardized and clearly documented protocols for control placement and handling.
We present a section in the Materials and Methods on preventing contamination and a case example when these precautions failed when preparing the dilution standards containing very high numbers of parasites. Directly responding to the reviewer, sixteen no template controls were included in every 384 well assay plate and we never obtained amplification products from those reactions. Additionally, as noted in the manuscript, uninfected macaques were negative on a collective >15,000 PCR reactions.
We understand the concern about contamination but we believe that we have taken the appropriate precautions and our data fully support that the positives we detect are real positives, not contaminations. It would be reckless to depend on a single positive PCR reaction out of hundreds to conclude that a host is infected; multiple samples must be obtained and analyzed to be certain in such cases, as we show exhaustively with the NHP samples here.
Rather than adding additional technical protocols such as plate layouts to this manuscript, we believe publishing a STAR Protocol or a similar detailed, step-by-step method paper would be more useful and that is our plan.
(3) Unclear relevance for treatment monitoring in Humans
In Figure 7A, the results suggest that the deep-sampling PCR method does not provide a clearly significant improvement over conventional qPCR in humans. Of the 9 samples tested, 6 (56%) were consistently amplified in all or nearly all reactions, indicating these samples could also be reliably detected with standard PCR protocols. Two additional samples were detected only with the deep-sampling approach, increasing sensitivity to 78%; however, these detections might be attributable to random chance given the limited sample size. While the authors acknowledge the small sample size in the discussion, they do not address the fact that a similar increase in sensitivity was reported in citation 5, where only 3 samples were tested with 3 replicates each. This raises an important question: how many PCR reactions are needed in human samples to reach a plateau in detection rates? This issue should be further discussed to contextualize the results and their implications.
We disagree with the reviewer’s conclusion here. First, it is not known how the “conventional” PCR would have performed in the human samples used herein as this was not done. However, it is very likely that it would have performed significantly worse for the following reasons. “Conventional” PCR for T. cruzi has a number of variations, but the most common approach is to mix whole blood 1:1 with a guanidine:EDTA solution, and then extract DNA for PCR from 100-300 ul of this mix. Thus, at best, one has the equivalent of 150 ul of blood that is being analyzed for the presence of T. cruzi DNA. In contrast, in the protocol described herein, we extract DNA from ~5 ml of blood and use aliquots from that DNA for PCR. Thus, even before fragmenting or deep-sampling, the approach described herein is sampling 33X more blood that the conventional protocol, thus likely increasing by over 30-fold the chances of detecting parasite DNA in blood from an infected subject. The smaller the volume of blood sampled as well as the number of samples obtained greatly impact the ability to detect T. cruzi infection in some hosts. This is clearly demonstrated in the extensive screening done in NHPs in this study and there is no reason to believe that the situation will be different in humans and dogs. So the relevance of these enhancements are clear for any host with T. cruzi infection; humans are not unique in this regard.
We don’t believe there will be a “plateau in detection rates”; individuals are either infected or not and the ability to detect that infection (whether with T. cruzi or any other pathogen) depends on the sensitivity of the test and the quantity of the sample available to be screened. Perhaps what is being asked is ‘how many PCR reactions have to be performed to be sure that someone is NOT infected?’. There is not a discrete answer to this and related questions, but by making some assumptions, one can make some estimates. The approach described herein is approaching single copy target detection and if this is true then one would need to PCR amplify ALL of the DNA from a blood sample to assure detection of that single template copy (so for a 200ug of DNA one might obtain from 5-10 ml of blood, 1600 PCR reactions of 125 ng each; 95% and 99% confidence could be obtained with 1520 and 1584 PCRs, respectively). But any conclusion from this testing applies only to that individual blood sample and we show clearly in the NHP studies that multiple samples have to be analyzed to detect parasite DNA in hosts with very low parasite burden – some samples contain parasite DNA and others do not. Thus hundreds of negative PCRs from a single or even multiple samples is unfortunately not definitive.
Such limitations exist for detection of any pathogen. A more important question for the future may be ‘is there a level of infection below which the risk of disease development is sufficiently low as to not be of concern clinically?’. Such is the standard in drug-controlled HIV infections, for example. The improvements we document in this work provides the means to answer such questions and additional improvements may be possible as well. But to be absolutely certain that a host is not infected by T. cruzi, one would have to sample some subjects (likely a small minority of the entire pool) multiple times and perform 1000’s of PCR reactions – as we done for the most difficult to detect macaques in this study.
Despite these limitations, this work represents a promising step forward in the development of highly sensitive diagnostic tools for T. cruzi. It offers a novel foundation for advancing the detection and monitoring of parasitemia, which could significantly benefit Chagas disease research community and clinicians focused on neglected tropical diseases. While addressing the methodological inconsistencies and improving robustness will be critical, this study provides valuable insights and data that could lead to future innovations in parasitological research and diagnostics.
As discussed in detail above, we do not agree that this study has any methodological inconsistencies nor that it lacks robustness.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1:
Weaknesses:
(1) The authors themselves propose in their Introduction that the "ECM-associated changes are increasingly perceived as causative, rather than consequential"; however, they have not conducted mechanistic (gain of function/loss of function) studies either in vitro or in vivo from any of their identified targets to truly prove causality. This remains one of the limitations of this study. Thus, future studies should investigate this point in detail. For instance, it would have been intriguing to dissect if knocking out specific genes involved in one specific model or genes common to both would yield distinct phenotypic outcomes.
We agree with the reviewer that our study does not provide mechanistic verification of the function of identified targets with suggested role in the development and/or resolution of fibrosis. The current study was primarily conducted in order to identify these possible targets with focus on the identification of differences in extracellular matrix deposited in two selected models of liver fibrosis with different modes of action. To conduct further studies using knock-out/in models for verification of causality of proposed targets was at this point well beyond our intention. However, we are fully aware of the potential of identified molecules and further studies to disect their roles in liver diseases are part of future plans.
(2) The majority of the conclusions are derived primarily from the proteomic analyses. Although well conducted, it would strengthen the study to corroborate some of the major findings by other means such as IHC/IF with the corresponding quantifications and not only representative images.
We have now provided additional IF images and their quantifications in accordance with the Reviewer’s suggestions to our major MS findings to strenghten the significance of the MS data (see detailed answer below).
Reviewer #2:
Weaknesses:
(1) As it currently stands, the data, whilst extensive, is primarily focussed on the proteomic data which is fairly descriptive and I am not clear on the additional insight gained in their approach that is not already detailed from the extensive transcriptomic studies. The manuscript overall would benefit from some mechanistic functional insight to provide new additional modes of action relevant to fibrosis progression.
We agree with the reviewer that our study could initially appear descriptive. However, this characteristics is inherent to most omics studies, which tend to provide hypothesis-free testing of a large number of analytes in order to find a multitude of candidate biomarkers(1). Importantly, we believe our study provides insights that go beyond the scope of previously published transcriptomic analyses.
Specifically, our work focuses on compartment-specific changes in the liver proteome, with an emphasis on the extracellular matrix (ECM) composition and alterations in protein solubility—features that cannot be captured by transcriptomic studies. The matrisome is more than a structural scaffold; it functions as a reservoir for secreted factors, including growth factors and cytokines, which modulate the local cellular microenvironment. Transition dynamics between the insoluble matrisome and soluble protein pools influence the signaling capabilities and bioavailability of these factors. Moreover, fibrous ECM assemblies directly impact tissue mechanics, providing cells embedded within the matrix with spatially distinct biochemical and biomechanical contexts. The current understanding of matrisome composition in the context of specific liver disease etiologies is limited. Dr. Friedman, in his 2022 review on hepatic fibrosis, highlights the unmet need to elucidate etiology-specific protein signatures of the cirrhotic liver matrisome, which could serve as disease staging or prognostic biomarkers(2). Our study addresses this gap by characterizing the distinct matrisome profiles associated with hepatotoxic- versus cholestasis-driven liver injury. We believe our findings lay the groundwork for identifying etiology-specific biomarkers and potential therapeutic targets for antifibrotic interventions, offering a novel layer of insight beyond what transcriptomic data alone can provide.
(2) Whilst there is some human data presented it is a minimal analysis without quantification that would imply relevance to disease state. Although studying disease progression in animals is a fundamental aspect of understanding the full physiological response of fibrotic disease, without more human insight makes any analysis difficult to fulfil their suggestion that these targets identified will be of use to treat human disease.
We thank the reviewer for this comment. Our study primarily focuses on utilizing animal models to explore the fundamental physiological processes underlying the development and resolution of fibrotic liver disease. To address the translational relevance of our findings, we concentrated on clusterin, one of the key target proteins identified during our analysis of the insoluble proteome. Specifically, we investigated its localization in human liver samples, focusing on its association with collagen deposits (Figure 6F). To this end, we analyzed human liver samples of diverse etiologies and varying degrees of fibrotic damage, including samples representing four distinct stages of HCV-induced fibrosis (Figure 6F, lower panel). While this analysis highlights the presence and localization of clusterin in fibrotic deposits, we acknowledge that our study does not include extensive quantification or mechanistic insight into clusterin's role in human liver fibrosis. We believe that the data presented in this manuscript provide a valuable foundation for future investigations into clusterin’s involvement in liver fibrosis across different etiologies. Recognizing the translational importance of this work, we have already initiated a prospective study involving human patients, which aims to conduct a more comprehensive analysis of clusterin's function and its potential as a therapeutic target.
To further support our findings on clusterin's role in fibrosis development and resolution and to address the reviewer's concern, we quantified clusterin deposits in the available human samples representing four distinct stages of HCV-induced fibrotic disease. Using immunofluorescence (IF) images at a 20x field of view, we measured both clusterin and collagen deposits to illustrate changes in clusterin abundance during fibrosis progression (stages F1–F4) in relation to collagen deposition dynamics. The quantified data have been included for the reviewer's consideration (Figure 1). However, it is important to emphasize that this quantification was conducted on a single human sample per fibrotic stage, which limits the statistical robustness of the analysis. A more comprehensive evaluation involving additional patient samples would be necessary for a more definitive conclusion. For this reason, we propose to include these results solely in our rebuttal letter and to incorporate a more extensive analysis in our intended follow-up study, where larger cohorts will allow for a thorough investigation of clusterin's role in human liver fibrosis.
Author response image 1.
Dynamics of clusterin abundance with the development of HCV-induced fibrotic disease in comparison to the changes in collagen deposits. IF images of human liver sections from different stages of chronic HCV infection were immunolabeled for clusterin and collagen 1. Clusterin- and collagenpositive (<sup>+</sup>) areas (as %) from three to eight fields of view (20x objective) were evaluated for each fibrosis stage (F1-F4).
(3) Some of the terminology is incorrect while discussing these models of injury used and care should be taken. For example - both models are toxin-induced and I do not think these data have any support that the DDC model has a higher carcinogenic risk. An investigation into the tumour-induced risk would require significant additional models. These types of statements are incorrect and not supported by this study.
We are grateful to the reviewer for drawing our attention to the incorrect use of the term "toxin-induced". In two instances, where the wording was incorrect, we have corrected the term to hepatotoxin-induced as it was originally intended. While we believe that our proteomic signature data and identified signaling pathways suggest a potential carcinogenic risk associated with the cholestatic, but not the hepatotoxic model, we have toned down the statements on this issue in the article to respect the reviewer's perspective. These changes, which are highlighted in the track changes mode of the article, aim to make the conclusions of the study more precise and thus improve the clarity of our conclusions.
Reviewer #1 (Recommendations for the authors):
(1) In the Discussion, the authors could consider pointing out that one limitation of the study is a lack of mechanistic (gain of function/loss of function) studies either in vitro or in vivo from any of their identified targets to truly prove causality.
As noted earlier, we fully agree with both reviewers that a limitation of this study is its descriptive nature, which is an inherent characteristic of omics-based research. In our manuscript, we aimed to "determine compartment-specific proteomic landscapes of liver fibrosis and delineate etiology-specific ECM components," with the overarching goal of providing a foundation for future antifibrotic therapies.
The insights gained from our study will indeed serve as a critical basis for subsequent research, where we will prioritize mechanistic investigations to elucidate the roles of the identified targets. While we acknowledge the importance of gain- or loss-of-function studies to establish causality, we believe this falls outside the primary scope of the current manuscript. Instead, we envision these mechanistic approaches as key elements of our future research efforts. For this reason, we feel it is not necessary to further expand on this limitation in the current discussion.
(2) The majority of the conclusions are derived primarily from the proteomic analyses. Although well conducted, it would strengthen the study to corroborate some of the major findings by other means such as IHC/IF with the corresponding quantifications and not only representative images. For example, the IF stainings for ECM1 should also be quantified - ECM1.
To strengthen our MS findings on ECM1 expression and to address the reviewer's concern, we have now included quantification of ECM1 using IF staining at selected time points in Figure S7E and we refer to these data in the Results section (p. 12 of the current manuscript). The IF quantification data correspond well to the MS data showing increase in ECM1 expression with fibrosis development and decline with partial fibrosis resolution.
(3) S1 - it would be important to show Sirius Red images over the time course, especially for CCl4 T4 where fibrosis resolution is occurring. Proteomics data also show this group clusters more closely with control mice and seeing a representative image would add further credibility to this point.
Requsted Sirius Red images are now part of the Figure S1B, documenting partial fibrosis resolution and overall parenchyma healing in T4 in both models.
(4) How comparable are the periods of the two models? 2 weeks in one model may not be the same as 2 weeks in the other depending on the severity of the pathogenesis.
We appreciate the reviewer’s comment regarding the comparability of time points between the two models. Indeed, the temporal dynamics of fibrosis development differ between the models employed in our study, and we have carefully considered this aspect to ensure the validity of our comparative analysis. To address this, we started our comparisons at a stage corresponding to the onset of fibrosis in each model. Specifically, quantification of Sirius Red-positive areas, indicative of collagen deposition (Figure S1B), revealed that 2 weeks of DDC treatment produced a comparable extent of fibrosis to that observed after 3 weeks of CCl₄ treatment. This point was designated as the initial fibrosis time point (T1, Figure S1B), from which further treatment was applied to induce more advanced fibrosis. This approach allowed us to standardize the comparison of fibrosis progression between the two models.
(5) Figure 4A-D - cell-type-specific signatures should be corroborated by actual IHC or IF stainings if possible. HNF4a (hepatocytes), CK19 (cholangiocytes), aSMA (activated fibrogenic HSCs), immune cells (B220, F4/80, Cd11b, CD11c etc).
We thank the reviewer for this valuable suggestion. To strengthen our analysis, we have now complemented the box plots of cell type-specific signatures derived from the MS data (Figure 4A-D) with immunofluorescence (IF) staining, which has been included in the Supplemental Data (Figure S6). Specifically, we provide representative IF images from control and T1-T4 time points for each model, documenting the changes in abundance with treatment in:
A) Hepatocytes (HNF4α), activated hepatic stellate cells (αSMA), and cholangiocytes (CK19).
B) Immune cell populations, including B cells (B220) and macrophages/monocytes/Kupffer cells (F4/80), as these immune cell groups were not only identified in our MS analysis but also have established roles in the selected models(3, 4, 5).
The representative images shown in Figure S6 show the dynamics of the cellular populations in each of the models, which correspond well with the MS data (compare Figures 4A-D and S5). These additional data further validate our findings and enhance the robustness of our conclusions.
References:
(1) Thiele M, Villesen IF, Niu L, et al. Opportunities and barriers in omics-based biomarker discovery for steatotic liver diseases. J Hepatol 2024;81:345-359.
(2) Friedman SL, Pinzani M. Hepatic fibrosis 2022: Unmet needs and a blueprint for the future. Hepatology 2022;75:473-488.
(3) Best J, Verhulst S, Syn WK, et al. Macrophage Depletion Attenuates Extracellular Matrix Deposition and Ductular Reaction in a Mouse Model of Chronic Cholangiopathies. PLoS One 2016;11:e0162286.
(4) Aoyama T, Inokuchi S, Brenner DA, et al. CX3CL1-CX3CR1 interaction prevents carbon tetrachlorideinduced liver inflammation and fibrosis in mice. Hepatology 2010;52:1390-400.
(5) Yang W, Chen L, Zhang J, et al. In-Depth Proteomic Analysis Reveals Phenotypic Diversity of Macrophages in Liver Fibrosis. J Proteome Res 2024;23:5166-5176.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Major concerns:
For studies investigating capsaicin binding to KEAP1, the authors used capsaicin concentrations that are toxic to cells (Figures S1D and 4F, G). In vivo studies were performed only in 3 rats per group. The T-test was used for the comparison of more than two groups. Given the well-known issues with the specificity of the NRF2 antibody, the authors should provide appropriate controls, especially for IF and IHC staining.
We sincerely appreciate your valuable comments. We repeated the experiments about CCK8 (Figure S1d) and Pull-down (Figure 4g), and then updated the results. In September 2022, GES-1 cells were more sensitive to capsaicin (CAP) because Gibco serum from North America was used. Later, in 2024, we changed the serum from Australia(Gibco: 10099-141), and we found that such GES-1 cells raised better, so we re-ran the test, and the IC50 was seen to be 304.8 μM, so concentrations used in this paper has no obvious toxicity to cells. What’s more, we repeated the Pull-down experiment with more reasonable concentrations of 32 μM and 100 μM, and the results were still in line with expectations. In summary, we concluded that the effect of CAP on GES-1 cells is closely related to the cell state, and that treatments of CAP from 32 to 100 μM can hinder the interaction between NRF2 and the Kelch domain of KEPA1. What’s more, at the cellular level, the experimental concentration of CAP was not more than 32 μM, which is a relatively safe concentration for cells.
Thank you very much for your comments. We also pay attention to using more repetitions to increase the reliability of the experimental results in animal experiments. Therefore, recently we supplemented the experiment of Nfe2l2Knockout mice in Figure 9 (6 mice per group). Additionally, thank you very much for your comments on the use of T-test analysis, we reviewed the statistics and changed them by one-way ANOVA.
Finally, thanks to your concern about the specificity of NRF2 antibody, we used commercialized NRF2 antibody which have been KO/KD validated (Cat No. 16396-1-AP, Proteintech) and can be used for IF and IHC staining. Each of our fluorescence result was equipped with Western Blotting in its active form at the size of 105-110 KDa for statistical analysis, the trend was consistent with the experimental results of IF and IHC, which fully proves the correctness of the results presented (Figure 2c and Figure S8j).
Reviewer #2 (Public Review):
Weaknesses:
One major weakness of the study is that plausibility is taken as proof for causality. The finding that capsaicin directly binds to Keap1 and releases Nrf2 from its fate of degradation (in vitro) is taken for granted as the sole explanation for the observed improved gastric health upon alcohol exposure (in vivo). There is no consideration or exclusion of any potential unrelated off-target effect of capsaicin, or proteins other than Nrf2 that are also controlled by Keap1.
Another point that hampers full appreciation of the capsaicin effect in cells is that capsaicin is not investigated alone, but mostly in combination with alcohol only.
Thank you very much for this comment. In the introduction, we clarified as follows: “Currently, experiments conducted in rats have demonstrated that red pepper/capsaicin (CAP) had significant protective effects on ethanol-induced gastric mucosal damage, and the mechanism may be related to the promotion of vasodilation(6,7), increased mucus secretion(8) and the release of calcitonin gene-related peptide (CGRP)(9,10). However, it is noteworthy that whether the antioxidant activity of CAP works has not been fully investigated.” Therefore, we also recognize that CAP does not exert its effects through the KEAP1-NRF2 pathway alone. Your advice is very useful. We further explored the TRPV1 and DPP3 to detect the potential off-target effects of CAP respectively. Capsazepine (CAPZ), which is TRPV1 receptor antagonist did not affect the protection of CAP against GES-1 (Fig S4f and S4g), which may indicate that CAP activation of NRF2 does not have to depend on TRPV1. The binding of CAP with DPP3, containing an ETGE motif and can bind to KEPA1, was detected by BLI, and we found that the K<sub>D</sub> between CAP and DPP3 was 1.653 mM(>100 μM), which may indicate the potential off-target effect of CAP is low because CAP had a strong binding force with KEAP1 about 31.45 μM (Fig S4h and S4i).
Thank you very much for the comment of another point. Multiple experiments have shown that CAP significantly up-regulates NRF2 in the presence of additional stimuli such as EtOH (Figure 1i), H<sub>2</sub>O<sub>2</sub> (Figure 1l), PS-341(Figure 2e) and DTT (Figure 4d), which pattern is consistent with our understanding of allosteric regulation and as expected. Especially for the experiments of PS-341 and DTT, we had a group that only adds CAP, and it can be seen that the addition of CAP alone did not significantly up-regulate NRF2, which is completely different from traditional NRF2 activators (especially artificially designed covalent binding peptides which have serious side effects).
Reviewer #3 (Public Review):
Weaknesses:
While the study provides valuable insights into the molecular mechanisms and in vivo effects of CAP, further clinical studies are needed to validate its efficacy and safety in human subjects. The study primarily focuses on the acute effects of CAP on ethanol-induced gastric mucosa damage. Long-term studies are necessary to assess the sustained therapeutic effects and potential side effects of CAP treatment.
Furthermore, the study primarily focuses on the interaction between CAP and the KEAP1-NRF2 axis in the context of ethanol-induced gastric mucosa damage. It may be beneficial to explore the broader effects of CAP on other pathways or conditions related to oxidative stress. CAP has been known for its interaction with the Transient Receptor Potential Vanilloid type 1 (TRPV1) channel and subsequent NRF2 signaling pathway activation. Those receptors are also expressed within the gastric mucosa and could potentially cross-react with CAP leading to the observed outcome. Including experiments to investigate this route of activation could strengthen the present study.
While the design of CAP nanoparticles is innovative, further research is needed to optimize the nanoparticle formulation for enhanced efficacy and targeted delivery to specific tissues.
Addressing these weaknesses through additional research and clinical trials can strengthen the validity and applicability of CAP as a therapeutic agent for oxidative stress-related conditions.
Thank you very much for these suggestions. We also believe that CAP is very valuable and promising for protecting EtOH induced gastric mucosal injury, and actively promote patent applications and if conditions permit, longer drug research for biosecurity is essential. Because of the inherently new discovery of the binding of CAP and KEAP1, and the important role of NRF2 in various oxidative stress-related diseases, we used Human umbilical cord mesenchymal stem cells (HUC-MSCs) and H<sub>2</sub>O<sub>2</sub> to explore the potential broader effects of CAP related to oxidative stress in cells (Figure 1l and 1m). At the same time, we also explored TRPV1 related experiments, and we were surprised to find that inhibiting TRPV1 did not affect the effect of CAP (Supplementary Figure 4f and 4g). We hope that more people can read this article and do more interesting research together.
Recommendations for the authors:
Reviewing Editor (Recommendations For The Authors):
Although this study has been conducted in rats, a direct proof that albumin-coated capsaicin nanoparticles act through activation of Nrf2 in protecting gastric mucosa against alcohol toxicity could be well conducted in commercially available Nrf2-deficient mice.
Thank you very much for your suggestion and the comment is very constructive for us to improve this paper. We purchased Nrf2-deficient mice (Cat. NO. NM-KO-190433) and performed experiments, and the results showed that knockout mice with Nrf2 were more sensitive to EtOH and the effects of CAP were partially eliminated (Figure 9), which further validated the role of Nrf2-related signaling pathway in EtOH-induced gastric mucosal injury and the therapeutic effect of CAP.
Reviewer #1 (Recommendations For The Authors):
Minor concerns include proofreading the paper. Actinomycin is not an inhibitor of translation.
Thank you for your comment. We have revised “Actinomycin” to “Cycloheximide”.
Reviewer #2 (Recommendations For The Authors):
- Please have a careful look at your conclusions: just because two effects happen at the same time and may be plausible explanations for each other, it does not mean that they are really in a causative relationship in your given test system (unless unambiguously proven by additional experiments).
Your suggestions are very constructive for us to improve this paper.
We further discussed the role of capsaicin with TRPV1, DPP3 and Nrf2deficient mice, hoping to make our conclusions more credible to some extent.
- You may want to frankly discuss other targets of capsaicin (e.g. the TrpV1 receptor) that possibly could also account for your observations, and that binding to Keap1 not only releases Nrf2 from proteasomal degradation.
Thank you for your comment. As a result, we further explored the TRPV1 and DPP3 to detect the potential off-target effects of CAP respectively. Capsazepine (CAPZ), which is TRPV1 receptor antagonist does not affect the protection of CAP against GES-1 (Fig S4f and S4g). DPP3 with an ETGE motif was detected by BLI, and we found that the K<sub>D</sub> between CAP and DPP3 was 1.653 mM, which may indicate the potential off-target effect of CAP is low (Fig S4h and S4i). At the same time, the activation of NRF2 by non-classical pathways such as CAP regulation of DPP3 or other proteins also deserves more discussion and experimental verification.
- For Figure 1G it does not become entirely clear what has been done (and thus deduction of conclusions is hampered).
Thank you for your comment. Network targets analysis (Figure 1g) was performed to obtain the potential mechanism of effects of CAP on ROS. Biological effect profile of CAP was predicted based our previous networkbased algorithm:drug CIPHER. Enrichment analysis was conducted based on R package ClusterProfiler v4.9.1 and pathways or biological processes enriched with significant P value less than 0.05 (Benjamini-Hochberg adjustment) were remained for further studies. Then pathways or biological processes related to ROS and significantly enriched were filtered and classified into three modules, including ROS, inflammation and immune expression. Network targets of CAP against ROS were constructed based on above analyses, and finally we combined proteomics to determine the research idea of this paper
- Figure 1L: is there a reason/explanation why UC.MSC needs a comparably very high concentration of capsaicin.
Thank you for your comment. Because the experimental results of 8 μM and 32 μM on this cell were more stable, and the activation effect of NRF2 downstream was more obvious.
- Figure 2C: it is surprising that naïve (unstressed /untreated cells) already show a rather high nuclear abundance of Nrf2 (shouldn´t Nrf2 be continuously tagged for degradation by Keap1).
Thank you for your comment. This is a real experimental result, and we have found in many experiments that the untreated group can also show NRF2 when immunoblotting. We think that this phenomenon may be related to the cell state at that time.
- Figure 2E: the claim of synergy between CAP and the proteasome inhibitor is not justified with this single figure.
Thank you for your comment. Multiple experiments have shown that CAP significantly up-regulates NRF2 in the presence of additional stimuli such as EtOH (Figure 1i), H<sub>2</sub>O<sub>2</sub> (Figure 1l), PS-341 (Figure 2e) and DTT (Figure 4d), which pattern is consistent with our understanding of allosteric regulation and as expected. However, this synergy does warrant more research.
- CHX is cycloheximide (in the main text it is referred to as actinomycin).
Thank you very much for your comment. We have revised “Actinomycin” to “Cycloheximide”.
- Figures 2G-H: why switch to rather high concentrations? Is it due to the overexpression of Keap1?
Thank you for your comment. At the time of this part of the experiment, we had obtained in vitro data on the interaction of CAP and the Kelch domain of KEAP1 (about 32 μM). To keep the results uniform and valid, we chose a relatively higher concentration.
- Figure 2I: in the pics of mitochondria the control mitochondria look way more punctuated (likely fissed) than the ones treated with EtOH or EtOH + CAP. Wouldn´t one expect that EtOH leads to mitochondrial fission and CAP can prevent it?
Thank you for your comment. MitoTracker® Red CMXRos (M9940, Solarbio, China) is a cell-permeable X-rosamine derivative containing weakly sulfhydryl reactive chloromethyl functional groups that label mitochondria. This product is an oxidized red fluorescent stain (Ex=579 nm, Em=599 nm) that simply incubates the cell and can be passively transported across the cell membrane and directly aggregated on the active mitochondria. Therefore, red does not represent broken mitochondria, but active mitochondria. Quantitative analysis of the mean branch length of mitochondria was calculated using MiNA software (https://github.com/ScienceToolkit/MiNA) developed by ImageJ.
- Figure 3C: figure legend is somewhat poor.
Thank you for your comment. We have revised: “KEAP1-NRF2 interaction was detected with Surface plasmon resonance (SPR) in vitro.”
- Figure 3E: given that CAP disrupts Nrf2/Keap1- PPI, why is there no Nrf2 stabilization seen in the fourth lane (input/lysate)?
Thank you for your comment. The fourth lane may promote the degradation of NRF2 due to overexpression of KEAP1.
- Figure 3H: high basal Nrf2 levels in unstressed/untreated HEK WT cells, why?
Thank you for your comment. This is a real experimental result, and we have found in many experiments that the untreated group can also show NRF2 when immunoblotting in 293T cells. We think that this phenomenon may be related to the cell state at that time.
- Figure 3G/I: this data suggests to me that the alcohol-mediated toxicity is Keap1-dependent (rather than the protection by CAP), doesn´t it?
Thank you for your comment. We can see that KEAP1-KO cells had a high expression of NRF2, which was also in line with our expectations, and EtOH-induced GES-1 damage may be closely related to oxidative stress.
- Figure 4a: the inclusion of an additional Keap1 binding protein (one with an ETGE motif) would have been desirable (to get information on specificity/risks of off-target (unwanted) effects of CAP).
Thank you for your comment. DPP3 with an ETGE motif was detected by BLI, and we found that the K<sub>D</sub> between CAP and DPP3 was 1.653 mM, which may indicate the potential off-target effect of CAP is low (Fig S4h and S4i).
- Figure 4D: why is there no stabilization of Nrf2 by CAP in lane 2 ? How can the DTT-mediated boost on Nrf2 levels be explained?
Thank you for your comment. Multiple experiments have shown that CAP significantly up-regulates NRF2 in the presence of additional stimuli such as EtOH (Figure 1i), H<sub>2</sub>O<sub>2</sub> (Figure 1l), PS-341 (Figure 2e) and DTT (Figure 4d), which pattern is consistent with our understanding of allosteric regulation and as expected. However, this synergy does warrant more research.
- Figure 4f: 5% DMSO is a rather high solvent concentration, why so high (the solvent alone seems to have quite marked effects).
Thank you for your comment. Because our maximum concentration was set relatively high, we have also recognized relevant problems and resupplemented the more critical Pull-down experiment (Figure 4g). The current DMSO of 0.2% had no effect on the experimental results.
- Figure 5: it should be described in the figure legend which mutant is used. Based on the previous data, I would expect an investigation of mutants carrying amino acid exchanges at the newly identified allosteric site.
Thank you for your comment. The mutated version involved substitutions at residues Y334A, R380A, N382A, N414A, R415A, Y572A, and S602A (the orthostatic site), which are residues reported to engage NRF2 and classic Keap1 inhibitors. The exploration of newly discovered allosteric sites is worthy of further study.
- Figure 6/7: I am not expert enough to judge formulations and histology scores. However, the benefit of the encapsulated capsaicin does not become entirely clear to me, as CAP and IRHSA@CAP mostly do not significantly differ in their elicited response.
Thank you for your comment. On the one hand, nanomedicine improves the safety of administration: it helps to reduce the intense spicy irritation of CAP itself when administered in the stomach; On the other hand, the dosage of drugs is reduced to a certain extent to achieve better therapeutic effect.
- Figure 7: rebamipide was introduced as positive control in the text with an activating effect on Nrf2, but there is no induction of hmox and nqo in Figure 7f, why?
Thank you for your comment. The effect of addition of positive control drug (Rebamipide) on NRF2 activation is not the focus of this paper. We speculate that the transcription and translation of related genes may not be completely synchronized when Rebamipide was taken at the same time.
- Figure 8: the CAP effect on inflammation is visible, however, a clear causal connection between ROS/Nrf2/KEap1 is not given in the presented experiments.
Thank you for your comment. The simple mechanics of this paper are illustrated in the Graphic diagram. The activation of NRF2 exerts both antiinflammatory and antioxidant functions, which has been reported in many articles, but the causal relationship is still open to exploration.
Points related to presentation:
- The data with the encapsulated CAP appear a little as a sidearm that does not bolster your main message (maybe take out and elaborate on this topic more extensively in another manuscript).
- Revise the introduction on the Nrf2 signaling pathway as it is written at the moment, someone outside the Nrf2 field might have trouble understanding it.
- The use of language requires proofreading and revision.
Thank you for your comment. We rearranged and proofread it.
Reviewer #3 (Recommendations For The Authors):
Overall, the manuscript is well-written and the results are presented in a concise and comprehensible manner.
Some recommendations on the experimental evidence and further suggestions:
• The authors should state how they assessed the distribution of the data. Description of data with mean and standard deviation as well as comparisons between different groups with t-test assumes that the underlying data is normally distributed.
Your suggestions are very constructive for us to improve the paper. The differences in the mean values between the two groups were analyzed using the student’s t-test, while the differences among multiple groups were analyzed using a one-way ANOVA test in the GraphPad Prism software.
Therefore, we checked and proofread the statistical analysis.
• Additional experiments further characterising and validating the activation of CAP via direct KELCH1-binding could include parallel experiments with similar agonists like dimethyl fumarate. It would be interesting to know how CAP activation compares to DMF activation.
Thank you very much for your comment. We believe that the activation of NRF2 by DMF has been widely reported and well-studied, so we did not purchase this drug for comparative study here. If it can be promoted clinically in the future, we may consider comparing with DMF.
• Also, the knock-down of NRF2 would be a suggested experiment to do because it rules out that the benefit of CAP is independent of KEAP1-NRF2 binding and activation.
Thank you very much for your suggestions. We purchased Nrf2-deficient mice and performed experiments, and the results showed that knockout mice with Nrf2 were more sensitive to ethanol and the effects of CAP were partially eliminated (Figure 9), which further validated the role of Nrf2-related signaling pathway in alcohol-induced gastric mucosal injury and the therapeutic effect of CAP.
Some corrections on text and figures:
• Figure 1b: incorrect spelling of DNA stain. Should be Hoechst33324.
Thank you very much for your comment. We have revised.
• Figure 1c: don't put the label inside the plot.
Thank you very much for your comment. We have revised.
• Figure 1d: choose less verbose axes titles (this also applies to other figures).
Thank you very much for your comment. We have revised.
• Figures 1e and 1f: please state the units.
Thank you very much for your comment. The enzyme activity of SOD and the content of MDA were compared with that of the control group.
• Heading 2.2: NRF2-ARE instead of NRF-ARE.
Thank you very much for your comment. We have revised.
• Line 118: missing expression after immune.
Thank you very much for your comment. We have revised.
• Figure 1g: names of proteins are not readable.
Thank you very much for your comment. We have revised.
• Line 120: You performed transcriptomic analyses to identify differentially expressed GENES not proteomic.
Thank you very much for your comment. This part of the work we do is proteomics.
• Line 122: Fold change should be stated in both directions, i.e. absolute FC like |FC| > 1. Or did you select only upregulated DEGs? Is it not log2 FC?
Thank you very much for your comment. We have revised.
• Figure 1h (and Supplementary Figure 1a): Missing heatmap legend for FC.
What do the colors show? Sample (column) description missing.
Thank you very much for your comment. We used red to indicate up-regulation, blue to indicate down-regulation, and the vertical coordinate on the right side were antioxidant genes such as GSS and SOD1, respectively, and the proportion between the treatment group and the model group (CAP + EtOH/EtOH) had been calculated and labeled.
• Line 145: A Western blot is not a proteomic analysis.
Thank you very much for your comment. We have revised: “Concurrently, the elevated expression levels of GSS and Trx proteins, which were also downstream targets of NRF2, further validated by western blotting (Figure 1j).”
• Supplementary Figure 2e-j: expression fold change is not the right quantity. The signal of the actual protein was quantified. And what are you comparing to with the statistics? The stars on one bar are not clear.
Thank you very much for your comment. The expression level of this part was normalized compared with that of the control group. The significance differentiation analysis is compared with the model group.
• What was the concentration of H<sub>2</sub>O<sub>2</sub> used?
Thank you very much for your comment. 200 μM H<sub>2</sub>O<sub>2</sub> was used.
• Figure 2d: use a more precise y-axis label.
Thank you very much for your comment. We do want to compare the amount of NRF2 entering the nucleus, so the relative expression is compared to the internal reference
• Figure 2g: missing molecular weight markers.
Thank you very much for your comment. Since the ubiquitination modification is a whole membrane, and only marking the size of HA and GAPDH is not beautiful enough here.
• Line 221: lactate is the endproduct of the anaerobic glycolytic pathway.
Thank you very much for your comment. We have revised.
• Supplementary Figure 3d: should it be PKM2 (instead of PKM) and LDHA (instead of LDH). Should fit with the text in the manuscript.
Thank you very much for your comment. We have revised.
• Supplementary Figures 3 e-f: brackets in y-axis labels are too bold.
Thank you very much for your comment. We have revised.
• Figures 3a and b. Brackets should only be used if two conditions are being compared statistically. Remove the one line with ns as it could imply that you have compared the first with the last condition only.
Thank you very much for your comment. We have revised.
• Consistent labeling of kDa in figures (no capital K in KDa).
Thank you very much for your comment. We have revised.
• Figure 4a. Move kDa on top of 70.
Thank you very much for your comment. We have revised.
• Figure 3 g-h: Why 2% EtOH. Used 5% previously?
Thank you very much for your comment. Because here we changed the 293T cell line, 5% EtOH concentration is too high on this cell.
• Supplementary Figure b-e: correct typo in y-axis label: expression.
Thank you very much for your comment. We have revised.
• Figure 4a: correct x-axis label for temperature unit. Too bold. Not readable.
Add a clear label and unit for y-axis.
Thank you very much for your comment. We have revised.
• Figure 4 b-c: should have a legend explaining colors.
Thank you very much for your comment. Our Figure legend already contains the meaning of colors: “(b) Computational docking of CAP molecule to KEAP1 surface pockets. The Keap1 protein is represented in gray, while the CAP molecule is shown in yellow. The seven key amino acids predicted to be crucial for the interaction are highlighted in blue. (c) Partial overlap of CAPbinding pocket with KEAP1-NRF2 interface. The KEAP1-NRF2 interaction interface is represented in purple.”
• Supplementary Figure 5a. Add axis units.
Thank you very much for your comment. We have revised.
• Figure 4e: Missing b ions value for number 19.
Thank you very much for your comment. This part is not missing, but corresponds to 19 of y ions.
• Figure 7f: adjust brackets - they are too bold.
Thank you very much for your comment. We have revised.
• Supplementary Figure 8b-i: labels not readable. c should be spleen.
Thank you very much for your comment. We have revised.
• Line 787: specify BH adjustment to Benjamini-Hochberg.
Thank you very much for your comment. We have revised.
• Check spelling of µl throughout the Methods section e.g. line 854 - shouldn't be "ul".
Thank you very much for your comment. We have revised.
• Line 974: correct spelling of species names: E. coli should be in italics.
Thank you very much for your comment. We have revised all of these corrections on text and figures. For me, the writing of papers will be more rigorous and careful in the future.
-
-
www.researchsquare.com www.researchsquare.com
-
Author response:
We sincerely thank the reviewers for their thorough and constructive evaluation of our manuscript. We particularly appreciate their recognition of our comprehensive characterization approach, which integrates immunohistochemistry, transcriptomics, morphological assessments, and electrophysiology to understand psilocin's effects on human neurons. The reviewers highlighted that our findings closely align with and validate prior work on rat cortical neurons, while importantly extending these insights to human cells. We are encouraged by their acknowledgment that our study demonstrates the value of using iPSC-derived human cortical neurons for testing potentially translatable effects of psychedelic compounds. Their positive assessment of our work's implications for psychedelic drug development is particularly valuable, as it supports our goal of advancing the understanding of these compounds' therapeutic potential and their possible application in treating neuropsychiatric disorders.
We are also very grateful for the reviewers' constructive criticism which will help strengthen our manuscript significantly. Based on their detailed feedback, we plan to perform several additional experiments for inclusion in the revised manuscript.
The most important concern raised by both reviewers is about the specificity of the antibody used to detect the expression pattern and abundance of 5-HT2A receptors at the cells' surface. We acknowledge that GPCR antibodies, including those targeting 5-HT2A receptors, can be challenging in terms of specificity and reliability, particularly given the structural similarities within this receptor family. To address these concerns comprehensively, we propose the following systematic validation strategy:
(1) Cell-Type Specific Expression Analysis: We will systematically evaluate the antibody across different developmental stages and cell lines. The results from the stainings will be correlated with RNA sequencing data to provide quantitative validation of expression patterns. Cell types to be included will be:
· iPSCs (expected negative)
· Neural progenitors (expected positive)
· Mature neurons (expected positive)
· HEK cells (expected negative) This multi-stage analysis will allow us to track receptor expression through development and verify antibody specificity across distinct cellular contexts.
(2) Peptide Competition Study: We will perform blocking experiments using the specific peptide sequence against which the antibody was raised. By pre-incubating the antibody with its cognate peptide at established working concentration, followed by detailed documentation of signal reduction in peptide-blocked condition versus standard staining, we can demonstrate binding specificity. This approach will provide direct evidence of antibody selectivity for its intended target.
(3) Sequence Analysis and Specificity: We will perform a comprehensive protein BLAST analysis of the antigenic peptide sequence, assess potential cross-reactivity with related receptors, and evaluate species conservation and specificity. This in silico approach will complement our experimental validation and help identify any potential off-target binding sites.
(4) Additional Validation: While technically challenging, we will attempt knockdown studies using siRNA/shRNA approaches to provide additional validation of antibody specificity. This molecular intervention will offer another layer of validation through targeted reduction of the receptor.
We plan to present these results in a new supplementary figure that will provide a comprehensive overview of our validation efforts. Should we not be able to convincingly demonstrate the specificity of the antibody, we will discuss with the editors and reviewers to modify Figure 1 and exclude critical parts from the manuscript. While we find the results interesting and important to communicate, an omission would not critically impact the key message of the manuscript, which is the structural and molecular changes elicited by psilocin on human neurons. The strength of our multi-modal approach means that our core findings are supported by several independent lines of evidence beyond antibody-based detection.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors aimed to quantify feral pig interactions in eastern Australia to inform disease transmission networks. They used GPS tracking data from 146 feral pigs across multiple locations to construct proximity-based social networks and analyse contact rates within and between pig social units.
Strengths:
(1) Addresses a critical knowledge gap in feral pig social dynamics in Australia.
(2) Uses robust methodology combining GPS tracking and network analysis.
(3) Provides valuable insights into sex-based and seasonal variations in contact rates.
(4) Effectively contextualizes findings for disease transmission modeling and management.
(5) Includes comprehensive ethical approval for animal research.
(6) Utilizes data from multiple locations across eastern Australia, enhancing generalizability.
Weaknesses:
(1) Limited discussion of potential biases from varying sample sizes across populations
This is a really good comment, and we will address this in the discussion as one of the limitations of the study
(2) Some key figures are in supplementary materials rather than the main text.
We will move some of our supplementary material to the main text as suggested.
(3) Economic impact figures are from the US rather than Australia-specific data.
We included the impact figures that are available for Australia (for FDM), and we will include the estimated impact of ASF in Australia in the introduction.
(4) Rationale for spatial and temporal thresholds for defining contacts could be clearer.
We will improve the explanation of why we chose the spatial and temporal thresholds based on literature, the size of animals and GPS errors.
(5) Limited discussion of ethical considerations beyond basic animal ethics approval.
This research was conducted under an ethics committee's approval for collaring the feral pigs. This research is part of an ongoing pest management activity, and all the ethics approvals have been highlighted in the main manuscript.
The authors largely achieved their aims, with the results supporting their conclusions about the importance of sex and seasonality in feral pig contact networks. This work is likely to have a significant impact on feral pig management and disease control strategies in Australia, providing crucial data for refining disease transmission models.
Reviewer #2 (Public review):
Summary:
The paper attempts to elucidate how feral (wild) pigs cause distortion of the environment in over 54 countries of the world, particularly Australia.
The paper displays proof that over $120 billion worth of facilities were destroyed annually in the United States of America.
The authors have tried to infer that the findings of their work were important and possess a convincing strength of evidence.
Strengths:
(1) Clearly stating feral (wild) pigs as a problem in the environment.
(2) Stating how 54 countries were affected by the feral pigs.
(3) Mentioning how $120 billion was lost in the US, annually, as a result of the activities of the feral pigs.
(4) Amplifying the fact that 14 species of animals were being driven into extinction by the feral pigs.
(5) Feral pigs possessing zoonotic abilities.
(6) Feral pigs acting as reservoirs for endemic diseases like brucellosis and leptospirosis.
(7) Understanding disease patterns by the social dynamics of feral pig interactions.
(8) The use of 146 GPS-monitored feral pigs to establish their social interaction among themselves.
Weaknesses:
(1) Unclear explanation of the association of either the female or male feral pigs with each other, seasonally.
This will be better explained in the methods.
(2) The "abstract paragraph" was not justified.
We have justified the abstract paragraph as requested by the reviewer.
(3) Typographical errors in the abstract.
Typographical errors have been corrected in the Abstract.
Reviewer #3 (Public review):
Summary:
The authors sought to understand social interactions both within and between groups of feral pigs, with the intent of applying their findings to models of disease transmission. The authors analyzed GPS tracking data from across various populations to determine patterns of contact that could support the transmission of a range of zoonotic and livestock diseases. The analysis then focused on the effects of sex, group dynamics, and seasonal changes on contact rates that could be used to base targeted disease control strategies that would prioritize the removal of adult males for reducing intergroup disease transmission.
Strengths:
It utilized GPS tracking data from 146 feral pigs over several years, effectively capturing seasonal and spatial variation in the social behaviors of interest. Using proximity-based social network analysis, this work provides a highly resolved snapshot of contact rates and interactions both within and between groups, substantially improving research in wildlife disease transmission. Results were highly useful and provided practical guidance for disease management, showing that control targeted at adult males could reduce intergroup disease transmission, hence providing an approach for the control of zoonotic and livestock diseases.
Weaknesses:
Despite their reliability, populations can be skewed by small sample sizes and limited generalizability due to specific environmental and demographic characteristics. Further validation is needed to account for additional environmental factors influencing social dynamics and contact rates.
This is a really good point, and we thank the reviewer for pointing out this issue. We will discuss the potential biases due to sample size in our discussion. We agree that environmental factors need to be incorporated and tested for their influence on social dynamics, and this will be added to the discussion as we have plans to expand this research and conduct, the analysis to determine if environmental factors are influencing social dynamics.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Consider moving some key figures from supplementary materials to the main text to strengthen the presentation of results.
We included a new figure to strengthen the presentation of results (Figure 3a-b), which shows the node level measures by sex and for direct and indirect networks.
(2) Expand discussion of limitations, particularly addressing potential biases from varying sample sizes across populations.
We added more detail and clarity about this potential bias into the limitation section within the discussion: “Different populations in our study had varying numbers of collared individuals, with some populations having only two individuals at certain times. This variability in sample size across populations is a limitation when interpreting the results. Small populations are often the result of a few individuals being trapped and collared, and this does not necessarily reflect the actual number of individuals in those groups.” Moreover, while reviewing the effect of the potential bias, we found that a General Linear Mixed Effect Model (Table 1) was not optimal for analysing the effect of sex on the network measures, and therefore this analysis has been done again using a non-parametric test (Wilcoxon rank-sum test) for direct and indirect networks based on a 5 metres threshold (Table 1).
(3) If available, include Australia-specific economic impact data in the introduction.
We included the impact figures that are available for Australia (for FDM) in the introduction.
(4) Clarify the rationale for chosen spatial and temporal thresholds for defining contacts.
This has been added in the methodology: “Direct contact was defined when two individuals interacted either at 2, 5, or 350-metre buffers within a five-minute interval [36]. A previous study used 350 metres as a spatial threshold [16], while others use the approximate average body length of an individual [36]”
(5) Consider adding a brief discussion of ethical considerations beyond basic animal ethics approval, addressing aspects like animal welfare during collaring and potential environmental impacts.
Feral pigs are an invasive species in Australia, and managing their population is crucial to protecting native ecosystems. The trapping and collaring of these animals have been conducted following the stringent animal welfare requirements necessary to obtain animal ethics approval in Australia. However, it is important to consider the broader ethical implications. Animal welfare during collaring is a critical aspect and involves minimising stress and physical harm to the animals. The collars used are lightweight and properly fitted only on adults due to welfare issues collaring juveniles.
(6) Add a statement about data availability/accessibility.
The GPS data cannot be shared; however, the R codes will be deposited in GitHub (https://github.com/Tatianaproboste/Feral-Pig-Interactions) and the link has been added in the final version.
(7) Expand on the implications of seasonal variation in contact rates for disease management strategies in the discussion.
We have added this information in the discussion: “For example, controlling an outbreak during summer would potentially require more resources than an outbreak in other seasons due to the higher number of contact between individuals during summer.”
Reviewer #2 (Recommendations for the authors):
The typographical errors in the abstract to be corrected are:
(1) Line 22: Remove the "are" before "threaten".
This has been corrected.
(2) Line 24: Replace the "to" before "extinction" with "into".
This has been corrected.
(3) Line 28: Rephrase the sentence.
‘Yet social dynamics are known to vary enormously from place to place, so knowledge generated for example in USA and Europe might not easily transfer to locations such as Australia.’
(3) Line 29: Insert a "comma" after "Here".
This has been corrected.
(4) Lines 33 -34: Explain, clearly, the contact rates; is it between females to females or females to males?
We have improved this phrase and now it reads: “…. with females demonstrating higher group cohesion (female-female) and males acting as crucial connectors between independent groups.”
(5) Line 36: Make yourselves clear about what you mean by "targeting adult male".
We believe “targeting adult males” is correct in this context.
Reviewer #3 (Recommendations for the authors):
(1) Line 22 and 44, I think are threaten "are" should be removed for better clarity.
This has been corrected.
(2) Line 71, the source and not "force" of infection.
The force of infection is correct here.
(3) Line 72, population "of".
This has been corrected.
(4) Under statistical analysis, the software version should be included.
R has changed to multiple versions since we started this analysis.
(5) Terminological consistency: as far as possible try to be consistent with the terms used in the text, such as using "contact rate" instead of "interaction rate" in order not to puzzle the readers.
We have changed most of the “interactions” to “contact” instead as suggested.
(6) Correct Typos: Identify typos and grammatical inconsistencies of any kind, especially in those complex sentences that may be hard to follow.
The typos have been checked.
(7) Under the methodology, briefly describe why specific thresholds were chosen and any limitations.
We added the following into the method: “Direct contact was defined when two individuals interacted either at 2, 5, or 350-metre buffers within a five-minute interval [36]. A previous study used 350 metres as a spatial threshold [16], while others use the approximate average body length of an individual [36]”
(8) The discussion should be strengthened by drawing clear links between the findings and actionable management strategies.
We have strengthened the discussion by adding more specific actionable management strategies. For example, controlling an outbreak during summer would potentially require more resources than an outbreak in other seasons due to the higher number of contacts between individuals during summer.
(9) Did you consider additional environmental factors, such as rainfall, food availability, or habitat features, to better understand how these influence seasonal variations in pig interactions and contact rates?
This is something that we have in mind and will explore in future research. This has been partially explored but is based on how environmental factors and seasons affect the home range (Wilson et al 2023).
(10) Figure Legends: Add more detailed descriptions in figure legends, especially for those figures showing network metrics or contact rates.
More information has been added to the figure legends.
(11) The paper includes too many figures, and thus, it is recommended to simplify or merge some figures where appropriate. In particular, this is recommended for those figures that plot more network measures across thresholds. Adding clear, summarized captions with interpretation on threshold and measure significance would be a great help in interpreting complicated visualizations.
The figure that shows the comparison between global network measures, including average local transitivity, edge density, global transitivity, mean distance and number of edges for direct and indirect networks has been moved to supplementary material (Figure S3). We also included direct and indirect model-level measures by sex as in Figure 3 and improved the captions of the figures presented in the main document.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
We thank the reviewers for their comments and provide answers /clarifications and new data; There were 3 important recurrent points we already address here:
(a) The reviewers were concerned that the observed motor defects (measured by startle induced negative geotaxis- “SING”) where a reasonable behavioral measure of DAN function.
Previously, Riemensperger et al., 2013 (PMID: 24239353) already linked synaptic loss of the dopaminergic PAM neurons to SING impairments. Furthermore, in a separate paper that we recently posted on BioRxiv, we show that the SING defects in PD mutants are rescued when the flies are fed L-DOPA (Kaempf et al 2024; BioRxiv). In this same paper we also show a very strong correlation between SING defects and defects in dopaminergic synaptic innervation of PAM DAN onto Mushroom body neurons. Both experiments suggest that the motor defects are the result of defects in dopamine release. Altogether, these data suggest that the combination of the SING assay and a quantification of the synaptic region of PAM DAN onto Mushroom body neurons is a suitable measure for DAN function.
(b) The reviewers asked if the OPN dysfunction in young animals is connected to dopaminergic neuron (DAN) dysfunction in later life;
We have conducted additional experiments and have included the results (new Figure 6): Our young PD mutants (we included Aux<sup>R927G</sup>, Synj<sup>R258Q</sup> and LRRK2<sup>G2019S</sup>) show olfactory defects, but normal DAN function (measured by assessing the TH-labeled synaptic area onto the Mushroom body neurons and by SING). Aged PD mutants show both olfactory defects and DAN dysfunction. When we express the wildtype PD gene in (a.o.) OPN of PD mutants using the GH146-Gal4 (that does not drive expression in DAN) we are able to rescue the DAN defects (synaptic area and SING) that occur later in life. This indeed suggests there is a cell non-autonomous positive effect on DAN dysfunction that occurs at later stages in the life of our PD mutants (new Figure 6a).
In a set of independent experiments, we also fed one of our mutants (LRRK2<sup>G2019S</sup>) nicotine, activating Nicotinic acetylcholine receptors (that are also activated by the release of acetylcholine from cholinergic neurons such as OPN). While nicotine does not rescue the olfactory preference defect, the OPN synapse morphology defect or the OPN-associated defects in Ca<sup>2+</sup>-imaging in LRRK2<sup>G2019S</sup> mutants (Figure 6b), it does rescue the DAN-associated defects, including SING, synapse loss and defects in Ca<sup>2+</sup>-imaging (Figure 6c).
Finally, we generated human induced dopaminergic neurons derived from iPSC with a LRRK2<sup>G2019S</sup> mutation and incubated these neurons with nicotine. Again, this induced a rescue of a LRRK2-mutant-induced defect in neuronal activity measured by Ca<sup>2+</sup>-imaging. This is specific to nicotine since the rescue was absent when cells were also incubated with mecamylamine, a non-competitive antagonist of nicotinic acetylcholine receptors, trumping the effects of nicotine (Figure 6d-e").
(c) The reviewers indicated that the GH146 Gal 4 driver is expressed in other cells than OPN and thus, they noted that the defects we observe may not only be the result of OPN dysfunction.
It is correct that GH146-dependent Gal expression includes OPNs (that are cholinergic) and one pair of inhibitory APL neurons (that are GABAergic) (Li et al., 2017 (PMID: 29149607), Lui et al., 2009 (PMID: 19043409)). We have adapted the text to explicitly state this. There are only 2 APL per fly brain and our single cell sequencing experiment does not have the resolution to allow us to test if these neurons had a significant number of DEG. However, as indicated above (in (b)), we are able to rescue DAN dysfunction by mimicking cholinergic output (application of nicotine). These data do not exclude that APL-neuron problems contribute to the defects we observe in our PD mutants, but they do suggest that cholinergic output is critical to maintain normal DAN function.
Public Reviews:
Reviewer #1 (Public Review):
This is a fantastic, comprehensive, timely, and landmark pan-species work that demonstrates the convergence of multiple familial PD mutations onto a synaptic program. It is extremely well written and I have only a few comments that do not require additional data collection.
Thank you for this enthusiastic endorsement.
Major Comments:
neurons and the olfactory system are acutely impacted by these PD mutations. However, I wonder if this is the case:
(1) In the functional experiments performing calcium imaging on projection neurons I could not find a count of cell bodies across conditions. Since the loss of OPNs could explain the reduced calcium signal, this is a critical control to perform. A differential abundance test on the single-cell data would also suffice here and be easy for the authors to perform with their existing data.
This is indeed an important number, and we had included this in the Supplemental figure 2a.
Also, the number of DAN and Visual projection neurons were not significantly different between the genotypes (Supplemental Figure 2a in the manuscript).
(2) One of the authors' conclusions is that cholinergic
a. Most Drosophila excitatory neurons are cholinergic
and only a subpopulation appear to be dysregulated by these mutations. The authors point out that visual neurons also have many DEGs, couldn't the visual system also be dysregulated in these flies? Is there something special about these cholinergic neurons versus other cholinergic neurons in the fly brain? I wonder if they can leverage their nice dataset to say something about vulnerability.
Yes, the reviewer is right, and we have changed our wording to be more specific. The reviewer also noted correctly that neurons in the visual system rank high in terms of number of DEGs, but we did not conduct elaborate experiments to assess if these visual system neurons are functional. Of note, several of our mutants show (subtle) electroretinogram defects, that are a measure of visual system integrity, but further work is needed to determine the origin of these defects.
The question about the nature of the underlying vulnerability pathways is interesting. In preliminary work we have selected a number of DEGs common to vulnerable cells in several PD mutants, and conducted a screen where we manipulated the expression of these DEGs and looked for rescue of the olfactory preference defects in our PD mutants. The strongest genetic interaction was with genes encoding proteins involved in proteostasis (Atg8/LC3, Lamp1 and Hsc70-4) (Reviewer Figure 3). While interesting, these results require further work to understand the underlying molecular mechanisms. We present these preliminary data here but have not included them in the main manuscript.
b. As far as I can tell, the cross-species analysis of DEGs (Figure 3) is agnostic to neuronal cell type, although the conclusion seems to suggest only cholinergic neurons were contrasted. Is this correct? Could you please clarify this in the text as it's an important detail. If not, Have the authors tried comparing only cholinergic neuron DEGs across species? That would lend strength to their specificity argument. The results for the NBM are impressive. Could the authors add more detail to the main text here about other regions to the main text?
The reviewer is correct that we compiled the DEG of all affected cells, the majority of which are cholinergic neurons.
For the human data we focused on the NBM samples, because it contained the highest fraction of cholinergic neurons (as compared to the other 2 regions), but even so, it was not possible to analyze the cholinergic neurons alone because the fraction of cholinergic neurons in the human material was too low to be statistically analyzed independently. Note that both wildtype and PD samples contained a low number of cholinergic neurons (i.e. the DEG differences we detected were not the result of sequencing different types of cells - see also Supplemental Figure 3b and d). We have indicated this more clearly in the text.
c. Uniquely within the human data, are cholinergic neurons more dysregulated than others? I understand this is not an early timepoint but would still be useful to discuss.
As indicated in the previous point, unfortunately the fraction of cholinergic neurons in the human material was low and we were not able to analyze these cells on their own.
Author response image 1.
Upregulation of protein homeostasis rescues hyposmia across familial models of PD. Results of a behavioral screen for cell-specific rescue of olfactory preference defects of young PD fly models using up and downregulation of deregulated genes in affected cell types. Genes implicated in the indicated pathways are over expressed or knocked down using GH146-Gal4 (OPN>) and UAS-constructs (over expression or RNAi) . UAS-only (-) and OPN>UAS (+) were scored in parallel and are compared to each other. n.d. not determined; Bars represent mean ± s.e.m.; grey zone indicates the variance of controls; n≥5 independent experiments per genotype, with ~50 flies each; red bars: p<0.05 in ANOVA and Bonferroni-corrected comparison to UAS-only control.
d. In the discussion, the authors say that olfactory neurons are uniquely poised to be dysregulated as they are large and have high activity. Is this really true compared to other circuits? I didn't find the references convincing and I am not sure this has been borne out in electron microscopy reconstructions for anatomy.
We agree and have toned down this statement.
Reviewer #2 (Public Review):
Summary:
Pech et al selected 5 Parkinson's disease-causing genes, and generated multiple
Drosophila lines by replacing the Drosophila lrrk, rab39, auxilin (aux), synaptojanin
(synj), and Pink1 genes with wild-type and pathogenic mutant human or Drosophila cDNA sequences. First, the authors performed a panel of assays to characterize the phenotypes of the models mentioned above. Next, by using single-cell RNA-seq and comparing fly data with human postmortem tissue data, the authors identified multiple cell clusters being commonly dysregulated in these models, highlighting the olfactory projection neurons. Next, by using selective expression of Ca<sup>2+</sup>-sensor GCaMP3 in the OPN, the authors confirmed the synaptic impairment in these models, which was further strengthened by olfactory performance defects.
Strengths:
The authors overall investigated the functionality of PD-related mutations at endogenous levels and found a very interesting shared pathway through singlecell analysis, more importantly, they performed nice follow-up work using multiple assays.
Weaknesses:
While the authors state this is a new collection of five familial PD knock-in models, the Aux<sup>R927G</sup> model has been published and carefully characterized in Jacquemyn et al., 2023. ERG has been performed for Aux R927G in Jacquemyn et al., 2023, but the findings are different from what's shown in Figure 1b and Supplementary Figure 1d, which the authors should try to explain.
We should have explained this better: the ERG assay in Jacquemyn et al., and here, in Pech et al., are different. While the ERGs in our previous publication were recorded under normal endogenous conditions, the flies in our current study were exposed to constant light for 7 days. This is often done to accelerate the degeneration phenotype. We have now indicated this in the text (and also refer to the different experimental set up compared to Jacquemyn et al).
Moreover, according to the authors, the hPINK1control was the expression of human PINK1 with UAS-hPINK1 and nsyb-Gal4 due to technical obstacles. Having PINK1 WT being an overexpression model, makes it difficult to explain PINK1 mutant phenotypes. It will be strengthened if the authors use UAS-hPINK1 and nsyb-Gal4 (or maybe ubiquitous Gal4) to rescue hPink1L347P and hPink1P399L phenotypes.
The UAS-hPink1 was originally created by the Lu lab (Yang et al., 2003, PMID: 12670421) and has been amply used before in Pink1 loss-of-function backgrounds (e.g. in Yang et al., 2006, PMID: 16818890). In our work, the control we refer to was UAS-hPink1 expression (driven by nSyb-gal4) in a Pink1 knock-out background. For unknown reasons we were unable to replace the fly Pink1 with a human pink1 cDNA, we explained this in the methods section and added a remark in the new manuscript.
In addition, although the authors picked these models targeting different biology/ pathways, however, Aux and Synj both act in related steps of Clathrin-mediated endocytosis, with LRRK2 being their accessory regulatory proteins. Therefore, is the data set more favorable in identifying synaptic-related defects?
We picked these particular mutants, as they were the first we created in the context of a much larger collection of “PD flies” (see also Kaempf et al 2024, BioRxiv). We have made adaptations to the text to tone down the statement on the broad selection of mutants.
GH146-GAL4+ PNs are derived from three neuroblast lineages, producing both cholinergic and GABAergic inhibitory PNs (Li et al, 2017). Therefore, OPN neurons have more than "cholinergic projection neurons". How do we know from singlecell data that cholinergic neurons were more vulnerable across 5 models?
The reviewer is correct that GH146 drives expression in other cells than OPN and we now clearly state this in the text. We do present additional arguments that substantiate our conclusion that cholinergic neurons are affected: (1) our single cell sequencing identifies the most DEGs in cholinergic neurons. (2) nicotine (a compound activating cholinergic receptors) rescues dopamine-related problems in old PD-mutant flies. (3) Likewise, nicotine also alleviates problems we observed in LRRK2 mutant human induced dopaminergic neurons and this is blocked by mecamylamine, a non-competitive antagonist of nicotinic acetylcholine receptors.
In Figure 1b, the authors assumed that locomotion defects were caused by dopaminergic neuron dysfunction. However, to better support it, the author should perform rescue experiments using dopaminergic neuron-specific Gal4 drivers. Otherwise, the authors may consider staining DA neurons and performing cell counting. Furthermore, the authors stated in the discussion, that "We now place cholinergic failure firmly ahead of dopaminergic system failure in flies", which feels rushed and insufficient to draw such a conclusion, especially given no experimental evidence was provided, particularly related to DA neuron dysfunction, in this manuscript.
Previously, Riemensperger et al., 2013 (PMID: 24239353) already linked synaptic loss of the dopaminergic PAM neurons to locomotion impairments (measured by SING). Furthermore, in a separate paper we show that the motor defects (SING) observed in PD mutants are rescued when the flies are fed L-DOPA, but not D-DOPA (Kaempf et al 2024; BioRxiv). In this same paper, we also show a significant correlation between SING defects and defects in dopaminergic synaptic innervation of PAM DAN onto Mushroom body neurons. We have referred to both articles in the revised manuscript.
The statement on cholinergic failure ahead of dopaminergic failure was made in the context of the sequence of events: young flies did not show DAN defects, but they did display olfactory defects. The statement was indeed not meant to imply causality. However, we have now conducted new experiments where we express wild type PD genes using GH146-Gal4 (that does not express in DAN) in the PD mutants and assess dopaminergic-relevant phenotypes later in life (see also new Figure 6 in the manuscript). This shows that GH146Gal4-specific rescue is sufficient to alleviate the DAN-dependent SING defects in old flies. Likewise, as indicated above, application of nicotine is also sufficient to rescue the DAN-associated defects (in PD mutant flies and human induced mutant dopaminergic neurons).
It is interesting to see that different familial PD mutations converge onto synapses. The authors have suggested that different mechanisms may be involved directly through regulating synaptic functions, or indirectly through mitochondria or transport. It will be improved if the authors extend their analysis on Figure 3, and better utilize their single-cell data to dissect the mechanisms. For example, for all the candidates listed in Figure 3C, are they all altered in the same direction across 5 models?
This is indeed the case: the criteria for "commonly deregulated" included that the DEGs are changed in the same direction across several mutants. We ranked genes according to their mean gene expression across the mutants as compared it to the wildtype control: i.e. only if the DEGs are all up- or all down-regulated they end up on the top or bottom of our list. We added a remark in the revised manuscript. In preliminary work we also selected a number of the DEGs and conducted a screen where we manipulated the expression of these genes looking for rescue of the olfactory preference defects in our PD mutants. The strongest genetic interaction was with genes encoding proteins involved in proteostasis (Atg8/LC3, Lamp1 and Hsc70-4; and we also show a genetic interaction between EndoA and Lrrk in this work and in Matta et al., 2012) (Author response image 1 above). While interesting, these results require further work to understand the underlying molecular mechanisms. We present these preliminary data here, but have not included them in the main manuscript.
While this approach is carefully performed, the authors should state in the discussions the strengths and the caveats of the current strategy. For example, what kind of knowledge have we gained by introducing these mutations at an endogenous locus? Are there any caveats of having scRNAseq at day 5 only but being compared with postmortem human disease tissue?
We have included a “strengths and caveats section” in the discussion addressing these points.
Reviewer #3 (Public Review):
Summary:
This study investigates the cellular and molecular events leading to hyposmia, an early dysfunction in Parkinson's disease (PD), which develops up to 10 years prior to motor symptoms. The authors use five Drosophila knock-in models of familial PD genes (LRRK2, RAB39B, PINK1, DNAJC6 (Aux), and SYNJ1 (Synj)), three expressing human genes and two Drosophila genes with equivalent mutations.
The authors carry out single-cell RNA sequencing of young fly brains and singlenucleus RNA sequencing of human brain samples. The authors found that cholinergic olfactory projection neurons (OPN) were consistently affected across the fly models, showing synaptic dysfunction before the onset of motor deficits, known to be associated with dopaminergic neuron (DAN) dysfunction.
Single-cell RNA sequencing revealed significant transcriptional deregulation of synaptic genes in OPNs across all five fly PD models. This synaptic dysfunction was confirmed by impaired calcium signalling and morphological changes in synaptic OPN terminals. Furthermore, these young PD flies exhibited olfactory behavioural deficits that were rescued by selective expression of wild-type genes in OPNs.
Single-nucleus RNA sequencing of post-mortem brain samples from PD patients with LRRK2 risk mutations revealed similar synaptic gene deregulation in cholinergic neurons, particularly in the nucleus basalis of Meynert (NBM). Gene ontology analysis highlighted enrichment for processes related to presynaptic function, protein homeostasis, RNA regulation, and mitochondrial function.
This study provides compelling evidence for the early and primary involvement of cholinergic dysfunction in PD pathogenesis, preceding the canonical DAN degeneration. The convergence of familial PD mutations on synaptic dysfunction in cholinergic projection neurons suggests a common mechanism contributing to early non-motor symptoms like hyposmia. The authors also emphasise the potential of targeting cholinergic neurons for early diagnosis and intervention in PD.
Strengths:
This study presents a novel approach, combining multiple mutants to identify salient disease mechanisms. The quality of the data and analysis is of a high standard, providing compelling evidence for the role of OPN neurons in olfactory dysfunction in PD. The comprehensive single-cell RNA sequencing data from both flies and humans is a valuable resource for the research community. The identification of consistent impairments in cholinergic olfactory neurons, at early disease stages, is a powerful finding that highlights the convergent nature of PD progression. The comparison between fly models and human patients' brains provides strong evidence of the conservation of molecular mechanisms of disease, which can be built upon in further studies using flies to prove causal relationships between the defects described here and neurodegeneration.
The identification of specific neurons involved in olfactory dysfunction opens up potential avenues for diagnostic and therapeutic interventions.
Weaknesses:
The causal relationship between early olfactory dysfunction and later motor symptoms in PD remains unclear. It is also uncertain whether this early defect contributes to neurodegeneration or is simply a reflection of the sensitivity of olfactory neurons to cellular impairments. The study does not investigate whether the observed early olfactory impairment in flies leads to later DAN deficits. Additionally, the single-cell RNA sequencing analysis reveals several affected neuronal populations that are not further explored. The main weakness of the paper is the lack of conclusive evidence linking early olfactory dysfunction to later disease progression.
We agree that this is an interesting avenue to pursue and as indicated above in Figure 6 and in the reworked manuscript, we have now included data that strengthens the connection between early OPN defects and the later DAN dependent problems. Additional future work will be needed to elucidate the mechanisms of this cell-non autonomous effect.
The rationale behind the selection of specific mutants and neuronal populations for further analysis could be better qualified.
We have added further explanation in the reworked text.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Minor Comments:
(1) Questions about the sequencing methods and analysis approaches. From reading the methods and main text, I was confused about aspects of the Drosophila single-cell profiling. Firstly, did the authors multiplex their fly samples?
No, we did not. Genotypes were separately prepared and sequenced, but they were all processed in parallel to avoid batch effects.
Secondly, it seems like there are two rounds of dataset integration performed, Harmony and Seurat's CCA-based method. This seems unorthodox. Could the authors comment on why they perform two integrations?
Thanks for pointing this out, this was a mistake in the methods section (copied from a much older version of the manuscript). In this manuscript, we only used harmony for dataset integration and removed the methods on Seurat-CCA.
Finally, for all dataset integrations please state in the main text how datasets were integrated (by age, genotype, etc).
Datasets were integrated by sample id, corresponding to individual libraries.
(2) The authors focus on OPNs with a really nice set of experiments. I noticed however that Kenyon cells were also dysregulated. What about Olfactory sensory neurons? Could the authors provide comments on this?
Olfactory sensory neurons are located in the antennae of the fly brain and were not captured by our analysis. However, the GH146-Gal4-specific rescue experiments indicate these sensory neurons are likely not severely functionally impaired. Kenyon cells are an interesting affected cell type to look at in future experiments, as they are directly connected to DANs.
(3) There are several citations of Jenett et al 2012 that seem wrong (related to single-cell datasets).
We are sorry for this and have corrected this in the text.
Reviewer #2 (Recommendations For The Authors):
(1) In the key resources table, a line called CG5010k.o. (chchd2k.o.) was mentioned, but was not used in the paper. The authors should remove it.
Sorry, this was from a previous older version of the manuscript. We fixed this.
(2) Why did the authors use human CDS for LRRK2, Rab39B, and PINK1, but fly CDS for Aux and Synj1? Is it based on the conservation of amino acid residues? Although the authors cited a review (Kalia & Lang, 2015) to justify the selection of the mutations, for the interest of a broad audience, it is recommended that the authors expand their introduction for the rationale of their selection, including the pathogenicity of each selected mutation, original human genetics evidence, conservation between fly and human.
(a) We used Drosophila cDNA for rescue experiments with aux and synj since knockin of the human homologues at the locus of these genes did not rescue its loss-offunction (lethality).
(b) We expanded the introduction to provide further explanation on the selection of our mutants we analyzed in this work. We picked these particular mutants, as they were the first we created in the context of a much larger collection of “PD flies” (see also Kaempf et al 2024, BioRxiv). We have made adaptations to the text to tone down the statement on the broad selection of mutants.
(3) Supplemental Figure 1a, is mRNA level normalized to an internal control? If not, it is not appropriate to compare the results directly from two primer sets, since each primer set may have different amplification efficiency.
We are sorry for the lack of information. Indeed, mRNA levels were determined using the Δ-Δ-CT method, where Ct values were first normalized to the housekeeping gene Rp49, and next expressed as a percent of endogenous Drosophila gene expression. We expanded the methods section and now also enlist the primers for Rp49 along with the other qPCR primers in Supplemental File 1.
(4) For Figure 2, it may be helpful to have a supplemental table or figure showcasing the clusters with significant changes (based on cell number-adjusted DEGs) for each model, i.e., what are those black cell clusters in Figure 2? "Thus, cellular identity and cellular composition are preserved in young PD fly models." In Figure S2A, the authors only show cell composition percentages for 3 cell clusters, are the bars 95% standard error?
The error bars in Supplemental Figure 2a represent the 95 % CI. We have included a new supplemental table with the number of cells per cell cluster for each mutant (Supplemental File 3).
What about the remaining 183 cell clusters? Are there any KI-model cell clusters that are statistically different than controls? What about the annotated cell types (e.g., the 81 with cell identities)? Please consider at least providing or pointing to a table to state how many have significant differences, or if there are truly none.
As mentioned above, we have included a new supplemental table with the number of cells per cell cluster for each mutant (Supplemental File 3).
(5) What are the rows in the sunburst plot in Figure 3a? Please be more descriptive in the figure legend or label the figure.
We have expanded on this in the figure legend and now also include a summary of the SynGO analysis in Supplemental File 7. In Figure 3a, a summary sunburst plot is presented, reflecting the GO terms (inner rings, indicated in a) with their subdivided levels (the complete list is provided in Supplemental File 7). In Figure 3a’ and a” the DEG data acquired from the different datasets (human vs fly) are applied to the sunburst plot where rings are color-coded according to enrichment Q-value.
(6) In Table S4, which clusters (in the table) have normalized residuals that are outside of the 95% confidence interval of the regression model displayed in Figure S2e? They use this analysis to adjust for cell number bias and point out the "most significant cell clusters" affected in each model. This may be helpful for readers who want to grab a full list of responsive clusters.
We have included this information in Supplemental File 5 (Tab “Cell types outside of CIs”) in the supplemental data of the manuscript.
(7) The human samples used all have different LRRK2 variants: for the crossspecies comparisons, do Lrrk flies have greater similarity to the human PD cases compared to the other fly models?
No, comparing the vulnerable gene signatures from each of the fly mutants to the DEGs from the human samples does not show any greater similarity between the LRRK mutants compared to the other mutants.
Reviewer #3 (Recommendations For The Authors):
Clarifications required:
Some of the mutations used are not common PD-associated genes, the authors should explain the rationale behind using these particular mutants, and not using well-established fly models of PD (like for example GBA flies) or SNCA overexpression.
We opted to use knock-ins of mutations that are causal to Parkinsonism. Given flies do not express an alpha-synuclein homologue we were not able to add this ‘as such’ to our collection. Future work can indeed also include expression models or risk factor models (like GBA). As also requested by another reviewer, we did add further rationale and explanation to the genes we chose to analyze in this work.
Why starvation rather than lifespan for PD models? For the lifespan data shown there are no error bars, if the stats test is a log-rank or Cox proportional hazards (usually used in survival analysis, this should be stated), it would also be good to have the survival plots for all the survival during starvation, not just PINK1.
While starvation assays can provide valuable insights into acute metabolic and physiological stress responses, we acknowledge that lifespan is a critical parameter and would provide a more comprehensive understanding of the PD models in our study. Based on this consideration and the reviewer’s feedback we have removed the starvation data from the manuscript. Unfortunately, we did not perform lifespan experiments, which is why these data were not included in the manuscript. However, based on our observations (though not detailed analysis), all genotypes tested—except for the PINK1 mutants—appeared to have a normal lifespan. For PINK1 mutants, most flies died by 25 days of age. Therefore, we conducted our assays using 15-day-old PINK1 mutant flies.
Do the fly models used have different lifespans, and how close to death was the SING assay performed? Different mutations show different effects, most phenotypes are really mild (hRab39BG192R has no phenotype), and PINK1 has the strongest, are these simply reflections of how strong the model is?
The ages of flies we analyzed are indicated in the legend. As mentioned before, all but PINK1 mutants- had a normal life span: i.e. we did not detect abnormal low number of flies or premature death at 50 days of age, except for the PINK1 mutants tested in this manuscript where most flies died by 25 days of age. Therefore, we conducted our assays using 15-day-old PINK1 mutant flies.
Rab39G192R has no phenotype in the tests presented, suggesting no degeneration, why use RabG192R for scRNA seq? Seems an odd choice, the authors should explain.
Single-cell sequencing was initiated before the full phenotypic characterization of all mutants was completed. Although basic characterization of the Rab39<sup>G192R</sup> mutant PD flies revealed either no significant phenotypes or only mild effects in the assays performed (Figure 1), the sequencing data provided additional insights into potential cellular and molecular alterations. Furthermore, all PD-mutant knock-ins, including Rab39<sup>G192R</sup> mutant PD flies, show dysfunctional synaptic terminals of their OPN neurons as they had significantly weaker Ca<sup>2+</sup>-responses, even though their synaptic area was increased (Figure 4 g-h). Furthermore, all mutants also had olfactory behavior defects (Figure 5 a).
When the authors state that “For example, in the NBM, an area associated with PD (Arendt et al., 1983), 20% of the DEG that has an orthologous gene in the fly are also found among the most deregulated genes across PD fly models" a test should be performed to confirm this is a significant overlap (such as a hypergeometric test).
We have performed this test, of the 2486 significantly differential human genes, 1149 have a fly orthologue, and of these, 28.46 % overlap with the deregulated fly genes (5 % top and bottom gene as shown in Supplemental Table 7). Performing a hypergeometric test confirms that this overlap is significant, with a p-value of 9.06e<sup>76</sup>. We have included this in the text.
The authors speak of deregulation when speaking of the overlap between human and fly DE genes, but do the over-expressed genes in flies overlap with overexpressed genes in humans, or is the direction of transcription deregulation not concordant? If it is mostly not concordant, can the authors please comment as to why they might think that is the case?
In our fly experiments, we identified DEG in affected cell types and then defined common DEG by looking at the average change across the fly mutants. Genes that show a consistent change (all or mostly up, or all or mostly down) in the different mutants will end at the top of our list while genes that are up in some mutants and downregulated in others will average out and not end up in our commonly deregulated gene list. For comparison to the human data, we only looked for the presence of the human homologue, but did not assess if the change occurred in the same direction. More work will be needed to define the most relevant changes, but in a mini-screen we did select a number of DEG present in fly and human datasets from different functional categories and tested if they genetically interact with our PD mutants. As shown in Reviewer Figure 3, we find that modulating proteostasis pathway-encoding genes rescue the olfactory preference defect across many PD mutants.
Can the authors explain why only the NMB region was used for comparison with the fly data?
We used the NMB because this region has the highest number of cholinergic neurons to compare the deregulation in those neurons to the deregulation in the cholinergic OPN of mutant PD flies.
In Figure 4, can the genotypes please be stated in full and why is the hPINK1 fly giving no detectable signal?
Despite several attempts, we failed to knock-in wild type hPink1 in the fly pink1 locus. Therefore, the hPink1 control used throughout the manuscript was the nSybGal4>UAS-hPink1 in Pink1 knock-out background, except for Figure 4. Particularly, for experiments in this figure, we could not use UAS-hPink1 with nSyb-Gal4, since we needed OPN-specific expression of Gal4 to drive UAS-GCamP expression.
Therefore, this was labeled as “not determined” (“n.d.”), as indicated in the figure and the legend. We explained this better in the methods section, added a remark in the new manuscript and expanded the legend of Figure 4.
The paper states that" These findings imply that factors affecting the function of cholinergic neurons might, by the absence of insufficient innervation, lead to DAN problems and degeneration, warranting further exploration of the underlying molecular mechanisms", this should be less strong, the paper never looks at DAN, only at OPN neurons. Fly neurons are mostly cholinergic, and human neurons are mostly glutamatergic, so jumping from one system to the other might not be as straightforward, the authors should comment on this.
We now included a new exciting experiment where we assessed DAN function in aged PD mutants where the wildtype gene was expressed in OPN using GH146-Gal4. We find this manipulation rescued DAN defects (measured by SING) in older flies. We further corroborated our observation by “replacing” cholinergic innervation with nicotine feeding in PD mutants. Also, this rescues the SING defect as well as the defects in neuronal activity in PAM DAN (based on live synaptic calcium imaging). Finally, we also show that incubating LRRK2<sup>G2019S</sup> mutant human induced dopaminergic neurons with nicotine is sufficient to rescue functional defects in these neurons (measured using calcium imaging). We included this data in the new manuscript and show them also in Figure 6 above (new Figure 6 in the revised manuscript).
Experiments that would improve the manuscript:
Does rescue of OPN function also rescue later progressive symptoms (geotaxis response)?
It does, as indicated in the previous point and shown in Figure 6.
Do the fly PD models used show DAN degeneration? This could be assessed by stains with anti-TH stains.
We quantified DAN cell bodies using anti-TH, but see very little or no loss. There is, however, loss of synaptic innervation of the PAM onto the mushroom bodies. We included the data in a new Figure 6 (see also Figure 6). Furthermore, we have quantified this across the genetic space of familial Parkinsonism in Kaempf et al., 2024, BioRxiv. Note that this phenotype is also rescued by expressing wildtype CDS in their OPN using GH146-Gal4.
Minor issues:
The final sentence on page 5 is repetitive with the introduction.
Indeed, we removed the redundant sentence.
First line of the new section on page 6, the authors probably mean cholinergic olfactory projection neurons, not just cholinergic neurons.
Yes, and corrected.
At the top of page 7 the authors state: "Additionally, we also found enrichment of genes involved in RNA regulation and mitochondrial function that are also important for the functioning of synaptic terminals", where is the data showing this? The authors should point to the supplemental file showing this.
We now included a reference to Supplemental File 7 that includes a summary of those data. Additionally, we also included references to back this claim.
Just before the discussion, Rab39BG193R should be Rab39BG192R.
Sorry for this, it is now corrected.
Stating "fifth row" in Fig 5c and d is confusing, can the figure be labelled more clearly?
We modified the figure (including extra marks and colors) and expanded the legend and the main text to differentiate better between expression of the rescues in OPN versus T1 neurons revealing that only expression in OPN neurons rescues the olfactory defects while expression in T1 neurons does not.
In the methods, the authors describe clustering done both in Scanpy and Seurant, why were both run? Which clustering was used for further analysis?
We only used Scanpy with Harmony and removed the methods on Seurat-CCA. Thanks for pointing this out, this was a mistake in the methods section (copied from a previous version of the manuscript).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Vision is a highly active process. Humans move their eyes 3-4 times per second to sample information with high visual acuity from our environment, and where eye movements are directed is critical to our understanding of active vision. Here, the authors propose that the cost of making a saccade contributes critically to saccade selection (i.e., whether and where to move the eyes). The authors build on their own recent work that the effort (as measured by pupil size) that comes with planning and generating an eye movement varies with saccade direction. To do this, the authors first measured pupil size for different saccade directions for each participant. They then correlated the variations in pupil size obtained in the mapping task with the saccade decision in a free-choice task. The authors observed a striking correlation: pupil size in the mapping task predicted the decision of where to move the eyes in the free choice task. In this study, the authors provide a number of additional insightful analyses (e.g., based on saccade curvature, and saccade latency) and experiments that further support their claim that the decision to move the eyes is influenced by the effort to move the eyes in a particular direction. One experiment showed that the same influence of assumed saccade costs on saccade selection is observed during visual search in natural scenes. Moreover, increasing the cognitive load by adding an auditory counting task reduced the number of saccades, and in particular reduced the costly saccades. In sum, these experiments form a nice package that convincingly establishes the association between pupil size and saccade selection.
We thank the reviewer for highlighting the novelty and cogency of our findings.
In my opinion, the causal structure underlying the observed results is not so clear. While the relationship between pupil size and saccade selection is compelling, it is not clear that saccade-related effort (i.e., the cost of a saccade) really drives saccade selection. Given the correlational nature of this relationship, there are other alternatives that could explain the finding. For example, saccade latency and the variance in landing positions also vary across saccade directions. This can be interpreted for instance that there are variations in oculomotor noise across saccade directions, and maybe the oculomotor system seeks to minimize that noise in a free-choice task. In fact, given such a correlational result, many other alternative mechanisms are possible. While I think the authors' approach of systematically exploring what we can learn about saccade selection using pupil size is interesting, it would be important to know what exactly pupil size can add that was not previously known by simply analyzing saccade latency. For example, saccade latency anisotropies across saccade directions are well known, and the authors also show here that saccade costs are related to saccade latency. An important question would be to compare how pupil size and saccade latency uniquely contribute to saccade selection. That is, the authors could apply the exact same logic to their analysis by first determining how saccade latencies (or variations in saccade landing positions; see Greenwood et al., 2017 PNAS) vary across saccade directions and how this saccade latency map explains saccade selection in subsequent tasks. Is it more advantageous to use one or the other saccade metric, and how well does a saccade latency map correlate with a pupil size map?
We thank the reviewer for the detailed comment. 1) The reviewer first points out the correlational nature of many of our results. Thereafter, 2), the reviewer asks whether saccade latencies and landing precision also predict saccade selection, and could be these potential predictors be considered alternative explanations to the idea of effort driving saccade selection? Moreover, what can pupil size add to what can be learned from saccade latency?
In brief, although we report a combination of correlational and causal findings, we do not know of a more parsimonious explanation for our findings than “effort drives saccade selection”. Moreover, we demonstrate that oculomotor noise cannot be construed as an alternative explanation for our findings.
(1) Correlational nature of many findings.
We acknowledge that many of our findings are predominantly correlational in nature. In our first tasks, we correlated pupil size during saccade planning to saccade preferences in a subsequent task. Although the link between across tasks was correlational, the observed relationship clearly followed our previously specified directed hypothesis. Moreover, experiments 1 and 2 of the visual search data replicated and extended this relationship. We also directly manipulated cognitive demand in the second visual search experiment. In line with the hypothesis that effort affects saccade selection, participants executed less saccades overall when performing a (primary) auditory dual task, and even cut the costly saccades most – which actually constitutes causal evidence for our hypothesis. A minimal oculomotor noise account would not directly predict a reduction in saccade rate under higher cognitive demand. To summarize, we have a combination of correlational and causal findings, although mediators cannot be ruled out fully for the latter. That said, we do not know of a more fitting and parsimonious explanation for our findings than effort predicting saccade selection (see following points for saccade latencies). We now address causality in the discussion for transparency and point more explicitly to the second visual search experiment for causal evidence.
“We report a combination of correlational and causal findings. Despite the correlational nature of some of our results, they consistently support the hypothesis that saccade costs predicts saccade selection [which we predicted previously, 33]. Causal evidence was provided by the dual-task experiment as saccade frequencies - and especially costly saccades were reduced under additional cognitive demand. Only a cost account predicts 1) a link between pupil size and saccade preferences, 2) a cardinal saccade bias, 3) reduced saccade frequency under additional cognitive demand, and 4) disproportional cutting of especially those directions associated with more pupil dilation. Together, our findings converge upon the conclusion that effort drives saccade selection.”
(2) Do anisotropies in saccade latencies constitute an alternative explanation?
First of all, we would like to to first stress that differences in saccade latencies are indeed thought to reflect oculomotor effort (Shadmehr et al., 2019; TINS). For example, saccades with larger amplitudes and saccades where distractors need to be ignored are associated with longer latencies. Therefore, even if saccade latencies would predict saccade selection, this would not contrast the idea that effort drives saccade selection. Instead, this would provide convergent evidence for our main novel conclusion: effort drives saccade selection. There are several reasons why pupil size can be used as a more general marker of effort (see responses to R2), but ultimately, our conclusions do not hinge on the employed measure of effort per se. As stressed above in 1), we see no equally parsimonious explanation besides the cost account. Moreover, we predicted this relationship in our previous publication before running the currently reported experiments and analyses (Koevoet et al., 2023). That said, we are open to discuss further alternative options and would be looking forward to test these accounts in future work against each other – we are welcoming the reviewers’ (but also the reader’s) suggestions.
We now discuss this in the manuscript as follows:
“We here measured cost as the degree of effort-linked pupil dilation. In addition to pupil size, other markers may also indicate saccade costs. For example, saccade latency has been proposed to index oculomotor effort [100], whereby saccades with longer latencies are associated with more oculomotor effort. This makes saccade latency a possible complementary marker of saccade costs (also see Supplemen- tary Materials). Although relatively sluggish, pupil size is a valuable measure of attentional costs for (at least) two reasons. First, pupil size is a highly established as marker of effort, and is sensitive to effort more broadly than only in the context of saccades [36–45, 48]. Pupil size therefore allows to capture not only the costs of saccades, but also of covert attentional shifts [33], or shifts with other effectors such as head or arm movements [54, 101]. Second, as we have demonstrated, pupil size can measure saccade costs even when searching in natural scenes (Figure 4). During natural viewing, it is difficult to disentangle fixation duration from saccade latencies, complicating the use of saccade latency as a measure of saccade cost.
Together, pupil size, saccade latency, and potential other markers of saccade cost could fulfill complementary roles in studying the role of cost in saccade selection.”
Second, we followed the reviewer’s recommendation in testing whether other oculomotor metrics would predict saccade selection. To this end, we conducted a linear regression across directions. We calculated pupil size, saccade latencies, landing precision and peak velocities maps from the saccade planning task. We then used AICbased backward model selection to determine the ‘best’ model model to determine which factor would predict saccade selection best. The best model included pupil size, latency and landing precision as predictors (Wilkinson notation: saccade preferences ~ pupil size + saccade latency + landing precision). Pupil size (b \=-42.853, t \= 4.791, p < .001) and saccade latency (b \=-.377, t \= 2.106, p \= .043; see Author response image 1) predicted saccade preferences significantly. In contrast, landing precision did not reach significance (b \= 23.631, t \= 1.675, p \= .104). This analysis shows that although saccade latency also predicts saccade preferences, pupil size remains a robust predictor of saccade selection. These findings demonstrate that minimizing oculomotor noise cannot fully explain the pattern of results.
Author response image 1.
The relationship between saccade latency (from the saccade planning task) and saccade preferences averaged across participants. Individual points reflect directions and shading represents bootstrapped 95% confidence intervals.
We have added this argument into the manuscript, and discuss the analysis in the discussion. Details of the analysis have been added to the Supporting Information for transparency and further detail.
“A control analysis ruled out that the correlation between pupil size and saccade preferences was driven by other oculomotor metrics such as saccade latency and landing precision (see Supporting Information).”
“To ascertain whether pupil size or other oculomotor metrics predict saccade preferences, we conducted a multiple regression analysis. We calculated average pupil size, saccade latency, landing precision and peak velocity maps across all 36 directions. The model, determined using AIC-based backward selection, included pupil size, latency and landing precision as predictors (Wilkinson notation: saccade preferences pupil size + saccade latency + landing precision). The analysis re- vealed that pupil size (β = -42.853, t = 4.791, p < .001) and saccade latency (β = -.377, t = 2.106, p = .043) predicted saccade preferences. Landing precision did not reach significance (β = 23.631, t = 1.675, p = .104). Together, this demonstrates that although other oculomotor metrics such as saccade latency contribute to saccade selection, pupil size remains a robust marker of saccade selection.”
In addition to eye-movement-related anisotropies across the visual field, there are of course many studies reporting visual field anisotropies (see Himmelberg, Winawer & Carrasco, 2023, Trends in Neuroscience for a review). It would be interesting to understand how the authors think about visual field anisotropies in the context of their own study. Do they think that their results are (in)dependent on such visual field variations (see Greenwood et al., 2017, PNAS; Ohl, Kroell, & Rolfs, 2024, JEP:Gen for a similar discussion)?
We agree that established visual field anisotropies are fascinating to be discussed in context of our own results. At the reviewer’s suggestion, we now expanded this discussion.
The observed anisotropies in terms of saccade costs are likely related to established anisotropies in perception and early visual cortex. However, the exact way that these anisotropies may be linked remains elusive (i.e. what is cause, what is effect, are links causal?), and more research is necessary to understand how these are related.
“The observed differences in saccade costs across directions could be linked to established anisotropies in perception [80–86], attention [87–92], saccade charac- teristics [87, 88, 92, 93], and (early) visual cortex [94–98] [also see 99]. For example, downward saccades are more costly than upward saccades, which mimics a similar asymmetry in early visual areas wherein the upper visual field is relatively under- represented [94–98]; similarly stronger presaccadic benefits are found for down- compared with upward saccades [87, 88]. Moreover, upward saccades are more pre- cise than downward saccades [93]. Future work should elucidate where saccade cost or the aforementioned anisotropies originate from and how they are related - something that pupil size alone cannot address.”
We also added that the finding that more precise saccades are coupled with worse performance in a crowding task might be attributed to the increased effort associated with more precise saccades (Greenwood et al., 2017).
“Adaptive resource allocation from, and to the oculomotor system parsimoniously explains a number of empirical observations. For example, higher cognitive demand is accompanied by smooth pursuits deviating more from to-be tracked targets [137], reduced (micro)saccade frequencies [Figure 4; 63, 64, 138, 139], and slower peak saccade velocities [140–142]. Relatedly, more precise saccades are accompanied with worse performance in a crowding task [93].”
Finally, the authors conclude that their results "suggests that the eye-movement system and other cognitive operations consume similar resources that are flexibly allocated among each other as cognitive demand changes. The authors should speculate what these similar resources could mean? What are the specific operations of the auditory task that overlap in terms of resources with the eye movement system?
We agree that the nature of joint resources is an interesting question. Our previous discussion was likely too simplistic here (see also responses to R3). We here specifically refer to the cognitive resources that one can flexibly distribute between tasks.
Our data do not directly speak to the question of what the shared resources between the auditory and oculomotor tasks are. Nevertheless, both tasks charge working memory as saccade targets are mandatorily encoded into working memory prior to saccade onset (Van der Stigchel & Hollingworth, 2018), and the counting task clearly engages working memory. This may indicate some domain-generality between visual and auditory working memory during natural viewing (see Nozari & Martin, 2024 for a recent review), but this remains speculative. Another possibility is that not the working memory encoding associated with saccades per se, but that the execution of overt motor actions itself also requires cognitive processing as suggested by Beatty (1982): “the organization of an overt motor act places additional demands on informationprocessing resources that are reflected in the task-evoked pupillary response”.
We have added upon this in more detail in the results and discussion sections.
“Besides the costs of increased neural activity when exerting more effort, effort should be considered costly for a second reason: Cognitive resources are limited. Therefore, any unnecessary resource expenditure reduces cognitive and behavioral flexibility [22, 31, 36, 116]. As a result, the brain needs to distribute resources between cognitive operations and the oculomotor system. We found evidence for the idea that such resource distribution is adaptive to the general level of cognitive demand and available resources: Increasing cognitive demand through an additional pri- mary auditory dual task led to a lower saccade frequency, and especially costly sac- cades were cut. In this case, it is important to consider that the auditory task was the primary task, which should cause participants to distribute resources from the ocu- lomotor system to the counting task. In other situations, more resources could be distributed to the oculomotor system instead, for example to discover new sources of reward [22, 136]. Adaptive resource allocation from, and to the oculomotor system parsimoniously explains a number of empirical observations. For example, higher cognitive demand is accompanied by smooth pursuits deviating more from to-be tracked targets [137], reduced (micro)saccade frequencies [Figure 4; 63, 64, 138, 139], and slower peak saccade velocities [140–142]. Relatedly, more precise saccades are accompanied with worse performance in a crowding task [93]. Furthermore, it has been proposed that saccade costs are weighed against other cognitive operations such as using working memory [33, 143–146]. How would the resources between the oculomotor system and cognitive tasks (like the auditory counting task) be related? One possibility is that both consume from limited working memory resources [147, 148]. Saccades are thought to encode target objects in a mandatory fashion into (vi- sual) working memory [79], and the counting task requires participants to keep track of the auditory stream and maintain count of the instructed digit in working mem- ory. However, the exact nature of which resources overlap between tasks remain open for future investigation [also see 149]. Together, we propose that cognitive re- sources are flexibly (dis)allocated to and from the oculomotor system based on the current demands to establish an optimal balance between performance and cost minimization.”
Reviewer #2 (Public Review):
The authors attempt to establish presaccadic pupil size as an index of 'saccade effort' and propose this index as one new predictor of saccade target selection. They only partially achieved their aim: When choosing between two saccade directions, the less costly direction, according to preceding pupil size, is preferred. However, the claim that with increased cognitive demand participants would especially cut costly directions is not supported by the data. I would have expected to see a negative correlation between saccade effort and saccade direction 'change' under increased load. Yet participants mostly cut upwards saccades, but not other directions that, according to pupil size, are equally or even more costly (e.g. oblique saccades).
Strengths:
The paper is well-written, easy to understand, and nicely illustrated.
The sample size seems appropriate, and the data were collected and analyzed using solid and validated methodology.
Overall, I find the topic of investigating factors that drive saccade choices highly interesting and relevant.
We thank the reviewer for pointing out the strengths of our paper.
Weaknesses:
The authors obtain pupil size and saccade preference measures in two separate tasks. Relating these two measures is problematic because the computations that underly saccade preparation differ. In Experiment 1, the saccade is cued centrally, and has to be delayed until a "go-signal" is presented; In Experiment 2, an immediate saccade is executed to an exogenously cued peripheral target. The 'costs' in Experiment 1 (computing the saccade target location from a central cue; withholding the saccade) do not relate to Experiment 2. It is unfortunate, that measuring presaccadic pupil size directly in the comparatively more 'natural' Experiment 2 (where saccades did not have to be artificially withheld) does not seem to be possible. This questions the practical application of pupil size as an index of saccade effort
This is an important point raised by the reviewer and we agree that a discussion on these points improves the manuscript. We reply in two parts: 1) Although the underlying computations during saccade preparation might differ, and are therefore unlikely to be fully similar (we agree), we can still predict saccade selection between (Saccade planning to Saccade preference) and within tasks (Visual search). 2) Pupil size is a sluggish physiological signal, but this is outweighed by the advantages of using pupil size as a general marker of effort, also in the context of visual selection compared with saccade latencies.
(1) Are delayed saccades (cost task) and the much faster saccades (preference task) linked?
As the reviewer notes the underlying ‘type’ of oculomotor program may differ between voluntarily delayed-saccades and those in the saccade preference task. There are, however, also considerable overlaps between the oculomotor programs as the directions and amplitudes are identical. Moreover, the different types of saccades have considerable overlap in their underlying neural circuitry. Nevertheless, the underlying oculomotor programs likely still differ in some regard. Even despite these differences, we were able to measure differences across directions in both tasks, and costs and preferences were negatively and highly correlated between tasks. The finding itself therefore indicates that the costs of saccades measured during the saccade planning task generalize to those in the saccade preference task. Note also that we predicted this finding and idea already in a previous publication before starting the present study (Koevoet et al., 2023).
We now address this interesting point in the discussion as follows:
“We observed that aOordable saccades were preferred over costly ones. This is especially remarkable given that the delayed saccades in the planning task likely differ in their oculomotor program from the immediate saccades in the preference task in some regard.”
(2) Is pupil size a sensible measure of saccade effort?
As the reviewer points out, the pupillary signal is indeed relatively sluggish and therefore relatively slow and more artifical tasks are preferred to quantify saccade costs. This does not preclude pupil size from being applied in more natural settings, as we demonstrate in the search experiments – but a lot of care has to be taken to control for many possible confounding factors and many trials will be needed.
That said, as saccade latencies may also capture differences in oculomotor effort (Shadmehr et al., 2019) they are a possible alternative option to assess effort in some oculomotor tasks (see below on why saccade latencies do not provide evidence for an alternative to effort driving saccade selection, but converging evidence). Whilst we do maintain that pupil size is an established and versatile physiological marker of effort, saccade latencies provide converging evidence for our conclusion that effort drives saccade selection.
As for the saccade preference task, we are not able to analyze the data in a similar manner as in the visual search task for two reasons. First, the number of saccades is much lower than in the natural search experiments. Second, in the saccade preference task, there were always two possible saccade targets. Therefore, even if we were able to isolate an effort signal, this signal could index a multitude of factors such as deciding between two possible saccade targets. Even simple binary decisions go hand in hand with reliable pupil dilations as they require effort (e.g. de Gee et al., 2014).
There are three major reasons why pupil size is a more versatile marker of saccade costs than saccade latencies (although as mentioned, latencies may constitute another valuable tool to study oculomotor effort). First, pupil size is able to quantify the cost of attentional shifts more generally, including covert attention as well as other effector systems such as head and hand movements. This circumvents the issue of different latencies of different effector systems and also allows to study attentional processes that are not associated with overt motor movements. Second, saccade latencies are difficult to interpret in natural viewing data, as fixation duration and saccade latencies are inherently confounded by one another. This makes it very difficult to separate oculomotor processes and the extraction of perceptual information from a fixated target. Thus, pupil size is a versatile marker of attentional costs in a variety of settings, and can measure costs that saccade latencies cannot (i.e. covert attention). Lastly, pupil size is highly established as a marker of effort which has been demonstrated across wide range of cognitive tasks and therefore not bound to eye movements alone (Bumke, 1911; Koevoet et al., 2024; Laeng et al., 2012; Loewenfeld, 1958; Mathôt, 2018; Robison & Unsworth, 2019; Sirois & Brisson, 2014; Strauch et al., 2022; van der Wel & van Steenbergen, 2018).
We now discuss this as follows:
“We here measured cost as the degree of effort-linked pupil dilation. In addition to pupil size, other markers may also indicate saccade costs. For example, saccade latency has been proposed to index oculomotor effort [100], whereby saccades with longer latencies are associated with more oculomotor effort. This makes saccade latency a possible complementary marker of saccade costs (also see Supplemen- tary Materials). Although relatively sluggish, pupil size is a valuable measure of attentional costs for (at least) two reasons. First, pupil size is a highly established as marker of effort, and is sensitive to effort more broadly than only in the context of saccades [36–45, 48]. Pupil size therefore allows to capture not only the costs of saccades, but also of covert attentional shifts [33], or shifts with other effectors such as head or arm movements [54, 101]. Second, as we have demonstrated, pupil size can measure saccade costs even when searching in natural scenes (Figure 4). During natural viewing, it is difficult to disentangle fixation duration from saccade latencies, complicating the use of saccade latency as a measure of saccade cost. Together, pupil size, saccade latency, and potential other markers of saccade cost could fulfill complementary roles in studying the role of cost in saccade selection.”
The authors claim that the observed direction-specific 'saccade costs' obtained in Experiment 1 "were not mediated by differences in saccade properties, such as duration, amplitude, peak velocity, and landing precision (Figure 1e,f)". Saccade latency, however, was not taken into account here but is discussed for Experiment 2.
The final model that was used to test for the observed anisotropies in pupil size across directions indeed did not include saccade latencies as a predictor. However, we did consider saccade latencies as a potential predictor originally. As we performed AICbased backward model selection, however, this predictor was removed due to the marginal predictive contribution of saccade latency beyond other predictors explaining pupil size.
For completeness, we here report the outcome of a linear mixed-effects that does include saccade latency as a predictor. Here, saccade latencies did not predict pupil size (b \= 1.859e-03, t \= .138, p \= .889). The asymmetry effects remained qualitatively unchanged: preparing oblique compared with cardinal saccades resulted in a larger pupil size (b \= 7.635, t \= 3.969, p < .001), and preparing downward compared with upward saccades also led to a larger pupil size (b \= 3.344, t \= 3.334, p \= .003).
The apparent similarity of saccade latencies and pupil size, however, is striking. Previous work shows shorter latencies for cardinal than oblique saccades, and shorter latencies for horizontal and upward saccades than downward saccades - directly reflecting the pupil sizes obtained in Experiment 1 as well as in the authors' previous study (Koevoet et al., 2023, PsychScience).
As the reviewer notes, there are substantial asymmetries across the visual field in saccade latencies. These assymetries in saccade latency could also predict saccade preferences. We will reply to this in three points: 1) even if saccade latency is a predictor of saccade preferences, this would not constitute as an alternative explanation to the conclusion of effort driving saccade selection, 2) saccade latencies show an up-down asymmetry but oblique-cardinal effects in latency may not be generalizable across saccade tasks, 3) pupil size remains a robust predictor of saccade preferences even when saccade latencies are considered as a predictor of saccade preferences.
(1) We want to first stress that saccade latencies are thought to reflect oculomotor effort (Shadmehr et al., 2019). For example, saccades with larger amplitudes and saccades where distractors need to be ignored are associated with longer latencies. Therefore, even if saccade latencies predict saccade selection, this would not contrast the idea that effort drives saccade selection. Instead, this would provide convergent evidence for our main conclusion – effort predicting saccade selection (rather than pupil size predicting saccade selection per se).
“We here measured cost as the degree of effort-linked pupil dilation. In addition to pupil size, other markers may also indicate saccade costs. For example, saccade latency has been proposed to index oculomotor effort [100], whereby saccades with longer latencies are associated with more oculomotor effort. This makes saccade latency a possible complementary marker of saccade costs (also see Supplemen- tary Materials). Although relatively sluggish, pupil size is a valuable measure of attentional costs for (at least) two reasons. First, pupil size is a highly established as marker of effort, and is sensitive to effort more broadly than only in the context of saccades [36–45, 48]. Pupil size therefore allows to capture not only the costs of saccades, but also of covert attentional shifts [33], or shifts with other effectors such as head or arm movements [54, 101]. Second, as we have demonstrated, pupil size can measure saccade costs even when searching in natural scenes (Figure 4). During natural viewing, it is difficult to disentangle fixation duration from saccade latencies, complicating the use of saccade latency as a measure of saccade cost. Together, pupil size, saccade latency, and potential other markers of saccade cost could fulfill complementary roles in studying the role of cost in saccade selection.”
(2) We first tested anisotropies in saccade latency in the saccade planning task (Wilkinson notation: latency ~ obliqueness + updownness + leftrightness + saccade duration + saccade amplitude + saccade velocity + landing error + (1+obliqueness + updownness|participant)). We found upward latencies to be shorter than downward saccade latencies (b \= -.535, t \= 3.421, p \= .003). In addition, oblique saccades showed shorter latencies than cardinal saccades (b \= -1.083, t \= 3.096, p \= .002) – the opposite of what previous work has demonstrated.
We then also tested these latency anisotropies in another dataset wherein participants (n \= 20) saccaded toward a single peripheral target as fast as possible (Koevoet et al., submitted; same amplitude and eccentricity as in the present manuscript). There we did not find a difference in saccade latency between cardinal and oblique targets, but we did observe shorter latencies for up- compared with downward saccades. We are therefore not sure in which situations oblique saccades do, or do not differ from cardinal saccades in terms of latency, and even in which direction the effect occurs.
In contrast, we have now demonstrated a larger pupil size prior to oblique compared with cardinal saccades in two experiments. This indicates that pupil size may be a more reliable and generalizable marker of saccade costs than saccade latency. However, this remains to be investigated further.
(3) To gain further insights into which oculomotor metrics would predict saccade selection, we conducted a linear regression across directions. We created pupil size, saccade latencies, landing precision and peak velocities maps from the saccade planning task. We then used AIC-based model selection to determine the ‘best’ model to determine which factor would predict saccade selection best. The selected model included pupil size, latency and landing precision as predictors (Wilkinson notation: saccade preferences ~ pupil size + saccade latency + landing precision). Pupil size (b \=-42.853, t \= 4.791, p < .001) and saccade latency (b \=-.377, t \= 2.106, p \= .043) predicted saccade preferences significantly. In contrast, landing precision did not reach significance (b \= 23.631, t \= 1.675, p \= .104). This analysis shows that although saccade latency predicts saccade preferences, pupil size remains a robust predictor of saccade selection.
“To ascertain whether pupil size or other oculomotor metrics predict saccade preferences, we conducted a multiple regression analysis. We calculated average pupil size, saccade latency, landing precision and peak velocity maps across all 36 directions. The model, determined using AIC-based backward selection, included pupil size, latency and landing precision as predictors (Wilkinson notation: saccade preferences pupil size + saccade latency + landing precision). The analysis re- vealed that pupil size (β = -42.853, t = 4.791, p < .001) and saccade latency (β = -.377, t = 2.106, p = .043) predicted saccade preferences. Landing precision did not reach significance (β = 23.631, t = 1.675, p = .104). Together, this demonstrates that although other oculomotor metrics such as saccade latency contribute to saccade selection, pupil size remains a robust marker of saccade selection.”
The authors state that "from a costs-perspective, it should be eOicient to not only adjust the number of saccades (non-specific), but also by cutting especially expensive directions the most (specific)". However, saccade targets should be selected based on the maximum expected information gain. If cognitive load increases (due to an additional task) an effective strategy seems to be to perform less - but still meaningful - saccades. How would it help natural orienting to selectively cut saccades in certain (effortful) directions? Choosing saccade targets based on comfort, over information gain, would result in overall more saccades to be made - which is non-optimal, also from a cost perspective.
We thank the reviewer for this comment. Although we do not fully agree, the logic is quite close to our rationale and it is worth adding a point of discussion here. A vital part of the current interpretation is the instruction given to participants. In our second natural visual search task, participants were performing a dual task, where the auditory task was the primary task, whilst the search task was secondary. Therefore, participants are likely to adjust their resources to optimize performance on the primary task – at the expense of the secondary task. Therefore, less resources are made available and used to searching in the dual than in the single task, because these resources are needed for the auditory task. Cutting expensive directions does not help search in terms of search performance, but it does reduce the cost of search, so that more resources are available for the prioritized auditory task. Also note that the search task was rather difficult – participants did it, but it was tough (see the original description of the dataset for more details), which provides another reason to go full in on the auditory task at expense of the visual task. This, however, opens up a nice point of discussion: If one would emphasize the importance of search (maybe with punishment or reward), we would indeed expect participants to perform whichever eye movements are getting them to their goal fastest – thus reducing the relative influence of costs on saccade behavior. This remains to be tested however - we are working on this and are looking forward to discussing such findings in the future.
Together, we propose that there is a trade-off between distributing resources either towards cognitive tasks or the oculomotor system (also see Ballard et al., 1995; Van der Stigchel, 2020). How these resources are distributed depends highly on the current task demands (also see Sahakian et al., 2023). This allows for adaptive behavior in a wide range of contexts.
We now added these considerations to the manuscript as follows (also see our previous replies):
“Do cognitive operations and eye movements consume from a similar pool of resources [44]? If so, increasing cognitive demand for non-oculomotor processes should result in decreasing available resources for the oculomotor system. In line with this idea, previous work indeed shows altered eye-movement behavior un- der effort as induced by dual tasks, for example by making less saccades under increased cognitive demand [62–64]. We therefore investigated whether less sac- cades were made as soon as participants had to count the occurrence of a specific digit in the auditory number stream in comparison to ignoring the stream (in Exp. 2; Figure 4a). Participants were instructed to prioritize the auditory digit-counting task over finding the visual search target. Therefore, resources should be shifted from the oculomotor system to the primary auditory counting task. The additional cognitive demand of the dual task indeed led to a decreased saccade frequency (t(24) = 7.224, p < .001, Cohen’s d = 1.445; Figure 4h).”
I would have expected to see a negative correlation between saccade effort and saccade direction 'change' under increased load. Yet participants mostly cut upwards saccades, but not other directions that, according to pupil size, are equally or even more costly (e.g. oblique saccades).
The reviewer’s point is taken from the initial comment, which we will address here. First, we’d like to point out that is it not established that saccade costs in different directions are always the same. Instead, it is possible that saccade costs could be different in natural viewing compared with our delayed-saccade task. Therefore, we used pupil size during natural viewing for the search experiments. Second, the reviewer correctly notes that oblique saccades are hardly cut when under additional cognitive demand. However, participants already hardly execute oblique saccades when not confronted with the additional auditory task (Figure 4b, d), making it difficult to reduce those further (i.e. floor effect). Participants chose to cut vertical saccades, possibly because these are more costly than horizontal saccades.
We incorporated these point in our manuscript as follows:
“To test this, we analyzed data from two existing datasets [63] wherein participants (total n = 41) searched for small targets (’Z’ or ’H’) in natural scenes (Figure 4a; [64]). Again, we tested whether pupil size prior to saccades negatively linked with saccade preferences across directions. Because saccade costs and preferences across directions could differ for different situations (i.e. natural viewing vs. saccade preference task), but should always be negatively linked, we established both cost and preferences independently in each dataset.”
“We calculated a saccade-adjustment map (Figure 4g) by subtracting the saccade preference map in the single task (Figure 4f) from the dual task map (Fig- ure 4d). Participants seemingly cut vertical saccades in particular, and made more saccades to the top right direction. This pattern may have emerged as vertical saccades are more costly than horizontal saccades (also see Figure 1d). Oblique saccades may not have been cut because there were very little oblique saccades in the single condition to begin with (Figure 4d), making it difficult to observe a further reduction of such saccades under additional cognitive demand (i.e. a floor effect).”
Overall, I am not sure what practical relevance the relation between pupil size (measured in a separate experiment) and saccade decisions has for eye movement research/vision science. Pupil size does not seem to be a straightforward measure of saccade effort. Saccade latency, instead, can be easily extracted in any eye movement experiment (no need to conduct a separate, delayed saccade task to measure pupil dilation), and seems to be an equally good index.
There are two points here.
(1) What is the practical relevance of a link between effort and saccade selection for eyemovement research and vision science?
We see plenty – think of changing eye movement patterns under effort (be it smooth pursuits, saccade rates, distributions of gaze positions to images etc.) which have substantial implications for human factors research, but also neuropsychology. With a cost account, one may predict (rather than just observe) how eye movement changes as soon as resources are reduced/ non-visual demand increases. With a cost account, we can explain such effects (e.g. lower saccade rates under effort, cardinal bias, perhaps also central bias) parsimoniously that cannot be explained by what is so far referred to as the three core drivers of eye movement behavior (saliency, selection history, goals, e.g., Awh et al., 2012). Conversely, one must wonder why eye-movement research/vision science simply accepts/dismisses these phenomena as such, without seeking overarching explanations.
(2) What is the usefulness of using pupil size to measure effort?
We hope that our replies to the comments above illustrate why pupil size is a sensible, robust and versatile marker of attentional costs. We briefly summarize our most important points here.
- Pupil size is an established measure of effort irrespective of context, as demonstrated by hundreds of original works (e.g. working memory load, multiple object tracking, individual differences in cognitive ability). This allows pupil size to be a versatile marker of the effort, and therefore costs, of non-saccadic attentional shifts such as covert attention or those realized by other effector systems (i.e. head or hand movements).
- Our new analysis indicates that pupil size remains a strong and robust predictor of saccade preference, even when considering saccade latency.
- Pupil size allows to study saccade costs in natural viewing. In contrast, saccade latencies are difficult to assess in natural viewing as fixation durations and saccade latencies are intrinsically linked and very difficult to disentangle.
- Note however, that we think that it is interesting and useful so study effects of effort/cost on eye movement behavior. Whichever index is used to do so, we see plenty potential in this line of research, this paper is a starting point to do so.
Reviewer #3 (Public Review):
This manuscript extends previous research by this group by relating variation in pupil size to the endpoints of saccades produced by human participants under various conditions including trial-based choices between pairs of spots and search for small items in natural scenes. Based on the premise that pupil size is a reliable proxy of "effort", the authors conclude that less costly saccade targets are preferred. Finding that this preference was influenced by the performance of a non-visual, attentiondemanding task, the authors conclude that a common source of effort animates gaze behavior and other cognitive tasks.
Strengths:
Strengths of the manuscript include the novelty of the approach, the clarity of the findings, and the community interest in the problem.
We thank the reviewer for pointing out the strengths of our paper.
Weaknesses:
Enthusiasm for this manuscript is reduced by the following weaknesses:
(1) A relationship between pupil size and saccade production seems clear based on the authors' previous and current work. What is at issue is the interpretation. The authors test one, preferred hypothesis, and the narrative of the manuscript treats the hypothesis that pupil size is a proxy of effort as beyond dispute or question. The stated elements of their argument seem to go like this:
PROPOSITION 1: Pupil size varies systematically across task conditions, being larger when tasks are more demanding.
PROPOSITION 2: Pupil size is related to the locus coeruleus.
PROPOSITION 3: The locus coeruleus NE system modulates neural activity and interactions.
CONCLUSION: Therefore, pupil size indexes the resource demand or "effort" associated with task conditions.
How the conclusion follows from the propositions is not self-evident. Proposition 3, in particular, fails to establish the link that is supposed to lead to the conclusion.
We inadvertently laid out this rationale as described above, and we thank the reviewer for pointing out this initial suboptimal structure of argumentation. The notion that the link between pupil size and effort is established in the literature because of its neural underpinnings is inaccurate. Instead, the tight link between effort and pupil size is established based on covariations of pupil diameter and cognition across a wide variety of tasks and domains. In line with this, we now introduce this tight link predominantly based on the relationships between pupil size and cognition instead of focusing on putative neural correlates of this relationship.
As reviewed previously (Beatty, 1982; Bumke, 1911; Kahneman, 1973; Kahneman & Beatty, 1966; Koevoet et al., 2024; Laeng et al., 2012; Mathôt, 2018; Sirois & Brisson, 2014; Strauch et al., 2022; van der Wel & van Steenbergen, 2018), any increase in effort is consistently associated with an increase in pupil size. For instance, the pupil dilates when increasing load in working memory or multiple object tracking tasks, and such pupillary effects robustly explain individual differences in cognitive ability and fluctuations in performance across trials (Alnæs et al., 2014; Koevoet et al., 2024; Robison & Brewer, 2020; Robison & Unsworth, 2019; Unsworth & Miller, 2021). This extends to the planning of movements as pupil dilations are observed prior to the execution of (eye) movements (Koevoet et al., 2023; Richer & Beatty, 1985). The link between pupil size and effort has thus been firmly established for a long time, irrespective of the neural correlates of these effort-linked pupil size changes.
We again thank the reviewer for spotting this logical mistake, and now revised the paragraph where we introduce pupil size as an established marker of effort as follows:
“We recently demonstrated that the effort of saccade planning can be measured with pupil size, which allows for a physiological quantification of saccade costs as long as low-level visual factors are controlled for [33]. Pupil size is an established marker of effort [36–44]. For instance, loading more in working memory or tracking more objects results in stronger pupil dilation [44–52]. Pupil size not only reflects cognitive (or mental) effort but also the effort of planning and executing movements [37, 53, 54]. We leveraged this to demonstrate that saccade costs can be captured with pupil size, and are higher for oblique compared with cardinal directions [33]. Here, we addressed whether saccade costs predict where to saccade.”
We now mention the neural correlates of pupil size only in the discussion. Where we took care to also mention roles for other neurotransmitter systems:
“Throughout this paper, we have used cost in the limited context of saccades.
However, cost-based decision-making may be a more general property of the brain [31, 36, 114–116]. Every action, be it physical or cognitive, is associated with an in- trinsic cost, and pupil size is likely a general marker of this [44]. Note, however, that pupil dilation does not always reflect cost, as the pupil dilates in response to many sensory and cognitive factors which should be controlled for, or at least considered, when interpreting pupillometric data [e.g., see 39, 40, 42, 117]. Effort-linked pupil dilations are thought to be, at least in part, driven by activity in the brainstem locus coeruleus (LC) [40, 118–120] [but other neurotransmitters also affect pupil size, e.g. 121, 122]. Activity in LC with its widespread connections throughout the brain [120, 123–127] is considered to be crucial for the communication within and between neu- ral populations and modulates global neural gain [128–132]. Neural firing is costly [22, 133], and therefore LC activity and pupil size are (neuro)physiologically plausible markers of cost [40]. Tentative evidence even suggests that continued exertion of effort (accompanied by altered pupil dilation) is linked to the accumulation of glutamate in the lateral prefrontal cortex [134], which may be a metabolic marker of cost [also see 116, 134, 135]. “
(2) The authors test one, preferred hypothesis and do not consider plausible alternatives. Is "cost" the only conceivable hypothesis? The hypothesis is framed in very narrow terms. For example, the cholinergic and dopamine systems that have been featured in other researchers' consideration of pupil size modulation are missing here. Thus, because the authors do not rule out plausible alternative hypotheses, the logical structure of this manuscript can be criticized as committing the fallacy of aOirming the consequent.
As we have noted in the response to the reviewer’s first point, we did not motivate our use of pupil size as an index of effort clearly enough. For the current purpose, the neural correlates of pupil size are less relevant than the cognitive correlates (see previous point). We reiterate that the neuromodulatory underpinnings of the observed pupil size effects (which indeed possibly include effects of the cholinergic, dopaminergic and serotonergic systems), while interesting for the discussion on the neural origin of effects, are not crucial to our conclusion. We hope the new rationale (without focusing too much on the (irrelevant) exact neural underpinnings) convinces the reviewer and reader.
Our changes to the manuscript are shown in our reply to the previous comment.
The reviewer notes that other plausible alternative hypotheses could explain the currently reported results. However, we did not find a more parsimonuous explanation for our data than ‘Effort Drives Saccade Selection’. Effort explains why participants prefer saccading toward specific directions in (1) highly controlled and (2) more natural settings. Note that we also predicted this effect previously (Koevoet et al., 2023). Moreover, this account explains (3) why participants make less saccades under additional cognitive demand, and (4) why especially costly saccades are reduced under additional cognitive demand. We are very open to the reviewer presenting other possible interpretations of our data so these can be discussed to be put to test in future work.
(3) The authors cite particular publications in support of the claim that saccade selection is influenced by an assessment of effort. Given the extensive work by others on this general topic, the skeptic could regard the theoretical perspective of this manuscript as too impoverished. Their work may be enhanced by consideration of other work on this general topic, e.g, (i) Shenhav A, Botvinick MM, Cohen JD. (2013) The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron. 2013 Jul 24;79(2):217-40. (ii) Müller T, Husain M, Apps MAJ. (2022) Preferences for seeking effort or reward information bias the willingness to work. Sci Rep. 2022 Nov 14;12(1):19486. (iii) Bustamante LA, Oshinowo T, Lee JR, Tong E, Burton AR, Shenhav A, Cohen JD, Daw ND. (2023) Effort Foraging Task reveals a positive correlation between individual differences in the cost of cognitive and physical effort in humans. Proc Natl Acad Sci U S A. 2023 Dec 12;120(50):e2221510120.
We thank the reviewer for pointing us toward this literature. These papers are indeed relevant for our manuscript, and we have now incorporated them. Specifically, we now discuss how the costs of effort are weighed in relation to possible rewards during decision-making. We have also incorporated work that has investigated how the biomechanical costs of arm movements contribute to action selection.
“Our findings are in line with established effort-based models that assume costs to be weighed against rewards during decision-making [102–107]. In such studies, reward and cognitive/physical effort are often parametrically manipulated to as- sess how much effort participants are willing to exert to acquire a given (monetary) reward [e.g. 108, 109]. Whereas this line of work manipulated the extrinsic costs and/or rewards of decision options (e.g. perceptual consequences of saccades [110, 111] or consequences associated with decision options), we here focus on the intrin- sic costs of the movement itself (in terms of cognitive and physical effort). Relatedly, the intrinsic costs of arm movements are also considered during decision-making: biomechanically aOordable movements are generally preferred over more costly ones [26–28]. We here extend these findings in two important ways. First, until now, the intrinsic costs of saccades and other movements have been inferred from gaze behavior itself or by using computational modelling [23, 25–28, 34, 35, 112]. In con- trast, we directly measured cost physiologically using pupil size. Secondly, we show that physiologically measured saccade costs predict where saccades are directed in a controlled binary preference task, and even during natural viewing. Our findings could unite state-of-the-art computational models [e.g. 23, 25, 34, 35, 113] with physiological data, to directly test the role of saccade costs and ultimately further our understanding of saccade selection.”
(4) What is the source of cost in saccade production? What is the currency of that cost? The authors state (page 13), "... oblique saccades require more complex oculomotor programs than horizontal eye movements because more neuronal populations in the superior colliculus (SC) and frontal eye fields (FEF) [76-79], and more muscles are necessary to plan and execute the saccade [76, 80, 81]." This statement raises questions and concerns. First, the basis of the claim that more neurons in FEF and SC are needed for oblique versus cardinal saccades is not established in any of the publications cited. Second, the authors may be referring to the fact that oblique saccades require coordination between pontine and midbrain circuits. This must be clarified. Second, the cost is unlikely to originate in extraocular muscle fatigue because the muscle fibers are so different from skeletal muscles, being fundamentally less fatigable. Third, if net muscle contraction is the cost, then why are upward saccades, which require the eyelid, not more expensive than downward? Thus, just how some saccades are more effortful than others is not clear.
Unfortunately, our current data do not allow for the specification of what the source is of differences in saccade production, nor what the currency is. We want to explicitly state that while pupil size is a sensitive measure of saccade costs, pupil size cannot directly inform what underlying mechanisms are causing differences in saccade costs across conditions (e.g. directions). Nevertheless, we do speculate about these issues because they are important to consider. We thank the reviewer for pointing out the shortcomings in our initial speculations.
Broadly, we agree with the reviewer that a neural source of differences in costs between different types of saccades is more likely than a purely muscular account (also see Koevoet et al., 2023). Furthermore, we think that the observed differences in saccade costs for oblique vs. cardinal and up vs. down could be due to different underlying mechanisms. While we caution against overinterpreting single directions, tentative evidence for this may also be drawn by the different time course of effects for up/down versus cardinal/oblique, Figure 1c.
Below we speculate about why some specific saccade directions may be more costly than others:
Why would oblique saccades be more costly than cardinal saccades? We thank the reviewer for pointing out that oblique saccades additionally require coordination between pontine and midbrain circuits (Curthoys et al., 1984; King & Fuchs, 1979; Sparks, 2002). This point warrants more revised discussion compared to our initial version. We have incorporated this as follows:
“The complexity of an oculomotor program is arguably shaped by its neural underpinnings. For example, oblique but not cardinal saccades require communication between pontine and midbrain circuits [73–75]. Such differences in neural complexity may underlie the additional costs of oblique compared with cardinal saccades. Besides saccade direction, other properties of the ensuing saccade such as its speed, distance, curvature, and accuracy may contribute to a saccade’s total cost [22, 33, 53, 76, 77] but this remains to be investigated directly.”
Why would downward saccades be more costly than upward saccades? As the reviewer points out: from a net muscular contraction account of cost, one would expect the opposite pattern due to the movement of the eyelid. Instead, we speculate that our findings may be associated with the well-established anisotropy in early visual cortex along the vertical meridian. Specifically, the upper vertical meridian is represented at substantially less detail than the lower vertical meridian (Himmelberg et al., 2023; Silva et al., 2018). Prior to a saccade, attention is deployed towards the intended saccadic endpoint (Deubel & Schneider, 1996; Kowler et al., 1995). Attention tunes neurons to preferentially process the attended location over non-attended locations. Due to the fact that the lower visual field is represented at higher detail than the upper visual field, attention may tune neuronal responses differently when preparing up- compared with downward saccades (Hanning et al., 2024; Himmelberg et al., 2023). Thus, it may be more costly to prepare down- compared with upward saccades. This proposition, however, does not account for the lower costs associated horizontal compared with up- and downward saccades as the horizontal meridian is represented at a higher acuity than the vertical merdian. This makes it unlikely that this explains the pattern of results completely. Again, at this point we can only speculate why costs differ, yet we demonstrate that these differences in cost are decisive for oculomotor behavior. We now explicitly state the speculative nature of these ideas that would all need to be tested directly.
We have updated our discussion of this issue as follows:
“The observed differences in saccade costs across directions could be linked to established anisotropies in perception [80–86], attention [87–92], saccade charac- teristics [87, 88, 92, 93], and (early) visual cortex [94–98] [also see 99]. For example, downward saccades are more costly than upward saccades, which mimics a similar asymmetry in early visual areas wherein the upper visual field is relatively under- represented [94–98]; similarly stronger presaccadic benefits are found for down- compared with upward saccades [87, 88]. Moreover, upward saccades are more pre- cise than downward saccades [93]. Future work should elucidate where saccade cost or the aforementioned anisotropies originate from and how they are related - something that pupil size alone cannot address.”
(5) The authors do not consider observations about variation in pupil size that seem to be incompatible with the preferred hypothesis. For example, at least two studies have described systematically larger pupil dilation associated with faster relative to accurate performance in manual and saccade tasks (e.g., Naber M, Murphy P. Pupillometric investigation into the speed-accuracy trade-off in a visuo-motor aiming task. Psychophysiology. 2020 Mar;57(3):e13499; Reppert TR, Heitz RP, Schall JD. Neural mechanisms for executive control of speed-accuracy trade-off. Cell Rep. 2023 Nov 28;42(11):113422). Is the fast relative to the accurate option necessarily more costly?
We thank the reviewer for this interesting point that we will answer in two ways. First, we discuss the main point: the link between pupil size, effort, and cost. Second, we discuss the findings described specifically in these two papers and how we interpret these from a pupillometric account.
First, one may generally ask whether 1) any effort results in pupil dilation, 2) whether any effort is costly, and 3) whether this means that pupil dilation always reflects effort and cost respectively. Indeed, it has been argued repeatedly, prominently, and independently (e.g., Bumke, 1911; Mathôt, 2018) that any change in effort (no matter the specific origin) is associated with an evoked pupil dilation. Effort, in turn, is consistently and widely experienced as aversive, both across tasks and cultures (David et al., 2024). Effort minimization may therefore be seen as an universal law of human cognition and behavior with effort as a to-be minimized cost (Shadmehr et al., 2019; Hull 1943, Tsai 1932). However, this does not imply that any pupil dilation necessarily reflects effort or that, as a consequence thereof, any pupil dilation is always signaling cost. For instance, the pupil dark response, the pupil far response and changes in baseline pupil size are not associated with effort. Baseline and task-evoked pupil dilation responses have to be interpreted differently (see below), moreover, the pupil also changes (and dilates) due to other factors (see Strauch et al., 2022; Mathôt, 2018, Bumke 1911, Loewenfeld, 1999 for reviews).
Second, as for Naber & Murphy (2020) & Reppert at al. (2023) specifically: Both Reppert et al. (2023) and Naber & Murphy (2020) indeed demonstrate a larger baseline pupil size when participants made faster, less accurate responses. However, baseline pupil size is not an index of effort per-se, but task-evoked pupil dilation responses are (as studied in the present manuscript) (Strauch et al., 2022). For work on differences between baseline pupil diameter and task-evoked pupil responses, and their respective links with exploration and exploitation please see Jepma & Nieuwenhuis (2011). Indeed, the link between effort and larger pupil size holds for task evoked responses, but not baseline pupil size per se (also see Koevoet et al., 2023).
Still, Naber (third author of the current paper) & Murphy (2020) also demonstrated larger task-evoked pupil dilation responses when participants were instructed to make faster, less accurate responses compared with making accurate and relatively slow responses. However, this difference in task-evoked response gains significance only after the onset of the movement itself, and peaks substantially later than response offset. Whilst pupil dilation may be sluggish, it isn’t extremely sluggish either. As feedback to the performance of the participant was displayed 1.25s after performing the movement and clicking (taking about 630ms), we deem it possible that this effect may in part result from appraising the feedback to the participant rather than the speed of the response itself (in fact, Naber and Murphy also discuss this option). In addition to not measuring saccades but mouse movements, it is therefore possible that the observed evoked pupil effects in Naber & Murphy (2020) are not purely linked to motor preparation and execution per se. Therefore, future work that aims to investigate the costs of movements should isolate the effects of feedback and other potential factors that may drive changes in pupil size. This will help clarify whether fast or more accurate movements could be linked to the underlying costs of the movements.
Relatedly, we do not find evidence that pupil size during saccade planning predicts the onset latency of the ensuing saccade (please refer to our second response to Reviewer 2 for a detailed discussion).
Together, we therefore do not see the results from Reppert et al. (2023) and Naber & Murphy (2020) to be at odds with our interpretation of evoked pupil size reflecting effort and cost in the context of planning saccades.
We think that these are considerations important to the reader, which is why we now added them to the discussion as follows:
“Throughout this paper, we have used cost in the limited context of saccades.
However, cost-based decision-making may be a more general property of the brain [31, 36, 114–116]. Every action, be it physical or cognitive, is associated with an in- trinsic cost, and pupil size is likely a general marker of this [44]. Note, however, that pupil dilation does not always reflect cost, as the pupil dilates in response to many sensory and cognitive factors which should be controlled for, or at least considered, when interpreting pupillometric data [e.g., see 39, 40, 42, 117].”
(6) The authors draw conclusions based on trends across participants, but they should be more transparent about variation that contradicts these trends. In Figures 3 and 4 we see many participants producing behavior unlike most others. Who are they? Why do they look so different? Is it just noise, or do different participants adopt different policies?
We disagree with the transparency point of the reviewer. Note that we deviated from the norm here by being more transparent than common: we added individual data points and relationships rather than showing pooled effects across participants with error bars alone (see Figures 2c, 3b,c, 4c,e,f).
Moreover, our effects are consistent and stable across participants and are highly significant. To illustrate, for the classification analysis based on cost (Figure 2E) 16/20 participants showed an effect. As for the natural viewing experiments (total > 250,000 fixations), we also find that a majority of participants show the observed effects: Experiment 1: 15/16 participants; Experiment 2: 16/25 participants; Experiment 2 – adjustment: 22/25 participants.
We fully agree that it’s interesting to understand where interindividual variation may originate from. We currently have too little data to allow robust analyses across individuals and zooming in on individual differences in cost maps, preference maps, or potential personalized strategies of saccade selection. That said, future work could study this further. We would recommend to hereby reduce the number of directions to gain more pupil size data per direction and therefore cleaner signals that may be more informative on the individual level. With such stronger signals, studying (differences in) links on an individual level may be feasible and would be interesting to consider – and will be a future direction in our own work too. Nonetheless, we again stress that the reported effects are robust and consistent across participants, and that interindividual differences are therefore not extensive. Moreover, our results from four experiments consistently support our conclusion that effort drives saccade selection.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
- Based on the public review, I would recommend that the authors carefully review and correct the manuscript with regard to the causal conclusions. The study is largely correlational (i.e. the pupil was only observed, not manipulated) and therefore does not allow causal conclusions to be drawn about the relationship between pupil size and saccade selection. These causal conclusions become even more confusing when pupil size is equated with effort and saccade cost. As a consequence, an actual correlation between pupil size and saccade selection has led to the title that effort drives saccade selection. It would also be helpful for the reader to summarize in an additional section of the discussion what they consider to be a causal or correlational link based on their results.
We agree with the reviewer, and we have indeed included more explicitly which findings are correlational and which causal in detail now. As outlined before we do not see a more parimanious explanation for our findings than our title, but we fully agree that the paper benefits from making the correlational/causal nature of evidence for this idea explicitly transparent.
“We report a combination of correlational and causal findings. Despite the correlational nature of some of our results, they consistently support the hypothesis that saccade costs predicts saccade selection [which we predicted previously, 33]. Causal evidence was provided by the dual-task experiment as saccade frequencies - and especially costly saccades were reduced under additional cognitive demand. Only a cost account predicts 1) a link between pupil size and saccade preferences, 2) a cardinal saccade bias, 3) reduced saccade frequency under additional cognitive demand, and 4) disproportional cutting of especially those directions associated with more pupil dilation. Together, our findings converge upon the conclusion that effort drives saccade selection.”
- Can the authors please elaborate in more detail on how they transformed the predictors of their linear mixed model for the visualization in Figure 1f? It is difficult to see how the coeOicients in the table and the figure match.
We used the ‘effectsize’ package to provide effect sizes of for each predictor of the linear mixed-effects model (https://cran.r-project.org/web/packages/effectsize/index.html). We report absolute effect sizes to make it visually easier to compare different predictors. These details have now been included in the Methods section to be more transparent about how these effect sizes were computed.
“Absolute effect sizes (i.e. r) and their corresponding 95% confidence intervals for the linear mixed-effects models were calculated using t and df values with the ’effectsize’ package (v0.8.8) in R.”
- Could the authors please explain in more detail why they think that a trial-by-trial analysis in the free choice task adds something new to their conclusions? In fact, a trialby-trial analysis somehow suggests that the pupil size data would enter the analysis at a single trial level. If I understand correctly, the pupil size data come from their initial mapping task. So there is only one mean pupil size for a given participant and direction that goes into their analysis to predict free choice in a single trial. If this is the case, I don't see the point of doing this additional analysis given the results shown in Figure 2c.
The reviewer understands correctly that pupil size data is taken from the initial mapping task. We then used these mean values to predict which saccade target would be selected on a trial-by-trial basis. While showing the same conceptual result as the correlation analysis, we opted to include this analysis to show the robustness of the results across individuals. Therefore we have chosen to keep the analysis in the manuscript but now write more clearly that this shows the same conceptual finding as the correlation analysis.
“As another test of the robustness of the effect, we analyzed whether saccade costs predicted saccade selection on a trial-by-trial basis. To this end, we first determined the more aOordable option for each trial using the established saccade cost map (Figure 1d). We predicted that participants would select the more aOordable option. Complementing the above analyses, the more aOordable option was chosen above chance level across participants (M = 56.64%, 95%-CI = [52.75%-60.52%], one-sample t-test against 50%: t(19) = 3.26, p = .004, Cohen’s d = .729; Figure 2e). Together, these analyses established that saccade costs robustly predict saccade preferences.”
Reviewer #2 (Recommendations For The Authors):
The authors report that "Whenever the difference in pupil size between the two options was larger, saccades curved away more from the non-selected option (β = .004, SE = .001, t = 4.448, p < .001; Figure 3b), and their latencies slowed (β = .050, SE = .013, t = 4.323, p < .001; Figure 3c)". I suspect this effect might not be driven by the difference but by a correlation between pupil size and latency.
The authors correlate differences in pupil size (Exp1) with saccade latencies (Exp2), I recommend correlating pupil size with the latency directly, in either task. This would show if it is actually the difference between choices or simply the pupil size of the respective individual option that is linked to latency/effort. Same for curvature.
The reviewer raises a good point. Please see the previous analyses concerning the possible correlations between pupil size and saccade latency, and how they jointly predict saccade selection.
Our data show that saccade curvature and latencies are linked with the difference in pupil size between the selected and non-selected options. Are these effects driven by a difference in pupil size or by the pupil size associated with the chosen option?
To assess this, we conducted two linear mixed-effects models. We predicted saccade curvature and latency using pupil size (from the planning task) of the selected and nonselected options while controlling for the chosen direction (Wilkinson notation: saccade curvature/latency ~ selected pupil size + non-selected pupil size + obliqueness + vertical + horizontal + (1+ selected pupil size + non-selected pupil size|participant). We found that saccades curved away more from costlier the non-selected targets (β \=1.534, t \= 8.151, p < .001), and saccades curved away from the non-selected target less when the selected target was cheaper (β \=-2.571, t \= -6.602, p < .001). As the costs of the selected and non-selected show opposite effects on saccade curvature, this indicates that the difference between the two options drives oculomotor conflict.
As for saccade latencies, we found saccade onsets to slow when the cost of the selected target was higher (b \= .068, t \= 2.844, p \= .004). In contrast, saccade latencies were not significantly affected by the cost of the non-selected target (β \= -.018, t \= 1.457, p \= .145), although numerically the effect was in the opposite direction. This shows that latencies were primarily driven by the cost of the selected target but a difference account cannot be fully ruled out.
Together, these analyses demonstrate that the difference in costs between two alternatives reliably affects oculomotor conflict as indicated by the curvature analysis. However, saccade latencies are predominantly affected by the cost of the selected target – even when controlling for the obliqueness, updownness and leftrightness of the ensuing saccade. We have added these analyses here for completeness, but because the findings seem inconclusive for saccade latency we have chosen to not include these analyses in the current paper. We are open to including these analyses in the supplementary materials if the reviewer and/or editor would like us to, but have chosen not to do so due to conciseness and to keep the paper focused.
I was wondering why the authors haven't analyzed the pupil size in Experiment 2. If the pupil size can be assessed during a free viewing task (Experiment 3), shouldn't it be possible to also evaluate it in the saccade choice task?
We did not analyze the pupil size data from the saccade preference task for two reasons. First, the number of saccades is much lower than in the natural search experiments (~14.000 vs. ~250.000). Second, in the saccade preference task, there were always two possible saccade targets. Therefore, even if we were able to isolate an effort signal, this signal could index a multitude of factors such as deciding between two possible saccade targets (de Gee et al., 2014), and has the possibility of two oculomotor programs being realized instead of only a single one (Van der Stigchel, 2010).
Discussion: "due to stronger presaccadic benefits for upward compared with downward saccades [93,94]". I think this should be the other way around.
We thank the reviewer for pointing this out. We have corrected our mistake in the revised manuscript.
Saccade latencies differ around the visual field; to account for that, results / pupil size should be (additionally) evaluated relative to saccade onset (rather than cue offset). It is interesting that latencies were not accounted for here (Exp1), since they are considered for Exp2 (where they correlate with a pupil size difference). I suspect that latencies not only correlate with the difference in pupil size, but directly with pupil size itself.
We agree with the reviewer that locking the pupil size signal to saccade onset instead of cue offset may be informative. We included an analysis in the supporting information that investigates this (see Figure S1). The results of the analysis were conceptually identical.
The reviewer writes that latencies were not accounted for in Experiment 1. Although saccade latency was not included in the final model reported in the paper, it was considered during AIC-based backward model selection. As saccade latency did not predict meaningful variance in pupil size, it was ultimately not included in the analysis as a predictor. For completeness, we here report the outcome of a linear mixed-effects that does include saccade latency as a predictor. Here, saccade latencies did not predict pupil size (β \= 1.859e-03, t \= .138, p \= .889). The assymetry effects remained qualitatively unchanged: preparing oblique compared with cardinal saccades resulted in a larger pupil size (β \= 7.635, t \= 3.969, p < .001), and preparing downward compared with upward saccades also led to a larger pupil size (β \= 3.344, t \= 3.334, p \= .003).
In addition, we have included a new analysis in the supporting information that directly addresses this issue. We will reiterate the main results here:
“To ascertain whether pupil size or other oculomotor metrics predict saccade preferences, we conducted a multiple regression analysis. We calculated average pupil size, saccade latency, landing precision and peak velocity maps across all 36 directions. The model, determined using AIC-based backward selection, included pupil size, latency and landing precision as predictors (Wilkinson notation: saccade preferences pupil size + saccade latency + landing precision). The analysis re- vealed that pupil size (β = -42.853, t = 4.791, p < .001) and saccade latency (β = -.377, t = 2.106, p = .043) predicted saccade preferences. Landing precision did not reach significance (β = 23.631, t = 1.675, p = .104). Together, this demonstrates that although other oculomotor metrics such as saccade latency contribute to saccade selection, pupil size remains a robust marker of saccade selection.”
We have also added this point in our discussion:
“We here measured cost as the degree of effort-linked pupil dilation. In addition to pupil size, other markers may also indicate saccade costs. For example, saccade latency has been proposed to index oculomotor effort [100], whereby saccades with longer latencies are associated with more oculomotor effort. This makes saccade latency a possible complementary marker of saccade costs (also see Supplemen- tary Materials). Although relatively sluggish, pupil size is a valuable measure of attentional costs for (at least) two reasons. First, pupil size is a highly established as marker of effort, and is sensitive to effort more broadly than only in the context of saccades [36–45, 48]. Pupil size therefore allows to capture not only the costs of saccades, but also of covert attentional shifts [33], or shifts with other effectors such as head or arm movements [54, 101]. Second, as we have demonstrated, pupil size can measure saccade costs even when searching in natural scenes (Figure 4). During natural viewing, it is difficult to disentangle fixation duration from saccade latencies, complicating the use of saccade latency as a measure of saccade cost. Together, pupil size, saccade latency, and potential other markers of saccade cost could fulfill complementary roles in studying the role of cost in saccade selection.”
References
Alnæs, D., Sneve, M. H., Espeseth, T., Endestad, T., van de Pavert, S. H. P., & Laeng, B. (2014). Pupil size signals mental eFort deployed during multiple object tracking and predicts brain activity in the dorsal attention network and the locus coeruleus. Journal of Vision, 14(4), 1. https://doi.org/10.1167/14.4.1
Awh, E., Belopolsky, A. V., & Theeuwes, J. (2012). Top-down versus bottom-up attentional control: A failed theoretical dichotomy. Trends in Cognitive Sciences, 16(8), 437–443. https://doi.org/10.1016/j.tics.2012.06.010
Ballard, D. H., Hayhoe, M. M., & Pelz, J. B. (1995). Memory Representations in Natural Tasks. Journal of Cognitive Neuroscience, 7(1), 66–80. https://doi.org/10.1162/jocn.1995.7.1.66
Beatty, J. (1982). Task-evoked pupillary responses, processing load, and the structure of processing resources. Psychological Bulletin, 91(2), 276–292. https://doi.org/10.1037/0033-2909.91.2.276
Bumke, O. (1911). Die Pupillenstörungen bei Geistes-und Nervenkrankheiten (2nd ed.). Fischer.
Curthoys, I. S., Markham, C. H., & Furuya, N. (1984). Direct projection of pause neurons to nystagmusrelated excitatory burst neurons in the cat pontine reticular formation. Experimental Neurology, 83(2), 414–422. https://doi.org/10.1016/S0014-4886(84)90109-2
David, L., Vassena, E., & Bijleveld, E. (2024). The unpleasantness of thinking: A meta-analytic review of the association between mental eFort and negative aFect. Psychological Bulletin, 150(9), 1070–1093. https://doi.org/10.1037/bul0000443
de Gee, J. W., Knapen, T., & Donner, T. H. (2014). Decision-related pupil dilation reflects upcoming choice and individual bias. Proceedings of the National Academy of Sciences, 111(5), E618–E625. https://doi.org/10.1073/pnas.1317557111
Deubel, H., & Schneider, W. X. (1996). Saccade target selection and object recognition: Evidence for a common attentional mechanism. Vision Research, 36(12), 1827–1837. https://doi.org/10.1016/0042-6989(95)00294-4
Greenwood, J. A., Szinte, M., Sayim, B., & Cavanagh, P. (2017). Variations in crowding, saccadic precision, and spatial localization reveal the shared topology of spatial vision. Proceedings of the National Academy of Sciences, 114(17), E3573–E3582. https://doi.org/10.1073/pnas.1615504114
Hanning, N. M., Himmelberg, M. M., & Carrasco, M. (2024). Presaccadic Attention Depends on Eye Movement Direction and Is Related to V1 Cortical Magnification. Journal of Neuroscience, 44(12). https://doi.org/10.1523/JNEUROSCI.1023-23.2023
Himmelberg, M. M., Winawer, J., & Carrasco, M. (2023). Polar angle asymmetries in visual perception and neural architecture. Trends in Neurosciences, 46(6), 445–458. https://doi.org/10.1016/j.tins.2023.03.006
Jepma, M., & Nieuwenhuis, S. (2011). Pupil Diameter Predicts Changes in the Exploration–Exploitation Trade-oF: Evidence for the Adaptive Gain Theory. Journal of Cognitive Neuroscience, 23(7), 1587– 1596. https://doi.org/10.1162/jocn.2010.21548
Kahneman, D. (1973). Attention and Effort. Prentice-Hall.
Kahneman, D., & Beatty, J. (1966). Pupil diameter and load on memory. Science (New York, N.Y.), 154(3756), 1583–1585. https://doi.org/10.1126/science.154.3756.1583
King, W. M., & Fuchs, A. F. (1979). Reticular control of vertical saccadic eye movements by mesencephalic burst neurons. Journal of Neurophysiology, 42(3), 861–876. https://doi.org/10.1152/jn.1979.42.3.861
Koevoet, D., Strauch, C., Naber, M., & Van der Stigchel, S. (2023). The Costs of Paying Overt and Covert Attention Assessed With Pupillometry. Psychological Science, 34(8), 887–898. https://doi.org/10.1177/09567976231179378
Koevoet, D., Strauch, C., Van der Stigchel, S., Mathôt, S., & Naber, M. (2024). Revealing visual working memory operations with pupillometry: Encoding, maintenance, and prioritization. WIREs Cognitive Science, e1668. https://doi.org/10.1002/wcs.1668
Kowler, E., Anderson, E., Dosher, B., & Blaser, E. (1995). The role of attention in the programming of saccades. Vision Research, 35(13), 1897–1916. https://doi.org/10.1016/0042-6989(94)00279-U
Laeng, B., Sirois, S., & Gredebäck, G. (2012). Pupillometry: A Window to the Preconscious? Perspectives on Psychological Science, 7(1), 18–27. https://doi.org/10.1177/1745691611427305
Loewenfeld, I. E. (1958). Mechanisms of reflex dilatation of the pupil. Documenta Ophthalmologica, 12(1), 185–448. https://doi.org/10.1007/BF00913471
Mathôt, S. (2018). Pupillometry: Psychology, Physiology, and Function. Journal of Cognition, 1(1), 16. https://doi.org/10.5334/joc.18
Naber, M., & Murphy, P. (2020). Pupillometric investigation into the speed-accuracy trade-oF in a visuomotor aiming task. Psychophysiology, 57(3), e13499. https://doi.org/10.1111/psyp.13499
Nozari, N., & Martin, R. C. (2024). Is working memory domain-general or domain-specific? Trends in Cognitive Sciences, 0(0). https://doi.org/10.1016/j.tics.2024.06.006
Reppert, T. R., Heitz, R. P., & Schall, J. D. (2023). Neural mechanisms for executive control of speedaccuracy trade-oF. Cell Reports, 42(11). https://doi.org/10.1016/j.celrep.2023.113422
Richer, F., & Beatty, J. (1985). Pupillary Dilations in Movement Preparation and Execution. Psychophysiology, 22(2), 204–207. https://doi.org/10.1111/j.1469-8986.1985.tb01587.x
Robison, M. K., & Brewer, G. A. (2020). Individual diFerences in working memory capacity and the regulation of arousal. Attention, Perception, & Psychophysics, 82(7), 3273–3290. https://doi.org/10.3758/s13414-020-02077-0
Robison, M. K., & Unsworth, N. (2019). Pupillometry tracks fluctuations in working memory performance. Attention, Perception, & Psychophysics, 81(2), 407–419. https://doi.org/10.3758/s13414-0181618-4
Sahakian, A., Gayet, S., PaFen, C. L. E., & Van der Stigchel, S. (2023). Mountains of memory in a sea of uncertainty: Sampling the external world despite useful information in visual working memory. Cognition, 234, 105381. https://doi.org/10.1016/j.cognition.2023.105381
Shadmehr, R., Reppert, T. R., Summerside, E. M., Yoon, T., & Ahmed, A. A. (2019). Movement Vigor as a Reflection of Subjective Economic Utility. Trends in Neurosciences, 42(5), 323–336. https://doi.org/10.1016/j.tins.2019.02.003
Silva, M. F., Brascamp, J. W., Ferreira, S., Castelo-Branco, M., Dumoulin, S. O., & Harvey, B. M. (2018). Radial asymmetries in population receptive field size and cortical magnification factor in early visual cortex. NeuroImage, 167, 41–52. https://doi.org/10.1016/j.neuroimage.2017.11.021
Sirois, S., & Brisson, J. (2014). Pupillometry. WIREs Cognitive Science, 5(6), 679–692. https://doi.org/10.1002/wcs.1323
Sparks, D. L. (2002). The brainstem control of saccadic eye movements. Nature Reviews Neuroscience, 3(12), Article 12. https://doi.org/10.1038/nrn986
Strauch, C., Wang, C.-A., Einhäuser, W., Van der Stigchel, S., & Naber, M. (2022). Pupillometry as an integrated readout of distinct attentional networks. Trends in Neurosciences, 45(8), 635–647. https://doi.org/10.1016/j.tins.2022.05.003
Unsworth, N., & Miller, A. L. (2021). Individual DiFerences in the Intensity and Consistency of Attention. Current Directions in Psychological Science, 30(5), 391–400. https://doi.org/10.1177/09637214211030266
Van der Stigchel, S. (2010). Recent advances in the study of saccade trajectory deviations. Vision Research, 50(17), 1619–1627. https://doi.org/10.1016/j.visres.2010.05.028
Van der Stigchel, S. (2020). An embodied account of visual working memory. Visual Cognition, 28(5–8), 414–419. https://doi.org/10.1080/13506285.2020.1742827
Van der Stigchel, S., & Hollingworth, A. (2018). Visuospatial Working Memory as a Fundamental Component of the Eye Movement System. Current Directions in Psychological Science, 27(2), 136–143. https://doi.org/10.1177/0963721417741710
van der Wel, P., & van Steenbergen, H. (2018). Pupil dilation as an index of eFort in cognitive control tasks: A review. Psychonomic Bulletin & Review, 25(6), 2005–2015. https://doi.org/10.3758/s13423-018-1432-y
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this valuable study, the authors found that the macrolide drug rapamycin, which is an important pharmacological tool in the clinic and the research lab, is less specific than previously thought. They provide solid functional evidence that rapamycin activates TRPM8 and develop an NMR method to measure the specific binding of a ligand to a membrane protein.
Strengths:
The authors use a variety of complementary experimental techniques in several different systems, and their results support the conclusions drawn.
Weaknesses:
Controls are not shown in all cases, and a lack of unity across the figures makes the flow of the paper disjointed. The proposed location of the rapamycin binding pocket within the membrane means that molecular docking approaches designed for soluble proteins alone do not provide solid evidence for a rapamycin binding pocket location in TRPM8, but the authors are appropriately careful in stating that the model is consistent with their functional experiments.
Impact:
This work provides still more evidence for the polymodality of TRP channels, reminding both TRP channel researchers and those who use rapamycin in other contexts that the adjective "specific" is only meaningful in the context of what else has been explicitly tested.
Reviewer #2 (Public Review):
Summary:
Tóth and Bazeli et al. find rapamycin activates heterologously-expressed TRPM8 and dissociated sensory neurons in a TRPM8-dependent way with Ca2+-imaging. With electrophysiology and STTD-NMR, they confirmed the activation is through direct interaction with TRPM8. Using mutants and computational modeling, the authored localized the binding site to the groove between S4 and S5, different than the binding pocket of cooling agents such as menthol. The hydroxyl group on carbon 40 within the cyclohexane ring in rapamycin is indispensable for activation, while other rapalogs with its replacement, such as everolimus, still bind but cannot activate TRPM8. Overall, the findings provide new insights into TRPM8 functions and may indicate previously unknown physiological effects or therapeutic mechanisms of rapamycin.
Strengths:
The authors spent extensive effort on demonstrating that the interaction between TRPM8 and rapamycin is direct. The evidence is solid. In probing the binding site and the structural-function relationship, the authors combined computational simulation and functional experiments. It is very impressive to see that "within" a rapamycin molecule, the portion shared with everolimus is for "binding", while the hydroxyl group in the cyclohexane ring is for activation. Such detailed dissection represents a successful trial in the computational biology-facilitated, functional experiment-validated study of TRP channel structuralactivity relationship. The research draws the attention of scientists, including those outside the TRP channel field, to previously neglected effects of rapamycin, and therefore the manuscript deserves broad readership.
Weaknesses:
The significance of the research could be improved by showing or discussing whether a similar binding pocket is present in other TRP channels, and hence rapalogs might bind to or activate these TRP channels. Additionally, while the finding on TRPM8 is novel, it is worthwhile to perform more comprehensive pharmacological characterization, including single-channel recording and a few more mutant studies to offer further insight into the mechanism of rapamycin binding to S4~S5 pocket driving channel opening. It is also necessary to know if rapalogs have independent or synergistic effects on top of other activators, including cooling agents and lower temperature, and their dependence on regulators such as PIP2.
Additional discussion that might be helpful:
The authors did confirm that rapamycin does not activate TRPV1, TRPA1 and TRPM3. But other TRP channels, particularly other structurally similar TRPM channels, should be discussed or tested. Alignment of the amino acid sequences or structures at the predicted binding pocket might predict some possible outcomes. In particular, rapamycin is known to activate TRPML1 in a PI(3,5)P2-dependent manner, which should be highlighted in comparison among TRP channels (PMID: 35131932, 31112550).
Reviewer #3 (Public Review):
Summary:
Rapamycin is a macrolide of immunologic therapeutic importance, proposed as a ligand of mTOR. It is also employed as in essays to probe protein-protein interactions.
The authors serendipitously found that the drug rapamycin and some related compounds, potently activate the cationic channel TRPM8, which is the main mediator of cold sensation in mammals. The authors show that rapamycin might bind to a novel binding site that is different from the binding site for menthol, the prototypical activator of TRPM8. These solid results are important to a wide audience since rapamycin is a widely used drug and is also employed in essays to probe protein-protein interactions, which could be affected by potential specific interactions of rapamycin with other membrane proteins, as illustrated herein.
Strengths:
The authors employ several experimental approaches to convincingly show that rapamycin activates directly the TRPM8 cation channel and not an accessory protein or the surrounding membrane. In general, the electrophysiological, mutational and fluorescence imaging experiments are adequately carried out and cautiously interpreted, presenting a clear picture of the direct interaction with TRPM8. In particular, the authors convincingly show that the interactions of rapamycin with TRPM8 are distinct from interactions of menthol with the same ion channel.
Weaknesses:
The main weakness of the manuscript is the NMR method employed to show that rapamycin binds to TRPM8. The authors developed and deployed a novel signal processing approach based on subtraction of several independent NMR spectra to show that rapamycin binds to the TRPM8 protein and not to the surrounding membrane or other proteins. While interesting and potentially useful, the method is not well developed (several positive controls are missing) and is not presented in a clear manner, such that the quality of data can be assessed and the reliability and pertinence of the subtraction procedure evaluated.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Major points
(1) Given the novelty of the STTD NMR approach, please provide more details and data for the reader.
• I would like to see all of the collected spectra so that readers can see and judge the effect sizes for themselves, perhaps as an additional supplementary figure.
We agree with the reviewer that the data transparency of the NMR measurements should be improved. We changed panel C of Figure 2 in the main text and provided all the STD and the computed STDD and STTD spectra recorded on one set of experiments. We carried out additional experimental replicas on new samples and addressed the variability of cell samples by rescaling the STD effects based on reference <sup>1</sup>H measurements. We provided supplementary spectra of the reference experiments without saturation (Figure S5) and the obtained STTD spectra from the three parallel NMR sessions (Figure S6).
• I appreciate the labels for STDD-1, STDD-2, and STTD on the lower two spectra of Figure 2C. Is the top spectrum from STD-1 or is it prior to saturation? In Figure 2C, what do the x1 and x2 notations on the right-hand side of the spectra indicate?
We showed the top spectrum as an overview and a demonstration of the spectral complexity of the samples. <sup>1</sup>H experiments were run before the STD measurements to assess the sample quality and stability. The demonstrated spectrum on sample 1 (TRPM8 with rapamycin in HEK cells) was recorded with more transients than the corresponding STDs, thus it is only visually comparable with the difference spectra after scaling (2x). Figure 2 was changed and all the spectra were replaced as mentioned before. All the recorded <sup>1</sup>H-experiments without saturation including the one removed are now available in the supplementary information (Figure S5).
• The STTD NMR results with WT TRPM8 are consistent with rapamycin binding directly to the channel. Testing whether rapamycin binding observed with STTD NMR is disrupted by one of the most compelling mutations (D796A, D802A, G805A, or Q861A) would be a further test of this direct interaction.
We thank the reviewer for the suggestion and agree that testing the most compelling mutants would be a promising next step. These mutations were generated in plasmid vectors and only transiently transfected into HEK cells. For NMR analysis we would need a high amount of cells stably overexpressing the mutant channels which were not available for experimentation.
• Given that this is not a methods paper, it is probably outside the scope to further validate the STTD NMR measurements by performing parallel ITC, SPR, MST, or radiolabeled ligand experiments. Nevertheless, I would be excited to see such a comparison since STTD NMR appears to have promise as an experimental technique for assessing ligand binding to membrane proteins that does not require large amounts of purified protein or radioactive isotopes.
We agree with the reviewer that additional independent biophysical measurements on the interactions are necessary to further validate the STTD methodology. This paper is a preliminary demonstration of the STTD concept and our group is currently working on the challenges of on-cell NMR (e.g., sample and spectral complexity) and the standardization of the proposed workflow.
(2) Please clarify the methods used to model of rapamycin binding. Docking can be imprecise in TRP channels, even with a sophisticated docking scheme (Hughes et al., 2019, doi: https://doi.org/10.7554/eLife.49572.001).
Thank you for mentioning this point and providing the reference. We have further clarified our methods and included the reference in our discussion, indicating the limitations of our approach.
• As a positive control, does the docking strategy accurately predict binding of known compounds (menthol, icilin, etc.) to TRPM8 consistent with cryo-EM structures?
Yes, the binding site for menthol, based on a similar docking strategy as for rapamycin, is also presented, and matches with predictions from other publications. This is now clarified in the revised manuscript.
• Why was homology modeling to the human sequence used with the mouse structure but not the avian structure?
At this onset of the project, only the avian structure was available, and it was used in the primary docking. Later, to get more precise docking relevant for human TRPM8 pharmacology, we did revert to the then available structure of the mouse ortholog.
• How many rapamycin structural clusters were built, and how many structures were there in each cluster? How many were used? "most populated" is unspecific.
Thank you for your comment. We have added the following highlighted information to the methods section to address your comment:
“Representative conformations of rapamycin were identified by clustering of the 1000-membered pools, having the macrocycle backbone atoms compared with 1.0 Å RMSD cut-off. Middle structures of the ten most populated clusters, accounting for more than 90% of the total conformational ensemble generated by simulated annealing, were used for further docking studies. To refine initial docking results and to identify plausible binding sites, the above selected rapamycin structures were docked again, following the same protocol as above, except for the grid spacing which was set to 0.375 Å in the second pass. The resultant rapamycin-TRPM8 complexes were, again, clustered and ranked according to the corresponding binding free energies. Selected binding poses were subjected to further refinement. The three most populated and plausible binding poses were further refined by a third pass of docking, where amino acid side chains of TRPM8, identified in the previous pass to be in close contact with rapamycin (< 4 Å), were kept flexible. Grid volumes were reduced to these putative binding sites including all flexible amino acid side chains (21.0-26.2 Å x 26.2-31.5 Å x 24.8-29.2 Å).”
However, it is important to clarify that the clusters are not built and their number is not specified by the user. The number of clusters found depends on how similar the structures are in the structural ensemble analyzed by clustering. A high number of clusters indicates a diverse, whereas a low number suggests a uniform structural ensemble. Furthermore, it is arbitrarily controlled by the similarity cutoff specified by the user. If the cutoff is selected well, then the number of structures is different in each cluster. There are some highly populated clusters and a few which only have one structure. The selection of how many cluster representatives are used is usually based on the decision of whether or not the sum of the population of selected clusters sufficiently covers the mapped conformational space.
• Additionally, the rapamycin poses were generated using a continuum solvent model that is unlikely to replicate the conditions existing in the lipid bilayer or in a lipid-exposed binding pocket as is predicted here. It is therefore possible that the rapamycin poses chosen for docking do not represent the physiological rapamycin binding pose, hampering the ability of the docking algorithm to find an appropriate docking pocket.
• Furthermore, accurately docking that may bind to membrane-exposed pockets is a challenging problem, particularly because many scoring algorithms, including those employed by Autodock, do not distinguish between solvent-exposed and membrane-exposed faces of the protein. This affects the predicted binding energies.
We appreciate the reviewer's insightful comments. We add a note in discussion part, mentioning these important limitations.
• In Figure 4, it appears that the proposed rapamycin binding pocket is located at the interface between two subunits, but only one is shown. Is there any contact with residues in the neighboring subunit? Based on Figure S4, I assume not, but am unsure.
Based on the estimated distances, we do not think that there are any relevant interactions with residues from neighboring subunits. This is now indicated in the results section.
• Consider uploading the rapamycin-docked model to a public repository such as Zenodo for readers to examine and manipulate themselves
As suggested, the model will be uploaded in a public repository. A link to the file on Zenodo is now included.
(3) Please discuss the spatial location of the proposed rapamycin binding pocket relative to the vanilloid binding pocket in TRPV1.
• The mutagenesis indicates that D745, D802, G805, and Q861 are most important for rapamycin sensitivity in TRPM8. Interestingly, the proposed rapamycin binding pocket appears to overlap spatially with the vanilloid binding pocket in TRPV1. Consistent with this, Q861 aligns with E570 in TRPV1, which is a critical residue for resiniferatoxin sensitivity. Indeed, similar to Q861's modeled proximity to the cyclohexyl ring, the hydroxyl group of the vanillyl moity of capsaicin (4DY in 7LR0, for example) is in proximity to E750 in TRPV1. Additionally, searching PubChem by structural similarity suggests that vanillyl head group of the TRP channel modulators capsaicin and eugenol are similar structurally to the trans-2Methoxycyclohexan-1-ol ring. Without overlaying the two structures myself, it is difficult to say more than that, but I encourage the authors to comment on any similarities and differences they observe.
• If the proposed rapamycin pocket is indeed similar to the location of the vanilloid binding site, the authors may wish to discuss other TRPM channel structures that show ligands and lipids bound to this pocket because this provides evidence that this pocket influences TRPM channel function. For example, how does the proposed rapamycin binding pocket compare to TRPM8 bound to agonist AITC (PDBID 8e4l), TRPM5 bound to inhibitor NDNA (7mbv), and TRPM2 bound to phosphatidylcholine (6co7)?
• Other TRP channel structures with ligands or lipids modeled in this region include TRPV1 bound to resiniferatoxin, capsaicin, or phosphatidylinositol (7l2j, 7l24, 7l2s, 7l2t, 7l2u, 7lp9, 7lpc, 7lqy, 7mz6, 7mz9, 7mza); TRPV3 bound to phosphatidylcholine (7mij, 7mik, 7mim, 7min, 7ugg); TRPV5 bound to econazole (6b5v) or ZINC9155 (6pbf); TRPV6 bound to piperazine (7d2k, 7k4b, 7k4c, 7k4d, 7k4e, 7k4f) or cholesterol hemisuccinate (7s8c); TRPC6 bound to BTDM (7dxf) or phosphatidylcholine (6uza); and TRP1 bound to PIP2 (6pw5).
We thank the reviewer for these valuable insights. We have included some additional discussion highlighting the similarities between the proposed rapamycin binding site and some of the other ligandchannel interactions in the TRP superfamily, in particular the well-known vanilloid binding site in TRPV1. However, to keep the discussion focused, we have not fully discussed all the indicated interactions, to best serve the clarity and scope of the manuscript.
(4) I would like to see negative control calcium imaging and electrophysiology data with untransfected HEK cells to confirm that the observed activation is mediated by TRPM8 to parallel the TRPM8 KO sensory neuron experiments.
This important information is now included in the revised manuscript (Figure S2).
(5) The DM-nitrophen Ca uncaging experiments are an interesting method to test Ca sensitivity of rapamycin, but the results make these experiments more complex to interpret. Ca has been shown to be an obligate cofactor for icilin sensitivity in TRPM8 under conditions where both the internal and external Ca concentrations are tightly controlled (Kuhn et al., 2009, doi: https://doi.org/10.1074/jbc.M806651200), which is necessary because TRPM8 allows Ca permeation through the pore when open. The large icilin-evoked currents in Figure 5A and 5B indicate that the effective intracellular calcium concentration is not zero prior to calcium uncaging, which may be high enough to mask any Ca-dependence of rapamycin that occurs at low Ca concentrations. Given this ambiguity, the inside-out patch clamp configuration would provide more control over the internal and external Ca concentration than is achieved in the Ca uncaging experiments. Because the authors have already demonstrated their ability to perform such experiments (Figure 2 panel B), it would be nice to see tests of Ca dependence using inside-out patch clamp.
As was already shown in Figure 2, Rapamycin activates TRPM8 in inside-out patches, and these experiments were performed using calcium-free cytosolic and extracellular solutions. Note that earlier studies have already shown that icilin activates outward TRPM8 currents in the full absence of calcium: see e.g. Janssens et al. eLife, 2016. Chuang et al. 2004. In the case of Icilin, increased calcium further potentiates the current, which is more prominent for the inward current.
In the Ca uncaging experiments, considering the Kd of DM-nitrophen of 5 nM, we expect that the intracellular calcium concentration before the UV flash would be approximately 15 nM. Taken together, both the inside-out experiments and the flash uncaging experiments confirm that rapamycin responses are not directly regulated by intracellular calcium, contrary to icilin.
(6) Sequence conservation within TRPM channels could be used in combination with the binding pocket model and mutagenesis to predict rapamycin selectivity for TRPM8 over other TRPMs. For example, some important residues, specifically G805 and Q861, are not conserved in TRPM3, which agrees with the lack of rapamycin sensitivity observed in TRPM3 (Figure S1). Further sequence comparison would provide testable hypotheses for future exploration of rapamycin sensitivity in other TRPMs that could validate the proposed binding pocket.
Thank you for the suggestion. We now indicate in the discussion that only some of the key residues are conserved and make suggestions for future studies.
(7) Please unify the color scheme across the figures to improve clarity.
• The authors frequently use the colors blue, red, and green to represent menthol and rapamycin in the figures, but they are inconsistent in which one represents menthol and which represents rapamycin. It would be clearer for the audience if, for example, rapamycin is always represented with red and menthol is always represented with blue.
Thank you for pointing this out. We have made the coloring schemes more uniform.
• In Figure 1, panel E, the coloring for Menthol and Pregnenolone Sulfate changes between the TRPM8+/+ and TRPM8-/- panels.
Thank you for pointing this out. We have updated the coloring schemes to ensure consistency between the TRPM8+/+ and TRPM8-/- panels.
• Figure 3 B and E, perhaps color the plot background as a 3-color gradient (blue to white to red) rather than yellow and aqua. Center the white at the WT ratio, keeping the dashed line, with diverging gradients to, for example, blue for mutations that selectively affect menthol sensitivity and red for rapamycin.
Thank you for the suggestion – we have changed the figure accordingly.
• Figure 4 panels A and B use the same color (green) to show two different things (menthol molecule and mutated residues that affect rapamycin sensitivity). It would be clearer for readers to change these colors to agree with a unified color scheme such that, for example, the menthol molecule is colored blue and the rapamycin-neighboring residues are colored red.
Thank you for the suggestion. We have updated the figure to use a unified color scheme, with the menthol molecule now colored green and the rapamycin-neighboring residues colored cyan, to enhance clarity for readers.
• I recommend adding a figure or panel that shows side chains for all mutations, colored by menthol/rapamycin selectivity, as indicated by the functional data in Figure 3B and 3E. This will highlight spatial patterns of the selective residues that are discussed in the text.
Thank you for your suggestion, we added all the side residues in Figure S10.
Minor points
(1) It would be nice to have one more concentration data point in the middle of the dose response curve shown in Figure 1 panel B. The response is not saturating at the top or foot of the curve in Figure 1 panel D, precluding a confident fit to a two-state Boltzmann function.
Instead of adding a single data point to this figure, we performed independent measurements on a plate reader system, comparing concentration responses at room temperature and 37 degrees. These data are now included as Figure S1.
(2) The cartoon in Figure 2 panel B should be made more accurate. For example, only the transmembrane helices should be depicted embedded in the membrane, not the whole protein including the intracellular domain. Because the experiment was performed with cells, change the orientation of TRPM8 in the cartoon to show the intracellular domain of the protein facing away from the extracellular side of the membrane where the rapamycin is applied.
Thank you for this comment. We have corrected the cartoon accordingly
(3) Perhaps put the yellow circles under or around the carbon atoms to which the identified hydrogen atoms belong in Figure 2 panel E and Figure 4 panel C. I found it difficult to visualize and compare the STTD NMR results with the predicted binding pocket.
Thank you for the feedback. We have added yellow circles around the carbon atoms corresponding to the identified hydrogen atoms in Figure S9.
(4) Regarding the sentence on p. 12 beginning "In agreement with this notion..."
• Include icilin, Cooling Agent-10, and WS-3 as other cooling agents whose sensitivity has been modulated by mutation of Y745
• Cryosim-3 responses were not tested in either of the two papers cited; please add citation to Yin et al., 2022, doi: https://doi.org/10.1126/science.add1268 .
• Other relevant papers include:
– Malkia et al., 2009, doi: https://doi.org/10.1186/1744-8069-5-62 which includes molecular docking showing the hydroxyl group of menthol interacting with Y745
– Beccari et al., 2017, doi: https://doi.org/10.1038/s41598-017-11194-0 Figure 5 shows disruption of icilin and Cooling Agent-10 sensitivity by Y745A
– Palchevskyi et al., 2023, doi: https://doi.org/10.1038/s42003-023-05425-6 Figure 3 shows disruption of icilin, cooling agent-10, WS-3, and menthol sensitivity by Y745A o Plaza-Cayon et al., 2022, https://doi.org/10.1002%2Fmed.21920 Review of TRPM8 mutations
• typo: Y754H should be Y745H
Thank you for these suggestions. We have added the above references to the text and corrected the typo.
(5) The authors use the competitive action of everolimus on rapamycin activation as evidence that the different macrolides are binding to the same binding pocket. In addition, prior work showed that Y745H and N799A mutations (which render TRPM8 insensitive to menthol and icilin, respectively) do not affect TRPM8 sensitivity to the structurally-related compound tacrolimus (Arcas et al., 2019). This is consistent with the docking and mutagenesis results presented here.
Thank you for this valuable suggestion. We discuss these data in the revised version.
(6) Rapamycin sensitivity has also been observed in TRPML1 (Zhang et al. 2019, doi: https://doi.org/10.1371/journal.pbio.3000252).
We added a short reference to this interesting finding in the discussion.
(7) The whole-cell currents are very large in several of the electrophysiology experiments (for example Figure 3 panel D and Figure S1), which could lead to artifacts of voltage errors as well as ion accumulation/depletion. However, because this paper is not relying on reversal potential measurements or trying to quantify V1/2, these errors are unlikely to affect the qualitative conclusions drawn.
This is a fair point, but indeed unlikely to affect our main conclusions. Note that we compensated between 70 and 90% of the series resistance, so we don’t expect voltage errors exceeding ~10 mV.
(8) Ligand sensitivity is frequently species-dependent in TRP channels, so it is interesting that multiple species were used here and that both human and mouse isoforms exhibit rapamycin sensitivity. It should be emphasized that human TRPM8 was used in the calcium imaging and electrophysiology experiments, as well as some docking models, while the mouse isoform was used in the sensory neuron experiments and a mutated avian isoform was used for some docking models.
This information is available in the Methods and we believe it is clear for the readers.
(9) Perhaps discuss the unclear mechanism of G805A action in icilin (but not menthol, cold, or praziquantel) sensitivity because it is not in direct contact with the ligand. For example, Yin et al., 2019 propose flexibility allowing Ca binding site and larger binding site for icilin.
Yin et al. (2019) suggests that the G805A mutation impacts icilin sensitivity by influencing the flexibility of the binding site and possibly affecting calcium binding. In our study, we found that G805A significantly reduces rapamycin sensitivity, likely due to its direct role in the rapamycin binding pocket rather than affecting calcium binding. This is now briefly mentioned in the results section.
(10) The Figure S1 legend indicates that n=5 for all panels, so please show normalized population IV curves rather than individual examples. Additionally, it would be interesting to see what happens when each agonist is co-applied with rapamycin. Does rapamycin potentiate or inhibit agonist activation in these channels and/or TRPM8?
We believe that normalized population IVs are not ideal for representing whole-cell currents, considering the substantial variation in current densities. We therefore prefer to show example traces in Figure S3 of the revised version but include mean values of current densities for all tested cells in the text.
While the effects of co-application of rapamycin with activating ligands could be of interest, we consider this somewhat outside the scope of the present manuscript. The combination of HEK293 cell experiments, along with results obtained in WT and TRPM8-deficient mice does, in our opinion, sufficiently describe the selectivity of rapamycin towards TRPM8 compared to other sensory TRP channels.
(11) Figure S1 panel A does not contain units for Rapamycin or AITC concentrations.
Thank you for pointing this out. The units were added to the figure.
(12) It would be nice if the authors characterized the different mutations as predicted to contribute to site 1 (D796, H845, Q861, based on Figure S4), site 2 (D796, M801, F847, and R851), and/or site 3 (F847, V849, and R851).
The indicated mutants were all tested, as shown in Figure 3.
(13) The numbering scheme in Figure S4 does not appear to match the residue numbers in the rest of the paper for certain residues (HIS-844 rather than H845, PHE-846 rather than F847, VAL-848 rather than V849, ARG-850 rather than R851, and GLN-860 rather than Q861), and labels are often overlapping and difficult to see. I also find the transparent spheres very difficult to distinguish from the transparent background, which makes it difficult to appreciate the STTD NMR data overlay.
We apologize for the confusing numbering scheme. The lower numbers refer to the initial docking that was done using the avian TRPM8 ortholog. We have made a newer, clearer version of Figure S4 and inserted as Figure S9.
(14) Please superpose the Ligplots in Figure S5 panels E and F as described in the LigPlus manual (https://www.ebi.ac.uk/thornton-srv/software/LigPlus/manual/manual.html) to facilitate easier comparison.
Thank you for the suggestion. We followed the suggestion to superpose the Ligplots as described but found that the result was visually cluttered and difficult to interpret. To avoid confusion, we instead decided to remove panels E and F from Figure S5, as we believe that the visualization in panels A-D is clear and informative.
(15) Some n values are missing in figure legends.
We checked all legends, and added n numbers were missing.
(16) There is an inconsistent specification of error bars as SEM in the figure legends, though it is specified in methods.
A question for my own edification: Here, you have looked at ligand interactions with the protein by saturating the protein resonances and observing transfer to the ligand. Would it be possible to instead saturate lipid or solute resonances and observe transfer to a ligand? I am curious whether this would be one way to measure equilibrium partitioning of ligand into a membrane and/or determine the effective concentration of a ligand in the membrane. Additionally, could one determine whether the compound is fully partitioned into the center of the membrane or just sitting on the surface?
The reviewer highlights an interesting aspect. The widely used WaterLOGSY NMR experiment (doi: 10.1023/a:1013302231549) saturates water molecules then the magnetization is transferred to the ligand of interest. Characteristic changes in ligand resonances are observed in the case of a binding event with proteins. On the other hand, the selective saturation of lipids is -while theoretically possible –technically challenging mainly because of the inherent low signal-dispersion of lipids and peak overlapping with ligand resonances. Additionally, lipid systems are more dynamic compared to proteins and ligand-lipid interactions could be weaker and less specific, significantly affecting the sensitivity of STD experiments.
Reviewer #2 (Recommendations For The Authors):
Major:
• Is it feasible to test rapamycin on TRPM8 with single-channel recording? This will allow us to better probe the mechanism of rapamycin activation and compare it with menthol, with parameters of singlechannel conductance and maximal open probability.
In our experience, it is very difficult to obtain single-channel recordings from TRPM8. The channel expresses at high densities, typically leading to patches contain multiple channels, making a proper analysis of mean open and closed times very difficult. Therefore, we have decided not to include such measurements in the manuscript.
• The authors classified rapamycin as a type I agonist, the type that stabilizes the open conformation, same as menthol but more prominent. Does that indicate that rapamycin work synergistically (rather than independently) with menthol, because co-application of them can allow them to add to each other in stabilizing the open conformation? I wonder if the authors agree that this could be tested with experiments as in Figure S3, by showing a much more prolonged deactivation with co-application of menthol and rapamycin than applying each alone.
Thank you for the insightful suggestion. We conducted co-application experiments, and our results show that the deactivation time is indeed significantly prolonged when both compounds are applied together compared to each alone. In fact, very little deactivation is seen when both compounds are co-applied, which made it virtually impossible to perform reliable fits to the deactivation time course for the Menthol+Rapamycin condition. Instead, we have now included summary results showing the percentage of deactivation after 100 ms. We included these findings in FigureS8.
• It could be tested whether rapamycin activation of TRPM8 requires or overrides the requirement of PIP2 with inside-out patch by briefly exposing the patch to poly-lysine to sequester PIP2.
This is certainly a good suggestion for further follow-up studies. However, we considered that examination of the (potential) interaction between ligands and PIP2 was outside the scope of the current manuscript.
• Figure 1C suggests that the authors test rapamycin when there is a relatively high baseline TRPM8 activation (prior to rapamycin) activation. This raises the possibility that rapamycin is more a potentiator than an activator. I wonder if the following two experiments could address it: (1) perfuse rapamycin while holding at different membrane potentials, wash-off rapamycin in the solution and quickly (in a few seconds) test the activated current magnitude (before rapamycin dissociation), to compare whether a more depolarized membrane potential (high baseline open probability) allows rapamycin to potentiate more. (2) Perform the experiment at a higher temperature (low baseline open probability) and test whether rapamycin EC50 shifts to the right.
Thank you for the thoughtful suggestion. Overall, we are not really in favor of making a distinction between a potentiator and an activator since it is not really feasible to create a situation where TRPM8 activity is zero. As suggested, we performed the dose response experiment at a higher temperature (37 °C) and observed that rapamycin’s EC<sub>50</sub> shifts to the right FigureS2. This is similar to what has been observed for menthol on TRPM8 and for many other ligands on other temperature-sensitive TRP channels.
Minor:
(1) The author should report hill coefficient together with EC50 when showing dose-responses.
We have added Hill coefficients for all the fits.
(2) In Figure 1 (E, F), it might be clearer to use Venn-diagram to show whether there is overlapping among rapamycin-, menthol-, and cinnamaldehyde-responsive neurons. According to the authors' explanation, we can predict that rapamycin-insensitive, menthol-sensitive neurons should predominantly be cinnamaldehyde-responsive.
Thank you for your suggestion. In these experiments, we applied several agonists and the combination of them would result in a visually crowded Venn diagram difficult to interpret. However, we agree, with the reviewer’s suggestion, and discuss the percentage of the cinnamaldehyde+ neurons in the rapa- menthol+ population in Trpm8<sup>-/-</sup> neurons.
(3) In Figure 3(C), since F847 does not respond to either menthol or rapamycin, it should be excluded from (B). Otherwise it is misleading.
Thank you for pointing this out. To clarify, we have included a calcium imaging trace for the F847 mutant, demonstrating a clear response to rapamycin in FigureS9. This additional data highlights that F847 does respond to rapamycin, albeit with a more modest response amplitude. This is now also clarified in the results section.
(4) The word "potency" in pharmacology usually refers to a smaller EC50 number in dose-dependent experiments. In "Effect of rapamycin analogs on TRPM8" session, the authors use "potency" to refer to response to a single-dose experiment of different compounds. The experiment does not measure potency.
Thank you for pointing out this mistake. We have corrected the text and replaced “potency” with “efficacy”.
(5) "2-methoxyl-" is misspelled in the text body.
We have corrected the typo.
(6) It will be nice to include "vehicle" in Figure 6B, or alternatively normalize all individual traces to vehicle. In Figure 6C and D, everolimus has almost no effect with compared to vehicle, and should not be shown as if it had ~8% in Figure 6B.
We have added the vehicle values to Figure 6B from the same experiments.
Reviewer #3 (Recommendations For The Authors):
(1) The NMR method presented here as novel and employed to identify a proposed molecule bound to a membrane protein (TRPM8 in this case) is not well explained and presented. Since several spectra need to be subtracted, the authors should present the raw data and the results of the subtractions step by step. Also, it seems that the height of the peaks in each spectra will be highly variable and thus a reliable criterion employed to scale spectra before subtraction. None of these problems are discussed of described.
The reviewer is right, that the data transparency should be improved and due to the high molecular complexity of the samples the size of the STD effects should be carefully scaled. We carried out additional experimental replicas on new samples and addressed the inherent sample/peak height variability by rescaling the STD effects based on reference <sup>1</sup>H measurements. We provided supplementary spectra of the reference experiments without saturation (Figure S5) and the computed STTD spectra from three parallel NMR sessions (Figure S6). We changed panel C of Figure 2 in the main text and provided all the STD and the computed STDD and STTD spectra recorded on one set of NMR experiments. We added the following paragraph to the main text: “To address the effect of the inherent variability of cellular samples on peak heights, STD effects were normalized based on the comparison of independent <sup>1</sup>H experiments (Figure S5). Three STTD replicates were computed, unambiguously confirming direct binding to TRPM8 in two datasets (Figure S6 A,B)”.
Importantly since this signal subtraction method is proposed as a new development, control experiments employing well-established pairs of ligand and membrane protein receptor should be performed to demonstrate the reliability of the method.
We agree with the reviewer, that the STTD experiment as a new development needs further validation, however, this paper is a preliminary demonstration of a new strategy building on the well-established STD and STDD NMR methodologies. Our group is actively engaged in studying additional biological samples to enhance our understanding of the applicability of STTD NMR. These efforts also aim to address challenges such as sample and spectral complexity by refining and standardizing the proposed workflow.
(2) The tail currents shown in supplementary figure 3 are clearly not monoexponential. The fit to a single exponential can be seen to be inadequate and thus the comparison of kinetics of control, rapamycin and menthol is incorrect. At least two exponentials should be fitted and their values compared.
We agree that the decay in the (combined) presence of agonists deviates from a simple monoexponential behavior. While we agree that fitting with two (or more) exponentials would provide a better fit, this also comes with greater variations/uncertainties in the fit parameters. This is particularly the case when inactivation is very slow and incomplete, or when the difference between slow and fast exponential time constants is <5, as seen with rapamycin and rapamycin +menthol. Therefore, we decided to provide monoexponential time constants as a proxy to describe the clear slowing down of activation and deactivation time courses in the presence of Type I agonists.
Also related to this aspect, recordings of TRPM8 currents can not be leak subtracted with a p/n protocol, thus a large fraction of the initial tail current must be the capacitive transient. There is no indication in the methods of how was this dealt with for the fitting of tail currents.
As explained in the methods, capacitive transients and series resistance were maximally compensated. Therefore, we do not agree that a large fraction of the initial tail current must be capacitive. This can also be clearly seen in experiment such as Figure 1C, where the inward tail current is fully abolished in the presence of a TRPM8 antagonist. Likewise, very small and rapidly inactivating tail currents can be seen during voltage steps under control conditions (e.g. Figure S7 and S8 in the revised version).
(3) The docking procedure employed, as the authors show, is not appropriate for membrane proteins since it does not include a lipid membrane. It is not clear in the methods section if the MD minimization described applies only to the rapamycin molecule or to rapamycin bound to TRPM8.
It is also not clear if the important residue Q861 (and other residues that are identified as interacting with rapamycin) were identified from dockings or proposed based on other evidence.
(4) Identifying amino acid residues that diminish the response to a ligand, does not uniquely imply that they form a binding site or even interact with said ligand. It is entirely possible that they can be involved in the allosteric networks involved in the activating conformational change. This caveat should be clearly posited by the authors when discussing their results.
In our study, we identified several residues that significantly reduce the response to rapamycin when mutated, while retaining robust responses to menthol, which indicates that these mutations do not affect crucial conformational changes leading to channel gating. While our cumulative data suggest that these residues may be involved in direct interaction with rapamycin, we recognize the alternative possibility that they allosterically affect rapamycin-induced channel gating. This is now clearly stated in the first paragraph of the discussion.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 1 (Public Review):
• While the title is fair with respect to the data shown, in the summary and the rest of the paper, the comparison between anesthetized and awake conditions is systematically stated, while more caution should be used.
First, isoflurane is one of the (many) anesthetics commonly used in pre-clinical research, and its effect on the brain vasculature cannot be generalized to all the anesthetics. Indeed, other anesthesia approaches do not produce evident vasodilation; see ketamine + medetomidine mixtures. Second, the imaged awake state is head-fixed and body-constrained in mice. A condition that can generate substantial stress in the animals. In this study, there is no evaluation of the stress level of the mice. In addition, the awake imaging sessions were performed a few minutes after the mouse woke up from isoflurane induction, which is necessary to inject the MB bolus. It is known that the vasodilator effects of isoflurane last a long time after its withdrawal. This aspect would have influenced the results, eventually underestimating the difference with respect to the awake state.
These limitations should be clearly described in the Discussion.
Looking at Figure 2e, it takes more than 5' to reach the 5 Millions MB count useful for good imaging. However, the MB count per pixel drops to a few % at that time. This information tells me that (i) repeated measurements are feasible but with limited brain coverage since a single 'wake up' is needed to acquire a single brain section and (ii) this approach cannot fit the requirements of functional ULM that requires to merge the responses to multiple stimuli to get a complete functional image. Of course, a chronic i.v. catheter would fix the issue, but this configuration is not trivial to test in the experimental setup proposed by the authors, hindering the extension of the approach to fULM.
Thank you for highlighting these limitations, as they address aspects that were not fully considered during the experimental design and manuscript writing. In response, we have added the following paragraphs to the discussion section, addressing these limitations of our study:
(Line 310) “Although isoflurane is widely used in ultrasound imaging because it provides long-lasting and stable anesthetic effects, it is important to note that the vasodilation observed with isoflurane is not representative of all anesthetics. Some anesthesia protocols, such as ketamine combined with medetomidine, do not produce significant vasodilation and are therefore preferred in experiments where vascular stability is essential, such as functional ultrasound imaging(47). Therefore, in future studies, it would be valuable to design more rigorous control experiments with larger sample sizes to systematically compare the effects of isoflurane anesthesia, awake states, and other anesthetics that do not induce vasodilation on cerebral blood flow.
Our proposed method enabled repeatable longitudinal brain imaging over a three-week period, addressing a key limitation of conventional ULM imaging and offering potential for various preclinical applications. However, there are still some limitations in this study.
One of the limitations is the lack of objective measures to assess the effectiveness of head-fix habituation in reducing anxiety. This may introduce variability in stress levels among mice. Recent studies suggest that tracking physiological parameters such as heart rate, respiratory rate, and corticosterone levels during habituation can confirm that mice reach a low stress state prior to imaging(48). This approach would be highly beneficial for future awake imaging studies. Furthermore, alternative head-fixation setups, such as air-floated balls or treadmills, which allow the free movement of limbs, have been shown to reduce anxiety and facilitate natural behaviors during imaging(30). Adopting these approaches in future studies could enhance the reliability of awake imaging data by minimizing stress-related confounds.
Another limitation of this study is the potential residual vasodilatory effect of isoflurane anesthesia on awake imaging sessions. The awake imaging sessions were conducted shortly after the mice had emerged from isoflurane anesthesia, required for the MB bolus injections. The lasting vasodilatory effects of isoflurane may have influenced vascular responses, potentially contributing to an underestimation of differences in vascular dynamics between anesthetized and awake state. Future applications of awake ULM in functional imaging using an indwelling jugular vein catheter presents a promising alternative to enable more accurate functional imaging in awake animals, addressing current limitations associated with anesthesia-induced vascular effects.”
• Statistics are often poor or not properly described.
The legend and the text referring to Figure 2 do not report any indication of the number of animals analyzed. I assume it is only one, which makes the findings strongly dependent on the imaging quality of THAT mouse in THAT experiment. Three mice have been displayed in Figure 3, as reported in the text, but it is not clear whether it is a mouse for each shown brain section. Figure 5 reports quantitative data on blood vessels in awake VS isoflurane states but: no indication about the number of tested mice is provided, nor the number of measured blood vessels per type and if statistics have been done on mice or with a multivariate method.
Also, a T-test is inappropriate when the goal is to compare different brain regions and blood vessel types.
Similar issues partially apply to Figure 6, too.
Thank you for bringing this to our attention.
We acknowledge that the statistical analyses were not clearly explained in the original version. In the revised manuscript, we have ensured that the statistical methods are clearly described.
(Fig.4 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using t-test at each measurement point along the segments.”
(Fig.6 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using the two one-sided test (TOST) procedure, which evaluates the null hypothesis that the difference between the two weeks is larger than three times the standard deviation of one week.”
Additionally, we corrected an error in the previous comparison of the violin plots on flow velocities, where a t-test was incorrectly applied; this has now been removed.
We acknowledge that the original version did not clearly indicate the numbers of animals in the statistical analysis. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. In the revised Figures 4 and 6, we have ensured that each quantitative analysis figure or its caption clearly indicate the specific mice.
For original Figures 1 and 2, these are presented as case studies to illustrate the methodology. Since the anesthesia time required for tail vein injection for each animal varies slightly, it is challenging to have the consistent time taken for each mouse to recover from anesthesia across all mice. For instance, in Figure 1, the mouse took nearly 500 seconds to recover from anesthesia, but this duration is not consistent across all animals, which is a limitation of the bolus injection technique. We have noted this point in the discussion (discussion on the limitation of bolus injection), and we have also clarified in the results section and figure captions that these figures represent a case study of a single mouse rather than a standardized recovery time for all animals.
We further clarified this point in the end of the Figure 2 caption:
(Fig.2 caption) “This figure presents a case study based on the same mouse shown in Fig 1. The x-axis for d-f begins at 500 seconds because, at this point, the mouse’s pupil size stabilized, indicating it had recovered to an awake state. Consequently, ULM images were accumulated starting from this time. It is important to note that not every mouse requires 500 seconds to fully awaken; the time to reach a stable awake state varies across individual mice.” We added the following statement before introducing Figure 1e:
(Line 93) “Due to differences in tail vein injection timing and anesthesia depth, the time required for each mouse to fully awaken varied. Although it was not feasible to get pupil size stabilized just after 500 seconds for each animal, ULM reconstruction only used the data that acquired after the animal reached full pupillary dilation, to ensure that ULM accurately captures the cerebrovascular characteristics in the awake state.”
We added the following statement before introducing Figure 2d:
(Line 139) “To further verify that the proposed MB bolus injection method can help to achieve ULM image saturation shortly after mice awaken from anesthesia, an analysis on the change in MB concentration over time was conducted once pupil size had stabilized (T = 500s).”
For Figures 3, 4, and 5 (in the revised version, Figures 4 and 5 have been combined into a single Figure 4), the data represents results from three individual mice, with each coronal plane corresponding to a different mouse. In the revised version, we have added labels to indicate the specific mouse in each image to improve clarity. We also recognize that some analyses in the original submission (original Figure 5) may have lacked sufficient statistical power due to the small sample size. Therefore, in the revised version, we have focused only on findings that were consistently observed across the three mice to ensure robust conclusions.
Reviewer 1 (Recommendations For the Authors):
• If the study's main goal is to compare awake vs anesthetized ULM, the authors should test at least another anesthetic with no evident vasodilator effect.
Thank you for this valuable suggestion. We would like to clarify that the primary aim of our study is not to comprehensively compare the effects of anesthesia versus the awake state, as a rigorous comparison would indeed require a more controlled experimental design, including additional anesthetics, a larger cohort of mice, and broader controls to ensure sufficient statistical power. We also add the following statement in the Discussion to clarify this point:
(Line 314) “Therefore, in future studies, it would be valuable to design more rigorous control experiments with larger sample sizes to systematically compare the effects of isoflurane anesthesia, awake states, and other anesthetics that do not induce vasodilation on cerebral blood flow.”
We acknowledge that the initial organization of Figures 3–5 placed excessive emphasis on comparisons between the awake and anesthetized states, but without yielding consistently significant findings. Meanwhile, our longitudinal observations in original Figure 6 were underrepresented, despite their potential importance.
In the revised version, we shifted our focus toward the main goal of awake longitudinal imaging. By consolidating the previous Figures 4 and 5 into the new Figure 4, we emphasize conclusions that are both more consistent and broadly applicable, avoiding areas that may lack sufficient rigor or consensus. Additionally, we expanded the quantitative analysis related to longitudinal imaging, highlighting its role as the ultimate objective of this study. The awake vs. anesthetized ULM comparison was intended to demonstrate the value of awake imaging and introduce the importance of awake longitudinal imaging. In the revised text, we have reframed this comparison to emphasize the specific response to isoflurane rather than a general response to anesthesia. For example, in Figures 3 and 4, we have replaced the original term "Anesthetized" with "Isoflurane". We have also added a discussion noting that isoflurane may induces more vasodilation than other anesthetic agents.
(Line 310) “Although isoflurane is widely used in ultrasound imaging because it provides long-lasting and stable anesthetic effects, it is important to note that the vasodilation observed with isoflurane is not representative of all anesthetics. Some anesthesia protocols, such as ketamine combined with medetomidine, do not produce significant vasodilation and are therefore preferred in experiments where vascular stability is essential, such as functional ultrasound imaging(47).”
• The claims made about the proposed experimental protocol to be suitable for the "long-term" (line 255) are not supported by the data and should be modified according to the presented evidence.
Thank you for your valuable feedback. We agree that our current three-week experimental results do not yet fulfill the requirements for extended longitudinal imaging that may span several months. We have revised the relevant text accordingly. For instance, the phrase “Our proposed method enabled long-term, repeatable longitudinal brain imaging” has been modified to “Our proposed method enabled repeatable longitudinal brain imaging over a threeweek period.” (Similar changes also in Line 67, Line 318, and Line 337) Additionally, we have added the following paragraph in the discussion section to indicate that extending the monitoring period to several months is a meaningful direction for future exploration:
(Line 337) “In our longitudinal study, consistent imaging results were obtained over a three-week period, demonstrating the feasibility of awake ULM imaging for this duration. However, for certain research applications, a monitoring period of several months would be valuable. Extending the duration of longitudinal awake ULM imaging to enable such long-term studies is a potential direction for future development.”
Recommendations for improving the writing and presentation:
• Reporting the number of mice and blood vessels and statistics for each quantitative figure.
Thank you for highlighting this issue. We acknowledge that the quantitative figures in the previous version lacked clarity in specifying the number of mice, vessels, and associated statistics. In the revised version, we have ensured that each quantitative figure or its caption clearly indicate the specific mice, vessels, and statistical methods used. To further minimize any potential confusion, we have also added Supplementary Figure 1 to clearly label and reference each individual mouse included in the study.
Minor corrections to the text and figures.
• Line 22: "vascularity reduction from anesthesia" is not clear, nor it is a codified property of brain vasculature. Explain or rephrase.
Thank you for your comment. We apologize for any confusion caused by the phrase “vascularity reduction from anesthesia” in the abstract. We agree that this phrasing was unclear without context. To improve clarity, we have revised this statement in the abstract to make it more straightforward and easier to understand.
(Line 24) “Vasodilation induced by isoflurane was observed by ULM. Upon recovery to the awake state, reductions in vessel density and flow velocity were observed across different brain regions.”
Additionally, we have added a section in the Methods titled Quantitative Analysis of ULM Images to provide a clear definition of vascularity. This section outlines how vascularity is quantified in our study, ensuring that our terminology is well-defined.
The following sentence shows the definition of vascularity:
(Line 547) “Vascularity was defined as the proportion of the pixel count occupied by blood vessels within each ROI, obtained by binarizing the ULM vessel density maps and calculating the percentage of the pixels with MB signal.”
We have also added an instant definition when it was firstly used in Results part:
(Line 161) “When comparing vessel density maps, ULM images that are acquired in the awake state demonstrate a global reduction of vascularity, which refers to percentage of pixels that occupied by blood vessels.”
• Line 76: putting the mice in a tube is also intended "To further reduce animal anxiety and minimize tissue motion" I agree with tissue motion, not with animal anxiety, which, indeed, I expect to be higher than if it could, for example, run on a ball or a treadmill.
Thank you for pointing this out. We acknowledge the limitations of our setup regarding reducing animal anxiety. We have replaced the original phrase “to further reduce animal anxiety and minimize tissue motion” with “to further minimize tissue motion.” (Line 78) Additionally, we have added the following paragraph in Discussion section to address the limitations of our setup in reducing anxiety.
(Line 321) “One of the limitations is the lack of objective measures to assess the effectiveness of head-fix habituation in reducing anxiety. This may introduce variability in stress levels among mice. Recent studies suggest that tracking physiological parameters such as heart rate, respiratory rate, and corticosterone levels during habituation can confirm that mice reach a low stress state prior to imaging(48). This approach would be highly beneficial for future awake imaging studies. Furthermore, alternative head-fixation setups, such as air-floated balls or treadmills, which allow the free movement of limbs, have been shown to reduce anxiety and facilitate natural behaviors during imaging(30). Adopting these approaches in future studies could enhance the reliability of awake imaging data by minimizing stress-related confounds.”
• Line 79: PMP has been used by Sieu et al., Nat Methods, 2015; it should be acknowledged.
Thank you for highlighting this. We have now included the reference to Sieu et al. Nat Methods, 2015 to appropriately acknowledge their use of PMP. (Line 81)
• Figure: is there a reason why the plots start at 500 sec? What happened before that time?
Thank you for your question regarding the starting time in the plots. Figures 1 and 2 are case studies using a single mouse to demonstrate the feasibility of our method. The “zero” timepoint was defined as the moment when anesthesia was stopped, and the microbubble injection began. However, the mouse does not fully recover immediately after anesthesia is stopped. As shown in Figure 1e, there is a period of approximately 500 seconds during which the pupil gradually dilates, indicating recovery. Only after this period does the mouse reach a relatively stable physiological state suitable for ULM imaging, which is why the plots in Figure 2 begin at T = 500 seconds.
We recognize that this was not sufficiently explained in the main text and figure captions. In the revised manuscript, we have clarified this timing rationale in both the results section and the figure captions. We added the following sentence to the result section to introduce Fig.2d:
(Line 139) “To further verify that the proposed MB bolus injection method can help to achieve ULM image saturation shortly after mice awaken from anesthesia, an analysis on the change in MB concentration over time was conducted once pupil size had stabilized (T = 500s).”
We also added the following statement to note that this recover time varies across individual mice:
(Line 154, Fig.2 caption) “This figure presents a case study based on the same mouse shown in Fig 1. The x-axis for d-f begins at 500 seconds because, at this point, the mouse’s pupil size stabilized, indicating it had recovered to an awake state. Consequently, ULM images were accumulated starting from this time. It is important to note that not every mouse requires 500 seconds to fully awaken; the time to reach a stable awake state varies across individual mice.”
Reviewer 2 (Public Review):
• The only major comment (calling for further work) I would like to make is the relative weakness of the manuscript regarding longitudinal imaging (mostly Figure 6), compared to the exhaustive review of the effect of isoflurane on the vasculature (3 rats, 3 imaging planes, quantification on a large number of vessels, in 9 different brain regions). The 6 cortical vessels evaluated in Figure 6 feel really disappointing. As longitudinal imaging is supposed to be the salient element of this manuscript (first word appearing in the title), it should be as good and trustworthy as the first part of the paper. Figure 6c. is of major importance, and should be supported by a more extensive vessel analysis, including various brain areas, and validated on several animals to validate the robustness of longitudinal positioning with several instances of the surgical procedure. Figure 6d estimates the reliability of flow measurements on 3 vessels only. Therefore I recommend showing something similar to what is done in Figures 4 and 5: 3 animals, and more extensive quantification in different brain regions.
We thank the reviewer for pointing out this issue. We acknowledge that the first version of the manuscript lacked in-depth quantitative analysis in the section on the longitudinal study, which should have been a focal point. It also did not provide a sufficient number of animals to demonstrate the reproducibility of the technique. In this revised version, we have included results from more animals and conducted a more comprehensive quantitative analysis, with the corresponding text updated accordingly. Specifically, we combined the previous Figures 4 and 5 into the current Figure 4 (corresponding revised text from Line 169 to Line 207). The revised Figures 5 and 6
compare the results of the longitudinal study, presenting data from three mice (corresponding revised text from
Line 224 to Line 258). Detailed information about the mice used has been added to Supplementary Figure 1, and Supplementary Figure 4 further provides a detailed display of the results for the three mice in longitudinal study. We hope that these adjustments will provide a more thorough validation of the longitudinal imaging.
Reviewer 2 (Recommendations For The Authors):
Minor comments:
• The statistical analyses are not always explained: could they be stated briefly in the legends of each figure, or gathered in a statistical methods section with details for each figure? Be sure to use the appropriate test (e.g. student t-test is used in Fig 5 k whereas normality of distribution is not guaranteed.)
Thank you for pointing this out. We acknowledge that the statistical analyses were not clearly explained in the original version. In the revised manuscript, we have ensured that the statistical methods are clearly described.
(Fig.4 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using t-test at each measurement point along the segments.”
(Fig.6 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using the two one-sided test (TOST) procedure, which evaluates the null hypothesis that the difference between the two weeks is larger than three times the standard deviation of one week.”
Additionally, we corrected an error in the previous comparison of the violin plots on flow velocities, where a t-test was incorrectly applied; this has now been removed.
• The authors use early in the manuscript the term vascularity, e.g. in "vascularity reduction", it is not exactly clear what they mean by vascularity, and would require a proper definition at that moment. If I am correct, a quantification of that "vascularity reduction" (page 5 line 132), is then done in Figures 5 d e f and j.
Thank you for highlighting this issue. We acknowledge that our initial use of the term “vascularity” may have been unclear and potentially confusing. In the revised manuscript, we have included a clear definition of “vascularity” in the Methods section under Quantitative Analysis of ULM Images (Line 534).
The following sentence shows the definition of vascularity:
(Line 547) “Vascularity was defined as the proportion of the pixel count occupied by blood vessels within each ROI, obtained by binarizing the ULM vessel density maps and calculating the percentage of the pixels with MB signal.”
We have also added an instant definition when it was firstly used in Results part:
(Line 161) “When comparing vessel density maps, ULM images that are acquired in the awake state demonstrate a global reduction of vascularity, which refers to percentage of pixels that occupied by blood vessels.”
• There is very little motion in the images presented, except for the awake "Bregma -4.2 mm" (Figure 3, directional maps), especially in the area including colliculi and mesencephalon, while the cortical vessels do not move. Can you comment on that?
Thank you for highlighting this important aspect of motion in awake animal imaging. Motion correction is indeed a critical factor in such studies. In the original version of our discussion, we briefly addressed this issue (from Line 342 to Line 346), but we agree that a more detailed discussion is needed.
To minimize motion artifacts, we conducted habituation to acclimate the animals to the head-fixation setup, which helps reduce anxiety during imaging. With thorough head-fixed habituation, the imaging quality is generally well-preserved. We also applied correlation-based motion correction techniques based on ULM images, which can partially correct for overall brain motion, as stated in the previous version. However, this ULM-images-based correction is limited to addressing only rigid motion.
In the revised discussion, we have expanded on the limitations of our current motion correction approach and referenced recent work about more advanced motion correction methods:
(Line 346) “While rigid motion correction is often effective in anesthetized animals, awake animal imaging presents greater challenges due to the more prominent non-rigid motion, particularly in deeper brain regions. This is evidenced in Supplementary Fig. 1 (Mouse 7), where cortical vessels remain relatively stable, but regions around the colliculi and mesencephalon exhibit more noticeable motion artifacts, indicating that displacement is more pronounced in deeper areas. To address these deeper, non-rigid motions, recent studies suggest estimating nonrigid transformations from unfiltered tissue signals before applying corrections to ULM vascular images(16,50). Such advanced motion correction strategies may be more effective for awake ULM imaging, which experiences higher motion variability. The development of more robust and effective motion correction techniques will be crucial to reduce motion artifacts in future awake ULM applications.”
• Figure 1f maybe flip the color bar to have an upward up and downward down.
Thank you for your suggestion. This display method indeed makes the images more intuitive. In the revised manuscript, all directional flow color bars have been flipped to ensure that upward flow is displayed as ‘up’ and downward flow as ‘down.’
• Figure 2b the figure is a bit confusing in what is displayed between dashed lines, solid lines, dots... maybe it would be easier to read with
- bigger dots and dashed lines in color for each of the 4 series
- and so in the legend, thin solid lines in the corresponding color for the fit, but no solid line in the legend (to distinguish data/fit)
- no lines for FWHM as they are not very visible, and the FWHM values are not mentioned for these examples.
Thank you for your detailed suggestions. We agree that the original Fig. 2b appeared messy and confusing. Based on this feedback and other comments, we decided to replace the FWHM-based vessel diameter measurement with a more stable binarization-based approach. In the revised version, we selected a specific segment of each vessel and measured the diameter by calculating the distance from the vessel’s centerline to both side after binarization. Each point on the centerline of this segment provides a diameter measurement, which can be further used to calculate the mean and standard error. This updated method is more stable and reproducible, providing reliable measurements even for vessels that are not fully saturated. It also facilitates comparison across more vessels, helping to further demonstrate the generalizability of our saturation standard. We believe these adjustments make the revised Fig. 2b clearer and more readable.
• Page 7, lines 144-147. This passage is not really clear when linking going up or down and going from the stem to the branches that it is specific to Figure 4a (and therefore to this particular location).
Thank you for your insightful comments on our vessel classification method. We recognize the limitations of the previous approach and, in order to enhance the rigor of the study, we have opted not to continue using this method in the revised manuscript. We have removed all content related to vessel classification based on branchin and branch-out criteria. This includes the original Classification of Cerebral Vessels section in the Methods, the relevant descriptions in the Results section under “ULM reveals detailed cerebral vascular changes from anesthetized to awake for the full depth of the brain”, limitation of this classification method in Discussion section, as well as related content in the original Figures 4 and 5.
In the revised analysis, for the comparison between arteries and veins, we focus solely on penetrating vessels in the cortex. For these vessels, it is generally accepted that downward-flowing vessels are arterioles, while upwardflowing vessels are venules. Accordingly, in the revised Figures 4 and 6, we analyze arterioles and venules exclusively in the cortex, without relying on the previous classification method that could be considered controversial.
• Page 11 line 222 "higher vascular density" seems unprecise.
Thank you for pointing this out. We have revised the sentence to more precisely convey our observations regarding changes in vascular diameter and vascularity within the ROI. We present these findings as evidence of the vasodilation effect under isoflurane, in alignment with existing research. The revised statement is as follows:
(Line 275) “Statistical analysis from Fig. 4 shows that certain vessels exhibit a larger diameter under isoflurane anesthesia, and the vascularity, calculated as the percentage of vascular area within selected brain region ROIs, is also higher in the anesthetized state. These findings suggest a vasodilation effect induced by isoflurane, consistent with existing research(20,40,41,43,44).
• Discussion: page 12, lines 257-267: it is not exactly clear how 3D imaging will help for the differentiation of veins/arteries. However, some methods have already been proposed to discriminate between arteries and veins using pulsatility (Bourquin et al., 2022) or 3D positioning when vessels are overlapped (Renaudin et al., 2023). The latter can also help estimate the out-of-plane positioning during longitudinal imaging.
Bourquin, C., Poree, J., Lesage, F., Provost, J., 2022. In Vivo Pulsatility Measurement of Cerebral Microcirculation in Rodents Using Dynamic Ultrasound Localization Microscopy. IEEE Trans. Med. Imaging 41, 782-792. https://doi.org/10.1109/TMI.2021.3123912
Renaudin, N., Pezet, S., Ialy-Radio, N., Demene, C., Tanter, M., 2023. Backscattering amplitude in ultrasound localization microscopy. Sci. Rep. 13, 11477. https://doi.org/10.1038/s41598-023-38531-w
Thank you for pointing this out. We have revised the relevant paragraph in the discussion to clarify the potential advantages of advances in ULM imaging methods, such as those based on pulsatility (as described by Bourquin et al., 2022) or backscattering amplitude (as demonstrated by Renaudin et al., 2023). These established methods could be helpful for longitudinal imaging. Below is the revised text in the discussion section:
(Line 370) “Advances in ULM imaging methods can benefit longitudinal awake imaging. For instance, dynamic ULM can differentiate between arteries and veins by leveraging pulsatility features(51). 3D ULM, with volumetric imaging array(52,53), enables the reconstruction of whole-brain vascular network, providing a more comprehensive understanding of vessel branching patterns. Meanwhile, 3D ULM also helps to mitigate the challenge of aligning the identical coronal plane for longitudinal imaging, a process that requires precise manual alignment in 2D ULM to ensure consistency. Additionally, this alignment issue can also be alleviated in 2D imaging using backscattering amplitude method, which may assist in estimating out-of-plane positioning during longitudinal imaging(54).”
Reviewer 3 (Public Review):
• It is unclear whether multiple animals were used in the statistical analysis.
Thank you for bringing this to our attention. We acknowledge that the original version did not clearly indicate the use of animals in the statistical analysis. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. In the revised Figures 4 and 6, we have ensured that each quantitative analysis figure or its caption clearly indicate the specific mice.
• Generalizations are sometimes drawn from what seems to be the analysis of a single vessel.
Thank you for pointing this out. To enhance the generalizability of our conclusions, we have expanded our analysis beyond single vessels in several parts of the study. For instance, in Figure 2, we analyzed three vessels at different depths within the same brain region of a single mouse, and we have included additional results in the Supplementary Figure 2 to further support these findings. Additionally, we have revised the language in the manuscript to ensure that conclusions are appropriately qualified and avoid overgeneralization.
In Figures 4 and 6, we extended the analysis from single vessels to larger region-of-interest (ROI) analyses across entire brain regions. Unlike single-vessel measurements, which are susceptible to bias based on specific measurement locations, ROI-based analyses are less influenced by the operator and provide more objective, generalizable insights.
• The description of the statistical analysis is mostly qualitative.
We recognize that some aspects of the original statistical analysis (Figures 4 and 5 in the previous version) lacked rigor and description is more qualitative. The revised version of statistical analysis (Figure 4 and Figure 6) presents our findings from multiple dimensions, ranging from individual vessels to individual cortical ROI of arteries and veins, and ultimately to broader brain regions. For instance, as illustrated in the revised Figure 4f, the average cortical arterial flow speed decreases by approximately 20% from anesthesia to wakefulness, while venous flow speed decreases by an average of 40%, with the reduction in venous flow speed being significantly greater than that of arterial flow. We believe that this kind of description offers more quantitative analysis.
For more examples, please refer to the Results section where Figure 4 (Line 169 to Line 207) and Figure 6 (Line 224 to Line 258) are described. These sections have been extensively rewritten to emphasize quantitative interpretation of the data. Each part of the analysis now focuses more heavily on quantitative analyses that consistently show similar trends across all animals.
• Some terms used are insufficiently defined.
• Additional limitations should be included in the discussion.
• Some technical details are lacking.
Thank you for highlighting these issues. In response, we have made several improvements in the revised manuscript to address these issues. We have clarified terms such as “vascularity” (Line 547) and “saturation point” (Line 112) to ensure precision and prevent ambiguity. We have expanded the discussion (Line 310 to Line 377) to include limitations such as motion correction challenges and advances in ULM imaging methods, including dynamic ULM and backscattering amplitude techniques. We have added further details on interleaved sampling (Line 494 to Line 497), ULM tracking (Line 517 to Line 529), and quantitative analysis (Line 535 to Line 551) in the Methods section to provide a clearer understanding of our approach.
Please refer to our other responses for more specific adjustments.
• Without information about whether the results obtained come from multiple animals, it is difficult to conclude that the authors generally achieved their aim. They do achieve it in a single animal. The results that are shown are interesting and could have an impact on the ULM community and beyond. In particular, the experimental setup they used along with the high reproducibility they report could become very important for the use of ULM in larger animal cohorts.
We thank the reviewer for recognizing the impact of our work. We also acknowledge that there were some issues—specifically, we did not provide sufficient proof of reproducibility. In the revised version, we have included additional animal experiment results to ensure that the conclusions were not drawn from a single animal but are generally representative of our aim. (See supplementary figure 1 for detailed use of the animals)
Reviewer 3 (Recommendations For The Authors):
• The manuscript would be more convincing by removing some of the superlatives used in the text. For instance, shouldn't "super-resolution ultrasound localization microscopy" simply be "ultrasound localization microscopy"? Expressions such as "first study", "essential", and "invaluable", etc could be replaced by more factual terms. The word "significant" is also used sometimes with statistics to back it up and sometimes without.
Thank you for highlighting this issue. We have removed the superlatives throughout the manuscript to make the language more precise. For instance, we have simplified “super-resolution ultrasound localization microscopy” to “ultrasound localization microscopy” throughout the main text and removed expressions such as “first study” and “invaluable”. We also reviewed all uses of “essential” and “significant,” replacing “essential” with more modest alternatives where it does not indicate a strict requirement. Similarly, where “significant” does not refer to statistical significance, we have used other terms to avoid any ambiguity.
• The section "Microbubble count serves as a quantitative metric for awake ULM image reconstruction" had several issues that I think should be addressed. Mainly, the authors make the case that after detecting 5 million microbubbles, there is no clear gain in detecting more. The argument is not very convincing as we know many vessels will not have had a microbubble circulate in them within that timeframe, which will be especially true in smaller vessels. While the analysis in Figure 2 shows nicely that the diameter estimate for vessels in the 20-30 um range is stable at 5 million microbubbles, it is not necessarily the case for smaller vessels. A better approach here might be to select, e.g., a total of 5 million detected microbubbles for practical reasons and then to determine which vessel parameters estimation (e.g., diameter, flow velocity) remain stable. In addition:
a. Terms such as 'complete ULM reconstruction', 'no obvious change', 'ULM image saturation' are not well defined within the manuscript.
Thank you for pointing out these issues and for offering a more rigorous approach. We completely agree with your suggestion. While our analysis demonstrated stable diameter estimates for vessels with diameter around 20 µm at 5 million microbubbles, this does not necessarily ensure stability for smaller vessels. Therefore, the choice of 5 million microbubbles was primarily for practical reasons. In the revised version, we have provided a more objective description and clarification of this limitation. We also recognize that terms such as “complete ULM reconstruction,” “no obvious change,” and “ULM image saturation” were not well defined and may have caused confusion, reducing the rigor of this manuscript. Based on your feedback, we have clearly defined “ULM image saturation” within the context of our study, removed absolute and ambiguous terms like “complete ULM reconstruction” and “no obvious change”. We revised the entire section accordingly:
(Line 109) “To facilitate equitable comparison of brain perfusion at different states, a practical saturation point enabling stable quantification of most vessels needs to be established. Our observations indicated that when the cumulative MB count reached 5 million, ULM images achieved a relatively stable state. Accordingly, in this study, the saturation point was defined as a cumulative MB count of 5 million. There are also possible alternatives for ULM image normalization. For example, different ULM images can be normalized to have the same saturation rate. However, the proposed method of using the same number of cumulative MB count for normalization enables the analysis of blood flow distribution across different brain regions from a probabilistic perspective. The following analysis substantiates this criterion.
Fig. 2a compares ULM directional vessel density maps and flow speed maps generated with 1, 3, 5, and 6 million MBs, using the same animal as shown in Fig. 1. To quantitatively confirm saturation, multiple vessel segments were selected for further analysis. Fig. 2b presents the measured vessel diameter for a specific segment at various MB counts. After binarizing the ULM map, the vessel diameter was measured by calculating the distance from the vessel centerline to the edge. Each point along the centerline of the segment provided a diameter measurement, enabling calculation of the mean and standard error. At low MB counts, vessels appeared incompletely filled, leading to inaccurate estimation of vessel diameter due to incomplete profiles. For example, at 1–2 million MBs, the binarized ULM map displayed a width of only one or two pixels along the segment. As a result, the measurements always yielded the same diameter values (two pixels, ~10um) with a consistently low standard error of the mean across the entire segment. With increased MB counts, the measured vessel diameter gradually rose, ultimately reaching saturation. The plots in Fig. 2b show that vessel diameter stabilized at 5 million MB count. Additionally, Fig. 2c illustrates the changes in flow velocity measured at different cumulative MB counts. The violin plots display the distribution of flow speed estimates for all valid centerline pixels within the selected segment. At low MB counts (1–3 million), flow velocity estimates fluctuated, but they stabilized as the MB count increased (4–6 million MBs). At 5 million MBs, flow velocity estimates were nearly identical to those at 6 million MBs, corroborating previous findings that vessel velocity measurements stabilize as MB count grows(39). To assess the generalizability of the 5 million MB saturation condition, vessel segments from three different mice across various brain regions were examined. The results, shown in Supplementary Fig. 2, confirm that this saturation criterion applies broadly. Although the 5 million MB threshold may not ensure absolute saturation for all vessels, it is generally effective for vessels larger than 15 μm. This MB count threshold was therefore adopted as a practical criterion.”
b. The choice of 10 consecutive tracking frames is arbitrary and should be described as such unless a quantitative optimization study was conducted. Was there a gap-filling parameter? What was the maximum linking distance and what is its impact on velocity estimation?
Thank you for your comment. We acknowledge that the choice of 10 consecutive tracking frames was based on our common practice rather than a specific quantitative optimization. Additionally, with the uTrack algorithm, we set both the gap-filling parameter and maximum linking distance to 10 pixels. Setting these parameters too high could potentially overestimate velocity. These details have now been added to the Methods section for clarity:
(Line 517) “The choice of 10 consecutive frames (10 ms) was based on established practice but can be adjusted as needed. For the uTrack algorithm, two additional key parameters were specified: the maximum linking distance and the gap-filling distance, both set to 10 pixels (~50 microns). This configuration means that only bubble centroids within 10 pixels of each other across consecutive frames are considered part of the same bubble trajectory. Additionally, when the start and end points of two tracks fall within this threshold, the gap-filling parameter merges them into a single, continuous track. It is important to select these parameters carefully, as overly large values could lead to an overestimation of flow velocity. By setting the maximum linking distance to 10 pixels, we effectively limited the measurable velocity to 50 mm/s, under the assumption that no bubble would exceed a 50-micron displacement within the 1 ms interval between frames. After determining bubble tracks with the specified parameters for uTrack algorithm, accumulating the MB tracks resulted in the flow intensity map. Considering the velocity distribution across the mouse brain, this 50 mm/s limit ensures that the vast majority of blood flow is captured accurately.”
c. 'The plots (Figure 2b) clearly indicate that the vessel diameter stabilized beyond 5 million MB count.' This is true for one vessel. To generalize that claim, the analysis should be performed quantitatively on a larger sample of vessels in various areas of the brain, across multiple animals.
Thank you for pointing out this limitation. We agree that conclusions drawn from a single vessel cannot be generalized across all regions. Following your suggestion, we have added Supplementary Figure 2, where we analyzed multiple vessels from different brain regions across three mice. This expanded analysis further confirms that a 5 million MB count is sufficient to stabilize vessel diameter measurements across various samples.
(Line 133) “To assess the generalizability of the 5 million MB saturation condition, vessel segments from three different mice across various brain regions were examined. The results, shown in Supplementary Fig. 2, confirm that this saturation criterion applies broadly. Although the 5 million MB threshold may not ensure absolute saturation for all vessels, it is generally effective for vessels larger than 15 μm. This MB count threshold was therefore adopted as a practical criterion.”
• "Statistical analysis validates the increase in blood flow induced by anesthesia" is a very interesting section but even though a quantitative analysis was conducted in Figure 5, the language used remains mostly qualitative. I think this section should include quantitative conclusions from the statistical analysis to increase the impact of this work.
Thank you for your valuable feedback. We recognize that some aspects of the original quantitative analysis (Figures 4 and 5 in the previous version) lacked rigor, such as the classification of arteries, veins, and capillaries, and that the data presented in each row of Figure 5 represented only one mouse per coronal section, limiting the generalizability of statistical conclusions.
In response to the reviewers’ feedback, the revised version incorporates a new approach by merging the previous Figure 4 and Figure 5 into a single, consolidated figure (now Figure 4). This updated figure aims to present our findings from multiple dimensions, ranging from individual vessels to individual cortical ROI of arteries and veins, and ultimately to broader brain regions. We have focused on quantitative analyses that consistently show similar trends across all animals. For instance, as illustrated in the revised Figure 4f, the average cortical arterial flow speed decreases by approximately 20% from anesthesia to wakefulness, while venous flow speed decreases by an average of 40%, with the reduction in venous flow speed being significantly greater than that of arterial flow. We believe that this approach offers more insightful analysis and enhances the overall impact of the study.
For more examples, please refer to the revised Results section where Figure 4 are described (from Line 169 to Line 212). These sections have been extensively rewritten to emphasize quantitative interpretation of the data. Each part of the analysis now focuses more heavily on quantitative analyses that consistently show similar trends across all animals.
• In the methods, it is claimed that 6 healthy female C57 mice were used in the study, but it is hard to tell whether more than one animal is shown in the figures. It is also unclear whether the statistics were performed within or across animals. Since one of the major strengths of the manuscript is that it shows the feasibility of performing reproducible measurements using ULM, most figures should be repeated for each individual animal and provided in supplementary data and statistics should be performed across animals.
Thank you for bringing this to our attention. We acknowledge that the original version did not clearly indicate the use of individual animals. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. Additionally, we included statistics across animals in the revised Figures 4 and 6, and detailed data for each individual mouse are now provided in Supplementary Figures 3 and 4.
• The effect of aliasing should be discussed given that 1) a high-frequency probe is used along with a correspondingly relatively low frame rate (1000 fps) and 2) Doppler filtering is used to separate upward from downward-moving microbubbles. There will be microbubbles that circulate faster than the Nyquist limit, which will thus appear as moving in the opposite direction in the Doppler spectrum. It would be important to double-check that the effect is not too important and to report this as a limitation in the discussion.
Thank you for highlighting this important point. Aliasing is indeed a relevant issue to consider, especially for higher flow velocities in large vessels. We have added a discussion on this limitation in the revised manuscript:
(Line 359) “Based on the maximum linking distance and gap closing parameters outlined in the Methods section, blood flow with velocities below 50 mm/s can be detected. However, the use of a directional filter to estimate flow direction may introduce aliasing. MBs moving at higher velocities may be subject to incorrect flow direction estimation due to aliasing effects. Given that the compounded frame rate is 1000 Hz, with an ultrasound center frequency of 20 MHz and a sound speed of 1540 m/s, the relationship between Doppler frequency and the axial blood flow velocity(12) indicates that aliasing will not occur for axial flow velocities below 19.25 mm/s. In all flow velocity maps presented in this study, the range is limited to a maximum of 15 mm/s, remaining below the critical threshold for aliasing. Additionally, all vessels analyzed in the violin plots for arteriovenous flow comparisons fall within this range. While cortical arterioles and venules generally exhibit moderate flow speeds, aliasing remains a factor to consider when combining directional filtering with velocity analysis.”
• The method used to classify vessels may be incorrect and may not be needed. I would recommend the authors not use it and describe the vessels as vessels that branch in or out, etc. Applying an arbitrary threshold of 2 to detect capillaries is also not very convincing. I understand that the authors might decide to maintain this nomenclature, in which case I would recommend clearly explaining it at the beginning of the manuscript along with some of the caveats that are already reported in the discussion.
Thank you for your comments on our vessel classification method. We recognize the limitations of the previous approach and, in order to enhance the rigor of the study, we have opted not to continue using this method in the revised manuscript.
In the revised analysis regarding artery and vein, we focus solely on penetrating vessels in the cortex. For these vessels, it is generally accepted that downward-flowing vessels are arterioles, while upward-flowing vessels are venules. Accordingly, in the revised Figures 4 and 6, we analyze arterioles and venules exclusively in the cortex, without relying on the previous classification method that could be considered controversial.
Additionally, we agree that classifying vessels with values below 2 as capillaries was not a robust approach. Thus, we have removed all related analyses from the revised manuscript.
Minor comments:
• Line 16: "resolves capillary-scale ..."; it is not clear that the resolution that is achieved in this work is at the capillary scale.
Thank you for your valuable feedback. We understand that “capillary-scale” may overstate the achieved resolution in our work. To clarify, we have revised the sentence as follows:
(Line 18) “Ultrasound localization microscopy (ULM) is an emerging imaging modality that resolves microvasculature in deep tissues with high spatial resolution.”
This adjustment more accurately reflects the resolution capabilities of ULM as used in our study.
• Line 22: 'vascularity' is not well defined in the manuscript. Consider defining or using another term.
Thank you for pointing out the need for clarification on vascularity. We acknowledge that our initial use of the term “vascularity” may have been unclear and potentially confusing. In the revised manuscript, we have included a clear definition of “vascularity” in the Methods section under Quantitative Analysis of ULM Images (Line 534).
The following sentence shows the definition of vascularity:
(Line 547) “Vascularity was defined as the proportion of the pixel count occupied by blood vessels within each ROI, obtained by binarizing the ULM vessel density maps and calculating the percentage of the pixels with MB signal.”
We have also added an instant definition when it was firstly used in Results part:
(Line 161) “When comparing vessel density maps, ULM images that are acquired in the awake state demonstrate a global reduction of vascularity, which refers to percentage of pixels that occupied by blood vessels.”
• Line 30: I'm not convinced the first two sentences are useful.
Thank you for pointing out this issue. The opening sentence of the article lacked focus and was too broad. We have rewritten the sentence as follows:
(Line 34) “Sensitive imaging of correlates of activity in the awake brain is fundamental for advancing our understanding of neural function and neurological diseases.”
• Line 37: 'micron-scale capillaries': this expression is unclear. Capillaries are typically micron-scaled, so it gives the impression that ULM can image ULM at the one-micron scale, which is not the case.
Thank you for your helpful comment. We agree that “micron-scale capillaries” could be misleading, as it might imply a resolution at the single-micron level. To clarify, we have revised the sentence as follows:
(Line 40) “ULM is uniquely capable of imaging microvasculature situated in deep tissue (e.g., at a depth of several centimeters).”
This revised wording more accurately describes ULM’s capability without implying single-micron level resolution.
• Line 74: I don't think motion-free imaging is possible in the context of awake animals. Consider 'limiting motion' instead.
Thank you for pointing out the potential issue with the term “motion-free”. We agree that achieving entirely motion-free imaging is challenging, especially in the context of awake animals. In response to your suggestion, we have revised the sentence to better reflect this limitation:
(Line 76) “To achieve consistent ULM brain imaging while allowing limited movement in awake animals, a headfixed imaging platform with a chronic cranial window was used in this study.”
This revised wording more accurately conveys our approach to minimizing motion without implying that motion is completely eliminated.
• Line 134:'clearly reveals decreased vessel diameter' How was that demonstrated?
• Line 153: 'significant' according to which statistical test?
• Line 167: 'slight increase', by how much, is it significant?
• Line 183: 'smaller vessels' the center of the distribution is not at 10mm/s, and velocity is not necessarily correlated with diameter.
• Line 184: 'more large vessels', see above. What is a large vessel, and how was this measured?
• Line 205: 'significantly lower', according to which statistical test?
We acknowledge that the original version did not properly use the terms of statistical analysis. In the revised manuscript, we have deleted the related points, and rewritten the statistical analysis part to ensure the terms are used correctly. Please refer to the revised part of “ULM reveals an increase in blood flow induced by isoflurane anesthesia” (From Line 169 to Line 209). In the revised Figures 4 and 6, we have also ensured that each quantitative analysis figure or its caption is clearly explained.
• Line 398: the interleaved sampling scheme should be described in more detail.
Thank you for pointing out this issue. The previous version did not clearly explain the details of interleaved sampling. We have now added the following paragraph to the Ultrasound imaging sequence section in Methods:
(Line 494) “Interleaved sampling is employed to capture high-frequency echoes more effectively. With the system’s sampling rate limited to 62.5 MHz, the upper limit of the center frequency of the transducer passband is 15.625 MHz. To mitigate aliasing, two transmissions are sent per angle, staggered in time. This approach effectively doubles the sampling rate, ensuring more accurate image reconstruction.”
• Figure 1: Which mouse is it? Are these results consistent across all animals?
• Figure 2: Which mouse is it? Are these results consistent across all animals?
• Figure 3: Which mouse is it? Are these results consistent across all animals?
• Figure 4: Which mouse is it? Are these results consistent across all animals?
• Figure 5: Is it a single mouse or multiple mice? Are these results consistent across all animals?
We acknowledge that the original version did not clearly indicate the numbers of animals in the statistical analysis. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. In the revised Figures 4 and 6, we have ensured that each quantitative analysis figure or its caption clearly indicate the specific mice.
For original Figures 1 and 2, these are presented as case studies to illustrate the methodology. Since the anesthesia time required for tail vein injection for each animal varies slightly, it is challenging to have the consistent time taken for each mouse to recover from anesthesia across all mice. For instance, in Figure 1, the mouse took nearly 500 seconds to recover from anesthesia, but this duration is not consistent across all animals, which is a limitation of the bolus injection technique. We have noted this point in the discussion (discussion on the limitation of bolus injection), and we have also clarified in the results section and figure captions that these figures represent a case study of a single mouse rather than a standardized recovery time for all animals.
We further clarified this point in the end of the Figure 2 caption:
(Fig.2 caption) “This figure presents a case study based on the same mouse shown in Fig 1. The x-axis for d-f begins at 500 seconds because, at this point, the mouse’s pupil size stabilized, indicating it had recovered to an awake state. Consequently, ULM images were accumulated starting from this time. It is important to note that not every mouse requires 500 seconds to fully awaken; the time to reach a stable awake state varies across individual mice.” We added the following statement before introducing Figure 1e:
(Line 93) “Due to differences in tail vein injection timing and anesthesia depth, the time required for each mouse to fully awaken varied. Although it was not feasible to get pupil size stabilized just after 500 seconds for each animal, ULM reconstruction only used the data that acquired after the animal reached full pupillary dilation, to ensure that ULM accurately captures the cerebrovascular characteristics in the awake state.”
We added the following statement before introducing Figure 2d:
(Line 139) “To further verify that the proposed MB bolus injection method can help to achieve ULM image saturation shortly after mice awaken from anesthesia, an analysis on the change in MB concentration over time was conducted once pupil size had stabilized (T = 500s).”
For Figures 3, 4, and 5 (in the revised version, Figures 4 and 5 have been combined into a single Figure 4), the data represents results from three individual mice, with each coronal plane corresponding to a different mouse. In the revised version, we have added labels to indicate the specific mouse in each image to improve clarity. We also recognize that some analyses in the original submission (original Figure 5) may have lacked sufficient statistical power due to the small sample size. Therefore, in the revised version, we have focused only on findings that were consistently observed across the three mice to ensure robust conclusions.
Minor corrections and typos from all reviewers:
We would like to sincerely thank the reviewers for their careful reading of our manuscript. We appreciate the time and effort taken to point out the minor typographical errors. We have carefully addressed and corrected all the identified typos, as listed below:
From Reviewer #1:
• Line 316: "insensate": correct, please.
(Line 409) “After confirming that the mouse was anesthetized, the head of the animal was fixed in the stereotaxic frame.”
From Reviewer #3:
• Line 15: Super-resolution ultrasound localization microscopy -- consider removing super-resolution as it gives the impression that it is different from standard ULM.
(Line 18) “Ultrasound localization microscopy (ULM) is an emerging imaging modality that resolves microvasculature in deep tissues with high spatial resolution.”
• Line 39: typo: activities should be activity.
(Line 41) “ULM can also be combined with the principles of functional ultrasound (fUS) to image whole-brain neural activity at a microscopic scale.”
• Line 47: typo: over under.
(Line 50) “Therefore, in neuroscience research, brain imaging in the awake state is often preferred over imaging under anesthesia.”
Once again, we are grateful for the reviewers’ thorough review and valuable input, which have helped us improve the clarity and precision of the manuscript.
-
-
www.researchsquare.com www.researchsquare.com
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This paper investigates the neural mechanisms underlying the change in perception when viewing ambiguous figures. Each possible percept is related to an attractor-like brain state and a perceptual switch corresponds to a transition between these states. The hypothesis is that these switches are promoted by bursts of noradrenaline that change the gain of neural circuits. The authors present several lines of evidence consistent with this view: pupil diameter changes during the time point of the perceptual change; a gain change in neural network models promotes a state transition; and large-scale fMRI dynamics in a different experiment suggests a lower barrier between brain states at the change point. However, some assumptions of the computational model seem not well justified and the theoretical analysis is incomplete. The paper would also benefit from a more in-depth analysis of the experimental data.
Strengths:
The main strength of the paper is that it attempts to combine experimental measurements - from psychophysics, pupil measurements, and fMRI dynamics - and computational modeling to provide an emerging picture of how a perceptual switch emerges. This integrative approach is highly useful because the model has the potential to make the underlying mechanisms explicit and to make concrete predictions.
Weaknesses:
A general weakness is that the link between the three parts of the paper is not very strong. Pupil and fMRI measurements come from different experiments and additional analysis showing that the two experiments are comparable should be included. Crucially, the assumptions underlying the RNN modeling are unclear and the conclusions drawn from the simulation may depend on those assumptions.
With this comment in mind we have made substantial effort to better integrate the three different aspects of our paper. On the pupillometry side, we now show that the dynamic uncertainty associated with perceptual categorisation shares a similar waveform with the observed fluctuations in pupil diameter around the switch point (Fig 2B). To better link the modelling to the behaviour we have also made the gain of the activation function of each sigmoidal unit change dynamically as a function of the uncertainty (i.e. the entropy) of the network’s classification generating phasic changes in gain that mimic the observed phasic changes in pupil dilation explicitly linking the dynamics of gain in the RNN to the observed dynamics of pupil diameter (our non-invasive proxy for neuromodulatory tone). Finally we note that the predictions of the RNN (flattened egocentric landscape and peaks in low-dimensional brain state velocity at the time point of the perceptual switch) were tested directly in the whole-brain BOLD data, which links the modelling and BOLD analysis. Finally we note that whilst we agree that an experiment in which pupilometry and BOLD data were collected simultaneously would be ideal, these data were not available to us at the time of this study.
Main points:
Perceptual tasks in pupil and fMRI experiments: how comparable are these two tasks? It seems that the timing is very different, with long stimulus presentations and breaks in the fMRI task and a rapid sequence in the pupil task. Detailed information about the task timing in the pupil task is missing. What evidence is there that the same mechanisms underlie perceptual switches at these different timescales? Quantification of the distributions of switching times/switching points in both tasks is missing. Do the subjects in the fMRI task show the same overall behavior as in the pupil task? More information is needed to clarify these points.
We recognize the need for a more detailed and comparative analysis of the perceptual tasks used in our pupil and fMRI experiments, particularly regarding differences in timing, task structure, and instructions. The fMRI task incorporates jittered inter-trial intervals (ITIs) of 2, 4, 6, and 8 seconds, designed to enable effective deconvolution of the BOLD response (Stottinger et al., 2018). In contrast, the pupil task presents a more rapid sequence of stimuli without ITIs. These timing differences are reflected in the mean perceptual switch points: the 8th image in the fMRI task and the 9th image in the pupil task. This small yet consistent difference suggests subtle influences of task design on behavior.
Despite these structural and instructional differences, our analyses indicate that overall behavioral patterns remain consistent across the two modalities. The distributions of switching times align closely, and no significant behavioral deviations were observed that might suggest a fundamental difference in the underlying mechanisms driving perceptual switches. These findings suggest that the additional time and structural differences in the fMRI task do not significantly alter the behavioral outcomes compared to the pupil task.
To address these issues, we have added paragraphs in the Results, Methods, and Limitations sections of the manuscript. In the Results section, we provide a detailed comparison of switching point distributions across the two tasks, emphasizing behavioral consistencies and any observed variations. In the Methods section, we include an expanded description of task timing, instructions, and the presence or absence of catch trials to ensure clarity regarding the experimental setups. Finally, in the Limitations section, we acknowledge the structural differences between the tasks, particularly the lack of catch trials and rapid stimulus presentation in the pupil task, and discuss how these differences may influence perceptual dynamics.
These additions aim to clarify how task-specific factors, such as timing, instructions, and catch trials, influence perceptual dynamics while highlighting the consistency in behavioral outcomes across both experimental setups. We believe these revisions address the concerns raised and enhance the manuscript’s transparency and rigor.
Computational model:
(1) Modeling noradrenaline effects in the RNN: The pupil data suggests phasic bursts of NA would promote perceptual switches. But as I understand, in the RNN neuromodulation is modeled as different levels of gain throughout the trial. Making the neural gain time-dependent would allow investigation of whether a phasic gain change can explain the experimentally observed distribution of switching times.
We thank the reviewer for this very helpful suggestion. We updated the RNN so that, post-training, gain changes dynamically as a function of the network's classification uncertainty (i.e. the entropy of the network's output). Specifically, the gain dynamics of each unit in the neural network are governed by a linear ODE with a forcing function given by the entropy of the network’s classification (i.e. the uncertainty of the classification). This explicitly tests the hypothesis that uncertainty driven increases in gain near the perceptual switch (when the input is maximally ambiguous) speeds perceptual switches, and allows us to distinguish between tonic and phasic increases in gain (in the absence of uncertainty forcing gain decays exponentially to a tonic value of 1). Importantly, in line with our hypothesis, we found that switch times decreased as we increased the impact of uncertainty on gain (i.e. switch times decreased as the magnitude of uncertainty forcing increased). Finally, we wish to note that although making gain dynamical is relatively simple conceptually, actually implementing it and then analysing the dynamics turned out to be highly non-trivial. To our knowledge our model is the first RNN of reasonable size to implement dynamical gain requiring us to push the RNN modelling beyond the current state of the art (see Fig 2 - 4).
(2) Modeling perceptual switches: in the results, it is described that the networks were trained to output a categorical response, but the firing rates in Fig 2B do not seem categorical but rather seem to follow the input stimulus. The output signals of the network are not shown. If I understand correctly, a trivial network that would just represent the two input signals without any internal computation and relay them to the output would do the task correctly (because "the network's choice at each time point was the maximum of the two-dimensional output", p. 22). This seems like cheating: the very operation that the model should perform is to signal the change, in a categorical manner, not to represent the gradually changing input signals.
The output of the network was indeed trained to be categorical via a cross entropy loss function with the output defined by the max of the projection of the excitatory hidden units onto the output weights which is boilerplate RNN modelling practice. As requested we now show the output in Fig 2B. On the broader question of whether a trivially small network could solve the task we are in total agreement that with the right set of hand-crafted weights a two neuron sigmoidal network with winner-take-all readout could solve the task. We disagree, however, that using an RNN is cheating in any way. Many tasks in neuroscience can be trivially solved with a very small number of recurrent units (e.g. basically all 2AF tasks). The question we were interested in is how the brain might solve the task, and more specifically how neuromodulator control of gain changes the dynamics of our admittedly very simple task. We could have done this by hand crafting a small network to solve the task but we wanted to use the RNN modelling as a means of both hypothesis testing and hypothesis generation. We now expand on and justify this modelling choice in the second paragraph of the discussion:
“We chose to use an RNN, instead of a simpler (more transparent) model as we wanted to use the RNN as a means of both hypothesis generation and hypothesis testing. Specifically, unlike more standard neuronal models which are handcrafted to reproduce a specific effect, when building an RNN the modeller only specifies the network inputs, labels, and the parameter constraints (e.g. Dale’s law) in advance. The dynamics of the RNN are entirely determined by optimisation. Post-training manipulations of the RNN are not built in, or in any way guaranteed to work, making them more analogous to experimental manipulations of an approximately task-optimal brain-like system. Confirmatory results are arguably, therefore, a first steps towards an in vitro experimental test.”
(3) The mechanism of how increased gain leads to faster switches remains unclear to me. My first intuition was that increasing the gain of excitatory populations (the situation shown in Fig. 2E) in discrete attractor models would lead to deeper attractor wells and this would make it more difficult to switch. That is, a higher gain should lead to slower decisions in this case. However, here the switching time remains constant for a gain between 1 and 1.5. Lowering the gain, on the other hand, leads to slower switching. It is, of course, possible that the RNN behaves differently than classical point attractor models or that my intuition is incorrect (though I believe it is consistent with previous literature, e.g. Niyogi & Wong-Lin 2013 (doi:10.1371/journal.pcbi.1003099) who show higher firing rates - more stable attractors - for increased excitatory gain).
We thank the reviewer for the astute observation, which we entirely agree with. The energy landscape analysis is a method still under active development within our group and we are still learning how to best explain it and its relationship to more traditional ways of quantifying potential-like energy functions of dynamical systems which we think the reviewer has in mind. We have now included a second type of energy landscape analysis which gives a complementary perspective on the RNN dynamics and is more straightforwardly comparable to typical potential functions. We describe the new analysis in the section “Large-scale neural predictions of recurrent neural network model” as follows:
“Crucially, there are two complementary viewpoints from which we can construct an energy landscape; the first allocentric (i.e., third-person view) perspective quantifies the energy associated with each position in state space, whereas the second egocentric (i.e., first person view) perspective quantifies the energy associated relative changes independent of the direction of movement or the location in state space. The allocentric perspective is straightforwardly comparable to the potential function of a dynamical system but can only be applied to low dimensional data in settings where a position-like quantity is meaningfully defined. The egocentric perspective is analogous to taking the point of view of a single particle in a physical setting and quantifying the energy associated with movement relative to the particles initial location. An egocentric framework is thus more applicable, when signal magnitude is relative rather than absolute. See materials and methods, and (see Fig S4 for an intuitive explanation of the allocentric and egocentric energy landscape analysis on a toy dynamical system).”
From the allocentric perspective it is entirely true that increasing gain increases the depth of the landscape, equivalent to increasing the depth of the attractor. However, because the input to the network changes dynamically the location of the approximate fixed-point attractor changes and the network state “chases” this attractor over the course of the trial. Importantly, the location of the energy minima changes more rapidly as gain increases, effectively forcing the network to rapidly change course at the point of the perceptual switch (see Fig 4). To quantify this effect we constructed a new measure - neural work - which describes the amount of “force” exerted on the low-dimensional neural trajectory by the vector field quantified by the allocentric landscape. Specifically we treat the allocentric landscape as analogous to a potential function and then leverage the fact that force is equal to the negative gradient of potential energy to calculate the work (force x displacement) done on the low dimensional trajectory at each time point. This showed that as gain increases the amount of work done on the neuronal trajectory at turning points increases analogous to the application of an external force transiently increasing the kinetic energy of an object. From the perspective of the egocentric landscape this results in a flattening of the landscape as there is a lower energy (i.e. higher probability) assigned to large deviations in the neuronal trajectory around the perceptual switch.
Because of the novelty of the analyses we went to great lengths to carefully explain the methods in the updated manuscript. In addition we wrote a short tutorial style MATLAB script implementing both the allocentric and egocentric landscape analysis on a toy dynamical system with a known potential function (a supercritical pitchfork bifurcation).
(4) From the RNN model it is not clear how changes in excitatory and inhibitory gain lead to slower/faster switching. In order to better understand the role of inhibitory and excitatory gain on switching, I would suggest studying a simple discrete attractor model (a rate model, for example as in Wong and Wang 2006 or Roxin and Ledberg, Plos Comp. Bio 2008) which will allow to study these effects in terms of a very few model parameters. The Roxin paper also shows how to map rate models onto simplified one-dimensional systems such as the one in Fig S3. Setting up the model using this framework would allow for making much stronger, principled statements about how gain changes affect the energy landscape, and under which conditions increased inhibitory gain leads to faster switching.
One possibility is that increasing the excitatory gain in the RNN leads to saturated firing rates. If this is the reason for the different effects of excitatory and inhibitory gain changes, it should be properly explained. Moreover, the biological relevance of this effect should be discussed (assuming that saturation is indeed the explanation).
We thank the reviewer for this excellent suggestion. After some consideration we decided that studying a reduced model would likely not do justice to the dynamical mechanisms of RNN especially after making gain dynamical rather than stationary. Still we very much share the reviewer’s concern that we need a stronger link between the (now dynamical) gain alterations and energy landscape dynamics. To this end we now describe and interrogate the dynamics of the RNN at a circuit level through selectivity and lesion based analyses, at a population level through analysis of the dynamical regime traversed by the network, and finally, through an extended energy landscape framework which has far stronger links to traditional potential based descriptions of low-dimensional dynamical systems (also see to comment 3. above).
At a circuit level the speeding of perceptual switches is mediated by inhibition of the initially dominant population we describe in paragraphs 7 and 8 of the section “Computational evidence for neuromodulatory-mediated perceptual switches in a recurrent neural network” as follows:
“Having confirmed our hypothesis that increasing gain as a function of the network uncertainty increased the speed of perceptual switches, we next sought to understand the mechanisms governing this effect starting with the circuit level and working our way up to the population level (c.f. Sheringtonian and Hopfieldian modes of analysis(66)). Because of the constraint that the input and output weights are strictly positive, we could use their (normalised) value as a measure of stimulus selectivity. Inspection of the firing rates sorted by input weights revealed that the networks had learned to complete the task by segregating both excitatory and inhibitory units into two stimulus-selective clusters (Fig 2C). As the inhibitory units could not contribute to the networks read out, we hypothesised that they likely played an indirect role in perceptual switching by inhibiting the population of excitatory neurons selective for the currently dominant stimulus allowing the competing population to take over and a perceptual switch to occur.
To test this hypothesis, we sorted the inhibitory units by the selectivity of the excitatory units they inhibit (i.e. by the normalised value of the readout weights). Inspecting the histogram of this selectivity metric revealed a bimodal distribution with peaks at each extreme strongly inhibiting a stimulus selective excitatory population at the exclusion of the other (Fig S2). Based on the fact that leading up to the perceptual switch point both the input and firing rate of the dominant population are higher than the competing population, we hypothesized that gain likely speeds perceptual switches by actively inhibiting the currently dominant population rather than exciting/disinhibiting the competing population. We predicted, therefore, that lesioning the inhibitory units selective for the stimulus that is initially dominant would dramatically slow perceptual switches, whilst lesioning the inhibitory units selective for the stimulus the input is morphing into would have a comparatively minor slowing effect on switch times since the population is not receiving sufficient input to take over until approximately half way through the trial irrespective of the inhibition it receives. As selectivity is not entirely one-to-one, we expect both lesions to slow perceptual switches but differ in magnitude. In line with our prediction, lesioning the inhibitory units strongly selective for the initially dominant population greatly slowed perceptual switches (Fig 3F upper), whereas lesioning the population selective for the stimulus the input morphs into removed the speeding effect of gain but had a comparatively small slowing effect on perceptual switches (Fig 3F lower).”
At the population level we characterised the dynamics of the 2D parameter space (defined by gain and the difference between the input dimensions) traversed by the network over the course of a trial as input and gain dynamically change. We describe this paragraphs 9-14 of the section “Computational evidence for neuromodulatory-mediated perceptual switches in a recurrent neural network” which we reprint below for the reviewers convenience :
“Based on the selectivity of the network firing rates we hypothesised that the dynamics were shaped by a fixed-point attractor whose location and existence were determined by gain and and thus changed dynamically over the course of a single trial(67-70). Because of the large size of the network, we could not solve for the fixed points or study their stability analytically. Instead we opted for a numerical approach and characterised the dynamical regime (i.e. the location and existence of approximate fixed-point attractors) across all combinations of gain and visited by the network. Specifically, for each combination of elements in the parameter space we ran 100 simulations with initial conditions (firing rates) drawn from a uniform distribution between [0,1], and let the dynamics run for 10 seconds of simulation time (10 times the length of the task - longer simulation times did not qualitatively change the results) without noise. As we were interested in the existence of fixed-point attractors rather than their precise location, at each time point we computed the difference in firing rate between successive time points across the network. For each simulation we computed both the proportion of trials that converged to a value below 10^-2 giving us proxy for the presence of fixed points, and the time to convergence, giving us a measure of the “strength” of the attractor.
Across gain values when input had unambiguous values, the network rapidly converged across all initialisations (Fig 3A & 3C-H). When input became ambiguous, however, the dynamics acquired a decaying oscillation and did not converge within the time frame of the simulation. As gain increased, the range of values characterised by oscillatory dynamics broadened. Crucially, for sufficiently high values of gain, ambiguous values transitioned the network into a regime characterised by high amplitude inhibition-driven oscillations (Fig 3D & 3G). Each trial can, therefore, be characterised by a trajectory through this 2-dimensional parameter space, with dynamics shaped by the dynamical regimes of each location visited (Fig 3A-B).
When uncertainty has a small impact on gain the network has a trajectory through an initial regime characterised by the rapid convergence to a fixed point where the population representing the initial stimulus dominated whilst the other was silent (Fig 3C), an uncertain regime characterised by oscillations with all neurons partially activated (Fig 3D), and after passing through the oscillatory regime, the network once again enters a new fixed-point regime where the population representing the initial stimulus is now silent and the other is dominant (Fig 3E).
For high gain trails, the network again started and finished in states characterised by a rapid convergence to a fixed point representing the dominant input dimension (Fig 3F-H), but differed in how it transitioned between these states. Uncertain inputs now generated high amplitude oscillations with the network flip-flopping between active and silent states (Fig 3G). We hypothesised that, within the task, this has the effect of silencing the initially dominant population, and boosting the competing population. To test this we initialised each network with parameter values well inside the oscillatory regime (u = [ .5, .5] , gain = 1.5) with initial conditions determined by the selectivity of each unit. Excitatory units selective for input dimension 1, as well as the associated inhibitory units projecting to this population, were fully activated, whilst the excitatory units selective for input dimension 2 and the associated inhibitory units were silenced. As we predicted, when initialised in this state the network dynamics displayed an out of phase oscillation where the initially dominant population was rapidly silenced and the competing population was boosted after a brief delay (219 (ms), +/-114 Fig S3).”
From this we concluded that at a population level, heightened gain leading up to the perceptual switch speeds the switch by transiently pushing the dynamics into an unstable dynamical regime replacing the fixed-point attractor representing the input with an oscillatory regime that actively inhibits the currently dominant population and boosts the competing population before transitioning back into a regime with a stable (approximate) fixed-point attractor representing the new stimulus (Fig 3F-H & Fig S3).
As we describe in the our response to comment 3 above our extended energy-landscape analysis framework now includes an explicit link between the potential of the dynamical system and allocentric landscape, whilst also explaining how a transient deepening of the allocentric landscape (which can be essentially thought of analogous to a traditional potential function) relates to the flattening of the egocentric landscape.
Finally, whilst we appreciate the interest in further characterising the effect of inhibitory gain compared with excitatory gain the topic is is largely orthogonal the aims of our paper so we have removed the discussion of inhibitory vs excitatory gain. Still, we understand that we need to do our due diligence and check that our results do not break down when we manipulate either inhibitory or excitatory gain in isolation. To this end we checked that dynamical gain still speeded perceptual switches when the effect was isolated to inhibitory or excitatory cells in isolation. We show the behavioural plots below for the reviewer’s interest.
Author response image 1.
Switch time as a function of uncertainty forcing
Alternative mechanisms:
It is mentioned in the introduction that changes in attention could drive perceptual switches. A priori, attention signals originating in the frontal cortex may be plausible mechanisms for perceptual switches, as an alternative to LC-controlled gain modulation. Does the observed fMRI dynamics allow us to distinguish these two hypotheses? In any case, I would suggest including alternative scenarios that may be compatible with the observed findings in the discussion.
We agree with the reviewer, in that attention is itself a confound and a process that is challenging to disentangle from the perceptual switching process in the current task. Importantly, we were not arguing for exclusivity in our manuscript, but merely testing the veracity of the hypothesis that the ascending arousal system may play a causal role in mediating and/or speeding perceptual switches. Future work with experiments that more specifically aim to dissociate these different features will be required to tease apart these different possibilities.
Reviewer #2 (Public Review):
Strengths
- the study combines different methods (pupillometry, RNNs, fMRI).
- the study combines different viewpoints and fields of the scientific literature, including neuroscience, psychology, physics, dynamical systems.
- This combination of methods and viewpoints is rarely done, it is thus very useful.
- Overall well-written.
Weaknesses
- The study relies on a report paradigm: participants report when they identify a switch in the item category. The sequence corresponds to the drawing of an object being gradually morphed into another object. Perceptual switches are therefore behaviorally relevant, and it is not clear whether the effect reported correspond to the perceptual switch per se, or the detection of an event that should change behavior (participant press a button indicating the perceived category, and thus switch buttons when they identify a perceptual change). The text mentions that motor actions are controlled for, but this fact only indicates that a motor action is performed on each trial (not only on the switch trial); there is still a motor change confounded with the switch. As a result, it is not clear whether the effect reported in pupil size, brain dynamics, and brain states is related to a perceptual change, or a decision process (to report this change).
We agree with the reviewer that the coupling of the motor change with the perceptual switch is confounded to some degree, but since motor preparation occurs on every trial we suspect that it is more accurate to describe it as confounded with task-relevance more than motor preparation per se. While it is possible that pupil diameter, network topology and energy landscape features are all related to motor change rather than the perceptual switch, we note that the weight of evidence is against this interpretation, given the simple mechanistic explanation created by the coupling of perceptual uncertainty to network gain.
- The study presents events that co-occur (perceptual switch, change in pupil size, energy landscape of brain dynamics) but we cannot identify the causes and consequences. Yet, the paper makes several claims about causality (e.g. in the abstract "neuromodulatory tone ... causally mediates perceptual switches", in the results "the system flattening the energy landscape ... facilitated an updating of the content of perception").
We have made an effort to soften the causal language, where appropriate. In addition, we note that we have changed the title to “Gain neuromodulation mediates task-relevant perceptual switches: evidence from pupillometry, fMRI, and RNN Modelling” to reflect the fact that our claims do not extent to cases of perceptual switches where the stimulus is only passively observed.
- Some effects may reflect the expectation of a perceptual switch, rather than the perceptual switch per se. Given the structure of the task, participants know that there will be a perceptual switch occurring once during a sequence of morphed drawings. This change is expected to occur roughly in the middle of the sequence, making early switches more surprising, and later switches less surprising. Differences in pupil response to early, medium, and late switches could reflect this expectation. The authors interpret this effect very differently ("the speed of a perceptual switch should be dependent on LC activity").
The task includes catch trials designed to reduce the expectation of a perceptual switch. In these trials, a perceptual switch occurs either earlier or later than usual. While these trials are valuable for mitigating predictability, we did not focus extensively on them, as they were thoroughly discussed in the original paper. Additionally, due to the limited number of catch trials, it is difficult—if not impossible—to calculate a reliable mean surprise per image set.
It is also worth noting that the pupil study does not include catch trials, which could contribute to differences in how perceptual switches are processed and interpreted between the fMRI and pupil experiments.
- The RNN is far more complex than needed for the task. It has two input units that indicate the level of evidence for the two categories being morphed, and it is trained to output the dominant category. A (non-recurrent) network with only these two units and an output unit whose activity is a sigmoid transform of the difference in the inputs can solve the task perfectly. The RNN activity is almost 1-dimensional probably for this reason. In addition, the difficult part of the computation done by the human brain in this task is already solved in the input that is provided to the network (the brain is not provided with the evidence level for each category, and in fact, it does not know in advance what the second category will be).
We agree that a simpler model could perform the task. We opted to use an RNN rather than hand craft a simpler model as we wanted to use the model as both a method of hypothesis testing and hypothesis generation. We now expand on and justify this modelling choice in the second paragraph of the discussion (also see our response to Reviewer 1 comment 4):
“We chose to use an RNN, instead of a simpler (more transparent) model as we wanted to use the RNN as a means of both hypothesis generation and hypothesis testing. Specifically, unlike more standard neuronal models which are handcrafted to reproduce a specific effect, when building an RNN the modeller only specifies the network inputs, labels, and the parameter constraints (e.g. Dale’s law) in advance. The dynamics of the RNN are entirely determined by optimisation. Post-training manipulations of the RNN are not built in, or in any way guaranteed to work, making them more analogous to experimental manipulations of an approximately task-optimal brain-like system. Confirmatory results are arguably, therefore, a first steps towards an in vitro experimental test.”
In other words, a simpler model would not have been appropriate to the aims. In addition we note that low dimensional dynamics are extremely common in the RNN literature and are in no way unique to our model.
- Basic fMRI results are missing and would be useful, before using elaborate analyses. For instance, what are the regions that are more active when a switch is detected?
We explicitly chose to not run a standard voxelwise statistical parametric approach on these data, as the results were reported extensively in the original study (Stottinger et al., 2018).
- The use of methods from physics may obscure some simple facts and simpler explanations. For instance, does the flatter energy landscape in the higher gain condition reflect a smaller number of states visited in the state space of the RNN because the activity of each unit gets in the saturation range? If correct, then it may be a more straightforward way of explaining the results.
We appreciate the reviewer's concern as this would indeed be a problem. However, this is not the case for our network. At the time point of the perceptual switch where the egocentric landscape dynamics are at their flattest the RNN firing rates are approximately 50% activated nowhere near the saturation point. In addition, a flatter landscape in the egocentric and allocentric landscape analyses only occurs - mathematically speaking - when there are more states visited not less.
In addition, we note that we are very sympathetic to the complexity of our physics based analyses and have gone to great lengths to describe them in an accessible manner in both the main text and methods. We have also included tutorial style code demonstrating how the analysis can be used on a toy dynamical system in the supplementary material.
- Some results are not as expected as the authors claim, at least in the current form of the paper. For instance, they show that, when trained to identify which of two inputs u1 and u2 is the largest (with u2=1-u1, starting with u1=1 and gradually decreasing u1), a higher gain results in the RNN reporting a switch in dominance before the true switch (e.g. when u1=0.6 and u2=0.4), and vice et versa with a lower gain. In other words, it seems to correspond to a change in criterion or bias in the RNN's decision. The authors should discuss more specifically how this result is related to previous studies and models on gain modulation. An alternative finding could have been that the network output is a more (or less) deterministic function of its inputs, but this aspect is not reported.
We appreciate this comment but it is simply not applicable to our network. There is no criterion in the RNN. We could certainly add one but this would be a significant departure from how decisions are typically modelled in RNNs. The (deterministic) readout is the max of the projection of the (instantaneous) excitatory firing rate onto the readout weights. A shift in criterion would imply that the dynamics are unaffected and the effect can be explained by a shift in the readout weights; this cannot be the case because the readout weights are stationary the change occurs at the level of the activation function.
We are aware that there is a large literature in decision making and psychophysics that uses the term gain in a slightly different way. Here we are strictly referring to the gain of the activation function. Although we agree that it would be interesting and important to discuss the differing uses of the term gain, this is beyond the scope of the present paper.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
This study reports that spatial frequency representation can predict category coding in the inferior temporal cortex.
Thank you for taking the time to review our manuscript. We greatly appreciate your valuable feedback and constructive comments, which have been instrumental in improving the quality and clarity of our work.
The original conclusion was based on likely problematic stimulus timing (33 ms which was too brief). Now the authors claim that they also have a different set of data on the basis of longer stimulus duration (200 ms).
One big issue in the original report was that the experiments used a stimulus duration that was too brief and could have weakened the effects of high spatial frequencies and confounded the conclusions. Now the authors provided a new set of data on the basis of a longer stimulus duration and made the claim that the conclusions are unchanged. These new data and the data in the original report were collected at the same time as the authors report.
The authors may provide an explanation why they performed the same experiments using two stimulus durations and only reported one data set with the brief duration. They may also explain why they opted not to mention in the original report the existence of another data set with a different stimulus duration, which would otherwise have certainly strengthened their main conclusions.
Thank you for your comments regarding the stimulus duration used in our experiments. We appreciate the opportunity to clarify and provide further details on our methodology and decisions.
In our original report, we focused on the early phase of the neuronal response, which is less affected by the duration of the stimulus. Observations from our data showed that certain neurons exhibited high firing rates even with the brief 33 ms stimulus duration, and the results we obtained were consistent across different durations. To avoid redundancy, we initially chose not to include the results from the 200 ms stimulus duration, as they reiterated the findings of the 33 ms duration.
However, we acknowledge that the brief stimulus duration could raise concerns regarding the robustness of our conclusions, particularly concerning the effects of high spatial frequencies. Upon reflecting on the reviewer’s comments during the first revision, we recognized the importance of addressing these potential concerns directly. Therefore, we have included the data from the 200 ms stimulus duration in our revised manuscript.
Furthermore, Our team is actively investigating the differences between fast (33 ms) and slow (200 ms) presentations in terms of SF processing. Our preliminary observations suggest similar processing of HSF in the early phase of the response for both fast and slow presentations, but different processing of HSF in the late phase. This was another reason we initially opted to publish the results from the brief stimulus duration separately, as we intended to explore the different aspects of SF processing in fast and slow presentations in subsequent studies.
I suggest the authors upload both data sets and analyzing codes, so that the claim could be easily examined by interested readers.
Thank you for your suggestion to make both data sets and the analyzing codes available for examination by interested readers.
We have created a repository that includes a sample of the dataset along with the necessary codes to output the main results. While we cannot provide the entire dataset at this time due to ongoing investigations by our team, we are committed to ensuring transparency and reproducibility. The data and code samples we have provided should enable interested readers to verify our claims and understand our analysis process.
Repository: https://github.com/ramintoosi/spatial-frequency-selectivity
Reviewer #2 (Public Review):
Summary:
This paper aimed to examine the spatial frequency selectivity of macaque inferotemporal (IT) neurons and its relation to category selectivity. The authors suggest in the present study that some IT neurons show a sensitivity for the spatial frequency of scrambled images. Their report suggests a shift in preferred spatial frequency during the response, from low to high spatial frequencies. This agrees with a coarse-to-fine processing strategy, which is in line with multiple studies in the early visual cortex. In addition, they report that the selectivity for faces and objects, relative to scrambled stimuli, depends on the spatial frequency tuning of the neurons.
Strengths:
Previous studies using human fMRI and psychophysics studied the contribution of different spatial frequency bands to object recognition, but as pointed out by the authors little is known about the spatial frequency selectivity of single IT neurons. This study addresses this gap and shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly. They related this weak spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli they employed to assess category selectivity.
Thank you for your thorough review and insightful feedback on our manuscript. We greatly appreciate your time and effort in providing valuable comments and suggestions, which have significantly contributed to enhancing the quality of our work.
The authors revised their manuscript and provided some clarifications regarding their experimental design and data analysis. They responded to most of my comments but I find that some issues were not fully or poorly addressed. The new data they provided confirmed my concern about low responses to their scrambled stimuli. Thus, this paper shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly (see main comments below). They related this (weak) spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli to assess category selectivity.
While we acknowledge that the number of instances per condition is relatively low, the overall dataset is substantial. Specifically, our study includes a total of 180 stimuli (6 spatial frequencies × 2 scrambled/non-scrambled conditions × 15 instances, including 9 fixed and 6 non-fixed) and 5400 trials (180 stimuli × 2 durations × 15 repetitions). Conducting these trials requires approximately one hour of experimental time per session.
Extending the number of stimuli, while potentially addressing this limitation, would significantly compromise the quality of the experiment by increasing the duration and introducing potential fatigue effects in the subjects. Despite this limitation, our findings lay important groundwork by offering novel insights into object recognition through the lens of spatial frequency. We believe this work can serve as a foundation for future experiments designed to further explore and validate these theories with expanded stimulus sets.
Main points.
(1) They have provided now the responses of their neurons in spikes/s and present a distribution of the raw responses in a new Figure. These data suggest that their scrambled stimuli were driving the neurons rather poorly and thus it is unclear how well their findings will generalize to more effective stimuli. Indeed, the mean net firing rate to their scrambled stimuli was very low: about 3 spikes/s. How much can one conclude when the stimuli are driving the recorded neurons that poorly? Also, the new Figure 2- Appendix 1 shows that the mean modulation by spatial frequency is about 2 spikes/s, which is a rather small modulation. Thus, the spatial frequency selectivity the authors describe in this paper is rather small compared to the stimulus selectivity one typically observes in IT (stimulus-driven modulations can be at least 20 spikes/s).
To address the concerns regarding the firing rates and the modulation of neuronal responses by spatial frequency (SF), we emphasize several key points:
(1) Significance of Firing Rate Differences: While it is true that the mean net firing rate to our scrambled stimuli was relatively low, the firing rate differences observed were statistically significant, with p-values approximately at 1e-5. This indicates that despite the low firing rates, the observed differences are reliable and unlikely to have occurred by chance.
(2) Classification Rate and Modulation by SF: Our analysis showed that the difference between various SF responses led to a classification rate of 44.68%, which is 24.68% higher than the chance level. This substantial increase above the chance level demonstrates that SF significantly modulates IT responses, even if the overall firing rates are modest.
(3) Effect Size and SF Modulation: While the effect size in terms of firing rate differences may be small, it is significant. The significant modulation of IT responses by SF, as evidenced by our statistical analyses and classification rate, supports our conclusions regarding the role of SF in driving IT responses.
(4) Expectations for Noise-like Pure SF Stimuli: We acknowledge that IT responses are typically higher for various object stimuli. Given the nature of our pure SF stimuli, which resemble noise-like patterns, we did not anticipate high responses in terms of spikes per second. The low firing rates are consistent with the expectation for such stimuli and do not undermine the significance of the observed modulation by SF.
We believe that these points collectively support the validity of our findings and the significance of SF modulation in IT responses, despite the low firing rates. We appreciate your insights and hope this clarifies our stance on the data and its implications.
We added the following description to the Appendix 1 - “Strength of SF selectivity” section:
“While the firing rates and net responses to scrambled stimuli were modest (e.g., 2.9 Hz in T1), the differences across spatial frequency (SF) bands were statistically significant (p ≈ 1e-5) and led to a classification accuracy 24.68\% above chance. This demonstrates the robustness of SF modulation in IT neurons despite low firing rates. The modest responses align with expectations for noise-like stimuli, which are less effective in driving IT neurons, yet the observed SF selectivity highlights a fundamental property of IT encoding.”
(2) Their new Figure 2-Appendix 1 does not show net firing rates (baseline-subtracted; as I requested) and thus is not very informative. Please provide distributions of net responses so that the readers can evaluate the responses to the stimuli of the recorded neurons.
We understand the reviewer’s concern about the presentation of net firing rates. In T2 (the late time interval), the average response rate falls below the baseline, resulting in negative net firing rates, which might confuse readers. To address this, we have added the net responses to the text for clarity. Additionally, we have included the average baseline response in the figure to provide a more comprehensive view of the data.
“To check the SF response strength, the histogram of IT neuron responses to scrambled, face, and non-face stimuli is illustrated in this figure. A Gamma distribution is also fitted to each histogram. To calculate the histogram, the neuron response to each unique stimulus is calculated for each neuron in spike/seconds (Hz). In the early phase, T1, the average firing rate to scrambled stimuli is 26.3 Hz which is significantly higher than the response in -50 to 50ms which is 23.4 Hz. In comparison, the mean response to intact face stimuli is 30.5 Hz, while non-face stimuli elicit an average response of 28.8 Hz. The average net responses to the scrambled, face, and non-face stimuli are 2.9 Hz, 7.1 Hz, and 5.4 Hz, respectively. Moving to the late phase, T2, the responses to scrambled, face, and object stimuli are 19.5 Hz, 19.4 Hz, and 22.4 Hz, respectively. The corresponding average net responses are 3.9 Hz, 4.0 Hz, and 1.0 Hz below the baseline response.”
(3) The poor responses might be due to the short stimulus duration. The authors report now new data using a 200 ms duration which supported their classification and latency data obtained with their brief duration. It would be very informative if the authors could also provide the mean net responses for the 200 ms durations to their stimuli. Were these responses as low as those for the brief duration? If so, the concern of generalization to effective stimuli that drive IT neurons well remains.
The firing rates for the 200 ms stimulus duration are as follows: 27.7 Hz, 30.7 Hz, and 30.4 Hz for scrambled, face, and object stimuli in T1), respectively; and 26.2 Hz, 29.1 Hz, and 33.9 Hz in T2. The average baseline firing rate (−50 to 50 ms) is 23.4 Hz. Therefore, the net responses are 4.3 Hz, 7.3 Hz, and 7.0 Hz for T1; and 2.8 Hz, 5.7 Hz, and 10.5 Hz for T2 for scrambled, face, and object stimuli, respectively.
Notably, the impact of stimulus duration is more pronounced in T2, which is consistent with the time interval of the T2 compared to T1. However, the firing rates in T1 do not show substantial changes with the longer duration. As we discussed in our response to the first comment, it is important to note that high net responses are not typically expected for scrambled or noise-like stimuli in IT neurons. Instead, the key findings of this study lie in the statistical significance of these responses and their meaningful relationship to category selectivity. These results highlight the broader implications for understanding the role of spatial frequency in object recognition.
We added the firing rates to the, Appendix 1, “Extended stimulus duration supports LSF-preferred tuning” part as follows.
“For the 200 ms stimulus duration, the firing rates were 27.7 Hz, 30.7 Hz, and 30.4 Hz for scrambled, face, and object stimuli in T1, respectively, and 26.2 Hz, 29.1 Hz, and 33.9 Hz in T2. The corresponding net responses were 4.3 Hz, 7.3 Hz, and 7.0 Hz in T1, and 2.8 Hz, 5.7 Hz, and 10.5 Hz in T2. While the longer stimulus duration did not substantially increase firing rates in T1, its impact was more pronounced in T2.”
(4) I still do not understand why the analyses of Figures 3 and 4 provide different outcomes on the relationship between spatial frequency and category selectivity. I believe they refer to this finding in the Discussion: "Our results show a direct relationship between the population's category coding capability and the SF coding capability of individual neurons. While we observed a relation between SF and category coding, we have found uncorrelated representations. Unlike category coding, SF relies more on sparse, individual neuron representations.". I believe more clarification is necessary regarding the analyses of Figures 3 and 4, and why they can show different outcomes.
Figure 3 explores the relationship between SF coding and category coding at both the single-neuron and population levels.
● Figures 3(a) and 3(b) examine the relationship between a single neuron’s response pattern and object decoding in the population.
● Figure 3(c) investigates the relationship between a single neuron’s SF decoding capabilities and object decoding in the population.
● Figure 3(d) assesses the relationship between a single neuron’s object decoding capabilities and SF decoding in the population.
In summary, Figure 3 demonstrates a relation between SF coding/response pattern at the single level and category coding at the population level.
Figure 4, on the other hand, addresses the uncorrelated nature of SF and category coding.
● Figure 4(a) shows the uncorrelated relation between a single neuron’s SF decoding capability and its object decoding capability. This suggests that a neuron's ability to decode SF does not predict its ability to decode object categories.
● Figure 4(b) illustrates that the contribution of a neuron to the population decoding of SF is uncorrelated with its contribution to the population decoding of object categories. This further supports the idea that the mechanisms behind SF coding and object coding are uncorrelated.
In summary, Figure 4 suggests that while there is a relation between SF coding and category coding as illustrated in Figure 3, the mechanisms underlying SF coding and object coding operate independently (in terms of correlation), highlighting the distinct nature of these processes.
We hope this explanation clarifies why the analyses in Figures 3 and 4 present different outcomes. Figure 3 provides insight into the relationship between SF and category coding, while Figure 4 emphasizes the uncorrelated nature of these processes. We also added the following explanation in the “Uncorrelated mechanisms for SF and category coding” section.
Based on your command, to clarify the presentation of the work, we added the following description to the “Uncorrelated mechanisms for SF and category coding” section:
“Figures 3 and 4 examine different aspects of the relationship between SF and category coding. Figure 3 highlights a relationship between SF coding at the single-neuron level and category coding at the population level. Conversely, Figure 4 demonstrates the uncorrelated mechanisms underlying SF and category coding, showing that a neuron’s ability to decode SF is not predictive of its ability to decode object categories. This distinction underscores that while SF and category coding are related at broader levels, their underlying mechanisms are independent, emphasizing the distinct processes driving each form of coding.”
(5) The authors found a higher separability for faces (versus scrambled patterns) for neurons preferring high spatial frequencies. This is consistent for the two monkeys but we are dealing here with a small amount of neurons. Only 6% of their neurons (16 neurons) belonged to this high spatial frequency group when pooling the two monkeys. Thus, although both monkeys show this effect I wonder how robust it is given the small number of neurons per monkey that belong to this spatial frequency profile. Furthermore, the higher separability for faces for the low-frequency profiles is not consistent across monkeys which should be pointed out.
We appreciate the reviewer’s concern regarding the relatively small number of neurons in the high spatial frequency group (16 neurons, 6% of the total sample across the two monkeys) and the consistency of the results. While we acknowledge this limitation, it is important to note that findings involving sparse subsets of neurons can still be meaningful. For example, Dalgleish et al. (2020) demonstrated that perception can arise from the activity of as few as ~14 neurons in the mouse cortex, supporting the sparse coding hypothesis. This underscores the potential robustness of results derived from small neuronal populations when the activity is statistically significant and functionally relevant.
Regarding the higher separability for faces among neurons preferring high spatial frequencies, the consistency of this finding across both monkeys suggests that this effect is robust within this subgroup. For neurons preferring low spatial frequencies, we agree that the lack of consistency across monkeys should be explicitly noted. These differences may reflect individual variability or differences in sampling across subjects and merit further investigation in future studies.
To address this concern, we have updated the text to explicitly discuss the small size of the high spatial frequency group, its implications, and the observed inconsistency in the low spatial frequency profiles between monkeys. We have added the following description to the discussion.
“Next, according to Figure 3(a), 6% of the neurons are HSF-preferred and their firing rate in HSF is comparable to the LSF firing rate in the LSF-preferred group. This analysis is carried out in the early phase of the response (70-170ms). While most of the neurons prefer LSF, this observation shows that there is an HSF input that excites a small group of neurons. Importantly, findings involving small neuronal populations can still be meaningful, as studies like Dalgleish et al. (2020) have demonstrated that perception can arise from the activity of as few as ~14 neurons in the mouse cortex, emphasizing the robustness of sparse coding.”
Regarding the separability of faces for the low-frequency profiles, we added the following to the appendix section,
“For neurons preferring LSF, LP profile, it is important to note the lack of consistency in responses across monkeys. This variability may reflect individual differences in neural processing or variations in sampling between subjects.”
And in the discussion:
“Our results are based on grouping the neurons of the two monkeys; however, the results remain consistent when looking at the data from individual monkeys as illustrated in Appendix 2. However, for neurons preferring LSF, we observed inconsistency across monkeys, which may reflect individual differences or sampling variability. These findings highlight the complexity of SF processing in the IT cortex and suggest the need for further research to explore these variations.”
* Henry WP Dalgleish, Lloyd E Russel, lAdam M Packer, Arnd Roth, Oliver M Gauld, Francesca Greenstreet, Emmett J Thompson, Michael Häusser (2020) How many neurons are sufficient for perception of cortical activity? eLife 9:e58889.
(6) I agree that CNNs are useful models for ventral stream processing but that is not relevant to the point I was making before regarding the comparison of the classification scores between neurons and the model. Because the number of features and trial-to-trial variability differs between neural nets and neurons, the classification scores are difficult to compare. One can compare the trends but not the raw classification scores between CNN and neurons without equating these variables.
We appreciate the reviewer’s follow-up comment and agree that differences in the number of features and trial-to-trial variability between IT neurons and CNN units make direct comparisons of raw classification scores challenging. As the reviewer suggests, it is more appropriate to focus on comparing trends rather than absolute scores when analyzing the similarities and differences between these systems. In light of this, we have revised the text to clarify that our intention was not to equate raw classification scores but to highlight the qualitative patterns and trends observed in spatial frequency encoding between IT and CNN units.
“SF representation in the artificial neural networks
We conducted a thorough analysis to compare our findings with CNNs. To assess the SF coding capabilities and trends of CNNs, we utilized popular architectures, including ResNet18, ResNet34, VGG11, VGG16, InceptionV3, EfficientNetb0, CORNet-S, CORTNet-RT, and CORNet-z, with both pre-trained on ImageNet and randomly initialized weights. Employing feature maps from the four last layers of each CNN, we trained an LDA model to classify the SF content of input images. Figure 5(a) shows the SF decoding accuracy of the CNNs on our dataset (SF decoding accuracy with random (R) and pre-trained (P) weights, ResNet18: P=0.96±0.01 / R=0.94±0.01, ResNet34 P=0.95±0.01 / R=0.86±0.01, VGG11: P=0.94±0.01 / R=0.93±0.01, VGG16: P=0.92±0.02 / R=0.90±0.02, InceptionV3: P=0.89±0.01 / R=0.67±0.03, EfficientNetb0: P=0.94±0.01 / R=0.30±0.01, CORNet-S: P=0.77±0.02 / R=0.36±0.02, CORTNet-RT: P=0.31±0.02 / R=0.33±0.02, and CORNet-z: P=0.94±0.01 / R=0.97±0.01). Except for CORNet-z, object recognition training increases the network's capacity for SF coding, with an improvement as significant as 64\% in EfficientNetb0. Furthermore, except for the CORNet family, LSF content exhibits higher recall values than HSF content, as observed in the IT cortex (p-value with random (R) and pre-trained (P) weights, ResNet18: P=0.39 / R=0.06, ResNet34 P=0.01 / R=0.01, VGG11: P=0.13 / R=0.07, VGG16: P=0.03 / R=0.05, InceptionV3: P=<0.001 / R=0.05, EfficientNetb0: P=0.07 / R=0.01). The recall values of CORNet-Z and ResNet18 are illustrated in Figure 5(b). However, while the CNNs exhibited some similarities in SF representation with the IT cortex, they did not replicate the SF-based profiles that predict neuron category selectivity. As depicted in Figure 5(c) although neurons formed similar profiles, these profiles were not associated with the category decoding performances of the neurons sharing the same profile.”
Discussion:
“Finally, we compared SF's representation trends and findings within the IT cortex and the current state-of-the-art networks in deep neural networks.”
Recommendations for the authors:
Reviewer #2 (Recommendations For The Authors):
The mean baseline firing rate of their neurons (23.4 Hz) was rather high for single IT neurons (typically around 10 spikes/s or lower). Were these well-isolated units or mainly multiunit activity?
We confirm that the recordings in our study were from both well-isolated single units and multi-unit activities (remaining after isolation neurons) sorted based on our spike sorting toolbox. The higher baseline firing rate is likely due to the experimental design, particularly the inclusion of the responsive neurons from the selectivity phase. We added the following statement to the methods section.
“In our analysis, we utilized both well-isolated single units and multi-unit activities (which represent neural activities that could not be further sorted into single units), ensuring a comprehensive representation of neural responses across the recorded population.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
We thank the reviewers for their thoughtful comments.
Based on their suggestions we will:
(1) Use more accurate language to describe the hypothalamus regions under investigation in this study. While we aimed to primarily investigate the medial preoptic area (MPOA), our dissections and sequencing data in fact capture several regions of the anterior hypothalamus including the anteroventral periventricular (AVPV), paraventricular (PVN), supraoptic (SON), suprachiasmatic nuclei (SCN), and more. We will revise the language in our manuscript to reflect that our study in fact investigates the cellular evolution of the anterior hypothalamus across behaviorally divergent deer mice.
(2) Revise our language to clarify that while our study provides a rich dataset for generating hypotheses about which cell types may contribute to behavioral differences, it does not provide any evidence of causal relationships. We hope to investigate this further in future work.
(3) Clarify specific methodological choices for which reviewers had questions, especially about the hypothalamic regions for which we did histology to validate cell abundance differences and methodological choices related to mapping our cell clusters to Mus cell types.
Our responses to each reviewer’s specific comments are below.
Reviewer #1:
The major limitation of the study is the absence of causal experiments linking the observed changes in MPOA cell types to species-specific social behaviors. While the study provides valuable correlational data, it lacks functional experiments that would demonstrate a direct relationship between the neuronal differences and behavior. For instance, manipulating these cell types or gene expressions in vivo and observing their effects on behavior would have strengthened the conclusions, although I certainly appreciate the difficulty in this, especially in non-musculus mice. Without such experiments, the study remains speculative about how these neuronal differences contribute to the evolution of social behaviors.
Yes, we agree the study lacks functional experiments. We hope that the dataset is of value for generating hypotheses about how hypothalamic neuronal cell types may govern species-specific social behaviors, and for these hypotheses to be functionally tested by us and others in future work.
Reviewer #2:
Some methodology could be further explained, like the decision of a 15% cutoff value for cell type assignment per cluster, or the necessity of a multi-step analysis pipeline for gene enrichment studies.
A 15% cutoff value for cell type assignment was chosen to include all known homology correspondences between our dataset and the Mus atlas. For example, i14:Avp/Cck cells from the Mus atlas represent Avp cells from the suprachiasmatic nuclei (SCN). Though only 17.3% of cluster 15 maps to i14:Avp/Cck, we know these two clusters correspond based on the expression of Avp and additional SCN marker genes in cluster 15 (Supp Fig 6). We will further explain this cutoff in the revised manuscript.
Our gene enrichment study includes a multi-step analysis pipeline because we wanted to control for confounders that may be introduced because of gene expression level. Genes that are more highly expressed are more accurately quantified and thus more likely to be identified as differentially expressed. Therefore, we wanted to test for gene enrichments in our set of DE genes against a background of genes with similar expression levels. We will clarify this motivation in the revised manuscript.
The authors should exercise strong caution in making inferences about these differences being the basis of parental behavior. It is possible, given connections to relevant research, but without direct intervention, direct claims should be avoided. There should be clear distinctions of what to conclude and what to propose as possibilities for future research.
Yes, we agree that we are unable to make direct claims about neuronal differences being the basis of parental behavior. We will revise our language to be clearer about which relationships we are hypothesizing and what we propose as possibilities for future research.
Histology is not performed on all regions included in the sequencing analysis.
We apologize that our language describing the hypothalamic regions included in the sequencing analysis and those included in the histology is unclear. We aimed to dissect the medial preoptic region for the sequencing analysis, but additionally captured parts of the anterior hypothalamus including the paraventricular (PVN), supraoptic (SON), and suprachiasmatic nuclei (SCN), and more. Our histology was performed across the entire hypothalamus and includes all regions included in the sequencing data. We will revise the manuscript to more accurately describe the hypothalamic regions for which we investigated.
Reviewer #3:
My primary concern is that the dataset is limited: 52,121 neuronal nuclei across 24 samples, which does not provide many cells per cluster to analyze comparatively across sex and species, particularly given the heterogeneity of the region dissected. The Supplementary table reports lower UMIs/genes per cell than is typically seen as well. Perhaps additional information could be obtained from the data by not restricting the analyses to cells that can be assigned to Mus types. A direct comparison of the two Peromyscus species could be valuable as would a more complete Peromyscus POA atlas.
Our dataset reports ~1,500 genes and ~1,000 UMIs per nuclei which is indeed lower than is typically reported in other single nuclei datasets. Some of this discrepancy is due to a lower quality genome and annotated transcriptome available for Peromyscus compared to Mus musculus, which results in a lower mapping rate than is typically reported in Mus studies. However, our dataset was sufficient to identify known peptidergic cell types (Supp Fig 6) and to map homology to Mus cell types for 34 (64%) of our 53 clusters. Additionally, although some of our clusters contain small numbers of cells, our differential abundance analysis accounts for the variance in cell numbers observed across samples and should be robust against any increase in variance due to small numbers. In fact, even differential abundance of very small cell clusters such as oxytocin neurons (cell type 40) was validated by histology.
We would like to clarify that all analyses were performed on all cell clusters, regardless of whether or not they could be assigned homology to a Mus cell type. All the cell types that we identified as differentially abundant or contained significant sex differences happened to be cell types for which homology to a Mus cell type could be defined. This may arise for a relatively uninteresting reason: cell types that have more distinct transcriptional signatures will be more accurately clustered, leading to more accurate identification of homology as well as more accurate measurements of differential abundance / expression. We will revise language to make this more clear in our manuscript.
In Supplement 7, it appears that most neurons can be assigned as excitatory or inhibitory, but then so many of these cells remain in the unassigned "gray blob" seen in panel 1E. Clustering of excitatory and inhibitory neurons separately, as in prior cited work in Mus POA (refs 31 and 57) may boost statistical power to detect sex and species differences in cell types. Perhaps the cells that cannot be assigned to Mus contain too few reads to be useful, in which case they should be filtered out in the QC. The technical challenges of a comparative single-cell approach are considerable, so it benefits the scientific community to provide transparency about them.
We are not certain about why we are unable to cluster and assign homology to many of our cells (i.e. cells in the unassigned “gray blob”). However, we note that even in the Mus atlas, many cells did not belong to obvious clusters by UMAP visualization and that several clusters lacked notable marker genes and were designated simply as “Gaba” and “Glut” clusters. Therefore, it is unsurprising that our own dataset also contains cells that lack the transcriptional signatures needed to be clustered and/or mapped to Mus cell types. We do know, however, that the median number of reads/nuclei is uniform across cell clusters and does not explain why some clusters could not be assigned to Mus. We will add this information to our revised manuscript.
We do not think that a two-stage clustering (i.e. clustering first by excitatory vs. inhibitory neurons) is expected to gain power to resolve cell types in this case. Excitatory vs. inhibitory neurons are clearly separable on our UMAP (Supp Fig 7) so that information is already being used by our clustering procedure. However, we will explore this further in our revised manuscript to see if doing so will boost statistical power.
The Calb1 dimorphism as observed by immunostaining, appears much more extensive in P. maniculatus compared to P. polionotus (Figures 3 E and F). This finding is not reflected in the counts of the i20:Gal/Moxd1 cluster. The use of Calb1 staining as a proxy for the Gal/Moxd1 cluster would be strengthened if the number of POA Calb1+ neurons that are found in each cluster was apparent. There may be additional Calb+ neurons in the cells that are not annotated to a Mus cluster. This clarification would add support to the overall conclusion that there is reduced sexual dimorphism in P. polionotus.
From the Mus MPOA atlas (which includes both single-cell sequencing data and imaging-based spatial information), it is known that the i20:Gal/Moxd1 cluster comprises sexually dimorphic cells that make up both the BNST and the SDN-POA. These sexually dimorphic cells are well-studied and known to be marked by Calb1, which we used in immunostaining as a proxy for i20:Gal/Moxd1.
However, we would like to clarify that in our study, the immunostaining of Calb1+ neurons and the sequencing counts of the i20:Gal/Moxd1 cluster are not completely reflective of each other because our sequencing dataset only captured the ventral portion of the BNST. Therefore our i20:Gal/Moxd1 counts contain a combination of some Calb1+ BNST cells and likely all Calb1+ SDN-POA cells and is difficult to interpret on its own. Our histology, however, covers the entire hypothalamus and is more reliable for identifying sex and species differences in each region. We will clarify this in the revised manuscript.
The relationship between the sex steroid receptor expression and the sex bias in gene expression would be improved if the sex bias in sex steroid receptor expression was included in Supplementary Figure 10.
We will include this in the revised manuscript.
There is no explanation for the finding that there is a female bias in gene expression across all cell types in P. polionotus.
We also find this observation interesting but don’t have a good explanation for why at this point. We plan to follow this up in future work.
-
-
www.biorxiv.org www.biorxiv.org
-
Author Response:
We appreciate the reviewers' detailed feedback, which has highlighted several areas where our study could be strengthened. Although we acknowledge the relatively limited scope of our CRISPR-based gene-deletion screen, we successfully demonstrated the immunogenic role of Pccb in our syngenetic pancreatic cancer mouse model. Specifically, loss of PCCB in our mutant KRAS/p53 PIK3CA-null (αKO) cells blocked host T cell killing of tumor cells.
Furthermore, blocking the PD1/PD-L1 interaction reverses this anti-tumor immunogenic effect. We agree with the reviewers regarding the limitations of our study, such as the sample size in our scTCR sequencing and the lack of direct cytotoxicity assays to confirm tumor-specific T cell clones. However, our results are consistent across multiple experimental approaches that strongly suggest meaningful differences in host T cell response to the three implanted tumor types, KPC, αKO and p-αKO. We agree that future mechanistic studies will be important to determine how PCCB is involved in this immunogenic response. We also agree with the reviewers that future additional studies with other KPC cell lines will strength our conclusion regarding PCCB. Finally, we acknowledge the inherent limitations of IHC techniques to assess the involvement of other T cell checkpoints that might also be involved in this anti-tumor immunogenic effect. In summary, despite these limitations, our findings provide novel insight into the role of PCCB in pancreatic tumor immunogenicity and contribute to the ongoing discussion of how to improve therapeutic strategies for this deadly cancer.
Reviewer 1:
Weaknesses:
(1) Clonal expansion of cytotoxic T cells infiltrating the pancreatic αKO tumors
a. Only two tumor-bearing hosts were evaluated by single-cell TCR sequencing, thus limiting conclusions that may be drawn regarding repertoire diversity and expansion.
We agree with the reviewer that possible repertoire diversity and expansion could be observed by sequencing more tumor-bearing hosts. However, our current data reveal a marked consistency in the transcriptional expression within the two tumors analyzed per group. Importantly, these features are significantly divergent between the αKO and p-αKO groups. While recognizing the limited sample size, the observed within-group consistency and the clear distinction between groups strongly support the validity of the reported trends.
b. High abundance clones in the TME do not necessarily have tumor specificity, nor are they necessarily clonally expanded. They may be clones which are tissue-resident or highly chemokine-responsive and accumulate in larger numbers independent of clonal expansion. Please consider softening language to clonal enrichment or refer to clone size as clonal abundance throughout the paper.
We agree with the reviewer that it’s possible that the high abundance clones are not necessarily tumor specific. Our previous work (N. Sivaram 2019) demonstrated the critical role of increased pancreatic CD8+ T cells in αKO tumor regression within B6 mice. Therefore, antigen specific CD8+ T cell clonal expansion within the pancreas is an anticipated observation. However, as the reviewer pointed out, a portion of this expansion may be attributable to factors independent of tumor antigens. While the low T cell infiltration observed in KPC-implanted mice argues against a purely tissue-resident explanation, further investigation is required to definitively establish the tumor specificity of individual clones. We have revised the manuscript to reflect this nuance, replacing "clonal expansion" with "clonal enrichment".
c. The whole story would be greatly strengthened by cytotoxicity assays of abundant TCR clones to show tumor antigen specificity.
As mentioned above, we agree with the reviewer that future studies are needed to investigate each of the specific clones. Due to the extended timeframe required, it’s beyond the scope of the present study.
(2) A genome-wide CRISPR gene-deletion screen to identify molecules contributing to Pik3camediated pancreatic tumor immune evasion"
a. CRISPR mutagenesis yielded outgrowth of only 2/8 tumors. A more complete screen with an increased total number of tumors would yield much stronger gene candidates with better statistical power. It is unsurprising that candidates were observed in only one of the two tumors. Nevertheless, the authors moved forward successfully with Pccb.
We agree that by including more mice in the CRISPR screen, it’s possible that we could have identified more candidates. Regardless, we have successfully demonstrated PCCB’s role in pancreatic tumorgenicity with our mouse model.
(3) T cells infiltrate p-αKO tumors with increased expression of immune checkpoint
*a. In Figure 4D, cell counts are not normalized to totalCD8+ T cell counts making it difficult to directly compare aKO to p-aKO tumors. Based on quantifications from Figure 4D, I suspect normalization will strengthen the conclusion that CD8+ infiltrate is more exhausted in p-aKO tumors. *
Due to the use of distinct tumor sections for quantifying CD8+ cells and T cell checkpoint inhibitory receptor expression, direct normalization of these counts is challenging. However, we observed comparable CD8+ cell numbers between αKO and p-αKO tumors, with p-αKO tumors exhibiting nearly double the expression of immune checkpoint receptors. Therefore, even accounting for potential normalization discrepancies, we anticipate that p-αKO tumors would still demonstrate a significantly higher percentage of immune checkpoint receptorpositive cells compared to αKO tumors.
b. Flow cytometric analysis to further characterize the myeloid compartment is incomplete (single replicate) and does not strengthen the argument that p-aKO TME is more immunosuppressive. It could, however, strengthen the argument that TIL has less anti-tumor potential if effector molecule expression in CD8+ infiltrating cells were quantified.
We agree that including more tumor samples will strengthen the argument that p-αKO TME is more immunosuppressive. Future studies need to be done to characterize CD8+ T cells.
(4) Inhibition of PD1/PD-L1 checkpoint leads to elimination of most p-αKO tumors
a. It is reasonable to conclude that p-aKO tumors are responsive to immune checkpoint blockade. However, there is no data presented to support the statement that checkpoint blockade reactivates an existing anti-tumor CD8+ T cell response and does not induce a de novo response
We agree that future studies exploring the clonotypes of T cells infiltrating tumors in PD-1treated mice are necessary to determine whether observed T cell response represents reactivation of existing clones, a de novo response, or a combination of both.
b. The discussion of these data implies that anti-PD-1 would not improve aKO tumor control, but these data are not included. As such, it is difficult to compare the therapeutic response in aKO versus p-aKO. Further, these data are at best an indirect comparison of the T cell responsiveness against tumor, as the only direct comparison is infiltrating cell count in Figure 4 and there are no public TCR clones with confirmed anti-tumor specificity to follow in the aKO versus p-aKO response.
Since αKO tumors completely regress with 100% animal survival, we deemed anti-PD1 treatment in this group unnecessary. While we did assess anti-PD1 treatment in KPCimplanted mice, no survival benefit was observed (data not shown). The p-αKO tumor model was the only one in which anti-PD1 treatment improved survival. The complexity of the in vivo tumor microenvironment likely contributes to the lack of shared TCR clones between αKO and p-αKO tumors, even within the same tumor group. Future studies aimed at identifying tumorspecific clones may involve transferring in vivo models to in vitro assays or the generation of novel mouse strains expressing identified TCRs. However, these approaches require substantial time and resources and are beyond the scope of the present study.
Reviewer 2:
Weaknesses:
(1) A major issue is that it seems these data are based on the use of a single tumor cell clone with PIK3CA deleted. Therefore, there could be other changes in this clone in addition to the deletion of PIK3CA that could contribute to the phenotype.
We have previously tested a different KPC cell line (DT10022) with genetically downregulated PIK3CA and found mice implanted with αKO cells also showed tumor regression. However, we have not tested if deletion of Pccb in the DT10022-aKO cell line will have the same effect.
2) The conclusion that the change in the PCCB-deficient tumor cell line is unrelated to mitochondrial metabolic changes may be incorrect based on the data provided. While it is true that in the experiments performed, there was no statistically significant change in the oxygen consumption rate or metabolite levels, this could be due to experimental error. There is a trend in the OCR being higher in the PCCB-deficient cells, although due to a high standard deviation, the change is not statistically significant. There is also a trend for there being more aKG in this cell line, but because there were only 3 samples per cell line, there is no statistically significant difference.
Although PCCB is known to cause metabolic changes, in the context of this study, we are comparing PCCB-deficient to PCCB & PIK3CA double-deficient cells. We did not address if PCCB loss alone would cause metabolic alteration. We suspect that is the case.
(3) More data are required to make the authors' conclusion that there are myeloid changes in the PCCB-deficient tumor cells. There is only flow data from shown from one tumor of each type.
We agree that including more tumor samples will strengthen the argument that p-αKO TME is more immunosuppressive.
(4) The previous published study demonstrated increased MHC and CD80 expression in the PIK3CA-deficient tumors and these differences were suggested to be the reason the tumors were rejected. However, no data concerning the levels of these proteins were provided in the current manuscript.
Our previous hypothesis for altered MHC and CD80 levels is based on the observation that there is a dramatic increase in the number of infiltrating T cells upon Pik3ca deletion. In this study, similar levels of infiltrating T cells were observed when Pccb was deleted in αKO cells, therefore we do not expect any changes in MHC and CD80 levels since these tumors appears to be still recognized by the T cells. Indeed, we are able detect clonal enrichment in p-αKO tumors.
Reviewer 3:
Weaknesses:
The IHC technique that was used to stain and characterize the exhaustion status of the tumorinfiltrating T cells.
We agree with the reviewer that incorporating multi-color IHC or flow cytometry to characterize the exhaustion status of specific T cell subtypes would provide more comprehensive information. Unfortunately, we do not have the resources to perform these studies currently.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Summary:
This manuscript by Guo and Uusisaari describes a series of experiments that employ a novel approach to address long-standing questions on the inferior olive in general and the role of the nucleoolivary projection specifically. For the first time, they optimized the ventral approach to the inferior olive to facilitate imaging in this area that is notoriously difficult to reach. Using this approach, they are able to compare activity in two olivary regions, the PO and DAO, during different types of stimulation. They demonstrate the difference between the two regions, linked to Aldoc-identities of downstream Purkinje cells, and that there is co-activation resulting in larger events when they are clustered. Periocular stimulation also drives larger events, related to co-activation. Using optogenetic stimulation they activate the nucleoolivary (N-O) tract and observe a wide range of responses, from excitation to inhibition. Zooming in on inhibition they test the assumption that N-O activation can be responsible for suppression of sensoryevoked events. Instead, they suggest that the N-O input can function to suppress background activity while preserving the sensory-driven responses.
Strengths:
This is an important study, tackling the long-standing issue of the impossibility to do imaging in the inferior olive and using that novel method to address the most relevant questions. The experiments are technically very challenging, the results are presented clearly and the analysis is quite rigorous. There is quite a lot of room for interpretation, see weaknesses, but the authors make an effort to cover many options.
Weaknesses:
The heavy anesthesia that is required during the experiment could severely impact the findings. Because of the anesthesia, the firing rate of IO neurons is found to be 0.1 Hz, significantly lower than the 1 Hz found in non-anesthetized mice. This is mentioned and discussed, but what the consequences could be cannot be understated and should be addressed more. Although the methods and results are described in sufficient detail, there are a few points that, when addressed, would improve the manuscript.
We sincerely thank the reviewer for their encouraging comments and recognition of our study’s significance. We fully acknowledge the confounding effects of the deep anesthesia used in our experiments, which was necessary to ensure the animals’ welfare while establishing this technically demanding methodology. We elaborate on these effects below and will further clarify them in the revised manuscript.
Ultimately, the full resolution of this issue will require recordings in awake animals, as we consider our approach an advancement from acute slice preparations but not yet a complete representation of in vivo IO function. However, key findings from our study—such as amplitude modulation with co-activation and the potential role of IO refractoriness in complex spike generation—could be further explored in existing cerebellar cortical recordings from awake, behaving animals. We hope our work will motivate re-examination of such datasets to assess whether these mechanisms contribute to overall cerebellar function.
Reviewer #1 (Recommendations for the authors):
On page 10 the authors indicate that 2084 events were included for DAO and 1176 for PO. Is that the total number of events? What was the average and the range per neuron and the average recording duration?
Thank you for pointing out lack of clarity. The sentence should say "in total, 2084 and 1176 detected events from DAO and PO were included in the study". We will add the averages and ranges of events detected per neuron in different categories, as well as the durations of the recordings (ranging from 120s to 270s) to the tables.
On page 10 it is also stated that: "events in PO reached larger values than those in DAO even though the average values did not differ". Please clarify that statement. Which parameter + p-value in the table indicates this difference?
Apologies for omission. Currently the observation is only visible in the longer tail to the right in the PO data in Figure 2B2. We will add the range of values (3.0-75.2 vs 3.1-39.6 for PO and DAO amplitudes, respectively) in text and the tables in the revision.
Abbreviating airpuff to AP is confusing, I would suggest not abbreviating it.
Understood. We will change AP to airpuff in the text. In figure labels, at least in some panels, the abbreviation will be necessary due to space constraints.
What type of pulse was used to drive ChrimsonR? Could it be that the pulse caused a rebound-like phenomenon with the pulse duration that drove the excitation?
As described on line 229 and in the Methods, we used 5-second trains of 5-ms LED light pulses. Importantly, these stimulation parameters were informed by our extensive in vitro examination of various stimulation patterns (Lefler et al., 2014), which consistently produced stable postsynaptic responses without inducing depolarization or rebound effects. Additionally, Loyola et al. (2024) reported no evidence of rebound activity in IO cells following optogenetic activation of N-O axons in the absence of direct neuronal depolarization. We will incorporate these considerations into the discussion, while also acknowledging that unequivocal confirmation of “direct” rebound excitation would require intracellular recordings, such as patch clamp experiments.
The authors indicate that the excitatory activity was indistinguishable in shape from other calcium activity, but can anything be said about the timing (the scale bar in Figure 4A2 has no value, is it the same 2s pulse)?
Apologies for oversight in labeling the scale bar in Figure 4A2 (it is 2s). While we deliberately refrain from making strong claims regarding the origin of the NO-evoked spikes, their timing can be examined in more detail in Figure 4 - Supplement 1, panels C and D. We will make sure this is clearly stated in the revised text.
Did the authors check for accidental sparse transfection with ChrimsonR of olivary neurons in the post-mortem analysis?
Good point! However, we have never seen this AAV9-based viral construct to drive trans-synaptic expression in the IO, nor is this version of AAV known to have the capacity for transsynaptic expression in general.
No sign of retrograde labeling (via the CF collaterals in the cerebellar nuclei) was seen either. Notably, the hSyn promoter used to drive ChrimsonR expression is extremely ineffective in the IO. Thus, we doubt that such accidental labeling could underlie the excitatory events seen upon N-O stimulation. We will add these mentions with relevant references to the discussion of the revised manuscript.
On page 18 the authors state that: "The lower SS rate was attributed to intrinsic factors of PNs, while the reduced frequency of CSs was speculated to result from increased inhibition of the IO via the nucleo-olivary (N-O) pathway targeting the same microzone." I think I understand what you mean to say, but this is a bit confusing.
Agreed. We will rephrase this sentence to clarify that a lower SS rate in a given microzone may lead to increased activation of inhibitory N-O axons that target the region of IO that sends CF to the same microzone.
Is airpuff stimulation not more likely to activate PO dan DAO because of the related modalities (more face vs. more trunk/limbs?), and thereby also more likely to drive event co-activation (as it is stated in the abstract).
We agree that the specific innervation patterns of different IO regions likely explain the discrepancy between previous reports of airpuff-evoked complex spikes in cerebellar cortical regions targeted by DAO and the absence of airpuff responses in the particular region of DAO accessible via our surgical approach. As in the present dataset virtually no airpuff-evoked events were seen in DAO regions, we are unable to directly compare airpuff-evoked event co-activation between PO and DAO. The higher co-activation for PO was observed for "spontaneous" activity.
The Discussion addresses the question of why N-O pathway activation does not remove the airpuff response.
Given the potentially profound effect, I would propose to expand the discussion on the role of aneasthesia, including longer refractory periods but also potential disruption of normal network interactions (even though individually the stimulations work). Briefly indicating what is known about alpha-chloralose would help interpret the results as well.
We fully agree that the anesthetic state introduces confounding factors that must be considered when interpreting our results. We will expand the discussion to address how anesthesia, particularly alphachloralose as well as tissue cooling, may contribute to prolonged refractory periods and potential disruptions in normal network interactions. However, we recognize that certain aspects cannot be fully resolved without recordings in awake animals. For this reason, we characterize our preparation as an "upgraded" in vitro approach rather than a fully representative in vivo model.
Please clearly indicate that the age range of P35-45 is for the moment of virus injection and specify the age range for the imaging experiment.
Apologies for the oversight. We will indicate these age ranges in the results (as they are currently only specified in Methods). The P35-45 range refers to moment of virus injection.
The methods indicate that a low-pass filter of 1Hz was used. I am sure this helps with smoothing, but does it not remove a lot of potentially interesting information. How would a higher low-pass filter affect the analysis and results?
We acknowledge that applying a 1 Hz low-pass filter inevitably removes high-frequency components, including potential IO oscillations and fine details such as spike "doublets." However, given the temporal resolution constraints of our recording approach, we prioritized capturing robust, interpretable events over attempting to extract finer features that might be obscured by both the indicator kinetics and imaging speed.
While a higher cut-off frequency could, in principle, allow more precise measurement of rise times and peak timings, it would also amplify high-frequency noise, complicating automated event detection and reducing confidence in distinguishing genuine neural signals from artifacts. Given these trade-offs, we opted for a conservative filtering approach to ensure stable event detection. Future work, particularly with faster imaging rates and improved sensors (GCaMP8s) will be used to explore the finer temporal structure of IO activity. We will deliberate on these matters more extensively in the revised discussion.
Reviewer #2 (Public review):
The authors developed a strategy to image inferior olive somata via viral GCaMP6s expression, an implanted GRIN lens, and a one-photon head-mounted microscope, providing the first in vivo somatic recordings from these neurons. The main new findings relate to the activation of the nucleoolivary pathway, specifically that: this manipulation does not produce a spiking rebound in the IO; it exerts a larger effect on spontaneous IO spiking than stimulus (airpuff)-evoked spiking. In addition, several findings previously demonstrated in vivo in Purkinje cell complex spikes or inferior olivary axons are confirmed here in olivary somata: differences in event sizes from single cells versus co-activated cells; reduced coactivation when activating the NO pathway; more coactivation within a single zebrin compartment.
The study presents some interesting findings, and for the most part, the analyses are appropriate. My two principal critiques are that the study does not acknowledge major technical limitations and their impact on the claims; and the study does not accurately represent prior work with respect to the current findings.
We thank the reviewer for recognising the value of the findings in our "reduced" in vivo preparation, and apologize for omissions in the work that led to critique. We will elaborate on these matters below and prepare a revised manuscript.
The authors use GCaMP6s, which has a tau1/2 of >1 s for a normal spike, and probably closer to 2 s (10.1038/nature12354) for the unique and long type of olivary spikes that give rise to axonal bursts (10.1016/j.neuron.2009.03.023). Indeed, the authors demonstrate as much (Fig. 2B1). This affects at least several claims:
a. The authors report spontaneous spike rates of 0.1 Hz. They attribute this to anesthesia, yet other studies under anesthesia recording Purkinje complex spikes via either imaging or electrophysiology report spike rates as high as 1.5 Hz (10.1523/JNEUROSCI.2525-10.2011). This discrepancy is not acknowledged and a plausible explanation is not given. Citations are not provided that demonstrate such low anesthetized spike rates, nor are citations provided for the claim that spike rates drop increasingly with increasing levels of anesthesia when compared to awake resting conditions.
We fully acknowledge that anesthesia is a major confounding factor in our study. Given the unusually invasive nature of our surgical preparation, we prioritized deep anesthesia to ensure the animals’ welfare. This, along with potential cooling effects from tissue removal and GRIN lens contact, likely contributed to the observed suppression of IO activity.
We recognize that reported complex spike rates under anesthesia vary considerably across studies, and we will expand our discussion to provide a more comprehensive comparison with prior literature. Notably, different anesthetic protocols, levels of anesthesia, and recording methodologies can lead to widely different estimates of firing rates. While we cannot resolve this issue without recordings in awake animals, we will clarify that our observed rates likely reflect both the effects of anesthesia and specific methodological constraints. We will also incorporate additional references to studies examining cerebellar activity under different anesthetic conditions.
More likely, this discrepancy reflects spikes that are missed due to a combination of the indicator kinetics and low imaging sensitivity (see (2)), neither of which are presented as possible plausible alternative explanations.
We acknowledge that the combination of slow indicator kinetics and limited optical power in our miniature microscope setup constrains the temporal resolution of our recordings. However, we are confident that we can reliably detect events occurring at intervals of 1 second or longer. This confidence is based on data from another preparation using the same viral vector and optical system, where we observed spike rates an order of magnitude higher.
That said, we do not make claims regarding the presence or absence of somatic events occurring at very short intervals (e.g., 100-ms "doublets," as described by Titley et al., 2019), as these would likely fall below our temporal resolution. We will clarify this limitation in the revised manuscript to ensure that the constraints of our approach are fully acknowledged.
While GCaMP6s is not as sensitive as more recent variants (Zhang et al., 2023, PMID 36922596), our previous work (Dorgans et al., 2022) demonstrated that its dynamic range and sensitivity are sufficient to detect both spikes and subthreshold activity in vitro. Although the experimental conditions differ in the current miniscope experiments, we took measures to optimize signal quality, including excluding recordings with a low signal-to-noise ratio (see Methods). This need for high signal fidelity also informed our decision to limit the sampling rate to 20 fps. In future work, we plan to adopt newer GCaMP variants that were not available at the start of this project, which should further improve sensitivity and temporal resolution.
Many claims are made throughout about co-activation ("clustering"), but with the GCaMP6s rise time to peak (0.5 s), there is little technical possibility to resolve co-activation. This limitation is not acknowledged as a caveat and the implications for the claims are not engaged with in the text.
As noted in the manuscript (L492-), "interpreting fluorescence signals relative to underlying voltage changes is challenging, particularly in IO neurons with unusual calcium dynamics." We acknowledge that the slow rise time of GCaMP6s ( 0.5 s) limits our ability to precisely resolve the timing of co-activation at very short intervals. However, given the relatively slow timescales of IO event clustering and the inherent synchrony in olivary network dynamics, we believe that the observed co-activation patterns remain meaningful, even if finer temporal details cannot be fully resolved.
To ensure clarity, we will expand this section to explicitly acknowledge the temporal resolution limitations of our approach and discuss their implications for interpreting co-activation. While the precise timing of individual spikes within a cluster may not be resolvable, the observed increase in event magnitude with coarse co-activation suggests that clustering effects remain functionally relevant even when exact spike synchrony is not detectable at millisecond resolution.
This finding is consistent with the idea that co-activation enhances calcium influx, leading to larger amplitude events — a relationship that does not require perfect temporal resolution to be observed. The fact that this effect persists across a broad range of clustering windows (as shown in Figure 2 Supplement 2) further supports its robustness. While we cannot make strong claims about precise spike timing within these clusters nor about the mechanism underlying enhanced calcium signal, our results demonstrate that co-activation may influence IO activity in a quantifiable way. We will clarify these points in the revised manuscript to ensure that our findings are appropriately framed given the temporal constraints of our imaging approach.
The study reports an ultralong "refractory period" (L422-etc) in the IO, but this again must be tempered by the possibility that spikes are simply being missed due to very slow indicator kinetics and limited sensitivity. Indeed, the headline numeric estimate of 1.5 s (L445) is suspiciously close to the underlying indicator kinetic limitation of 1-2 s.
Our findings suggest a potential refractory period limiting the frequency of events in the inferior olive under our recording conditions. This interpretation is supported by the observed inter-event interval distribution, the inability of N-O stimulation to suppress airpuff-evoked events, and lower bounds reported in earlier literature on complex spike intervals recorded in awake animals under various behavioral contexts. Taking into account the likely cooling of tissue, a refractory period of 1.5s is not unreasonable. Of course, we recognize that the slow decay kinetics of GCaMP6s may cause overlapping fluorescence signals, potentially obscuring closely spaced events. This is in line with data presented in the Chen et al 2013 manuscript describing GCaMp6s (PMID: 36922596; Figure 3b showing events detected with intervals less than 500 ms).
The consideration of refractoriness only arose late in the project while we were investigating the explanations for lack of inhibition of airpuff-evoked spikes. Future experiments, particularly in awake animals, will be instrumental in validating this interpretation. To ensure that the refractory period is understood as one possible mechanism rather than a definitive explanation, we will rephrase the discussion to clarify that while our data are compatible with a refractory period, they do not establish it conclusively.
The study uses endoscopic one-photon miniaturized microscope imaging. Realistically, this is expected to permit an axial point spread function (z-PSF) on the order of 40um, which must substantially reduce resolution and sensitivity. This means that if there *is* local coactivation, the data in this study will very likely have individual ROIs that integrate signals from multiple neighboring cells. The study reports relationships between event magnitude and clustering, etc; but a fluorescence signal that contains photons contributed by multiple neighboring neurons will be larger than a single neuron, regardless of the underlying physiology - the text does not acknowledge this possibility or limitation.
We acknowledge that the use of one-photon endoscopic imaging imposes limitations on axial resolution, potentially leading to signal contributions from neighboring neurons. To mitigate this, we applied CNMFe processing, which allows for the deconvolution of overlapping signals and the differentiation of multiple neuronal sources within shared pixels. However, as the reviewer points out, if two neurons are perfectly overlapping in space, they may be treated as a single unit.
To clarify this limitation, we will expand the discussion to explicitly acknowledge the impact of one-photon imaging on signal separation and to emphasize that, while CNMFe helps resolve some overlaps, perfect separation is not always possible. As already noted in the manuscript (L495-), "the absence of optical sectioning in the whole-field imaging method can lead to confounding artifacts in densely labeled structures such as the IO’s tortuous neuropil." We will further elaborate on how this factor was considered in our analysis and interpretation.
Second, the text makes several claims for the first multicellular in vivo olivary recordings. (L11; L324, etc).
I am aware of at least two studies that have recorded populations of single olivary axons using two-photon Ca2+ imaging up to 6 years ago (10.1016/j.neuron.2019.03.010; 10.7554/eLife.61593). This technique is not acknowledged or discussed, and one of these studies is not cited. No argument is presented for why axonal imaging should not "count" as multicellular in vivo olivary recording: axonal Ca2+ reflects somatic spiking.
We appreciate the reviewer’s point and acknowledge the important prior work using two-photon imaging to record olivary axonal activity in the cerebellar cortex. However, while axonal calcium signals do reflect somatic spiking, these recordings inherently lack information about the local network interactions within the inferior olive itself.
A key motivation for our study was to observe neuronal activity within the IO at the level of its gap-junctioncoupled local circuits, rather than at the level of its divergent axonal outputs. The fan-like spread of climbing fibers across rostrocaudal microzones in the cerebellar cortex makes them relatively easy to record in vivo, but it also means that individual imaging fields contain axons from neurons that may be distributed across different IO microdomains. As a result, while previous work has provided valuable insight into olivary output patterns, it has not allowed for the examination of coordinated somatic activity within localized IO neuron clusters.
With apologies, we recognize that this distinction was not sufficiently emphasized in our introduction. We will clarify this key point and ensure that the important climbing fiber imaging studies are properly cited and contextualized in the revised manuscript.
Reviewer #2 (Recommendations for the authors):
The authors state: "we found no reports that examined coactivation levels between Z+ and Z- microzones in cerebellar complex spike recordings" (L359). Multiple papers (that are not cited) using AldolaceC-tdTomato mice with two photon Purkinje dendritic calcium imaging showed synchronization (at similar levels) within but not across z+/z- bands. (2015 10.1523/JNEUROSCI.2170-14.2015, 2023 https://doi.org/10.7554/eLife.86340).
We apologize for the misleading phrasing. We will rephrase this statement to: "While complex spike coactivation within individual zebrin zones has been extensively studied (references), we found no reports directly comparing the levels of intra-zone co-activation between Z+ and Z microzones."
Additionally, we will ensure that the relevant studies demonstrating synchronization within zebrin zones, as well as (lack of) interactions between neighboring zones, are properly cited and discussed in the revised manuscript.
The figures could use more proofreading, and several decisions should be reconsidered:
Normalizing the amplitude to maximum is not a good strategy, as it can overemphasize noise or extremely small-magnitude signals, and should instead follow standard convention and present in fixed units (3A2, 4B2, and even 2C).
As noted earlier, we have excluded recordings and cells with high noise or a low signal-to-noise ratio for event amplitudes, ensuring that such data do not influence the color-coded panels. Importantly, all quantitative analyses and traces presented in the manuscript are normalized to baseline noise level, not to maximal amplitude, ensuring that noise or low-magnitude signals do not skew the analysis.
The decision to use max-amplitude normalization in color-coded panels was made specifically to aid visualization of temporal structure across recordings. This approach allows for clearer comparisons without the distraction of inter-cell variability in absolute signal strength. However, we recognize the potential for confusion and will revise the Results text to explicitly clarify that the color-coded visualizations use a different scaling method than the quantitative analyses.
x axes with no units: Figures 2B2, 2E1, 3B2, 3C2, 5B2, 5C2, 5D2.
No colorbar units: 5A3 (and should be shown in real not normalized units).
No y axis units: 5D1.
No x axis label or units: 5E1.
5E3 says "stim/baseline" for the y-axis units and then the first-panel title says "absolute frequencies" meaning it’s *not* normalized and needs a separate (accurate) y-axis with units.
Illegibly tiny fonts: 2E1, 3E1, etc.
We will correct all these in the revised manuscript. Thank you for careful reading.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
This study provides a thorough analysis of Nup107's role in Drosophila metamorphosis, demonstrating that its depletion leads to developmental arrest at the third larval instar stage due to disruptions in ecdysone biosynthesis and EcR signaling. Importantly, the authors establish a novel connection between Nup107 and Torso receptor expression, linking it to the hormonal cascade regulating pupariation.
However, some contradictory results weaken the conclusions of the study. The authors claim that Nup107 is involved in the translocation of EcR from the cytoplasm to the nucleus. However, the evidence provided in the paper suggests it more likely regulates EcR expression positively, as EcR is undetectable in Nup107-depleted animals, even below background levels.
We appreciate the concern raised in this public review. However, we must clarify that we do not claim that Nup107 regulates the translocation of EcR from the cytoplasm. It is important to note that we posited this hypothesis if Nup107 will regulate EcR nuclear translocation (9<sup>th</sup> line of 2<sup>nd</sup> paragraph on page 6). We have spelled this out more clearly as the 3<sup>rd</sup> sub-section title of the Results section, and in the discussion (8<sup>th</sup> line of 2<sup>nd</sup> paragraph on page 11). Overall, we have expressed surprise that Nup107 is not directly involved in the nuclear translocation of EcR.
Ecdysone hormone acts through the EcR to induce the transcription of EcR also and creates a positive autoregulatory loop that enhances the EcR level through ecdysone signaling (1). Since Nup107 depletion leads to a reduction in ecdysone levels, it disrupts the transcription autoregulatory EcR expression loop. This can contribute to the reduced EcR levels seen in Nup107-depleted animals.
Additionally, the link between Nup107 and Torso is not fully substantiated. While overexpression of Torso appears to rescue the lack of 20E production in the prothoracic gland, the distinct phenotypes of Torso and Nup107 depletion-developmental delay in the former versus complete larval arrest in the latter complicate understanding of Nup107's precise role.
We understand that there are differences in the developmental delay when Tosro and Nup107 depletion is analyzed. However, the two molecules being compared here are very different, and the extent of Torso depletion is not evident in other studies (2). Even if the extent of depletion of Torso and Nup107 is similar, we believe that Nup107, being a more widely expressed protein, induces stronger defects owing to its importance in cellular physiology. We think that RNAi-mediated depletion of Nup107 causes a defect in 20E biosynthesis through the Halloween genes, inducing a developmental arrest.
To clarify these discrepancies, further investigation into whether Nup107 interacts with other critical signaling pathways related to the regulation of ecdysone biosynthesis, such as EGFR or TGF-β, would be beneficial and could strengthen the findings.
In summary, although the study presents some intriguing observations, several conclusions are not well-supported by the experimental data.
We agree with the reviewer’s suggestion. As noted in the literature, five RTKs-torso, InR, EGFR, Alk, and Pvr-stimulate the PI3K/Akt pathway, which plays a crucial role in the PG functioning and controlling pupariation and body size (3). We have checked the torso and EGFR signaling. We rescued Nup107 defects with the torso overexpression, however, constitutively active EGFR (BL-59843) did not rescue the phenotype (data was not shown). Nonetheless, we plan to examine the EGFR pathway activation by measuring the pERK levels in Nup107-depleted PGs.
Reviewer #2 (Public review):
Summary:
The manuscript by Kawadkar et al investigates the role of Nup107 in developmental progression via the regulation of ecdysone signaling. The authors identify an interesting phenotype of Nup107 whole-body RNAi depletion in Drosophila development - developmental arrest at the late larval stage. Nup107-depleted larvae exhibit mis-localization of the Ecdysone receptor (EcR) from the nucleus to the cytoplasm and reduced expression of EcR target genes in salivary glands, indicative of compromised ecdysone signaling. This mis-localization of EcR in salivary glands was phenocopied when Nup107 was depleted only in the prothoracic gland (PG), suggesting that it is not nuclear transport of EcR but the presence of ecdysone (normally secreted from PG) that is affected. Consistently, whole-body levels of ecdysone were shown to be reduced in Nup107 KD, particularly at the late third instar stage when a spike in ecdysone normally occurs. Importantly, the authors could rescue the developmental arrest and EcR mislocalization phenotypes of Nup107 KD by adding exogenous ecdysone, supporting the notion that Nup107 depletion disrupts biosynthesis of ecdysone, which arrests normal development. Additionally, they found that rescue of the Nup107 KD phenotype can also be achieved by over-expression of the receptor tyrosine kinase torso, which is thought to be the upstream regulator of ecdysone synthesis in the PG. Transcript levels of the torso are also shown to be downregulated in the Nup107KD, as are transcript levels of multiple ecdysone biosynthesis genes. Together, these experiments reveal a new role of Nup107 or nuclear pore levels in hormone-driven developmental progression, likely via regulation of levels of torso and torso-stimulated ecdysone biosynthesis.
Strengths:
The developmental phenotypes of an NPC component presented in the manuscript are striking and novel, and the data appears to be of high quality. The rescue experiments are particularly significant, providing strong evidence that Nup107 functions upstream of torso and ecdysone levels in the regulation of developmental timing and progression.
Weaknesses:
The underlying mechanism is however not clear, and any insight into how Nup107 may regulate these pathways would greatly strengthen the manuscript. Some suggestions to address this are detailed below.
Major questions:
(1) Determining how specific this phenotype is to Nup107 vs. to reduced NPC levels overall would give some mechanistic insight. Does knocking down other components of the Nup107 subcomplex (the Y-complex) lead to similar phenotypes? Given the published gene regulatory function of Nup107, do other gene regulatory Nups such as Nup98 or Nup153 produce these phenotypes?
We thank this public review to raise this concern. Working with a Nup-complex like the Nup107 complex, this concern is anticipated but difficult to address as many Nups function beyond their complex identity. Our observations with all other members of the Nup107-complex, including dELYS, suggest that except Nup107, none of the other Nup107-complex members could induce larval developmental arrest.
In this study, we primarily focused on the Nup107 complex (outer ring complex) of the NPC. We have not examined other nucleoporins outside of this complex, such as Nup98 and Nup153. However, previous studies have reported that Nup98 and Nup153 interact with chromatin, with these investigations conducted in Drosophila S2 cells (4, 5, 6). In the future, we may check whether Nup98 and Nup153 depletion can produce the arrest phenotype.
(2) In a related issue, does this level of Nup107 KD produce lower NPC levels? It is expected to, but actual quantification of nuclear pores in Nup107-depleted tissues should be added. These and the above experiments would help address a key mechanistic question - is this phenotype the result of lower numbers of nuclear pores or specifically of Nup107?
We agree with the concern raised here, and we plan to assess nucleoporin intensity using mAb414 antibody (exclusively FG-repeat Nup recognizing antibody) in the Nup107 depletion background. Our past observations suggest that Nup107-depletion does not affect the overall nuclear pore complex assembly in Drosophila salivary glands (Data is not shown).
(3) Additional experiments on how Nup107 regulates the torso would provide further insight. Does Nup107 regulate transcription of the torso or perhaps its mRNA export? Looking at nascent levels of the torso transcript and the localization of its mRNA can help answer this question. Or alternatively, does Nup107 physically bind the torso?
While the concern regarding torso transcript level is genuine, we have already reported in the manuscript that Nup107 levels directly regulate torso expression. When Nup107 is depleted torso levels go down, which in turn controls ecdysone production and subsequent EcR signaling (Figure 6B of the manuscript). However, the exact nature of Nup107 regulation on torso expression is still unclear. Since the Nup107 is known to interact with chromatin (7), it may affect torso transcription. The possibility of a physiologically relevant interaction between Nup107 and the torso in a cellular context is unlikely due to their distinct sub-cellular localizations. If we investigate this further, it will require a significant amount of time for having reagents and experimentation, and currently stands beyond the scope of this manuscript.
(4) The depletion level of Nup107 RNAi specifically in the salivary gland vs. the prothoracic gland should be compared by RT-qPCR or western blotting.
Although we know that the Nup107 protein signal is reduced in SG upon knockdown (Figure 3B), we have not compared the Nup107 transcript level in these two tissues (SG and PG). As suggested here, we will knock down Nup107 using SG and PG-specific drivers and quantify the Nup107 depletion level by RT-qPCR.
(5) The UAS-torso rescue experiment should also include the control of an additional UAS construct - so Nup107; UAS-control vs Nup107; UAS-torso should be compared in the context of rescue to make sure the Gal4 driver is functioning at similar levels in the rescue experiment.
This is a very valid point, and we took this into account while planning the experiment. To maintain the GAL4 function, we used the Nup107<sup>KK</sup>;UAS-GFP as control alongside the Nup107<sup>KK</sup>;UAS-torso. This approach ensures that GAL4 dilution does not affect observations made in the experiments. It can be noticed in Figure S7 that the presence of GFP signal in prothoracic glands and their reduced size indicates genes downstream to both UAS sequences are transcribed, and GAL4 dilution does not play a role here.
Minor:
(6) Figures and figure legends can stand to be more explicit and detailed, respectively.
We will revisit all figures and their corresponding legends to ensure appropriate and explicit details are provided.
Reviewer #3 (Public review):
Summary:
In this study by Kawadkar et al, the authors investigate the developmental role of Nup107, a nucleoporin, in regulating the larval-to-pupal transition in Drosophila through RNAi knockdown and CRISPR-Cas9-mediated gene editing. They demonstrate that Nup107, an essential component of the nuclear pore complex (NPC), is crucial for regulating ecdysone signaling during developmental transitions. The authors show that the depletion of Nup107 disrupts these processes, offering valuable insights into its role in development.
Specifically, they find that:
(1) Nup107 depletion impairs pupariation during the larval-to-pupal transition.
(2) RNAi knockdown of Nup107 results in defects in EcR nuclear translocation, a key regulator of ecdysone signaling.
(3) Exogenous 20-hydroxyecdysone (20E) rescues pupariation blocks, but rescued pupae fail to close.
(4) Nup107 RNAi-induced defects can be rescued by activation of the MAP kinase pathway.
Strengths:
The manuscript provides strong evidence that Nup107, a component of the nuclear pore complex (NPC), plays a crucial role in regulating the larval-to-pupal transition in Drosophila, particularly in ecdysone signaling.
The authors employ a combination of RNAi knockdown, CRISPR-Cas9 gene editing, and rescue experiments, offering a comprehensive approach to studying Nup107's developmental function.
The study effectively connects Nup107 to ecdysone signaling, a key regulator of developmental transitions, offering novel insights into the molecular mechanisms controlling metamorphosis.
The use of exogenous 20-hydroxyecdysone (20E) and activation of the MAP kinase pathway provides a strong mechanistic perspective, suggesting that Nup107 may influence EcR signaling and ecdysone biosynthesis.
Weaknesses:
The authors do not sufficiently address the potential off-target effects of RNAi, which could impact the validity of their findings. Alternative approaches, such as heterozygous or clonal studies, could help confirm the specificity of the observed phenotypes.
This is a very valid point raised, and we are aware of the consequences of the off-target effects of RNAi. To assert the effects of authentic RNAi and reduce the off-target effects, we have used two RNAi lines (Nup107<sup>GD</sup> and Nup107<sup>KK</sup>) against Nup107. Both RNAi induced comparable levels of Nup107 reduction, and using these lines, ubiquitous and PG specific knockdown produced similar phenotypes. Although the Nup107<sup>GD</sup> line exhibited a relatively stronger knockdown compared to the Nup107<sup>KK</sup> line, we preferentially used the Nup107<sup>KK</sup> line because the Nup107<sup>GD</sup> line is based on the P-element insertion, and the exact landing site is unknown. Furthermore, there is an off-target predicted for the Nup107<sup>GD</sup> line, where a 19bp sequence aligns with the bifocal (bif) sequence. The bif-encoded protein is involved in axon guidance and regulation of axon extension. However, the Nup107<sup>KK</sup> line does not have a predicted off-target molecule, and we know its precise landing site on the second chromosome. Thus, the Nup107<sup>KK</sup> line was ultimately used in experimentation for its clearer and more reliable genetic background.
We are also investigating Nup107 knockdown in the prothoracic gland, which exhibits polyteny. Additionally, the number of cells in the prothoracic gland is quite limited, approximately 50-60 cells (8). Given this, there is a possibility that a clonal study may not yield the phenotype. However, we will consider moving forward with this approach also.
NPC Complex Specificity: While the authors focus on Nup107, it remains unclear whether the observed defects are specific to this nucleoporin or if other NPC components also contribute to similar defects. Demonstrating similar results with other NPC components would strengthen their claims.
We thank this public review to raise this concern. Working with a Nup-complex like the Nup107 complex, this concern is anticipated but difficult to address as many Nups function beyond their complex identity. Our observations with all other members of the Nup107-complex, including dELYS, suggest that except Nup107, none of the other Nup107-complex members could induce larval developmental arrest. Since the study is primarily focused on the Nup107 complex (outer ring complex) of the NPC, we have not examined other nucleoporins outside of this complex.
Although the authors show that Nup107 depletion disrupts EcR signaling, the precise molecular mechanism by which Nup107 influences this process is not fully explored. Further investigation into how Nup107 regulates EcR nuclear translocation or ecdysone biosynthesis would improve the clarity of the findings.
We appreciate the concern raised. Through our observation, we have proposed the upstream effect of Nup107 on the PTTH-torso-20E-EcR axis regulating developmental transitions. We know that Nup107 regulates torso levels, but we do not know if Nup107 directly interacts with torso. We would like to address whether Nup107 exerts control on PTTH levels also.
We must emphasize that Nup107 does not directly regulate the translocation of EcR. On the contrary, we have demonstrated that EcR translocation is 20E dependent and Nup107 independent. Through our observations, we have argued that Nup107 regulates the expression of Halloween genes required for ecdysone biosynthesis. We are interested in identifying if Nup107 associates directly or through some protein to chromatin to bring about the changes in gene expression required for normal development.
There are some typographical errors and overly strong phrases, such as "unequivocally demonstrate," which could be softened. Additionally, the presentation of redundant data in different tissues could be streamlined to enhance clarity and flow.
We thank the reviewer for this observation. We will remove all typographical errors and make reasonable statements based on our conclusions.
References:
(1) Varghese, Jishy, and Stephen M Cohen. “microRNA miR-14 acts to modulate a positive autoregulatory loop controlling steroid hormone signaling in Drosophila.” Genes & development vol. 21,18 (2007): 2277-82. doi:10.1101/gad.439807
(2) Rewitz, Kim F et al. “The insect neuropeptide PTTH activates receptor tyrosine kinase torso to initiate metamorphosis.” Science (New York, N.Y.) vol. 326,5958 (2009): 1403-5. doi:10.1126/science.1176450
(3) Pan, Xueyang, and Michael B O'Connor. “Coordination among multiple receptor tyrosine kinase signals controls Drosophila developmental timing and body size.” Cell reports vol. 36,9 (2021): 109644. doi:10.1016/j.celrep.2021.109644
(4) Pascual-Garcia, Pau et al. “Metazoan Nuclear Pores Provide a Scaffold for Poised Genes and Mediate Induced Enhancer-Promoter Contacts.” Molecular cell vol. 66,1 (2017): 63-76.e6. doi:10.1016/j.molcel.2017.02.020
(5) Pascual-Garcia, Pau et al. “Nup98-dependent transcriptional memory is established independently of transcription.” eLife vol. 11 e63404. 15 Mar. 2022, doi:10.7554/eLife.63404
(6) Kadota, Shinichi et al. “Nucleoporin 153 links nuclear pore complex to chromatin architecture by mediating CTCF and cohesin binding.” Nature communications vol. 11,1 2606. 25 May. 2020, doi:10.1038/s41467-020-16394-3
(7) Gozalo, Alejandro et al. “Core Components of the Nuclear Pore Bind Distinct States of Chromatin and Contribute to Polycomb Repression.” Molecular cell vol. 77,1 (2020): 67-81.e7. doi:10.1016/j.molcel.2019.10.017
(8) Shimell, MaryJane, and Michael B O'Connor. “Endoreplication in the Drosophila melanogaster prothoracic gland is dispensable for the critical weight checkpoint.” microPublication biology vol. 2023 10.17912/micropub.biology.000741. 21 Feb. 2023, doi:10.17912/micropub.biology.000741
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1:
Summary:
In this study, the authors propose a "unifying method to evaluate inter-areal interactions in different types of neuronal recordings, timescales, and species". The method consists of computing the variance explained by a linear decoder that attempts to predict individual neural responses (firing rates) in one area based on neural responses in another area.
The authors apply the method to previously published calcium imaging data from layer 4 and layers 2/3 of 4 mice over 7 days, and simultaneously recorded Utah array spiking data from areas V1 and V4 of 1 monkey over 5 days of recording. They report distributions over "variance explained" numbers for several combinations: from mouse V1 L4 to mouse V1 L2/3, from L2/3 to L4, from monkey V1 to monkey V4, and from V4 to V1. For their monkey data, they also report the corresponding results for different temporal shifts. Overall, they find the expected results: responses in each of the two neural populations are predictive of responses in the other, more so when the stimulus is not controlled than when it is, and with sometimes different results for different stimulus classes (e.g., gratings vs. natural images).
Strengths:
(1) Use of existing data.
(2) Addresses an interesting question.
Unfortunately, the method falls short of the state of the art: both generalized linear models (GLMs), which have been used in similar contexts for at least 20 years (see the many papers, both theoretical and applied to neural population data, by e.g. Simoncelli, Paninsky, Pillow, Schwartz, and many colleagues dating back to 2004), and the extension of Granger causality to point processes (e.g. Kim et al. PLoS CB 2011). Both approaches are substantially superior to what is proposed in the manuscript, since they enforce non-negativity for spike rates (the importance of which can be seen in Figure 2AB), and do not require unnecessary coarse-graining of the data by binning spikes (the 200 ms time bins are very long compared to the time scale on which communication between closely connected neuronal populations within an area, or between related areas, takes place).
We thank the reviewer for this suggestion. Our goal was to use a simple and unified linear ridge regression framework that can be applied to both calcium imaging (mouse) and MUAe (monkey) data.
We will perform a GLM-based analysis enforcing non-negativity as suggested, including in the GLM any additional available variables that may contribute to the neuronal responses.
We also would like to note that:
● Macaque data: Our MUAe data are binned at 25 ms, not 200 ms. We used the envelope
of multi-unit activity as reported in the original study [1]. We did not perform spike sorting on these data and therefore, strictly speaking, this is not a point process and methods developed for point processes are not directly applicable.
● Mouse data: The Stringer et al. dataset [2,3] uses two-photon calcium imaging sampled at 2.5 or 3 Hz. Additionally, responses were computed by averaging two frames per stimulus (yielding an effective bin size of 666 ms or 800 ms), dictated by acquisition constraints. We will emphasize the low temporal resolution of these signals as a limitation in the discussion section, but we cannot improve the temporal resolution with our analyses. These signals are not point processes either (although there is a correlation between two-photon calcium signals and spike rates).
Regardless of these considerations, the reviewer’s points are well taken, and we will conduct additional analyses as described above.
In terms of analysis results, the work in the manuscript presents some expected and some less expected results. However, because the monkey data are based on only one monkey (misleadingly, the manuscript consistently uses the plural ‘monkeys’), none of the results specific to that monkey, nor the comparison of that one monkey to mice, are supported by robust data.
We will add data from at least two more monkeys, as suggested by the reviewer:
● First, we will include a second monkey from the same dataset [1]. The reason this monkey was not included in the original submission is that the dataset for this second monkey consisted of much less data than the original. For example, for the lights-off condition, the number of V4 channels with signal-to-noise ratio greater than 2 (recommended electrodes to use by dataset authors) is 9-12 in this second monkey, compared to 68-74 in the first monkey [1]. However, we will still add results for this second monkey.
● Additionally, we will include data from a new monkey by collaborating with the Ponce lab who will collect new data for this study.
One of the main results for mice (bimodality of explained variance values, mentioned in the abstract) does not appear to be quantified or supported by a statistical test.
We appreciate this point. We will conduct statistical tests to quantify the degree of bimodality and clarify these findings in the results.
Moreover, the two data sets differ in too many aspects to allow for any conclusions about whether the comparisons reflect differences in species (mouse vs. monkey), anatomy (L2/3-L4 vs. V1-V4), or recording technique (calcium imaging vs. extracellular spiking).
We agree that the methodological and anatomical differences between the mouse and monkey datasets make any direct cross-species comparisons hard to interpret. We explicitly discuss this point in the Discussion section. We will add a section within the Discussion entitled “Limitations of this study”. We will further emphasize that our goal is not to attempt a direct quantitative comparison across species. We will further emphasize that the two experiments differ in terms of: (i) differences in recording modalities (calcium vs. electrophysiology) and associated differences in temporal resolution, neuronal types, and SNR, (ii) cortical targets (layers vs. areas), (iii) sample size, (iv) stimuli, (v) task conditions. In the revised manuscript, we will further highlight that our primary aim is to investigate inter-areal interactions within each species rather than to draw comparisons across species.
Reviewer #2:
Summary:
In this work, the authors investigated the extent of shared variability in cortical population activity in the visual cortex in mice and macaques under conditions of spontaneous activity and visual stimulation. They argue that by studying the average response to repeated presentations of sensory stimuli, investigators are discounting the contribution of variable population responses that can have a significant impact at the single trial level. They hypothesized that, because these fluctuations are to some degree shared across cortical populations depending on the sources of these fluctuations and the relative connectivity between cortical populations within a network, one should be able to predict the response in one cortical population given the response of another cortical population on a single trial, and the degree of predictability should vary with factors such as retinotopic overlap, visual stimulation, and the directionality of canonical cortical circuits.
To test this, the authors analyzed previously collected and publicly available datasets. These include calcium imaging of the primary visual cortex in mice and electrophysiology recordings in V1 and V4 of macaques under different conditions of visual stimulation. The strength of this data is that it includes simultaneous recordings of hundreds of neurons across cortical layers or areas. However, the weaknesses of calcium dynamics (which has lower temporal resolution and misses some non-linear dynamics in cortical activity) and multi-unit envelope activity (which reflects fluctuations in population activity rather than the variance in individual unit spike trains), underestimate the variability of individual neurons. The authors deploy a regression model that is appropriate for addressing their hypothesis, and their analytic approach appears rigorous and well-controlled.
We agree that both calcium imaging and multi-unit envelope recordings have inherent limitations in capturing the variability of individual neuron spiking. Among other factors, the slower temporal resolution of calcium signals can blur fast spiking events, and multi-unit envelopes can mask single-unit heterogeneity. In the Discussion, we will explicitly mention these modality-specific caveats and note that our approach is meant to capture shared variability at the population level rather than the fine temporal structure of individual neurons and individual spikes.
From their analysis, they found that there was significant predictability of activity between layer II/III and layer IV responses in mice and V1 and V4 activity in macaques, although the specific degree of predictability varied somewhat with the condition of the comparison with some minor differences between the datasets. The authors deployed a variety of analytic controls and explored a variety of comparisons that are both appropriate and convincing that there is a significant degree of predictability in population responses at the single trial level consistent with their hypothesis. This demonstrates that a significant fraction of cortical responses to stimuli is not due solely to the feedforward response to sensory input, and if we are to understand the computations that take place in the cortex, we must also understand how sensory responses interact with other sources of activity in cortical networks. However, the source of these predictive signals and their impact on function is only explored in a limited fashion, largely due to limitations in the datasets. Overall, this work highlights that, beyond the traditionally studied average evoked responses considered in systems neuroscience, there is a significant contribution of shared variability in cortical populations that may contextualize sensory representations depending on a host of factors that may be independent of the sensory signals being studied.
We will include a section within the Discussion to emphasize the limitations in the datasets used in this study. We also agree and appreciate the reviewer’s description and will borrow some of the reviewer’s terminology to provide context in the Discussion section.
The different recording modalities and comparisons (within vs. across cortical areas) limit the interpretability of the inter-species comparisons.
We agree that the methodological and anatomical differences between the mouse and monkey datasets make any direct cross-species comparisons hard to interpret. We explicitly discuss this point in the Discussion section. We will add a section within the Discussion entitled “Limitations of this study”. We will further emphasize that our goal is not to attempt a direct quantitative comparison across species. We will further emphasize that the two experiments differ in terms of: (i) differences in recording modalities (calcium vs. electrophysiology) and associated differences in temporal resolution, neuronal types, and SNR, (ii) cortical targets (layers vs. areas), (iii) sample size, (iv) stimuli, (v) task conditions. In the revised manuscript, we will further highlight that our primary aim is to investigate inter-areal interactions within each species rather than to draw comparisons across species.
Strengths:
This work considers a variety of conditions that may influence the relative predictability between cortical populations, including receptive field overlap, latency that may reflect feed-forward or feedback delays, and stimulus type and sensory condition. Their analytic approach is well-designed and statistically rigorous. They acknowledge the limitations of the data and do not over-interpret their findings.
Weaknesses:
The different recording modalities and comparisons (within vs. across cortical areas) limit the interpretability of the inter-species comparisons.The mechanistic contribution of known sources or correlates of shared variability (eye movements, pupil fluctuations, locomotion, whisking behaviors) were not considered, and these could be driving or a reflection of much of the predictability observed and explain differences in spontaneous and visual activity predictions.
We also appreciate this important point. We agree that multiple behavioral factors may significantly contribute to shared variability. In our analyses of the mouse data, we addressed non-visual influences by projecting out “non-visual ongoing neuronal activity” (as shown in Figure 6C, following the approach in Stringer et al. 2019). Additionally, we will further evaluate the contribution of behavioral measures available in the open dataset—such as running speed, whisking, pupil area, and “eigenface” components– to predictivity of neuronal responses.
For the macaque data, the head-fixed and eye-fixation conditions help minimize some of these other potential behavioral contributions. Moreover, we have performed comparisons of eyes-open versus eyes-closed conditions (see Figure 5D). We will also analyze pupil size specifically for the lights-off condition. We do not have access to any other behavioral data from monkeys.
Previous work has explored correlations in activity between areas on various timescales, but this work only considered a narrow scope of timescales.
We appreciate this suggestion. We will perform additional analyses to evaluate predictivity at different temporal scales, as suggested.
The observation that there is some degree of predictability is not surprising, and it is unclear whether changes in observed predictability with analysis conditions are informative of a particular mechanism or just due to differences in the variance of activity under those conditions. Some of these issues could be addressed with further analysis, but some may be due to limitations in the experimental scope of the datasets and would require new experiments to resolve.
Our initial analyses in Fig.6A examined the effect of variance in activity and predictability in mice. As the reviewer intuited, there is a correlation between variance and predictability, at least when presenting a stimulus. Importantly, however, this is not the case when predicting activity in the absence of any stimulus. In the macaque, we cannot compute the variance across stimuli in the checkerboard case (single stimulus), but we will compute it for the conditions of the 4 moving bars. In addition, inspired by the reviewer’s question, we will perform an analysis where we further normalize the variance in activity.
We would like to note that our key contribution is not to merely show that some degree of predictability is possible (which we agree is not surprising) but rather: (i) to use a simple approach to quantify this predictability, (ii) to assess directional differences in predictability, (iii) to evaluate how this predictability depends on neuronal properties and receptive field overlap, (iv) how it depends on the stimuli, and, importantly, (v) to compare predictability during visual stimulation versus absence of visual input.
We agree with the limitations in the datasets. We will include a section within the Discussion to emphasize these limitations.
Reviewer #3:
Neural activity in the visual cortex has primarily been studied in terms of responses to external visual stimuli. While the noisiness of inputs to a visual area is known to also influence visual responses, the contribution of this noisy component to overall visual responses has not been well characterized.
In this study, the authors reanalyze two previously published datasets - a Ca++ imaging study from mouse V1 and a large-scale electrophysiological study from monkey V1-V4. Using regression models, they examine how neural activity in one layer (in mice) or one cortical area (in monkeys) predicts activity in another layer or area. Their main finding is that significant predictions are possible even in the absence of visual input, highlighting the influence of non-stimulus-related downstream activity on neural responses. These findings can inform future modeling work of neural responses in the visual cortex to account for such non-visual influences.
A major weakness of the study is that the analysis includes data from only a single monkey. This makes it hard to interpret the data as the results could be due to experimental conditions specific to this monkey, such as the relative placement of electrode arrays in V1 and V4.
We will add data from at least two more monkeys, as suggested by the reviewer:
● First, we will include a second monkey from the same dataset [1]. The reason this monkey was not included in the original submission is that the dataset for this second monkey consisted of much less data than the original. For example, for the lights-off condition, the number of V4 channels with signal-to-noise ratio greater than 2 (recommended electrodes to use by dataset authors) is 9-12 in this second monkey, compared to 68-74 in the first monkey [1]. However, we will still add results for this second monkey.
● Additionally, we will include data from a new monkey by collaborating with the Ponce lab who will collect new data for this study.
The authors perform a thorough analysis comparing regression-based predictions for a wide variety of combinations of stimulus conditions and directions of influence. However, the comparison of stimulus types (Figure 4) raises a potential concern. It is not clear if the differences reported reflect an actual change in predictive influence across the two conditions or if they stem from fundamental differences in the responses of the predictor population, which could in turn affect the ability to measure predictive relationships. The authors do control for some potential confounds such as the number of neurons and self-consistency of the predictor population. However, the predictability seems to closely track the responsiveness of neurons to a particular stimulus. For instance, in the monkey data, the V1 neuronal population will likely be more responsive to checkerboards than to single bars. Moreover, neurons that don't have the bars in their RFs may remain largely silent. Could the difference in predictability be just due to this? Controlling for overall neuronal responsiveness across the two conditions would make this comparison more interpretable.
This is also a valid concern. As the reviewer noted, we controlled for the number of neurons and degree of self-consistency (Fig. 3A, 3C), and this was always done within their respective stimulus type.
As the reviewer intuits, in Fig. 6A in mice, we show that predictability correlates with neuronal responsiveness. This observation only held during the stimulus condition and not during the gray screen condition. We also showed correlations with self-consistency metrics as a proxy for responsiveness in Fig. 6A and 6C. However, we will directly assess the impact of responsiveness in two ways: (i) by correlating predictability directly with neuronal responsiveness and (ii) by following the same subsampling approach in Fig. 3 to normalize the degree of responsiveness and recompute the predictability metrics.
REFERENCES
(1) Chen, X., Morales-Gregorio, A., Sprenger, J., Kleinjohann, A., Sridhar, S., van Albada, S.J., Grün, S., and Roelfsema, P.R. (2022). 1024-channel electrophysiological recordings in macaque V1 and V4 during resting state. Sci Data 9, 77. https://doi.org/10.1038/s41597-022-01180-1.
(2) Stringer, C., Pachitariu, M., Steinmetz, N., Carandini, M., and Harris, K.D. (2019). High-dimensional geometry of population responses in visual cortex. Nature 571, 361–365. https://doi.org/10.1038/s41586-019-1346-5.
(3) Stringer, C., Pachitariu, M., Carandini, M., and Harris, K. (2018). Recordings of 10,000 neurons in visual cortex in response to 2,800 natural images. (Janelia Research Campus). https://doi.org/10.25378/janelia.6845348.v4 https://doi.org/10.25378/janelia.6845348.v4.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Although the reviewers found our work interesting, they raised several important concerns about our study. To address these concerns, mostly we performed new experiments. The most important changes are highlighted in the summary paragraphs.
First, in response to Reviewer 1’s suggestions, we have conducted the SFN experiments systematically, e.g., we further confirmed the mechanism of SFN-activated TFEB in HeLa NPC1 cells with new experiments including: the effect of BAPTA-AM (a calcium chelator), FK506+CsA (calcineurin inhibitors) and NAC (ROS scavenger) on SFN-induced TFEB-nuclear translocation in HeLa NPC1 cells (New Fig. S3). The effect of SFN on NPC1 expression (New Fig. S5). Particularly, we examined the colocalization of DiO (a PM marker) staining and surface LAMP1 staining in HeLa NPC1 cells under SFN treatment to confirm the PM exocytosis. In main text and figure legends, accuracy of sentence is thoroughly checked and defined. Hence, we have significantly improved the presentation and clarity in the revision.
Second, in response to Reviewer 2’s suggestions, we have performed additional experiments to demonstrate that the role of TFEB in SFN-evoked the lysosomal exocytosis by using TFEB-KO cells (New Fig. S7B). In TFEB KO cells, this increase of surface LAMP1 signal by SFN treatment was significantly reduced, suggestive of SFN-induced exocytosis in a TFEB-dependent manner. We also investigated the effect of U18666A on CF555-dextran endocytosis. By examining the localization of CF-dex and Lamp1, we found that CF555 is present in the lysosome with U18666A treatment (Fig for reviewers only A,B), suggesting that NPC1 deficiency/U18666A treatment has no effect on CF-dex endocytosis.
Third, in response to Reviewer 3’s suggestions, we have performed experiments in addition to response to other reviewers’ suggestion ie. the cytotoxicity of the concentration of SFN used in this study in various cell lines (New Fig.S10).
In addition, according to the reviewers’ suggestions, we made clarifications and corrections wherever appropriate in the manuscript.
Reviewer #1 (Public review):
Summary:
The authors are trying to determine if SFN treatment results in dephosphorylation of TFEB, subsequent activation of autophagy-related genes, exocytosis of lysosomes, and reduction in lysosomal cholesterol levels in models of NPC disease.
Strengths:
(1) Clear evidence that SFN results in translocation of TFEB to the nucleus.
(2) In vivo data demonstrating that SFN can rescue Purkinje neuron number and weight in NPC1<sup>-/-</sup> animals.
Thank you for the support!
Weaknesses:
(1) Lack of molecular details regarding how SFN results in dephosphorylation of TFEB leading to activation of the aforementioned pathways. Currently, datasets represent correlations.
Thank you for raising this critical point! The reviewer is right that in this manuscript we did not talk too much about the molecular mechanism of SFN-evoked TFEB activation. Because in our previous study (Li, Shao et al. 2021), we explored the mechanism of SFN-induced TFEB activation. We show that SFN-evoked TFEB activation via a ROS-Ca<sup>2+</sup>-calcineurin dependent but MTOR -independent pathway (Li, Shao et al. 2021). In the current manuscript, we cited this paper, but did not talk the details of the mechanism, which obviously confused the reviewers. Therefore, in the revision manuscript we added more details of the molecular mechanism of SFN-activated TFEB. Also, we further confirmed this mechanism in HeLa NPC1 cells with new experiments including: the effect of BAPTA-AM (a calcium chelator), FK506+CsA (calcineurin inhibitors) and NAC (ROS scavenger) on SFN-induced TFEB-nuclear translocation in NPC cells (New Fig.S3).
(2) Based on the manuscript narrative, discussion, and data it is unclear exactly how steady-state cholesterol would change in models of NPC disease following SFN treatment. Yes, there is good evidence that lysosomal flux to (and presumably across) the plasma membrane increases with SFN. However, lysosomal biogenesis genes also seem to be increasing. Given that NPC inhibition, NPC1 knockout, or NPC1 disease mutations are constitutively present and the cell models of NPC disease contain lysosomes (even with SFN) how could a simple increase in lysosomal flux decrease cholesterol levels? It would seem important to quantify the number of lysosomes per cell in each condition to begin to disentangle differences in steady state number of lysosomes, number of new lysosomes, and number of lysosomes being exocytosed.
Thank you for this constructive comment. From our data, in NPC1 cells SFN reduced the cholesterol levels by inducing lysosomal exocytosis and increasing lysosomal biogenesis. We understand the reviewer’s point that it would be really helpful to differentiate the exact three states of original number of lysosomes, number of new lysosomes, and number of lysosomes being exocytosis. Unfortunately, due to the technique limitation, so far seems there is no appropriate method that could clearly differentiate the lysosomes exactly come from which state. In the future, hopefully we will have technique to explore this mechanism.
(3) Lack of evidence supporting the authors' premise that "SFN could be a good therapeutic candidate for neuropathology in NPC disease".
Suggestion was taken! We removed this sentence. Thanks!
Reviewer #2 (Public review):
(4) The in vivo experiments demonstrate the therapeutic potential of SFN for NPC. A clear dose response analysis would further strengthen the proposed therapeutic mechanism of SFN.
Thank you for this constructive suggestion. We examined the effect of two doses of SFN30 and 50mg/kg on NPC mice. As shown in Fig.6, SFN (50mg/kg), but not 30mg/kg prevents a degree of Purkinje cell loss in the lobule IV/V of cerebellum, suggesting a dose-correlated preventive effect of SFN. In the future study, we will continue optimizing the dosage form and amount of SFN and do a dose-responsive analysis.
(5) Additional data supporting the activation of TFEB by SFN for cholesterol clearance in vivo would strengthen the overall impact of the study.
Thank the reviewer for this constructive comment. We have detected a significant decrease of pS211-TFEB protein in brain tissues of NPC mice upon SFN treatment compared to vehicle, suggesting that SFN activates TFEB in brain tissue for the first time. It is worth to further examine the lysosomal cholesterol levels in brain tissues to show the direct effect of SFN. However, in our hands and in the literatures Filipin seems not suitable for detecting lysosomal cholesterol accumulation in brain tissue. So far there isn’t a good method to directly measure lysosomal cholesterol in tissue.
(6) In Figure 4, the authors demonstrate increased lysosomal exocytosis and biogenesis by SFN in NPC cells. Including a TFEB-KO/KD in this assay would provide additional validation of whether these effects are TFEB-dependent.
Great suggestion! We investigated the role of TFEB in SFN-evoked the lysosomal exocytosis by using TFEB-KO cells. As shown in New Suppl. Fig. 7B, in TFEB KO cells, this increase of surface LAMP1 signal by SFN (15 μM, 12 h) treatment was significantly reduced, suggestive of SFN induced exocytosis in a TFEB-dependent manner.
(7) For lysosomal pH measurement, the combination of pHrodo-dex and CF-dex enables ratiometric pH measurement. However, the pKa of pHrodo red-dex (according to Invitrogen) is ~6.8, while lysosomal pH is typically around 4.7. This discrepancy may account for the lack of observed lysosomal pH changes between WT and U18666A-treated cells. Notably, previous studies (PMID: 28742019) have reported an increase in lysosomal pH in U18666A-treated cells.
We understand the reviewer’s point. But as stated in the methods and main text, we used pHrodo™ Green-Dextran (P35368, Invitrogen), rather than pHrodo Red-dextran. According to the product information from Invitrogen, pHrodo Green-dex conjugates are non-fluorescent at neural pH, but fluorescence bright green at acidic pH around 4, such as those in endosomes and lysosomes. Therefore, pHrodo Green-dex is suitable to monitor the acidity of lysosome (Hu, Li et al. 2022). We also used LysoTracker Red DND-99 (Thermo Scien fic, L7528) to measure lysosomal pH (Fig. 4G, H), which is consistent with results from pHrodo Green/CF measurement.
The reviewer mentioned that previous studies have reported an increase in lysosomal pH in U18666Atreated cells. We understood this concern. But in our hands, from our data with two lysosomal pH sensors, we have not detected lysosomal pH change in U18666A-treated NPC1 cell models.
(7) The authors are also encouraged to perform colocalization studies between CF-dex and a lysosomal marker, as some researchers may be concerned that NPC1 deficiency could reduce or block the trafficking of dextran along endocytosis.
Thank you for raising this important point and suggestion was taken! We investigated the effect of NPC1 deficiency on CF555-dextran trafficking into lysosome by examining the localization of CF-dex and Lamp1. To clearly define whether CF555-dex is present in the lysosome, we first used apilimod to enlarge lysosomes and then examined the relative posi on of CF555-dex and lamp1. As shown in Author response image 1A,B, in HeLa cells treated with U18666A, CF555 signals (red) clearly present inside lysosome (LAMP1 labelled lysosomal membrane, green signal), suggesting that CF555dex endocytosis is not affected by NPC1 deficiency (U18666A treatment).
Author response image 1.
The effect of NPC1 deficiency on CF555 endocytosis. HeLa cells were transiently transfected with LAMP1-GFP plasmid for 24 h. Cells were then treated with apilimod (100 nM) for 2 h to enlarge the lysosomes, and followed by co- treatment of U18666A (2.5 μM, 24 h) and CF555 (12 h). (A)Each panel shows fluorescence images taken by confocal microscopes. (B) Each panel shows the fluorescence intensity of a line scan (white line) through the double labeled object indicated by the white arrow. Scale bar, 20 μm or 2 μm (for zoom-in images).
(9) In vivo data supporting the activation of TFEB by SFN for cholesterol clearance would significantly enhance the impact of the study. For example, measuring whole-animal or brain cholesterol levels would provide stronger evidence of SFN's therapeutic potential.
We really appreciate the reviewer’s comments. Please see response to point #5.
Reviewer #3 (Public review):
(10) The manuscript is extremely hard to read due to the writing; it needs careful editing for grammar and English.
Sorry for the defects in the writing and grammar. We had thoroughly checked grammar and polished the English to improve the manuscript.
(11) There are a number of important technical issues that need to be addressed.
We will address the technical issues mentioned in the following ques ons.
(12) The TFEB influence on filipin staining in Figure 1A is somewhat subtle. In the mCherry alone panels there is a transfected cell with no filipin staining and the mCherry-TFEBS211A cells still show some filipin staining.
Thank you for raising this point. The reviewer is right that not all the mCherry alone cells with the same level of filipin signal and not all mCherry-TFEBS211 transfected cells show completely no filipin signal. The statistical results were from randomly selected cells from 3 independent experiments. To avoid the confusion, we have included more cells in the statistical analysis to cover all the conditions as shown in the new Fig. 1B. Hopefully this helps to clarify the confusion.
(13) Figure 1C is impressive for the upregulation of filipin with U18666A treatment. However, SFN is used at 15 microM. This must be hitting multiple pathways. Vauzour et al (PMID: 20166144) use SFN at 10 nM to 1microM. Other manuscripts use it in the low microM range. The authors should repeat at least some key experiments using SFN at a range of concentrations from perhaps 100 nM to 5 microM. The use of 15 microM throughout is an overall concern.
The reason that we use this concentration of SFN is based on our previous study (Li, Shao et al. 2021). We had shown that SFN (10–15 μM, 2–9 h) induces robust TFEB nuclear translocation in a dose- and time-dependent manner in HeLa cells as well as in other human cell lines without cytotoxicity (Li, Shao et al. 2021). Also, tissue concentrations of SFN can reach 3–30 μM upon broccoli consumption (Hu, Khor et al. 2006), so we used low micromolar concentrations of SFN (15 μM) in our study. Moreover, we further confirmed that SFN (15 μM) induces TFEB nuclear translocation in HeLa NPC1 cells (Fig. 1F, G Fig. 2B, G) and this concentration of SFN has no cytotoxicity (New Fig.S10).
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The following comments are designed to improve and focus the authors' work.
(14) Related to data in Figure 1. The mechanism through which TFEB can reduce Filipin in U18 conditions is unclear. Inhibi on of NPC1 results in hyperactivation of mTOR through cholesterol transport at ER-Lysosome contacts (see Zoncu group publications). If mTORC is hyperac ve in NPC disease models, TFEB would be expected to remain cytoplasmic and not enter the nucleus as the representative image in Figure 1A demonstrates.
In our previous study (Li, Shao et al. 2021), we have shown that SFN induces TFEB nuclear translocation in a mTOR-independent manner (Li, Shao et al. 2021). Consistent with this result, in this study we confirmed that SFN-induced TFEB nuclear translocation is mTor-independent in NPC1 cells (Now Fig. S4A, B). Thus, SFN induced TFEB nuclear translocation in various NPC cells (Fig. 1F, G, Fig. 2B, G). Please also see the discussion about the mechanism of SFN in response to point #1.
(15) Therefore, how does overexpression of TFEB, which remains in the cytoplasm, result in a decreased filipin signal? Similar ques ons relate to Figure 1C-H.
Medina et. al (Medina, Fraldi et al. 2011) show that TFEB overexpression (not activation, so overexpressed TFEB is in the cytoplasm) increases the pool of lysosomes in the proximity of the plasma membrane and promotes their fusion with PM by raising intracellular Ca<sup>2+</sup> levels through lysosomal Ca<sup>2+</sup> channel MCOLN1, leading to increased lysosomal exocytosis. Hence, TFEB overexpression only (TFEB is not activated) could reduce filipin signal via increasing lysosomal exocytosis. And with TFEB agonist treatment such as TFEB could further boost this increase.
(16) It would seem appropriate to measure the NPC1 and NPC2 proteins using western blot to ensure that SFN-dependent clearance of cholesterol is not due to enhanced expression of the native protein in U18-treated cells or enhanced folding of the protein in patient fibroblasts.
Thank you for this constructive comment! Because NPC1 gene mutation takes about 95% of NPC cases and NPC2 mutation takes about 5% of NPC cases. And in this study we focused on NPC1 deficiency cases. Thus, we measured the effect of SFN on the expression of NPC1 in human NPC1-patient fibroblasts. Western blot analysis showed that SFN (15 μM, 24 h) treatment did not affect NPC1 expression in human NPC1-patient fibroblasts (new Fig. S5).
(17) Related to data in Figures 1C-E. Controls are missing related to the effect SFN has on steady-state cholesterol levels. This may be insightful in providing information on the mode of action of this compound.
Suggestion was taken! We have supplemented the control- SFN only in new Fig. 1C-E.
(18) The mechanism that links SFN to TFEB-dependent translocation is suggested to involve calcineur independent dephosphorylation of TFEB. However, no data is provided. It would seem important to iden fy the mechanism(s) through which SFN positively regulates TFEB location. This would shift the manuscript and its model from correlations to causation. Experiments involving calcineurin inhibitors, or agonists of TRPML1 that have been reported as being a key source of Ca<sup>2+</sup> for calcineurin activation, may provide molecular insight.
Please see the paragraph in response to point #1.
(19) Related to Figure 4. Using a plasma membrane counterstain to quantify plasma membrane LAMP1 would increase the rigor of the analysis.
Great idea! We examined the colocalization of DiO (a PM marker) staining and LAMP1 staining in HeLa NPC1 cells under SFN treatment. As shown in new Fig.4A, surface LAMP1 signal(red) colocalized with DiO (green), a PM marker.
(20) Related to Figure 5. How do the authors explain the kinetic disparity between SFN treatment for 24 vs 72 hrs? IF TFEB is activated and promoting lysosomal biogenesis and increased lysosomal flux across the PM, why does cholesterol accumulation lag? Perhaps related to this point. Are other cholesterol metabolizing enzymes that may have altered activity in NPC sensitive to SFN? A similar comment applies to the Sterol regulatory element binding protein pathway, which has been shown to be activated in models of NPC disease.
We understand the reviewer’s point. As shown in Fig. 5C, D, in NPC1<sup>-/-</sup> MEF cells, SFN treatment for 24 h showed relative weaker cholesterol clearance compared to the effects in human cells (Fig.1C, D, Fig.2.E, I). Thus, we explored a longer treatment of SFN for 72 h (fresh SFN in medium was added every 24 h), and 72h treatment of SFN exhibited substantial cholesterol reduction (Fig. 5C, D). This different effect could be attributed to the continuous action of SFN, which could prolong the exocytosis, leading to more effective cholesterol clearance. As shown in the DMSO-treated MEF cells, the cholesterol levels are similar in both 24 and 72 h, thus 24 h U18666A treatment has reached the upper limit of the accumulated cholesterol, longer treatment me would not change the cholesterol levels. Thus, cholesterol accumulation has no lag.
We did not investigate whether SFN regulates other cholesterol metabolizing enzymes or sterol regulatory element binding proteins although we cannot rule out this possibility. In this study we mainly focus on the cholesterol clearance effect by SFN via TFEB-mediated pathways. From our data, TFEB KO could significantly diminish SFN-evoked cholesterol clearance. Hence, the effect of other cholesterol metabolizing enzymes or sterol regulatory element binding proteins maybe not as important as TFEB, thus out of scope of this study. In the future, we may explore the involvement of possible other pathways on SFN’s effects.
(21) Related to Figure 7. The western blots for pS211-TFEB are poor. It's suggested that whole blots are shown to increase rigor.
Thank you for the comments. We have represented the blots with more spare space to increase the rigor.
(22) Data demonstrating the ability of SFN to improve Purkinje cell survival are exci ng and pair well with the weight analysis, however, to address the overall goal of determining if "SFN could be a good therapeutic candidate for neuropathology in NPC disease" survival analysis should be tested as well.
Please see the paragraph in response to point #3.
Minor
(23) Throughout the manuscript many different Fonts and font sizes are used. This is very jarring to readers. It is suggested that a more uniform approach is taken to presenting these nice datasets.
We are so sorry and apologize for these oversights. We have thoroughly checked all the manuscript to make sure that Fonts and sizes of font are synchronized.
(24) Related to data presentation. In general, there is a lack of alignment and organization of the figures.
So sorry about this. We have reorganized the figures to get them better aligned.
(25) Line 149, SFN is missing.
Corrected!
Reviewer #3 (Recommendations for the authors):
(26) In Figure 3 the authors should use multiple single siRNAs or perform a functional rescue to determine specificity.
We understand the reviewer’s point. We did design several siRNAs and the efficiency of these siRNAs were validated. Finally, we decide use this siRNA whose knockdown efficiency is best in the study and the specificity of the siTFEB has been validated by Western blot as shown in Fig. 3A. Furthermore, we used TFEB knockout cells constructed by CRISPR/Cas9 to further examine the role of TFEB in SFN-induced cholesterol clearance (Fig. 3D). Consistently with the results in the siTFEB-transfected HeLa NPC1 cells (Fig. 3B, C), SFN failed to diminish cholesterol in HeLa TFEB KO cells. The result from TFEB KO cells is even convincing than siRNA experiment. We also performed a functional rescue of re-expressing TFEB in TFEB KO cells, in which SFN-induced cholesterol clearance was restored (Fig. 3E, F). Collectively, these data indicate that TFEB is required for lysosomal cholesterol reduction upon SFN treatment. Thus, we did not repeat this rescue experiment in the siTFEB-transfected HeLa NPC1 cells.
(27) The label for 3D is missing.
Corrected! Thanks!
(28) Figure 4, although the authors use an an body against the luminal domain of LAMP1 there could s ll be some permeabilization. A marker of the plasma membrane would be helpful.
Please see the response to point #19.
(29) Figure 4, cholesterol in the media because of lysosome exocytosis. This is where the high concentration of SFN is of concern. Is there any cell death that could explain the result? The authors should test for cell death with the SFN treatment.
Thank you for raising this important point! We have measured the cytotoxicity of SFN of the concentrations used in this study in various cell lines (New Fig.S10). Please also see the paragraph in response to point #13.
(30) The blot in Figure 6A is unclear. It is very hard to see any change in pS211-TFEB levels, and, the blurry signal is the detection of phospho-TFEB is uncertain.
Please see the summary paragraph in response to point #21.
References:
Hu, M. Q., P. Li, C. Wang, X. H. Feng, Q. Geng, W. Chen, M. Marthi, W. L. Zhang, C. L. Gao, W. Reid, J. Swanson, W. L. Du, R. Hume and H. X. Xu (2022). "Parkinson's disease-risk protein TMEM175 is a proton-activated proton channel in lysosomes." Cell 185(13): 2292-+.
Hu, R., T. O. Khor, G. Shen, W. S. Jeong, V. Hebbar, C. Chen, C. Xu, B. Reddy, K. Chada and A. N. Kong (2006). "Cancer chemoprevention of intestinal polyposis in ApcMin/+ mice by sulforaphane, a natural product derived from cruciferous vegetable." Carcinogenesis 27(10): 2038-2046.
Li, D., R. Shao, N. Wang, N. Zhou, K. Du, J. Shi, Y. Wang, Z. Zhao, X. Ye, X. Zhang and H. Xu (2021). "Sulforaphane Activates a lysosome-dependent transcriptional program to mitigate oxidative stress." Autophagy 17(4): 872-887.
Medina, D. L., A. Fraldi, V. Bouche, F. Annunziata, G. Mansueto, C. Spampanato, C. Puri, A. Pignata, J. A. Martina, M. Sardiello, M. Palmieri, R. Polishchuk, R. Puertollano and A. Ballabio (2011). "Transcriptional activation of lysosomal exocytosis promotes cellular clearance." Dev Cell 21(3): 421-430.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Fuchs describes a novel method of enzymatic protein-protein conjugation using the enzyme Connectase. The author is able to make this process irreversible by screening different Connectase recognition sites to find an alternative sequence that is also accepted by the enzyme. They are then able to selectively render the byproduct of the reaction inactive, preventing the reverse reaction, and add the desired conjugate with the alternative recognition sequence to achieve near-complete conversion. I agree with the authors that this novel enzymatic protein fusion method has several applications in the field of bioconjugation, ranging from biophysical assay conduction to therapeutic development. Previously the author has published on the discovery of the Connectase enzymes and has shown its utility in tagging proteins and detecting them by in-gel fluorescence. They now extend their work to include the application of Connectase in creating protein-protein fusions, antibody-protein conjugates, and cyclic/polymerized proteins. As mentioned by the author, enzymatic protein conjugation methods can provide several benefits over other non-specific and click chemistry labeling methods. Connectase specifically can provide some benefits over the more widely used Sortase, depending on the nature of the species that is desired to be conjugated. However, due to a similar lengthy sequence between conjugation partners, the method described in this paper does not provide clear benefits over the existing SpyTag-SpyCatcher conjugation system. Additionally, specific disadvantages of the method described are not thoroughly investigated, such as difficulty in purifying and separating the desired product from the multiple proteins used. Overall, this method provides a novel, reproducible way to enzymatically create protein-protein conjugates.
The manuscript is well-written and will be of interest to those who are specifically working on chemical protein modifications and bioconjugation.
I'd like to comment on two points.
(1) The benefits over the SpyTag-SpyCatcher system. Here, the conjugation partners are fused via the 12.3 kDa SpyCatcher protein, which is considerably larger than the Connectase fusion sequence (19 aa). This is mentioned in the introduction (p. 1 ln 24-26). Furthermore, SpyTag-SpyCatcher fusions are truly irreversible, while Connectase/BcPAP fusions may be reversed (p. 8, ln 265-273). For example, target proteins (e.g., AGAFDADPLVVEI-Protein) may be covalently fused to functionalized magnetic beads (e.g., Bead-ELASKDPGAFDADPLVVEI) in order to perform a pulldown assay. After the assay, the target protein and any bound interactors could be released from the beads by the addition of a Connectase / peptide (AGAFDAPLVVEI) mixture.
In a related technology, the SpyTag-SpyCatcher system was split into three components, SpyLigase, SpyTag and KTag (Fierer et al., PNAS 2014). The resulting method introduces a sequence between the fusion partners (SpyTag (13aa) + KTag (10aa)), which is similar in length to the Connectase fusion sequence (p. 8, ln 297 - 298). Compared to the original method, however, this approach seems to require longer incubation times, while yielding less fusion product (Fierer et al., Figure 2).
(2) Purification of the fusion product. The method is actually advantageous in this respect, as described in the discussion (p. 8, ln 258-264). Examples are now provided in Figure 6.
Reviewer #2 (Public review):
Summary:
Unlike previous traditional protein fusion protocols, the author claims their proposed new method is fast, simple, specific, reversible, and results in a complete 1:1 fusion. A multi-disciplinary approach from cloning and purification, biochemical analyses, and proteomic mass spec confirmation revealed fusion products were achieved.
Strengths:
The author provides convincing evidence that an alternative to traditional protein fusion synthesis is more efficient with 100% yields using connectase. The author optimized the protocol's efficiency with assays replacing a single amino acid and identification of a proline aminopeptidase, Bacilius coagulans (BcPAP), as a usable enzyme to use in the fusion reaction. Multiple examples including Ubiquitin, GST, and antibody fusion/conjugations reveal how this method can be applied to a diverse range of biological processes.
Weaknesses:
Though the ~100% ligation efficiency is an advancement, the long recognition linker may be the biggest drawback. For large native proteins that are challenging/cannot be synthesized and require multiple connectase ligation reactions to yield a complete continuous product, the multiple interruptions with long linkers will likely interfere with protein folding, resulting in non-native protein structures. This method will be a good alternative to traditional approaches as the author mentioned but limited to generating epitope/peptide/protein tagged proteins, and not for synthetic protein biology aimed at examining native/endogenous protein function in vitro.
The assessment is fair, and I have no further comments to add.
Reviewer #1 (Recommendations for the authors):
Major/Experimental Suggestions:
(1) Throughout the paper only one reaction shown via gels had 100% conversion to desired product (Figure 3C). It is misleading to title a paper with absolutes such as "100% product yield", when the majority of reactions show >95% product yield, without any purification. Please change the title of the manuscript to something along the lines of "Novel Irreversible Enzymatic Protein Fusions with Near-Complete Product Yield".
The conjugation reaction is thermodynamically favored. It is driven by the hydrolysis of a peptide bond (P|GADFDADPLVVEI), which typically releases 8 - 16 kJ/mol energy. This should result in a >99.99% complete reaction (DG° = -RT ln (Product/Educt)). In line with this, 99% - 100% of the less abundant educts (LysS, Figure 3A; MBP, Figure 3B; Ub-Strep, Figure 3C) are converted in the time courses (Figure 3D-F show different reaction conditions, which slow down conjugate formation). 100% conversion are also shown in Figure 5, Figure 6, and Figure S4. Likewise, 99.6% relative fusion product signal intensity in an LCMS analysis (Figure S2) after 4h reaction time (0.13% and 0.25% educts). In this experiment, the proline had been removed from 99.8% of the peptide byproducts (P|GADFDADPLVVEI). It is clear that this reaction is still ongoing and that >99.99% of the prolines will be removed from the peptides in time. These findings suggest that the conjugation reaction gradually slows down the less educt is available, but eventually reaches completion.
For some experiments, lower product yields (e.g. 97% in Figure 3B) are reported in the paper. These were calculated with Yield = 100% x Product / (Educt1 + Educt 2 + Product). With this formula, 100% conjugation can only be achieved with exactly equimolar educt quantities, because both educt 1 and educt 2 need to be converted entirely. If one educt 1 is available in excess, for example because of protein concentration measurement inaccuracies or pipetting errors, some of it will be left without fusion partner. In case of Figure 3B, 3% more GST seemed to have been in the mixture. These are methodological inaccuracies.
(2) Please provide at least one example of a purified desired product, and mention the difficulties involved as a disadvantage to this particular method. Separating BcPAP, Connectase, and the desired protein-protein conjugate may prove to be quite difficult, especially when Connectase cleaves off affinity tags.
Examples are now provided in Figure 6. As described in the discussion (p. 8, ln 258-264), the simple product purification is one of the advantages of the method.
(3) For the antibody conjugate, please provide an example of conjugating an edduct that would prove to be more useful in the context of antibodies. For example, as you mention in the introduction, conjugation of fluorophores, immobilization tags such as biotin, and small molecule linker/drugs are useful bioconjugates to antibodies.
Antibody-biotinylation is now shown in Figure S6; Antibody-fluorophore conjugates are part of Figures S5 and S7.
(4) Please assess the stability of these protein-protein conjugates under various conditions (temperature, pH, time) to ensure that the ligation via Connectase is stable over a broad array of conditions. In particular, a relevant antibody-conjugate stability assay should be done over the period of 1-week in both buffer and plasma to show applicability for potential therapeutics.
The stability of an antibody-biotin conjugate in blood plasma over 7 days at different temperatures is now shown in Figure S7.
Generally, Connectase introduces a regular peptide bond (Asp-Ala) with a high chemical and physical stability (e.g. 10 min incubation at 95°C in SDS-PAGE loading buffer; H2O-formic acid / acetonitrile gradients for LC-MS). The sequence may be susceptible to proteases, although this is not the case in HEK293 cells (antibody expression), E. coli, or blood plasma (Figure S7).
(5) Please conduct functional assays with the antibody-protein/peptide conjugates to show that the antibody retains binding capabilities to the HER-2 antigen and the modification was site-selective, not interfering with the binding paratope or binding ability of the antibody in any way. This can be done through bio-layer interferometry, surface plasmon resonance, ELISA, etc.
We plan the immobilization of the HER2 antibody on microplates and its use in an ELISA. However, this experiment requires significant testing and optimizations. It will be part of a future paper on the use of Connectase for protein immobilization.
For now, the mass spectrometry data provide clear evidence of a single site-selective conjugation, as the C-terminal ELASKDPGAFDADPLVVEI-Strep sequence is replaced by ELASKDAGAFDADPLVVEI(-Ub). Given that the conjugation sites at the C-termini are far from the antigen binding sites, and have already been used in a number of other approaches (e.g., SpyTag, SnapTag, Sortase), it appears unlikely that these conjugations interfere with antigen binding.
(6) Please include gels of all proteins used in ligation reactions after purification steps in the SI to show that each species was pure.
The pure proteins are now shown in Figure S9.
(7) Please provide the figures (not just tables) of LC/MS deconvoluted mass spectra graphs for all conjugates, either in the main text or the SI.
Please specify which spectra you are missing. I believe all relevant spectra are shown in Figures 4, 5, and S3. The primary data can be found in Dataset S2.
(8) Please provide more information in the methods section on exactly how the densitometry quantification of gel bands was performed with ImageJ.
Details on the quantification with Image Studio Lite 5.2 were added in the method section (p. 17, ln 461-463).
Minor Suggestions:
(1) Page 1, line 19: can include one sentence on what assays these particular bioconjugations are usefule for (e.g. internalization cell studies, binding assays, etc.)
I prefer not to provide additional details here to keep the text concise and focused.
(2) Page 1, line 22: "three to ten equivalents" instead of 3x-10x.
Done.
(3) Page 1, line 23: While NHS labeling is widely considered non-specific, maleimide conjugation to free cysteines is generally considered specific for engineered free cysteine residues, since native proteins often do not have free cysteine residues available for conjugation. If you are referring to the potential of maleimides to label lysines as well, that should be specifically stated.
I modified the sentence, now stating that these methods are "can be" unspecific.
As pointed out, it is possible to achieve specificity by eliminating all other free cysteines and/or engineering a cysteine in an appropriate position. In many other cases, however (e.g., natural antibodies), several cysteines are available, or the sample contains other proteins/peptides. I did not want to go into more detail here and refer to the cited review.
(4) Page 1, line 31: "and an oligoglycine G(1-5)-B"
Done.
(5) Page 1, line 34: It is not clear where in the source these specific Km values are coming from, considering these are variable based on specific conditions/substrates and tend to be reaction-specific.
I cited another review, which lists the same values, along with a few other measurements (Jacobitz et al., Adv Protein Chem Struct Biol 2017, Table 2). It is clear that each of these measurements differs somewhat, but they are generally comparable (K<sub>M</sub>(LPETG) = 5500 - 8760 µM; K<sub>M</sub>(GGGGG) = 140 - 196 µM). I chose the cited study (Frankel et al., Biochemistry 2005), because it also investigated hydrolysis rates. In this study, the measurements are derived from the plots in Figure 2.
(6) Page 1, line 47: the comparison to western blots feels a little like apples to oranges, even though this comparison was made in previous literature. Engineering an expressed protein to have this tag and then using the tag to detect and quantify it, feels more akin to a tagging/pull down assay than a western blot in which unmodified proteins are easily detected.
It is akin to a frequently used type of western blots with tag-specific antiboies, e.g. Anti-His<sub>6</sub>, -Streptavidin, -His<sub>6</sub>, -HA ,-cMyc, -Flag. I modified the sentence to clarify this.
(7) Page 2, line 51: "Connectase cleaves between the first D and P amino acids in the recognition sequence, resulting in an N-terminal A-ELASKD-Connectase intermediate and a C-terminal PGAFDADPLVVEI peptide."
I prefer the current sentence, because we assume that a bond between the aspartate and Connectase is formed before PGAFDADPLVVEI is cleaved off.
(8) Page 3, line 94: "Exact determination is not possible due to reversibility of the reaction", the way it is stated now sounds like it is a flaw in the methods. Also, update Figure 2 to read "Estimated relative ligation rate".
Done.
(9) Page 3, lines 101-107: This is worded in a confusing way. It can either be X<sub>1</sub> or X<sub>2</sub> that is inactivated depending on if the altered amino acid is on the original protein sequence or on the desired edduct to conjugate. You first give examples of how to render other amino acids inactive, but then ultimately state that proline made inactive, so separate the two distinct possibilities a bit more clearly.
The reaction requires the inactivation of X<sub>1</sub>, without affecting X<sub>2</sub> (ln 100 - 102). This is true, no matter whether it is X<sub>1</sub> = A, C, S, or P that is inactivated. I added a sentence to clarify this (ln 102 – 103).
(10) Page 4, line 118: Give a one-sentence justification for why these proteins were chosen to work with (easy to express, stable, etc).
Done.
(11) Page 5, line 167: "payload molecules".
Done.
(12) Page 5, lines 170-173: Word this more clearly- "full conversion with many of these methods is difficult on antibodies due to each heavy and light chain being modified separately, resulting in only a total yield of 66% DAR4 even when 90% of each chain is conjugated."
I rephrased the section.
(13) Page 8, line 290: Discuss other disadvantages of this method including difficulties purifying and in incorporating such a long sequence into proteins of interest.
Product purification is shown in the new Figure 6. As stated above, I consider the simple purification process an advantage of the method. The genetic incorporation of the sequence into proteins is a routine process and should not make any difficulties. The disadvantages of long linker sequences between fusion partners are now discussed (p.8 – 9, ln 300-302).
(14) Page 10, line 341: 'The experiment is described and discussed in detail in a previously published paper.31"
Done.
Reviewer #2 (Recommendations for the authors):
Minor Points:
(1) It's unclear how the author derived 100% ligation rate with X = Proline in Figure 2 when there is still residual unligated UB-Strep at 96h. Please provide an expanded explanation for those not familiar with the protocol. Is the assumption made that there will be no UB-Strep if the assay was carried out beyond 96h?
I clarified the figure legend. The assay shows the formation of an equilibrium between educts and products. Therefore, only ~50% Ub-Strep is used with X = Proline (see p. 2, ln 79 - 81). The "relative ligation rate" refers to the relative speed with which this equilibrium is established. The highest rate is seen with X = Proline, and it is set to 100%. The other rates are given relative to the product formation with X = Proline.
(2) Though the qualitative depiction of the data in Figure 3 is appreciated, an accompanying graphical representation of the data in the same figure will greatly enhance reception and better comprehension of several of the author's conclusions.
Graphs are now shown in Figure S1.
(3) Figure 3 panel E is misaligned. Please align it with panel B above it.
Done, thank you.
(4) The author refers to 'The resulting circular assemblies (37% UB2...)' in the text but identifies it as UB-C2 in Figure 5B. Is this a mistake or does UB2 refer to another assembly not mentioned in the Figures? Please check for inconsistencies.
All circular assemblies are now labeled Ub-C <sub>1-6</sub>.
(5) Finishing with a graphical schematic that depicts the entire protocol in a simple image would be much appreciated and well-received by readers. Including the scheme with A and B proteins, the recognition linkers, the addition of connectase and BcPAP, etc. to the final resulting protein with connected linker.
A graphical summary of the reaction is now included in Figure 6.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this manuscript, Fuchsberger et al. demonstrate a set of experiments that ultimately identifies the de novo synthesis of GluA1-, but not GluA2-containing Ca2+ permeable AMPA receptors as a key driver of dopamine-dependent LTP (DA-LTP) during conventional post-before-pre spike-timing dependent (t-LTD) induction. The authors further identify adenylate cyclase 1/8, cAMP, and PKA as the crucial mitigators of these actions. While some comments have been identified below, the experiments presented are thorough and address the aims of the manuscript, figures are presented clearly (with minor comments), and experimental sample sizes and statistical analyses are suitable. Suitable controls have been utilized to confirm the role of Ca2+ permeable AMPAR. This work provides a valuable step forward built on convincing data toward understanding the underlying mechanisms of spike-timing-dependent plasticity and dopamine.
Strengths:
Appropriate controls were used.
The flow of data presented is logical and easy to follow.
The quality of the data, except for a few minor issues, is solid.
Weaknesses:
The drug treatment duration of anisomycin is longer than the standard 30-45 minute duration (as is the 500uM vs 40uM concentration) typically used in the field. Given the toxicity of these kinds of drugs long term it's unclear why the authors used such a long and intense drug treatment.
In an initial set of control experiments (Figure S 1C-D) we wanted to ensure that protein synthesis was definitely blocked and therefore used a relatively high concentration of anisomycin and a relatively long pre-incubation period. We agree with the Reviewer that we cannot exclude the possibility that this treatment could compromise cell health in addition to the protein synthesis block. Therefore, we carried out an additional experiment with an alternative protein synthesis inhibitor cycloheximide at a lower standard concentration (10 µM) which confirmed a significant reduction in the puromycin signal (Figure S 1A-B). Together these results support the conclusion that puromycin signal is specific to protein synthesis in our labelling assay.
Furthermore, in the electrophysiology experiments, we used 500 μM anisomycin in the patch pipette solution. Under these conditions, we recorded a stable EPSP baseline for 60 minutes, indicating that the treatment did not cause toxic effects to the cell (Figure S1F). This high concentration would ensure an effective block of local translation at dendritic sites. Nevertheless, we also carried out this experiment with cycloheximide at a lower standard concentration (10 µM) and observed a similar result with both protein synthesis inhibitors (Figure 1F).
With some of the normalizations (such as those in S1) there are dramatic differences in the baseline "untreated" puromycin intensities - raising some questions about the overall health of slices used in the experiments.
We agree with the Reviewer that there is a large variability in the normalised puromycin signal which might be due to variability in the health of slices. However, we assume that the same variability would be present in the treated slices, which showed, despite the variability, a significant inhibition of protein synthesis. To avoid any bias by excluding slices with low puromycin signal in the control condition, we present the full dataset.
The large set of electrophysiology experiments carried out in our study (all recorded cells were evaluated for healthy resting membrane potential, action potential firing, and synaptic responses) confirmed that, generally, the vast majority of our slices were indeed healthy.
Reviewer #2 (Public Review):
Summary:
The aim was to identify the mechanisms that underlie a form of long-term potentiation (LTP) that requires the activation of dopamine (DA).
Strengths:
The authors have provided multiple lines of evidence that support their conclusions; namely that this pathway involves the activation of a cAMP / PKA pathway that leads to the insertion of calcium-permeable AMPA receptors.
Weaknesses:
Some of the experiments could have been conducted in a more convincing manner.
We carried out additional control experiments and analyses to address the specific points that were raised.
Reviewer #3 (Public Review):
The manuscript of Fuchsberger et al. investigates the cellular mechanisms underlying dopamine-dependent long-term potentiation (DA-LTP) in mouse hippocampal CA1 neurons. The authors conducted a series of experiments to measure the effect of dopamine on the protein synthesis rate in hippocampal neurons and its role in enabling DA-LTP. The key results indicate that protein synthesis is increased in response to dopamine and neuronal activity in the pyramidal neurons of the CA1 hippocampal area, mediated via the activation of adenylate cyclases subtypes 1 and 8 (AC1/8) and the cAMP-dependent protein kinase (PKA) pathway. Additionally, the authors show that postsynaptic DA-induced increases in protein synthesis are required to express DA-LTP, while not required for conventional t-LTP.
The increased expression of the newly synthesized GluA1 receptor subunit in response to DA supports the formation of homomeric calcium-permeable AMPA receptors (CP-AMPARs). This evidence aligns well with data showing that DA-LTP expression requires the GluA1 AMPA subunit and CP-AMPARs, as DA-LTP is absent in the hippocampus of a GluA1 genetic knock-out mouse model. Overall, the study is solid, and the evidence provided is compelling. The authors clearly and concisely explain the research objectives, methodologies, and findings. The study is scientifically robust, and the writing is engaging. The authors' conclusions and interpretation of the results are insightful and align well with the literature. The discussion effectively places the findings in a meaningful context, highlighting a possible mechanism for dopamine's role in the modulation of protein-synthesis-dependent hippocampal synaptic plasticity and its implications for the field. Although the study expands on previous works from the same laboratory, the findings are novel and provide valuable insights into the dynamics governing hippocampal synaptic plasticity.
The claim that GluA1 homomeric CP-AMPA receptors mediate the expression of DA-LTP is fascinating, and although the electrophysiology data on GluA1 knock-out mice are convincing, more evidence is needed to support this hypothesis. Western blotting provides useful information on the expression level of GluA1, which is not necessarily associated with cell surface expression of GluA1 and therefore CP-AMPARs. Validating this hypothesis by localizing the protein using immunofluorescence and confocal microscopy detection could strengthen the claim. The authors should briefly discuss the limitations of the study.
Although it would be possible to quantify the surface expression of GluA1 using immunofluorescence, it would not be possible to distinguish between GluA1 homomers and GluA1-containing heteromers. It would therefore not be informative as to whether these are indeed CP-AMPARs. This is an interesting problem, which we have briefly discussed in the Discussion section.
Additional comments to address:
(1) In Figure 2A, the representative image with PMY alone shows a very weak PMY signal. Consequently, the image with TTX alone seems to potentiate the PMY signal, suggesting a counterintuitive increase in protein synthesis.
We agree with the Reviewer that the original image was not representative and have replaced it with a more representative image.
(2) In Figures 3A-B, the Western blotting representative images have poor quality, especially regarding GluA1 and α-actin in Figure 3A. The quantification graph (Figure 3B) raises some concerns about a potential outlier in both the DA alone and DA+CHX groups. The authors should consider running a statistical test to detect outlier data. Full blot images, including ladder lines, should be added to the supplementary data.
We have replaced the western blot image in Figure 3A and have also presented full blot images including ladder lines in supplementary Figure S3.
Using the ROUT method (Q=1%) we identified one outlier in the DA+CHX group of the western blot quantification. The quantification for this blot was then removed from the dataset and the experiment was repeated to ensure a sufficient number of repeats.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) How the authors perform these experiments with puromycin, these are puromycilation experiments - not SuNSET. The SuNSET protocol (surface sensing of translation) specifically refers to the detection of newly synthesized proteins externally at the plasma membrane. I'd advise to update the terminology used.
We thank the Reviewer for pointing this out. We have updated this to ‘puromycin-based labelling assay’.
(2) The legend presented in Figure 2F suggests WT is green and ACKO is orange, however, in Figure 2G the WT LTP trace is orange, consider changing this to green for consistency.
We thank the Reviewer for this suggestion and agree that a matching colour scheme makes the Figure clearer. This has been updated.
(3) In the results section, it is recommended to include units for the values presented at the first instance and only again when the units change thereafter.
The units of the electrophysiology data were [%], this is included in the Results section. Results of western blots and IHC images were presented as [a.u.]. While we included this in the Figures, we have not specifically added this to the text of individual results.
(4) Two hours pre-treatment with anisomycin vs 30 minutes pretreatment with cycloheximide seems hard to directly compare - as the pharmokinetics of translational inhibition should be similar for both drugs. What was the rationale for the extremely long anisomycin pretreatment? What controls were taken to assess slice health either prior to or following fixation? This is relevant to the below point (5).
In an initial set of control experiments (Figure S 1C-D) we wanted to ensure that protein synthesis was definitely blocked and therefore used a relatively high concentration of anisomycin and a relatively long pre-incubation period. We agree with the Reviewer that we cannot exclude the possibility that this treatment could compromise cell health in addition to the protein synthesis block. Therefore, we carried out an additional experiment with an alternative protein synthesis inhibitor cycloheximide at a lower standard concentration (10 µM) which confirmed a significant reduction in the puromycin signal (Figure S1A-B). Together these results support the conclusion that puromycin signal is specific to protein synthesis in our labelling assay.
IHC slices were visually assessed for health. The large set of electrophysiology experiments carried out in our study (all recorded cells were evaluated for healthy resting membrane potential, action potential firing, and synaptic responses) also confirmed that, generally, the vast majority of our slices were indeed healthy.
(5) In Supplementary Figure 1, there is a dramatic difference in the a.u. intensities across CHX (B) and AM (D), please explain the reason for this. It is understood these are normalised values to nuclear staining, please clarify if this is a nuclear area.
We agree with the Reviewer that there is a large variability in normalised puromycin signal which may be due to variability in the health of the slices. However, we assume that the same variability would be present in the treated slices, which showed, despite the variability, a significant effect of protein synthesis inhibition. To prevent introducing bias by excluding slices with low puromycin signal in the control condition, we present the full dataset.
The CA1 region of the hippocampus contains of a dense layer of neuronal somata (pyramidal cell layer). We normalized against the nuclear area as it provides a reliable estimate of the number of neurons present in the image. This approach minimizes bias by accounting for variation in the number of neurons within the visual field, ensuring consistency and accuracy in our analysis.
(6) Please clarify the decision to average both the last 5 minutes of baseline recordings and the last 5 minutes of the recording for the normalisation of EPSP slopes.
The baseline usually stabilises after a few minutes of recording, thus the last 5 minutes were used for baseline measurement, which are the most relevant datapoints to compare synaptic weight change to. After induction of STDP, potentiation or depression of synaptic weights develops gradually. Based on previous results, evaluating the EPSP slopes at 30-40 minutes after the induction protocol gives a reliable estimate of the amount of plasticity.
Reviewer #2 (Recommendations For The Authors):
The concentration of anisomycin used (0.5 mM) is very high.
As described above, in an initial set of control experiments (Figure S 1C-D) we wanted to ensure that protein synthesis was definitely blocked and therefore used a relatively high concentration of anisomycin and a relatively long pre-incubation period. We agree with the Reviewer that this is higher than the standard concentration used for this drug and we cannot exclude the possibility that this treatment could compromise cell health in addition to the protein synthesis block. Therefore, we carried out an additional experiment with an alternative protein synthesis inhibitor cycloheximide at a lower standard concentration (10 µM) which confirmed a significant reduction in the puromycin signal (Figure S1A-B). Together these results support the conclusion that puromycin signal is specific to protein synthesis in our labelling assay.
Furthermore, in the electrophysiology experiments, we also used 500 µM anisomycin in the patch pipette solution. Under these conditions, we recorded a stable EPSP baseline for 60 minutes, indicating that the treatment did not cause toxic effects to the cell (Figure S1F). This high concentration would ensure an effective block of local translation at dendritic sites. Nevertheless, we also carried out this experiment with cycloheximide at a lower standard concentration (10 µM) and observed a similar result with both protein synthesis inhibitors (Figure 1F).
The authors conclude that the effect of DA is mediated via D1/5 receptors, which based on previous work seems likely. But they cannot conclude this from their current study which used a combination of a D1/D5 and a D2 antagonist.
We thank the Reviewer for pointing this out. We agree and have updated this in the Discussion section to ‘dopamine receptors’, without specifying subtypes.
There is no mention that I can see that the KO experiments were conducted in a blinded manner (which I believe should be standard practice). Did they verify the KOs using Westerns?
Only a subset of the experiments was conducted in a blinded manner. However, the results were collected by two independent experimenters, who both observed significant effects in KO mice compared to WTs (TF and ZB).
We received the DKO mice from a former collaborator, who verified expression levels of the KO mice (Wang et al., 2003). We verified DKO upon arrival in our facility using genotyping.
Maybe I'm misunderstanding but it appears to me that in Figure 1F there is LTP prior to the addition of DA. (The first point after pairing is already elevated). I think the control of pairing without DA should be added.
We thank the Reviewer for pointing this out. Based on previous results (Brzosko et al., 2015) we would expect potentiation to develop over time once DA is added after pairing, however, it indeed appears in the Figure here as if there was an immediate increase in synaptic weights after pairing. It should be noted, however, that when comparing the first 5 minutes after pairing to the baseline, this increase was not significant (t(9)=1.810, p =0.1037). Nevertheless, we rechecked our data and noticed that this initial potentiation was biased by one cell with an increasing baseline, which had both the test and control pathway strongly elevated. We had mistakenly included this cell in the dataset, despite the unstable conditions (as stated in the Methods section, the unpaired control pathway served as a stability control). We apologise for the error and this has now been corrected (Figure 1F). In addition, we present the control pathway in Figure S1G and I.
We have also now included the control for post-before-pre pairing (Δt = -20 ms) without dopamine in a supplemental figure (Figure S1E and F).
The Westerns (Figure 3A) are fairly messy. Also, it is better to quantify with total protein. Surface biotinylation of GluA1 and GluA2 would be more informative.
We carried out more repeats of Western blots and have exchanged blots in Figure 3A.
We observed that DA increases protein synthesis, we therefore cannot exclude the possibility that application of DA could also affect total protein levels. Thus quantifying with total protein may not be the best choice here. Quantification with actin is standard practice.
While we agree with the Reviewer that surface biotinylation of GluA1 and GluA2 could in principle be more informative, we do not think it would work well in our experimental setup using acute slice preparation, as it strictly requires intact cells. Slicing generates damaged cells, which would take up the surface biotin reagents. This would cause unspecific biotinylation of the damaged cells, leading to a strong background signal in the assay.
In Figure 4 panels D and E the baselines are increasing substantially prior to induction. I appreciate that long stable baselines with timing-dependent plasticity may not be possible but it's hard to conclude what happened tens of minutes later when the baseline only appears stable for a minute or two. Panels A and B show that relatively stable baselines are achievable.
We agree with the Reviewer that the baselines are increasing, however, when looking at the baseline for 5 minutes prior to induction (5 last datapoints of the baseline), which is what we used for quantification, the baselines appeared stable. Unfortunately, longer baselines are not suitable for timing-dependent plasticity. In addition, all experiments were carried out with a control pathway which showed stable conditions throughout the recording.
In general, the discussion could be better integrated with the current literature. Their experiments are in line with a substantial body of literature that has identified two forms of LTP, based on these signalling cascades, using more conventional induction patterns.
We thank the Reviewer for this suggestion and have added more discussion of the two forms of LTP in the Discussion section.
It would be helpful to include the drug concentrations when first described in the results.
Drug concentration have now been included in the Results section.
It is now more common to include absolute t values (not just <0.05 etc).
While we indicate significance in Figures using asterisks when p values are below the indicated significance levels, we report absolute values of p and t values in the Results section.
Similarly full blots should be added to an appendix / made available.
We have now included full blot images in Supplementary Figure S3.
A 30% tolerance for series resistance seems generous to me. (10-20% would be more typical).
We thank the Reviewer for their suggestion, and will keep this in mind for future studies. However, the error introduced by the higher tolerance level is likely to be small and would not influence any of the qualitative conclusions of the manuscript.
Whereas series resistance is of course extremely important in voltage-clamp experiments, changes in series resistance would be less of a concern in current-clamp recordings of synaptic events. We use the amplifier as a voltage follower, and there are two problems with changes in the electrode, or access, resistance. First, there is the voltage drop across the electrode resistance. Clearly this error is zero if no current is injected and is also negligible for the currents we use in our experiments to maintain the membrane voltage at -70 mV. For example, the voltage drop would be 0.2 mV for 20 pA current through a typical 10 MOhm electrode resistance, and a change in resistance of 30% would give less than 0.1 mV voltage change even if the resistance were not compensated. The second problem is distortion of the EPSP shape due to the low-pass filtering properties of the electrode set up by the pipette capacitance and series resistance (RC). This can be a significant problem for fast events, such as action potentials, but less of a problem for the relatively slow EPSPs recorded in pyramidal cells. Nevertheless, we take on board the advice provided by the Reviewer and will use the conventional tolerance of 20% in future experiments.
Reviewer #3 (Recommendations For The Authors):
In the references, the entry for Burnashev N et al. has a different font size. Please ensure that all references are formatted consistently.
We thank the Reviewer for spotting this and have updated the font size of this reference.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment
Birdsong production depends on precise neural sequences in a vocal motor nucleus HVC. In this useful biophysical model, Daou and colleagues identify specific biophysical parameters that result in sparse neural sequences observed in vivo. While the model is presently incomplete because it is overfit to produce sequences and therefore not robust to real biological variation, the model has the potential to address some outstanding issues in HVC function.
We are grateful for the extensive supportive comments from the reviewers, including broad, strong appreciation of the novel aspects of our manuscript. We believe these will be only strengthened in the next submission.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The paper presents a model for sequence generation in the zebra finch HVC, which adheres to cellular properties measured experimentally. However, the model is fine-tuned and exhibits limited robustness to noise inherent in the inhibitory interneurons within the HVC, as well as to fluctuations in connectivity between neurons. Although the proposed microcircuits are introduced as units for sub-syllabic segments (SSS), the backbone of the network remains a feedforward chain of HVC_RA neurons, similar to previous models.
Strengths:
The model incorporates all three of the major types of HVC neurons. The ion channels used and their kinetics are based on experimental measurements. The connection patterns of the neurons are also constrained by the experiments.
Weaknesses:
The model is described as consisting of micro-circuits corresponding to SSS. This presentation gives the impression that the model's structure is distinct from previous models, which connected HVC_RA neurons in feedforward chain networks (Jin et al 2007, Li & Greenside, 2006; Long et al 2010; Egger et al 2020). However, the authors implement single HVC_RA neurons into chain networks within each micro-circuit and then connect the end of the chain to the start of the chain in the subsequent micro-circuit. Thus, the HVC_RA neuron in their model forms a single-neuron chain. This structure is essentially a simplified version of earlier models.
In the model of the paper, the chain network drives the HVC_I and HVC_X neurons. The role of the micro-circuits is more significant in organizing the connections: specifically, from HVC_RA neurons to HVC_I neurons, and from HVC_I neurons to both HVC_X and HVC_RA neurons.
We thank Reviewer 1 for their thoughtful comments.
While the reviewer is correct about the fact that the propagation of sequential activity in this model is primarily carried by HVC<sub>RA</sub> neurons in a feed-forward manner, we need to emphasize that this is true only if there is no intrinsic or synaptic perturbation to the HVC network. For example, we showed in Figures 10 and 12 how altering the intrinsic properties of HVC<sub>X</sub> neurons or for interneurons disrupts sequence propagation. In other words, while HVC<sub>RA</sub> neurons are the key forces to carry the chain forward, the interplay between excitation and inhibition in our network as well as the intrinsic parameters for all classes of HVC neurons are equally important forces in carrying the chain of activity forward. Thus, the stability of activity propagation necessary for song production depend on a finely balanced network of HVC neurons, with all classes contributing to the overall dynamics. Moreover, all existing models that describe premotor sequence generation in the HVC either assume a distributed model (Elmaleh et al., 2021) that dictates that local HVC circuitry is not sufficient to advance the sequence but rather depends upon momentto-moment feedback through Uva (Hamaguchi et al., 2016), or assume models that rely on intrinsic connections within HVC to propagate sequential activity. In the latter case, some models assume that HVC is composed of multiple discrete subnetworks that encode individual song elements (Glaze & Troyer, 2013; Long & Fee, 2008; Wang et al., 2008), but lacks the local connectivity to link the subnetworks, while other models assume that HVC may have sufficient information in its intrinsic connections to form a single continuous network sequence (Long et al. 2010). The HVC model we present extends the concept of a feedforward network by incorporating additional neuronal classes that influence the propagation of activity (interneurons and HVC<sub>X</sub> neurons). We have shown that any disturbance of the intrinsic or synaptic conductances of these latter neurons will disrupt activity in the circuit even when HVC<sub>RA</sub> neurons properties are maintained.
In regard to the similarities between our model and earlier models, several aspects of our model distinguish it from prior work. In short, while several models of how sequence is generated within HVC have been proposed (Cannon et al., 2015; Drew & Abbott, 2003; Egger et al., 2020; Elmaleh et al., 2021; Galvis et al., 2018; Gibb et al., 2009a, 2009b; Hamaguchi et al., 2016; Jin, 2009; Long & Fee, 2008; Markowitz et al., 2015), all the models proposed either rely on intrinsic HVC circuitry to propagate sequential activity, rely on extrinsic feedback to advance the sequence or rely on both. These models do not capture the complex details of spike morphology, do not include the right ionic currents, do not incorporate all classes of HVC neurons, or do not generate realistic firing patterns as seen in vivo. Our model is the first biophysically realistic model that incorporates all classes of HVC neurons and their intrinsic properties. We tuned the intrinsic and the synaptic properties bases on the traces collected by Daou et al. (2013) and Mooney and Prather (2005) as shown in Figure 3. The three classes of model neurons incorporated to our network as well as the synaptic currents that connect them are based on HodgkinHuxley formalisms that contain ion channels and synaptic currents which had been pharmacologically identified. This is an advancement over prior models that primarily focused on the role of synaptic interactions or external inputs. The model is based on a feedforward chain of microcircuits that encode for the different sub-syllabic segments and that interact with each other through structured feedback inhibition, defining an ordered sequence of cell firing. Moreover, while several models highlight the critical role of inhibitory interneurons in shaping the timing and propagation of bursts of activity in HVC<sub>RA</sub> neurons, our work offers an intricate and comprehensive model that help understand this critical role played by inhibition in shaping song dynamics and ensuring sequence propagation.
How useful is this concept of micro-circuits? HVC neurons fire continuously even during the silent gaps. There are no SSS during these silent gaps.
Regarding the concern about the usefulness of the 'microcircuit' concept in our study, we appreciate the comment and we are glad to clarify its relevance in our network. While we acknowledge that HVC<sub>RA</sub> neurons interconnect microcircuits, our model's dynamics are still best described within the framework of microcircuitry particularly due to the firing behavior of HVC<sub>X</sub> neurons and interneurons. Here, we are referring to microcircuits in a more functional sense, rather than rigid, isolated spatial divisions (Cannon et al. 2015). A microcircuit in our model reflects the local rules that govern the interaction between all HVC neuron classes within the broader network, and that are essential for proper activity propagation. For example, HVC<sub>INT</sub> neurons belonging to any microcircuit burst densely and at times other than the moments when the corresponding encoded SSS is being “sung”. What makes a particular interneuron belong to this microcircuit or the other is merely the fact that it cannot inhibit HVC<sub>RA</sub> neurons that are housed in the microcircuit it belongs to. In particular, if HVC<sub>INT</sub> inhibits HVC<sub>RA</sub> in the same microcircuit, some of the HVC<sub>RA</sub> bursts in the microcircuit might be silenced by the dense and strong HVC<sub>INT</sub> inhibition breaking the chain of activity again. Similarly, HVC<sub>X</sub> neurons were selected to be housed within microcircuits due to the following reason: if an HVC<sub>X</sub> neuron belonging to microcircuit i sends excitatory input to an HVC<sub>INT</sub> neuron in microcircuit j, and that interneuron happens to select an HVC<sub>RA</sub> neuron from microcircuit i, then the propagation of sequential activity will halt, and we’ll be in a scenario similar to what was described earlier for HVC<sub>INT</sub> neurons inhibiting HVC<sub>RA</sub> neurons in the same microcircuit.
We agree that there are no sub-syllabic segments described during the silent gaps and we thank the reviewer to pointing this out. Although silent gaps are integral to the overall process of song production, we have not elaborated on them in this model due to the lack of a clear, biophysically grounded representation for the gaps themselves at the level of HVC. Our primary focus has been on modeling the active, syllable-producing phases of the song, where the HVC network’s sequential dynamics are critical for song. However, one can think the encoding of silent gaps via similar mechanisms that encode SSSs, where each gap is encoded by similar microcircuits comprised of the three classes of HVC neurons (let’s called them GAP rather than SSS) that are active only during the silent gaps. In this case, the propagation of sequential activity is carried throughout the GAPs from the last SSS of the previous syllable to the first SSS of the subsequent syllable. We’ll make sure to emphasize this mechanism more in the revised version of the manuscript.
A significant issue of the current model is that the HVC_RA to HVC_RA connections require fine-tuning, with the network functioning only within a narrow range of g_AMPA (Figure 2B). Similarly, the connections from HVC_I neurons to HVC_RA neurons also require fine-tuning. This sensitivity arises because the somatic properties of HVC_RA neurons are insufficient to produce the stereotypical bursts of spikes observed in recordings from singing birds, as demonstrated in previous studies (Jin et al 2007; Long et al 2010). In these previous works, to address this limitation, a dendritic spike mechanism was introduced to generate an intrinsic bursting capability, which is absent in the somatic compartment of HVC_RA neurons. This dendritic mechanism significantly enhances the robustness of the chain network, eliminating the need to fine-tune any synaptic conductances, including those from HVC_I neurons (Long et al 2010).
Why is it important that the model should NOT be sensitive to the connection strengths?
We thank the reviewer for the comment. While mathematical models designed for highly complex nonlinear biological processes tangentially touch the biological realism, the current network as is right now is the first biologically realistic-enough network model designed for HVC that explains sequence propagation. We do not include dendritic processes in our network although that increases the realistic dynamics for various reasons. 1) The ion channels we integrated into the somatic compartment are known pharmacologically (Daou et al. 2013), but we don’t know about the dendritic compartment’s intrinsic properties of HVC neurons and the cocktail of ion channels that are expressed there. 2) We are able to generate realistic bursting in HVC<sub>RA</sub> neurons despite the single compartment, and the main emphasis in this network is on the interactions between excitation and inhibition, the effects of ion channels in modulating sequence propagation, etc. 3) The network model already incorporates thousands of ODEs that govern the dynamics of each of the HVC neurons, so we did not want to add more complexity to the network especially that we don’t know the biophysical properties of the dendritic compartments.
Therefore, our present focus is on somatic dynamics and the interaction between HVC<sub>RA</sub> and HVC<sub>INT</sub> neurons, but we acknowledge the importance of these processes in enhancing network resiliency. Although we agree that adding dendritic processes improves robustness, we still think that somatic processes alone can offer insightful information on the sequential dynamics of the HVC network. While the network should be robust across a wide range of parameters, it is also essential that certain parameters are designed to filter out weaker signals, ensuring that only reliable, precise patterns of activity propagate. Hence, we specifically chose to make the HVC<sub>RA</sub>-to-HVC<sub>RA</sub> excitatory connections more sensitive (narrow range of values) such that only strong, precise and meaningful stimuli can propagate through the network representing the high stereotypy and precision seen in song production.
First, the firing of HVC_I neurons is highly noisy and unreliable. HVC_I neurons fire spontaneous, random spikes under baseline conditions. During singing, their spike timing is imprecise and can vary significantly from trial to trial, with spikes appearing or disappearing across different trials. As a result, their inputs to HVC_RA neurons are inherently noisy. If the model relies on precisely tuned inputs from HVC_I neurons, the natural fluctuations in HVC_I firing would render the model non-functional. The authors should incorporate noisy HVC_I neurons into their model to evaluate whether this noise would render the model non-functional.
We acknowledge that under baseline and singing settings, interneurons fire in an extremely noisy and inaccurate manner, although they exhibit time locked episodes in their activity (Hahnloser et al 2002, Kozhinikov and Fee 2007). In order to mimic the biological variability of these neurons, our model does, in fact, include a stochastic current to reflect the intrinsic noise and random variations in interneuron firing shown in vivo (and we highlight this in the Methods). If necessary and to make sure the network is resilient to this randomness in interneuron firing, we will investigate different approaches to enhance the noise representation even further and check its effect on sequence propagation.
Second, Kosche et al. (2015) demonstrated that reducing inhibition by suppressing HVC_I neuron activity makes HVC_RA firing less sparse but does not compromise the temporal precision of the bursts. In this experiment, the local application of gabazine should have severely disrupted HVC_I activity. However, it did not affect the timing precision of HVC_RA neuron firing, emphasizing the robustness of the HVC timing circuit. This robustness is inconsistent with the predictions of the current model, which depends on finely tuned inputs and should, therefore, be vulnerable to such disruptions.
We thank the reviewer for the comment. The differences between the Kosche et al. (2015) findings and the predictions of our model arise from differences in the aspect of HVC function we are modeling. Our model is more sensitive to inhibition, which is a designed mechanism for achieving precise song patterning. This is a modeling simplification we adopted to capture specific characteristics of HVC function. Hence, Kosche et al. (2015) findings do not invalidate the approach of our model, but highlights that HVC likely operates with several, redundant mechanisms that overall ensure temporal precision.Nevertheless, we will investigate further the effects of the degree of inhibition on song patterning.
Third, the reliance on fine-tuning of HVC_RA connections becomes problematic if the model is scaled up to include groups of HVC_RA neurons forming a chain network, rather than the single HVC_RA neurons used in the current work. With groups of HVC_RA neurons, the summation of presynaptic inputs to each HVC_RA neuron would need to be precisely maintained for the model to function. However, experimental evidence shows that the HVC circuit remains functional despite perturbations, such as a few degrees of cooling, micro-lesions, or turnover of HVC_RA neurons. Such robustness cannot be accounted for by a model that depends on finely tuned connections, as seen in the current implementation.
Our model of individual HVC<sub>RA</sub> neurons and as stated previously is reductive model that focuses on understanding the mechanisms that govern sequential neural activity. We agree that scaling the model to include many of HVC<sub>RA</sub> neurons poses challenges, specifically concerning the summation of presynaptic inputs. However, our model can still be adapted to a larger network without requiring the level of fine-tuning currently needed. In fact, the current fine-tuning of synaptic connections in the model is a reflection of fundamental network mechanisms rather than a limitation when scaling to a larger network. Besides, one important feature of this neural network is redundancy. Even if some neurons or synaptic connections are impaired, other neurons or pathways can compensate for these changes, allowing the activity propagation to remain intact.
The authors examined how altering the channel properties of neurons affects the activity in their model. While this approach is valid, many of the observed effects may stem from the delicate balancing required in their model for proper function.
In the current model, HVC_X neurons burst as a result of rebound activity driven by the I_H current. Rebound bursts mediated by the I_H current typically require a highly hyperpolarized membrane potential. However, this mechanism would fail if the reversal potential of inhibition is higher than the required level of hyperpolarization. Furthermore, Mooney (2000) demonstrated that depolarizing the membrane potential of HVC_X neurons did not prevent bursts of these neurons during forward playback of the bird's own song, suggesting that these bursts (at least under anesthesia, which may be a different state altogether) are not necessarily caused by rebound activity. This discrepancy should be addressed or considered in the model.
In our HVC network model, one goal with HVC<sub>X</sub> neurons is to generate bursts in their underlying neuron population. Since HVC<sub>X</sub> neurons in our model receive only inhibitory inputs from interneurons, we rely on inhibition followed by rebound bursts orchestrated by the IH and the I<sub>CaT</sub> currents to achieve this goal. The interplay between the T-type Ca<sup>++</sup> current and the H current in our model is fundamental to generate their corresponding bursts, as they are sufficient for producing the desired behavior in the network. Due to this interplay, we do not need significant inhibition to generate rebound bursts, because the T-type Ca<sup>++</sup> current’s conductance can be stronger leading to robust rebound bursting even when the degree of inhibition is not very strong. We will highlight this with more clarity in the revised version.
Some figures contain direct copies of figures from published papers. It is perhaps a better practice to replace them with schematics if possible.
We will replace the relevant figures with schematic representations where possible.
Reviewer #2 (Public review):
Summary:
In this paper, the authors use numerical simulations to try to understand better a major experimental discovery in songbird neuroscience from 2002 by Richard Hahnloser and collaborators. The 2002 paper found that a certain class of projection neurons in the premotor nucleus HVC of adult male zebra finch songbirds, the neurons that project to another premotor nucleus RA, fired sparsely (once per song motif) and precisely (to about 1 ms accuracy) during singing.
The experimental discovery is important to understand since it initially suggested that the sparsely firing RA-projecting neurons acted as a simple clock that was localized to HVC and that controlled all details of the temporal hierarchy of singing: notes, syllables, gaps, and motifs. Later experiments suggested that the initial interpretation might be incomplete: that the temporal structure of adult male zebra finch songs instead emerged in a more complicated and distributed way, still not well understood, from the interaction of HVC with multiple other nuclei, including auditory and brainstem areas. So at least two major questions remain unanswered more than two decades after the 2002 experiment: What is the neurobiological mechanism that produces the sparse precise bursting: is it a local circuit in HVC or is it some combination of external input to HVC and local circuitry?
And how is the sparse precise bursting in HVC related to a songbird's vocalizations?
The authors only investigate part of the first question, whether the mechanism for sparse precise bursts is local to HVC. They do so indirectly, by using conductance-based Hodgkin-Huxley-like equations to simulate the spiking dynamics of a simplified network that includes three known major classes of HVC neurons and such that all neurons within a class are assumed to be identical. A strength of the calculations is that the authors include known biophysically deduced details of the different conductances of the three major classes of HVC neurons, and they take into account what is known, based on sparse paired recordings in slices, about how the three classes connect to one another. One weakness of the paper is that the authors make arbitrary and not well-motivated assumptions about the network geometry, and they do not use the flexibility of their simulations to study how their results depend on their network assumptions. A second weakness is that they ignore many known experimental details such as projections into HVC from other nuclei, dendritic computations (the somas and dendrites are treated by the authors as point-like isopotential objects), the role of neuromodulators, and known heterogeneity of the interneurons. These weaknesses make it difficult for readers to know the relevance of the simulations for experiments and for advancing theoretical understanding.
Strengths:
The authors use conductance-based Hodgkin-Huxley-like equations to simulate spiking activity in a network of neurons intended to model more accurately songbird nucleus HVC of adult male zebra finches. Spiking models are much closer to experiments than models based on firing rates or on 2-state neurons.
The authors include information deduced from modeling experimental current-clamp data such as the types and properties of conductances. They also take into account how neurons in one class connect to neurons in other classes via excitatory or inhibitory synapses, based on sparse paired recordings in slices by other researchers.
The authors obtain some new results of modest interest such as how changes in the maximum conductances of four key channels (e.g., A-type K<sup>+</sup> currents or Ca-dependent K<sup>+</sup> currents) influence the structure and propagation of bursts, while simultaneously being able to mimic accurately current-clamp voltage measurements.
Weaknesses:
One weakness of this paper is the lack of a clearly stated, interesting, and relevant scientific question to try to answer. In the introduction, the authors do not discuss adequately which questions recent experimental and theoretical work have failed to explain adequately, concerning HVC neural dynamics and its role in producing vocalizations. The authors do not discuss adequately why they chose the approach of their paper and how their results address some of these questions.
For example, the authors need to explain in more detail how their calculations relate to the works of Daou et al, J. Neurophys. 2013 (which already fitted spiking models to neuronal data and identified certain conductances), to Jin et al J. Comput. Neurosci. 2007 (which already discussed how to get bursts using some experimental details), and to the rather similar paper by E. Armstrong and H. Abarbanel, J. Neurophys 2016, which already postulated and studied sequences of microcircuits in HVC. This last paper is not even cited by the authors.
We thank the reviewer for this valuable comment, and we agree that we did not clarify enough throughout the paper the utility of our model or how it advanced our understanding of the HVC dynamics and circuitry. To that end, we will revise several places of the manuscript and make sure to cite and highlight the relevance and relatedness of the mentioned papers.
In short, and as mentioned to Reviewer 1, while several models of how sequence is generated within HVC have been proposed (Cannon et al., 2015; Drew & Abbott, 2003; Egger et al., 2020; Elmaleh et al., 2021; Galvis et al., 2018; Gibb et al., 2009a, 2009b; Hamaguchi et al., 2016; Jin, 2009; Long & Fee, 2008; Markowitz et al., 2015; Jin et al., 2007), all the models proposed either rely on intrinsic HVC circuitry to propagate sequential activity, rely on extrinsic feedback to advance the sequence or rely on both. These models do not capture the complex details of spike morphology, do not include the right ionic currents, do not incorporate all classes of HVC neurons, or do not generate realistic firing patterns as seen in vivo. Our model is the first biophysically realistic model that incorporates all classes of HVC neurons and their intrinsic properties.
No existing hypothesis had been challenged with our model, rather; our model is a distillation of the various models that’s been proposed for the HVC network. We go over this in detail in the Discussion. We believe that the network model we developed provide a step forward in describing the biophysics of HVC circuitry, and may throw a new light on certain dynamics in the mammalian brain, particularly the motor cortex and the hippocampus regions where precisely-timed sequential activity is crucial. We suggest that temporally-precise sequential activity may be a manifestation of neural networks comprised of chain of microcircuits, each containing pools of excitatory and inhibitory neurons, with local interplay among neurons of the same microcircuit and global interplays across the various microcircuits, and with structured inhibition as well as intrinsic properties synchronizing the neuronal pools and stabilizing timing within a firing sequence.
The authors' main achievement is to show that simulations of a certain simplified and idealized network of spiking neurons, which includes some experimental details but ignores many others, match some experimental results like current-clamp-derived voltage time series for the three classes of HVC neurons (although this was already reported in earlier work by Daou and collaborators in 2013), and simultaneously the robust propagation of bursts with properties similar to those observed in experiments. The authors also present results about how certain neuronal details and burst propagation change when certain key maximum conductances are varied.
However, these are weak conclusions for two reasons. First, the authors did not do enough calculations to allow the reader to understand how many parameters were needed to obtain these fits and whether simpler circuits, say with fewer parameters and simpler network topology, could do just as well. Second, many previous researchers have demonstrated robust burst propagation in a variety of feed-forward models. So what is new and important about the authors' results compared to the previous computational papers?
A major novelty of our work is the incorporation of experimental data with detailed network models. While earlier works have established robust burst propagation, our model uses realistic ion channel kinetics and feedback inhibition not only to reproduce experimental neural activity patterns but also to suggest prospective mechanisms for song sequence production in the most biophysical way possible. This aspect that distinguishes our work from other feed-forward models. We go over this in detail in the Discussion. However, the reviewer is right regarding the details of the calculations conducted for the fits, we will make sure to highlight this in the Methods and throughout the manuscript with more details.
We believe that the network model we developed provide a step forward in describing the biophysics of HVC circuitry, and may throw a new light on certain dynamics in the mammalian brain, particularly the motor cortex and the hippocampus regions where precisely-timed sequential activity is crucial. We suggest that temporally-precise sequential activity may be a manifestation of neural networks comprised of chain of microcircuits, each containing pools of excitatory and inhibitory neurons, with local interplay among neurons of the same microcircuit and global interplays across the various microcircuits, and with structured inhibition as well as intrinsic properties synchronizing the neuronal pools and stabilizing timing within a firing sequence.
Also missing is a discussion, or at least an acknowledgment, of the fact that not all of the fine experimental details of undershoots, latencies, spike structure, spike accommodation, etc may be relevant for understanding vocalization. While it is nice to know that some models can match these experimental details and produce realistic bursts, that does not mean that all of these details are relevant for the function of producing precise vocalizations. Scientific insights in biology often require exploring which of the many observed details can be ignored and especially identifying the few that are essential for answering some questions. As one example, if HVC-X neurons are completely removed from the authors' model, does one still get robust and reasonable burst propagation of HVC-RA neurons? While part of the nucleus HVC acts as a premotor circuit that drives the nucleus RA, part of HVC is also related to learning. It is not clear that HVC-X neurons, which carry out some unknown calculation and transmit information to area X in a learning pathway, are relevant for burst production and propagation of HVC<sub>RA</sub> neurons, and so relevant for vocalization. Simulations provide a convenient and direct way to explore questions of this kind.
One key question to answer is whether the bursting of HVC-RA projection neurons is based on a mechanism local to HVC or is some combination of external driving (say from auditory nuclei) and local circuitry. The authors do not contribute to answering this question because they ignore external driving and assume that the mechanism is some kind of intrinsic feed-forward circuit, which they put in by hand in a rather arbitrary and poorly justified way, by assuming the existence of small microcircuits consisting of a few HVC-RA, HVC-X, and HVC-I neurons that somehow correspond to "sub-syllabic segments". To my knowledge, experiments do not suggest the existence of such microcircuits nor does theory suggest the need for such microcircuits.
Recent results showed a tight correlation between the intrinsic properties of neurons and features of song (Daou and Margoliash 2020, Medina and Margoliash 2024), where adult birds that exhibit similar songs tend to have similar intrinsic properties. While this is relevant, we acknowledge that not all details may be necessary for every aspect of vocalization, and future models could simplify concentrate on core dynamics and exclude certain features while still providing insights into the primary mechanisms.
The question of whether HVC<sub>X</sub> neurons are relevant for burst propagation given that our model includes these neurons as part of the network for completeness, the reviewer is correct, the propagation of sequential activity in this model is primarily carried by HVC<sub>RA</sub> neurons in a feed-forward manner, but only if there is no perturbation to the HVC network. For example, we have shown how altering the intrinsic properties of HVC<sub>X</sub> neurons or for interneurons disrupts sequence propagation. In other words, while HVC neurons are the key forces to carry the chain forward, the interplay between excitation and inhibition in our network as well as the intrinsic parameters for all classes of HVC neurons are equally important forces in carrying the chain of activity forward. Thus, the stability of activity propagation necessary for song production depend on a finely balanced network of HVC neurons, with all classes contributing to the overall dynamics.
We agree with the reviewer however that a potential drawback of our model is that its sole focus is on local excitatory connectivity within the HVC (Kornfeld et al., 2017; Long et al., 2010), while HVC neurons receive afferent excitatory connections (Akutagawa & Konishi, 2010; Nottebohm et al., 1982) that plays significant roles in their local dynamics. For example, the excitatory inputs that HVC neurons receive from Uvaeformis may be crucial in initiating (Andalman et al., 2011; Danish et al., 2017; Galvis et al., 2018) or sustaining (Hamaguchi et al., 2016) the sequential activity. While we acknowledge this limitation, our main contribution in this work is the biophysical insights onto how the patterning activity in HVC is largely shaped by the intrinsic properties of the individual neurons as well as the synaptic properties where excitation and inhibition play a major role in enabling neurons to generate their characteristic bursts during singing. This is true and holds irrespective of whether an external drive is injected onto the microcircuits or not. We will however elaborate on and investigate this more during the next submission.
Another weakness of this paper is an unsatisfactory discussion of how the model was obtained, validated, and simulated. The authors should state as clearly as possible, in one location such as an appendix, what is the total number of independent parameters for the entire network and how parameter values were deduced from data or assigned by hand. With enough parameters and variables, many details can be fit arbitrarily accurately so researchers have to be careful to avoid overfitting. If parameter values were obtained by fitting to data, the authors should state clearly what the fitting algorithm was (some iterative nonlinear method, whose results can depend on the initial choice of parameters), what the error function used for fitting (sum of least squares?) was, and what data were used for the fitting.
The authors should also state clearly the dynamical state of the network, the vector of quantities that evolve over time. (What is the dimension of that vector, which is also the number of ordinary differential equations that have to be integrated?) The authors do not mention what initial state was used to start the numerical integrations, whether transient dynamics were observed and what were their properties, or how the results depended on the choice of the initial state. The authors do not discuss how they determined that their model was programmed correctly (it is difficult to avoid typing errors when writing several pages or more of a code in any language) or how they determined the accuracy of the numerical integration method beyond fitting to experimental data, say by varying the time step size over some range or by comparing two different integration algorithms.
We thank the reviewer again. The fitting process in our model occurred only at the first stage where the synaptic parameters were fit to the Mooney and Prather as well as the Kosche results. There was no data shared and we merely looked at the figures in those papers and checked the amplitude of the elicited currents, the magnitudes of DC-evoked excitations etc, and we replicated that in our model. While this is suboptimal, it was better for us to start with it rather than simply using equations for synaptic currents from the literature for other types of neurons (that are not even HVC’s or in the songbird) and integrate them into our network model. However, we will certainly highlight the details of this fitting process in the new submission. We will also highlight more technical details in the Methods regarding the exact number of ODEs, the initial conditions to run them, etc.
Also disappointing is that the authors do not make any predictions to test, except rather weak ones such as that varying a maximum conductance sufficiently (which might be possible by using dynamic clamps) might cause burst propagation to stop or change its properties. Based on their results, the authors do not make suggestions for further experiments or calculations, but they should.
We agree that making experimental testable predictions is crucial for the advancement of the model. Our predictions include testing whether eradication of a class of neurons such as HVC<sub>X</sub> neurons disrupts activity propagation which can be done through targeted neuron elimination. This also can be done through preventing rebound bursting in HVC<sub>X</sub> by pharmacologically blocking the I<sub>h</sub> channels. Others include down regulation of certain ion channels (pharmacologically done through ion blockers) and testing which current is fundamental for song production (and there a plenty of test based our results, like the SK current, the T-type Ca<sup>++</sup> current, the A-type K<sup>+</sup> current, etc). We will incorporate these into the revised manuscript to better demonstrate the model's applicability and to guide future research directions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Structural colors (SC) are based on nanostructures reflecting and scattering light and producing optical wave interference. All kinds of living organisms exhibit SC. However, understanding the molecular mechanisms and genes involved may be complicated due to the complexity of these organisms. Hence, bacteria that exhibit SC in colonies, such as Flavobacterium IR1, can be good models.
Based on previous genomic mining and co-occurrence with SC in flavobacterial strains, this article focuses on the role of a specific gene, moeA, in SC of Flavobacterium IR1 strain colonies on an agar plate. moeA is involved in the synthesis of the molybdenum cofactor, which is necessary for the activity of key metabolic enzymes in diverse pathways.
The authors clearly showed that the absence of moeA shifts SC properties in a way that depends on the nutritional conditions. They further bring evidence that this effect was related to several properties of the colony, all impacted by the moeA mutant: cell-cell organization, cell motility and colony spreading, and metabolism of complex carbohydrates. Hence, by linking SC to a single gene in appearance, this work points to cellular organization (as a result of cell-cell arrangement and motility) and metabolism of polysaccharides as key factors for SC in a gliding bacterium. This may prove useful for designing molecular strategies to control SC in bacterial-based biomaterials.
Strengths:
The topic is very interesting from a fundamental viewpoint and has great potential in the field of biomaterials.
Thank you for your comments.
The article is easy to read. It builds on previous studies with already established tools to characterize SC at the level of the flavobacterial colony. Experiments are well described and well executed. In addition, the SIBR-Cas method for chromosome engineering in Flavobacteria is the most recent and is a leap forward for future studies in this model, even beyond SC.
We appreciate these comments.
Weaknesses:
The paper appears a bit too descriptive and could be better organized. Some of the results, in particular the proteomic comparison, are not well exploited (not explored experimentally). In my opinion, the problem originates from the difficulty in explaining the link between the absence of moeA and the alterations observed at the level of colony spreading and polysaccharide utilization, and the variation in proteomic content.
We will look at the organisation of the manuscript carefully in the coming, detailed revision, as suggested. In terms of the proteomics, there are clearly a large number of proteins affected by the moeA deletion. In terms of experimental exploration, we chose spreading, structural colour formation and starch degradation to test phenotypically, as the most relevant. For example, in L615-617, we discuss the downregulation of GldL (which is known to be involved Flavobacterial gliding motility [Shrivastava et al., 2013]) in the _moe_A KO as a possible explanation for the reduced colony spreading of moeA mutant. Changes in polysaccharide (starch) utilization were seen on solid medium, as well as in the proteomic profile where we observed the upregulation of carbohydrate metabolism proteins linked to PUL (polysaccharide utilisation locus) operons (Terrapon et al., 2015), such as PAM95095-90 (Figure 8), and other carbohydrate metabolism-related proteins, including a pectate lyase (Table S7) which is involved in starch degradation (Aspeborg et al., 2012). And as noted in L555-566 and Figure 9, starch metabolism was tested experimentally.
First, the effect of moeA deletion on molybdenum cofactor synthesis should be addressed.
MoeA is the last enzyme in the MoCo synthesis pathway, thus if only MoeA is absent the cell would accumulate MPT-AMP (molybdopterin-adenosine monophosphatase) (Iobbi-Nivol & Leimkühler, 2013), and the expressed molybdoenzymes would not be functional. In L582-585, we commented how the lack of molybdenum cofactor may affect the synthesis of molybdoenzymes. However, if you meant to analyse the presence of the small molecules, the cofactors, involved in these pathways, that was an assay we were not able to perform. Moreover, in L585-587, we addressed how the deletion of _moe_A affected the proteins encoded by the rest of genes in the operon.
Second, as I was reading the entire manuscript, I kept asking myself if moeA (and by extension molybdenum cofactor) was really involved in SC or it was an indirect effect. For example, what if the absence of moeA alters the cell envelope because the synthesis of its building blocks is perturbed, then subsequently perturbates all related processes, including gliding motility and protein secretion? It would help to know if the effects on colony spreading and polysaccharide metabolism can be uncoupled. I don't think the authors discussed that clearly.
The message of the paper is that the moeA gene, as predicted from a previous genomics analysis, is important in SC. This is based on the representation of the _moe_A gene in genomes of bacteria that display SC. This analysis does not predict the mechanism. When knocked out, a significant change in structural colour occurred, supporting this hypothesis. Whether this effect is direct or indirect is difficult to assess, as this referee rightly suggests. In order to follow up this central result, we performed proteomics (both intra- and extracellular). As we observed, the deletion of a single gene generated many changes in the proteomic profile, thus in the biological processes. Based on the known functions of molybdenum cofactor, we could only hypothesize that pterin metabolism is important for SC, not exactly how.
We intend to discuss the links between gliding/spreading and polysaccharide metabolism more clearly, with reference to the literature, as quite a bit is known here including possible links to SC.
Reviewer #2 (Public review):
Summary:
The authors constructed an in-frame deletion of moeA gene, which is involved in molybdopterin cofactor (MoCo) biosynthesis, and investigated its role in structural colors in Flavobacterium IR1. The deletion of moeA shifted colony color from green to blue, reduced colony spreading, and increased starch degradation, which was attributed to the upregulation of various proteins in polysaccharide utilization loci. This study lays the ground for developing new colorants by modifying genes involved in structural colors.
Major strengths and weaknesses:
The authors conducted well-designed experiments with appropriate controls and the results in the paper are presented in a logical manner, which supports their conclusions.
We appreciate your comment.
Using statistical tests to compare the differences between the wild type and moeA mutant, and adding a significance bar in Figure 4B, would strengthen their claims on differences in cell motility regarding differences in cell motility.
Thank you. Figure 4B contains the significance bars that represent the standard deviation of the mean value of the three replicates, but we will modify it to make them more clear.
Additionally, in the result section (Figure 6), the authors suggest that the shift in blue color is "caused by cells which are still highly ordered but narrower", which to my knowledge is not backed up by any experimental evidence.
Thanks. We mentioned that the mutant cells are narrower than the wild type based on the estimated periodicity resulting from the goniometry analysis (L427-430). We will now say “likely to be narrower based on the estimated periodicity from the optical analysis” rather than just “narrower” in the revision.
Overall, this is a well-written paper in which the authors effectively address their research questions through proper experimentation. This work will help us understand the genetic basis of structural colors in Flavobacterium and open new avenues to study the roles of additional genes and proteins in structural colors.
Much appreciated.
REFERENCES
Aspeborg, Henrik, Pedro M. Coutinho, Yang Wang, Harry Brumer, and Bernard Henrissat. "Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5)." BMC evolutionary biology 12 (2012): 1-16.
lobbi-Nivol, Chantal, and Silke Leimkühler. "Molybdenum enzymes, their maturation and molybdenum cofactor biosynthesis in Escherichia coli." Biochimica et Biophysica Acta (BBA)-Bioenergetics 1827, no. 8-9 (2013): 1086-1101.
Shrivastava, Abhishek, Joseph J. Johnston, Jessica M. Van Baaren, and Mark J. McBride. "Flavobacterium johnsoniae GldK, GldL, GldM, and SprA are required for secretion of the cell surface gliding motility adhesins SprB and RemA." Journal of bacteriology 195, no. 14 (2013): 3201-3212.
Terrapon, Nicolas, Vincent Lombard, Harry J. Gilbert, and Bernard Henrissat. "Automatic prediction of polysaccharide utilization loci in Bacteroidetes species." Bioinformatics 31, no. 5 (2015): 647-655.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife assessment
This important study explored a molecular comparison of smooth muscle and neighboring fibroblast cells found in lung blood vessels afflicted by a disease called pulmonary arterial hypertension. In doing so, the authors described distinct disease-associated states of each of these cell types with further insights into the cellular communication and crosstalk between them. The strength of evidence was convincing through the use of complementary and sophisticated tools, accompanied by rare isolation of human diseased lung blood vessel cells that were source-matched to the same donor for direct comparison.
We thank the editors and reviewers in their highly positive and encouraging assessment of our manuscript detailing the cell state changes of arterial smooth muscle cells and fibroblasts in the pulmonary bed. We addressed reviewers’ major comments in the revised manuscript by providing validation of key in vitro findings, such as preserved marker localization and increased GAG deposition in IPAH pulmonary arteries. We additionally provide comparison of transcriptomic profiles spanning fresh, very early and late passage cells. Finally, we present expanded experimental data in support of cellular crosstalk, including testing of additional PAAF ligands on donor PASMC and influence of PTX3/HGF on IPAH PASMC.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors isolated and cultured pulmonary artery smooth muscle cells (PASMC) and pulmonary artery adventitial fibroblasts (PAAF) of the lung samples derived from the patients with idiopathic pulmonary arterial hypertension (PAH) and the healthy volunteers. They performed RNA-seq and proteomics analyses to detail the cellular communication between PASMC and PAAF, which are the main target cells of pulmonary vascular remodeling during the pathogenesis of PAH. The authors revealed that PASMC and PAAF retained their original cellular identity and acquired different states associated with the pathogenesis of PAH, respectively.
Strengths:
Although previous studies have shown that PASMC and PAAF cells each have an important role in the pathogenesis of PAH, there have been scarce reports focusing on the interactions between PASMC and PAAF. These findings may provide valuable information for elucidating the pathogenesis of pulmonary arterial hypertension.
We appreciate the reviewer’s positive view of our study.
Weaknesses:
The results of proteome analysis using primary culture cells in this paper seem a bit insufficient to draw conclusions. In particular, the authors described "We elucidated the involvement of cellular crosstalk in regulating cell state dynamics and identified pentraxin-3 and hepatocyte growth factor as modulators of PASMC phenotypic transition orchestrated by PAAF." However, the presented data are considered limited and insufficient.
We thank the reviewer for drawing our attention to this point and have accordingly modified the conclusion section to read: “We investigated the involvement of cellular crosstalk….” Moreover, we provide further experimental evidence demonstrating the effect of both PTX3 and HGF on cell state marker expression in IPAH-PASMC cells (Figure 7H). In addition, we clarify the selection strategy applied to investigate particular PAAF-secreted ligands and test three additional ligands on donor PASMC (Figure S8), supporting the original focus on PTX3 and HGF.
Reviewer #2 (Public Review):
Summary:
Utilizing a combination of transcriptomic and proteomic profiling as well as cellular phenotyping from source-matched PASMC and PAAFs in IPAH, this study sought to explore a molecular comparison of these cells in order to track distinct cell fate trajectories and acquisition of their IPAH-associated cellular states. The authors also aimed to identify cell-cell communication axes in order to infer mechanisms by which these two cells interact and depend upon external cues. This study will be of interest to the scientific and clinical communities of those interested in pulmonary vascular biology and disease. It also will appeal to those interested in lung and vascular development as well as multi-omic analytic procedures.
We thank the reviewer for overall highly positive assessment of our study.
Strengths:
(1) This is one of the first studies using orthogonal sequencing and phenotyping for the characterization of source-matched neighboring mesenchymal PASMC and PAAF cells in healthy and diseased IPAH patients. This is a major strength that allows for direct comparison of neighboring cell types and the ability to address an unanswered question regarding the nature of these mesenchymal "mural" cells at a precise molecular level.
We value the reviewer’s kind and objective summary of our study.
(2) Unlike a number of multi-omic sequencing papers that read more as an atlas of findings without structure, the inherent comparative organization of the study and presentation of the data were valuable in aiding the reader in understanding how to discern the distinct IPAH-associated cell states. As a result, the reader not only gleans greater insight into these two interacting cell types in disease but also now can leverage these datasets more easily for future research questions in this space.
We thank the reviewer for this highly positive comment.
(3) There are interesting and surprising findings in the cellular characterizations, including the low proliferative state of IPAH-PASMCs as compared to the hyperproliferative state in IPAH-PAAFs. Furthermore, the cell-cell communication axes involving ECM components and soluble ligands provided by PAAFs that direct cell state dynamics of PASMCs offer some of the first and foundational descriptions of what are likely complex cellular interactions that await discovery.
We agree with the reviewer’s assessment that some of the novel data in our study helps to formulate testable hypothesis that can be followed through with more focused follow-up research.
(4) Technical rigor is quite high in the -omics methodology and in vitro phenotyping tools used.
We are grateful for reviewer’s assessment of our work and positive recognition.
Weaknesses:
There are some weaknesses in the methodology that should temper the conclusions:
(1) The number of donors sampled for PAAF/PASMCs was small for both healthy controls and IPAH patients. Thus, while the level of detail of -omics profiling was quite deep, the generalizability of their findings to all IPAH patients or Group 1 PAH patients is limited.
We appreciate the reviewers concerns regarding the generalizability of the findings and have acknowledged this as the study limitation in the discussion: “A low case number and end-stage disease samples used for omics characterization represents a study limitation that has to be taken into account before assuming similar findings would be evident in the entire PAH patient population over the course of the disease development and progression”. We have addressed this issue by performing validation of key in vitro findings using fresh cells or assessment of FFPE lung material from additional independent samples in the revised manuscript (Figures 2D, 3D, 3H, 4H). For transparency, we provide biological sample number in the result section of the modified manuscript.
(2) While the study utilized early passage cells, these cells nonetheless were still cultured outside the in vivo milieu prior to analysis. Thus, while there is an assumption that these cells do not change fundamental behavior outside the body, that is not entirely proven for all transcriptional and proteomic signatures. As such, the major alterations that are noted would be more compelling if validated from tissue or cells derived directly from in vivo sources. Without such validation, the major limitation of the impact and conclusions of the paper is that the full extent of the relevance of these findings to human disease is not known.
We thank the reviewer for this constructive and excellent suggestion. The comparison of fresh and cultured cells revealed a strong and early divergence of differentially regulated pathways for PAAF, while a more gradual transition for PASMC. The results of this analysis are included in the new Figures 2D, 3D, 3H, and 4H. Implications are discussed in the revised manuscript: “However, the same mechanism renders cells susceptible to phenotypic change induced simply by extended vitro culturing, testified by broad expression profile differences between fresh and cultured cells. This common caveat in cell biology research and represents a technical and practical tradeoff that requires cross validation of key findings. Using a combination of archived lung tissue and available single cell RNA sequencing dataset of human pulmonary arteries, we show that some of the key defining phenotypic features of diseased cells, such as altered proliferation rate and ECM production, are preserved and gradually lost upon prolonged culturing”.
(3) While the presentation of most of the manuscript was quite clear and convincing, the terminology and conclusions regarding "cell fate trajectories" throughout the manuscript did not seem to be fully justified. That is, all of the analyses were derived from cells originating from end-stage IPAH, and otherwise, the authors were not lineage tracing across disease initiation or development (which would be impossible currently in humans). So, while the description of distinct "IPAH-associated states" makes sense, any true cell fate trajectory was not clearly defined.
In accordance to reviewer’s comment, we have decided to modify the wording to exclude the “cell fate trajectory” phrase and replace it with “acquisition of disease cell state”.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Major comments:
(1) In Figure 1, PASMC and PAAF were collected from the lungs of healthy donors and analyzed for transcriptomics and proteomics; in Figure 1A, it can be taken as if both cells from IPAH patients were also analyzed, but this is not reflected in the results. In Figure1D, immunostaining of normal lungs confirms the localization of PASMC and PAAF markers found by transcriptomics. The authors describe a strong, but not perfect, correlation between the transcriptomics and proteomics data from Figure S1, but the gene names of each cellular marker they found should also be listed. In addition, the authors have observed the expression of markers characteristic of PASMC and PAAF in pulmonary vessels of healthy subjects by IH, but is there any novelty in these markers? Furthermore, are the expression sites of these markers altered in IPAH patients?
In the revised manuscript we have adjusted the schematic to reflect the fact that only donor cells are compared in Figure 1. We additionally provide a correlation of cell type markers between proteomic and transcriptomic data sets for those molecules that are detected in both datasets (Figure S1B).
We provide clarification on the novelty aspect in the result section: “Some of the molecules were previously associated with predominant SMC, such as RGS5 and CSPR1 (Crnkovic et al., 2022; Snider et al., 2008), or adventitial fibroblast, such as SCARA5, CFD and MGST1 (Crnkovic et al., 2022; Sikkema et al., 2023) expression”. Except for RGS5, expression and localization of other markers in IPAH was previously unknown.
The conservation of expression sites for reported markers was validated in IPAH in the revised manuscript (Figure 2D), with IGFBP5 showing dual localization in both cell types. Moreover, results in Figure 1D, 1E and 2D support the validity of omics findings and preservation of key markers during passaging.
(2) In Figure 2, the authors compare PASMC and PAAF derived from IPAH patients and donors. The results show that transcriptomics and proteomics changes are clearly differentiated by cell type and not by pathological state. In the pathological state, transcriptional changes are more pronounced. The GO analysis of the factors that showed significant changes in each cell type is shown in Figure 2E, but the differences between the GO analysis of the transcriptomics and proteomics results are not clearly shown. The reviewer believes that the advantages of a combined analysis of both should be indicated. Also, in Figure 2G, the GAG content in PA appears to be elevated in only 3 cases, while the other 5 cases appear to be at the same level as the donor; is there a characteristic change in these 3 cases? Figure 2I shows that the phenotype of PAAF changes with cell passages. Since this phenomenon would be interesting and useful to the reader, additional discussion regarding the mechanism would be desired.
We have integrated both data sets in order to achieve stronger and meaningful analysis due to weaker and uncomplete correlation between transcriptomic and protein dataset as indicated in the results section: “Comparative analysis of transcriptomic and proteomic data sets revealed a strong, but not complete level of linear correlation between the gene and protein expression profiles (Figure S1B, C). We therefore decided to use an integrative dataset and analyzed all significantly enriched genes and proteins (-log10(P)>1.3) between both cell types to achieve stronger and more robust analysis”. In general, proteomic profile showed fewer significant differences and extent of change was lesser compared with transcriptomics, likely due to technical limitations of the method and sensitivity, testified by the complete lack of top transcriptomic molecules (RGS5, ADH1C, IGFBP5, CFD, SCARA5) in the protein dataset.
To strengthen the findings of increased GAG in IPAH pulmonary arteries, we have performed compartment-specific, quantitative image analysis of Alcian blue staining on additional donor and patient samples (n=10 for each condition). The new analysis totaling around 40 PA confirmed significantly increased deposition of GAG in IPAH pulmonary arteries.
We have addressed the issue of phenotypic change with prolonged cell culture in the revised manuscript by systematically comparing enrichment for biological processes between fresh (Crnkovic et al., 2022: GSE210248), very early (this study: GSE255669) and later passage cells (Chelladurai et al., 2022: GSE144932; Gorr et al., 2020: GSE144274). We observed cell type differences in the rate of change of phenotypic features, with PAAF showing faster shift early on during culturing that could for some of the features be due to isolation from immunomodulatory environment or presence of hydrocortisone supplement in the PAAF cell media. These points have been described in the revised results section and mentioned in the discussion.
(3) The authors claim that one feature of this paper is the use of "very early passage (p1)" of pulmonary artery smooth muscle cells (PASMC). Since there are other existing (previouly reported) data that are publicly available, such as RNA-seq data using cells with 2-4 cell passages, it may be possible to show that fewer passages are better in primary culture by comparing the data presented in this paper.
Following reviewers’ comments, we have performed systematic comparison (Crnkovic et al., 2022: GSE210248), very early (this study: GSE255669) and later passage cells (Chelladurai et al., 2022: GSE144932; Gorr et al., 2020: GSE144274). in the revised manuscript in order to comprehensively address the issue and define changes occurring as a result of prolonged in vitro conditions (Figure 3H). The results showed that the expression profile of early passage cells retains some of the key phenotypic features displayed by cells in their native environment, with PASMC displaying a more gradual loss of phenotypic characteristics compared to PAAF. Interestingly, PAAF displayed a striking inverse enrichment for inflammatory/NF-kB signaling between fresh and cultured PAAF, which could potentially be caused by the hydrocortisone supplement in the PAAF cell media or due to the isolation from its highly immunomodulatory enviroment. These points have been described in the revised results section and mentioned in the discussion.
(4) The authors describe a study characterized by decreased expression of "cytoskeletal contractile elements" in pulmonary artery smooth muscle cells (PASMC) derived from patients with IPAH. What are the implications of this result, and does it arise from the use of smooth muscle in patients resistant to pulmonary artery smooth muscle dilating agents? A discussion on this issue needs to be made in a way that is easy for the reader to understand.
The reviewer raises an interesting point regarding the loss the contractile markers and response to vasodilating therapy. We would speculate that isolated decrease in contractile machinery, without concomitant change in ECM and other PASMC features, would dampen both the contraction and relaxation properties of the single PASMC, affecting not only its response to dilating agents, but also to vasoconstrictors. Clinical consequences and responsiveness to dilating agents are more difficult to predict, since the vasoactive response would additionally depend on mechanical properties of the pulmonary artery defined by cellular and ECM composition. Nevertheless, we believe that decreased expression of contractile machinery reflects an intrinsic, “programmed” response of SMC to remodeling, rather than vasodilator therapy-induced selection pressure, since similar phenotypic change is observed in SMC from systemic circulation and in various animal models without exposure to PAH medication. These considerations have been included in the revised discussion section.
(5) There are a lot of secreted proteins that increase or decrease in Figure 6G, but there is scant reason to focus on PTX3 and HGF among them. The authors need to elaborate on the above issue.
We regret the lack of clarity and provide improved explanation of the ligand selection strategy in the revised manuscript. In order to prioritize the potential hits, we first used hierarchical clustering to group co-regulated ligands into smaller number of groups. We then prioritized for the ligands that lacked or had limited information with respect to IPAH. Based on these results, we analyzed the effect of three additional ligands on PASMC cell state marker expression (Figure S8). This additional data supported the initial focus on PTX3 and HGF.
Minor comments:
(1) Regarding the number of specimens used in the Result, it would be more helpful to the reader if the number of samples were also mentioned in the text.
We have included the number of used samples in manuscript text.
(2) There is no explanation of what R2Y represents in Figure 2B. This reviewer is not able to understand the statistical analysis of Figure 2H. The detailed results should be explained.
We apologize for the oversight in labeling of Figure 2B and modify the figure legend: “Orthogonal projection to latent structures-discriminant analysis (OPLS-DA) T score plots separating predictive variability (x-axis), attributed to biological grouping, and non-predictive variability (technical/inter-individual, y-axis). Monofactorial OPLS-DA model for separation according to cell type or disease. C) Bifactorial OPLS-DA model considering cell type and disease simultaneously. Ellipse depicting the 95% confidence region, Q2 denoting model’s predictive power (significance: Q2>50%) and R2Y representing proportion of variance in the response variable explained by the model (higher values indicating better fit)”.
We also modified figure legend wording for the analysis in Figure 2H (new Figure 3E) to clarify the independent factors whose interaction was investigated using 3-way ANOVA: “Interaction effects of stimulation, cell type, and disease state on cellular proliferation were analyzed by 3-way ANOVA. Significant interaction effects are indicated as follows: * for stimulation × cell type interactions and # for cell type × disease state interactions (both *, # p<0.05)”.
(3) In Figure 3, the authors examined whether there were molecular abnormalities common to IPAH-PASMC and IPAH-PAAF and found that the number of commonly regulated genes and proteins was limited to 47. Further analysis of these regulators by STRING analysis revealed that factors related to the regulation of apoptosis are commonly altered in both cells. On the other hand, the authors focused on mitochondria, as SOD2 is downregulated, and found an increase in ROS production specific to PASMC, indicating that mitochondrial dysfunction is common to PASMC and PAAF in IPAH, but downstream phenomena are different between cell types. Factors associated with apoptosis regulation have been found to be both upward and downward regulated, but the actual occurrence of apoptosis in both cell types has not been addressed.
We have performed TUNEL staining on FFPE lung tissue from donors and IPAH patients that revealed apoptosis as a rare event in both conditions in PASMC and PAAF. Therefore, no meaningful quantification could be conducted. An example of pulmonary artery where rare positive signal in either PAAF or PASMC could be found is provided in Figure 4H.
Unfortunately, association of a particular gene with a pathway is by default arbitrary and potentially ambiguous. In particular, factors identified as associated in apoptosis are also involved in regulation of inflammatory signaling (BIRC3, DDIT3) and amino acid metabolism (SHMT1). Nevertheless, mitochondria represent a crucial cellular hub for apoptosis regulation and, as shown in the current study, display significant functional alterations in IPAH in both cell types, aligning with reduced mitochondrial superoxide dismutase (SOD2) expression.
(4) The meaning of the gray circle in Figure 3C should be clarified. Similarly, the meaning of the color in Fig. 3D should be clearly explained. In Figure 3E-G, each cell is significantly different from 18-61 cells, and the number of each cell and the reason should be described.
We regret the confusion and provide better explanation of the figure legend: “gray nodes representing their putative upstream regulators”, “with color coding reflecting the IPAH dependent regulation”. In the revised Figure panels 4E-G (old 3E-G) we provide the exact number of cells measured in each condition. Although we tried to have comparable cell confluency at the time of measurement, different proliferation rates between cells from different cell type and condition led to different number of measured cells per donor/patient.
(5) In Figure 4, the authors focus on factors that vary in different directions between cells, revealing fingerprints of molecular changes that differ between cell types, particularly IPAH-PASMC, which acquires a synthetic phenotype with enhanced regulation of chemotaxis elements, whereas IPAH-PAAF, a fast cycling cell characteristics. Next, focusing on the ECM components that were specifically altered in IPAH-PASMC, Nichenet analysis in Figure 5 suggested that ligands from PAAF may act on PASMC, and the authors focused on integrin signaling to examine ECM contact and changes in cell function. The results indicate that adhesion to laminin is poor in PASMC. Although no difference was observed between donor and IPAH PASMCs, a discussion of the reasons for this would be desired and helpful to the readers.
Both donor and IPAH PASMCs respond similarly to laminin. However, our key finding is the downregulation of laminin in IPAH PAAF, which likely leads to a skewed laminin-to-collagen ratio and altered ECM composition in remodeled arteries. This shift in the ECM class results in altered PASMC behavior, affecting both donor and IPAH cells similarly. In the revised manuscript, we demonstrate that PASMC largely retain the expression pattern of integrin subunits that serve as high-affinity collagen and laminin receptors, with higher levels compared to PAAF (Figure 6F, G). Furthermore, we speculate that the distinct cellular phenotypic responses to collagen versus laminin coatings may arise from different downstream signaling pathways activated by the various integrin subunits (Nguyen et al., 2000). These considerations have been included in the revised discussion: “The comparable responses of donor and IPAH PASMC likely result from their shared integrin receptor expression profiles. Meanwhile, ECM class switching engages different high-affinity integrin receptors, which activate alternative downstream signaling pathways (Nguyen et al., 2000) and lead to differential responses to collagen and laminin matrices. We thus propose a model in which laminins and collagens act as PAAF-secreted ligands, regulating PASMC behavior through their ECM-sensing integrin receptors.”
(6) Since Figure 3B and Figure 4A seem to show the same results, why not combine them into one?
Indeed, these figure panels show the same results, but the focus of the investigations in each Figure is different. We therefore opted to keep the panels separate for better clarity and logical link to other panels in the same figure
(7) In Figure 6, the interaction analysis of scRNAseq data with respect to signaling between PASMC and PAAF was performed using Nichenet and CellChat, showing that signaling from PAAF to PASMC is biased toward secreted ligands and that a functionally relevant set of soluble ligands is impaired in the IPAH state. From there, they proceeded with co-culture experiments and showed that co-culture healthy PASMC with PAAF of IPAH patients abolished PASMC markers in the healthy state. Furthermore, the authors attempted to identify ligands that induce functional changes in PASMCs produced from IPAH PAAFs and found that HGF is a factor that downregulates the expression of contractile markers in PASMCs. Further insights may be gained by co-culturing IPAH-derived cells in co-culture experiments. Also, no beneficial effect of pentraxin3 was found in Figure 6H. The authors should examine the effect of pentraxin3 on PASMC cells derived from IPAH patients, rather than healthy donors.
We tested the influence of IPAH-PASMC on donor-PAAF and found no effect on the expression of the selected markers. We thank the reviewer for the suggestion to conduct the experiments on IPAH-PASMC. The new data show that both PTX3 and HGF have a significant effect, but differential effect on IPAH-PASMC as compared to donors-PASMC. Whereas PTX lacks effect on donor PASMC, it leads to downregulation of some of the contractile markers in IPAH PASMC, while HGF upregulates VCAN synthetic marker in IPAH PASMC. These results are now included in Figure 7H.
Reviewer #2 (Recommendations For The Authors):
The authors should double-check for grammar and typos in the manuscript. I caught a few such as "therefor" and others, but there could be more.
We thank the reviewer for the effort and time in reading and evaluating the manuscript. To the best of our knowledge, we have corrected the grammatical errors in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
(1) Although there are many citations acknowledging relevant previous work, there often isn't a very granular attribution of individual previous findings to their sources. In the results section, it's sometimes ambiguous when the paper is recapping established background and when it is breaking new ground. For example, around equation 8 in the results (sv = r - rho*t), it would be good to refer to previous places where versions of this equation have been presented. Offhand, McNamara 1982 (Theoretical Population Biology) is one early instance and Fawcett et al. 2012 (Behavioural Processes) is a later one. Line 922 of the discussion seems to imply this formulation is novel here.
We would like to clarify that original manuscript equation 8,
, as we derive, is not new, as it is similarly expressed in prior foundational work by McNamara (1982), and we thank the reviewer for drawing our attention to the extension of this form by Fawcett, McNamara, Houston (2012).
We now so properly acknowledge this foundational work and extension in the results section…
“This global reward-rate equivalent immediate reward (see Figure 4) is the subjective value of a pursuit, svPursuit (or simply, sv, when the referenced pursuit can be inferred), as similarly expressed in prior foundational work (McNamara 1982), and subsequent extensions (see (Fawcett, McNamara, Houston (2012)).”
…and in the Discussion section at the location referenced by the reviewer:
“From it, we re-expressed the pursuit’s worth in terms of its global reward rate-equivalent immediate reward, i.e., its ‘subjective value’, reprising McNamara’s foundational formulation (McNamara 1982).”
(2) The choice environments that are considered in detail in the paper are very simple. The simplicity facilitates concrete examples and visualizations, but it would be worth further consideration of whether and how the conclusions generalize to more complex environments. The paper considers "forgo" scenario in which the agent can choose between sequences of pursuits like A-B-A-B (engaging with option B at all opportunities, which are interleaved with a default pursuit A) and A-A-A-A (forgoing option B). It considers "choice" scenarios where the agent can choose between sequences like A-B-A-B and A-C-A-C (where B and C are larger-later and smaller-sooner rewards, either of which can be interleaved with the default pursuit). Several forms of additional complexity would be valuable to consider. [A] One would be a greater number of unique pursuits, not repeated identically in a predictable sequence, akin to a prey-selection paradigm. It seems to me this would cause t_out and r_out (the time and reward outside of the focal prospect) to be policy-dependent, making the 'apportionment cost' more challenging to ascertain. Another relevant form of complexity would be if there were [B] variance or uncertainty in reward magnitudes or temporal durations or if [C] the agent had the ability to discontinue a pursuit such as in patch-departure scenarios.
A) We would like to note that the section “Deriving Optimal Policy from Forgo Decision-making worlds”, addresses the reviewer’s scenario of n-number of pursuits”, each occurring at their own frequency, as in prey selection, not repeating identically in a predictable sequence. Within our subsection “Parceling the world…”, we introduce the concept of dividing a world (such as that) into the considered pursuit type, and everything outside of it. ‘Outside’ would include any number of other pursuits currently part of any policy, as the reviewer intuits, thus making t<sup>out</sup> and r<sup>out</sup> policy dependent. Nonetheless, a process of excluding (forgoing) pursuits by comparing the ‘in’ to the ‘out’ reward rate (section “Reward-rate optimizing forgo policy…”) or its equivalent sv (section “The forgo decision can also be made from subjective value), would iteratively lead to the global reward rate maximizing policy. This manner of parceling into ‘in’ and ‘out’ thus simplifies visualization of what can be complex worlds. Simpler cases that resemble common experimental designs are given in the manuscript to enhance intuition.
We thank the reviewer for this keen suggestion. We now include example figures (Supplemental 1 & 2) for multi-pursuit worlds which have the same (Supplemental 1) and different pursuit frequencies (Supplemental 2), which illustrate how this evaluation leads to reward-rate optimization. This addition demonstrates how an iterative policy would lead to reward rate maximization and emphasizes how parcellating a world into ‘in’ and ‘out’ of the pursuit type applies and is a useful device for understanding the worth of any given pursuit in more complex worlds. The policy achieving the greatest global reward rate can be realized through an iterative process where pursuits with lower reward rates than the reward rate obtained from everything other than the considered pursuit type are sequentially removed from the policy.
B) We would also emphasize that the formulation here contends with variance or uncertainty in the reward magnitudes or temporal durations. The ‘in’ pursuit is the average reward and the average time of the considered pursuit type, as is the ‘out’ the average reward and average time outside of the considered pursuit type.
C) In this work, we consider the worth of initiating one-or-another pursuit (from having completed a prior one), and not the issue of continuing within a pursuit (having already engaged it), as in patch/give-up. Handling worlds in which the agent may depart from within a pursuit, which is to say ‘give-up’ (as in patch foraging), is outside the scope of this work.
(3) I had a hard time arriving at a solid conceptual understanding of the 'apportionment cost' around Figure 5. I understand the arithmetic, but it would help if it were possible to formulate a more succinct verbal description of what makes the apportionment cost a useful and meaningful quality to focus on.
We thank the reviewer for pressing for a succinct and intuitive verbal description.
We added the following succinct verbal description of apportionment cost… “Apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration.” This definition appears in new paragraphs (as below) describing apportionment cost in the results section “Time’s cost: opportunity & apportionment costs determine a pursuit’s subjective value”, and is accompanied by equations for apportionment cost, and a figure giving its geometric depiction (Figure 5). We also expanded original figure 5 and its legend (so as to illustrate the apportionment scaling factor and the apportionment cost), and its accompanying main text, to further illustrate and clarify apportionment cost, and its relationship to opportunity cost, and time’s cost.
“What, then, is the amount of reward by which the opportunity cost-subtracted reward is scaled down to equal the sv of the pursuit? This amount is the apportionment cost of time. The apportionment cost of time (height of the brown vertical bar, Figure 5F) is the global reward rate after taking into account the opportunity cost (slope of the magenta-gold dashed line in Figure 5F) times the time of the considered pursuit. Equally, the difference between the inside and outside reward rates, times the time of the pursuit, is the apportionment cost when scaled by the pursuit’s weight, i.e., the fraction that the considered pursuit is to the total time to traverse the world (Equation 9, right hand side). From the perspective of decision-making policies, apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration (Equation 9 center, Figure 5F).
Equation 9. Apportionment Cost.
While this difference is the apportionment cost of time, the opportunity cost of time is the amount that would be expected from a policy of not taking the considered pursuit over a time equal to the considered pursuit’s duration. Together, they sum to Time’s Cost (Figure 5G). Expressing a pursuit’s worth in terms of the global reward rate obtained under a policy of accepting the pursuit type (Figure 5 left column), or from the perspective of the outside reward and time (Figure 5 right column), are equivalent. However, the latter expresses sv in terms that are independent of one another, conveys the constituents giving rise to global reward rate, and provides the added insight that time’s cost comprises an apportionment as well as an opportunity cost.”
The above definition of apportionment cost adds to other stated relationships of apportionment cost found throughout the paper (original lines 434,435,447,450).
I think Figure 6C relates to this, but I had difficulty relating the axis labels to the points, lines, and patterned regions in the plot.
We thank the reviewer for pointing out that this figure can be made to be more easily understood.
We have done so by breaking its key features over a greater number of plots so that no single panel is overloaded. We have also changed text in the legend to clarify how apportionment and opportunity costs add to constitute time’s cost, and also correspondingly in the main text.
I also was a bit confused by how the mathematical formulation was presented. As I understood it, the apportionment cost essentially involves scaling the rest of the SV expression by t<sup>out</sup>/(t<sup>in</sup> + t<sup>out</sup>).
The reviewer’s understanding is correct: the amount of reward of the pursuit that remains after subtracting the opportunity cost, when so scaled, is equivalent to the subjective value of that pursuit. The amount by which that scaling decreases the rest of the SV expression is equal to the apportionment cost of time.
The way this scaling factor is written in Figure 5C, as 1/(1 + (1/t<sup>out</sup>) t<sup>in</sup>), seems less clear than it could be.
To be sure, we present the formula in original Figure 5C in this manner to emphasize the opportunity cost subtraction as separable from the apportionment rescaling, expressing the opportunity cost subtraction and the apportionment scaling component of the equation as their own terms in parentheses.
But we understand the reviewer to be referring to the manner by which we chose to express the scaling term. We presented it in this way in the original manuscript, (rather than its more elegant form recognized by the reviewer) to make direct connection to temporal discounting literature. In this literature, discounting commonly takes the same mathematical form as our apportionment cost scaling, but whereas the steepness of discounting in this literature is controlled by a free fit parameter, k, we show how for a reward rate maximizing agent, the equivalent k term isn’t a free fit parameter, but rather is the reciprocal of the time spent outside the considered pursuit type.
We take the reviewer’s advice to heart, and now first express subjective value in the format that emphasizes opportunity cost subtraction followed by an apportionment downscaling, identifying the apportionment scaling term, t<sup>out</sup>/(t<sup>out</sup> + t<sup>in</sup>), ie the outside weight. Figure 5 now shows the geometric representation of apportionment scaling and apportionment cost. Only subsequently in the discounting function section then do we now in the revised manuscript rearrange this subjective value expression to resemble the standard discounting function form.
Also, the apportionment cost is described in the text as being subtracted from sv rather than as a multiplicative scaling factor.
What we describe in the original text is how apportionment cost is a component of time’s cost, and how sv is the reward less time’s cost. It would be correct to say that apportionment cost and opportunity cost are subtracted from the pursuit’s reward to yield the subjective value of the pursuit. This is what we show in the original Figure 5D graphically. Original Figure 5 and accompanying formulas at its bottom show the equivalence of expressing sv in terms of subtracting time’s cost as calculated from the global reward rate under a policy of accepting the considered pursuit, or, of subtracting opportunity cost and then scaling the opportunity cost subtracted reward by the apportionment scaling term, thereby accounting for the apportionment cost of time.
The revision of original figure 5, its figure legend, and accompanying text now make clear the meaning of apportionment cost, how it can be considered a subtraction from the reward of a pursuit, or, equivalently, how it can be thought of as the result of scaling down of opportunity cost subtracted reward.
It could be written as a subtraction, by subtracting a second copy of the rest of the SV expression scaled by t_in/(t_in + t_out). But that shows the apportionment cost to depend on the opportunity cost, which is odd because the original motivation on line 404 was to resolve the lack of independence between terms in the SV expression.
On line 404 of the original manuscript, we point out that the simple equation―which is a reprisal of McNamara’s insight―is problematic in that its terms on the RHS are not independent: the global reward rate is dependent on the considered pursuit’s reward (see Fig5B). The alternative expression for subjective value that we derive expresses sv in terms that are all independent of one another. We may have unintentionally obscured that fact by having already defined rho<sup>in</sup> as r<sup>in</sup>/ t<sup>in</sup> and rho<sup>out</sup> as r<sup>out</sup>/t<sup>out</sup> on lines 306 and 307.
Therefore, in the revision, Ap 8 is expressed so to keep clear that it uses terms that are all independent of one another, and only subsequently express this formula with the simplifying substitution, rho<sup>out</sup>.
That all said, we understand the reviewer’s point to be that the parenthetical terms relating the opportunity cost and the apportionment rescaling both contain within them the parameter t<sup>out</sup>, and in this way these concepts we put forward to understand the alternative equation are non-independent. That is correct, but it isn’t at odds with our objective to express SV in terms that are independent with one another (which we do). Our motivation in introducing these concepts is to provide insight and intuition into the cost of time (especially now with a clear and simple definition of apportionment cost stated). We go to lengths to demonstrate their relationship to each other.
(4) In the analysis of discounting functions (line 664 and beyond), the paper doesn't say much about the fact that many discounting studies take specific measures to distinguish true time preferences from opportunity costs and reward-rate maximization.
We understand the reviewer’s comment to connote that temporal decision-making worlds in which delay time does not preclude reward from outside the current pursuit is a means to distinguish time preference from the impact of opportunity cost. One contribution of this work is to demonstrate that, from a reward-rate maximization framework, an accounting of opportunity cost is not sufficient to understand apparent time preferences as distinguishable from reward-rate maximization. The apportionment cost of time must also be considered to have a full appreciation of the cost of time. For instance, let us consider a temporal decision-making world in which there is no reward received outside the considered pursuit. In such a world, there is no opportunity cost of time, so apparent temporal discounting functions would appear as if purely hyperbolic as a consequence of the apportionment cost of time alone. Time preference, as revealed experimentally by the choices made between a SS and a LL reward, then, seem confounding, as preference can reverse from a SS to a LL option as the displacement of those options (maintaining their difference in time) increases (Green, Fristoe, and Myerson 1994; Kirby and Herrnstein 1995). While this shift, the so-called “Delay effect”, could potentially arise as a consequence of some inherent time preference bias of an agent, we demonstrate that a reward-rate maximal agent exhibits hyperbolic discounting, and therefore it would also exhibit the Delay effect, even though it has no time preference.
In the revision we now make reference to the Delay Effect (in abstract, results new section “The Delay Effect” with new figure 14, and in the discussion), which is taken as evidence of time preference in human and animal literature, and note explicitly how a reward-rate maximizing agent would also exhibit this behavior as a consequence of apparent hyperbolic discounting.
In many of the human studies, delay time doesn't preclude other activities.
Our framework is generalizable to worlds in which being in pursuit does not preclude an agent from receiving reward during that time at the outside reward rate. Original Ap 13 solves for such a condition, and shows that in this context, the opportunity cost of time drops out of the SV equation, leaving only the consequences of the apportionment cost of time. We made reference to this case on lines 1032-1034 of the original manuscript: “In this way, such hyperbolic discounting models [models that do not make an accounting of opportunity cost] are only appropriate in worlds with no “outside” reward, or, where being in a pursuit does not exclude the agent from receiving rewards at the rate that occurs outside of it (Ap. 13).”
The note and reference is fleeting in the original work. We take the reviewer’s suggestion and now add paragraphs in the discussion on the difference between humans and animals in apparent discounting, making specific note of human studies in which delay time doesn’t preclude receiving outside reward while engaged in a pursuit. Relatedly, hyperbolic discounting is oft considered to be less steep in humans than in animals. As the reviewer points out, these assessments are frequently made under conditions in which being in a pursuit does not preclude receiving reward from outside the pursuit. When humans are tested under conditions in which outside rewards are precluded, they exhibit far steeper discounting. We now include citation to that observation (Jimura et al. 2009). We handle such conditions in original AP 13, and show how, in such worlds, the opportunity cost of time drops out of the equation. The consequence of this is that the apparent discounting function would become less steep (the agent would appear as if more patient), consistent with reports.
“Relating to the treatment of opportunity cost, we also note that many investigations into temporal discounting do not make an explicit distinction between situations in which 1) subjects continue to receive the usual rewards from the environment during the delay to a chosen pursuit, and 2) situations in which during a chosen pursuit’s delay no other rewards or opportunities will occur (Kable & Glimcher, 2007; Kirby & Maraković, 1996; McClure, Laibson, Loewenstein, & Cohen, 2004). Commonly, human subjects are asked to answer questions about their preferences between options for amounts they will not actually earn after delays they will not actually have to wait, during which it is unclear whether they are really investing time away from other options or not (Rosati et al., 2007). In contrast, in most animal experiments, subjects actually receive reward after different delays during which they do not receive new options or rewards. By our formulation, when a pursuit does not exclude the agent from receiving rewards at the rate that occurs outside, the opportunity cost of time drops out of the subjective value equation (Ap 12).
Equation 10. The value of initiating a pursuit when pursuit does not exclude receiving rewards at the outside rate (Ap 12)
Therefore, the reward-rate maximizing discounting function in these worlds is functionally equivalent to the situation in which the outside reward rate is zero, and will―lacking an opportunity cost―be less steep. This rationalizes why human discounting functions are often reported to be longer (gentler) than animal discounting functions: they are typically tested in conditions that negate opportunity cost, whereas animals are typically tested in conditions that enforce opportunity costs. Indeed, when humans are made to wait for actually received reward, their observed discounting functions are much steeper (Jimura et al. 2009). “
In animal studies, rate maximization can serve as a baseline against which to measure additional effects of temporal discounting. This is an important caveat to claims about discounting anomalies being rational under rate maximization (e.g., line 1024).
We agree that the purpose of this reward-rate maximizing framework is to serve as a point of comparison in which effects of temporal intervals and rewards that define the environment can be analyzed to better understand the manner in which animals and humans deviate from this ideal behavior. Our interest in this work is in part motivated by a desire to have a deeper understanding of what “true” time preference means. Using the reward-rate maximizing framework here provides a means to speak about time preferences (ie biases) in terms of deviation from optimality. From this perspective, a reward-rate maximal agent doesn’t exhibit time preference: its actions are guided solely by reward-rate optimizing valuation. Therefore, one contribution of this work is to show that purported signs of time preference (hyperbolic discounting, magnitude, sign, and (now) delay effect) can be explained without invoking time preference. What errors from optimality that remain following an proper accounting of reward-rate maximizing behavior should then, and only then, be considered from the lens of time preference (bias).
(5) The paper doesn't feature any very concrete engagement with empirical data sets. This is ok for a theoretical paper, but some of the characterizations of empirical results that the model aims to match seem oversimplified. An example is the contention that real decision-makers are optimal in accept/reject decisions (line 816 and elsewhere). This isn't always true; sometimes there is evidence of overharvesting, for example.
We would like to note that the scope of this paper is limited to examining the value of initiating a pursuit, rather than the value of continuing within a pursuit. The issue of continuing within a pursuit constitutes a third fundamental topology, which could be called give-up or patch-foraging, and is complex and warrants its own paper. In Give-up topologies, which are distinct from Forgo, and Choice topologies, the reviewer is correct in pointing out that the preponderance of evidence demonstrates that animals and humans are as if overpatient, adopting a policy of investing too much time within a pursuit, than is warranted_._ In Forgo instances, however, the evidence supports near optimality.
(6) Related to the point above, it would be helpful to discuss more concretely how some of this paper's theoretical proposals could be empirically evaluated in the future. Regarding the magnitude and sign effects of discounting, there is not a very thorough overview of the several other explanations that have been proposed in the literature. It would be helpful to engage more deeply with previous proposals and consider how the present hypothesis might make unique predictions and could be evaluated against them.
We appreciate the reviewer’s point that there are many existing explanations for these various ‘anomalous’ effects. We hold that the point of this work is to demonstrate that these effects are consistent with a reward-rate maximizing framework so do not require additional assumptions, like separate processes for small and large rewards, or the inclusion of a utility function.
Nonetheless, there is a diversity of explanations for the sign and magnitude effect, and, (now with its explicit inclusion in the revision) the delay effect. Therefore, we now also include reference to additional work which proffers alternative explanations for the sign and magnitude effects, (as reviewed by (Kalenscher and Pennartz 2008; Frederick et al. 2002)), as well as a scalar timing account of non-stationary time preference (Gibbon, 1977).
With respect to making predictions, this framework makes the following in regards to the magnitude, sign, and (now in the revision) delay effect: in Discussion, Magnitude effect subsection: “The Magnitude Effect should be observed, experimentally, to diminish when 1) increasing the outside time while holding the outside reward constant, (thus decreasing the outside reward rate), or when 2) decreasing the outside reward while holding the outside time constant (thus decreasing the outside reward rate). However, 3) the Magnitude Effect would exaggerate as the outside time increased while holding the outside reward rate constant.”, in Sign effect subsection: “…we then also predict that the size of the Sign effect would diminish as the outside reward rate decreases (and as the outside time increases), and in fact would invert should the outside reward rate turn negative (become net punishing), such that punishments would appear to discount more steeply than rewards.” Delay effect subsection: “...a sign of irrationality is that a preference reversal occurs at delays greater than what a reward-rate-maximizing agent would exhibit.”
A similar point applies to the 'malapportionment hypothesis' although in this case there is a very helpful section on comparisons to prior models (line 1163). The idea being proposed here seems to have a lot in common conceptually with Blanchard et al. 2013, so it would be worth saying more about how data could be used to test or reconcile these proposals.
We thank the reviewer for holding that the section of model comparisons to be very helpful. We believe the text previously dedicated to this issue to be sufficient in this regard. We have, however, adding substantively to the Malapportionment Hypothesis section (Discussion) and its accompanying figure, to make explicit a number of predictions from the Malapportionment hypothesis as it relates to Hyperbolic discounting, the Delay Effect, and the Sign and Magnitude Effects.
Reviewer #1 Recommendations
(1) As a general note about the figures, it would be helpful to specify, either graphically or in the caption, what fixed values of reward sizes and time intervals are being assumed for each illustration.
Thank you for the suggestion. We attempted to keep graphs as uncluttered as possible, but agree that for original figures 4,5,16, and 17, which didn’t have numbered axes, that we should provide the amounts in the captions in the revised figures (4,5, and now 17,18). These figures did not have numerics as their shapes and display are to illustrate the form of the relationship between vectors, being general to the values they may take.
We now include in the captions for these figures the parameter amounts used.
(2) Should Equation 2 have t in the denominator instead of r?
Indeed. We thank the reviewer for catching this typographical error.
We have corrected it in the revision.
(3) General recommendation:
My view is that in order for the paper's eLife assessment to improve, it would be necessary to resolve points 1 through 4 listed under "weaknesses" in my public review, which pertain to clarity and acknowledgement of prior work. I think a lot hinges on whether the authors can respond to point #3 by making a more compelling case for the usefulness and generality of the 'apportionment cost' concept, since that idea is central to the paper's contribution.
We believe these critical points (1-4) to improve the paper will now have been addressed to the reviewer’s satisfaction.
Reviewer #2 (Public review):
While the details of the paper are compelling, the authors' presentation of their results is often unclear or incomplete:
(1) The mathematical details of the paper are correct but contain numerous notation errors and are presented as a solid block of subtle equation manipulations. This makes the details of the authors' approach (the main contribution of the paper to the field) highly difficult to understand.
We thank the reviewers for having detected typographical errors regarding three equations. They have been corrected. The first typographical error in the original main text (Line 277) regards equation 2 and will be corrected so that equation 2 appears correctly as
The second typo regards the definition of the considered pursuit’s reward rate which appear in the original main text (line 306), and has been corrected to appear as
The third typographical error occurred in conversion from Google Sheets to Microsoft Word appearing in the original main text (line 703) and regards the subjective value expression when no reward is received in an intertrial interval (ITI). It has been corrected to appear as
(2) One of the main contributions of the paper is the notion that time’s cost in decision-making contains an apportionment cost that reflects the allocation of decision time relative to the world. The authors use this cost to pose a hypothesis as to why subjects exhibit sub-optimal behavior in choice decisions. However, the equation for the apportionment cost is never clearly defined in the paper, which is a significant oversight that hampers the effectiveness of the authors' claims.
We thank the reviewer for pressing on this critical point. Reviewers commonly identified a need to provide a concise and intuitive definition of apportionment cost, and to explicitly solve and provide for its mathematical expression.
We added the following succinct verbal description of apportionment cost… “Apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration.” This definition appears in new paragraphs (as below) describing apportionment cost in the results section “Time’s cost: opportunity & apportionment costs determine a pursuit’s subjective value”, and is accompanied by equations for apportionment cost, and a figure giving its geometric depiction (Figure 5). We also expanded original figure 5 and its legend (so as to illustrate the apportionment scaling factor and the apportionment cost), and its accompanying main text, to further illustrate and clarify apportionment cost, and its relationship to opportunity cost, and time’s cost.
“What, then, is the amount of reward by which the opportunity cost-subtracted reward is scaled down to equal the sv of the pursuit? This amount is the apportionment cost of time. The apportionment cost of time (height of the brown vertical bar, Figure 5F) is the global reward rate after taking into account the opportunity cost (slope of the magenta-gold dashed line in Figure 5F) times the time of the considered pursuit. Equally, the difference between the inside and outside reward rates, times the time of the pursuit, is the apportionment cost when scaled by the pursuit’s weight, i.e., the fraction that the considered pursuit is to the total time to traverse the world (Equation 9, right hand side). From the perspective of decision-making policies, apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration (Equation 9 center, Figure 5F).
Equation 9. Apportionment Cost.
While this difference is the apportionment cost of time, the opportunity cost of time is the amount that would be expected from a policy of not taking the considered pursuit over a time equal to the considered pursuit’s duration. Together, they sum to Time’s Cost (Figure 5G). Expressing a pursuit’s worth in terms of the global reward rate obtained under a policy of accepting the pursuit type (Figure 5 left column), or from the perspective of the outside reward and time (Figure 5 right column), are equivalent. However, the latter expresses sv in terms that are independent of one another, conveys the constituents giving rise to global reward rate, and provides the added insight that time’s cost comprises an apportionment as well as an opportunity cost.”
(3) Many of the paper's figures are visually busy and not clearly detailed in the captions (for example, Figures 6-8). Because of the geometric nature of the authors' approach, the figures should be as clean and intuitive as possible, as in their current state, they undercut the utility of a geometric argument.
We endeavored to make our figures as simple as possible. We have made in the revision changes to figures that we believe improve their clarity. These include: 1) breaking some figures into more panels when more than one concept was being introduced (such as in revised Figure 5 , 6, 7, and 8), 2) using the left hand y axis for the outside reward, and the right hand axis for the inside reward when plotting the “in” and “outside” reward, and indicating their respective numerics (which run in opposite directions), 3) adding a legend to the figures themselves where needed (revised figures 10, 11, 12, 14) 4) adding the values used to the figure captions, where needed, and 5) ensuring all symbols are indicated in legends.
(4) The authors motivate their work by focusing on previously-observed behavior in decision experiments and tell the reader that their model is able to qualitatively replicate this data. This claim would be significantly strengthened by the inclusion of experimental data to directly compare to their model's behavior. Given the computational focus of the paper, I do not believe the authors need to conduct their own experiments to obtain this data; reproducing previously accepted data from the papers the authors' reference would be sufficient.
Our objective was not to fit experimentally observed data, as is commonly the goal of implementation/computational models. Rather, as a theory, our objective is to rationalize the broad, curious, and well-established pattern of temporal decision-making behaviors under a deeper understanding of reward-rate maximization, and from that understanding, identify the nature of the error being committed by whatever learning algorithm and representational architecture is actually being used by humans and animals. In doing so, we make a number of important contributions. By identifying and analyzing reward-rate-maximizing equations, we 1) provide insight into what composes time’s cost and how the temporal structure of the world in which it is embedded (its ‘context’) impacts the value of a pursuit, 2) rationalize a diverse assortment of temporal decision-making behaviors (e.g., Hyperbolic discounting, the Magnitude Effect, the Sign Effect, and the Delay effect), explaining them with no assumed free-fit parameter, and then, by analyzing error in parameters enabling reward-rate maximization, 3) identify the likely source of error and propose the Malapportionment Hypothesis. The Malapportionment Hypothesis identifies the underweighting of a considered pursuit’s “outside”, and not error in pursuit’s reward rates, as the source of error committed by humans and animals. It explains why animals and humans can present as suboptimally ‘impatient’ in Choice, but as optimal in Forgo. At the same time, it concords with numerous and diverse observations in decision making regarding whether to initiate a pursuit. The nature of this error also, then, makes numerous predictions. These insights inform future computational and experimental work by providing strong constraints on the nature of the algorithm and representational architecture used to learn and represent the values of pursuits. Rigorous test of the Malapportionment Hypothesis will require wholly new experiments.
In the revision, we also now emphasize and add predictions of the Malapportionment Hypothesis, updated its figure (Figure 21), its legend, and its paragraphs in the discussion.
“We term this reckoning of the source of error committed by animals and humans the Malapportionment Hypothesis, which identifies the underweighting of the time spent outside versus inside a considered pursuit but not the misestimation of pursuit rates, as the source of error committed by animals and humans (Figure 21). This hypothesis therefore captures previously published behavioral observations (Figure 21A) showing that animals can make decisions to take or forgo reward options that optimize reward accumulation (Krebs et al., 1977; Stephens and Krebs, 1986; Blanchard and Hayden, 2014), but make suboptimal decisions when presented with simultaneous and mutually exclusive choices between rewards of different delays (Logue et al., 1985; Blanchard and Hayden, 2015; Carter and Redish, 2016; Kane et al., 2019). The Malapportionment Hypothesis further predicts that apparent discounting functions will present with greater curvature than what a reward-rate-maximizing agent would exhibit (Figure 21B). While experimentally observed temporal discounting would have greater curvature, the Malapportionment Hypothesis also predicts that the Magnitude (Figure 21C) and Sign effect (Figure 21D) would be less pronounced than what a reward-rate-maximizing agent would exhibit, with these effects becoming less pronounced the greater the underweighting. Finally, with regards to the Delay Effect (Figure 21E), the Malapportionment Hypothesis predicts that preference reversal would occur at delays greater than that exhibited by a reward-rate-maximizing agent, with the delay becoming more pronounced the greater the underweighting outside versus inside the considered pursuit by the agent.”
(5) While the authors reference a good portion of the decision-making literature in their paper, they largely ignore the evidence-accumulation portion of the literature, which has been discussing time-based discounting functions for some years. Several papers that are both experimentally-(Cisek et al. 2009, Thurs et al. 2012, Holmes et al. 2016) and theoretically-(Drugowitsch et al. 2012, Tajima et al. 2019, Barendregt et al. 22) driven exist, and I would encourage the authors to discuss how their results relate to those in different areas of the field.
In this manuscript, we consider the worth of initiating one or another pursuit having completed a prior one, and not the issue of continuing within a pursuit having already engaged in it. The worth of continuing a pursuit, as in patch-foraging/give-up tasks, constitutes a third fundamental time decision-making topology which is outside the scope of the current work. It engages a large and important literature, encompassing evidence accumulation, and requires a paper on the value of continuing a pursuit in temporal decision making, in its own right, that can use the concepts and framework developed here. The excellent works suggested by the reviewer will be most relevant to that future work concerning patch-foraging/give-up topologies.
Reviewer #2 Recommendations:
(1) In Equation 1, the term rho_d is referred to as the reward rate of the default pursuit, when it should be the reward of the default pursuit.
Regarding Equation 1, it is formulated to calculate the average reward received and average time spent per unit time spent in the default pursuit. So, f<sub>i</sub> is the encounter rate of pursuit i for one unit of time spent in the default pursuit (lines 259-262). Added to the summation in the numerator, we have the average reward obtained in the default pursuit per unit time (
) and in the denominator we have the time spent in the default pursuit per unit time (1).
We have added clarifying text to assist in meaning of the equation in Ap 1, and thank the reviewer for pointing out this need.
(2) The notation for "in" and "out" of a considered pursuit type begins as being used to describe the contribution from a single pursuit (without inter-trial interval) towards global reward rate and the contribution of all other factors (other possible pursuits and inter-trial interval) towards global reward rate, respectively, but is then used to describe the pursuit's contribution and the inter-trial interval's contribution, respectively, to the global reward rate. This should be cleaned up to be consistent throughout, or at the very least, it should be addressed when this special case is considered the default.
As understood by the reviewer, “in” and “out” of the considered pursuit type describes the general form by which a world can be cleaved into these two parts: the average time and reward received outside of the considered pursuit type for the average time and reward received within that pursuit type. A specific, simple, and common experimental instance would be a world composed of one or another pursuit and an intertrial interval.
We now make clear how such a world composed of a considered pursuit and an inter trial interval would be but one special case. In example cases where t<sup>out</sup> represents the special case of an inter-trial interval, this is now stated clearly. For instance, we do so when discussing how a purely hyperbolic discounting function would apply in worlds in which no reward is received in t<sup>out</sup>, stating that this is often the case common to experimental designs where t<sup>out</sup> represents an intertrial interval with no reward. Importantly, by the new inclusion of illustrated worlds in the revision that have n-number pursuits that could occur from a default pursuit and 1) equal frequency (Supplemental 1), and 2) at differing frequencies (Supplemental 2), we make more clear the generalizability and utility of this t<sup>out</sup>/tin concept.
(3) Figure 5 should make clear the decomposition of time's cost both graphically and functionally. As it stands, the figure does not define the apportionment cost.
In the revision of original fig 5, we now further decompose the figure to effectively convey 1) what opportunity cost, and (especially) 2) the apportionment cost is, both graphically and mathematically, 3) how time’s cost is comprised by them, 4) how the apportionment scaling term scales the opportunity-cost-subtracted reward by time’s allocation to equal the subjective value, and 4) the equivalence between the expression of time’s cost using terms that are not independent of one another with the expression of time’s cost using terms that are independent of one another.
(4) Figures 6-8 do not clearly define the dots and annuli used in panels B and C.
We have further decomposed figures 6-8 so that the functional form of opportunity, apportionment, and time’s cost can be more clearly appreciated, and what their interrelationship is with respect to changing outside reward and outside time, and clearly identify symbols used in the corresponding legends.
(5) The meaning of a negative subjective value should be specifically stated. Is it the amount a subject would pay to avoid taking the considered pursuit?
As the reviewer intuits, negative subjective value can be considered the amount an agent ought be willing to pay to avoid taking the considered pursuit.
We now include the following lines in “The forgo decision can also be made from subjective value” section in reference to negative subjective value…
“A negative subjective value thus indicates that a policy of taking the considered pursuit would result in a global reward rate that is less than a policy of forgoing the considered pursuit. Equivalently, a negative subjective value can be considered the amount an agent ought be willing to pay to avoid having to take the considered pursuit.”
(6) Why do you define the discounting function as the normalized subjective value? This choice should be justified, via literature citations or a well-described logical argument.
The reward magnitude normalized subjective value-time function is commonly referred to as the temporal discounting function as it permits comparison of the discount rate isolated from a difference in reward magnitude and/or sign and is deeply rooted in historical precedent. As the reviewer points out, the term is overloaded, however, as investigations in which comparisons between the form of subjective value-time functions is not needed tend to refer to these functions as temporal discounting functions as well.
We make clear in the revised text in the introduction our meaning and use of the term, the justification in doing so, and its historical roots.
“Historically, temporal decision-making has been examined using a temporal discounting function to describe how delays in rewards influence their valuation. Temporal discounting functions describe the subjective value of an offered reward as a function of when the offered reward is realized. To isolate the form of discount rate from any difference in reward magnitude and sign, subjective value is commonly normalized by the reward magnitude when comparing subjective value-time functions (Strotz, 1956, Jimura, 2009). Therefore, we use the convention that temporal discounting functions are the magnitude-normalized subjective value-time function (Strotz, 1956).”
Special addition. In investigating the historical roots of the discounting function prompted by the reviewer, we learned (Grüne-Yanoff 2015) that it was Mazur that simply added the “1+k” in the denominator of the hyperbolic discounting function. Our derivation for the reward-rate optimal agent makes clear why apparent temporal discounting functions ought have this general form.
Therefore, we add the following to the “Hyperbolic Temporal Discounting Function section in the discussion…
“It was Ainslie (Ainslie, 1975) who first understood that the empirically observed “preference reversals” between SS and LL pursuits could be explained if temporal discounting took on a hyperbolic form, which he initially conjectured to arise simply from the ratio of reward to delay (Grüne-Yanoff 2015). This was problematic, however, on two fronts: 1) as the time nears zero, the value curve goes to infinity, and 2) there is no accommodation of differences observed within and between subjects regarding the steepness of discounting. Mazur (Mazur, 1987) addressed these issues by introducing 1 + k into the denominator, providing for the now standard hyperbolic discounting function,
. Introduction of “1” solved the first issue, though “it never became fully clear how to interpret this 1” (Grüne-Yanoff 2015; interviewing Ainslie). Introduction of the free-fit parameter, k, accommodated the variability observed across and within subjects by controlling the curvature of temporal discounting, and has become widely interpreted as a psychological trait, such as patience, or willingness to delay gratification (Frederick et al., 2002).”
…continuing later in that section to explain why the reward-rate optimal agent would exhibit this general form…
“Regarding form, our analysis reveals that the apparent discounting function of a reward-rate-maximizing agent is a hyperbolic function…
…which resembles the standard hyperbolic discounting function,
, in the denominator, where
. Whereas Mazur introduced 1 + k to t in the denominator to 1) force the function to behave as t approaches zero, and 2) provide a means to accommodate differences observed within and between subjects, our derivation gives cause to the terms 1 and k, their relationship to one another, and to t in the denominator. First, from our derivation, “1” actually signifies taking t<sub>out</sub> amount of time expressed in units of t<sub>out</sub> (t<sub>out</sub>/t<sub>out</sub>=1) and adding it to t<sub>in</sub> amount of time expressed in units of t<sub>out</sub> (ie, the total time to make a full pass through the world expressed in terms of how the agent apportions its time under a policy of accepting the considered pursuit).”
Additional Correction. In revising the section, “Hyperbolic Temporal Discounting Functions” in the discussion, we also detected an error in our description of the meaning of suboptimal bias for SS. In the revision, the sentence now reads…
“More precisely, what is meant by this suboptimal bias for SS is that the switch in preference from LL to SS occurs at an outside reward rate that is lower—and/or an outside time that is greater —than what an optimal agent would exhibit.”
(7) Figure 15B should have negative axes defined for the pursuit's now negative reward.
Yes- excellent point.
To remove ambiguity regarding the valence of inside and outside reward magnitudes, we have changed all such figures so that the left hand y-axis is used to signify the outside reward magnitude and sign, and so that the right hand y-axis is used to signify the inside reward magnitude and sign.
With respect to the revision of original 15B, this change now makes clear that the inside reward label and numerics on the right hand side of the graph run from positive (top) to negative (bottom) values so that it can now be understood that the magnitude of the inside reward is negative in this figure (ie, a punishment). The left hand y-axis labeling the outside reward magnitude has numerics that run in the opposite direction, from negative (top) to positive (bottom). In this figure, the outside reward rate is positive whereas the inside reward rate is negative.
(8) When comparing your discounting function to the TIMERR and Heuristic models, it would be useful to include a schematic plot illustrating the different obtainable behaviors from all models rather than just telling the reader the differences.
We hold that the descriptions and references are sufficient to address these comparisons.
(9) I would strongly suggest cleaning up all appendices for notation…
The typographical errors that have been noted in these reviews have all been corrected. We believe the reviewer to be referring here to the manner that we had cross-referenced Equations in the appendices and main text which can lead to confusion between whether an equation number being referenced is in regard to its occurrence in the main text or its occurrence in the appendices.
In the revision, we eliminate numbering of equations in the appendices except where an equation occurs in an appendix that is referenced within the main text. In the main text, important equations are numbered sequentially and note the appendix from which they derive. If an equation in an appendix is referenced in the main text, it is noted within the appendix it derives.
…and replacing some of the small equation manipulations with written text describing the goal of each derivation.
To increase clarity, we have taken the reviewer’s helpful suggestion, adding helper text in the appendices were needed, and have bolded the equations of importance within the Appendices (rather than removing equation manipulations making clear steps of derivation).
(10) I would suggest moving the table in Appendix 11 to the main text where misestimation is referenced.
So moved. This appendix now appears in the main text as table 1 “Definitions of misestimating global reward rate-enabling parameters”.
Reviewer #3 (Public review):
One broad issue with the paper is readability. Admittedly, this is a complicated analysis involving many equations that are important to grasp to follow the analyses that subsequently build on top of previous analyses.
But, what's missing is intuitive interpretations behind some of the terms introduced, especially the apportionment cost without referencing the equations in the definition so the reader gets a sense of how the decision-maker thinks of this time cost in contrast with the opportunity cost of time.
We thank the reviewer for encouraging us to formulate a succinct and intuitive statement as to the nature of apportionment cost. We thank the reviewer for pressing for a succinct and intuitive verbal description.
We added the following succinct verbal description of apportionment cost… “Apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration.” This definition appears in a new paragraph (as below) describing apportionment cost in the results section “Time’s cost: opportunity & apportionment costs determine a pursuit’s subjective value”, and is accompanied by equations for apportionment cost, and a figure giving its geometric depiction (Figure 5). We also expanded original figure 5 and its legend (so as to illustrate the apportionment scaling factor and the apportionment cost), and its accompanying main text, to further illustrate and clarify apportionment cost, and its relationship to opportunity cost, and time’s cost.
“What, then, is the amount of reward by which the opportunity cost-subtracted reward is scaled down to equal the sv of the pursuit? This amount is the apportionment cost of time. The apportionment cost of time (height of the brown vertical bar, Figure 5F) is the global reward rate after taking into account the opportunity cost (slope of the magenta-gold dashed line in Figure 5F) times the time of the considered pursuit. Equally, the difference between the inside and outside reward rates, times the time of the pursuit, is the apportionment cost when scaled by the pursuit’s weight, i.e., the fraction that the considered pursuit is to the total time to traverse the world (Equation 9, right hand side). From the perspective of decision-making policies, apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration (Equation 9 center, Figure 5F).
Equation 9. Apportionment Cost.
While this difference is the apportionment cost of time, the opportunity cost of time is the amount that would be expected from a policy of not taking the considered pursuit over a time equal to the considered pursuit’s duration. Together, they sum to Time’s Cost (Figure 5G). Expressing a pursuit’s worth in terms of the global reward rate obtained under a policy of accepting the pursuit type (Figure 5 left column), or from the perspective of the outside reward and time (Figure 5 right column), are equivalent. However, the latter expresses sv in terms that are independent of one another, conveys the constituents giving rise to global reward rate, and provides the added insight that time’s cost comprises an apportionment as well as an opportunity cost.”
The above definition of apportionment cost adds to other stated relationships of apportionment cost found throughout the paper (original lines 434,435,447,450).
Re-analysis of some existing empirical data through the lens of their presented objective functions, especially later when they describe sources of error in behavior.
Our objective was not to fit experimentally observed data, as is commonly the goal of implementation/computational models. Rather, as a theory, our objective is to rationalize the broad, curious, and well-established pattern of temporal decision-making behaviors under a deeper understanding of reward-rate maximization, and from that understanding, identify the nature of the error being committed by whatever learning algorithm and representational architecture is actually being used by humans and animals. In doing so, we make a number of important contributions. By identifying and analyzing reward-rate-maximizing equations, we 1) provide insight into what composes time’s cost and how the temporal structure of the world in which it is embedded (its ‘context’) impacts the value of a pursuit, 2) rationalize a diverse assortment of temporal decision-making behaviors (e.g., Hyperbolic discounting, the Magnitude Effect, the Sign Effect, and the Delay effect), explaining them with no assumed free-fit parameter, and then, by analyzing error in parameters enabling reward-rate maximization, 3) identify the likely source of error and propose the Malapportionment Hypothesis. The Malapportionment Hypothesis identifies the underweighting of a considered pursuit’s “outside”, and not error in pursuit’s reward rates, as the source of error committed by humans and animals. It explains why animals and humans can present as suboptimally ‘impatient’ in Choice, but as optimal in Forgo. At the same time, it concords with numerous and diverse observations in decision making regarding whether to initiate a pursuit. The nature of this error also, then, makes numerous predictions. These insights inform future computational and experimental work by providing strong constraints on the nature of the algorithm and representational architecture used to learn and represent the values of pursuits. Rigorous test of the Malapportionment Hypothesis will require wholly new experiments.
In the revision, we also now emphasize and add predictions of the Malapportionment Hypothesis, augmenting its figure (Figure 21), its legend, and its paragraphs in the discussion.
“We term this reckoning of the source of error committed by animals and humans the Malapportionment Hypothesis, which identifies the underweighting of the time spent outside versus inside a considered pursuit but not the misestimation of pursuit rates, as the source of error committed by animals and humans (Figure 21). This hypothesis therefore captures previously published behavioral observations (Figure 21A) showing that animals can make decisions to take or forgo reward options that optimize reward accumulation (Krebs et al., 1977; Stephens and Krebs, 1986; Blanchard and Hayden, 2014), but make suboptimal decisions when presented with simultaneous and mutually exclusive choices between rewards of different delays (Logue et al., 1985; Blanchard and Hayden, 2015; Carter and Redish, 2016; Kane et al., 2019). The Malapportionment Hypothesis further predicts that apparent discounting functions will present with greater curvature than what a reward-rate-maximizing agent would exhibit (Figure 21B). While experimentally observed temporal discounting would have greater curvature, the Malapportionment Hypothesis also predicts that the Magnitude (Figure 21C) and Sign effect (Figure 21D) would be less pronounced than what a reward-rate-maximizing agent would exhibit, with these effects becoming less pronounced the greater the underweighting. Finally, with regards to the Delay Effect (Figure 21E), the Malapportionment Hypothesis predicts that preference reversal would occur at delays greater than that exhibited by a reward-rate-maximizing agent, with the delay becoming more pronounced the greater the underweighting outside versus inside the considered pursuit by the agent.”
Reviewer #3 Recommendations:
As mentioned above, the readability of this paper should be improved so that the readers can follow the derivations and your analyses better. To this end, careful numbering of equations, following consistent equation numbering formats, and differentiating between appendix referencing and equation numbering would have gone a long way in improving the readability of this paper. Some specific questions are noted below.
To increase clarity, in the revision we eliminated numbering of equations in the appendices except where an equation occurs in an appendix that is referenced within the main text. In the main text, important equations are thus numbered sequentially as they appear and note the appendix from which they derive. If an equation in an appendix is referenced in the main text, it is noted within the appendix it derives.
(1) In general, it is unclear what the default pursuit is. From the schematic on the left (forgo decision), it appears to be the time spent in between reward-giving pursuits. However, this schematic also allows for smaller rewards to be attained during the default pursuit as do subsequent equations that reference a default reward rate. Here is where an example would have really benefited the authors in getting their point across as to what the default pursuit is in practice in the forgo decisions and how the default reward rate could be modulated.
(1) The description of the default pursuit has been modified in section “Forgo and Choice decision topologies” to now read… “After either the conclusion of the pursuit, if accepted, or immediately after rejection, the agent returns to a pursuit by default (the “default” pursuit). This default pursuit effectively can be a waiting period over which reward could be received, and reoccurs until the next pursuit opportunity becomes available.” (2) Additionally, helper text has been added to Ap1 regarding the meaning of time and reward spent in the default pursuit. Finally, (3) new figures concerning n-pursuits occurring at the same (Supplement 1) or different (Supplement 2) frequencies from a default pursuit is now added, providing examples as suggested by the reviewer.
(2) I want to clarify my understanding of the topologies in Figure 1. In the forgo, do they roam in the "gold" pursuit indefinitely before they are faced with the purple pursuit? In general, comparing the 2 topologies, it seems like in the forgo decision, they can roam indefinitely in the gold topology or choose the purple but must return to the gold.
The reviewer’s understanding of the topology is correct. The agent loops across one unit time in the default gold pursuit indefinitely, though the purple pursuit (or any pursuit that might exist in that world) occurs on exit from gold at its frequency per unit time. The default gold pursuit will then itself have an average duration in units of time spent in gold. As the reviewer states, the agent can re-enter into gold from having exited gold, and can enter gold from having exited purple, but cannot re-enter purple from having exited purple; rather, it must enter into the default pursuit.
…Another point here is that this topology is highly simplified (only one considered pursuit). So it may be helpful to either add a schematic for the full topology with multiple pursuits or alternatively, provide the corresponding equations (at least in appendix 1 and 2) for the simplified topology so you can drive home the intuition behind derived expressions in these equations.
We understand the reviewer to be noting that, while, the illustrated example is of the simple topology, the mathematical formulation handles the case of n-number pursuits, and that illustrating a world in which there are a greater number of pursuits, corresponding to original appendices 1&2, would assist readers in understanding the generality of these equations.
An excellent suggestion. We have now n-pursuit world illustrations where each pursuit occurs at the same (Supplemental Figure 1) and at different frequencies (Supplemental Figure 2) to the manuscript, and have added text to assist in understanding the form of the equation and its relationship to unit time in the default pursuit in the main and in the appendices.
(3) In Equation and Appendix 1, there are a few things that are unclear. Particularly, why is the expected time of the default option E(t_default )= 1/(∑_(i=1)^n f_i )? Similarly, why is the E(r_default )= ρ_d/(∑_(i=1)^n f_i )? Looking at the expression for E(r_default ), it implies that across all pursuits 1 through n, the default option is encountered only once. Ultimately, in Equation 1.4, (and Equation 1), the units of the two terms in the numerator don't seem to match. One is a reward rate (ρ_d) and the other is a reward value. This is the most important equation of the paper since the next several equations build upon this. Therefore, the lack of clarity here makes the reader less likely to follow along with the analysis in rigorous detail. Better explanations of the terms and better formatting will help alleviate some of these issues.
The equation is formulated to calculate the average reward received and average time spent per unit time spent in the default pursuit. So, f<sub>i</sub> is the encounter rate of pursuit i for one unit of time spent in the default pursuit. Added to the summation in the numerator we have the average reward obtained in the default pursuit per unit time () and in the denominator we have the time spent in the default pursuit per unit time (1).
Text explaining the above equation has been added to Ap 1.
(4) In equation and appendix 2, I'm trying to relate the expressions for t_out and r_out to the definitions "average time spent outside the considered pursuit". If I understand the expression in Equation 2.4 on the right-hand side, the numerator is the total time spent in all of the pursuits in the environment and the denominator refers to the number of times the considered pursuit is encountered. It is unclear as to why this is the average time spent outside the considered pursuit. In my mind, the expression for average time spent outside the considered pursuit would look something like t_out=1+ ∑_(i≠in)〖p_i t_i 〗= 1+ ∑_(i≠in)〖f_i/(∑_(j=1)^n f_j ) * t_i 〗. It is unclear how these expressions are then equivalent.
Regarding the following equation,
f<sub>i</sub> is the probability that pursuit i will be encountered during a single unit of time spent in the default pursuit. The numerator of the expression is the average amount of time spent across all pursuits, excepting the considered pursuit, per unit time spent in the default pursuit. Note that the + 1 in the numerator is accounting for the unit of time spent in the default pursuit and is added outside of the sum. Since f<sub>in</sub> is the probability that the considered pursuit will be encountered per unit of time spent in the default pursuit,
is the average amount of time spent in the default pursuit between encounters of the considered pursuit. By multiplying the average time spent across all outside pursuits per unit of time in the default pursuit by the average amount of time spent in the default pursuit between encounters of the considered pursuit, we get the average amount of time spent outside the considered pursuit per encounter of the considered pursuit. This is calculated as if the pursuit encounters are mutually exclusive within a single unit of time spent within the default pursuit, as this is the case as the length of our unit time (delta t) approaches zero.
The above text explaining the equation has been added to Ap 2.
(5) In Figure 3, one huge advantage of this separation into in-pursuit and out-of-pursuit patches is that the optimal reward rate maximizing rule becomes one that compares ρ_in and ρ_out. This contrasts with an optimal foraging rule which requires comparing to the global reward rate and therefore a circularity in solution. In practice, however, it is unclear how ρ_out will be estimated by the agent.
How, in practice, a human or animal estimates the reward rates―be they the outside and/or global reward rate under a policy of accepting a pursuit―is the crux of the matter. This work identifies equations that would enable a reward-rate maximizing agent to calculate and execute optimal policies and emphasizes that the effective reward rates and weights of pursuits must be accurately appreciated for global reward rate optimization. In so doing, it makes a reckoning of behaviors commonly but erroneously treated as suboptimal. Then, by examining the consequences of misestimation of these enabling parameters, it identifies mis-weighting pursuits as the nature of the error committed by whatever algorithm and representational architecture is being used by humans and animals (the Malapportionment Hypothesis). This curious pattern identified and analyzed in this work thus provides a clue into the nature of the learning algorithm and means of representing the temporal structure of the environment that is used by humans and animals―the subject of future work.
We note, however, that we do discuss existing models that grapple with how, in practice, how a human or animal may estimate the outside reward rate. Of particular importance is the TIMERR model, which estimates the outside reward rate from its past experience, and can make an accounting of many qualitative features widely observed. However, while appealing, it would mix prior ‘in’ and ‘outside’ experiences within that estimate, and so would fail to perform forgo tasks optimally. Something is still amiss, as this work demonstrates.
(6) The apportionment time cost needs to be explained a little bit more intuitively. For instance, it is clear that the opportunity cost of time is the cost of not spending time in the rest of the environment relative to the current pursuit. But given the definition of apportionment cost here in lines 447- 448 "The apportionment cost relates to time's allocation in the world: the time spent within a pursuit type relative to the time spent outside that pursuit type, appearing in the denominator." The reference to the equation (setting aside the confusion regarding which equation) within the definition makes it a bit harder to form an intuitive interpretation of this cost. Please reference the equation being referred to in lines 447-448, and again, an example may help the authors communicate their point much better
We thank the reviewer for pressing on this critical point.
Action: We added the following succinct verbal description of apportionment cost… “Apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration.” This definition appears in a new paragraph (as below) describing apportionment cost in the results section “Time’s cost: opportunity & apportionment costs determine a pursuit’s subjective value”, and is accompanied by equations for apportionment cost, and a figure giving its geometric depiction (Figure 5).
“What, then, is the amount of reward by which the opportunity cost-subtracted reward is scaled down to equal the sv of the pursuit? This amount is the apportionment cost of time. The apportionment cost of time (height of the brown vertical bar, Figure 5F) is the global reward rate after taking into account the opportunity cost (slope of the magenta-gold dashed line in Figure 5F) times the time of the considered pursuit. Equally, the difference between the inside and outside reward rates, times the time of the pursuit, is the apportionment cost when scaled by the pursuit’s weight, i.e., the fraction that the considered pursuit is to the total time to traverse the world (Equation 9, right hand side). From the perspective of decision-making policies, apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration (Equation 9 center, Figure 5F).
Equation 9. Apportionment Cost.
While this difference is the apportionment cost of time, the opportunity cost of time is the amount that would be expected from a policy of not taking the considered pursuit over a time equal to the considered pursuit’s duration. Together, they sum to Time’s Cost (Figure 5G). Expressing a pursuit’s worth in terms of the global reward rate obtained under a policy of accepting the pursuit type (Figure 5 left column), or from the perspective of the outside reward and time (Figure 5 right column), are equivalent. However, the latter expresses sv in terms that are independent of one another, conveys the constituents giving rise to global reward rate, and provides the added insight that time’s cost comprises an apportionment as well as an opportunity cost.”
(7) The analyses in Figures 6 and 7 give a nice visual representation of how the time costs are distributed as a function of outside reward and time spent. However, without an expression for apportionment cost it is hard to intuitively understand these visualizations. This also relates to the previous point of requiring a more intuitive explanation of apportionment costs in relation to the opportunity cost of time. Based on my quick math, it seems that an expression for apportionment cost would be as follows: (r_in- ρ_out*t_in)*(t_in⁄t_out )/(t_in⁄t_out +1 ). The condition described in Figure 7 seems like the perfect place to compute the value of just apportionment cost when the opportunity cost is zero. It would be helpful to introduce the equation here.
We designed original figure 7, as the reviewer appreciates, to emphasize that time has a cost even when there is no opportunity cost, being due entirely to the apportionment cost of time.
We now provide the mathematical expression of apportionment cost and apportionment scaling in Figure 5, the point in the main text of its first occurrence.
…and have expanded original figure 5, its legend (so as to illustrate the apportionment scaling factor and the apportionment cost), and its accompanying main text, to further illustrate and clarify apportionment cost, and its relationship to opportunity cost, and time’s cost.
(8) The analysis regarding choice decisions is relatively straightforward, pending the concerns for the main equations listed above for the forgo decisions. Legends certainly would have helped me grasp Figures 10-12 better.
We believe the reviewer is referring to missing labels for the Sooner Smaller pursuit, and the Larger Later Pursuit in these figures? We used the same conventions as in Figure 9, but we see now that adding these labels to these figures would be helpful, and add them in the revision.
We have now added to the figures themselves figure legends indicating the Sooner Small Pursuit and the Larger Later Pursuit. We have also added to the main text to emphasize the points made in these figures regarding the impact of opportunity cost and apportionment cost.
(9) The derivation of the temporal discounting function from subjective reward rate is much appreciated as it provides further evidence for potential equivalence between reward rate optimization and hyperbolic discounting, which is known to explain a slew of decision-making behaviors in the economics literature.
We thank and greatly appreciate the reviewer for this recognition.
In response to the reviewer’s comment, we have added text that further relates reward rate optimization to hyperbolic discounting…
(1) We add discussion of how our normative derivation gives explanation to Mazur’s ad hoc addition of 1 + k to Ainslie’s reward/time hyperbolic discounting conception. See new first paragraph under “Hyperbolic Temporal Discounting Functions” for the historical origins of the standard hyperbolic equation (which are decidedly not normatively derived). And then see our discussion (new second paragraph in sections “The apparent discounting function of global….”) of how our normative derivation gives explanation to “1”, “k”, and their relationship to each other.
(2) We add explicit treatment of the Delay Effect in a new “The Delay Effect” section of the results along with a figure, and in its corresponding Discussion section.
Minor comments:
(1) Typo in equation 2, should be t_i in the denominator within the summation, not r_i .
We thank the reviewer for catching this typo, and have corrected it in the revision.
(2) Before equation 6, typo when defining ρ_in= r_in/(t_in.). Should be t_in in the denominator, not r_out.
We thank the reviewer for catching this typo, and have corrected it in the revision.
(3) Please be consistent with equation numbers, placement of equation references, and the reason for placing appendix numbers. This will improve readability immensely.
To increase clarity, in the revision we eliminated numbering of equations in the appendices except where an equation occurs in an appendix that is referenced within the main text. In the main text, important equations are thus numbered sequentially and note the appendix from which they derive. If an equation in an appendix is referenced in the main text, it is noted within the appendix it derives.
(4) Line 505 - "dominants" should be dominates.
Typo fixed as indicated
(5) Figures 10-12: add legends to the figures.
Now so included.
(6) Lines 701-703: please rewrite the equation separately. It is highly unclear what rt is here.
We thank the reviewer for bringing attention to this error. The error arose in converting from Google Sheets to Microsoft Word.
The equation has now been corrected.
Additional citations noted in reply and appearing in Main text
Ainslie, George. 1975. “Specious Reward: A Behavioral Theory of Impulsiveness and Impulse Control.” Psychological Bulletin 59: 257–72.
Frederick, Shane, George Loewenstein, Ted O. Donoghue, and T. E. D. O. Donoghue. 2002. “Time Discounting and Time Preference : A Critical Review.” Journal of Economic Literature 40: 351–401.
Gibbon, John. 1977. “Scalar Expectancy Theory and Weber’s Law in Animal Timing.” Psychological Review 84: 279–325.
Green, Leonard, Nathanael Fristoe, and Joel Myerson. 1994. “Temporal Discounting and Preference Reversals in Choice between Delayed Outcomes.” Psychonomic Bulletin & Review 1: 383–89.
Grüne-Yanoff, Till. 2015. “Models of Temporal Discounting 1937-2000: An Interdisciplinary Exchange between Economics and Psychology.” Science in Context 28 (4): 675–713.
Jimura, Koji, Joel Myerson, Joseph Hilgard, Todd S. Braver, and Leonard Green. 2009. “Are People Really More Patient than Other Animals? Evidence from Human Discounting of Real Liquid Rewards.” Psychonomic Bulletin & Review 16: 1071–75.
Kalenscher, Tobias, and Cyriel M. A. Pennartz. 2008. “Is a Bird in the Hand Worth Two in the Future? The Neuroeconomics of Intertemporal Decision-Making.” Progress in Neurobiology 84 (3): 284–315.
Kirby, Kris N., and R. J. Herrnstein. 1995. “Preference Reversals Due to Myopic Discounting of Delayed Reward.” Psychological Science 6 (2): 83–89.
Mazur, James E. 1987. “An Adjusting Procedure for Studying Delayed Reinforcement.” In The Effect of Delay and of Intervening Events on Reinforcement Value., 55–73. Quantitative Analyses of Behavior, Vol. 5. Hillsdale, NJ, US: Lawrence Erlbaum Associates, Inc.
McNamara, John. 1982. “Optimal Patch Use in a Stochastic Environment.” Theoretical Population Biology 21 (2): 269–88.
Rosati, Alexandra G., Jeffrey R. Stevens, Brian Hare, and Marc D. Hauser. 2007. “The Evolutionary Origins of Human Patience: Temporal Preferences in Chimpanzees, Bonobos, and Human Adults.” Current Biology: CB 17: 1663–68.
Strotz, R. H. 1956. “Myopia and Inconsistency in Dynamic Utility Maximization.” The Review of Economic Studies 23: 165–80.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors investigate the role of HSPA2 during mouse preimplantation development. Knocking down HSPA2 in zygotes, the authors describe lower chances of developing into blastocysts, which show a reduced number of inner cell mass cells. They find that HSPA2 mRNA and protein levels show some heterogeneity among blastomeres at the 4-cell stage and propose that HSPA2 could contribute to skewing their relative contribution to embryonic lineages. To test this, the authors try to reduce HSPA2 expression in one of the 2-cell stage blastomere and propose that it biases their contribution to towards extra-embryonic lineages. To explain this, the authors propose that HSPA2 would interact with CARM1, which controls chromatin accessibility around genes regulating differentiation into embryonic lineage.
Strengths:
(1) The study offers simple and straightforward experiments with large sample sizes.
Thanks for your kind recognition.
(2) Unlike most studies in the field, this research often relies on both mRNA and protein levels to analyses gene expression and differentiation.
Thanks for your kind recognition.
Weaknesses:
(1) Image and statistical analyses are not well described.
Thanks for your advisable comment. We redescribe the image and statistical analyses in our revised version (line 255-257).
(2) The functionality of the overexpression construct is not validated.
Thanks for your kind suggestion. We validate the functionality of the overexpression construct in our revised version (Figure S3).
(3) Tracking of KD cells in embryos injected at the 2-cell stage with GFP is unclear.
Thanks for your kind suggestion. We randomly co-injected green fluorescent protein (Gfp) mRNA as a linage tracer with either Hspa2-siRNA or NC-FAM into one of the 2 -cell, and then monitored embryo development to the blastocyst stage (line 342-344).
(4) A key rationale of the study relies on measuring small differences in the levels of mRNA and proteins using semi-quantitative methods to compare blastomeres. As such, it is not possible to know whether those subtle differences are biologically meaningful. For example, the lowest HSPA2 level of the embryo with the highest level is much higher than the top cell from the embryo with the lowest level. What does this level mean then? Does this mean that some blastomeres grafted from strong embryos would systematically outcompete all other blastomeres from weaker embryos? That would be very surprising. I think the authors should be more careful and consider the lack of quantitative power of their approach before reaching firm conclusions. Although to be fair, the authors only follow a long trend of studies with the same intrinsic flaw of this approach.
Thanks for your advisable comment. Indeed, despite the approach drew on previous research (Zhou Cell 2018), we were clearly aware that this approach can only reflect relative comparisons. This means that the relative difference among the blastomeres from the same embryo were detected and compared. We did not compare the absolute levels of mRNA between different embryos. We also offered simple and straightforward experiments with large sample sizes to confirm this conclusion.
(5) Some of the analyses on immunostaining do not take into account that this technique only allows for semi-quantitative measurements and comparisons.
a) Some of the microscopy images are shown with an incorrect look-up table.
b) Some of the schematics are incorrect and misleading.
Thanks for your advisable comment. We revised microscopy images and schematics in our revised version.
Reviewer #2 (Public review):
Summary:
In this study, Gao et al. use RNA-seq to identify Hspa2 as one of the earliest transcripts heterogeneously distributed between blastomeres. Functional studies are performed using siRNA knockdown showing Hspa2 may bias cells toward the ICM lineage via interaction with the known methyltransferase CARM1.
Strengths:
This study tackles an important question regarding the origins of the first cell fate decision in the preimplantation embryo. It provides novelty in its identification of Hspa2 as a heterogeneous transcript in the early embryo and proposes a plausible mechanism showing interactions with Carm1. Multiple approaches are used to validate their functional studies (FISH, WB, development rates, proteomics). Given only 4 other transcripts/RNA have been identified at or before the 4-cell stage (LincGET, CARM1, PRDM14, HMGA1), this would be an important addition to our understanding of how TE vs ICM fate is established.
Thanks for your kind recognition.
The RNA-seq results leading the authors to focus on Hspa2 are not included in the manuscript. This dataset would serve as an important resource but is neither included nor discussed. Nor is it mentioned whether Hspa2 was identified in prior RNA-seq embryos studies (for example Deng Science 2014).
Thanks for your advisable comment. To identify genes that show a significantly high variability across blastomeres in the same embryo, we regressed out the embryo effect by established a new method, which will be published and uploaded to the database in the future. Thus, the RNA-seq results leading the we focus on Hspa2 are not included in the manuscript.
In addition, the functional studies are centered on Hspa2 knockdown at the zygote (1-cell) stage, which would largely target maternal transcript. Given the proposed mechanism relies on Hspa2 heterogeneity post-ZGA (late 2-cell stage), the knockdown studies don't necessarily test this and thus don't provide direct support to the authors' conclusions. The relevance of the study would be improved if the authors could show that zygotic knockdown leads to symmetric Hspa2 levels at the late 2-cell and/or 4-cell stage. It may be possible that zygotic knockdown leads to lower global Hspa2 levels, but that asymmetry is still generated at the 4-cell stage.
Thanks for your advisable comment. We showed that the Hspa2 levels at the late 2-cell and 4cell stage after zygotic knockdown in our revised version (Figure S1 G-H, line 450-452).
Furthermore, the authors show that Hspa2 knockdown at the 1-cell stage lowers total Carm1 levels at the 4-cell stage. However, it is unclear how total abundance within the embryo alters lineage specification within blastomeres. The authors go on to propose a plausible mechanism involving Hspa2 and Carm1 interaction, but do not discuss how expression levels may be involved.
Thanks for your advisable comment. Previous research suggests that heterogeneous activity of the methyltransferase CARM1 results in differential methylation of histone H3R26 to modulate establishment of lineage specification (Zernicka-Goetz Cell 2018). Thus, we didn't discuss the total abundance within the embryo alters lineage specification.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Major issue with analyses:
Image analysis needs to be much better explained than simply saying that ImageJ was used. Where are cells measured (at their equatorial plane? What is the size of the ROI?)? Ideally, the ROI and/or raw measurements should be provided.
Thanks for your advisable comment. We redescribe the Image analysis in our revised version (line 187-194).
What are the objective criteria determining whether a cell is counted as GFP positive, CDX2 positive, or OCT4 positive? This is very unclear and key to the interpretation of many experiments.
Thanks for your advisable comment. We think that the cell containing fluorescence signals above background noise were counted positive.
Statistical analyses mention ANOVA in the methods but the student's t-test in the figure legend. Which is which? Most data are heavily normalized, which would unlikely fit the description for Student's t-test analyses.
Thanks for your advisable comment. We redescribe the statistical analyses in our materials and methods (line 253-260).
Figure 5H describes a relative fluorescence intensity with control at 1. The legend describes a normalization to "DNA" (I guess the authors meant DAPI), which is unlikely to give 1. This suggests that additional normalization was done and is not described. Is that the case? Also, since the authors propose that HSPA2 would control Histone modification and chromatin packing, I do not think that using DAPI is an appropriate way of normalizing the fluorescence signal.
Thanks for your advisable comment. We replaced DNA with DAPI in our revised version. Based on previous studies, we adopted DAPI as a normalized fluorescence signal (Zhou Cell 2018, Zernicka-Goetz Cell 2018).
Figure 1E shows data normalized to the lowest level while Figure 1H is normalized to the highest level. A consistent representation would be welcome.
Thanks for your advisable comment. We revised the Figure 1H in our revised version.
Is Figure 1C showing a t-test between correlations?
Yes, Figure 1C shows the t-test between correlation.
(2) Major issue with the interpretation of semi-quantitative methods and measurements:
qPCR, WB, immunostaining are all semi-quantitative methods that require some kind of normalization due to non-linear bias in the way the molecules are picked up. Such normalization makes it difficult to know whether a detectable difference is meaningful biologically speaking i.e. if a difference of 1 CT between blastomeres can be detected after qPCR, is it meaningful? If that were the case, then embryos with lower CT than others (Figure 1D) would not be able to develop into blastocyst, like siRNA injected embryos, or grafting a blastomere with a high CT onto an embryo with low CT would lead to the systematic differentiation of these strong blastomeres into ICM.
Thanks for your advisable comment. The CT values represent the relative mRNA levels of Hspa2 between blastomeres, and the higher CT value represents the lower expression of Hspa2 at mRNA level. Figure 1D shows the Hspa2 mRNA levels between blastomeres. The blastomere with lowlevel expression of the Hspa2 mRNA is not bias an ICM fates.
The same goes for fluorescence analyses (Figure 1F). Can the authors also provide the measurements for DAPI as they did for HSPA2? I am sure that with enough measurements, DAPI is variable enough to give a statistical difference among blastomeres with questionable biological meaning.
I think the reasoning used here (unfortunately following the reasoning that has been used in a series of studies by other groups) of ranking blastomeres after semi-quantitative measurement is fundamentally flawed.
Thanks for your advisable comment. The DAPI was determined by the maximal area using a custom Python script. Based on previous studies, we adopted DAPI as a normalized fluorescence signal (Zhou Cell 2018). This approach is to normalize embryo-to-embryo variance from the technical reason.
(3) Major issue with overexpression experiment:
While the siRNA experiment is partially validated by qPCR and WB measurements of HSPA2 after KD, the overexpression experiment is not. Do the authors have any evidence that the construct they use is produced into protein and functional? Can the authors check by WB? Can the authors rescue the siRNA with their overexpression?
Thanks for your advisable comment. We verified the overexpression experiment by WB in in our revised version (Figure S3, line 360-361). Considering that siRNA degrades mRNA and prevents the mRNA translation process, we did not co-inject the siRNA with their overexpression.
The lack of effect of HSPA2 overexpression on blastocyst formation is difficult to reconcile with the interpretation from the authors that levels of HSPA2 bias lineages.
Have the authors tried lower concentrations? Have the authors tried FISH on their half-injected 2cell embryos? Of course, if the antibody against HSPA2 would work with immunostaining, that would be ideal.
Thanks for your advisable comment. We chose the concentrations for our study based on previous research (Zernicka-Goetz Cell 2016). To verified Hspa2 was successfully inject into one blastomere at the 2-cell stage, we observed green fluorescence after co-injected GFP mRNA with either siRNA or NC-FAM into one blastomere of the two-cell embryos. Thus, we didn't try FISH on half-injected 2-cell embryos. We tried to perform immunostaining experiments with various HSPA2 antibodies (Proteintech: 12797-1-AP, Abcam: ab108416) and no good results were achieved.
Author response image 1.
(4) Major issue with tracking of injected cells:
It is unclear what counts as a GFP-positive cell. In Figure 3D, most cells appear to have the same level of GFP.
Thanks for your advisable comment. The cell containing green fluorescence signals above background noise were counted GFP-positive in Figure 3D. Most cells seem to have the same level of GFP because they are daughter cells of the blastomeres injected with GFP.
In the images of GFP-expressing cells used to track the control of KD cells shown in Figure 3A, it seems that the control embryos have mostly GFP cells in the ICM. Is that the case, or just a bad example?
Thanks for your advisable comment. The green fluorescent signals in Figure 3A represented OCT4 protein, an ICM marker.
Can the authors do FISH against HSPA2 and visualize their GFP cells to validate the heterogeneous expression in situ?
Thanks for your advisable comment. We have verified the heterogeneous expression of HSPA2 in Figure1.
(5) Issue with fluorescent images:
Many images are shown with inappropriate look-up tables with saturated DAPI, OCT4, CDX2, and FISH. This raises the doubt that analyses were made on saturated images, which would be incorrect.
The LUT of Figure 5H should be adjusted similarly between the control and siRNA.
Thanks for your advisable comment. We revised some images which showed inappropriate lookup tables in our revised version. The LUT of Figure 5H had been adjusted between the control and siRNA.
(6) Issue with schematics:
Schematics of blastomere isolation grown into blastocyst-like structures are misleading since the final blastocyst-like structure should not have a zona pellucida and should have fewer cells than regular blastocysts.
Thanks for your advisable comment. We revised schematics of blastomere grown into morula in our revised version (Figure 1A and Figure S1A).
The summary schematics in the final figure should not state HSPA2 -/- since experiments in the study did not use KO but KD.
Thanks for your advisable comment. We revised the summary schematics in our revised version.
The blastocysts are the same sizes as the cleavage stage or morula embryos which implies that cells lose volume to the lumen, which is not the case.
Thanks for your advisable comment. We revised the schematics in our revised version.
(7) Issue with data presentation:
In the tables within the figures, the number of decimals given should be the same for the mean and SE (one decimal should be more than enough).
Thanks for your advisable comment. We revised the figure 2H in our revised version.
The comparison of cell number and distribution within embryos (e.g. Figure 2B) would be best represented by a correlation analysis of TE vs ICM cells.
Thanks for your advisable comment. We add the figure of a correlation analysis of TE vs ICM cells in our revised version (Figure 3B).
The docking simulations are described in the main text as "experiments".
Thanks for your advisable comment. We redescribed the docking simulations in our revised version.
(8) Issue with data interpretation:
The reduced number of ICM cells is interpreted as a slowed-down cell cycle. This could also be explained by failed cytokinesis and the generation of binucleated or polyploid cells. Have the authors checked for that? For example, by looking at their DAPI staining.
Thanks for your advisable comment. Our RNA-seq results revealed that the differentially expressed genes (DEGs) at blastocyst stage with HSPA2 knocking down are closely related to negative regulation of cell cycle, G1/S transition of mitotic cell cycle, mitotic cell cycle phase transition and regulation of mitotic cell cycle phase transition. Additionally, the previous study demonstrated that knockdown of HSPA2 reduced cell proliferation and led to G1/S phase cell cycle arrest (Hu Ann Transl Med 2019). Additionally, the lower cell number in ICM may also associated with failed cytokinesis and the generation of binucleated or polyploid cells. Thus, we guessed that HSPA2 has a role in ICM lineage establishment, although half of the ICM cells were able to survive with HSPA2 deficiency (line 463-472).
It is unclear to me why reduced ICM should lead to fewer blastocysts. Blastocysts should be able to form as long as their TE is fine. In Figure 2G, embryos seem to be cultured in close proximity, which is fine if they are healthy but not if some of the embryos start dying and releasing toxic compounds (e.g. ROS). Have the authors tried removing the dying KD embryos to see if the development of the remaining embryos would improve?
Thanks for your advisable comment. We think HSPA2 may affect blastocyst development by affecting other signaling pathways. And, the GO enriched terms was closely related to blastocyst development (Figure 2E). There was no significant difference in morula formation rate between Hspa2-KD group and NC group, thus the assumption that the toxic compounds released by some of the embryos that lead to downregulation of blastocyst rate may not be correct. Indeed, the rate of blastocyst formation in Hspa2-KD embryos was reduced significantly lower when few embryos was cultured separately. In addition, we discussed the possibility that the lower cell number in ICM may also associated with failed cytokinesis and the generation of binucleated or polyploid cells.
Author response image 2.
Reviewer #2 (Recommendations for the authors):
One of the significant findings in the paper is the discovery portion where Hspa2 is identified as a heterogeneous transcript. To improve the logic and impact of the manuscript, it may benefit from reorganizing some of the figures and text. For example:
(1) The paragraph in the introduction (Lines 56-68) should be moved to the discussion as the Hspa2 reveal should be in section 3.1, not prior to the RNA-seq results presented in Figure 1.
Thanks for your advisable comment. We think it is more logical that HSPA2 needs to be introduced in the introduction.
(2) Add text at the beginning of Section 3.1 to describe the rationale and results for the RNAseq. It would help the readers if the authors clearly stated why they chose the 4-cell stage.
Thanks for your advisable comment. We explain why we chose the 4-cell stage in our revised version (line 272-273).
(3) As this is the first time Hspa2 is identified, consider moving Figure S1C to the main figure to show expression throughout development.
Thanks for your advisable comment. We moved Figure S1C to the main figure in our revised version (line 286-291).
(4) Figure 1C: the correlation between Hspa2 and ICM markers would be strengthened if additional transcripts were used (Oct4, Sox2, Sox21). The graph in 1C would also be more informative if represented as a scatter plot with correlation coefficients (Nanog log2TPM vs Hspa2 log2TPM), rather than bar graphs.
Thanks for your advisable comment. We chose Nanog as the correlation between Hspa2 and Nanog, a ICM markers, was showing the strongest correlation in result. And, the figure 1C shows the stronger positive correlation between Nanog and Hspa2 in gene expression than random gene pairs (n=100, n means the number of random gene pairs). Thus, the figure 1C with bar graphs is easier to understand.
(5) Figure 1D: how were individual blastomeres grouped into B1-4? Individually run and then pooled based on relative expression?
Thanks for your advisable comment. Blastomeres are named B1 to B4 according to increasing Hspa2 concentration in figure 1E.
(6) Figures 1F, 1I, 5H: the DAPI channel appears to be saturated, but is used to normalize fluorescence intensity and may incorrectly account for light scattering within the embryo. Please clarify by adding more details regarding image analysis. Were partial stacks through the nucleus used for analysis, or max projections? Graph axes should be "relative fluorescence intensity."
Thanks for your advisable comment. We added the details of fluorescence images analysis. The graph axes had revised in our revised version.
(7) Line 278: the results in Figure S1C would benefit from more text regarding expression patterns throughout development. The maternal transcript appears to have a sharp downregulation by the early 2-cell stage, and is then upregulated coinciding with ZGA.
Thanks for your advisable comment. We added more describe of the Figure in main text (LINE 285-290).
(8) For the analyses in Figure 2 I-J and 2K-L, were arrested embryos excluded from analysis? This is an important detail as including arrested embryos would significantly bias the RNA-seq results.
Thanks for your advisable comment. The arrested embryos were excluded in Figure 2 I-J and 2K-L.
(9) Figures 2G-H would be aided by converting the table in 2H to a bar graph and adding development rates for all stages (2-, 4-, 8-, morula, and blast). This would also show when an arrest occurs.
Thanks for your advisable comment. We converted the table in 2H to a bar graph.
(10) Blast rates are represented with too many significant digits (Figures 2H, 4B). They should only be reported to the closest ones given the unit of measure (number of blasts divided by number of zygotes). For instance, a blast rate of 81.63 {plus minus} 2.000 reflects excessive precision that is not measured in the data, it should rather read 82 {plus minus} 2%. This is also true for % cells (Figures 3E, 4H).
Thanks for your advisable comment. Values were rounded down to the one decimal place (rounded down).
(11) The clarity and impact of Figure 3A and 3D would benefit from 2D slices through the ICM.
Thanks for your advisable comment. In order to get more comprehensive understanding of the 3D structure of blastocyst of Figure 3A and 3D, we did not choose 2D slices.
(12) To improve clarity and logic, separate the 1-cell and 2-cell knockdown experiments in the text and figures:
a) 1-cell knockdown with RNA-seq results (Fig 2A-F).
b) 1-cell knockdown showing less ICM/pluripotency markers in (combine Figures 2G-M and Figures 3A-B; "new Fig 3").
c) 2-cell knockdown tracing lineage (Figures 2D-E; "new Fig 4").
The new Figures 3 and 4 should mirror one another (i.e. for each knockdown experiment, development rates and cell counts should be included). For the 2-cell knockdown (Figures 2 D-E), what were the developmental rates (8-cell, morula, blast)?
Thanks for your advisable comment. However, in order to the overall logical of the article, we do not separate the 1-cell and 2-cell knockdown experiments in the text and figures. And, we added the developmental rates (8-cell, morula, blast) of 2-cell knockdown group in our revised version (Figure S2).
For the overexpression experiment (Figure 4), why were injections performed at the zygote stage versus the 2-cell stage? Given the significant downregulation of maternal transcript demonstrated in Figure S1C, it seems plausible that the injected RNA was also downregulated.
Thanks for your advisable comment. For the overexpression experiment, we first chose to inject Hspa2 mRNA at the zygote stage and found that the overexpression of Hspa2 does not induce blastomere cells to bias an ICM fate. The qRT-PCR results indicated that the expression level of Hspa2 in overexpression group was significantly increased compared with normal group at 4cell and blastocyst stage (Figure 4C, 4D). In addition, there is no guarantee that an equal amount of Hspa2 mRNA be injected into each blastomere in 2-cell stage. Thus, we did not microinject Hspa2 mRNA into the 2-cell stage.
The 3.5 subheading overstates the results as the Hspa2-Carm1 interaction is not linked to lineage segregation. For example, a more specific subtitle might be, "Hspa2 interacts with Carm1 and alters H3R26me2 levels."
Thanks for your advisable comment. We revised the subtitle in our revised version (line 376).
Figures 5B-C and 5D-E. The qRT-PCR and WB analysis of knockdown blasts shows a correlation between Hspa2 downregulation and Carm1 downregulation. However, if the proposed mechanism is Hspa2 binding to Carm1 to mediate downstream methylation, why would it be expected to alter transcript levels at the 4-cell or blast stage? Please add further details and discussion in the results and discussion sections.
Thanks for your advisable comment. The reason we chose to work at the 4-cell stage is because previous studies on CARM1 have focused on the 4-cell stage (Zernicka-Goetz Cell 2018,2016).
In the discussion, the statement in Lines 430-431 is an overinterpretation: "the heterogeneity of HSPA2... acts as an upstream factor to drive [the] first cell-fate decision." The knockdown experiments don't alter heterogeneity per se, but total abundance. Furthermore, the results do not show that heterogeneity drives heterogeneity in H3R26me2 patterns, for example.
Thanks for your advisable comment. We redescribe the relevant statement in the discussion.
More needs to be said regarding the ICM cells that persisted in the 1-cell KD experiment (Fig 3B). Lines 449-450 point out this result, but do not propose any plausible explanations. For instance, ICM cells may still form due to the incomplete knockdown achieved or the possibility that redundant pathways exist.
Thanks for your advisable comment. We redescribe the relevant statement in our revised version (line 468-473).
The 5th paragraph of the discussion seems incomplete. The authors point out a possible link between Hspa2 and Hippo and Wnt signaling pathways, but need to expand their discussion on how this may act as an additional mechanism incorporating Hspa2 with lineage segregation.
Thanks for your advisable comment. We redescribe the 5th paragraph of the discussion (line 483-494).
Statistics: all comparisons with greater than 2 groups should be performed with a one-way ANOVA and multiple comparisons, rather than Student's t-test (Figures 1B, 1D, 1E, 1F).
All figure legends lack statistical test details.
Thanks for your advisable comment. All figure legends added statistical test details in statistical analysis.
Minor comments:
In all graphs, individual blastomere expression levels should be represented as boxwhisker/bar/scatter/violin plots since the comparison is groups rather than time points (i.e. symbols should not be connected with a line in Figures 1B, 1D, 1F-G, 1I, S1D, S1F).
Thanks for your advisable comment. Each colored line represents a single cell, and the dots of the same color represent the blastomere of the same cell. Thus, we use a line representation individual blastomere.
For all fluorescent images, having two representative images may be confusing for the reader. Figures may be improved by just including one representative image for each stage/treatment (Figures 1F, 1I, S1F, 3A, 3D, 4E, 4G).
Thanks for your advisable comment. The figures just including one representative image for each stage in our revised version. In addition, two representative images from each group were shown for each treatment (Figures 3A, 3D, 4E, 4G).
The manuscript would be improved with thorough grammar and typo editing.
For example:
(1) Lines 18, 73, the wording is confusing, consider: "knockdown of Hspa2 in one of the two-cell blastomeres biased its progeny towards the trophectoderm lineage.".
(2) Line 23, overstatement. Consider: "we demonstrated that HSPA2 levels correlate with ICMassociated genes and that it interacts with the CARM1.".
(3) Line 25 confusing wording, "via the execution of commitment and differentiation phases.".
(4) Line 37, replace "that" with "of;" replace "cell-fate decisions" with "cell-fate decision".
(5) Line 40: needs space before (CARM1).
(6) Line 43: the wording is confusing, consider "can result in higher expression levels of".
(7) Line 45: wording, consider "Recent [studies have] further suggested".
(8) Line 70: plurality, consider "analyzed gene expression pattern".
(9) Line 73 typo: "prevents its".
(10) Line 76-77 wording, consider "Hspa2 expression patterns can bias cell fate in the mouse embryo".
(11) Line 276: remove "in whole embryos," since MII eggs are not embryos.
(12) Line 617 "There" should be "Three".
(13) Axis label in Fig 3b "Totle" should be "Total".
(14) Lines 417, 419 missing spaces.
(15) Line 448 missing word, "interfering [with] the cell cycle".
(16) Line 462 incorrect word, "[a]polar cells being specified as ICM".
(17) Line 469 incorrect plural, "cell differentiation".
Thanks for your advisable comment. We revised the whole manuscript carefully according to the reviewers' suggestions.
-
-
osf.io osf.io
-
Author response:
The following is the authors’ response to the current reviews.
Reviewer #1:
(1) To improve the clarity of the work, I suggest a final note to the authors to say more explicitly that objective accuracy has a finer resolution *due to the number of "special circles" per trial* in their task. This task detail got lost in my read of the manuscript, and confused me with respect to the resolution of each accuracy measure.
We agree with the reviewer that this would be a useful clarification and have therefore added the following statement to the Methods section on p. 20:
“It should be noted that the OIP has a slightly finer resolution due to the number of special circles per trial.”
(2) Similarly for clarification, they could point out that their exclusion criteria removes subjects that have lower OIP than their AIP analysis allows (which is good for comparison between OIP and AIP). Thus, it removes the possibility that very poor performing subjects (OIP) are forced to have a higher than actual AIP due to the range).
We agree this would be a useful statement to add and have included the following sentence in the Supplement on p. 8:
“Such a restriction of the threshold parameter was intended to increase the comparability between AIP and OIP, and hence improved the calculation of the reminder bias.”
The following is the authors’ response to the previous reviews.
Reviewer #1:
(1) Upon reading their response to the question I had regarding AIP and OIP, a few more questions came up regarding OIP, AIP, how they're calculations differ, and how the latter was computed in R. I hope these help readers to clarify how to interpret these key measures, and the hypotheses that rely upon them.
Regarding fitting, and in relation to power, is16 queries adequate to estimate an AIP using the R's quickpsy? That is, assuming some noise in the choice process, how recoverable is a true indifference points from 16 trials? If there's a parameter recovery analysis (ie generating choice via the fitting parameters, which will have built-in stochasticity, and seeing how well you recover the parameter) of interest would be helpful. It may help to characterize why the present study might differ from prior studies (maybe a power issue here).
The reviewer is absolutely correct that we should have provided more detail when describing our fitting procedure for the psychometric curves. We have now addressed this by adding the following statements to the Methods section and Supplement:
Page 20 in the main manuscript: “Fitting was done using the quickpsy package in R and more detail is given in the Supplement.”
Pages 8 and 9 in the Supplement:
“Psychometric curve fitting
We used the quickpsy package in R to fit psychometric curves to each participant’s choice data to derive their actual indifference point (AIP), which was operationalised as the threshold parameter when predicting reminder choices from target values. We restricted the possible parameter ranges from 2 to 9 for the threshold parameter and from 1 to 500 for the slope parameter, based on the task’s properties and pilot data. Apart from those parameter ranges, we used only default settings of the quickpsy() function.
Each participant has only 16 trials (2 for each target value) contribute to the curve fitting. To understand the robustness of the AIP based on such limited data, we conducted a parameter recovery analysis. We simulated 16 trials based on each psychometric function and re-ran the curve fitting based on those simulated choices. There was close correspondence between the actual and recovered threshold parameters (or AIPs) with a correlation of r = 0.97, p < 0.001 (see also Figure S1). In contrast, the slope parameter—which was not central to any of our analyses—exhibited greater variability during the initial fitting. This increased uncertainty likely contributed to its poor recovery in the simulation, as evidenced by a near-zero correlation (r = −0.01, p = 0.82).”
(2) Along these lines, it would be helpful for the reader to actually see the individual psychometric curve, now how quickpsy was used (did you fit left and right asymptotes), etc, to understand how that fitting procedure works and how the assumptions of the fitting procedure compare to what can be gleaned through seeing the choice curves plotted.
As stated above, we used default settings of the quickpsy() function and hence assumed symmetric asymptotes at 0 and 1. However, the reviewer mentions “left and right asymptotes”, so maybe this question is about restricting the possible parameter range for the threshold, which we restricted to values from 2 to 9, as described above.
Regarding the individual curves, we have now include the following statement on page 9 in the Supplement: “Figures S2 to S31 show the individual psychometric curves that were estimated for each participant.” Please refer to the Supplement for the added figures.
(3) A more full explanation of quickpsy, its parameters, and how choice curves look might also generate interesting further questions to think about with respect to biases and compulsivity. Two individuals might have similar indifference points, but an asymptote might reflect a bias to always have some percent chance of for example to take the reminders even at the lowest offer available for them.
We agree that this is an interesting focus which we will keep in mind for future studies.
(4) Regarding comparing OIP to AIP:
For OIP, as far as I can understand, the resolution of it is decreased compared to AIP. Accuracies for OIP can only be 0/4,1/4,2/4,3/4, or 4/4. Yet, the resolution for AIP is the full range of offers (2 to 9) with respect to the parameter of interest (the indifference point). Could this bias the estimation of OIP (for instance, someone who scored 25% might actually be much closer to either 50 or 0, but we can't tell due to resolution?
As mentioned in response to comment (1), we restricted the parameter range for the thresholds to 2 to 9 to increase comparability. The reviewer is right to point out that the OIP still has lower resolution than the AIP, which is one of the downsides of having a shortened paradigm (cf. the longer version in Gilbert et al., 2019), which is optimised for online testing, especially if used in combination with additional questionnaires. We have no reason to believe though that this could have led to any bias, especially none that would contribute to the individual differences which are the main focus of our study.
Gilbert, S. J., Bird, A., Carpenter, J. M., Fleming, S. M., Sachdeva, C., & Tsai, P.-C. (2020). Optimal use of reminders: Metacognition, effort, and cognitive offloading. Journal of Experimental Psychology: General, 149(3), 501–517. https://doi.org/10.1037/xge0000652
(5) Additionally, it seems like the upper and lower bounds of OIP (0 and 10) differ from AIP (2 and 9). Could this also introduce bias (for example, if someone terrible performance, the mean would artificially be higher under AIP than OIP because the smallest indifference point is 2 under AIP, but could be 0 under OIP.
See our response to comment (1), we fixed the range to 2 to 9 (which was the range of target values used in our study).
(6) Finally seeing how CIT actually corresponds to accuracy overall (not a relative measure like AIP compared to OIP) I think would also be helpful as this is related to most points noted above.
We included the suggested test as an exploratory analysis on pages 42-43 in the Supplement: “Third, we were interested in how the transdiagnostic phenotypes would correspond to performance. We therefore fitted a model which predicted internal accuracy (that is, unaided task performance on trials where no reminders could be used) from AD, CIT, and the other covariates (age, education and gender). We found that neither AD, β = -0.02, SE = 0.05, t = 0.44, p = 0.658, nor CIT, β = -0.03, SE = 0.05, t = -0.66, p = 0.510, predicted internal accuracy.
The full results can be found in Table S13 as well as in Figure S32.”
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We genuinely appreciate the reviewers' interest and recognition of our work. The comments and suggestions on the results presentation and interpretation are well taken. We plan to revise the manuscript based on the reviewers' recommendations in the following aspects.
(1) We fully agree with the reviewer that the aged environment indeed would affect the myeloid and megakaryocyte differentiation behaviors of HSC. As a result, the clonal behaviors of HSCs presented in the current manuscript could be different from how HSCs differentiate in young mice. This point will be discussed in the revised manuscript.
(2) We agree with the reviewer that the manuscript was not as easy to follow as many other papers in experimental hematology, primarily because the analyses presented in the current manuscript were not frequently used in previous studies. To address this, we will try to revise the manuscript using plain language to describe the results and conclusions. We will also provide graphical summary schematics where appropriate to present the findings better. We will further discuss our results in the context of previous findings to better illustrate the novelty of the current work.
(3) We will provide more technical details of our analysis in the revised manuscript for readers to better understand how results are obtained and data analyses are performed in the current manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers for their thoughtful and constructive assessment of our manuscript. We agree that additional clarity on some key points in the manuscript will be valuable additions to this work. Both reviewers expressed a related concern regarding the basis for design and interpretation of our pyrazinamide ROS synergy experiments.
Reviewer 1:
The in vitro experiments performed in this manuscript mainly report that PZA pre-treatment increases H2O2-mediated killing or inhibition. There is no direct evidence that clearly shows that oxidative stress drives the potent bactericidal activity of PZA. In these settings the oxidative stress is always applied after PZA pre-treatment and is therefore likely displaying the major lethal effect.
Reviewer 2:
The manuscript would benefit from a clear statement of the rationale for the protocols used to examine the synergy of PZA with ROS, the possible models their protocols could be testing, and then how their data supports or disproves the models being tested. The manuscript appears to propose, as stated in the title, that "Oxidative stress drives potent bactericidal activity of pyrazinamide...". However their experimental design more likely tests the effect of PZA on ROS sensitivity. Indeed, by the last figure, the authors begin the present their data as PZA sensitizing the bacteria to ROS. More clarity on these possible models and the different interpretations of the data should be considered.
We agree that the data presented in the current version of the manuscript is incomplete in supporting our assertion that oxidative stress drives bactericidal activity of pyrazinamide. As both reviewers note, pretreatment of bacilli with pyrazinamide followed by challenge with ROS indicates that pyrazinamide enhances susceptibility to oxidative stress but does not address whether oxidative stress enhances susceptibility to pyrazinamide. Further, we neglected to provide information regarding why we chose to pretreat bacilli with pyrazinamide before ROS exposure. Over the course of our work, we had found that pyrazinoic acid, the active form of pyrazinamide, showed potent synergy with hydrogen peroxide. In contrast to the time-dependent synergy that we observed between pyrazinamide and peroxide, synergy between pyrazinoic acid and peroxide did not require pretreatment. We will revise our manuscript to include results that address these key issues and we will carefully consider revising our interpretations accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Summary:
The question of how central nervous system (CNS) lamination defects affect functional integrity is an interesting topic, though it remains a subject of debate. The authors focused on the retina, which is a relatively simple yet well-laminated tissue, to investigate the impact of afadin - a key component of adherens junctions on retinal structure and function. Their findings show that the loss of afadin leads to significant disruptions in outer retinal lamination, affecting the morphology and localization of photoreceptors and their synapses, as illustrated by high-quality images. Despite these severe changes, the study found that some functions of the retinal circuits, such as the ability to process light stimuli, could still be partially preserved. This research offers new insights into the relationship between retinal lamination and neural circuit function, suggesting that altered retinal morphology does not completely eliminate the capacity for visual information processing.
Strengths:
The retina serves as an excellent model for investigating lamination defects and functional integrity due to its relatively simple yet well-organized structure, along with the ease of analyzing visual function. The images depicting outer retinal lamination, as well as the morphology and localization of photoreceptors and their synapses, are clear and well-described. The paper is logically organized, progressing from structural defects to functional analysis. Additionally, the manuscript includes a comprehensive discussion of the findings and their implications.
Weaknesses:
While this work presents a wealth of descriptive data, it lacks quantification, which would help readers fully understand the findings and compare results with those from other studies. Furthermore, the molecular mechanisms underlying the defects caused by afadin deletion were not explored, leaving the role of afadin and its intracellular signaling pathways in retinal cells unclear. Finally, the study relied solely on electrophysiological recordings to demonstrate RGC function, which may not be robust enough to support the conclusions. Incorporating additional experiments, such as visual behavior tests, would strengthen the overall conclusions.
Thank you very much for taking the time and thoughtful and valuable comments. Following your suggestions, we will quantify some of the histological data and explore the mechanisms underlying the defects of lamination and cell fate determination observed in afadin cKO retina. We will also try to examine the vision of afadin cKO mice by visual behavior tests.
Reviewer #2 (Public review):
Summary:
Ueno et al. described substantial changes in the afadin knockout retina. These changes include decreased numbers of rods and cones, an increased number of bipolar cells, and disrupted somatic and synaptic organization of the outer limiting membrane, outer nuclear layer, and outer plexiform layer. In contrast, the number and organization of amacrine cells and retinal ganglion cells remain relatively intact. They also observed changes in ERG responses and RGC receptive fields and functions using MEA recordings.
Strengths:
The morphological characterization of retinal cell types and laminations is detailed and relatively comprehensive.
Weaknesses:
(1) The major weakness of this study, perhaps, is that its findings are predominantly descriptive and lack any mechanistic explanation. As afadin is key component of adherent junctions, its role in mediating retinal lamination has been reported previously (see PMCID: PMC6284407). Thus, a more detailed dissection of afadin's role in processes, such as progenitor generation, cell migration, or the formation of retinal lamination would provide greater insight into the defects caused by knocking out afadin.
Thank you for taking the time and valuable comments. Following your suggestions, we will perform experiments to evaluate mechanisms of retinal lamination and cell fate determination defects observed in the afadin cKO retina. However, we would like to note that the paper cited in the comment (PMCID: PMC6284407) analyzed the function of afadin in the formation of dendrites of direction selective RGCs in the IPL, and that the word "lamination" refers to the layering of RGC dendrites in the IPL. Here, we analyzed the function of afadin in laminar construction of the retina.
(2) The authors observed striking changes in the numbers of rods, cones, and BCs, but not in ACs or RGCs. The causes of these distinct changes in specific cell classes remain unclear. Detailed characterizations, such as the expression of afadin in early developing retina, tracing cell numbers across various early developmental time points, and staining of apoptotic markers in developing retinal cells, could help to distinguish between defects in cell generation and survival, providing a better understand of the underlying causes of these phenotypes.
Following your suggestion, we will perform the experiments to characterize the causes of distinct changes in the afadin cKO retina.
(3) Although the total number of ACs or RGCs remains unchanged, their localizations are somewhat altered (Figures 2E and 4E). Again, the cause of the altered somatic localization in ACs and RGCs is unclear.
To clarify the reviewer’s point, we will analyze the progenitor and those cell positions in the developing stage of the afadin cKO retina.
(4) One conclusion that the authors emphasise is that the function of RGCs remains detectable despite a major disrupted outer plexiform layer. However, the organization of the inner plexiform layer remains largely intact, and the axonal innervation of BCs remains unchanged. This could explain the function integrity of RGCs. In addition, the resolution of detecting RGCs by MEA is low, as they only detected 5 clusters in heterozygous animals. This represents an incomplete clustering of RGC functional types and does not provide a full picture of how functional RGC types are altered in the afadin knockout.
We appreciate the reviewer’s insightful comments. Although our clustering of RGC subtypes in afadin cHet retinas resulted in only five clusters, the key finding of our study is the preservation of RGC receptive fields in afadin cKO retinas, despite severe photoreceptor loss (reduced to about one-third of normal) and disruption of photoreceptor-bipolar cell synapses in the OPL. This suggests that even with crucial damage to the OPL, the primary photoreceptor-bipolar-RGC pathway can still function as long as the IPL remains intact. Moreover, the presence of rod-driven responses in RGCs indicates that the AII amacrine cell-mediated rod pathway may also continue to function. We agree that our functional clustering in afadin cHet retinas was incomplete. However, we guess that the absence of RGCs with fast temporal responses in afadin cKO retinas may not simply due to the loss of specific RGC subtypes but due to disrupted synaptic connections between photoreceptors and fast-responding bipolar cells. Furthermore, the structural abnormalities in retinal lamination in afadin cKO retinas may alter RGC response properties, making strict functional classification less meaningful. We would like to emphasize the finding that disruption of the retinal lamination in afadin cKO retinas leads to the absence of RGCs with fast temporal response properties, rather than focusing solely on the classification of RGC subtypes.
Minor Comments:
(1) Line 56-67: "Overall, these findings provide the first evidence that retinal circuit function can be partially preserved even when there are significant disruptions in retinal lamination and photoreceptor synapses" There is existing evidence showing substantial adaption in retinal function when retinal lamination or photoreceptor synapses are disrupted, such as PMCID: PMC10133175.
Thank you for your comment. The paper you mentioned is crucial for discussing and considering the results of our study. We will refer the paper and mention in Discussion.
(2) Line 114-115: "we focused on afadin, which is a scaffolding protein for nectin and has no ortholog in mice." The term "Ortholog" is misused here, as the mouse has an afadin gene. Should the intended meaning be that afadin has no other isoforms in mouse?
Thank you for pointing it out. As we misused "Ortholog" as "Paralog", we will revise it.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment
This useful study integrates experimental methods from materials science with psychophysical methods to investigate how frictional stabilities influence tactile surface discrimination. The authors argue that force fluctuations arising from transitions between frictional sliding conditions facilitate the discrimination of surfaces with similar friction coefficients. However, the reliance on friction data obtained from an artificial finger, together with the ambiguous correlative analyses relating these measurements to human psychophysics, renders the findings incomplete.
Our main goal with this paper was to show that the most common metric, i.e. average friction coefficient—widely used in tactile perception and device design—is fundamentally unsound, and to offer a secondary parameter that is compatible with the fact that human motion is unconstrained, leading to dynamic interfacial mechanics. In contrast with the summary assessment, we also note that the average friction coefficients in our study were not particularly similar, ranging from differences of 0.4 – 1, a typical range seen in most studies. We believe some of the comments originate from a misinterpretation of our statistically significant, but negative correlation between human results and friction coefficients – which leads to the spurious conclusion that nearly identical objects should be very easy to tell apart, thus supporting our central argument for the need of an alternative. We understand the Reviewers wanting to see that we can demonstrate that humans using instabilities in situ. This is seemingly reasonable, but we explain the significant challenges and fundamental unknowns to those experiments. However, we modified our title to reflect our focus on offering an alternative to the average coefficient of friction.
We do not think it was feasible, at this stage, to demonstrate that humans use friction instabilities through direct manipulation and observation in human participants. In short, there are still several fundamental unknowns: (1) a decision-making model would need to be created, but it is unknown if tactile decision making follows other models, (2) it is further unknown what constitutes “tactile evidence”, though at our manuscript’s conclusion, we propose that friction instabilities are better suited for to be tactile evidence than the averaging of friction coefficients from a narrow range of human exploration (3) in the design of samples, from a friction mechanics and materials perspective, it is not at this point, possible to pre-program surfaces a priori to deliver friction instabilities and instead must be experimentally determined – especially when attempting to achieve this in controlled surfaces that do not create other overriding tactile cues, like macroscopic bumps or large differences in surface roughness. (4) Given that the basis for tactile percepts, like which object feels “rougher” or “smoother” is not sufficiently established and we have seen leads to confusion, it is necessary to use a 3-alternative forced choice task which avoids asking objects along a preset perceptual dimension – a challenge recognized by Reviewer 3. However, this would bring in issues of memory in the decision-making model. (5) The prior points are compounded by the fact that, we believe, tactile exploration must be performed in an unconstrained manner, i.e., without an apparatus generating motion onto a stationary finger. Work by Liu et al. (IEEE ToH, 2024) showed that recreating friction obtained during free exploration onto a stationary finger was uninterpretable by the participants, hinting at the importance of efference copies(1). We believe that each of the above-mentioned issues constitutes a significant advance in knowledge and would require discussion and dissemination with the community. Finally, one of our overarching goals is to create a consistent method to characterize surfaces, and given individual variability in human fingers and motion, a machine-based method that can rapidly, consistently, and sufficiently replicate tactile exploration is needed.
Finally, we also justify our use of a mock finger to provide a method to characterize surfaces in tactile studies that other researchers could reasonably recreate, without creating a standard around individual humans, considering the variability in finger shape and motion during exploration. We do not believe this is an “either-or” argument, but rather that standardized methods to characterize surfaces and devices are greatly needed in the field. From these standardized methods, like surface roughness, some tabulated values of friction coefficient, or surface energy, etc., the current metrics to parameterize results are largely incapable of capturing the dynamic changes in forces expected during human tactile exploration.
Our changes to the manuscript (Page 1 & SI Page 1, Title)
“Alternatives to Friction Coefficient: Role of Frictional Instabilities for Fine Touch Perception”
Reviewer 1 (Public review):
Summary:
In this paper, Derkaloustian et. al look at the important topic of what affects fine touch perception. The observations that there may be some level of correlation with instabilities are intriguing. They attempted to characterize different materials by counting the frequency (occurrence #, not of vibration) of instabilities at various speeds and forces of a PDMS slab pulled lengthwise over the material. They then had humans make the same vertical motion to discriminate between these samples. They correlated the % correct in discrimination with differences in frequency of steady sliding over the design space as well as other traditional parameters such as friction coefficient and roughness. The authors pose an interesting hypothesis and make an interesting observation about the occurrences of instability regimes in different materials while in contact with PDMS, which is interesting for the community to see in the publication. It should be noted that the finger is complex, however, and there are many factors that may be quite oversimplified with the use of the PDMS finger, and the consideration and discounting of other parameters are not fully discussed in the main text or SI. Most importantly, however, the conclusions as stated do not align with the primary summary of the data in Figure 2.
Strengths:
The strength of this paper is in its intriguing hypothesis and important observation that instabilities may contribute to what humans are detecting as differences in these apparently similar samples.
We thank Reviewer 1 for their time on the manuscript, recognizing the approach we took, and offering constructive feedback. We believe that our conclusions, in fact, are supported by the primary summary of the data in Figure 2 but we believe that our use of R<sup>2</sup> could have led to misinterpretation. The trend with friction coefficient and percent correct was indeed statistically significant but was spurious because the slope was negative. In the revision, we add clarifying comments throughout, change from R<sup>2</sup> to r as to highlight the negative trend, and adjust the figures to better focus on friction coefficient.
Finally, we added a new section to discuss the tradeoffs between using a real human finger versus a mock finger, and which situations may warrant the use of one or the other. In short, for our goal of characterizing surfaces to be used in tactile experiments, we believe a mock finger is more sustainable and practical than using real humans because human fingers are unique per participant, humans move their fingers at constantly changing pressures and velocities, and friction generated during free exploring human cannot be satisfactorily replicated by moving a sample onto a stationary finger. But, we do not disagree that for other types of experiments, characterizing a human participant directly may be more advantageous.
Weaknesses:
Comment 1 - The most important weakness is that the findings do not support the statements of findings made in the abstract. Of specific note in this regard is the primary correlation in Figure 2B between SS (steady sliding) and percent correct discrimination. Of specific note in this regard is the primary correlation in Figure 2B between SS (steady sliding) and percent correct discrimination. While the statistical test shows significance (and is interesting!), the R-squared value is 0.38, while the R-squared value for the "Friction Coefficient vs. Percent Correct" plot has an R-squared of 0.6 and a p-value of < 0.01 (including Figure 2B). This suggests that the results do not support the claim in the abstract: "We found that participant accuracy in tactile discrimination was most strongly correlated with formations of steady sliding, and response times were negatively correlated with stiction spikes. Conversely, traditional metrics like surface roughness or average friction coefficient did not predict tactile discriminability."
We disagree that the trend with friction coefficient suggests the results do not support the claim because the correlation was found to be negative. However, we could have made the comparison more apparent and expanded on this point, given its novelty.
While the R<sup>2</sup> value corresponding to the “Friction Coefficient vs. Percent Correct” plot is notably higher, our results show that the slope is negative, which would be statistically spurious. This is because a negative correlation between percent correct (accuracy in discriminating surfaces) and difference in friction coefficient means that the more similar two surfaces are (by friction coefficient), the easier it would be for people to tell them apart. That is, it incorrectly concludes that two identical surfaces would be much easier to tell apart than two surfaces with greatly different friction coefficients.
This is counterintuitive to nearly all existing results, but we believe our samples were well-positioned to uncover this trend by minimizing variability, by controlling multiple physical parameters in the samples, and that the friction coefficient — typically calculated in the field as an average friction coefficient — ignores all the dynamic changes in forces present in elastic systems undergoing mesoscale friction, i.e., human touch, as seen in Fig. 1 in a mock finger and Fig. 3 in a real finger. By demonstrating this statistically spurious trend, we believe this strongly supports our premise that an alternative to friction coefficient is needed in the design of tactile psychophysics and haptic interfaces.
We believe that this could have been misinterpreted, so we took several steps to improve clarity, given the importance of this finding: we separated the panel on friction coefficient to its own panel, we changed from R<sup>2</sup> to r throughout, and we added clarifying text. We also added a small section focusing on this spurious trend.
Our changes to the manuscript (Page 10)
“To compare the value of looking at frictional instabilities, we also performed GLMM fits on common approaches in the field, like a friction coefficient or material property typically used in tactile discrimination, shown in Fig. 2D-E. Interestingly, in Fig. 2D, we observed a spurious, negative correlation between friction coefficient (typically and often problematically simplified as
across all tested conditions) and accuracy (r = -0.64, p < 0.01); that is, the more different the surfaces are by friction coefficient, the less people can tell them apart. This spurious correlation would be the opposite of intuition, and further calls into question the common practice of using friction coefficients in touch-related studies. The alternative, two-term model which includes adhesive contact area for friction coefficient(29) was even less predictive (see Fig. S6A of SI). We believe such a correlation could not have been uncovered previously as our samples are minimal in their physical variations. Yet, the dynamic changes in force even within a single sample are not considered, despite being a key feature of mesoscale friction during human touch.
We investigate different material properties in Fig. 2E. Differences in average roughness R<sub>a</sub> (or other parameters, like root mean square roughness R<sub>rms</sub> (Fig. S6A of SI) did not show a statistically significant correlation to accuracy. Though roughness is a popular parameter, correlating any roughness parameter to human performance here could be moot: the limit of detecting roughness differences has previously been defined as 13 nm on structured surfaces33 and much higher for randomly rough surfaces,(46) all of which are magnitudes larger than the roughness differences between our surfaces. The differences in contact angle hysteresis – as an approximation of the adhesion contributions(47) – do not present any statistically significant effects on performance.”
Comment 2, Part 1
Along the same lines, other parameters that were considered such as the "Percent Correct vs. Difference in Sp" and "Percent Correct vs. Difference in SFW" were not plotted for consideration in the SI. It would be helpful to compare these results with the other three metrics in order to fully understand the relationships.
We have added these plots to the SI. We note that we had checked these relationships and discussed them briefly, but did not include the plot. The plots show that the type of instability was not as helpful as its presence or absence.
Our changes to the manuscript (Page 9)
“Furthermore, a model accounting for slow frictional waves alone specifically shows a significant, negative effect on performance (p < 0.01, Fig. S5 of SI), suggesting that in these samples and task, the type of instability was not as important.”
Added (SI Page 4)
“and no correlation between accuracy and stiction spikes (Fig. S5).”
Comment 2, Part 2
Other parameters such as stiction magnitude and differences in friction coefficient over the test space could also be important and interesting.
We agree these are interesting and have thought about them. We are aware that others, like Gueorguiev et al., have studied stiction magnitudes, and though there was a correlation, the physical differences in surface roughness (glass versus PMMA) investigated made it unclear if these could be generalized further(2). We are unsure how to proceed here with a satisfactory analysis of stiction magnitude, given that stiction spikes are not always generated. In fact, Fig. 1 shows that for many velocities and pressures, they do not form. However, we offer some speculation on why stiction spikes may be overrepresented in the literature because:
(1) They are prone to being created if the finger was loaded for a long time onto a surface prior to movement, thus creating adhesion by contact aging which is unlike active human exploration. We avoid this by discarding the first pull in our measurements, and is a standard practice in mechanical characterization if contact aging needs to be avoided.
(2) The ranges of velocities and pressures explored were small.
(3) In an effort to generate strong tactile stimuli, highly adhesive or rough surfaces are used.
(4) They are visually distinctive on a plot, but we are unaware of any mechanistic reason that mechanoreceptors would be extremely sensitive to this low frequency event over other signals.
In ongoing work, however, we are always cognizant that if stiction spikes are a dominant factor, then a secondary analysis on their magnitude would be important.
We interpret “difference in friction coefficient over the test space” to be, for a single surface, like C4, to find the highest average friction for a condition of single velocity and mass and subtract that from the lowest average friction for a condition of single velocity and mass. We calculated the difference in friction coefficient in the typical manner of the field, by averaging all data collected at all velocities and masses and assigning a single value for all of a surface, like C4. We had performed this, and have the data, but we are wary of overinterpreting secondary and tertiary metrics because they do not have any fundamental basis in traditional tribology, and this value, if used by humans, would suggest that they rapidly explore a large parameter space to find a “maximum” and “minimum” friction. Furthermore, the range in friction across the test space, after averaging, may in fact, be smaller than the range of friction in a single measurement. For example, in Fig. 1B, the friction coefficient can be calculated by dividing the data by the normal force ([applied mass + 6 g finger] × gravity). The friction coefficient in a single run varies widely, as expected.
Fig. 2D shows a GLMM fit between percent correct responses across our pairs and the differences in friction coefficient for each pair, where we see a spurious negative correlation. As we had the data of all average friction coefficients for each condition for a given material, we also looked at the difference in maximum and minimum friction coefficients. For our tested pairs, these differences also lined up on a statistically significant, negative GLMM fit (r = -0.86, p < 0.005). However, the values for a given surface can vary drastically, with an interquartile range of 1.20 to 2.09 on a single surface. We fit participant accuracy to the differences in these IQRs across pairs. This also led to a negative GLMM fit (r = -0.65, p < 0.05). However, we are hesitant to add this to the manuscript for the reasons stated previously.
Comment 3, Part 1
Beyond this fundamental concern, there is a weakness in the representativeness of the PDMS finger, the vertical motion, and the speed of sliding to real human exploration.
Overall, this is a continuous debate that we think offers two solutions. There is always a tradeoff between using a synthetic model of a finger versus a real human finger, and there is a place for both models. That is, while our mock finger will be more successful the closer it is to a human finger, it is not our goal to fully replace a human finger, rather our goal is to provide a method of characterizing surfaces that is indeed relevant on the length scale of human touch.
The usefulness of the mock finger is in isolating the features of each surface that is independent of human variability, i.e., instabilities that form without changing loading conditions between sliding motions or even within one sliding motion. Of course, with this method, we still require confirmation of these features still forming during human exploration, which we show in Fig. 3.
We believe that this method of characterizing surfaces at the mesoscale will ultimately lead to more successful human studies on tactile perception. Currently, and as shown in the paper, characterizing surfaces through traditional techniques, such as a commercial tribometer (friction coefficient, using a steel or hard metal ball), roughness (via atomic force microscopy or some other metrology), surface energy are less predictive. Thus, we believe this mock finger is stronger than the current state-of-the-art characterizing surfaces (we are also aware of a commercial mock finger company, but we were unable to purchase or obtain an evaluation model).
One of the main – and severe – limitations of using a human finger is that all fingers are different, meaning any study focusing on a particular user may not apply to others or be recreated easily by other researchers. We cannot set a standard for replication around a real human finger as that participant may no longer be available, or willing to travel the world as a “standard”. Furthermore, the method in which changes their pressures and velocities is different. We note that this is a challenge unique to touch perception – how an object is touched changes the friction generated, and thus the tactile stimulus generated, whereas a standardized stimulus is more straightforward for light or sound.
However, we do emphasize that we have strongly considered the balance between feasibility and ecological validity in the design of a mock finger. We have a mock finger, with the three components of stiffness of a human finger (more below). Furthermore, we have also successfully used this mock finger in correlations with human psychophysics in previous work, where findings from our mechanical experiments were predictive of human performance(3-6).
Our changes to the manuscript Added (Page 2-3)
“Mock finger as a characterization tool
In this work, we use a mechanical setup with a PDMS mock finger to derive tactile predictors from controlled friction traces alternative to average friction coefficients. While there is a tradeoff in selecting a synthetic finger over a more accurate, real human finger in modeling touch, our aim to design a method of mesoscale surface characterization for more successful studies on tactile perception cannot be fulfilled using one human participant as a standard. We believe that with sufficient replication of surface and bulk properties as well as contact geometry, and controlled friction measurements collected at loading conditions observed during a tactile discrimination task, we can isolate unique frictional features of a set of surfaces that do not arise from human-to-human variability.
The major component of a human finger, by volume, is soft tissue (~56%)(22), resulting in an effective modulus close to 100 kPa(23,24). In order to achieve this same softness, we crosslink PDMS in a 1×1×5 cm mold at a 30:1 elastomer:crosslinker ratio. However, two more features impart increased stiffness in a human finger. Most of this added rigidity is derived from the bone at the fingertip, the distal phalanx(23–25), which we mimic with an acrylic bone within our PDMS network. The stratum corneum, the stiffer, glassier outer layer of skin(26), is replicated with the surface of the mock finger glassified, or further crosslinked, after 8 hours of UV-Ozone treatment(27). This treatment also modifies the surface properties of the native PDMS to align with those of a human finger more closely. It minimizes the viscoelastic tack at the surface, resulting in a comparable non-sticky surface. At least one day after treatment, the finger surface returns to moderate hydrophilicity (~60º), as is typically observed for a real finger(28).
The initial contact area formed before a friction trace is collected is a rectangle of 1×1 cm. While this shape is not entirely representative of a human finger with curves and ridges, human fingers flatten out enough to reduce the effects of curvature with even very light pressures(28–30). This implies that regardless of finger pressure, the contact area is largely load-independent, which is more accurately replicated with a rectangular mock finger. It is still a challenge to control pressure distribution with this planar interface, but non-uniform pressures are also expected during human exploration.
Lastly, we consider fingerprints vs. flat fingers. A key finding of our previous work is that while fingerprints enhanced frictional dynamics at certain conditions, key features were still maintained with a flat finger.7 Furthermore, for some loading conditions, the more amplified signals could also result in more similar friction traces for different surfaces. We have continued to use flat fingers in our mechanical experiments, and have observed good agreement between these friction traces and human experiments(7,8,21,31).”
(Page 3-4, Materials and Methods)
“Mock Finger Preparation
Friction forces across all six surfaces were measured using a custom apparatus with a polydimethylsiloxane (PDMS, Dow Sylgard 184) mock finger that mimics a human finger’s
mechanical properties and contact mechanics while exploring a surface relatively closely(7,8). PDMS and crosslinker were combined in a 30:1 ratio to achieve a stiffness of 100 kPa comparable to a real finger, then degassed in a vacuum desiccator for 30 minutes. We are aware that the manufacturer recommended crosslinking ratio for Sylgard 184 is 10:1 due to potential uncrosslinked liquid residues(32), but further crosslinking concentrated at the surface prevents this. The prepared PDMS was then poured into a 1×1×5 cm mold also containing an acrylic 3D-printed “bone” to attach applied masses on top of the “fingertip” area contacting a surface during friction testing. After crosslinking in the mold at 60ºC for 1 hour, the finger was treated with UV-Ozone for 8 hours out of the mold to minimize viscoelastic tack.
Mechanical Testing
A custom device using our PDMS mock finger was used to collect macroscopic friction force traces replicating human exploration(7,8). After placing a sample surface on a stage, the finger was lowered at a slight angle such that an initial 1×1 cm rectangle of “fingertip” contact area could be established. We considered a broad range of applied masses (M \= 0, 25, 75, and 100 g) added onto the deadweight of the finger (6 g) observed during a tactile discrimination task. The other side of the sensor was connected to a motorized stage (V-508 PIMag Precision Linear Stage, Physikinstrumente) to control both displacement (4 mm across all conditions) and sliding velocity (v \= 5, 10, 25, and 45 mm s<sup>-1</sup>). Forces were measured at all 16 combinations of mass and velocity via a 250 g Futek force sensor (k \= 13.9 kN m<sup>-1</sup>) threaded to the bone, and recorded at an average sampling rate of 550 Hz with a Keithley 7510 DMM digitized multimeter. Force traces were collected in sets of 4 slides, discarding the first due to contact aging. Because some mass-velocity combinations were near the boundaries of instability phase transitions, not all force traces at these given conditions exhibited similar profiles.
Thus, three sets were collected on fresh spots for each condition to observe enough occurrences of multiple instabilities, at a total of nine traces per combination for each surface.”
Added References (Page 13)
M. Murai, H.-K. Lau, B. P. Pereira and R. W. H. Pho, J. Hand Surg., 1997, 22, 935–941.
A. Abdouni, M. Djaghloul, C. Thieulin, R. Vargiolu, C. Pailler-Mattei and H. Zahouani, R. Soc. Open Sci., DOI:10.1098/rsos.170321.
P.-H. Cornuault, L. Carpentier, M.-A. Bueno, J.-M. Cote and G. Monteil, J. R. Soc. Interface, DOI:10.1098/rsif.2015.0495.
K. Qian, K. Traylor, S. W. Lee, B. Ellis, J. Weiss and D. Kamper, J. Biomech., 2014, 47, 3094– 3099.
Y. Yuan and R. Verma, Colloids Surf. B Biointerfaces, 2006, 48, 6–12.
Y.-J. Fu, H. Qui, K.-S. Liao, S. J. Lue, C.-C. Hu, K.-R. Lee and J.-Y. Lai, Langmuir, 2010, 26, 4392–4399.
Comment 3, Part 2
“The real finger has multiple layers with different moduli. In fact, the stratum corneum cells, which are the outer layer at the interface and determine the friction, have a much higher modulus than PDMS. The real finger has multiple layers with different moduli. In fact, the stratum corneum cells, which are the outer layer at the interface and determine the friction, have a much higher modulus than PDMS.
We have approximated the softness of the finger with 100 kPa crosslinked PDMS, which is close to what has been reported for the bulk of a human fingertip(8,9). However, as mentioned in the Materials and Methods, there are two additional features of the mock finger that impart greater strength. The PDMS surrounds a rigid, acrylic bone comparable to the distal phalanx, which provides an additional layer of higher modulus(10). Additionally, the 8-hour UV-Ozone treatment decreases the viscoelastic tack of the pristine PDMS by glassifying, or further crosslinking the surface of the finger(11), therefore imparting greater stiffness at the surface similar to the contributions of the stratum corneum, along with a similar surface energy(12). This technique is widely used in wearables(13), soft robotics(14), and microfluidics(15) to induce both these material changes. Additionally, the finger is used at least a day after UV-Ozone treatment is completed in order for the surface to return to moderate hydrophilicity, similar to the outermost layer of human skin(16).
Comment 3, Part 3
In addition, the slanted position of the finger can cause non-uniform pressures across the finger. Both can contribute to making the PDMS finger have much more stick-slip than a real finger.
To ensure that there is minimal contribution from the slanted position of the finger, an initial contact area of 1×1 cm is established before sliding and recording friction measurements. As the PDMS finger is a soft object, the portion in contact with a surface flattens and the contact area remains largely unchanged during sliding. Any additional stick-slip after this alignment step is caused by contact aging at the interface, but the first trace we collect is always discarded to only consider stick-slip events caused by surface chemistry. We recognize that it is difficult to completely control the pressure distribution due to the planar interface, but this is also expected when humans freely explore a surface.
Comment 3, Part 4
In fact, if you look at the regime maps, there is very little space that has steady sliding. This does not represent well human exploration of surfaces. We do not tend to use a force and velocity that will cause extensive stick-slip (frequent regions of 100% stick-slip) and, in fact, the speeds used in the study are on the slow side, which also contributes to more stick-slip. At higher speeds and lower forces, all of the materials had steady sliding regions.
We are not aware of published studies that extensively show that humans avoid stickslip regimes. In fact, we are aware familiar with literature where stiction spike formation is suppressed – a recent paper by AliAbbasi, Basdogan et. al. investigates electroadhesion and friction with NaCl solution-infused interfaces, resulting in significantly steadier forces(17). We also directly showed evidence of instability formation that we observed during human exploration in Fig. 3B-C. These dynamic events are common, despite the lack of control of normal forces and sliding velocities. We also note that Reviewer 1, Comment 2, was suggesting that we further explore possible trends from parameterizing the stiction spike.
We note that many studies have often not gone at the velocities and masses required for stiction spikes – even though these masses and velocities would be routinely seen in free exploration – this is usually due to constraints of equipment(18). Sliding events during human free exploration of surfaces can exceed 100 mm/s for rapid touches. However, for the surfaces investigated here, we observe that large regions of stick-slip can emerge at velocities as low as 5 mm/s depending on the applied load. The incidence of steady sliding appears more dependent on the applied mass, with almost no steady sliding observed at or above 75 g. Indeed, the force categorization along our transition zones is the main point of the paper.
Comment 3, Part 5
Further, on these very smooth surfaces, the friction and stiction are more complex and cannot dismiss considerations such as finger material property change with sweat pore occlusion and sweat capillary forces. Also, the vertical motion of both the PDMS finger and the instructed human subjects is not the motion that humans typically use to discriminate between surfaces.
We did not describe the task sufficiently. Humans were only given the instruction to slide their finger along a single axis from top to bottom of a sample, not vertical as in azimuthal to gravity. We have updated our wording in the manuscript to reflect this.
Our changes to the manuscript (Page 4)
“Participants could touch for as long as they wanted, but were asked to only use their dominant index fingers along a single axis to better mimic the conditions for instability formation during mechanical testing with the mock finger.”
(Page 11)
“The participant was then asked to explore each sample simultaneously, and ran over each surface in strokes along a single axis until the participant could decide which of the two had “more friction”.”
Comment 3, Part 6
Finally, fingerprints may not affect the shape and size of the contact area, but they certainly do affect the dynamic response and detection of vibrations.
We are aware of the nuance. Our previous work on the role of fingerprints on friction experienced by a PDMS mock finger showed enhanced signals with the incorporation of ridges on the finger and used a rate-and-state model of a heterogenous, elastic body to find corresponding trends (though there is no existing model of friction that can accurately model experiments on mesoscale friction)(7). The key conclusion was that a flat finger still preserved key dynamic features, and the presence of stronger or more vibrations could result in more similar forces for different surfaces depending on the sliding conditions.
This is also in the context that we are seeking to provide a reasonable and experimentally accessible method to characterize surfaces, which will always be better as we get closer in replicating a true human finger. But our goal here was to replicate the finger sufficiently for use in human studies. We believe the more appropriate metric of success is if the mock finger is more successful than replacing traditional characterization experiments, like friction coefficient, roughness, surface energy, etc.
Comment 4
This all leads to the critical question, why are friction, normal force, and velocity not measured during the measured human exploration and in a systematic study using the real human finger? The authors posed an extremely interesting hypothesis that humans may alter their speed to feel the instability transition regions. This is something that could be measured with a real finger but is not likely to be correlated accurately enough to match regime boundaries with such a simplified artificial finger.
We are excited that our manuscript offers a tractable manner to test the hypothesis that tactile decision-making models use friction instabilities as evidence. However, we lay out the challenges and barriers, and how the scope of this paper will lead us in that direction. We also clarify that our goals are to provide a method to characterize samples to better design tactile interfaces in haptics or in psychophysical experiments and raise awareness that the common methods of sample characterization in touch by an average friction coefficient or roughness is fundamentally unsound.
In short, in our view, to further support our findings on instabilities would require answering:
(1) Which one, or combination of, of the multiple swipes that people make responsible for a tactile decision? (The need for a decision-making model)
(2) Establish what is, or may be, tactile evidence.
(3) Establish tactile decision-making models are similar or different than existing decision-making models.
(4) Test the hypothesis, in these models, that friction instabilities are evidence, and not some other unknown metric. This requires design samples that vary in the amount of evidence generated, but this evidence cannot be controlled directly. Rather, the samples indirectly vary evidence by how likely it is for a human to generate different types of friction instabilities during standard exploration.
(5) Design a task that does not require the use of subjective tactile descriptors, like “which one feels rougher”, which we see cause confusion in participants, which will likely require accounting for memory effects.
We elaborate these points below:
To successfully perform this experiment, we note that freely exploring humans make multiple strokes on a surface. Therefore, we would need to construct a decision-making model. It has not yet been demonstrated whether tactile decision making follows visual decision making, but perhaps to start, we can assume it does. Then, in the design of our decision-making paradigm, we immediately run into the problem: What is tactile evidence?
From Fig. 3C, we already can see that identifying evidence is challenging. Prior to this manuscript, people may have chosen the average force, or the highest force. Or we may choose the average friction force. Then, after deciding on the evidence, we need to find a method to manipulate the evidence, i.e., create samples or a machine that causes high friction, etc. We show that during the course of human touch, due to the dynamic nature of friction, the average can change a large amount and sample design becomes a central barrier to experiments. Others may suggest immobilizing the finger and applying a known force, but given how much friction changes with human exploration, there is no known method to make a machine recreate temporally and spatially varying friction forces during sliding onto a stationary finger. Finally, perhaps most importantly, in addition to mechanical challenges, a study by Liu, Colgate et al. showed that even if they recorded the friction (2D) of a finger exploring a surface and then replicated the same friction forces onto a finger, the participant could not determine which surface the replayed friction force was supposed to represent.1 This supports that the efference copy is important, that the forces in response to expected motion are important to determine friction. Finally, there is no known method to design instabilities a priori. They must be found through experiments. Especially since if we were to introduce, say a bump or a trough, then we bring in confounding variables to how participants tell surfaces apart.
Furthermore, even if we had some consistent method to create tactile “evidence”, the paradigm also deserves some consideration. In our experience, the 3-AFC task we perform is important because the vocabulary for touch has not been established. That is, in 3-AFC, by asking to determine which one sample is unlike the others, we do not have to ask the participant questions like “which one is rougher” or “which one has less friction”. In contrast, 2-AFC, which is better for decision-making models because it does not include memory, requires the asking of a perceptual question like: “which one is rougher?”. In our ongoing work, taking two silane coatings, we found that participants could easily identify which surface is unlike the others above chance in a 3-AFC, but participants, even within their own trials, could not consistently identify one silane as perceptually “rougher” by 2-AFC. To us, this calls into question the validity of tactile descriptors, but is beyond the scope of this manuscript.
This is not our only goal, but in the context of human exploration, in this manuscript here, we believed it was important to identify a mechanical parameter that was consistent with how humans explore surfaces, but was also a parameter that could characterize to some consistent property of a surface – irrespective of whether a human was touching it. We thought that designing human decision-making models and paradigms around the friction coefficient would not be successful.
Given the scope of these challenges, we do not think it would be possible to establish these conceptual sequences in a single manuscript.
Reviewer 2 (Public review):
Summary:
In this paper, the authors want to test the hypothesis that frictional instabilities rather than friction are the main drivers for discriminating flat surfaces of different sub-nanometric roughness profiles.
They first produced flat surfaces with 6 different coatings giving them unique and various properties in terms of roughness (picometer scale), contact angles (from hydrophilic to hydrophobic), friction coefficient (as measured against a mock finger), and Hurst exponent.
Then, they used those surfaces in two different experiments. In the first experiment, they used a mock finger (PDMS of 100kPA molded into a fingertip shape) and slid it over the surfaces at different normal forces and speeds. They categorized the sliding behavior as steady sliding, sticking spikes, and slow frictional waves by visual inspection, and show that the surfaces have different behaviors depending on normal force and speed. In a second experiment, participants (10) were asked to discriminate pairs of those surfaces. It is found that each of those pairs could be reliably discriminated by most participants.
Finally, the participant's discrimination performance is correlated with differences in the physical attributes observed against the mock finger. The authors found a positive correlation between participants' performances and differences in the count of steady sliding against the mock finger and a negative correlation between participants' reaction time and differences in the count of stiction spikes against the mock finger. They interpret those correlations as evidence that participants use those differences to discriminate the surfaces.
Strengths:
The created surfaces are very interesting as they are flat at the nanometer scale, yet have different physical attributes and can be reliably discriminated.”
We thank Reviewer 2 for their notes on our manuscript. The responses below address the reviewer’s comments and recommendations for revised work.
Weaknesses:
Comment 1
In my opinion, the data presented in the paper do not support the conclusions. The conclusions are based on a correlation between results obtained on the mock finger and results obtained with human participants but there is no evidence that the human participants' fingertips will behave similarly to the mock finger during the experiment. Figure 3 gives a hint that the 3 sliding behaviors can be observed in a real finger, but does not prove that the human finger will behave as the mock finger, i.e., there is no evidence that the phase maps in Figure 1C are similar for human fingers and across different people that can have very different stiffness and moisture levels.
The mechanical characterization conducted with the mock finger seeks to extract significant features of friction traces of a set of surfaces to use as predictors of tactile discriminability. The goal is to find a consistent method to characterize surfaces for use in tactile experiments that can be replicated by others and used prior to any human experiments. However, in the overall response and in a response to a similar comment by Reviewer 1, we also explain why we believe experiments on humans to establish this fact is not yet reasonable.
Comment 2
I believe that the authors collected the contact forces during the psychophysics experiments, so this shortcoming could be solved if the authors use the actual data, and show that the participant responses can be better predicted by the occurrence of frictional instabilities than by the usual metrics on a trial by trial basis, or at least on a subject by subject basis. I.e. Poor performers should show fewer signs of differences in the sliding behaviors than good performers.
To fully implement this, a decision-making model is necessary because, as a counter example, a participant could have generated 10 swipes of SFW and 1 swipe of a Sp, but the Sp may have been the most important event for making a tactile decision. This type of scenario is not compatible with the analysis suggested — and similar counterpoints can be made for other types of seemingly straightforward analysis.
While we are interested and actively working on this, the study here is critical to establish types of evidence for a future decision-making model. We know humans change their friction constantly during real exploration, so it is unclear which of these constantly changing values we should input into the decision making model, and the future challenges we anticipate are explained in Comment 1.
Comment 3
The sample size (10) is very small.
We recognize that, with all factors being equal, this sample size is on the smaller end. However, we emphasize the degree of control of samples is far above typical, with minimal variations in sample properties such as surface roughness, and every sample for every trial was pristine. Furthermore, the sample preparation (> 300 individual wafers were used) and cost became a factor. Although not typically appropriate, and thus not included in the manuscript, a post-hoc power analysis for our 100 trials of our pair that was closest to chance, P4, (53%, closest to chance at 33%) showed a power of 98.2%, suggesting that the study was appropriately powered.
Reviewer 2 (Recommendations for the authors):
Comment 1
Differences in SS and Sp (Table 2) are NOT physical or mechanical differences but are obtained by counting differences in the number of occurrences of each sliding behavior. It is rather a weird choice.
We disagree that differences in SS and Sp are not physical or mechanical, as these are well-established phenomena in the soft matter and tribology literature(19-21). These are known as “mechanical instabilities” and generated due to the effects of two physical phenomena: the elasticity of the finger (which is constant in our mechanical testing) and the friction forces present (which change per sample type). The motivation behind using these different shapes is that the instabilities, in some conditions, can be invariant to external factors like velocity. This would be quite advantageous for human exploration because, unlike friction coefficient, which changes with nearly any factor, including velocity and mass, the instabilities being invariant to velocity would mean that we are accurately characterizing a unique identifier of the surface even though velocity may be variable.
This “weird choice” is the central innovation of this paper. This choice was necessary because we demonstrated that the common usage of friction coefficient is fundamentally flawed: we see that friction coefficient suggests that surface which are more different would feel more similar – indeed the most distinctive surfaces would be two surfaces that are identical, which is clearly spurious. One potential explanation for why we were able to see this is effect is because our surfaces have similar (< 0.6 nm variability) roughness, removing potential confounding factors, and this type of low roughness control has not been used in tactile studies to the best of our knowledge.
Comment 2
Figures 2B-C: why are the x-data different than Table 2?
The x-data in Fig. 2B-C are the absolute differences in the number of occurrences measured for a given instability type or material property out of 144 pulls. Modeling the human participant results in our GLMMs required the independent variables to be in this form rather than percentages. We initially chose to list percent differences in Table 2 to highlight the ranges of differences instead of an absolute value, but have added both for clarity.
Our changes to the manuscript (Page 7)
“To determine if humans can detect these three different instabilities, we selected six pairs of surfaces to create a broad range of potential instabilities present across all three types. These are summarized in Table 2, where the first column for each instability is the difference in occurrence of that instability formed between each pair, and the second is the percent difference.”
Comment 3
"We constructed a set of coated surfaces with physical differences which were imperceptible by touch but created different types of instabilities based on how quickly a finger is slid and how hard a human finger is pressed during sliding." Yet, in your experiment, participants could discriminate them, so this is incoherent.
To clarify the point, macroscopic objects can differ in physical shape and in chemical composition. What we meant was that the physical differences, i.e., roughness, were below a limit (Skedung et al.) that participants, without a coating, would not be able to tell these apart(22). Therefore, the reason people could tell our surfaces apart was due to the chemical composition of the surface, and not any differences in roughness or physical effects like film stiffness (due to the molecular-scale thinness of the surface coatings, they are mechanically negligible). However, we concede that at the molecular scale, the traditional macroscopic distinction between physical and chemical is blurred.
We have made minor revisions to the wording in the abstract. We clarify that the surface coatings had physical differences in roughness that were smaller than 0.6 nm, which based purely on roughness, would not be expected to be distinguishable to participants. Therefore, the reason participants can tell these surfaces apart is due to differences in friction generated by chemical composition, and we were able to minimize contributions from physical differences in the sample our study.
Our changes to the manuscript (Page 1, Abstract)
“We constructed a set of coated surfaces with minimal physical differences that by themselves, are not perceptible to people, but instead, due to modification in surface chemistry, the surfaces created different types of instabilities based on how quickly a finger is slid and how hard a human finger is pressed during sliding.”
Reviewer 3 (Public review):
Strengths:
The paper describes a new perspective on friction perception, with the hypothesis that humans are sensitive to the instabilities of the surface rather than the coefficient of friction. The paper is very well written and with a comprehensive literature survey.
One of the central tools used by the author to characterize the frictional behavior is the frictional instabilities maps. With these maps, it becomes clear that two different surfaces can have both similar and different behavior depending on the normal force and the speed of exploration. It puts forward that friction is a complicated phenomenon, especially for soft materials.
The psychophysics study is centered around an odd-one-out protocol, which has the advantage of avoiding any external reference to what would mean friction or texture for example. The comparisons are made only based on the texture being similar or not.
The results show a significant relationship between the distance between frictional maps and the success rate in discriminating two kinds of surface.”
We thank Reviewer 3 for their notes and interesting discussion points on our manuscript. Below, we address the reviewer’s feedback and comments on related works.
Weaknesses:
Comment 1
The main weakness of the paper comes from the fact that the frictional maps and the extensive psychophysics study are not made at the same time, nor with the same finger. The frictional maps are produced with an artificial finger made out of PDMS which is a poor substitute for the complex tribological properties of skin.
A similar comment was made by Reviewers 1 and 2 and parts are replicated below. We are not claiming that our PDMS fingers are superior to real fingers, but rather, we cannot establish standards in the field by using real human fingers that vary between subjects and researchers. We believe the mock finger we designed is a reasonable mimic of the human finger by matching surface energy, heterogeneous mechanical structure, and the ability to test multiple physiologically relevant pressures and sliding velocities.
We achieve a heterogeneous mechanical structure with the 3 primary components of stiffness of a human finger. The effective modulus of ~100 kPa, from soft tissue,8,9 is obtained with a 30:1 ratio of PDMS to crosslinker. The PDMS also surrounds a rigid, acrylic bone comparable to the distal phalanx, which provides an additional layer of higher modulus.10 Additionally, the 8-hour UV-Ozone treatment decreases the viscoelastic tack of the pristine PDMS by glassifying, or further crosslinking the surface of the finger,11 therefore imparting greater stiffness at the surface similar to the contributions of the stratum corneum, along with a similar surface energy.12 The finger is used at least a day after UV-Ozone treatment is completed in order for the surface to return to moderate hydrophilicity, similar to the outermost layer of human skin.16 We also discuss the shape of the contact formed. To ensure that there is minimal contribution from the slanted position of the finger, an initial contact area of 1×1 cm is established before sliding and recording friction measurements. As the PDMS finger is a soft object, the portion in contact with a surface flattens and the contact area remains largely unchanged during sliding. We recognize that it is difficult to completely control the pressure distribution due to the planar interface, but this variation is also expected when humans freely explore a surface. Finally, we consider flat vs. fingerprinted fingers. Our previous work on the role of fingerprints on friction experienced by a PDMS mock finger showed enhanced signals with the incorporation of ridges on the finger and used a rate-andstate model of a heterogenous, elastic body to find corresponding trends.7 The key conclusion was that a flat finger still preserved key dynamic features, and the presence of stronger or more vibrations could result in more similar forces for different surfaces depending on the sliding conditions. We note that we have subsequently used the controlled mechanical data collected with this flat mock finger in correlations with human psychophysics in previous work, where findings from our mechanical experiments were predictive of human performance.3–6 Ultimately, we see from our prior work and here that, despite the drawbacks of our mock finger, it outperforms other standard characterization technique in providing information about the mesoscale that correlates to tactile perception. We have added these details to the manuscript.
We also note that an intermediate option, replicating real fingers, even in a mold, may also inadvertently limit trends from characterization to a specific finger. One of the main – and severe – limitations of using a human finger is that all fingers are different, meaning any study focusing on a particular user may not apply to others or be recreated easily by other researchers. We cannot set a standard for replication around a real human finger as that participant may no longer be available, or willing to travel the world as a “standard”. Furthermore, the method in which a single person changes their pressures and velocities as they touch a surface is highly variable. We also note that in the Summary Response, we noted that a study by Colgate et al. (IEEE ToH 2024) demonstrated that efference copies may be important, and thus constraining a human finger and replaying the forces recorded during free exploration will not lead to the participant identifying a surface with any consistency. Thus, it is important to allow humans to freely explore surfaces, but creates nearly limitless variability in friction forces.
This is also against the backdrop that we are seeking to provide a method to characterize surfaces, which will be aided as we get closer in replicate a true human finger. Indeed, the more features we replicate, the more successful the mechanical data will be in correlating to tactile distinguishability. But reasonably, our success would be in replacing traditional characterization experiments, not in recreating the forces of an arbitrary human finger.
Our changes to the manuscript Added (Page 2-3)
“Mock finger as a characterization tool
In this work, we use a mechanical setup with a PDMS mock finger to derive tactile predictors from controlled friction traces alternative to average friction coefficients. While there is a tradeoff in selecting a synthetic finger over a more accurate, real human finger in modeling touch, our aim to design a method of mesoscale surface characterization for more successful studies on tactile perception cannot be fulfilled using one human participant as a standard. We believe that with sufficient replication of surface and bulk properties as well as contact geometry, and controlled friction measurements collected at loading conditions observed during a tactile discrimination task, we can isolate unique frictional features of a set of surfaces that do not arise from human-to-human variability.
The major component of a human finger, by volume, is soft tissue (~56%)(22), resulting in an effective modulus close to 100 kPa(23,24). In order to achieve this same softness, we crosslink PDMS in a 1×1×5 cm mold at a 30:1 elastomer:crosslinker ratio. However, two more features impart increased stiffness in a human finger. Most of this added rigidity is derived from the bone at the fingertip, the distal phalanx(23-25), which we mimic with an acrylic bone within our PDMS network. The stratum corneum, the stiffer, glassier outer layer of skin(26), is replicated with the surface of the mock finger glassified, or further crosslinked, after 8 hours of UV-Ozone treatment(27). This treatment also modifies the surface properties of the native PDMS to align with those of a human finger more closely. It minimizes the viscoelastic tack at the surface, resulting in a comparable non-sticky surface. At least one day after treatment, the finger surface returns to moderate hydrophilicity (~60º), as is typically observed for a real finger(28).
The initial contact area formed before a friction trace is collected is a rectangle of 1×1 cm. While this shape is not entirely representative of a human finger with curves and ridges, human fingers flatten out enough to reduce the effects of curvature with even very light pressures(28-30). This implies that regardless of finger pressure, the contact area is largely load-independent, which is more accurately replicated with a rectangular mock finger. It is still a challenge to control pressure distribution with this planar interface, but non-uniform pressures are also expected during human exploration.
Lastly, we consider fingerprints vs. flat fingers. A key finding of our previous work is that while fingerprints enhanced frictional dynamics at certain conditions, key features were still maintained with a flat finger(7). Furthermore, for some loading conditions, the more amplified signals could also result in more similar friction traces for different surfaces. We have continued to use flat fingers in our mechanical experiments, and have observed good agreement between these friction traces and human experiments(7,8,21,31).”
(Page 3-4, Materials and Methods)
“Mock Finger Preparation
Friction forces across all six surfaces were measured using a custom apparatus with a polydimethylsiloxane (PDMS, Dow Sylgard 184) mock finger that mimics a human finger’s
mechanical properties and contact mechanics while exploring a surface relatively closely(7,8). PDMS and crosslinker were combined in a 30:1 ratio to achieve a stiffness of 100 kPa comparable to a real finger, then degassed in a vacuum desiccator for 30 minutes. We are aware that the manufacturer recommended crosslinking ratio for Sylgard 184 is 10:1 due to potential uncrosslinked liquid residues(32), but further crosslinking concentrated at the surface prevents this. The prepared PDMS was then poured into a 1×1×5 cm mold also containing an acrylic 3D-printed “bone” to attach applied masses on top of the “fingertip” area contacting a surface during friction testing. After crosslinking in the mold at 60ºC for 1 hour, the finger was treated with UV-Ozone for 8 hours out of the mold to minimize viscoelastic tack.
Mechanical Testing
A custom device using our PDMS mock finger was used to collect macroscopic friction force traces replicating human exploration(7,8). After placing a sample surface on a stage, the finger was lowered at a slight angle such that an initial 1×1 cm rectangle of “fingertip” contact area could be established. We considered a broad range of applied masses (M \= 0, 25, 75, and 100 g) added onto the deadweight of the finger (6 g) observed during a tactile discrimination task. The other side of the sensor was connected to a motorized stage (V-508 PIMag Precision Linear Stage, Physikinstrumente) to control both displacement (4 mm across all conditions) and sliding velocity (v \= 5, 10, 25, and 45 mm s<sup>-1</sup>). Forces were measured at all 16 combinations of mass and velocity via a 250 g Futek force sensor (k \= 13.9 kN m<sup>-1</sup>) threaded to the bone, and recorded at an average sampling rate of 550 Hz with a Keithley 7510 DMM digitized multimeter. Force traces were collected in sets of 4 slides, discarding the first due to contact aging. Because some mass-velocity combinations were near the boundaries of instability phase transitions, not all force traces at these given conditions exhibited similar profiles. Thus, three sets were collected on fresh spots for each condition to observe enough occurrences of multiple instabilities, at a total of nine traces per combination for each surface.”
Added References (Page 13)
M. Murai, H.-K. Lau, B. P. Pereira and R. W. H. Pho, J. Hand Surg., 1997, 22, 935–941.
A. Abdouni, M. Djaghloul, C. Thieulin, R. Vargiolu, C. Pailler-Mattei and H. Zahouani, R. Soc. Open Sci., DOI:10.1098/rsos.170321.
P.-H. Cornuault, L. Carpentier, M.-A. Bueno, J.-M. Cote and G. Monteil, J. R. Soc. Interface, DOI:10.1098/rsif.2015.0495.
K. Qian, K. Traylor, S. W. Lee, B. Ellis, J. Weiss and D. Kamper, J. Biomech., 2014, 47, 3094– 3099.
Y. Yuan and R. Verma, Colloids Surf. B Biointerfaces, 2006, 48, 6–12.
Y.-J. Fu, H. Qui, K.-S. Liao, S. J. Lue, C.-C. Hu, K.-R. Lee and J.-Y. Lai, Langmuir, 2010, 26, 4392–4399.
Comment 2
The evidence would have been much stronger if the measurement of the interaction was done during the psychophysical experiment. In addition, because of the protocol, the correlation is based on aggregates rather than on individual interactions.
Our Response: We agree that this would have helped further establish our argument, but in the overall statement and in other reviewer responses, we describe the significant challenges to establishing this.
To fully implement this, a decision-making model is necessary because, as a counter example, a participant could have generated 10 swipes of SFW and 1 swipe of a Sp, but the Sp may have been the most important event for making a tactile decision. We also clarify that our goals are to provide a method to characterize samples to better design tactile interfaces in haptics or in psychophysical experiments.
In short, in our view, to develop a decision-making model, the challenges are as follows:
(1) Which one, or combination of, of the multiple swipes that people make responsible for a tactile decision?
(2) Establish what is, or may be, tactile evidence.
(3) Establish tactile decision-making models are similar or different than existing decision-making models.
(4) Test the hypothesis, in these models, that friction instabilities are evidence, and not some other unknown metric.
(5) Design a task that does not require the use of subjective tactile descriptors, like “which one feels rougher”, which we see cause confusion in participants, which will likely require accounting for memory effects.
(6) Design samples that vary in the amount of evidence generated, but this evidence cannot be controlled directly. Rather, the samples indirectly vary evidence by how likely it is for a human to generate different types of friction instabilities during standard exploration.
We elaborate these points below:
To successfully perform this experiment, we note that freely exploring humans make multiple strokes on a surface. Therefore, we would need to construct a decision-making model. It has not yet been demonstrated whether tactile decision making follows visual decision making, but perhaps to start, we can assume it does. Then, in the design of our decision-making paradigm, we immediately run into the problem: What is tactile evidence?
From Fig. 3C, we already can see that identifying evidence is challenging. Prior to this manuscript, people may have chosen the average force, or the highest force. Or we may choose the average friction force. Then, after deciding on the evidence, we need to find a method to manipulate the evidence, i.e., create samples or a machine that causes high friction, etc. We show that during the course of human touch, due to the dynamic nature of friction, the average can change a large amount and sample design becomes a central barrier to experiments. Others may suggest to immobilize the finger and applying a known force, but given how much friction changes with human exploration, there is no known method to make a machine recreate temporally and spatially varying friction forces during sliding onto a stationary finger. Finally, perhaps most importantly, in addition to mechanical challenges, a study by Liu, Colgate et al. showed that even if they recorded the friction (2D) of a finger exploring a surface and then replicated the same friction forces onto a finger, the participant could not determine which surface the replayed friction force was supposed to represent.1 This supports that the efference copy is important, that the forces in response to expected motion are important to determine friction. Finally, there is no known method to design instabilities a priori. They must be found through experiments, especially since if we were to introduce, say a bump or a trough, then we bring in confounding variables to how participants tell surfaces apart.
Furthermore, even if we had some consistent method to create tactile “evidence”, the paradigm also deserves some consideration. In our experience, the 3-AFC task we perform is important because the vocabulary for touch has not been established. That is, in 3-AFC, by asking to determine which one sample is unlike the others, we do not have to ask the participant questions like “which one is rougher” or “which one has less friction”. In contrast, 2-AFC, which is better for decision-making models because it does not include memory, requires the asking of a perceptual question like: “which one is rougher?”. In our ongoing work, taking two silane coatings, we found that participants could easily identify which surface is unlike the others above chance in a 3-AFC, but participants, even within their own trials, could not consistently identify one silane as perceptually “rougher” by 2-AFC. To us, this calls into question the validity of tactile descriptors, but is beyond the scope of the current manuscript.
This is not our only goal, but in the context of human exploration, in this manuscript here, we believed it was important to identify a mechanical parameter that was consistent with how humans explore surfaces, but was also a parameter that could characterize to some consistent property of a surface – irrespective of whether a human was touching it. We thought that designing human decision-making models and paradigms around the friction coefficient would not be successful.
Given the scope of these challenges, we do not think it would be possible to establish this conceptual sequence in a single manuscript.
Comment 3
The authors compensate with a third experiment where they used a 2AFC protocol and an online force measurement. But the results of this third study, fail to convince the relation.
With this experiment, our central goal was to demonstrate that the instabilities we have identified with the PDMS finger also occur with a human finger. Several instances of SS, Sp, and SFW were recorded with this setup as a participant touched surfaces in real time.
Comment 4
No map of the real finger interaction is shown, bringing doubt to the validity of the frictional map for something as variable as human fingers.
Real fingers change constantly during exploration, and friction is state-dependent, meaning that the friction will depend on how the person was moving the moment prior. Therefore, a map is only valid for a single human movement – even if participants all were instructed to take a single swipe and start from zero motion, humans are unable to maintain constant velocities and pressures. Clearly, this is not sustainable for any analysis, and these drawbacks apply to any measured parameter, whether instabilities suggested here, or friction coefficients used throughout. We believe the difficulty of this approach emphasizes why a standard map of characterization of a surface by a mock finger, even with its drawbacks, is a viable path forward.
Reviewer 3 (Recommendations for the authors):
Comment 1
It would be interesting to comment on a potential connection between the frictional instability maps and Schalamack waves
Schallamach waves are a subset of slow frictional waves (SFW). Schallmach waves are very specifically defined. They are a are pockets of air that form between a soft sliding object and rigid surface, and propagate rear-to-front (retrograde waves) as a soft object is slid and buckles due to adhesive pinning. Wrinkles form at the detached portion of the soft material, until the interface reattaches and the process repeats.23 There is typically a high burden of proof to establish a Schallamach wave over a more general slow frictional wave. We note that it would be exceeding difficult to design samples that can reliably create subsets of SFW, but we are aware that this may be an interesting question at a future point in our work.
Comment 2
The force sensors look very compliant, and given the dynamic nature of the signal, it is important to characterize the frequency response of the system to make sure that the fluctuations are not amplified.
Our Response: Thank you for noticing. We mistyped the sensor spring constant as 13.9 N m<sup>-1</sup> instead of kN m<sup>-1</sup>. However, below we show how the instabilities are derived from the mechanics at the interface due to the compliance of the finger. The “springs” of the force sensor and PDMS finger are connected in parallel. Since k<sub>sensor</sub> = 13.9 kN m<sup>-1</sup>, the spring constant of the system overall reflects the compliance of the finger, and highlights the oscillations arising solely from stick-slip. A sample calculation is shown below.
Author response image 1.
Fitting a line to the initial slope of the force trace for C6 gives the equation y = 25.679_x_ – 0.2149. The slope here represents force data over time data, and is divided by the velocity (25 mm/s) to determine 𝐹𝐹 the spring constant of the system
. This value is lower than ksensor = 13.9 kN/m, indicating that the “springs” representing the force sensor and PDMS finger are connected in parallel:
. The finger is the compliant component of the system, with k<sub>finger</sub> = 0.902 N/m, and of course, real human fingers are also compliant so this matches our goals with the design of the mock finger.
Our changes to the manuscript (Page 4)
(k \= 13.9 kN m<sup>-1</sup>)
Comment 3
The authors should discuss about the stochastic nature of friction:
Wiertlewski, Hudin, Hayward, IEEE WHC 2011
Greenspon, McLellan, Lieber, Bensmaia, JRSI 2020”
We believe that, given the references, this comment on “stochastic” refers to the macroscopically-observable fluctuations (i.e., the mechanical “noise” which is not due to instrument noise) in friction arising from the discordant network of stick-slip phenomena occurring throughout the contact zone, and not the stochastic nature of nanoscale friction that occurs thermal fluctuations nor due to statistical distributions in bond breaking associated with soft contact.
We first note that our small-scale fluctuations do not arise from a periodic surface texture that dominates in the frequency regime. However, even on our comparatively smooth surfaces, we do expect fluctuations due to nanoscale variation in contact, generation of stick-slip across at microscale length scales that occur either concurrently or discordantly across the contact zone, and the nonlinear dependence of friction to nearly any variation in state and composition(7).
Perhaps the most relevant to the manuscript is that a major advantage of analysis by friction is that it sidesteps these ever-present microscale fluctuations, leading to more clearly defined classifiers or categories during analysis. Wiertlewski et. al. showed repeated measurements in their systems ultimately gave rise to consistent frequencies(24) (we think their system was in a steady sliding regime and the patterning gave rise to underlying macroscopic waves). These consistent frequencies, at least in soft systems and absent obvious macroscopic patterned features, would be expected to arise from the instability categories and we see them throughout.
Comment 4
It is stated that "we observed a spurious, negative correlation between friction coefficient and accuracy”.
What makes you qualify that correlation as spurious?
We mean this as in the statistical definition of “spurious”.
This correlation would indicate that by the metric of friction coefficient, more different surfaces are perceived more similarly. Thus, two very different surfaces, like Teflon and sandpaper, by friction coefficient would be expected to feel very similar. Two nearly identical surfaces would be expected to feel very different – but of course, humans cannot consistently distinguish two identical surfaces. This finding is counterintuitive and refutes that friction coefficient is a reliable classifier of surfaces by touch. We do not think it is productive to determine a mechanism for a spurious correlation, but perhaps one reason we were able to observe this is because our study, to the best of our knowledge, is unique for having samples that are controlled in their physical differences in roughness and surface features.
Our changes to the manuscript (Page 10)
“To compare the value of looking at frictional instabilities, we also performed GLMM fits on common approaches in the field, like a friction coefficient or material property typically used in tactile discrimination, shown in Fig. 2D-E. Interestingly, in Fig. 2D, we observed a spurious, negative correlation between friction coefficient (typically and often problematically simplified as
across all tested conditions) and accuracy (r = -0.64, p < 0.01); that is, the more different the surfaces are by friction coefficient, the less people can tell them apart. This spurious correlation would be the opposite of intuition, and further calls into question the common practice of using friction coefficients in touch-related studies. The alternative, two-term model which includes adhesive contact area for friction coefficient(29) was even less predictive (see Fig. S6A of SI). We believe such a correlation could not have been uncovered previously as our samples are minimal in their physical variations. Yet, the dynamic changes in force even within a single sample are not considered, despite being a key feature of mesoscale friction during human touch.
We investigate different material properties in Fig. 2E. Differences in average roughness R<sub>a</sub> (or other parameters, like root mean square roughness R<sub>rms</sub> (Fig. S6A of SI) did not show a statistically significant correlation to accuracy. Though roughness is a popular parameter, correlating any roughness parameter to human performance here could be moot: the limit of detecting roughness differences has previously been defined as 13 nm on structured surfaces(33) and much higher for randomly rough surfaces(46), all of which are magnitudes larger than the roughness differences between our surfaces. The differences in contact angle hysteresis – as an approximation of the adhesion contributions(47) – do not present any statistically significant effects on performance.”
Comment 5
The authors should comment on the influence of friction on perceptual invariance. Despite inducing radially different frictional behavior for various conditions, these surfaces are stably perceived. Maybe this is a sign that humans extract a different metric?
We agree – we are excited that frictional instabilities may offer a more stable perceptual cue because they are not prone to fluctuations (Recommendations for the authors, Comment 3) and instability formation, in many conditions, is invariant to applied pressures and velocities – thus forming large zones where a human may reasonable encounter a given instability.
Raw friction is highly prone to variation during human exploration (in alignment with Recommendations for the authors, Comment 3), but ongoing work seeks to explain tactile constancy, or the ability to identify objects despite these large changes in force. Very recently published work by Fehlberg et. al. identified the role of modulating finger speed and normal force in amplifying the differences in friction coefficient between materials in order to identify them(25), and we postulate that their work may be streamlined and consistent with the idea of friction instabilities, though we have not had a chance to discuss this in-depth with the authors yet.
We think that the instability maps show a viable path forward to how surfaces are stably perceived, and instabilities themselves show a potential mechanism: mathematically, instabilities for given conditions can be invariant to velocity or mass, creating zones where a certain instability is encountered. This reduces the immense variability of friction to a smaller, more stable classification of surfaces (e.g., a 30% SS surface or a 60% SS surface). A given surface will typically produce the same instability at a specific condition (we found some boundaries are extremely condition sensitive, but many conditions are not), whereas a single friction trace which is highly prone to variation is not a stable metric.
Added References (Page 14)
53 M. Fehlberg, E. Monfort, S. Saikumar, K. Drewing and R. Bennewitz, IEEE Trans. Haptics, 2024, 17, 957–963.
References
Z. Liu, J.-T. Kim, J. A. Rogers, R. L. Klatzky and J. E. Colgate, IEEE Trans. Haptics, 2024, 17, 441– 450.
D. Gueorguiev, S. Bochereau, A. Mouraux, V. Hayward and J.-L. Thonnard, Sci Rep, 2016, 6, 25553.
C. W. Carpenter, C. Dhong, N. B. Root, D. Rodriquez, E. E. Abdo, K. Skelil, M. A. Alkhadra, J. Ramírez, V. S. Ramachandran and D. J. Lipomi, Mater. Horiz., 2018, 5, 70–77.
A. Nolin, A. Licht, K. Pierson, C.-Y. Lo, L. V. Kayser and C. Dhong, Soft Matter, 2021, 17, 5050– 5060.
A. Nolin, K. Pierson, R. Hlibok, C.-Y. Lo, L. V. Kayser and C. Dhong, Soft Matter, 2022, 18, 3928– 3940.
Z. Swain, M. Derkaloustian, K. A. Hepler, A. Nolin, V. S. Damani, P. Bhattacharyya, T. Shrestha, J. Medina, L. Kayser and C. Dhong, J. Mater. Chem. B, DOI:10.1039/D4TB01646G.
C. Dhong, L. V. Kayser, R. Arroyo, A. Shin, M. Finn, A. T. Kleinschmidt and D. J. Lipomi, Soft Matter, 2018, 14, 7483–7491.
A. Abdouni, M. Djaghloul, C. Thieulin, R. Vargiolu, C. Pailler-Mattei and H. Zahouani, Royal Society Open Science, DOI:10.1098/rsos.170321.
P.-H. Cornuault, L. Carpentier, M.-A. Bueno, J.-M. Cote and G. Monteil, Journal of The Royal Society Interface, DOI:10.1098/rsif.2015.0495.
K. Qian, K. Traylor, S. W. Lee, B. Ellis, J. Weiss and D. Kamper, J Biomech, 2014, 47, 3094–3099.
Y.-J. Fu, H. Qui, K.-S. Liao, S. J. Lue, C.-C. Hu, K.-R. Lee and J.-Y. Lai, Langmuir, 2010, 26, 4392– 4399.
Y. Yuan and R. Verma, Colloids Surf B Biointerfaces, 2006, 48, 6–12.
G. Yu, J. Hu, J. Tan, Y. Gao, Y. Lu and F. Xuan, Nanotechnology, 2018, 29, 115502.
L. Zheng, S. Dong, J. Nie, S. Li, Z. Ren, X. Ma, X. Chen, H. Li and Z. L. Wang, ACS Appl. Mater. Interfaces, 2019, 11, 42504–42511.
K. Ma, J. Rivera, G. J. Hirasaki and S. L. Biswal, Journal of Colloid and Interface Science, 2011, 363, 371–378.
A. Mavon, H. Zahouani, D. Redoules, P. Agache, Y. Gall and Ph. Humbert, Colloids and Surfaces B: Biointerfaces, 1997, 8, 147–155.
E. AliAbbasi, M. Muzammil, O. Sirin, P. Lefèvre, Ø. G. Martinsen and C. Basdogan, IEEE Trans. Haptics, 2024, 17, 841–849.
G. Corniani, Z. S. Lee, M. J. Carré, R. Lewis, B. P. Delhaye and H. P. Saal, eLife, DOI:10.7554/eLife.93554.1.
J. N. Israelachvili, Intermolecular and Surface Forces, Academic Press, 2011.
S. Das, N. Cadirov, S. Chary, Y. Kaufman, J. Hogan, K. L. Turner and J. N. Israelachvili, J R Soc Interface, 2015, 12, 20141346.
B. N. J. Persson, O. Albohr, C. Creton and V. Peveri, The Journal of Chemical Physics, 2004, 120, 8779–8793.
L. Skedung, M. Arvidsson, J. Y. Chung, C. M. Stafford, B. Berglund and M. W. Rutland, Sci Rep, 2013, 3, 2617.
K. Viswanathan, N. K. Sundaram and S. Chandrasekar, Soft Matter, 2016, 12, 5265–5275.
M. Wiertlewski, C. Hudin and V. Hayward, in 2011 IEEE World Haptics Conference, 2011, pp. 25– 30.
M. Fehlberg, E. Monfort, S. Saikumar, K. Drewing and R. Bennewitz, IEEE Transactions on Haptics, 2024, 17, 957–963.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Response to Reviewer 1
Thank you for your recognition of our revised work.
Response to Reviewer 2
It would be useful to have a demonstration of where this model outperforms SaProt systematically, and a discussion about what the success of this model teaches us given there is a similar, previously successful model, SaProt.
As two concurrent works, ProtSSN and SaProt employ different methods to incorporate the structure information of proteins. Generally speaking, for two deep learning models that are developed during a close period, it is challenging to conclude that one model is systematically superior to another. Nonetheless, on DTm and DDG (the two low-throughput datasets that we constructed), ProtSSN demonstrates better empirical performance than SaProt.
Moreover, ProtSSN is more efficient in both training and inference compared to SaProt. In terms of training cost, SaProt uses 40 million protein structures for pretraining (requiring 64 A100 GPUs for three months), whereas ProtSSN requires only about 30,000 crystal structures from the CATH database (trained on a single 3090 GPU for two days). Despite SaProt’s significantly higher training cost, its pretrained version does not exhibit superior performance on low-throughput datasets such as DTm, DDG, and Clinvar. Furthermore, the high training cost limits many users from retraining or fine-tuning the model for specific needs or datasets.
Regarding the inference cost, ProtSSN requires only one embedding computation for a wild-type protein, regardless of the number of mutants (n). In contrast, SaProt computes a separate embedding and score for each mutant. For instance, when evaluating the scoring performance on ProteinGym, ProtSSN only needs 217 inferences, while SaProt needs more than 2M inferences. This inference speed is important in practice, such as high-throughput design and screening.
Please remove the reference to previous methods as "few shot". This typically refers to their being trained on experimental data, not their using MSAs. A "few shot" model would be ProteinNPT.
The definition of "few-shot" we used here is following ESM1v [1]. This concept originates from providing a certain number of examples as input to GPT-3 [2]. In the context of protein deep learning models, MSA serves as the wild-type protein examples.
Also, Reviewer 1 uses the concept in the same way.
“Readers should note that methods labelled as "few-shot" in comparisons do not make use of experimental labels, but rather use sequences inferred as homologous; these sequences are also often available even if the protein has never been experimentally tested.”
In the main text, we also included this definition as well as the reference of ESM-1v in lines 457-458.
“We extend the evaluation on ProteinGym v0 to include a comparison of our zero-shot ProtSSN with few-shot learning methods that leverage MSA information of proteins (Meier et al., 2021).”
(1) Meier J, Rao R, Verkuil R, et al. Language models enable zero-shot prediction of the effects of mutations on protein function. Advances in Neural Information Processing Systems, 2021.
(2) Brown T, Mann B, Ryder N, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 2020.
Furthermore, I don't think it is fair to state that your method is not comparable to these models -- one can run an MSA just as one can predict a structure. A fairer comparison would be to highlight particular assays for which getting an MSA could be challenging -- Transcription did this by showing that they outperform EVE when MSAs are shallow.
We recognize that there are often differences in the definitions and classifications of various methodologies. Here, we follow the definitions provided by ProteinGym. As the most comprehensive and large scale open benchmark in the community, we believe this classification scheme should be widely accepted. All classifications are available on the official website of ProteinGym (https://proteingym.org/benchmarks), which categorizes methods into PLMs, Structure-based models, and Alignment-based models. For example, GEMME is classified as an alignment-based model, and MSA Transformer is considered a hybrid model combining alignment and PLM features.
We believe that methodologies with different inputs and architectures can lead to inherent unfairness. Also, it is generally believed that models including evolutionary relationships tend to outperform end-to-end models due to the extra information and efforts involved during the training phase. Some empirical evidence and discussions are in the ablation studies of retrieval factors in Tranception [3]. Moreover, the choice of MSA search parameters can introduce uncertainty, which could have positive or negative impacts.
We showcase the impact of MSA depth on model performance with an additional analysis below. Author response image 1 visualizes the Spearman’s correlation between the scores of each model and the number of MSAs on 217 ProteinGym assays, where each point represents one of 217 assays. The summary correlation of each model with respect to all assays are reported in Author response table 1. These results demonstrate no clear correlation between MSA depth and model performance even for MSA-based models.
Author response image 1.
Scatter plots of the number of MSA sequences and spearman’s correlation.
Author response table 1.
Spearmar’s score of the number of MSA sequences and the model’s performance.
(3) Notin P, Dias M, Frazer J, et al. Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval. International Conference on Machine Learning, 2022.
The authors state that DTm and DDG are conceptually appealing because they come from low-throughput assays with lower experimental noise and are also mutations that are particularly chosen to represent the most interesting regions of the protein. I agree with the conceptual appeal but I don't think these claims have been demonstrated in practice. The cited comparison with Frazer as a particularly noisy source of data I think is particularly unconvincing: ClinVar labels are not only rigorously determined from multiple sources of evidence, Frazer et al demonstrates that these labels are actually more reliable than experiment in some cases. They also state that ProteinGym data doesn't come with environmental conditions, but these can be retrieved from the papers the assays came from. The paper would be strengthened by a demonstration of the conceptual benefit of these new datasets, say a comparison of mutations and signal for a protein that may be in one of these datasets vs ProteinGym.
In the work by Frazer et al. [4], they mentioned that
"However, these technologies do not easily scale to thousands of proteins, especially not to combinations of variants, and depend critically on the availability of assays that are relevant to or at least associated with human disease phenotypes."
It points out that the results of high-throughput experiments are usually based on the design of specific genes (such as BRCA1 and TP53.) and cannot be easily extended to thousands of other genes. At the same time, due to the complexity of the experiment, there may be problems with reproducibility or deviations from clinical relevance.
This statement aligns with our perspective that high-throughput experiments inherently involve a significant amount of noise and error. It is important to clarify that the noise we discuss here arises from the limitations of high-throughput experiments themselves, instead of from the reliability of the data sources, such as systematic errors in experimental measurements. This latter issue is a complex problem common to all wetlab experiments and falls outside the scope of our study.
Under this premise, low-throughput datasets like DTm and DDG can be considered to have less noise than high-throughput datasets, as they have undergone manual curation. As for your suggestion, while valuable, unfortunately, we were unable to identify datasets in DTM and DDG that align with those in ProteinGym after a careful search. Thus, we are unable to conduct this comparative experiment at this stage.
(4) Frazer J, Notin P, Dias M, et al. Disease variant prediction with deep generative models of evolutionary data. Nature, 2021.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment
This study presents a valuable theoretical exploration on the electrophysiological mechanisms of ionic currents via gap junctions in hippocampal CA1 pyramidal-cell models, and their potential contribution to local field potentials (LFPs) that is different from the contribution of chemical synapses. The biophysical argument regarding electric dipoles appears solid, but the evidence can be more convincing if their predictions are tested against experiments. A shortage of model validation and strictly comparable parameters used in the comparisons between chemical vs. junctional inputs makes the modeling approach incomplete; once strengthened, the finding can be of broad interest to electrophysiologists, who often make recordings from regions of neurons interconnected with gap junctions.
We gratefully thank the editors and the reviewers for the time and effort in rigorously assessing our manuscript, for the constructive review process, for their enthusiastic responses to our study, and for the encouraging and thoughtful comments. We especially thank you for deeming our study to be a valuable exploration on the differential contributions of active dendritic gap junctions vs. chemical synapses to local field potentials. We thank you for your appreciation of the quantitative biophysical demonstration on the differences in electric dipoles that appear in extracellular potentials with gap junctions vs. chemical synapses.
However, we are surprised by aspects of the assessment that resulted in deeming the approach incomplete, especially given the following with specific reference to the points raised:
(1) Testing against experiments: With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established nonspecificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). The non-specific actions of gap junctions are tabulated in Table 2 of (Szarka et al., 2021), reproduced below. In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.
In addition, the complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).
Together, we emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials.
(2) Model validation: The model used in this study was adopted from a physiologically validated model from our laboratory (Roy & Narayanan, 2021). Please note that the original model was validated against several physiological measurements along the somatodendritic axis. We sincerely regret our oversight in not mentioning clearly that we have used an existing, thoroughly physiologically-validated model from our laboratory in this study.
(3) Comparisons between chemical vs. junctional inputs: We had taken elaborate precautions in our experimental design to match the intracellular electrophysiological signatures with reference to synchronous as well as oscillatory inputs, irrespective of whether inputs arrived through gap junctions or chemical synapses.
In a revised manuscript, we will address all the concerns raised by the reviewers in detail. We have provided point-by-point responses to reviewers’ helpful and constructive comments below. We thank the editors and the reviewers for this constructive review process, which we believe will help us in improving our manuscript with specific reference to emphasizing the novelty of our approach and conclusions.
Reviewer #1 (Public review):
This manuscript makes a significant contribution to the field by exploring the dichotomy between chemical synaptic and gap junctional contributions to extracellular potentials. While the study is comprehensive in its computational approach, adding experimental validation, network-level simulations, and expanded discussion on implications would elevate its impact further.
We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.
Strengths
Novelty and Scope
The manuscript provides a detailed investigation into the contrasting extracellular field potential (EFP) signatures arising from chemical synapses and gap junctions, an underexplored area in neuroscience. It highlights the critical role of active dendritic processes in shaping EFPs, pushing forward our understanding of how electrical and chemical synapses contribute differently to extracellular signals.
We thank you for the positive comments on the novelty of our approach and how our study addresses an underexplored area in neuroscience. The assumptions about the passive nature of dendritic structures had indeed resulted in an underestimation of the contributions of gap junctions to extracellular potentials. Once the realities of active structures are accounted for, the contributions of gap junctions increases by several orders of magnitude compared to passive structures (Fig. 1D).
Methodological Rigor
The use of morphologically and biophysically realistic computational models for CA1 pyramidal neurons ensures that the findings are grounded in physiological relevance. Systematic analysis of various factors, including the presence of sodium, leak, and HCN channels, offers a clear dissection of how transmembrane currents shape EFPs.
We thank you for your encouraging comments on the experimental design and methodological rigor of our approach.
Biological Relevance
The findings emphasize the importance of incorporating gap junctional inputs in analyses of extracellular signals, which have traditionally focused on chemical synapses. The observed polarity differences and spectral characteristics provide novel insights into how neural computations may differ based on the mode of synaptic input.
We thank you for your positive comments on the biological relevance of our approach. We also gratefully thank you for emphasizing the two striking novelties unveiling the dichotomy between gap junctions and chemical synapses in their contributions to field potentials: polarity differences and spectral characteristics.
Clarity and Depth
The manuscript is well-structured, with a logical progression from synchronous input analyses to asynchronous and rhythmic inputs, ensuring comprehensive coverage of the topic.
We sincerely thank you for the positive comments on the structure and comprehensive coverage of our manuscript encompassing different types of inputs that neurons typically receive.
Weaknesses and Areas for Improvement
Generality and Validation
The study focuses exclusively on CA1 pyramidal neurons. Expanding the analysis to other cell types, such as interneurons or glial cells, would enhance the generalizability of the findings. Experimental validation of the computational predictions is entirely absent. Empirical data correlating the modeled EFPs with actual recordings would strengthen the claims.
We thank you for raising this important point. The prime novelty and the principal conclusion of this study is that gap junctional contributions to extracellular field potentials are orders of magnitude higher when the active nature of cellular compartments are accounted for. The lacuna in the literature has been consequent to the assumption that cellular compartments are passive, resulting in the dogma that gap junctional contributions to field potentials are negligible. Despite knowledge about active dendritic structures for decades now, this assumption has kept studies from understanding or even exploring the contributions of gap junctions to field potentials. The rationale behind the choice of a computational approach to address the lacuna were as follows:
(1) The complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).
(2) With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established non-specificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). The non-specific actions of gap junctions are tabulated in Table 2 of (Szarka et al., 2021). In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.
We highlight the novelty of our approach and of the conclusions about differences in extracellular signatures associated with active-dendritic chemical synapses and gap junctions, against these experimental difficulties. We emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials. Our analyses clearly demonstrates that gap junctions do contribute to extracellular potentials if the active nature of the cellular compartments is explicitly accounted for (Fig. 1D). We also show theoretically well-grounded and mechanistically elucidated differences in polarity (Figs. 1–3) as well as in spectral signatures (Figs. 5–8) of extracellular potentials associated with gap junctional vs. chemical synaptic inputs. Together, our fundamental demonstration in this study is the critical need to account for the active nature of cellular compartments in studying gap junctional contributions of extracellular potentials, with CA1 pyramidal neuronal dendrites used as an exemplar.
In a revised version of the manuscript, we will emphasize the motivations for the approach we took, highlighting the specific novelties both in methodological and conceptual aspects, finally emphasizing the need to account for other cell types and gap junctional contributions therein. Importantly, we will emphasize the non-specificities associated with gap-junctional blockers as the reason why experimental delineation of gap junctional vs. chemical synaptic contributions to LFP becomes tedious. We hope that these points will underscore the need for the computational approach that we took to address this important question, apart from the novelties of the manuscript.
Role of Active Dendritic Currents
The paper emphasizes active dendritic currents, particularly the role of HCN channels in generating outward currents under certain conditions. However, further discussion of how this mechanism integrates into broader network dynamics is warranted.
We thank you for this constructive suggestion. We agree that it is important to consider the implications for broader network dynamics of the outward HCN currents that are observed with synchronous inputs. In a revised manuscript, we will elaborate on the implications of the outward HCN current to network dynamics in detail.
Analysis of Plasticity
While the manuscript mentions plasticity in the discussion, there are no simulations that account for activity-dependent changes in synaptic or gap junctional properties. Including such analyses could significantly enhance the relevance of the findings.
We thank you for this constructive suggestion. Please note that we have presented consistent results for both fewer and more gap junctions in our analyses (Figure 1 with 217 gap junctions and Supplementary Figure 1 with 99 gap junctions). Thus, our fundamentally novel result that gap junctions onto active dendrites differentially shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron. Thus, these results demonstrate that the conclusions about their contributions to LFP are invariant to plasticity in their gap junctional numerosity.
We had only briefly mentioned plasticity in the Introduction to highlight the different modes of synaptic transmission and to emphasize that plasticity has been studied in both chemical synapses and gap junctions, playing a role in learning and adaptation. However, if this wording inadvertently suggests that our study includes plasticity simulations, we would remove it from Introduction in the updated manuscript to ensure clarity.
In the ‘Limitations of analyses and future studies’ section in Discussion, we suggested investigating the impact of plasticity mechanisms—specifically, activity-dependent plasticity of ion channels—on synaptic receptors vs. gap junctions and their effects on extracellular field potentials under various input conditions and plasticity combinations across different structures. We fully agree with the reviewer that such studies would offer valuable insights and further enhance the broader relevance of our findings. However, while our study implies this direction, it was not the primary focus of our investigation.
In the revised manuscript, we will expand on intrinsic/synaptic plasticity and how they could contribute to LFPs (Sinha & Narayanan, 2015, 2022), while also pointing to simulations with different numbers of gap junction in this context.
Frequency-Dependent Effects
The study demonstrates that gap junctional inputs suppress highfrequency EFP power due to membrane filtering. However, it could delve deeper into the implications of this for different brain rhythms, such as gamma or ripple oscillations.
We sincerely thank you for these insightful comments that we totally agree with. As it so happens, this manuscript forms the first part of a broader study where we explore the implications of gap junctions to ripple frequency oscillations. The ripple oscillations part of the work was presented as a poster in the Society for Neuroscience (SfN) annual meeting 2024 (Sirmaur & Narayanan, 2024). There, we simulate a neuropil made of hundreds of morphologically realistic neurons to assess the role of different synaptic inputs — excitatory, inhibitory, and gap junctional — and active dendrites to ripple frequency oscillations. We demonstrate there that the conclusions from single-neuron simulations in this current manuscript extend to a neuropil with several neurons, each receiving excitatory, inhibitory and gap-junctional inputs, especially with reference to high-frequency oscillations. Our networkbased analyses unveiled a dominant mediatory role of patterned inhibition in ripple generation, with recurrent excitations through chemical synapses and gap junctions in conjunction with return-current contributions from active dendrites playing regulatory roles in determining ripple characteristics (Sirmaur & Narayanan, 2024).
Our principal goal in this study, therefore, was to lay the single-neuron foundation for network analyses of the impact of gap junctions on LFPs. We are preparing the network part of the study, with a strong focus on ripple-frequency oscillations, for submission for peer review separately.
In a revised manuscript, we will mention the results from our SfN abstract with reference to network simulations and high-frequency oscillations, while also presenting discussions from other studies on the role of gap junctions in synchrony and LFP oscillations.
Visualization
Figures are dense and could benefit from more intuitive labeling and focused presentations. For example, isolating key differences between chemical and gap junctional inputs in distinct panels would improve clarity.
We thank you for this constructive suggestion. In the revised manuscript, we will enhance the visualization of the figures to ensure a clearer and more intuitive distinction between chemical synapses and gap junctions.
Contextual Relevance
The manuscript touches on how these findings relate to known physiological roles of gap junctions (e.g., in gamma rhythms) but does not explore this in depth. Stronger integration of the results into known neural network dynamics would enhance its impact.
We sincerely appreciate your valuable suggestion and acknowledge the importance of integrating our results into established neural network dynamics, particularly their implications for gamma rhythms. We will address this aspect more comprehensively in the revised version of our manuscript.
Reviewer #2 (Public review):
This computational work examines whether the inputs that neurons receive through electrical synapses (gap junctions) have different signatures in the extracellular local field potential (LFP) compared to inputs via chemical synapses. The authors present the results of a series of model simulations where either electric or chemical synapses targeting a single hippocampal pyramidal neuron are activated in various spatio-temporal patterns, and the resulting LFP in the vicinity of the cell is calculated and analyzed. The authors find several notable qualitative differences between the LFP patterns evoked by gap junctions vs. chemical synapses. For some of these findings, the authors demonstrate convincingly that the observed differences are explained by the electric vs. chemical nature of the input, and these results likely generalize to other cell types. However, in other cases, it remains plausible (or even likely) that the differences are caused, at least partly, by other factors (such as different intracellular voltage responses due to, e.g., the unequal strengths of the inputs). Furthermore, it was not immediately clear to me how the results could be applied to analyze more realistic situations where neurons receive partially synchronized excitatory and inhibitory inputs via chemical and electric synapses.
We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.
Strengths
The main strength of the paper is that it draws attention to the fact that inputs to a neuron via gap junctions are expected to give rise to a different extracellular electric field compared to inputs via chemical synapses, even if the intracellular effects of the two types of input are similar. This is because, unlike chemical synaptic inputs, inputs via gap junctions are not directly associated with transmembrane currents. This is a general result that holds independent of many details such as the cell types or neurotransmitters involved.
We gratefully thank you for the positive comments and the encouraging words about the novel contributions of our study. We are particularly thankful to you for your comment on the generality of our conclusions that hold for different cell types and neurotransmitters involved.
Another strength of the article is that the authors attempt to provide intuitive, non-technical explanations of most of their findings, which should make the paper readable also for non-expert audiences (including experimentalists).
We sincerely thank you for the positive comments about the readability of the paper.
Weaknesses
The most problematic aspect of the paper relates to the methodology for comparing the effects of electric vs. chemical synaptic inputs on the LFP. The authors seem to suggest that the primary cause of all the differences seen in the various simulation experiments is the different nature of the input, and particularly the difference between the transmembrane current evoked by chemical synapses and the gap junctional current that does not involve the extracellular space. However, this is clearly an oversimplification: since no real attempt is made to quantitatively match the two conditions that are compared (e.g., regarding the strength and temporal profile of the inputs), the differences seen can be due to factors other than the electric vs. chemical nature of synapses. In fact, if inputs were identical in all parameters other than the transmembrane vs. directly injected nature of the current, the intracellular voltage responses and, consequently, the currents through voltage-gated and leak currents would also be the same, and the LFPs would differ exactly by the contribution of the transmembrane current evoked by the chemical synapse. This is evidently not the case for any of the simulated comparisons presented, and the differences in the membrane potential response are rather striking in several cases (e.g., in the case of random inputs, there is only one action potential with gap junctions, but multiple action potentials with chemical synapses). Consequently, it remains unclear which observed differences are fundamental in the sense that they are directly related to the electric vs. chemical nature of the input, and which differences can be attributed to other factors such as differences in the strength and pattern of the inputs (and the resulting difference in the neuronal electric response).
We thank you for raising this important point. We would like to emphasize that our experimental design and analyses quantitatively account for the spatial distribution and temporal pattern of specific kinds of inputs that arrive through gap junctions and chemical synapses. We submit that our analyses quantitatively demonstrates that the fundamental difference between the gap junctional and chemical synaptic contributions to extracellular potentials is the absence of the direct transmembrane component from gap junctional inputs. We elucidate these points below:
(1) Spatial distribution: The inputs were distributed randomly across the basal dendrites, irrespective of whether they were through gap junctions or chemical synapses. For both chemical synapses and gap junctions, the inputs were of the same nature: excitatory.
(2) Different numbers of inputs: We have presented consistent results for both fewer and more gap junctions or chemical synapses in our analyses (see Figure 1 with 217 gap junctions or 245 chemical synapses and Supplementary Figure 2 with 99 gap junctions or 30 chemical synapses). Our fundamentally novel result that gap junctions onto active dendrites shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron.
(3) Synchronous inputs (Figs. 1–3): For chemical synapses, the waveforms are in the shape of postsynaptic potentials. For gap junctional inputs, the waveforms are in the shape of postsynaptic potentials or dendritic spikes (to respect the active nature of inputs from the other cell). Here, the electrical response of the postsynaptic cell is identical irrespective of whether inputs arrive through gap junctions or chemical synapses: an action potential. We quantitatively matched the strengths such that the model generated a single action potential in response to synchronous inputs, irrespective of whether they arrived through chemical synaptic and gap junctional inputs. We mechanistically analyze the contributions of different cellular components and show that the direct transmembrane current in chemical synapses is the distinguishing factor that determines the dichotomy between the contributions of gap junctions vs. chemical synapses to extracellular potentials (Figs. 2–3). In a revised manuscript, we will show the intracellular responses to demonstrate that they are electrically matched.
(4) Random inputs (Fig. 4): For random inputs, we did not account for the number of action potentials that arrived, as the only observation we made here was with reference to the biphasic nature of the extracellular potentials with gap junctional inputs in the “No Sodium” scenario. We note that in the “No Sodium” scenario, the time-domain amplitudes were comparable for the field potentials (Fig. 4B, Fig. 4D).
(5) Rhythmic inputs (Fig. 5–8): For rhythmic inputs, please note that the intracellular and extracellular waveforms for every frequency are provided in supplementary figures S5– S11. It may be noted that the intracellular responses are comparable. In simulations for assessing spike-LFP comparison, we tuned the strengths to produce a single spike per cycle, ensuring fair comparison of LFPs with gap junctions vs. chemical synapses.
Taken together, we demonstrate through explicit sets of simulations and analyses that the differences in LFPs were not driven by the strength or patterns of the inputs but rather by the differences in direct transmembrane currents, which are subsequently reflected in the LFPs. In a revised manuscript, we will add a section to emphasize these points apart from providing intracellular traces for cases where they are not provided.
Some of the explanations offered for the effects of cellular manipulations on the LFP appear to be incomplete. More specifically, the authors observed that blocking leak channels significantly changed the shape of the LFP response to synchronous synaptic inputs - but only when electric inputs were used, and when sodium channels were intact. The authors seemed to attribute this phenomenon to a direct effect of leak currents on the extracellular potential - however, this appears unlikely both because it does not explain why blocking the leak conductance had no effect in the other cases, and because the leak current is several orders of magnitude smaller than the spike-generating currents that make the largest contributions to the LFP. An indirect effect mediated by interactions of the leak current with some voltage-gated currents appears to be the most likely explanation, but identifying the exact mechanism would require further simulation experiments and/or a detailed analysis of intracellular currents and the membrane potential in time and space.
We thank you for raising this important question. Leak channels were among the several contributors to the positive deflection observed in LFPs associated with gap junctions. This effect was present not only in gap junctional models with intact sodium conductance but also in the no-sodium model, where the amplitude of the positive deflection was reduced across other models as well (Fig. 2F, I). Furthermore, even in the absence of leak conductance, a small positive deflection was still observed (Fig. 2F), leading us to further investigate other transmembrane currents over time and across spatial locations, from the proximal to the distal dendritic ends relative to the soma (Fig. 3D). We had observed that the dominant contributor in the case of chemical synapses was the inward synaptic current (Fig. 3A), whereas for gap junctions, the primary contributors were leak conductance along with other outward currents, such as potassium and HCN currents (Fig. 3D). Together, the direct transmembrane component of chemical synapses provides a dominant contribution to extracellular potentials. This dominance translates to differences in the relative contributions of indirect currents (including leak currents) to extracellular potentials associated chemical synaptic vs. gap junctional inputs. Our analyses of the exact ionic mechanisms (Fig. 3) demonstrates the involvement of several ion channels contributing to the indirect component in either scenario.
In every simulation experiment in this study, inputs through electric synapses are modeled as intracellular current injections of pre-determined amplitude and time course based on the sampled dendritic voltage of potential synaptic partners. This is a major simplification that may have a significant impact on the results. First, the current through gap junctions depends on the voltage difference between the two connected cellular compartments and is thus sensitive to the membrane potential of the cell that is treated as the neuron "receiving" the input in this study (although, strictly speaking, there is no pre- or postsynaptic neuron in interactions mediated by gap junctions). This dependence on the membrane potential of the target neuron is completely missing here. A related second point is that gap junctions also change the apparent membrane resistance of the neurons they connect, effectively acting as additional shunting (or leak) conductance in the relevant compartments. This effect is completely missed by treating gap junctions as pure current sources.
We thank you for raising this important point. We agree with the analyses presented by the reviewer on the importance of network simulations and bidirectional gap junctions that respect the voltages in both neurons. However, the complexities of LFP modeling precludes modeling of networks of morphologically realistic models with patterns of stimulations occurring across the dendritic tree. LFP modeling studies predominantly uses “post-synaptic” currents to analyze the impact of different patterns of inputs arriving on to a neuron, even when chemical synaptic inputs are considered. Explicitly, individual neurons are separately simulated with different patterns of synaptic inputs, the transmembrane current at different locations recorded, and the extracellular potential is then computed using line source approximation (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). Even in scenarios where a network is analyzed, a hybrid approach involving the outputs of a pointneuron-based network being coupled to an independent morphologically realistic neuronal model is employed (Hagen et al., 2016; Martinez-Canada et al., 2021; Mazzoni et al., 2015). Given the complexities associated with the computation of electrode potentials arising as a distance-weighted summation of several transmembrane currents, these simplifications becomes essential.
Our approach models gap junctional currents in a similar way as the other model incorporate synaptic currents in LFP modeling (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). As gap junctions are typically implemented as resistors from the other neuronal compartment, we accounted for gap-junctional variability in our model by randomizing the scaling-factors and the exact waveforms that arrive through individual gap junctions at specific locations. Thus, the inputs were not pre-determined by “pre” neurons. Instead, the recorded voltages from potential synaptic partner neurons were randomized across locations and scaled using factors at the dendrites before being injected into the target neuron (Supplementary Fig. S1). While incorporating a network of interconnected neurons is indeed important, we utilized biophysical, morphologically realistic CA1 neuron model with different sets of input patterns to model LFPs, which were derived from the total transmembrane currents across all compartments of the multi-compartmental neuron model. Given the complexity of this approach, adding further network-level interactions or pre-post connections would have been computationally demanding.
In a revised manuscript, we will introduce the general methodology used in LFP modeling studies to introduce synaptic currents. We will emphasize that our study extends this approach to modeling gap junctional inputs, while also highlighting randomization of locations and the scaling process in assigning gap junctional synaptic strengths.
One prominent claim of the article that is emphasized even in the abstract is that HCN channels mediate an outward current in certain cases. Although this statement is technically correct, there are two reasons why I do not consider this a major finding of the paper. First, as the authors acknowledge, this is a trivial consequence of the relatively slow kinetics of HCN channels: when at least some of the channels are open, any input that is sufficiently fast and strong to take the membrane potential across the reversal potential of the channel will lead to the reversal of the polarity of the current. This effect is quite generic and well-known and is by no means specific to gap junctional inputs or even HCN channels. Second, and perhaps more importantly, the functional consequence of this reversed current through HCN channels is likely to be negligible. As clearly shown in Supplementary Figure S3, the HCN current becomes outward only for an extremely short time period during the action potential, which is also a period when several other currents are also active and likely dominant due to their much higher conductances. I also note that several of these relevant facts remain hidden in Figure 3, both because of its focus on peak values, and because of the radically different units on the vertical axes of the current plots.
We thank you for raising this point and agree with you on every point. Please note that we do not assert that the outward HCN currents are exclusively associated with gap junctional inputs. Rather, our results show that synchronous inputs generate outward HCN currents in both chemical synapses (Fig. 3B; positive/outward HCN currents, except in the no sodium or leak model) and gap junctions (Fig. 3D; positive/outward HCN currents). We emphasized this in the case of gap junctions because, in the absence of inward synaptic currents, HCN (acting as outward currents with synchronous inputs) contributed to the positive deflection observed in the LFPs. While HCN would also contribute in the case of chemical synapses, its effect was negligible due to the presence of large inward synaptic currents. Since LFPs reflect the collective total transmembrane currents, the dominant contributors differ between these two scenarios, which we aimed to highlight. Since HCN exhibited outward currents in our synchronous input simulations, we have elaborated on this mechanism in the supplementary figure (Fig. S3). Our intention was not to emphasize this effect for only one synaptic mode but rather to highlight HCN's contribution to the positive deflection as one of the contributing factors.
We agree that HCN currents are relatively small in magnitude; therefore, our conclusions were based on HCN being one of the several contributing factors. Leak conductance and other outward conductances, including HCN currents (Fig. 3D), collectively contribute to the positive deflections observed in the case of gap junctional synchronous inputs.
We will ensure that we will account for all the points appropriately in a revised manuscript.
Finally, I missed an appropriate validation of the neuronal model used, and also the characterization of the effects of the in silico manipulations used on the basic behavior of the model. As far as I understand, the model in its current form has not been used in other studies. If this is the case, it would be important to demonstrate convincingly through (preferably quantitative) comparisons with experimental data using different protocols that the model captures the physiological behavior of at least the relevant compartments (in this case, the dendrites and the soma) of hippocampal pyramidal neurons sufficiently well that the results of the modeling study are relevant to the real biological system. In addition, the correct interpretation of various manipulations of the model would be strongly facilitated by investigating and discussing how the physiological properties of the model neuron are affected by these alterations.
We thank you for raising this important point. The CA1 pyramidal neuronal model used in this study is built with ion-channel models derived from biophysical and electrophysiological recordings from these cells. As mentioned in the Methods section “Dynamics and distribution of active channels” and Supplementary Table S1, models for individual channels, their gating kinetics, and channel distributions across the somatodendritic arbor (wherever known) are all derived from their physiological equivalents. Importantly, these values were derived from previously validated models from the laboratory, which contain these very ion channel models and the exact same morphology (Roy & Narayanan, 2021). Please compare Supplementary Table S1 with the Table 1 from (Roy & Narayanan, 2021). Please note that this model was validated against several physiological measurements along the somatodendritic axis (Fig. 1 of (Roy & Narayanan, 2021)).
In a revised manuscript, we will explicitly mention this while also mentioning the different physiological properties that were used for the validation process from (Roy & Narayanan, 2021). We sincerely regret not mentioning these details in the current version of our manuscript.
We will fix these in a revised version of the manuscript.
References
Bedner, P., Steinhauser, C., & Theis, M. (2012). Functional redundancy and compensation among members of gap junction protein families? Biochim Biophys Acta, 1818(8), 1971-1984. https://doi.org/10.1016/j.bbamem.2011.10.016
Behrens, C. J., Ul Haq, R., Liotta, A., Anderson, M. L., & Heinemann, U. (2011). Nonspecific effects of the gap junction blocker mefloquine on fast hippocampal network oscillations in the adult rat in vitro. Neuroscience, 192, 11-19. https://doi.org/10.1016/j.neuroscience.2011.07.015
Buzsaki, G., Anastassiou, C. A., & Koch, C. (2012). The origin of extracellular fields and currents--EEG, ECoG, LFP and spikes. Nat Rev Neurosci, 13(6), 407-420. https://doi.org/10.1038/nrn3241
Einevoll, G. T., Destexhe, A., Diesmann, M., Grun, S., Jirsa, V., de Kamps, M., Migliore, M., Ness, T. V., Plesser, H. E., & Schurmann, F. (2019). The Scientific Case for Brain Simulations. Neuron, 102(4), 735-744. https://doi.org/10.1016/j.neuron.2019.03.027
Gold, C., Henze, D. A., Koch, C., & Buzsaki, G. (2006). On the origin of the extracellular action potential waveform: A modeling study. J Neurophysiol, 95(5), 3113-3128. https://doi.org/10.1152/jn.00979.2005
Hagen, E., Dahmen, D., Stavrinou, M. L., Linden, H., Tetzlaff, T., van Albada, S. J., Grun, S., Diesmann, M., & Einevoll, G. T. (2016). Hybrid Scheme for Modeling Local Field Potentials from Point-Neuron Networks. Cereb Cortex, 26(12), 4461-4496. https://doi.org/10.1093/cercor/bhw237
Halnes, G., Ness, T. V., Næss, S., Hagen, E., Pettersen, K. H., & Einevoll, G. T. (2024). Electric Brain Signals: Foundations and Applications of Biophysical Modeling. Cambridge University Press. https://doi.org/DOI: 10.1017/9781009039826
Lo, C. W. (1999). Genes, gene knockouts, and mutations in the analysis of gap junctions. Dev Genet, 24(1-2), 1-4. https://doi.org/10.1002/(SICI)1520-6408(1999)24:1/2<1::AIDDVG1>3.0.CO;2-U
Martinez-Canada, P., Ness, T. V., Einevoll, G. T., Fellin, T., & Panzeri, S. (2021). Computation of the electroencephalogram (EEG) from network models of point neurons. PLoS Comput Biol, 17(4), e1008893. https://doi.org/10.1371/journal.pcbi.1008893
Mazzoni, A., Linden, H., Cuntz, H., Lansner, A., Panzeri, S., & Einevoll, G. T. (2015). Computing the Local Field Potential (LFP) from Integrate-and-Fire Network Models. PLoS Comput Biol, 11(12), e1004584. https://doi.org/10.1371/journal.pcbi.1004584
Ness, T. V., Remme, M. W. H., & Einevoll, G. T. (2018). h-Type Membrane Current Shapes the Local Field Potential from Populations of Pyramidal Neurons. J Neurosci, 38(26), 6011-6024. https://doi.org/10.1523/jneurosci.3278-17.2018
Reimann, M. W., Anastassiou, C. A., Perin, R., Hill, S. L., Markram, H., & Koch, C. (2013). A biophysically detailed model of neocortical local field potentials predicts the critical role of active membrane currents. Neuron, 79(2), 375-390. https://doi.org/10.1016/j.neuron.2013.05.023
Rouach, N., Segal, M., Koulakoff, A., Giaume, C., & Avignone, E. (2003). Carbenoxolone blockade of neuronal network activity in culture is not mediated by an action on gap junctions. Journal of Physiology, 553(Pt 3), 729-745. https://doi.org/10.1113/jphysiol.2003.053439
Roy, A., & Narayanan, R. (2021). Spatial information transfer in hippocampal place cells depends on trial-to-trial variability, symmetry of place-field firing, and biophysical heterogeneities. Neural Netw, 142, 636-660. https://doi.org/10.1016/j.neunet.2021.07.026
Schomburg, E. W., Anastassiou, C. A., Buzsaki, G., & Koch, C. (2012). The spiking component of oscillatory extracellular potentials in the rat hippocampus. J Neurosci, 32(34), 11798-11811. https://doi.org/10.1523/JNEUROSCI.0656-12.2012
Sinha, M., & Narayanan, R. (2015). HCN channels enhance spike phase coherence and regulate the phase of spikes and LFPs in the theta-frequency range. Proc Natl Acad Sci U S A, 112(17), E2207-2216. https://doi.org/10.1073/pnas.1419017112
Sinha, M., & Narayanan, R. (2022). Active Dendrites and Local Field Potentials: Biophysical Mechanisms and Computational Explorations. Neuroscience, 489, 111-142. https://doi.org/10.1016/j.neuroscience.2021.08.035
Sirmaur, R., & Narayanan, R. (2024). Distinct extracellular signatures of chemical and electrical synapses impinging on active dendrites differentially contribute to ripple-frequency oscillations. Society for Neuroscience annual meeting (https://www.abstractsonline.com/pp8/?_gl=1*1bxo7m*_gcl_au*MTc5MTQ0NjE0NC4xNzI3MDcwOTMw*_ga*MTMxMTE5OTcyMy4xNzI3MDcwOTMx*_ga_T09K 3Q2WDN*MTcyNzA3MDkzMS4xLjEuMTcyNzA3MDkzNy41NC4wLjA.#!/20433/ presentation/13949), Chicago, USA.
Szarka, G., Balogh, M., Tengolics, A. J., Ganczer, A., Volgyi, B., & Kovacs-Oller, T. (2021). The role of gap junctions in cell death and neuromodulation in the retina. Neural Regen Res, 16(10), 1911-1920. https://doi.org/10.4103/1673-5374.308069
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #2:
The authors indicated that they had added coefficients of variation for within-lineage heterogeneity (line 93), but I can't seem to find this.
The coefficients of variation were indeed included as suggested, and can be found in lines 94-96 of the current revised version of the manuscript. The sentence states: “Nevertheless, substantial intra-lineage heterogeneity could be observed, particularly within L1 and L2 (coefficients of variation 84.4% [L1] and 66.0% [L2] vs. 32.6% [L3], 34.6% [L4] and 31.9% [L5]).”
They were unable to address my question on the impact of T-cell depletion from PBMC on bacterial growth? Their discussion should include that this experimental limitation means that they are unable to test cause and effect for the relationship between T cell proliferation and bacterial growth.
As recommended, this experimental limitation is now included in the discussion in lines 344-346.
Reviewer #3:
EM:
Based on the authors lack of resources, I don't believe that electron microscopy experiments should be required for this publication. However, it should be noted that EM is performed on fixed samples such that implementation of those protocols as it relates to bio-safety is no more demanding than the preparation of samples for other common assays performed outside of the BSL3.
We appreciate your understanding regarding our lack of resources to carry out the EM experiments, although we recognize the possibility of them being performed on BSL3 samples.
Granuloma score:
From the author comments and the manuscript's text, it appears that the "granuloma score" is an attempt at quantitation of PBMC organization. Where every component of the metric [(mean area / mean aspect ratio) / mean n ] is a visual facet of the relative integration of PBMCs into a more organized aggregate. The area and number (n) of aggregates both address regional coalescence of the total number of PBMCs added into the matrix. Whereas the aspect ratio component is an indicator of uniformity of the PBMCs that have been assigned to an individual aggregate. Perhaps another roundness estimation would have been a more precise, but aspect ratio seems fine for their assay. Considering these factors and the author's contention that the aggregates making up (n) are granulomas, the name "granuloma score" is inaccurate and a more appropriate title would be "aggregate organization score" or "aggregate organization index".
Thank you for the suggested alternative terminology, the term “granuloma score” has been substituted with “aggregate formation score” throughout the manuscript.
Dormancy:
In the manuscript, the authors should explicitly reference the validation studies which demonstrate induction of the DosR regulon in the model, lest their previously generated and conducted studies go unappreciated by a broader audience. In the title of that previous work (PMID: 32069329) this group used the designation "dormant-like" to describe the state observed in bacteria within their in vitro granuloma model system, as they also do in LINE 124. This term or a variation of it should be exchanged for dormant/dormancy throughout the manuscript when referring to observations in the model bacteria. It is a more precise description. Further, "dormant-like" allows the latitude to refer to actively growing bacteria in the context of dormancy without running the risk of putting forth confusing or potentially erroneous assertions.
As recommended, the suffix “-like” has been added to the designation “dormant” when referring to the bacterial phenotype induced in the model. In addition, de induction of the DosR regulon in the model is now mentioned in line 116 and the reference to Kapoor’s work that originally demonstrated it by qPCR included.
PBMC aggregation:
I would like to make the authors aware that in well vetted models, cell aggregation as a function of infection does not typically occur in PBMCs on tissue culture plates until day 6 post infection (PMID: 25691598, Fig 2). Further, this group's own published protocol for the model under consideration in this manuscript (PMID: 33659472, Fig1) explicitly states that "Formation of granuloma like structures can be observed after 7-8 days", the implication being that prior to 7 days granuloma like structures cannot be observed reliably. Regardless, it seems evident that the authors will not be conducting additional experiments for this publication, which I find acceptable. However a proper negative control would certainly strengthen evidence for the association of strain specific bacterial and host responses with the granulomatous response in this model.
We had interpreted the reviewer’s previous comment regarding PBMC aggregation as referring to a different experimental model rather than a matter of timing. Since many other studies have previously assessed the impact of strain/lineage variability in macrophage responses, in this work we decided to focus on later time points and we did include uninfected as a negative control. Nonetheless, we agree it would be indeed very interesting to additionally evaluate monocyte/macrophage early responses and we will take it into account for future studies.
Use of antiquated terminology:
I can appreciate the desire to establish continuity between publications by using the same abbreviation for TNF but it will come at a cost. Using outdated terms in general makes people more dismissive of the work. Perhaps something to consider.
Since this seems an important issue to the reviewer, we have replaced the term TNF-a with TNF throughout the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
Chen and Phillips describe the dynamic appearance of cytoplasmic granules during embryogenesis analogous to SIMR germ granules, and distinct from CSR-1-containing granules, in the C. elegans germline. They show that the nuclear Argonaute NRDE-3, when mutated to abrogate small RNA binding, or in specific genetic mutants, partially colocalizes to these granules along with other RNAi factors, such as SIMR-1, ENRI-2, RDE-3, and RRF-1. Furthermore, NRDE-3 RIP-seq analysis in early vs. late embryos is used to conclude that NRDE-3 binds CSR-1-dependent 22G RNAs in early embryos and ERGO-1dependent 22G RNAs in late embryos. These data lead to their model that NRDE-3 undergoes small RNA substrate "switching" that occurs in these embryonic SIMR granules and functions to silence two distinct sets of target transcripts - maternal, CSR-1 targeted mRNAs in early embryos and duplicated genes and repeat elements in late embryos.
Strengths:
The identification and function of small RNA-related granules during embryogenesis is a poorly understood area and this study will provide the impetus for future studies on the identification and potential functional compartmentalization of small RNA pathways and machinery during embryogenesis.
Weaknesses:
(1) While the authors acknowledge the following issue, their finding that loss of SIMR granules has no apparent impact on NRDE-3 small RNA loading puts the functional relevance of these structures into question. As they note in their Discussion, it is entirely possible that these embryonic granules may be "incidental condensates." It would be very welcomed if the authors could include some evidence that these SIMR granules have some function; for example, does the loss of these SIMR granules have an effect on CSR-1 targets in early embryos and ERGO-1-dependent targets in late embryos?
We appreciate reviewer 1’s concern that we do not provide enough evidence for the function of the SIMR granules. As suggested, we examined the NRDE-3 bound small RNAs more deeply, and we do observe a slight but significant increased CSR-class 22G-RNAs binding to NRDE-3 in late embryos of simr-1 and enri-2 mutants (see below, right). We hypothesize that this result could be due to a slower switch from CSR to ERGO 22G-RNAs in the absence of SIMR granules. We added these data to Figure 6G.
(2) The analysis of small RNA class "switching" requires some clarification. The authors re-define ERGO1-dependent targets in this study to arrive at a very limited set of genes and their justification for doing this is not convincing. What happens if the published set of ERGO-1 targets is used?
As we mentioned in the manuscript, we initially attempted to use the previously defined ERGO targets. However, the major concern is fewer than half the genes classified as ERGO targets by Manage et al. and Fischer et al. overlap with one another (Figure 6—figure supplement 1D and below). We reason this might because the gene sets were defined as genes that lose small RNAs in various ERGO pathway mutants and because different criteria were used to define the lists as discussed in the manuscript (lines 471-476). As a result, some of the previously defined ERGO target genes may actually be indirect targets of the pathway. Here we focus on genes targeted by small RNAs enriched in an ERGO pathway Argonaute IP, which should be more specific.
In this manuscript, we are interested specifically in the ERGO targets bound by NRDE-3, thus we utilized the IP-small RNA sequencing data from young adult animals (Seroussi et al, 2023), to define a new ERGO list. We are confident about this list because 1) Most of our new ERGO genes overlap with the overlap between ERGO-Manage and ERGO-Fischer list (see Figure 6—figure supplement 1D in our manuscript and below). 2) We observed the most significant decrease of small RNA levels and increase of mRNA levels in the nrde-3 mutants using our newly defined list (see Figure 6—figure supplement 1E-F in our manuscript).
To further address reviewer 1’s concern about whether the data would look significantly different when using the ERGO-Manage and ERGO-Fischer lists, we made new scatter plots shown in Author response image 1 panels A-C below (ERGO-Manage – purple, ERGO-Fischer- yellow, and the overlap - yellow with purple ring). We found that the small switching pattern of NRDE-3 is consistent with our newly defined list, particularly if we look at the overlap of ERGO-Manage and ERGO-Fischer list (Author response image 1 panels D-F below, red).
Author response image 1.
Further, the NRDE-3 RIP-seq data is used to conclude that NRDE-3 predominantly binds CSR-1 class 22G RNAs in early embryos, while ERGO-1-dependent 22G RNAs are enriched in late embryos. a) The relative ratios of each class of small RNAs are given in terms of unique targets. What is the total abundance of sequenced reads of each class in the NRDE-3 IPs?
To address the reviewer’s question about the total abundance of sequenced reads of each class in the NRDE-3 IPs: Author response image 2 panel A-B below show the total RPM of CSR and ERGO class sRNAs in inputs and IPs at different stages. Focusing on late embryos, the total abundance of ERGO-dependent sRNAs is similar to CSR-class sRNAs in input, while much higher in IP, indicating an enrichment of ERGO-dependent 22G-RNAs in NRDE-3 consistent with our log2FC (IP vs input) in Figure 6B. This data supports our conclusion that NRDE-3 preferentially binds to ERGO targets in late embryos.
Author response image 2.
b) The "switching" model is problematic given that even in late embryos, the majority of 22G RNAs bound by NRDE-3 is the CSR-1 class (Figure 5D).
It is important to keep in mind the difference in the total number of CSR target genes (3834) and ERGO target genes (119). The pie charts shown in Figure 6D are looking at the total proportion of the genes enriched in the NRDE-3 IP that are CSR or ERGO targets. For the NRDE-3 IP in late embryos, that would be 70/119 (58.8%) of ERGO targets are enriched, while 172/3834 (4.5%) of CSR targets are enriched. These data are also supported by the RPM graphs shown in Author response image 2 panels A-B above, which show that the majority of the small RNA bound by NRDE-3 in late embryos are ERGO targets. Nonetheless, NRDE-3 still binds to some CSR targets shown as Figure 6D and panel B, which may be because the amount of CSR-class 22G-RNAs is reduced gradually across embryonic development as the maternally-deposited NRDE-3 loaded with CSR-class 22G-RNAs is diluted by newly transcribed NRDE-3 loaded with ERGOdependent 22G-RNAs (lines 857-862).
c) A major difference between NRDE-3 small RNA binding in eri-1 and simr-1 mutants appears to be that NRDE-3 robustly binds CSR-1 22G RNAs in eri-1 but not in simr-1 in late embryos. This result should be better discussed.
In the eri-1 mutant, we hypothesize that NRDE-3 robustly binds CSR-class 22G-RNAs because ERGOclass 22G-RNAs are not synthesized during mid-embryogenesis, so either NRDE-3 is unloaded (in granule at 100-cell stage in Figure 2A) or mis-loaded with CSR-class 22G-RNAs (in the nucleus at 100cell stage in Figure 2A). We don’t have a robust method to address the proportion of loaded vs. unloaded NRDE-3 so it is difficult to address the degree to which NRDE-3 is misloaded in the eri-1 mutant. In the simr-1 mutant, both classes of small RNAs are present and NRDE-3 is still preferentially loaded with ERGO-dependent 22G-RNAs, though we do see a subtle increase in association with CSR-class 22GRNAs. These data could suggest a less efficient loading of NRDE-3 with ERGO-dependent 22G-RNAs, but we would need more precise methods to address the loading dynamics in the simr-1 mutant.
(3) Ultimately, if the switching is functionally important, then its impact should be observed in the expression of their targets. RNA-seq or RT-qPCR of select CSR-1 and ERGO-1 targets should be assessed in nrde-3 mutants during early vs late embryogenesis.
The function of NRDE-3 at ERGO targets has been well studied (Guang et al, 2008) and is also assessed in our H3K9me3 ChIP-seq analysis in Figure 7E where, in mixed staged embryos, H3K9me3 level on ERGO targets (labeled as ‘NRDE-3 targets in young adults’) is reduced significantly in the nrde-3 mutant.
To understand the function of NRDE-3 binding on CSR targets in early embryos, we attempted to do RTqPCR, smFISH, and anti-H3K9me3 CUT&Tag-seq on early embryos, and we either failed to obtain enough signal or failed to detect any significant difference (data not shown). We additionally tested the possibility that NRDE-3 functions with CSR-class 22G-RNAs in oocytes. We present new data showing that NRDE-3 represses RNA Pol II in oocytes to promote global transcriptional repression at the oocyteto-embryo transition, we now included these data in Figure 8.
Reviewer #2 (Public review):
Summary:
NRDE-3 is a nuclear WAGO-clade Argonaute that, in somatic cells, binds small RNAs amplified in response to the ERGO-class 26G RNAs that target repetitive sequences. This manuscript reports that, in the germline and early embryos, NRDE-3 interacts with a different set of small RNAs that target mRNAs. This class of small RNAs was previously shown to bind to a different WAGO-clade Argonaute called CSR1, which is cytoplasmic, unlike nuclear NRDE-3. The switch in NRDE-3 specificity parallels recent findings in Ascaris where the Ascaris NRDE homolog was shown to switch from sRNAs that target repetitive sequences to CSR-class sRNAs that target mRNAs.
The manuscript also correlates the change in NRDE-3 specificity with the appearance in embryos of cytoplasmic condensates that accumulate SIMR-1, a scaffolding protein that the authors previously implicated in sRNA loading for a different nuclear Argonaute HRDE-1. By analogy, and through a set of corelative evidence, the authors argue that SIMR foci arise in embryogenesis to facilitate the change in NRDE-3 small RNA repertoire. The paper presents lots of data that beautifully documents the appearance and composition of the embryonic SIMR-1 foci, including evidence that a mutated NRDE-3 that cannot bind sRNAs accumulates in SIMR-1 foci in a SIMR-1-dependent fashion.
Weaknesses:
The genetic evidence, however, does not support a requirement for SIMR-1 foci: the authors detected no defect in NRDE-3 sRNA loading in simr-1 mutants. Although the authors acknowledge this negative result in the discussion, they still argue for a model (Figure 7) that is not supported by genetic data. My main suggestion is that the authors give equal consideration to other models - see below for specifics.
We appreciate reviewer 2’s comments on the genetic evidence for the function of SIMR foci. A similar concern was also brought up by reviewer 1. By re-examining our sequencing data, we found that there is a modest but significant increase in NRDE-3 association with CSR-class sRNAs in simr-1 and enri-2 mutants in late embryos. We believe that this data supports our model that SIMR-1 and ENRI-2 are required for an efficient switch of NRDE-3 bound small RNAs. Please refer our response to the reviewer 1 - point (1), and Figure 6G in the updated manuscript.
Reviewer #3 (Public review):
Summary:
Chen and Phillips present intriguing work that extends our view on the C. elegans small RNA network significantly. While the precise findings are rather C. elegans specific there are also messages for the broader field, most notably the switching of small RNA populations bound to an argonaute, and RNA granules behavior depending on developmental stage. The work also starts to shed more light on the still poorly understood role of the CSR-1 argonaute protein and supports its role in the decay of maternal transcripts. Overall, the work is of excellent quality, and the messages have a significant impact.
Strengths:
Compelling evidence for major shift in activities of an argonaute protein during development, and implications for how small RNAs affect early development. Very balanced and thoughtful discussion.
Weaknesses:
Claims on col-localization of specific 'granules' are not well supported by quantitative data
We have now included zoomed images of individual granules to better show the colocalization in Figure 4 and Figure 4—figure supplement 1, and performed Pearson’s colocalization analysis between different sets of proteins in Figure 4B.
Reviewer #2 (Recommendations for the authors):
- The manuscript is very dense and the gene names are not helpful. For example, the authors mention ERGO-1 without clarifying the type of protein, etc. I suggest the authors include a figure to go with the introduction that describes the different classes of primary and secondary sRNAs, associated Argonautes, and other accessory proteins. Also include a table listing relevant gene names, protein classes, main localizations, and proposed functions for easy reference by the readers.
We agree that the genes names in different small RNA pathways are easily confused. We added a diagram and table in Figure 1—figure supplement 1 depicting the ERGO/NRDE and CSR pathways and added clarification about the ERGO/NRDE-3 pathway in the text from line 126-128.
- Line 424 - the wording here and elsewhere seems to imply that SIMR-1 and ENRI-2, although not essential, contribute to NRDE-3 sRNA loading. The sequencing data, however, do not support this - the authors should be clearer on this. If the authors believe there are subtle but significant differences, they should show them perhaps by adding a panel in Figure 5 that directly compares the NRDE-3 IPs in wildtype versus simr-1 mutants. Figure 5H however does not support such a requirement.
As brought up by reviewer 1, we do not see difference in binding of ERGO-dependent sRNA in simr-1 mutant in late embryos. We do, however, see a modest, but significant, increase of CSR-sRNAs bound by NRDE-3 in simr-1 and enri-2 mutants, which we hypothesize could be due to a less efficient loading of ERGO-dependent 22G-RNAs by NRDE-3. The updated data are now in Figure 6G. We have also edited the text and model figure to soften these conclusions.
- Condensates of PGL proteins appear at a similar time and place (somatic cells of early embryos) as the embryonic SIMR-1 foci. The PGL foci correspond to autophagy bodies that degrade PGL proteins. Is it possible that SIMR-1 foci also correspond to degradative structures? The possibility that SIMR-1 foci are targeted for autophagy and not functional would fit with the finding that simr-1 mutants do not affect NRDE-3 loading in embryos.
We appreciate reviewer 2’s comments on possibility of SIMR granules acting as sites for degradation of SIMR-1 and NRDE-3. We think this is not the case for the following reasons: 1) if SIMR granules are sites of autophagic degradation, then we would expect that embryonic SIMR granules in somatic cells, like PGL granules, should only be observed in autophagy mutants; however we see them in wild-type embryos 2) we would not expect a functional Tudor domain to be required for granule localization; however in Figure 1—figure supplement 2B, we show that a point mutation in the Tudor domain of SIMR-1 abrogates SIMR granule formation, and 3) if NRDE-3(HK-AA) is recruited to SIMR granules for degradation while wild-type NRDE-3 is cytoplasmic, then NRDE-3(HK-AA) should shows a significantly reduced protein level comparing to wild-type NRDE-3. In the western blot in Figure 2—figure supplement 1B, NRDE-3 and NRDE-3(HK-AA) protein levels are similar, indicating that NRDE-3(HK-AA) is not degraded despite being unloaded. This is in contrast to what we have observed previously for HRDE-1, which is degraded in its unloaded state. If SIMR-1 played a role directly in promoting degradation of NRDE-3(HK-AA), we would similarly expect to see a change in NRDE-3 or NRDE-3(HK-AA) expression in a simr-1 mutant. We performed western blot and did not observe a significant change in protein expression for NRDE-3 (Figure 3—figure supplement 1A).
Although under wild-type conditions, SIMR granules do not appear to be sites of autophagic degradation, upon treatment with lgg-1 (an autophagy protein) RNAi, we found that SIMR-1, as well as many other germ granule and embryonic granule-localized proteins, increase in abundance in late embryos. This data demonstrates that ZNFX-1, CSR-1, SIMR-1, MUT-2/RDE-3, RRF-1, and unloaded NRDE-3 are removed by autophagic degradation similar to what have been shown previously for PGL-1 proteins (Zhang et al, 2009, Cell). We added these data to Figure 5. It is important to emphasize, however, that the timing of degradation differs for each granule assayed (Lines 447-450), indicating that there must be multiple waves of autophagy to selectively degrade subsets of proteins when they are no longer needed by the embryo.
- The observation that an NRDE-3 mutant that cannot load sRNAs localizes to SIMR-1 foci does not necessarily imply that wild-type unloaded NRDE-3 would also localize there. Unless the authors have additional data to support this idea, the authors should acknowledge that this hypothesis is speculative. In fact, why does cytoplasmic NRDE-3 not localize to granules in the rde-3;ego-1degron strain shown in Figure 6B?? Is it possible that the NRDE-3 mutant accumulates in SIMR-1 foci because it is unfolded and needs to be degraded?
We believe that wild-type NRDE-3 also localize to SIMR foci when unloaded. This is supported by the localization of wild-type NRDE-3 in eri-1 and rde-3 mutants, where a subset of small RNAs are depleted. Wild-type NRDE-3 localizes to both somatic SIMR-1 granules and the nucleus, depending on embryo stage (Figure 2A, Figure 2—figure supplement 1C). The granule numbers in eri-1 and rde-3 mutants are less than the nrde-3(HK-AA) mutant, consistent with the imaging data that NRDE-3 only partially localize to somatic granule (Figure 2A – 100-cell stage).
In the rde-3; ego-1 double mutant, the embryos have severe developmental defect: they cannot divide properly after 4-8 cell stage and exhibit morphology defects after that stage. In wild-type, SIMR foci does not appear until around 8-28-cell stage (shown in Figure 1C), so we believe that cytoplasmic NRDE-3 does not localize to foci in the double mutant is because of the timing.
- The authors propose that NRDE-3 functions in nuclei to target mRNAs also targeted in the cytoplasm by CSR-1. If so, how do they propose that NRDE-3 might do this since little transcription occurs in oocytes/early embryos?? Are the authors suggesting that NRDE-3 targets germline genes for silencing specifically at the times that zygotic transcription comes back on, or already in maturing oocytes? Is the transcription of most CSR-1 targets silenced in early embryos??
We appreciate the suggestions to check the function of NRDE-3 in oocytes. We tested this possibility and found it to be correct. NRDE-3 functions in oocytes for transcriptional repression by inhibiting RNA Pol II elongation. We added these data to Figure 8. We also attempted to do RT-qPCR, smFISH, and antiH3K9me3 Cut&Tag-seq on early embryos to further test the hypothesis that NRDE-3 acts with CSR-class 22G-RNAs in early embryos, but we either failed to obtain enough signal or failed to detect any significant difference (data not shown). Therefore, we think that the primary role for NRDE-3 bound to CSR-class 22G-RNAs may be for global transcriptional repression of oocytes prior to fertilization.
- Line 684-686: "In summary, this work investigating the role of SIMR granules in embryos, together with our previous study of SIMR foci in the germline (Chen and Phillips 2024), has identified a new mechanism for small RNA loading of nuclear Argonaute proteins in C. elegans". This statement appears overstated/incorrect since there is no evidence that SIMR-1 foci are required for sRNA loading of NRDE3. The authors should emphasize other models, as suggested above.
We have revised the text on line 869-871 to emphasize that SIMR granule regulate the localization of nuclear Argonaute proteins, rather than suggesting a direct role on controlling small RNA loading. We also edit the title, text, and legend for our model in Figure 9.
Reviewer #3 (Recommendations for the authors):
Issues to be addressed:
- The authors show a switch in 22G RNA binding by NRDE-3 during embryogenesis. While the data is convincing, it would be great if it could be tested if the preferred NRDE-3 replacement model is indeed correct. This could be done relatively easily by giving NRDE-3 a Dendra tag, allowing one to colour-switch the maternal WAGO-3 pool before the zygotic pool comes up. Such data would significantly enhance the manuscript, as this would allow the authors to follow the fate of maternal NRDE-3 more precisely, perhaps identifying a period of sharp decline of maternal NRDE-3.
We think the NRDE-3 Dendra tag experiment suggested by the reviewer is a clever approach and we will consider generating this strain in the future. However, we feel that optimization of the color-switching tag between the maternal germline and the developing embryos is beyond the scope of this manuscript. To partially address the question about NRDE-3 fate during embryogenesis, we examined the single-cell sequencing data of C. elegans embryos from 1-cell to 16-cell stage (Tintori et al, 2016, Dev Cell; Visualization tool from John I Murray lab), as shown in Author response image 3 Panel A below, NRDE-3 transcript level increases as embryo develops, indicating that zygotic NRDE-3 is being actively expressed starting very early in development. We hypothesize that maternal NRDE-3 will either be diluted as the embryo develops or actively degraded during early embryogenesis.
Author response image 3.
- Figure 3A: * should mark PGCs, but this seems incorrect. At the 8-cell stage there still is only one PGC (P4), not two, and at 100 cells there are only two, not three germ cells. Also, the identification of PGCs with a maker (PGL for instance) would be much more convincing.
We apologize for the confusion in Figure 3A. We changed the figure legend to clarify that the * indicate nuclear NRDE-3 localization in somatic cells for 8- and 100-cell stage embryos rather than the germ cells.
- Overall, the authors should address colocalization more robustly. In the current manuscript, just one image is provided, and often rather zoomed-out. How robust are the claims on colocalization, or lack thereof? With the current data, this cannot be assessed. Pearson correlation, combined with line-scans through a multitude of granules in different embryos will be required to make strong claims on colocalization. This applies to all figures (main and supplement) where claims on different granules are derived from.
We thank reviewer 3 for this important suggestion. To better address the colocalization, we included insets of individual granules in Figure 2D and Figure 4. We also performed colocalization analysis by calculating the Pearson’s R value between different groups of proteins in Figure 4B, to highlight that SIMR-1 colocalizes with ENRI-2, NRDE-3(HK-AA), RDE-3, and RRF-1, while CSR-1 colocalizes with EGO-1.
For the proteins that lack colocalization in Figure 4—figure supplement 1, we also added insets of individual granules. Additionally, we included a new set of panels showing SIMR-1 localization compared to tubulin::GFP (Figure 4—figure supplement 1I) in response to a recent preprint (Jin et al, 2024, BioRxiv), which finds NRDE-3 (expressed under a mex-5 promoter) associating with pericentrosomal foci and the spindle in early embryos. We do not see SIMR-1 (or NRDE-3, data not shown) at centrosomes or spindles in wild-type conditions but made a similar observation for SIMR-1 in a mut-16 mutant (Figure 4E). All of the localization patterns were examined on at least 5 individual 100-cell staged embryos with same localization pattern.
- Figure 7: Its title is: Function of cytoplasmic granules. This is a much stronger statement than provided in the nicely balanced discussion. The role of the granules remains unclear, and they may well be just a reflection of activity, not a driver. While this is nicely discussed in the text, figure 7 misses this nuance. For instance, the title suggests function, and also the legend uses phrases like 'recruited to granule X'. If granules are the results of activity, 'recruitment' is really not the right way to express the findings. The nuance that is so nicely worded in the discussion should come out fully in this figure and its legend as well.
We have changed the title of Figure 7 (now Figure 9) to “Model for temporally- and developmentallyregulated NRDE-3 function” to deemphasize the role of the granules and to highlight the different functions of NRDE-3. Similarly, we have rephrased the text in the figure and legend and add a some details about our new results.
Minor:
Typo: line 663 Acaris
We corrected the typo.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
(1) Adding microscopy of the untreated group to compare Figure 2A with would further strengthen the findings here.
First of all, we would like to thank Reviewer #1 for their comments and efforts on our manuscript. We have carefully revised it. We used a time-lapse method to capture images at 0 minutes, before any drugs were added. We will change '0 min' to 'untreated,' which will further strengthen the findings.
(2) Quantification of immune infiltration and histological scoring of kidney, liver, and spleen in the various treatment groups would increase the impact of Figure 4.
Thank you very much to Reviewer #1 for their comments and efforts on our manuscript. We have revised it carefully. We conducted quantitative analysis of immune infiltration in the kidney, liver, and spleen across different treatment groups. However, due to the extremely low number of abnormal cells in the negative control, treatment, and prophylactic groups, neither the instrument nor manual methods could reliably gate the cells. Consequently, quantification of immune infiltration and histological scoring were not performed.
(3) The data in Figure 6 I is not sufficiently convincing as being significant.
Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Previous researches have shown that antibiotics and other drugs can cause alterations in gut microbiota. Therefore, we plan to study the effects of antibiotics on gut microbiota. To conduct this research, we need to isolate these microbes from the gut. Although this process is challenging, we still aim to explore the gut microbiota. If possible, we will continue to delve into interesting aspects of how antibiotics affect gut microbiota in future studies.
(4) Comparisons of the global transcriptomic analysis of the untreated group to the PC, LP, and LT groups would strengthen the author's claims about the immunological and transcriptomic changes caused by linalool and provide a true baseline.
Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Due to the initial research design and data analysis strategy, we have focused on comparisons among the PC, LP, and LT groups to more directly explore the differences under various treatment conditions. Specifically, while the transcriptomic data from the untreated group could provide a basic reference, it has shown limited relevance to the core hypotheses of our study. Our research has aimed to investigate the immunological and transcriptomic changes among the treatment groups rather than comparing treated and untreated states. We believe that the current experimental design and data analysis have effectively revealed the mechanisms of linalool and that the additional comparisons among the treatment groups have further supported our conclusions. We hope the reviewer understands the rationale behind our experimental design. If there are additional suggestions, we are more than willing to further optimize the content of our manuscript.
Reviewer #2 (Public review):
(1) The authors have taken for granted that the readers already know the experiments/assays used in the manuscript. There was not enough explanation for the figures as well as figure legends.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will provide more detailed explanations of the experiments and assays used in the manuscript, as well as enhance the descriptions in the figure legends, to ensure that readers have a clear understanding of the figures and their context.
(2) The authors missed adding the serial numbers to the references.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will add serial numbers to the references to ensure proper citation and improve the clarity of our manuscript.
(3) The introduction section does not provide adequate rationale for their work, rather it is focused more on the assays done.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will add a section to the introduction that provides a rationale for our work, specifically focusing on the impact of plant extract on immunoregulation.
(4) Full forms are missing in many places (both in the text and figure legends), also the resolution of the figures is not good. In some figures, the font size is too small.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We will ensure that all abbreviations are expanded where necessary, both in the text and figure legends. Additionally, we will improve the resolution of the figures and increase the font size where needed to enhance clarity.
(5) There is much mislabeling of the figure panels in the main text. A detailed explanation of why and how they did the experiments and how the results were interpreted is missing.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will improve the labeling of the figure panels, provide detailed explanations of the experimental methods, including their rationale and interpretation, and clarify the connections between the methods.
(6) There is not enough experimental data to support their hypothesis on the mechanism of action of linalool. Most of the data comes from pathway analysis, and experimental validation is missing.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Actually, in our manuscript the transcriptomic data are not alone, and we carried out many experiments to substantiate the changes inferred from the transcriptomic data as SEM, TEM, CLSM, molecular docking, RT-qPCR, histopathological examinations. The detailed information is listed as below.
As shown in Figure 2, we combined the transcriptomic data related to membrane and organelle with SEM, TEM, and CLSM images. After deep analysis of these data and observation together, we illustrated that cell membrane may be a potential target for linalool.
As shown in Figure 3, we carried out molecular docking to explore the specific binding protein of linalool with ribosome which were screen out as potential target of linalool by transcriptomic data.
As shown in Figure 5, transcriptomic data illustrated that linalool enhanced the host complement and coagulation system. To substantiate these changes, we carried out RT-qPCR to detect those important immune-related gene expressions, and found that RT-qPCR analysis results were consistent with the expression trend of transcriptome analysis genes.
As shown in Figure 4 and 5, transcriptomics data revealed that linalool promoted wound healing tissue repair, and phagocytosis (Figure. 5E). To ensure these, we carried out histopathological examinations, and found that linalool alleviated tissue damage caused by S. parasitica infection on the dorsal surface of grass carp and enhancing the healing capacity (Figure. 4G).
Overall, we will conduct additional experiments to verify the mechanism of action of linalool in the future.
Reviewer #1 (Recommendations for the authors):
(1) Figure 1 Panel G is not referenced in the legend, this should be fixed
Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. The order of Panel F and G in Figure 1 is wrong. We have modified the order of Figure 1.
(2) Statistical comparisons between groups in Figure 4 Panels C-F is lacking and should be added.
Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 4 C-F. We have added statistical comparisons between groups in Figure 4 Panels C-F.
(3) Capitalize Kidney label in Figure 4G.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 4G. We have capitalized the K of kidney.
Reviewer #2 (Recommendations for the authors):
(1) The authors missed adding the serial numbers to the references. I could not go through the references to cross-check if they cited the right ones because it's extremely difficult to figure out which one corresponds to which reference number.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the references. We have added the serial numbers to the references.
(2) In the last paragraph of the introduction section, most of the techniques in the paper were summarized which does not go with the flow of the paper. The introduction should not be focused on the different techniques used the focus should be more on the rationale of the work. It would be nice if the last paragraph could be rewritten.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 85-94. We have added a section to the introduction that provides a rationale for our work, specifically focusing on the impact of plant extract on immunoregulation.
(3) The resolution of the figures is not good.
Thank you for your suggestion. We have revised it carefully. Please check all the figures. We have increased the resolution and size of all the figures.
(4) Mostly, the figure legends sound like results, with not enough explanation. Full forms are missing in many places which would make the readers go back to the text/other figures each time.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it throughout the manuscript and all the figure legends. We have added full names and abbreviations to both the manuscript and all the figure legends so that we don't make the readers go back to the text/other figures each time.
(5) Figure 1:
Figure 1A: there is not enough explanation for this panel. It's not clear from the text which other EOs than Linalool are referred to here. Which EOs were extracted from daidai flowers?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in the Figure 1A. Figure 1A is divided into “Essential oils (EOs)” and “The main compounds of EOs” to make it easier to distinguish.
Figure 1B: do the three different wells of each set represent three replicates? If so, are they biological/technical replicates? Also, I'm not sure how the MFC was determined from this figure (line 116) because clearly this panel only corresponds to the determination of MICs, not MFCs.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 126-130. The three different wells of each set represent three biological replicates. After adding 5 μL of resazurin dye, when the color of the wells turned to pink, the linalool concentration in the first non-pink well corresponded to the MIC. The culture liquid in the well where no mycelium growth was seen was marked onto the plate and incubated at 25°C for 7 days. The well with the lowest linalool concentration and no mycelium growth was identified as MFC.
Figure 1C: the figure legend says that the effect of linalool on mycelium growth inhibition was done over a 6hr timepoint but according to the figure the timepoint was 60hr. I am also confused about the concentrations of linalool used. Although a range of concentration from 0 to 0.4% is mentioned, I only see the time vs diameter curves for 7 concentrations.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 983 and Figure 1C. We have changed 6 h to 60 h in the figure legends. The reason why only the time vs diameter curves for 7 concentrations in Figure1C is that the growth inhibition of 0.4%, 0.2% and 0.1% linalool on mycelial growth is the same. As a result, the time vs diameter curves coincide. We have shown the time and diameter curves of 0.4%, 0.2% and 0.1% concentration with three dotted lines of different colors and sizes in Figure 1C.
Figure 1D: mislabeled as 1G in the figure panel.
Figures 1E and 1G: Figure 1E is missing and I do not see any figure legend for Figure 1G.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. The order of Panel F and G in Figure 1 is wrong. We changed the order of Figure 1 ABCDEF, no Figure G.
Overall, Figure 1 is very confusing and needs rewriting. Also, there is a need to add more explanation of the figure panels in the results section.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. We have corrected all the problems in Figure1. And we have added more explanation of the figure panels in the results section, and increased the correlation between methods, in order to show how to carry out the experiment logically and interpret the results, please check them in Line 126-130, 144-147, 174-179, 213-217, 343-345, 677-682.
(6) Figure 2:
The authors could justify the reason for doing the experiments before moving into the results they got.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the methods and results in the manuscript, please check them in Line 126-130, 144-147, 174-179, 213-217, 343-345, 677-682. We have added more explanation of the figure panels in the results section, and increased the correlation between methods, in order to show how to carry out the experiment logically and interpret the results.
What concentration of linalool was used?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 992-996. The mycelium treated with 6×MIC (0.3%) linalool was observed by Confocal laser scanning microscopy (CLSM), and the mycelium treated with 1×MIC 0.05% linalool was observed by Scanning Electron Microscope (SEM) and transmission electron microscopy (TEM).
The full form of DEGs has been mentioned later, but it should be mentioned in the figure legend of Figure 2 as this is the first time the term was used. Also, what is the full form of DEPs?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 168, 175, 182, 631, 998, 1001. The word DEPs in Figure 2I was incorrect, and we have changed DEPs to DEGs.
Is there a particular reason for looking into the cellular component rather than molecular function and biological processes in the GO analysis? (what I see is that Figure 2H indicates the prevalence of catalytic activity, binding, cellular, and metabolic processes as well). Also, there is not enough explanation of the observation from Figure 2I (both in the results section and figure legend).
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 174-179, 998-1002 (Figure 2I). The reason we looked at cellular components rather than molecular functions and biological processes in GO analysis is because we focused more on the effects of cell membranes and cell walls. These results are closely related to and echo the results of our scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and also support the results of electron microscopy. Enough explanations have added to the results and figure legend section to explain the observations from Figure 2I.
(7) Figure 3:
Figures 3A and 3B: The adjusted p value is already indicated in the figures, so there is no need to add statistical significance (Asterix) to each bar. The resolution for these panels is not good and the font is too small.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 3A and 3B. We have removed statistical significance (Asterix) from Figure3A and 3B. If we are lucky, we will upload the clearest figures when the manuscript is published.
Figure 3C: the figure legend is missing (wrongly added as KEGG analysis, which should be network analysis). The numbering for the figure legends is wrong. What are the node sizes (5, 22, 40, 58) mentioned in the figure represent? Also, I wonder why ribosome biogenesis in eukaryotes has been indicated as the most enriched pathway despite its less connection to the other nodes.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 3C. Figure 3C is KEGG analysis generated by software, not network analysis. For the convenience of readers, we have made a new Figure of KEGG analysis.
Figure 3D: KEGG enrichment and GO analysis: global/local search? Which database was used as a reference?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the 633-635. Functional enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. KEGG pathway analysis was conducted using Goatools.
Figure 3E: why were the RNA pol structures compared? The authors did not mention anything about this panel in their results.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the line 207. We found that many DEGs related to ribosome biogenesis (Figure 3D) and RNA polymerase (Figure 3E) are down expressed. Because RNA polymerase is closely related to ribosome biogenesis, the downregulation of RNA polymerase directly affects the synthesis of ribosome-related RNAs, including rRNA, mRNA, and tRNA, thereby inhibiting ribosome production. This relationship is particularly significant in cell growth, division, and the response to external environmental changes.
Figures 3F and 3G: please mention which model is illustrated (ribbon/sphere model).
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the line 1010-1015. The tertiary structure of NOP1 was displayed using a cartoon representation. Molecular docking of linalool with NOP1 was performed by enlarging the regions binding to the NOP1 activation pocket to showcase the detailed amino acid structures, which were presented using a surface model, while the small molecule was displayed with a ball-and-stick representation.
Figure 3H: this panel needs more explanation. Why were some of the ABC transporters upregulated while some were downregulated?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. It is a common phenomenon that microorganisms adjust the expression of genes related to substance transport in response to different environmental stimuli to optimize their survival strategies. The expression of ATP-binding cassette (ABC) transporters can be upregulated or downregulated due to various factors, such as environmental stimuli, metabolic demands, energy consumption, species specificity, and signaling molecules. This explains why some ABC transporters are upregulated while others are downregulated.
(8) Figure 4:
There was no statistical significance shown in the figures (D-F) which makes me wonder how they worked out that there was any significant increase/decrease, as mentioned in the text. What are the p values? What is the number of replicates? What concentration of linalool was used?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 4D-F. In this study, 4 groups were established: (1) Positive control (PC) group (10 fish infected with S. parasitica). (2) Linalool therapeutic (LT) group (10 fish infected with S. parasitica, soaked in 0.00039% linalool in a 20L tank for 7 days). (3) Linalool prophylactic (LP) group (10 uninfected fish soaked in 0.00039% linalool in a 20L tank for 2 days, followed by the addition of 1×10<sup>6</sup> spores/mL secondary zoospores). (4) Negative control (NC) group (10 uninfected fish without linalool treatment). Each group had 3 replicate tanks. In each group, 8 fish were utilized for immunological assays, and on day 7, blood samples were collected from the tail veins using heparinized syringes and left to coagulate overnight at 4°C. Kits from Nanjing Jiancheng Institute (Nanjing, China) were used to measure lysozyme (LZY) activity, superoxide dismutase (SOD) activity, and alkaline phosphatase (AKP) activity.
(9) Figure 5:
Again, the resolution and font size are off. Please mention the full forms of the terms used in the figure legend. The interpretation of the in vivo protective mechanism of linalool is completely based on GO enrichment and KEGG pathway analysis (also some transcriptional analysis). The only wet lab validation done was by checking the mRNA level of some cytokines but that does not necessarily validate what the authors claim.
Thank you for your suggestion. We have revised it carefully. Please check all the figures and figure legend. We have increased the resolution and size of all the figures and used the full forms of the terms in figure legend. If we are lucky, we will upload the clearest figures when the manuscript is published. Currently, in the field of aquaculture research, mRNA quantification at the genetic level faces numerous challenges compared to model organisms like mice and zebra fish, primarily due to the lack of available antibodies. For instance, antibodies related to grass carp have not yet been commercialized, making protein-level studies and validations significantly more difficult. This lack of antibodies limits the progress of protein verification. However, we hope to design more experiments and validation tests in the future to gradually overcome these technical bottlenecks and provide stronger support for research in the future.
(10) Figure 6:
There is not enough explanation on why and how the experiments were done. It seems like the authors already presumed that the readers know the experiments. The interpretation of the PCA plot is not clear. Why are the quadrant sizes different? How was the heat map plotted? Also, the claim of linalool regulating the gut microbiota is only dependent on the correlation analysis and there is no wet lab validation for this. The data represented in this figure is not enough to prove their hypothesis and needs further investigation.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 6. We will improve the labeling of the figure panels, provide detailed explanations of the experimental methods, including their rationale and interpretation, and clarify the connections between the methods.
The goal of PCoA is to preserve the distance relationships between samples as much as possible through the principal coordinates, thereby revealing the differences or patterns in microbial composition among different groups. For example, in our study, PCoA analysis demonstrated that the microbial compositions of the positive control (PC), linalool prophylactic (LP), and linalool therapeutic (LT) groups showed significant differences in the reduced dimensional space, possibly indicating that these treatments had a notable impact on the microbial community.
In our study, the heatmap was generated using the Majorbio Cloud Platform. This platform visualized the preprocessed microbial community data, providing an intuitive representation of the differences in microbial composition and relative abundance among samples. The platform automatically performed steps such as data normalization, color mapping, and clustering analysis, offering convenience for data analysis and interpretation.
Previous researches have shown that antibiotics and other drugs can cause alterations in gut microbiota. Therefore, we plan to study the effects of antibiotics on gut microbiota. To conduct this research, we need to isolate these microbes from the gut. Although this process is challenging, we still aim to explore the gut microbiota. If possible, we will continue to delve into interesting aspects of how antibiotics affect gut microbiota in future studies.
(11) Figure 7:
This figure does not clarify how they did the interpretation. The in vivo study does not phenocopy their in vivo studies.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. we have carefully reviewed and confirmed the current experimental design and data analysis. Although we have not made any changes to Figure 7, we have further clarified the interpretation of the results in the revised manuscript, especially concerning the discrepancies between the in vivo and in vitro studies. We have added more experimental background information to help better understand the possible reasons for these differences. We hope the reviewer will understand our explanation and we look forward to your further feedback.
(12) Minor comments:
Line 61: what's meant by "et al"?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 61. We have removed "et al".
Line 87-88: please add a citation referring to the earlier studies.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 109.
Line 151-152: the term "related to" has been used a couple of times. Mentioning it once in the beginning and avoiding repeating the same word might be better.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 168-171.We have rewritten this paragraph to avoid repeating the word “related to”.
How did they reconstitute the EO compounds?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. The EO compounds we used in our experiments were partially extracted from essential oils in the laboratory and partially purchased from ThermoFisher (USA).
Line 544: needs explanation of how there was a 2-fold dilution in the concentrations shown in the figure compared to the concentrations mentioned here.
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We set the concentration of MIC assay for mycelium to be 0.8%, 0.4%, 0.2%, 0.1%, 0.05%, 0.025%, 0.0125%, and 0.00625%, and the concentration of MIC assay for spores to be 0.4%, 0.2%, 0.1%. 0.05%, 0.025%, 0.0125%, 0.00625%. Figure 1B shows the MIC determination of linalool on spores, while the MIC determination of mycelium is not shown.
Line 546: remove "were".
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 573. We have removed "were".
Line 555: what concentration of malachite green and tween 20 was used?
Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 579-580. 2.5mg /mL malachite green and 1% Tween 20 were used.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer 1:
(1) Some conclusions are not completely supported by the present data, and at times the manuscript is disjoint and hard to follow. While the work has some interesting observations, additional experiments and controls are warranted to support the claims of the manuscript.
Thank you for the comments. We revised some of the claims and conclusions to be more objective and result-supportive.
(2) While the authors present compelling data that is relevant to the development of anti-bacterial vaccinations, the data does not completely match their assertions and there are places where some further investigation would further the impact of their interesting study.
We do not fully agree with the reviewer's comments. We have demonstrated that changes in CPS levels during infection are associated with pathogenesis, which will guide future studies on the underlying mechanisms. A significant amount of effort is required for studying mechanisms, which is beyond the scope of this research. We concur with the reviewer that assertions should be made cautiously until further studies are conducted. We have revised these assertions to align with the data and to avoid extrapolating the results (pages 7, lines 126, 133-136; page 11, lines 216-218; page 13, line 264; and page 18, lines 378-383).
(3) The difference in the pathogenesis of a log phase vs. stationary phage intranasal infection would be interesting. Especially because the bacteria is a part of the natural microbial community of swine tonsils, it is curious if the change in growth phase and therefore CPS levels may be a causative reason for pathogenic invasion in some pigs.
S. suis is a part of the natural microbial community of swine tonsils but not mouse NALT. It is interesting to know if CPS levels are low in pig tonsils since CPS is hydrophilic and not conducive to bacterial adhesion. In the study, mice were i.n. infected with a high dose of the bacteria, which could increase opportunities for dissemination (acidic acid may not be a contributor since with or without it is similar). S. suis getting into other body compartments from pig tonsils might be triggered by other conditions, such as viral coinfection, nasal cavity inflammation, cold weather, and decreased immunity.
Experiments with pig blood and phagocytes have shown that genes involved in the synthesis of CPS are upregulated in pig blood. In contrast, these genes are downregulated [1]. In addition, the absence of CPS correlated with increased hydrophobicity and phagocytosis, proposing that S. suis undergoes CPS phase variation and could play a role in the different steps of S. suis infection [2]. We showed direct evidence of encapsulation modulation associated with S. suis pathogenesis in mice. A pig infection model is required to confirm these findings.
(4) The authors should consider taking the bacteria from NALT/CSF and blood and compare the lag times bacteria from different organs take to enter a log growth phase to show whether the difference in CPS is because S. suis in each location is in a different growth phase. If log phase bacteria were intranasally delivered, would it adapt a stationary phase life strategy? How long would that take?
What causes CPS regulation in vivo is not known. CPS changes in different culture stages, indicating that stress, such as nutrition levels, is one of the signals triggering CPS regulation. The microenvironment in the body compartments is far more complex than in vitro, in which host cells, immune factors and others may affect CPS regulation, individually or collectively. The reviewer’ question is important but the suggested experiment is impracticable since bacterial numbers taken from organs are few, and culturing the bacteria in vitro would obliterate the in vivo status.
(5) Authors should be cautious about claims about S. suis downregulating CPS in the NALT for increased invasion and upregulating CPS to survive phagocytosis in blood. While it is true that the data shows that there are different levels of CPS in these locations, the regulation and mechanism of the recorded and observed cell wall difference are not investigated past the correlation to the growth phase.
We lower the tone and change the claim as “suggest a correlation between lower CPS in the NALT and a greater capacity for cellular association, whereas elevated CPS levels in the blood are linked to improved resistance against bactericidal activity. However, the mechanisms behind these associations remain unknown.” (page 7, lines 133-136).
(6) The mouse model used in this manuscript is useful but cannot reproduce the nasal environment of the natural pig host. It is not clear if the NALTs of pigs and mice have similar microbial communities and how this may affect the pathogenesis of S. Suis in the mouse. Because the authors show a higher infection rate in the mouse with acetic acid, they may want to consider investigating what the mouse NALT microenvironment is naturally doing to exclude more bacterial invasion. Is it simply a host mismatch or is there something about the microbiome or steady-state immune system in the nose of mice that is different from pigs?
It is a very interesting comment. The mice are SPF level. The microenvironment in SPF mouse NALT should be significantly different from conventional pig tonsils. Although NALT in mice resembles pig tonsils in function, many factors may contribute to the sensitivity to S. suis colonization in the pig nasal cavity, such as the microbiome and local steady-state immune system. More complex microbiota in tonsils could be one of the factors. Analyzing what makes S. suis inclined towards colonization in pig tonsils by SPF and conventional pigs are an ideal experiment to answer the question.
(7) Have some concerns regarding the images shown for neuroinvasion because I think the authors mistake several compartments of the mouse nasal cavity as well as the olfactory bulb. These issues are critical because neuroinvasion is one of the major conclusions of this work.
Thank you for your comments. The olfactory epithelium (OE) is located directly underneath the olfactory bulb in the olfactory mucosa area and lines approximately half of the nasal cavities of the nasal cavity. The remaining surface of the nasal cavity is lined by respiratory epithelium, which lacks neurons. The olfactory receptor neuron in OE is stained green in the images by β-tubulin III, a neuron-specific marker. The respiratory epithelium is colorless due to the absence of nerve cells. Similarly, the green color stained by β-tubulin III identifies the olfactory bulb. The accuracy of the anatomic compartments of the mouse nasal cavity has been checked and confirmed by referring to related literature [3, 4].
References
(1) Wu Z, Wu C, Shao J, Zhu Z, Wang W, Zhang W, Tang M, Pei N, Fan H, Li J, Yao H, Gu H, Xu X, Lu C. The Streptococcus suis transcriptional landscape reveals adaptation mechanisms in pig blood and cerebrospinal fluid. RNA. 2014 Jun;20(6):882-98.
(2) Charland N, Harel J, Kobisch M, Lacasse S, Gottschalk M. Streptococcus suis serotype 2 mutants deficient in capsular expression. Microbiology (Reading). 1998 Feb;144 ( Pt 2):325-332.
(3) Pägelow D, Chhatbar C, Beineke A, Liu X, Nerlich A, van Vorst K, Rohde M, Kalinke U, Förster R, Halle S, Valentin-Weigand P, Hornef MW, Fulde M. The olfactory epithelium as a port of entry in neonatal neurolisteriosis. Nat Commun. 2018;9(1):4269.
(4) Sjölinder H, Jonsson AB. Olfactory nerve--a novel invasion route of Neisseria meningitidis to reach the meninges. PLoS One. 2010 Nov 18;5(11):e14034.
Reviewer 2:
(1) However, there are serious concerns about data collection and interpretation that require further data to provide an accurate conclusion. Some of these concerns are highlighted below:
Both reviewers were concerned about some of the interpretations of the results. We modified the interpretations in related lines throughout the manuscript (Please see the related responses to Reviewer 1).
(2) In figure 2, the authors conclude that high levels of CPS confer resistance to phagocytic killing in blood exposed S. suis. However, it seems equally likely that this is resistance against complement mediated killing. It would be important to compare S. suis killing in animals depleted of complement components (C3 and C5-9).
We thank the reviewer for the comment. The experiment should be Bactericidal Assay instead of anti-phagocytosis killing. CPS is a main inhibitor of C3b deposition [1]. It interferes with complement-mediated and receptor-mediated phagocytosis; and direct killing. Data in Figure 2C is expressed as “% of bacterial survival in whole blood” for clarity (page 8, Fig. 2C and page 23, lines 489-490).
(3) Intranasal administration non-CPS antisera provides a nice contrast to intravenous administration, especially in light of the recently identified "blood-olfactory barrier". Can the authors provide any insight into how long and where this antibody would be located after intranasal administration? Would this be antibody mediated cellular resistance, or something akin to simple antibody "neutralization"
Anti-V5 may not stay long locally following intranasal administration. Efficient reduction of S. suis colonization in NALT supports that anti-V5 could recognize and neutralize the bacteria in NALT quickly, thereby reducing further dissemination in the body. Antibody-mediated phagocytosis may not play a major role because neutrophils are mainly present in the blood but not in the tissues.
(4) The micrographs in Figure 7 depict anatomy from the respiratory mucosa. While there is no histochemical identification of neurons, the tissues labeled OE are almost certainly not olfactory and in fact respiratory. However, more troubling is that in figures 7A,a,b,e, and f, the lateral nasal organ has been labeled as the olfactory bulb. This undermines the conclusion of CNS invasion, and also draws into question other experiments in which the brain and CSF are measured.
We understand the significance of your concerns and appreciate your careful review of Figure 7. The olfactory epithelium (OE) is situated directly beneath the olfactory bulb in the olfactory mucosa area and covers about half of the nasal cavity. This positioning allows information transduction between the olfactory and the olfactory epithelium. The remaining surface of the nasal cavity is lined with respiratory epithelium, which does not contain neurons and primarily serves as a protective barrier. In contrast, the olfactory epithelium consists of basal cells, sustentacular cells, and olfactory receptor neurons. The olfactory receptor neurons are specifically stained green in the images using β-tubulin III, a marker that is unique to neurons. The respiratory epithelium appears colorless due to the lack of nerve cells. Similarly, the green staining with β-tubulin III also highlights the olfactory bulb. The anatomical structures indicated in the images are consistent with those described in the literature [2, 3], confirming that the anatomy of the nasal cavity has been accurately identified.
(5) Micrographs of brain tissue in 7B are taken from distal parts of the brain, whereas if olfactory neuroinvasion were occurring, the bacteria would be expected to arrive in the olfactory bulb. It's also difficult to understand how an inflammatory process would be developed to this point in the brain -even if we were looking at the appropriate region of the brain -within an hour of inoculation (is there a control for acetic acid induced brain inflammation?). Some explanations about the speed of the immune responses recorded are warranted.
Thank you for highlighting this issue. Cerebrospinal fluid (CSF) flows into the subarachnoid space surrounding the spinal cord and the brain. There are direct connections from this subarachnoid space to lymphatic vessels that wrap around the olfactory nerves as they cross the cribriform plate towards the nasal submucosa. This connection allows for the drainage of CSF into the nasal submucosal lymphatics in mice [4, 5]. Bacteria may utilize this CSF outflow channel in the opposite direction, which explains the development of brain inflammation in the distal areas of brain tissue adjacent to the subarachnoid space. We have included additional relevant information in the revised manuscript (page 16, lines 323-325).
(6) The detected presence of S. suis in the CSF 0.5hr following intranasal inoculation is difficult to understand from an anatomical perspective. This is especially true when the amount of S. suis is nearly the same as that found within the NALT. Even motile pathogens would need far longer than 0.5hr to get into the brain, so it's exceedingly difficult to understand how this could occur so extensively in under an hour. The authors are quantifying CSF as anything that comes out of the brain after mincing. Firstly, this should more accurately be referred to as "brain", not CSF. Secondly, is it possible that the lateral nasal organ -which is mistakenly identified as olfactory bulb in figure 7- is being included in the CNS processing? This would explain the equivalent amounts of S. suis in NALT and "CSF".
The high dose of inoculation used in the experiment may explain the rapid presence of S. suis in the CSF. Mice exhibit low sensitivity to S. suis infection, and the range for the effective intranasal infectious dose is quite narrow. Higher doses lead to the quick death of the mice, while lower doses do not initiate an infection at all. The dose used in this study is empirical and is intended to facilitate the observation of the progression of S. suis infection in mice.
The NALT tissue and CSF samples are collected separately. After obtaining the NALT tissue, the nasal portion was carefully separated from the rest of the head along the line of the eyeballs. The brain tissue was then extracted from the remaining part of the head to collect the CSF, and it was lacerated to expose the subarachnoid space without being minced. This procedure aims to preserve the integrity of the brain tissue as much as possible. Further details about the CSF collection process can be found in the Materials and Methods section (page 24, lines 508-512).
(7) To support their conclusions about neuroinvasion along the olfactory route and /CSF titer the authors should provide more compelling images to support this conclusion: sections stained for neurons and S. suis, images of the actual olfactory bulb (neurons, glomerular structure etc).
Thank you. We respectfully disagree with the reviewer. We stained neurons using a neuron-specific marker to identify the anatomical structures of the olfactory bulb and olfactory epithelium (in green). We used an S. suis-specific antibody to highlight the bacteria present in these areas (in orange and red). The images, along with the bacteria found in the cerebrospinal fluid (CSF) and the brain inflammation observed early in the infection, strongly support our conclusion regarding brain invasion through the olfactory pathway. Please see the response to question 4 for further clarification.
References
(1) Seitz M, Beineke A, Singpiel A, Willenborg J, Dutow P, Goethe R, Valentin-Weigand P, Klos A, Baums CG. Role of capsule and suilysin in mucosal infection of complement-deficient mice with Streptococcus suis. Infect Immun. 2014 Jun;82(6):2460-71.
(2) Sjölinder H, Jonsson AB. Olfactory nerve--a novel invasion route of Neisseria meningitidis to reach the meninges. PLoS One. 2010 Nov 18;5(11):e14034.
(3) Pägelow D, Chhatbar C, Beineke A, Liu X, Nerlich A, van Vorst K, Rohde M, Kalinke U, Förster R, Halle S, Valentin-Weigand P, Hornef MW, Fulde M. The olfactory epithelium as a port of entry in neonatal neurolisteriosis. Nat Commun. 2018;9(1):4269.
(4) Yoon JH, Jin H, Kim HJ, Hong SP, Yang MJ, Ahn JH, Kim YC, Seo J, Lee Y, McDonald DM, Davis MJ, Koh GY. Nasopharyngeal lymphatic plexus is a hub for cerebrospinal fluid drainage. Nature. 2024 Jan;625(7996):768-777.
(5) Spera I, Cousin N, Ries M, Kedracka A, Castillo A, Aleandri S, Vladymyrov M, Mapunda JA, Engelhardt B, Luciani P, Detmar M, Proulx ST. Open pathways for cerebrospinal fluid outflow at the cribriform plate along the olfactory nerves. EBioMedicine. 2023 May;91:104558.
Response to Recommendations for the authors:
Reviewer 1:
Minor concerns for the manuscript:
(1) In the introduction, please consider giving a little more background about the bacteria itself and how it causes pathogenesis.
We appreciate your suggestion. We have included additional background on the virulent factors and the pathogenesis of the bacteria in the introduction to enhance understanding of the results (page 4, lines 63-69).
(2) Figure 2C would be more correct to say percent survival as the CFUs before and after are what are being compared and not if the bacteria is being phagocytosed or not. Flow cytometry of the leukocytes and a fluorescent S. Suis would show phagocytosis. Unless that experiment is performed, the authors cannot claim that there is a resistance to phagocytosis.
Thank you for your feedback. We agree with the reviewer that the experiment should be Bactericidal Assay rather than anti-phagocytosis killing. CPS interferes with complement-mediated phagocytosis and direct killing, and receptor-mediated phagocytosis. To enhance clarity, the data in Fig. 2C has been presented as “% of bacterial survival in whole blood” (page 8).
(3) There are two different legends present for Figure 1. Please resolve.
We apologize for the oversight. The redundant figure legend has been removed (page 6).
(4) There are places such as in lines 194-195, that there are assertions and interpretations about the data that are not directly drawn from the data. These hypotheses are valuable, but please move them to the discussion.
Thank you for your suggestion. The hypothesis has been moved to the Discussion section (page 19, lines 402 - 405).
(5) In Figure 4B, higher resolution images would strengthen the ability of non-microbiologists to see the differences in CPS levels in the cell wall.
We achieved the highest resolution possible for clearer distinctions in CPS levels. To enhance the visualization of the different CPS levels in the images, we revised the description of the CPS changes in Figure 4B within the results section (page 11, lines 208-213).
(6) In Figure 5 there is no D. Further, the schematics throughout would be easier to parse with the text if the challenge occurred at time 0. Consider revising them for clarity.
Thank you for highlighting the error. We have removed "i.v + i.n (Fig. 5)" from Figure 5A and made adjustments to the schematic illustrations in Figures 5 and 6 as recommended by the reviewer (page 14).
(7) What is the control for the serum? The findings for figures 5 and 6 would be much stronger if a non- S. Suis isotype control serum was also infused.
We used a naive serum as a control to avoid interference from a non-S. suis isotype control that targets other surface molecules of S. suis serotypes.
(8) Figure 6 legend does not include the anti-CPS treatment.
Thank you. We have added anti-CPS serum in the legend (page 15, line 249).
(9) Figure 7 legend does not include the time point for panel 7A.
Thank you. The time point is shown on Fig.7A (page 17).
(10) Figure 7 should show OB micrographs or entire brain including the OB.
The neuron-specific marker, β-tubulin III, identifies the neuro cells in the olfactory bulb (OB) as shown in Fig. 7A. Unfortunately, we were unable to provide an image of the entire brain that includes the OB due to limitations in our section preparation. We apologize for the mislabeled structure in Fig. 7A, which may have caused confusion. We have corrected the labeling for consistency (see page 15, lines 257-260). Additionally, we included a drawing of the sagittal plane of the rodent's nose, depicting the compartments of the OB, olfactory epithelium (OE), nasal cavity (NC), and brain. This illustration, presented in Fig. 7B on page 17, aims to clarify the structural and functional connections between the nasopharynx and the CNS.
(11) Some conclusions may be better drawn if figures were to be consolidated. As noted above, the data at times feels disjointed and the importance is more difficult for readers to follow because data are presented further apart. Particularly figures 5 and 6 which are similar with different time points and controls of antisera administrative routes; placing these figures together would be an example of increasing continuity throughout the paper.
Thank you for the valuable suggestion. Figures 5 and 6, along with their related descriptions in the results section, have been combined for better cohesiveness (pages 14-15).
Reviewer #2:
To support their conclusions about neuroinvasion along the olfactory route and /CSF titer the authors should provide more compelling images to support this conclusion: sections stained for neurons and S. suis, images of the actual olfactory bulb (neurons, glomerular structure etc).
Please refer to our responses to Reviewer 1's Question 7, Reviewer 2's Questions 4 and 7 in the public reviews, and Reviewer 1's Question 10 in the authors' recommendations.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Reviewer #1 (Public review):
The authors have strengthened their conclusions by providing additional information about the specificity of their antibodies, but at the same time the authors have revealed concerning information about the source of their antibodies.
It appears that many of the antibodies used in this study have been discontinued because the supplier company was involved in a scandal of animal cruelty and all their goats and rabbits Ab products were sacrificed. The authors acknowledge that this is unfortunate but they also claim that the issue is out of their hands.
The authors' statement is false; the authors ought to not use these antibodies, just as the providing company chose to discontinue them, as those antibodies are tied to animal cruelty. The issue that the authors feel OK with using them is of concern. In short, please remove any results from unethical antibodies.
Removal of such results also best serves science. That is, any of their results using the discontinued antibodies means that the authors' results are non-reproducible and we should be striving to publish good, reproducible science.
For the antibodies that do not have unethical origins the authors claim that their antibodies have been appropriately validated, by "testing in positive control tissue and/or Western blot or in situ hybridization". This is good but needs to be expanded upon. It is a strong selling point that the Abs are validated and I want to see additional information in their Supplementary Table 2 stating for each Ab specifically:
(1) What +ve control tissue was used in the validation of each Ab and which species that +ve control came from. Likewise, if competition assays to confirm validity was used, please also specify.
(2) Which assay was the Ab validated for (WB, IHC, ELISA, all etc)
(3) For Antibodies that were validated for, or using WBs please let the reader know if there were additional bands showing.
(4) Include references to the literature that supports these validations. That is, please make it easy for the reader to appreciate the hard work that went into the validation of the Antibodies.
Finally, for the Abs, when the authors write that "All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization" I fail to understand what in situ hybridisation means in this context. I am under the impression that in situ hybridisation is some nucleic acid -hybridising-to-organ or tissue slice. Not polypeptide binding.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Remove results that have been obtained by unethically-sourced antibody reagents.
Strengthen the readers' confidence about the appropriateness & validity of your antibodies.
First, we want to stress that reviewer 1 has raised his critique related to the used of antibodies from Santa Cruz biotechnology not only through the journal. The head of our department and two others were contacted by reviewer 1 directly without going through the journal or informing/approaching the corresponding or first author. It is our opinion that this debate and critique should be handled through the journal and editorial office and not with people without actual involvement in the project.
It is correct that we have purchased antibodies from Santa Cruz Biotechnologies both mouse, rabbit and goat antibodies as stated in the correspondence with the reviewer.
As stated in our previous rebuttal – the goat antibodies from Santa Cruz were discontinued due to inadequate treatment of goats after settling with the authorities in 2016.
https://www.nature.com/articles/nature.2016.19411
https://www.science.org/content/blog-post/trouble-santa-cruz-biotechnology
We have used 11 mouse, rabbit or goat antibodies from Santa Cruz biotechnologies in the manuscript as listed in supplementary table 2 of the manuscript and all of them have been carefully validated in other control tissues supported by ISH and/or WB and many of them already used in several publications by our group (https://pubmed.ncbi.nlm.nih.gov/34612843/, https://pubmed.ncbi.nlm.nih.gov/33893301/, https://pubmed.ncbi.nlm.nih.gov/32931047/, https://pubmed.ncbi.nlm.nih.gov/32729975/, https://pubmed.ncbi.nlm.nih.gov/30965119/, https://pubmed.ncbi.nlm.nih.gov/29029242/, https://pubmed.ncbi.nlm.nih.gov/23850520/, https://pubmed.ncbi.nlm.nih.gov/23097629/, https://pubmed.ncbi.nlm.nih.gov/22404291/, https://pubmed.ncbi.nlm.nih.gov/20362668/, https://pubmed.ncbi.nlm.nih.gov/20172873/, and other research groups. All antibodies used in this manuscript were purchased before the whole world was aware of mistreatment of goats that was evident several years later.
We do not support animal cruelty in anyway but the purchase of antibodies from Santa Cruz biotechnologies were conducted long before mistreatment was reported. Moreover, antibodies from Santa Cruz biotechnologies are being used in thousands of publications annually. The company has been punished for their misconduct, and subsequently granted permission to produce antibodies from the relevant authorities again.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Despite the study being a collation of important results likely to have an overall positive effect on the field, methodological weaknesses and suboptimal use of statistics make it difficult to give confidence to the study's message.
Strengths:
Relevant human and mouse models approached with in vivo and in vitro techniques.
Weaknesses:
The methodology, statistics, reagents, analyses, and manuscripts' language all lack rigour.
(1) The authors used statistics to generate P-values and Rsquare values to evaluate the strength of their findings.
However, it is unclear how stats were used and/or whether stats were used correctly. For instance, the authors write: "Gaussian distribution of all numerical variables was evaluated by QQ plots". But why? For statistical tests that fall under the umbrella of General Linear Models (line ANOVA, t-tests, and correlations (Pearson's)), there are several assumptions that ought to be checked, including typically:
(a) Gaussian distribution of residuals.
(b) Homoskedasticity of the residuals.
(c) Independence of Y, but that's assumed to be valid due to experimental design.
So what is the point of evaluating the Gaussian distribution of the data themselves? It is not necessary. In this reviewer's opinion, it is irrelevant, not a good use of statistics, and we ought to be leading by example here.
Additionally, it is not clear whether the homoscedasticity of the residuals was checked. Many of the data appear to have particularly heteroskedastic residuals. In many respects, homoscedasticity matters more than the normal distribution of the residuals. In Graphpad analyses if ANOVA is used but equal variances are assumed (when variances among groups are unequal then standard deviations assigned in each group will be wrong and thus incorrect p values are being calculated.
Based on the incomplete and/or wrong statistical analyses it is difficult to evaluate the study in greater depth.
We agree with the reviewer that we should lead by example and improve clarity on the use of the different statistical tests and their application. In response to the reviewer’s suggestion, we have extended the statistical section, focusing on the analyses used. Additionally, we have specified the statistical test used in the figure legends for each figure. Additionally, we did check for Gaussian distribution and homoskedasticity of residuals before conducting a general linear model test, and this has now been specified in the revised manuscript. In case the assumptions were not met, we have specified which non-parametric test we used. If the assumptions were not met, we specified which non-parametric test was used.
While on the subject of stats, it is worth mentioning this misuse of statistics in Figure 3D, where the authors added the Slc34a1 transcript levels from controls in the correlation analyses, thereby driving the intercept down. Without the Control data there does not appear to be a correlation between the Slc34a1 levels and tumor size.
We agree with the reviewer that a correlation analysis is inappropriate here and have removed this part of the figure.
There is more. The authors make statements (e.g. in the figure levels as: "Correlations indicated by R2.". What does that mean? In a simple correlation, the P value is used to evaluate the strength of the slope being different from zero. The authors also give R2 values for the correlations but they do not provide R2 values for the other stats (like ANOVAs). Why not?
We agree with the reviewer and have replaced the R2 values with the Pearson correlation coefficient in combination with the P value.
(2) The authors used antibodies for immunos and WBs. I checked those antibodies online and it was concerning:
(a) Many are discontinued.
Many of the antibodies we have used were from the major antibody provider Santa Cruz Biotechnology (SCBT). SCBT was involved in a scandal of animal cruelty and all their goats and rabbits were sacrificed, which explains why several antibodies were discontinued, while the mice antibodies were allowed to continue. This is unfortunate but out of our hands.
(b) Many are not validated.
We agree with the reviewer that antibody validation is essential. All antibodies used in this manuscript have been validated. The minimal validation has been to evaluate cellular expression in positive control tissue for instance bone, kidney, or mamma. Moreover, many of the antibodies have been used and validated in previous publications (doi: 10.1593/neo.121164, doi:10.1096/fj.202000061RR, doi: 10.1093/cvr/cvv187) including knockout models. Moreover, many antibodies but not all have been validated by western blot or in situ hybridization. We have included the following in the Materials and Methods section: “All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization”.
(c) Many performed poorly in the Immunos, e.g. FGF23, FGFR1, and Kotho are not really convincing. PO5F1 (gene: OCT4) is the one that looks convincing as it is expressed at the correct cell types.
We fail to understand the criticism raised by the reviewer regarding the specificity of these specific antibodies. We believe the FGF23 and Klotho antibodies are performing exceptionally well, and FGFR1 is abundantly expressed in many cell types in the testis. As illustrated in Figure 2E, the expression of Klotho, FGF23, and FGFR1 is very clear, specific, and convincing. FGF23 is not expressed in normal testis – which is in accordance with no RNA present there either. However, it is abundantly expressed in GCNIS where RNA is present. On the other hand, Klotho is abundantly expressed in germ cells from normal testis but not expressed in GCNIS.
(d) Others like NPT2A (product of gene SLC34A1) are equally unconvincing. Shouldn't the immuno show them to be in the plasma membrane?
If there is some brown staining, this does not mean the antibodies are working. If your antibodies are not validated then you ought to omit the immunos from the manuscript.
We acknowledge your concerns regarding the NPT2A, NPT2B, and NPT2C staining. While the NPT2A antibody is performing well, we understand your reservations about the other antibodies. It's worth noting that NPT2A is not expressed in normal testis (no RNA either) but is expressed in GCNIS where the RNA is also present. Although it is typically present in the plasma membrane, cytoplasmic expression can be acceptable as membrane availability is crucial for regulating NPT2A function, particularly in the kidney where FGF23 controls membrane availability. We are currently involved in a comprehensive study exploring these phosphate transporters in the organs lining the male reproductive tract. In functional animal models, we have observed very specific staining with this NPT2A antibody following exposed to high phosphate or FGF23. Additionally, we are conducting Western Blot analyses with this antibody, which reinforces our belief that the antibody has a specific binding.
Reviewer #2 (Public Review):
Summary:
This study set out to examine microlithiasis associated with an increased risk of testicular germ cell tumors (TGCT). This reviewer considers this to be an excellent study. It raises questions regarding exactly how aberrant Sertoli cell function could induce osteogenic-like differentiation of germ cells but then all research should raise more questions than it answers.
Strengths:
Data showing the link between a disruption in testicular mineral (phosphate)homeostasis, FGF23 expression, and Sertoli cell dysfunction, are compelling.
Weaknesses:
Not sure I see any weaknesses here, as this study advances this area of inquiry and ends with a hypothesis for future testing.
We thank the reviewer for the acknowledgment and highlighting that this is an important message that addresses several ways to develop testicular microlithiasis, which indicates that it is not only due to malignant disease but also frequent in benign conditions.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
I applaud the authors' approach to nomenclature for rodent and human genes and proteins (italicised for genes, all caps for humans, capitalised only for rodents, etc), but the authors frequently got it wrong when referring to genes or proteins. A couple of examples include:
(1) SLC34A1 (italics) refers to gene (correct use by the authors) but then again the authors use e.g. SLC34A1 (not italics) to refer to the protein product of SLC34A1(italics) gene. In fact, the protein product of the SLC34A1 (italics) gene is called NPT2A (non-italics).
(2) OCT4 (italics) refers to gene (correct use by the authors) but then again the authors use e.g. OCT4 (not italics) to refer to the protein product of OCT4 (italics)gene. In fact, the protein product of the OCT4 gene (italics) gene is called PO5F1(non-italics).
The problem with their incorrect and inconsistent nomenclature is widespread in the manuscript making further evaluation difficult.
Please consult a reliable protein-based database like Uniprot to derive the correct protein names for the genes. You got NANOG correct though.
We thank the reviewer for addressing this important point. We have corrected the nomenclature throughout the manuscript as suggested.
(3) The authors use the word "may" too many times. Also often in conjunction with words like "indicates", and "suggests". Examples of phrases that reflect that the authors lack confidence in their own results, conclusions, and understanding of the literature are:
"...which could indicate that the bone-specific RUNX2 isoform may also be expressed... "
"...which indicates that the mature bone may have been..."
Are we shielding ourselves from being wrong in the future because "may" also means "may not"? It is far more engaging to read statements that have a bit more tooth to them, and some assertion too. How about turning the above statements around, to :
"...which shows that the bone-specific RUNX2 isoform is also expressed... "
"...which reveals that the mature bone were..."
...then revisit ambiguous language ("may", "might" "possibly", "could", "indicate" etc.) throughout the manuscript?
It's OK to make a statement and be found wrong in the future. Being wrong is integral to Science.
Thank you for addressing this. We agree with the reviewer that it is fair to be more direct and have revised many of these vague phrases throughout the manuscript.
(4) The authors use the word "transporter" which in itself is confusing. For instance, is SLC34A1 an importer or an exporter of phosphate? Or both? Do SLC34As move phosphate in or out of the cells or cellular compartments? "Transporter" sounds too vague a word.
We understand that it might be easier for the reader with the term "importer". However, we should use the specific nomenclature or "wording" that applies to these transporters. The exact terminology is a co-transporter or sodium-dependent phosphate cotransporter as reported here (doi: 10.1152/physrev.00008.2019). Thus, we will use the terms “co-transporter” and “transporter” throughout the revised manuscript.
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
We would like to remind the editors and reviewers that the present project is a pilot study that does not claim to produce definitive results. Pilot studies are exploratory preliminary studies to test the validity of hypotheses, the feasibility of a study as well as the research methods and the study design. From our point of view, our hypotheses and the feasibility of the pilot study have been confirmed to such an extent that the implementation of a larger study is justified. At the same time, it became clear during the pilot that the methods and design need to be adapted in some areas in order to increase the reliability of the results - a finding that pilot studies are usually conducted to obtain. We discussed these limitations in detail in order to explain the planned changes in the follow-up study. What the reviewers and editors interpret as incompleteness is therefore due to the nature of a pilot study. We consider it necessary that appropriate standards are taken into account in the evaluation of the present work.
In addition, we would like to make a counterstatement as to what our main claims, which should be used to assess the strength of evidence, are - and what they are not:
In the introduction, we describe the background that led to the formation of our hypotheses: Previous animal and human studies show that food, along with light, serves as the main Zeitgeber for circadian clocks. It has also been shown that chrononutrition can lead to weight loss and improved well-being. Based on this, we hypothesized that individualized meal timing can enhance these positive effects. This hypothesis has been validated on the basis of the available results. Contrary to what the editors and reviewers stated, the assumption that the observed beneficial effects are indeed related to an alteration or resetting of endogenous circadian rhythms was not intended to be investigated in this study and is not one of our main claims. This has already been sufficiently demonstrated and, in our view, need not and should not be repeated in every study on chrononutrition. Accordingly, this assumption was not formulated as a working hypothesis or main claim. It is described in the paper as a potential mechanism, the assumption of which is justified on the basis of previous studies. The lack of a corresponding examination and the erroneous insinuation that corresponding results were nevertheless listed by us in the paper as a main claim should therefore not be used as a criterion for downgrading the assessment of the strength of evidence.
The main criticism of our study is the collection of data using self-reported food and food quantities. This form of data collection is indeed prone to error, as there is little control over the accuracy of the reported data. However, we believe that this problem is limited in scope.
(1) Contrary to what the editors and reviewers claim, at no point do we write that we are convinced that food intake has not changed. On the contrary, in Figure 2 we explicitly show that there was a change in what some participants reported to us regarding their food intake. We make it clear throughout the text that we could not find any correlation between weight change and the changes in the reports of food quantities/meals. These statements are correct and only what are actual and formulated main claims should be included in the evaluation of the study.
(2) As previously stated, we conducted analyses that suggest that an unreported reduction in food intake is unlikely to be the cause of weight loss. For the most part, participants did not change their reporting behavior during the exploration and intervention phases. That is, participants who underreported food intake reported similar amounts in both phases of the study, but lost weight only in the intervention phase. To explain their weight loss with imprecise reporting, it would have to be assumed that these participants began to eat less in the intervention phase and at the same time report more in order to achieve similar calorie counts and food composition in the evaluation. We consider such behavior to be very unlikely, especially since it would apply to numerous participants.
(3) The editors and reviewers reduce the results to the absence of a correlation between weight loss and reported food quantity and composition. In their assessment of the significance of the findings, however, they ignore the fact that we did find a significant correlation in our analyses, namely between weight loss and an increase in the regularity of food intake. There is no correlation between an increase in regularity and a reduction in reported calories (R<sup>2</sup> = 0.01472). This is credible in our view, as it is unlikely that the more regularly participants ate, the more pronounced the error in their reports was (while in reality they ate less than before).
(4) We also had the requirement for the study design that the participants could carry out the intervention in their normal everyday life and environment in order to test and ensure implementation in real life. We consider it unrealistic to be able to monitor food intake continuously and without interruption over a period of several weeks under these conditions. We therefore see no alternative to self-reporting. As the reviewers and editors did not suggest any alternative methods of data collection that would fulfil the requirements of our study, we assume that, despite criticism and reservations, they generally agree with our assessment and take this into account in their evaluation.
It is still criticized that some confounding factors are present. The reviewer makes no reference to the fact that we either eliminated these in the last version submitted (age range), identified them as unproblematic (unmatched cohorts, menstrual cycle, shift work) or even deliberately used them in order to be able to test our hypothesis more validly (inclusion of individuals with normal weight, overweight, and obesity).
Besides, the use of actimeters to determine circadian rhythms as proposed by the editors and reviewers is not valid for this study and the requirement to use them to determine a circadian reset in the eLife assessment is misleading and inappropriate. This instrument only measures physical activity, but not the physiological parameters that are relevant for an investigation in this field of research.
For the assessment of chronotype alone, the MCTQ questionnaire is a valid instrument that has been validated several times against actimetry (e.g., DOIs: 10.1080/07420528.2022.2025821, 10.1080/07420528.2023.2202246, 10.1016/j.ijpsycho.2016.07.433, 10.1155/2018/5646848). The reviewer's statement that the MCTQ questionnaire is unreliable for determining chronotype is unsupported and incorrect.
Equally unproven is the statement that any form of imposed diet appears to lead to weight loss over a period of several months.
Nevertheless, in order to prevent further misunderstandings, we have revised our text in a number of places and clarified that our statements are not irrefutable assertions, but potential interpretations of the results obtained in the pilot study, which are to be analyzed in more detail with regard to the planned more comprehensive study.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors found that IL-1b signaling is pivotal for hypoxemia development and can modulate NETs formation in LPS+HVV ALI model.
Strengths:
They used IL1R1 ko mice and proved that IL1R1 is involved in ALI model proving that IL1b signalling leads towards ARDS. In addition, hypothermia reduces this effect, suggesting a therapeutic option.
We thank the Reviewer for recognizing the strengths of our study and their positive feedback.
Weaknesses:
(1) IL1R1 binds IL1a and IL1b. What would be the role of IL1a in this scenario?
Thank you for asking this question. We have addressed this in our previous paper (Nosaka et al. Front Immunol 2020;11; 207) where we used anti-IL-1a and IL-1a KO mice (Nosaka et al. Front Immunol 2020;11; 207) in our model and found that neither anti-IL-1a treated mice nor IL-1a KO mice were protected. Thus, IL-1b plays a role in inducing hypoxemia during LPS+HVV but not IL-1a. We will now add this point in our revised manuscript discussion.
(2) The authors depleted neutrophils using anti-Ly6G. What about MDSCs? Do these latter cells be involved in ARDS and VILI?
Anti-Ly6G neutrophils depletion may potentially affect G-MDSCs as well (Blood Adv 2022 Jul 29;7(1):73–86), however, we have not looked directly at G-MDSCs. If these cells were depleted we would have expected to see an increase in inflammation, which we did not.
Instead, anti-Ly6G treated mice were protected. Thus, we can not comment on any presumed role of G-MDSCs in LPS+HVV induced severe ALI model that we used.
(3) The authors found that TH inhibited IL-1β release from macrophages led to less NETs formation and albumin leakage in the alveolar space in their lung injury model. A graphical abstract could be included suggesting a cellular mechanism.
Thanks for summarizing our findings and the suggestion. Unfortunately, eLIFE does not publish a graphical abstract. We tried to mention this mechanism in the discussion.
(4) If Macrophages are responsible for IL1b release that via IL1R1 induces NETosis, what happens if you deplete macrophages? what is the role of epithelial cells?
Previous studies have found that macrophage depletion is protective in several models of ALI (Eyal. Intensive Care Med. 2007;33:1212–1218., Lindauer. J Immunol. 2009;183:1419–1426.), and other researchers have found that airway epithelial cells did not contribute to IL-1β secretion (Tang. PLoS ONE. 2012;7:e37689.). We have previously reported that epithelial cells produce IL-18 without LPS priming signal during LPS+HVV (Nosaka et al. Front Immunol 2020;11; 207). Thus, IL-18 is not sufficient to induce Hypoxemia as Saline+HVV treated mice do not develop hypoxemia (Nosaka et al. Front Immunol 2020;11; 207). We will now add this point to the revised discussion of the manuscript.
Reviewer #2 (Public review):
Summary:
The manuscript by Nosaka et al is a comprehensive study exploring the involvement of IL1beta signaling in a 2-hit model of lung injury + ventilation, with a focus on modulation by hypothermia.
Strengths:
The authors demonstrate quite convincingly that interleukin 1 beta plays a role in the development of ventilator-induced lung injury in this model, and that this role includes the regulation of neutrophil extracellular trap formation. The authors use a variety of in vivo animal-based and in vitro cell culture work, and interventions including global gene knockout, cell-targeted knockout and pharmacological inhibition, which greatly strengthen the ability to make clear biological interpretations.
We thank the Reviewer for their positive feedback
Weaknesses:
A primary point for open discussion is the translatability of the findings to patients. The main model used, one of intratracheal LPS plus mechanical ventilation is well accepted for research exploring the pathogenesis and potential treatments for acute respiratory distress syndrome (ARDS). However, the interpretation may still be open to question - in the model here, animals were exposed to LPS to induce inflammation for only 2 hours, and seemingly displayed no signs of sickness, before the start of ventilation. This would not be typical for the majority of ARDS patients, and whether hypothermia could be effective once substantial injury is already present remains an open question. The interaction between LPS/infection and temperature is also complicated - in humans, LPS (or infection) induces a febrile, hyperthermic response, whereas in mice LPS induces hypothermia (eg. Ganeshan K, Chawla A. Nat Rev Endocrinol. 2017;13:458-465). Given this difference in physiological response, it is therefore unclear whether hypothermia in mice and hypothermia in humans are easily comparable. Finally, the use of only young, male animals such as in the current study has been typical but may be criticised as limiting translatability to people.
Therefore while the conclusions of the paper are well supported by the data, and the biological pathways have been impressively explored, questions still remain regarding the ultimate interpretations.
We agree with the reviewer that at two hours post LPS, there is only minimal pulmonary inflammation at that time (Dagvadorj et al Immunity 42, 640–653). This is a limitation to the experimental model we used in our study. Additionally, as the reviewer pointed out that LPS induces hyperthermia in human, but it is also well-established that physiological hypothermia occurs in humans with severe infections and sepsis (Baisse. Am J Emerg Med. 2023 Sep: 71: 134-138., Werner. Am J Emerg Med. 2025 Feb;88:64-78.). Therefore, the difference between human and mouse responses to sepsis or infections may be more nuanced. Furthermore, it is important to distinguish between physiological hypothermia (just <36°C) and therapeutic hypothermia (typically 32-34°C). We will add to the discussion whether hypothermia serves as a protective response, and the transition from normothermia to hyperthermia could have detrimental effects. We only used young male mice in our study as the Reviewer points out; we will also add this point to the revised discussion as a limitation of our study.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
DiPeso et al. develop two tools to (i) classify micronucleated (MN) cells, which they call VCS MN, and (ii) segment micronuclei and nuclei with MMFinder. They then use these tools to identify transcriptional changes in MN cells.
The strengths of this study are:
(1) Developing highly specialized tools to speed up the analysis of specific cellular phenomena such as MN formation and rupture is likely valuable to the community and neglected by developers of more generalist methods.
(2) A lot of work and ideas have gone into this manuscript. It is clearly a valuable contribution.
(3) Combining automated analysis, single-cell labeling, and cell sorting is an exciting approach to enrich phenotypes of interest, which the authors demonstrate here.
Weaknesses:
(1) Images and ground truth labels are not shared for others to develop potentially better analysis methods.
We regret this omission and thank the reviewer for pointing it out. Both the images and ground truth labels for VCS MN and MNFinder are now available on the lab’s github page and described in the README.txt files. VCS MN: https://github.com/hatch-lab/fast-mn. MNFinder: https://github.com/hatch-lab/mnfinder.
(2) Evaluations of the methods are often not fully explained in the text.
The text has been extensively updated to include a full description of the methods and choices made to develop the VCS MN and MNFinder image segmentation modules.
(3) To my mind, the various metrics used to evaluate VCS MN reveal it not to be terribly reliable. Recall and PPV hover in the 70-80% range except for the PPV for MN+. It is what it is - but do the authors think one has to spend time manually correcting the output or do they suggest one uses it as is?
VCS MN attempts to balance precision and recall with speed to reduce the fraction of MN changing state from intact to ruptured during a single cell cycle during a live-cell isolation experiment. In addition, we chose to prioritize inclusion of small MN adjacent to the nucleus in our positive calls. This meant that there were more false positives (lower PPV) than obtained by other methods but allowed us to include this highly biologically relevant class of MN in our MN+ population. Thus, for a comprehensive understanding of the consequences of MN formation and rupture, we recommend using the finder as is. However, for other visual cell sorting applications where a small number of highly pure MN positive and negative cells is preferred, such as clonal outgrowth or metastasis assays, we would recommend using the slower, but more precise, MNFinder to get a higher precision at a cost of temporal resolution. In addition, MNFinder, with its higher flexibility and object coverage, is recommended for all fixed cell analyses.
Reviewer #2 (Public review):
Summary:
Micronuclei are aberrant nuclear structures frequently seen following the missegregation of chromosomes. The authors present two image analysis methods, one robust and another rapid, to identify micronuclei (MN) bearing cells. The authors induce chromosome missegregation using an MPS1 inhibitor to check their software outcomes. In missegregation-induced cells, the authors do not distinguish cells that have MN from those that have MN with additional segregation defects. The authors use RNAseq to assess the outcomes of their MN-identifying methods: they do not observe a transcriptomic signature specific to MN but find changes that correlate with aneuploidy status. Overall, this work offers new tools to identify MN-presenting cells, and it sets the stage with clear benchmarks for further software development.
Strengths:
Currently, there are no robust MN classifiers with a clear quantification of their efficiency across cell lines (mIoU score). The software presented here tries to address this gap. GitHub material (tools, protocols, etc) provided is a great asset to naive and experienced computational biologists. The method has been tested in more than one cell line. This method can help integrate cell biology and 'omics' studies.
Weaknesses:
Although the classifier outperforms available tools for MN segmentation by providing mIOU, it's not yet at a point where it can be reliably applied to functional genomics assays where we expect a range of phenotypic penetrance.
We agree that the MNFinder module has limitations with regards to the degree of nuclear atypia and cell density that can be tolerated. Based on the recall and PPV values and their consistency across the majority conditions analyzed, we believe that MNFinder can provide reliable results for MN frequency, integrity, shape, and label characteristics in a functional genomics assay in many commonly used adherent cell lines. We also added a discussion of caveats for these analyses, including the facts that highly lobulated nuclei will have higher false positive rates and that high cell confluency may require additional markers to ensure highly accurate assignment of MN to nuclei.
Spindle checkpoint loss (e.g., MPS1 inhibition) is expected to cause a variety of nuclear atypia: misshapen, multinucleated, and micronucleated cells. It may be difficult to obtain a pure MN population following MPS1 inhibitor treatment, as many cells are likely to present MN among multinucleated or misshapen nuclear compartments. Given this situation, the transcriptomic impact of MN is unlikely to be retrieved using this experimental design, but this does not negate the significance of the work. The discussion will have to consider the nature, origin, and proportion of MN/rupture-only states - for example, lagging chromatids and unaligned chromosomes can result in different states of micronuclei and also distinct cell fates.
We appreciate the reviewer’s comments and now quantify the frequency of other nuclear atypias and MN chromosome content in RPE1 cells after 24 h Mps1 inhibition (Fig. S1). In summary, we find only small increases in nuclear atypia, including multinucleate cells, misshapen nuclei, and chromatin bridges, compared to the large increase in MN formation. This contrasts with what is observed when mitosis is delayed using nocodazole or CENPE inhibitors where nuclear atypia is much more frequent. Importantly, after Mps1 inhibition, RPE1 cells with MN were only slightly more likely to have a misshapen nucleus compared to cells without MN (Fig. S1C).
Interestingly, this analysis showed that the VCS MN pipeline, which uses the Deep Retina segmenter to identify nuclei, has a strong bias against lobulated nuclei and frequently fails to find them (Fig. S2B). Therefore, the cell populations analyzed by RNAseq were largely depleted of highly misshapen nuclei and differences in nuclear atypia frequency between MN+ and MN- cells in the starting population were lost (Fig. S9A, compare to Fig. S1C). This strongly suggests that the transcript changes we observed reflect differences in MN frequency and aneuploidy rather than differences in nuclei morphology.
We agree with the reviewer that MN rupture frequency and formation, and downstream effects on cell proliferation and DNA damage, are sensitive to the source of the missegregated chromatin. In the revised manuscript we make clear that we chose Mps1 inhibition because it is strongly biased towards whole chromosome MN (Fig. S1E), limiting signal from DNA damage products, including chromosome fragments and chromatin bridges. This provides a base line to disambiguate the consequences of micronucleation and DNA damage in more complex chromosome missegregation processes, such as DNA replication disruption and irradiation.
Reviewer #3 (Public review):
Summary:
The authors develop a method to visually analyze micronuclei using automated methods. The authors then use these methods to isolate MN post-photoactivation and analyze transcriptional changes in cells with and without micronuclei of RPE-1 cells. The authors observe in RPE-1 cells that MN-containing cells show similar transcriptomic changes as aneuploidy, and that MN rupture does not lead to vast changes in the transcriptome.
Strengths:
The authors develop a method that allows for automating measurements and analysis of micronuclei. This has been something that the field has been missing for a long time. Using such a method has the potential to advance micronuclei biology. The authors also develop a method to identify cells with micronuclei in real time and mark them using photoconversion and then isolate them via FACS. The authors use this method to study the transcriptome. This method is very powerful as it allows for the sorting of a heterogenous population and subsequent analysis with a much higher sample number than could be previously done.
Weaknesses:
The major weakness of this paper is that the results from the RNA-seq analysis are difficult to interpret as very few changes are found to begin with between cells with MN and cells without. The authors have to use a 1.5-fold cut-off to detect any changes in general. This is most likely due to the sequencing read depth used by the authors. Moreover, there are large variances between replicates in experiments looking at cells with ruptured versus intact micronuclei. This limits our ability to assess if the lack of changes is due to truly not having changes between these populations or experimental limitations. Moreover, the authors use RPE-1 cells which lack cGAS, which may contribute to the lack of changes observed. Thus, it is possible that these results are not consistent with what would occur in primary tissues or just in general in cells with a proficient cGAS/STING pathway.
We agree with the reviewer’s assessment of the limitations of our RNA-Seq analysis. After additional analysis, we propose an alternative explanation for the lower expression changes we observe in the MN+ and Mps1 inhibitor RNA-Seq experiments. In summary, we find that VCS MN has a strong bias against highly lobulated nuclei that depletes this class of cells from both the bulk analysis and the micronucleated cell populations (Fig. S9A). Based on this result, we propose that our analysis reduces the contribution of nuclear atypia to these transcriptional changes and that nuclear morphology changes are likely a signaling trigger associated with aneuploidy.
We believe that this finding strengthens our overall conclusion that MN formation and rupture do not cause transcriptional changes, as suppressing the signaling associated with nuclei atypia should increase sensitivity to changes from the MN. However, we cannot completely rule out that MN formation or rupture cause a broad low-level change in transcription that is obscured by other signals in the dataset.
As to cGAS signaling, several follow up papers and even the initial studies from the Greenburg lab show that MN rupture does not activate cGAS and does not cause cGAS/STING-dependent signaling in the first cell cycle (see citations and discussion in text). Therefore, we expect the absence of cGAS in RPE1 cells will have no effect in the first cell cycle, but could alter the transcriptional profile after mitosis. Although analysis of RPE1 cGAS+ cells or primary cells in these experiments will be required to definitively address this point, we believe that our interpretation of our RNAseq results is sufficiently backed up by the literature to warrant our conclusion that MN formation and rupture do not induce a transcriptional response in the first cell cycle.
Reviewer #1 (Recommendations for the authors):
I do not recommend additional experimental or computational work. Instead, I just recommend adapting the claims of the manuscript to what has been done. I am just asking for further clarification and minor rewriting.
(1) The manuscript is written like a molecular biology paper with sparse explanations of the authors' reasoning, especially in the development of their algorithms. I was often lost as to why they did things in one way or another.
The revised manuscript has thorough explanations and additional data and graphics defining how and why the VCS MN and MNFinder modules were developed. We hope that this clears up many of the questions the reviewer had and appreciate their guidance on making it more readable for scientists from different backgrounds.
(2) Evaluations of their method are often not fully explained, for example:
"On average, 75% of nuclei per field were correctly segmented and cropped."
"MN segments were then assigned to 'parent' nuclei by proximity, which correctly associated 97% of MN."
Were there ground truth images and labels created? How many? For example, I don't know how the authors could even establish a ground-truth for associating MNs to nuclei if MNs happened to be almost equidistant between two nuclei in their images.
I suggest a separate subsection early in the Results section where the underlying imaging data + labels are presented.
We added new sections to the text and figures at the beginning of the VCS MN and MNFinder subsections (Fig. S2 and Fig. S5) with specific information about how ground truth images and labels were generated for both modules and how these were broken up for training, validation, and testing.
We also added information and images to explain how ground truth MN/nucleus associations were derived. In summary, we took advantage of the fact that 2xDendra-NLS is present at low levels in the cytoplasm to identify cell boundaries. This combined with a subconfluent cell population allowed us to unambiguously group MN and nuclei for 98% of MN, we estimate. These identifications were used to generate ground truth labels and analyze how well proximity defines MN/nuclei groups (Fig.s S1 and S2).
(3) Overall, I find the sections long and more subtitles would help me better navigate the manuscript.
Where possible, we have added subtitles.
(4) Everything following "To train the model, H2B channel images were passed to a Deep Retina neural net ..." is fully automated, it seems to me. Thus, there seems to be no human intervention to correct the output before it is used to train the neural network. Therefore, I do not understand why a neural network was trained at all if the pipeline for creating ground truth labels worked fully automatically. At least, the explanations are insufficient.
We apologize for the initial lack of clarity in the text and included additional details in the revision. We used the Deep Retina segmenter to crop the raw images to areas around individual nuclei to accelerate ground truth labeling of MN. A trained user went through each nucleus crop and manually labeled pixels belonging to MN to generate the ground truth dataset for training, validation, and imaging in VCS MN (Fig. S2A).
(5) To my mind, the various metrics used to evaluate VCS MN reveal it not to be terribly reliable. Recall and PPV hover in the 70-80% range except for the PPV for MN+. It is what it is - but do the authors think one has to spend time manually correcting the output or do they suggest one uses it as is? I understand that for bulk transcriptomics, enrichment may be sufficient but for many other questions, where the wrong cell type could contaminate the population, it is not.
Remarks in the Results section on what the various accuracies mean for different applications would be good (so one does not need to wait for the Discussion section).
One of the strengths of the visual cell sorting system is that any image analysis pipeline can be used with it. We used VCS MN for the transcriptomics experiment, but for other applications a user could run visual cell sorting in conjunction with MNFinder for increased purity while maintaining a reasonable recall or use a pre-existing MN segmentation program that gives 100% purity but captures only a specific subgroup of micronucleated cells (e.g. PIQUE).
To maintain readability, especially with the expansion of the results sections, we kept the discussion of how we envision using visual cell sorting for other MN-based applications in the discussion section.
(6) I am confused about what "cell" is referring to in much of the manuscript. Is it the nucleus + MNs only? Is it the whole cell, which one would ordinarily think it is? If so, are there additional widefield images, where one can discern cell boundaries? I found the section "MNFinder accurately ..." very hard to read and digest for this reason and other ambiguous wording. I suggest the authors take a fresh look at their manuscript and see whether the text can be improved for clarity. I did not find it an easy read overall, especially the computational part.
After re-examining how “cell” was used, we updated the text to limit its use to the MNFinder arm tasked with identifying MN-nucleus associations where the convex hull defined by these objects is used to determine the “cell” boundary. In all other cases we have replaced cell with “nucleus” because, as the reviewer points out, that is what is being analyzed and converted. We hope this is clearer.
(7) Post-FACS PPVs are not that great (Figure 3c). It depends on the question one wants to answer whether ~70% PPV is good enough. Again, would be good to comment on.
We added discussion of this result to the revision. In summary, a likely reason for the reduced PPV is that, although we maintain the cells in buffer with a Cdk1 inhibitor, we know that some proportion of the cells go through mitosis post-sorting. Since MN are frequently reincorporated into the nucleus after mitosis (Hatch et al, 2013; Zhang et al., 2015), we expect this to reduce the MN+ population. Thus, we expect that the PPV in the RNAseq population is higher than what we can measure by analyzing post-sorted cells that have been plated and analyzed later.
(8) I am thoroughly confused as to why the authors claim that their system works in the "absence of genetic perturbations" and why they emphasize the fact that their cells are non-transformed: They still needed a fluorescent label and they induce MNs with a chemical Mps1 inhibitor. (The latter is not a genetic manipulation, of course, but they still need to enrich MNs somehow. That is, their method has not been tested on a cell population in which MNs occur naturally, presumably at a very low rate, unless I missed something.) A more careful description of the benefits of their method would be good.
We apologize for the confusion on these points and hope this is clarified in the revision. We were comparing our system, which can be made using transient transfection, if desired, to current tools that disambiguate aneuploidy and MN formation by deleting parts of chromosomes or engineering double strand breaks with CRISPR to generate single chromosome-specific missegregation events. Most of these systems require transformed cancer cells to obtain high levels of recombination. In contrast, visual cell sorting can isolate micronucleated cells from any cell line that can exogenously express a protein, including primary cells and non-transformed cells like RPE1s.
Other minor points:
(1) The authors should not refer to "H2B channels" but to "H2B-emiRFP703 channels". It may seem obvious to the authors but for someone reading the manuscript for the very first time, it was not. I was not sure whether there were additional imaging modalities used for H2B/nucleus/chromatin detection before I went back and read that only fluorescence images of H2B-emiRFP703 were used. To put it another way, the authors are detecting fluorescence, not histones -- unless I misunderstood something.
To address this point, we altered the text to read “H2B-emiRFP703” when discussing images of this construct. For MNFinder some images were of cells expressing H2B-GFP, which has also been clarified.
(2) If the level of zoom on my screen is such that I can comfortably read the text, I cannot see much in the figure panels. The features that I should be able to see are the size of a title. The image panels should be magnified.
In the revision, the images are appended to the end at full resolution to overcome this difficulty. Thank you for your forbearance.
Reviewer #2 (Recommendations for the authors):
The methods are adequately explained. The Results text narrating experiments and data analysis is clear. Interpretation of a few results could be clarified and strengthened as explained below.
(1) RNAseq experiments are a good proof of principle. To strengthen their interpretation in Figures 4 and 6, I would recommend the authors cite published work on checkpoint/MPS1 loss-induced chromosome missegregation (PMID: 18545697, PMID: 33837239, PMC9559752) and consider in their discussion the 'origin' and 'proportion' of micronucleated cells and irregularly shaped nuclei expected in RPE1 lines. This will help interpret Figure 6 findings on aneuploidy signature accurately. Not being able to see an MN-specific signature could be due to the way the biological specimen is presented with a mixture of cells with 'MN only' or 'rupture' or 'MN along with misshapen nuclei'. These features may all link to aneuploidy rather than 'MN' specifically.
We appreciate the reviewer’s suggestion and added a new analysis of nuclear atypia after Mps1 inhibition in RPE1 cells to Fig. S1. Overall, we found that Mps1 inhibition significantly, but modestly, increased the proportion of misshapen nuclei and chromatin bridges. Multinucleate cells were so rare that instead of giving them their own category we included them in “misshapen nuclei.” These results are consistent with images of Msp1i treated RPE1 cells from He et al. 2019 and Santaguida et al. 2017 and distinct from the stronger changes in nuclear morphology observed after delaying mitosis by nocodazole or CENPE inhibition.
We also found that the Deep Retina segmenter used to identify nuclei in VCS MN had a significant bias against highly lobulated nuclei (Fig. S2B) that led to misshapen nuclei being largely excluded from the RNAseq analyses. As a result we found no enrichment of misshapen nuclei, chromatin bridges, or dead/mitotic nuclear morphologies in MN+ compared to MN- nuclei in our RNASeq experiments (Fig. S9A).
(2) As the authors clarify in the response letter, one round of ML is unlikely to result in fully robust software; additional rounds of ML with other markers will make the work robust. It will be useful to indicate other ML image analysis tools that have improved through such reiterations. They could use reviews on challenges and opportunities using ML approaches to support their statement. Also in the introduction, I would recommend labelling as 'rapid' instead of 'rapid and precise' method.
We updated the text to reference review articles that discuss the benefit of additional training for increasing ML accuracy and changed the text to “rapid.”
(3) The lack of live-cell studies does not allow the authors to distinguish the origin of MN (lagging chromatids or unaligned chromosomes). As explained in 1, considering these aspects in discussion would strengthen their interpretation. Live-cell studies can help reduce the dependencies on proximity maps (Figure S2).
The revised text includes new references and data (Fig. S1E) demonstrating that Mps1 inhibition strongly biases towards whole chromosome missegregation and that MN are most likely to contain a single centromere positive chromosome rather than chromatin fragments or multiple chromosomes.
(4) Mean Intersection over Union (mIOU) is a good measure to compare outcomes against ground truth. However, the mIOU is relatively low (Figure 2D) for HeLa-based functional genomics applications. It will help to discuss mIOU for other classifiers (non-MN classifiers) so that they can be used as a benchmark (this is important since the authors state in their response that they are the first to benchmark an MN classifier). There are publications for mitochondria, cell cortex, spindle, nuclei, etc. where IOU has been discussed.
We added references to classifiers for other small cellular structures. We also evaluated major sources of error in MNFinder found that false negatives are enriched in very small MN (3 to 9 pixels, or about 0.4 µm<sup>2</sup> – 3 µm<sup>2</sup>, Fig. S6B). A similar result was obtained for VCS MN (Fig. S3B). Because small changes in the number of pixels identified in small objects can have outsized effects on mIoU scores, we suspect that this is exerting downward pressure on the mIoU value. Based on the PPV and recall values we identified, we believe that MNFinder is robust enough to use for functional genomics and screening applications with reasonable sample sizes.
(5) Figure 5 figure legend title is an overinterpretation. MN and rupture-initiated transcriptional changes could not be isolated with this technique where several other missegregation phenotypes are buried (see point 1 above).
We decided to keep the figure title legend based on our analysis of known missegregation phenotypes in Fig. S1 and S9 showing that there is no difference in major classes of nuclear atypia between MN+ and MN- populations in this analysis. Although we cannot rule out that other correlated changes exist, we believe that the title represents the most parsimonious interpretation.
Minor comments
(1) The sentence in the introduction needs clarification and reference. "However, these interventions cause diverse "off-target" nuclear and cellular changes, including chromatin bridges, aneuploidy, and DNA damage." Off-target may not be the correct description since inhibiting MPS1 is expected to cause a variety of problems based on its role as a master kinase in multiple steps of the chromosome segregation process. Consider one of the references in point 1 for a detailed live-cell view of MPS1 inhibitor outcomes.
We have changed “off-target” to “additional” for clarity.
(2) In Figure 3 or S3, did the authors notice any association between the cell cycle phase and MN or rupture presence? Is this possible to consider based on FACS outcomes or nuclear shapes?
Previous work by our lab and others have shown that MN rupture frequency increases during the cell cycle (Hatch et al., 2013; Joo et al., 2023). Whether this is stochastic or regulated by the cell cycle may depend on what chromosome is in the MN (Mammel et al., 2021) and likely the cell line. Unfortunately, the H2B-emiRFP703 fluorescence in our population is too variable to identify cell cycle stage from FACS or nuclear fluorescence analysis.
(3) Figure 5 - Please explain "MA plot".
An MA plot, or log fold-change (M) versus average (A) gene expression, is a way to visualize differently expressed genes between two conditions in an RNASeq experiment and is used as an alternative to volcano plots. We chose them for our paper because most of the expression changes we observed were small and of similar significance and the MA plot spreads out the data compared to a volcano plot and allowed a better visualization of trends across the population.
(4) Page 7: "our results strongly suggest that protein expression changes in MN+ and rupture+ cells are driven mainly by increased aneuploidy rather than cellular sensing of MN formation and rupture.". This is an overstatement considering the mIOU limits of the software tool and the non-exclusive nature of MN in their samples.
We agree that we cannot rule out that an unknown masking effect is inhibiting our ability to observe small broad changes in transcription after MN formation or rupture. However, we believe we have minimized the most likely sources of masking effects, including nuclear atypia and large scale aneuploidy differences, and thus our interpretation is the most likely one.
Reviewer #3 (Recommendations for the authors):
Overall, the authors need to explain their methods better, define some technical terms used, and more thoroughly explain the parameters and rationale used when implementing these two protocols for identifying micronuclei; primarily as this is geared toward a more general audience that does not necessarily work with machine learning algorithms.
(1) A clearer description in the methods as to how accuracy was calculated. Were micronuclei counted by hand or another method to assess accuracy?
We significantly expanded the section on how the machine learning models were trained and tested, including how sensitivity and specificity metrics were calculated, in both the results and the methods sections. The code used to compare ground truth labels to computed masks is also now included in the MNFinder module available on the lab github page.
(2) Define positive predictive value.
The text now says “the positive predictive value (PPV, the proportion of true positives, i.e. specificity) and recall (the proportion of MN found by the classifier, i.e. sensitivity)…”.
(3) Why is it a problem to use the VCS MN at higher magnifications where undersegmentation occurs? What do the authors mean by diminished performance (what metrics are they using for this?).
We have included a representative image and calculated mIoU and recall for 40x magnification images analyzed by MNFinder after rescaling in Fig. 2A. In summary, VCS MN only correctly labeled a few pixels in the MN, which was sufficient to call the adjacent nucleus “MN+” but not sufficient for other applications, such as quantifying MN area. In addition, VCS MN did much worse at identifying all the MN in 40x images with a recall, or sensitivity, metric of 0.36. We are not sure why. Developing MNFinder provided a module that was well suited to quantify MN characteristics in fixed cell images, an important use case in MN biology.
(4) The authors should compare MN that are analyzed and not analyzed using these methods and define parameters. Is there a size limitation? Closeness to the main nucleus?
We added two new figures defining what contributes to module error for both VCS MN (Fig. S3) and MNFinder (Fig. S6). For VCS MN, false negatives are enriched in very large or very small MN and tend to be dimmer and farther from the nucleus than true positives. False positives are largely misclassification of small dim objects in the image as MN. For MNFinder, the most missed class of MN are very small ones (3-9 px in area) and the majority of false positives are misclassifications of elongated nuclear blebs as MN.
(5) Are there parameters in how confluent an image must be to correctly define that the micronucleus belongs to the correct cell? The authors discussed that this was calculated based on predicted distance. However, many factors might affect proper calling on MN. And the authors should test this by staining for a cytosolic marker and calculating accuracy.
We updated the text with more information about how the cytoplasm was defined using leaky 2x-Dendra2-NLS signal to analyze the accuracy of MN/nucleus associations (Fig. S2G-H). In addition, we quantified cell confluency and distance to the first and second nearest neighbor for each MN in our training and testing image datasets. We found that, as anticipated, cells were imaged at subconfluent concentrations with most fields having a confluency around 30% cell coverage (Fig. S2E) and that the average difference in distance between the closest nucleus to an MN and the next closest nucleus was 3.3 fold (Fig. S2F). We edited the discussion section to state that the ability of MN/nuclear proximity to predict associations at high cell confluencies would have to be experimentally validated.
(6) The authors measure the ratio of Dendra2(Red) v. Dendra2 (Green) in Figure 3B to demonstrate that photoconversion is stable. This measurement, to me, is confusing, as in the end, the authors need to show that they have a robust conversion signal and are able to isolate these data. The authors should directly demonstrate that the Red signal remains by analyzing the percent of the Red signal compared to time point 0 for individual cells.
We found a bulk analysis to be more powerful than trying to reidentify individual cells due to how much RPE1 cells move during the 4 and 8 hours between image acquisitions. In addition, we sort on the ratio between red and green fluorescence per cell, rather than the absolute fluorescence, to compensate for variation in 2xDendra-NLS protein expression between cells. Therefore, demonstrating that distinct ratios remained present throughout the time course is the most relevant to the downstream analysis.
To address the reviewer’s concern, we replotted the data in Fig. 3B to highlight changes over time in the raw levels of red and green Dendra fluorescence (Fig. S7D). As expected, we see an overall decrease in red fluorescence intensity, and complementary increase in green fluorescence intensity, over 8 hours, likely due to protein turnover. We also observe an increase in the number of nuclei lacking red fluorescence. This is expected since the well was only partially converted and we expect significant numbers of unconverted cells to move into the field between the first image and the 8 hour image.
(7) The authors isolate and subsequently use RNA-sequencing to identify changes between Mps1i and DMSO-treated cells. One concern is that even with the less stringent cut-off of 1.5 fold there is a very small change between DMSO and MPS1i treated cells, with only 63 genes changing, none of which were affected above a 2-fold change. The authors should carefully address this, including why their dataset sees changes in many more pathways than in the He et al. and Santaguida et al. studies. Is this due to just having a decreased cut-off?
The reviewer correctly points out that we observed an overall reduction in the strength of gene expression changes between our dataset of DMSO versus Mps1i treated RPE1 cells compared to similar studies. We suggest a couple reasons for this. One is that the log<sub>2</sub> fold changes observed in the other studies are not huge and vary between 2.5 and -3.8 for He et al., 3.3 and -2.3 for Santaguida et al., and -0.8 and 1.6 for our study. This variability is within a reasonable range for different experimental conditions and library prep protocols. A second is that our protocol minimizes a potential source of transcriptional change – nuclear lobulation – that is present in the other datasets.
For the pathway analysis we did not use a fold-change cut-off for any data set, instead opting to include all the genes found to be significantly different between control and Mps1i treated cells for all three studies. Our read-depth was higher than that of the two published experiments, which could contribute to an increased DEG number. However, we hypothesize that our identification of a broader number of altered pathways most likely arises from increased sensitivity due to the loss of covering signal from transcriptional changes associated with increased nuclear atypia. Additional visual cell sorting experiments sorting on misshapen nuclei instead of MN would allow us to determine the accuracy of this hypothesis.
(8) Moreover, clustering (in Figure 5E) of the replicates is a bit worrisome as the variances are large and therefore it is unclear if, with such large variance and low screening depth, one can really make such a strong conclusion that there are no changes. The authors should prove that their conclusion that rupture does not lead to large transcriptional changes, is not due to the limitations of their experimental design.
We agree with the reviewers that additional rounds of RNAseq would improve the accuracy of our transcriptomic analysis and could uncover additional DEGs. However, we believe the overall conclusion to be correct based on the results of our attempt to validate changes in gene expression by immunofluorescence. We analyzed two of the most highly upregulated genes in the ruptured MN dataset, ATF3 and EGR1. Although we saw a statistically significant increase in ATF3 intensity between cells without MN and those with ruptured MN, the fold change was so small compared to our positive control (100x less) that we believe it is it is more consistent with a small increase in the probability of aneuploidy rather than a specific signature of MN rupture.
(9) The authors also need to address the fact that they are using RPE-1 cells more clearly and that the lack of effect in transcriptional changes may be simply due to the loss of cGAS-STING pathway (Mackenzie et al., 2017; Harding et al., 2017; etc.).
As we discuss above in the public comments section, the literature is clear that MN do not activate cGAS in the first cell cycle after their formation, even upon rupture. Therefore, we do not expect any changes in our results when applied to cGAS-competent cells. However, this expectation needs to be experimentally validated, which we plan to address in upcoming work.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #2:
(1) The use of two m<sup>5</sup>C reader proteins is likely a reason for the high number of edits introduced by the DRAM-Seq method. Both ALYREF and YBX1 are ubiquitous proteins with multiple roles in RNA metabolism including splicing and mRNA export. It is reasonable to assume that both ALYREF and YBX1 bind to many mRNAs that do not contain m<sup>5</sup>C.
To substantiate the author's claim that ALYREF or YBX1 binds m<sup>5</sup>C-modified RNAs to an extent that would allow distinguishing its binding to non-modified RNAs from binding to m<sup>5</sup>C-modified RNAs, it would be recommended to provide data on the affinity of these, supposedly proven, m<sup>5</sup>C readers to non-modified versus m<sup>5</sup>C-modified RNAs. To do so, this reviewer suggests performing experiments as described in Slama et al., 2020 (doi: 10.1016/j.ymeth.2018.10.020). However, using dot blots like in so many published studies to show modification of a specific antibody or protein binding, is insufficient as an argument because no antibody, nor protein, encounters nanograms to micrograms of a specific RNA identity in a cell. This issue remains a major caveat in all studies using so-called RNA modification reader proteins as bait for detecting RNA modifications in epitranscriptomics research. It becomes a pertinent problem if used as a platform for base editing similar to the work presented in this manuscript.
The authors have tried to address the point made by this reviewer. However, rather than performing an experiment with recombinant ALYREF-fusions and m<sup>5</sup>C-modified to unmodified RNA oligos for testing the enrichment factor of ALYREF in vitro, the authors resorted to citing two manuscripts. One manuscript is cited by everybody when it comes to ALYREF as m<sup>5</sup>C reader, however none of the experiments have been repeated by another laboratory. The other manuscript is reporting on YBX1 binding to m<sup>5</sup>C-containing RNA and mentions PAR-CLiP experiments with ALYREF, the details of which are nowhere to be found in doi: 10.1038/s41556-019-0361-y.<br /> Furthermore, the authors have added RNA pull-down assays that should substitute for the requested experiments. Interestingly, Figure S1E shows that ALYREF binds equally well to unmodified and m<sup>5</sup>C-modified RNA oligos, which contradicts doi:10.1038/cr.2017.55, and supports the conclusion that wild-type ALYREF is not specific m<sup>5</sup>C binder. The necessity of including always an overexpression of ALYREF-mut in parallel DRAM experiments, makes the developed method better controlled but not easy to handle (expression differences of the plasmid-driven proteins etc.)
Thank you for pointing this out. First, we would like to correct our previous response: the binding ability of ALYREF to m<sup>5</sup>C-modified RNA was initially reported in doi: 10.1038/cr.2017.55, (and not in doi: 10.1038/s41556-019-0361-y), where it was observed through PAR-CLIP analysis that the K171 mutation weakens its binding affinity to m<sup>5</sup>C -modified RNA.
Our previous experimental approach was not optimal: the protein concentration in the INPUT group was too high, leading to overexposure in the experimental group. Additionally, we did not conduct a quantitative analysis of the results at that time. In response to your suggestion, we performed RNA pull-down experiments with YBX1 and ALYREF, rather than with the pan-DRAM protein, to better validate and reproduce the previously reported findings. Our quantitative analysis revealed that both ALYREF and YBX1 exhibit a stronger affinity for m<sup>5</sup>C -modified RNAs. Furthermore, mutating the key amino acids involved in m<sup>5</sup>C recognition significantly reduced the binding affinity of both readers. These results align with previous studies (doi: 10.1038/cr.2017.55 and doi: 10.1038/s41556-019-0361-y), confirming that ALYREF and YBX1 are specific readers of m<sup>5</sup>C -modified RNAs. However, our detection system has certain limitations. Despite mutating the critical amino acids, both readers retained a weak binding affinity for m<sup>5</sup>C, suggesting that while the mutation helps reduce false positives, it is still challenging to precisely map the distribution of m<sup>5</sup>C modifications. To address this, we plan to further investigate the protein structure and function to obtain a more accurate m<sup>5</sup>C sequencing of the transcriptome in future studies. Accordingly, we have updated our results and conclusions in lines 294-299 and discuss these limitations in lines 109-114.
In addition, while the m<sup>5</sup>C assay can be performed using only the DRAM system alone, comparing it with the DRAM<sup>mut</sup>C control enhances the accuracy of m<sup>5</sup>C region detection. To minimize the variations in transfection efficiency across experimental groups, it is recommended to use the same batch of transfections. This approach not only ensures more consistent results but also improve the standardization of the DRAM assay, as discussed in the section added on line 308-312.
(2) Using sodium arsenite treatment of cells as a means to change the m<sup>5</sup>C status of transcripts through the downregulation of the two major m<sup>5</sup>C writer proteins NSUN2 and NSUN6 is problematic and the conclusions from these experiments are not warranted. Sodium arsenite is a chemical that poisons every protein containing thiol groups. Not only do NSUN proteins contain cysteines but also the base editor fusion proteins. Arsenite will inactivate these proteins, hence the editing frequency will drop, as observed in the experiments shown in Figure 5, which the authors explain with fewer m<sup>5</sup>C sites to be detected by the fusion proteins.
The authors have not addressed the point made by this reviewer. Instead the authors state that they have not addressed that possibility. They claim that they have revised the results section, but this reviewer can only see the point raised in the conclusions. An experiment would have been to purify base editors via the HA tag and then perform some kind of binding/editing assay in vitro before and after arsenite treatment of cells.
We appreciate the reviewer’s insightful comment. We fully agree with the concern raised. In the original manuscript, our intention was to use sodium arsenite treatment to downregulate NSUN mediated m<sup>5</sup>C levels and subsequently decrease DRAM editing efficiency, with the aim of monitoring m<sup>5</sup>C dynamics through the DRAM system. However, as the reviewer pointed out, sodium arsenite may inactivate both NSUN proteins and the base editor fusion proteins, and any such inactivation would likely result in a reduced DRAM editing. This confounds the interpretation of our experimental data.
As demonstrated in Appendix A, western blot analysis confirmed that sodium arsenite indeed decreased the expression of fusion proteins. In addition, we attempted in vitro fusion protein purification using multiple fusion tags (HIS, GST, HA, MBP) for DRAM fusion protein expression, but unfortunately, we were unable to obtain purified proteins. However, using the Promega TNT T7 Rapid Coupled In Vitro Transcription/Translation Kit, we successfully purified the DRAM protein (Appendix B). Despite this success, subsequent in vitro deamination experiments did not yield the expected mutation results (Appendix C), indicating that further optimization is required. This issue is further discussed in line 314-315.
Taken together, the above evidence supports that the experiment of sodium arsenite treatment was confusing and we determined to remove the corresponding results from the main text of the revised manuscript.
Author response image 1.
(3) The authors should move high-confidence editing site data contained in Supplementary Tables 2 and 3 into one of the main Figures to substantiate what is discussed in Figure 4A. However, the data needs to be visualized in another way then excel format. Furthermore, Supplementary Table 2 does not contain a description of the columns, while Supplementary Table 3 contains a single row with letters and numbers.
The authors have not addressed the point made by this reviewer. Figure 3F shows the screening process for DRAM-seq assays and principles for screening high-confidence genes rather than the data contained in Supplementary Tables 2 and 3 of the former version of this manuscript.
Thank you for your valuable suggestion. We have visualized the data from Supplementary Tables 2 and 3 in Figure 4A as a circlize diagram (described in lines 213-216), illustrating the distribution of mutation sites detected by the DRAM system across each chromosome. Additionally, to improve the presentation and clarity of the data, we have revised Supplementary Tables 2 and 3 by adding column descriptions, merging the DRAM-ABE and DRAM-CBE sites, and including overlapping m<sup>5</sup>C genes from previous datasets.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public Review):
Summary:
The paper begins with phenotyping the DGRP for post-diapause fecundity, which is used to map genes and variants associated with fecundity. There are overlaps with genes mapped in other studies and also functional enrichment of pathways including most surprisingly neuronal pathways. This somewhat explains the strong overlap with traits such as olfactory behaviors and circadian rhythm. The authors then go on to test genes by knocking them down effectively at 10 degrees. Two genes, Dip-gamma and sbb, are identified as significantly associated with post-diapause fecundity, and they also find the effects to be specific to neurons. They further show that the neurons in the antenna but not the arista are required for the effects of Dip-gamma and sbb. They show that removing the antenna has a diapause-specific lifespan-extending effect, which is quite interesting. Finally, ionotropic receptor neurons are shown to be required for the diapause-associated effects.
Strengths and Weaknesses:
Overall I find the experiments rigorously done and interpretations sound. I have no further suggestions except an ANOVA to estimate the heritability of the post-diapause fecundity trait, which is routinely done in the DGRP and offers a global parameter regarding how reliable phenotyping is.
We added to the Methods: “We performed a one-way ANOVA to get the mean squares for between-group and withingroup variances and calculated broad-sense heritability using the formula: H<sup>2</sup> = MS<sub>G</sub> - MS<sub>E</sub> / MS<sub>G</sub> + (k-1) MS<sub>E</sub> where MS<sub>G</sub> - Mean square between groups and MS<sub>G</sub> - Mean square within groups and k - Number of individuals per group. Using this formula, the broad-sense heritability for normalized post-diapause fecundity was found to be 0.51.”
We added to the Results: “The broad-sense heritability for normalized post-diapause fecundity was found to be 0.51 (see Methods).”
A minor point is I cannot find how many DGRP lines are used.
Response: We screened 193 lines and have added that to the Results.
Reviewer #2 (Public Review):
Summary
In this study, Easwaran and Montell investigated the molecular, cellular, and genetic basis of adult reproductive diapause in Drosophila using the Drosophila Genetic Reference Panel (DGRP). Their GWAS revealed genes associated with variation in post-diapause fecundity across the DGRP and performed RNAi screens on these candidate genes. They also analyzed the functional implications of these genes, highlighting the role of genes involved in neural and germline development. In addition, in conjunction with other GWAS results, they noted the importance of the olfactory system within the nervous system, which was supported by genetic experiments. Overall, their solid research uncovered new aspects of adult diapause regulation and provided a useful reference for future studies in this field.
Strengths:
The authors used whole-genome sequenced DGRP to identify genes and regulatory mechanisms involved in adult diapause. The first Drosophila GWAS of diapause successfully uncovered many QTL underlying post-diapause fecundity variations across DGRP lines. Gene network analysis and comparative GWAS led them to reveal a key role for the olfactory system in diapause lifespan extension and post-diapause fecundity.
Comments on revised version:
While the authors have addressed many of the minor concerns raised by the reviewers, they have not fully resolved some of the key criticisms. Notably, two reviewers highlighted significant concerns regarding the phenotype and assay of post-diapause fecundity, which are critical to the study. The authors acknowledged that this assay could be confounded by the 'cold temperature endurance phenotype,' potentially altering the interpretation of their results.
However, they responded by stating that it is not obvious how to separate these effects experimentally. This leaves the analysis in this research ambiguous, as also noted by Reviewer #3.
We should have clarified earlier that we actually chose to measure post-diapause fecundity in order to minimize any impact of ‘cold temperature endurance.” In fact, we chose post-diapause fecundity as the appropriate measure of successful diapause for both technical and conceptual reasons. Conceptually, the benefit of diapause is to perpetuate the species. It seems obvious to us that post-diapause fecundity is more relevant to species propagation than other measures of diapause such as how many egg chambers contain yolk or how many eggs are laid. Technically, we chose 5-week diapause and recovery based on pilot studies that showed that nearly all DGRP lines showed excellent survival at 5 weeks in diapause conditions. Therefore, our experimental design minimized as much as possible any effect of cold temperature endurance - in the sense of the ability to survive at 10°C - on our phenotype.
We apologize for not clarifying that point earlier and have added this text to the Results: “We chose 5 weeks based on pilot studies that showed that nearly all DGRP lines showed excellent survival at 5 weeks in diapause conditions while exhibiting sufficient variation in post-diapause fecundity to carry out GWAS. Beyond 5 weeks, fecundity was low and there was insufficient variation to conduct a GWAS.”
Additionally, I raised concerns about the validity of prioritizing genes with multiple associated variants. Although the authors agreed with this point, they did not revise the manuscript accordingly. The statement that 'Genes with multiple SNPs are good candidates for influencing diapause traits' is not a valid argument within the context of population and quantitative genetics.
We apologize for neglecting to revise the manuscript accordingly. We have revised Supplemental Table: S4 and ranked the genes by p-value.
-
-
www.medrxiv.org www.medrxiv.org
-
Author Response:
Reviewer #1 (Public Review):
[...] Strengths: This study utilized multiple in vitro approaches, such as proteomics, siRNA, and overexpression, to demonstrate that PCBP2 is an intrinsic factor of BMSC aging.
Weaknesses:
This study did not perform in vivo experiments.
Response: We will continue to conduct animal experiments in subsequent studies.
Reviewer #2 (Public Review):
[...] Weaknesses: It is unclear if PCBP2 can also function as an intrinsic factor for BMSC cells in female individuals. More work may be needed to further dissect the mechanism of how PCBP2 impacts FGF2 expression. Could PCBP2 impact the FGF2 expression independent of ROS?
Response: Thank you very much for your valuable comments, which is also the focus of our follow-up work. We will sort out the data and publish the relevant research results as soon as possible.
Additional context that would help readers interpret or understand the significance of the work: In the current work, the authors studied the aging process of BMSC cells, which are related to osteoporosis. Aging processes also impact many other cell types and their function, such as in muscle, skin, and the brain.
Response: Thank you very much for your valuable comments, we will continue to improve the writing logic of the article to make the article more understandable.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife assessment
This useful manuscript reports mechanisms behind the increase in fecundity in response to sub-lethal doses of pesticides in the crop pest, the brown plant hopper. The authors hypothesize that the pesticide works by inducing the JH titer, which through the JH signaling pathway induces egg development. Evidence for this is, however, inadequate.
We greatly appreciate your valuable comments and constructive suggestions for our work. All in all, the manuscript has been carefully edited and improved following your suggestions. We also provide more evidence to support our statements by conducting new experiments. First, we found that also EB treatment of adult females can stimulate egg-laying. Second, EB treatment in female adults increases the number of mature eggs in the ovary and ovarioles. Third, EB treatment in females enhances the expression of the kr-h1 gene in the whole body of BPH. Finally, EB treatment in female adults increases the JHIII titer, but has no impact on the 20E titer.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Gao et al. have demonstrated that the pesticide emamectin benzoate (EB) treatment of brown planthopper (BPH) leads to increased egg-laying in the insect, which is a common agricultural pest. The authors hypothesize that EB upregulates JH titer resulting in increased fecundity.
Strengths:
The finding that a class of pesticide increases the fecundity of brown planthopper is interesting.
We greatly appreciate your positive comments on our work.
Weaknesses:
(1) EB is an allosteric modulator of GluCl. That means EB physically interacts with GluCl initiating a structural change in the cannel protein. Yet the authors' central hypothesis here is about how EB can upregulate the mRNA of GluCl. I do not know whether there is any evidence that an allosteric modulator can function as a transcriptional activator for the same receptor protein. The basic premise of the paper sounds counterintuitive. This is a structural problem and should be addressed by the authors by giving sufficient evidence about such demonstrated mechanisms before.
Thank you for your question. As the reviewer points out, EB physically interacts with its target protein GluCl and thus affects its downstream signaling pathway. In the manuscript, we reported that EB-treated brown planthoppers display increased expression of GluCl in the adult stage (Fig. 5A). Actually, there are many studies showing that insects treated with insecticides can increase the expression of target genes. For example, the relative expression level of the ryanodine receptor gene of the rice stem borer, Chilo suppressalis was increased 10-fold after treatment with chlorantraniliprole, an insecticide which targets the ryanodine receptor (Peng et al., 2017). Besides this, in Drosophila, starvation (and low insulin) elevates the transcription level of the sNPF and tachykinin receptors (Ko et al., 2015; Root et al., 2011). In brown planthoppers, reduction in mRNA and protein expression of a nicotinic acetylcholine receptor α8 subunit is associated with resistance to imidacloprid (Zhang et al., 2015). RNA interference knockdown of α8 gene decreased the sensitivity of N. lugens to imidacloprid (Zhang et al., 2015). Hence, expression of receptor genes can be regulated by diverse factors including insecticide treatment. In our case, we found that EB can upregulate its target gene GluCl. However, we did not claim that EB functions as transcriptional activator for GluCl, and we still do not know why EB treatment changes the expression of GluCl in the brown planthopper. Considering our experiments are lasting several days, it might be an indirect (or secondary) effect caused by other factors, which change the expression of GluCl gene upon EB action of the channel. One reason is maybe that the allosteric interaction with GluCl by EB makes it dysfunctional and the cellular response is to upregulate the channel/receptor to compensate. We have inserted text on lines 738 - 757 to explain these possibilities.
(2) I am surprised to see a 4th instar larval application or treatment with EB results in the upregulation of JH in the adult stages. Complicating the results further is the observation that a 4th instar EB application results in an immediate decrease in JH titer. There is a high possibility that this late JH titer increase is an indirect effect.
Thank you for your question. Treatment with low doses or sublethal doses of insecticides might have a strong and complex impact on insects (Gandara et al., 2024; Gong et al., 2022; Li et al., 2023; Martelli et al., 2022). We kept the 4th instar of brown planthoppers feeding on EB for four days. They will develop to 5th instar after four days treatment, which is the final nymphal stage of BPH. Since the brown planthopper is a hemimetabolous insect, we cannot rule out the possibility that an indirect effect of treatment with EB results in the upregulation of JH in the adult stages. In this new revised manuscript, we investigated the impact of EB treatment in the adult stage. We found that female adults treated with EB also laid more eggs than controls (Figure 1-figure supplement 1A). The following experiments were performed in adults to address how EB treated stimulates egg-laying in adult brown planthopper.
(1) We found that EB treatment in adults increases the number of mature eggs in ovary (new Figure 2-figure supplement 1). We add this results in lines 234 – 238 and 281-285.
(2) We measured the JH titer after the female adults had been treated with EB. We found that EB can also increase the JH titer but has no impact on the 20E titer in the female adult (Figure 3-S3A and B). We add this results in lines 351 – 356 and 281-285.
(3) EB treatment in adults increases the gene expression of JHAMT and Kr-h1 (Figure 3-S3C and D). We add this results in lines 378 – 379, lines 387-390 and lines 457-462.
(3) The writing quality of the paper needs improvement. Particularly with respect to describing processes and abbreviations. In several instances the authors have not adequately described the processes they have introduced, thus confusing readers.
Thank you for your suggestion. We have thoroughly revised the paper to improve clarity.
(4) In the section 'EB promotes ovarian development' the authors have shown that EB treatment results in increased detention of eggs which contradicts their own results which show that EB promotes egg laying. Again, this is a serious contradiction that nullifies their hypothesis.
Thank you for pointing this out. We revised the figure 2B to show number of mature eggs in the ovary. The number of mature eggs in ovaries of females that fed on EB was higher than in control females. We also show that BPH fed with EB laid more eggs than controls. Thus, our results suggest that EB promotes ovary maturation (and egg production) and also increases egg laying (Figure 1 and Table S1). Thus, we found that EB treatment can increase both the production of eggs and increase egg laying. We add this results in lines 234 – 238.
(5) Furthermore, the results suggest that oogenesis is not affected by EB application. The authors should devote a section to discussing how they are observing increased egg numbers in EB-treated insects while not impacting Oogenesis.
Thank you for your suggestions, and apologies for the lack of clarity in our initial explanation. First, we found that EB treatment led to an increase in the number of eggs laid by female brown planthoppers (Figure 1). Through dissection experiments, we observed that EB-treated females had more mature eggs in their ovaries (Figure 2A and B), indicating that the increased egg-laying was due to a larger production of mature eggs in the ovaries after EB treatment. This is now explained on lines 229-238.
Additionally, since there is no systematic description of oogenesis in the brown planthopper, we were the first to observe the oogenesis process in this species using immunohistochemistry and laser confocal microscopy. Based on the developmental characteristics, we defined the different stages of oogenesis (Figure 2C, Figure 2-figure supplement 2). We did not observe any significant effect of EB treatment on the various stages of oogenesis, indicating that EB treatment does not impair normal egg development (Figure 2D). Instead, the increase in vitellogenin accelerates the production of mature eggs. This is now explained on lines 243-262.
During the maturation process, eggs require uptake of vitellogenin, and an increase in vitellogenin (Vg) content can accelerate egg maturation, producing more mature eggs. Our molecular data suggest that EB treatment leads to an upregulation of vg expression. Based on these findings, we conclude that the increase in egg-laying caused by EB treatment is due to the upregulation of vg (Figure 3I), which raises vitellogenin content, promoting the uptake of vitellogenin by maturing eggs and resulting in the production of more mature eggs. We have revised the text on lines 389-395 to clarify this point.
(6) Met is the receptor of JH and to my understanding, remains mostly constant in terms of its mRNA or protein levels throughout various developmental periods in many different insects. Therefore, the presence of JH becomes the major driving factor for physiological events and not the presence of the receptor Met. Here the authors have demonstrated an increase in Met mRNA as a result of EB treatment. Their central hypothesis is that EB increases JH titer to result in enhanced fecundity. JH action will not result in the activation of Met. Although not contradictory to the hypothesis, the increase in mRNA content of Met is contrary to the findings of the JH field thus far.
Thank you for your comment. Our results showed that EB treatment can mildly increase (about 2-fold) expression of the Met gene in brown planthoppers (Figure 3G). And our data indicated that Met and FAMeT expression levels were not influenced so much by EB compared with kr-h1 and vg (Figure 3H and I). We agree that JH action will not result in the increase of Met. However, we cannot rule out the possibility of other factors (indirect effects), induced by EB treatment that increase the mRNA expression level of Met. One recent paper reported that downregulation of transcription factor CncC will increase met expression in beetles (see Figure 6A in this reference) (Jiang et al., 2023). Many studies have reported that insecticide treatment will activate the CncC gene signaling pathway, which regulates detoxification gene expression (Amezian et al., 2023; Fu et al., 2024; Hu et al., 2021). Hence, it is possible that EB might influence the CncC gene pathway which then induces met expression. This EB effect on met upregulation may be similar to the upregulation of GluCl and some other secondary effects. We have discussed this on lines 725-738.
(7) As pointed out before, it is hard to rationalize how a 4th instar exposure to EB can result in the upregulation of key genes involved in JH synthesis at the adult stage. The authors must consider providing a plausible explanation and discussion in this regard.
Thank you for your comments. It must be mentioned that although we exposed the BPH to EB at 4th instar, we make the insect feed on the EB-treated rice plants for four days. After that, the insect will develop into 5<sup>th</sup> instar, the final nymphal stage of brown planthopper. Since brown planthoppers do not have a pupal stage, this might cause the EB presented to the insects last a longer time even in the adult stage. Besides this, we found that EB treatment will increase the weight of adult females (Figure 1-figure supplement 3E and F), which indicates that EB might increase food intake in BPHs that might produce more insulin peptide. Insulin might increase the JH synthesis at the adult stage. In our revised study we also investigate EB impairment in adult BPHs. We found that, similar to the nymphal stage, EB treatment in adult BPHs also increases the egg laying. Furthermore, the JH titer was increased after treatment of BPH with EB in adults. Besides this, GluCl and kr-h1 genes were also up-regulated after EB treatment in the adult stage. We have discussed this on lines 739-746.
(8) I have strong reservations against such an irrational hypothesis that Met (the receptor for JH) and JH-Met target gene Kr-h1 regulate JH titer (Line 311, Fig 3 supplemental 2D). This would be the first report of such an event on the JH field and therefore must be analysed in depth. I strongly suggest the authors remove such claims from the manuscript without substantiating it.
Thank you for your suggestions and comments. We have changed our claims in this revised MS. We found that EB treatment can enhance Kr-h1 expression. We have no evidence to support that JH can induce met expression. We have rewritten the manuscript to avoid confusion (see text on lines 725-735).
(9) Kr-h1 is JH/Met target gene. The authors demonstrate that silencing of Kr-h1 results in inhibition of FAMeT, which is a gene involved in JH synthesis. A feedback loop in JH synthesis is unreported. It is the view of this reviewer that the authors must go ahead with a mechanistic detail of Kr-h1 mediated JH upregulation before this can be concluded. Mere qPCR experiments are not sufficient to substantiate a claim that is completely contrary to the current understanding of the JH signalling pathway.
Thank you for your suggestions and comments. We agree that only qPCR experiments are not enough to provide this kind of claim. More evidences need to be provided to support this. We have revised the MS to avoid confusion (see text on lines 725-735).
(10) The authors have performed knockdowns of JHAMT, Met, and Kr-h1 to demonstrate the effect of these factors on fecundity in BPH. Additionally, they have performed rescue experiments with EB application on these knockdown insects (Figure 3K-M). This, I believe, is a very flawed experiment. The authors demonstrate EB works through JHAMT in upregulating JH titer. In the absence of JHAMT, EB application is not expected to rescue the phenotype. But the authors have reported a complete rescue here. In the absence of Met, the receptor of JH, either EB or JH is not expected to rescue the phenotype. But a complete rescue has been reported. These two experimental results contradict their own hypothesis.
Thank you for your comments. We thought that this rescue is possible since knockdown of the genes is incomplete when using dsRNA injection (and residual gene expression allows for EB action). It is not a total knockout and actually, these genes still have a low level of expression in the dsRNA-injected insects. Since EB can upregulate the expression of JHAMT, Met, and Kr-h1, it is reasonable that EB treatment can rescue the down-regulation effects of these three genes and make fecundity completely rescued. We have clarified this on lines 411-413).
(11) A significant section of the paper deals with how EB upregulates JH titer. JH is a hormone synthesized in the Corpora Allata. Yet the authors have chosen to use the whole body for all of their experiment. Changes in the whole body for mRNA of those enzymes involved in JH synthesis may not reflect the situation in Corpora Allata. Although working with Corpora Allata is challenging, discarding the abdomen and thorax region and working with the head and neck region of the insect is easily doable. Results from such sampling are always more convincing when it comes to JH synthesis studies.
Thank you for your suggestions. Because the head is very difficult to separate from the thorax region in brown planthoppers as you can see in Author response image 1. We are now trying to answer how EB regulates JH synthesis using Drosophila as a model.
Author response image 1.
The brown planthopper
(12) The phenomenon reported was specific to BPH and not found in other insects. This limits the implications of the study.
Thank you for your comments. The brown planthopper is a serious insect pest on rice in Asia. Our findings can guide the use of this insecticide in the field. Besides this, our findings indicated that EB, which targets GluCl can impair the JH titer. Our findings added new implications for how a neuronal system influences the JH signaling pathway. We will further investigate how EB influences JH in the future and will use Drosophila as a model to study the molecular mechanisms.
(13) Overall, the molecular experiments are very poorly designed and can at best be termed superficial. There are several contradictions within the paper and no discussion or explanation has been provided for that.
Thank you for your comments. We have revised the paper according to your suggestions and added further explanation of our results in the discussion parts and hope the conclusions are better supported in the new version. We have discussed this on lines 725-746 and 778-799.
Reviewer #2 (Public Review):
The brown plant hopper (BPH) is a notorious crop pest and pesticides are the most widespread means of controlling its population. This manuscript shows that in response to sublethal doses of the pesticide (EB), BPH females show enhanced fecundity. This is in keeping with field reports of population resurgence post-pesticide treatment. The authors work out the mechanism behind this increase in fecundity. They show that in response to EB exposure, the expression of its target receptor, GluCl, increases. This, they show, results in an increase in the expression of genes that regulate the synthesis of juvenile hormone (JH) and JH itself, which, in turn, results in enhanced egg-production and egg-laying. Interestingly, these effects of EB exposure are species-specific, as the authors report that other species of plant hoppers either don't show enhanced fecundity or show reduced fecundity. As the authors point out, it is unclear how an increase in GluCl levels could result in increased JH regulatory genes.
We greatly appreciate your valuable comments and constructive suggestion to our work. We will try to figure out how EB interacts with its molecular target GluCl and then increases JH regulatory genes in the future work using Drosophila as models.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Overall, the molecular experiments are very poorly designed and can at best be termed superficial. There are several contradictions within the paper and no discussion or explanation has been provided for that.
The authors should consider a thorough revision.
Thank you for your comments. We have thoroughly revised the paper according to your suggestions and added further experiments and explanations of our results in the discussion parts.
Reviewer #2 (Recommendations For The Authors):
It would help the reader to have more schematics along with the figures. The final figure is helpful, but knowing the JH pathway, and where it acts would help with the interpretations as one reads the manuscript and the figures. The pathways represented in 4N or 5J are helpful but could be improved upon for better presentation.
It would be nice to have some discussion on how the authors think EB exposure results in an increase in GluCl expression, and how that in turn affects the expression of so many genes.
Thank you for your comments. We have thoroughly revised the paper according to your suggestions and added further experiments and explanations of how we think EB exposure results in an increase in JH titer and other genes in the discussion parts. We have added the test on lines 753-761.
References
Amezian, D., Fricaux, T., de Sousa, G., Maiwald, F., Huditz, H.-I., Nauen, R., Le Goff, G., 2023. Investigating the role of the ROS/CncC signaling pathway in the response to xenobiotics in Spodoptera frugiperda using Sf9 cells. Pesticide Biochemistry and Physiology 195, 105563.
Fu, B., Liang, J., Hu, J., Du, T., Tan, Q., He, C., Wei, X., Gong, P., Yang, J., Liu, S., Huang, M., Gui, L., Liu, K., Zhou, X., Nauen, R., Bass, C., Yang, X., Zhang, Y., 2024. GPCR–MAPK signaling pathways underpin fitness trade-offs in whitefly. Proceedings of the National Academy of Sciences 121, e2402407121.
Gandara, L., Jacoby, R., Laurent, F., Spatuzzi, M., Vlachopoulos, N., Borst, N.O., Ekmen, G., Potel, C.M., Garrido-Rodriguez, M., Böhmert, A.L., Misunou, N., Bartmanski, B.J., Li, X.C., Kutra, D., Hériché, J.-K., Tischer, C., Zimmermann-Kogadeeva, M., Ingham, V.A., Savitski, M.M., Masson, J.-B., Zimmermann, M., Crocker, J., 2024. Pervasive sublethal effects of agrochemicals on insects at environmentally relevant concentrations. Science 386, 446-453.
Gong, Y., Cheng, S., Desneux, N., Gao, X., Xiu, X., Wang, F., Hou, M., 2022. Transgenerational hormesis effects of nitenpyram on fitness and insecticide tolerance/resistance of Nilaparvata lugens. Journal of Pest Science.
Hu, B., Huang, H., Hu, S., Ren, M., Wei, Q., Tian, X., Esmail Abdalla Elzaki, M., Bass, C., Su, J., Reddy Palli, S., 2021. Changes in both trans- and cis-regulatory elements mediate insecticide resistance in a lepidopteron pest, Spodoptera exigua. PLOS Genetics 17, e1009403.
Jiang, H., Meng, X., Zhang, N., Ge, H., Wei, J., Qian, K., Zheng, Y., Park, Y., Reddy Palli, S., Wang, J., 2023. The pleiotropic AMPK–CncC signaling pathway regulates the trade-off between detoxification and reproduction. Proceedings of the National Academy of Sciences 120, e2214038120.
Ko, K.I., Root, C.M., Lindsay, S.A., Zaninovich, O.A., Shepherd, A.K., Wasserman, S.A., Kim, S.M., Wang, J.W., 2015. Starvation promotes concerted modulation of appetitive olfactory behavior via parallel neuromodulatory circuits. eLife 4, e08298.
Li, Z., Wang, Y., Qin, Q., Chen, L., Dang, X., Ma, Z., Zhou, Z., 2023. Imidacloprid disrupts larval molting regulation and nutrient energy metabolism, causing developmental delay in honey bee Apis mellifera. eLife
Martelli, F., Hernandes, N.H., Zuo, Z., Wang, J., Wong, C.-O., Karagas, N.E., Roessner, U., Rupasinghe, T., Robin, C., Venkatachalam, K., Perry, T., Batterham, P., Bellen, H.J., 2022. Low doses of the organic insecticide spinosad trigger lysosomal defects, elevated ROS, lipid dysregulation, and neurodegeneration in flies. eLife 11, e73812.
Peng, Y.C., Sheng, C.W., Casida, J.E., Zhao, C.Q., Han, Z.J., 2017. Ryanodine receptor genes of the rice stem borer, Chilo suppressalis: Molecular cloning, alternative splicing and expression profiling. Pestic. Biochem. Physiol. 135, 69-77.
Root, Cory M., Ko, Kang I., Jafari, A., Wang, Jing W., 2011. Presynaptic facilitation by neuropeptide signaling mediates odor-driven food search. Cell 145, 133-144.
Zhang, Y., Wang, X., Yang, B., Hu, Y., Huang, L., Bass, C., Liu, Z., 2015. Reduction in mRNA and protein expression of a nicotinic acetylcholine receptor α8 subunit is associated with resistance to imidacloprid in the brown planthopper, Nilaparvata lugens. Journal of Neurochemistry 135, 686-694.
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The authors aimed to confirm the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility within admixed African populations. Building upon previous findings from the International Tuberculosis Host Genetics Consortium (ITHGC), this study sought to address the limitations of small sample size and the inclusion of admixed samples by employing the Local Ancestry Allelic Adjusted (LAAA) model, as well as identify TB susceptibility loci in an admixed South African cohort.
Strengths:
The major strengths of this study include the use of six TB case-control datasets collected over 30 years from diverse South African populations and ADMIXTURE for global ancestry inference. The former represents comprehensive dataset used in this study and the later ensures accurate determination of ancestral contributions. In addition, the identified association in the HLA-DPB1 gene shows near-genomewide significance, enhancing the credibility of the findings.
Weaknesses:
The major weakness of this study includes insufficient significant discoveries and reliance on crossvalidation. This study only identified one variant significantly associated with TB status, located in an intergenic region with an unclear link to TB susceptibility. Despite identifying multiple lead SNPs, no other variants reached the genome-wide significance threshold, limiting the overall impact of the findings. The absence of an independent validation cohort, with the study relying solely on crossvalidation, is also a major limitation. This approach restricts the ability to independently confirm the findings and evaluate their robustness across different population samples.
Appraisal:
The authors successfully achieved their aims of confirming the association between the HLA-II region and TB susceptibility in admixed African populations. However, the limited number of significant discoveries, reliance on cross-validation, and insufficient discussion of model performance and SNP significance weaken the overall strength of the findings. Despite these limitations, the results support the conclusion that considering local ancestry is crucial in genetic studies of admixed populations.
Impact:
The innovative use of the LAAA model and the comprehensive dataset in this study make substantial contributions to the field of genetic epidemiology.
Reviewer #2 (Public review):
Summary:
This manuscript is about using different analytical approaches to allow ancestry adjustments to GWAS analyses amongst admixed populations. This work is a follow-on from the recently published ITHGC multi-population GWAS (https://doi.org/10.7554/eLife.84394), with a focus on the admixed South African populations. Ancestry adjustment models detected a peak of SNPs in the class II HLA DPB1, distinct from the class II HLA DQA1 loci significant in the ITHGC analysis.
Strengths:
Excellent demonstration of GWAS analytical pipelines in highly admixed populations. Further confirmation of the importance of the HLA class II locus in genetic susceptibility to TB.
Weaknesses:
Limited novelty compared to the group's previous existing publications and the body of work linking HLA class II alleles with TB susceptibility in South Africa or other African populations. This work includes only ~100 new cases and controls from what has already been published. High-resolution HLA typing has detected significant signals in both the DQA1 and DPB1 regions identified by the larger ITHGC and in this GWAS analysis respectively (Chihab L et al. HLA. 2023 Feb; 101(2): 124-137). Despite the availability of strong methods for imputing HLA from GWAS data (Karnes J et Plos One 2017), the authors did not confirm with HLA typing the importance of their SNP peak in the class II region. This would have supported the importance of this ancestry adjustment versus prior ITHGC analysis.
The populations consider active TB and healthy controls (from high-burden presumed exposed communities) and do not provide QFT or other data to identify latent TB infection.
Important methodological points for clarification and for readers to be aware of when reading this paper:
(1) One of the reasons cited for the lack of African ancestry-specific associations or suggestive peaks in the ITHGC study was the small African sample size. The current association test includes a larger African cohort and yields a near-genome-wide significant threshold in the HLA-DPB1 gene originating from the KhoeSan ancestry. The investigation is needed as to whether the increase in power is due to increased African samples and not necessarily the use of the LAAA model as stated on lines 295 and 296?
Thank you for your comment. The Manhattan plot in Figure 3 includes the results for all four models: the traditional GWAS model (GAO), the admixture mapping model (LAO), the ancestry plus allelic (APA) model and the LAAA model. In this figure, it is evident that only the LAAA model identified the association peak on chromosome 6, which lends support the argument that the increase in power is due to the use of the LAAA model and not solely due to the increase in sample size.
(2) In line 256, the number of SNPs included in the LAAA analysis was 784,557 autosomal markers; the number of SNPs after quality control of the imputed dataset was 7,510,051 SNPs (line 142). It is not clear how or why ~90% of the SNPs were removed. This needs clarification.
Thank you for your recommendation. In our manuscript (line 194), we mention that “…variants with minor allele frequency (MAF) < 1% were removed to improve the stability of the association tests.” A large proportion of imputed variants fell below this MAF threshold, and were subsequently excluded from this analysis. Below, we show the number of imputed variants across MAF bins for one of our datasets [RSA(A)] to substantiate this claim:
Author response image 1.
(3) The authors have used the significance threshold estimated by the STEAM p-value < 2.5x10<sup>-6</sup> in the LAAA analysis. Grinde et al. (2019 implemented their significance threshold estimation approach tailored to admixture mapping (local ancestry (LA) model), where there is a reduction in testing burden. The authors should justify why this threshold would apply to the LAAA model (a joint genotype and ancestry approach).
Thank you for your recommendation. We describe in the methods (line 189 onwards) that the LAAA model is an extension of the APA model. Since the APA model itself simultaneously performs the null global ancestry only model and the local ancestry model (utilised in admixture mapping), we thus considered the use of a threshold tailored to admixture mapping appropriate for the LAAA model.
(4) Batch effect screening and correction (line 174) is a quality control check. This section is discussed after global and local ancestry inferences in the methods. Was this QC step conducted after the inferencing? If so, the authors should justify how the removed SNPs due to the batch effect did not affect the global and local ancestry inferences or should order the methods section correctly to avoid confusion.
Thank you for your comments. The batch effect correction method utilised a pseudo-case-control comparison which included global ancestry proportions. Thus, batch effect correction was conducted after ancestry inference. We excluded 36 627 SNPs that were believed to have been affected by the batch effect. We have amended line 186 to include the exact number of SNPs excluded due to batch effect.
The ancestry inference by RFMix utilised the entire merged dataset of 7 510 051 SNPs. Thus, the SNPs removed due to the batch effect make up a very small proportion of the SNPs used to conduct global and local ancestry inferences (less than 0.5%). As a result, we do not believe that the removed SNPs would have significantly affected the global and local ancestry inferences. However, we did conduct global ancestry inference with RFMix on each separate dataset as a sanity check. In the tables below, we show the average global ancestry proportions inferred for each separate dataset, the average global ancestry proportions across all datasets and the average global ancestry proportions inferred using the merged dataset. The SAC and Xhosa cohorts are shown in two separate tables due to the different number of contributing ancestral populations to each cohort. The differences between the combined average global ancestry proportions across the separate cohorts does not differ significantly to the global ancestry proportions inferred using the merged dataset.
Author response table 1.
Comparison of global ancestry proportions across the separate SAC datasets and the merged cohort.
Author response table 2.
Comparison of global ancestry proportions in the Xhosa dataset and the merged cohort.
Reviewer #1 (Recommendations for the authors):
Suggestions for Improved or Additional Experiments, Data, or Analyses:
(1) It might be beneficial to consider splitting the data into separate discovery and validation cohorts rather than relying solely on cross-validation. This approach could provide a stronger basis for independently confirming the findings.
Thank you for your suggestion. However, we are hesitant to divide our already modest dataset (n=1544) into separate discovery and validation cohorts, as this would reduce the statistical power to detect significant associations.
(2) Clearly stating the process of cross-validation in the methods section and reporting relevant validation statistics, such as accuracy, sensitivity, specificity, and area under the curve (AUC), would provide a more comprehensive assessment of the model's performance.
Thank you for your recommendation. We would like to highlight this article, “GWAS in the southern African context” (1), which evaluated the performance of the LAAA model compared to other models in three- and five-way admixed populations. Given the thorough evaluation of the model’s performance in that study, we did not find it necessary to reassess its performance in this manuscript.
(3) Analysing racial cohorts separately to see if you can replicate previous results and find significant markers in combined non-African populations that are not evident in African-only samples might be useful.
Thank you for your suggestion. We would like to respectfully note that race is a social construct, and its use as a proxy for genetic ancestry can be problematic (2). In our study, we rather rely on genetic ancestry inferred using ancestry inference software to provide a more accurate representation of our cohort's genetic diversity. Additionally, our cohort consists mostly of a highly admixed population group, with some individuals exhibiting ancestral contributions from up to five different global populations. Therefore, it is not possible to categorize our samples into distinct “Africanonly” or “non-African” groups.
(4) It might be worthwhile to consider using polygenic risk scores (PRS) to combine multiple genetic influences. This approach could help in identifying cumulative genetic effects that are not apparent when examining individual SNPs.
Thank you for your recommendation. While constructing a polygenic risk score (PRS) is beyond the scope of the current study, but an ongoing interest in our group, we recognize its potential value and will consider incorporating this approach in future research endeavours or a separate publication. A recent publication by Majara et al showed that that PRS accuracy is low for all traits and varies across ancestrally and ethnically diverse South African groups (3).
Recommendations for Improving the Writing and Presentation:
Including a more thorough discussion of the methodological limitations, such as the challenges of studying admixed populations and the potential limitations of the LAAA model, would provide a more balanced perspective.
Thank you for your suggestion. To provide a more balanced perspective, we included the limitations of our study in the discussion, from line 429 to like 451.
Minor Corrections to the Text and Figures:
Including all relevant statistics would improve clarity. For example, providing confidence intervals for the odds ratios and discussing any observed trends or outliers would be beneficial.
Thank you for your recommendation. We have added 95% confidence intervals to all odds ratios reported in Table 3. However, beyond the association peak identified in the HL-II region associated with the phenotype, we do not observe any other trends or outliers in or LAAA analysis.
Reviewer #2 (Recommendations for the authors):
Points for improvement:
(1) Related to the different datasets and inclusions in previous publications, it would also be good to better understand the different numbers of cases and controls included across the previous and current analyses, or discussion thereof. For instance, the RSA(M) dataset includes 555/440 cases/controls for this analysis and only 410/405 cases/controls in the ITHGC analysis. Other discrepancies are noted across the other published datasets compared to those included in this analysis, and these always need to be detailed in a supplement or similar to better understand if this could have introduced bias or was in fact correct based on the additional ancestry-related restriction applied.
Thank you for your comments. Table 1 of our manuscript lists number of individuals in the RSA(M) dataset, including related individuals. As described in line 131, related individuals were subsequently excluded during quality control: “Individual datasets were screened for relatedness using KING software (Manichaikul et al., 2010) and individuals up to second degree relatedness were removed.” The ITHGC only reported the number of unrelated individuals included their analyses, which would account for the discrepancies in the reported number of cases and controls.
(2) The imbalance between cases and controls in this analysis is quite striking, and it is unusual to have the imbalance favour cases over controls. This contrasts with the ITHGC, where there are substantially more controls. There is no comment on how this could potentially impact this analysis.
Thank you for your comment. We have included a note on our case-control imbalance in the discussion:
“While many studies discuss methods for addressing case-control imbalances with more controls than cases (which can inflate type 1 error rates (Zhou et al. 2018; Dai et al. 2021; Öztornaci et al. 2023), few address the implications of a large case-to-control ratio like ours (952 cases to 592 controls). To assess the impact of this imbalance, we used the Michigan genetic association study (GAS) power calculator (Skol et al. 2006). Under an additive disease model with an estimated prevalence of 0.15, a disease allele frequency of 0.3, a genotype relative risk of 1.5, and a default significance level of 7 × 10<sup>-6</sup>, we achieved an expected power of approximately 75%. With a balanced sample size of 950 cases and 950 controls, power would exceed 90%, but it would drop significantly with a smaller balanced cohort of 590 cases and 590 controls. Given these results, we proceeded with our analysis to maximize statistical power despite the case-control imbalance.”
Author response image 2.
Minor comments
(1) Referencing around key points of TB epidemiology and disease states seems out of date, given recent epidemiology reviews and seminal nature or lancet review articles. Please update.
Thank you for your suggestion. We have included the following recent publications in the introductory paragraph:
Zaidi, S. M. A., Coussens, A. K., Seddon, J. A., Kredo, T., Warner, D., Houben, R. M. G. J., & Esmail, H. (2023). Beyond latent and active tuberculosis: a scoping review of conceptual frameworks. EClinicalMedicine, 66, 102332. https://doi.org/10.1016/j.eclinm.2023.102332
Menzies, N. A., Swartwood, N., Testa, C., Malyuta, Y., Hill, A. N., Marks, S. M., Cohen, T., & Salomon, J. A. (2021). Time Since Infection and Risks of Future Disease for Individuals with Mycobacterium tuberculosis Infection in the United States. Epidemiology, 32(1), 70–78. https://doi.org/10.1097/EDE.0000000000001271
Cudahy, P. G. T., Wilson, D., & Cohen, T. (2020). Risk factors for recurrent tuberculosis after successful treatment in a high burden setting: a cohort study. BMC Infectious Diseases, 20(1), 789. https://doi.org/10.1186/s12879-020-05515-4
Escombe, A. R., Ticona, E., Chávez-Pérez, V., Espinoza, M., & Moore, D. A. J. (2019). Improving natural ventilation in hospital waiting and consulting rooms to reduce nosocomial tuberculosis transmission risk in a low resource setting. BMC Infectious Diseases, 19(1), 88. https://doi.org/10.1186/s12879-019-3717-9
Laghari, M., Sulaiman, S. A. S., Khan, A. H., Talpur, B. A., Bhatti, Z., & Memon, N. (2019). Contact screening and risk factors for TB among the household contact of children with active TB: a way to find source case and new TB cases. BMC Public Health, 19(1), 1274. https://doi.org/10.1186/s12889-0197597-0
Matose, M., Poluta, M., & Douglas, T. S. (2019). Natural ventilation as a means of airborne tuberculosis infection control in minibus taxis. South African Journal of Science, 115(9/10). https://doi.org/10.17159/sajs.2019/5737
Smith, M. H., Myrick, J. W., Oyageshio, O., Uren, C., Saayman, J., Boolay, S., van der Westhuizen, L., Werely, C., Möller, M., Henn, B. M., & Reynolds, A. W. (2023). Epidemiological correlates of overweight and obesity in the Northern Cape Province, South Africa. PeerJ, 11, e14723. https://doi.org/10.7717/peerj.14723
(2) Lines 46 to 48 appear to have two contradictory statements next to each other. The first says there are numerous GWAS investigating TB susceptibility; the second says there are sparse. Please clarify.
Thank you for bringing this to our attention. We have amended the lines as follows:
“Numerous genome-wide association studies (GWASs) investigating TB susceptibility have been conducted across different population groups. However, findings from these studies often do not replicate across population groups (Möller & Kinnear, 2020; Möller et al., 2018; Uren et al., 2017).”
(3) Add ref in line 69 for two SAC populations.
Thank you for your recommendation. We have included the citation for the ITHGC meta-analysis paper here:
“The authors described possible reasons for the lack of associations, including the smaller sample size compared to the other ancestry-specific meta-analyses, increased genetic diversity within African individuals and population stratification produced by two admixed cohorts from the South African Coloured (SAC) population (Schurz et al. 2024).”
(4) Write out abbreviations the first time they appear (Line 121).
Thank you for your recommendation. We have corrected the sentence as follows:
“Monomorphic sites were removed. Individuals were screened for deviations in Hardy-Weinberg Equilibrium (HWE) for each SNP and sites deviating from the HWE threshold of 10-5 were removed.”
(5) It would be good in the supplement to see if there is a SNP peak in chromosome 20 with a hit that reached significance in the Bantu-speaking African ancestry.
Thank you for your recommendation. We have included a regional plot for the lead variant identified on chromosome 20 originating from Bantu-speaking African ancestry in the supplementary material (Supplementary Figure 3).
(6) It would be good to mention the p-values of rs28383206 from the ITHGC paper in this cohort for KhoeSan and Bantu-speaking African ancestries.
Thank you for your suggestion. We have included the following paragraph from line 352:
“The lead variant identified in the ITHGC meta-analysis, rs28383206, was not present in our genotype or imputed datasets. The ITHGC imputed genotypes using the 1000 Genomes (1000G) reference panel (4). Variant rs28383206 has an alternate allele frequency of 11.26% in the African population subgroup within the 1000G dataset (https://www.ncbi.nlm.nih.gov/snp/rs28383206). However, rs28383206 is absent from our in-house whole-genome sequencing (WGS) datasets, which include Bantu-speaking African and KhoeSan individuals. This absence suggests that rs28383206 might not have been imputed in our datasets using the AGR reference panel, potentially due to its low alternate allele frequency in southern African populations. Our merged dataset contained two variants located within 800 base pairs of r_s28383206: rs482205_ (6:32576009) and rs482162 (6:32576019). However, these variants were not significantly associated with TB status in our cohort (Supplementary Table 1).” Supplementary Table 1 can be found in the supplementary material:
(7) It would improve the readability of the ancestry proportions listed on lines 236 and 237 if these population groups were linked with the corresponding specific population used in Figure 1, as has been done in Table 2.
Thank you for your suggestion. We have amended Figure 1 to include the corresponding population labels mentioned in Table 2.
(8) In line 209, it is not clear why the number of alleles of a specific ancestry at a locus is referred to as a covariate in admixture mapping when the corresponding marginal effect is the parameter of interest.
Thank you for bringing this to our attention. We have amended the description as follows:
“(2) Local ancestry (LA) model:
This model is used in admixture mapping to identify ancestry-specific variants associated with a specific phenotype. The LA model evaluates the number of alleles of a specific ancestry at a locus and includes the corresponding marginal effect as a covariate in association analyses.”
(9) Table 3 would benefit from a column on whether the SNP was genotyped or imputed.
Thank you for your suggestion. We have included a column indicating whether the SNP was genotyped or imputed, as well as an additional column with the INFO score for imputed genotypes.
(10) The authors should remove the print and download icons in Figure 1 on lines 240 and 241.
Thank you for your suggestion. We have amended the figure as requested.
(11) In the quality control, the authors use a more relaxed threshold for missingness in individuals (90%) and genotypes (5%) and have strayed away from the conventional 97%-98%. An explanation of the choice of these thresholds will be helpful to the reader.
Thank you for your suggestion. We aimed to use similar genotype and individual missingness thresholds outline by the ITHGC meta-analysis (which utilised a threshold of 10% for both genotype and individual missingness) and the previous LAAA analysis paper performed by Swart et al. in 2021. We have amended line 116 for more clarity:
“Individuals with genotype call rates less than 90% and SNPs with more than 5% missingness were removed as described previously (5).”
References
(1) Swart Y, van Eeden G, Uren C, van der Spuy G, Tromp G, Moller M. GWAS in the southern African context. Cold Spring Harbor Laboratory. 2022;
(2) Byeon YJJ, Islamaj R, Yeganova L, Wilbur WJ, Lu Z, Brody LC, et al. Evolving use of ancestry, ethnicity, and race in genetics research-A survey spanning seven decades. Am J Hum Genet. 2021 Dec 2;108(12):2215–23.
(3) Majara L, Kalungi A, Koen N, Tsuo K, Wang Y, Gupta R, et al. Low and differential polygenic score generalizability among African populations due largely to genetic diversity. HGG Adv. 2023 Apr 13;4(2):100184.
(4) Schurz H, Naranbhai V, Yates TA, Gilchrist JJ, Parks T, Dodd PJ, et al. Multi-ancestry metaanalysis of host genetic susceptibility to tuberculosis identifies shared genetic architecture. eLife. 2024 Jan 15;13.
(5) Swart Y, Uren C, van Helden PD, Hoal EG, Möller M. Local ancestry adjusted allelic association analysis robustly captures tuberculosis susceptibility loci. Front Genet. 2021 Oct 15;12:716558.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary: The authors of this study sought to define a role for IgM in responses to house dust mites in the lung.
Strengths:
Unexpected observation about IgM biology
Combination of experiments to elucidate function
Weaknesses:
Would love more connection to human disease
We thank the reviewer for these comments. At the time of this publication, we have not made a concrete link with human disease. While there is some anecdotal evidence of diseases such as Autoimmune glomerulonephritis, Hashimoto’s thyroiditis, Bronchial polyp, SLE, Celiac disease and other diseases in people with low IgM. Allergic disorders are also common in people with IgM deficiency, other studies have reported as high as 33-47%. The mechanisms for the high incidence of allergic diseases are unclear as generally, these patients have normal IgG and IgE levels. IgM deficiency may represent a heterogeneous spectrum of genetic defects, which might explain the heterogeneous nature of disease presentations.
Reviewer #2 (Public Review):
Summary:
The manuscript by Hadebe and colleagues describes a striking reduction in airway hyperresponsiveness in Igm-deficient mice in response to HDM, OVA and papain across the B6 and BALB-c backgrounds. The authors suggest that the deficit is not due to improper type 2 immune responses, nor an aberrant B cell response, despite a lack of class switching in these mice. Through RNA-Seq approaches, the authors identify few differences between the lungs of WT and Igm-deficient mice, but see that two genes involved in actin regulation are greatly reduced in IgM-deficient mice. The authors target these genes by CRISPR-Cas9 in in vitro assays of smooth muscle cells to show that these may regulate cell contraction. While the study is conceptually interesting, there are a number of limitations, which stop us from drawing meaningful conclusions.
Strengths:
Fig. 1. The authors clearly show that IgMKO mice have striking reduced AHR in the HDM model, despite the presence of a good cellular B cell response.
Weaknesses:
Fig. 2. The authors characterize the cd4 t cell response to HDM in IGMKO mice.<br /> They have restimulated medLN cells with antiCD3 for 5 days to look for IL-4 and IL-13, and find no discernible difference between WT and KO mice. The absence of PBS-treated WT and KO mice in this analysis means it is unclear if HDM-challenged mice are showing IL-4 or IL-13 levels above that seen at baseline in this assay.
We thank the Reviewer for this comment. We would like to mention that a very minimal level of IL-4 and IL-13 in PBS mice was detected. We have indicated with a dotted line on the Figure to show levels in unstimulated or naïve cytokines. Please see Author response image 1 below from anti-CD3 stimulated cytokine ELISA data. The levels of these cytokines are very low and are not changed between WT and IgM<sup>-/-</sup> mice, this is also true for PMA/ionomycin-stimulated cells.
Author response image 1.
The choice of 5 days is strange, given that the response the authors want to see is in already primed cells. A 1-2 day assay would have been better.
We agree with the reviewer that a shorter stimulation period would work. Over the years we have settled for 5-day re-stimulation for both anti-CD3 and HDM. We have tried other time points, but we consistently get better secretion of cytokines after 5 days.
It is concerning that the authors state that HDM restimulation did not induce cytokine production from medLN cells, since countless studies have shown that restimulation of medLN would induce IL-13, IL-5 and IL-10 production from medLN. This indicates that the sensitization and challenge model used by the authors is not working as it should.
We thank the reviewer for this observation. In our recent paper showing how antigen load affects B cell function, we used very low levels of HDM to sensitise and challenge mice (1 ug and 3 ug respectively). See below article, Hadebe et al., 2021 JACI. This is because Labs that have used these low HDM levels also suggested that antigen load impacts B cell function, especially in their role in germinal centres. We believe the reason we see low or undetectable levels of cytokines is because of this low antigen load sensitisation and challenge. In other manuscripts we have published or about to publish, we have shown that normal HDM sensitisation load (1 ug or 100 ug) and challenge (10 ug) do induce cytokine release upon restimulation with HDM. See the below article by Khumalo et al, 2020 JCI Insight (Figure 4A).
Sabelo Hadebe, Jermaine Khumalo, Sandisiwe Mangali, Nontobeko Mthembu, Hlumani Ndlovu, Amkele Ngomti, Martyna Scibiorek, Frank Kirstein, Frank Brombacher. Deletion of IL-4Ra signalling on B cells limits hyperresponsiveness depending on antigen load. doi.org/10.1016/j.jaci.2020.12.635).
Jermaine Khumalo, Frank Kirstein, Sabelo Hadebe, Frank Brombacher. IL-4Rα signalling in regulatory T cells is required for dampening allergic airway inflammation through inhibition of IL-33 by type 2 innate lymphoid cells. JCI Insight. 2020 Oct 15;5(20):e136206. doi: 10.1172/jci.insight.136206
The IL-13 staining shown in panel c is also not definitive. One should be able to optimize their assays to achieve a better level of staining, to my mind.
We agree with the reviewer that much higher IL-13-producing CD4 T cells should be observed. We don’t think this is a technical glitch or non-optimal set-up as we see much higher levels of IL-13-producing CD4 T cells when using higher doses of HDM to sensitise and challenge, say between 7 -20% in WT mice (see Author response image 2, lung stimulated with PMA/ionomycin+Monensin, please note this is for illustration purposes only and it not linked to the current manuscript, its merely to demonstrate a point from other experiments we have conducted in the lab).
Author response image 2.
In d-f, the authors perform a serum transfer, but they only do this once. The half life of IgM is quite short. The authors should perform multiple naïve serum transfers to see if this is enough to induce FULL AHR.
We thank the reviewer for this comment. We apologise if this was not clear enough on the Figure legend and method, we did transfer serum 3x, a day before sensitisation, on the day of sensitisation and a day before the challenge to circumvent the short life of IgM. In our subsequent experiments, we have now used busulfan to deplete all bone marrow in IgM-deficient mice and replace it with WT bone marrow and this method restores AHR (Figure 3).
This now appears in line 165 to 169 and reads
“Adoptive transfer of naïve serum
Naïve wild-type mice were euthanised and blood was collected via cardiac puncture before being spun down (5500rpm, 10min, RT) to collect serum. Serum (200mL) was injected intraperitoneally into IgM-deficient mice. Serum was injected intraperitoneally at day -1, 0, and a day before the challenge with HDM (day 10).”
The presence of negative values of total IgE in panel F would indicate some errors in calculation of serum IgE concentrations.
We thank the reviewer for this observation. For better clarity, we have now indicated these values as undetected in Figure , as they were below our detection limit.
Overall, it is hard to be convinced that IgM-deficiency does not lead to a reduction in Th2 inflammation, since the assays appear suboptimal.
We disagree with the reviewer in this instance, because we have shown in 3 different models and in 2 different strains and 2 doses of HDM (high and low) that no matter what you do, Th2 remains intact. Our reason for choosing low dose HDM was based on our previous work and that of others, which showed that depending on antigen load, B cells can either be redundant or have functional roles. Since our interest was to tease out the role of B cells and specifically IgM, it was important that we look at a scenario where B cells are known to have a function (low antigen load). We did find similar findings at high dose of HDM load, but effects on AHR were not as strong, but Th2 was not changed, in fact in some instances Th2 was higher in IgM-deficient mice.
Fig. 3. Gene expression differences between WT and KO mice in PBS and HDM challenged settings are shown. PCA analysis does not show clear differences between all four groups, but genes are certainly up and downregulated, in particular when comparing PBS to HDM challenged mice. In both PBS and HDM challenged settings, three genes stand out as being upregulated in WT v KO mice. these are Baiap2l1, erdr1 and Chil1.
Noted
Fig. 4. The authors attempt to quantify BAIAP2L1 in mouse lungs. It is difficult to know if the antibody used really detects the correct protein. A BAIAP2L1-KO is not used as a control for staining, and I am not sure if competitive assays for BAIAP2L1 can be set up. The flow data is not convincing. The immunohistochemistry shows BAIAP2L1 (in red) in many, many cells, essentially throughout the section. There is also no discernible difference between WT and KO mice, which one might have expected based on the RNA-Seq data. So, from my perspective, it is hard to say if/where this protein is located, and whether there truly exists a difference in expression between wt and ko mice.
We thank the reviewer for this comment. We are certain that the antibody does detect BAIAP2L1, we have used it in 3 assays, which we admit may show varying specificities since it’s a Polyclonal antibody. However, in our western blot, the antibody detects 1 band at 56.7kDa and no other bands, apart from what we think are isoforms. We agree that BAIAP2L1 is expressed by many cell types, including CD45+ cells and alpha smooth muscle negative cells and we show this in our supplementary Figure 9. Where we think there is a difference in expression between WT and IgM-deficient mice is in alpha-smooth muscle-positive cells. We have tested antibodies from different companies, and we find similar findings. We do not have access to BAIAP2L1 KO mice and to test specificity, we have also used single stain controls with or without secondary antibody and isotype control which show no binding in western blot and Immunofluorescence assays and Fluorescence minus one antibody in Flow cytometry, so that way we are convinced that the signal we are seeing is specific to BAIAP2L1.
Fig. 5 and 6. The authors use a single cell contractility assay to measure whether BAIAP2L1 and ERDR1 impact on bronchial smooth muscle cell contractility. I am not familiar with the assay, but it looks like an interesting way of analysing contractility at the single cell level.
The authors state that targeting these two genes with Cas9gRNA reduces smooth muscle cell contractility, and the data presented for contractility supports this observation. However, the efficiency of Cas9-mediated deletion is very unclear. The authors present a PCR in supp fig 9c as evidence of gene deletion, but it is entirely unclear with what efficiency the gene has been deleted. One should use sequencing to confirm deletion. Moreover, if the antibody was truly working, one should be able to use the antibody used in Fig 4 to detect BAIAP2L1 levels in these cells. The authors do not appear to have tried this.
We thank the reviewer for these observations. We are in a process to optimise this using new polyclonal BAIAP2L1 antibodies from other companies, since the one we have tried doesn’t seem to work well on human cells via western blot. So hopefully in our new version, we will be able to demonstrate this by immunofluorescence or western blot.
Other impressions:
The paper is lacking a link between the deficiency of IgM and the effects on smooth muscle cell contraction.
The levels of IL-13 and TNF in lavage of WT and IGMKO mice could be analysed.
We have measured Th2 cytokine IL-13 in BAL fluid and found no differences between IgM-deficient mice and WT mice challenged with HDM (Author response image 1). We could not detected TNF-alpha in the BAL fluid, it was below detection limit.
Author response image 3.
IL-13 levels are not changed in IgM-deficient mice in the lung. Bronchoalveolar lavage fluid in WT or IgM-deficient mice sensitised and challenged with HDM. TNF-a levels were below the detection limit.
Moreover, what is the impact of IgM itself on smooth muscle cells? In the Fig. 7 schematic, are the authors proposing a direct role for IgM on smooth muscle cells? Does IgM in cell culture media induce contraction of SMC? This could be tested and would be interesting, to my mind.
We thank the Reviewer for these comments. We are still trying to test this, unfortunately, we have experienced delays in getting reagents such as human IgM to South Africa. We hope that we will be able to add this in our subsequent versions of the article. We agree it is an interesting experiment to do even if not for this manuscript but for our general understanding of this interaction at least in an in vitro system.
Reviewer #3 (Public Review):
Summary:
This paper by Sabelo et al. describes a new pathway by which lack of IgM in the mouse lowers bronchial hyperresponsiveness (BHR) in response to metacholine in several mouse models of allergic airway inflammation in Balb/c mice and C57/Bl6 mice. Strikingly, loss of IgM does not lead to less eosinophilic airway inflammation, Th2 cytokine production or mucus metaplasia, but to a selective loss of BHR. This occurs irrespective of the dose of allergen used. This was important to address since several prior models of HDM allergy have shown that the contribution of B cells to airway inflammation and BHR is dose dependent.
After a description of the phenotype, the authors try to elucidate the mechanisms. There is no loss of B cells in these mice. However, there is a lack of class switching to IgE and IgG1, with a concomitant increase in IgD. Restoring immunoglobulins with transfer of naïve serum in IgM deficient mice leads to restoration of allergen-specific IgE and IgG1 responses, which is not really explained in the paper how this might work. There is also no restoration of IgM responses, and concomitantly, the phenotype of reduced BHR still holds when serum is given, leading authors to conclude that the mechanism is IgE and IgG1 independent. Wild type B cell transfer also does not restore IgM responses, due to lack of engraftment of the B cells. Next authors do whole lung RNA sequencing and pinpoint reduced BAIAP2L1 mRNA as the culprit of the phenotype of IgM<sup>-/-</sup> mice. However, this cannot be validated fully on protein levels and immunohistology since differences between WT and IgM KO are not statistically significant, and B cell and IgM restoration are impossible. The histology and flow cytometry seems to suggest that expression is mainly found in alpha smooth muscle positive cells, which could still be smooth muscle cells or myofibroblasts. Next therefore, the authors move to CRISPR knock down of BAIAP2L1 in a human smooth muscle cell line, and show that loss leads to less contraction of these cells in vitro in a microscopic FLECS assay, in which smooth muscle cells bind to elastomeric contractible surfaces.
Strengths:
(1) There is a strong reduction in BHR in IgM-deficient mice, without alterations in B cell number, disconnected from effects on eosinophilia or Th2 cytokine production
(2) BAIAP2L1 has never been linked to asthma in mice or humans
Weaknesses:
(1) While the observations of reduced BHR in IgM deficient mice are strong, there is insufficient mechanistic underpinning on how loss of IgM could lead to reduced expression of BAIAP2L1. Since it is impossible to restore IgM levels by either serum or B cell transfer and since protein levels of BAIAP2L1 are not significantly reduced, there is a lack of a causal relationship that this is the explanation for the lack of BHR in IgM-deficient mice. The reader is unclear if there is a fundamental (maybe developmental) difference in non-hematopoietic cells in these IgM-deficient mice (which might have accumulated another genetic mutation over the years). In this regard, it would be important to know if littermates were newly generated, or historically bred along with the KO line.
We thank the reviewer for asking this question and getting us to think of this in a different way. This prompted us to use a different method to try and restore IgM function and since our animal facility no longer allows irradiation, we opted for busulfan. We present this data as new data in Figure 3. We had to go back and breed this strain and then generated bone marrow chimeras. What we have shown now with chimeras is that if we can deplete bone marrow from IgM-deficient mice and replace it with congenic WT bone marrow when we allow these mice to rest for 2 months before challenge with HDM (new Supplementary Figure 6 a-c) We also show that AHR (resistance and elastance) is partially restored in this way (Figure 3 a and b) as mice that receive congenic WT bone marrow after chemical irradiation can mount AHR and those that receive IgM-deficient bone marrow, can’t mount AHR upon challenge with HDM. If the mice had accumulated an unknown genetic mutation in non-hematopoietic cells, the transfer of WT bone marrow would not make a difference. So, we don’t believe the colony could have gained a mutation that we are unaware of. We have also shipped these mice to other groups and in their hands, this strains still only behaves as an IgM only knockout mice. See their publication below.
Mark Noviski, James L Mueller, Anne Satterthwaite, Lee Ann Garrett-Sinha, Frank Brombacher, Julie Zikherman 2018. IgM and IgD B cell receptors differentially respond to endogenous antigens and control B cell fate. eLife 2018;7:e35074. DOI: https://doi.org/10.7554/eLife.35074 we have also added methods for bone marrow chimaeras and added results sections and new Figures related to this methods.
Methods (line 171-182).
“Busulfan Bone marrow chimeras
WT (CD45.2) and IgM<sup>-/-</sup> (CD45.2) congenic mice were treated with 25 mg/kg busulfan (Sigma-Aldrich, Aston Manor, South Africa) per day for 3 consecutive days (75 mg/kg in total) dissolved in 10% DMSO and Phosphate buffered saline (0.2mL, intraperitoneally) to ablate bone marrow cells. Twenty-four hours after last administration of busulfan, mice were injected intravenously with fresh bone marrow (10x10<sup>6</sup> cells, 100mL) isolated from hind leg femurs of either WT (CD45.1) or IgM<sup>-/-</sup> mice(33). Animals were then allowed to complement their haematopoietic cells for 8 weeks. In some experiments the level of bone marrow ablation was assessed 4 days post-busulfan treatment in mice that did not receive donor cells. At the end of experiment level of complemented cells were also assessed in WT and IgM<sup>-/-</sup> mice that received WT (CD45.1) bone marrow.”
Results (line 491-521)
“Replacement of IgM-deficient mice with functional hematopoietic cells in busulfan mice chimeric mice restores airway hyperresponsiveness.
We then generated bone marrow chimeras by chemical radiation using busulfan(33). We treated mice three times with busulfan for 3 consecutive days and after 24 hrs transferred naïve bone marrow from congenic CD45.1 WT mice or CD45.2 IgM<sup>-/-</sup> mice (Fig. 3a and Supplementary Fig. 5a). We showed that recipient mice that did not receive donor bone marrow after 4 days post-treatment have significantly reduced lineage markers (CD45+Sca-1+) or lineage negative (Lin-) cells in the bone marrow when compared to untreated or vehicle (10% DMSO) treated mice (Supplementary Figure 5b-c). We allowed mice to reconstitute bone marrow for 8 weeks before sensitisation and challenge with low dose HDM (Figure 3a). We showed that WT (CD45.2) recipient mice that received WT (CD45.1) donor bone marrow had higher airway resistance and elastance and this was comparable to IgM<sup>-/-</sup> (CD45.2) recipient mice that received donor WT (CD45.1) bone marrow (Figure 3b). As expected, IgM<sup>-/-</sup> (CD45.2) recipient mice that received donor IgM<sup>-/-</sup> (CD45.2) bone marrow had significantly lower AHR compared to WT (CD45.2) or IgM<sup>-/-</sup> (CD45.2) recipient mice that received WT (CD45.1) bone marrow (Figure 3b). We confirmed that the differences observed were not due to differences in bone marrow reconstitution as we saw similar frequencies of CD45.1 cells within the lymphocyte populations in the lungs and other tissues (Supplementary Fig. 5d). We observed no significant changes in the lung neutrophils, eosinophils, inflammatory macrophages, CD4 T cells or B cells in WT or IgM<sup>-/-</sup> (CD45.2) recipient mice that received donor WT (CD45.1/CD45.2) or IgM<sup>-/-</sup> (CD45.2) bone marrow when sensitised and challenged with low dose HDM (Fig. 3c)
Restoring IgM function through adoptive reconstitution with congenic CD45.1 bone marrow in non-chemically irradiated recipient mice or sorted B cells into IgM<sup>-/-</sup> mice (Supplementary Fig. 6a) did not replenish IgM B cells to levels observed in WT mice and as a result did not restore AHR, total IgE and IgM in these mice (Supplementary Fig. 6b-c).”
The 2 new figures are
Figure 3 which moved the rest of the Figures down and Supplementary Figure 5, which also moved the rest of the supplementary figures down.
Discussion appears in line 757-766 of the untracked version of the article.
To resolve other endogenous factors that could have potentially influenced reduced AHR in IgM-deficient mice, we resorted to busulfan chemical irradiation to deplete bone marrow cells in IgM-deficient mice and replace bone marrow with WT bone marrow. While it is well accepted that busulfan chemical irradiation partially depletes bone marrow cells, in our case it was not possible to pursue other irradiation methods due to changes in ethical regulations and that fact that mice are slow to recover after gamma rays irradiation. Busulfan chemical irradiation allowed us to show that we could mostly restore AHR in IgM-deficient recipient mice that received donor WT bone marrow when challenged with low dose HDM.
(2) There is no mention of the potential role of complement in activation of AHR, which might be altered in IgM-deficient mice
We thank the reviewer for this comment. We have not directly looked at complement in this instance, however, from our previous work on C3-/- mice, there have been comparable AHR to WT mice under the HDM challenge.
(3) What is the contribution of elevated IgD in the phenotype of the IgM-deficient mice. It has been described by this group that IgD levels are clearly elevated
We thank the reviewer for this question. We believe that IgD is essentially what drives partial class switching to IgG, we certainly have shown that in the case of VSV virus and Trypanosoma congolense and Trypanosoma brucei brucei that elevated IgD drive delayed but effective IgG in the absence of IgM (Lutz et al, 2001, Nature). This is also confirmed by Noviski studies where they show that both IgM and IgD do share some endogenous antigens, so its likely that external antigens can activate IgD in a similar manner to prompt class switching.
(4) How can transfer of naïve serum in class switching deficient IgM KO mice lead to restoration of allergen specific IgE and IgG1?
We thank the Reviewer for these comments, we believe that naïve sera transferred to IgM deficient mice is able to bind to the surface of B cells via IgM receptors (FcμR / Fcα/μR), which are still present on B cells and this is sufficient to facilitate class switching. Our IgM<sup>-/-</sup> mouse lacks both membrane-bound and secreted IgM, and transferred serum contains at least secreted IgM which can bind to surfaces via its Fc portion. We measured HDM-specific IgE and we found very low levels, but these were not different between WT and IgM<sup>-/-</sup> adoptively transferred with WT serum. We also detected HDM-specific IgG1 in IgM<sup>-/-</sup> transferred with WT sera to the same level as WT, confirming a possible class switching, of course, we can’t rule out that transferred sera also contains some IgG1. We also can’t rule out that elevated IgD levels can partially be responsible for class switched IgG1 as discussed above.
In the discussion line 804-812, we also added the following
“We speculate that IgM can directly activate smooth muscle cells by binding a number of its surface receptors including FcμR, Fcα/μR and pIgR(52-54). IgM binds to FcμR strictly, but shares Fcα/μR and pIgR with IgA(5,52,54). Both Fcα/μR and pIgR can be expressed by non-structural cells at mucosal sites(54,55). We would not rule out that the mechanisms of muscle contraction might be through one of these IgM receptors, especially the ones expressed on smooth muscle cells(54,55). Certainly, our future studies will be directed towards characterizing the mechanism by which IgM potentially activates the smooth muscle.”
We have discussed this section under Discussion section, line 731 to 757. In addition, since we have now performed bone marrow chimaeras we have further added the following in our discussion in line 757-766.
To resolve other endogenous factors that could have potentially influenced reduced AHR in IgM-deficient mice, we resorted to busulfan chemical irradiation to deplete bone marrow cells in IgM-deficient mice and replace bone marrow with WT bone marrow. While it is well accepted that busulfan chemical irradiation partially depletes bone marrow cells, in our case it was not possible to pursue other irradiation methods due to changes in ethical regulations and that fact that mice are slow to recover after gamma rays irradiation. Busulfan chemical irradiation allowed us to show that we could mostly restore AHR in IgM-deficient recipient mice that received donor WT bone marrow when challenged with low dose HDM.
We removed the following lines, after performing bone marrow chimaeras since this changed some aspects.
Our efforts to adoptively transfer wild-type bone marrow or sorted B cells into IgM-deficient mice were also largely unsuccessful partly due to poor engraftment of wild-type B cells into secondary lymphoid tissues. Natural secreted IgM is mainly produced by B1 cells in the peritoneal cavity, and it is likely that any transfer of B cells via bone marrow transfer would not be sufficient to restore soluble levels of IgM(3,10).
(5) Alpha smooth muscle antigen is also expressed by myofibroblasts. This is insufficiently worked out. The histology mentions "expression in cells in close contact with smooth muscle". This needs more detail since it is a very vague term. Is it in smooth muscle or in myofibroblasts.
Response: We appreciate that alpha-smooth muscle actin-positive cells are a small fraction in the lung and even within CD45 negative cells, but their contribution to airway hyperresponsiveness is major. We also concede that by immunofluorescence BAIAP2L1 seems to be expressed by cells adjacent to alpha-smooth muscle actin (Fig. 5b), however, we know that cells close to smooth muscle (such as extracellular matrix and myofibroblasts) contribute to its hypertrophy in allergic asthma.
James AL, Elliot JG, Jones RL, Carroll ML, Mauad T, Bai TR, et al. Airway Smooth Muscle Hypertrophy and Hyperplasia in Asthma. Am J Respir Crit Care Med [Internet]. 2012;185:1058–64. Available from: https://doi.org/10.1164/rccm.201110-1849OC
(6) Have polymorphisms in BAIAP2L1 ever been linked to human asthma?
No, we have looked in asthma GWAS studies, at least summary statics and we have not seen any SNPs can could be associated with human asthma.
(7) IgM deficient patients are at increased risk for asthma. This paper suggests the opposite. So the translational potential is unclear
We thank the reviewer for these comments. At the time of this publication, we have not made a concrete link with human disease. While there is some anecdotal evidence of diseases such as Autoimmune glomerulonephritis, Hashimoto’s thyroiditis, Bronchial polyp, SLE, Celiac disease and other diseases in people with low IgM. Allergic disorders are also common in people with IgM deficiency as the reviewer correctly points out, other studies have reported as high as 33-47%. The mechanisms for the high incidence of allergic diseases are unclear as generally, these patients have normal or higher IgG and IgE levels. IgM deficiency may represent a heterogeneous spectrum of genetic defects, which might explain the heterogeneous nature of disease presentations.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The manuscript performs a comprehensive biochemical, structural, and bioinformatic analysis of TseP, a type 6 secretion system effector from Aeromonas dhakensis that includes the identification of a domain required for secretion and residues conferring target organism specificity. Through targeted mutations, they have expanded the target range of a T6SS effector to include a gram-positive species, which is not typically susceptible to T6SS attack.
Strengths:
All of the experiments presented in the study are well-motivated and the conclusions are generally sound.
Thank you.
Weaknesses:
There are some issues with the clarity of figures. For example, the microscopy figures could have been more clearly presented as cell counts/quantification rather than representative images. Similarly, loading controls for the secreted proteins for the westerns probably should be shown.
Also, some of the minor/secondary conclusions reached regarding the "independence" of the N and C term domains of the TseP are a bit overreaching.
We thank the reviewer for pointing out the issues and have carefully revised the manuscript accordingly. We acknowledge the reviewer’s concern regarding the independence of the N- and C-terminal domains, and have toned down the relevant claims.
Reviewer #2 (Public review):
Summary:
Wang et al. investigate the role of TseP, a Type VI secretion system (T6SS) effector molecule, revealing its dual enzymatic activities as both an amidase and a lysozyme. This discovery significantly enhances the understanding of T6SS effectors, which are known for their roles in interbacterial competition and survival in polymicrobial environments. TseP's dual function is proposed to play a crucial role in bacterial survival strategies, particularly in hostile environments where competition between bacterial species is prevalent.
Strengths:
(1) The dual enzymatic function of TseP is a significant contribution, expanding the understanding of T6SS effectors.
(2) The study provides important insights into bacterial survival strategies, particularly in interbacterial competition.
(3) The findings have implications for antimicrobial research and understanding bacterial interactions in complex environments.
Thank you.
Weaknesses:
(1) The manuscript assumes familiarity with previous work, making it difficult to follow. Mutants and strains need clearer definitions and references.
Thank you for raising the issue. We have revised the manuscript accordingly to improve the clarity by including more detailed descriptions of the mutants and strains, along with references to prior work where relevant, to improve clarity.
(2) Figures lack proper controls, quantification, and clarity in some areas, notably in Figures 1A and 1C.
We have now added the controls as requested by reviewers.
(3) The Materials and Methods section is poorly organized, hindering reproducibility. Biophysical validation of Zn<sup>2+</sup> interaction and structural integrity of proteins need to be addressed.
We have now included more details in the Materials and Methods section. While we recognize the importance of biophysical validation of the Zn<sup>2+</sup> interaction, this analysis lies beyond the primary scope of the current study. We plan to investigate the role of Zn²⁺ interaction and the EF-hand domain in greater depth as part of our follow-up studies. Thank you for this suggestion.
(4) Discrepancies in protein degradation patterns and activities across different figures raise concerns about data reliability.
We acknowledge the concern about discrepancies in protein degradation patterns. TseP exhibits inherent instability, which might explain the observed variations. We have added an explanation in the detailed response letter and the manuscript.
Reviewer #3 (Public review):
Summary:
Type VI secretion systems (T6SS) are employed by bacteria to inject competitor cells with numerous effector proteins. These effectors can kill injected cells via an array of enzymatic activities. A common class of T6SS effector are peptidoglycan (PG) lysing enzymes. In this manuscript, the authors characterize a PG-lysing effector-TseP-from the pathogen Aeromonas dhakensis. While the C-terminal domain of TseP was known to have lysozyme activity, the N-terminal domain was uncharacterized. Here, the authors functionally characterize TsePN as a zinc-dependent amidase. This discovery is somewhat novel because it is rare for PG-lysing effectors to have amidase and lysozyme activity.
In the second half of the manuscript, the authors utilize a crystal structure of the lysozyme TsePC domain to inform the engineering of this domain to lyse gram-positive peptidoglycan.
Strengths:
The two halves of the manuscript considered together provide a nice characterization of a unique T6SS effector and reveal potentially general principles for lysozyme engineering.
Thank you.
Weaknesses:
The advantage of fusing amidase and lysozyme domains in a single effector is not discussed but would appear to be a pertinent question. Labeling of the figures could be improved to help readers understand the data.
Thank you for the suggestions. We have revised the manuscript and figures to improve clarity.
The advantage of having dual-domain functions relative to having just one of the two functions is likely for increasing competitive fitness. Although such dual functional cell-wall targeting effectors have not been characterized prior to this study, there are some examples that dual functions are encoded by the same secretion module, for example the VgrG1-TseL pair in Vibrio cholerae. The C-terminal of VgrG1 not only catalyzes actin crosslinking but also recognizes and delivers the downstream encoded lipase effector TseL through direct interaction. In this context, the VgrG1-TseL pair also represent a dual-functional module. Therefore, it is likely that fusing effector domains and coupling effector functions are parallel strategies for the evolution of T6SS effectors.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
The paper explored cross-species variance in albumin glycation and blood glucose levels in the function of various life-history traits. Their results show that
(1) blood glucose levels predict albumin gylcation rates
(2) larger species have lower blood glucose levels
(3) lifespan positively correlates with blood glucose levels and
(4) diet predicts albumin glycation rates.
The data presented is interesting, especially due to the relevance of glycation to the ageing process and the interesting life-history and physiological traits of birds. Most importantly, the results suggest that some mechanisms might exist that limit the level of glycation in species with the highest blood glucose levels.
While the questions raised are interesting and the amount of data the authors collected is impressive, I have some major concerns about this study:
(1) The authors combine many databases and samples of various sources. This is understandable when access to data is limited, but I expected more caution when combining these. E.g. glucose is measured in all samples without any description of how handling stress was controlled for. E.g glucose levels can easily double in a few minutes in birds, potentially introducing variation in the data generated. The authors report no caution of this effect, or any statistical approaches aiming to check whether handling stress had an effect here, either on glucose or on glycation levels.
(2) The database with the predictors is similarly problematic. There is information pulled from captivity and wild (e.g. on lifespan) without any confirmation that the different databases are comparable or not (and here I'm not just referring to the correlation between the databases, but also to a potential systematic bias (e.g. captivate-based sources likely consistently report longer lifespans). This is even more surprising, given that the authors raise the possibility of captivity effects in the discussion, and exploring this question would be extremely easy in their statistical models (a simple covariate in the MCMCglmms).
(3) The authors state that the measurement of one of the primary response variables (glycation) was measured without any replicability test or reference to the replicability of the measurement technique.
(4) The methods and results are very poorly presented. For instance, new model types and variables are popping up throughout the manuscript, already reporting results, before explaining what these are e.g. results are presented on "species average models" and "model with individuals", but it's not described what these are and why we need to see both. Variables, like "centered log body mass", or "mass-adjusted lifespan" are not explained. The results section is extremely long, describing general patterns that have little relevance to the questions raised in the introduction and would be much more efficiently communicated visually or in a table.
Reviewer #2 (Public review):
Summary
In this extensive comparative study, Moreno-Borrallo and colleagues examine the relationships between plasma glucose levels, albumin glycation levels, diet, and lifehistory traits across birds. Their results confirmed the expected positive relationship between plasma blood glucose level and albumin glycation rate but also provided findings that are somewhat surprising or contradicting findings of some previous studies (relationships with lifespan, clutch mass, or diet). This is the first extensive comparative analysis of glycation rates and their relationships to plasma glucose levels and life history traits in birds that are based on data collected in a single study and measured using unified analytical methods.
Strengths
This is an emerging topic gaining momentum in evolutionary physiology, which makes this study a timely, novel, and very important contribution. The study is based on a novel data set collected by the authors from 88 bird species (67 in captivity, 21 in the wild) of 22 orders, which itself greatly contributes to the pool of available data on avian glycemia, as previous comparative studies either extracted data from various studies or a database of veterinary records of zoo animals (therefore potentially containing much more noise due to different methodologies or other unstandardised factors), or only collected data from a single order, namely Passeriformes. The data further represents the first comparative avian data set on albumin glycation obtained using a unified methodology. The authors used LC-MS to determine glycation levels, which does not have problems with specificity and sensitivity that may occur with assays used in previous studies. The data analysis is thorough, and the conclusions are mostly wellsupported (but see my comments below). Overall, this is a very important study representing a substantial contribution to the emerging field of evolutionary physiology focused on the ecology and evolution of blood/plasma glucose levels and resistance to glycation.
Weaknesses
My main concern is about the interpretation of the coefficient of the relationship between glycation rate and plasma glucose, which reads as follows: "Given that plasma glucose is logarithm transformed and the estimated slope of their relationship is lower than one, this implies that birds with higher glucose levels have relatively lower albumin glycation rates for their glucose, fact that we would be referring as higher glycation resistance" (lines 318-321) and "the logarithmic nature of the relationship, suggests that species with higher plasma glucose levels exhibit relatively greater resistance to glycation" (lines 386-388). First, only plasma glucose (predictor) but not glycation level (response) is logarithm transformed, and this semi-logarithmic relationship assumed by the model means that an increase in glycation always slows down when blood glucose goes up, irrespective of the coefficient. The coefficient thus does not carry information that could be interpreted as higher (when <1) or lower (when >1) resistance to glycation (this only can be done in a log-log model, see below) because the semi-log relationship means that glycation increases by a constant amount (expressed by the coefficient of plasma glucose) for every tenfold increase in plasma glucose (for example, with glucose values 10 and 100, the model would predict glycation values 2 and 4 if the coefficient is 2, or 0.5 and 1 if the coefficient is 0.5). Second, the semi-logarithmic relationship could indeed be interpreted such that glycation rates are relatively lower in species with high plasma glucose levels. However, the semi-log relationship is assumed here a priori and forced to the model by log-transforming only glucose level, while not being tested against alternative models, such as: (i) a model with a simple linear relationship (glycation ~ glucose); or (ii) a loglog model (log(glycation) ~ log(glucose)) assuming power function relationship (glycation = a * glucose^b). The latter model would allow for the interpretation of the coefficient (b) as higher (when <1) or lower (when >1) resistance in glycation in species with high glucose levels as suggested by the authors.
Besides, a clear explanation of why glucose is log-transformed when included as a predictor, but not when included as a response variable, is missing.
We apologize for missing an answer to this part before. Indeed, glucose is always log transformed and this is explained in the text.
The models in the study do not control for the sampling time (i.e., time latency between capture and blood sampling), which may be an important source of noise because blood glucose increases because of stress following the capture. Although the authors claim that "this change in glucose levels with stress is mostly driven by an increase in variation instead of an increase in average values" (ESM6, line 46), their analysis of Tomasek et al.'s (2022) data set in ESM1 using Kruskal-Wallis rank sum test shows that, compared to baseline glucose levels, stress-induced glucose levels have higher median values, not only higher variation.
Although the authors calculated the variance inflation factor (VIF) for each model, it is not clear how these were interpreted and considered. In some models, GVIF^(1/(2*Df)) is higher than 1.6, which indicates potentially important collinearity; see for example https://www.bookdown.org/rwnahhas/RMPH/mlr-collinearity.html). This is often the case for body mass or clutch mass (e.g. models of glucose or glycation based on individual measurements).
It seems that the differences between diet groups other than omnivores (the reference category in the models) were not tested and only inferred using the credible intervals from the models. However, these credible intervals relate to the comparison of each group with the reference group (Omnivore) and cannot be used for pairwise comparisons between other groups. Statistics for these contrasts should be provided instead. Based on the plot in Figure 4B, it seems possible that terrestrial carnivores differed in glycation level not only from omnivores but also from herbivores and frugivores/nectarivores.
Given that blood glucose is related to maximum lifespan, it would be interesting to also see the results of the model from Table 2 while excluding blood glucose from the predictors. This would allow for assessing if the maximum lifespan is completely independent of glycation levels. Alternatively, there might be a positive correlation mediated by blood glucose levels (based on its positive correlations with both lifespan and glycation), which would be a very interesting finding suggesting that high glycation levels do not preclude the evolution of long lifespans.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Line 84: "glycation scavengers" such as polyamines - can you specify what these polyamines do exactly?
A clarification of what we mean with "glycation scavengers" is added.
(2) Line 87-89: specify that the work of Wein et al. and this sentence is about birds.
This is now clarified.
(3) Line 95: "88 species" add "OF BIRDS". Also, I think it would be nice if you specified here that you are relying on primary data.
This is now clarified (line 96).
(4) Line 90-119: I find this paragraph very long and complex, with too many details on the methodology. For instance, I agree with listing your hypothesis, e.g. that with POL, but then what variables you use to measure the pace of life can go in the materials and methods section (so all lines between 112-119).
This is explained here as a previous reviewer considered this presentation was indeed needed in the introduction.
(5) Line 122-124: The first sentence should state that you collected blood samples from various sources, and list some examples: zoos? collaborators? designated wild captures? Stating the sample size before saying what you did to get them is a bit weird. Besides, you skipped a very important detail about how these samples were collected, when, where, and using what protocols. We know very well, that glucose levels can increase quickly with handling stress. Was this considered during the captures? Moreover, you state that you had 484 individuals, but how many samples in total? One per individual or more?
We kindly ask the reviewer to read the multiple supplementary materials provided, in which the questions of source of the samples, potential stress effects and sample sizes for each model are addressed. All individuals contributed with one sample. More details about the general sources employed are given now in lines 125-127.
(6) Line 135-36: numbers below 10 should be spelled out.
Ok. Now that is changed.
(7) Line 136: the first time I saw that you had both wild and captive samples. This should be among the first things to be described in the methods, as mentioned above.
As stated above, details on this are included in the supplementary materials, but further clarifications have now been included in the main text (question 5).
(8) Line 137-138: not clear. So you had 46 samples and 9 species. But what does the 3-3-3 sample mean? or for each species you chose 9 samples (no, cause that would be 81 samples in total)?
This has now been clarified (lines 139-140).
(9) Line 139-141: what methodological constraints? Too high glucose levels? Too little plasma?
There were cases in which the device (glucometer) produced an unspecific error. This did not correspond to too high nor too low glucose levels, as these are differently signalled errors. Neither the manual nor the client service provided useful information to discern the cause. This may perhaps be related to the composition of the plasma of certain species, interfering with the measurement. Some clarifications have been added (lines 143-146).
(10) Line 143: should be ZIMS.
Corrected.
(11) Line 120-148: you generally talk about individuals here, but I feel it would be more precise to use 'samples'.
The use is totally interchangeable, as we never measured more than one sample for a given individual within this study. Besides, in some cases, saying “sample” could result less informative.
(12) Line 150: missing the final number of measurements for glucose and glycation.
Please, read the ESM6 (Table ESM6.1), where this information is given.
(13) Line 154-155: so you took multiple samples from the same individual? It's the first time the text indicates so. Or do you mean technical replicates were not performed on the same samples?
As previously indicated, each individual included only one sample. Replicates were done only for some individuals to validate the technique, as it would be unfeasible to perform replicates of all of them. This part of the text is referring to the fact that not all samples were analysed at the same time, as it takes a considerable amount of time, and the mass spectrometry devices are shared by other teams and project. Clarifications in this sense are now added (lines 160-163).
(14) Line 171-172: "After realizing that diet classifications from AVONET were not always suitable for our purpose" - too informal. Try rephrasing, like "After determining that AVONET diet classifications did not align with our research needs...", but you still need to specify what was wrong with it and what was changed, based on what argument?
The new formulation suggested by the reviewer has now been applied (lines 181-183). The details are given in the ESM6, as indicated in the text.
(15) Line 174-176: You start a new paragraph, talking about missing values, but you do not specify what variable are you talking about. you talk about calculating means, but the last variable you mentioned was diet, so it's even more strange.
We refer to life history traits. It has now been clarified in the text (line 185).
(16) Line 177: what longevity records? Coming from where? How did you measure longevity? Maximum lifespan ever recorded? 80-90% longevity, life expectancy???
We refer to maximum lifespan, as indicated in the introduction and in every other case throughout the manuscript. Clarifications have now been introduced (188-190).
(17) Line 180-183: using ZIMS can be problematic, especially for maximum longevity. There are often individuals who had a wrong date of birth entered or individuals that were failed to be registered as dead. The extremes in this database are often way off. If you want to combine though, you can check the correlation of lifespans obtained from different sources for the overlapping species. If it's a strong correlation it can be ok, but intuitively this is problematic.
The species for which we used ZIMS were those for which no other databases reported any values. We could try correlations for other species, but this issue is not necessarily restricted to ZIMS, as the primary origin of the data from other databases is often difficultly traceable. Also, ZIMS is potentially more updated that some of the other databases, mainly Amniotes database, from which we rely the most, as it includes the highest number of species in the most easily accessible format.
(18) Line 181-186: in ZIMS you calculate the average of the competing records, otherwise you choose the max. Why use different preferences for the same data?
This constitutes a misunderstanding, for which we include clarifications now (line 196). We were referring here to the fact that for maximum lifespan the maximum is always chosen, while for other variables an average is calculated.
(19) Line 198: Burn-in and thinning interval is quite low compared to your number of iterations. How were model convergences checked?
Please, check ESM1.
(20) Line 201-203: What's the argument using these priors? Why not use noninformative ones? Do you have some a priori expectations? If so, it should be explained.
Models have now been rerun with no expectations on the variance partitions so the priors are less informative, given the lack of firm expectations, and results are similar. Smaller nu values are also tried.
(21) Line 217: "carried" OUT.
Corrected (now in line 229).
(22) Line 233-234: "species average model" - what is this? it was not described in the methods.
Please, read the ESM6.
(23) Line 232-246: (a) all this would be better described by a table or plot. You can highlight some interesting patterns, but describing it all in the text is not very useful I think, (b) statistically comparing orders represented by a single species is a bit odd.
(a) Figure 1 shows this graphically, but this part was found to be quite short without descriptions by previous reviewers. (b) We recognise this limitation, but this part is not presented as one of the main results of the article, and just constitutes an attempt to illustrate very general patterns, in order to guide future research, as in most groups glycation has never been measured, so this still constitutes the best illustration of such patterns in the literature.
(24) Line 281: the first time I saw "mass-adjusted maximum lifespan" - what is this, and how was it calculated? It should be described in the methods. But in any case, neither ratios, nor residuals should be used, but preferably the two variables should be entered side by side in the model.
Please, see ESM6 for the explanations and justifications for all of this.
(25) Line 281: there was also no mention of quadratic terms so far. How were polynomial effects tested/introduced in the models? Orthogonal polynomials? or x+ x^2?
Please, read ESM6.
(26) Table 1. What is 'Centred Log10Body mass', should be added in the methods.
Please, read ESM6.
(27) Table 1: what's the argument behind separating terrestrial and aquatic carnivores?
This was mostly based on the a priori separation made in AVONET, but it is also used in a similar way by Szarka and Lendvai 2024 (comparative study on glucose in birds), where differences in glucose levels between piscivorous and carnivorous are reported. We had some reasons to think that certain differences in dietary nutrient composition, as discussed later, can make this difference relevant.
(28) Table 1: The variable "Maximum lifespan" is discussed and plotted as 'massadjusted maximum lifespan' and 'residual maximum lifespan'. First, this is confusing, the same name should be used throughout and it should be defined in the methods section. Second, it seems that non-linear effects were tested by using x + x^2. This is problematic statistically, orthogonal polynomials should be used instead (check polyfunction in R). Also, how did you decide to test for non-linear effects in the case of lifespan but not the other continuous predictors? Should be described in the methods again.
Please, read ESM6. Data exploration was performed prior to carry out these models. Orthogonal polynomials were considered to difficult the interpretation of the estimates and therefore the patterns predicted by the models, so raw polynomials were used. Clarifications have now been included in line 297.
(29) Figure 2. From the figure label, now I see that relative lifespan is in fact residual. This is problematic, see Freckleton, R. P. (2009). The seven deadly sins of comparative analysis. Journal of evolutionary biology, 22(7), 1367-1375. Using body mass and lifespan side by side is preferred. This would also avoid forcing more emphasis on body mass over lifespan meaning that you subjectively introduce body mass as a key predictor, but lifespan and body size are highly correlated, so by this, you remove a large portion of variance that might in fact be better explained by lifespan.
Please, read ESM6 for justifications on the use of residuals.
Reviewer #2 (Recommendations for the authors):
(1) If the semi-logarithmic relationship (glycation ~ log10(glucose)) is to be used to support the hypothesis about higher glycation resistance in species with high blood glucose (lines 318-321 and 386-388), it should be tested whether it is significantly better than the model assuming a simple linear relationship (i.e., glycation ~ glucose). Alternatively, if the coefficient is to be used to determine whether glycation rate slows down or accelerates with increasing glucose levels, log-log model (log10(glycation) ~ log10(glucose)) assuming power function relationship (glycation = a * glucose^b) should be used (as is for example in the literature about relationships between metabolic rates and body size). Probably the best approach would be to compare all three models (linear, semi-logarithmic, and log-log) and test if one performs significantly better. If none of them, then the linear model should be selected as the most parsimonious.
Different options (linear, both semi-logarithmic combinations and log-log) have now been tested, with similar results. All of the models confirm the pattern of a significant positive relationship between glucose and glycation. Moreover, when standardizing the variables (both glucose and glycation, either log transformed or not), the estimate of the slope is almost equal for all the models. It is also lower than one, which in the case of both the linear and log-log confirms the stated prediction. The log-log model, showing a much lower DIC than the linear version, is now shown as the final model.
(2) ESM6, line 46: Please note that Kruskal-Wallis rank sum test in ESM1 shows that, compared to baseline glucose levels, stress-induced glucose levels have higher median values (not only higher variation). With this in mind, what is the argument here about increased variation being the main driver of stress-induced change in glucose levels based on? It seems that both the median values and variation differ between baseline and stress-induced levels, and this should be acknowledged here.
As discussed in the public answers, Kruskal Wallis does not allow to determine differences in mean, but just says that the groups are “different” (implicitly, in their ranksums, which does not mean necessarily in mean), while the Levene test performed signals heteroskedasticity. This makes this feature of the data analytically more grounded. Of course, when looking at the data, a higher mean can be perceived, but nothing can be said about its statistical significance. Still, some subtle changes have been introduced in corresponding section of the ESM6.
(3) Have you recorded the sampling times? If yes, why not control them in the models? It is at least highly advisable to include the sampling times in the data (ESM5).
As indicated in ESM6 lines 42-43, we do not have sampling times for most of the individuals (only zebra finches and swifts), so this cannot be accounted for in the models.
(4) If sampling times will remain uncontrolled statistically, I recommend mentioning this fact and its potential consequences (i.e., rather conservative results) in the Methods section of the main text, not only in ESM6.
A brief description of this has now been included in the main text (lines 129-132), referencing the more detailed discussion on the supplementary materials. Some subtle changes have also been included in the “Possible effects of stress” section of the ESM6.
(5) ESM6, lines 52-53: The lower repeatability in Tomasek et al.' study compared to your study is irrelevant to the argument about the conservative nature of your results (the difference in repeatability between both studies is most probably due to the broader taxonomic coverage of the current study). The important result in this context is that repeatability is lower when sampling time is not considered within Tomasek et al's data set (ESM1). Therefore, I suggest rewording "showing a lower species repeatability than that from our data" to "showing lower species repeatability when sampling time is not considered" to avoid confusion. Please also note that you refer here to species repeatability but, in ESM1, you calculate individual repeatability. Nevertheless, both individual and species repeatabilities are lower when not controlling for sampling time because the main driver, in that case, is an increased residual variance.
We recognize the current confusion in the way the explanation is exposed, and have significantly changed the redaction of the section. However, we would like to indicate that ESM1 shows both species and individual repeatability (for Tomasek et al. 2022 data, for ours only species as we do not have repeated individual values). Changes are now made to make it more evident.
(6) I recommend providing brief guidelines for the interpretation of VIFs to the readers, as well as a brief discussion of the obtained values and their potential importance.
Thank you for the recommendation. We included a brief description in lines 230-231. Also in the results section (lines 389-393).
(7) Line: 264: Please note that the variance explained by phylogeny obtained from the models with other (fixed) predictors does not relate to the traits (glucose or glycation) per se but to model residuals.
We appreciate the indication, and this has been rephrased accordingly (lines 280-286).
(8) Change the term "confidence intervals" to "credible intervals" throughout the paper, since confidence interval is a frequentist term and its interpretations are different from Bayesian credible interval.
Thank you for the remark, this has now been changed.
(9) Besides lifespan, have you also considered quadratic terms for body mass? The plot in Figure 2A suggests there might be a non-linear relationship too.
A quadratic component of body mass has not shown any significant effect on glucose in an alternative model. Also, a model with linear instead of log glucose (as performed in other studies) did not perform better by comparing the DICs, despite both showing a significant relationship between glucose and body mass. Therefore, this model remains the best option considered as presented in the manuscript.
(10) ESM6, lines 115-116: It is usually recommended that only factors with at least 6 or 8 levels are included as random effects because a lower number of levels is insufficient for a good estimation of variance.
In a Bayesian approach this does not apply, as random and fixed factors are estimated similarly.
(11) Typos and other minor issues:
a) Line 66: Delete "related".
b) Figure 2: "B" label is missing in the plot.
c) Reference 9: Delete "Author".
d) References 15 and 83 are duplicated. Keep only ref. 83, which has the correct citation details.
e) ESM6, line 49: Change "GLLM" to "GLMM".
Thank you for indicating this. Now it’s corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Response to Reviewer 2’s comments:
I am concerned that the results in Figure 8D may not be correct, or that the authors may be mis-interpreting them. From my reading of the paper they cite (Lammers & Flamholz 2023), the equilibrium sharpness limit for the network they consider in Figure 8 should be 0.25. But both solutions shown in Figure 8D fall below this limit, which means that they have sharpness levels that could have been achieved with no energy expenditure. If this is the case, then it would imply that while both systems do dissipate energy, they are not doing so productively; meaning that the same results could be achieved while holding Phi=0.
I acknowledge that this could be due to a difference in how they measure sharpness, but wanted to raise it here in case it is, in fact, a genuine issue with the analysis.There should be an easy fix for this: just set the sharper "desired response" curve in 8b to be such that it demands non-equilibrium sharpness levels (0.25<S<0.5).
Thank you for raising this point regarding the interpretation of our results in Figure 8D. We agree that if the equilibrium sharpness limit for this particular network is around 0.25 (as shown by Lammers & Flamholz 2023), then achieving a sharpness below this threshold could, in principle, be accomplished without any energy expenditure. However, in our current design approach, the loss function is solely designed to enforce agreement with a target mean mRNA level at different input concentrations; it does not explicitly constrain energy dissipation, noise, or other metrics. Consequently, the DGA has no built-in incentive to minimize or optimize energy consumption, which means the resulting solutions may dissipate energy without exceeding the equilibrium sharpness limit.
In other words, the same input–output relationship could theoretically be achieved with \Phi =0 if an explicit constraint or regularization term penalizing energy usage had been included. As noted, adding such a term (e.g., penalizing \Phi^2) is conceptually straightforward but falls outside the scope of this study. Our primary goal is to demonstrate the flexibility of the DGA in designing a desired response, rather than to delve into energy–sharpness trade-offs or other biological considerations
While we appreciate the suggestion to set a higher target sharpness that exceeds the equilibrium limit, we believe the current example effectively demonstrates the DGA’s ability to design circuits with desired input-output relationships, which is the primary focus of this study. Researchers interested in optimizing energy efficiency, burst size, burst frequency, noise, response time, mutual information, or other system properties can easily extend our approach by incorporating additional terms into the loss function to target these specific objectives.
We hope this explanation addresses your concern and clarifies that the manuscript provides sufficient context for readers to interpret the results in Figure 8D correctly.
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
We thank Reviewer #1 for their thoughtful feedback and appreciation of the manuscript's clarity. Our primary goal is to introduce the DGA as a foundational tool for integrating stochastic simulations with gradient-based optimization. While we recognize the value of providing detailed comparisons with existing methods and a deeper analysis of the DGA’s limitations (such as rare event handling), these topics are beyond the scope of this initial work. Our focus is on presenting the core concept and demonstrating its potential, leaving more extensive evaluations for future research.
Reviewer #2 (Public review):
We thank Reviewer #2 for their detailed and constructive feedback. We appreciate the recognition of the DGA as a significant conceptual advancement for stochastic biochemical network analysis and design.
Weaknesses:
(1) Validation of DGA robustness in complex systems:
Our primary goal is to introduce the DGA framework and demonstrate its feasibility. While validation on high-dimensional and non-steady-state systems is important, it is beyond the scope of this initial work. Future studies may improve scalability by employing techniques such as dynamically adjusting the smoothness of the DGA's approximations during simulation or using surrogate models that remain differentiable but more accurately capture discrete behaviors in critical regions, thus preserving gradient computation while improving accuracy.
(2) Inference accuracy and optimization:
We acknowledge that the non-convex loss landscape in the DGA can hinder parameter inference and convergence to global minima, as seen in Figure 5A. While techniques like multi-start optimization or second-order methods (e.g., L-BFGS) could improve performance, our focus here is on establishing the DGA framework. We plan to explore better optimization methods in future work to improve the accuracy of parameter inference in complex systems.
(3) Use of simple models for demonstration:
We selected well-understood systems to clearly illustrate the capabilities of the DGA. These examples were intended to demonstrate how the DGA can be applied, rather than to solve problems better addressed by analytical methods. Applying DGA to more complex, analytically intractable systems is an exciting avenue for future work, but introducing the method was our main objective in this study.
Reviewer #3 (Public review):
We thank the reviewer for their detailed and insightful feedback. We appreciate the recognition of the DGA as a significant advancement for enabling gradient-based optimization in stochastic systems.
Weaknesses:
(1) Application beyond steady-state analysis
We acknowledge the limitation of focusing solely on steady-state properties. To extend the DGA for analyzing transient dynamics, time-dependent loss functions can be incorporated to capture system evolution over time. This could involve aligning simulated trajectories with experimental time-series data or using moment-matching across multiple time points.
(2) Numerical instability in gradient computation
The reviewer correctly highlights that large sharpness parameters (a and b) in the sigmoid and Gaussian approximations can induce numerical instability due to vanishing or exploding gradients. To address this, adaptive tuning of a and b during optimization could balance smoothness and accuracy. Additionally, alternative smoothing functions (e.g., softmax-based reaction selection) and gradient regularization techniques (such as gradient clipping and trust-region methods) could improve stability and convergence.
Reviewer #1 (recommendations):
We thank the reviewer for their thoughtful and constructive feedback on our manuscript. Below, we address each of the comments and suggestions raised.
Main points:
(1) It would have been useful to have a brief discussion, based on a concrete example, of what can be achieved with the DGA and is totally beyond the reach of the Gillespie algorithm and the numerous existing stochastic simulation methods.
Thank you for your comment. We would like to clarify that the primary aim of this work is to introduce the DGA and demonstrate its feasibility for tasks such as parameter estimation and network design. Unlike traditional stochastic simulation methods, the DGA’s differentiable nature enables gradient-based optimization, which is not possible with the classical Gillespie algorithm or its variants.
(2) As often with machine learning techniques, there is a sense of black box, with a lack of mathematical details of the proposed method: as opposite to the exact Gillespie algorithm, whose foundations lie on solid mathematical results (exponentially-distributed waiting times of continuous-time Markov processes), the DGA involves uncontrolled approximations, that are only briefly mentioned in the paper. For instance, it is currently simply noted that "the approximations introduced by the DGA may be pronounced in more complex settings such as the calculation of rare events", without specifying how limiting these errors are. It would be useful to include a clearer and more comprehensive discussion of the limitations of the DGA: When does it work accurately? What are the approximations/errors and can they be controlled? When is it worth paying the price for those approximations/errors, and when is it better to stick to the Gillespie algorithm? Is this notably the case for problems involving rare events? Clearly, these are difficult questions, and the answers are problem specific. However, it would be important to draw the readers' attention on the issues, especially if the DGA is presented as a potentially significant tool in computational and synthetic biology.
We acknowledge the importance of discussing the limitations of the DGA in more detail. While we have noted that the approximations introduced by the DGA may impact its accuracy in certain scenarios, such as rare-event problems, a deeper exploration of these trade-offs is outside the scope of this work. Instead, we provide sufficient context in the manuscript to guide readers on when the DGA is appropriate.
(3) The DGA is here introduced and discussed in the context of non-spatial problems (simple gene regulatory networks). However, numerous problems in the life sciences and computational/synthetic biology, involve stochasticity and spatial degrees of freedom (e.g. for problems involving diffusion, migration, etc). It is notoriously challenging to use the Gillespie algorithm to efficiently simulate stochastic spatial systems, especially in the context of rare events (e.g., extinction or fixation problems). It would be useful to comment on whether, and possibly how, the DGA can be used to efficiently simulate stochastic spatial systems, and if it would be better suited than the Gillespie algorithm for this purpose.
Thank you for pointing this out. Although our current work centers on non-spatial systems, we agree that many biological contexts incorporate both stochasticity and spatial degrees of freedom. Extending the DGA to efficiently simulate such systems would indeed require substantial modifications—for instance, coupling it with reaction-diffusion frameworks or spatial master equations. We believe this is an exciting direction for future research and mention it briefly in the discussion as a potential extension.
Minor suggestions:
(1) After Eq.(10): it would be useful to explain and motivate the choice of the ratio JSD/H.
Done.
(2) On page 6, just below the caption of Fig.4: it would be useful to clarify what is actually meant by "... convergence towards the steady-state distribution of the exact Gillespie simulation, which is obtained at a simulation time of 10^4".
Done.
(3) At the end of Section B on page 7: please clarify what is meant here by "soft directions".
Done.
Reviewer #2 (recommendations):
We thank the reviewer for their thoughtful comments and constructive feedback. Below, we address each of the comments/suggestions.
Main points:
(1) Enumerate the conditions under which DGA assumptions hold (and when they do not). There is currently not enough information for the interested reader to know whether DGA would work for their system of interest. Without this information, it is difficult to assess what the true scope of DGA's impact will be. One simple idea would be to test DGA performance along two axes: (i) increasing number of model states and (ii) presence/absence of non-steady state dynamics. I acknowledge that these are very open-ended directions, but looking at even a single instance of each would greatly strengthen this work. Alternatively, if this is not feasible, then the authors should provide more discussion of the attendant difficulties in the main text.
We agree that a detailed exploration of the conditions under which the DGA assumptions hold would be a valuable addition to the field. However, this paper primarily aims to introduce the DGA methodology and demonstrate its proof-of-concept applications. A comprehensive analysis along axes such as increasing model states or non-steady-state dynamics, while important, would require significant additional simulations and is beyond the scope of this work. In Appendix A, we have discussed the trade-off between accuracy and numerical stability. Additionally, we encourage future users to tune the hyperparameters a and b for their specific systems.
(2) Demonstrate DGA performance in a more complex biochemical system. Clearly the authors were aware that analytic solutions exist for the 2-state system in Figure 7, but it this is actually also the case (I think) for mean mRNA production rate of the non-equilibrium system in Figure 8. To really demonstrate that DGA is practically viable, I encourage the authors to seek out an interesting application that is not analytically tractable.
We appreciate the suggestion to validate DGA on a more complex biochemical system. However, the goal of this study is not to provide an exhaustive demonstration of all possible applications but to introduce the DGA and validate it in systems where ground-truth comparisons are available. While the non-equilibrium system in Figure 8 might be analytically tractable, its complexity already provides a meaningful demonstration of DGA’s ability to optimize parameters and design systems. Extending this work to analytically intractable systems is an exciting direction for future studies, and we hope this paper will inspire others to explore these applications.
(3) Take steps to improve the robustness of parameter optimization and error bar calculations. (3a) When the loss landscape is degenerate, shallow, or otherwise "difficult," a common solution is to perform multiple (e.g. 25-100) inference runs starting from different random positions in parameter space. Doing this, and then taking the parameter set that minimizes the loss should, in theory, lead to a more robust recovery of the optimal parameter set.
(3b) It seems clear that the Hessian approximation is underestimating the true error in your inference results. One alternative is to use a "brute force" approach like bootstrap resampling to get a better estimate for the statistical dispersion in parameter estimates. But I recognize that this is only viable if the inference is relatively fast. Simply recovering the true minimum will, of course, also help.
(3a) We acknowledge the challenge posed by degenerate or shallow loss landscapes during parameter optimization. While performing multiple inference runs from different initializations is a common strategy, this approach is computationally intensive. Instead, we rely on standard optimization techniques (e.g., ADAM) to find a robust local minimum.
(3b) Thank you for your comment. We agree that Hessian-based error bars can underestimate uncertainty, particularly in degenerate or poorly conditioned loss landscapes. While methods like bootstrap and Monte Carlo can provide more robust estimates, they can be computationally prohibitive for larger-scale simulations. A simpler reason for not using them is the high resource demand from repeated simulations, which quickly becomes infeasible for complex or high-dimensional models. We note these trade-offs between robust estimation and practicality as an important area for further exploration.
Moderate comments:
(1) Figure 7: is it possible to also show the inferred kon values? Specifically, it would be of interest to see how kon varies with repressor concentration.
Thank you for the suggestion. We have updated Figure 7 to include the inferred kon values, showing their variation with the mean mRNA copy number. However, we could not plot them against repressor concentration due to the lack of available data.
(2) Figure 8B & D: the authors claim that the sharper system dissipates more energy, but doesn't 8D show the opposite of this? More importantly, it does not look like either network drives sharpness levels that exceed the upper equilibrium limit cited in [36]. So it is not clear that it is appropriate to look at energy dissipation here. In fact, it is likely that equilibrium networks could produce the curves in 8B, and might be worth checking.
Thank you for pointing this out. We realized that the plotted values in Figure 8D were incorrect, as we had mistakenly plotted noise instead of energy dissipation. The plot has now been corrected.
(3) Figure 8: I really like this idea of using DGA to "design" networks with desired input-output properties, but I wonder if you could explore more a biologically compelling use-case. Specifically, what about some kind of switch-like logic where, as the activator concentration increases, you have first 0 genes on, then 1 promoter on, then 2 promoters on. This would achieve interesting regulatory logic, and having DGA try to produce step functions would ensure that you force the networks to be maximally sharp (i.e. about double what you're currently achieving).
Thank you for this intriguing suggestion. While the proposed switch-like logic use case is indeed compelling, implementing such a system would require significant work. This goes beyond the scope of the current study, which focuses on demonstrating the feasibility of DGA for network design with simple input-output properties.
Minor comments:
(1) Figure 4B & C: the bar plots do not do a good job conveying the points made by the authors. Consider alternatives, such as scatter plots or box plots that could convey inference uncertainty.
Done.
(2) Figure 4B: consider using a log y-axis.
The y-axis in Figure 4B is already plotted on a log scale.
(3) Figure 4D is mentioned prior to 4C in the text. Consider reordering.
Done.
(4) Figure 5B: it is difficult to assess from this plot whether or not the landscape is truly "flat," as the authors claim. Flat relative to what? Consider alternative ways to convey your point.
Thank you for highlighting this ambiguity. By describing the loss landscape as “flat,” we intend to convey its relative insensitivity to parameter variations in certain regions, rather than implying a completely level surface. While we believe Figure 5B still provides a useful qualitative depiction of this behavior, we acknowledge that it does not quantitatively establish “flatness.” In future work, we plan to incorporate more rigorous measures—such as gradient magnitudes or Hessian eigenvalues—to more accurately characterize and communicate the geometry of the loss landscape.
Reviewer #3 (recommendations):
We sincerely thank the reviewer for their thoughtful feedback and constructive suggestions, which have helped us improve the clarity and rigor of our manuscript. Below, we address each of the comments.
(1) Precision is lacking in the introduction section. Do the authors mean the Direct SSA, sorted SSA, which is usually faster, and how about rejection sampling methods?
Thank you for pointing this out. We have updated the introduction to explicitly mention the Direct SSA.
(2) When mentioning PyTorch and Jax, would be good to also talk about Julia, as they have fast stochastic simulators.
We have now mentioned Julia alongside PyTorch and Jax.
(3) Mentioned references 22-27. Reference 26 is an odd choice; a better reference is from the same author the Automatic Differentiation of Programs with Discrete Randomness, G Arya, M Schauer, F Schäfer, C Rackauckas, Advances in Neural Information Processing Systems, NeurIPS 2022
We have now cited the suggested reference.
(4) Page 1, Section: 'To circumnavigate these difficulties, the DGA modifies....' Have you thought about how you would deal with the bias that will be introduced by doing this?
Thank you for your insightful comment. We acknowledge the potential for bias due to the differentiable approximations in the DGA; however, our analysis has not revealed any systematic bias compared to the exact Gillespie algorithm. Instead, we observe irregular deviations from the exact results as the smoothness of the approximations increases.
(5) Page 2, first sentence '... traditional Gillespie...' be more precise here - the direct algorithm.
Thank you for your comment. We believe that the context of the paper, particularly the schematic in Figure 1, makes it clear that we are focusing on the Direct SSA.
(6) Page 2, second paragraph: ' In order to simulate such a system...' This doesn't fit here as this section is about tau-leaping. As this approach approximates discrete operations, it is unclear if it would work for large models, snap-shot data of larger scale and if it would be possible to extend it for time-lapse data
Thank you for your comment. We respectfully disagree that this paragraph is misplaced. The purpose of this paragraph is to explain why the standard Gillespie algorithm does not use fixed time intervals for simulating stochastic processes. By highlighting the inefficiency of discretizing time into small intervals where reactions rarely occur, the paragraph provides necessary context for the Gillespie algorithm’s event-driven approach, which avoids this inefficiency.
Regarding the applicability of the DGA to larger models, snapshot data, or time-lapse data, we acknowledge these are important directions and have noted them as potential extensions in the discussion section.
(7) Page 2 Section B: 'In order to make use of modern deep-learning techniques...' It doesn't appear from the paper that any modern deep learning is used.
Thank you for your comment. Although the DGA does not utilize deep learning architectures such as neural networks, it employs automatic differentiation techniques provided by frameworks like PyTorch and Jax. These tools allow efficient gradient computations, making the DGA compatible with modern optimization workflows.
(8) Page 3, Fig 1(a). S matrix last row, B and C should swap places: B should be 1 and C is -1.
Corrected the typo.
(9) Fig1 needs a more detailed caption.
Expanded the caption slightly for clarity.
(10) Page 3 last paragraph: 'The hyperparameter b...' Consequences of this are relevant, for example can we now go below zero. Also, we lose more efficient algorithms here. It would be good to discuss this in more detail that this is an approx.. algorithm that is good for our case study, but for other to use it more tests are needed.
Thank you for the comment. Appendix A discusses the trade-offs related to a and b, but we agree that more detailed analysis is needed. The hyperparameters are tailored to our case study and must be tuned for specific systems.
(11) Page 4, Section C, first paragraph, 'The goal of making...' This is snapshot data. Would the framework also translate to time-lapse data? Also, it would be better to make it clearer earlier which type of data are the target of this study.
Thank you for your suggestion. While the current study focuses on snapshot data and steady-state properties, we believe the DGA could be extended to handle time-lapse data by incorporating multiple recorded time points into its inference objective. Specifically, one could modify the loss function to penalize discrepancies across observed transitions between these time points, effectively capturing dynamic trajectories. We consider this an exciting area for future development, but it lies beyond our present scope.
(12) Page 4 Section C, sentence '...experimentally measured moments'. Should later be mentioned as error, as moments are imperfect
Thank you for your comment. We agree that experimentally measured moments are inherently noisy and may not perfectly represent the true system. However, within the context of the DGA, these moments serve as target quantities, and the discrepancy between simulated and measured moments is already accounted for in the loss function.
(13) Page 4 Section C, last sentence '...second-order...such as ADAM'. Another formulation would be better as second order can be confusing, especially in the context of parameter estimation
We have revised the language to avoid confusion regarding “second-order” methods.
(14) Fig 4(a) a density plot would fit better here
Fig. 4(a) has been updated to a scatter density plot as suggested.
(15) Fig 4(c) Would be interesting to see closer analysis of trade of between gradient and accuracy when changing a and b parameters
Thank you for this suggestion. We acknowledge that an in-depth exploration of these trade-offs could provide deeper insights into the method’s performance. However, for now, we believe the current analysis suffices to highlight the utility of the DGA in the contexts examined.
(16) Page 6 Section III, first sentence: This fits more to intro. Further the reference list is severely lacking here, with no comparison to other methods for actually fitting stochastic models.
Thank you for the suggestion. We have added a few references there.
(17) Page 6, Section A, sentence: '....experimental measured mean...' Why is it a good measure here (moment matching is not perfect), also do you have distribution data, would that not be better? How about accounting for measurement error?
Thank you for the comment. While we do not have full distribution data, we acknowledge that incorporating experimental measurement error could enhance the framework. A weighted loss function could model uncertainty explicitly, but this is beyond the scope of the current study.
(18) Page 7, section B, first paragraph: 'Motivated by this, we defined the...'Why using Fisher-Information when profile-likelihood have proven to be better, especially for systems with few parameters like this.
Thank you for the suggestion. While profile-likelihood is indeed a powerful tool for parameter uncertainty analysis, we chose Fisher Information due to its computational efficiency and compatibility with the differentiable nature of the DGA framework.
(19) Page 7, section C, sentence '...set kR/off=1..'. In this case, we cannot infer this parameter.
Thank you for the comment. You are correct that setting kR/off = 1 effectively normalizes the rates, making this parameter unidentifiable. In steady-state analyses, not all parameters can be independently inferred because observable quantities depend on relative—rather than absolute—rate values (as evident when setting the time derivative to zero in the master equation). To infer all parameters, one would need additional information, such as time-series data or moments at finite time.
(20) Page 7 Section 2. Estimating parameters .... Sentence: '....as can be seen, there is very good agreement..' How many times the true value falls within the CI (because corr 0.68 is not great).
Thank you for your comment. While a correlation coefficient of 0.68 indicates moderate agreement, the primary goal was to demonstrate the feasibility of parameter estimation using the DGA rather than achieving perfect accuracy. The coverage of the CI was not explicitly calculated, as the focus was on the overall trends and relative agreement.
(21) Page 7 Section 2. Estimating parameters .... Sentence: 'Fig5(c) shows....' Is this when using exact simulator?
Thank you for your question. Yes, the exact values in x-axis of Fig. 5(c) are obtained using the exact Gillespie simulation.
(22) Page 7 Section 3 Estimating parameters for the... Sentence: 'Fig6(a) shows...' Why Cis are not shown?
Thank you for your comment. CIs are not shown in Fig. 6(a) because this particular case is degenerate, making the calculation and meaningful representation of CIs challenging.
(23) Page 10, Sentence: 'As can be seen in Fig 7(b)...' Can you show uncertainty in measured value? It would be good to see something of a comparison against an exact method, at least on simulated synthetic data
Thank you for the comment. Fig. 7(a) already includes error bars for the experimental data, which account for measurement uncertainty. However, in Fig. 7(b), we do not include error bars for the experimental values due to limitations in the available data.
(24) Page 12, Section B Loss function '...n=600...' This is on a lower range. Have you tested with n=1000?
Yes, we have tested with n=1000 and observed no significant difference in the results. This indicates that n=600 is sufficient for the purposes of this study.
(25) Fig 8(c) why there are no CI shown?
Thank you for your comment. CIs were not included in Fig. 8(c) due to degeneracy, which makes meaningful confidence intervals difficult to compute.
(26) Page 12 Conclusion, sentence: '..gradients via backpropagation...' Actually, by making the function continuous, both forward and reverse mode might be used. And in this case, forward-mode would actually be the fastest by quite a margin
Thank you for your insightful comment. You are correct that by making the function continuous, both forward-mode and reverse-mode automatic differentiation can be used. We have now mentioned this point in the discussion.
(27) Overall comment for the Conclusion section: It would be good to discuss how this framework compares to other model-fitting frameworks for models with stochastic dynamics. The authors mention dynamic data and more discussion on this would be very welcomed. Why use ADAM and not something established like BFGS for model fitting? It would be interesting to discuss how this can fit with other SSA algorithms (e.g. in practice sorting SSA is used when models get larger). Also, inference comparison against exact approaches would be very nice. As it is now, the authors truly only check the accuracy of the SSA on 1 model -it would be interesting to see for other models.
Thank you for your detailed comments. While this study focuses on introducing the DGA and demonstrating its feasibility, we agree that comparisons with other model-fitting frameworks, testing on additional models, and integrating with other SSA variants like sorted SSA are important directions for future work. Similarly, extending the DGA to handle transient dynamics and exploring alternatives to ADAM, such as BFGS, are promising areas to investigate further.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
We are grateful for the positive evaluation of the work and the critical points raised by the reviewers. We thank all reviewers for their excellent comments. We believe that these revisions have significantly improved the quality of our study.
In response to the 2nd reviewer, we apologise for the missing data, we failed to provide a P-value of the RM ANOVA post-hoc test, we are very grateful that this was brought to our attention. We have revised the RM ANOVA by using the Tukey HSD post-hoc test, which is generally recommended for pairwise comparisons as it is more robust to unequal sample sizes. The controversial statistical analysis of the overall comparison of speed differences was deleted, as were three supplementary figures (Fig. S4, Fig. S9 and S10), which are less informative in support of the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This study is useful as it provides further analysis of previously published data to address which specific genes are part of the masculinizing actions of E2 on female zebra finches, and where these key genes are expressed in the brain. However the data supporting the conclusion of masculinizing the song system are incomplete as the current manuscript is a re-analysis of differential gene expression modulated by E2 treatment between male/female zebra finches without manipulation of gene expression. The conclusions (and title) regarding song learning are also incompletely supported with no gene manipulation or song analysis. Importantly, the use of WGCNA for a question of sex-chromosome expression in species without dosage compensation is considered inadequate. As the experimental design did not include groups to directly test for song learning, and there was also no analysis of song performance, these data were also considered inadequate in that regard.
We are sorry the editor felt the manuscript so incomplete and inadequate. Though the tone of this assessment seems more severe than the below reviewer comments, we are also happy to see that the editor has considered our paper further for a revised publication, based on the reviewer’s comments. We address the editor’s comments as follows:
While we agree that manipulation of some of the genes we discovered, whose expression levels are E2-sensitive in the song system, would take the study further in validating some proposed hypothesis in the discussion of the paper, we don’t think the outcome of gene manipulations would change the major conclusions from the results of the paper. In this study we performed estrogen hormone manipulations, with causal consequences on gene expression in song nuclei and associated song behavior. In a way this is analogous to gene manipulations, but manipulating directly the action of estrogen. The categories of genes impacted, and the differences among the sex chromosomes wouldn’t change.
For the comment on WGCNA being inadequate for addressing questions on sex chromosome expression in species without dosage compensation, we think the evidence in our data does not bear that out. One main result of this paper is the separation of Z chromosome transcripts whose expression is most strongly regulated by chromosomal dosage (WGCNA module E) across regions from those subject to additional sources of regulation in song nuclei (other modules). It seems to us that rather than being confounded by the lack of dosage compensation, WGCNA allowed us to better resolve the effects of dosage on different genes within the sex chromosomes. We have added a new figure more directly examining sex chromosome transcript abundance within different modules. Briefly, we found that module E assigned Z chromosome genes exhibited almost exactly the male-biased expression ratio expected from no dosage compensation while the Z chromosome genes in song nuclei assigned to other modules were expressed below the dosage predicted value, consistent with module E containing those genes whose expression are most strongly regulated by dose across all brain regions sampled.
At its core, WGCNA finds sets of correlated genes. The biological reality of the zebra finch transcriptome is that Z chromosome expression is largely anti-correlated with W chromosome due to dosage. However, this dosage effect is not felt equally by all genes and WGCNA provides an unbiased computational framework which can be used to separate dose from other potential sources of gene regulation. This is why roughly ⅓ of Z chromosome genes are not assigned to module E; for example the growth hormone receptor is assigned to module G based on its correlation with genes upregulated within HVC.
“As the experimental design did not include groups to directly test for song learning, and there was also no analysis of song performance, these data were also considered inadequate in that regard.”
Concerning the comment on no analysis on song performance in the paper, all such analyses were conducted on our previous study on the same animals (Choe et al. 2021, Hormones & Behavior). The birds considered here were sacrificed at PHD30, prior to the onset of learned song behavior. However, females treated with E2 the same at the same time and allowed to mature into adulthood, went onto to develop rudimentary song. Further, induction of rudimentary song learning in females following E2 treatment has been well established since the early ‘80s. We have added the following text toward the end of the intro to make this more clear:
“While the birds for this study were sacrificed prior to the developmental presentation of song behavior, we have previously shown that female finches treated in exactly the say way with E2 go on to produce rudimentary imitative songs as adults (Choe et al 2021), consistent with the known induction of vocal learning in females by E2 (REF).”
Reviewer #1 (Recommendations For The Authors):
Overall, this is a wonderfully designed and executed study that takes full advantage of new resources, such as the most complete zebra finch genome assembly yet, as well as the latest methods. I have very few suggestions as to the improvement of the manuscript. They are as follows:
Results Section:
In the paragraph "Identification of gene expression modules in song nuclei":
"The E2-treated females in this study had similarly sized song system nuclei as males, indicating that E2 treatment prevented atrophy."
Clarify if this comparison is to treated and/or untreated males.
We thank the reviewer for their comment. The relative differences in the song nuclei sizes between the E2-treated females and the other groups is more complex that our original sentence implied. We have revised the main the text as follows
“In our previous study, we found that estradiol treatment in PHD30 females caused HVC to enlarge and Area X to appear when it normally does not develop in females, but both at sizes less than in untreated or treated males.The sizes of PHD30 female LMAN RA were already the sizes as seen in males, as the later has not atrophied yet at this age(25).”
In the paragraph "Sex- and micro-chromosome gene expression across the telencephalon": "These animal and chromosome specific shifts in the transcriptomes could represent the systemic effects of allelic chromosomal structural variation..."
The authors should clarify the meaning of a"llelic chromosomal structural variation" in this context, as it is an unusual phrase. Major chromosomal structural variation seems unlikely to produce these effects. Is it also possible that animal-specific modules with brain-wide higher could also result from laboratory contamination between all samples from one animal? This is not too likely but perhaps should be acknowledged or ruled out.
We have removed the word allelic, which was unnecessary. We can’t envision how laboratory contamination could occur such that all of one animal’s samples would be affected to produce the observed result which is module and chromosome specific. An animal wide effect could emerge during sacrifice, but we can think of no reason that would affect these modules and not others. Rather, the most likely explanation is biological natural difference between animals. We have added this consideration of alternative explanations.
In the section "Candidate gene drivers of HVC specialization in E2-treated females":
When discussing GHR's role in cell growth and proliferation, the authors' argument could be expanded by including the documented role of GH signaling in anti-apoptotic protection of neurons from rounds of neural pruning during development as documented in the chicken, e.g. • Harvey S, Baudet M-L, Sanders EJ. 2009. Growth Hormone-induced Neuroprotection in the Neural Retina during Chick Embryogenesis. Annals of the New York Academy of Sciences, 1163: 414-416. https://doi.org/10.1111/j.1749-6632.2008.03641.x
We thank the reviewer for sharing this publication with us.. We have added the following sentence to our discussion with the above citation. “Further, our results are consistent with growth hormone’s known role in avian anti-apoptotic protection, with elevated signaling associated with the survival of chicken neurons during rounds of pruning in the developing
retina.”
The authors' argument of the relevance of the passerine GH duplication would be strengthened by citing:
• Rasband SA, Bolton PE, Fang Q, Johnson PLF, Braun MJ. 2023. Evolution of the Growth Hormone Gene Duplication in Passerine Birds, Genome Biol Evol, 15(3) https://doi.org/10.1093/gbe/evad033. Greatly expands on the Yuri et al. paper cited by characterizing of the molecular evolution of these genes across hundreds of avian species, supporting positive selection on multiple amino acid sites identified in both ancestral and duplicate (passerine) growth hormone.
• Xie F, London SE, Southey BR et al. 2010. The zebra finch neuropeptidome: prediction, detection and expression. BMC Biol 8, 28. https://doi.org/10.1186/1741-7007-8-28 The authors report significantly different expression of the ancestral GH gene in the adult male zebra finch auditory forebrain after different song exposure experiences.
We have amended the results section sentence and added all suggested citations. The sentence now reads: “The gene which encodes growth hormone receptor’s ligand, growth hormone, is interestingly duplicated and undergoing accelerated evolution in the genomes of songbirds (Rasband et al 2023); the GH ligand has been found to be upregulated in the zebra finch auditory forebrain following the presentation of familiar song (Xie et al 2010).”
Figures:
- Figure 1B. "Duration of sex typing" being a shorter bar compared to the others is not fully explained in the experimental design. Presumably at the end of this time period, the sex is non-invasively, phenotypically evident. I suggest an arrow pointing to the PHD/PHD range when sex is apparent in plumage/anatomy.
- Figure 4. Caption appears to be truncated; "across all... genes"?
Fixed
- Figure 5. For 5E, 5F, 5G, 5H, consider enlarging the plots so overlapping gene symbols are readable. Alternately, smaller numbers or symbols could be used with a key in areas where overlapping symbols are hard to prevent.
We agree that these are not the easiest to read; we originally offset the symbols in R to minimize overlaps, but it can only do so much for the more crammed panels. We have now added a supplemental .xlsx file with the underlying data from each of the 4 tests for readers that want to examine the data in more detail.
Reviewer #2 (Recommendations For The Authors):
Since WGCNA methods will inherently draw together sex-chromosome genes into the same module in systems without dosage compensation, I suggest the authors rerun the WGCNA using only female samples and only male samples. Then identify the composition of modules that differ between E2 and vehicle-treated females and compare these genes to males. Then from male WGCNA identify the composition of modules that differ between E2 and vehicle-treated males and compare to female modules.
We thank the reviewer for their suggestions. However, we believe it is not as strong as the approach we used, which is grouping data from both sexes in the WGCNA analyses in a study that is looking for sex differences. The reviewer's proposed approach amounts to computing modules twice (once per sex), determining song system specialized modules and E2 responsive modules in both settings, then intersecting the two sets to find corresponding modules, all done to prevent the non-dose compensated sex chromosome genes from being drawn into the same module.
While WGCNA does group the majority of sex chromosome genes into module E, it does not categorize them all this way (Fig 3). The module classification instead differentiates those sex chromosome genes whose expression are most explained by chromosome dosage / sex across regions (modE) from those whose expression is controlled by other sources of regulation; for an example of the latter, the growth hormone receptor (GHR) is one of several Z chromosome genes classified into modG as its expression better correlates with the genes specialized to HVC than it does with the majority of dosage-dependent Z chromosome genes found in modE. Further, to remove biological sex as a variable in a WGCNA analysis that is focused on sex differences seems counterintuitive.
Instead, to quantitatively address the reviewer’s concern, we conducted additional analyses, that led to an added new figure, associated text, and tables, that better describes sex/chromosome dosage effects on the abundance (FPKM) and expression ratios of sex chromosome transcripts by module irrespective of brain region (Fig. 5). We find that the Z chromosome genes in modE were expressed at the expected chromosome dosage in the non-vocal surrounding regions (65.06% observed vs 66.6% expected) while in other modules, other Z chromosome genes were expressed at intermediate levels between equal expression and the expected chromosomal dosage. For example, the Z chromosome content of modules D and H exhibited near equal expression between sexes. Within the song system, Z chromosome gene content of modG was highly expressed in males beyond what is expected from chromosome dosage, consistent with modG’s male-specific upregulation in song nuclei relative to surrounds in the absence of E2. These results better demonstrate that in our WGCNA on the combined dataset we are able to separate those Z chromosome genes whose expression is predominantly dosage controlled from those subject to additional regulation such as song system specialization.
Fig. S3 Legend: 'Black arrow' -> 'Red arrow'
Change made.
Fig. S5 - What part of the figure shows the 'human convergent signature'? Also, simply listing the number of genes mapped to a chromosome is misleading to readers unfamiliar with the zebra finch genome, you should either provide the number of genes on each chromosome or present as corrected by that number.
Fig. S5 was the same type of analyses in Fig. 3 but with an older zebra finch genome assembly, where we had not included the panel a for enrichments with genes convergent in expression between songbird song regions and humans speech brain regions. However, we see that Fig. S5 was not adding any new important information to the paper, so we removed it.
For the chromosome analyses in Fig. 3b, we provide both the total raw number of module assigned genes broken down by chromosome (The black bar plots on the right) as well as a statistical fold-enrichment value of modules per chromosome. Given the number of genes per chromosome and genes per module in our data, we computed the fold-enrichment for each intersection (observed intersection size / expected intersection size). To test for the significance of these enrichments, we bootstrapped FDR corrected p values for the enrichment of each chromosome-module pairing by randomizing the mapping of genes to modules to construct a null distribution of fold enrichments for each intersection. Our intent was not to describe the size of the chromosomes themselves, information readily available elsewhere, but to show the disproportionate chromosomal origins of the gene sets considered by this study. Performing this enrichment test using all annotated genes per chromosome would artificially increase enrichment values and make the analysis less conservative by confounding the results with the inherent enrichment for “brain function” in the assigned genes relative to all genes.
At several places you say "we correlated expression of each sex chromosome transcript with sexual dimorphism within each region, such that expressed W genes would be positively correlated and depleted Z chromosome genes would be anticorrelated." What was the sexual dimorphism that was being correlated with? Is this the eigengene?
We thank you for this comment. Our language was less clear than it could be. We tested for correlations of both the eigengene and the individual gene expression profiles with the biological sex of the animals. We have changed the text to:
“To do this, we tested for a correlation between the expression of each sex chromosome transcript to the animals’ sex within each brain region. We found that female-enriched transcripts were positively correlated with sex and male-enriched transcripts were anticorrelated (Fig. 4f,g).”
Fig. 4A: The 'true/false' boxes and animal A-L is confusing and unnecessary. I'd suggest just using M and F (or sex symbols) with a horizontal line below each set of 3 for respective E2 and Veh.
Change made.
Reviewer #3 (Recommendations For The Authors):
General comments:
After the initial characterization of the datasets and module identification, it is quite hard to follow the logic of the data presentation in the various other Results sections or to clearly understand how they relate to the main stated goal to identify factors related to sex differences in vocal learning. The most relevant findings relate to the presumed actions of hormone treatment and sex chromosome gene dosage in song nuclei, whereas analyses of other brain areas, other chromosomes, or speech-related genes serve more as controls and/or appear as distractions from the main theme. A suggestion to increase the clarity of the presentation and potential impact of the study is to change the order of the presentation, focusing first on the specific analyses and comparisons that most directly speak to the main goals of the study, and then secondarily and more briefly presenting the controls or less related comparisons.
The reviewer’s suggestion for the results section organization is exactly what we had tried to do. We opened the first paragraph on identification of modules, then presented the song nuclei specific modules, followed by E2-changes to those modules; and the followed by other specific results for the remainder of the paper, including module enrichments to specific chromosomes. The reviewer mentioned our analyses of “other brain areas” (which we assume to mean the non-vocal surround regions), other chromosomes (which we assume means autosomes) and speech-related genes as controls were a distraction in the paper; but within our analysis, these other brain regions are essential controls needed to assess the song-system specificity of any observed sex differences observed from the very first paragraphs of the results; the autosomes were not controls for sex chromosome results, but primary results in of themselves; the overlap with speech-related genes was also not a control, but a novel discovery. We have revised these points in the paper to make them clearer, and revised some of the section titles and transitions between sections to help increase clarity of the main storyline of the paper.
A related comment is that many of the inferences drawn from the WGCNA analysis were quite complex, thus independent verification of some predictions would be quite valuable. For example, consider the passage: "In non-vocal learning juvenile females, interestingly LMAN was specialized relative to the AN by the same gene modules as in males (B, F, and I) as well as an additional module G (Fig. 2b); RA was specialized by module A as in males, but not module L and by additional modules A and G. In contrast, neither juvenile female HVC nor Area X exhibited significant gene module expression specializations relative to their surrounds." Providing in situ hybridization verification of these regional gene expression predictions with a few representative genes seems quite feasible given the group's expertise and would considerably strengthen confidence in the module-based inferences.
We performed in-situ independent validation of 36 candidate genes in our first study with this dataset (Choe et al 2021). We now mention this validation in the revised paper. The reviewer’s selection of one of our sentences though made us realize that our grammar used to explain the results was not as clear as it needs to be. We thus cleaned up the grammar of our module descriptions so that it should be communicated with less complexity, the main issue noted by the reviewer.
Because this is a re-analysis of a previously published dataset, the authors should more explicitly describe somewhere in the Discussion how the present analysis advances the understanding of sex differences in songbird neuroanatomy and behavior beyond the previous analysis.
We have added an additional sentence into the discussion more clearly separating the results of the current study from our previous work.
Specific comments:
Abstract:
There is evidence (from Frank Johnson's lab) that RA does not completely atrophy in female zebra finches, but is still present with more preserved connectivity than previously thought, possibly related to non-singing function(s). A term like 'marked reduction' of female RA may more accurately reflect the current state of knowledge.
We have changed the text to “partial atrophy”.
The term "driver" is undefined and unclear at this point of the paper; a clear definition for "driver" is also lacking in the Intro.
We now define “driver” or “genetic driver” as understood to mean “a genetic locus whose expression and/or inheritance strongly regulates the trait of interest”.
When citing the literature on studies that identified "specific genes with specialized up- or down-regulated expression in song and speech circuits relative to the surrounding motor control circuits", the authors should also cite studies from other labs (e.g. Li et al., PNAS, 2007; Lovell et al, Plos One 2008; Lovell et al, BMC Genomics 2018; Nevue et al, Sci Rep. 2020), to be accurate and fair.
Citations added
For clarity, the authors should explicitly formulate the hypothesis they are proposing at the end of the Summary.
We thank the reviewer for this comment. We have replaced the final sentence of the summary with: “We present a hypothesis where reduced dosage and expression of these Z chromosome genes changes the developmental trajectory of female HVC, partially preventable by estrogen treatment, contributing to the loss of song learning behavior.”
Introduction:
Vocal learning is arguably the ability to imitate 'vocal' sounds, this could be clarified here.
We have amended the sentence to “Vocal learning is the ability to imitate heard sounds using a vocal organ…”
Given they are currently considered sister taxa, can the author briefly explain what is the basis for assuming that songbirds and parrots independently evolved vocal learning?
Although songbirds and parrots belong to a monophyletic clade, they are not sister taxa. There are two clades separating them that are vocal non-learners. We have cited the reference that demonstrated this (e.g. Jarvis et al 2014 Science).
Why use Taeniopygia castanotis rather than the more broadly used Taeniopygia guttata?
Zebra finches were recently reclassified and T.castanotis is now more accurate. The Indonesian Timor zebra finch retained T.guttata while the Australian finch, used here, was classified as T.castanotis.
The authors state: "...vocal learning is strongly sexually dimorphic in zebra finches and many other vocal learning species" and cite Nottebohm and Arnold, Science, 1978. That landmark paper only shows dimorphism in song nuclei (not learning) in two songbird species. The authors should provide citations for other species and behavior, or modify the statement.
We have added an additional citation (Odom et al.) to this sentence which covers the phylogeny more broadly.
The authors refer to the nucleus RA as being located in the lateral intermediate arcopallium (LAI). Other labs have described this domain as the dorsal part of the intermediate arcopallium, thus AId or AID (Mello et al., JCN, 2019; Yuan and Bottjer, J Neurophys 2019; Yuan and Bottjer, eNeuro, 2020; Nevue et al., BCM Genomics, 2020). The authors should acknowledge this discrepancy in nomenclature so that data and conclusions can be more readily compared across studies.
We thank the reviewer and agree that this is helpful. We have added a note at the first mention of LAI.
The authors state that data from the gynandromorph bird described by Agate et al implicates "sex chromosome gene expression within the song system" as involved in the song system sexual dimorphism. That study, however, only rules out circulating gonadal steroids, and while suggesting a cell-autonomous mechanism like sex chromosome genes, it does not necessarily exclude other brain-autonomous factors like sex differences in local production of sex steroids.
We say that this study “implicated” sex chromosome gene expression, which is accurate per the results and discussion of that study. We are unsure what “brain autonomous factors like sex differences in local production of sex steroids” means?. “Brain autonomous” and “local production” in the brain seem contradictory in this context?
Results:
The authors state that "the E2-treated females in this study had similarly sized song system nuclei as males, indicating that E2 treatment prevented atrophy". Can they clarify whether the VEH-treated females actually had smaller RAs than E2-treated females or VEH-treated males at this age? This is still quite early in development and it is unclear to what extent RA's marked sexual dimorphism in adults or later developmental ages has already taken place in untreated (or VEH-treated) birds. A related comment is that the authors state later on: "We interpret these findings to indicate that: LMAN and RA atrophy later in juvenile female development..." Does this mean these nuclei actually did not show the marked decreases predicted earlier in the text? Clarifying this point would be helpful.
We thank the reviewer for pointing out this discrepancy, which reviewer #1 asked for clarification as well. RA size at this age is similar in males and females. However, HVC and Area X is smaller and absent respectively in females and E2 treatment partially prevents this atrophy. The text now reads:
“In our previous study, we found that estradiol treatment in PHD30 females caused HVC to enlarge and Area X to appear when it normally does not develop in females, but both at sizes less than in untreated or treated males.The sizes of PHD30 female LMAN RA were already the sizes as seen in males, as the later has not atrophied yet at this age(25).”
The authors acknowledge that area X is absent in untreated and VEH-treated females. Could they please clarify how area X and the surrounding stratal tissue that excludes area X were identified for laser capture dissections in juvenile females?
We have added the following statement to the main text portion discussing the dissections.
“In the case of vehicle-treated females which lack Area X, a piece of striatum from the same location of where Area X is found in males was taken. “
Some passages in Results discussing the authors' interpretation of the modules seem quite speculative and possibly belong instead in the Discussion. For example: "... that module A and G genes could be associated with the start of this atrophy; HVC and Area X are likely the first to atrophy or not develop; and lack of any gene module specialization in them at this age could mean that they would be more sensitive to estrogen prevention of vocal learning loss."
As suggested, we have removed this text from the results; these ideas were already presented in the Discussion. We have merged the resulting small paragraph with the preceding paragraph.
The authors state: "To assess the effects of chronic exogenous estrogen on the developing song system, we first performed a control analysis of modules in the E2-treated juvenile males." How can an assessment of estrogen effects be a "control" analysis? Does this refer to a contrast with females? Please clarify the language here.
The reviewer is correct, that E2 treatment in males should not be considered a control experiment. We removed the word “control”.
When discussing the GO-enriched terms for module G, it is unclear how the authors reached the conclusion about "proliferative", as the enriched terms do not refer to processes more directly indicative of proliferation like "cell division" or "cell cycle regulation". Rather, these terms seem more related to differentiation and growth, which do not necessarily imply proliferation. The authors also refer to "HVC proliferation" later on in the Discussion. However, there is conclusive evidence from several labs that proliferative events associated with postnatal neuronal addition and/or replacement in song nuclei occur in the subventricular zone, not in song nuclei like HVC itself, and that the growth of song nuclei largely reflects cell survival, as well as growth in size and complexity under the regulation of sex steroids.
We agree that “proliferative” may have been a poor word choice here. We did not mean to indicate that cell division was occuring in HVC itself. Instead we meant to indicate that HVC is able to accommodate the new born neurons from the SVZ. We have replaced the word “proliferative” throughout. In the instance the reviewer mentions specifically we replaced it with,“...potentially act to integrate and differentiate late born neurons.”
With regard to module E, referring to a telencephalon-wide sexually dimorphic gene expression program seems quite a stretch, given that only a few regions were sampled and compared between sexes. These related statements should be toned down.
We have replaced “telencephalon-wide” with “more distributed across the finch telencephalon” and other similar language in each instance.
The following passage is very speculative and should shortened and/or moved to the Discussion: "Based on the findings in these gene sets, we hypothesize that without excess estrogen in females, HVC expansion is prevented by not specializing the growth and neuronal migration promoting genes in module G to the HVC lineage by late development. This is potentially enacted by depleting necessary gene products from the Z sex chromosome, such as GHR, which are already present in only one copy."
We have deleted this portion of the text, as the idea is already present in the discussion.
Figure 5: To this reviewer, the comparisons of sex differences and of female response to E2 are the most relevant and informative ones, whereas the regional differences between song nuclei and surrounds refer to different cell populations and cell types where other processes may be occurring, independently of what occurs in song nuclei. It thus seems like the intersection analysis in panel 5i may be subtracting out important "core genes" in terms of E2 effects and/or sex differences in the most relevant cell populations, i.e. in this case within song nucleus HVC.
Song learning and the vocal learning brain regions are specialized behaviors and associated nuclei which have a set of hundreds of specialized genes compared to the surrounds. Our previous findings shows that E2 drives the appearance of these specializations in female zebra finches. Thus, we considered this the most interesting question to focus on, which we have further highlighted. Nevertheless, in response to the reviewers suggestion, we have added a .xlsx supplemental file containing the results from each of the individual tests so readers may examine any single comparison, or set of comparisons, in more detail.
Discussion:
It is unclear what the term "critical period" refers to in: "during the critical period of atrophy for the female vocal circuit"; please clarify.
We agree that our language was nebulous. We have replaced it with “as several male song control nuclei begin to expand and female nuclei partially atrophy”
In: "HVC appeared unspecialized at the level of gene module expression in control females", does "unspecialized" refer to a lack of difference in gene expression when compared to surroundings? Please clarify. The same comment applies to other uses of "unspecialized" in this paragraph.
Yes, unspecialized means lack of difference in gene expression in the song nucleus. To clarify this point, we have reworked that and the following sentence as follows:
“HVC appeared unspecialized compared to the surrounding nidopallium at the level of gene module expression in control females, with no significantly differentially expressed MEGs . However, in E2-treated females, HVC exhibited a subset of the observed male HVC gene expression specializations. Similarly, the vehicle-treated female striatum located where Area X would be also lacked any specialized gene module expression, but the E2-treated female Area X exhibited a subset of the male Area X specializations, consistent with the known absence of Area X in vehicle-treated females and presence in E2-treated females.”
The authors state: "...we surprisingly found that the most specialized genes were disproportionately from the Z chromosome", when discussing module G in HVC. Why is this so surprising? In a sense, this could be taken as consistent with the findings of Friedrich et al, 2022, where sex differences in the RA transcriptome were predominantly Z related on 20 dph. Arguably 20 dph is still quite close to 30 dph in the present study, when compared to 50 dph in Friedrich et al, when autosomes predominate.
Our bioRxiv was originally posted in July 2021, prior to the publication of Friedrich et al, 2022; however we had previously added to our discussion that several of our results are consistent with the observations of Friedrich et al..
We have a different interpretation of Z chromosome gene results in Friedrich et al.. While the percentage of specialized genes from the Z chromosome decreased, the absolute number of specialized Z chromosome genes actually increased over this interval. In Fig. 3a from Friedrich et al. it appears that ~28% of Z chromosome genes were sexually dimorphic in their expression in RA at PHD20 but that ~39% of Z chromosome genes were similarly dimorphic at PHD50. We interpret this result as the Z chromosome genes being among the earliest genes differentially expressed between the sexes, not that their differential expression or role ever subsequently decreased. We have reworked this portion of the discussion to make our point more clear:
“This model of sex chromosome influenced song system development is consistent with recent observations comparing male and female zebra finch transcriptomes from RA at young juvenile (PHD20) and young adult (PHD50) ages in un-manipulated birds (Friedrich et al. 2022)57. While that study proposes that the role of the sex chromosome in maintaining transcriptomic sex differences diminishes across development, as the proportion of specialized genes that originate on the sex chromosomes diminishes, this effect was driven by large increases in differentially expressed autosomal genes rather than by any reduction in sex chromosome dimorphism; the percentage of differentially expressed Z chromosome genes increased from PHD20 (28%) to PHD50 (39%) (Friedrich et al). This leads us to conclude that sexually dimorphic Z chromosome expression at juvenile ages precedes the sexually dimorphic expression of the autosomes seen in adults. This is consistent with our hypothesis that sufficient expression of select Z chromosome gene products (GHR, etc..) is necessary for subsequent autosomal song system specializations (modG).”
Further, when we write ”When examining the module G HVC specialization induced by E2-treatment in female HVC, we surprisingly found that the most specialized genes were disproportionately from the Z chromosome” we are referring to the upregulation of module G by E2 in female HVC, not the sex difference described in RA by Friedrich et al. which only utilized un-treated RA samples and thus is more likely related to our observations of module E.
The term "sexual dimorphism" has been more traditionally used for sex differences that are very marked, like features that are highly regressed or absent in one sex, most often in females. Quantitative differences in gene expression, including dosage differences like those related to module E, are more appropriately described as sex differences rather than dimorphisms. That usage would be more consistent with most of the literature, and thus preferable.
We did a google search for common definitions, and found more the opposite. Sexual dimorphism being used more often as differences of degree (with the zebra finch example as one of the top hits), and sex differences being used often as more absolute differences (like presence vs absence of the Y chromosome). Further, as in the reviewer’s first sentence, the definition of sexual dimorphism is a sex difference. That is, the two phrases can be interchangeable. Thus, we prefer to keep sexual dimorphism.
Several references are incomplete or seem truncated, like 9 and 10.
Fixed
Table S2: Please examine and take into account the W gene curation presented in Table S3 of Friedrich et al., 2022.
We have added additional supplementals (supplemetal_w_chrom_express.csv and supplemetal_z_chrom_express.csv) of the data provided in new Fig 5 incorporating the curation information from Table S3 from Friedrich et al.
Data availability:
Genes for all the main modules identified should be presented in a Supplemental Table, or through a link to a stable data repository.
We have added an additional Supplemental Table supplemental_gene_module_assignment.csv with this information.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews
Reviewer #1 (Public Review):
Summary:
The authors have created a system for designing and running experimental pipelines to control and coordinate different programs and devices during an experiment, called Heron. Heron is based around a graphical tool for creating a Knowledge Graph made up of nodes connected by edges, with each node representing a separate Python script, and each edge being a communication pathway connecting a specific output from one node to an iput on another. Each node also has parameters that can be set by the user during setup and runtime, and all of this behavior is concisely specified in the code that defines each node. This tool tries to marry the ease of use, clarity, and selfdocumentation of a purely graphical system like Bonsai with the flexibility and power of a purely code-based system like Robot Operating System (ROS).
Strengths:
The underlying idea behind Heron, of combining a graphical design and execution tool with nodes that are made as straightforward Python scripts seems like a great way to get the relative strengths of each approach. The graphical design side is clear, selfexplanatory, and self-documenting, as described in the paper. The underlying code for each node tends to also be relatively simple and straightforward, with a lot of the complex communication architecture successfully abstracted away from the user. This makes it easy to develop new nodes, without needing to understand the underlying communications between them. The authors also provide useful and well-documented templates for each type of node to further facilitate this process. Overall this seems like it could be a great tool for designing and running a wide variety of experiments, without requiring too much advanced technical knowledge from the users.
The system was relatively easy to download and get running, following the directions and already has a significant amount of documentation available to explain how to use it and expand its capabilities. Heron has also been built from the ground up to easily incorporate nodes stored in separate Git repositories and to thus become a large community-driven platform, with different nodes written and shared by different groups. This gives Heron a wide scope for future utility and usefulness, as more groups use it, write new nodes, and share them with the community. With any system of this sort, the overall strength of the system is thus somewhat dependent on how widely it is used and contributed to, but the authors did a good job of making this easy and accessible for people who are interested. I could certainly see Heron growing into a versatile and popular system for designing and running many types of experiments.
Weaknesses:
(1) The number one thing that was missing from the paper was any kind of quantification of the performance of Heron in different circumstances. Several useful and illustrative examples were discussed in depth to show the strengths and flexibility of Heron, but there was no discussion or quantification of performance, timing, or latency for any of these examples. These seem like very important metrics to measure and discuss when creating a new experimental system.
Heron is practically a thin layer of obfuscation of signal passing across processes. Given its design approach it is up to the code of each Node to deal with issues of timing, synching and latency and thus up to each user to make sure the Nodes they author fulfil their experimental requirements. Having said that, Heron provides a large number of tools to allow users to optimise the generated Knowledge Graphs for their use cases. To showcase these tools, we have expanded on the third experimental example in the paper with three extra sections, two of which relate to Heron’s performance and synching capabilities. One is focusing on Heron’s CPU load requirements (and existing Heron tools to keep those at acceptable limits) and another focusing on post experiment synchronisation of all the different data sets a multi Node experiment generates.
(2) After downloading and running Heron with some basic test Nodes, I noticed that many of the nodes were each using a full CPU core on their own. Given that this basic test experiment was just waiting for a keypress, triggering a random number generator, and displaying the result, I was quite surprised to see over 50% of my 8-core CPU fully utilized. I don’t think that Heron needs to be perfectly efficient to accomplish its intended purpose, but I do think that some level of efficiency is required. Some optimization of the codebase should be done so that basic tests like this can run with minimal CPU utilization. This would then inspire confidence that Heron could deal with a real experiment that was significantly more complex without running out of CPU power and thus slowing down.
The original Heron allowed the OS to choose how to manage resources over the required process. We were aware that this could lead to significant use of CPU time, as well as occasionally significant drop of packets (which was dependent on the OS and its configuration). This drop happened mainly when the Node was running a secondary process (like in the Unity game process in the 3rd example). To mitigate these problems, we have now implemented a feature allowing the user to choose the CPU that each Node’s worker function runs on as well as any extra processes the worker process initialises. This is accessible from the Saving secondary window of the node. This stops the OS from swapping processes between CPUs and eliminates the dropping of packages due to the OS behaviour. It also significantly reduces the utilised CPU time. To showcase this, we initially run the simple example mentioned by the reviewer. The computer running only background services was using 8% of CPU (8 cores). With Heron GUI running but with no active Graph, the CPU usage went to 15%. With the Graph running and Heron’s processes running on OS attributed CPU cores, the total CPU was at 65% (so very close to the reviewer’s 50%). By choosing a different CPU core for each of the three worker processes the CPU went down to 47% and finally when all processes were forced to run on the same CPU core the CPU load dropped to 30%. So, Heron in its current implementation running its GUI and 3 Nodes takes 22% of CPU load. This is still not ideal but is a consequence of the overhead of running multiple processes vs multiple threads. We believe that, given Heron’s latest optimisation, offering more control of system management to the user, the benefits of multi process applications outweigh this hit in system resources.
We have also increased the scope of the third example we provide in the paper and there we describe in detail how a full-scale experiment with 15 Nodes (which is the upper limit of number of Nodes usually required in most experiments) impacts CPU load.
Finally, we have added on Heron’s roadmap projects extra tasks focusing only on optimisation (profiling and using Numba for the time critical parts of the Heron code).
(3) I was also surprised to see that, despite being meant specifically to run on and connect diverse types of computer operating systems and being written purely in Python, the Heron Editor and GUI must be run on Windows. This seems like an unfortunate and unnecessary restriction, and it would be great to see the codebase adjusted to make it fully crossplatform-compatible.
This point was also mentioned by reviewer 2. This was a mistake on our part and has now been corrected in the paper. Heron (GUI and underlying communication functionality) can run on any machine that the underlying python libraries run, which is Windows, Linux (both for x86 and Arm architectures) and MacOS. We have tested it on Windows (10 and 11, both x64), Linux PC (Ubuntu 20.04.6, x64) and Raspberry Pi 4 (Debian GNU/Linux 12 (bookworm), aarch64). The Windows and Linux versions of Heron have undergone extensive debugging and all of the available Nodes (that are not OS specific) run on those two systems. We are in the process of debugging the Nodes’ functionality for RasPi. The MacOS version, although functional requires further work to make sure all of the basic Nodes are functional (which is not the case at the moment). We have also updated our manuscript (Multiple machines, operating systems and environments) to include the above information.
(4) Lastly, when I was running test experiments, sometimes one of the nodes, or part of the Heron editor itself would throw an exception or otherwise crash. Sometimes this left the Heron editor in a zombie state where some aspects of the GUI were responsive and others were not. It would be good to see a more graceful full shutdown of the program when part of it crashes or throws an exception, especially as this is likely to be common as people learn to use it. More problematically, in some of these cases, after closing or force quitting Heron, the TCP ports were not properly relinquished, and thus restarting Heron would run into an "address in use" error. Finding and killing the processes that were still using the ports is not something that is obvious, especially to a beginner, and it would be great to see Heron deal with this better. Ideally, code would be introduced to carefully avoid leaving ports occupied during a hard shutdown, and furthermore, when the address in use error comes up, it would be great to give the user some idea of what to do about it.
A lot of effort has been put into Heron to achieve graceful shut down of processes, especially when these run on different machines that do not know when the GUI process has closed. The code that is being suggested to avoid leaving ports open has been implemented and this works properly when processes do not crash (Heron is terminated by the user) and almost always when there is a bug in a process that forces it to crash. In the version of Heron available during the reviewing process there were bugs that caused the above behaviour (Node code hanging and leaving zombie processes) on MacOS systems. These have now been fixed. There are very seldom instances though, especially during Node development, that crashing processes will hang and need to be terminated manually. We have taken on board the reviewer’s comments that users should be made more aware of these issues and have also described this situation in the Debugging part of Heron’s documentation. There we explain the logging and other tools Heron provides to help users debug their own Nodes and how to deal with hanging processes.
Heron is still in alpha (usable but with bugs) and the best way to debug it and iron out all the bugs in all use cases is through usage from multiple users and error reporting (we would be grateful if the errors the reviewer mentions could be reported in Heron’s github Issues page). We are always addressing and closing any reported errors, since this is the only way for Heron to transition from alpha to beta and eventually to production code quality.
Overall I think that, with these improvements, this could be the beginning of a powerful and versatile new system that would enable flexible experiment design with a relatively low technical barrier to entry. I could see this system being useful to many different labs and fields.
We thank the reviewer for positive and supportive words and for the constructive feedbacks. We believe we have now addressed all the raised concerns.
Reviewer #2 (Public Review):
Summary:
The authors provide an open-source graphic user interface (GUI) called Heron, implemented in Python, that is designed to help experimentalists to
(1) design experimental pipelines and implement them in a way that is closely aligned with their mental schemata of the experiments,
(2) execute and control the experimental pipelines with numerous interconnected hardware and software on a network.
The former is achieved by representing an experimental pipeline using a Knowledge Graph and visually representing this graph in the GUI. The latter is accomplished by using an actor model to govern the interaction among interconnected nodes through messaging, implemented using ZeroMQ. The nodes themselves execute user-supplied code in, but not limited to, Python.
Using three showcases of behavioral experiments on rats, the authors highlighted three benefits of their software design:
(1) the knowledge graph serves as a self-documentation of the logic of the experiment, enhancing the readability and reproducibility of the experiment,
(2) the experiment can be executed in a distributed fashion across multiple machines that each has a different operating system or computing environment, such that the experiment can take advantage of hardware that sometimes can only work on a specific computer/OS, a commonly seen issue nowadays,
(3) he users supply their own Python code for node execution that is supposed to be more friendly to those who do not have a strong programming background.
Strengths:
(1) The software is light-weight and open-source, provides a clean and easy-to-use GUI,
(2) The software answers the need of experimentalists, particularly in the field of behavioral science, to deal with the diversity of hardware that becomes restricted to run on dedicated systems.
(3) The software has a solid design that seems to be functionally reliable and useful under many conditions, demonstrated by a number of sophisticated experimental setups.
(4) The software is well documented. The authors pay special attention to documenting the usage of the software and setting up experiments using this software.
Weaknesses:
(1) While the software implementation is solid and has proven effective in designing the experiment showcased in the paper, the novelty of the design is not made clear in the manuscript. Conceptually, both the use of graphs and visual experimental flow design have been key features in many widely used softwares as suggested in the background section of the manuscript. In particular, contrary to the authors’ claim that only pre-defined elements can be used in Simulink or LabView, Simulink introduced MATLAB Function Block back in 2011, and Python code can be used in LabView since 2018. Such customization of nodes is akin to what the authors presented.
In the Heron manuscript we have provided an extensive literature review of existing systems from which Heron has borrowed ideas. We never wished to say that graphs and visual code is what sets Heron apart since these are technologies predating Heron by many years and implemented by a large number of software. We do not believe also that we have mentioned that LabView or Simulink can utilise only predefined nodes. What we have said is that in such systems (like LabView, Simulink and Bonsai) the focus of the architecture is on prespecified low level elements while the ability for users to author their own is there but only as an afterthought. The difference with Heron is that in the latter the focus is on the users developing their own elements. One could think of LabView style software as node-based languages (with low level visual elements like loops and variables) that also allow extra scripting while Heron is a graphical wrapper around python where nodes are graphical representations of whole processes. To our knowledge there is no other software that allows the very fast generation of graphical elements representing whole processes whose communication can also be defined graphically. Apart from this distinction, Heron also allows a graphical approach to writing code for processes that span different machines which again to our knowledge is a novelty of our approach and one of its strongest points towards ease of experimental pipeline creation (without sacrificing expressivity).
(2) The authors claim that the knowledge graph can be considered as a self-documentation of an experiment. I found it to be true to some extent. Conceptually it’s a welcoming feature and the fact that the same visualization of the knowledge graph can be used to run and control experiments is highly desirable (but see point 1 about novelty). However, I found it largely inadequate for a person to understand an experiment from the knowledge graph as visualized in the GUI alone. While the information flow is clear, and it seems easier to navigate a codebase for an experiment using this method, the design of the GUI does not make it a one-stop place to understand the experiment. Take the Knowledge Graph in Supplementary Figure 2B as an example, it is associated with the first showcase in the result section highlighting this self-documentation capability. I can see what the basic flow is through the disjoint graph where 1) one needs to press a key to start a trial, and 2) camera frames are saved into an avi file presumably using FFMPEG. Unfortunately, it is not clear what the parameters are and what each block is trying to accomplish without the explanation from the authors in the main text. Neither is it clear about what the experiment protocol is without the help of Supplementary Figure 2A.
In my opinion, text/figures are still key to documenting an experiment, including its goals and protocols, but the authors could take advantage of the fact that they are designing a GUI where this information, with properly designed API, could be easily displayed, perhaps through user interaction. For example, in Local Network -> Edit IPs/ports in the GUI configuration, there is a good tooltip displaying additional information for the "password" entry. The GUI for the knowledge graph nodes can very well utilize these tooltips to show additional information about the meaning of the parameters, what a node does, etc, if the API also enforces users to provide this information in the form of, e.g., Python docstrings in their node template. Similarly, this can be applied to edges to make it clear what messages/data are communicated between the nodes. This could greatly enhance the representation of the experiment from the Knowledge graph.
In the first showcase example in the paper “Probabilistic reversal learning.
Implementation as self-documentation” we go through the steps that one would follow in order to understand the functionality of an experiment through Heron’s Knowledge Graph. The Graph is not just the visual representation of the Nodes in the GUI but also their corresponding code bases. We mention that the way Heron’s API limits the way a Node’s code is constructed (through an Actor based paradigm) allows for experimenters to easily go to the code base of a specific Node and understand its 2 functions (initialisation and worker) without getting bogged down in the code base of the whole Graph (since these two functions never call code from any other Nodes). Newer versions of Heron facilitate this easy access to the appropriate code by also allowing users to attach to Heron their favourite IDE and open in it any Node’s two scripts (worker and com) when they double click on the Node in Heron’s GUI. On top of this, Heron now (in the versions developed as answers to the reviewers’ comments) allows Node creators to add extensive comments on a Node but also separate comments on the Node’s parameters and input and output ports. Those can be seen as tooltips when one hovers over the Node (a feature that can be turned off or on by the Info button on every Node).
As Heron stands at the moment we have not made the claim that the Heron GUI is the full picture in the self-documentation of a Graph. We take note though the reviewer’s desire to have the GUI be the only tool a user would need to use to understand an experimental implementation. The solution to this is the same as the one described by the reviewer of using the GUI to show the user the parts of the code relevant to a specific Node without the user having to go to a separate IDE or code editor. The reason this has not been implemented yet is the lack of a text editor widget in the underlying gui library (DearPyGUI). This is in their roadmap for their next large release and when this exists we will use it to implement exactly the idea the reviewer is suggesting, but also with the capability to not only read comments and code but also directly edit a Node’s code (see Heron’s roadmap). Heron’s API at the moment is ideal for providing such a text editor straight from the GUI.
(3) The design of Heron was primarily with behavioral experiments in mind, in which highly accurate timing is not a strong requirement. Experiments in some other areas that this software is also hoping to expand to, for example, electrophysiology, may need very strong synchronization between apparatus, for example, the record timing and stimulus delivery should be synced. The communication mechanism implemented in Heron is asynchronous, as I understand it, and the code for each node is executed once upon receiving an event at one or more of its inputs. The paper, however, does not include a discussion, or example, about how Heron could be used to address issues that could arise in this type of communication. There is also a lack of information about, for example, how nodes handle inputs when their ability to execute their work function cannot keep up with the frequency of input events. Does the publication/subscription handle the queue intrinsically? Will it create problems in real-time experiments that make multiple nodes run out of sync? The reader could benefit from a discussion about this if they already exist, and if not, the software could benefit from implementing additional mechanisms such that it can meet the requirements from more types of experiments.
In order to address the above lack of explanation (that also the first reviewer pointed out) we expanded the third experimental example in the paper with three more sections. One focuses solely on explaining how in this example (which acquires and saves large amounts of data from separate Nodes running on different machines) one would be able to time align the different data packets generated in different Nodes to each other. The techniques described there are directly implementable on experiments where the requirements of synching are more stringent than the behavioural experiment we showcase (like in ephys experiments).
Regarding what happens to packages when the worker function of a Node is too slow to handle its traffic, this is mentioned in the paper (Code architecture paragraph): “Heron is designed to have no message buffering, thus automatically dropping any messages that come into a Node’s inputs while the Node’s worker function is still running.” This is also explained in more detail in Heron’s documentation. The reasoning for a no buffer system (as described in the documentation) is that for the use cases Heron is designed to handle we believe there is no situation where a Node would receive large amounts of data in bursts while very little data during the rest of the time (in which case a buffer would make sense). Nodes in most experiments will either be data intensive but with a constant or near constant data receiving speed (e.g. input from a camera or ephys system) or will have variable data load reception but always with small data loads (e.g. buttons). The second case is not an issue and the first case cannot be dealt with a buffer but with the appropriate code design, since buffering data coming in a Node too slow for its input will just postpone the inevitable crash. Heron’s architecture principle in this case is to allow these ‘mistakes’ (i.e. package dropping) to happen so that the pipeline continues to run and transfer the responsibility of making Nodes fast enough to the author of each Node. At the same time Heron provides tools (see the Debugging section of the documentation and the time alignment paragraph of the “Rats playing computer games” example in the manuscript) that make it easy to detect package drops and either correct them or allow them but also allow time alignment between incoming and outgoing packets. In the very rare case where a buffer is required Heron’s do-it-yourself logic makes it easy for a Node developer to implement their own Node specific buffer.
(4) The authors mentioned in "Heron GUI’s multiple uses" that the GUI can be used as an experimental control panel where the user can update the parameters of the different Nodes on the fly. This is a very useful feature, but it was not demonstrated in the three showcases. A demonstration could greatly help to support this claim.
As the reviewer mentions, we have found Heron’s GUI double role also as an experimental on-line controller a very useful capability during our experiments. We have expanded the last experimental example to also showcase this by showing how on the “Rats playing computer games” experiment we used the parameters of two Nodes to change the arena’s behaviour while the experiment was running, depending on how the subject was behaving at the time (thus exploring a much larger set of parameter combinations, faster during exploratory periods of our shaping protocols construction).
(5) The API for node scripts can benefit from having a better structure as well as having additional utilities to help users navigate the requirements, and provide more guidance to users in creating new nodes. A more standard practice in the field is to create three abstract Python classes, Source, Sink, and Transform that dictate the requirements for initialisation, work_function, and on_end_of_life, and provide additional utility methods to help users connect between their code and the communication mechanism. They can be properly docstringed, along with templates. In this way, the com and worker scripts can be merged into a single unified API. A simple example that can cause confusion in the worker script is the "worker_object", which is passed into the initialise function. It is unclear what this object this variable should be, and what attributes are available without looking into the source code. As the software is also targeting those who are less experienced in programming, setting up more guidance in the API can be really helpful. In addition, the self-documentation aspect of the GUI can also benefit from a better structured API as discussed in point 2 above.
The reviewer is right that using abstract classes to expose to users the required API would be a more standard practice. The reason we did not choose to do this was to keep Heron easily accessible to entry level Python programmers who do not have familiarity yet with object oriented programming ideas. So instead of providing abstract classes we expose only the implementation of three functions which are part of the worker classes but the classes themselves are not seen by the users of the API. The point about the users’ accessibility to more information regarding a few objects used in the API (the worker object for example) has been taken on board and we have now addressed this by type hinting all these objects both in the templates and more importantly in the automatically generated code that Heron now creates when a user chooses to create a Node graphically (a feature of Heron not present in the version available in the initial submission of this manuscript).
(6) The authors should provide more pre-defined elements. Even though the ability for users to run arbitrary code is the main feature, the initial adoption of a codebase by a community, in which many members are not so experienced with programming, is the ability for them to use off-the-shelf components as much as possible. I believe the software could benefit from a suite of commonly used Nodes.
There are currently 12 Node repositories in the Heron-repositories project on Github with more than 30 Nodes, 20 of which are general use (not implementing a specific experiment’ logic). This list will continue to grow but we fully appreciate the truth of the reviewer’s comment that adoption will depend on the existence of a large number of commonly used Nodes (for example Numpy, and OpenCV Nodes) and are working towards this goal.
(7) It is not clear to me if there is any capability or utilities for testing individual nodes without invoking a full system execution. This would be critical when designing new experiments and testing out each component.
There is no capability to run the code of an individual Node outside Heron’s GUI. A user could potentially design and test parts of the Node before they get added into a Node but we have found this to be a highly inefficient way of developing new Nodes. In our hands the best approach for Node development was to quickly generate test inputs and/or outputs using the “User Defined Function 1I 1O” Node where one can quickly write a function and make it accessible from a Node. Those test outputs can then be pushed in the Node under development or its outputs can be pushed in the test function, to allow for incremental development without having to connect it to the Nodes it would be connected in an actual pipeline. For example, one can easily create a small function that if a user presses a key will generate the same output (if run from a “User Defined Function 1I 1O” Node) as an Arduino Node reading some buttons. This output can then be passed into an experiment logic Node under development that needs to do something with this input. In this way during a Node development Heron allows the generation of simulated hardware inputs and outputs without actually running the actual hardware. We have added this way of developing Nodes also in our manuscript (Creating a new Node).
Reviewer #3 (Public Review):
Summary:
The authors present a Python tool, Heron, that provides a framework for defining and running experiments in a lab setting (e.g. in behavioural neuroscience). It consists of a graphical editor for defining the pipeline (interconnected nodes with parameters that can pass data between them), an API for defining the nodes of these pipelines, and a framework based on ZeroMQ, responsible for the overall control and data exchange between nodes. Since nodes run independently and only communicate via network messages, an experiment can make use of nodes running on several machines and in separate environments, including on different operating systems.
Strengths:
As the authors correctly identify, lab experiments often require a hodgepodge of separate hardware and software tools working together. A single, unified interface for defining these connections and running/supervising the experiment, together with flexibility in defining the individual subtasks (nodes) is therefore a very welcome approach. The GUI editor seems fairly intuitive, and Python as an accessible programming environment is a very sensible choice. By basing the communication on the widely used ZeroMQ framework, they have a solid base for the required non-trivial coordination and communication. Potential users reading the paper will have a good idea of how to use the software and whether it would be helpful for their own work. The presented experiments convincingly demonstrate the usefulness of the tool for realistic scientific applications.
Weaknesses:
(1) In my opinion, the authors somewhat oversell the reproducibility and "selfdocumentation" aspect of their solution. While it is certainly true that the graph representation gives a useful high-level overview of an experiment, it can also suffer from the same shortcomings as a "pure code" description of a model - if a user gives their nodes and parameters generic/unhelpful names, reading the graph will not help much.
This is a problem that to our understanding no software solution can possibly address. Yet having a visual representation of how different inputs and outputs connect to each other we argue would be a substantial benefit in contrast to the case of “pure code” especially when the developer of the experiment has used badly formatted variable names.
(2) Making the link between the nodes and the actual code is also not straightforward, since the code for the nodes is spread out over several directories (or potentially even machines), and not directly accessible from within the GUI.
This is not accurate. The obligatory code of a Node always exists within a single folder and Heron’s API makes it rather cumbersome to spread scripts relating to a Node across separate folders. The Node folder structure can potentially be copied over different machines but this is why Heron is tightly integrated with git practices (and even politely asks the user with popup windows to create git repositories of any Nodes they create whilst using Heron’s automatic Node generator system). Heron’s documentation is also very clear on the folder structure of a Node which keeps the required code always in the same place across machines and more importantly across experiments and labs. Regarding the direct accessibility of the code from the GUI, we took on board the reviewers’ comments and have taken the first step towards correcting this. Now one can attach to Heron their favourite IDE and then they can double click on any Node to open its two main scripts (com and worker) in that IDE embedded in whatever code project they choose (also set in Heron’s settings windows). On top of this, Heron now allows the addition of notes both for a Node and for all its parameters, inputs and outputs which can be viewed by hovering the mouse over them on the Nodes’ GUIs. The final step towards GUI-code integration will be to have a Heron GUI code editor but this is something that has to wait for further development from Heron’s underlying GUI library DearPyGUI.
(3) The authors state that "[Heron’s approach] confers obvious benefits to the exchange and reproducibility of experiments", but the paper does not discuss how one would actually exchange an experiment and its parameters, given that the graph (and its json representation) contains user-specific absolute filenames, machine IP addresses, etc, and the parameter values that were used are stored in general data frames, potentially separate from the results. Neither does it address how a user could keep track of which versions of files were used (including Heron itself).
Heron’s Graphs, like any experimental implementation, must contain machine specific strings. These are accessible either from Heron’s GUI when a Graph json file is opened or from the json file itself. Heron in this regard does not do anything different to any other software, other than saving the graphs into human readable json files that users can easily manipulate directly.
Heron provides a method for users to save every change of the Node parameters that might happen during an experiment so that it can be fully reproduced. The dataframes generated are done so in the folders specified by the user in each of the Nodes (and all those paths are saved in the json file of the Graph). We understand that Heron offers a certain degree of freedom to the user (Heron’s main reason to exist is exactly this versatility) to generate data files wherever they want but makes sure every file path gets recorded for subsequent reproduction. So, Heron behaves pretty much exactly like any other open source software. What we wanted to focus on as the benefits of Heron on exchange and reproducibility was the ability of experimenters to take a Graph from another lab (with its machine specific file paths and IP addresses) and by examining the graphical interface of it to be able to quickly tweak it to make it run on their own systems. That is achievable through the fact that a Heron experiment will be constructed by a small amount of Nodes (5 to 15 usually) whose file paths can be trivially changed in the GUI or directly in the json file while the LAN setup of the machines used can be easily reconstructed from the information saved in the secondary GUIs.
Where Heron needs to improve (and this is a major point in Heron’s roadmap) is the need to better integrate the different saved experiments with the git versions of Heron and the Nodes that were used for that specific save. This, we appreciate is very important for full reproducibility of the experiment and it is a feature we will soon implement. More specifically users will save together with a graph the versions of all the used repositories and during load the code base utilised will come from the recorded versions and not from the current head of the different repositories. This is a feature that we are currently working on now and as our roadmap suggests will be implemented by the release of Heron 1.0.
(4) Another limitation that in my opinion is not sufficiently addressed is the communication between the nodes, and the effect of passing all communications via the host machine and SSH. What does this mean for the resulting throughput and latency - in particular in comparison to software such as Bonsai or Autopilot? The paper also states that "Heron is designed to have no message buffering, thus automatically dropping any messages that come into a Node’s inputs while the Node’s worker function is still running."- it seems to be up to the user to debug and handle this manually?
There are a few points raised here that require addressing. The first is Heron’s requirement to pass all communication through the main (GUI) machine. We understand (and also state in the manuscript) that this is a limitation that needs to be addressed. We plan to do this is by adding to Heron the feature of running headless (see our roadmap). This will allow us to run whole Heron pipelines in a second machine which will communicate with the main pipeline (run on the GUI machine) with special Nodes. That will allow experimenters to define whole pipelines on secondary machines where the data between their Nodes stay on the machine running the pipeline. This is an important feature for Heron and it will be one of the first features to be implemented next (after the integration of the saving system with git).
The second point is regarding Heron’s throughput latency. In our original manuscript we did not have any description of Heron’s capabilities in this respect and both other reviewers mentioned this as a limitation. As mentioned above, we have now addressed this by adding a section to our third experimental example that fully describes how much CPU is required to run a full experimental pipeline running on two machines and utilising also non python code executables (a Unity game). This gives an overview of how heavy pipelines can run on normal computers given adequate optimisation and utilising Heron’s feature of forcing some Nodes to run their Worker processes on a specific core. At the same time, Heron’s use of 0MQ protocol makes sure there are no other delays or speed limitations to message passing. So, message passing within the same machine is just an exchange of memory pointers while messages passing between different machines face the standard speed limitations of the Local Access Network’s ethernet card speeds.
Finally, regarding the message dropping feature of Heron, as mentioned above this is an architectural decision given the use cases of message passing we expect Heron to come in contact with. For a full explanation of the logic here please see our answer to the 3rd comment by Reviewer 2.
(5) As a final comment, I have to admit that I was a bit confused by the use of the term "Knowledge Graph" in the title and elsewhere. In my opinion, the Heron software describes "pipelines" or "data workflows", not knowledge graphs - I’d understand a knowledge graph to be about entities and their relationships. As the authors state, it is usually meant to make it possible to "test propositions against the knowledge and also create novel propositions" - how would this apply here?
We have described Heron as a Knowledge Graph instead of a pipeline, data workflow or computation graph in order to emphasise Heron’s distinct operation in contrast to what one would consider a standard pipeline and data workflow generated by other visual based software (like LabView and Bonsai). This difference exists on what a user should think of as the base element of a graph, i.e. the Node. In all other visual programming paradigms, the Node is defined as a low-level computation, usually a language keyword, language flow control or some simple function. The logic in this case is generated by composing together the visual elements (Nodes). In Heron the Node is to be thought of as a process which can be of arbitrary complexity and the logic of the graph is composed by the user both within each Node and by the way the Nodes are combined together. This is an important distinction in Heron’s basic operation logic and it is we argue the main way Heron allows flexibility in what can be achieved while retaining ease of graph composition (by users defining their own level of complexity and functionality encompassed within each Node). We have found that calling this approach a computation graph (which it is) or a pipeline or data workflow would not accentuate this difference. The term Knowledge Graph was the most appropriate as it captures the essence of variable information complexity (even in terms of length of shortest string required) defined by a Node.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
- No buffering implies dropped messages when a node is busy. It seems like this could be very problematic for some use cases...
This is a design principle of Heron. We have now provided a detailed explanation of the reasoning behind it in our answer to Reviewer 2 (Paragraph 3) as well as in the manuscript.
- How are ssh passwords stored, and is it secure in some way or just in plain text?
For now they are plain text in an unencrypted file that is not part of the repo (if one gets Heron from the repo). Eventually, we would like to go to private/public key pairs but this is not a priority due to the local nature of Heron’s use cases (all machines in an experiment are expected to connect in a LAN).
Minor notes / copyedits:
- Figure 2A: right and left seem to be reversed in the caption.
They were. This is now fixed.
- Figure 2B: the text says that proof of life messages are sent to each worker process but in the figure, it looks like they are published by the workers? Also true in the online documentation.
The Figure caption was wrong. This is now fixed.
- psutil package is not included in the requirements for GitHub
We have now included psutil in the requirements.
- GitHub readme says Python >=3.7 but Heron will not run as written without python >= 3.9 (which is alluded to in the paper)
The new Heron updates require Python 3.11. We have now updated GitHub and the documentation to reflect this.
- The paper mentions that the Heron editor must be run on Windows, but this is not mentioned in the Github readme.
This was an error in the manuscript that we have now corrected.
- It’s unclear from the readme/manual how to remove a node from the editor once it’s been added.
We have now added an X button on each Node to complement the Del button on the keyboard (for MacOS users that do not have this button most of the times).
- The first example experiment is called the Probabilistic Reversal Learning experiment in text, but the uncertainty experiment in the supplemental and on GitHub.
We have now used the correct name (Probabilistic Reversal Learning) in both the supplemental material and on GitHub
- Since Python >=3.9 is required, consider using fstrings instead of str.format for clarity in the codebase
Thank you for the suggestion. Latest Heron development has been using f strings and we will do a refactoring in the near future.
- Grasshopper cameras can run on linux as well through the spinnaker SDK, not just Windows.
Fixed in the manuscript.
- Figure 4: Square and star indicators are unclear.
Increased the size of the indicators to make them clear.
- End of page 9: "an of the self" presumably a typo for "off the shelf"?
Corrected.
- Page 10 first paragraph. "second root" should be "second route"
Corrected.
- When running Heron, the terminal constantly spams Blowfish encryption deprecation warnings, making it difficult to see the useful messages.
The solution to this problem is to either update paramiko or install Heron through pip. This possible issue is mentioned in the documentation.
- Node input /output hitboxes in the GUI are pretty small. If they could be bigger it would make it easier to connect nodes reliably without mis-clicks.
We have redone the Node GUI, also increasing the size of the In/Out points.
Reviewer #2 (Recommendations For The Authors):
(1) There are quite a few typos in the manuscript, for example: "one can accessess the code", "an of the self", etc.
Thanks for the comment. We have now screened the manuscript for possible typos.
(2) Heron’s GUI can only run on Windows! This seems to be the opposite of the key argument about the portability of the experimental setup.
As explained in the answers to Reviewer 1, Heron can run on most machines that the underlying python libraries run, i.e. Windows and Linux (both for x86 and Arm architectures). We have tested it on Windows (10 and 11, both x64), Linux PC (Ubuntu 20.04.6, x64) and Raspberry Pi 4 (Debian GNU/Linux 12 (bookworm), aarch64). We have now revised the manuscript and the GitHub repo to reflect this.
(3) Currently, the output is displayed along the left edge of the node, but the yellow dot connector is on the right. It would make more sense to have the text displayed next to the connectors.
We have redesigned the Node GUI and have now placed the Out connectors on the right side of the Node.
(4) The edges are often occluded by the nodes in the GUI. Sometimes it leads to some confusion, particularly when the number of nodes is large, e.g., Fig 4.
This is something that is dependent on the capabilities of the DearPyGUI module. At the moment there is no way to control the way the edges are drawn.
Reviewer #3 (Recommendations For The Authors):
A few comments on the software and the documentation itself:
- From a software engineering point of view, the implementation seems to be rather immature. While I get the general appeal of "no installation necessary", I do not think that installing dependencies by hand and cloning a GitHub repository is easier than installing a standard package.
We have now added a pip install capability which also creates a Heron command line command to start Heron with.
-The generous use of global variables to store state (minor point, given that all nodes run in different processes), boilerplate code that each node needs to repeat, and the absence of any kind of automatic testing do not give the impression of a very mature software (case in point: I had to delete a line from editor.py to be able to start it on a non-Windows system).
As mentioned, the use of global variables in the worker scripts is fine partly due to the multi process nature of the development and we have found it is a friendly approach to Matlab users who are just starting with Python (a serious consideration for Heron). Also, the parts of the code that would require a singleton (the Editor for example) are treated as scripts with global variables while the parts that require the construction of objects are fully embedded in classes (the Node for example). A future refactoring might make also all the parts of the code not seen by the user fully object oriented but this is a decision with pros and cons needing to be weighted first.
Absence of testing is an important issue we recognise but Heron is a GUI app and nontrivial unit tests would require some keystroke/mouse movement emulator (like QTest of pytest-qt for QT based GUIs). This will be dealt with in the near future (using more general solutions like PyAutoGUI) but it is something that needs a serious amount of effort (quite a bit more that writing unit tests for non GUI based software) and more importantly it is nowhere as robust as standard unit tests (due to the variable nature of the GUI through development) making automatic test authoring an almost as laborious a process as the one it is supposed to automate.
- From looking at the examples, I did not quite see why it is necessary to write the ..._com.py scripts as Python files, since they only seem to consist of boilerplate code and variable definitions. Wouldn’t it be more convenient to represent this information in configuration files (e.g. yaml or toml)?
The com is not a configuration file, it is a script that launches the communication process of the Node. We could remove the variable definitions to a separate toml file (which then the com script would have to read). The pros and cons of such a set up should be considered in a future refactoring.
Minor comments for the paper:
- p.7 (top left): "through its return statement" - the worker loop is an infinite loop that forwards data with a return statement?
This is now corrected. The worker loop is an infinite loop and does not return anything but at each iteration pushes data to the Nodes output.
- p.9 (bottom right): "of the self" → "off-the-shelf"
Corrected.
- p.10 (bottom left): "second root" → "second route"
Corrected.
- Supplementary Figure 3: Green start and square seem to be swapped (the green star on top is a camera image and the green star on the bottom is value visualization - inversely for the green square).
The star and square have been swapped around.
- Caption Supplementary Figure 4 (end): "rashes to receive" → "rushes to receive"
Corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Response to reviewer’s public reviews:
We chose the dose of J60 based on a prior publication that established that off-target effects were possible at relatively high doses1. The dose that we used (0.1 mg/kg) was 30-fold less than the dose that was reported in that paper to potentially have off-target responses (3 mg/kg). Further, Author response image 1 shows the results of experiments in which J60 was given to animals that did not have the excitatory DREADD expressed in the spinal cord. This includes a sample of mice (n = 2) and rats (n = 3), recorded from using the same diaphragm EMG procedure described in the manuscript. The figure shows that there was no consistent response to the J60 at 0.1 mg/kg in the “control experiment” in which the DREADD was not expressed in the spinal cord.
Author response image 1.
Diaphragm EMG response to J60 administrated to naïve rats and mice. Panel a-b show raw EMG values at baseline, following vehicle (saline) and J60 administration for the left and right hemidiaphragm. Panel c-d shows EMG values normalized to baseline. Neither One-way RM ANOVA (panel a-b) nor paired t-test (panel c-d) returned significant p values (p < 0.05).
Response to specific reviewer comments:
Reviewer #1:
How old were the animals at the time of AAV injection, and in subsequent experiments?
The wildtype cohort of mice were 7-9 weeks old at time of AAV injection and DREADD experiments took place 4-5 weeks after AAV injection. ChAT-Cre mice were 6-10 weeks old at time of AAV injection and DREADD experiments took place 4-9 weeks after AAV injection. ChAT-Cre rats were 2-5 months old at time of AAV spinal injection. These animals underwent plethysmography recordings 3-4 months post-AAV injection and subsequently phrenic nerve recording 3-8 weeks later. These details have been added to the Method section.
How many mice were excluded from electrophysiology experiments due to deteriorating electrode contact?
No mice were excluded from electrophysiology experiments due to deteriorating electrode contact. If you are referring to the n = 1 excluded ChAT-Cre mouse (line 368) this animal was excluded because it showed no histological evidence of DREADD expression (lines 200-206).
What was the urethane dose?
The urethane dose for phrenic nerve recordings was 2.1 g/kg. See methods section line 395.
A graphical timeline of the experimental progression for plethysmography and electrophysiology studies would enhance clarity.
A graphical timeline has been added. See Figure S6.
Significance indicators in the figures would greatly enhance clarity. It is a little awkward to have to refer to supplemental tables to figure out statistical differences.
Significance indicators have been added. See Figures 1, 2, 4, and 5
In Figures 1, 2, and 5, individual data points should be shown, as in Fig 4.
Thank you for this suggestion. We agree that, in general, it is best practice to scatter individual data points. However, when we drafted the new figures, it was apparent that including individual scatter points, in this case, created very “cluttered” figures that were very difficult to interpret.
More detail regarding the plethysmography studies is needed. Was saline/J60 infused via a tail vein catheter? Were animals handled during the infusion? How long is the "IV" period? What volume of fluid was delivered?
All IV infusions were delivered via a tail vein catheter. Animals were not handled during infusion nor at any point during the recording. An IV catheter was externalized via a port in the plethysmograph allowing for IV infusion without handling of the animal or opening the plethysmograph. The infusion period for both saline and J60 was standardized to 2 minutes. The volume of fluid of both saline and J60 was standardized to 0.6 mL. This information has been added to the methods section (lines 408-410, 415-16, 419-420).
Reviewer #2:
The abstract could be improved by briefly highlighting the rationale, scope, and novelty of the study - the intro does a great job of highlighting the scope of the study and the research questions.
A brief explanation of the rationale, scope, and novelty of the study has been added to the abstract. See lines 2-8.
Line 18, specifies that this was done under urethane anesthesia.
This detail has been added to the abstract (line 20).
The methods section should be moved to the end of the manuscript according to Journal policy.
The methods section has been moved to the end of the manuscript.
The authors mention the use of both female and male rats but it is not indicated if they tested for and observed any differences between sexes across experiments.
We included the use of both male and female animals in this study to improve the generalizability of the results. However, we were not adequately powered for sex comparisons and therefore did not perform any statistical analysis to assess differences between sexes across experiments. Text has been added to the methods section (lines 534-537) to clarify.
Line 40, since delivery of J60 was performed in both IV and IP, this general statement should be updated.
This detail has been revised to include both IV and IP. See line 43.
Line 42. "First, we determined if effective diaphragm activation requires focal DREADD expression targeting phrenic motor neurons, or if non-specific expression in the immediate vicinity of the phrenic motor nucleus would be sufficient...." I don't think that in the experiments with wild-type mice the authors can claim that they selectively targeted the cervical propriospinal network (in isolation from the motoneurons). Given the fact that the histological analysis did not quantify interneurons or motoneurons in the spinal cord, authors should be cautious in proposing which neuronal population is activated in the non-specific approach.
We agree, and this was a poorly worded statement in our original text. We agree that wild-type DREADD expression was not limited to the cervical propriospinal networks but likely a mix of interneurons and motoneurons. The text has been edited to reflect that (see lines 56-60).
AAV virus source is not described.
All AAVs were obtained from the UF Powell Gene Therapy Center. Details of virus source and production have been added to the methods section. See lines 336-347.
Line 108-125. Because the diaphragm EMG recordings are only described for mice here, I would suggest editing this methods section to clearly state mice instead of vaguely describing "animals" in the procedure.
“Animals” has been changed to “mice” to avoid ambiguity.
Line 120, add parenthesis.
Parenthesis has been added.
Line 126. Whole body plethysmography protocol. Three hypercapnic hypoxic challenges are a lot for a rat within a 3-hour recording session in freely behaving rats. Did the authors verify with control/ vehicle experiments that repeated challenges in the absence of J60 do not cause potentiation of the response? I understand that it is not possible to invert the order of the injections (due to likely long-term effects of J60) or it is too late to perform vehicle and J60 injections on different days, but controls for repeated challenges should be performed in this type of experiment, especially considering the great variability in the response observed in Figure 4 (in normoxic conditions).
We did not conduct control experiments to assess the impact of repeated hypercapnic hypoxic challenges on the naïve response (i.e., in the absence of J60). However, our experimental protocol was designed such that each experimental period (i.e., post-vehicle or post-J60 infusion) was normalized to baseline recordings taken immediately prior to the vehicle or J60 infusion. While repeated exposure to hypercapnic hypoxic challenges may have altered respiratory output, we are confident that normalizing each experimental period to its respective baseline effectively captures the impact of DREADD activation on ventilation, independent of any potential potentiation that may have occurred due to gas challenge exposure. We have included raw values for all plethysmography outcomes (see Figure 4, panels a-c) to ensure full data transparency. Still, we believe that the baseline-normalized values more accurately reflect the impact of DREADD activation on the components of ventilation.
Furthermore, why the response to the hypercapnic hypoxic challenges are not reported? These could be very interesting to determine the effects of DREADD stimulation on chemosensory responses and enhance the significance of the study.
Response to the hypercapnic hypoxic challenges has been added to the manuscript. See Figure S3 and results section lines 162-167. Briefly, there were no statistically significant (p < 0.05) differences in tidal volume, respiratory rate, or minute ventilation between J60 vs sham condition during hypercapnic-hypoxic ventilatory challenges.
Line 200 - what is the reason behind performing a qualitative analysis of mCherry in various quadrants? This limits the interpretation of the results. If the authors used Chat-cre rats, the virus should only be in Chat+ MN. Knowing how selective the virus is, and whether its expression was selective for Phrenic MN versus other MN pools, could address several technical questions.
We agree that detailed quantification of expression by motoneuron pool would be of value in future work. However, for these initial proof-of-concept experiments, we performed the quadrant-based qualitative analysis of mCherry expression to provide a simple comparison of mCherry expression between groups (i.e., ChAT-Cre vs. wildtype mice). This analysis allowed us to: 1) show the reader that each animal included in the study showed evidence of mCherry expression and 2) give the reader an idea of patterns of mCherry expression throughout the mid-cervical spinal cord. Additionally, it is important to note that while ChAT is a marker of motoneurons some populations of interneurons also express ChAT(2-4).
Given the increased values of Dia EMG AUC and no changes in respiratory rate, did the authors determine if there was a change in the inspiratory time with J60 administration?
We did not assess inspiratory time.
High death rate in DREADD WT mice - was histological analysis performed on these mice? Could it be due to the large volume injected into the spinal cord that affects not only descending pathways but also ascending ones? Or caused by neuronal death due to the large volume of viral solution in injected in mice.
Histological analysis was performed on these animals to assess mCherry expression only (i.e., no staining for NeuN or other markers was performed). While the reviewer's speculations are reasonable, we feel these reasons are unlikely to explain the death rate in DREADD WT mice as ChAT-Cre mice received the same volume injected into their spine and lived up until and during diaphragm EMG recordings. Additionally, WT mice lived for 4-5 weeks post-injection which would be past the acute phase that a large immune response to the viral dose would have occurred.
Line 299-304. Can you please clarify whether these rats were tested under anesthesia?
These rats were assessed under anesthesia. This detail has been added (line 146).
Given some of the unexpected results on cardiovascular parameters in urethane anesthetized rats, did the authors test the effects of J60 in the absence of AAV construct infection?
A small cohort (n = 2) of urethane anesthetized naïve wildtype rats were given the J60 ligand (IV, 0.1 mg/kg dose). We did observe a sudden drop in blood pressure after J60 administration that was sustained for the duration of the recording. One animal showed a 12% decrease in mean arterial blood pressure following J60 administration while the other showed a 35% decrease. Thus, it does appear that in this preparation the J60 ligand is producing a drop in arterial blood pressure.
Line 393. I believe this comment is referred to the intrapleural and diaphragmatic injection. Maybe this should clarified in the sentence.
This sentence has been revised for clarity (see lines 248-250).
Figures 1 and 2. It would be informative to show raw traces of the Diaphragm EMG to demonstrate the increase in tonic EMG. It is not possible to determine that from the integrated traces in Figures 1A and B.
Thank you for bringing up this concern. While the mean data in Figures 1F and 2F do indicate that, on average, animals had tonic diaphragm EMG responses to DREADD activation, the examples given in Figures 1A and 2A show minimal responses. This makes it difficult to fully appreciate the tonic response from those particular traces. However, clear tonic activity can be appreciated from Figures 5A and S2. In these figures, tonic activity is evident from the integrated EMG signals, presenting as a sustained increase in baseline activity between bursts—essentially an upward shift from the zero point.
References
(1) Van Savage, J. & Avegno, E. M. High dose administration of DREADD agonist JHU37160 produces increases in anxiety-like behavior in male rats. Behav Brain Res 452, 114553 (2023). https://doi.org/10.1016/j.bbr.2023.114553
(2) Mesnage, B. et al. Morphological and functional characterization of cholinergic interneurons in the dorsal horn of the mouse spinal cord. J Comp Neurol 519, 3139-3158 (2011). https://doi.org/10.1002/cne.22668
(3) Gotts, J., Atkinson, L., Yanagawa, Y., Deuchars, J. & Deuchars, S. A. Co-expression of GAD67 and choline acetyltransferase in neurons in the mouse spinal cord: A focus on lamina X. Brain Res 1646, 570-579 (2016). https://doi.org/10.1016/j.brainres.2016.07.001
(4) Alkaslasi, M. R. et al. Single nucleus RNA-sequencing defines unexpected diversity of cholinergic neuron types in the adult mouse spinal cord. Nat Commun 12, 2471 (2021). https://doi.org/10.1038/s41467-021-22691-2
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
These experiments are some of the first to assess the role of dopamine release and the activity of D1 and D2 MSNs in pair bond formation in Mandarin voles. This is a novel and comprehensive study that presents exciting data about how the dopamine system is involved in pair bonding. The authors provide very detailed methods and clearly presented results. Here they show dopamine release in the NAc shell is enhanced when male voles encounter their pair bonded partner 7 days after cohabitation. In addition, D2 MSN activity decreases whereas D1 MSN activity increases when sniffing the pair-bonded partner.
The authors do not provide justification for why they only use males in the current study, without discussing sex as a biological variable these data can only inform readers about one sex (which in pair-bonded animals by definition have 2 sexes). In addition, the authors do not use an isosbestic control wavelength in photometry experiments, although they do use EGFP control mice which show no effects of these interventions, a within-subject control such as an isosbestic excitation wavelength could give more confidence in these data and rule out motion artefacts within subjects.
We agree with your suggestion that mechanism underlying pair bonding in females should also be investigated. In general, natal philopatry among mammals is female biased in the wild(Greenwood, 1983; Brody and Armitage, 1985; Ims, 1990; Solomon and Jacquot, 2002); social mammals are rarely characterized by exclusively male natal philopatry (Solomon and Jacquot, 2002). Males often disperse from natal area to a new place. Thus, males rodents may play a dominant role in the formation and maintenance of mating relationships. This is a reason we investigate pair bonding in male firstly. Certainly, female mate selection, and sexual receptivity or refusal through olfactory cues from males, thereby affect the formation and maintenance of pair bonding (Hoglen and Manoli, 2022). This is also the reason why we should focus on the mechanisms underlying pair bonding formation in females in the future research. This has been added in the limitation in the discussion.
In photometry experiments, rAAV-D1/D2-GCaMP6m, a D1/D2 genetically encoded fluorescent calcium sensor, was injected into the NAc shell. The changes in fluorescence signals during these social interactions were collected and digitalized. To assess the specific response to social stimulus in fluorescence signals, changes in fluorescence signals during non-social behavioral bouts (such as freezing, exploration of the environment, grooming, rearing, etc…) were also recorded and analyzed. The result showed that dopamine release or D1/D2 MSNs activity displayed no significant changes after cohabitation of 3 or 7 days upon occurring of no-social behavior such as freezing, exploring, grooming and rearing. In addition, GCaMP6m is a genetically encoded calcium indicator. Changes in its fluorescence signal reflect changes in intracellular calcium ion concentration. Using EGFP virus as a control, it can be determined whether the fluorescence signal observed in the experiment is generated by the specific response of GCaMP6m to calcium or if there are other non-specific factors leading to fluorescence changes. If there is no similar fluorescence change in the EGFP control group, it can more strongly prove that the signal detected by GCaMP6m is a calcium-related specific signal. In some research article, they also use EGFP control group in photometry experiments (Yamaguchi et al., 2020; Qu et al., 2024; Zhan et al., 2024). Therefore, changes in fluorescence signals observed in the present study reflect neuron activities upon specific social behaviors, but were not affected by motion artefacts.
There is an existing literature (cited in this manuscript) from Aragona et al., (particularly Aragona et al., 2006) which has highlighted key differences in the roles of rostral versus caudal NAc shell dopamine in pair bond formation and maintenance. Specifically, they report that dopamine transmission promoting pair bonding only occurs in the rostral shell and not the caudal shell or core regions. Given that the authors have targeted more caudally a discussion of how these results fit with previous work and why there may be differences in these areas is warranted.
Thanks for your professional consideration. The brain coordinates of Bilateral 26-gauge guide cannulae were NAc (1.6 mm rostral, ± 1 mm bilateral, 4.5 mm ventral (for shell), 3.5 mm ventral (for core) from bregma) in report from Aragona et al (2006). In the present study, the brain coordinates of virus injection were (AP: +1.5, ML: ±0.99, DV: −4.2 (for NAc shell)). Thus, the virus injection sites were close to rostral shell in our study. However, as the diffusive expression of the virus, part of neurons in the rostrocaudal border and caudal shell also be infected by the virus, so we did not distinguish different subregions of NAc shell. In the future, we will use AAV13, a viral strategy could target / manipulate precise local neural populations, to address this issue. NAc is a complex brain structure with distinct regions that have different functions. Previous study suggested that GABAergic substrates of positive and negative types of motivated behavior in the nucleus accumbens shell are segregated along a rostrocaudal gradient (Reynolds and Berridge, 2001). However, a study found that food intake is significantly enhanced by administering μ-selective opioid agonists into the NAc, especially its shell region (Znamensky et al., 2001). Also, μ-opioid stimulation increases the motivation to eat (“wanting”) both in the NAc shell and throughout the entire NAc, as well as in several limbic or striatal structures beyond. For DAMGO stimulation of eating, the “wanting” substrates anatomically extend additionally beyond the rostrodorsal shell and throughout the entire shell (the caudal shell). Furthermore, DAMGO stimulates eating at NAc shell and core, as well as the neostriatum, amygdala…(Gosnell et al., 1986; Gosnell and Majchrzak, 1989; Peciña and Berridge, 2000; Zhang and Kelley, 2000; Echo et al., 2002; Peciña and Berridge, 2005, 2013; Castro and Berridge, 2014). In pair bond formation and maintenance, the rostral shell is the specific subregion of the NAc important for DA regulation of partner preference (Aragona et al., 2006). In conclusion, it appears that the changes in real time dopamine release and activities and electrophysiological properties of D1R, D2R MSNs in the NAc shell after pair bond formation may have primarily targeted to the rostral shell in our study, which is consistent with the report from Aragona et al.
The authors could discuss the differences between pair bond formation and pair bond maintenance more deeply.
Thanks for your suggestion. I have discussed the differences between pair bond formation and pair bond maintenance more deeply.
The dopamine and different types of dopamine receptors in the NAc may play different roles in regulation of pair bond formation and maintenance. The chemogenetic manipulation revealed that VP-projecting D2 MSNs are necessary and more important in pair bond formation compared to VPprojecting D1 MSNs. It is consistent with previous pharmacological experiments that blocking of D2R with its specific antagonist, while D1R was not blocked, can prevent the formation of a pair bond in prairie voles (Gingrich et al., 2000). This indicates that D2R is crucial for the initial formation of the pair bond. D2R is involved in the reward aspects related to mating. In female prairie voles, D2R in the NAc is important for partner preference formation. The activation of D2R may help to condition the brain to assign a positive valence to the partner's cues during mating, facilitating the development of a preference for a particular mate. In addition, the cohabitation caused the DA release, the high affinity Gi-coupled D2R was activated first, which inhibited D2 MSNs activity and promoted the pair bond formation. And then, after 7 days of cohabitation, the pair bonding was already established, the significantly increased release of dopamine significantly activated Gs-coupled D1R with the low affinity to dopamine, which increased D1 MSNs activity and maintained the formation of partner preference. While D1R is also present and involved in the overall process, its role in the initial formation of the pair bond is not as dominant as D2R (Aragona et al., 2006). However, it still participates in the neurobiological processes related to pair bond formation. For example, in male mandarin voles, after 7 days of cohabitation with females, D1R activity in the NAc shell was affected during pair bond formation. The extracellular DA concentration was higher when sniffing their partner compared to a stranger, and this increase in DA release led to an increase in D1R activity in the NAc shell. In prairie voles, dopamine D1 receptors seem to be essential for pair bond maintenance. Neonatal treatment with D1 agonists can impair partner preference formation later in life, suggesting an organizational role for D1 in maintaining the bond (Aragona et al., 2006). In pair-bonded male prairie voles, D1R is involved in inducing aggressive behavior toward strangers, which helps to maintain the pair bond by protecting it from potential rivals. In the NAc shell, D1 agonist decreases the latency to attack same-sex conspecifics, while D1 antagonism increases it (Aragona et al., 2006). In summary, D2R is more crucial for pair bond formation, being involved in reward association and necessary for the initial development of the pair bond. D1R, on the other hand, is more important for pair bond maintenance, being involved in aggression and mate guarding behaviors and having an organizational role in maintaining the pair bond over time. We therefore suggest that D2 MSNs are more predominantly involved in the formation of a pair bond compared with D1 MSNs.
The authors have successfully characterised the involvement of dopamine release, changes in D1 and D2 MSNs, and projections to the VP in pair bonding voles. Their conclusions are supported by their data and they make a number of very reasonable discussion points acknowledging various limitations
Reviewer #2 (Public review):
Summary:
Using in vivo fiber-photometry the authors first establish that DA release when contacting their partner mouse increases with days of cohabitation while this increase is not observed when contacting a stranger mouse. Similar effects are found in D1-MSNs and D2-MSNs with the D1MSN responses increasing and D2-MSN responses decreasing with days of cohabitation. They then use slice physiology to identify underlying plasticity/adaptation mechanisms that could contribute to the changes in D1/D2-MSN responses. Last, to address causality the authors use chemogenetic tools to selectively inhibit or activate NAc shell D1 or D2 neurons that project to the ventral pallidum. They found that D2 inhibition facilitates bond formation while D2 excitation inhibits bond formation. In contrast, both D1-MSN activation and inhibition inhibit bond formation.
Strengths:
The strength of the manuscript lies in combining in vivo physiology to demonstrate circuit engagement and chemogenetic manipulation studies to address circuit involvement in pair bond formation in a monogamous vole.
Weaknesses:
Comment: Weaknesses include that a large set of experiments within the manuscript are dependent on using short promoters for D1 and D2 receptors in viral vectors. As the authors acknowledge this approach can lead to ectopic expression and the presented immunohistochemistry supports this notion. It seems to me that the presented quantification underestimates the degree of ectopic expression that is observed by eye when looking at the presented immunohistochemistry. However, given that Cre transgenic animals are not available for Microtus mandarinus and given the distinct physiological and behavioral outcomes when imaging and manipulating both viral-targeted populations this concern is minor.
Thanks for your professional comment. The virus used in the present study were purchased from brainVTA company. D1/D2 receptor promoter genes were predicted and amplified for validation by the company. The promoter gene was constructed and packaged by aav virus vector (taking rAAV-D2-mCherry-WPRE-bGH_polyA virus as an example, Author response image 1A). The D1/D2 promoter sequence is shown in the Author response image 1B-C. In addition, the D1 receptor gene promoter and D2 receptor gene promoter viruses used in this paper have been used in several published papers with high specificity (Zhao et al., 2019; Ying et al., 2022). In our paper, a high proportion of virus and mRNA co-localization was found through FISH verification and also showed high specificity of virus (Figure S15, S16).
Author response image 1.
(A) Gene carrier of rAAV-D2-mCherry-WPRE-bGH_polyA. (B-C) Gene sequence of D1 promoter and D2 promoter.
The slice physiology experiments provide some interesting outcomes but it is unclear how they can be linked to the in vivo physiological outcomes and some of the outcomes don't match intuitively (e.g. cohabitation enhances excitatory/inhibitory balance in D2-MSNs but the degree of contact-induced inhibition is enhanced in D2-MSN).
Thanks for your comment. The present study found that the frequencies of sEPSC and sIPSC were significantly enhanced after the formation of a pair bond in NAc shell D2 MSNs. The excitatory/inhibitory balance of D2 MSNs was enhanced after cohabitation.These results are not consistent with the findings from fiber photometry of calcium signals. One study showed that NAc D2 MSNs was linked to both ‘liking’ (food consumption) and ‘wanting’ (food approach) but with opposing actions; high D2 MSNs activity signaled ‘wanting’, and low D2 MSNs activity enhanced ‘liking’. D2 MSNs are faced with a tradeoff between increasing ‘wanting’ by being more active or allowing ‘liking’ by remaining silent (Guillaumin et al., 2023). Therefore, the increase in frequencies of sEPSC and sIPSC in D2 MSNs may reflect two processes, liking and wanting, respectively. We thought that hedonia and motivation might influence D2 MSNs activity differently during cohabitation and contribute to the processing of pair bond formation in a more dynamic and complex way than previously expected.
Moreover, the frequencies of sEPSC and sIPSC were significantly reduced in the NAc shell D1 MSNs after pair bonding, whereas the intrinsic excitability increased after cohabitation with females.
The bidirectional modifications (reduced synaptic inputs vs. increased excitability) observed in D1 MSNs might result from homeostatic regulation. The overall synaptic transmission may produce no net changes, given that reductions in both excitatory and inhibitory synaptic transmission of D1 MSNs were observed. Also, increases in the intrinsic excitability of D1 MSNs would result in an overall excitation gain on D1 MSNs.
One interesting finding is that the relationship between D2-MSN and pair bond formation is quite clear (inhibition facilitates while excitation inhibits pair bond formation). In contrast, the role of D1-MSNs is more complicated since both excitation and inhibition disrupt pair bond formation. This is not convincingly discussed.
Considering the reviewer’s suggestion, the discussion has been added in the revised manuscript.
In the present study, DREADDs approaches were used to inhibit or excite NAc MSNs to VP projection and it was found that D1 and D2 NAc MSNs projecting to VP play different roles in the formation of a pair bond. Chemogenetic inhibition of VP-projecting D2 MSNs promoted partner preference formation, while activation of VP-projecting D2 MSNs inhibited it (Figure 6). Chemogenetic activation of D2 MSNs produced the opposite effect of DA on the D2 MSNs on partner preference, while inhibition of these neurons produced the same effects of DA on D2 MSNs. DA binding with D2R is coupled with Gi and produces an inhibitory effect (Lobo and Nestler, 2011). It is generally assumed that activation of D2R produces aversive and negative reinforcement. These results were consistent with the reduced D2 MSNs activity upon sniffing their partner in the fiber photometry test and the increased frequency and amplitude of sIPSC in the present study. Our results also agree with other previous studies that chemogenetic inhibition of NAc D2 MSNs is sufficient to enhance reward-oriented motivation in a motivational task (Carvalho Poyraz et al., 2016; Gallo et al., 2018). Inhibition of D2 MSNs during self-administration enhanced response and motivation to obtain cocaine (Bock et al., 2013). This also suggests that the mechanism underlying attachment to a partner and drug addiction is similar.
Besides, in the present study, the formation of partner preference was inhibited after activation or inhibition of VP-projecting D1 MSNs, which is not consistent with conventional understanding of prairie vole behavior. Alternatively, DA binding with D1R is coupled with Gs and produces an excitatory effect (Lobo and Nestler, 2011), while activation of D1R produces reward and positive reinforcement (Hikida et al., 2010; Tai et al., 2012; Kwak and Jung, 2019). For example, activation of D1 MSNs enhances the cocaine-induced conditioned place preference (Lobo et al., 2010). In addition, D1R activation by DA promotes D1 MSNs activation, which promotes reinforcement. However, a recent study found that NAc-ventral mesencephalon D1 MSNs promote reward and positive reinforcement learning; in contrast, NAc-VP D1 MSNs led to aversion and negative reinforcement learning (Liu et al., 2022). It is consistent with our results that activation of NAc-VP D1 MSNs pathway reduced time spent side-by-side and impaired partner preference after 7 days of cohabitation. In contrast to inhibition of D2 MSNs, we found that inhibition of the D1 MSNs did not elicit corresponding increases in partner preference. One possible explanation is that almost all D1 MSNs projecting to the VTA/ substantia nigra (SN) send collaterals to the VP (Pardo-Garcia et al., 2019). For example, optogenetically stimulating VP axons may inadvertently cause effects in the VTA/SN through the antidromic activation of axon collaterals (Yizhar et al., 2011). Therefore, chemogenetic inhibition of D1 MSNs may also inhibit DA neurons in VTA, subsequently inhibiting the formation of a pair bond.
The dopamine and different types of dopamine receptors in the NAc may play different roles in regulation of pair bond formation and maintenance. The chemogenetic manipulation revealed that VP-projecting D2 MSNs are necessary and more important in pair bond formation compared to VPprojecting D1 MSNs. It is consistent with previous pharmacological experiments that blocking of D2R with its specific antagonist, while D1R was not blocked, can prevent the formation of a pair bond in prairie voles (Gingrich et al., 2000). This indicates that D2R is crucial for the initial formation of the pair bond. D2R is involved in the reward aspects related to mating. In female prairie voles, D2R in the NAc is important for partner preference formation. The activation of D2R may help to condition the brain to assign a positive valence to the partner's cues during mating, facilitating the development of a preference for a particular mate. In addition, the cohabitation caused the DA release, the high affinity Gi-coupled D2R was activated first, which inhibited D2 MSNs activity and promoted the pair bond formation. And then, after 7 days of cohabitation, the pair bonding was already established, the significantly increased release of dopamine significantly activated Gs-coupled D1R with the low affinity to dopamine, which increased D1 MSNs activity and maintained the formation of partner preference. While D1R is also present and involved in the overall process, its role in the initial formation of the pair bond is not as dominant as D2R (Aragona et al., 2006). However, it still participates in the neurobiological processes related to pair bond formation. For example, in male mandarin voles, after 7 days of cohabitation with females, D1R activity in the NAc shell was affected during pair bond formation. The extracellular DA concentration was higher when sniffing their partner compared to a stranger, and this increase in DA release led to an increase in D1R activity in the NAc shell. In prairie voles, dopamine D1 receptors seem to be essential for pair bond maintenance. Neonatal treatment with D1 agonists can impair partner preference formation later in life, suggesting an organizational role for D1 in maintaining the bond (Aragona et al., 2006). In pair-bonded male prairie voles, D1R is involved in inducing aggressive behavior toward strangers, which helps to maintain the pair bond by protecting it from potential rivals. In the NAc shell, D1 agonist decreases the latency to attack same-sex conspecifics, while D1 antagonism increases it (Aragona et al., 2006). In summary, D2R is more crucial for pair bond formation, being involved in reward association and necessary for the initial development of the bond. D1R, on the other hand, is more important for pair bond maintenance, being involved in aggression and mate guarding behaviors and having an organizational role in maintaining the bond over time. We therefore suggest that D2 MSNs are more predominantly involved in the formation of a pair bond compared with D1 MSNs.
It seemed a missed opportunity that physiological readout is limited to males. I understand though that adding females may be beyond the scope of this manuscript.
We gratefully appreciate for your valuable comment. The reviewer 1 also concerned this issue. We made a following response.
In general, natal philopatry among mammals is female biased in the wild(Greenwood, 1983; Brody and Armitage, 1985; Ims, 1990; Solomon and Jacquot, 2002); social mammals are rarely characterized by exclusively male natal philopatry (Solomon and Jacquot, 2002). Males often disperse from natal area to a new place. Thus, male rodents may play a dominant role in the formation and maintenance of mating relationships. This is a reason we investigate pair bonding in male firstly. Certainly, female mate selection, and sexual receptivity or refusal through olfactory cues from males, thereby affect the formation and maintenance of pair bonding (Hoglen and Manoli, 2022). This is also the reason why we should focus on the mechanisms underlying pair bonding formation in females in the future research. This has been added in the limitation in the discussion.
Reviewer #3 (Public review):
Summary:
The manuscript is evaluating changes in dopamine signaling in the nucleus accumbens following pair bonding and exposure to various stimuli in mandarin voles. In addition, the authors present chemogenetic data that demonstrate excitation and inhibition of D1 and D2 MSN affect pair bond formation.
Strengths:
The experimental designs are strong. The approaches are innovative and use cutting-edge methods.
The manuscript is well written.
Weaknesses:
The statistical results are not presented, and not all statistical analyses are appropriate.
Additionally, some details of methods are absent.
As you suggested, we added the detailed information in the revised manuscript.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Remove references to 'extreme significance' - p is set as a threshold and the test is either significant or not.
Thanks for your suggestion. We have removed 'extreme significance' in the revised manuscript.
(2) The second half of the abstract is a little confusing the use of activation/inhibition makes it difficult to read and follow, this could be re-worded for clarity.
Sorry for the confusing. We reorganized the sentence as following.
In addition, chemogenetic inhibition of ventral pallidum-projecting D2 MSNs in the NAc shell enhanced pair bond formation, while chemogenetic activation of VP-projecting D2 MSNs in the NAc shell inhibited pair bond formation.
Reviewer #2 (Recommendations for the authors):
(1) In many instances repeated measures are presented from the same mice (e.g. Figures 1F, I; S1BC). Repeated measures for each mouse should be connected with a line in the figures. This will allow the reader to visually compare the repeated measures for each animal.
Thanks for your careful consideration. As reviewer suggested, the figures have been changed.
(2) It is unclear to me how the time point 0 for sniffing was determined. How is the time point 0 for side-by-side contact determined?
Sniffing is a behavior for olfactory investigation and defined as animals uses nose to inspect any portion of the stimulus mouse’s body, including the tail. The time point 0 for sniffing was the beginning of sniffing behavior occurs. The side-by-side behavior is defined as significant physical contact with a social object and huddle in a quiescent state. The time point 0 for side-byside behavior was the beginning of side-by-side behavior occurs.
(3) Figure 1-3: For the fiber photometry data 7 events (sniffs) are shown in the heat maps. Are these the first 7 sniffs? What went into the quantification? It seems that DA and D1/D2 responses are habituating. This could be analyzed and would need to be discussed.
In the heat maps (Figure 1-3), we showed the mean fluorescence signal changes of every subject (n = 7 voles) upon sniffing partner, stranger or an object in the experiment, but not the fluorescence signal changes of sniffing events in one vole. The quantification of changes in mean fluorescence signal of all subjects was showed in Figure 1F, 1I, Figure 2F, 2I, Figure 3F and 3I.
(4) Generally, it is very difficult to obtain cell type selectivity using short promoters in viruses (the authors acknowledge this). Which D1 and D2 promoter sequences were used for obtaining specificity? The degree of ectopic expression looks much higher than the quantification (e.g. in Fig. 3b, 6C, 7C, S14A, C). Is this due to thresholding?
The virus used in the present study were purchased from brainVTA company. D1/D2 receptor promoter genes were predicted and amplified for validation. The promoter gene was constructed and packaged by aav virus vector (taking rAAV-D2-mCherry-WPRE-bGH_polyA virus as an example, Author response image 1A). The D1/D2 promoter sequence is shown in the Author response image 1B-C. In addition, the D1 receptor gene promoter and D2 receptor gene promoter viruses used in this paper have been used in several published papers with high specificity (Zhao et al., 2019; Ying et al., 2022). In the Figure 6C, the first image is the merged fluorescence images that were taken under different fluorescence channels with the 20X objective. The second and the third images were taken under 40X objective from field of white box in the first image. The second and the third images were merged into fourth one. Due to the different exposure time and intensity, the fluorescence photo taken at 40X are clearer compared to image taken at the 20X. For example, in the Figure 6C, the labeled-cells were presented as following (Author response image 2). In our paper,virus infection and mRNA through FISH verification were co-localized in a high proportion displaying high specificity of virus (Figure S15, S16).Certainly, the number of positive neurons may be dependent on visuality (thresholding). Only visible cells were counted. The cell counting results at Author response image 2B and 2C are similar to the quantification in the Figure 6C.
Author response image 2.
(A) Immunohistological image showing co-localization of hM3Dq- mCherry-anti expression (green), D2R-mRNA (red), and DAPI (blue) in the NAc shell. Scale bar: 100 μm. (B) The cell counts and the determination of colocalization of the 20× immunohistochemistry images. The marked neurons were counted with white dots. (C) The cell counts and the determination of colocalization of the 40× immunohistochemistry images. The marked neurons were counted with white dots.
(5) Figure 6D/7D: the time scale seems to be off for both traces (40 seconds). For the hM3D Gq experiment, only one trace is shown. It would be more convincing to provide an input-output curve from several mice and to statistically compare the curves.
Response: Thanks for your careful consideration. As reviewer suggested, the figure of resting membrane potentials before and after drug CNO exposure from several voles was added in the revised manuscript.
(6) The presence of GIRK channels in MSNs has been a long debate and hM4D Gi activation may mostly act at the level of terminals by inhibiting neurotransmitter release. For demonstrating hyperpolarization of the soma showing the resting membrane potential before and after drug CNO exposure would be more convincing.
Thanks for your careful consideration. As reviewer suggested, the figure of resting membrane potential before and after drug CNO exposure was added in the revised manuscript.
(7) It is unclear to me how far the slice physiology informs the in vivo physiology (e.g. cohabitation enhances excitatory/inhibitory balance in D2-MSNs but the degree of contact-induced inhibition is enhanced in D2-MSN; D2-MSNs become less responsive to DA in the slice yet but at the time of enhanced DA release D2-MSN activity is also strongly reduced).
The present study found that the frequencies of sEPSC and sIPSC were significantly enhanced after the formation of a pair bond in NAc shell D2 MSNs. The excitatory/inhibitory balance of D2 MSNs was enhanced after cohabitation. These results are not consistent with the findings from fiber photometry of calcium signals. One study showed that NAc D2 MSNs was linked to both ‘liking’ (food consumption) and ‘wanting’ (food approach) but with opposing actions; high D2 MSNs activity signaled ‘wanting’, and low D2 MSNs activity enhanced ‘liking’. D2 MSNs are faced with a tradeoff between increasing ‘wanting’ by being more active or allowing ‘liking’ by remaining silent (Guillaumin et al., 2023). Therefore, the increase in frequencies of sEPSC and sIPSC in D2 MSNs may reflect two processes, liking and wanting, respectively. We thought that hedonia and motivation might different influence D2 MSNs activity during cohabitation and contribute to the processing of pair bond formation in a more dynamic and complex way than previously expected.
Moreover, the frequencies of sEPSC and sIPSC were significantly reduced in the NAc shell D1
MSNs after pair bonding, whereas the intrinsic excitability increased after cohabitation with females.
The bidirectional modifications (reduced synaptic inputs vs. increased excitability) observed in D1 MSNs might result from homeostatic regulation. The overall synaptic transmission may produce no net changes, given that reductions in both excitatory and inhibitory synaptic transmission of D1 MSNs were observed. Also, increases in the intrinsic excitability of D1 MSNs would result in an overall excitation gain on D1 MSNs.
(8) One interesting finding is that the relationship between D2-MSN and pair bond formation is quite clear (inhibition facilitates while excitation inhibits pair bond formation). In contrast, the role of D1-MSNs is more complicated since both excitation and inhibition disrupt pair bond formation.
The discussion of this would benefit from another attempt.
As reviewer suggested, the discussion was added in the revised manuscript.
In the present study, DREADDs approaches were used to inhibit or excite NAc MSNs to VP projection and it was found that D1 and D2 NAc MSNs projecting to VP play different roles in the formation of a pair bond. Chemogenetic inhibition of VP-projecting D2 MSNs promoted partner preference formation, while activation of VP-projecting D2 MSNs inhibited it (Figure 6). Chemogenetic activation of D2 MSNs produced the opposite effect of DA on the D2 MSNs on partner preference, while inhibition of these neurons produced the same effects of DA on D2 MSNs. DA binding with D2R is coupled with Gi and produces an inhibitory effect (Lobo and Nestler, 2011). It is generally assumed that activation of D2R produces aversive and negative reinforcement. These results were consistent with the reduced D2 MSNs activity upon sniffing their partner in the fiber photometry test and the increased frequency and amplitude of sIPSC in the present study. Our results also agree with other previous studies, which showed that chemogenetic inhibition of NAc D2 MSNs is sufficient to enhance reward-oriented motivation in a motivational task (Carvalho Poyraz et al., 2016; Gallo et al., 2018). Inhibition of D2 MSNs during self-administration enhanced response and motivation to obtain cocaine (Bock et al., 2013). This also suggests that the mechanism underlying attachment to a partner and drug addiction is similar.
Besides, in the present study, the formation of partner preference was inhibited after activation or inhibition of VP-projecting D1 MSNs, which is not consistent with conventional understanding of prairie vole behavior. Alternatively, DA binding with D1R is coupled with Gs and produces an excitatory effect (Lobo and Nestler, 2011), while activation of D1R produces reward and positive reinforcement (Hikida et al., 2010; Tai et al., 2012; Kwak and Jung, 2019). For example, activation of D1 MSNs enhances the cocaine-induced conditioned place preference (Lobo et al., 2010). In addition, D1R activation by DA promotes D1 MSNs activation, which promotes reinforcement. However, a recent study found that NAc-ventral mesencephalon D1 MSNs promote reward and positive reinforcement learning; in contrast, NAc-VP D1 MSNs led to aversion and negative reinforcement learning (Liu et al., 2022). It is consistent with our results that activation of NAc-VP D1 MSNs pathway reduced time spent side-by-side and impaired partner preference after 7 days of cohabitation. In contrast to inhibition of D2 MSNs, we found that inhibition of the D1 MSNs did not elicit corresponding increases in partner preference. One possible explanation is that almost all D1 MSNs projecting to the VTA/ substantia nigra (SN) send collaterals to the VP (Pardo-Garcia et al., 2019). For example, optogenetically stimulating VP axons may inadvertently cause effects in the VTA/SN through the antidromic activation of axon collaterals (Yizhar et al., 2011). Therefore, chemogenetic inhibition of D1 MSNs may also inhibit DA neurons in VTA, subsequently inhibiting the formation of a pair bond.
The dopamine and different types of dopamine receptors in the NAc may play different roles in regulation of pair bond formation and maintenance. The chemogenetic manipulation revealed that VP-projecting D2 MSNs are necessary and more important in pair bond formation compared to VPprojecting D1 MSNs. It is consistent with previous pharmacological experiments that blocking of D2R with its specific antagonist, while D1R was not blocked, can prevent the formation of a pair bond in prairie voles (Gingrich et al., 2000). This indicates that D2R is crucial for the initial formation of the pair bond. D2R is involved in the reward aspects related to mating. In female prairie voles, D2R in the NAc is important for partner preference formation. The activation of D2R may help to condition the brain to assign a positive valence to the partner's cues during mating, facilitating the development of a preference for a particular mate. In addition, the cohabitation caused the DA release, the high affinity Gi-coupled D2R was activated first, which inhibited D2 MSNs activity and promoted the pair bond formation. And then, after 7 days of cohabitation, the pair bonding was already established, the significantly increased release of dopamine significantly activated Gs-coupled D1R with the low affinity to dopamine, which increased D1 MSNs activity and maintained the formation of partner preference. While D1R is also present and involved in the overall process, its role in the initial formation of the pair bond is not as dominant as D2R (Aragona et al., 2006). However, it still participates in the neurobiological processes related to pair bond formation. For example, in male mandarin voles, after 7 days of cohabitation with females, D1R activity in the NAc shell was affected during pair bond formation. The extracellular DA concentration was higher when sniffing their partner compared to a stranger, and this increase in DA release led to an increase in D1R activity in the NAc shell. In prairie voles, dopamine D1 receptors seem to be essential for pair bond maintenance. Neonatal treatment with D1 agonists can impair partner preference formation later in life, suggesting an organizational role for D1 in maintaining the bond (Aragona et al., 2006). In pair-bonded male prairie voles, D1R is involved in inducing aggressive behavior toward strangers, which helps to maintain the pair bond by protecting it from potential rivals. In the NAc shell, D1 agonist decreases the latency to attack same-sex conspecifics, while D1 antagonism increases it (Aragona et al., 2006). In summary, D2R is more crucial for pair bond formation, being involved in reward association and necessary for the initial development of the bond. D1R, on the other hand, is more important for pair bond maintenance, being involved in aggression and mate guarding behaviors and having an organizational role in maintaining the bond over time. We therefore suggest that D2 MSNs are more predominantly involved in the formation of a pair bond compared with D1 MSNs.
(9) For the chemogenetic inhibition/excitation experiment please specify the temporal relationship between CNO injection and the behavioral testing. Are the DREADDs activated during the preference testing or are we only looking at the consequences of DREADD activation during cohabitation? This would impact the interpretation of the results.
Considering the reviewer’s suggestion, we have clarified the time of CNO injection and the behavioral testing. In chemogenetic experiments, male voles were injected with CNO (1 mg/kg, i.p. injection) or saline once per day during 7-days cohabitation period. On day 3 and day 7 of cohabitation, the partner preference tests (3 h) were conducted after 3h of injection. Anton Pekcec (Jendryka et al., 2019) found that, in mice, after 60 min of CNO injection (i.p.), free CNO levels had dropped surprisingly sharply in CSF and cortex tissue, CNO could not be detected after 60 min. However, associated biological effects are reported to endure 6 - 24 h after CNO treatment (Farzi et al., 2018; Desloovere et al., 2019; Paretkar and Dimitrov, 2019). For example, René He et al. (Anacker et al., 2018) showed that chemogenetic inhibition of adult-born neurons in the vDG promotes susceptibility to social defeat stress by using of DREADDs for 10 days, whereas increasing neurogenesis confers resilience to chronic stress. Moreover, Ming-Ming Zhang et al. (Zhang et al., 2022) revealed that the selective activation or inhibition of the IC-BLA projection pathway strengthens or weakens the intensity of observational pain while the CNO (1 mg/kg) was i.p. injected into the infected mice on days 1, 3, 5, and 7 after virus expression. Furthermore, in study of James P Herman et al. (Nawreen et al., 2020) chronic inhibition of IL PV INs reduces passive and increases active coping behavior in FST. Therefore, we believe that 7-day CNO injections can produce chronic effects on MSNs and alters the formation of partner preferences.
(10) Discussion: "The observed increase in DA release resulted in suppression of D2 neurons in the NAc shell". "In contrast, the rise in DA release increases D1 activity selectively in response to their partner after extended cohabitation." These statements would need to be weakened as causality is not shown here.
Thanks for your rigorous consideration. We have reorganized the discussion in the revised manuscript.
“The observed increase in DA release resulted in alterations in activities of D2 and D1 neurons in the NAc shell selectively in response to their partner after extended cohabitation.”
(11) It would help if the order of supplementary figures would match their order of figures appearance in the result section.
Thanks for your suggestion. We reorganized the order of appearance in the revised manuscript.
(12) This may be beyond the focus of the study but it would be very interesting to know whether the physiological responses to partner contact are similarly observed in females.
Thanks for your concern. It is regretful that we did not observe physiological responses of female to partner contact. We predict the females may show the similar response patterns to their partner. In the future, we will supplement the research on the mechanism of partner preferences in female voles.
Reviewer #3 (Recommendations for the authors):
The manuscript is evaluating changes in dopamine signaling in the nucleus accumbens following pair bonding and exposure to various stimuli in mandarin voles. The manuscript is generally wellwritten. The experiment designs seem strong, although there are missing details to fully evaluate them. The statistics are not completed correctly, and the statistical values are not reported making them even harder to evaluate. There are a lot of potential strengths in this research. However, my review is limited because I am limited in how to evaluate data interpretation when statistical analyses are not clear. I provide details below.
Major
(1) Statistics should be provided in the Results section. It is not clear how to evaluate the authors' interpretations without presenting the statistical data. What stats are being reported about viral expression in cells on lines 192-194? What posthocs? There is only one condition, so I assume the statistic was a one-sample t-test. The authors should report the t-value, df, and p-value. No post-hoc is needed. There are many issues like this, which makes reviewing this manuscript very difficult. If the statistics were not conducted properly and reported clearly, I do not have confidence that I can evaluate the author's interpretation of the results.
Thanks for your suggestion. We report the t-value, df, and p-value in the Results section.
(2) Statistical tests should be labeled correctly. ANOVAs (found in figure caption) for Figure 1 data are not repeated measures. Rather, they are one-way ANOVA (with stimulus as a within-subject variable).
We used one-way ANOVA to analyze the changes in fluorescence signals in figure1-3. In the experiment, the changes in fluorescence signals of every subject were collected upon sniffing the partner, an unknown female, and an object. So, we used One-Way Repeated Measures ANOVA to analyze the data.
(3) The protocol for behavioral assessment and stimulus presentation during fiber photometry recording is not clear. For example, the authors mention on line 662 that voles ate carrots during some of the recording sessions, but nothing else is described about the recording session. What was the order of stimulus presentation? What was the object provided? Why is eating carrots analyzed separately from object, partner, and stranger exposure?
Response: Sorry for the confusing. The detailed description has been added. After 3 and 7 days of cohabitation, males were exposed to their partner or an unfamiliar female (each exposure lasted for 30 min) in random order in a clean social interaction cage. The changes in fluorescence signals during these social interactions with their partner, an unfamiliar vole of the opposite sex, or an object (Rubik's Cube) were collected and digitalized by CamFiberPhotometry software (ThinkerTech). To rule out that the difference in fluorescence signals was caused by the difference in virus expression at different time points, we used the same experimental strategy in new male mandarin voles and measured the fluorescence signal changes upon eating carrot after 3 and 7 days of cohabitation (The male mandarin voles were fasted for four hours before the test.). Since sniffing (object, partner, and stranger) and eating carrot were not tested in the same males, we analyzed sniffing and eating carrot separately.
(4) Supplement figures would be better as figures instead of tables. Many effects are hard to interpret.
As you suggested, we added the information of Supplement table1 in results.
(5) Citations should be included to note when pair bonding occurs in mandarin voles.
As you suggested, we added the citation in the revised manuscript.
Minor
(1) Add a citation for the statement that married people live longer than unmarried people (Lines 51-52).
As you suggested, we added the citation in the revised manuscript.
(2) There is a table labeling viral vectors, but the table is not titled properly or referenced in the methods section.
Thanks for our careful checking. We reorganized the table title and the table was also cited in the revised manuscript.
(3) Sentences on lines 608-610 and 610-612 seem redundant.
This sentence was corrected.
(4) This is a rather subjective statement "Carrots are voles' favorite food."
We reorganized the sentence in the revised manuscript.
"Carrots are voles' daily food."
Anacker C, Luna VM, Stevens GS, Millette A, Shores R, Jimenez JC, Chen B, Hen R (2018) Hippocampal neurogenesis confers stress resilience by inhibiting the ventral dentate gyrus. Nature 559:98-102.
Aragona BJ, Liu Y, Yu YJ, Curtis JT, Detwiler JM, Insel TR, Wang Z (2006) Nucleus accumbens dopamine differentially mediates the formation and maintenance of monogamous pair bonds. Nature neuroscience 9:133-139.
Bock R, Shin JH, Kaplan AR, Dobi A, Markey E, Kramer PF, Gremel CM, Christensen CH, Adrover MF, Alvarez VA (2013) Strengthening the accumbal indirect pathway promotes resilience to compulsive cocaine use. Nature neuroscience 16:632-638.
Brody AK, Armitage KB (1985) The effects of adult removal on dispersal of yearling yellow-bellied marmots. Canadian Journal of Zoology 63:2560-2564.
Carvalho Poyraz F, Holzner E, Bailey MR, Meszaros J, Kenney L, Kheirbek MA, Balsam PD, Kellendonk C (2016) Decreasing Striatopallidal Pathway Function Enhances Motivation by Energizing the Initiation of Goal-Directed Action. The Journal of neuroscience : the official journal of the Society for Neuroscience 36:5988-6001.
Castro DC, Berridge KC (2014) Opioid hedonic hotspot in nucleus accumbens shell: mu, delta, and kappa maps for enhancement of sweetness "liking" and "wanting". The Journal of neuroscience : the official journal of the Society for Neuroscience 34:4239-4250.
Desloovere J, Boon P, Larsen LE, Merckx C, Goossens MG, Van den Haute C, Baekelandt V, De Bundel D, Carrette E, Delbeke J, Meurs A, Vonck K, Wadman W, Raedt R (2019) Longterm chemogenetic suppression of spontaneous seizures in a mouse model for temporal lobe epilepsy. Epilepsia 60:2314-2324.
Echo JA, Lamonte N, Ackerman TF, Bodnar RJ (2002) Alterations in food intake elicited by GABA and opioid agonists and antagonists administered into the ventral tegmental area region of rats. Physiology & behavior 76:107-116.
Farzi A, Lau J, Ip CK, Qi Y, Shi YC, Zhang L, Tasan R, Sperk G, Herzog H (2018) Arcuate nucleus and lateral hypothalamic CART neurons in the mouse brain exert opposing effects on energy expenditure. eLife 7.
Gallo EF, Meszaros J, Sherman JD, Chohan MO, Teboul E, Choi CS, Moore H, Javitch JA, Kellendonk C (2018) Accumbens dopamine D2 receptors increase motivation by decreasing inhibitory transmission to the ventral pallidum. Nature communications 9:1086.
Gingrich B, Liu Y, Cascio C, Wang Z, Insel TR (2000) Dopamine D2 receptors in the nucleus accumbens are important for social attachment in female prairie voles (Microtus ochrogaster). Behavioral neuroscience 114:173-183.
Gosnell BA, Majchrzak MJ (1989) Centrally administered opioid peptides stimulate saccharin intake in nondeprived rats. Pharmacology, biochemistry, and behavior 33:805-810.
Gosnell BA, Levine AS, Morley JE (1986) The stimulation of food intake by selective agonists of mu, kappa and delta opioid receptors. Life sciences 38:1081-1088.
Greenwood PJ (1983) Mating systems and the evolutionary consequences of dispersal. The ecology of animal movement:116-131.
Guillaumin MCC, Viskaitis P, Bracey E, Burdakov D, Peleg-Raibstein D (2023) Disentangling the role of NAc D1 and D2 cells in hedonic eating. Molecular psychiatry 28:3531-3547.
Hikida T, Kimura K, Wada N, Funabiki K, Nakanishi S (2010) Distinct roles of synaptic transmission in direct and indirect striatal pathways to reward and aversive behavior. Neuron 66:896907.
Hoglen NEG, Manoli DS (2022) Cupid's quiver: Integrating sensory cues in rodent mating systems. Frontiers in neural circuits 16:944895.
Ims RA (1990) Determinants of natal dispersal and space use in grey-sided voles, Clethrionomys rufocanus : a combined field and laboratory experiment. Oikos 57:106-113.
Jendryka M, Palchaudhuri M, Ursu D, van der Veen B, Liss B, Kätzel D, Nissen W, Pekcec A (2019) Pharmacokinetic and pharmacodynamic actions of clozapine-N-oxide, clozapine, and compound 21 in DREADD-based chemogenetics in mice. Scientific reports 9:4522.
Kwak S, Jung MW (2019) Distinct roles of striatal direct and indirect pathways in value-based decision making. eLife 8.
Liu Z, Le Q, Lv Y, Chen X, Cui J, Zhou Y, Cheng D, Ma C, Su X, Xiao L, Yang R, Zhang J, Ma L, Liu X (2022) A distinct D1-MSN subpopulation down-regulates dopamine to promote negative emotional state. Cell Res 32:139-156.
Lobo MK, Nestler EJ (2011) The striatal balancing act in drug addiction: distinct roles of direct and indirect pathway medium spiny neurons. Front Neuroanat 5:41.
Lobo MK, Covington HE, 3rd, Chaudhury D, Friedman AK, Sun H, Damez-Werno D, Dietz DM, Zaman S, Koo JW, Kennedy PJ, Mouzon E, Mogri M, Neve RL, Deisseroth K, Han MH, Nestler EJ (2010) Cell type-specific loss of BDNF signaling mimics optogenetic control of cocaine reward. Science (New York, NY) 330:385-390.
Nawreen N, Cotella EM, Morano R, Mahbod P, Dalal KS, Fitzgerald M, Martelle S, Packard BA, Franco-Villanueva A, Moloney RD, Herman JP (2020) Chemogenetic Inhibition of Infralimbic Prefrontal Cortex GABAergic Parvalbumin Interneurons Attenuates the Impact of Chronic Stress in Male Mice. eNeuro 7.
Pardo-Garcia TR, Garcia-Keller C, Penaloza T, Richie CT, Pickel J, Hope BT, Harvey BK, Kalivas PW, Heinsbroek JA (2019) Ventral Pallidum Is the Primary Target for Accumbens D1 Projections Driving Cocaine Seeking. The Journal of neuroscience : the official journal of the Society for Neuroscience 39:2041-2051.
Paretkar T, Dimitrov E (2019) Activation of enkephalinergic (Enk) interneurons in the central amygdala (CeA) buffers the behavioral effects of persistent pain. Neurobiology of disease 124:364-372.
Peciña S, Berridge KC (2000) Opioid site in nucleus accumbens shell mediates eating and hedonic 'liking' for food: map based on microinjection Fos plumes. Brain research 863:71-86.
Peciña S, Berridge KC (2005) Hedonic hot spot in nucleus accumbens shell: where do mu-opioids cause increased hedonic impact of sweetness? The Journal of neuroscience : the official journal of the Society for Neuroscience 25:11777-11786.
Peciña S, Berridge KC (2013) Dopamine or opioid stimulation of nucleus accumbens similarly amplify cue-triggered 'wanting' for reward: entire core and medial shell mapped as substrates for PIT enhancement. The European journal of neuroscience 37:1529-1540.
Qu Y, Zhang L, Hou W, Liu L, Liu J, Li L, Guo X, Li Y, Huang C, He Z, Tai F (2024) Distinct medial amygdala oxytocin receptor neurons projections respectively control consolation or aggression in male mandarin voles. Nature communications 15:8139.
Reynolds SM, Berridge KC (2001) Fear and feeding in the nucleus accumbens shell: rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior. The Journal of neuroscience : the official journal of the Society for Neuroscience 21:3261-3270.
Solomon NG, Jacquot JJ (2002) Characteristics of resident and wandering prairie voles, Microtus ochrogaster. Canadian Journal of Zoology 80:951-955.
Tai LH, Lee AM, Benavidez N, Bonci A, Wilbrecht L (2012) Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value. Nature neuroscience 15:1281-1289.
Yamaguchi T, Wei D, Song SC, Lim B, Tritsch NX, Lin D (2020) Posterior amygdala regulates sexual and aggressive behaviors in male mice. Nature neuroscience 23:1111-1124.
Ying L, Zhao J, Ye Y, Liu Y, Xiao B, Xue T, Zhu H, Wu Y, He J, Qin S, Jiang Y, Guo F, Zhang L, Liu N, Zhang L (2022) Regulation of Cdc42 signaling by the dopamine D2 receptor in a mouse model of Parkinson's disease. Aging cell 21:e13588.
Yizhar O, Fenno LE, Davidson TJ, Mogri M, Deisseroth K (2011) Optogenetics in neural systems. Neuron 71:9-34.
Zhan S, Qi Z, Cai F, Gao Z, Xie J, Hu J (2024) Oxytocin neurons mediate stress-induced social memory impairment. Current biology : CB 34:36-45.e34.
Zhang M, Kelley AE (2000) Enhanced intake of high-fat food following striatal mu-opioid stimulation: microinjection mapping and fos expression. Neuroscience 99:267-277.
Zhang MM et al. (2022) Glutamatergic synapses from the insular cortex to the basolateral amygdala encode observational pain. Neuron 110:1993-2008.e1996.
Zhao J, Ying L, Liu Y, Liu N, Tu G, Zhu M, Wu Y, Xiao B, Ye L, Li J, Guo F, Zhang L, Wang H, Zhang L (2019) Different roles of Rac1 in the acquisition and extinction of methamphetamineassociated contextual memory in the nucleus accumbens. Theranostics 9:7051-7071.
Znamensky V, Echo JA, Lamonte N, Christian G, Ragnauth A, Bodnar RJ (2001) gammaAminobutyric acid receptor subtype antagonists differentially alter opioid-induced feeding in the shell region of the nucleus accumbens in rats. Brain research 906:84-91.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Public Reviews:
Reviewer #1 (Public Review):
The authors demonstrated that NINJ1 promotes TF-positive MV release during pyroptosis and thereby triggers coagulation. Coagulation is one of the risk factors that can cause secondary complications in various inflammatory diseases, making it a highly important therapeutic target in clinical treatment. This paper effectively explains the connection between pyroptosis and MV release with Ninj1, which is a significant strength. It provides valuable insight into the potential of targeting Ninj1 as a therapeutic strategy.
Although the advances in this paper are valuable, several aspects need to be clarified. Some comments are discussed below.
(1) Since it is not Ninj1 directly regulating coagulation but rather the MV released by Ninj1 playing a role, the title should include that. The current title makes it seem like Ninj1 directly regulates inflammation and coagulation. It would be better to revise the title.
Thanks for the thoughtful comments. We show that the release of procoagulant MVs by plasma membrane rupture (PMR) is a critical step in the activation of coagulation. In addition, the release of cytokines and danger molecules by PMR may also contribute to coagulation. In choosing the title, we are trying to emphasize NINJ1-dependent PMR as a common trigger for these biological processes.
(2) Ninj1 is known to be an induced protein that is barely expressed in normal conditions. As you showed in "Fig1G" data, control samples showed no detection of Ninj1. However, in "Figure S1", all tissues (liver, lung, kidney and spleen) expressed Ninj1 protein. If the authors stimulated the mice with fla injection, it should be mentioned in the figure legend.
We respectfully disagree with the comment that “Ninj1 is known to be an induced protein that is barely expressed in normal conditions”. NINJ1 protein is abundantly expressed (without induction) in tissues including liver, lung, kidney, and spleen, as shown in Fig S1. Consistently, other groups have shown abundant NINJ1 expression at baseline in tissues and cells such as liver (Kayagaki et.al. Nature 2023) and BMDM (Kayagaki et.al. Nature 2021; Borges et.al. eLife 2023). Fig 1G shows fibrin deposition as an indicator of coagulation, not NINJ1 protein.
(3) In "Fig3A", the Ninj1 protein expression was increased in the control of BMDM +/- cell lysate rather than fla stimulation. However, in MV, Ninj1 was not detected at all in +/- control but was only observed with Fla injection. The authors need to provide an explanation for this observation. Additionally, looking at the MV β-actin lane, the band thicknesses appear to be very different between groups. It seems necessary to equalize the protein amounts. If that is difficult, at least between the +/+ and +/- controls.
Thanks for the valuable comments. In Fla-stimulated Ninj1+/- BMDMs, most of the NINJ1 is released in MVs, therefore, not in the cell lysate, as shown in Fig 3A. The difference in beta-actin band intensity correlated with MV numbers shown in Fig 3B. We ensure consistency by using the same number of cells.
(4) Since the authors focused Ninj1-dependent microvesicle (MV) release, they need to show MV characterizations (EM, NTA, Western for MV markers, etc...).
Thanks for the suggestion. We now add NTA analysis of MV for BMDMs in Fig S4C.
(5) To clarify whether Ninj1-dependent MV induces coagulation, the authors need to determine whether platelet aggregation is reduced with isolated +/- MVs compared to +/+ MVs.
Thanks for the suggestion. We agree that platelet aggregation is closely linked to blood coagulation but would argue that one does not directly cause the other. While it would be interesting to examine whether MVs induce platelet aggregation, we hope the reviewer would agree that the outcome of this experiment would neither significantly support nor challenge our statement that NINJ1-dependent PMR promotes coagulation.
(6) Even with the authors well established experiments with haploid mice, it is a critical limitation of this paper. To improve the quality of this paper, the authors should consider confirming the findings using mouse macrophage cell lines, such as generating Ninj1-/- Raw264.7 cell lines, to examine the homozygous effect.
Thanks for the valuable comments. We acknowledge the limitation of using haploid mice in this study. However, our data provides strong evidence supporting the role of NINJ1-dependent plasma membrane rupture in blood coagulation using primary macrophages.
(7) There was a paper reported in 2023 (Zhou, X. et al., NINJ1 Regulates Platelet Activation and PANoptosis in Septic Disseminated Intravascular Coagulation. Int. J. Mol. Sci. 2023) that revealed the relationship between Ninj1 and coagulation. According to this paper, inhibition of Ninj1 in platelets prevents pyroptosis, leading to reduced platelet activation and, consequently, the suppression of thrombosis. How about the activation of platelets in Ninj1 +/- mice? The author should add this paper in the reference section and discuss the platelet functions in their mice.
Thanks for the valuable comments. We examine PT time, plasma TAT, and tissue fibrin deposition as direct evidence of blood coagulation in this manuscript. We acknowledge that platelets play a key role in thrombosis; however, we hope the reviewer would agree that tissue factor-induced blood coagulation and platelet aggregation are linked yet distinct processes. Therefore, the role of NINJ1 in platelet aggregation falls beyond the scope of this manuscript.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Referring to previous research findings, the authors explain the connection between NINJ1 and MVs. Additional experiments and clarifications will strengthen the conclusions of this study.
Below are some comments I feel could strengthen the manuscript:
(1) The authors mentioned their choice of using heterozygous NINJ1+/- mice on page 4, because of lethality and hydrocephalus. Nonetheless, there is a substantial number of references that use homozygous NINJ1-/- mice. Could there be any other specific reasons for using heterozygous mice in this study?
Thanks for the thoughtful comments. We are aware that a few homozygous NINJ1-/- mouse strains were used in several publications by different groups, including Drs. Kayagaki and Dixit (Genentech), from whom we obtained the heterozygous NINJ1+/- breeders. We do not have experience with the homozygous NINJ1-/- mice used by other groups. It’s reasonable to assume that homozygous NINJ1-/-, if healthy, would have even stronger protection against coagulopathy than heterozygous NINJ1+/-. The only reason for not using homozygous mice in this study is that a majority of our homozygous NINJ1-/- develops hydrocephalus around weaning and these mice are required to be euthanized by the rules of our DLAR facility. Although our homozygous NINJ1-/- mice develop hydrocephalus (the same reported by Drs. Kayagaki and Dixit, PMID: 37196676, PMCID: PMC10307625), heterozygous NINJ1+/- mice remain healthy.
(2) Figure S2 clearly shows the method of pyroptosis induction by flagellin. It is also necessary as a prerequisite for this paper to show the changes in flagellin-induced pyroptosis in heterozygous NINJ1+/- mice.
Thanks for the valuable suggestions. We agree that a plasma LDH measurement as an indicator of pyroptosis in vivo would add to the manuscript. Therefore, we have made several attempts to measure plasma LDH in flagellin-challenged WT and NINJ1+/- mice using CytoTox96 Non-Radioactive Cytotoxicity Assay (a Promega kit commonly used for LDH, Promega#G1780). Flagellin-challenged WT and NINJ1+/- mice develops hemolysis, which renders plasma red. Because plasma coloring interferes with the assay, we could not get a meaningful reading to make an accurate comparison. We also tried LHD-Glo Cytotoxicity Assay (Luciferase based, Promega#J2380) with no luck on both plasma and serum. We hope the reviewer would agree that reduced plasma MV count (Fig 3C) would serve as an alternative indictor for reduced pyroptosis.
(3) IL-1ß levels controlled by GSDMD were not affected by NINJ1 expression according to previous studies (Ref 37, 29, Nature volume 618, pages 1065-1071 (2023)). GSDMD also plays an important role in TF release in pyroptosis. Are GSDMD levels not altered in heterozygous NINJ1 +/- mice?
Thanks for raising these great points. It’s been reported that IL-1β secretion in cell culture supernatant were not affected by NINJ1 deficiency or inhibition when BMDMs were stimulated by LPS (Ref 29, 37, now Ref 29, 35) or nigericin (Ref 29). As GSDMD pore has been shown to facilitate the release of mature IL-1β, these in vitro observations are reasonable given that NINJ1-mediated PMR is a later event than GSDMD pore-forming. However, we observed that plasma IL-1β (also TNFα and IL-6) in Ninj1+/- mice were significantly lower. There are a few differences in the experimental condition that might contribute to the discrepancy: 1, there was no priming in our in vivo experiment, while priming in BMDMs were performed in both in vitro observations before stimulating with LPS or nigericin; 2, the flagellin in our study engages different inflammasome than either LPS or nigericin. Priming might change the expression and dynamics of IL-1β. More importantly, there might be unrecognized mechanisms in IL-1β secretion in vivo. We now add discussion on this in the main text.
We examined GSDMD protein levels in liver, lung, kidney, and spleen from WT and NINJ1+/- mice by Western blotting. The data is now presented in the updated Fig S1, we did not observe apparent difference in GSDMD expression between the two genotypes.
(4) In Fig 1 F, the authors used a fibrin-specific monoclonal antibody for staining fibrin, but it's not clearly defined. There may be some problem with the quality of antibody or technical issues. Considering this, exploring alternative methods to visualize fibrin might be beneficial. Fibrin is an acidophil material, so attempting H&E staining or Movat's pentachrome staining might help for identify fibrin areas.
Thanks for the valuable suggestions. The fibrin-specific monoclonal antibody in our study is mouse anti-fibrin monoclonal antibody (59D8). This antibody has been shown to bind to fibrin even in the presence of human fibrinogen at the concentration found in plasma [Hui et al. (1983). Science. 222 (4628); 1129-1132]. We apologize that we did not cite the reference in our initial submission. We obtained this antibody from Dr. Hartmut Weiler at Medical College of Wisconsin and Dr. Rodney M. Camire at the University of Pennsylvania, who were acknowledged in our initial submission.
We performed H&E staining on serial sections of the same tissues for Figure 1F. The data is now presented as Fig S3.
Reviewer #2 (Public Review):
Summary:
The author's main goal is to understand the mechanism by which pyroptosis (through the formation of Gasdermin D (GSDMD) pores in the plasma membrane) contributes to increased release of procoagulant Tissue Factor-containing microvesicles (MV). Their previous data demonstrate that GSDMD is critical for the release of MV that contains Tissue Factor (TF), thus making a link between pyroptosis and hypercoagulation. Given the recent identification of NINJ1 being responsible for plasma membrane rupture (Kayagaki et al. Nature 2011), the authors wanted to determine if NINJ1 is responsible for TF-containing MV release. Given the constitutive ninj1 KO mouse leads to partial embryonic lethality, the authors decided to use a heterozygous ninj1 KO mouse (ninj1+/-). While the data are well controlled, there is limited understanding of the mechanism of action. Also, given that the GSDMD pores have an ~18 nm inner diameter enough to release IL-1β, while larger molecules like LDH (140 kDa) and other DAMPs require plasma membrane rupture (likely mediated by NINJ1), it s not unexpected that large MVs require NINJ1-mediated plasma cell rupture.
Strengths:
The authors convincingly demonstrate that ninj1 haploinsufficiency leads to decreased prothrombin time, plasma TAT and plasma cytokines 90 minutes post-treatment in mice, which leads to partial protection from lethality.
Weaknesses:
- In the abstract, the authors say "...cytokines and protected against blood coagulation and lethality triggered by bacterial flagellin". This conclusion is not substantiated by the data, as you still see 70% mortality at 24 hours in the ninj1+/- mice.
Thanks for the thoughtful comments. We corrected the text to “partially protected against blood coagulation and lethality triggered by bacterial flagellin”.
- The previous publication by the authors (Wu et al. Immunity 2019) clearly shows that GSDMDdependent pyroptosis is required for inflammasome-induced coagulation and mouse lethality. However, as it is not possible for the authors to use the homozygous ninj1 KO mouse due to partial embryonic lethality, it becomes challenging to compare these two studies and the contributions of GSDMD vs. NINJ1. Comparing the contributions of GSDMD and NINJ1 in human blood-derived monocytes/macrophages where you can delete both genes and assess their relevant contributions to TF-containing MV release within the same background would be crucial in comparing how much contribution NINJ1 has versus what has been published for GSDMD? This would help support the in vivo findings and further corroborate the proposed conclusions made in this manuscript.
Thanks for the valuable question. We have shown that plasma MV TF activity was reduced in both GSDMD deficient mice (Ref 23) and Ninj1+/- mice (present manuscript). Given that TF is a plasma membrane protein, MV TF most likely comes from ruptured plasma membrane. In flagellin-induced pyroptosis, both GSDMD and NINJ1 deficiency equally blocked LDH release (plasma membrane rupture) in BMDMs (Ref 29). Further, in pyroptosis glycine acts downstream of GSDMD pore formation for its effect against NINJ1 activation (Ref 35). Therefore, GSDMD pore-forming should be upstream of NINJ1 activation in pyroptosis (which may not be the case in other forms of cell death) and there are likely equal effects of GSDMD and NINJ1 on MV release in flagellin-induced pyroptosis. As the reviewer suggested, experiments using human blood-derived monocytes/macrophages will enable a direct comparison to determine the relative contribution. However, this approach presents a few technical difficulties: it’s not easy to manipulate gene expression on primary human monocytes/macrophages (in our experience); variable efficiency in gene manipulation of GSDMD and NINJ1 will complicate the comparison. I hope the reviewer would agree that a direct comparison between GSDMD and NINJ1 is not required to support our conclusion that NINJ1-dependent membrane rupture is involved in inflammasome-pyroptosis induced coagulation and inflammation.
- What are the levels of plasma TAT, PT, and inflammatory cytokines if you collect plasma after 90 minutes? Given the majority (~70%) of the ninj+/- mice are dead by 24 hours, it is imperative to determine whether the 90-minute timeframe data (in Fig 1A-G) is also representative of later time points. The question is whether ninj1+/- just delays the increases in prothrombin time, plasma TAT, and plasma cytokines.
Thank for the valuable question. The time point (90 min) was chosen based on our in vitro observation that flagellin-induced pyroptosis in BMDMs largely occurs within 60-90 min.
Because our focus on the primary effect of flagellin in vivo, potential secondary effects at later points may complicate the results and are hard to interpret. As the reviewer suggested, we have measured plasma PT, TAT at 6 hours post-flagellin challenge. The significant difference in PT sustained between Ninj1+/+ and Ninj1+/- (Fig A), suggesting coagulation proteins remained more depleted in Ninj1+/+ mice than in Ninj1+/- mice. However, plasma TAT levels were diminished to baseline level (refer to Fig 1B in main text) in both groups and showed no significant difference between groups (Fig B), which could be explained by the short half-life (less than 30 min) in the blood. Since flagellin challenge is a one-time hit, there might not a second episode of coagulation after the 90-minute time point, at least not triggered by flagellin, supported by the plasma TAT levels at 6 hours. We now comment on this limitation at the end of the main text.
Based on our previous studies, plasma IL-1β and TNFα peaked at early time point and diminished over time, but plasma IL-6 levels maintained. As shown below, plasma IL-6 appeared higher in Ninj1+/+ compared with Ninj1+/-, but not statistically significant (partly because one missing sample, n = 4 not 5, in Ninj1+/+ group decreased the statistical power of detecting a difference).
Author response image 1.
Mice were injected with Fla (500 ng lFn-Fla plug3 ugPA). Blood was collected 6 hours after Fla injection. Prothrombin time (A), plasma TAT (B), and plasma IL-6 (C) were measured. Mann-Whitney test were performed.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
- Fig 1F: are there lower magnification images that capture the fibrin deposition? The IHC data seems at odds with the WB data in Fig. 1G where there is still significant fibrin detected in the heterozygous lungs and liver. Quantitating the Fig. 1G Western blot would also be helpful.
IHC surveys a thin layer of tissue section while WB surveys a piece of tissue, therefore fibrin deposition may be missing from IHC and but found in WB. That is why we used two methods. Below we provide lower mag images of fibrin deposition (about 2 x 1.6 mm area).
Author response image 2.
- Fig1H - lethality study uses 5x dose of Fla used in earlier studies. In the lethality data where there is a delay in ninj1+/- mortality, are the parameters (prothrombin time, plasma TAT, and plasma cytokines) measured at 90 minutes different between WT and ninj+/- mice? This would be critical to confirm that this is not merely due to a delayed release of TF-containing MVs.
We used 5x lower dose of Fla in coagulation study than lethality study because it’s not as easy to draw blood from septic mouse with higher dose of flagellin. We need to terminate the mice to collect blood for plasma measurement and therefore the parameters were not measured for mice in lethality study.
- What is the effect of ninj+/- on E. coli-induced lethality in mice? How do these data compare to E. coli infection of GSDMD-/- mice?
We did not examine the effect of Ninj1+/- on E. coli-induced lethality. After the initial submission of our manuscript, we have focused on Ninj1 flox/flox mice instead of Ninj1+/- for NINJ1 deficiency. We are using induced global Ninj1 deficient mice for polymicrobial infectioninduced lethality in our new studies.
- Fig 2 - in the E. coli model, the prothrombin time, plasma TAT, and plasma cytokines are measured 6 hours post-infection. How were these time points chosen? Did the authors measure prothrombin time, plasma TAT, and plasma cytokines at different time points?
The in vivo time point for flagellin and E.coli were chosen based on our in vitro observation of the timelines on BMDM pyroptosis induced by flagellin and bacteria. This disparity probably arises from distinct dynamics between purified protein and bacterial infections. Purified proteins can swiftly translocate into cells and take effect immediately after injection. Conversely, during bacterial infection, macrophages engulf and digest the bacteria to expose their antigens. Subsequently, these antigens initiate further effects, a process that takes some time to unfold.
Our focus is on the primary effect of flagellin in vivo, potential secondary effects at later points may complicate the results and are hard to interpret. As the reviewer suggested, we have measured plasma PT, TAT at 6 hours post-flagellin challenge. The significant difference in PT sustained between Ninj1+/+ and Ninj1+/- (Fig A), suggesting coagulation proteins remained more depleted in Ninj1+/+ mice than in Ninj1+/- mice. However, plasma TAT levels were diminished to baseline level (refer to Fig 1B in main text) in both groups and showed no significant difference between groups (Fig B), which could be explained by the short half-life (less than 30 min) in the blood. Since flagellin challenge is a one-time hit, there might not a second episode of coagulation after the 90-minute time point, at least not triggered by flagellin, supported by the plasma TAT levels at 6 hours. We now comment on this limitation at the end of the main text.
Based on our previous studies, plasma IL-1β and TNFα peaked at early time point and diminished over time, but plasma IL-6 levels maintained. As shown below, plasma IL-6 appeared higher in Ninj1+/+ compared with Ninj1+/-, but not statistically significant (partly because one missing sample, n = 4 not 5, in Ninj1+/+ group decreased the statistical power of detecting a difference).
- Fig 3 - the sequence of figure panels listed in the legend needs to be corrected. Fig 3A requires quantitation of NINJ1 levels compared to beta-actin. Fig 3C - needs a control for equal MV loading.
Thanks for the recommendations. The figure sequence has been corrected. There remain no common markers or loading controls for MV, so we use equal plasma volume for loading control.
Additional comments:
(1) In Fig 3A, the size of NINJ1 appears to be increased in the NINJ+/- group.
This discrepancy is likely attributed to a technical issue when running the protein gel and protein transfer, which makes the image tilt to one side.
(2) Describe the method of BMDM isolation.
Thanks for the recommendations. We now include the method of BMDM isolation. In brief, mouse femur and tibia from one leg are harvested and rinsed in ice-cold PBS, followed by a brief rinse in 70% ethanol for 10-15 seconds. Both ends of the bones are then cut open, and the bone marrow is flushed out using a 10 ml syringe with a 26-gauge needle. The marrow is passed through a 19-gauge needle once to disperse the cells. After filtering through a 70-μm cell strainer, the cells are collected by centrifugation at 250 g for 5 minutes at 4 °C, then suspended in two 150 mm petri dish, each containing 25 ml of L-cell conditioned medium (RPMI-1640 supplemented with 10% FBS, 2mM L-Glutamine, 10mM HEPES, 15% LCM, and penicillin/streptomycin). After 3 days, 15 mL of LCM medium is added to each dish cells. The cells typically reach full confluency by days 5-7.
(3) According to this method, BMDMs are seeded without any M-CSF or L929-cell conditioned medium. How many macrophages survive under this condition?
BMDMs are cultured and differentiated in medium supplemented with 15% L929-cell conditioned medium. For the experiment, the cells were seeded in Opti-MEM medium (Thermo Fisher Scientific, Cat# 51985034) without M-CSF or L929-cell conditioned medium. BMDMs can survive under this condition, as evidenced by low LDH and high ATP measurement (Fig S5).
Reviewer #2 (Recommendations For The Authors):
- There is significant information missing in the methods and this makes it unclear how to interpret how some of the experiments were performed. For example, there is no detailed description or references in the methods on how the in vivo experiments were performed. The methods section needs significantly more details so that any reader is able to follow the protocols in this manuscript. References to previous work should also be included as needed.
Thanks for the recommendations. We had some of the details in the figure legend. We now add details in the methods for better interpretation of our data.
- Line numbers in the manuscript would be helpful when resubmitting the manuscript so that the reviewer can easily point to the main text when making comments.
Thanks for the recommendations. We now add line numbers in the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
(1) This is a valuable manuscript that successfully integrates several data sets to determine genomic interactions with nuclear bodies.
In this paper we both challenge and/or revise multiple long-standing “textbook” models of nuclear genome organization while also revealing new features of nuclear genome organization. Therefore, we argue that the contributions of this paper extend well beyond “valuable”. Specifically, these contributions include:
a. We challenge a several decades focus on the correlation of gene positioning relative to the nuclear lamina. Instead, through comparison of cell lines, we show a strong correlation of di4erences in gene activity with di4erences in relative distance to nuclear speckles in contrast to a very weak correlation with di4erences in relative distance to the nuclear lamina. This inference of little correlation of gene expression with nuclear lamina association was supported by direct experimental manipulation of genome positioning relative to the nuclear lamina. Despite pronounced changes in relative distances to the nuclear lamina there was little change relative to nuclear speckles and little change in gene expression.
b. We similarly challenge the long-standing proposed functional correlation between the radial positioning of genes and gene expression. Here, and in a now published companion paper (doi.org/10.1038/s42003-024-06838-7), we demonstrate how nuclear speckle positioning relative to nucleoli and the nuclear lamina varies among cell types, as does the inverse relationship between genome positioning relative to nuclear speckles and the nuclear lamina. Again, this is consistent with the primary correlation of gene activity being the positioning of genes relative to nuclear speckles and also explains previous observations showing a strong relationship between radial position and gene expression only in some cell types.
c. We identified a new partially repressed, middle to late DNA replicating type of chromosome domain- “p-w-v fILADs”- by their weak interaction with the nuclear lamina, which, based on our LMNA/LBR KO experimental results, compete with LADs for nuclear lamina association. Moreover, we show that when fLADs convert to iLADs, most conversions are to this p-w-v fiLAD state, although ~ one third are to a normal, active, early replicating iLAD state. Thus, fLADs can convert between repressed, partially repressed, and active states, challenging the prevailing assumption of the division of the genome into two states – active, early replicating A compartment/iLAD regions versus inactive, late replicating, B compartment/LAD regions.
d. We identified nuclear speckle associated domains as DNA replication initiation zones, with the domains showing strongest nuclear speckle attachment initiating DNA replication earliest in S-phase.
e. We describe for the first time an overall polarization of nuclear genome organization in adherent cells with the most active, earliest replicating genomic regions located towards the equatorial plane and less expressed genomic regions towards the nuclear top or bottom surfaces. This includes polarization of some LAD regions to the nuclear lamina at the equatorial plane and other LAD regions to the top or bottom nuclear surfaces.
We have now rewritten the text to make the significance of these new findings clearer.
(2) Strength of evidence: The evidence supporting the central claims is varied in its strength ranging from solid to incomplete. Orthogonal evidence validating the novel methodologies with alternative approaches would better support the central claims.
We argue that our work exploited methods, data, and analyses equal to or more rigorous than the current state-of-the-art. This indeed includes orthogonal evidence using alternative methods which both supported our novel methodologies as well as demonstrating their robustness relative to more conventional approaches. This explains how we were able to challenge/revise long-standing models and discover new features of nuclear genome organization. More specifically:
a. Unlike most previous analyses, we have integrated both genomic and imaging approaches to examine the nuclear genome organization relative to not one, but several di4erent nuclear locales and we have done this across several cell types. To our knowledge, this is the first such integrated approach and has been key to our success in appreciating new features of nuclear genome organization.
b. The 16-fraction DNA replication Repli-seq data we developed and applied to this project represents the highest temporal mapping of DNA replication timing to date.
c. The TSA-seq approach that we used remains the most accurate sequence-based method for estimating microscopic distance of chromosome regions to di4erent nuclear locales. As implemented, this method is unusually robust and direct as it exploits the exponential micron-scale gradient established by the di4usion of the free-radicals generated by peroxidase labeling to measure relative distances of chromosome regions to labeled nuclear locales. We had previously demonstrated that TSA-seq was able to estimate the average distances of genomic regions to nuclear speckles with an accuracy of ~50 nm, as validated by light microscopy. The TSA-seq 2.0 protocol we developed and applied to this project maintained the original resolution of TSA-seq to estimate to an accuracy of ~50 nm the average distances of genomic regions from nuclear speckles, as validated by light microscopy, while achieving more than a 10-fold reduction in the required number of cells.
We have rewritten the text to address the reviewer concerns that led them to their initial characterization of the TSA-seq as novel and not yet validated.
First, we have added a discussion of how the use of nuclear speckle TSA-seq as a “cytological ruler” was based on an extensive initial characterization of TSA-seq as described in previous published literature. In that previous literature we showed how the conventional molecular proximity method, ChIP-seq, instead showed local accumulation of the same marker proteins over short DNA regions unrelated to speckle distances. Second, we reference our companion paper, now published, and describe how the extension of TSA-seq to measure relative distances to nucleoli was further validated and shown to be robust by comparison to NAD-seq and extensive multiplexed immuno-FISH data. We further discuss how in the same companion paper we show how nucleolar DamID instead was inconsistent with both the NAD-seq and multiplexed immuno-FISH data as well as the nucleolar TSA-seq.
Third, we have added scatterplots showing exactly how highly the estimated microscopic distances to all three nuclear locales, measured in IMR90 fibroblasts, correlate with the TSA-seq measurements in HFF fibroblasts. This addresses the concern that we were not using the exact same fibroblast cell line for the TSA-seq versus microscopic measurements. The strong correlation already observed would only be expected to become even stronger with use of the exact same fibroblast cell lines for both measurements.
Fourth, we have addressed the reviewer concern that the nuclear lamin TSA-seq was not properly validated because it did not match nuclear lamin Dam-ID. We have now added to the text a more complete explanation of how microscopic proximity assays such as TSA-seq measure something di4erent from molecular proximity assays such as DamID or NAD-seq. We have added further explanation of how TSA-seq complements molecular proximity assays such as DamID and NAD-seq, allowing us to extract further information than either measurement alone. We also briefly discuss why TSA-seq succeeds for certain nuclear locales using multiple independent markers whereas molecular proximity assays may fail against the same nuclear locales using the same markers. This includes brief discussion from our own experience attempting unsuccessfully to use DamID against nucleoli and nuclear speckles.
Reviewer #1 (Public Review):
(1) The weakness of this study lies in the fact that many of the genomic datasets originated from novel methods that were not validated with orthogonal approaches, such as DNAFISH. Therefore, the detailed correlations described in this work are based on methodologies whose efficacy is not clearly established. Specifically, the authors utilized two modified protocols of TSA-seq for the detection of NADs (MKI67IP TSA-seq) and LADs (LMNB1-TSA-seq).
We disagree with the statement that the TSA-seq approach and data has not been validated by orthogonal approaches. We have now addressed this point in the revised manuscript text:
a) We added text to describe how previously FISH was used to validate speckle TSA-seq by demonstrating a residual of ~50 nm between the TSA-seq predicted distance to speckles and the distance measured by light microscopy using FISH:
"In contrast, TSA-seq measures relative distances to targets on a microscopic scale corresponding to 100s of nm to ~ 1 micron based on the measured diffusion radius of tyramide-biotin free-radicals (Chen et al., 2018). Exploiting the measured exponential decay of the tyramide-biotin free-radical concentration, we showed how the mean distance of chromosomes to nuclear speckles could be estimated from the TSA-seq data to an accuracy of ~50 nm, as validated by FISH (Chen et al., 2018)."
b) We note that we also previously have validated lamina (Chen et al, JCB 2018) and nucleolar (Kumar et al, 2024) TSA-seq and further validated speckle TSA-seq (Zhang et al, Genome Research 2021) by traditional immuno-FISH and/or immunostaining. The overall high correlation between lamina TSA-seq and the orthogonal lamina DamID method was also extensively discussed in the first TSA-seq paper (Chen et al, JCB 2018). Included in this discussion was description of how the di4erences between lamina TSA-seq and DamID were expected, given that DamID produces a signal more proportional to contact frequency, and independent of distance from the nuclear lamina, whereas TSA-seq produces a signal that is a function of microscopic distance from the lamina, as validated by traditional FISH.
c) We added text to describe how the nucleolar TSA-seq previously was validated by two orthogonal methods- NAD-seq and multiplexed DNA immuno-FISH:
"We successfully developed nucleolar TSA-seq, which we extensively validated using comparisons with two different orthogonal genome-wide approaches (Kumar et al., 2024)- NAD-seq, based on the biochemical isolation of nucleoli, and previously published direct microscopic measurements using highly multiplexed immuno-FISH (Su et al., 2020)."
d) We have now added panels A&B to Fig. 7 and a new Supplementary Fig. 7 demonstrating further validation of TSA-seq based on showing the high correlation between the microscopically measured distances of many hundreds of genomic sites across the genome from di4erent nuclear locales and TSA-seq scores. As discussed in response #2 below, we have used comparison of distances measured in IMR90 fibroblasts with TSA-seq scores measured in HFF fibroblasts. We would argue therefore that these correlations are a lower estimate and therefore the correlation between microscopic distances and TSAseq scores would likely have been still higher if we had performed both assays in the exact same cell line.
(2) Although these methods have been described in a bioRxiv manuscript by Kumar et al., they have not yet been published. Moreover, and surprisingly, Kumar et al., work is not cited in the current manuscript, despite its use of all TSA-seq data for NADs and LADs across the four cell lines.
The Kumar et al, Communications Biology, 2024 paper is now published and is cited properly in our revision. We apologize for this oversight and confusion our initial omission of this citation may have created. We had been writing this manuscript and the Kumar et al manuscript in parallel and had intended to co-submit. We planned to cross-reference the two at the time we co-submitted, adding the Kumar et al reference to the first version of this manuscript once we obtained a doi from bioRxiv. But we then submitted the Kumar et al manuscript several months earlier, but meanwhile forgot that we had not added the reference to our first manuscript version.
(3) Moreover, Kumar et al. did not provide any DNA-FISH validation for their methods.
As we described in our response to Reviewer 1's comment #1, we had previously provided traditional FISH validation of lamina TSA-seq in our first TSA- seq paper as well as validation by comparison with lamina DamID (Chen et al, 2018).
We also described how the nucleolar TSA-seq was extensively cross-validated in the Kumar et al, 2024 paper by both NAD-seq and the highly multiplexed immuno-FISH data from Su et al, 2020).
We note additionally that in the Kumar et al, 2024 paper the nucleolar TSA-seq was additionally validated by correlating the predicted variations in centromeric association with nucleoli across the four cell lines predicted by nucleolar TSA-seq with the variations observed by traditional immunofluorescence microscopy.
(4) Therefore, the interesting correlations described in this work are not based on robust technologies.
This comment was made in reference to the Kumar et al paper not having been published, and, as noted in responses to points #2 and #3, the paper is now published.
But we wanted to specifically note, however, that our experience is that TSA-seq has proven remarkably robust in comparison to molecular proximity assays. We've described in our responses to the previous points how TSA-seq has been cross-validated by both microscopy and by comparison with lamina DamID and nucleolar NAD-seq. We note also that in every application of TSA-seq to date, all antibodies that produced good immunostaining showed good TSA-seq results. Moreover, we obtained nearly identical results in every case in which we performed TSA-seq with different antibodies against the same target. Thus anti-SON and antiSC35 staining produced very similar speckle TSA-seq data (Chen et al, 2018), anti-lamin A and anti-lamin B staining produced very similar lamina TSA-seq data (Chen et al, 2018), antinucleolin and anti-POL1RE staining produced very similar DFC/FC nucleolar TSA-seq data (Kumar et al, 2024), and anti-MKI67IP and anti-DDX18 staining produced very similar GC nucleolar TSA-seq data (Kumar et al, 2024).
This independence of results with TSA-seq to the particular antibody chosen to label a target differs from experience with methods such as ChIP, DamID, and Cut and Run/Tag in which results can differ or be skewed based on variable distance and therefore reactivity of target proteins from the DNA or due to other factors such as non-specific binding during pulldown (ChIP) or differential extraction by salt washes (Cut and Tag).
Our experience in every case to date is that antibodies that produce similar immunofluorescence staining produce similar TSA-seq results. We attribute this robustness to the fact that TSA-seq is based only on the original immunostaining specificity provided by the primary and secondary antibodies plus the diffusion properties of the tyramide-free radical.
We've now added the following text to our revised manuscript:
"As previously demonstrated for both SON and lamin TSA-seq (Chen et al., 2018), nucleolar TSA-seq was also robust in the sense that multiple target proteins showing similar nucleolar staining showed similar TSA-seq results (Kumar et al., 2024); this robustness is intrinsic to TSA-seq being a microscopic rather than molecular proximity assay, and therefore not sensitive to the exact molecular binding partners and molecular distance of the target proteins to the DNA."
(5) An attempt to validate the data was made for SON-TSA-seq of human foreskin fibroblasts (HFF) using multiplexed FISH data from IMR90 fibroblasts (from the lung) by the Zhuang lab (Su et al., 2020). However, the comparability of these datasets is questionable. It might have been more reasonable for the authors to conduct their analyses in IMR90 cells, thereby allowing them to utilize MERFISH data for validating the TSA-seq method and also for mapping NADs and LADs.
We disagree with the reviewer's overall assessment that that the use of the IMR90 data to further validate the TSA-seq is questionable because the TSA-seq data from HFF fibroblasts is not necessarily comparable with multiplexed immuno-FISH microscopic distances measured in IMR90 fibroblasts.
In response we have now added panels to Fig. 7 and Supplementary Fig. 7, showing:
a) There is very little di4erence in correlation between speckle TSA-seq and measured distances from speckles in IMR90 cells whether we use IMR90 or HFF cells SON TSA-seq data (R<sup>2</sup> = 0.81 versus 0.76) (new Fig. 7A).
b) There is also a high correlation between lamina (R<sup>2</sup> = 0.62) and nucleolar (R<sup>2</sup> = 0.73) HFF TSA-seq and measured distances in IMR90 cells. Thus, we conclude that this high correlation shows that the multiplexed data from ~1000 genomic locations does validate the TSA-seq. These correlations should be considered lower bounds on what we would have measured using IMR90 TSA-seq data. Thus, the true correlation between distances of loci from nuclear locales and TSA-seq would be expected to be either comparable or even stronger than what we are seeing with the IMR90 versus HFF fibroblast comparisons.
c) This correlation is cell-type specific (Fig. 7B, new SFig. 7). Thus, even for speckle TSAseq, highly conserved between cell types, the highest correlation of IMR90 distances with speckle TSA-seq is with IMR90 and HFF fibroblast data. For lamina and nucleolar TSA-seq, which show much lower conservation between cell types, the correlation of IMR90 distances is high for HFF data but much lower for data from the other cell types. This further justifies the use of IMR90 fibroblast distance measurements as a proxy for HFF fibroblast measurements.
Thus, we have added the following text to the revised manuscript:
"We reasoned that the nuclear genome organization in the two human fibroblast cell lines would be sufficiently similar to justify using IMR90 multiplexed FISH data [43] as a proxy for our analysis of HFF TSA-seq data. Indeed, the high inverse correlation (R= -0.86) of distances to speckles measured by MERFISH in IMR90 cells with HFF SON TSA-seq scores is nearly identical to the inverse correlation (R= -0.89) measured instead using IMR90 SON TSA-seq scores (Fig. 7A). Similarly, distances to the nuclear lamina and nucleoli show high inverse correlations with lamina and nucleolar TSA-seq, respectively (Fig. 7A). These correlations were cell type specific, particularly for the lamina and nucleolar distance correlations, as these correlations were reduced if we used TSA-seq data from other cell types (SFig. 7A). Therefore, the high correlation between IMR90 microscopic distances and HFF TSA-seq scores can be considered a lower bound on the likely true correlation, justifying the use of IMR90 as a proxy for HFF for testing our predictions."
Reviewer #2 (Public Review):
Weaknesses:
(1) The experiments are largely descriptive, and it is difficult to draw many cause-andeffect relationships...The study would benefit from a clear and specific hypothesis.
This study was hypothesis-generating rather than hypothesis-testing in its goal. Our research was funded through the NIH 4D-Nucleome Consortium, which had as its initial goal the development, benchmarking, and validation of new genomic technologies. Our Center focused on the mapping of the genome relative to different nuclear locales and the correlation of this intranuclear positioning of the genome with functions- specifically gene expression and DNA replication timing. By its very nature, this project took a discovery-driven versus hypothesis-driven scientific approach. Our question fundamentally was whether we could gain new insights into nuclear genome organization through the integration of genomic and microscopic measurements of chromosome positioning relative to multiple different nuclear compartments/bodies and their correlation with functional assays such as RNA-seq and Repliseq.
Indeed, this study resulted in multiple new insights into nuclear genome organization as summarized in our last main figure. We believe our work and conclusions will be of general interest to scientists working in the fields of 3D genome organization and nuclear cell biology. We anticipate that each of these new insights will prompt future hypothesis-driven science focused on specific questions and the testing of cause-and-effect relationships.
However, we do want to point out that our comparison of wild-type K562 cells with the LMNA/LBR double knockout was designed to test the long-standing model that nuclear lamina association of genomic loci contributes to gene silencing. This experiment was motivated by our surprising result that gene expression differences between cell lines correlated strongly with differences in positioning relative to nuclear speckles rather than the nuclear lamina. Despite documenting in these double knockout cells a decreased nuclear lamina association of most LADs, and an increased nuclear lamina association of the “p-w-v” fiLADs identified in this manuscript, we saw no significant change in gene expression in any of these regions as compared to wild-type K562 cells. Meanwhile, distances to nuclear speckles as measured by TSA-seq remained nearly constant.
We would argue that this represents a specific example in which new insights generated by our genomics comparison of cell lines led to a clear and specific hypothesis and the experimental testing of this hypothesis.
(2) Similarly, the paper would be very much strengthened if the authors provided additional summary statements and interpretation of their results (especially for those not as familiar with 3D genome organization).
We appreciate this feedback and agree with the reviewer that this would be useful, especially for those not familiar with previous work in the field of 3D genome organization. In an earlier draft, we had included additional summary and interpretation statements in both the Introduction and Results sections. At the start of each Results section, we had also previously included brief discussion of what was known before and the context for the subsequent analysis contained in that section. However, we had thought we might be submitting to a journal with specific word limits and had significantly cut out that text.
We have now restored this text and, in certain cases, added additional explanations and context.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Figures 1C and D. Please add the units at the values of each y-axis.
We have done that.
The representation of Figure 2C lacks clarity and is diJicult to understand. The x-axis labeling regarding the gene fraction number needs clarification.
We've modified the text to the Fig. 2C legend: "Fraction of genes showing significant di=erence in relative positioning to nuclear speckles (gene fraction, x-axis) versus log2 (HFF FKPM / H1 FKPM) (y-axis);"
"We next used live-cell imaging to corroborate that chromosome regions close to nuclear speckles, primarily Type I peaks, would show the earliest DNA replication timing." This sentence requires modification as Supplementary Figure 3F does not demonstrate that Type I peaks exhibit the earliest DNA replication timing; it only indicates that the first PCNA foci in S-phase are in proximity to nuclear speckles.
We've modified the text to: "We next used live-cell imaging to show that chromosome regions close to nuclear speckles show the earliest DNA replication timing; this is consistent with the earliest firing DNA replication IZs, as determined by Repli-seq, aligning with Type 1 peaks that are closely associated with nuclear speckles."
In Figure 5, the authors employed LaminB1-DamID to quantify LADs in LBR-KO and LMNA/LBR-DKO K562 cells. These are interesting results. However, for these experiments, it is crucial to assess LMNB1 signal at the nuclear periphery via immunofluorescence (IF) to confirm the absence of changes, ensuring that the DamID signal solely reflects contacts with the nuclear lamina. Furthermore, in this instance, their findings should be validated through DNA-FISH.
Immunostaining of LMNB1 was performed and showed a normal staining pattern as a ring adjacent to the nuclear periphery. Images of this staining were included in the metadata tied to the sequencing data sets deposited on the 4D Nucleome Data portal. We thank the reviewer for bringing up this point, and have added a sentence mentioning this result in the Results Section:
"Immunostaining against LMNB1 revealed the normal ring of staining around the nuclear periphery seen in wt cells (images deposited as metadata in the deposited sequencing data sets)."
Because both TSA-seq and DamID have been extensively validated by FISH, as detailed in our previous responses to the public reviewer comments, we feel it is unnecessary to validate these findings by FISH.
p-w-v-fiLADs should be labelled in Figure 5B.
We've added labeling as suggested.
"The consistent trend of slightly later DNA replication timing for regions (primarily p-w-v fiLADs) moving closer to the lamina" is not visible in the representation of the data of Figure 5G.
We did not make a change as we believed this trend was apparent in the Figure.
To reduce the descriptive nature of the data, it would be pertinent to conduct H3K9me3 and H3K27me3 ChIP-seq analyses in both the parental and DKO mutant cells. This would elucidate whether p-w-v-fiLADs and NADs anchoring to the nuclear lamina undergo changes in their histone modification profile.
We believe further analysis of the reasons underlying these shifts in positioning, including such ChIP-seq or equivalent analysis, is of interest but beyond the scope of this publication. We see such measurements as the beginning of a new story but insuJicient alone to determine mechanism. Therefore we believe such experiments should be part of that future study.
The description of Figure 7 lacks clarity. Additionally, it appears that TSA-seq for NADs and LADs may not be universally applicable across all cell types, particularly in flat cells, whereas DamID scores demonstrate less variation across cell lines, as also stated by the authors.
TSA-seq is a complement to rather than a replacement for either DamID or NAD-seq. TSAseq reports on microscopic distances whereas both DamID and NAD-seq instead are more proportional to contact frequency with the nuclear lamina or nucleoli, respectively, and insensitive to distances of loci away from the lamina or nucleoli. Thus, TSA-seq provides additional information based on the intrinsic diJerences in what TSA-seq measures relative to molecular proximity methods such as DamID or NAD-seq. The entire point is that the convolution of the exponential point-spread-function of the TSA-seq with the shape of the nuclear periphery allows us to distinguish genomic regions in the equatorial plane versus the top and bottom of the nuclei. The TSA-seq is therefore highly "applicable" when properly interpreted in discerning new features of genome organization. As we stated in the revised manuscript, the lamina DamID and TSA-seq are complementary and provide more information together then either method along. The same is true for the NAD-seq and nucleolar TSA-seq comparison, as described in more detail in the Kumar, et al, 2024 paper.
Introduction:
The list of methodologies for mapping genomic contacts with nucleoli (NADs) should also include recent technologies, such as Nucleolar-DamID (Bersaglieri et al., PMID: 35304483), which has been validated through DNA-FISH.
We did not include nucleolar DamID in the mention in the Introduction of methods for identifying diJerential lamina versus nucleolar interactions of heterochromatin- either from our own collaborative group or from the cited reference- because we did not have confidence in the accuracy of this method in identifying NADs. In the case of the published nucleolar DamID from our collaborative group, published in Wang et al, 2021, we later discovered that despite apparent agreement of the nucleolar DamID with a small number of published FISH localization the overall correlation of the nucleolar DamID with nucleolar localization was poor. As described in detail in the Kumar et al, 2024 publication, this poor correlation of the nucleolar DamID was established using three orthogonal methods- nucleolar TSA-seq, NAD-seq, and multiplexed immuno-FISH measurements from ~1000 genomic locations. Instead, we found that this nucleolar DamID showed high correlation with lamina DamID. We note that many strong NADs are also LADs, which we think is why validation with only several FISH probes is inadequate to demonstrate overall validation of the approach.
We could not compare our nucleolar-DamID data in human cells with the alternative nucleolar-DamID results cited by the reviewer which were performed in mouse cells. We note that in this paper the nucleolar DamID FISH validation only included several putative NAD chromosome regions and, I believe, one LAD region. However, our initial comparison of the nucleolar DamID cited by the reviewer with unpublished TSA-seq data from mouse ESCs produced by the Belmont laboratory and with NAD-seq data from the Kaufman laboratory shows a similar lack of correlation of the nucleolar DamID signal with nucleolar TSA-seq and NAD-seq, as well as multiplexed immuno-FISH data from the Long Cai laboratory, as we saw in our analysis of own nucleolar DamID data in human cells.
We have added explanation concerning the lack of correlation of our nucleolar DamID with orthogonal measurements of nucleolar proximity in the added text (below) to our revised manuscript:
"Nucleolar DamID instead showed broad positive peaks over large chromatin domains, largely overlapping with LADs mapped by LMNB1 DamID (Wang et al., 2021). However, this nucleolar DamID signal, while strongly correlated with lamin DamID, showed poor correlation with either NAD-seq or nucleolar distances mapped by multiplexed immunoFISH (Kumar et al., 2024). We suspect the problem is that with molecular proximity assays the output signals are disproportionally dominated by the small fraction of target proteins juxtaposed in su=icient proximity to the DNA to produce a signal rather than the amount of protein concentrated in the target nuclear body. "
Our mention of nucleolar TSA-seq was in the context of why we focused on nucleolar TSAseq and excluded our own nucleolar DamID. We chose not to discuss the second nucleolar DamID method cited above 1) because it was not appropriate to our discussion of our own experimental approach and 2) also because we cannot yet make a definitive statement of its accuracy for nucleolar mapping.
Reviewer #2 (Recommendations For The Authors):
(1) The authors start the manuscript by describing the 'radial genome organization' model and contrast it with the 'binary model' of genome organization. It would be helpful for the authors to contextualize their results a bit more with regard to these two diJerent models in the discussion.
We have added several sentences in the first paragraph of the Discussion to accomplish this contextualization. The new paragraph reads:
"Here we integrated imaging with both spatial (DamID, TSA-seq) and functional (Repli-seq, RNA-seq) genomic readouts across four human cell lines. Overall, our results significantly extend previous nuclear genome organization models, while also demonstrating a cell-type dependent complexity of nuclear genome organization. Briefly, in contrast to the previous radial model of genome organization, we reveal a primary correlation of gene expression with relative distances to nuclear speckles rather than radial position. Additionally, beyond a correlation of nuclear genome organization with radial position, in cells with flat nuclei we show a pronounced correlation of nuclear genome organization with distance from the equatorial plane. In contrast to previous binary models of genome organization, we describe how both iLAD / A compartment and LAD / B compartment contain within them smaller chromosome regions with distinct biochemical and/or functional properties that segregate di=erentially with respect to relative distances to nuclear locales and geometry."
(2) Data should be provided demonstrating KO of LBR and LMNA - immunoblotting for both proteins would be one approach. In addition, it would be helpful to provide additional nuclear morphology measurements of the DKO cells (volume, surface area, volume of speckles/nucleoli, number of speckles/nucleoli).
We've added additional description describing the generation and validation of the KO lines:
"To create LMNA and LBR knockout (KO) lines and the LMNA/LBR double knockout (DKO) line, we started with a parental "wt" K562 cell line, clone #17, expressing an inducible form of Cas9 (Brinkman et al., 2018). The single KO and DKO were generated by CRISPR-mediated frameshift mutation according to the procedure described previously (Schep et al., 2021). The "wt" K562 clone #17 was used for comparison with the KO clones.
The LBR KO clone, K562 LBR-KO #19, was generated, using the LBR2 oligonucleotide GCCGATGGTGAAGTGGTAAG to produce the gRNA, and validated previously, using TIDE (Brinkman et al., 2014) to check for frameshifts in all alleles as described elsewhere (Schep et al., 2021). The LMNA/LBR DKO, K562 LBR-LMNA DKO #14, was made similarly, starting with the LBR KO line and using the combination of two oligonucleotides to produce gRNAs:
LMNA-KO1: ACTGAGAGCAGTGCTCAGTG, LMNA-KO2: TCTCAGTGAGAAGCGCACGC.
Additionally, the LMNA KO line, K562 LMNA-KO #14, was made the same way but starting with the "wt" K562 cell line. Validation was as described above; additionally, for the new LMNA KO and LMNA/LBR DKO lines, immunostaining showed the absence of anti-LMNA antibody signal under confocal imaging conditions used to visualize the wt LMNA staining while the RNA-seq from these clones revealed an ~20-fold reduction in LMNA RNA reads relative to the wt K562 clone."
As suggested, we also added morphological data for the DKO line in a modified SFig.5.
(3) The rationale for using LMNB1 TSA-seq and LMNB1 DAMID is not immediately clear. The LMNB1 TSA-seq is more variable across cell types and replicates than the DAMID. Could the authors please compare the datasets a bit more to understand the diJerences? For example, the authors demonstrate that "40-70% of the genome shows statistically significant diJerences in Lamina TSA-seq over regions 100 kb or larger, with most of these regions showing little or no diJerences in speckle TSA-seq scores." If the LMNB1 DAMID data is used for this analysis or Figure 2D, is the same conclusion reached? Also, in Figure 6, the authors conclude that C1 and C3 LAD regions are enriched for constitutive LADs, while C2 and C4 LAD regions are fLADs. This is a bit surprising because the authors and others have previously shown that constitutive LADs have higher LMNB1 contact frequency than facultative LADs (Kind, et al Cell 2015, Figure 3C).
Indeed, in the first TSA-seq paper (Chen et al, 2018) we did observe that cLADs had the highest LMNB TSA-seq scores; this was for K562 cells with round nuclei in which there is therefore no diJerence in lamina TSA-seq scores produced by nuclear shape over the entire nucleus.
However, there are diJerences between TSA-seq and DamID in terms of what they measure and we refer the reviewer to the first TSA-seq paper (Chen et al, 2018) that explains in greater depth these diJerences. This first paper explains how DamID is indeed related to contact frequency but how the TSA-seq instead estimates mean distances from the target, in this case the nuclear lamina. This is because the diJusion of tyramide free radicals from the site of their constant HRP production produces an exponential decay gradient of tyramide free radical concentration at steady state.
We have summarized these diJerences in in text we have added to introduce both DamID and TSA-seq in the second Results section:
"DamID is a well-established molecular proximity assay; DamID applied to the nuclear lamina divides the genome into lamina-associated domains (LADs) versus nonassociated “inter-LADs” or “iLADs” (Guelen et al., 2008; van Steensel and Belmont, 2017). In contrast, TSA-seq measures relative distances to targets on a microscopic scale corresponding to 100s of nm to ~ 1 micron based on the measured diJusion radius of tyramide-biotin free-radicals (Chen et al., 2018)... While LMNB1 DamID segments LADs most accurately, lamin TSA-seq provides distance information not provided by DamID- for example, variations in relative distances to the nuclear lamina of diJerent iLADs and iLAD regions. These diJerences between the LMNB1 DamID and LMNB TSA-seq signals are also crucial to a computational approach, SPIN, that segments the genome into multiple states based on their varying nuclear localization, including biochemically and functionally distinct lamina-associated versus near-lamina states (Consortium et al., 2024; Wang et al., 2021).
Thus, lamin DamID and TSA-seq complement each other, providing more information together than either one separately."
We note that these diJerences in lamina DamID and TSA-seq are crucial to being able to gain additional information by comparing variations in the lamina TSA-seq for LADs in Figs. 6&7. See our response to point (4) below, for further explanation.
(4) In 7B/C, the authors show that the highest LMNB1 regions in HFF are equator of IMR90s. However, in Figure 7G, their cLAD score indicates that constitutive LADs are not at the equator. This is a bit surprising given the point above and raises the possibility that SON signals (as opposed to LMNB1 signals) might be more responsible for correlation to localization relative to the equator. Hence, it might be helpful if the authors repeat the analyses in Figures 7B/C in regions with diJering LMNB1 signals but similar SON signals (and vice versa).
Again, this is based on the apparent assumption by the reviewer that DamID and TSA-seq work the same way and measure the same thing. But as explained above in the previous point, this is not true.
In our first TSA-seq paper (Chen et al, 2018) we showed how we could use the exponential decay point-spread-function produced by TSA, measured directly by light microscopy, to convert sequencing reads from the TSA-seq into a predicted mean distance from nuclear speckles, approximated as point sources. These mean distances predicted from the SON TSA-seq data agreed with measured FISH distances to nuclear speckles to within ~50 nm for a set of DNA probes from diJerent chromosome regions. Moreover, varying TSA staining conditions changed the decay constants of this exponential decay, thus producing diJerences in the SON TSA-seq signals. By using these diJerent exponential decay functions to convert the TSA-seq scores from these independent data sets to estimated distances from nuclear speckles, we again observed a distance residual of ~50 nm; in this case though this distance residual of ~50 nm represented the mean residual observed genome-wide. This gives us great confidence that the TSA-seq is working as we have modeled it.
As we mentioned in our response to point 3 above, we did see the highest LMNB TSA-seq signal for cLADs in K562 cells with round nuclei (Chen et al, 2018).
But as we now show in our simulation performed in this paper for Fig. 7, the observed tyramide free radical exponential decay gradient convolved with the flat nuclear lamina shape produces a higher equatorial LMNB1 TSA-seq signal for LADs at the equatorial plane. We confirmed that LADs with this higher TSA-seq signal were enriched at the equatorial plane by mining the multiplexed IMR90 imaging data. Similar mining of the multiplexed FISH IMR90 data showed localization of cLADs away from the equatorial plane.
We are not clear about the rationale for what the reviewer is suggesting about SON signals "being more responsible for correlation to localization to the equator". We have provided an explanation for the higher lamina TSA-seq scores for LADs near the equator based on the measured spreading of the tyramide free radicals convolved with the eJect of the nuclear shape. This makes a prediction that the observed variation in lamina TSA-seq scores for LADs with similar DamID scores is related to their positioning relative to the equatorial plane as we then validated through our mining of the IMR90 multiplexed FISH data.
(5) FISH of individual LADs, v-fiLADs, and p-w-v-fiLADs relative to the lamina and speckle would be helpful to understand their relative positioning in control and LBR/LMNA double KO cells. This would significantly bolster the claim that "histone mark enrichments..more precisely revealed the diJerential spatial distribution of LAD regions...".
Adequately testing these predictions made from the lamina/SON TSA-seq scatterplots by direct FISH measurements would require measurements from large numbers of diJerent chromosome regions through a highly multiplexed immuno-FISH approach. We are not equipped currently in any of our laboratories to do such measurements and we leave this therefore for future studies.
Rather our statement is based on our use of TSA-seq analyzed through these 2D scatterplots and should be valid to the degree that our TSA-seq measurements do indeed correlate with microscopy derived distances.
However, we do now include demonstration of a high correlation of speckle, lamina, and nucleolar TSA-seq with highly multiplexed immuno-FISH measurement of distances to these locales in a revised Fig. 7. The high correlation shown between the TSA-seq scores and measured distances does therefore add additional support to our claim that the reviewer is discussing, even without our own multiplexed FISH validation.
(6) "In contrast, genes within genomic regions which in pair-wise comparisons of cell lines show a statistically significant diJerence in lamina TSA-seq show no obvious trend in their expression diJerences (Figure 2C).". This appears to be an overstatement based on the left panel of 2D.
We do not follow the reviewer's point. In Fig. 2C we show little bias in the diJerences in gene expression between the two cell types for regions that showed diJerences in lamina TSA-seq. The reviewer is suggesting something otherwise based on their impression, not explicitly stated, of the left panel of Fig. 2D. But we see similar shades of blue extending vertically at low SON values and similar shades of red extending vertically at high SON values, suggesting a correlation of gene expression only with the SON TSA-seq score but not with the LMNB1 TSA-seq score displayed on the y-axis. This is also consistent with the very small and/or insignificant correlation coeJicients measured in our linear model relating diJerences in LMNB1 TSA-seq to diJerences in expression but the large correlation coeJicient observed for SON TSA-seq (Fig. 2E). Thus, we see Fig. 2C-E as self-consistent.
(7) In the section on "Polarity of Nuclear Genome Organization" - "....Using the IMR90 multiplexed FISH data set [43]...." - The references are not numbered.
We thank the reviewer for this correction.
(8) I believe there is an error in the Figure 7B legend. The descriptions of Cluster 1 and 2 do not match those indicated in the figure.
We again thank the reviewer for this correction.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The drug Ivermectin is used to effectively treat a variety of worm parasites in the world, however resistance to Ivermectin poses a rising challenge for this treatment strategy. In this study, the authors found that loss of the E3 ubiquitin ligase UBR-1 in the worm C. elegans results in resistance to Ivermectin. In particular, the authors found that ubr-1 mutants are resistant to the effects of Ivermectin on worm viability, body size, pharyngeal pumping, and locomotion. The authors previously showed that loss of UBR-1 disrupts homeostasis of the amino acid and neurotransmitter glutamate resulting in increased levels of glutamate in C. elegans. Here, the authors found that the sensitivity of ubr-1 mutants to Ivermectin can be restored if glutamate levels are reduced using a variety of different methods. Conversely, treating worms with exogenous glutamate to increase glutamate levels also results in resistance to Ivermectin supporting the idea that increased glutamate promotes resistance to Ivermectin. The authors found that the primary known targets of Ivermectin, glutamate-gated chloride channels (GluCls), are downregulated in ubr-1 mutants providing a plausible mechanism for why ubr-1 mutants are resistant to Ivermectin. Although it is clear that loss of GluCls can lead to resistance to Ivermectin, this study suggests that one potential mechanism to decrease GluCl expression is via disruption of glutamate homeostasis that leads to increased glutamate. This study suggests that if parasitic worms become resistant to Ivermectin due to increased glutamate, their sensitivity to Ivermectin could be restored by reducing glutamate levels using drugs such as Ceftriaxone in a combination drug treatment strategy.
Strengths:
(1) The use of multiple independent assays (i.e., viability, body size, pharyngeal pumping, locomotion, and serotonin-stimulated pharyngeal muscle activity) to monitor the effects of Ivermectin
(2) The use of multiple independent approaches (got-1, eat-4, ceftriaxone drug, exogenous glutamate treatment) to alter glutamate levels to support the conclusion that increased glutamate in ubr-1 mutants contributes to Ivermectin resistance.
Weaknesses:
(1) The primary target of Ivermectin is GluCls so it is not surprising that alteration of GluCl expression or function would lead to Ivermectin resistance.
(2) It remains to be seen what percent of Ivermectin-resistant parasites in the wild have disrupted glutamate homeostasis as opposed to mutations that more directly decrease GluCl expression or function.
Thank you for your thoughtful and constructive comments. We completely agree with your observation that alterations in GluCl expression or function can lead to Ivermectin resistance. However, we would like to emphasize that our study highlights an additional mechanism: disruptions in glutamate homeostasis can also lead to decreased GluCl expression, thereby contributing to Ivermectin resistance. This mechanism, which has not been fully explored previously, offers new insights into the complexity of drug resistance and could have important implications for understanding the development of Ivermectin resistance in parasitic nematodes.
As you pointed out, the role of disrupted glutamate homeostasis in wild parasitic populations and the proportion of resistant parasites with this mechanism remain unknown. We believe this uncertainty underlines the significance of our findings, as they suggest a novel avenue for studying Ivermectin resistance and for developing potential strategies to counteract it.
We have incorporated this discussion into the revised manuscript to further enrich the context of our findings.
Reviewer #2 (Public review):
Summary:
The authors provide a very thorough investigation of the role of UBR-1 in anthelmintic resistance using the non-parasitic nematode, C. elegans. Anthelmintic resistance to macrocyclic lactones is a major problem in veterinary medicine and likely just a matter of time until resistance emerges in human parasites too. Therefore, this study providing novel insight into the mechanisms of ivermectin resistance is particularly important and significant.
Strengths:
The authors use very diverse technologies (behavior, genetics, pharmacology, genetically encoded reporters) to dissect the role of UBR-1 in ivermectin resistance. Deploying such a comprehensive suite of tools and approaches provides exceptional insight into the mechanism of how UBR-1 functions in terms of ivermectin resistance.
Weaknesses:
I do not see any major weaknesses in this study. My only concern is whether the observations made by the authors would translate to any of the important parasitic helminthes in which resistance has naturally emerged in the field. This is always a concern when leveraging a non-parasitic nematode to shed light on a potential mechanism of resistance of parasitic nematodes, and I understand that it is likely beyond the scope of this paper to test some of their results in parasitic nematodes.
Thank you for your kind words and positive feedback on our work. We greatly appreciate your acknowledgment of the diverse technologies and comprehensive approaches we utilized to uncover the role of UBR-1 in ivermectin resistance.
Your concern about whether our findings in C. elegans translate to parasitic helminthes in which ivermectin resistance has naturally emerged is both valid and critical. This is indeed a key question we expect to figure out in future studies. Collaborating with parasitologists to investigate whether naturally occurring mutations in ubr-1 exist in parasitic and non-parasitic nematodes is a priority for us. We hope that these efforts will lead to meaningful discoveries that have a significant impact on both livestock management and medicine.
Reviewer #3 (Public review):
Summary:
Li et al propose to better understand the mechanisms of drug resistance in nematode parasites by studying mutants of the model roundworm C. elegans that are resistant to the deworming drug ivermectin. They provide compelling evidence that loss-of-function mutations in the E3 ubiquitin ligase encoded by the UBR-1 gene make worms resistant to the effects of ivermectin (and related compounds) on viability, body size, pharyngeal pumping rate, and locomotion and that these mutant phenotypes are rescued by a UBR-1 transgene. They propose that the mechanism is resistance is indirect, via the effects of UBR-1 on glutamate production. They show mutations (vesicular glutamate transporter eat-4, glutamate synthase got-1) and drugs (glutamate, glutamate uptake enhancer ceftriaxone) affecting glutamate metabolism/transport modulate sensitivity to ivermectin in wild-type and ubr-1 mutants. The data are generally consistent with greater glutamate tone equating to ivermectin resistance. Finally, they show that manipulations that are expected to increase glutamate tone appear to reduce expression of the targets of ivermectin, the glutamate-gated chloride channels, which is known to increase resistance.
There is a need for genetic markers of ivermectin resistance in livestock parasites that can be used to better track resistance and to tailor drug treatment. The discovery of UBR-1 as a resistance gene in C. elegans will provide a candidate marker that can be followed up in parasites. The data suggest Ceftriaxone would be a candidate compound to reverse resistance.
Strengths:
The strength of the study is the thoroughness of the analysis and the quality of the data. There can be little doubt that ubr-1 mutations do indeed confer ivermectin resistance. The use of both rescue constructs and RNAi to validate mutant phenotypes is notable. Further, the variety of manipulations they use to affect glutamate metabolism/transport makes a compelling argument for some kind of role for glutamate in resistance.
Weaknesses:
The proposed mechanism of ubr-1 resistance i.e.: UBR-1 E3 ligase regulates glutamate tone which regulates ivermectin receptor expression, is broadly consistent with the data but somewhat difficult to reconcile with the specific functions of the genes regulating glutamatergic tone. Ceftriaxone and eat-4 mutants reduce extracellular/synaptic glutamate concentrations by sequestering available glutamate in neurons, suggesting that it is extracellular glutamate that is important. But then why does rescuing ubr-1 specifically in the pharyngeal muscle have such a strong effect on ivermectin sensitivity? Is glutamate leaking out of the pharyngeal muscle into the extracellular space/synapse? Is it possible that UBR-1 acts directly on the avr-15 subunit, both of which are expressed in the muscle, perhaps as part of a glutamate sensing/homeostasis mechanism?
Thank you for your insightful feedback and thought-provoking questions. These are excellent points that have prompted us to critically reconsider our findings and the proposed mechanism.
Several potential explanations could be considered, although we currently lack direct evidence to support this hypothesis: (1) The pharynx likely plays a dominant role in ivermectin resistance, as previously reported (Dent et al., 1997; Dent et al., 2000), and overexpression of UBR-1 in the pharyngeal muscle may exhibit a strong effect on ivermectin sensitivity. (2) It is also possible that pharyngeal muscle cells have the capacity to release glutamate into the extracellular space, which could contribute to the observed effect. (3) Alternatively, UBR-1 expression in the pharyngeal muscle may regulate other indirect pathways affecting extracellular or synaptic glutamate concentrations.
We also appreciate your suggestion that UBR-1 may act directly on AVR-15 in the pharynx. While this is an interesting possibility, UBR-1 is an E3 ubiquitin ligase, and if AVR-15 were a direct target, we would expect UBR-1 to ubiquitinate AVR-15 and promote its degradation. In this case, loss of UBR-1 should inhibit AVR-15 ubiquitination, reduce its degradation, and lead to increased AVR-15 protein levels in the pharynx. However, our experimental data show a reduction, rather than an increase, in AVR-15::GFP levels in ubr-1 mutants (Figure 4A). This observation suggests that AVR-15 is less likely to be a direct target of UBR-1. To definitively address this hypothesis, a direct assessment of AVR-15 ubiquitination levels in wild-type and ubr-1 mutant backgrounds would be needed. We agree that this is an important avenue for future investigation.
The use of single ivermectin dose assays can be misleading. A response change at a single dose shows that the dose-response curve has shifted, but the response is not linear with dose, so the degree of that shift may be difficult to discern and may result from a change in slope but not EC50. Similarly, in Figure 3C, the reader is meant to understand that eat-4 mutant is epistatic to ubr-1 because the double mutant has a wild-type response to ivermectin. But eat-4 alone is more sensitive, so (eyeballing it and interpolating) the shift in EC50 caused by the ubr-1 mutant in a wild type background appears to be the same as in an eat-4 background, so arguably you are seeing an additive effect, not epistasis. For the above reasons, it would be desirable to have results for rescuing constructs in a wild-type background, in addition to the mutant background.
Thank you for your detailed feedback and observations.
The potential additive effect you noted in Figure 3C appears to be specific to the body length analysis. In our other three ivermectin resistance assays (viability, pumping rate, and locomotion velocity), this additive effect was not observed. A possible explanation for this is that eat-4 and got-1 single mutants inherently exhibit reduced body length compared to wild-type worms (Mörck and Pilon 2006; Greer et al. 2008; Chitturi et al. 2018), which may give the appearance of an additive effect in this particular assay.
Regarding the use of rescuing constructs, we performed these experiments in the ubr-1;got-1 and ubr-1;eat-4 double mutant backgrounds. This was designed to test whether the suppression of ubr-1-mediated ivermectin resistance by got-1 or eat-4 mutations is indeed due to the functional activity of GOT-1 and EAT-4, respectively. The choice of this setup was to ensure that the double mutant phenotype was fully addressed. In contrast, rescuing constructs of GOT-1 and/or EAT-4 in a wild-type background might not sufficiently reveal the relationship between GOT-1, EAT-4, and UBR-1. However, we are open to further testing your suggestion in the future.
To aid in the interpretation and clarify the apparent effects, we have revised Figure 3 annotation to clearly represent the data and the comparisons being made. We hope this adjustment makes the results more straightforward and easier for readers to understand.
The added value of the pumping data in Figure 5 (using calcium imaging) over the pump counts (from video) in Figure 1G, Figure 2E, F, K, & Figure 3D, H is not clearly explained. It may have to do with the use of "dissected" pharynxes, the nature/advantage of which is not sufficiently documented in the Methods/Results.
Thank you for pointing this out. The behavioral pumping data in Figure 1G, Figure 2E, F, K, & Figure 3D and calcium imaging data in Figure 5 were obtained under different experimental conditions. Specifically, the behavioral assays (pumping rate) were conducted on standard culture plates with freely moving worms, whereas the calcium imaging experiments were performed in a liquid environment with immobilized worms. In the calcium imaging setup, the dissection refers to gently puncturing the epidermis behind head of the worm with a glass electrode to relieve internal pressure, which aids in stabilizing the calcium imaging process and ensures better visualization of pharyngeal muscle activity.
We compared the pharyngeal muscle activity of worms that were not subjected to puncturing the epidermis and found no significant difference when activated by 20 mM serotonin. Therefore, we speculate that there is no direct interaction between the bath solution and the pharynx or head neurons. To avoid confusion, we have removed the term "dissected" from the manuscript and added additional experimental details in the Methods section.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) The authors propose that ubr-1 mutants are resistant to ivermectin due to persistent elevation of glutamate that leads to a compensatory reduction in GluCl levels and thus resistance to Ivermectin. This model would be strengthened by experiments more directly connecting glutamate, GluCls and Ivermectin sensitivity. For example, does overexpression of a relevant GluCl such as AVR-15 restore Ivermectin sensitivity to ubr-1 mutants? Does Ceftriaxone treatment affect the Ivermectin resistance of worms lacking the relevant GluCls (i.e., avr-15, avr-14 and glc-1)? - The model suggests that Ceftriaxone treatment would have no effect in the latter case.
Thank you for your valuable suggestion. Based on your recommendation, we have performed two additional experiments to strengthen our model. First, we conducted an overexpression experiment of AVR-15 and found that it significantly, though partially, restored ivermectin sensitivity in ubr-1 mutants (p < 0.01, Supplemental Figure S5D). Second, we tested the effect of Ceftriaxone treatment on the IVM resistance of avr-15; avr-14; glc-1 triple mutants, which encode the most critical glutamate receptors involved in IVM sensitivity. As expected, we found that Ceftriaxone treatment did not alter the IVM resistance in these triple mutants (Supplemental Figure S5E), supporting the idea that these specific GluCls are key to the observed resistance.
These two experiments provide further support for our proposed model. We have integrated the results into the manuscript, updating the Results section and Supplemental Figure S5D, E, as well as the corresponding Figure Legends.
(2) Line 211 - Ceftriaxone is known to upregulate EAAT2 expression in mammals. Do the authors know if the drug also increases EAAT expression in C. elegans?
Thank you for raising this point. To our knowledge, this is the first study to demonstrate the antagonistic effect of ceftriaxone on ivermectin resistance in C. elegans, particularly in the context of ubr-1-mediated resistance. Ceftriaxone enhances glutamate uptake by increasing the expression of excitatory amino acid transporter-2 (EAAT2) in mammals (Rothstein et al., 2005, Lee et al., 2008). C. elegans has six glutamate transporters encoded by glt-1 and glt-3–7 (Mano et al. 2007).
Compared to testing whether ceftriaxone increases the expression of these EAATs in C. elegans, identifying which specific glt gene targeted by ceftriaxone may better reveal its mechanism of action. To investigate this, we performed a genetic analysis. In the ubr-1 mutant, we individually deleted the six glt genes and found that ceftriaxone’s ability to restore ivermectin sensitivity was specifically suppressed in the ubr-1; glt-1 and ubr-1; glt-5 double mutants (Author response image 1A). This suggests that glt-1 and glt-5 may be the targets of ceftriaxone in C. elegans. In contrast, ivermectin sensitivity was unaffected in the individual glt mutants (Author response image 1B), indicating that a single glt deletion may not be sufficient to alter glutamate level or induce GluRs downregulation. Further studies are needed to determine whether ceftriaxone directly increases GLT-1 and GLT-5 expression in C. elegans and to explore the underlying mechanisms.
Author response image 1.
Glutamate transporter removal inhibits ceftriaxone-mediated restoration of ivermectin sensitivity in ubr-1. (A) Compared to the ubr-1 mutants, the ubr-1; glt-1 and ubr-1; glt-5 double mutants show enhanced ivermectin resistance under ceftriaxone treatment. (B) The glt mutants do not show resistance to ivermectin. ****p < 0.0001; one-way ANOVA test.
(3) Line 64 - as part of the rationale for the study, the authors state that "...increasing reports of unknown causes of IVM resistance continue to emerge...suggesting that additional unknown mechanisms are awaiting investigation." While this may be true, the ultimate conclusion from this study is that decreasing expression of Ivermectin-targeted GluCls causes Ivermectin resistance, which is a known mechanism. The field already knows that Ivermectin targets GluCls and thus decreasing GluCl expression or function would lead to Ivermectin resistance. The authors may want to edit the sentence mentioned above for clarity.
Thanks for the suggestion. We have revised the sentence for clarity: “…, suggesting that previously unrecognized or additional mechanisms regulating GluCls expression may await further investigation.” This revision better reflects the focus on GluCl regulation and clarifies the potential for additional mechanisms to be explored.
(4) The introduction to the serotonin-stimulated pharyngeal Calcium imaging section is a little confusing. The role of the various GluCls in pharyngeal pumping should be defined/clarified in the introduction to the last section (lines 337-341).
Thanks. We have revised and clarified the introduction as follows: “GluCls downregulation was functionally validated by the diminished IVM-mediated inhibition of serotonin-activated pharyngeal Ca2+ activity observed in ubr-1 mutants. ”
Additionally, the role of the various GluCls in pharyngeal pumping has been clarified:
“Using translational reporters, we found that IVM resistance in ubr-1 mutants is caused by the functional downregulation of IVM-targeted GluCls, including AVR-15, AVR-14, and GLC-1. These receptors are activated by glutamate to facilitate chloride ion influx into pharyngeal muscle cells, resulting in the inhibition of muscle contractions and the suppression of food intake in C. elegans. ”
We hope these revisions address the concerns raised and improve the clarity of this section.
(5) The color code key on the right-hand side of the Raster Plots in Figure 1H should be made larger for clarity.
Revised.
(6) In Figure S3, a legend should be included to define the black and blue box plots.
Thank you for your comment. We have added the following clarification to the figure legend: “Black plots: wild-type, blue plots: ubr-1 mutants.” This should now make the distinction between the two groups clear.
(7) Figure S4, the brackets above the graphs are misleading. It is not clear which comparisons are being made.
Thank you for your feedback. We have clarified the figure by updating the legend to include the statement: “All statistical analyses were performed against the ubr-1 mutant.” This clarification is now also included in Figure 3F-I to ensure consistency and avoid any confusion regarding the comparisons being made.
Reviewer #2 (Recommendations for the authors):
(1) In Figure 1A: the "trails" table needs more clarification to orient the reader.
To improve clarity and better orient the reader, we have updated Figure 1A by explicitly adding the number of trials and including a statistical analysis of the viability of wild-type and ubr-1 mutants under different ML conditions. In Figure 1A legend, we have added “we used shades of red to represent worm viability on each experimental plate (n = 50 animals per plate), with darker shades indicating lower survival rates. The viability test was repeated at least 5 times (5 trials).”. These modifications aim to provide a clearer understanding of the data presentation and its significance.
(2) In Figure S2: it would benefit the reader to include the major human parasitic nematodes in the phylogeny and include a discussion of the conservation.
Thank you for your insightful comment. In Figure S2A, we have included the human parasitic nematodes Onchocerca volvulus, Brugia malayi, and Toxocara canis. Unfortunately, other major human parasitic nematodes, such as Ascaris lumbricoides (roundworm), Ancylostoma duodenale (hookworm), and Trichuris trichiura (whipworm), currently lack reported homologs of the ubr-1 gene.
To provide some context, Onchocerca volvulus is a leading cause of infectious blindness globally, affecting millions of people, while Brugia malayi causes lymphatic filariasis, a significant tropical disease. Toxocara canis is a zoonotic parasite responsible for serious human syndromes such as visceral and ocular larval migration. Ivermectin remains a primary treatment for these parasitic infections.
Interestingly, while we have identified relevant sequences in Onchocerca volvulus, Brugia malayi, and Toxocara canis, potential mutations in ubr-1-like genes in these parasitic nematodes may lead to ivermectin resistance. Sequence comparison analysis could shed light on the risks of such mutations and their relevance to ivermectin treatment failure, warranting further attention. We have added a discussion of this potential risk in the manuscript.
Reviewer #3 (Recommendations for the authors):
Minor corrections/suggestions:
(1) The level of resistance in ubr-1 is similar to dyf genes. Should double-check ubr-1 mutant is not dyf.
Thank you for your insightful suggestion. We are also interested in this point and designed the following experiments. We first directly tested the Dyf phenotype of ubr-1 using standard DIO dye staining (Author response image 2A) and found that ubr-1 clearly show a "dye filling defective" phenotype (Author response image 2B). This raises an interesting question: Could the IVM resistance observed in ubr-1 be due to its Dyf defect? To address this, we further performed experiment by using Ceftriaxone to test ubr-1’s Dyf phenotype. Ceftriaxone can fully rescue the sensitivity of ubr-1 to IVM (Figure 2). If IVM resistance observed in ubr-1 is due to its Dyf defect, we should observe same rescued Dyf defect. After treating ubr-1 mutants with Ceftriaxone (50 μg/mL) until L4 stage, we again performed DIO dye staining and found that while Ceftriaxone fully rescued IVM resistance in ubr-1, it did not rescue the Dyf defect (Author response image 2C). These results suggest that while ubr-1 has a Dyf defect, it is unlikely the primary cause of the IVM resistance in ubr-1 mutant.
Author response image 2.
ubr-1 mutant is not dyf. (A) Depiction of the DIO dye-staining assays. Diagram is adapted from (Power et al. 2020). (B) ubr-1 mutant exhibits obvious Dyf phenotype. (C) Cef treatment (50 μg/mL) does not alter the ubr-1 Dyf defect phenotype. Scale bar, 20 µm.
(2) 367 "in IVM" superscript.
(3) 429 ubr-1 italics.
Thanks, revised.
(4) Methods: Need more info on dissection: if there is direct interaction of bath with pharynx, as suggested by bath solution, then 5HT concentrations are too high. Direct exposure to 20mM 5HT will kill a pharynx. 20uM 5HT?
Thank you for your comment. We have reviewed our experimental records and confirmed that the concentration mentioned in the manuscript is correct. In our experiment, the dissection refers to gently puncturing the epidermis behind head of the worm with a glass electrode to relieve internal pressure, which helps stabilize the calcium imaging process. In fact, there is no direct interaction between the bath solution and the pharynx or head neurons. We have revised the Methods section to clarify this point.
(5) Figure 2: Meaning of "Trials" arrow on grid y-axis is not immediately obvious to me. Would prefer you just label/number individual trials.
Sure, we have labeled the trails accordingly in revised Figure 1, 2, and Figure S1.
(6) Figure 3: Legend should include [IVM]. Meaning of +EAT-4, +GOT-1 should be described in the legend.
Thank you for your suggestion. We have updated the figure legend to include the IVM concentration (5 ng/mL). Additionally, we have clarified the meaning of +EAT-4 and +GOT-1 in the legend with the description: “…whereas the re-expression of GOT-1 (+GOT-1) and EAT-4 (+EAT-4) partially reinstated IVM resistance in the respective double mutants.” This ensures the figure is more informative and accessible to the reader.
(7) 784 signalling pathway should just be pathway.
Thanks, revised.
(8) Line 811 " Both types of motor neurons are innervated by serotonin (5 -HT)." Innervated by serotonergic "neurons"? However, even that is misleading because serotonin is not necessarily synaptic.
Thank you for your comment. We have revised the sentence to: “Both types of motor neurons could be activated by serotonin (5-HT).” This clarification better reflects the role of serotonin in modulating motor neuron activity.
(9) Line 814 puffing or perfusion. Perfusion seems more accurate. Make the figure consistent.
Thanks, revised.
(10) Figure S1 requires an x axis label with better explanation.
Thank you for your feedback. We have revised Figure S1 and added "x-axis" to clarify that it represents the trail number. Additionally, we have updated the figure legend to include the experimental conditions: “The shades of red represent worm viability, with darker shades indicating lower survival rates, based on 100 animals per plate and at least 5 trials.”
(11) Figure S2 C-F needs ivermectin concentration.
(12) Line 865 plants -> plates?
Thanks, revised.
(13) Figure S4. 875 "Rescue of IVM sensitivity of the ubr-1 mutant by the UBR-1 genomic fragment." Wrong title? Describes GFP expression and RNAi experiments.
Thank you for pointing out the mistake in the title. We have revised the title to: “Knockdown of UBR-1 induces IVM resistance phenotypes.” Additionally, we have updated the figure description to include details about GFP expression and RNAi experiments. The GFP expression is now described as: “Expression of functional UBR-1::GFP, driven by its endogenous promoter, was observed predominantly in the pharynx, head neurons, and body wall muscles with weaker expression detected in vulval muscles and the intestine.” The RNAi experiments are described as: “Double-stranded RNA (dsRNA) interference was employed to suppress gene expression in specific tissues (Methods).”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1:
The entire study is based on only 2 adult animals, that were used for both the single cell dataset and the HCR. Additionally, the animals were caught from the ocean preventing information about their age or their life history. This makes the n extremely small and reduces the confidence of the conclusions.
This statement is incorrect. While the scRNAseq was indeed performed in two animals (n=2), the HCR-FISH was performed in 3-5 animals (depending on the probe used). These were different animals from those used for the scRNAseq. The number of animals used has now been included in the manuscript.
All the fluorescent pictures present in this manuscript present red nuclei and green signals being not color-blind friendly. Additionally, many of the images lack sufficient quality to determine if the signal is real. Additional images of a control animal (not eviscerated) and of a negative control would help data interpretation. Finally, in many occasions a zoomed out image would help the reader to provide context and have a better understanding of where the signal is localized.
Fluorescent photos have been changed to color-blind friendly colors. Diagrams, arrows and new photos have been included as to guide readers to the signal or labeling in cells. Controls for HCR-FISH and labeling in normal intestines have been included.
Reviewer #2:
The spatial context of the RNA localization images is not well represented, making it difficult to understand how the schematic model was generated from the data. In addition, multiple strong statements in the conclusion should be better justified and connected to the data provided.
As explained above we have made an effort to provide a better understanding of the cellular/tissue localization of the labeled cells. Similarly, we have revised the conclusions so that the statements made are well justified.
Reviewer #3:
Possible theoretical advances regarding lineage trajectories of cells during sea cucumber gut regeneration, but the claims that can be made with this data alone are still predictive.
We are conscious that the results from these lineage trajectories are still predictive and have emphasized this in the text. Nonetheless, they are important part of our analyses that provide the theoretical basis for future experiments.
Better microscopy is needed for many figures to be convincing. Some minor additions to the figures will help readers understand the data more clearly.
As explained above we have made an effort to provide a better understanding of the cellular/tissue localization of the labeled cells.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
- Page 4, line 70-81: if the reader is not familiar with holothurian anatomy and regeneration process, this section can be complicated to fully understand. An illustration, together with clear definitions of mesothelium, coelomic epithelium, celothelium and luminal cells would help the reader.
A figure (now Figure 1) detailing the holothurian anatomy of normal and regenerating animals has been added. A figure detailing the intestinal regeneration process has also been included (S1).
- Page 5 line 92-104: this paragraph could be shortened. It would be more important to explain what the main question is the Authors would like to answer and why single cell would be the best technique to answer it, than listing previous studies that used scRNA-Seq.
The paragraph has been shortened and the focus has been shifted to the question of cellular components of regenerative tissues in holothurians.
- Page 6, line 125-127 and line 129-132: this belongs to the method section.
This information is now provided in the Materials and Methods section.
- Page 11, line 210-217: this belongs to the discussion.
This section has now been included in the Discussion.
- How many mesenteries are present in one animal?
This has now been included as part of Figure S1.
- In the methods there are no information about the quality of the dataset and the sequencing and the difference between the 2 samples coming from the 2 animals. How many cells from each sample and which is the coverage? The Authors provided this info only between mesentery and anlage but not between animals.
We have added additional information about the sequencing statistics in S4 Fig and S15 Table. Description has also been added in the methods in lines 922-926 under Single Cell RNA Sequencing and Data Analysis section.
- The result section "An in-depth analysis of the various cluster..." is particularly long and very repetitive. I would encourage to Authors to remove a lot of the details (list of genes and GO terms) that can be found in the figures and stressed only the most important elements that they will need to support their conclusions. Having full and abbreviated gene names and the long list of references makes the text difficult to read and it is challenging to identify the main point that the Authors are trying to highlight.
This section has been abbreviated.
- Figure 1: I would suggest adding a graph of holothurian anatomy before and after the evisceration to provide more context of the process we are looking at and remove 1C.
Information on the holothurian anatomy has been included in a new Fig 1 and in supplementary figure S1
- Figure 2: I would suggest removing this figure that is redundant with Figure 3 and several genes are not cluster specific. Figure 3 is doing a better job in showing similar concepts.
Figure 2 was removed and placed in the Supplement section.
- In figure 3 how were the 3 cell types defined? Was this done manually or through a bioinformatic analysis?
The cell definition was done following the analysis of the highly expressed transcripts and comparisons to what has been shown in the scientific literature.
- Figure 2O shows that one of the supra-cluster is made of C2, C7, C6 and C10. This contradicts the text page 9, line 195.
The transcript chosen for this figure gives the wrong idea that these 4 clusters are similar. We have now addressed this in the manuscript.
- Figure 4A and 4C: if these are representing a subset of Figure 3, they should be removed in one or the other. The same comment is valid also for Figures 5, 6 and 7. In general the manuscript is very redundant both in terms of Figures and text.
These are indeed subsets of Fig 3 that were added with the purpose of clarifying the findings, however, in view of the reviewer’s comment we have deleted the redundant information from all figures.
- Figure 9: since the panels are not in order, it is difficult to follow the flow of the figure. - All UMAP should have the number of the cluster on the UMAP itself instead of counting only on the color code in order to be color-blind friendly.
The figure has been modified and clusters are now identified in the UMAP by their number.
- Figure S1F seems acquired in very different conditions compared to the other images in the same figure.
Fig S1F (now S2 Fig) is an overlay of fluorescent immune-histochemistry (UV light detected) with “classical” toluidine blue labeling (visible light detected). This has now been explained in the figure legend.
- Table S7 is lacking some product numbers.
The toluidine blue product number has now been added to the table. The antibodies that lack product number correspond to antibodies generated in our lab and described in the references provided.
- The discussion is pretty long and partially redundant with the result section. I would encourage the Authors to shorten the text and shorten paragraphs that have repeating information. - It might be out of the scope of the Authors but the readers would benefit from having a manuscript that focuses more on the novel aspects discovered with the single-cell RNA-Seq and then have a review that will bring together all the literature published on this topic and integrating the single-cell data with everything that is known so far.
We have tried to shorten the discussion by eliminating redundant text.
Reviewer #2 (Recommendations For The Authors):
- An intriguing finding is the lack of significant difference in the cell clusters between the anlage and mesentery during regeneration. This discovery raises important questions about the regenerative process. The authors should provide a more detailed explanation of the implications of this finding. For example, does it suggest that both organs contribute equally to the regenerated tissues?
The lack of significant differences in the cell clusters between the anlage and the mesentery is somewhat surprising but can be explained by two different facts. First, we have previously shown that many of the cellular processes that take place in the anlage, including cell proliferation, apoptosis, dedifferentiation and ECM remodeling occur in a gradient that begins at the tip of the mesentery where the anlage forms and extends to various degrees into the mesentery. Similarly, migrating cells move along the connective tissue of the mesentery to the anlage. Thus, there is no clear partition of the two regions that would account for distinct cell populations associated with the regenerative stage. Second, the two cell populations that would have been found in the mesentery but not in the regenerating anlage, mature muscle and neurons, were not dissociated by our experimental protocol as to allow for their sequencing. Our current experiments are being done using single nuclei RNA sequencing to overcome this hurdle. This has now been included in the discussion.
- Proliferating cells are obviously important to the study of regeneration as it is assumed these form the regenerating tissue. The authors describe cluster 8 as the proliferative cells. Is there evidence of proliferation in other cell types or are these truly the only dividing cells? Is c8 of multiple cell types but the clustering algorithm picks up on the markers of cell division i.e. what happens if you mask cell division markers - does this cluster collapse into other cluster types? This is important as if there is only one truly proliferating cell type then this may be the origin of the regenerative tissues and is important for this study to know this.
As the reviewer highlights, we also believe this to be an important aspect to discuss. We have addressed this in the manuscript discussion with the following: “Our data suggest that there appears to be a specific population of only proliferative cells (C8) characterized by a large number of cell proliferation genes, which can be visualized by the top genes shown in Fig 3. These cell proliferation genes are specific to C8, with minimum representation in other populations. Interestingly, as mentioned before C8 expresses at lower levels many of the genes of other coelomic epithelium populations. Nevertheless, even if we mask the top 38 proliferation genes (not shown), this cluster is maintained as an independent cluster, suggesting that its identity is conferred by a complex transcriptomic profile rather than only a few proliferation-related genes. Therefore, the identity and potential role of C8 could be further described by two distinct alternatives: (1) cells of C8 could be an intermediate state between the anlage precursor cells (discussed below) and the specialized cell populations or (2) cells of C8 are the source of the anlage precursor populations from which all other populations arise. The pseudotime data is certainly complex and challenging to interpret with our current dataset, yet the RNA velocity analysis showed in Fig 11B would suggests that cells of C8 transition into the anlage precursor populations, rather than being an intermediate state. This is also supported by the Slingshot pseudotime analysis that incorporates C8 (S13 Fig).
Nevertheless, additional experiments are needed to confirm this hypothesis.”
- The schematic model presented in Fig 10 is essential for clarifying the paper's findings and will provide a crucial baseline model for future research. However, the comparison of the data shown in the HCR figures with the schematic is challenging due to the lack of spatial context in the HCR figures. The authors should find a way to provide better context in the figures, such as providing two-color in situ images to compare spatial relationships of cell types and/or including lower resolution and side-by-side fluorescent and bright field images if possible.
The figure has been modified to explain the spatial arrangement of the tissues.
The authors make several strong statements in the discussion that weren't well connected to the findings in the data. Specifically:
“Regardless of which cell population is responsible for giving rise to the cells of the regenerating intestine, our study reveals that the coelomic epithelium, as a tissue layer, is pluripotent.”
This has now been expanded to better explain the statement.
738 “…we postulate that cells from C1 stand as the precursor cell population from which the rest of the cells in the coelomic epithelium arise”.
This has now been expanded to better explain the statement.
748 “differentiation: muscle, neuroepithelium, and coelomic epithelium cells. We also propose the presence of undifferentiated and proliferating cell populations in the coelomic epithelia, which give rise to the cells in this layer…”
This has now been expanded to better explain the statement.
777 “amphibians, the cells of the holothurian anlage coelomic epithelium are proliferative undifferentiated cells and originated via a dedifferentiation process…”
This has now been expanded to better explain the statement.
Reviewer #3 (Recommendations For The Authors):
Specific questions:
- Is there any way to systematically compare these cells to evolutionarily-diverged cells in distant relatives to sea cucumbers? Or even on a case-by-case basis? For example, is there evidence for any of these transitory cell types to have correlate(s) in vertebrate gut regeneration?
This is a most interesting question but one that is perhaps a bit premature to answer due to multiple reasons. First, most of the studies in vertebrates focus on the regeneration of the luminal epithelium, a layer that we are not studying in our system since it appears later in the regeneration process. Second, there is still too little data from adult echinoderms to fully comprehend which cells are cell orthologues to vertebrates. Third, we are only analyzing one regenerative stage. It is our hope that this is just the start of a full description of what cell types/stages are found and how they function in regeneration and that this will lead us to identify the cellular orthologues among animal species.
Major revisions:
- If lineage tracing is within the scope of this paper, it would provide more definitive evidence to the conclusions made about the precursor populations of the regenerating anlage.
Response: This is certainly one of the next steps, however at present, it is not possible due to technical limitations.
Minor revisions:
- Line 47: "for decades" even longer! Could the authors also cite some other amphibians, such as other salamanders (newts) and larval frogs?
References have been added.
- Line 85: "specially"-could authors potentially change to "specifically"
Corrected
- Line 122: Authors should add the full words of what these abbreviations stand for in the caption for Figure 1 or in Figure 1A itself.
Corrected
- Lines 153: What conclusions are the authors trying to make from one type of tubulin presence compared to the others? It's unclear from the text.
The authors are not trying to reach any particular conclusion. They are just stating what was found using several markers, and the possibility that what might be viewed first hand as a single cell population might be more heterogenous. Although the tubulin-type information might not be relevant for the conclusions in the present manuscript, it might be important for future work on the cell types involved in the regeneration process.
- Line 226: Could the authors clarify if "WNT9" is "WNT9a". Figure 3 lists WNT9a but authors refer to WNT9 in the text.
The gene names in Fig 3 are based on the human identifiers. H. glaberrima only has one sequence of Wnt9 (Auger et al. 2023) and this sequence shares the highest similarity to human Wnt9a, thus the name in the list. We have now identified the gene as Wnt9 to avoid confusion.
- Lines 236-237: Can authors rule out that some immune cells might infiltrate the mesenchymal population?
No, this cannot be ruled out. In fact, we believe that most of the immune cells found in our scRNA-seq are indeed cells that have infiltrated the anlage and are part of the mesenchyma. This has been reported by us previously (see Garcia-Arraras et al. 2006). We have now included this in the text.
- Line 452-453: The over-representation of ribosomal genes not shown. Would it be possible to show this information in the supplementary figures?
The sentence has been modified, the data is being prepared as part of a separate publication that focuses on the ribosomal genes.
- Line 480: Could authors clarify if it's WNT9a or just WNT9?
It is indeed Wnt9. See previous response above.
- Line 500: In future experiments, it would be interesting to compare to populations at different timepoints in order see how the populations are changing or if certain precursors are activated at different times.
We fully agree with the reviewer. These are ongoing experiments or are part of new grant proposals.
- Line 567-568: Choosing 9-dpe allowed for 13 clusters, but do authors expect a different number of clusters at different timepoints as things become more terminally differentiated?
Definitely, we believe that clusters related to the different regenerative stages of cells can be found by looking at earlier or later regeneration stages of the organ. A clear example is that if the experiment is done at 14-dpe, when the lumen is forming, cells related to luminal epithelium populations will appear. It is also possible that different immune cells will be associated with the different regeneration stages.
- Line 653: References Figure 10D (not in this manuscript). Are authors referring to only 1D or 9D or an old draft figure number?
As the reviewer correctly points out, this was a mistake where the reference is to a previous draft. It has now been corrected.
- Line 701: "our study reveals that the coelomic epithelium, as a tissue layer, is pluripotent." Phrasing may be better as referring to the cell population making up the tissue layer as pluripotent/multipotent or that the cells it contains would likely be pluripotent or multipotent. Additionally, lineage tracing may be needed to definitively demonstrate this.
This has been modified.
- Line 808: The authors may make a more accurate conclusion by saying that the characteristics are similar to blastemas or behave like a blastema rather than it is blastema. There is ambiguity about the meaning of this term in the field, but most researchers seem to currently have in mind that the "blastema" definition includes a discrete spatial organization of cells, and here these cells are much more spread out. This could be a good opportunity for the authors to engage in this dialogue, perhaps parsing out the nuances of what a "blastema" is, what the term has traditionally referred to, and how we might consider updating this term or at least re-framing the terminology to be inclusive of functions that "blastemas" have traditionally had in the literature and how they may be dispersed over geographical space in an organism more so than the more rigid, geographically-restricted definition many researchers have in mind. However, if the authors choose to elaborate on these issues, those elaborations do belong in the discussion, and the more provisional terminology we mention here could be used throughout the paper until that element of the revised discussion is presented. We would welcome the authors to do this as a way to point the field in this direction as this is also how we view the matter. For example, some of the genes whose expression has been observed to be enriched following removal of brain tissue in axolotls (such as kazald2, Lust et al.), are also upregulated in traditional blastemas, for instance, in the limb, but we appreciate that the expression domain may not be as localized as in a limb blastema. Additionally, since there is now evidence that some aspects of progenitor cell activation even in limb regeneration extend far beyond the local site of amputation injury (Johnson et al., Payzin-Dogru et al.), there is an opportunity to connect the dots and make the claim that there could be more dispersion of "blastema function" than previously appreciated in the field. Diving a bit more into these nuances may also enable better conceptual framework of how blastema function may evolve across vast evolutionary time and between different injury contexts in super-regenerative organisms.
We have followed the reviewer’s suggestion and stated that the holothurian anlage behaves as a blastema. Though we would love to elaborate on the blastema topic, as suggested by the reviewer, we believe that it would extend the discussion too much and that the topic might be better served in a different publication.
- In the discussion, it would be important not to leave the reader with the impression that all amphibian blastema cells originate via dedifferentiation. This is not the case. For example, in axolotls (Sandoval-Guzman et al.) and in larval/juvenile newts, muscle progenitors within the blastema structure have been shown to originate from muscle satellite cells, a kind of stem cell, in stump tissues (while adult newts use dedifferentiation of myofibers to generate muscle progenitors in the blastema). Most cell lineages simply have not been evaluated in the level of detail that would be required to definitively conclude one way or the other, and the door is open for a more substantial contribution from stem cell populations than previously appreciated especially because new tools exist to detect and study them. Providing the reader with a more nuanced view of this situation will not negatively impact the findings in this paper, but it will show that there is biological complexity still waiting to be discovered and that we don't have all the answers at this point.
This has now been corrected.
Figures: Overall, the figures need minor work.
- Figure 1A: Can the authors draw a smaller, full-body cartoon and feature the current high-mag cartoon as an inset to that? Can they label the axes and make it clear how the geometry works here?
Fig 1 has been re-done and now is split into Fig 1 and Fig 2.
- Figure 1B: Can the authors label the UMAP with cluster identities on the map itself? This will make it easier to identify each cluster (especially to make sure cluster 11 is easier to find).
This has been corrected.
- Figure 2: Could the authors put boxes/clearly distinguish panel labels around each cluster (AO), so that there are clear boundaries?
Fig 2 has been moved to Supplement, following another reviewer recommendation.
- "Gene identifiers starting with "g" correspond to uncharacterized gene models of H. glaberrima." - The sentence is from another figure caption but this figure would benefit from having this sentence in the figure caption as well.
This has been added to other figures as suggested.
- Figure 3A: Can the authors potentially bold, highlight, or underline genes you discuss in text, so it's easier for the reader to reference?
This has been added as suggested.
- Figure 3C: Can the authors please label the cell types directly on the UMAP here as well?
The changes were made following the reviewer’s recommendation.
- Figure 4D-E: There's not much context here to determine if this HCR-FISH validation can tell us anything about these cells besides some of them appear to be there. Do authors expect the coelomocyte morphology to look different in regenerating/injured tissue versus normal animals? Can the authors provide some double in situs, as well as some lower-magnification views showing where the higher-magnification insets are located? Is there any spatial pattern to where these cells are found? Counter stains would be helpful.
- Figure 6C: If clusters C5, C8, C9 are part of the coelomic epithelium, then authors could show a smaller diagram above with blue and grey to show types and then show clusters separately to help get their point across better.
- Figure 6G: This image appears to have high background- would it be possible for authors to repeat phalloidin stain or reimage with a lower exposure/gain. Additionally, imaging with Zstacks would help to obtain maximum intensity projections. It would greatly aid the reader if each image was labeled with HCR probes/antibodies that have been applied to the sample.
- Figure 7E: The cells appear to be out of focus and have high background. Additionally, they are lacking the speckled appearance expected to be seen with HCR-FISH. Would it be possible for authors to collect another image utilizing z-stacks?
HCR-FISH figures identifying the gene expression characteristic of cell clusters have been modified following the reviewer’s concerns. The changes include:
(1) Additional clusters have been verified with probes to gene identifiers. These include clusters 8, 9 and 12.
(2) Redundant information has been removed.
(3) Colors have been changed to make figures friendlier to color-impaired readers.
(4) Spatial context has been added or identified.
(5) In some cases, improved photos have been added
(6) Better labels have been included
(7) When necessary individual photos used for the overlay have been included.
- Figure 9A: Could authors add cluster labels onto UMAP directly?
This change was made to Fig 2A. UMAP in Fig 9A is the same and used just as reference of the subset.
- Figure 10: It could be useful if authors put a small map of the sea cucumber like in other images so that readers know where in the anlage this zoomed in model represents.
Added as suggested by the reviewer.
- Supplementary figure 1F: Could authors add an arrow to the dark cell that's being pointed out?
Changed made as suggested by the reviewer.
- Supplementary figure 1: Could authors label clearly what color is labeled with what marker?
Changed made as suggested by the reviewer.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
The authors present valuable findings on trends in hind limb morphology throughout the evolution of titanosaurian sauropod dinosaurs, the land animals that reached the most remarkable gigantic sizes. The solid results include the use of 3D geometric morphometrics to examine the femur, tibia, and fibula to provide new information on the evolution of this clade and understand the evolutionary trends between morphology and allometry. Further justification of the ontogenetic stages of the sampled individuals would help strengthen the manuscript's conclusions, and the inclusion of additional large-body mass taxa could provide expanded insights into the proposed trends.
Most of the analyzed specimens, especially from the smaller taxa, come from adult or subadult specimens. None exhibit features that may indicate juvenile status. However, we lack information of the paleohistology that may be a stronger indicator on the ontogenetic status of the individual, and some of operative taxonomic units used in the study come from mean shape of all the sampled specimens.
Current information on morphological differences between adult and subadult or juvenile specimens indicates that even early juvenile specimens may share same morphological features and overall morphology as the adult (e.g., see Curry-Rogers et al., 2016; Appendix S3). We included a comprehensive analysis of the impact of juvenile specimens as one of the aspects of the intraspecific variability that may alter our results in Appendix S3.
Public Reviews:
Reviewer #1:
Weaknesses:
Several sentences throughout the manuscript could benefit from citations. For example, the discussion of using hind limb centroid size as a proxy for body mass has no citations attributed. This should be cited or described as a new method for estimating body mass with data from extant taxa presented in support of this relationship. This particular instance is a very important point to include supporting documentation because the authors' conclusions about evolutionary trends in body size are predicated on this relationship.
We address this issue in the text (Line 32 & 64). Centroid size seems a good indication as it’s the overall size of the entire hind limb, and the length of the femur and tibia is well correlated independently with the body size/mass. Also, as we use few landmarks and only those that are purely type I or II landmarks, with curves of semilandmarks bounded or limited by them, centroid size is not sensible to landmark number differences across the sample in our study (as the centroid size is dependent of the number of landmarks of the current study as well as the physical dimensions of the specimens).
We have sampled and repeated all the analyses using other proxies like the femoral length and the body mass estimated from the Campione & Evans (2020) and Mazzeta et al. (2004) methods. The comprehensive description of the method is in Appendix S2, the alternative analyses can be accessed in the Appendix S3 and S4; and the code for the alternative analyses can be accessed in the modified Appendix S5. All offer similar results than the ones obtained in our analyses with the body size proxied with the hind limb landmark configuration centroid size.
An additional area of concern is the lack of any discussion of taphonomic deformation in Section 3.3 Caveats of This Study, the results, or the methods. The authors provide a long and detailed discussion of taphonomic loss and how this study does a good job of addressing it; however, taphonomic deformation to specimens and its potential effects on the ensuing results were not addressed at all. Hedrick and Dodson (2013) highlight that, with fossils, a PCA typically includes the effects of taphonomic deformation in addition to differences in morphology, which results in morphometric graphs representing taphomorphospaces. For example, in this study, the extreme negative positioning of Dreadnoughtus on PC 2 (which the authors highlight as "remarkable") is almost certainly the result of taphonomic deformation to the distal end of the holotype femur, as noted by Ullmann and Lacovara (2016).
We included a brief commentary in the Caveats of This Study (Line 467) and greatly expanded this issue in the Appendix S3. We followed the methodology proposed by Lefebvre et al. (2020) to discuss the effects of taphonomic deformation in the shape analyses.
Our shape variables (PCs obtained from the shape PCA) should be viewed as taphomorphospaces as Hedrick and Dodson, as well as the reviewer, points in such cases.
The analysis of the effects of taphonomy or errors induced by the landmark estimation method indicate that Dreadnoughtus schrani is one of the few sampled taxa that may have a noticeable impact on our analyses due lithostatic deformation. Other taxa like Mendozasaurus neguyelap or Ampelosaurus atacis may also induce some alterations to the PCs. In general, the trends of those PCs slightly altered by taphonomy, where D. scharni is the only sauropod that may alter an entire PC like PC2, did not exhibit phylogenetic signal and are a small proportion of the sample variance.
The authors investigated 17 taxa and divided them into 9 clades, with only Titanosauria and Lithostrotia including more than two taxa (and four clades are only represented by one taxon). While some of these clades represent the average of multiple individuals, the small number of plotted taxa can only weakly support trends within Titanosauria. If similar general trends could be found when the taxa are parsed into fewer, more inclusive clades, it would support and strengthen their claims. Of course, the authors can only study what is preserved in the fossil record, and titanosaurian remains are often highly fragmentary; these deficiencies should therefore not be held against the authors. They clearly put effort and thought into their choices of taxa to include in this study, but there are limitations arising from this low sample size that inherently limit the confidence that can be placed on their conclusions, and this caveat should be more clearly discussed. Specifically, the authors note that their dataset contains many lithostrotians, but they do not discuss unevenness in body size sampling. As neither their size-category boundaries nor the taxa which fall into each of them are clearly stated, the reader must parse the discussion to glean which taxa are in each size category. It should be noted that the authors include both Jainosaurus and Dreadnoughtus as 'large' taxa even though the latter is estimated to have been roughly five times the body mass of the former, making Dreadnoughtus the only taxon included in this extreme size category. The effects that this may have on body size trends are not discussed. Additionally, few taxa between the body masses of Jainosaurus and Dreadnoughtus have been included even though the hind limbs of several such macronarians have been digitized in prior studies (such as Diamantinasaurus and Giraffititan; Klinkhamer et al. 2018). Also, several members of Colossosauria are more similar in general body size to Dreadnoughtus than Jainosaurus, but unfortunately, they do not preserve a known femur, tibia, and fibula, so the authors could not include them in this study. Exclusion of these taxa may bias inferences about body size evolution, and this is a sampling caveat that could have been discussed more clearly. Future studies including these and other taxa will be important for further evaluating the hypotheses about macronarian evolution advanced by Páramo et al. in this study.
Sadly, we could not include some larger sized titanosaurians sauropods. As the reviewers points out, the lack of larger sauropods among the sampled taxa may hinder our results, as the “large-bodied” category is filled with some mid-sized taxa and the former Dreadnoughtus schrani which is five times larger than some of them. We tried to include Elaltitan lilloi, digitized for this study and included in preliminary analyses, but the fragmentary status increased greatly the error by the estimation method as there is only a proximal third or mid femur preserved from this taxon. Therefore we opted to exclude it from our database.
Other taxa considered, as the reviewer suggest, was not readily available for the authors as the time of this study was conducted and including now may have increased the possible bias of our study. Giraffatitan brancai is an Late Jurassic brachiosaurid, which may again increase the number of early-branching titanosauriforms with large body masses while most of the smaller taxa sampled are recovered in deeply-branching macronarians (including Diamantinasaurus matildae if we would have also included it). Future analyses may include a wider sample of the mid to large-bodied titanosaurians, especially lithostrotians, as well as some colossosaurs like Patagotitan mayorum.
Reviewer #1 (Recommendations For The Authors):
These are all minor comments that would improve the manuscript.
- There are a few typos throughout the manuscript such as: line 70 should be 2016 and line 242 should be forelimb.
Corrected.
- To me, the most interesting aspect of your study is the diversity and trends recovered in titanosaurian subclades and I would highlight this, not gigantism, in the title if you choose to revise the title.
It has been addressed. The specificality of some of the tests and the implication to the acquisition of the spread limb posture and gigantism in early-branching taxa is important nonetheless, so we think that it may remain in the title.
- The abstract should provide more details on the results such as none of the listed trends were statistically significant.
Many of the trends exhibit phylogenetic signal, but not the allometric components. We have briefly addressed them.
- Several sentences in the manuscript need citations such as: line 48 the reference to other megaherbivores, line 66 the discussion of poor understanding of the relationship of wide gauge posture and gigantism, and the use of centroid size as an estimate of body mass (see Public Review).
We changed the line 66 to improve the focus on the current state of the art in the hypothesis of a relationship between arched limbs and in the increase of body size. We included a section relating centroid size as a proxy (due the good correlation between the femur and tibia length and the body mass) and the caveats of using it. We also expanded in the Appendix S2 the use of centroid size and the alternative models.
- With titanosaur evolution, you mention that they are adapting to new niches and topography (line 64). What support is there for this versus they are adapting to be more successful in their current environment?
Noted, we have changed the phrase to improved efficiency exploiting of inland environments, as thy can be either opening new inland niches or adapting better to current inland niches that were already exploited for less deeply branching sauropods. However, its testing is beyond the scope of the current work.
- Line 384-385: the discussion of Rapetosaurus should mention that it is a juvenile and some studies have suggested that titanosaur limbs grow allometrically.
We have included a small line. Whether Rapetosaurus krausei exhibit allometric growth or not may not change greatly the discussion, maybe only excluding it as morphologically convergent to Lirainosaurus and Muyelensaurus. But if that so, it will be further proof that small-sized titanosaurs exhibit the robust skeleton expected in the giant titanosaurs.
- I would consider addressing the question of if we are certain enough in our understanding of titanosaurian phylogeny to rule out homology, especially when you discuss the uncertainty of the placement of specific taxa. Also, Diamantinasaurus is not the only titanosaur that has been proposed as a member of both basal and more derived subclades (e.g., Dreadnoughtus).
We tried to assume a more conservative approach. We could not fully rule out that some of the features observed in the sampled deeply branching lithostrotians, especially saltasauroids, cannot be present in the entire somphospondylan lineage. However, none of the less deeply-branching or early-branching titanosaurs exhibit this kind of morphology. Recent studies propose the possibility that entire groups, included in this study like the Colossosauria, change its position in the phylogeny. However, despite the debated phylogenetic position of Diamantinasaurus or Dreadnoughtus, or even the inclusion of Colossosauria within the saltasauroids and the inclusion of the Ibero-Armorican lithostrotians as putative saltasaurids (Mocho et al. 2024). However, even considering these changes we did not notice any relevant differences in our conclusions about hind limb arched morphology nor about size. Distal hind limb overall robustness should indeed be addressed in the light of shifts in phylogenetic position and include some interesting sauropods like Diamantinasaurus or expand the large-sized Colossosauria or early-branching somphospondyls as it may have profound implications on the morphofunctional adaptations to specific feeding niches, e.g., see current hypotheses about rearing as mentioned in Bates et al. (2016), Ullmann et al. (2017) or Vidal et al. (2020). We had not enough information to conclude the presence of any plesiomorphic condition or analogous feature with our current sample and the debated titanosaurian phylogeny.
- I understand this is not standard in the field, but your study provides the opportunity to conduct sensitivity testing of the effects of cartilage thickness and user articulation of the bones on PCA results. This would be an inciteful addition to the field of GMM.
We are currently developing such a comprehensive analysis and several other implications on our past results. However, we feel that it is beyond the scope of the current study. We appreciate the suggestion nonetheless, as it would be a sensitivity test of the impact of several of our assumptions in the final results that is often not considered.
- In Figure 1, if all the limbs were arranged the same way it would be easier to interpret. Consider flipping panels B and D to match A and C.
Accepted.
- In Figures 2-4, the views in C should be labeled in the figure or caption. Oceanotitan is also in the PCA plot but not included in the figure caption. Also, consider changing the names to represent the paraphyletic groupings you are using instead of formal clade names. For example, change 'Titanosauria' to 'Basal Titanosaurs' to reflect that it is not including all titanosaurs in the sample.
Changes accepted for the shape PCA results. The informal (i.e., paraphyletic) terms such as “Basal Titanosaurs” were only used in the shape analyses as in the RMA, the Titanosauria (and other more inclusive groups) were used as natural groups. Each partial RMA model is based on a sample of all the taxa that are included within that particular clade (e.g., Titanosauria includes both Dreadnoughtus and Saltasaurus; Lithostrotia excludes the former).
- I am concerned that centroid size does not scale evenly across the wide-ranging body mass of titanosaurs. I do not know if this affects your size trends or their significance, but as I mentioned above Dreadnoughtus is much bigger than most of the taxa included and that isn't as drastically apparent in centroid size (in Figure 5) as it is when taxa are plotted by body mass.
Main problematic with centroid size of the hind limb is the shift in the body plan of deeply-branching titanosaurs as the Center of Masses is displaced toward the anterior portion of the body and it has been proposed due a large development of the forelimb region (e.g., Bates et al. 2016). However, it would only increase the effects of the phyletic body size reduction, as smaller taxa tend to have a 1:1 fore limb and hind limb ratio, e.g., from our past analyses as in Páramo et al. (2019), and the sacrum is not as beveled as in earlier somphospondyls, e.g., Vidal et al. (2020). The role of the low-browsing feeding habits of deeply-branching lithostrotians shall be explored elsewhere, as it may be the main driving force of this effect. Our point is, the proxy used may have some slight offset due some high-browsing giant early-branching titanosaurs which has a greater cranial region development which increase its body size and mass beyond our bare-minimum estimation based on the hind limb region. But, overall, this offset is assumed to be low. We repeated the analyses with the femoral length as proxy of body size and a mass estimation, including the quadratic equation based on both humeral and femoral lengths, and the results remain similar. Another problem that arises with the use of centroid size is the way it shall be calculated, but as we used an even number of landmarks and curve semilandmarks, and all of them bounded to anatomical features, it remains equal at least for our sample (but cannot be extrapolated to other geometric morphometric studies that do not use the same configurations)
We appreciate the reviewer concerns nonetheless, as it was on of our own when designing this study, and we in the future will try to expand the analyses, or advise anyone expanding on this study, using total body size/volume estimations following Bates et al. (2016). Which also includes test of the effects of the different whole-body estimation models.
Cites:
Bates KT, Mannion PD, Falkingham PL, Brusatte SL, Hutchinson JR, Otero A, Sellers WI, Sullivan C, Stevens KA, Allen V. 2016. Temporal and phylogenetic evolution of the sauropod dinosaur body plan. Royal Society Open Science 3:150636. doi:10.1098/rsos.150636
Mocho P, Escaso F, Marcos-Fernández F, Páramo A, Sanz JL, Vidal D, Ortega F. 2024. A Spanish saltasauroid titanosaur reveals Europe as a melting pot of endemic and immigrant sauropods in the Late Cretaceous. Commun Biol 7:1016. doi:10.1038/s42003-024-06653-0
Páramo A, Ortega F, Sanz JL. 2019. A Niche Partitioning Scenario for the Titanosaurs of Lo Hueco (Upper Cretaceous, Spain). International Congress of Vertebrate Morphology (ICVM) - Abstract Volume, Journal of Morphology. Prague. p. S197.
Ullmann PV, Bonnan MF, Lacovara KJ. 2017. Characterizing the Evolution of Wide-Gauge Features in Stylopodial Limb Elements of Titanosauriform Sauropods via Geometric Morphometrics. The Anatomical Record 300:1618–1635. doi:10.1002/ar.23607
Vidal D, Mocho P, Aberasturi A, Sanz JL, Ortega F. 2020. High browsing skeletal adaptations in Spinophorosaurus reveal an evolutionary innovation in sauropod dinosaurs. Sci Rep 10:6638. doi:10.1038/s41598-020-63439-0
Reviewer #2:
The authors report a quantitative comparative study regarding hind limb evolution among titanosaurs. I find the conclusions and findings of the manuscript interesting and relevant. The strength of the paper would be increased if the authors were to improve their reporting of taxon sampling and their discussion of age estimation and the potential implications that uncertainty in these estimates would have for their conclusions regarding gigantism (vs. ontogenetic patterns).
Considering the observations made by reviewer #1, we included a data about the impact of ontogenetic patterns and other intraspecific variability in the Appendix S3. We considered to increase the sample but it has not been possible at the time of this study was carried out.
Reviewer #2 (Recommendations For The Authors):
I have a few concerns/requests for the authors, that I hope can be easily resolved.
Comments:
- What drove taxon sampling?
Random sampling of somphospondylan sauropods focused on the Lithostrotia clade for the thesis project of one of the authors, APB. Logistics were also one of the bias on our sample, and based on the available titanosaurian material we left out several macronarians that has been already sampled but would further induce a early-branching large sauropod, deeply-branching small sauropod that may alter our results.
- Which phylogenies were used to create the supertree applied to the analyses? What references were used to time-calibrate the tips and deeper nodes? I couldn't find any reference to this. Additionally, more information regarding the R packages and analytical pipeline would be appreciated: e.g. were measurements used in the analyses log-transformed?
A comprehensive description of the methodology is provided in Appendix S2.
- Age estimate: can the author confirm the skeletal maturity of the sampled individuals? If this is not the case, how can the author be sure that the patterns towards gigantism are not reflecting different ontogenetic stages? I believe this should be part of both methods and discussion.
As commented before, we excluded small, probable juvenile specimens from our sample. We have no paleohistological sample backing the claims of the ontogenetic status of some of the specimens that were included or excluded were calculating the mean shape for the operative taxonomic units. However, we followed a criteria to identify the relative ontogenetic status and it has been included in Appendix S3.
- The authors used the centroid size for regressions in Figure 6. Although I believe that this is a good variable, would the author be willing to use body mass and log-transformed femur length in addition to what was done? These would be very useful considering that these variables are (relatively) independent from shape/morphology.
Accepted, we tested our hypotheses with three alternative models based on femoral length, combined femoral and humeral lengths for body mass estimations. Methodology can be found in Appendix S2, results on Appendix S4, code for the alternative methods in Appendix S5.
- Data access: will stl. Files of the limb elements be shared and freely available? In this case, where the files will be deposited?
At the time of the current study, some of the sampled specimens cannot be available (material under study) but the mean shapes can be generated after the landmarks and semilandmark curves and the “atlas” mesh.
- Additionally, outstanding references regarding limb evolution, GMM, role of ontogeny, and evolution of columnar gait are missing. The authors should reinforce the literature review with the following (alphabetical order):
Bonnan, M. F. (2003). The evolution of manus shape in sauropod dinosaurs: implications for functional morphology, forelimb orientation, and phylogeny. Journal of Vertebrate Paleontology, 23(3), 595-613.
Botha, J., Choiniere, J. N., & Benson, R. B. (2022). Rapid growth preceded gigantism in sauropodomorph evolution. Current Biology, 32(20), 4501-4507.
Curry Rogers, K., Whitney, M., D'Emic, M., & Bagley, B. (2016). Precocity in a tiny titanosaur from the Cretaceous of Madagascar. Science, 352(6284), 450-453.
Day, J. J., Upchurch, P., Norman, D. B., Gale, A. S., & Powell, H. P. (2002). Sauropod trackways, evolution, and behavior. Science, 296(5573), 1659-1659.
Fabbri, M., Navalón, G., Benson, R. B., Pol, D., O'Connor, J., Bhullar, B. A. S., ... & Ibrahim, N. (2022). Subaqueous foraging among carnivorous dinosaurs. Nature, 603(7903), 852-857.
Fabbri, M., Navalón, G., Mongiardino Koch, N., Hanson, M., Petermann, H., & Bhullar, B. A. (2021). A shift in ontogenetic timing produced the unique sauropod skull. Evolution, 75(4), 819-831.
González Riga, B. J., Lamanna, M. C., Ortiz David, L. D., Calvo, J. O., & Coria, J. P. (2016). A gigantic new dinosaur from Argentina and the evolution of the sauropod hind foot. Scientific Reports, 6(1), 19165.
Lefebvre, R., Allain, R., & Houssaye, A. (2023). What's inside a sauropod limb? First three‐dimensional investigation of the limb long bone microanatomy of a sauropod dinosaur, Nigersaurus taqueti (Neosauropoda, Rebbachisauridae), and implications for the weight‐bearing function. Palaeontology, 66(4), e12670.
McPhee, B. W., Benson, R. B., Botha-Brink, J., Bordy, E. M., & Choiniere, J. N. (2018). A giant dinosaur from the earliest Jurassic of South Africa and the transition to quadrupedality in early sauropodomorphs. Current Biology, 28(19), 3143-3151.
Martin Sander, P., Mateus, O., Laven, T., & Knötschke, N. (2006). Bone histology indicates insular dwarfism in a new Late Jurassic sauropod dinosaur. Nature, 441(7094), 739-741.
Remes, K. (2008). Evolution of the pectoral girdle and forelimb in Sauropodomorpha (Dinosauria, Saurischia): osteology, myology and function (Doctoral dissertation, München, Univ., Diss., 2008).
Sander, P. M., & Clauss, M. (2008). Sauropod gigantism. Science, 322(5899), 200-201.
Yates, A. M., & Kitching, J. W. (2003). The earliest known sauropod dinosaur and the first steps towards sauropod locomotion. Proceedings of the Royal Society of London. Series B: Biological Sciences, 270(1525), 1753-1758.
We appreciate this suggestion and we already used some of the articles in our study but the selection of cites were based also in the available manuscript space enforced by the edition guidelines. We would have like to include several of these works but we had opted to include some of the works that summarize some of them, whereas excluding others.
-
-
www.researchsquare.com www.researchsquare.com
-
Author response:
The following is the authors’ response to the original reviews.
We thank the reviewers for their constructive criticism. It is rare and gratifying to receive such thoughtful feedback, and the result is a much stronger paper. We made significant changes to our statistical analyses and figures to better differentiate the effects of sex and dominance rank on food-cleaning behaviors. These revisions uphold our original conclusion––that rank-related variation overwhelms any sex difference in cleaning behavior. We hope that these edits, together with the rest of our responses, provide a convincing demonstration of the tradeoffs of eliminating quartz from food surfaces.
Reviewer #1 (Public Review):
Summary
We have no objections to Reviewer 1’s summary of our manuscript.
Strengths
Reviewer 1 is extremely gracious, and we are grateful for the kind words.
Weaknesses
Reviewer 1 identified several weaknesses, enumerating three types: (1) statistics, (2) insufficient links to foraging theory, and (3) interpretation and validity of the model. The present response is organized around these same categories.
(1) Statistics
We put all of our data and code into the Zenodo repository prior to submission. This content should have been accessible to Reviewer 1 from the outset. But in any event, we are very sorry for the mixup. To ensure access to our data and code during the present stage of review, we included the URL in the main mainscript and here: https://doi.org/10.5281/zenodo.14002737
(a) AIC and outcome distributions
Reviewer 1 criticized our use of AIC for determining model selection. We agree and this aspect of our manuscript is now removed. In lieu of AIC, we produced two data sets consisting of whole number counts (seconds) with means <5. The data were right-skewed due to high concentrations of biologically-meaningful zeros (i.e., bouts of food handling without any cleaning effort). Following the recommendations of Bolker et al. (2008) and others (Brooks et al. 2017, 2019), we chose an outcome distribution (zero-inflated Poisson, see response below) that best matched this data distribution. In addition, we evaluated the post-hoc performance of each of our models using the standardized residual diagnostic tools for hierarchical regression models available in the DHARMa package (Hartig, 2022). To further evaluate our choice of outcome distribution, we generated QQ-plots and residual vs. predicted plots for each model and included them in our revision as Figures S3-S5.
(b) zeros
Reviewer 1 expressed concern over our treatment of biologically-meaningful zeros, and recommended use of a zero-inflated GLMM with either a Poisson or negative binomial outcome distribution. We agree that such models are best for our two data sets. Accordingly, we fit a series of zero-inflated generalized linear mixed models (ZIGLMM) using the glmmTMB package in R, each with a logit-link function, a single zero-inflation parameter applying to all observations, and a Poisson error distribution. For the food-brushing model, we fit a zero-inflated Poisson (ZIP), which produced favorable standardized residual diagnostic plots with no major patterns of deviation (Figure S3) and minor, but non-significant underdispersion (DHARMa dispersion statistic = 0.99, p = 0.80). For our two food-washing models, we used zero-inflated models with Conway-Maxwell Poisson (ZICMP) distributions, an error distribution chosen for its ability to handle data that are more underdispersed (DHARMa dispersion statistic = 8.2E-09, p = 0.74) than the standard zero-inflated Poisson (Brooks et al. 2019). Using this error distribution improved residual diagnostic plots over a standard ZIP model and we view any deviations in the standardized residuals as minor and attributable to the smaller sample size of our food-washing data set (see Figures S4 and S5) (Hartig, 2022). We reported the summarized fixed effects tests for each GLMM in Tables S1-S3 as Analysis of Deviance Tables (Type II Wald chi square tests, one-sided) along with 𝜒2 values, degrees of freedom, and p-values (one-sided tests). Full model summaries with standard errors and confidence intervals are also included in Tables S4-S6. For all statistical analyses, we set 𝛼 = 0.05.
(2) Absence of Links to Foraging Theory
This critique has three components. The first revisits the absence of code for the optimal cleaning time model. This omission was an unfortunate error at the moment of submission, but our code is available now as a Mathematica notebook in Zenodo (https://doi.org/10.5281/zenodo.14002737). The second pivots around our scholarship, admonishing us for failing to acknowledge the marginal value theorem of Charnov (1976). It is a fair point and we have corrected the oversight with a citation to this classic paper. The third criticism is also rooted in scholarship, with Reviewer 1 asking for greater connection to the existing literature on optimal foraging theory, a point echoed in the summary assessment of the editors at eLife. This comment and the weight given to it by eLife’s editors put us in a difficult spot, as our paper is focused on the optimization of delayed gratification, not food acquisition per se. So, we are in the awkward position of gently resisting this recommendation while simultaneously agreeing with Reviewer 1 that we need to better situate our findings in the landscape of existing literature. To thread this needle, we produced Box 2 with a photograph and 410 words. This display box puts our findings into direct conversation with recent research focused on the sunk cost fallacy.
(3) Interpretation and validity of model relative to data
This critique is focused on the simulated brushing and washing results reported in Figure S1, along with its captioning, which was inadequate. We edited the caption to identify the author (JER) who simulated the brushing and washing behaviors of the monkeys. In addition, we clarified the number of brushing replicates (3) and washing replicates (3) for each of three treatments, for a total of 18 simulations.
We followed Reviewer 1’s suggestion, incorporating the experimental uncertainty of grit removal into our optimal cleaning time model. We drew % grit removed values the % grit removed is used to estimate the cleaning inefficiency≥ 100%parameter 𝑐 for from a distribution, discounting the rare event when values were drawn. As brushing and washing, the included uncertainty now allows us to evaluate these parameters as distributions; and, in turn, obtain a distribution for our predicted brushing and washing optimal cleaning times. As we now describe in the main text, the optimal cleaning time for brushing and washing are 𝑡* \= 0. 98 ± 0. 19 s and * = 2. 40 ± 0. 74 s, respectively. We are grateful for Reviewer 1’s suggestion, for it added𝑡 valuable context to our model predictions. Notably, the inclusion of experimental uncertainty did not change the qualitative nature of our results, or the interpretations of our model predictions compared to observed cleaning behaviors.
We choose to exclude variability in handling time h to generate predicted cleaning time optima, at least in the main text. Our reasoning stems from the observation that handling time variability is long-tailed, with the longer handling times associated with behaviors that we do not account for in our analysis. For example, individuals carrying multiple cucumber slices to the ocean were apt to drop them, struggling at times to re-grasp so many at once. Such moments increased handling times substantially. Still, we acted on Reviewer 1’s suggestion, accounting for the tandem effects of handling time variability and uncertainty in % grit removed (see Figure S6). Drawing handling time estimates from a log-normal distribution fitted to the handling time data, we found that these dual sources of uncertainty did not qualitatively change our results. They added further uncertainty to the predicted washing time, but the mean remains roughly equivalent. (We note that brushing is assumed to have a constant handling time––composed of only assessment time and no travel––such that the results for brushing do not change.) Both analyses are included in the Mathematica notebook at (https://doi.org/10.5281/zenodo.14002737).
Reviewer #2 (Public Review):
Summary
We have no objections to Reviewer 2’s summary of our manuscript.
Strengths
Reviewer 2 is extremely gracious, and we are grateful for the kind words.
Weaknesses
Reviewer 2 noted that our manuscript failed to provide “sufficient background on [our study] population of animals and their prior demonstrations of food-cleaning behavior or other object-handling behaviors (e.g., stone handling).” To address this comment, we edited the introduction (lines 56-58) to alert readers to the onset of regular food-cleaning behaviors sometime after December 26, 2004. In addition, we edited our methods text (lines 155-160) to highlight the onset and limited scope of prior research with this study population:
“The animals are well habituated to human observers due to regular tourism and sustained study since 2013 (Tan et al., 2018). Most of this research has revolved around stone tool-mediated foraging on mollusks, the only activity known to elicit stone handling (Malaivijitnond et al., 2007; Gumert and Malaivijitnond, 2012, 2013; Tan et al., 2015), although infants and juveniles will sometimes use stones during object play (Tan, 2017). There has been no prior examination of food-cleaning behaviors.”
Reviewer #3 (Public Review):
Reviewer 3 identified three weaknesses, which we address in three paragraphs.
Reviewer 3 questioned our methods for determining rank-dependent differences in cleaning behavior, arguing that our conclusions were unsupported. It is a fair point, and it compelled us to combine males and females into a single standardized ordinal rank of 24 individuals. This unified ranking is now reflected in the x-axes of Figure 2 and Figure S2. Plotting the data this way––see Figure S2––underscores Reviewer 3’s concern that sex and dominance rank are confounding variables. To address this problem, our GLMM included rank and sex as predictor variables, which controls for the effect of sex when assessing the relationship between rank and cleaning time across the three treatments. Reported in Tables S1-S3, these findings show that the effect of sex on either brushing or washing time was not significant. This result bolsters our original contention that rank-related variation in cleaning time overwhelms any sex differences.
Relatedly, Reviewer 3 questioned our conclusions on the effects of rank because our study was focused on a single social group. In other words, it is plausible that our results were heavily influenced by the idiosyncrasies of select individuals, not dominance rank per se. It is a fair point, and it compelled us to include individual ID as a random effect in each of our GLMMs. Including individual ID as a random intercept allowed us to control for inter-individual variation in cleaning duration while assessing the effects of rank. An analysis based on additional social groups or longitudinal data are certainly desirable, but also well beyond the scope of a Short Report for eLife.
Finally, Reviewer 3 objected to fragments of sentences in our abstract, introduction, and discussion, combining them into a criticism of claims that we did not and do not make. It probably wasn’t intentional, but it puts us in the awkward position of deconstructing a strawman:
● Review 3 begins, “there is no evidence presented on the actual fitness-related costs of tooth wear or the benefits of slightly faster food consumption”. This statement is true while insinuating that collecting such evidence was our intent. To be clear, our experiment was never designed to measure tooth wear or reproductive fitness, nor do we make any claims of having done so.
● Reviewer 3 adds, “Support for these arguments is provided based on other papers, some of which come from highly resource-limited populations (and different species). But this is a population that is supplemented by tourists with melons, cucumbers, and pineapples!” We were puzzled over these sentences. The first fails to mention that the citations exist in our discussion. Citing relevant work in a discussion is a basic convention of scientific writing. But it seems the underlying intent of these words is to denigrate the value of our study population because two dozen tourists visit Koram Island once a day. Exclamations to the contrary, the amount of tourist-provisioned food in the diet of any one monkey is negligible.
● Last, Reviewer 3 commented on matters of style, objecting to “overly strong claims.” We puzzled over this criticism because the claims in question are broader points of introduction or discussion, not results. The root problem appears to be the final sentence of our abstract:
“Dominant monkeys abstained from washing, balancing the long-term benefits of mitigating tooth wear against immediate energetic requirements, an essential predictor of reproductive fitness.”
This sentence has three clauses. The first is a statement of results, whereas the second and third are meant to mirror our discussion on the importance of our findings. We combined the concepts into a single concluding sentence for the sake of concision, but we can appreciate how a reader could feel deceived, expecting to see data on tooth wear and fitness. So, our impression is that we are dealing with a simple misunderstanding of our own making, and that this single sentence explains Reviewer 3’s criticism and tone––it cast a long shadow over the substance of our paper. To resolve this problem, we edited the sentence:
“Dominant monkeys abstained from washing, a choice consistent with the impulses of dominant monkeys elsewhere: to prioritize rapid food intake and greater reproductive fitness over the long-term benefits of prolonging tooth function.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
In their manuscript, Gomez-Frittelli and colleagues characterize the expression of cadherin6 (and -8) in colonic IPANs of mice. Moreover, they found that these cdh6-expressing IPANs are capable of initiating colonic motor complexes in the distal colon, but not proximal and midcolon. They support their claim by morphological, electrophysiological, optogenetic, and pharmacological experiments.
Strengths:
The work is very impressive and involves several genetic models and state-of-the-art physiological setups including respective controls. It is a very well-written manuscript that truly contributes to our understanding of GI-motility and its anatomical and physiological basis. The authors were able to convincingly answer their research questions with a wide range of methods without overselling their results.
We greatly appreciate the reviewer’s time, careful reading and support of our study.
Weaknesses:
The authors put quite some emphasis on stating that cdh6 is a synaptic protein (in the title and throughout the text), which interacts in a homophilic fashion. They deduct that cdh6 might be involved in IPAN-IPAN synapses (line 247ff.). However, Cdh6 does not only interact in synapses and is expressed by non-neuronal cells as well (see e.g., expression in the proximal tubuli of the kidney). Moreover, cdh6 does not only build homodimers, but also heterodimers with Chd9 as well as Cdh7, -10, and -14 (see e.g., Shimoyama et al. 2000, DOI: 10.1042/02646021:3490159). It would therefore be interesting to assess the expression pattern of cdh6proteins using immunostainings in combination with synaptic markers to substantiate the authors' claim or at least add the possibility of cell-cell-interactions other than synapses to the discussion. Additionally, an immunostaining of cdh6 would confirm if the expression of tdTomato in smooth muscle cells of the cdh6-creERT model is valid or a leaky expression (false positive).
We agree with the reviewer that Cdh6 could be mediating some other cell-cell interaction besides synapses between IPANs, and we noted it in the discussion. Cdh6 primarily forms homodimers but, as the reviewer points out, has been known to also form heterodimers with some other cadherins. We performed RNAscope in the colonic myenteric plexus with Cdh7 and found no expression (data not shown). Cdh10 is suggested to have very low expression (Drokhlyansky et al., 2020), possibly in putative secretomotor vasodilator neurons, and Cdh14 has not been assayed in any RNAseq screens. We attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018) but our efforts did not result in sufficient signal or resolution to identify synapses in the ENS, which remain broadly challenging to assay. Similarly, immunostaining with Cdh6 antibody was unable to confirm Cdh6 protein in tdT-expressing muscle cells, or by RNAscope. We have addressed these caveats in the discussion section.
(1) E. Drokhlyansky, C. S. Smillie, N. V. Wittenberghe, M. Ericsson, G. K. Griffin, G. Eraslan, D. Dionne, M. S. Cuoco, M. N. Goder-Reiser, T. Sharova, O. Kuksenko, A. J. Aguirre, G. M. Boland, D. Graham, O. Rozenblatt-Rosen, R. J. Xavier, A. Regev, The Human and Mouse Enteric Nervous System at Single-Cell Resolution. Cell 182, 1606-1622.e23 (2020).
(2) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).
Reviewer #2 (Public review):
Summary:
Intrinsic primary afferent neurons are an interesting population of enteric neurons that transduce stimuli from the mucosa, initiate reflexive neurocircuitry involved in motor and secretory functions, and modulate gut immune responses. The morphology, neurochemical coding, and electrophysiological properties of these cells have been relatively well described in a long literature dating back to the late 1800's but questions remain regarding their roles in enteric neurocircuitry, potential subsets with unique functions, and contributions to disease. Here, the authors provide RNAscope, immunolabeling, electrophysiological, and organ function data characterizing IPANs in mice and suggest that Cdh6 is an additional marker of these cells.
Strengths:
This paper would likely be of interest to a focused enteric neuroscience audience and increase information regarding the properties of IPANs in mice. These data are useful and suggest that prior data from studies of IPANs in other species are likely translatable to mice.
We appreciate the reviewer’s support of our study and insightful critiques for its improvement.
Weaknesses:
The advance presented here beyond what is already known is minimal. Some of the core conclusions are overstated and there are multiple other major issues that limit enthusiasm. Key control experiments are lacking and data do not specifically address the properties of the proposed Cdh6+ population.
Major weaknesses:
(1) The novelty of this study is relatively low. The main point of novelty suggests an additional marker of IPANs (Cdh6) that would add to the known list of markers for these cells. How useful this would be is unclear. Other main findings basically confirm that IPANs in mice display the same classical characteristics that have been known for many years from studies in guinea pigs, rats, mice and humans.
We appreciate the already existing markers for IPANs in the ENS and the existing literature characterizing these neurons. The primary intent of this study was to use these well-established characteristics of IPANs in both mice and other species to characterize Cdh6-expressing neurons in the mouse myenteric plexus and confirm their classification as IPANs.
(2) Some of the main conclusions of this study are overstated and claims of priority are made that are not true. For example, the authors state in lines 27-28 of the abstract that their findings provide the "first demonstration of selective activation of a single neurochemical and functional class of enteric neurons". This is certainly not true since Gould et al (AJP-GIL 2019) expressed ChR2 in nitrergic enteric neurons and showed that activating those cells disrupted CMC activity. In fact, prior work by the authors themselves (Hibberd et al., Gastro 2018) showed that activating calretinin neurons with ChR2 evoked motor responses. Work by other groups has used chemogenetics and optogenetics to show the effects of activating multiple other classes of neurons in the gut.
We thank the reviewer for bringing up this important point and apologize if our wording was not clear. Whilst single neurochemical classes of enteric neurons have been manipulated to alter gut functions, all such instances to date do not represent manipulation of a single functional class of enteric neurons. In the given examples, multiple functional classes are activated utilizing the same neurotransmitter, as NOS and calretinin are each expressed to varying degrees across putative motor neurons, interneurons and IPANs. In contrast, Chd6 is restricted to IPANs and therefore this study is the first optogenetic investigation of enteric neurons from a single putative functional class. Our abstract and discussion emphasizes this point and differentiates this study from those previous.
(3) Critical controls are needed to support the optogenetic experiments. Control experiments are needed to show that ChR2 expression a) does not change the baseline properties of the neurons, b) that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons, and c) that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions focused on here.
We completely agree controls are essential. However, our paper is not the first to express ChR2 in enteric neurons. Authors of our paper have shown in Hibberd et al. 2018 that expression of ChR2 in a heterogeneous population of myenteric neurons did not change network properties of the myenteric plexus. This was demonstrated in the lack of change in control CMC characteristics in mice expressing ChR2 under basal conditions (without blue light exposure). Regarding question (b), that it should be shown that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons. We show the restricted expression of ChR2 in IPANs and that motor responses (to blue light) are blocked by selective nerve conduction blockade.
Regarding question (c), that our study should demonstrate that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions. We would not expect each region of the gut to behave comparably. This is because the different gut regions (i.e. proximal, mid, distal) are very different anatomically, as is anatomy of the myenteric plexus and myenteric ganglia between each region, including the density of IPANs within each ganglia, in addition to the presence of different patterns of electrical and mechanical activity [Spencer et al., 2020]. Hence, it is difficult to expect that between regions stimulation of ChR2 should induce similar physiological responses. The motor output we record in our study (CMCs) is a unified motor program that involves the temporal coordination of hundreds of thousands of enteric neurons and a complex neural circuit that we have previously characterized [Spencer et al., 2018]. But, never has any study until now been able to selectively stimulate a single functional class of enteric neurons (with light) to avoid indiscriminate activation of other classes of neurons.
(1) T. J. Hibberd, J. Feng, J. Luo, P. Yang, V. K. Samineni, R. W. Gereau, N. Kelley, H. Hu, N. J. Spencer, Optogenetic Induction of Colonic Motility in Mice. Gastroenterology 155, 514-528.e6 (2018).
(2) N. J. Spencer, L. Travis, L. Wiklendt, T. J. Hibberd, M. Costa, P. Dinning, H. Hu, Diversity of neurogenic smooth muscle electrical rhythmicity in mouse proximal colon. American Journal of Physiology-Gastrointestinal and Liver Physiology 318, G244–G253 (2020).
(3) N. J. Spencer, T. J. Hibberd, L. Travis, L. Wiklendt, M. Costa, H. Hu, S. J. Brookes, D. A. Wattchow, P. G. Dinning, D. J. Keating, J. Sorensen, Identification of a Rhythmic Firing Pattern in the Enteric Nervous System That Generates Rhythmic Electrical Activity in Smooth Muscle. The Journal of Neuroscience 38, 5507–5522 (2018).
(4) The electrophysiological characterization of mouse IPANs is useful but this is a basic characterization of any IPAN and really says nothing specifically about Cdh6+ neurons. The electrophysiological characterization was also only done in a small fraction of colonic IPANs, and it is not clear if these represent cell properties in the distal colon or proximal colon, and whether these properties might be extrapolated to IPANs in the different regions. Similarly, blocking IH with ZD7288 affects all IPANs and does not add specific information regarding the role of the proposed Cdh6+ subtype.
Our electrophysiological characterization was guided to be within a subset of Cdh6+ neurons by Hb9:GFP expression. As in the prior comment (1) above, we used these experiments to confirm classification of Cdh6+ (Hb9:GFP+) neurons in the distal colon as IPANs. We have clarified in the results and methods that these experiments were performed in the distal colon and agree that we cannot extrapolate that these properties are also representative of IPANs in the proximal colon. We apologize that this was confusing. Finally, we agree with the reviewer that ZD7288 affects all IPANs in the ENS and have clarified this in the text.
(5) Why SMP IPANs were not included in the analysis of Cdh6 expression is a little puzzling. IPANs are present in the SMP of the small intestine and colon, and it would be useful to know if this proposed marker is also present in these cells.
We agree with the reviewer. In addition to characterizing Cdh6 in the myenteric plexus, it would be interesting to query if sensory neurons located within the SMP also express Cdh6. Our preliminary data (n=2) show ~6-12% tdT/Hu neurons in Cdh6-tdT ileum and colon (data not shown). We have added a sentence to the discussion.
(6) The emphasis on IH being a rhythmicity indicator seems a bit premature. There is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS.
Regarding the statement there is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS. We agree with the reviewer that evidence of rhythm generation by IH and IT in the ENS has not been explicitly confirmed. We are confident the reviewer agrees that an absence of evidence is not evidence of absence, although the presence of IH has been well described in enteric neurons. We have modified the text in the results to indicate more clearly that IH and IT are known to participate in rhythm generation in thalamocortical circuits, though their roles in the ENS remain unknown. Our discussion of the potential role of IH or IT in rhythm generation or oscillatory firing of the ENS is constrained to speculation in the discussion section of the text.
(7) As the authors point out in the introduction and discuss later on, Type II Cadherins such as Cdh6 bind homophillically to the same cadherin at both pre- and post-synapse. The apparent enrichment of Cdh6 in IPANs would suggest extensive expression in synaptic terminals that would also suggest extensive IPAN-IPAN connections unless other subtypes of neurons express this protein. Such synaptic connections are not typical of IPANs and raise the question of whether or not IPANs actually express the functional protein and if so, what might be its role. Not having this information limits the usefulness of this as a proposed marker.
We agree with the reviewer that the proposed IPAN-IPAN connection is novel although it has been proposed before (Kunze et al., 1993). As detailed in our response to Reviewer #1, we attempted to confirm Cdh6 protein expression, but were unsuccessful, due to insufficient signal and resolution. We therefore discuss potential IPAN interconnectivity in the discussion, in the context of contrasting literature.
(1) W. A. A. Kunze, J. B. Furness, J. C. Bornstein, Simultaneous intracellular recordings from enteric neurons reveal that myenteric ah neurons transmit via slow excitatory postsynaptic potentials. Neuroscience 55, 685–694 (1993).
(8) Experiments shown in Figures 6J and K use a tethered pellet to drive motor responses. By definition, these are not CMCs as stated by the authors.
The reviewer makes a valid criticism as to the terminology, since tethered pellet experiments do not record propagation. We believe the periodic bouts of propulsive force on the pellet is triggered by the same activity underlying the CMC. In our experience, these activities have similar periodicity, force and identical pharmacological properties. Consistent with this, we also tested full colons (n = 2) set up for typical CMC recordings by multiple force transducers, finding that CMCs were abolished by ZD7288, similar to fixed pellet recordings (data not shown).
(9) The data from the optogenetic experiments are difficult to understand. How would stimulating IPANs in the distal colon generate retrograde CMCs and stimulating IPANs in the proximal colon do nothing? Additional characterization of the Cdh6+ population of cells is needed to understand the mechanisms underlying these effects.
We agree that the different optogenetic responses in the proximal and distal colon are challenging to interpret, but perhaps not surprising in the wider context. It is not only possible that the different optogenetic responses in this study reflect regional differences in the Chd6+ neuronal populations, but also differences in neural circuits within these gut regions. A study some time ago by the authors showed that electrical stimulation of the proximal mouse colon was unable to evoke a retrograde (aborally) propagating CMC (Spencer, Bywater, 2002), but stimulation of the distal colon was readily able to. We concluded that at the oral lesion site there is a preferential bias of descending inhibitory nerve projections, since the ascending excitatory pathways have been cut off. In contrast, stimulation of the distal colon was readily able to activate an ascending excitatory neural pathway, and hence induce the complex CMC circuits required to generate an orally propagating CMC. Indeed, other recent studies have added to a growing body of evidence for significant differences in the behaviors and neural circuits of the two regions (Li et al., 2019, Costa et al., 2021a, Costa et al., 2021b, Nestor-Kalinoski et al., 2022). We have expanded this discussion.
(1) N. J. Spencer, R. A. Bywater, Enteric nerve stimulation evokes a premature colonic migrating motor complex in mouse. Neurogastroenterology & Motility 14, 657–665 (2002).
(2) Li Z, Hao MM, Van den Haute C, Baekelandt V, Boesmans W, Vanden Berghe P, Regional complexity in enteric neuron wiring reflects diversity of motility patterns in the mouse large intestine. Elife 8:e42914 (2019).
(3) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Dinning PG, Brookes SJ, Spencer NJ, Motor patterns in the proximal and distal mouse colon which underlie formation and propulsion of feces. Neurogastroenterology & Motility e14098 (2021a).
(4) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Smolilo DJ, Dinning PG, Brookes SJ, Spencer NJ, Characterization of alternating neurogenic motor patterns in mouse colon. Neurogastroenterology & Motility 33:e14047 (2021b).
(5) Nestor-Kalinoski A, Smith-Edwards KM, Meerschaert K, Margiotta JF, Rajwa B, Davis BM, Howard MJ, Unique Neural Circuit Connectivity of Mouse Proximal, Middle, and Distal Colon Defines Regional Colonic Motor Patterns. Cellular and Molecular Gastroenterology and Hepatology 13:309-337.e303 (2022).
Recommendations for the Authors:
Reviewer #1 (Recommendations for the authors):
As mentioned above, immunolocalization of cdh6 would be helpful to substantiate the claims regarding IPAN-IPAN synapses.
As mentioned in our response to both reviewers’ public reviews, we attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018), but our efforts did not result in sufficient signal or resolution to identify Cdh6+ synapses.
(1) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).
Reviewer #2 (Recommendations for the authors):
(1) The authors repeatedly refer to IPANs as "sensory" neurons (e.g. in title, abstract, and introduction) but there is some debate regarding whether these cells are truly "sensory" because the information they convey never reaches sensory perception. This is why they have classically been referred to as intrinsic primary afferent (IPAN) neurons. It would be more appropriate to stick with this terminology unless the authors have compelling data showing that information detected by IPANs reaches the sensory cortex.
We thank the reviewer for their comment, but respectfully disagree. The term “sensory neuron” is well established in the ENS. The first definitive proof that “sensory neurons” exist in the ENS was published in Kunze et al., 1995. We note that this paper did not use the word “IPAN” but used the term “sensory neuron”. Furthermore, mechanosensory neurons were published in Spencer and Smith (2004).
Regarding the reviewer’s comment that the authors would need compelling data showing that information detected by IPANs reaches the sensory cortex before the term “sensory neuron” should be valid, it is important to note that many sensory neurons do not provide direct information to the cortex.
(1) W. A. A. Kunze, J. C. Bornstein, J. B. Furness, Identification of sensory nerve cells in a peripheral organ (the intestine) of a mammal. Neuroscience 66, 1–4 (1995).
(2) N. J. Spencer, T. K. Smith, Mechanosensory S-neurons rather than AH-neurons appear to generate a rhythmic motor pattern in guinea-pig distal colon. The Journal of Physiology 558, 577–596 (2004).
(2) Important information regarding the gut region shown and other details are absent from many figure legends.
We apologize for this omission. We have updated the figure legends to include information on gut regions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Thank you for the constructive feedback from the reviewers. We are grateful for their insights and are committed to addressing the key concerns raised in the public reviews through the following revisions:
(1) Validating Axoneme Stability Claims
We have procured new antibodies for DRC11, as well as marker proteins for ODA, IDA, and RS. We will conduct quantitative immunofluorescence staining to validate our claims regarding axoneme stability.
(2) Investigating ANKRD5 Expression in Other Ciliated Cells
We plan to examine the expression of ANKRD5 in mouse respiratory cilia to determine whether it is also expressed in these cells.
(3) Supplementing Key Citations for N-DRC Components
We will add references to published studies on N-DRC components (e.g., DRC1, DRC2, DRC3, DRC5) associated with male infertility in the Introduction to strengthen the background context.
(4) Further Analysis and Validation of ANKRD5 Interactome
We will conduct additional analyses and validation of the interactome of ANKRD5 detected by LC-MS.
(5) Elucidating the Function of ANKRD5 in Mitochondria
We will further investigate the role of ANKRD5 in mitochondrial function.
(6) Investigating Mitochondrial Function and Energy Metabolism
We will further explore the role of ANKRD5 in mitochondrial function and energy metabolism.
(7) Improving Cryo-ET Data Quality and Interpretation
We will attempt to further improve the quality of the STA results and try to calculate the DMT structure with a period of 96 nm. We will also use the WT density map with the same period to generate a difference map.
(8) Expanding Discussion and Correcting Terminology
The Discussion section will be revised to elaborate on the implications of ANKRD5 for male contraceptive research, particularly in targeting sperm motility. We will also correct terminology inaccuracies (e.g., changing "9+2 microtubule doublet" to "9+2 structure") and address formatting issues (e.g., capitalizing "Control").
Response to Reviewer #2 Comment 4:
We appreciate the reviewer's careful consideration of our proteomic data. However, our Gene Set Enrichment Analysis (GSEA) of glycolysis/gluconeogenesis pathways showed no significant enrichment (p-value=0.089, NES=0.708; Fig.6D), which does not meet the statistical thresholds for biological significance (|NES|>1, pvalue<0.05). This observation is further corroborated by our direct ATP measurements showing no difference between genotypes (Fig.6E). We agree that further studies on metabolic regulation could be valuable, but current evidence does not support glycolysis disruption as a primary mechanism for the motility defects observed in Ankrd5-null sperm. This misinterpretation likely arose from the reviewer's overinterpretation of non-significant proteomic trends. We request that this specific claim be excluded from the assessment to avoid misleading readers.
We will provide a comprehensive point-by-point response, along with detailed experimental data and revised figures, in the resubmitted manuscript. Thank you once again for the opportunity to address the reviewers' concerns. We are confident that these revisions will strengthen our manuscript and contribute to the scientific community.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this study, the authors examined the role of Afadin, a key adaptor protein associated with cell-adhesion molecules, in retinal development. Using a conditional knockout mouse line (Six3-Cre; AfadinF/F), the authors successfully characterized a disorganized pattern of various neuron types in the mutant retinae. Despite these altered distributions, the retinal neurons maintained normal cell numbers and seemingly preserved some synaptic connections. Notably, tracing results indicated mistargeting of retinal ganglion cell (RGC) axon projections to the superior colliculus, and electroretinography (ERG) analyses suggested deficits in visual functions.
Thank you for the summary and highlights of our study. We appreciate the input from Reviewer 1 and the Editor on this study, with focus on laminar choices, synaptic choices and axonal projections.
Strengths:
This compelling study provides solid evidence addressing the important question of how cell-adhesion molecules influence neuronal development. Compared to previous research conducted in other parts of the central nervous system (CNS), the clearly defined lamination of cell types in the retina serves as a unique model for studying the aberrant neuronal localizations caused by Afadin knockout. The data suggest that cell-cell interactions are critical for retinal cellular organization and proper axon pathfinding, while aspects of cell fate determination and synaptogenesis remain less understood. This work has broad implications not only for retinal studies but also for developmental biology and regenerative medicine.
Weaknesses:
While the phenotypes observed in the Afadin knockout (cKO) mice are intriguing, I would expect to see evidence confirming that Afadin is indeed knocked out in the retina through immunostaining. Specifically, is Afadin knocked out only in certain retinal regions and not others, as suggested by Figures 4A-B? Are Afadin levels different among distinct neuron types, which could mean that its knockout may have a more pronounced impact on certain cell types, such as rods compared to others?
The authors suggest that synapses may form between canonical synaptic partners, based on the proximity of their processes (Figure 2). However, more solid evidence is needed to verify these synapses through the use of synaptic marker staining or transsynaptic labeling before drawing further conclusions.
Although the Afadin cKO mice displayed dramatic phenotypes, additional experiments are necessary to clarify the details of this process. By manipulating Afadin levels in specific cell types or at different developmental time points, we could gain a better understanding of how Afadin regulates accurate retinal lamination and axonal projection.
Regarding the antibody confirming the Knockout, we tested the commercially available antibody from Sigma but weren’t able to confirm its specificity. There was a homemade antibody from another Japan-based laboratory, but it was not available to share at the moment when the study was conducted. Nonetheless, the original allele was derived for hippocampal and cortical studies by Louis Reichardt’s Lab (UCSF), with verified efficacies of the KO allele.
Regarding phenotypical penetrance, this may likely come from the mosaicism of the clone and the symmetric cell division, leading to a rosette-like structure. At this moment, we reason that Afadin KO does NOT lead to direct neuronal loss, and the selective rod loss may derive from other issues, but we lack direct evidence to validate this point.
In regards to the specific neuronal types and synaptic pairs, we acknowledge the limitations of the current Figure 2 in linking the mutant phenotypes to circuit changes. However, the current genetic reagents (Six3Cre) are not compatible with neuron-type specific labeling of synaptic labeling – i.e., cell type-specific Cre and additional Cre-dependent AAV tools might be desired. To do so, we will need to initiate cell-type-specific breeding of transgenic markers such as Hb9GFP for ooDSGCs, or Chat-Cre, VGlut3-Cre for starburst amacrine cells, vG3 amacrine cells, followed by retinal physiology. These experiments take multi-allelic genetic crosses for a very low breeding yield (1/16 or 1/32 Mendelian ratio). These extensive genetic tests are beyond the scope of the current manuscript.
Reviewer #2 (Public review):
Summary:
This study by Lum and colleagues reports on the role of Afadin, a cytosolic adapter protein that organizes multiple cell adhesion molecule families, in the generation and maintenance of complex cellular layers in the mouse retina. They used a conditional deletion approach, removing Afadin in retinal progenitors, and allowing them to analyze broad effects on retinal neuron development.
The study presents high-quality and extensive characterization of the cellular phenotypes, supporting the main conclusions of the paper. They show that Afadin loss results in significant disorganization of the retinal cellular layers and the neuropil, producing rosettes and displacement of cells away from their resident layers. The major classes of neurons in the inner retina are affected, and some neurons are, remarkably, displaced to the other side of the inner plexiform layer. Nevertheless, they mostly target their synaptic partners, including the RGCs to distant retinorecipient targets in the brain. The main conclusions are as follows. Afadin is necessary for establishing and maintaining the retinal architecture. It is not necessary for the generation of the correct numbers/densities of retinal neuron subtypes. Moreover, Afadin loss preserves associations between known synaptic partners and preserves axonal targeting to retinorecipient layers. The consequences on photoreceptor viability and visual processing are also interesting, underscoring the essential function for maintaining retinal structure and function. Overall, the main conclusions describing the consequences are supported by the results.
Strengths:
The study provides new knowledge on the requirement of Afadin in retinal development. The introduction and discussion effectively set up the rationale for this work, and place it in the context of previous studies of Afadin in other regions of the CNS.
The study presents high-quality and extensive characterizations of the cellular phenotypes resulting from Afadin loss. By analyzing various aspects of retinal organization - from cellular densities to axon targeting to brain - the study narrows down the role of a structure for promoting the establishment of the layers, or maintenance. The data are straightforward and convincing, and the interpretations are bounded by the data shown (though minor weakness re. survival). Another important finding is that the targeting of retinal neuron processes to synaptic partners, including retinorecipient targets in the brain, are intact.
The study is important as it establishes a focused requirement for Afadin to set up and preserve the overall cellular organizations within the retinal tissue. The demonstration that Afadin is needed for photoreceptor viability and overall visual function enhances impact by establishing its functional importance.
The manuscript is well well-written and presented. The images are attractive and compelling, and the figures are well organized.
Thank you for your high praise on the logic, data presentation, and significance of the current manuscript. We appreciate your comments on the novelty and impact of our study using retinal circuits as a model.
Weaknesses:
(1) Expanding on the developmental mechanism is beyond the scope of the study, and would not add to the main conclusions. However, the manuscript would be improved by providing more clarity on the developmental emergence of the defects. The study left me questioning whether the rosettes and cell displacements occur during earlier stages of retina development, or are progressive. For instance, do the RGCs migrate and establish within the GCL correctly at first, and then are displaced with the progressive disorganization? Or are they disorganized and delaminate en route? Images of RGC staining at P0, or earlier during their migration, would be informative. Data in Figure 1 is limited to DAPI staining at P7. Figure 4 shows an image of rod photoreceptors at P7, with their displacement in the GCL layer (and not contained within a rosette). Are the progenitors mislocalized due to delamination? A few additional thoughts on how these defects compare to other mutants with rosettes might give us more context for understanding the results.
We chose P7 as our focus due to the lamination in controls. In the revised manuscript, we plan to include earlier time points, as suggested by the reviewer. The data in Figure 1 at P7 utilizes well-established cell type markers (RBPMS, Chx10, Ap2α) and is not limited only to DAPI. Additionally, we will revise the discussion section and place our mutant analyses in the context of other mutants with rosettes (beta-catenin, etc.) in the retina. Finally, we will address the comment on progenitor lamination by exploring earlier developmental time points.
(2) The manuscript reports that the densities of major inner retinal classes are unaffected. There are a few details missing for this point. How were the cell densities quantified (in terms of ROI size), and normalized? This information is lacking in the methods. There is a striking thickening of the GCL in the DAPI-labeled images shown in Figure 1. What are these cells?
We will revise the manuscript, particularly the methods section, to address these comments. Additionally, we will tackle ROI units and normalization. The cells in the thickened GCL were identified as displaced amacrine cells and bipolar cells.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1:
Summary:
The authors address the role of the centromere histone core in force transduction by the kinetochore.
Strengths:
They use a hybrid DNA sequence that combines CDEII and CDEIII as well as Widom 601 so they can make stable histones for biophysical studies (provided by the Widom sequence) and maintain features of the centromere (CDE II and III).
Weaknesses:
The main results are shown in one figure (Figure 2). Indeed the Centromere core of Widom and CDE II and III contribute to strengthening the binding force for the OA-beads. The data are very nicely done and convincingly demonstrate the point. The weakness is that this is the entire paper. It is certainly of interest to investigators in kinetochore biology, but beyond that, the impact is fairly limited in scope.
This reviewer might have missed that this is a Research Advance, not an article. Research Advances are limited in scope by definition and provide a new development that builds on research reported in a prior paper. They can be of any length. Our Research Advance builds on our prior work, Hamilton et al., 2020 and provides the new result that native centromere sequences strengthen the attachment of the kinetochore to the nucleosome.
Reviewer #2:
Summary:
This paper provides a valuable addendum to the findings described in Hamilton et al. 2020 (https://doi.org/ 10.7554/eLife.56582). In the earlier paper, the authors reconstituted the budding yeast centromeric nucleosome together with parts of the budding yeast kinetochore and tested which elements are required and sufficient for force transmission from microtubules to the nucleosome. Although budding yeast centromeres are defined by specific DNA sequences, this earlier paper did not use centromeric DNA but instead the generic Widom 601 DNA. The reason is that it has so far been impossible to stably reconstitute a budding yeast centromeric nucleosome using centromeric DNA.
In this new study, the authors now report that they were able to replace part of the Widom 601 DNA with centromeric DNA from chromosome 3. This makes the assay more closely resemble the in vivo situation. Interestingly, the presence of the centromeric DNA fragment makes one type of minimal kinetochore assembly, but not the other, withstand stronger forces.
We thank the reviewer for their careful and positive assessment of our work.
Which kinetochore assembly turned out to be affected was somewhat unexpected, and can currently not be reconciled with structural knowledge of the budding yeast centromere/kinetochore. This highlights that, despite recent advances (e.g. Guan et al., 2021; Dendooven et al., 2023), aspects of budding yeast kinetochore architecture and function remain to be understood and that it will be important to dissect the contributions of the centromeric DNA sequence.
We couldn’t agree more.
Given the unexpected result, the study would become yet more informative if the authors were able to pinpoint which interactions contribute to the enhanced force resistance in the presence of centromeric DNA.
Strength:
The paper demonstrates that centromeric DNA can increase the attachment strength between budding yeast microtubules and centromeric nucleosomes.
Weakness:
How centromeric DNA exerts this effect remains unclear.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
eLife Assessment
In this work, the authors use a Drosophila adult ventral nerve cord injury model extending and confirming previous observations; this important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The data on detected plasticity under physiologic conditions and especially the extent of cell divisions and cell fate changes upon injury would benefit from validation by additional markers. The experimental part would improve if strengthened and accompanied by a more comprehensive integration of results regarding glial reactivity in the adult CNS.
Thank you very much for your thoughtful comments and constructive feedback regarding our manuscript. We appreciate all the positive remarks on the significance of our findings on neural plasticity in this Drosophila adult ventral nerve cord injury model.
In response to your suggestion, we fully agree that the continuation of this project should address in detail cell fate changes with additional markers if available, or an “omic” approach such as scRNAseq. Unfortunately, these further experiments are beyond the scope of this paper to describe the in vivo phenomena of cell reprogramming, and the cellular events that take glial cells to convert into neurons or neuronal precursors.
Additionally, we agree that the experimental part can be further improved by providing a more comprehensive integration of our results with current knowledge on glial reactivity in the adult CNS. We will revise the manuscript accordingly to include a deeper discussion of the broader implications of our findings and their alignment with existing literature.
Thank you again for your valuable input, which will undoubtedly enhance the quality of our work. We look forward to submitting the revised manuscript for your consideration.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Casas-Tinto et al. present convincing data that injury of the adult Drosophila CNS triggers transdifferentiation of glial cell and even the generation of neurons from glial cells. This observation opens up the possibility to get an handle on the molecular basis of neuronal and glial generation in the vertebrate CNS after traumatic injury caused by Stroke or Crush injury. The authors use an array of sophisticated tools to follow the development of glial cells at the injury site in very young and mature adults. The results in mature adults reveal a remarkable plasticity in the fly CNS and dispels the notion that repair after injury may be only possible in nerve cords which are still developing. The observation of so called VC cells which do not express the glial marker repo could point to the generation of neurons by former glial cells.
Conclusion:
The authors present an interesting story which is technically sound and could form the basis for an in depth analysis of the molecular mechanism driving repair after brain injury in Drosophila and vertebrates.
Strengths:
The evidence for transdifferentiation of glial cells is convincing. In addition, the injury to the adult CNS shows an inherent plasticity of the mature ventral nerve cord which is unexpected.
Weaknesses:
Traumatic brain injury in Drosophila has been previously reported to trigger mitosis of glial cells and generation of neural stem cells in the larval CNS and the adult brain hemispheres. Therefore this report adds to but does not significantly change our current understanding. The origin and identity of VC cells is still unclear. The authors show that VC cells are not GABA- or glutamergic. Yet, there are many other neurotransmitter or neuropetides. It would have been nice to see a staining with another general neuronal marker such as anti-Syt1 to confirm the neuronal identity of Syt1.
We thank the reviewer for the constructive comments and positive feedback. We concur that previous studies have demonstrated glial cell proliferation in response to CNS injury. In contrast, our study focuses on glial transdifferentiation that emerges as a novel phenomenon, particularly in response to injury. We found that neuropile glia lose their glial identity and express the pan-neuronal marker Elav. To investigate the identity of these newly observed elav-positive cells, we employed anti-ChAT, antiGABA and anti-GluRIIA antibodies to determine the functional identity of these cells, besides we stained them with other neuronal markers such Enabled, Gigas or Dac (not shown); however, our attempts yielded limited success. To address this, we have now included a discussion section exploring the potential identity of these cells, considering the possibility that they may represent immature neurons.
Reviewer #2 (Public review):
Summary:
Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such gliaderived neurogenesis is specifically favoured following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.
Strengths:
This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.
The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact and with the potential to later link observed plasticity to behaviour such as locomotion.
Numerous experiments have been carried out in 7-day old flies, showing that the observed plasticity is not due to residual developmental remodelling or a still immature VNC.
By elegantly combining different methods, the authors show glial divisions including with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.
The work identifies prospero in glia as an important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.
We would like to thank the reviewer for his/her comments and the positive analysis of this work.
Weaknesses:
The authors observe consistent inter-conversion of EG to ALG glial subtypes that is further stimulated upon injury. The authors conclude that these findings have important consequences for CNS regeneration and potentially for memory and learning. However, it remains somewhat unclear how glial transformation could contribute to regeneration and functional recovery.
This is an ongoing question in the laboratory and in the field. We know that glial cells contribute to the regenerative program in the nervous system, and molecular signalling in glial cells is determinant for the functional recovery (Losada-Perez et al 2021). Therefore, we include this concept in the discussion as the evidence indicates that glial cells participate in these programs. However, further investigation is required to clarify and determine the mechanisms underlying this glial contribution. To determine if glial to neuron transformation contributes to functional recovery, we would need to compare the recovery of animals with new VC to animals without VC, however, the molecular mechanism that produces this change of identity is still unknown, and therefore we are not able to generate injured flies with no new VC
The signal of the Fucci cell cycle reporter seems more complex to interpret based on the panels provided compared to the other methods employed by the authors to assess cell divisions.
We agree that Fly Fucci is a genetic reporter that might be more complex to interpret than EdU staining or other markers. However, glial cells proliferation is a milestone of this manuscript, and we used different available tools to confirm our results. We have revised this specific section to ensure that the text is clear and straightforward.
Elav+ cells originating from glia do not express markers for mature neurons at the analysed time-point. If they will eventually differentiate or what type of structure is formed by them will have to be followed up in future studies.
We fully agree with the reviewer, and we will analyze later days to study neuronal fate and contribution to VNC function.
Context/Discussion
There is some lack of connecting or later comparing the observed forms of glial plasticity in the VNC with respect to plasticity described in the fly brain.
Highlighting some differences in the reactiveness of glia in the VNC compared to the brain could point to relevant differences in repair capacity in different areas of the CNS.
Based on the assays employed, the study points to a significant amount of glial "identity" changes or interconversions under homeostatic conditions. The potential significance of this rather unexpected "baseline" plasticity in adult tissues is not explicitly pointed out and could improve the understanding of the findings.
Some speculations if "interconversion" of glia is driven by the needs in the tissue could enrich the discussion.
We would like to thank the reviewer for these suggestions. We have changed the discussion to introduce these concepts.
Reviewer #3 (Public review):
In this manuscript, Casas-Tintó et al. explore the role of glial cell in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a
model organism, and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury. This paper provides a new mechanism in regeneration, and gives an understanding to the role of glial cells in the process.
Comments on revisions:
In the previous version of the manuscript, I had suggested several recommendations for the authors. Unfortunately, none of these were addressed in the author's revision.
We are sorry for this error. We apologize but we never received these comments. We have now found them, and we have incorporated these comments in the new version of the manuscript.
(1) Have you tried screening for other markers for the EdU+ Repo+ Pros- cells?
We have identified these cells as glial cells (Repo +), and not astrocyte-like glia (pros-). But we have not further characterized the identity of these cells. Our aim was to identify these proliferating glial cells as NPG (Neuropile glia), which are Astrocyte-Like Glia (ALG), as previous works suggest in larvae (Kato et al., 2020; Losada-Perez et al., 2016), or Ensheathing Glia (EG). To discard the ALG identity, we used prospero as the best marker. The results indicate that there are ALG among the proliferating population, but in addition, we also found pros- glial cells that were EdU positive. These cells are located in the interface between cortex and neuropile, where the neuropile glia position is described. The anti-pros staining indicated they were no ALG which suggest that they are EG.
There is no specific nuclear marker for EG cells, therefore we used FLY_FUCCI under the control of a EG specific promoter (R56F03-Gal4) to determine if the other dividing cells were EG. These results indicate that EG glia divide although their proliferation does not increase upon injury.
The R56F03 Gal4 construct is described as ensheathing glia specific by previous publications, including:
(1) Kremer M. C., Jung C., Batelli S., Rubin G. M. and Gaul U. (2017). The glia of the adult Drosophila nervous system. Glia 65, 606-638. 10.1002/glia.23115
(2) Qingzhong Ren, Takeshi Awasaki, Yu-Chun Wang, Yu-Fen Huang, Tzumin Lee. Lineage-guided Notch-dependent gliogenesis by Drosophila multi-potent progenitors. Development. 2018 Jun 11;145(11):dev160127. doi: 10.1242/dev.160127
To summarize, our results suggest that part of these proliferating glial cells are ALG and EG. Our results can not discard that a residual part of these proliferating cells are not AG nor EG.
(2) You mentioned that ALG are heterogenous in size and shape, does that mean that you may have different subpopulations of ALG? Would that also mean that only a portion of them responds to injury?
Yes, as in Astrocytes in vertebrates this population is highly heterogeneous. Currently there are no molecular tools to specifically identify these subpopulations and characterize their distinct roles. However, emerging research suggests that differences in size, shape, and potentially molecular markers could correlate with functional diversity. This implies that certain subpopulations of ALG may be more specialized or primed to respond to injury, while others may play roles in homeostasis or other processes. Understanding this heterogeneity will require advanced techniques such as single-cell RNA sequencing, spatial transcriptomics, or live imaging to unravel how these subpopulations contribute to injury responses and overall tissue dynamics.
(3) You mentioned that NP-like cells have similar nuclear shape and size to ALG and EG, while Ventral cortex cells have larger nuclei. Can you please show a quantification of the NP-like cells and Ventral cortex cells size, and show a direct comparison with ALG and EG cells to support those claims (images, quantification and analysis)?
We added a new supplementary figure with a graph showing nuclei size differences between VC and NP-like cells, and a diagram showing VC cell localization. Images in figure 2A-A’ and 2B-B’ show both types of cells with the same scale, additionally, NPG cells are shown in red (current expression of the specific Gal4 line). A direct comparison between EG and NP-like glia can be observed in Figure 3 as well.
Besides of size and localization, we conclude that VC and N-like cells present different molecular markers as VC are elav-positive and reponegative whereas NP-like cells are repo-positive elav-negative
(4) In Figure 2B, the repo expression is not very clear. I suggest using a different example to support the claim that NP cells are Repo+.
We have changed the color of anti-elav staining to facilitate visualisation
(5) Again, in Figure 2C, you need quantification and analysis to support the claim that you used nuclear shape and size to identify VC vs. NP like cells.
Quantification in point 3, criteria in Figure S1
(6) What is the identity of the newly formed neurons? Other than Elav, have you tried using other markers of neurons that are typically found in this area?
This question is of great interest and relevance. We have done great efforts to solve this open question and so far, our data suggest that these neurons might be in an immature state. In this last version of the manuscript, we included the results (Figure S1) with several different markers.
The molecular identity of these cell populations, glia and neurons, is currently under investigation.
Minor comments:
(1) In the abstract, EG and ALG abbreviations are not introduced properly.
Thank you very much for noticing this missing information, we have now included it in the abstract.
(2) Please include a representation of the NPG somata location in Figure 1A.
We have included this information in the figure
(3) A schematic showing the differences between ALG and EG cells would be helpful as well.
We have included in the introduction references and reviews where other authors describe in detail the differences.
(4) In Figure 1 E, G, H- please indicated the genotype of the fly used in the panel as well as the cell type studied.
The complete genotype is included in the corresponding figure legend. We have added a simplified genotype in the figure for clarity.
(5) Please show the genotype used for images in Figure 2: ALG or EG specific drivers.
This information is included in the corresponding figure legend. We believe that it is better to keep the figure clean so we decided to keep the complete genotype, which is considerably long, only in the figure legend.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We appreciate the constructive feedback provided by the reviewers and the editorial board. We are delighted by the positive reception of our work and the thoughtful insights shared.
Regarding the validation of our predicted interactions, we are currently conducting yeast two-hybrid (Y2H) assays using a commercially available Arabidopsis thaliana cDNA library to screen for interacting partners of the ANK putative effector PBTT_00818 from Plasmodiophora brassicae. Following this initial screening, we will validate positive interactions through targeted 1-to-1 Y2H assays. In particular, we aim to confirm the AlphaFold Multimer-predicted interaction between PBTT_00818 and MPK3, a key immunity-related kinase in Arabidopsis.
We are grateful for the reviewers’ thoughtful suggestions regarding clustering visualization, sequence vs. structure-based motif alignments, and structural confidence assessments. We will carefully incorporate these improvements in our planned revisions.
Once again, we thank the editors and reviewers for their rigorous and constructive assessment. We look forward to implementing these refinements and submitting an updated version that further enhances the impact of our study.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Recommendations for the authors):
Minor Points:
• HEK293T cells are not typically Type 1 IFN-producing cells; it is recommended to use other immune cell lines to validate results obtained with ORMDL3 overexpression in 293T cells. The same applies to A549 alveolar basal epithelial cells.
Thanks for the reviewer’s insightful comment. In Figure 1C, we overexpressed ORMDL3 in mouse primary BMDM cell and stimulated it with poly(I:C) or poly(dG:dC), which suggests that ORMDL3 inhibits IFN expression in primary cell BMDM.
• Clarify whether TLR3 is expressed in the cell lines used in Figure 1 and whether TLR3 is present in mouse BMDMs.
Thanks for your suggestions. We identified whether TLR3 is expressed in HEK293T, A549 and BMDM. We designed primers of human TLR3 and murine Tlr3, and the results showed that Tlr3 is expressed in BMDM but not in HEK293T and A549. As it shown in Author response image 1.
Author response image 1.
PCR amplification of human TLR3 was conducted on cDNA derived from HEK293T and A549 cells (lanes 1 and 2, respectively), and PCR amplification of murine Tlr3 was performed on cDNA from BMDM (lane 3). Human spleen cDNA (lane 4, TAKARA Human MTCTM Panel I, Cat# 636742) served as a positive control, and 18s rRNA was used as an internal control.
primer sequences:
human TLR3: forward TTGCCTTGTATCTACTTTTGGGG reverse TCAACACTGTTATGTTTGTGGGT
murine Tlr3: forward GTGAGATACAACGTAGCTGACTG reverse TCCTGCATCCAAGATAGCAAGT
18s (human/mice): forward GTAACCCGTTGAACCCCATT reverse CCATCCAATCGGTAGTAGCG
• Specify the type of luciferase reporter assay used in Figure 1E.
Thanks for the reviewer’s insightful comment. The Dual-Luciferase® Reporter (DLR™) Assay System efficiently measures two luciferase signals. In brief, the IFN-reporter luciferase is derived from firefly (Photinus pyralis), while the internal control luciferase is from Renilla (Renilla reniformis or sea pansy). These dual luciferases are measured sequentially from a single sample. In Figure 1E, we measured the luciferase activity of IFN (firefly) and internal control gene TK (Renilla), and their ratio is shown in Figure 1E.
• Clarify what was knocked down in the A549 stable KD cell line and whether HSV-1 infects and replicates in A549 cells.
We sincerely appreciate the reviewer’s concern and apologize for any ambiguous descriptions. In Figure 1H, we knocked down ORMDL3 and infected the cell with HSV-1, which shows that ORMDL3 does not affect the infection and replication of HSV-1 in A549.
• In Figure 2E, provide the rationale for using the same tag (Flag) in overexpression experiments with different molecules such as Flag-ORDML3 and Flag-RIG-I.
We thank the reviewer’s concern. We tried to co-express different tags of ORMDL3 and innate immunity proteins, and we got the same results as before. ORMDL3-Myc overexpression can only promote the degradation of Flag-RIG-I-N, as shown in current Figure 2E.
• Address the low knockdown efficiency shown in Figure 2D and consider whether it is sufficient for drawing conclusions.
Thanks for the reviewer’s concern. Because ORMDL3 antibody (Abcam 107639) can recognize all ORMDL family members (ORMDL1, 2 and 3), this may explain why the knockdown efficiency of ORMDL3 is not apparent in Figure2D. We also detect the knockdown efficiency of ORMDL3 at mRNA level, which showed that ORMDL3 was silenced efficiently and specifically (Figure S2C).
• Replace the Tubulin/β-Actin WB control with a more distinguishable band.
Thanks for the suggestion. Owing to different gel concentration, sometimes the protein bands appear fused, but it is distinguishable that the internal controls are consistent.
• In Figures 3D/E, the expression level of the Lysine mutant of RIG-I-N is too low. Please provide an explanation or repeat the experiment to achieve comparable expression levels and update the figure accordingly.
Thanks for the question. The expression of lysine mutant of RIG-I-N is low, we have increased the amount of plasmid in transfection, but this still hasn't increased its expression level. Though its abundance is low, we provided evidence to show that it would not be degraded by ORMDL3. In some literatures (for example: RNF122 suppresses antiviral type I interferon production by targeting RIG-I CARDs to mediate RIG-I degradation. Proc Natl Acad Sci U S A. 2016 Aug 23;113(34):9581-6; TRIM4 modulates type I interferon induction and cellular antiviral response by targeting RIG-I for K63-linked ubiquitination. J Mol Cell Biol. 2014 Apr;6(2):154-63.), it has also been reported that lysine mutant can affect RIG-I stability. In addition, we speculate that the 4KR mutant (K146R, K154R, K164R, K172R) may change RIG-I conformation, so its expression is lower.
• Explain why there is no difference in MAVS expression levels despite binding with MAVS.
Thanks for the question. In our experiment, ORMDL3 has no effect on MAVS expression. Our results showed that ORMDL3 interacts with MAVS and promotes the degradation of RIG-I, so only RIG-I level has a significant difference.
• Verify if Flag-tagged ORMDL3 is present in the IP sample in Figure 3G.
Thanks for the comment. We reloaded the samples and blot flag, and we found that ORMDL3 cannot be pulled down by RIG-I. We have added the results in Figure 3G.
• Reload the samples in Figure 4C to clearly identify the correct band for GFP-tagged ORMDL3.
Thanks for the question. As ORMDL3 is small molecular protein, we fused it and its fragments to GFP to increase its molecular weight. In our GFP vector, for some unknown reason, the 26kDa band always exists. This is actually a technical difficulty. Although the GFP-fused protein and GFP band are very close, they can still be distinguished as two bands.
• Rerun the Western blot for Actin IB in Figure 4E, as the ORMDL3-GFP (1-153) full-length appears abnormal.
Thanks for the question. As we first blot GFP and then blot actin on the same membrane, so it appears abnormal. We reloaded the previous sample and blotted the actin again.
• Clarify in which figure RIG-I ubiquitination is shown and whether ORMDL3 has E3 ubiquitin ligase activity. Explain how ORMDL3 facilitates USP10 transfer to RIG-I despite no direct interaction.
Thank you for your question. In Figure 3B we showed the ubiquitination of RIG-I and ORMDL3 does not have an E3 ubiquitin ligase activity. Our results showed that although ORMDL3 does not directly interacted with RIG-I, it forms complex with USP10 (Figure 5B, 5C) and disrupt USP10 induced RIG-I stabilization by decreasing the interaction between USP10 and RIG-I (Figure 6A). The detailed mechanism needs further investigation.
• Provide quantification for Figure 5D. Explain why the bands are not degraded by RIG-I and USP10.
Thanks for the concern. We quantified the bands and found that overexpression of USP10 increased RIG-I protein abundance. The quantitative gray values are added into the image. USP10 functions to stabilize RIG-I rather than promoting its degradation.
• Explain the decrease in RIG-I levels in Figure 5E when USP10 levels decrease.
Thanks for the concern. As shown in the working model (Supplementary Figure 8), USP10 is a deubiquitinase that stabilizes RIG-I by decreasing its K48-linked ubiquitination. So, in Figure 5E, we knocked down USP10 and found a decrease in RIG-I levels, which is consistent with Figure 5D.
• Clarify whether K48 ubiquitination on RIG-I has decreased in Figure 5F, as this is not clear from the image.
Thanks for the question. In Figure 5F it is shown that the K48 ubiquitination level of RIG-I significantly decreased (please see the density of the bands in the IP samples).
• Address whether ORMDL3 reduces RIG-I-N degradation in Figure 5H, as the results do not clearly support this claim.
Thanks for the concern. We quantified the bands and the results showed that ORMDL3 promotes the degradation of RIG-I-N. The quantitative gray values are added into the image.
• Reload Flag-ORMDL3 in Figure 6C to determine whether RIG-I-N is restored in the MG132-treated samples.
Thank you for your question. We quantified the bands and the results showed that RIG-I-N is restored in the MG132-treated samples. The quantitative gray values are added into the image.
• Correct numerous typos and errors, especially in the Discussion section, to improve readability
Thanks for the suggestion. We have revised the manuscript carefully to correct these errors.
Reviewer #2 (Recommendations for the authors):
(1) In Figure 1G and H, The number of virus-infected cells was observed using a fluorescence microscope. In addition, can the author use other techniques to detect the impact of ORMDL3 on virus replication?
Thanks for the question. Except for using a fluorescence microscope, we also used RT-PCR to quantify the amount of viral mRNA, and results were added in Figure 1G and H.
(2) In Figure 3C, ORMDL3 overexpression promotes the degradation of RIG-I-N. ORMDL3 is one of three ORMDL proteins with similar amino acid sequences, does ORMDL1/2 also have this function?
Thanks for the suggestion. We compared the function between ORMDLs and found that only ORMDL3 overexpression facilitated RIG-I-N degradation. The results were shown in Figure S2D.
(3) In Figure 5A, USP10 is not the top protein in the Mass spec assay. Does the author verified the interaction between ORMDL3 and other protein (for example CAND1)?
Thanks for your suggestion. We verified that ORMDL3 has no interaction with CAND1 and UFL1 but only interacts with USP10, as Figure S5 shows.
(4) A scale bar to be added to the images in Figure 1 G, H and Figure 7K.
Thanks for the suggestion. We have added the scale bars.
(5) The annotations in Figure 4B, C and E should be aligned.
Thanks for the suggestion. We have aligned the annotations.
(6) Provide Statistical methods
Thanks for the suggestion. We have provided the statistical methods in the materials and methods part.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This important study identifies the "H-state" as a potential conformational marker distinguishing amyloidogenic from non-amyloidogenic light chains, addressing a critical problem in protein misfolding and amyloidosis. By combining advanced techniques such as small-angle X-ray scattering, molecular dynamics simulations, and H-D exchange mass spectrometry, the authors provide convincing evidence for their novel findings. However, incomplete experimental descriptions, limitations in SAXS data interpretation, and the way HDX MS data is presented aHect the strength and generalizability of the conclusions. Strengthening these aspects would enhance the impact of this work for researchers in amyloidosis and protein misfolding.
We thank eLife editors and reviewers for their constructive feedback. The manuscript has been improved to provide a more complete description of the experiments and to strengthen the interpretation and presentation of all data. Updated Figures (Figure 2 and Figure 5) and a new Table (Table 2) in the main text provide a more complete and clearer comparison of the SAXS data with MD simulations as well as a clearer representation of the HDX MS data. Additional figures have been added in SI. The text has been extended accordingly and complete materials and methods are now included in the main text. Abstract, introduction and discussion have been revised to improve the overall readability of the manuscript.
Public Reviews:
Reviewer #1 (Public review):
The study investigates light chains (LCs) using three distinct approaches, with a focus on identifying a conformational fingerprint to diHerentiate amyloidogenic light chains from multiple myeloma light chains. The study's major contribution is identifying a low-populated "H state," which the authors propose as a unique marker for AL-LCs. While this finding is promising, the review highlights several strengths and weaknesses. Strengths include the valuable contribution of identifying the H state and using multiple approaches, which provide a comprehensive understanding of LC structural dynamics. However, the study suHers from weaknesses, particularly in interpreting SAXS data, lack of clarity in presentation, and methodological inconsistencies. Critical concerns include high error margins between SAXS profiles and MD fits, unclear validation of oligomeric species in SAXS measurements, and insuHicient quantitative cross-validation between experimental (HDX) and computational data (MD). This reviewer calls for major revisions including clearer definitions, improved methodology, and additional validation, to strengthen the conclusions.
We thank the reviewer for the supportive comments, in the revised version of the manuscript we have focused on improving the clarity and completeness of our work. We are sorry for example to not have made previously clear enough that the comparison of SAXS with MD simulation was not that shown in the main text in Figure 1 and Table 1 (this is the comparison with single structures) but that reported in the SI (previously Figure S1 and Table S2, showing very good fits). These data have been moved in the main text in the reworked Figure 2 and new Table 2. We have also improved the presentation of the HDX MS data in Figure 5 and in the text adding also additional analysis in SI. Materials and methods are now completely moved in the main text. We generally revised the manuscript for clarity.
Reviewer #2 (Public review):
Summary:
This well-written manuscript addresses an important but recalcitrant problem - the molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a major life-threatening form of systemic human amyloidosis. The authors use expertly recorded and analyzed smallangle X-ray scattering (SAXS) data as a restraint for molecular dynamics simulations (called M&M) and to explore six patient-based LC proteins. The authors report that a highly populated "H-state" determined computationally, wherein the two domains in an LC molecule acquire a straight rather than bent conformation, is what distinguishes AL from non-AL LCs. They then use H-D exchange mass spectrometry to verify this conclusion. If confirmed, this is a novel and interesting finding with potentially important translational implications.
We thank the reviewer for the supportive comments.
Strengths:
Expertly recorded and analyzed SAXS data combined with clever M&M simulations lead to a novel and interesting conclusion. Regardless of whether or not the CL-CL domain interface is destabilized in AL LCs explored in this (Figure 6) and other studies, stabilization of this interface is an excellent idea that may help protect at least a subset of AL LCs from misfolding in amyloid. This idea increases the potential impact of this interesting study.
We thank the reviewer for the supportive comments.
Weaknesses:
The HDX analysis could be strengthened.
We have extended the analysis and improved the presentation of the HDX data. Figure 5 has been reworked, text has been improved accordingly and additional analysis have been reported in SI.
Reviewer #3 (Public review):
Summary:
This study identifies conformational fingerprints of amyloidogenic light chains, that set them apart from the non-amyloidogenic ones.
We thank the reviewer for the supportive comments.
Strengths:
The research employs a comprehensive combination of structural and dynamic analysis techniques, providing evidence that conformational dynamics at the VL-CL interface and structural expansion are distinguished features of amyloidogenic LCs.
We thank the reviewer for the supportive comments.
Weaknesses:
The sample size is limited, which may aHect the generalizability of the findings. Additionally, the study could benefit from deeper analysis of specific mutations driving this unique conformation to further strengthen therapeutic relevance.
We agree, we tried to maximise the size of the sample and this was the best we could do. With respect to the analysis of the mutations, while we tried to discuss some of them also in view of previous works, because our set covers multiple germlines instead than focusing on a single one, this limit our ability to discuss single point mutations systematically, at the same time the discussion of single points mutations has been the focus of many recent works, while our approach provide a diNerent point of view.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
This study provides an investigation of light chains (LCs) using three distinct approaches, focusing primarily on identifying a conformational fingerprint to distinguish amyloidogenic light chains (AL-LCs) from multiple myeloma light chains (MM-LCs). The authors propose that the presence of a low-populated "H state," characterized by an extended quaternary structure and a perturbed CL-CL interface, is unique to AL-LCs. This finding is validated through hydrogendeuterium exchange mass spectrometry (HDX-MS). The study makes a valuable contribution to understanding the structural dynamics of light chains, particularly with the identification of the H state in AL-LCs. However, significant concerns regarding the interpretation of the SAXS data, clarity in presentation, and methodological rigor must be addressed. I recommend major revisions and resubmission of the work.
Major concerns:
(1) A critical concern is how the authors ensure that the SAXS profiles represent only dimeric species, given the high propensity of LCs to aggregate. If higher-order aggregates or monomers were present, this would significantly impact the SAXS data and SAXS-MD integration. Some measurements are bulk SAXS, while others are SEC-SAXS, making the study questionable. The authors need to clarify how only dimeric species were measured for the SEC-SAXS analysis, and all assessments of the dimeric state should be shown in the SI. Additionally, complementary techniques such as DLS or SEC-MALS should be used to verify the oligomeric state of the samples. Without this validation, the SAXS profiles may not be reliable.
We added SEC-MALS and SEC-SAXS data in the SI (Figures S20 and S21) as well the SAXS curves shown in log-log plot (Figure S1) that display a flat trend at low q that exclude aggregation. SAXS is very sensitive to oligomers and aggregates and our data do not indicate the presence of those species. When we had indication of possible aggregation in the sample we used SEC-SAXS.
(2) A major problem with the paper is that the claim of the "H state," which is the novelty of the study and serves as a marker of aggregation, is derived from samples where the error between the SAXS profiles and MD fits is extremely high. This casts doubt on whether the structure is indeed resolved by MD. The main conclusion of the paper is derived from weak consistency between experiment and simulation. In AL55, the error between experiment and simulation is greater than 5; for H7, it is higher than 2.8. The residuals show significant error at mid-q values, suggesting that long-range distance correlations (20-10 Å, CL, VL positioning) are not consistent between simulation and experiment. Furthermore, the FES plots of two independent replicas show deviation in the existence of the H state. One shows a minimum in that region, while the other does not. So, how robust is this conclusion? What is the chi-squared value if each replica is used independently? A separate experimental cross-validation is necessary to claim the existence of the H state.
We apologise for the misunderstanding underlying this reviewer comment. The poor agreement mentioned is not between the SAXS and MD simulations, but with the individual structures, and this disagreement led us to perform MD simulations that are in much better agreement with the data (previously Fig. S1 and Table S2). To avoid this misunderstanding, which would indeed weaken our work, we have now moved both the figure and the table in the main text to the updated Figure 2 and the new Table 2.
Regarding the robustness of the sampling, we believe that Table 3 (previously Table 2) clearly shows the statistical convergence of the data, diNerences in the presentation of the free energy are purely interpolation issues. The chi-squares of each replicate are reported in Table 2 (previously Table S2).
(3) There is insuHicient discussion about SAXS computations from MD trajectories. The accuracy of these calculations is crucial to deriving the existing conclusions, and the study's reliance on the PLUMED plugin, which is known to give inaccurate results for SAXS computations, raises concerns. How the solvent is treated in the SAXS computations needs to be explained. Alternative methods like WAXSiS or Crysol should be explored to check whether the SAXS profiles derived from the MD trajectory are consistent across other SAXS computation methods for the major conformers of the proteins.
We have now clarified that while the SAXS calculation to perform Metainference MD were done using PLUMED (that to our knowledge is as accurate as crysol) SAXS curves used for analysis were calculated using crysol.
(4) The HDX and MD results do not seem to correlate well, and there is a disconnect between Figure 2 (SAXS profiles) and Figure 5 (HDX structural interpretation). The authors should quantitatively assess residue-level dynamics by comparing HDX signals with MD-derived HDX signals for each protein. This would provide a cross-validation between the experimental and computational data.
In our opinion our SAXS, MD and HDX MS data provide a consistent picture. Our HDX-MS do not provide per residue data, making a quantitative comparison out of scope. RMSF data do not necessarily need to correlate with the deuterium uptake.
(5) MD simulations are only used to refine the structure of AlphaFold predictions, but the trajectories could help explain why these structures diHer, what stabilizes the dimer, or what leads to the conformational transition of the H state. A lack of analysis regarding the physical mechanism behind these structural changes is a weakness of the study. The authors should dedicate more eHort to analyzing their data and provide physical insights into why these changes are observed.
Our aim was to identify a property that could discriminate between AL and MM LCs. We used MD simulations, not to refine structures, but to explore the conformational dynamics of LCs (starting from either X-ray structures, homology or AlphaFold models), because SAXS data suggested that conformational dynamics could discriminate between AL- and MM-LCs. Simulations allowed us to propose a hypothesis, which we tested by HDX MS. While more insight is always welcome, we believe that we have achieved our goal for now. In the discussion, we present additional analysis of the simulations to connect with previous literature, we agree that more analysis can be done, and also for this reason, all our data are publicly available.
Minor concerns
(6) The abstract leans heavily on describing the problem and methods but lacks a clear presentation of key results. Providing a concise summary of the main findings (e.g., the identification of the H state) would better balance the abstract.
We agree with the reviewer and we rewrote the abstract.
(7) In the abstract, the term "experimental structure" is used ambiguously. Since SAXS also provides an experimental structure, it is unclear what the authors are referring to. This should be clarified.
We agree with the reviewer and we rewrote the abstract.
(8) Abbreviations such as VL (variable domain) and CL (constant domain) are not defined, making it harder for readers unfamiliar with the field to follow. Abbreviations should be defined when first mentioned.
We agree with the reviewer and we rewrote the abstract.
(9) The introduction provides a good general context but fails to explicitly define the knowledge gap. Specifically, the structural and dynamic determinants of LC amyloidogenicity are not well established, and this study could be framed as addressing that gap.
We thank the reviewer and we agree this could be better framed, we improved the introduction accordingly.
(10) The introduction does not present the novel discovery of the H state early enough. The unique contribution of identifying this state as a marker for AL-LCs should be mentioned upfront to guide the reader through the significance of the study.
We thank the reviewer and we have now made more explicit what we found.
(11) The therapeutic implications of this research should be highlighted more clearly in the discussion. Examples of how these findings could be utilized in drug design or therapeutic approaches would enhance the study's impact.
We thank the reviewer, but while we think that the H-state could be targeted for drug design, since we do not have data yet we do not want to stress this point more than what we are already doing.
(12) There is an overwhelming use of abbreviations such as H3, H7, H18, M7, and M10 without proper introduction. This makes it diHicult for readers to follow the results, and the average reader may become lost in the details. An introductory figure summarizing the sequences under study, along with a schematic of the dimeric structure defining VL and CL domains, would significantly aid comprehension.
We agree and we tried to better introduce the systems and simplify the language without adding a figure that we think would be redundant.
(13) In Figure 1, add labels to each SAXS curve to indicate which protein they correspond to. Also, what does online SEC-SAXS mean?
Done
(14) The caption of Figure 3 is unclear, particularly with abbreviations like Lb, Ls, G, and H, which are not mentioned in the captions. The authors should define these terms for clarity.
Done
(15) The study claims that the dominant structure of the dimer changes between diHerent LCs. However, Figure 5 shows identical structures for all proteins, raising questions about the consistency between the SAXS and HDX data. This inconsistency is a general problem between the MD and HDX sections, where cross-communication and comparisons are not properly addressed.
We do not claim that the dominant structure of the dimer changes between diNerent LCs, this would also be in contradiction with current literature. We claim a diNerence in a low-populated state. From this point of view using always the same structure is consistent and should simplify the representation of the results. We agree that the manuscript may be not always easy to follow and we thank the reviewer in helping us improving it.
(16) The authors show I(q) vs q and residuals for each protein. The Kratky plots are not suHicient to compare the SAXS computations with the measured profile.
Showing Kratky and residuals is a standard and complementary way to present and compare SAXS data to structures. Chi-square values are also reported. Log-log plots have been added to SI in response to previous comments.
(17) The authors need to explain how they estimate the Rg values (from simulation or SAXS profiles). If they are using simulations, they should compute the Rg values from the simulations for comparison.
Rg values reported in Table 1 are derived from SAXS. Rg from simulations have been added in Table 2.
(18) The evolution of the sampling is unclear. The authors need to show the initial starting conformation in each case and the most likely conformation after M&M in the SI, to demonstrate that their approach indeed caused changes in the initial predictions.
Our approach is not structure refinement and as such the proposed analysis would be misleading. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. DiNerences (or not) between initial and selected configurations will not be particularly informative in this context.
(19) The authors should also provide a running average of chi-squared values over time to demonstrate that the conformational ensemble converged toward the SAXS profile.
Our simulations are not driven to improve the agreement with SAXS over time, this is not structure refinement. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. The suggested analysis would be a misinterpretation of our simulations. The comparison with SAXS is provided in Figure 2 and Table 2 as mentioned above.
(20) The aggregate simulation time of 120 microseconds is misleading, as each replica was only run for 2-3 microseconds. This should be clarified.
The number reported in the text is accurate and represent the aggregated sampling. The number of replicas for each metainference simulation and their length is reported in Table 2 now moved for clarity from the SI to main text.
(21) It is not clear how the replicas were weighted to compute the SAXS profiles and FES. There are two independent runs in each case, and each run has about 30 replicas. How these replicas are weighted needs to be discussed in the SI.
Done
(22) The methods section is unevenly distributed, with detailed explanations of LC production and purification, while other key methodologies like SAXS+MD integration and HDX are not even mentioned in the main text (they are in the Supporting Information). The authors should provide a brief overview of all methodologies in the main text or move everything to the SI for consistency.
We agree with the reviewer, all methods are now in main text.
Reviewer #2 (Recommendations for the authors):
(1) Computational M&M evidence is strong (Figure 3) and is supported by SAXS (used as restraints). However, Kratky plots reported in the main MS Figure 1 show significant diHerences between the data and the structural model only for one protein, AL-55. It is hard for the general reader to see how these SAXS data support a clear diHerence between AL and non-AL proteins. If possible, please strengthen the evidence; if not, soften the conclusions.
We thank the reviewer for the comments. The chi-square (Table 1) and the residuals (Figure 1) are a strong indication of the diNerence. To strengthen the evidence, following also the comment from reviewer 3 we calculated the p-value (<10<sup>-5</sup>) on the significance of the radius of gyration to discriminate AL and MM LCs. We agree that SAXS alone was not enough and this is indeed what prompted us to perform MD simulations.
(2) HDX MS results are cursory and not very convincing as presented. The butterfly plots in Figure 5 are too small to read and are unlabeled so it is unclear which protein is which.
Figure 5 has been reworked for readability. More data have been added in SI.
(3) What labeling time was selected to construct these plots and why?
The deuterium uptakes at 30 min HDX time showed the most pronounced diNerences between diNerent proteins, which were chosen to illustrate the key structural features in the main figure panel (Figure 5).
How diHerent are the results at other labeling times? Showing uptake curves (with errors) for more than just two peptides in the supplement Figure S12 might be helpful.
We found a continuous increase in deuterium uptake as we increased the exchange time from 0.5 to 240 min, which reached saturation at 120 min. Therefore, the exchange follows the same pattern at all time points. Butterfly plots at diNerent HDX times of 0.5 to 240 min are shown in gradient of light blue to dark blue which clearly shows the pattern of deuterium uptake at increasing incubation times (Figure 5). The HDX uptake kinetics of selected peptides with corresponding error bars are shown in Figure S12.
How redundant are the data, i.e. how good is the peptide coverage/resolution in key regions at the domain-domain interface that the authors deem important? Mapping the maximal deuterium uptake on the structures in Figure 5 is not very helpful. Perhaps mapping the whole range of uptake using a gradient color scheme would be more informative.
Overall coverage and redundancy for all four proteins are> 90% and > 4.0, respectively, with an average error margin in fractional uptake among all peptides is 0.04-0.05 Da, which suggests that our data is reliable (Table S3). We modified the main panel figures showing the gradient of deuterium uptake in blue-white-red for 0 to 30% of deuterium uptake on the chain A of the dimeric LCs.
(3) Is the conformational heterogeneity depicted in M&M simulations consistent with HDX results? The authors may want to address this by looking at the EX1/EX2 exchange kinetics for AL vs. non-AL proteins. Do AL proteins show more EX1?
No, we don’t see any EX1 exchange kinetics in our analysis. This is compatible with the prediction of the H-state that is a native like state and not an unfolded/partially folded state.
(4) Perhaps the main conclusion could be softened given the small number of proteins (six), esp. since only four (3 AL and 1 non-AL) could be explored by HDX. Are other HDX MS data of AL LCs from the same Lambda6 family (e.g. PMID: 34678302) consistent with the conclusions that a particular domain-domain interface is weakened in AL vs. non-AL LCs?
We thank the reviewer for this suggestions. A diNerence in HDX MS data is indeed visible between AL and MM proteins for peptide 33-47 in the suggested paper (Figures 4, S5 and S8). The diNerence is reduced by the mutation identified in the paper as driving the aggregation in that specific case. We now mention this in the discussion.
(5) Please clarify if the H* state is the same for a covalent vs. non-covalent LC dimer.
We do not know because our data are only for covalent dimers. But, interestingly, the state is very similar to what was observed for a model kappa light-chain in Weber, et al., we have better highlighted this point in the discussion.
(6) Please try and better explain why a smaller distance between CL domains in H7 protein and a larger distance in other AL proteins both promote protein misfolding.
We do not have elements to discuss this point in more detail.
(7) Please comment on the Kratky plots data vs. model agreement (see comments above).
Done.
(8) Please find a better way to display, describe, and interpret the HD exchange MS data.
We have generated new main text (new Figure 5) and SI figures that we think allow the reader to better appreciated our observations. Corresponding results sections have been also improved.
Minor points:
(9) Is the population of the H-state with perturbed CL-CL domain interface, which was obtained in M&M simulations, suHicient to be observable by HDX MS?
While populations alone are not enough to determine what is observable by HDX MS, a 10% population correspond roughly to 6 kJ/mol of ΔG and is compatible with EX2 kinetics. Previous works suggested that HDX-MS data should be sensitive to subpopulations of the order of 10%, (https://doi.org/10.1016/j.bpj.2020.02.005, https://doi.org/10.1021/jacs.2c06148)
(10) Typically, an excited intermediate in protein unfolding is a monomer, while here it is an LC dimer. Is this unusual?
This is a good point, we think that intermediates have mostly been studied on monomeric proteins because these are more commonly used as model systems, but we do not feel like discussing this point.
(11) Low deuterium uptake is consistent with a rigid structure but may also reflect buried structure and/or structure that moves on a time scale greater than the labeling time.
We agree.
Reviewer #3 (Recommendations for the authors):
(1) The p-value (statistical significance) of Rg diHerence should be computed.
We thank the reviewer for the suggestion, we calculated the p-value that resulted quite significant.
(2) The significance of mutations (SHM?) at the interface, such as A40G should be compared with previous observations. (Garrofalo et al., 2021).
We thank the reviewer for the suggestion, a sentence has been added in the discussion.
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This important study identifies the "H-state" as a potential conformational marker distinguishing amyloidogenic from non-amyloidogenic light chains, addressing a critical problem in protein misfolding and amyloidosis. By combining advanced techniques such as small-angle X-ray scattering, molecular dynamics simulations, and H-D exchange mass spectrometry, the authors provide convincing evidence for their novel findings. However, incomplete experimental descriptions, limitations in SAXS data interpretation, and the way HDX MS data is presented aHect the strength and generalizability of the conclusions. Strengthening these aspects would enhance the impact of this work for researchers in amyloidosis and protein misfolding.
We thank eLife editors and reviewers for their constructive feedback. The manuscript has been improved to provide a more complete description of the experiments and to strengthen the interpretation and presentation of all data. Updated Figures (Figure 2 and Figure 5) and a new Table (Table 2) in the main text provide a more complete and clearer comparison of the SAXS data with MD simulations as well as a clearer representation of the HDX MS data. Additional figures have been added in SI. The text has been extended accordingly and complete materials and methods are now included in the main text. Abstract, introduction and discussion have been revised to improve the overall readability of the manuscript.
Public Reviews:
Reviewer #1 (Public review):
The study investigates light chains (LCs) using three distinct approaches, with a focus on identifying a conformational fingerprint to diHerentiate amyloidogenic light chains from multiple myeloma light chains. The study's major contribution is identifying a low-populated "H state," which the authors propose as a unique marker for AL-LCs. While this finding is promising, the review highlights several strengths and weaknesses. Strengths include the valuable contribution of identifying the H state and using multiple approaches, which provide a comprehensive understanding of LC structural dynamics. However, the study suHers from weaknesses, particularly in interpreting SAXS data, lack of clarity in presentation, and methodological inconsistencies. Critical concerns include high error margins between SAXS profiles and MD fits, unclear validation of oligomeric species in SAXS measurements, and insuHicient quantitative cross-validation between experimental (HDX) and computational data (MD). This reviewer calls for major revisions including clearer definitions, improved methodology, and additional validation, to strengthen the conclusions.
We thank the reviewer for the supportive comments, in the revised version of the manuscript we have focused on improving the clarity and completeness of our work. We are sorry for example to not have made previously clear enough that the comparison of SAXS with MD simulation was not that shown in the main text in Figure 1 and Table 1 (this is the comparison with single structures) but that reported in the SI (previously Figure S1 and Table S2, showing very good fits). These data have been moved in the main text in the reworked Figure 2 and new Table 2. We have also improved the presentation of the HDX MS data in Figure 5 and in the text adding also additional analysis in SI. Materials and methods are now completely moved in the main text. We generally revised the manuscript for clarity.
Reviewer #2 (Public review):
Summary:
This well-written manuscript addresses an important but recalcitrant problem - the molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a major life-threatening form of systemic human amyloidosis. The authors use expertly recorded and analyzed smallangle X-ray scattering (SAXS) data as a restraint for molecular dynamics simulations (called M&M) and to explore six patient-based LC proteins. The authors report that a highly populated "H-state" determined computationally, wherein the two domains in an LC molecule acquire a straight rather than bent conformation, is what distinguishes AL from non-AL LCs. They then use H-D exchange mass spectrometry to verify this conclusion. If confirmed, this is a novel and interesting finding with potentially important translational implications.
We thank the reviewer for the supportive comments.
Strengths:
Expertly recorded and analyzed SAXS data combined with clever M&M simulations lead to a novel and interesting conclusion. Regardless of whether or not the CL-CL domain interface is destabilized in AL LCs explored in this (Figure 6) and other studies, stabilization of this interface is an excellent idea that may help protect at least a subset of AL LCs from misfolding in amyloid. This idea increases the potential impact of this interesting study.
We thank the reviewer for the supportive comments.
Weaknesses:
The HDX analysis could be strengthened.
We have extended the analysis and improved the presentation of the HDX data. Figure 5 has been reworked, text has been improved accordingly and additional analysis have been reported in SI.
Reviewer #3 (Public review):
Summary:
This study identifies conformational fingerprints of amyloidogenic light chains, that set them apart from the non-amyloidogenic ones.
We thank the reviewer for the supportive comments.
Strengths:
The research employs a comprehensive combination of structural and dynamic analysis techniques, providing evidence that conformational dynamics at the VL-CL interface and structural expansion are distinguished features of amyloidogenic LCs.
We thank the reviewer for the supportive comments.
Weaknesses:
The sample size is limited, which may aHect the generalizability of the findings. Additionally, the study could benefit from deeper analysis of specific mutations driving this unique conformation to further strengthen therapeutic relevance.
We agree, we tried to maximise the size of the sample and this was the best we could do. With respect to the analysis of the mutations, while we tried to discuss some of them also in view of previous works, because our set covers multiple germlines instead than focusing on a single one, this limit our ability to discuss single point mutations systematically, at the same time the discussion of single points mutations has been the focus of many recent works, while our approach provide a diNerent point of view.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
This study provides an investigation of light chains (LCs) using three distinct approaches, focusing primarily on identifying a conformational fingerprint to distinguish amyloidogenic light chains (AL-LCs) from multiple myeloma light chains (MM-LCs). The authors propose that the presence of a low-populated "H state," characterized by an extended quaternary structure and a perturbed CL-CL interface, is unique to AL-LCs. This finding is validated through hydrogendeuterium exchange mass spectrometry (HDX-MS). The study makes a valuable contribution to understanding the structural dynamics of light chains, particularly with the identification of the H state in AL-LCs. However, significant concerns regarding the interpretation of the SAXS data, clarity in presentation, and methodological rigor must be addressed. I recommend major revisions and resubmission of the work.
Major concerns:
(1) A critical concern is how the authors ensure that the SAXS profiles represent only dimeric species, given the high propensity of LCs to aggregate. If higher-order aggregates or monomers were present, this would significantly impact the SAXS data and SAXS-MD integration. Some measurements are bulk SAXS, while others are SEC-SAXS, making the study questionable. The authors need to clarify how only dimeric species were measured for the SEC-SAXS analysis, and all assessments of the dimeric state should be shown in the SI. Additionally, complementary techniques such as DLS or SEC-MALS should be used to verify the oligomeric state of the samples. Without this validation, the SAXS profiles may not be reliable.
We added SEC-MALS and SEC-SAXS data in the SI (Figures S20 and S21) as well the SAXS curves shown in log-log plot (Figure S1) that display a flat trend at low q that exclude aggregation. SAXS is very sensitive to oligomers and aggregates and our data do not indicate the presence of those species. When we had indication of possible aggregation in the sample we used SEC-SAXS.
(2) A major problem with the paper is that the claim of the "H state," which is the novelty of the study and serves as a marker of aggregation, is derived from samples where the error between the SAXS profiles and MD fits is extremely high. This casts doubt on whether the structure is indeed resolved by MD. The main conclusion of the paper is derived from weak consistency between experiment and simulation. In AL55, the error between experiment and simulation is greater than 5; for H7, it is higher than 2.8. The residuals show significant error at mid-q values, suggesting that long-range distance correlations (20-10 Å, CL, VL positioning) are not consistent between simulation and experiment. Furthermore, the FES plots of two independent replicas show deviation in the existence of the H state. One shows a minimum in that region, while the other does not. So, how robust is this conclusion? What is the chi-squared value if each replica is used independently? A separate experimental cross-validation is necessary to claim the existence of the H state.
We apologise for the misunderstanding underlying this reviewer comment. The poor agreement mentioned is not between the SAXS and MD simulations, but with the individual structures, and this disagreement led us to perform MD simulations that are in much better agreement with the data (previously Fig. S1 and Table S2). To avoid this misunderstanding, which would indeed weaken our work, we have now moved both the figure and the table in the main text to the updated Figure 2 and the new Table 2.
Regarding the robustness of the sampling, we believe that Table 3 (previously Table 2) clearly shows the statistical convergence of the data, diNerences in the presentation of the free energy are purely interpolation issues. The chi-squares of each replicate are reported in Table 2 (previously Table S2).
(3) There is insuHicient discussion about SAXS computations from MD trajectories. The accuracy of these calculations is crucial to deriving the existing conclusions, and the study's reliance on the PLUMED plugin, which is known to give inaccurate results for SAXS computations, raises concerns. How the solvent is treated in the SAXS computations needs to be explained. Alternative methods like WAXSiS or Crysol should be explored to check whether the SAXS profiles derived from the MD trajectory are consistent across other SAXS computation methods for the major conformers of the proteins.
We have now clarified that while the SAXS calculation to perform Metainference MD were done using PLUMED (that to our knowledge is as accurate as crysol) SAXS curves used for analysis were calculated using crysol.
(4) The HDX and MD results do not seem to correlate well, and there is a disconnect between Figure 2 (SAXS profiles) and Figure 5 (HDX structural interpretation). The authors should quantitatively assess residue-level dynamics by comparing HDX signals with MD-derived HDX signals for each protein. This would provide a cross-validation between the experimental and computational data.
In our opinion our SAXS, MD and HDX MS data provide a consistent picture. Our HDX-MS do not provide per residue data, making a quantitative comparison out of scope. RMSF data do not necessarily need to correlate with the deuterium uptake.
(5) MD simulations are only used to refine the structure of AlphaFold predictions, but the trajectories could help explain why these structures diHer, what stabilizes the dimer, or what leads to the conformational transition of the H state. A lack of analysis regarding the physical mechanism behind these structural changes is a weakness of the study. The authors should dedicate more eHort to analyzing their data and provide physical insights into why these changes are observed.
Our aim was to identify a property that could discriminate between AL and MM LCs. We used MD simulations, not to refine structures, but to explore the conformational dynamics of LCs (starting from either X-ray structures, homology or AlphaFold models), because SAXS data suggested that conformational dynamics could discriminate between AL- and MM-LCs. Simulations allowed us to propose a hypothesis, which we tested by HDX MS. While more insight is always welcome, we believe that we have achieved our goal for now. In the discussion, we present additional analysis of the simulations to connect with previous literature, we agree that more analysis can be done, and also for this reason, all our data are publicly available.
Minor concerns
(6) The abstract leans heavily on describing the problem and methods but lacks a clear presentation of key results. Providing a concise summary of the main findings (e.g., the identification of the H state) would better balance the abstract.
We agree with the reviewer and we rewrote the abstract.
(7) In the abstract, the term "experimental structure" is used ambiguously. Since SAXS also provides an experimental structure, it is unclear what the authors are referring to. This should be clarified.
We agree with the reviewer and we rewrote the abstract.
(8) Abbreviations such as VL (variable domain) and CL (constant domain) are not defined, making it harder for readers unfamiliar with the field to follow. Abbreviations should be defined when first mentioned.
We agree with the reviewer and we rewrote the abstract.
(9) The introduction provides a good general context but fails to explicitly define the knowledge gap. Specifically, the structural and dynamic determinants of LC amyloidogenicity are not well established, and this study could be framed as addressing that gap.
We thank the reviewer and we agree this could be better framed, we improved the introduction accordingly.
(10) The introduction does not present the novel discovery of the H state early enough. The unique contribution of identifying this state as a marker for AL-LCs should be mentioned upfront to guide the reader through the significance of the study.
We thank the reviewer and we have now made more explicit what we found.
(11) The therapeutic implications of this research should be highlighted more clearly in the discussion. Examples of how these findings could be utilized in drug design or therapeutic approaches would enhance the study's impact.
We thank the reviewer, but while we think that the H-state could be targeted for drug design, since we do not have data yet we do not want to stress this point more than what we are already doing.
(12) There is an overwhelming use of abbreviations such as H3, H7, H18, M7, and M10 without proper introduction. This makes it diHicult for readers to follow the results, and the average reader may become lost in the details. An introductory figure summarizing the sequences under study, along with a schematic of the dimeric structure defining VL and CL domains, would significantly aid comprehension.
We agree and we tried to better introduce the systems and simplify the language without adding a figure that we think would be redundant.
(13) In Figure 1, add labels to each SAXS curve to indicate which protein they correspond to. Also, what does online SEC-SAXS mean?
Done
(14) The caption of Figure 3 is unclear, particularly with abbreviations like Lb, Ls, G, and H, which are not mentioned in the captions. The authors should define these terms for clarity.
Done
(15) The study claims that the dominant structure of the dimer changes between diHerent LCs. However, Figure 5 shows identical structures for all proteins, raising questions about the consistency between the SAXS and HDX data. This inconsistency is a general problem between the MD and HDX sections, where cross-communication and comparisons are not properly addressed.
We do not claim that the dominant structure of the dimer changes between diNerent LCs, this would also be in contradiction with current literature. We claim a diNerence in a low-populated state. From this point of view using always the same structure is consistent and should simplify the representation of the results. We agree that the manuscript may be not always easy to follow and we thank the reviewer in helping us improving it.
(16) The authors show I(q) vs q and residuals for each protein. The Kratky plots are not suHicient to compare the SAXS computations with the measured profile.
Showing Kratky and residuals is a standard and complementary way to present and compare SAXS data to structures. Chi-square values are also reported. Log-log plots have been added to SI in response to previous comments.
(17) The authors need to explain how they estimate the Rg values (from simulation or SAXS profiles). If they are using simulations, they should compute the Rg values from the simulations for comparison.
Rg values reported in Table 1 are derived from SAXS. Rg from simulations have been added in Table 2.
(18) The evolution of the sampling is unclear. The authors need to show the initial starting conformation in each case and the most likely conformation after M&M in the SI, to demonstrate that their approach indeed caused changes in the initial predictions.
Our approach is not structure refinement and as such the proposed analysis would be misleading. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. DiNerences (or not) between initial and selected configurations will not be particularly informative in this context.
(19) The authors should also provide a running average of chi-squared values over time to demonstrate that the conformational ensemble converged toward the SAXS profile.
Our simulations are not driven to improve the agreement with SAXS over time, this is not structure refinement. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. The suggested analysis would be a misinterpretation of our simulations. The comparison with SAXS is provided in Figure 2 and Table 2 as mentioned above.
(20) The aggregate simulation time of 120 microseconds is misleading, as each replica was only run for 2-3 microseconds. This should be clarified.
The number reported in the text is accurate and represent the aggregated sampling. The number of replicas for each metainference simulation and their length is reported in Table 2 now moved for clarity from the SI to main text.
(21) It is not clear how the replicas were weighted to compute the SAXS profiles and FES. There are two independent runs in each case, and each run has about 30 replicas. How these replicas are weighted needs to be discussed in the SI.
Done
(22) The methods section is unevenly distributed, with detailed explanations of LC production and purification, while other key methodologies like SAXS+MD integration and HDX are not even mentioned in the main text (they are in the Supporting Information). The authors should provide a brief overview of all methodologies in the main text or move everything to the SI for consistency.
We agree with the reviewer, all methods are now in main text.
Reviewer #2 (Recommendations for the authors):
(1) Computational M&M evidence is strong (Figure 3) and is supported by SAXS (used as restraints). However, Kratky plots reported in the main MS Figure 1 show significant diHerences between the data and the structural model only for one protein, AL-55. It is hard for the general reader to see how these SAXS data support a clear diHerence between AL and non-AL proteins. If possible, please strengthen the evidence; if not, soften the conclusions.
We thank the reviewer for the comments. The chi-square (Table 1) and the residuals (Figure 1) are a strong indication of the diNerence. To strengthen the evidence, following also the comment from reviewer 3 we calculated the p-value (<10<sup>-5</sup>) on the significance of the radius of gyration to discriminate AL and MM LCs. We agree that SAXS alone was not enough and this is indeed what prompted us to perform MD simulations.
(2) HDX MS results are cursory and not very convincing as presented. The butterfly plots in Figure 5 are too small to read and are unlabeled so it is unclear which protein is which.
Figure 5 has been reworked for readability. More data have been added in SI.
(3) What labeling time was selected to construct these plots and why?
The deuterium uptakes at 30 min HDX time showed the most pronounced diNerences between diNerent proteins, which were chosen to illustrate the key structural features in the main figure panel (Figure 5).
How diHerent are the results at other labeling times? Showing uptake curves (with errors) for more than just two peptides in the supplement Figure S12 might be helpful.
We found a continuous increase in deuterium uptake as we increased the exchange time from 0.5 to 240 min, which reached saturation at 120 min. Therefore, the exchange follows the same pattern at all time points. Butterfly plots at diNerent HDX times of 0.5 to 240 min are shown in gradient of light blue to dark blue which clearly shows the pattern of deuterium uptake at increasing incubation times (Figure 5). The HDX uptake kinetics of selected peptides with corresponding error bars are shown in Figure S12.
How redundant are the data, i.e. how good is the peptide coverage/resolution in key regions at the domain-domain interface that the authors deem important? Mapping the maximal deuterium uptake on the structures in Figure 5 is not very helpful. Perhaps mapping the whole range of uptake using a gradient color scheme would be more informative.
Overall coverage and redundancy for all four proteins are> 90% and > 4.0, respectively, with an average error margin in fractional uptake among all peptides is 0.04-0.05 Da, which suggests that our data is reliable (Table S3). We modified the main panel figures showing the gradient of deuterium uptake in blue-white-red for 0 to 30% of deuterium uptake on the chain A of the dimeric LCs.
(3) Is the conformational heterogeneity depicted in M&M simulations consistent with HDX results? The authors may want to address this by looking at the EX1/EX2 exchange kinetics for AL vs. non-AL proteins. Do AL proteins show more EX1?
No, we don’t see any EX1 exchange kinetics in our analysis. This is compatible with the prediction of the H-state that is a native like state and not an unfolded/partially folded state.
(4) Perhaps the main conclusion could be softened given the small number of proteins (six), esp. since only four (3 AL and 1 non-AL) could be explored by HDX. Are other HDX MS data of AL LCs from the same Lambda6 family (e.g. PMID: 34678302) consistent with the conclusions that a particular domain-domain interface is weakened in AL vs. non-AL LCs?
We thank the reviewer for this suggestions. A diNerence in HDX MS data is indeed visible between AL and MM proteins for peptide 33-47 in the suggested paper (Figures 4, S5 and S8). The diNerence is reduced by the mutation identified in the paper as driving the aggregation in that specific case. We now mention this in the discussion.
(5) Please clarify if the H* state is the same for a covalent vs. non-covalent LC dimer.
We do not know because our data are only for covalent dimers. But, interestingly, the state is very similar to what was observed for a model kappa light-chain in Weber, et al., we have better highlighted this point in the discussion.
(6) Please try and better explain why a smaller distance between CL domains in H7 protein and a larger distance in other AL proteins both promote protein misfolding.
We do not have elements to discuss this point in more detail.
(7) Please comment on the Kratky plots data vs. model agreement (see comments above).
Done.
(8) Please find a better way to display, describe, and interpret the HD exchange MS data.
We have generated new main text (new Figure 5) and SI figures that we think allow the reader to better appreciated our observations. Corresponding results sections have been also improved.
Minor points:
(9) Is the population of the H-state with perturbed CL-CL domain interface, which was obtained in M&M simulations, suHicient to be observable by HDX MS?
While populations alone are not enough to determine what is observable by HDX MS, a 10% population correspond roughly to 6 kJ/mol of ΔG and is compatible with EX2 kinetics. Previous works suggested that HDX-MS data should be sensitive to subpopulations of the order of 10%, (https://doi.org/10.1016/j.bpj.2020.02.005, https://doi.org/10.1021/jacs.2c06148)
(10) Typically, an excited intermediate in protein unfolding is a monomer, while here it is an LC dimer. Is this unusual?
This is a good point, we think that intermediates have mostly been studied on monomeric proteins because these are more commonly used as model systems, but we do not feel like discussing this point.
(11) Low deuterium uptake is consistent with a rigid structure but may also reflect buried structure and/or structure that moves on a time scale greater than the labeling time.
We agree.
Reviewer #3 (Recommendations for the authors):
(1) The p-value (statistical significance) of Rg diHerence should be computed.
We thank the reviewer for the suggestion, we calculated the p-value that resulted quite significant.
(2) The significance of mutations (SHM?) at the interface, such as A40G should be compared with previous observations. (Garrofalo et al., 2021).
We thank the reviewer for the suggestion, a sentence has been added in the discussion.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this article the authors described mouse models presenting with backer muscular dystrophy, they created three transgenic models carrying three representative exon deletions: ex45-48 del., ex45-47 19 del., and ex45-49 del. This article is well written but needs improvement in some points.
Strengths:
This article is well written. The evidence supporting the authors' claims is robust, though further implementation is necessary. The experiments conducted align with the current state-of-the-art methodologies.
Weaknesses:
This article does not analyze atrophy in the various mouse models. Implementing this point would improve the impact of the work
We thank the reviewer for their constructive suggestions and comments on this work. Muscle hypertrophy is shown with growth in dystrophin-deficient skeletal muscle in mdx mice; thus, we did not pay attention to the factors associated with muscle atrophy in BMD mice. As the reviewer suggested, the examination of the association between type IIa fiber reduction and muscle atrophy is important, and the result is considered to be helpful in resolving the cause of type IIa fiber reduction in BMD mice.
In response, we reviewed the following.
(1) The cross-sectional areas (CSAs) of muscles. We confirmed that the CSAs in BMD and mdx mice were rather high at 3 months, in accordance with muscle hypertrophy, compared with those of WT mice. The data is presented in Fig. 4–figure supplement 1B.
(2) The mRNA expression levels of Murf1 and atrogin-1. We confirmed that these muscle atrophy inducing factors did not differ among WT, BMD, and mdx mice. The data is presented in Fig. 4–figure supplements 1C and 1D.
Reviewer #2 (Public review):
Summary:
Miyazaki et al. established three distinct BMD mouse models by deleting different exon regions of the dystrophin gene, observed in human BMD. The authors demonstrated that these models exhibit pathophysiological changes, including variations in body weight, muscle force, muscle degeneration, and levels of fibrosis, alongside underlying molecular alterations such as changes in dystrophin and nNOS levels. Notably, these molecular and pathological changes progress at different rates depending on the specific exon deletions in the dystrophin gene. Additionally, the authors conducted extensive fiber typing, revealing a site-specific decline in type IIa fibers in BMD mice, which they suggest may be due to muscle degeneration and reduced capillary formation around these fibers.
Strengths:
The manuscript introduces three novel BMD mouse models with different dystrophin exon deletions, each demonstrating varying rates of disease progression similar to the human BMD phenotype. The authors also conducted extensive fiber typing across different muscles and regions within the muscles, effectively highlighting a site-specific decline in type IIa muscle fibers in BMD mice.
Weaknesses:
The authors have inadequate experiments to support their hypothesis that the decay of type IIa muscle fibers is likely due to muscle degeneration and reduced capillary formation. Further investigation into capillary density and histopathological changes across different muscle fibers is needed, which could clarify the mechanisms behind these observations.
We thank the reviewer for these positive comments and the very important suggestion about type IIa fiber reduction and capillary change around muscle fibers in BMD mice. From the results of the cardiotoxin-induced muscle degeneration and regeneration model, type IIa and IIx fibers showed delayed recovery compared with that of type-IIb fibers. However, this delayed recovery of type IIa and IIx could not explain the cause of the selective muscle fiber reduction limited to type IIa fibers in BMD mice. Therefore, we considered vascular dysfunction as the reason for the selective type IIa fiber reduction, and we found morphological capillary changes from a “ring pattern” to a “dot pattern” around type IIa fibers in BMD mice. However, the association between selective type IIa fiber reduction and the capillary change around muscle fibers in BMD mice remains unclear due to the lack of information about capillaries around type IIx and IIb fibers. The reviewer pointed out this insufficient evaluation of capillaries around other muscle fibers (except for type IIa fibers), and this suggestion is very helpful for explaining the association between selective type IIa fiber reduction and vascular dysfunction in BMD mice.
In response, we reviewed the following.
(1) The capillary formation around type IIx, IIb, and I fibers, in addition to that around type IIa fibers. We found that capillaries contacting around type IIx, IIb, and I fibers were poor in WT mice compared with that around type IIa fibers, with ‘incomplete ring-patterns’ around type IIx fibers, and ‘dot-patterns’ around type IIb and I fibers in WT mice. Morphological capillary changes around muscle fibers from WT to d45-49 and mdx mice were ‘incomplete dot-pattern’ to ‘dot-pattern’ around type IIx fibers, and ‘dot-pattern’ to ‘dot-pattern’ around type IIb and I fibers. This was in contrast to those around type IIa fibers: remarkable ‘ring-pattern’ to ‘dot-pattern’. These data are presented in Fig. 6B.
(2) The endothelial area in contact with type IIx, IIb, and I fibers, and additionally that in contact with type IIa fibers. The endothelial area in contact with both type IIa and IIx fibers was less in d45-49 and mdx mice than in WT mice, but the reduction was larger around type IIa fibers than around type IIx fibers, reflecting the difference between the ‘ring-pattern’ around the former and the ‘incomplete ring-pattern’ around the latter in WT mice. These data are presented in Fig. 6C.
(3) Transversely interconnected branches and capillary loops, using longitudinal muscle sections. We confirmed that there were fewer interconnected capillaries in BMD and mdx mice than in WT mice. These data are presented in Fig. 6E.
(4) The mRNA expression levels of neuronal nitric oxide synthase (nNOS). We confirmed that nNOS protein expression levels were decreased in BMD and mdx mice in spite of adequate levels of nNOS mRNA expression. The data on nNOS mRNA expression levels is presented in Fig. 3–figure supplement 1C.
(5) We added a sentence in the Abstract about the potential utility of BMD mice in developing vascular targeted therapies.
Recommendation for the authors:
Reviewer #1 (Recommendation for the authors):
Abstract:
Abstract: more emphasis should be on the pathological implications of Becker muscular dystrophy (BMD). Furthermore, should be emphasized the findings made in this article and the conclusions. Abbreviations such as DMD and MDX should be written in full and only then with the acronym.
We appreciate the reviewers’ comments, and we apologize for the confusion over abbreviations. DMD is the gene name encoding dystrophin, and mdx is the strain name of mouse lacking dystrophin.
In the Abstract and the Figure legends we changed:
(1) DMD to DMD;
(2) mdx mice to mdx mice.
Results:
Line 95: in this line, authors evaluated serum creatinine kinase (CK) levels at 1, 3, 6 and 12 months in WT mice and mdx mice. Why did you decide to study it? This part should be described in more detail. Serum CK is one of the main markers of muscle necrosis; therefore, I would report this data alongside the description of the muscle histology and necrotic fibers.
We thank the reviewers for the important remarks. In this study, serum creatine kinase (CK) levels were two-fold to four-fold higher in BMD mice than in WT mice, but its rate of increase was less than that of mdx mice. We consider that the lesser changes in serum CK levels in BMD mice may be due to the smaller area of muscle degeneration because of focal and uneven muscle degeneration compared with that in mdx mice, which showed diffuse muscle degeneration.
In response, we have moved the description of serum CK levels in the Results, from the section about the establishment of BMD mice to the section about site-specific muscle degeneration in BMD mice.
In addition, we added a description in the Discussion about the possible association between the lesser changes in serum CK levels in BMD mice and its uneven distribution of muscle degeneration.
Line 192-202: In these lines, authors observed a decrease in type IIa fibers after 3 months in BMD mice. I suggest evaluating also atrophy through evaluating cross-sectional areas (CSA) and expression of Murf1 and Atrogin1
We thank the reviewer for the point about the association between type IIa fiber reduction and muscle atrophy. We evaluated the CSAs and the mRNA expression levels of Murf1 and atrogin-1. We confirmed that the CSAs in BMD and mdx mice were rather high at 3 months, in accordance with muscle hypertrophy, compared with those of WT mice, and that Murf1 and atrogin-1 mRNA expression levels did not differ among WT, BMD, and mdx mice. These data are presented in Fig. 4–figure supplements 1B, 1C, and 1D. We added a sentence about the changes in CSA and muscle atrophy inducing factors in the Discussion.
Methods and material
Line 342-348: authors have described animals, but not specified sex and number of mice in each group. This part should be improved.
We apologize for our insufficient information about the sex and number of mice in the Materials and methods.
We added a sentence specifying the sex, number, and evaluation period of each mouse group in the section on the generation of BMD mice.
Line 426-433: authors described qPCR. It is necessary that the authors also describe primer sequences.
We apologize for any lack of information about the primer sequences used in qPCR analysis. Supplemental Table 1 lists the primer sequences.
We also added a sentence about the information in the primer list in the section on RNA isolation and RT-PCR in the Materials and methods.
Reviewer #2 (Recommendation for the authors):
Miyazaki et al. established three distinct BMD mouse models by removing different exon regions of the dystrophin gene. The authors demonstrated that the pathophysiological and molecular changes in these models progress at varying rates. Additionally, they observed a site-specific decline in type IIa fibers in BMD mice, while the proportions of other fiber types, such as type I and type IIx, remained consistent with those in wild-type mice. They proposed that the selective decay of type IIa fibers in BMD mice could be due to two primary factors: 1) muscle degeneration and regeneration, supported by their findings in cardiotoxin-treated mouse models, and 2) reduced capillary formation around type IIa fibers. However, the authors also presented evidence that type IIx fibers exhibited delayed recovery, similar to type IIa fibers, as demonstrated in cardiotoxin-induced regeneration models. Additionally, dot-patterned capillary formations were observed around both type IIa and type IIx fibers. Despite these findings, BMD mice did not show any changes in the proportion of type IIx fibers in inner BMD muscles. The authors should consider adding further analysis to strengthen their hypothesis and to disclose any possible mechanisms that led to these discrepancies.
If the authors hypothesize that reduced capillary density around type IIa fibers contribute to their site-specific decay in BMD mice, they should consider measuring and statistically analyzing the endothelial area around all fiber types. By plotting and comparing these measurements across different fiber types between wild-type, BMD, and mdx mice, the authors could provide more robust evidence to support their hypothesis. This approach would help clarify whether reduced capillary density is a contributing factor to the site-specific decay of type IIa fibers in BMD mice and the more diffuse, non-specific muscle changes observed in mdx mice.
The authors reported in the first part of the manuscript that histopathological changes, including muscle degeneration in BMD mice, are predominantly restricted to the inner part of the muscles. In the second part, they noted a decline in type IIa fibers specifically in the inner muscle region. To strengthen the hypothesis that the decay of type IIa fibers in the inner muscle is linked to muscle degeneration, the authors should consider performing histopathological measurements across different fiber types within the inner muscle. Reporting the correlations between these measurements would provide more compelling evidence to support their hypothesis.
We thank the reviewer for these important suggestions about the association between type IIa fiber reduction and capillary change around muscle fibers in BMD mice. We prepared an additional evaluation about the capillary formation (in Fig. 6B) and endothelial area (in Fig. 6C) around type IIx, IIb, and I fibers. We found that capillaries contacting around type IIx, IIb, and I fibers were poor in WT mice compared with those around type IIa fibers, and showed an ‘incomplete ring-pattern’ around type IIx fibers and a ‘dot-pattern’ around type IIb and I fibers in WT mice, in contrast with type IIa fibers, which showed remarkable ‘ring-pattern’ capillaries. Reflecting this, the changes in endothelial area around type IIx, IIb, and I fibers between WT and BMD mice were less than those around type IIa fibers. These results suggest that type IIa fibers may require numerous capillaries and maintained blood flow compared with type IIx, IIb, and I fibers, and this high requirement for blood flow might be associated with the type IIa fiber-specific decay in BMD mice.
We added the following.
(1) Sentences in the Results about the capillary changes around type IIx, IIb, and I fibers in WT, d45-49, and mdx mice.
(2) Sentences in the Results about the changes in endothelial area around type IIx, IIb, and I fibers in WT, d45-49, and mdx mice.
(3) Sentences in the Discussion about the association between the type IIa fiber-specific decay in BMD mice and the differences in capillary changes of each muscle fiber from WT to BMD mice.
We changed a sentence in the Discussion about the delayed recovery of type IIa and IIx fibers after CTX injection, to make it clear that the recovery of type IIx fibers was slower than that of type IIa fibers after CTX injection, and that therefore the type IIa fiber-specific decay in BMD mice might not be explained by this vulnerability and delayed recovery during muscle degeneration and regeneration.
Minor Issues:
Line 103: The word "mice" is duplicated and should be corrected.
We apologize that “mice” was duplicated. We have corrected it.
Line 120: Revise for clarity: "The proportion of opaque fibers is significantly different between d45-48 mice and WT at 3 months, with an increased tendency observed only in 1-month-old mice."
We apologize for the confusion about the proportion of opaque fibers. We revised this sentence as follows.
“Opaque fibers, which are thought to be precursors of necrotic fibers, increased at an earlier age of 1 month in d45–49 mice compared with WT mice; in contrast, the proportion of opaque fibers differs significantly between d45–47 and WT mice at 3 months, with an increased tendency only in 1-month-old mice (Fig. 2C).”
Line 152: Clarify the statement regarding utrophin levels, as it currently contradicts the Western blot data. The sentence reads: "The increased levels of utrophin are 8-fold higher at 1 month and 30-fold higher at 3 months." This should be verified against the data, as the band densities in the Western blots suggest otherwise.
We apologize for the confusion about utrophin expression levels. We revised this sentence as follows.
“By western blot analysis, the utrophin expression levels showed only an increased tendency in all BMD mice at 3 months, whereas there was a significant increase in mdx mice (8-fold at 1 month, and 30-fold at 3 months) compared to WT mice (Figs. 3C and F).”
Line 235: Correct the sentence to accurately reflect the findings: "BMD mice showed reduced muscle weakness."
We apologize for our incorrect wording. We have removed the word “reduced” in this sentence.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
The manuscript by Dr. Shinkai and colleagues is about the posttranslational modification of a highly important protein, MT3, also known as the growth inhibitory factor. Authors postulate that MT3, or generally all MT isoforms, are sulfane sulfur binding proteins. The presence of sulfane sulfur at each Cys residue has, according to the authors, a critical impact on redox protein properties and almost does not affect zinc binding. They show a model in which 20 Cys residues with sulfane sulfur atoms can still bind seven zinc ions in the same clusters as unmodified protein. They also show that recombinant MT3 (but also MT1 and MT2) protein can react with HPE-IAM, an efficient trapping reagent of persulfides/polysulfides. This reaction performed in a new approach (high temperature and high reagent concentration) resulted in the formation of bis-S-HPE-AM product, which was quantitatively analyzed using LC-MS/MS. This analysis indicated that all Cys residues of MT proteins are modified by sulfane sulfur atoms. The authors performed a series of experiments showing that such protein can bind zinc, which dissociates in the reaction with hydrogen peroxide or SNAP. They also show that oxidized MT3 is reduced by thioredoxin. It gives a story about a new redox-dependent switching mechanism of zinc/persulfide cluster involving the formation of cystine tetrasulfide bridge.
The whole story is hard to follow due to the lack of many essential explanations or full discussion. What needs to be clarified is the conclusion (or its lack) about MT3 modification proven by mass spectrometry. Figure 1B shows the FT-ICR-MALDI-TOF/MS spectrum of recombinant MT3. It clearly shows the presence of unmodified MT3 protein without zinc ions. Ions dissociate in acidic conditions used for MALDI sample preparation. If the protein contained all Cys residues modified, its molecular weight would be significantly higher. Then, they show the MS spectrum (low quality) of oxidized protein (Fig. 1C), in which new signals (besides reduced apo-MT3) are observed. They conclude that new signals come from protein oxidation and modification with one or two sulfur atoms. If the conclusion on Cys residue oxidation is reasonable, how this protein contains sulfur is unclear. What is the origin of the sulfur if apo-MT does not contain it? Oxidized protein was obtained by acidification of the protein, leading to zinc dissociation and subsequent neutralization and air oxidation. Authors should perform a detailed isotope analysis of the isotopic envelope to prove that sulfur is bound to the protein. They say that the +32 mass increase is not due to the appearance of two oxygen donors. They do not provide evidence. This protein is not a sulfane sulfur binding protein, or its minority is modified. Moreover, it is unacceptable to write that during MT3 oxidation are "released nine molecules of H2". How is hydrogen molecule produced? Moreover, zinc is not "released", it dissociates from protein in a chemical process.
Thank you for your comment. According to your suggestion, we have rewritten the corresponding sentences below, together with addition of new Fig.1D.
First, the sentence “which corresponded to the mass of zinc-free apo-GIF/MT3 and indicated that zinc was removed during MS analysis.” was changed to “which corresponded to the mass of zinc-free apo-GIF/MT3 and indicated that zinc dissociates from protein in acidic conditions used for MALDI sample preparation.” in the introduction section. Second, we have added the following sentence “However, FT-ICR-MALDI-TOF/MS analysis failed to detect sulfur modifications in GIF/MT-3 (Fig. 1B), suggesting that sulfur modifications in the protein were dissociated during laser desorption/ionization. Therefore, we postulate that the small amount of sulfur detected in oxidized apo-GIF/MT-3 is derived from the effect of laser desorption/ionization rather than any actual modification of the minority component.” in the discussion section. Third, we have added new Fig. 1D and the corresponding citation in the introduction. Fourth, the sentence “An increase in mass of 32 Da can also result from addition of two oxygen atoms, but we attributed it to one sulfur atom for reasons described later.” was changed to “Note that an increase in mass of 32 Da can also result from addition of two oxygen atoms.”.
Another important point is a new approach to the HPE-IAM application. Zinc-binding MT3 was incubated with 5 mM reagent at 60°C for 36 h. Authors claim that high concentration was required because apoMT3 has stable conformation. Figure 2B shows that product concentration increases with higher temperature, but it is unclear why such a high temperature was used. Figure 1D shows that at 37°C, there is almost no reaction at 5 mM reagent. Changing parameters sounds reasonable only when the reaction is monitored by mass spectrometry. In conclusion, about 20 sulfane sulfur atoms present in MT3 would be clearly visible. Such evidence was not provided. Increased temperature and reagent concentration could cause modification of cysteinyl thiol/thiolates as well, not only persulfides/polysulfides. Therefore, it is highly possible that non-modified MT3 protein could react with HPE-IAM, giving false results. Besides mass spectrometry, which would clearly prove modifications of 20 Cys, authors should use very important control, which could be chemically synthesized beta- or alfa-domain of MT3 reconstituted with zinc (many protocols are present in the literature). Such models are commonly used to test any kind of chemistry of MTs. If a non-modified chemically obtained domain would undergo a reaction with HPE-IAM under such rigorous conditions, then my expectation would be right.
Thank you for your comments. Although we have already confirmed that no false-positive results were observed using this method in Fig. 5 (previously Fig. 4), we have conducted additional experiments by preparing chemically synthesized α- and β-domains of GIF/MT-3, as well as recombinant α- and β-domains of GIF/MT-3. As shown in the new Fig. S2A, the chemically synthesized α- and β-domains of GIF/MT-3 detected almost no sulfane sulfur (less than 1 molecule per protein), whereas the recombinant α- and β-domains detected several molecules of sulfane sulfur (more than 5 molecules per protein) (Fig. S2A). Therefore, I would like to emphasize here that the cysteine residue itself cannot be the source of the bis-S-HPE-AM product (sulfane sulfur derivative).
Accordingly, we have added the following sentence in the results section: “Because this assay was performed at relatively high temperatures (60°C), we also examined the sulfane sulfur levels of several mutant proteins using chemically synthesized α- and β-domains of GIF/MT-3 to eliminate false-positive results. As shown in Fig. S2A, sulfane sulfur (less than 1 molecule per protein) was undetectable in chemically synthesized α- and β-domains of GIF/MT-3, whereas several molecules of sulfane sulfur per protein were detected in recombinant α- and β-domains exhibited (Fig. S2B, left panel). These findings indicated that the sulfane sulfur detected in our assay was derived from biological processes executed during the production of GIF/MT-3 protein. We further analyzed mutant proteins with β-Cys-to-Ala and α-Cys-to-Ala substitutions and found that their sulfane sulfur levels were comparable with those of the α- and β-domains of GIF/MT-3, respectively (Fig. S2B, left panel). Additionally, Ser-to-Ala mutation did not affect the sulfane sulfur levels of GIF/MT-3. The zinc content of each mutant protein was also determined under these conditions (Fig. S2B, right panel).”
- The remaining experiments provided in the manuscript can also be applied for non-modified protein (without sulfane sulfur modification) and do not provide worthwhile evidence. For instance, hydrogen peroxide or SNAP may interact with non-modified MTs. Zinc ions dissociate due to cysteine residue modification, and TCEP may reduce oxidized residue to rescue zinc binding. Again, mass spectrometry would provide nice evidence.
Thank you for your comment. We understand that such experiments can also be applied to non-modified proteins (without sulfane sulfur modification). However, the experiments shown in Fig. 4 and Fig. 6 were conducted to investigate the role of sulfane sulfur under oxidative stress conditions, rather than to examine sulfur modification in the protein itself. As mentioned previously, it is difficult to detect sulfur modifications directly in the protein using MALDI-TOF/MS (Fig. 1), as sulfur modifications appear to dissociate during the laser desorption/ionization process.
- The same is thioredoxin (Fig. 7) and its reaction with oxidized MT3. Nonmodified and oxidized MT3 would react as well.
Thank you for your comment. We understand that such experiments can also be applied to non-modified MT-3 protein. However, to the best of our knowledge, this is the first report demonstrating that apo-MT-3 can serve as a good substrate for the Trx system. In fact, this experiment is not intended to prove that MT-3 is sulfane sulfur-binding protein. Rather, it demonstrates the novel finding that apo-MT3 serves as an excellent substrate for Trx and that the sulfane sulfur (persulfide structure) remains intact throughout the reduction process.
- If HPE-IAM reacts with Cys residues with unmodified MT3, which is more likely the case under used conditions, the protein product of such reaction will not bind zinc. It could be an explanation of the cyanolysis experiment (Fig. 6).
Thank you for your comment. As you pointed out, HPE-IAM reacts with cysteine residues in unmodified MT-3, thereby preventing zinc from binding to the protein. However, we did not use HPE-IAM prior to measuring zinc binding. Instead, HPE-IAM was used solely for determining the sulfane sulfur content in the protein, and thus it cannot explain the results of the cyanolysis experiment.
- Figure 4 shows the reactivity of (pol)sulfides with TCEP and HPE-IAM. What are redox potentials? Do they correlate with the obtained results?
Thank you for your comment. However, we must apologize as we do not fully understand the rationale behind determining redox potentials in this experiment. We believe the data itself to be very clear and presenting convincing results.
- Raman spectroscopy experiments would illustrate the presence of sulfane sulfur in MT3 only if all Cys were modified.
Yes, that is correct. Since approximately 20 sulfane sulfur atoms are detected in the protein with 20 cysteine residues, we believe that nearly all cysteine residues are modified by sulfane sulfur. Therefore, Raman spectroscopy is considered applicable to our current study.
- The modeling presented in this study is very interesting and confirms the flexibility of metallothioneins. MT domains are known to bind various metal ions of different diameters. They adopt in this way to larger size the ions. The same mechanism could be present from the protein site. The presence of 9 or 11 sulfur atoms in the beta or alfa domain would increase the size of the domains without changing the cluster structure.
We truly appreciate your positive evaluation of this work.
- Comment to authors. Apo-MT is not present in the cell. It exists as a partially metallated species. The term "apo-MT" was introduced to explain that MTs are not fully saturated by metals and function as a metal buffer system. Apo-MT comes from old ages when MT was considered to be present only in two forms: apo-form and fully saturated forms.
Thank you for your insightful comments. We find it reasonable to understand that apo-MT exists as a partially metallated species within the cell.
Reviewer #2 (Public Review):
Summary:
In this manuscript, the authors reveal that GIF/MT-3 regulates zinc homeostasis depending on the cellular redox status. The manuscript technically sounds, and their data concretely suggest that the recombinant MTs, not only GIF/MT-3 but also canonical MTs such as MT-1 and MT-2, contain sulfane sulfur atoms for the Zn-binding. The scenario proposed by the authors seems to be reasonable to explain the Zn homeostasis by the cellular redox balance.
Strengths:
The data presented in the manuscript solidly reveal that recombinant GIF/MT-3 contains sulfane sulfur.
Weaknesses:
It is still unclear whether native MTs, in particular, induced MTs in vivo contain sulfane sulfur or not.
Thank you for pointing out the strengths and weaknesses of this manuscript. Based on your suggestions, we have determined the sulfane sulfur content in the native GIF/MT-3 protein, as explained in our response to "Recommendations for the Authors #2."
Reviewer #3 (Public Review):
Summary:
The authors were trying to show that a novel neuronal metallothionein of poorly defined function, GIF/MT3, is actually heavily persulfidated in both the Zn-bound and apo (metal-free) forms of the molecule as purified from a heterologous or native host. Evidence in support of this conclusion is compelling, with both spectroscopic and mass spectrometry evidence strongly consistent with this general conclusion. The authors would appear to have achieved their aims.
Strengths:
The analytical data are compelling in support of the author's primary conclusions are strong. The authors also provide some modeling evidence that strongly supports the contention that MT3 (and other MTs) can readily accommodate sulfane sulfur on each of the 20 cysteines in the Zn-bound structure, with little perturbation of the structure. This is not the case with Cys trisulfides, which suggests that the persulfide-metallated state is clearly positioned at lower energy relative to the immediately adjacent thiolate- or trisulfidated metal coordination complexes.
Weaknesses:
The biological significance of the findings is not entirely clear. On the one hand, the analytical data are clearly solid (albeit using a protein derived from a bacterial over-expression experiment), and yes, it's true that sulfane S can protect Cys from overoxidation, but everything shown in the summary figure (Fig. 8D) can be done with Zn release from a thiol by ROS, and subsequent reduction by the Trx/TR system. In addition, it's long been known that Zn itself can protect Cys from oxidation. I view this as a minor weakness that will motivate follow-up studies. Fig. 1 was incomplete in its discussion and only suggests that a few S atoms may be covalently bound to MT3 as isolated. This is in contrast to the sulfate S "release" experiment, which I find quite compelling.
Impact:
The impact will be high since the finding is potentially disruptive to the metals in the biology field in general and the MT field for sure. The sulfane sulfur counting experiment (the HPE-IAM electrophile trapping experiment) may well be widely adopted by the field. Those of us in the metals field always knew that this was a possibility, and it will interesting to see the extent to which metal-binding thiolates broadly incorporate sulfate sulfur into their first coordination shells.
Thank you for pointing out the strengths and weaknesses of this manuscript. As you noted, the explanations and discussions regarding Fig. 1 were missing. To address this, we have added the following sentences to the discission section: “However, FT-ICR-MALDI-TOF/MS analysis failed to detect sulfur modifications in GIF/MT-3 (Fig. 1B), suggesting that sulfur modifications in the protein were dissociated during laser desorption/ionization. Therefore, we postulate that the small amount of sulfur detected in oxidized apo-GIF/MT-3 is derived from the effect of laser desorption/ionization rather than any actual modification of the minority component.”
Reviewer #1 (Recommendations For The Authors):
Overall, the topic of the study is interesting, but the provided evidence is insufficient to claim that MT3 is a sulfane sulfur-binding protein. Indeed, some recent studies showed that natural and recombinant MT proteins can be modified, but only one or a few cysteine residues were modified. Authors should follow my suggestion and apply mass spectrometry to all performed reactions and, first of all, to freshly obtained protein. I strongly suggest using chemically synthesized and reconstituted domains to test whether the home-developed approach is appropriate. Moreover, native MS and ICP-MS analysis of MT3 would support their claims.
Thank you for your insightful comments. Following your suggestions, we have prepared chemically synthesized proteins of the α- and β-domains of GIF/MT-3 and conducted additional experiments, as explained in response comments to “Public Review #1”. Regarding the MS analysis, we have also added a discussion on the difficulty of detecting sulfur modifications in the protein.
Reviewer #2 (Recommendations For The Authors):
I have some minor points which should be considered by the authors.
(1) Table 1: In the simulation by MOE, the authors speculated 7 atoms of metal bound to GIF/MT-3. Although a total of 7 atoms of Zn or Cd are actually bound to MTs as a divalent ion, the number of Cu and Hg bound to MTs as a monovalent ion is scientifically controversial. Several ideas have been proposed in the literature, however, "7 atoms of Cu or Hg" could be inappropriate as far as I know. The authors should simulate again using a more appropriate number of Cu or Hg in MTs.
Thank you for providing this valuable information. We reviewed several papers by the Stillman group and found that the relative binding constants of Cu4-MT, Cu6-MT, and Cu10-MT were determined after the addition of Cu(I) to apo MT-1A, MT-2, and MT-3 (Melenbacher and Stillman, Metallomics, 2024). However, incorporating these copper numbers into our GIF/MT-3 simulation model proved challenging. Therefore, we decided to omit the score value for copper in Table 1.
On the other hand, some researchers have reported that mercury binds to MT as a divalent ion, and the formation of Hg<sub>7</sub>MT is possible (not just other forms). Therefore, we decided to continue using the score value for mercury shown in Table 1.
(2) If possible, native MT samples isolated from an experimental animal should be evaluated for the sulfane sulfur content. Canonical MTs, MT-1 and MT-2, are highly inducible by not only heavy metals but also oxidative stress. Under the oxidative stress condition such as the exposure of hydrogen peroxide, it is questionable whether the induced Zn-MTs contain sulfane sulfur or not.
According to your suggestion, we evaluated the sulfane sulfur content in native GIF/MT-3 samples isolated from mouse brain cytosol (Fig. 10). The measured amount was 3.3 per protein. This suggests that sulfane sulfur in GIF/MT-3 could be consumed under oxidative conditions, as you anticipated. Another possible explanation for the discrepancy between the native form and recombinant protein is likely related to metal binding in the protein. It is generally understood that both zinc and copper bind to GIF/MT-3 in approximately equal proportions in vivo. When we prepared recombinant copper-binding GIF/MT-3 protein, the sulfane sulfur content in the protein was significantly different (approximately 4.0 per protein) compared to the Zn<sub>7</sub>GIF/MT-3 form. Further studies are needed to clarify the relationship between sulfane sulfur binding and the types of metals in the future.
(3) The biological significance of sulfane sulfur in MTs is still unclear to me.
Thank you for your comments. To address this question, we have added the following sentence to the discussion section: “The biological significance of sulfane sulfur in MTs lies in its ability to 1) contribute to metal binding affinity, 2) provide a sensing mechanism against oxidative stress, and 3) aid in the regeneration of the protein.”
(4) According to the widely accepted nomenclature of MT, "MT3" should be amended to "MT-3".
According to your suggestion, we have amended from MT3 to MT-3 throughout the manuscript.
Reviewer #3 (Recommendations For The Authors):
Most of my comments are editorial in nature, largely focused on what I perceive as overinterpretation or unnecessary speculation.
The authors state in the abstract that the intersection of sulfane sulfur and Zn enzymes "has been overlooked." This is not actually true - please tone down to "under investigated" or something like this.
Based on your suggestion, we have replaced the term “has been overlooked” with “has been under investigated” in the abstract.
Line 228: The discussion of Fig. 6C involved too much speculation. I cannot see a quantitative experiment that supports this.
Based on your suggestion, we have removed Fig. 6C (currently referred to as Fig. 7C). Additionally, we have revised the sentence from “implying that the sulfane sulfur is an essential zinc ligand in apo-GIF/MT3 and that an asymmetric SSH or SH ligand is insufficient for native zinc binding (Fig. 6C)” to “implying the contribution of sulfane sulfur to zinc binding in GIF/MT-3”.
Line 247 "persulfide in apo-GIF/MT3 seems.." I think the authors mean that the Zn form of the protein is resistant to Trx or TCEP.
Thank you for pointing this out. We realized that the term “persulfide in apo-GIF/MT3” might be confusing. Therefore, we have replaced it with “persulfide formation derived from apo-GIF/MT3” in the corresponding sentence.
Molecular modeling: We need more details- were these structures energy-minimized in any way? Can the authors comment on the plethora of S-S dihedral angles in these structures, and whether they are consistent with expectations of covalent geometry? Please add text to explain or even a table that compiles these data.
Thank you for your comment. Yes, energy minimization calculations for structural optimization were conducted during homology modeling in MOE. In fact, we have already stated in the Methods section that “Refinement of the model with the lowest generalized Born/volume integral (GBVI) score was achieved through energy minimization of outlier residues in Ramachandran plots generated within MOE.” In this model, covalent geometry, including the S-S dihedral angles, is also taken into consideration.
What is a thermostability score? Perhaps a bit more discussion here and what relationship this has to an apparent (or macroscopic) metal affinity constant.
The thermostability score is used to compare the thermal stability between the wild-type and mutant proteins. As shown in Equation (1) in the method section, it is calculated by subtracting the energy of the hypothetical unfolded state from the energy of the folded state. Since obtaining the structure of the unfolded state requires extensive computational effort, MOE employs an empirical formula based on two-dimensional structural features to estimate it. The ΔΔG values represent the difference between ΔGf(WT) and ΔGf(Mut). However, because it is difficult to directly determine ΔGf(Mut) and ΔGf(WT), MOE calculates ΔΔG using the thermodynamic cycle equivalence: ΔΔGs =ΔGsf (WT→Mut) - ΔGsu (WT→Mut), as expressed in Equation (1).
On the other hand, the affinity score represents the interaction energy between the target ligand and the protein. In this study, we calculated the affinity score by selecting metal atoms as the ligands. The interaction energy (E int) is defined as:
E int = E complex − E receptor − E ligand
where each term is as follows:
E complex : Potential energy of the complex.
E receptor : Potential energy of the receptor alone.
E ligand : Potential energy of the ligand alone.
Each potential energy term includes contributions from bonded interactions such as bond lengths and bond angles. However, since there is no structural difference among E receptor, and E ligand, the bonded energy components cancel out. Consequently, E int is determined as:
E int = ΔEele +ΔEvdW +ΔE sol
Here, a negative E int indicates that the complex is more stable, while a positive E int implies that the receptor and ligand are more stable in their dissociated states.
We have revised the sentence "The affinity score was also calculated using MOE software as the difference between the ΔΔGs values of the protein, free zinc, and metal–protein complex” to "The affinity score was also calculated using MOE software as the difference between the potential energy values of the protein, free zinc, and metal–protein complex” to correct the misdescription.
Lines 278-280: The authors state that they observe a "marked enhancement of metal binding affinity, and rearrangement of zinc ions." I don't see support for this rather provocative conclusion. This is the expectation of course. I would love to see actual experimental data on this point, direct binding titrations with metals performed before and after the release of the sulfate sulfur atoms.
Thank you for your comments. Although this statement is based on the 3D modeling simulation, we have also experimentally observed that the diminishment of sulfane sulfur in GIF/MT-3 resulted in a decrease in zinc binding levels, as shown in Fig. 7. However, conducting direct binding titration experiments was difficult for us due to the difficulty in preparing pure GIF/MT-3 protein with or without sulfane sulfur. Therefore, we have revised the sentence "marked enhancement of metal binding affinity, and rearrangement of zinc ions" to simply "enhancement of metal binding affinity" to avoid over-speculation.
Table I- quantitatively lower stability for the Cu complex- the stoichiometry is clearly wrong in this simulation- please redo this simulation with the right stoichiometry or Cu to MT3- consult a Stillman paper.
Thank you for providing this valuable information. We reviewed several papers by the Stillman group and found that the relative binding constants of Cu4-MT, Cu6-MT, and Cu10-MT were determined after the addition of Cu(I) to apo MT-1A, MT-2, and MT-3 (Melenbacher and Stillman, Metallomics, 2024). However, incorporating these copper numbers into our GIF/MT-3 simulation model proved challenging. Therefore, we decided to omit the score value for copper in Table 1.
I like the model for reversible metal release mediated by the thioredoxin system (Fig. 8D)- but you can also do this with thiols- nothing really novel here. Has it been generally established that tetraulfides are better substrates for the Trx/TR system? The data shown in Fig. 7B seems to suggest this, but is this broadly true, from the literature?
There are reports describing that persulfides and polysulfides are reduced by the thioredoxin system. However, it is not well-established that tetraulfides are better substrates for the Trx/TR system. To the best of our knowledge, this is the first report demonstrating that apo-MT-3 can serve as a good substrate for the Trx/TR system. Further research is required to compare the catalytic efficiency between proteins containing disulfide and those with tetraulfide moieties.
Line 380: Many groups have reported that many proteins are per- or polysulfidated in a whole host of cells using mass spectrometry workflows, and that terminal persulfides can be readily reduced by general or specific Trx/TR systems. This work could be better acknowledged in the context of the authors' demonstration of the reduction of the tetrasulfides, which itself would appear to be novel (and exciting!).
We truly appreciate your positive evaluation of this work.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This manuscript uses the eye lens as a model to investigate basic mechanisms in the Fgf signaling pathway. Understanding Fgf signaling is of broad importance to biologists as it is involved in the regulation of various developmental processes in different tissues/organs and is often misregulated in disease states. The Fgf pathway has been studied in embryonic lens development, namely with regards to its involvement in controlling events such as tissue invagination, vesicle formation, epithelium proliferation, and cellular differentiation, thus making the lens a good system to uncover the mechanistic basis of how the modulation of this pathway drives specific outcomes. Previous work has suggested that proteins, other than the ones currently known (e.g., the adaptor protein Frs2), are likely involved in Fgfr signaling. The present study focuses on the role of Shp2 and Shc1 proteins in the recruitment of Grb2 in the events downstream of Fgfr activation.
Strengths:
The findings reveal that the juxtamembrane region of the Fgf receptor is necessary for proper control of downstream events such as facilitating key changes in transcription and cytoskeleton during tissue morphogenesis. The authors conditionally deleted all four Fgfrs in the mouse lens that resulted in molecular and morphological lens defects, most importantly, preventing the upregulation of the lens induction markers Sox2 and Foxe3 and the apical localization of F-actin, thus demonstrating the importance of Fgfrs in early lens development, i.e. during lens induction. They also examined the impact of deleting Fgfr1 and 2, on the following stage, i.e. lens vesicle development, which could be rescued by expressing constitutively active KrasG12D. By using specific mutations (e.g. Fgfr1ΔFrs lacking the Frs2 binding domain and Fgfr2LR harboring mutations that prevent binding of Frs2), it is demonstrated that the Frs2 binding site on Fgfr is necessary for specific events such as morphogenesis of lens vesicle. Further, by studying Shp2 mutations and deletions, the authors present a case for Shp2 protein to function in a context-specific manner in the role of an adaptor protein and a phosphatase enzyme. Finally, the key surprising finding from this study is that downstream of Fgfr signaling, Shc1 is an important alternative pathway - in addition to Shp2 - involved in the recruitment of Grb2 and in the subsequent activation of Ras. The methodologies, namely, mouse genetics and state-of-the-art cell/molecular/biochemical assays are appropriately used to collect the data, which are soundly interpreted to reach these important conclusions. Overall, these findings reveal the flexibility of the Fgf signaling pathway and its downstream mediators in regulating cellular events. This work is expected to be of broad interest to molecular and developmental biologists.
Weaknesses:
A weakness that needs to be discussed is that Le-Cre depends on Pax6 activation, and hence its use in specific gene deletion will not allow evaluation of the requirement of Fgfrs in the expression of Pax6 itself. But since this is the earliest Cre available for deletion in the lens, mentioning this in the discussion would make the readers aware of this issue. Referring to Jag1 among "lens-specific markers" (page 5) is debatable, suggesting changing to the lines of "the expected upregulation of Jag1 in lens vesicle". The Abstract could be modified to clearly convey the existing knowledge gap and the key findings of the present study. As it stands now, it is a bit all over the place. Some typos in the manuscript need to be fixed, e.g. "...yet its molecular mechanism remains largely resolved" - unresolved? "...in the development lens" - in the developing lens? In Figure 4 legend, "(B) Grb2 mutants Grb2 mutants displayed...", etc.
We thank the reviewer for the thoughtful and constructive feedback. We have added the caveat regarding the Le-Cre dependency on Pax6 expression to the discussion, removed the reference to Jag1 as a “lens-specific marker” and corrected the typographical errors noted by the reviewer.
Reviewer #2 (Public review):
Summary:
I have reviewed a manuscript submitted by Wang et al., which is entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development". In this paper, the authors first examined lens phenotypes in mice with Le-Cre-mediated knockdown (KD) of all four FGFR (FGFR1-4), and found that pERK signals, Jag1, and foxe3 expression are absent or drastically reduced, indicating that FGF signaling is essential for lens induction. Next, the authors examined lens phenotypes of FGFR1/2-KD mice and found that lens fiber differentiation is compromised and that proliferative activity and cell survival are also compromised in lens epithelium. Interestingly, Kras activation rescues defects in lens growth and lens fiber differentiation in FGFR1/2-KD mice, indicating that Ras activation is a key step for lens development. Next, the authors examined the role of Frs2, Shp2, and Grb2 in FGF signaling for lens development. They confirmed that lens fiber differentiation is compromised in FGFR1/3-KD mice combined with Frs2-dysfunctional FGFR2 mutants, which is similar to lens phenotypes of Grb2-KD mice. However, lens defects are milder in mice with Shp2YF/YF and Shp2CS mutant alleles, indicating that the involvement of Shp2 is limited for the Grb2 recruitment for lens fiber differentiation. Lastly, the authors showed new evidence on the possibility that another adapter protein, Shc1, promotes Grb2 recruitment independent of Frs2/Shp2-mediated Grb2 recruitment.
Strengths:
Overall, the manuscript provides valuable data on how FGFR activation leads to Ras activation through the adapter platform of Frs2/Shp2/Grb2, which advances our understanding of complex modification of the FGF signaling pathway. The authors applied a genetic approach using mice, whose methods and results are valid to support the conclusion. The discussion also well summarizes the significance of their findings.
Weaknesses:
The authors eventually found that the new adaptor protein Shc1 is involved in Grb2 recruitments in response to FGF receptor activation. however, the main data for Shc1 are histological sections and statistical evaluation of lens size. So, my major concern is that the authors need to provide more detailed data to support the involvement of Shc1 in Grb2 recruitment of FGF signaling for lens development.
We thank the reviewer for the positive comments and valuable suggestions. We have addressed the concerns in detail in the response to the recommendation outlined below.
Reviewer #3 (Public review):
Summary:
The manuscript entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development" by Wang et al., investigates the molecular mechanism used by FGFR signaling to support lens development. The lens has long been known to depend on FGFR signaling for proper development. Previous investigations have demonstrated that FGFR signaling is required for embryonic lens cell survival and for lens fiber cell differentiation. The requirement of FGFR signaling for lens induction has remained more controversial as deletion of both Fgfr1 and Fgfr2 during lens placode formation does not prevent the induction of definitive lens markers such as FOXE3 or αA-crystallin. Here the authors have used the Le-Cre driver to delete all four FGFR genes from the developing lens placode demonstrating a definitive failure of lens induction in the absence of FGFR signaling. The authors focused on FGFR1 and FGFR2, the two primary FGFRs present during early lens development, and demonstrated that lens development could be significantly rescued in lenses lacking both FGFR1 and FGFR2 by expressing a constitutively active allele of KRAS. They also showed that the removal of pro-apoptotic genes Bax and Bak could also lead to a substantial rescue of lens development in lenses lacking both FGFR1 and FGFR2. In both cases, the lens rescue included both increased lens size and the expression of genes characteristic of lens cells.
Significantly the authors concentrated on the juxtamembrane domain, a portion of the FGFRs associated with FRS2. Previous investigations have demonstrated the importance of FRS2 activation for mediating a sustained level of ERK activation. FRS2 is known to associate both with GRB2 and SHP2 to activate RAS. The authors utilized a mutant allele of Fgfr1, lacking the entire juxtamembrane domain (Fgfr1ΔFrs), and an allele of Fgfr2 containing two-point mutations essential for Frs2 binding (Fgfr2LR). When combining three floxed alleles and leaving only one functional allele (Fgfr1ΔFrs or Fgfr2LR) the authors got strikingly different phenotypes. When only the Fgfr1ΔFrs allele was retained, the lens phenotype matched that of deleting both Fgfr1 and Fgfr2. However, when only the Fgfr2LR allele was retained the phenotype was significantly milder, primarily affecting lens fiber cell differentiation, suggesting that something other than FRS2 might be interacting with the juxtamembrane domain to support FGFR signaling in the lens. The authors also deleted Grb2 in the lens and showed that the phenotype was similar to that of the lenses only retaining the Fgfr2LR allele, resulting in a failure of lens fiber cell differentiation and decreased lens cell survival. However, mutating the major tyrosine phosphorylation site of GRB2 did not affect lens development. The author additionally investigated the role of SHP2 lens development by making by either deleting SHP2 or by making mutations in the SHP2 catalytic domain. The deletion of the SHP2 phosphatase activity did not affect lens development as severely as the total loss of SHP2 protein, suggesting a function for SHP2 outside of its catalytic activity. Although the loss of Shc1 alone has only a slight effect on lens size and pERK activation in the lens, the authors showed that the loss of Shc1 exacerbated the lens phenotype in lenses lacking both Frs2 and Shp2. The authors suggest that SHC1 binds to the FGFR juxtamembrane domain allowing for the recruitment of GRB2 independently of FRS2.
Strengths:
(1) The authors used a variety of genetic tools to carefully dissect the essential signals downstream of FGFR signaling during lens development.
(2) The authors made a convincing case that something other than FRS2 binding mediates FGFR signaling in the juxtamembrane domain.
(3) The authors demonstrated that despite the requirement of both the adaptor function and phosphatase activity of SHP2 are required for embryonic survival, neither of these activities is absolutely required for lens development.
(4) The authors provide more information as to why FGFR loss has a phenotype much more severe than the loss of FRS2 alone during lens development.
(5) The authors followed up their work analyzing various signaling molecules in the context of lens development with biochemical analyses of FGF-induced phosphorylation in murine embryonic fibroblasts (MEFs).
(6) In general, this manuscript represents a Herculean effort to dissect FGFR signaling in vivo with biochemical backing with cell culture experiments in vitro.
We thank the reviewer for the thorough review of our paper and positive comments.
Weaknesses:
(1) The authors demonstrate that the loss of FGFR1 and FGFR2 can be compensated by a constitutive active KRAS allele in the lens and suggest that FGFRs largely support lens development only by driving ERK activation. However, the authors also saw that lens development was substantially rescued by preventing apoptosis through the deletion of BAK and BAX. To my knowledge, the deletion of BAK and BAX should not independently activate ERK. The authors do not show whether ERK activation is restored in the BAK/BAX deficient lenses. Do the authors suggest the FGFR3 and/or FGFR4 provide sufficient RAS and ERK activation for lens development when apoptosis is suppressed? Alternatively, is it the survival function of FGFR-signaling as much as a direct effect on lens differentiation?
Our interpretation is that at the lens induction stage, where FGFR1 and FGFR2 are crucial, their primary function operates through Ras signaling to promote cell survival. Thus, either constitutively active KRAS or the direct suppression of apoptosis by deleting Bak and Bax is sufficient to rescue lens induction. This rescue enables the subsequent differentiation of lens progenitor cells, a process for which FGFR3 and FGFR4 are sufficient to support.
(2) The authors make the argument that deleting all four FGFRs prevented lens induction but that the deletion of only FGFR1 and FGFR2 did not. Part of this argument is the retention of FOXE3 expression, αA-crystallin expression, and PROX1 expression in the FGFR1/2 double mutants. However, in Figure 1E, and Figure 1F, the staining of the double mutant lens tissue with FOXE3, αA-crystallin, and PROX1 is unconvincing. However, the retention of FOXE3 expression in the FGFR1/FGFR2 double mutants was previously demonstrated in Garcia et al 2011. Also, there needs to be an enlargement or inset to demonstrate the retention of pSMAD in the quadruple FGFR mutants in Figure 1D.
We have updated Figure 1E with a clearer image of FOXE3 staining to better illustrate FOXE3 expression in the FGFR1/2 double mutants. It seems there may have been a misunderstanding regarding our claims about αA-crystallin and PROX1. To clarify, our observation is that both αA-crystallin and PROX1 are lost in the FGFR1/2 double mutants, which we believe is clearly demonstrated in Figure 1F. Additionally, we have added inserts to Figure 1D to highlight the retention of pSMAD.
(3) Do the authors suggest that GRB2 is required for RAS activation and ultimately ERK activation? If so, do the authors suggest that ERK activation is not required for FGFR-signaling to mediate lens induction? This would follow considering that the GRB2 deficient lenses lack a problem with lens induction.
We do believe that GRB2 is required for RAS-ERK signaling activation; however, ERK activation is not absolutely required for lens induction. This conclusion is consistent with our previous study, which showed that deletion of ERK1/2 did not prevent lens induction (Garg et al. eLife 2020;9:e51915), as well as with our current findings demonstrating that the GRB2-deficient mutant is still capable of supporting lens induction.
(4) The increase in p-Shc is only slightly higher in the Cre FGFR1f/f FGFR2r/LR than in the FGFR1f/Δfrs FGFR2f/f. Can the authors provide quantification?
pShc quantification is now provided in Fig. 7B.
(5) The authors have not shown directly that Shc1 binds to the juxtamembrane region of either Fgfr1 or Fgfr2.
It is not yet clear whether Shc1 directly binds to the juxtamembrane region of FGFR1 or FGFR2, as it may also be recruited indirectly. We acknowledge this as an important question that warrants further investigation in future studies.
(6) The authors have used the Le-Cre strain for all of their lens deletion experiments. Previous work has documented that the Le-Cre transgene can cause lens defects independent of any floxed alleles in both homozygous and hemizygous states on some genetic backgrounds (Dora et al., 2014 PLoS One 9:e109193 and Lam et al., Human Genomics 2019 13(1):10. Are the controls used in these experiments Le-Cre hemizygotes?
As stated in the Method section, Le-Cre only or Le-Cre and heterozygous flox mice were used as controls.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Weaknesses
There are only a few minor weaknesses that need to be addressed.
(1) The point could be made in the Discussion that since Le-Cre depends on Pax6 placodal expression, it is challenging to evaluate the impact of deletion of the four Fgfrs on the expression of Pax6 (since Pax6 needs to be activated prior to achieving Fgfr deletion). A different Cre line (e.g. a Cre which is expressed in the surface ectoderm prior to lens placode formation) could help partially address this question, although it may not be able to comment on the requirement of the Fgfrs specifically in the lens ectoderm. Thus, it will be prudent to mention this in the discussion.
We have added the caveat regarding the Le-Cre dependency on Pax6 expression to the discussion.
(2) Referring to Jag1 among "lens-specific markers" (page 5) is debatable, I suggest changing it along the lines of "the expected upregulation of Jag1 in lens vesicle".
The wording has been changed as suggested.
(3) The Abstract could be modified to clearly convey the existing knowledge gap and the key findings of the present study. As it stands now, it is a bit all over the place.
The abstract has been revised.
(4) Some typos in the manuscript need to be fixed.
e.g. "...yet its molecular mechanism remains largely resolved" - unresolved?, "...in the development lens" - in the developing lens?, In Fig. 4 legend, "(B) Grb2 mutants Grb2 mutants displayed...", etc.
These typos have been corrected.
Reviewer #2 (Recommendations for the authors):
My specific suggestions are shown below.
(1) The authors need to describe the role of Shc1 in FGF signaling and vertebrate lens development, by citing previous publications in the introduction.
We have detailed previous studies on the role of Shc in FGF signaling in the Introduction and discussed its function in the vertebrate lens in the Discussion section.
(2) Figure 1B bottom panels: Inset images seem to be missing, although frames and arrowheads are there. Please check them.
The inset images were correctly placed.
(3) Results (page 5, line 13): The authors mentioned "Sox2 expression remained at basal levels". Since Figure 1B indicates that Sox2 expression fails to be upregulated in FGFR1/2 mutant lens placode in contrast to Pax6, it is better to clearly mention the failure in upregulation of Sox2 expression in the FGFR1/2 mutants.
This sentence has been rewritten as suggested.
(4) Results (page 6, line 8): The authors mentioned "we observed .... expression of Foxe3 in ...mutant lens cells (Figure 1E, arrows). However, Foxe3-expressing lens cells are a very small population in Figure 1E. It is important to state the decreased number of Foxe3-expressing lens cells in FGFR1/2 mutants. In addition, I would like to request the authors to show histograms indicating sample size and statistical analysis for marker expression: Foxe3 (Figure 1E), Prox1 and aA-crystallin (Fig. 1F), cyclin D1 and TUNEL (Fig. 1G) and pmTOR and pS6 (Supplementary figure 1B).
We added a statement indicating that the number of Foxe3-expressing cells is reduced in FGFR1/2 mutants, which is now quantified in Fig. 1H. Quantifications for Cyclin D1 and TUNEL are now shown in Fig. 1I and J, respectively. However, we chose not to quantify Prox1, αA-crystallin, pmTOR, and pS6, as the FGFR1/2 mutants showed no staining for these markers.
(5) Results (page 6, line 19- page 7, line 6): The authors showed that inducible expression of constitutive active Kras, KrasG12D, using Le-Cre, recovered lens size to the half level of wild-type control. However, in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D, pERK was detected in the most posterior edge of the lens fiber core, whereas pERK was detected in the broader area of the lens in control. Furthermore, pMEK was detected in the whole lens of mice with Le-Cre; FGFR1/2f/f; and LSL-KrasG12D, whereas pMEK was detected only in the lens epithelial cells at the equator. So, the spatial profile of pERK and pMEK expression was different from those of wild-type, although the authors observed that Prox1 and Crystallin expression are normally induced in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D. I wonder whether the lens normally develops in mice with Le-Cre; LSL-KrasG12D? Is the lens growth enhanced in mice with Le-Cre; LSL-KrasG12D? Please add the panels of mice with Le-Cre; LSL-KrasG12D in Figure 2B and 2C. In addition, I wonder whether apoptosis is suppressed in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D?
As we previously reported (Developmental Biology 355, 2011, 12–20), Le-Cre; LSL-KrasG12D did not lead to enhanced lens growth. While we agree that including images of Le-Cre; LSL-KrasG12D as controls in Fig. 2B and C and evaluating apoptosis in Le-Cre; FGFR1/2f/f; LSL-KrasG12D mutants would be appropriate, we regretfully no longer have these animals available to conduct these experiments.
(6) Results (page 11, line 15): the PCR genotyping image of Fig. 6C seems to be missing.
The PCR genotyping image was correctly placed below Fig. 6B.
(7) Results (page 11, lines 15-20): there is no citation of Figure 6D in the results section.
The citation for Fig. 6D is added in the results section.
(8) Figures 5H, 6H, and 7A: Western blotting of some of the pERK, ERK lanes is missing.
These western blots all have pERK/ERK overlay images.
(9) Figure 7A, western blotting data on pShc levels are important to suggest the involvement of Shc1 in Frs2-independent Grb2 activation by FGF stimulation. Please provide the histogram for statistical analysis.
pShc quantification is now provided in Fig. 7B.
(10) There is no citation of Figure 7D, E, and F in the results section. Please add them.
These citations have been added.
(11) Figures 7E, and 7F: The authors showed that lens morphology and lens size evaluation in genetic combinations: control, Frs2/Shc1 KD, Frs2/Shp2 KD, and Frs2/Shp2/Shc1 KD. However, I would like to request the authors to show more detailed data in these genetic combinations, for example, pERK, foxe3, Maf, Prox1, Jag1, p57, cyclin D3, g-crystallin, and TUNEL.
Unfortunately, we no longer have these mutant mice to perform these detailed staining.
Reviewer #3 (Recommendations for the authors):
(1) The figure legend for Figure 2 lists (G) twice. The second (G) should be (H). Also, in Figures 2G and H there is no indication as to what stage lenses were used for the TUNEL and size analyses. I assume that it was E13.5, but it should be explicitly stated.
The figure labeling has been corrected and the stage added to the figure legend.
(2) In Figure 4 A the label should be gamma-crystallin rather than r-crystallin.
The figure labeling has been corrected.
(3) In Figure 6 D, I believe that the immunolabeling for Maf and Foxe3 are reversed. The Maf should be red as it is in the fibers and the Foxe3 should be green as it is epithelial.
The figure labeling has been corrected.
(4) In Figure 6C I believe that the labels for the WT and YF alleles on the western blot are reversed.
The YF PCR band was designed to be larger than WT, so the labeling was correct as is.
(5) In Figure 6F I believe that the labels for WT and CS on the western blot are reversed.
The figure labeling has been corrected.
(6) In Supplemental figure 2 there are no genotype labels for the TUNEL bar graph.
The figure labeling has been added.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 3 (Public review):
Major comments:
(1) Can isolated mitochondria be transported to cultured cardiomyocytes, such as H9C2 cells, in vitro?
Thank you for this insightful question. Mitochondria are highly dynamic organelles that play a crucial role in cellular energy metabolism. When cells encounter various stressors and increased energy demands, they can benefit from the incorporation of exogenous mitochondria. In 2013, Masuzawa et al. (Masuzawa, et al.,2013) were the first to demonstrate that transplanted mitochondria are internalized by cardiomyocytes 2 to 8 hours after transplantation, significantly contributing to the preservation of myocardial energetics. Ali et al. (Ali, et al.,2020) discovered that exogenous mitochondria could be internalized by H9C2 cardiomyocytes as quickly as 5 minutes after co-incubation, resulting in an acute enhancement of normal cellular bioenergetics following mitochondrial transplantation. Pacak et al. (Pacak, et al.,2015) established that the internalization of mitochondria into cardiomyocytes is time-dependent and occurs through actin-dependent endocytosis.
Collectively, these evidences illustrate that exogenous mitochondria can be effectively internalized by H9C2 cells and other cardiomyocytes, our experiments further confirmed that mitochondrial transplantation can be incorporated by the myocardium in vivo.
(2) The description of results in the manuscript is too simple. It lacks detail on the rationale behind the experiments and the significance of the data.
Thank you for this suggestion. We have realized that the results in the submitted manuscript have not been adequately interpreted. We have added necessary details on the rationale behind the experiments and the significance of the data to the results section (Lines 57~59, 69~73, 81~88, 91~98, 100~102, 103~104, 10<sup>9</sup>~115, 124~129, 135~146, 149~157, 159~161, 168~169, 178~179). We would like to express our gratitude to the reviewers once again and hope that our modifications will meet their requirements.
(3) The authors demonstrate that mitochondrial transplantation reduces cardiomyocyte apoptosis. Therefore, Western blot analysis of apoptosis-related caspases could be provided for further confirmation.
Thank you for this constructive comment. We fully agree with the reviewer's perspective on the detection of apoptosis-related caspases and have conducted a Western blot assay to investigate the impact of mitochondria on myocardial tissue. Our new evidence indicates that rats receiving mitochondrial transplantation exhibited reduced expression of cleaved caspase-3 compared with those in the NS and Vehicle groups (Fig. 6G, 6H, Lines 168~169), suggesting that mitochondrial transplantation decreased the level of apoptosis in the myocardium.
(4) Do donor mitochondria fuse with recipient mitochondria? Relevant experiments and data should be provided to address this question.
This is a very helpful comment. Investigating the fate of transplanted mitochondria in myocardial cells after CA is of great significance. The internalization of exogenous mitochondria has been observed across various cell types (Liu, et al.,2021; Shanmughapriya, et al.,2020). Notably, a recent study indicated that after being incorporated into host cells, isolated mitochondria are transported to endosomes and lysosomes. Subsequently, most of these mitochondria escape from these compartments and fuse with the endogenous mitochondrial network (Cowan, et al.,2017). We have discussed this in the manuscript. (Lines 217~220)
Oxidative stress, a pathophysiological phenomenon common to cells suffering from ischemia/reperfusion insults after CA/CPR, was implicated to promote internalization and survival of exogenous mitochondria (Aharoni-Simon, et al.,2022). In our study, we confirmed that mitochondrial transplantation can enhance the metabolism of cardiomyocytes, increase ATP level, and reduce reactive oxygen species (ROS). Our results indirectly confirm that isolated mitochondria can successfully fuse with myocardial mitochondria.
(5) In Figure 5A, the histograms are not labeled with the specific experimental groups.
We apologize for this oversight. We have labeled the specific experimental groups in the histograms presented in Figure 6B and 6C (originally Figure 5A).
Reviewer #1 (Recommendations For The Authors):
(1) The age, gender, and strain of the donor rats should be specified in the Methods section. Additionally, it is not obvious what doses of mitochondria were injected into the rats and how the dosage was initially determined.
Thanks for your suggestion. We have included relevant information about the donor rats in the Methods section(Lines 361~362).
In Mito group, each animal received 0.5 mL of 1× 10<sup>9</sup>/mL mitochondrial suspension. (Lines 342~345). Considerable amounts of data have demonstrated the efficacy of mitochondrial transplantation in cellular, animal, and human research (Alemany, et al.,2024; Kaza, et al.,2017; Liu, et al.,2023). However, there is currently no evidence to determine the optimal dosage for transplantation. In previous research, isolated mitochondria (1 × 10<sup>9</sup>) were delivered to the left coronary ostium in pigs, and can be a viable treatment modality in cardiac ischemia-reperfusion injury (Blitzer, et al.,2020; Guariento, et al.,2020). Additionally, the dose of 1× 10<sup>9</sup> mitochondria achieve the maximal hyperemic effect when administered via intracoronary injection (Shin, et al.,2019). Considering that Sprague-Dawley (SD) rats are smaller than pigs and that there is a loss of mitochondria during pulmonary circulation, we adopted a mitochondrial transplantation dose of 5× 10<sup>8</sup>. We will explore the optimal dosage in our future research.
(2) In Figure 4a, the number of transplanted mitochondria appears to be very low. Considering the high number of mitochondria present in cardiomyocytes, it is unclear whether this small amount of transplanted mitochondria can significantly impact complex II activity and ATP levels in myocardial tissues, as shown in Figures 4b-d, or improve survival post-ROSC, as shown in Figure 2d. Could the observed benefits of mitochondrial transplantation be due to the indirect effects of the injected mitochondria, such as the release of mitochondrial contents, rather than the mitochondria themselves, as discussed by Bertero et al. (2021, Circ. Research)? This issue should be addressed in the manuscript.
Thanks for this wonderful comment. As presented in Fig. 4 (originally Figure 4A), our results indicated the internalization of mitochondria by myocardium, shown by colocalization of Mito-tracker and myocardium marker. We would like to make our points here regrading to Fig. 4:
(1) Significant left ventricular systolic and diastolic dysfunction that occurs in the myocardium shortly after the return of ROSC is referred to post-cardiac arrest myocardial dysfunction (PAMD) (Laurent, et al.,2002). It has demonstrated the efficacy of mitochondrial transplantation for the heart following ischemia-reperfusion injury in cellular, animal, and human studies, despite inadequate mitochondrial internalization (Liu, et al.,2023). A low number of transplanted mitochondria may improve cardiac function.
(2) Only biologically active mitochondria can be specifically labeled with Mito-tracker. Therefore, cardiomyocytes uptake mitochondria that possess complete functionality. Previous results have demonstrated that mitochondrial contents, such as nonviable mitochondria, mitochondrial fractions, mitochondrial deoxyribonucleic acid, ribonucleic acid, exogenous adenosine diphosphate and ATP, do not provide protection to the ischemic heart (McCully, et al.,2017; McCully, et al.,2009).
(3) The specific mechanism for mitochondrial internalization has yet to be fully elucidated. We totally agree with reviewer’s opinion pertaining the presence of other mechanisms of mitochondria transplantation that play a role in cardiac protection. Multiple mechanism may involve in the cardiac protection effect of mitochondria transplantation, and we are actively seeking reasonable approach to verify these hypotheses in an underway study (Lines 236~246).
(3) In Figure 4g, the claims regarding sarcomere length, mitochondrial structure, the number of cristae, accumulated calcium etc. seem to rely on the visual interpretation of representative images. To ensure a reliable interpretation of the data, a blinded quantification of each image in each group should be conducted. The same applies to the claims made in Figure 5E.
Thanks for this suggestion. We have quantitatively evaluated the electron microscope images and HE images of the myocardium to ensure reliable interpretation. Corresponding supplements have been added to the methods (Lines 433~441, 494~496), results sections (Lines 10<sup>9</sup>~115, 178~179), and Figures 5C, 5D, 6K and 6H (originally Figures 4G and 5E).
(4) In line 69, it is unclear why the authors claim that MAP and HR decrease at 1, 2, 3, and 4 hours after ROSC in all groups compared to the Sham group, despite stating in line 72 that "MAP and HR did not differ at any observational time points (P>0.05, Figure 2C)."
We apologize for our inaccurate phrasing. In the presented study, there was no statistically significant difference between MAP and HR at any observational timepoints (P>0.05, Figure 2C). In the NS, Vehicle and Mito groups, the MAP and HR decreased at 1, 2, 3, and 4 hours after ROSC, reaching their nadir at 1 hour. Subsequently, MAP and HR increased gradually but did not show any statistically significant differences compared with the Sham group. (Lines 69~73).
(5) The absence of increased mitochondrial content in the mito-groups should be discussed further in the manuscript.
Thank you for your suggestion. We discussed the reasons why the mass of isolated mitochondria did not increase in Lines 224~235.
(6) The N in Figure 5d should be provided.
Thanks for your suggestion. We have revised the figure legend to include N of Figure 6F (originally Figures 5D).
(7) Figure 6 demonstrates content beyond the findings in this manuscript. This reviewer recommends limiting the graphical abstract to the findings specifically in this paper.
Thanks for your great advice. We have revised Figure 7 (originally Figure 6) and restricted the graphical abstract to the findings presented in this paper.
Minor issues:
(8) The order of data in Figure 4 should be consistent with the text in the manuscript. Figures 4E-F-G are described before Figures 4B-C-D in the text. Similarly, Figure 5F was described before Figure 5E in the text.
Thanks for your great advice. We have rearranged the order of the pictures to align with the text. Thank you for your proposal.
(9) In Figure 4A, the locations of the epicardium, muscle, and endocardium should be indicated for clarity. Also, it is not obvious where the close-up box refers to in the actual image.
Thank you for your suggestion. We primarily seek evidence of mitochondrial internalization within the endocardium, as injury occurs first during myocardial ischemia (Kuwada and Takenaka,2000). The close-up box in Fig. 4 refers to the endocardium.
(10) In Figure 5A, the group annotations are missing from the MDA and SOD graphs. The standard deviation bars for the SOD vehicle and SOD mito groups (3rd and 4th columns) appear to overlap. Can the authors provide the actual p-values?
We apologize for the mission of group annotations in the MDA and SOD graphs. The p-value between the Vehicle group and the Mito group was 0.004. The SOD activity level of myocardial samples in the groups are presented in Table 1.
Author response table 1.
The SOD activity levels of myocardial samples in groups (U/mgprot)
(11) In line 58, NS abbreviation is used without defining what NS is.
We apologize for not including the full name of NS. NS is the abbreviation of normal. It has now been marked in the manuscript. (Line 58)
(12) In line 118, what MDA stands for is not described until line 348. MDA should be defined in the text for the general audience.
We apologize for this. We have defined it in the manuscript. (Lines 156~157)
(13) In line 192, the authors state that "mitochondrial transplantation... increased the expression of antioxidant enzymes after four hours of ROSC," while only SOD activity levels were assessed in the manuscript. Increased activity levels do not necessarily imply an increase in expression levels. This discrepancy should be addressed in the Discussion section.
Sorry for confusing the ‘activity’ with ‘expression’. Although mitochondrial transplantation has been shown to be involved in the restoration of manganese superoxide dismutase levels after ischemic insults, the changes in antioxidant enzyme expression level were not evaluated at the protein level in this paper (Tashiro, et al.,2022). To avoid misunderstandings, we have replaced the term ‘expression’ with ‘activity’ as appropriate. (Lines 268~271)
(14) Mitochondria from non-ischemic gastrocnemius muscle of health donor animals were isolated and a manner that maximized their healing potential. This sentence is not clear.
We apologize for the confusing sentence in the original manuscript. To improve clarity, we have revised that sentence. We isolated mitochondria from allogeneic gastrocnemius muscle tissue of healthy rats and maintained optimal mitochondrial activity and therapeutic effects. (Lines 199~201)
Minor grammar issues:
In line 153, mitochondrial should be mitochondria.
Figure 2D: Percent servival should be percent survival.
There should be a blank in complex IIactivity Figure 4B, and complex IV activity in Figure 4C.
In line 134, Four hours of ROSC, Tissue samples from. Tissue is capital.
In line 190, Similaerly should be similarly.
Thank you for your valuable comments. We apologize for the grammatical issues caused by our oversight. We have made the necessary corrections in the manuscript and figures. (Lines 198, 179, and 268), Figure 2D, Figure 5E (originally Figure 4B); Figure 5F (originally Figure 4C).
Reviewer #2 (Recommendations For The Authors):
Some details are lacking clarity, such as the rationale behind choosing certain doses or time points for interventions.
Thank you for this valuable suggestion. We have explained the rationale behind the selection of the dosage and the timing of the intervention. (Lines 201~212)
I would suggest verifying mitochondrial function using the seahorse experiment oxygen consumption, and to check mitochondrial oxidative stress. I would also suggest checking the mitochondrial permeability transition pore opening, using for example calcein cobalt quenching or simply a kit to examine this further.
Thank you for your valuable advice. In our manuscript, we added results regarding mitochondrial reactive oxygen species (ROS) and the mitochondrial permeability transition pore (mPTP) opening. As anticipated, mitochondrial transplantation reduced the increase in mitochondrial ROS and the mPTP opening in ischemic myocardium. (Lines 135~146, 149~157, 442~455, 460~476, Figure 5H, 5I, 6A)
We agree that seahorse experiment oxygen consumption would be beneficial for understanding the intricacies of their interactions and enhancements. Additionally, Ali et al. (Ali, et al.,2020) have demonstrated that introducing non-autologous mitochondria from healthy skeletal muscle cells into normal cardiomyocytes results in a short-term improvement in bioenergetics, as measured using a Seahorse Extracellular Flux Analyzer. In our results, we have not yet conducted cellular experiments, The process of isolating cells from the myocardial tissue of adult SD rats for Seahorse analysis can lead to secondary damage to the myocardial cells (Jacobson, et al.,1985). In this experiment, we measured ATP content and the activity of mitochondrial complexes to evaluate energy changes after mitochondrial transplantation. We will conduct cell experiments and utilize Seahorse measurements to further clarify the alterations in myocardial energy in future.
For Figure 3B, it would be beneficial to include the relative quantification of the mitochondrial marker COX-IV. Additionally, if feasible, I suggest verifying the representation of the mitochondria outer membrane TOM20 or VDAC.
Thank you for your great suggestion. As suggested, we added TOM20 to assess the purity of the isolated mitochondria and reached the same conclusion: the isolated mitochondria exhibited high purity (Figure 3B). TOM20 was expressed in both muscle lysates and isolated mitochondria, whereas GAPDH was exclusively found in the muscle lysate. (We re-validated the purity of the mitochondria by using relative quantification of TOM20 and COX VI.)
In Figure 2C, the clarity of the graphs depicting both arterial pressure (MAP) and heart rate (HR) is lacking and could potentially confuse the reader. I recommend incorporating color coding instead of relying solely on symbols, or by presenting the data in a more comprehensible format and that aligns with graph B as well.
Thank you for your constructive comments. We have color-coded the diagrams in Figure 2B and 2C.
In Figure 4A, please include high-magnification of the mitochondria to provide a more detailed examination.
Thank you for this insightful comment. We have provided a high-magnification image of the mitochondria in Figure 4.
Regarding lines 81-82, I recommend specifying the sentence more precisely for better clarity and understanding.
Thank you for your comments. We have revised the sentences in lines 83~86 to enhance their clarity for readers.
In the Materials and Methods section, it is crucial to provide precise details. For instance, when staining the exogenous mitochondria with MitoTracker Red, it is important to specify the duration of staining, such as the standard 20 minutes for example. Additionally, it is advisable to mention the number of times these mitochondria were washed with the respiratory solution to ensure thorough removal of excess MitoTracker, thus preventing unintended staining of endogenous mitochondria with MitoTracker red upon injection of pre-labeled mitochondria.
Thank you for your suggestion. We have added the necessary details regarding Mito-Tracker Red dyeing. (Lines 373~376) In addition, we also added other details in necessary (Lines 373~376, 379~382, 395~396, 397~400, 487~488). We appreciate your suggestion once again.
The sensitivity of JC-1 dye to temperature and pH fluctuations underscores the necessity for meticulous experimental conditions. It is crucial for the authors to elucidate why they chose to maintain the samples at 4 {degree sign} C for 60 minutes, especially considering the dye's optimal operating temperature of 25 {degree sign} C. Providing a rationale behind this deviation from standard protocol would enhance the scientific rigor and reproducibility of the study. Please add more information on the objectives used in the fluorescence microscope (BX53, OLYMPUS, Tokyo, Japan) and the software used.
We sincerely apologize for the mistake in this sentence. The purified mitochondria, which are stained with JC-1, should be stored at 4°C and examined using a fluorescence microscope within 60 minutes. Purified mitochondria were incubated with JC-1 staining solution at 37°C for 20 minutes. The fluorescence microscope used in our experiment is equipped with a WHN 10/22 eyepiece, and the software version is OLYMPUS cellSens Standard 3.2. (Lines 379~382)
Moreover, in the context of immunoblotting, it is imperative for the authors to furnish detailed information regarding the preparation of muscle tissue homogenates. Specifically, clarification is needed regarding the solution utilized for tissue grinding. Did the authors employ ice-cold RIPA lysis buffer or an alternative lysis buffer, supplemented with a protease inhibitor cocktail? Such details are pivotal for methodological transparency.
Thanks for this wonderful comment. In the methods section, we added detailed information about protein extraction. (Lines 383~385)
Furthermore, it would be beneficial for the authors to specify the instrument employed for scanning the immunoblots, as well as the software utilized for subsequent analysis of the immunoblot images. Providing this information would not only enhance the reproducibility of the findings but also facilitate the evaluation of the experimental results.
Thank you for your suggestion. We have included the instrument used for scanning the Western blot, as well as the software used for image analysis in the manuscript. (Lines 397~400)
Authors must exercise caution against copy-pasting. In line 282, there's a query regarding how the mitochondria were isolated. It is recommended to cite a specific reference and offer more comprehensive details. Despite the authors referencing a number within the text, the absence of numbered references makes it challenging to cross-reference.
Thank you for pointing this out; we have updated the citation accordingly (Line 361).
Figure 5C please double check some misspelling label errors (e.g: Vehicle and not Vehucle).
We apologize for the misspelling in Figure 6E (originally Figure 5C) and have corrected it. Additionally, we have thoroughly reviewed the text for spelling errors and sincerely apologize once again for the previous mistakes. (Lines 249~252, 322)
References:
Aharoni-Simon M, Ben-Yaakov K, Sharvit-Bader M, Raz D, Haim Y, Ghannam W, Porat N, Leiba H, Marcovich A, Eisenberg-Lerner A, Rotfogel Z. 2022. Oxidative stress facilitates exogenous mitochondria internalization and survival in retinal ganglion precursor-like cells. SCI REP-UK 12:5122. doi:10.1038/s41598-022-08747-3
Alemany VS, Nomoto R, Saeed MY, Celik A, Regan WL, Matte GS, Recco DP, Emani SM, Del NP, McCully JD. 2024. Mitochondrial transplantation preserves myocardial function and viability in pediatric and neonatal pig hearts donated after circulatory death. J THORAC CARDIOV SUR 167: e6-e21. doi: 10.1016/j.jtcvs.2023.05.010
Ali PP, Kenney MC, Kheradvar A. 2020. Bioenergetics Consequences of Mitochondrial Transplantation in Cardiomyocytes. J AM HEART ASSOC 9: e14501. doi:10.1161/JAHA.119.014501
Blitzer D, Guariento A, Doulamis IP, Shin B, Moskowitzova K, Barbieri GR, Orfany A, Del NP, McCully JD. 2020. Delayed Transplantation of Autologous Mitochondria for Cardioprotection in a Porcine Model. ANN THORAC SURG 109:711-719. doi: 10.1016/j.athoracsur.2019.06.075
Cowan DB, Yao R, Thedsanamoorthy JK, Zurakowski D, Del NP, McCully JD. 2017. Transit and integration of extracellular mitochondria in human heart cells. SCI REP-UK 7:17450. doi:10.1038/s41598-017-17813-0
Guariento A, Blitzer D, Doulamis I, Shin B, Moskowitzova K, Orfany A, Ramirez-Barbieri G, Staffa SJ, Zurakowski D, Del NP, McCully JD. 2020. Preischemic autologous mitochondrial transplantation by intracoronary injection for myocardial protection. J THORAC CARDIOV SUR 160: e15-e29. doi: 10.1016/j.jtcvs.2019.06.111
Jacobson SL, Banfalvi M, Schwarzfeld TA. 1985. Long-term primary cultures of adult human and rat cardiomyocytes. BASIC RES CARDIOL 80 Suppl 1:79-82. doi:10.1007/978-3-662-11041-6_15
Kaza AK, Wamala I, Friehs I, Kuebler JD, Rathod RH, Berra I, Ericsson M, Yao R, Thedsanamoorthy JK, Zurakowski D, Levitsky S, Del NP, Cowan DB, McCully JD. 2017. Myocardial rescue with autologous mitochondrial transplantation in a porcine model of ischemia/reperfusion. J THORAC CARDIOV SUR 153:934-943. doi: 10.1016/j.jtcvs.2016.10.077
Kuwada Y, Takenaka K. 2000. [Transmural heterogeneity of the left ventricular wall: subendocardial layer and subepicardial layer]. J CARDIOL 35:205-218.
Laurent I, Monchi M, Chiche JD, Joly LM, Spaulding C, Bourgeois B, Cariou A, Rozenberg A, Carli P, Weber S, Dhainaut JF. 2002. Reversible myocardial dysfunction in survivors of out-of-hospital cardiac arrest. J AM COLL CARDIOL 40:2110-2116. doi:10.1016/s0735- 1097(02)02594-9
Liu D, Gao Y, Liu J, Huang Y, Yin J, Feng Y, Shi L, Meloni BP, Zhang C, Zheng M, Gao J. 2021. Intercellular mitochondrial transfer as a means of tissue revitalization. SIGNAL TRANSDUCT TAR 6:65. doi:10.1038/s41392-020-00440-z
Liu Q, Liu M, Yang T, Wang X, Cheng P, Zhou H. 2023. What can we do to optimize mitochondrial transplantation therapy for myocardial ischemia-reperfusion injury? MITOCHONDRION 72:72-83. doi: 10.1016/j.mito.2023.08.001
Masuzawa A, Black KM, Pacak CA, Ericsson M, Barnett RJ, Drumm C, Seth P, Bloch DB, Levitsky S, Cowan DB, McCully JD. 2013. Transplantation of autologously derived mitochondria protects the heart from ischemia-reperfusion injury. AM J PHYSIOL-HEART C 304:H966-H982. doi:10.1152/ajpheart.00883.2012
McCully JD, Cowan DB, Emani SM, Del NP. 2017. Mitochondrial transplantation: From animal models to clinical use in humans. MITOCHONDRION 34:127-134. doi: 10.1016/j.mito.2017.03.004
McCully JD, Cowan DB, Pacak CA, Toumpoulis IK, Dayalan H, Levitsky S. 2009. Injection of isolated mitochondria during early reperfusion for cardioprotection. AM J PHYSIOL-HEART C 296:H94-H105. doi:10.1152/ajpheart.00567.2008
Pacak CA, Preble JM, Kondo H, Seibel P, Levitsky S, Del NP, Cowan DB, McCully JD. 2015. Actin-dependent mitochondrial internalization in cardiomyocytes: evidence for rescue of mitochondrial function. BIOL OPEN 4:622-626. doi:10.1242/bio.201511478
Shanmughapriya S, Langford D, Natarajaseenivasan K. 2020. Inter and Intracellular mitochondrial trafficking in health and disease. AGEING RES REV 62:101128. doi: 10.1016/j.arr.2020.101128
Shin B, Saeed MY, Esch JJ, Guariento A, Blitzer D, Moskowitzova K, Ramirez-Barbieri G, Orfany A, Thedsanamoorthy JK, Cowan DB, Inkster JA, Snay ER, Staffa SJ, Packard AB, Zurakowski D, Del NP, McCully JD. 2019. A Novel Biological Strategy for Myocardial Protection by Intracoronary Delivery of Mitochondria: Safety and Efficacy. JACC-BASIC TRANSL SC 4:871-888. doi: 10.1016/j.jacbts.2019.08.007
Tashiro R, Bautista-Garrido J, Ozaki D, Sun G, Obertas L, Mobley AS, Kim GS, Aronowski J, Jung JE. 2022. Transplantation of Astrocytic Mitochondria Modulates Neuronal Antioxidant Defense and Neuroplasticity and Promotes Functional Recovery after Intracerebral Hemorrhage. J NEUROSCI 42:7001-7014. doi:10.1523/JNEUROSCI.2222-21.2022
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This manuscript investigates a mechanism between the histone reader protein YEATS2 and the metabolic enzyme GCDH, particularly in regulating epithelial-to-mesenchymal transition (EMT) in head and neck cancer (HNC).
Strengths:
Great detailing of the mechanistic aspect of the above axis is the primary strength of the manuscript.
Weaknesses:
Several critical points require clarification, including the rationale behind EMT marker selection, the inclusion of metastasis data, the role of key metabolic enzymes like ECHS1, and the molecular mechanisms governing p300 and YEATS2 interactions.
We would like to sincerely thank the reviewer for the detailed, in-depth, and positive response. We are committed to implementing constructive revisions to the manuscript to address the reviewer’s concerns effectively.
Major Comments:
(1) The title, "Interplay of YEATS2 and GCDH mediates histone crotonylation and drives EMT in head and neck cancer," appears somewhat misleading, as it implies that YEATS2 directly drives histone crotonylation. However, YEATS2 functions as a reader of histone crotonylation rather than a writer or mediator of this modification. It cannot itself mediate the addition of crotonyl groups onto histones. Instead, the enzyme GCDH is the one responsible for generating crotonyl-CoA, which enables histone crotonylation. Therefore, while YEATS2 plays a role in recognizing crotonylation marks and may regulate gene expression through this mechanism, it does not directly catalyse or promote the crotonylation process.
We thank the reviewer for raising this concern. As stated by the reviewer, YEATS2 functions as a reader protein, capable of recognizing histone crotonylation marks and assisting in the addition of this mark to nearby histone residues, possibly by assisting the recruitment of the writer protein for crotonylation. Our data indicates the involvement of YEATS2 in the recruitment of writer protein p300 on the promoter of the SPARC gene, making YEATS2 a regulatory factor responsible for the addition of crotonyl marks in an indirect manner. Thus, we have decided to make changes in the title by replacing the word “mediates” with “regulates”. Therefore, the updated title can be read as: “Interplay of YEATS2 and GCDH regulates histone crotonylation and drives EMT in head and neck cancer”.
(2) The study suggests a link between YEATS2 and metastasis due to its role in EMT, but the lack of clinical or pre-clinical evidence of metastasis is concerning. Only primary tumor (PT) data is shown, but if the hypothesis is that YEATS2 promotes metastasis via EMT, then evidence from metastatic samples or in vivo models should be included to solidify this claim.
We appreciate the reviewer’s suggestion. Here, we would like to state that the primary aim of this study was to delineate the molecular mechanisms behind the role of YEATS2 in maintaining histone crotonylation at the promoter of genes that favour EMT in head and neck cancer. We have dissected the importance of histone crotonylation in the regulation of gene expression in head and neck cancer in great detail, having investigated the upstream and downstream molecular players involved in this process that promote EMT. Moreover, with the help of multiple phenotypic assays, such as Matrigel invasion, wound healing, and 3D invasion assays, we have shown the functional importance of YEATS2 in promoting EMT in head and neck cancer cells. Since EMT is known to be a prerequisite process for cancer cells undergoing metastasis(1), the evidence of YEATS2 being associated with EMT demonstrates a potential correlation of YEATS2 with metastasis. However, as part of the revision, we will use publicly available patient data to investigate the direct association of YEATS2 with metastasis by checking the expression of YEATS2 between different grades of head and neck cancer, as an increase in tumor grade is often correlated with the incidence of metastasis(2).
(3) There seems to be some discrepancy in the invasion data with BICR10 control cells (Figure 2C). BICR10 control cells with mock plasmids, specifically shControl and pEGFP-C3 show an unclear distinction between invasion capacities. Normally, we would expect the control cells to invade somewhat similarly, in terms of area covered, within the same time interval (24 hours here). But we clearly see more control cells invading when the invasion is done with KD and fewer control cells invading when the invasion is done with OE. Are these just plasmid-specific significant effects on normal cell invasion? This needs to be addressed.
We appreciate the reviewer for the thorough evaluation of the manuscript. The figure panels in question, Figure 2B and 2C, represent two different experiments performed independently, the invasion assay performed after knockdown and overexpression of YEATS2, respectively. We would like to clarify that both panels represent results that are distinct and independent of each other and that the method used to knockdown or overexpress YEATS2 is also different. As stated in the Materials and Methods section, the knockdown is performed using lentivirus-mediated transfection (transduction) of cells, on the other hand, the overexpression is done using standard method of transfection by directly mixing transfection reagent and the respective plasmids, prior to the addition of this mix to the cells. The difference in the experimental conditions in these two experiments might have attributed to the differences seen in the controls as observed previously(3). Hence, we would like to state that the results of figure panels Figure 2B and Figure 2C should be evaluated independently of each other.
(4) In Figure 3G, the Western blot shows an unclear band for YEATS2 in shSP1 cells with YEATS2 overexpression condition. The authors need to clearly identify which band corresponds to YEATS2 in this case.
The two bands seen in the shSP1+pEGFP-C3-YEATS2 condition correspond to the endogenous YEATS2 band (lower band, indicated by * in the shControl lane) and YEATS2-GFP band (upper band, corresponding to overexpressed YEATS2-GFP fusion protein, which has a higher molecular weight). To avoid confusion, the endogenous band will be highlighted (marked by *) in the lane representing the shSP1+pEGFP-C3-YEATS2 condition in the revised version of the manuscript.
(5) In ChIP assays with SP1, YEATS2 and p300 which promoter regions were selected for the respective genes? Please provide data for all the different promoter regions that must have been analysed, highlighting the region where enrichment/depletion was observed. Including data from negative control regions would improve the validity of the results.
Throughout our study, we have performed ChIP-qPCR assays to check the binding of SP1 on YEATS2 and GCDH promoter, and to check YEATS2 and p300 binding on SPARC promoter. Using transcription factor binding prediction tools and luciferase assays, we selected multiple sites on the YEATS2 and GCDH promoter to check for SP1 binding. The results corresponding to the site that showed significant enrichment were provided in the manuscript. The region of SPARC promoter in YEATS2 and p300 ChIP assay was selected on the basis of YEATS2 enrichment found in the YEATS2 ChIP-seq data. We will provide data for all the promoter regions investigated (including negative controls) in the revised version of the manuscript.
(6) The authors establish a link between H3K27Cr marks and GCDH expression, and this is an already well-known pathway. A critical missing piece is the level of ECSH1 in patient samples. This will clearly delineate if the balance shifted towards crotonylation.
We thank the reviewer for their valuable suggestion. To support our claim, we had checked the expression of GCDH and ECHS1 in TCGA HNC RNA-seq data (provided in Figure 4—figure supplement 1A and B) and found that GCDH showed increase while ECHS1 showed decrease in tumor as compared to normal samples. We hypothesized that higher GCDH expression and decreased ECHS1 expression might lead to an increase in the levels of crotonylation in HNC. To further substantiate our claim, we will check the abundance of ECHS1 in HNC patient samples as part of the revision.
(7) The p300 ChIP data on the SPARC promoter is confusing. The authors report reduced p300 occupancy in YEATS2-silenced cells, on SPARC promoter. However, this is paradoxical, as p300 is a writer, a histone acetyltransferase (HAT). The absence of a reader (YEATS2) shouldn't affect the writer (p300) unless a complex relationship between p300 and YEATS2 is present. The role of p300 should be further clarified in this case. Additionally, transcriptional regulation of SPARC expression in YEATS2 silenced cells could be analysed via downstream events, like Pol-II recruitment. Assays such as Pol-II ChIP-qPCR could help explain this.
Using RNA-seq and ChIP-seq analyses, we have shown that YEATS2 affects the expression of several genes by regulating the level of histone crotonylation at gene promoters globally. The histone writer p300 is a promiscuous acyltransferase protein that has been shown to be involved in the addition of several non-acetyl marks on histone residues, including crotonylation(4). Our data provides evidence for the dependency of the writer p300 on YEATS2 in mediating histone crotonylation, as YEATS2 downregulation led to decreased occupancy of p300 on the SPARC promoter (Figure 5F). However, the exact mechanism of cooperativity between YEATS2 and p300 in maintaining histone crotonylation remains to be investigated. To address the reviewer’s concern, we will perform various experiments to delineate the molecular mechanism pertaining to the association of YEATS2 with p300 in regulating histone crotonylation. Following are the experiments that will be performed:
(a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.
(b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.
(c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.
(d) As suggested by the reviewer, Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.
(8) The role of GCDH in producing crotonyl-CoA is already well-established in the literature. The authors' hypothesis that GCDH is essential for crotonyl-CoA production has been proven, and it's unclear why this is presented as a novel finding. It has been shown that YEATS2 KD leads to reduced H3K27cr, however, it remains unclear how the reader is affecting crotonylation levels. Are GCDH levels also reduced in the YEATS2 KD condition? Are YEATS2 levels regulating GCDH expression? One possible mechanism is YEATS2 occupancy on GCDH promoter and therefore reduced GCDH levels upon YEATS2 KD. This aspect is crucial to the study's proposed mechanism but is not addressed thoroughly.
The source for histone crotonylation, crotonyl-CoA, can be produced by several enzymes in the cell, such as ACSS2, GCDH, ACOX3, etc(5). Since metabolic intermediates produced during several cellular pathways in the cell can act as substrates for epigenetic factors, we wanted to investigate if such an epigenetic-metabolism crosstalk existed in the context of YEATS2. As described in the manuscript, we performed GSEA using publicly available TCGA RNA-seq data and found that patients with higher YEATS2 expression also showed a high correlation with expression levels of genes involved in the lysine degradation pathway, including GCDH. Since the preferential binding of YEATS2 with H3K27cr and the role of GCDH in producing crotonyl-CoA was known(6,7), we hypothesized that higher H3K27cr in HNC could be a result of both YEATS2 and GCDH. We found that the presence of GCDH in the nucleus of HNC cells is correlated to higher H3K27cr abundance, which could be a result of excess levels of crotonyl-CoA produced via GCDH. We also found a correlation between H3K27cr levels and YEATS2 expression, which could arise due to YEATS2-mediated preferential maintenance of crotonylation. This states that although being a reader protein, YEATS2 is affecting the promoter H3K27cr levels, possibly by helping in the recruitment of p300 (as shown in Figure 5F). Thus, YEATS2 and GCDH are both responsible for the regulation of histone crotonylation-mediated gene expression in HNC.
We did not find any evidence of YEATS2 regulating the expression of GCDH in HNC cells. However, we found that YEATS2 downregulation reduced the nuclear pool of GCDH in head and neck cancer cells (Figure 7F). This suggests that YEATS2 not only regulates histone crotonylation by affecting promoter H3K27cr levels (with p300), but also by affecting the nuclear localization of crotonyl-CoA producing GCDH. Also, we observed that the expression of YEATS2 and GCDH are regulated by the same transcription factor SP1 in HNC. We found that the transcription factor SP1 binds to the promoter of both genes, and its downregulation led to a decrease in their expression (Figure 3 and Figure 7).
We would like to state that the relationship between YEATS2 and the nuclear localization of GCDH, as well as the underlying molecular mechanism, remains unexplored and presents an open question for future investigation.
(9) The authors should provide IHC analysis of YEATS2, SPARC alongside H3K27cr and GCDH staining in normal vs. tumor tissues from HNC patients.
We thank the reviewer for their suggestion. We are consulting our clinical collaborators to assess the feasibility of including this IHC analysis in our revision and will make every effort to incorporate it.
Reviewer #2 (Public review):
Summary:
The manuscript emphasises the increased invasive potential of histone reader YEATS2 in an SP1-dependent manner. They report that YEATS2 maintains high H3K27cr levels at the promoter of EMT-promoting gene SPARC. These findings assigned a novel functional implication of histone acylation, crotonylation.
We thank the reviewer for the constructive comments. We are committed to making beneficial changes to the manuscript in order to alleviate the reviewer’s concerns.
Concerns:
(1) The patient cohort is very small with just 10 patients. To establish a significant result the cohort size should be increased.
We thank the reviewer for this suggestion. We will increase the number of patient samples to assess the levels of YEATS2 and H3K27cr in normal vs. tumor samples.
(2) Figure 4D compares H3K27Cr levels in tumor and normal tissue samples. Figure 1G shows overexpression of YEATS2 in a tumor as compared to normal samples. The loading control is missing in both. Loading control is essential to eliminate any disparity in protein concentration that is loaded.
In Figures 1G and 4D, we have used Ponceau S staining as a control for equal loading. Ponceau S staining is frequently used as an alternative for housekeeping genes like GAPDH as a control for protein loading(8). It avoids the potential for variability in housekeeping gene expression. However, it may be less quantitative than using housekeeping proteins. To address the reviewer’s concern, we will probe with an antibody against a house keeping gene as a loading control in the revised figures, provided its expression remains stable across the conditions tested.
(3) Figure 4D only mentions 5 patient samples checked for the increased levels of crotonylation and hence forms the basis of their hypothesis (increased crotonylation in a tumor as compared to normal). The sample size should be more and patient details should be mentioned.
A total of 9 samples were checked for H3K27cr levels (5 of them are included in Figure 4D and rest included in Figure 4—figure supplement 1D). However, as a part of the revision, we will check the H3K27cr levels in more patient samples.
(4) YEATS2 maintains H3K27Cr levels at the SPARC promoter. The p300 is reported to be hyper-activated (hyperautoacetylated) in oral cancer. Probably, the activated p300 causes hyper-crotonylation, and other protein factors cause the functional translation of this modification. The authors need to clarify this with a suitable experiment.
In our study, we have shown that p300 is dependent on YEATS2 for its recruitment on the SPARC promoter. As a part of the revision, we propose the following experiments to further substantiate the role of p300 in YEATS2-mediated gene regulation:
(a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.
(b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.
(c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.
(d) Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.
(5) I do not entirely agree with using GAPDH as a control in the western blot experiment since GAPDH has been reported to be overexpressed in oral cancer.
We would like to clarify that GAPDH was not used as a loading control for protein expression comparisons between normal and tumor samples. GAPDH was used as a loading control only in experiments using head and neck cancer cell lines where shRNA-mediated knockdown or overexpression was employed. These manipulations specifically target the genes of interest and are not expected to alter GAPDH expression, making it a suitable loading control in these instances.
(6) The expression of EMT markers has been checked in shControl and shYEATS2 transfected cell lines (Figure 2A). However, their expression should first be checked directly in the patients' normal vs. tumor samples.
We thank the reviewer for the suggestion. To address this, we will check the expression of EMT markers alongside YEATS2 expression in normal vs. tumor samples.
(7) In Figure 3G, knockdown of SP1 led to the reduced expression of YEATS2 controlled gene Twist1. Ectopic expression of YEATS2 was able to rescue Twist1 partially. In order to establish that SP1 directly regulates YEATS2, SP1 should also be re-introduced upon the knockdown background along with YEATS2 for complete rescue of Twist1 expression.
To address the reviewer’s concern regarding the partial rescue of Twist1 in SP1 depleted-YEATS2 overexpressed cells, we will perform the experiment as suggested by the reviewer. In brief, we will overexpress both SP1 and YEATS2 in SP1-depleted cells and then assess the expression of Twist1.
(8) In Figure 7G, the expression of EMT genes should also be checked upon rescue of SPARC expression.
We thank the reviewer for the suggestion. We will check the expression of EMT markers on YEATS2/ GCDH rescue and update Figure 7G in the revised version of the manuscript.
References
(1) T. Brabletz, R. Kalluri, M. A. Nieto and R. A. Weinberg, Nat Rev Cancer, 2018, 18, 128–134.
(2) P. Pisani, M. Airoldi, A. Allais, P. Aluffi Valletti, M. Battista, M. Benazzo, R. Briatore, S. Cacciola, S. Cocuzza, A. Colombo, B. Conti, A. Costanzo, L. Della Vecchia, N. Denaro, C. Fantozzi, D. Galizia, M. Garzaro, I. Genta, G. A. Iasi, M. Krengli, V. Landolfo, G. V. Lanza, M. Magnano, M. Mancuso, R. Maroldi, L. Masini, M. C. Merlano, M. Piemonte, S. Pisani, A. Prina-Mello, L. Prioglio, M. G. Rugiu, F. Scasso, A. Serra, G. Valente, M. Zannetti and A. Zigliani, Acta Otorhinolaryngol Ital, 2020, 40, S1–S86.
(3) J. Lin, P. Zhang, W. Liu, G. Liu, J. Zhang, M. Yan, Y. Duan and N. Yang, Elife, 2023, 12, RP87510.
(4) X. Liu, W. Wei, Y. Liu, X. Yang, J. Wu, Y. Zhang, Q. Zhang, T. Shi, J. X. Du, Y. Zhao, M. Lei, J.-Q. Zhou, J. Li and J. Wong, Cell Discov, 2017, 3, 17016.
(5) G. Jiang, C. Li, M. Lu, K. Lu and H. Li, Cell Death Dis, 2021, 12, 703.
(6) D. Zhao, H. Guan, S. Zhao, W. Mi, H. Wen, Y. Li, Y. Zhao, C. D. Allis, X. Shi and H. Li, Cell Res, 2016, 26, 629–632.
(7) H. Yuan, X. Wu, Q. Wu, A. Chatoff, E. Megill, J. Gao, T. Huang, T. Duan, K. Yang, C. Jin, F. Yuan, S. Wang, L. Zhao, P. O. Zinn, K. G. Abdullah, Y. Zhao, N. W. Snyder and J. N. Rich, Nature, 2023, 617, 818–826.
(8) I. Romero-Calvo, B. Ocón, P. Martínez-Moya, M. D. Suárez, A. Zarzuelo, O. Martínez-Augustin and F. S. de Medina, Anal Biochem, 2010, 401, 318–320.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers for their careful evaluation of our manuscript and appreciate the suggestions for improvement. We will outline our planned revisions in response to these reviews.
Reviewer 2:
“The one exception is the claim that "maintenance of respiration is the only cellular target of chalkophore mediated copper acquisition." While under the in vitro conditions tested this does appear to be the case; however, it can't be ruled out that the chalkophore is important in other situations. In particular, for maintenance of the periplasmic superoxide dismutase, SodC, which is the other M. tuberculosis enzyme known to require copper.”
And
Reviewer 3:
“Because the phenotype of M. tuberculosis lacking chalkophores is similar, if not identical, to using Q203, an inhibitor of cytochrome bcc:aa3, the authors propose that the copper-containing cytochrome bcc:aa3 is the only recipient of copper-uptake by chalkophores. A minor weakness of the work is that this latter conclusion is not verified under infection conditions and other copper-enzymes might still be functionally required during one or more stages of infection.
Both comments concern the question of whether the bcc:aa3 respiratory oxidase supercomplex is the only target of chalkophore delivered copper. In culture, our experiments suggest that bcc:aa3 is the only target. The evidence for this claim is in Figure 2E and F. In 2E, we show that M. tuberculosis DctaD (a subunit of bcc:aa3) is growth impaired, copper chelation with TTM does not exacerbate that growth defect, and that a DctaDDnrp double mutant is no more sensitive to TTM than DctaD. These data indicate that role of the chalkophore in protecting against copper deprivation is absent when the bcc:aa3 oxidase is missing. Similar results were obtained with Q203 (Figure 2F). Q203 or TTM arrest growth of M. tuberculosis Dnrp, but the combination has no additional effect, indicating that when Q203 is inhibiting the bcc:aa3 oxidase, the chalkophore has no additional role. However, we agree with the reviewers that we cannot exclude the possibility that during infection, there is an additional target of chalkophore mediated Cu acquisition. We will add this caveat to the revised version of this manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
In previous work, the authors described necrosis-induced apoptosis (NiA) as a consequence of induced necrosis. Specifically, experimentally induced necrosis in the distal pouch of larval wing imaginal discs triggers NiA in the lateral pouch. In this manuscript, the authors confirmed this observation and found that while necrosis can kill all areas of the disc, NiA is limited to the pouch and to some extent to the notum, but is excluded from the hinge region. Interestingly and unexpectedly, signaling by the Jak/Stat and Wg pathways inhibits NiA. Further characterization of NiA by the authors reveals that NiA also triggers regenerative proliferation which can last up to 64 hours following necrosis induction. This regenerative response to necrosis is significantly stronger compared to discs ablated by apoptosis. Furthermore, the regenerative proliferation induced by necrosis is dependent on the apoptotic pathway because RNAi targeting the RHG genes is sufficient to block proliferation. However, NiA does not promote proliferation through the previously described apoptosis-induced proliferation (AiP) pathway, although cells at the wound edge undergo AiP. Further examination of the caspase levels in NiA cells allowed the authors to group these cells into two clusters: some cells (NiA) undergo apoptosis and are removed, while others referred to as Necrosis-induced Caspase Positive (NiCP) cells survive despite caspase activity. It is the NiCP cells that repair cellular damage including DNA damage and that promote regenerative proliferation. Caspase sensors demonstrate that both groups of cells have initiator caspase activity, while only the NiA cells contain effector caspase activity. Under certain conditions, the authors were also able to visualize effector caspase activity in NiCP cells, but the level was low, likely below the threshold for apoptosis. Finally, the authors found that loss of the initiator caspase Dronc blocks regenerative proliferation, while inhibiting effector caspases by expression of p35 does not, suggesting that Dronc can induce regenerative proliferation following necrosis in a non- apoptotic manner. This last finding is very interesting as it implies that Dronc can induce proliferation in at least two ways in addition to its requirement in AiP.
Strengths:
This is a very interesting manuscript. The authors demonstrate that epithelial tissue that contains a significant number of necrotic cells is able to regenerate. This regenerative response is dependent on the apoptotic pathway which is induced at a distance from the necrotic cells. Although regenerative proliferation following necrosis requires the initiator caspase Dronc, Dronc does not induce a classical AiP response for this type of regenerative response. In future work, it will be very interesting to dissect this regenerative response pathway genetically.
Weaknesses:
No weaknesses were identified.
We thank the reviewer for their positive evaluation and kind words.
Reviewer #2 (Public Review):
Summary / Strengths:
In this manuscript, Klemm et al., build on past published findings (Klemm et al., 2021) to characterize caspase activation in distal cells following necrotic tissue damage within the Drosophila wing imaginal disc. Previously in Klemm et al., 2021, the authors describe necrosis-induced-apoptosis (NiA) following the development of a genetic system to study necrosis that is caused by the expression of a constitutive active GluR1 (Glutamate/Ca2+ channel), and they discovered that the appearance of NiA cells were important for promoting regeneration.
In this manuscript, the authors aim to investigate how tissues regenerate following necrotic cell death. They find that the cells of the wing pouch are more likely to have non-autonomous caspase activation than other regions within the wing imaginal disc (hinge and notum),two signaling pathways that are known to be upregulated during regeneration, Wnt (wingless) and JAK/Stat signaling, act to prevent additional NiA in pouch cells, and may explain the region specificity, the presence of NiA cells promotes regenerative proliferation in late stages of regeneration, not all caspase-positive cells are cleared from the epithelium (these cells are then referred to as Necrosis-induced Caspase Positive (NiCP) cells), these NiCP cells continue to live and promote proliferation in adjacent cells, the caspase Dronc is important for creating NiA/NiCP cells and for these cells to promote proliferation. Animals heterozygous for a Dronc null allele show a decrease in regeneration following necrotic tissue damage.
The study has the potential to be broadly interesting due to the insights into how tissues differentially respond to necrosis as compared to apoptosis to promote regeneration.
Weaknesses:
However, here are some of my current concerns for the manuscript in its current version:
The presence of cells with activated caspase that don't die (NiCP cells) is an interesting biological phenomenon but is not described until Figure 5. How does the existence of NiCP cells impact the earlier findings presented? Is late proliferation due to NiA, NiCP, or both? Does Wg and JAK/STAT signaling act to prevent the formation of both NiA and NiCP cells or only NiA cells? Moreover, the authors are able to specifically manipulate the wound edge (WE) and lateral pouch cells (LP), but don't show how these manipulations within these distinct populations impact regeneration. The authors provide evidence that driving UAS-mir(RHG) throughout the pouch, in the LP or the WE all decrease the amount of NiA/NiCP in Figure 3G-O, but no data on final regenerative outcomes for these manipulations is presented (such as those presented for Dronc-/+ in Fig 7M). The manuscript would be greatly enhanced by quantification of more of the findings, especially in describing if the specific manipulations that impacted NiA /NiCP cells disrupt end-point regeneration phenotypes.
We have added a line to the results to clarify that we believe the finding that some NiA likely persist as NiCP does not affect our conclusions up to this point.
We have added a statement emphasizing the results from our first paper, which demonstrate that LP>miRHG expression reduces the overall capacity to regenerate.
Quantification of the change in posterior NiA number have been added to Figure 2L to strengthen the evidence. Likewise, we have included quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), and quantification of the change in GC3Ai signal over time has been added to Figure 5 - Figure supplement 1D) to emphasize the perdurance of GC3Ai-positive NiA/NiCP.
How fast does apoptosis take within the wing disc epithelium? How many of the caspase(+) cells are present for the whole 48 hours of regeneration? Are new cells also induced to activate caspase during this time window? The author presented a number of interesting experiments characterizing the NiCP cells. For the caspase sensor GC3Ai experiments in Figure 5, is there a way to differentiate between cells that have maintained fluorescent CG3Ai from cells that have newly activated caspase? What is the timeline for when NiA and NiCP are specified? In addition, what fraction of NiCP cells contribute to the regenerated epithelium? Additional information about the temporal dynamics of NiA and NiCP specification/commitment would be greatly appreciated.
We have included more information concerning the kinetics of apoptotic cell removal, and how this compares to the observations we have made with NiA/NiCP in our GC3Ai experiments. Additionally, we have included a quantification of the percent of the whole wing pouch with GC3Ai signal over time (Figure 5F) as well as the distal wing pouch with GC3Ai signal over time (Figure 5 – Figure supplement 1D) to further support the idea that NiCP persist over time.
We acknowledge that our GC3Ai time course unfortunately cannot confirm whether the increase in GC3Ai signal over time is due to cells with new caspase activity or proliferating NiCP and have included this point in the discussion.
We attempted to track the lineage of NiA/NiCP into the pupal and adult wings with CasExpress and DBS, however the results of these experiments were inconsistent, and therefore we did not feel confident to include these data or draw conclusions in either direction. We are currently designing variations of these lineage trace tools in order to better track the lineage of these cells that we hope to include in a future paper.
The notum also does not express developmental JAK/STAT, yet little NiA was observed within the notum. Do the authors have any additional insights into the differential response between the pouch and notum? What makes the pouch unique? Are NiA/NiCP cells created within other imaginal discs and other tissues? Are they similarly important for regenerative responses in other contexts?
We have added a brief mention of these points to the appropriate results section to avoid further increasing the length of the discussion.
Data on the necrosis of other imaginal discs through FLP/FRT clone formation in haltere and leg discs has been added to Figure 1 Figure supplement 1J, and described in the text.
Reviewer #3 (Public Review):
The manuscript "Regeneration following tissue necrosis is mediated by non- apoptotic caspase activity" by Klemm et al. is an exploration of what happens to a group of cells that experience caspase activation after necrosis occurs some distance away from the cells of interest. These experiments have been conducted in the Drosophila wing imaginal disc, which has been used extensively to study the response of a developing epithelium to damage and stress. The authors revise and refine their earlier discovery of apoptosis initiated by necrosis, here showing that many of those presumed apoptotic cells do not complete apoptosis. Thus, the most interesting aspect of the paper is the characterization of a group of cells that experience mild caspase activation in response to an unknown signal, followed by some effector caspase activation and DNA damage, but that then recover from the DNA damage, avoid apoptosis, and proliferate instead. Many questions remain unanswered, including the signal that stimulates the mild caspase activation, and the mechanism through which this activation stimulates enhanced proliferation.
The authors should consider answering additional questions, clarifying some points, and making some minor corrections:
Major concerns affecting the interpretation of experimental results:
Expression of STAT92E RNAi had no apparent effect on the ability of hinge cells to undergo NiA, leading the authors to conclude that other protective signals must exist. However, the authors have not shown that this STAT92E RNAi is capable of eliminating JAK/STAT signaling in the hinge under these experimental conditions. Using a reporter for JAK/STAT signaling, such as the STAT-GFP, as a readout would confirm the reduction or elimination of signaling. This confirmation would be necessary to support the negative result as presented.
We have included data demonstrating our ability to knock down JAK/STAT activity in the hinge with UAS-Stat92E<sup>RNAi</sup> (Figure 2 – Figure supplement 1E and F). Additionally, we have included a quantification of posterior NiA/NiCP with the Stat92E<sup>RNAi</sup> (as well as wg<sup>RNAi</sup> and Zfh-2<sup>RNAi</sup>, Figure 2L) to strengthen our conclusion that JAK/STAT and WNT signaling acts to regulate NiA formation within the pouch.
Similarly, the authors should confirm that the Zfh2 RNAi is reducing or eliminating Zfh2 levels in the hinge under these experimental conditions, before concluding that Zfh2 does not play a role in stopping hinge cells from undergoing NiA.
We have repeated this experiment with a longer knockdown using a GAL4 driver that expresses from early larval stages until our evaluation at L3, but were unable to demonstrate a loss of Zfh-2 with IF labeling. Additionally, we have quantified posterior NiA/NiCP with a Zfh-2RNAi (Figure 2L) and do find a slight increase in NiA/NiCP number, however this change is not significant. We have altered our conclusions to reflect these new data.
EdU incorporation was quantified by measuring the fluorescence intensity of the pouch and normalizing it to the fluorescence intensity of the whole disc. However, the images show that EdU fluorescence intensity of other regions of the disc, especially the notum, varied substantially when comparing the different genetic backgrounds (for example, note the substantially reduced EdU in the notum of Figure 3 B' and B'). Indeed, it has been shown that tissue damage can lead to suppression of proliferation in the notum and elsewhere in the disc, unless the signaling that induces the suppression is altered. Therefore, the normalization may be skewing the results because the notum EdU is not consistent across samples, possibly because the damage-induced suppression of proliferation in the notum is different across the different genetic backgrounds.
To more accurately reflect the observations that we have made with the EdU assay, we have changed our terminology to indicate that the EdU signal is more localized to the damaged tissue in ablated discs, thus taking into account the relative changes across the disc, rather than referring to it as an increase in the pouch. To further strengthen our observation that damage results in a localized proliferation, we have included a quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), which underscores the trend observed in our EdU experiments.
The authors expressed p35 to attempt to generate "undead cells". They take an absence of mitogen secretion or increased proliferation as evidence that undead cells were not generated. However, there could be undead cells that do not stimulate proliferation non-autonomously, which could be detected by the persistence of caspase activity in cells that do not complete apoptosis. Indeed, expressing p35 and observing sustained effector caspase activation could help answer the later question of what percentage of this cell population would otherwise complete apoptosis (NiA, rescued by p35) vs reverse course and proliferate (NiCP, unaffected by p35).
In our previous work, we showed that P35 expression impairs our ability to detect effector caspases with IF-based tools. This can also be seen in Figure 4 of this work (Figure 4C and F). Given that P35 expression precludes our ability to label and assay effector caspase activity visually, and thus address the concerns outlined above, we relied on other tools such as reporters of AiP mitogens (wg-lacZ & dpp-lacZ) to assay whether NiA participate in AiP. As a functional readout, we also paired P35 expression with the EdU assay to test whether proliferation was altered by the presence of undead cells. The results discussed in Figure 4 lead us to conclude that NiA likely do not participate in the canonical AiP feedforward loop, although it is possible that these experiments generate another type of undead cell – one that utilizes a different mechanism to promote proliferation.
It is unclear if the authors' model is that the NiCP cells lead to autonomous or non-autonomous cell proliferation, or both. Could the lineage-tracing experiments and/or the experiments marking mitosis relative to caspase activity answer this question?
We have added further details to the discussion on the potential for NiA/NiCP to induce cell autonomous/non-autonomous proliferation.
Many of the conclusions rely on single images. Quantification of many samples should be included wherever possible.
We have added quantification to strengthen the results of Figures 2, 3 and 5.
Why does the reduction of Dronc appear to affect regenerative growth in females but not males?
We have repeated this regeneration scoring experiments and have increased the N for control versus droncI29 mutant males, however the results of the analysis for male wing size remain not significant, although the general trend that droncI29 wings are slightly smaller. While there could be sex-specific differences in the capacity to regenerate that contribute to this observation, it is unclear what the underlying mechanism could be.
Reviewer #1 (Recommendations for the authors):
The work in this paper is already very complete and very well worked out. The conclusions are well supported by the data in this manuscript. I do not have any experimental requests, only a few minor and formal requests/questions.
(1) Why does Diap1 overexpression not affect regenerative proliferation, whereas mir(RHG) and dronc[I29] do, given that Diap1 acts between RHG and Dronc?
We speculate on this point in the discussion section but have adjusted some of the phrasing for clarity.
(2) I assume that the authors used the cleaved Dcp-1 antibody from Cell Signaling Technologies. I recommend that the authors refer to this antibody as cDcp-1 in text and figures as this antibody specifically detects the cleaved, and thus activated form of Dcp-1, and not the uncleaved, inactive form of Dcp-1 which has a uniform expression in the discs.
Changed to cDcp-1.
(3) Line 299: Hay et al. 1994 did not show that p35 inhibits Drice and Dcp-1 (in fact, both genes were not even cloned yet). This was shown by Meier et al. 2000 and Hawkins et al. 2000. Please correct references.
Corrected.
(4) Line 574/575. Meier et al. 2000 did not show that Dronc is mono-ubiquitylated. This was shown by Kamber-Kaya et al., 2017. Please correct.
Corrected.
Reviewer #2 (Recommendations for the authors):
(1) Does domeless knockdown cause apoptosis without tissue ablation (Figures 2C-E)? Currently, the non-ablation control is not shown.
Domeless knockdown does not cause apoptosis in the absence of ablation (Added Figure 2 – Figure supplement 1A).
(2) The supplemental experiment with zfh2-RNAi is hard to interpret because there is no evidence of RNAi knockdown based on the staining with the anti-Zfh2 antibody.
As noted above, a longer zfh-2 knockdown does not appear to alter Zfh-2 protein levels. A quantification of posterior NiA/NiCP following knockdown shows a slight (non-significant) increase in posterior NiA/NiCP. Considering these new results, we have altered our interpretation within the appropriate results and discussion sections.
(3) The authors should consider adding a diagram showing where mir(RHG) and DIAP1 are in the apoptotic/caspase activation pathway (Figure 7N).
Completed, Figure 7N and 7O.
Reviewer #3 (Recommendations for the authors):
(1) Figure 2 I -The purported increase in NiA should be quantitated relative to the NiA in G across many discs.
Completed (Figure 2L)
(2) Figure 2 M - contrary to the conclusion drawn, the posterior Dcp1 does not appear different from that in the control (K). This conclusion that the NiA does not occur in the margin could be better supported with more images/quantification.
We have exchanged the image for a representative one that more clearly shows the lack of margin NiA and highlighted with an arrowhead (Figure 2K)
(3) Figure 2 supp 1 E - the "slight increase" in NiA in the pouch is relative to which control? Can this conclusion be supported by quantification?
Figure 2L now quantifies this change.
(4) Figure 2 Supp 1 D, E - these discs supposedly have Zfh2 RNAi expressed, but there appears to be no reduction in Zfh2.
We were unable to demonstrate a reduction of Zfh2, even with a longer knockdown. Considering these new data, we have altered our conclusions from the Zfh2 experiments.
(5) Figure 2 Supp 1 I - please quantitate the Dcp-1 across many discs to support the conclusion.
This is the UAS-wg experiment, which we decided to remove from the quantification given the non-specific increase in cDcp-1 throughout the disc (likely as a result from ectopic Wg expression).
(6) Figure 4 legend M - The authors conclude that the experiment indicates that "NiA promote proliferation independent of AiP". It would be more precise to say that NiA cells do not secrete AiP mitogens and do not increase the proliferation of surrounding cells when prevented from completing apoptosis. To say that the NiA-induced proliferation does not require AiP would require eliminating AiP, perhaps through reaper hid grim knockdown or mitogen knockdown.
Corrected.
Minor concerns and clarification needed:
(7) Line 61 - consider the distinction between a feed-forward loop and a positive feedback loop.
Corrected.
(8) Line 338 - it would be helpful to have a brief explanation of what the GC3Ai consists of and how it reports caspase activity.
Corrected.
(9) Line 343 - the authors should clarify by what they mean when they state GC3Ai-positive cells are "associated with" mitotic cells. Are the GC3Ai cells undergoing mitosis? Or is the increase in mitosis non-autonomous?
Adjusted. “associated with adjacent proliferative cells”.
(10) Lines 392-394 - the authors should add brief descriptions of how the Drice-Based sensor and the CasExpress function, so the readers can better understand the distinctions between these sensors and the previously mentioned sensors (anti-Dcp1 and GC3Ai). In addition, please clarify how the Gal80ts modulates the sensitivity of the CasExpress.
Descriptions of DBS and CasExpress and additional clarification provided.
(11) Line 413: How does Gal80ts suppress the background developmental caspase signal, and how does this suppression lead to NiCP cells expressing GFP?
This section has been reworded to clarify.
(12) Line 417 - which GFP label is referred to here?
This section has been reworded to clarify.
(13) Line 445 is the first mention of the CARD domain - it could be introduced more fully and explained why the DroncDN's lack of effect on proliferation excludes the CARD domain as being important.
Clarified. See also the discussion for the significance of the CARD domain as dispensable for regenerative proliferation following necrosis.
(14) Line 452 - "As mentioned" - the manuscript has not previously mentioned DIAP1 modification of the CARD domain and what that modification does. Perhaps the previous explanatory text was inadvertently removed?
Corrected.
(15) The Discussion is a lengthy list of experiments that the authors did not do or observations they were unable to make. This section could benefit from a more in-depth discussion of necrosis and the possibility that NiCP cells contribute to repair after injury across contexts and species.
We have made several changes to the discussion that elaborate on some of the points listed in the public reviews.
(16) All figures: Consider making single-channel panels grayscale to aid visualization. Also consider using color combinations that can be distinguished by color-blind readers.
We appreciate these suggestions and will consider them for future manuscripts.
(17) All figure legends - are error bars SD or SEM?
Standard deviation. Added to appropriate legends.
(18) Figure 1A,C - it would be helpful in the diagrams to note when the necrosis occurs/completes.
The endpoint of necrosis is not well defined, given the simultaneous changes that occur with regeneration. Thus, we opted to not include an indicator of when necrotic ablation ends.
(19) Figure 1B - it would be helpful to name the GAL4 drivers whose expression domain is depicted to correlate with the terms used in the text.
Completed.
(20) Figure 1 legend- what do the different colors of the arrowheads denote? The dotted lines are in R' and S', not N' and O'.
Completed.
(21) Figure 2G - the yellow dashed line is not in the same place in the two images.
Corrected.
(22) Figure 2I - what is the open arrowhead?
Completed (Figure 2I legend).
(23) Figure 3 legend - please describe what the time course is observing (EdU).
Completed.
(24) Figure 4 - please include the yellow boxes in the Dcp-1 channels.
Completed.
(25) Figure 5 F' - add the arrowheads to all the panels. The yellow arrowhead appears to be pointing to nothing.
Completed.
(27) Figure 5 legend - what is a "cytoplasmic undisturbed cell"? What is the arrowhead in G? J and J' should show the same view at different time points or different views at the same time point.
Figure legend has been corrected.
(28) Figure 5 Supp 1 would be especially helped by having more single-channel panels in grayscale.
For clarity and consistency, we chose to maintain the different color channels.
(29) Figure 5 Supp 1 D and E - It would be helpful to have higher magnification and arrows pointing to the cells of interest. Why are there TUNEL+ cells that do not have caspase activation (green)?
We have added arrowheads as suggested. We believe the disparity in TUNEL and GC3Ai signals are a result of the different sensitivities of the IF staining and the TUNEL assay.
(30) Figure 5 Supp 1 F - perhaps the arrowheads should be in all panels - they point to empty spaces with no H2Av staining in the final panel. Perhaps a higher magnification image would make the "strong overlap" of the two signals more apparent?
We have added arrowheads where appropriate.
(31) Figure 6 D-E - does the widespread GFP lineage tracing signal suggest that most cells in the repaired tissue originated from cells that once had caspases activity?
Possibly, however given that CasExpress leads to significant developmental labeling, we were unable to determine to what extent the signal in this experiment comes from NiA/NiCP activity versus developmental labeling. Note that tubGAL80ts is not present in this experiment.
(32) Writing corrections:
Line 343 "positive" is misspelled.
Completed
Line 429 - a word may be missing.
Completed
Line 639 - the word "day" may be missing.
Completed
Line 658 - what temperature was the recovery?
Completed
Lines 706-708 - were the discs incubated in 55 mL and 65 mL of liquid, or a smaller volume?
Completed
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1:
Overall I find the evidence very well presented and the study compelling. It offers an important new perspective on the key properties of neoblasts. I do have some comments to clarify the presentation and significance of the work.
We thank the reviewer for the positive feedback and plan to improve the presentation of the work.
Reviewer #2:
However, the absence of a cell-cell feedback mechanism during colony growth and the likelihood of the difference needs to be clarified. Is there any difference in interpreting the results if this mechanism is considered?
We will improve the description of the model assumptions and the interpretation of the data on the basis of these assumptions.
Although hnf-4 and foxF have been silenced together to validate the model, a deeper understanding of the tgs-1+ cell type and the non-significant reduction of tgs-1+ neoblasts in zfp-1 RNAi colonies is necessary, considering a high neural lineage frequency.
We will improve the analysis of this result in light of the experimentally determined frequency of the tgs-1+ neoblast population.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Cheng et al explore the utility of analyte ratios instead of relative abundance alone for biological interpretation of tissue in a MALDI MSI workflow. Utilizing the ratio of metabolites and lipids that have complimentary value in metabolic pathways, they show the ratio as a heat map which enhances the understanding of how multiple analytes relate to each other spatially. Normally, this is done by projecting each analyte as a unique color but using a ratio can help clarify visualization and add to biological interpretability. However, existing tools to perform this task are available in open-source repositories, and fundamental limitations inherent to MALDI MSI need to be made clear to the reader. The study lacks rigor and controls, i.e. without quantitative data from a variety of standards (internal isotopic or tissue mimetic models for example), the potential delta in ionization efficiencies of different species subtracts from the utility of pathway analysis using metabolite ratios.
We thank the reviewer for comments on the availability of four other commercial and open-source tools for performing ratio imaging: ENVI® Geospatial Analysis Software, MATLAB image processing toolbox, Spectral Python (SPy) and QGIS. We now highlight these in the introduction (page 3 line 80-86). However, in contrast to these target ratio imaging methods, our approach uniquely enables the untargeted discovery of correlated (or anti-correlated) ratios of molecular features, whether the species are structurally known or unknown.
ENVI® Geospatial Analysis Software and MATLAB image processing toolbox for hyperspectral imaging are both paid programs, limiting free access and software evaluation for the potential application of untargeted ratio-metric imaging. We are able to evaluate the application of MATLAB RatioImage since Weill Cornell Medicine has an institutional subscription for Mathwork-MATLAB. Notably, MATLAB RatioImage computes and displays an individual intensity modulated ratiometric image by choosing a numerator and denominator image. This software tool only images the ratios of selected metabolites from an input list of multiple species and does not allow for the possibility of untargeted ratiometric images of all metabolite pairs.
While Spectral Python (SPy) and QGIS are both freely-available software packages, and both can perform individual metabolite ratio images, neither allows for untargeted ratiometric imaging of all pairs from a multiple metabolite input list. Table S1 (below) provides a comparison of the ratio imaging tool that we offer in comparison with other previously available tools.
We appreciate the reviewer’s insightful comments on differential ionization efficiency among metabolites and the importance of using stable isotope internal standard to gain absolute quantification.
A fundamental advantage of our ratiometric imaging tool is to provide better image contrast for tissue regions with differential ionization efficiency, with the potential to discover new “metabolic” regions that can be revealed by metabolite ratio. Note that comparison for ratio image abundance is limited to tissue groups in the equivalent region which is expected to have similar ionization efficiency for given metabolites. Furthermore, the power of our strategy is to provide untargeted (and targeted) ratio imaging as a hypothesis generation tool and this use does not require absolute quantification. If cost was not an issue, an extensive group of stable isotope standards could theoretically be used for absolute metabolite quantification of target metabolites with known identity.
Using the tissue mimetic model, we generate calibration curve for stable isotope standards spiked in carboxymethylcellulose (CMC)-embedded brain homogenate cryosections and quantify the concentration of brain glucose, lactate and ascorbate concentrations. Similar ratio images among these metabolites are obtained from abundance data compared to quantified concentration data (Fig S3). While stable isotope standards are often used to obtain quantitative concentration of metabolite/lipid of interest, it is not applicable for untargeted metabolite ratios that include an assessment of structurally undefined species. Nevertheless, our data indicates that absolute quantification is not necessary for the targeted and untargeted ratio imaging described here (Page 6, line 196-205).
Reviewer #2 (Public Review):
Summary:
In the article, "Untargeted Pixel-by-Pixel Imaging of Metabolite Ratio Pairs as a Novel Tool for Biomedical Discovery in Mass Spectrometry Imaging" the authors describe their software package in R for visualizing metabolite ratio pairs. I think the novelty of this manuscript is overstated and there are several notable issues with the figures that prevent detailed assessment but the work would be of interest to the mass spectrometry community.
Strengths:
The authors describe a software that would be of use to those performing MALDI MSI. This software would certainly add to the understanding of metabolomics data and enhance the identification of critical metabolites.
Weaknesses:
The authors are missing several references and discussion points, particularly about SIMS MSI, where ratio imaging has been previously performed.
There are several misleading sentences about the novelty of the approach and the limitations of metabolite imaging.
Several sentences lack rigor and are not quantitative enough.
The figures are difficult to interpret/ analyze in their current state and lack some critical components, including labels and scale bars.
We thank reviewer for very helpful comments. The tone of the manuscript has been adjusted to highlight the real novelty of this method in the ease of computing and application to MS specific projects (abstract line 26-30 ). All figures have been updated to include labels and scale bars with improved resolution. References for ratio imaging use of SIMS MSI has been added in the introduction (Page 3, line 80-89).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Major Comments:
In the Abstract it is stated that: "the research community lacks a discovery tool that images all metabolite abundance ratio pairs." However, the following tools exist that perform this fundamental task.
A "pixel by pixel" data frame in .csv form has a very similar data structure to many instruments like satellite imaging or other hyperspectral tools. It is true this does not exist in the MALDI-specific context, but it would not be difficult to perform this task on the following programs. Highlight the novelty here is not ratios but the ease of computing them and the application in the specific project. Also, describe the available tools and what shortcomings others lack that this package provides. A supplemental table of MSI data analysis tools and the function of each would be a good addition.
List of tools to perform band ratio computation with minimal modification:
(1) ENVI IDL: geospatial imaging tool that allows ratio computation between spectral bands.
(2) MATLAB image processing toolbox for hyperspectral imaging.
(3) Spectral Python package (SPy).
(4) QGIS with plugins can be used for hyperspectral image analysis with a ratio between bands.
We revised the abstract and introduction to include novelty and comparison to other existing methods listed in Table S1.
"untargeted R package workflow" - If there are functions used outside the SCiLS Lab API client then write it up and include a GitHub link for open access to fit the mission of eLife.
As shown in Scheme I. We develop two types of codes for untargeted ratio imaging. The first type uses Scils lab API client to extend the function of targeted and targeted ratio imaging and all related spatial image analysis. This is suitable for Scils lab users. The second type does not require Scils lab API, it allows extracting pixel data from imzml file then proceed targeted and untargeted imaging and analysis. Both codes are now deposit in Github via public access (https://github.com/qic2005/Untargeted-massspectrometry-ratio-imaging.git).
"across cells and tissue subregions" The value in reporting cell type and tissue type-specific differences in any metric is powerful, but not done in this paper. Only whole samples are compared such as "KO vs WT" and the annotations in Figure 3 are not leveraged for increased biological relevance. This paper treats each image as a homogenization experiment in a practical sense beyond just visually inspecting each image. Remove this claim or do the calculations on region/tissue/cell-type specific differences with the appropriate tools to show the data beyond simple heat map images.
We have deleted the sentence containing across cells and tissue subregions from the abstract.
"enhances spatial image resolution" Clarify. The resolution in MALDI is set by the raster size of the pixels which is an instrument parameter and cannot be changed post-acquisition. Image-specific methods to increase resolution exist, but dividing the value in one peak column by another does not change functional resolution in the context of the instruments here.
We thank reviewer for pointing out this typo. We have changed it to enhance spatial image contrast in the abstract (line 34).
"pixel-by-pixel imaging of the ratio of an enzyme's substrate to its derived product offers an opportunity to view the distribution of functional activity for a given metabolic pathway across tissue" - Appropriately calibrate the impact of this work and correct this statement to better reflect the capabilities of this approach. Do not oversell the exploration of pathway activity since the raw quantity reported as relative abundance does not provide biologically interpretable pathway information. This is due to unaccounted differences in ionization efficiencies between analytes in a pathway and lack of determination of rate. Without a calibration curve and more techniques on the analytical chemistry side of the project, it is possible a relative abundance of one analyte (like the product of a pathway) could be higher than the relative abundance of another analyte (a precursor), but due to structural differences, the actual quantity of the higher relative abundance species could be significantly different or even lower than its counterpart. Secondly, "functional activity" cannot be assessed in this manner without isotopic labeling or additional techniques. This does not subtract from the overall validity and impact of the work, but highlighting these shortcomings and slight alterations to the claim are important for a multidisciplinary audience.
Although we show that abundance ratio results in similar image to concentration ratio for brain metabolites such as lactate, glucose and ascorbate, we agree with the reviewer that abundance ratio is different from the absolute concentration ratio in numerical value due to difference in ionization efficiency. We delete the sentence “pixel-by-pixel imaging of the ratio of an enzyme's substrate to its derived product offers an opportunity to view the distribution of functional activity for a given metabolic pathway across tissue" from the abstract. We apologize for not clarifying this application more clearly. We meant to compare pathway activity among the equivalent and similar pixel/regions of tissues from different biological groups, given the assumption that ionization efficiency is identical for equivalent pixel from different tissue sections ( i.e. same cell type and microenvironment), especially for metabolites with similar functional structure in the same pathway. For example, fatty acids with different chain length and phospholipid with same head groups are expected to have similar ionization efficiency in the same tissue pixel/region. We have thereby rewritten this section (Page 7, line 239-247).
"We further show that ratio imaging minimizes systematic variations in MSI data by sample handling and instrument drift, improves image resolution, enables anatomical mapping of metabotype heterogeneity, facilitates biomarker discovery, and reveals new spatially resolved tissue regions of interest (ROIs) that are metabolically distinct but otherwise unrecognized."
Instrument drift is not accounted for by ratios as it impacts the process before ratio computation. "metabotype" - spelling?
Instrument drift here refers to individual ion abundance changes during long data acquisition. Ratio may offer a better read-out than individual metabolite abundance alone. However, for acquired data after total ion normalization, ratio data would not have difference from non-ratio data. Therefore, we delete instrument drift from the sentence (Page 2, line 33, and Page 3, line 99)
Metabotype is a term widely used for metabolomics field. It is categorized by similar metabolic profiles, which are based on combinations of specific metabolites. https://nutritionandmetabolism.biomedcentral.com/articles/10.1186/s12986-020-00499-z
Results 3: Justify the claim that the ratio reduces artifacts. A ratio is the value from one m/z area over another and would seem that the quality of the ratio would be always lower than the individually higher quality pixel signal of the two analytes that compose a ratio.
Ratio images are indeed the heatmaps of pixel-by-pixel ratio data, set by the scale of all ratio values. For very abundant ion pairs, their individual image may not be better than the ratio image, depending on the abundance changes among pixels within tissue sections. Similarly, the quality of ratio image may not be higher than the individual image if distribution of ratios does not change much among pixels in tissue sections. For example, metabolite or lipids in Figures 2 and 5 are abundant, but non-ratio images do not have better quality than ratio images. Furthermore, ratio image provides additional information on how the ratio of the two metabolite pair changes pixel-by pixel in all tissue sections, such additional information could be useful for data interpretation.
Results 4: The metabolite pairs are biologically sensible but should be clearly stated that they do not account for differences in ionization efficiency between metabolites and cannot provide quantitative pathway analysis with a high degree of biological confidence.
We apologize for not clarifying this application more clearly. We meant to compare pathway activity among the equivalent and similar pixel/regions of tissues from different biological groups, given the assumption that ionization efficiency is identical for equivalent pixel from different tissue sections ( i.e. same cell type and microenvironment), especially for metabolites with similar functional structure in the same pathway. For example, fatty acids with different chain length and phospholipid with same head groups are expected to have similar ionization efficiency in the same tissue pixel/region. We have thereby rewritten this section (Page 7, 239-247, 254-255).
Results 4: "cell-type specific metabolic activity at cellular (10 µm) spatial resolution" Prove the cell type differences with IHC coregistration or MALDI IHC if you want to make claims about them. Just visually determining a tissue type of a scan of a slide is inadequate to support this claim.
We agree with reviewer’s comments. We meant to provide additional information on cellular level metabolic activity such as adenosine nucleotide phosphorylation status (ATP/AMP) ratio at 10µm resolution. Hippocampus neurons provide a good example for depicting this utility. We have rewritten the claim to highlight the role of ratio imaging in providing additional metabolic information (Page 8, line 288-290).
Minor Comments:
Table 2 "Aspartiate" spelling
We have corrected it.
Describe the process and mathematical background for ratio computation in the Methods section. As this paper introduces a package, describing its underlying functions has value.
We have added R-script comments to illustrate the untargeted ratio calculation using the R-mathematical function of combination and division between any two metabolite pairs in a data matrix (Page 4, line 139-141)
"we annotate missing values with 1/5 the minimum value quantified in all pixels in which it was detected" This is explicit (ie only values with exactly 1/5 the value are annotated" - make it clear this is a threshold.
We apologize for misunderstanding. Missing values are either have no value or have solid zero in their abundance. We first calculate the minimum abundance of a particular m/z among all pixels with detectable abundance ( i.e. excluding non-missing values), then use 1/5 this minimum value as a threshold to annotate missing value (Page 4, 133-139).
Figure 1: legend scils is branded SCiLS and EXCEL does not need caps lock (Excel).
Figure 1 legend has been corrected.
Conflicts of interest "None" - there are Bruker employees on a paper about MALDI method development in a field they dominate.
We added Joshua Fischer as a Bruker employee.
Figure 3: The legend does not describe the purple arrow in J.
Purple arrow description is added to figure legend.
Figure 5: Fix orientation inconsistencies in G, H, I, and J. Especially in J - they are opposite directions. This is arbitrary and determined in SCiLS lab with simple rotation.
Orientation has been made consistent in G,H, I and J.
Figure S8: Provide exact number of biological and technical replicates used to generate this figure.
Figure S8, now Figure S9, was generated from 4 biological replicates of KO and 4 biological replicates of WT brain section in the ROI7 region. This information has been added to the figure legend.
Figure S9: Make consistent orientation of all brains
We have made brain orientations consistent.
In addition to ionization efficiencies impacting the value of the numeric relative abundance where ratio computation originates from, it should be mentioned how different classes of metabolites are differentially impacted by the euthanasia and collection methods used for various tissue types. For example, it is well established the ATP/AMP ratio can change drastically from tissue collection.
We have added this to page 8, line 315-319.
Perform standards to adjust for ionization efficiency between different m/z features.
Untargeted ratio imaging serves as an add-on MSI data analysis tool with primary use in comparing ratio among equivalent regions/pixels with similar ionization efficiencies. It is a hypothesis generation tool. Standards adjust for ionization efficiency would be a great idea for a more accurate assessment of ratio values. Due to the cost and availability of stable isotope standards for different m/z, we chose glucose, lactate and ascorbate to showcase that abundance ratio and concentration ratio result in similar images among example brain metabolite lactate, glucose and ascorbate (page 6, 196-205).
Add more controls to support the claims.
We have 4 biological replicates for each genotype of brain. We have added the number of controls in all figure legends.
Significantly tone down the claims, it is unclear how knowledgeable the authors are about the current literature of SW regarding MALDI.
The tone has been significantly tuned down throughout the revised manuscript.
Reviewer #2 (Recommendations For The Authors):
Abstract:
"relative abundance of structurally identified and yet-undefined metabolites across tissue cryosections" is misleading, since tandem MS can be performed in an imaging context and is often also compatible with the same instrument.
We have deleted this sentence in the abstract.
Intro:
Paragraph 1: The authors mention MALDI and DESI, but I would argue that SIMS is more abundantly used than DESI within single-cell applications.
We have added SIMS to the introduction Page 3, line 67.
Paragraph 2: While it may not be all detected pairs, there are many examples of ratio imaging in the MALDI MSI and SIMS communities, particularly for bacterial signaling. These would be important examples to reference.
We have added the application of SIMS ratio imaging to the introduction, page 3, line 74-75.
Materials :
Paragraph 1: More specificity on sample size is required. 3 or 4 per group is not specific. Which has four and which has three? Why are they different?
We have corrected sample numbers for specific genotype in the text and figure legends. The number of sections per group is different due to the availability of fresh-frozen tissues (Page 4, line 115-117).
Results:
Paragraph 1: Am I correct in reading that an .imzml can't be used directly? Why not?
Imaging Mass Spectrometry Markup Language (imzml) is a common data format for mass spectrometry imaging. It was developed to allow the flexible and efficient exchange of large MS imaging data between different instruments and data analysis software (Schramm et al, 2012). It contains two sets of data: the mass spectral data which is stored in a binary file (.ibd file) to ensure efficient storage and the XML metadata (.imzml file) which stores instrumental parameters, sample details. Therefore, it can’t be used directly. We have added this to result 1(Page 5, line 160-169).
Paragraph 4: "Additionally, nonlipid small molecule metabolites suffer from smearing and/or diffusion during cryosection processing, including over the course of matrix deposition for MALDI-MSI." This is misleading. There are several examples of MALDI MSI of small metabolites that are nonlipids, where smearing or diffusion have not occurred. It would be beneficial to have a more accurate discussion of this instead. The authors should also provide some evidence of this, since they continue to focus on it for the full paragraph and don't provide references.
We initially meant the poor image quality of small molecule metabolites is due to its interaction with aqueous phase of spraying solution, rapid degradation rate and matrix interference. We have deleted this sentence in the revised version.
Section 5 Paragraph 2; "However, ratio imaging revealed a much greater aspartate to glutamate ratio in an unusual "moon arc" region across the amygdala and hypothalamus relative to the rest of the coronal brain." Much greater isn't scientifically accurate or descript. Use real numbers and be quantitative.
We used pixel data from all 8 sections to obtain quantitative changes in the ratio-generated “moon arc” region compared to the rest of coronal brain (page 8, line 331-337). Ratio imaging revealed a average of 1.59-fold increase in aspartate to glutamate ratio in an unusual “moon arc” region across the amygdala and hypothalamus (mean abundance 0.563 in 6345 pixels) relative to the rest of the coronal brain (mean abundance 0.353 in 45742 pixels, Figure 5D). Similar but different arc-like structures are encompassed within the ventral thalamus and hypothalamus, wherein glutamate to glutamine ratio show a 1.63-fold increase in intensity compared to the rest of the brain (mean abundance of 0.695 in 7108 pixels vs 0.428 in 44979 pixels, Figure 5E).
Section 8 Paragraph 2: "UMAPing" is not scientifically written.
We have replaced UMAPing with UMAP.
Figure 2 is difficult to interpret, given the small sizes of the images. Align the images, reduce the white space, clearly label the different tissues, add scale bars, increase size, etc. This applies to all figures, except for 3. This will make it possible to review.
All figures have been resized by removing extra space between sections.
Figure 3. There seems to be a change in tissue after section I, so a different diagram would be helpful. SCD has a high abundance in an area that seems to be off of the tissue. Can the authors explain this? Some of the images also appear to be low signal-to-noise. Example spectra in the SI would be helpful, so I can more accurately judge the quality of the data.
We apologize for the discrepancy. All images are from the same sample. We initially cropped the individual image from multiple page PDF plot, then inserted it in Figure 3. Resizing and cropping inconsistency may lead to the small difference in image size. In the revised version, we plot all images in one page, which eliminates the inconsistency.
Figure 3 example pixel data, ratio pixel data, mass spectra and ratio images can be downloaded below:
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public review):
In this revised manuscript, the authors aim to elucidate the cytological mechanisms by which conjugated linoleic acids (CLAs) influence intramuscular fat deposition and muscle fiber transformation in pig models. They have utilized single-nucleus RNA sequencing (snRNA-seq) to explore the effects of CLA supplementation on cell populations, muscle fiber types, and adipocyte differentiation pathways in pig skeletal muscles. Notably, the authors have made significant efforts in addressing the previous concerns raised by the reviewers, clarifying key aspects of their methodology and data analysis.
Strengths:
(1) Thorough validation of key findings: The authors have addressed the need for further validation by including qPCR, immunofluorescence staining, and western blotting to verify changes in muscle fiber types and adipocyte populations, which strengthens their conclusions.
(2) Improved figure presentation: The authors have enhanced figure quality, particularly for the Oil Red O and Nile Red staining images, which now better depict the organization of lipid droplets (Figure 7A). Statistical significance markers have also been clarified (Figure 7I and 7K).
Thanks!
Weaknesses:
(1) Cross-species analysis and generalizability of the results: Although the authors could not perform a comparative analysis across species due to data limitations, they acknowledged this gap and focused on analyzing regulatory mechanisms specific to pigs. Their explanation is reasonable given the current availability of snRNA-seq datasets on muscle fat deposition in other human and mouse.
Thanks for your suggestion!
(2) Mechanistic depth in JNK signaling pathway: While the inclusion of additional experiments is a positive step, the exploration of the JNK signaling pathway could still benefit from deeper analysis of downstream transcriptional regulators. The current discussion acknowledges this limitation, but future studies should aim to address this gap fully.
Thanks! As we discussed in discussion part, further studies should focus on the downstream transcriptional regulators of JNK signaling pathway on IMF deposition.
(3) Limited exploration of other muscle groups: The authors did not expand their analysis to additional muscle groups, leaving some uncertainty regarding whether other muscle groups might respond differently to CLA supplementation. Further studies in this direction could enhance the understanding of muscle fiber dynamics across the organism.
Thanks for your suggestion! In this study, we mainly focused on the adipocytes, muscles and FAPs subpopulations, which play important roles in lipid deposition. As you suggested, our further study will focus on other subpopulations such as endothelial cells and immune cells.
Reviewer #2 (Public review):
Summary:
This study comprehensively presents data from single nuclei sequencing of Heigai pig skeletal muscle in response to conjugated linoleic acid supplementation. The authors identify changes in myofiber type and adipocyte subpopulations induced by linoleic acid at depth previously unobserved. The authors show that linoleic acid supplementation decreased the total myofiber count, specifically reducing type II muscle fiber types (IIB), myotendinous junctions, and neuromuscular junctions, whereas type I muscle fibers are increased. Moreover, the authors identify changes in adipocyte pools, specifically in a population marked by SCD1/DGAT2. To validate the skeletal muscle remodeling in response to linoleic acid supplementation, the authors compare transcriptomics data from Laiwu pigs, a model of high intramuscular fat, to Heigai pigs. The results verify changes in adipocyte subpopulations when pigs have higher intramuscular fat, either genetically or diet-induced. Targeted examination using cell-cell communication network analysis revealed associations with high intramuscular fat with fibro-adipogenic progenitors (FAPs). The authors then conclude that conjugated linoleic acid induces FAPs towards adipogenic commitment. Specifically, they show that linoleic acid stimulates FAPs to become SCD1/DGAT2+ adipocytes via JNK signaling. The authors conclude that their findings demonstrate the effects of conjugated linoleic acid on skeletal muscle fat formation in pigs, which could serve as a model for studying human skeletal muscle diseases.
Strengths:
The comprehensive data analysis provides information on conjugated linoleic acid effects on pig skeletal muscle and organ function. The notion that linoleic acid induces skeletal muscle composition and fat accumulation is considered a strength and demonstrates the effect of dietary interactions on organ remodeling. This could have implications for the pig farming industry to promote muscle marbling. Additionally, these data may inform the remodeling of human skeletal muscle under dietary behaviors, such as elimination and supplementation diets and chronic overnutrition of nutrient-poor diets. However, the biggest strength resides in thorough data collection at the single nuclei level, which was extrapolated to other types of Chinese pigs.
Weaknesses:
Although the authors compiled a substantial and comprehensive dataset, the scope of cellular and molecular-level validation still needs to be expanded. For instance, the single nuclei data suggest changes in myofiber type after linoleic acid supplementation, but these findings need more thorough validation. Further histological and physiological assessments are necessary to address fiber types and oxidative potential. Similarly, the authors propose that linoleic acid alters adipocyte populations, FAPs, and preadipocytes; however, there are limited cellular and molecular analyses to confirm these findings. The identified JNK signaling pathways require additional follow-ups on the molecular mechanism or transcriptional regulation. However, these issues are discussed as potential areas for future exploration. While various individual studies have been conducted on mouse/human skeletal muscle and adipose tissues, these have only been briefly discussed, and further investigation is warranted. Additionally, the authors incorporate two pig models into their results, but they only examine one muscle group. Exploring whether other muscle groups respond similarly or differently to linoleic acid supplementation would be valuable. Furthermore, the authors should discuss how their results translate to human and pig nutrition, such as the desirability and cost-effectiveness for pig farmers and human diets high in linoleic acid. Notably, while the single nuclei data is comprehensive, there needs to be a statement on data deposition and code availability, allowing others access to these datasets.
Thanks for your suggestion!
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors have discussed and provided some experimental evidence to address the related issues to help justify their conclusions. The reviewer believes that authors should deposit their single-cell sequencing data and code for the broader research community.
Thank you! We have uploaded our raw dataset in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in National Genomics Data Center (Nucleic Acids Res 2022), China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences and data availability part has been updated (line 575-579).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This study investigates the role of macrophage lipid metabolism in the intracellular growth of Mycobacterium tuberculosis. By using a CRISPR-Cas9 gene-editing approach, the authors knocked out key genes involved in fatty acid import, lipid droplet formation, and fatty acid oxidation in macrophages. Their results show that disrupting various stages of fatty acid metabolism significantly impairs the ability of Mtb to replicate inside macrophages. The mechanisms of growth restriction included increased glycolysis, oxidative stress, pro-inflammatory cytokine production, enhanced autophagy, and nutrient limitation. The study demonstrates that targeting fatty acid homeostasis at different stages of the lipid metabolic process could offer new strategies for host-directed therapies against tuberculosis.
The work is convincing and methodologically strong, combining genetic, metabolic, and transcriptomic analyses to provide deep insights into how host lipid metabolism affects bacterial survival.
Strengths:
The study uses a multifaceted approach, including CRISPR-Cas9 gene knockouts, metabolic assays, and dual RNA sequencing, to assess how various stages of macrophage lipid metabolism affect Mtb growth. The use of CRISPR-Cas9 to selectively knock out key genes involved in fatty acid metabolism enables precise investigation of how each step-lipid import, lipid droplet formation, and fatty acid oxidation affect Mtb survival. The study offers mechanistic insights into how different impairments in lipid metabolism lead to diverse antimicrobial responses, including glycolysis, oxidative stress, and autophagy. This deepens the understanding of macrophage function in immune defense.
The use of functional assays to validate findings (e.g., metabolic flux analyses, lipid droplet formation assays, and rescue experiments with fatty acid supplementation) strengthens the reliability and applicability of the results.
By highlighting potential targets for HDT that exploit macrophage lipid metabolism to restrict Mtb growth, the work has significant implications for developing new tuberculosis treatments.
Weaknesses:
The experiments were primarily conducted in vitro using CRISPR-modified macrophages. While these provide valuable insights, they may not fully replicate the complexity of the in vivo environment where multiple cell types and factors influence Mtb infection and immune responses.
We thank the reviewer for pointing this out. We acknowledge that our in vitro system may indeed not fully replicate the complex in vivo environment given of what is becoming to light of macrophage heterogenous responses to Mtb infection in whole animal models. We do believe, however, that the Hoxb8 in vitro model provides a powerful genetic tool to interrogate host-Mtb interactions using primary macrophages that represent the bone marrow-derived macrophage lineage.
Reviewer #2 (Public review):
Summary:
Host-derived lipids are an important factor during Mtb infection. In this study, using CRISPR knockouts of genes involved in fatty acid uptake and metabolism, the authors claim that a compromised uptake, storage, or metabolism of fatty acid restricts Mtb growth upon infection. Further, the authors claim that the mechanism involves increased glycolysis, autophagy, oxidative stress, pro-inflammatory cytokines, and nutrient limitation. The authors also claim that impaired lipid droplet formation restricts Mtb growth. However, promoting lipid droplet biogenesis does not reverse/promote Mtb growth.
Strengths:
The strength of the study is the use of clean HOXB8-derived primary mouse macrophage lines for generating CRISPR knockouts.
Weaknesses:
There are many weaknesses of this study, they are clubbed into four categories below
(1) Evidence and interpretations: The results shown in this study at several places do not support the interpretations made or are internally contradictory or inconsistent. There are several important observations, but none were taken forward for in-depth analysis.
a) The phenotypes of PLIN2<sup>-/-</sup>, FATP1<sup>-/-</sup>, and CPT-/- are comparable in terms of bacterial growth restriction; however, their phenotype in terms of lipid body formation, IL1B expression, etc., are not consistent. These are interesting observations and suggest additional mechanisms specific to specific target genes; however, clubbing them all as altered fatty acid uptake or catabolism-dependent phenotypes takes away this important point.
We thank the reviewer for highlighting this. Our focus was on assessing the impact of manipulating lipid homeostasis in macrophages at several stages and the consequences this has on the intracellular growth of Mtb. Throughout the manuscript (abstract, results and discussion), we have continuously emphasized that interfering with lipid handling at several stages in macrophages results in both conserved and divergent antimicrobial responses against intracellular Mtb.
b) Finding the FATP1 transcript in the HOXB8-derived FATP1<sup>-/-</sup> CRISPR KO line is a bit confusing. There is less than a two-fold decrease in relative transcript abundance in the KO line compared to the WT line, leaving concerns regarding the robustness of other experiments as well using FATP1<sup>-/-</sup> cells.
CRISPR-Cas9 targeting of genes with single sgRNAs as is the case with our mutants generates insertions and deletions (INDELs) at the CRISPR cut site. These INDELs do not block mRNA transcription totally, and this is widely reported in the field. Because of this, quantitative RT-PCR or RNA-seq methods are not routinely used to verify CRISPR knockouts as they are not sensitive enough to identify INDELs. We provide INDEL quantification and knockout efficiencies by ICE analysis in supplemental file 1 for all the mutants used in the study. We also demonstrate protein depletion by western blot and flow cytometry for all the mutants (Figure 1 - figure supplement 1). Only mutants with greater than >90% protein depletion were used for subsequent characterization.
c) No gene showing differential regulation in FATP<sup>-/-</sup> macrophages, which is very surprising.
We assume the reviewer is referring to the Mtb transcriptome response in FATP1<sup>-/-</sup> macrophages, which we agree was unexpected. However, we saw a significant compensatory response in the host cell (at transcriptional level) in FATP1<sup>-/-</sup> macrophages as evidenced by an upregulation of other fatty acid transporters (Figure 5 - figure supplement 1, now Figure 6 - figure supplement 1). We believe that these compensatory responses could, in part, alleviate the stresses the bacteria experience within the cell. We discuss this point in the manuscript.
d) ROS measurements should be done using flow cytometry and not by microscopy to nail the actual pattern.
We thank the reviewer for the suggestion. However, confocal imaging is also widely used to measure ROS with similar quantitative power and individual cell resolution (PMID: 32636249, 35737799).
(2) Experimental design: For a few assays, the experimental design is inappropriate
a) For autophagy flux assay, immunoblot of LC3II alone is not sufficient to make any interpretation regarding the state of autophagy. This assay must be done with BafA1 or CQ controls to assess the true state of autophagy.
We would like to point out that monitoring LC3I to LC3II conversion by western blot, confocal imaging of LC3 puncta and qPCR analysis of autophagy related genes are all validated assays for monitoring autophagic flux in a wide variety of cells. We refer the reviewer to the latest extensive guidelines on the subject (PMID: 33634751). Furthermore, Bafilomycin A and chloroquine are not specific inhibitors of autophagy and therefore are of limited value as controls. BafA is an inhibitor of the proton-ATPase apparatus and can indirectly impact autophagy through activity on the Ca-P60A/SERCA pathway. Chloroquine impacts vacuole acidification, autophagosome/lysosome fusion and slows phagosome maturation. So, while BafA and chloroquine will reduce autophagy; their effects are pleotropic and their impact on Mtb is unknown.
b) Similarly, qPCR analyses of autophagy-related gene expression do not reflect anything on the state of autophagy flux.
See our response above.
(3) Using correlative observations as evidence:
a) Observations based on RNAseq analyses are presented as functional readouts, which is incorrect.
We are not entirely sure where we used our RNA-seq data sets as functional readouts. We used our transcriptome data to provide a preliminary identification of anti-microbial responses in the mutant macrophages infected with Mtb and we mention this at the beginning of the RNA-seq results sections. Where applicable, we followed up and confirmed the more compelling RNA-seq data either by metabolic flux analyzes, qPCR, ROS measurements, and quantitative imaging.
b) Claiming that the inability to generate lipid droplets in PLIN2<sup>-/-</sup> cells led to the upregulation of several pathways in the cells is purely correlative, and the causal relationship does not exist in the data presented.
It was not our intention to infer causality. We have re-written the beginning of the sentence, and it now starts with “Meanwhile, Mtb infection of PLIN2<sup>-/-</sup> macrophages led to upregulation” which hopefully eliminates any association to causality.
(4) Novelty: A few main observations described in this study were previously reported. That includes Mtb growth restriction in PLIN2 and FATP1 deficient cells. Similarly, the impact of Metformin and TMZ on intracellular Mtb growth is well-reported. While that validates these observations in this study, it takes away any novelty from the study.
To the best of our knowledge, Mtb growth restrictions in PLIN2 and FATP1 deficient macrophages have not been reported elsewhere. To the contrary, PLIN2 knockout macrophages obtained from PLIN2 deficient mice have been reported to robustly support Mtb replication (PMID: 29370315). We extensively discuss these discrepancies in the manuscript. We also discuss and cite appropriate references where Mtb growth restriction for similar macrophage mutants have been reported (CD36<sup>-/-</sup> and CPT2<sup>-/-</sup>). Our aim was to carry out a systematic myeloid specific genetic interference of fatty acid import, storage and catabolism to assess the effect on Mtb growth at all stages of lipid handling instead of focusing on one target. In the chemical approach, we used TMZ and Metformin deliberately because they had already been reported as being active against intracellular Mtb and we wished to place our data in the context of existing literature. These studies have been referenced extensively in the text.
(5) Manuscript organisation: It will be very helpful to rearrange figures and supplementary figures.
New figures have been added, and existing ones have been re-arranged where necessary. See our responses to recommendations for authors.
Reviewer #3 (Public review):
Summary:
This study provides significant insights into how host metabolism, specifically lipids, influences the pathogenesis of Mycobacterium tuberculosis (Mtb). It builds on existing knowledge about Mtb's reliance on host lipids and emphasizes the potential of targeting fatty acid metabolism for therapeutic intervention.
Strengths:
To generate the data, the authors use CRISPR technology to precisely disrupt the genes involved in lipid import (CD36, FATP1), lipid droplet formation (PLIN2), and fatty acid oxidation (CPT1A, CPT2) in mouse primary macrophages. The Mtb Erdman strain is used to infect the macrophage mutants. The study, reveals specific roles of different lipid-related genes. Importantly, results challenge previous assumptions about lipid droplet formation and show that macrophage responses to lipid metabolism impairments are complex and multifaceted. The experiments are well-controlled and the data is convincing.
Overall, this well-written paper makes a meaningful contribution to the field of tuberculosis research, particularly in the context of host-directed therapies (HDTs). It suggests that manipulating macrophage metabolism could be an effective strategy to limit Mtb growth.
Weaknesses:
None noted. The manuscript provides important new knowledge that will lead mpvel to host-directed therapies to control Mtb infections.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The study presents compelling and well-supported conclusions based on a solid body of evidence. However, the clarity of several figures could be improved for better understanding.
(1) In Figure 1, panels B and C are referenced incorrectly in the text.
We thank the reviewer for identifying the error. This has now been corrected
(2) Figures 2 and S2 would benefit from being combined or reorganized to display the data related to infected and uninfected cells together, making it easier for the reader to interpret.
We thank the reviewer for the suggestion. However, we believe that combining the two figures would further complicate the merged figure making it even more difficult to interpret. We decided to highlight the mutant macrophage’s responses upon Mtb infection in Figure 2 and put the uninfected data sets in supplementary information given that the OCR and ECAR trends were similar and as expected in both infected and uninfected states.
(3) Figure 3 is mislabeled, with four panels shown in the figure, but only panels A and B are mentioned in both the text and the figure legend.
We thank the reviewer for the observation. Figure 3 has been extensively revised. We have included new blots, statistical comparisons and a corresponding new supplementary figure (Figure 3 - figure supplement 1). We have verified that the figure panels are labelled correctly and appropriately referenced in the manuscript text.
(4) Figure 5 is overly complex and difficult to interpret. Simplifying the figure, possibly by reducing the amount of data or breaking it into more digestible parts, would enhance its readability.
We thank the reviewer for the suggestion. We have separated the figure into two parts which are now Figure 5 for the PCA and Venn diagrams and Figure 6 for the pathway enrichment figure panels. We have increased the resolution of both figures in the revised manuscript to improve readability.
(5) Panel 6A is not particularly informative and could either be omitted with a more detailed explanation provided in the text, or replaced with a clearer visual representation, such as Venn diagrams, to improve data visualization.
We thank the reviewer for the suggestion. We have removed Figure 6A given that detailed explanation of the panel is already available in the manuscript text.
(6) Additionally, on line 309, the word "to" is missing before "generate".
We thank the reviewer for identifying this. This sentence has now been re-written to address some unintended inferences of causation in line with recommendations from reviewer 2.
Reviewer #2 (Recommendations for the authors):
(1) Manuscript Organisations: The manuscript is very poorly organised. Supplemental figures are labelled very unconventionally, and that creates much confusion in following the manuscript. Some of the results in the supplementary figures could be easily kept in the main figures, as it is difficult to compare plots between the main figures and the supple figures. The results of RNAseq experiments are impossible to follow with very small fonts. Overall, the figures are very casually organised and can certainly be improved.
We would like to clarify that supplemental figures are labelled and organized as is in line with the eLife formatting of supplemental figures. We deliberately put some redundant figures like Figure 2 - figure supplement 1 in supplementary information (see our response to reviewer 1 recommendations on the same). We have split the RNA-seq Figure 5 into two separate figures (now Figure 5 and 6) and increased their resolution to improve readability.
(2) Figure 3: Among the KO lines, only PLIN2<sup>-/-</sup> had a higher HIF1a level before infection. Infection surely leads to higher levels across the three cases.
We have generated replicate western blots and provide statistical quantitation for both HIF1a, AMPK and pAMPK. Figure 3 has now been revised extensively, replicate blots are in Figure 3 - figure supplement 1. We have updated the text to reflect the reviewer observation which was also consistent with our statistical quantification.
(3) pAMPK blots are of very poor quality. Without quantification, the trend mentioned in the text is not clearly visible.
We have provided two more replicate blots for AMPK/pAMPK and provide statistical quantification as described above.
(4) Line 230: Regarding autophagy flux, neither the data suggest what is interpreted nor is this experiment correctly done. LC3 WB and autophagy gene qPCR: Unfortunately, LC3 WB, the way it was done, does not tell anything about the state of autophagy in these cells. A very mild LC3II increase is noted in CPT2<sup>-/-</sup> cells upon infection; the rest of the others do not show any change. This assay is not done correctly. To interpret LC3II WB, one needs to include the Bafilomycin A1 control, usually +Baf and -Baf run in the adjacent wells in the gel. Similarly, qPCR results are not indicative of any increase in autophagy. Regulation of ATG7, MAP1LC3B, and ULK1 is more at the post-translational level than the transcriptional level.
We have provided an additional replicate blot together with statistical quantification of LC3II/LC3I ratios in the revised Figure 3 - figure supplement 2. Our quantifications remain consistent with our prior assertations in the manuscript text. See our response in the public review section concerning autophagy assays and the use of Baf or chloroquine as controls.
(5) Exogenous oleate fails to rescue the Mtb icl1-deficient mutant in FATP1<sup>-/-</sup>, PLIN2<sup>-/-</sup> and CPT2<sup>-/-</sup> macrophages: this result is confusing. Lipid uptake and metabolism have been the central players so far; however, here, the phenotypes of FATP1 and CPT2 in terms of lipid body accumulation are very distinct. Therefore, the assessment that Mtb growth inhibition is due to factors other than limited access to fatty acid is not consistent with the theme of the study.
Nutrient limitation is a distinct transcriptional signature of Mtb, at least in PLIN2<sup>-/-</sup> macrophages (Figure 7). We used the oleate supplementation assay with the Mtb Dicl1 mutant to assess whether nutrient restriction was the sole anti-microbial pathway against Mtb in the knockout macrophages. This would have been the case (to a certain extent) if the growth of the Mtb Dicl1 mutant was rescuable upon addition of exogenous oleate in the knockout macrophages. Our data clearly shows that this is not the case and that in addition to nutrient limitation, interference with lipid processing results in several other macrophage anti-microbial responses against the bacteria. We extensively discuss these points in the abstract, results and discussion sections of the manuscript.
(6) Line 309: "Meanwhile, inability generate lipid droplets in Mtb infected PLIN2<sup>-/-</sup> macrophages led to upregulation in pathways involved in ribosomal biology, MHC class 1 antigen presentation, canonical glycolysis, ATP metabolic processes and type 1 interferon responses (Figure 5C, Supplementary file 3)." This is just a correlative observation. However, it is mentioned here as a causal mechanism.
We have revised this sentence to remove any unintended inference of causation.
(7) IL-1b is upregulated in FATP-/- macrophages, no effect in CPT2<sup>-/-</sup> macrophages, but downregulated in PLIN2<sup>-/-</sup> macrophages. Moreover, this effect is very transient, and by 24 hours, all these differences are lost. This suggests the mechanism of action, as their pro-bacterial function shown in Figure 1, is very distinct for different proteins, and FA metabolism is probably not the common denominator across these phenotypes.
We agree with the reviewer, and we extensively discuss this in the manuscript text (results and discussion). Clearly, they are shared anti-microbial responses across the mutants, but they are also points of divergence. We would like to further clarify that pro-inflammatory responses (IL-1b or IFN-B) in Mtb infected macrophages show a biphasic early upregulation (up to 8 hours of infection) followed by a rapid resolution phase (24-48 hours post infection). This is well reported in the literature (PMID: 30914513). It is common for pro-inflammatory gene expression differences to be temporary lost during the resolution phase (PMID: 30914513, 39472457). IL-1b expression profiles return to the 4-hour equivalent profile in Mtb infected FATP1<sup>-/-</sup> and PLIN2<sup>-/-</sup> macrophages 4 days post infection (Figure 6A, Figure 6 - figure supplement 2B, Supplementary file 2)
(8) It is very surprising that FATP-/- macrophages do not show any change in Mtb gene expression. The robustness of this experiment and analysis appears doubtful, given that the phenotype in terms of bacterial growth was clean.
See our response to this comment in the public reviews section
(9) Figure 5, Supplementary Figure 1: Among the FA transporters, authors also show data for FATP1. I am surprised to see FATP1 expression levels in the FATP1<sup>-/-</sup> cells. This puts into doubt every dataset using FATP-/- cells in this study.
See our response to this comment in the public reviews section
(10) Unfortunately, with the kind of evidence presented, it is far-fetched to claim that PLIN2<sup>-/-</sup> macrophages restrict Mtb growth by increasing ROS production. There is no evidence for this statement. The MFI units in Figure 6, Supplementary 1 are too small to extract meaningful interpretations. Moreover, the data appears to be arrived at by combining multiple technical replicates. Usually, flow cytometry data are more reliable for CellROX assays. Microscopy is not the technique of choice for this assay.
We would like to point out that MFIs are arbitrary units set to predetermined reference points. In our case, the reference was background fluorescence in CellROX unstained cells and cells stained with CellROX equivalent fluorophore conjugated isotype antibodies. We are not entirely sure what the reviewer means by “small” in these contexts. And the data is not entirely from technical replicates. Reported MFIs are from three independent repeats with MFI reads of at least 30 cells per replicate. We have added this clarification in Figure 6 - figure supplement 1 legend, now Figure 7 - figure supplement 1. See our response in the public reviews section on the use of confocal microcopy to image and quantify ROS. Furthermore, the Mtb transcriptional response in PLIN2<sup>-/-</sup> and CPT2<sup>-/-</sup> macrophages is clearly indicative of increased oxidative stresses (Figure 7).
(11) The CFU results with Metformin and TMZ are on the expected lines, as published earlier by others. FATP1 In data is good and aligned with the knockout phenotype.
We thank the reviewer for the note.
(12) Western blots, when interpreted for quantitative differences, must be quantified, and data should be represented as plots with statistical analysis.
Replicate blots have been provided and statistical quantifications performed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public reviews
Reviewer #1 (Public review):
Overall I find the evidence very well presented and the study compelling. It offers an important new perspective on the key properties of neoblasts. I do have some comments to clarify the presentation and significance of the work.
We thank the reviewer for the positive feedback and plan to improve the presentation of the work.
Reviewer #2 (Public review):
However, the absence of a cell-cell feedback mechanism during colony growth and the likelihood of the difference needs to be clarified. Is there any difference in interpreting the results if this mechanism is considered?
We will improve the description of the model assumptions and the interpretation of the data on the basis of these assumptions.
Although hnf-4 and foxF have been silenced together to validate the model, a deeper understanding of the tgs-1+ cell type and the non-significant reduction of tgs-1+ neoblasts in zfp-1 RNAi colonies is necessary, considering a high neural lineage frequency.
We will improve the analysis of this result in light of the experimentally determined frequency of the tgs-1+ neoblast population.
Recommendations for the authors
Reviewing Editor Comments:
After consultation, we have compiled a list of the key changes to be made to the manuscript, along with reviewer-specific recommendations to follow.
(1) Include a section that explicitly describes the assumptions and limitations of the study, particularly with respect to the following assumptions:
We thank the reviewers for the comment. We added a description of the model assumptions in the methods section “Assumptions underlying neoblast colony growth model”.
a) All known types of specialized neoblasts cycle at the same rate (see points from Reviewer 1).
We thank the reviewers for the comment. The current data used to estimate τ (Lei et al., Dev Cell, 2016) does not allow the direct estimation of individual cycling behaviors. Consequently, we assume that all specialized neoblasts cycle at the same average rate, a simplification supported by the model's accurate prediction of colony growth.
b) The assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. The manuscript does not mention that there may be fundamental differences among these different FSTFs that could be uncovered by future work. A strong addition to the paper would be to test other epithelial genes (e.g. p53, chd4, egr5) to show reproducible behavior within a single lineage.
We thank the reviewers for the comment. Colony size reduction following inhibition of Smed-p53 and failure to produce epidermal progenitors is strongly supported by previous analysis (Wagner et al., Cell Stem Cell, 2012). We refer to this observation in the paper in the section titled: “Inhibition of zfp-1 does not induce overexpression of other lineages in homeostasis”. We added the following sentence to the discussion (Line 460-462): Interestingly, suppression of Smed-p53, a TF expressed in neoblasts and required for epidermal cell production, has resulted in a similar reduction in colony size (Wagner et al., Cell Stem Cell, 2012).
Of note, Chd4 expression is not limited to specialized neoblasts or to a specific lineage (Scinome et al., Development, 2010), and therefore its inhibition likely has a more complex outcome than an effect on a single lineage. Furthermore, egr-5 is not expressed in neoblasts (Tu et al, eLife, 2015), making this experimental condition more challenging to examine in the context of neoblast colonies at the time points assessed in this study.
c) The fact that the data used to feed the model relies on radiated animals which are likely to have altered cell cycle rates compared to unirradiated animals (see comment by Reviewer 1). Of note, the model predicts a steady increase in colony size, but colony size does not change between 9dpi and 12dpi.
We thank the reviewers for the comment. The colony size in control animals increased between 9 and 12 dpi (Fig 3B), as predicted by the model. In zfp-1 (RNAi) animals, the median colony size has also increased over this period, at a slower rate, which we attribute to the increase in q. We attribute the unchanged average colony size to an increase in the frequency of cells failing to proliferate, because of selection of a fate they cannot fully differentiate into.
d) In light of both reviewers' comments about colony expansion vs. feedback, the authors should discuss how predicted changes to division frequencies might change as homeostasis is reached, or explain how their model accounts for the predicted rate differences under homeostatic conditions in which overall neoblast numbers do not change. Can the model estimate when this transition might occur?
We thank the reviewers for the comment. Our colony assays are constrained by the animals survival following sub-total irradiation (16 to 20 days). In this timeframe, the neoblast population is overwhelmingly smaller in comparison to non-irradiated animals. Therefore, the animals do not reach homeostasis during the experiment, and the model does not allow to estimate the time the system would need to return to homeostasis.
(2) In Figure 2D, the assumption is that these adjacent smedwi-1+ cells are sisters. Previous data analyzing this relied on EdU or H3P staining to show a shared division history. When these images were collected is therefore extremely critical to include (the methods suggest 7, 9, or 12 days). The authors should justify why they believe that these adjacent cells are derived from a single neoblast that has divided only once.
We thank the reviewers for the comment. The images were collected at 7 dpi. We modified the figure legend and the associated methods to include this information. At this early time point, smedwi-1+ cell dyads are spatially separated from other neighboring cells, suggesting that they are the product of a single cell division. Importantly, our data is in complete agreement with previous estimates of symmetric renewal division rate (Raz et al., Cell Stem Cell, 2021; Lei et al, Developmental Cell, 2016).
(3) Clarify the wording 'pre-selected' in the abstract as described by Reviewer 1.
We thank the reviewers for the comment, and for clarity we replaced the wording “pre-select” with “select”.
(4) Experimental details that are important to the interpretation should be added. For example, how is belonging to a colony defined? This is important because some of the data (e.g. Figure S1A: similar numbers of smedwi-1+ cells are observed at 2dpi and 4dpi, but 4dpi is considered a colony whereas 2dpi is not). The timing of quantification should be included in each figure (it is missing in Figure S2, and Figure 3C and 3D). How the authors distinguish biological vs technical replicates is not mentioned.
We thank the reviewers for the comment. Subtotal irradiation may result in formation of a spatially-isolated cluster of neoblasts that is not distributed throughout the animal (Wagner et al., Science, 2011). This localized cluster of neoblasts is defined as a neoblast colony (Wagner et al., Science, 2011; Wagner et al., Cell Stem Cell, 2012). The small number of high smedwi-1+ cells observed at 4 dpi in our experiments aligns with this definition (Fig S1A). By contrast, the low smedwi-1 expression detected across the animal 2 dpi does not fit this definition and likely reflects remnants of dying neoblasts resulting from irradiation. The following text was added to the figure legend: “isolated cells expressing low levels of smedwi-1+ were scattered in the planarian parenchyma, likely reflecting remnants of dying neoblasts”.
(5) Figure 5F appears to use SMEDWI-1 antibody (based on capital letters and increased signal in the brain). Is this the case? The methods do not mention the use of a SMEDWI-1 antibody, and the text indicates that these are progenitors, but SMEDWI-1 protein is well known to not mark neoblasts. If the antibody was used, the authors should not claim that these are neoblasts.
We thank the reviewers for the comment. The SMEDWI-1 antibody used in the experiments described in Figure 5F indeed labels neoblasts and their progeny (Guo et al., Developmental cell, 2006). The methods section “Immunofluorescence combined with FISH” details the labeling procedure, which combines FISH and IF using this antibody.
All microscopy images are difficult to see. Perhaps this is because they are formatted as CMYK images. They should be converted to RGB format to make them appear less dull.
We thank the reviewer for the comment. Improved version of the figures has now been uploaded.
The terminology used in Figure 5 to describe upregulation should not be "overexpression". We thank the reviewers for the comment.
We changed the terminology to “upregulated”.
Reviewer #1 (Recommendations for the authors):
I think the authors should include a section that explicitly lays out the assumptions and limitations of the study. For example, I believe that determining tau requires assuming that all different types of specialized neoblasts cycle at the same rates. Also there is the assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. It seems to remain possible that a future study could find that a subset of FSTFs might indeed exert "either/or" decisions in fating, just not the particular genes under investigation here.
We thank the reviewer for the comment. We added a description of the model assumptions in the methods section.
In the abstract, the wording "pre-selected" is somewhat puzzling to me. I would interpret a preselection as a process that defines the next specified state prior to its manifestation. Instead, and as I understand the authors argue this as well, the study provides good evidence that the determination mechanism is random in that subsequent neoblast choices do not likely depend on prior states. So I would suggest changing that wording.
We thank the reviewer for the comment. We replaced “pre-select” with “select”
Is it possible to determine the uncertainty in measuring tau the cell cycle time and would this have an impact on subsequent modeling?
We thank the reviewers for the comment. The current data that was used to estimate tau (Lei et al., Dev Cell, 2016) does not allow us to directly estimate the uncertainty in measuring τ.
For lines 154-164 I would suggest doing a little more to explicitly write out the logic of determining the growth constants within the main text and not just in methods, for ease of reading.
We thank the reviewer for the comment, and added explanations for how we determined the growth constant in the text. The text now reads (lines 160-166): “Considering an average cell cycle length of 29.7 hours, we calculated the value of q using the following approach: the probabilities of all cell division outcomes must sum to 1. Our experimental data showed that symmetric renewal (p) and asymmetric division (a) occur at equal rates (i.e., p = a). By fitting these parameters to the experimental data, we determined that the difference between the probabilities of symmetric renewal and symmetric differentiation (i.e., p - q) was = 0.345 (Fig 2E, S1D-E). Therefore, with these criteria, we estimated the probabilities of cell division outcomes in the colony as p = 0.45, a = 0.45, and q = 0.1 (Fig 2G; Methods).”
Line 192 why does post-mitotic progeny number linearly relate to neoblast number? In clones, a change in q has an exponential effect. I feel like I am missing something.
We thank the reviewer for the comment. In colonies, 50% of cell divisions result in the production of post-mitotic progeny (asymmetric division). Therefore, the number of produced progenitors in a given cell cycle is linearly correlated with the number of neoblasts. This statement is in line with previous analysis of planarian colony size (Wagner et al., Cell Stem Cell, 2012).
Line103 it also seems possible, although less likely, that the specified state is not fixed within a given cell cycle and could be that cells that try to switch into zeta-neoblasts mid-cell cycle arrest in proliferation etc just for that time.
We thank the reviewer for the comment and agree that this is a possibility. However, our observations suggest that incorporating this factor into the model is unnecessary for accurately predicting colony size.
In terms of the feedback mechanism proposed to operate in homeostasis, I think in the case of zfp-1 it is quite likely that loss of epidermal differentiation results in wound responses (this phenomenon has been documented in egr-5 RNAi in Tu et al 2015 I believe). This could play out differently in the clone assay because the effects of sublethal irradiation on this process would predominate in both control versus zfp1(RNAi) conditions.
We thank the reviewer for the comment. Our RNA-seq analysis following zfp-1 inhibition did not show overexpression of injury-induced genes at an early time point (6 days; Fig. 5B-C). However, an increase in cycling cells was detected much earlier via EdU labeling (3 days; Fig. 5D). In the case of egr-5 suppression, Tu et al. analyzed injury-induced gene expression at a later stage (21 days of RNAi), where they found significant epidermal defects (see Fig. 5C in Tu et al.). We agree that sublethal irradiation effects likely predominate in colony analysis for both control and zfp-1 (RNAi) animals. In homeostasis, additional factors likely influence cell proliferation and differentiation.
It seems likely that some of the differences noted between homeostasis versus clone growth could ultimately arise from the different growth parameters under each setting. Could the rate parameters be estimated from prior data in homeostasis as well? It seems to me that with the framework the authors use, homeostasis must involve a net zero change to neoblast abundance (also shown by Wagner 2011 by the sigmoidal curve of neoblast abundance at the endpoint of clone expansion). Therefore, in these conditions p=q by definition. Experimental evidence from Lei 2016 (Figure S7M) suggests asymmetric divisions and symmetric renewing divisions are about equally abundant (5/12 41% sym renewing vs 7/12 69% asymmetric renewing). Therefore, under homeostasis, there would be an estimated p=q=0.3 and a=0.4. Compared to clone growth conditions then, in homeostasis, it seems that roughly the rate of symmetric renewal decreases and the rate of symmetric differentiation also increases. I wonder, could this kind of difference potentially account for the differences between homeostasis versus clone expansion settings? It is also worth noting that the clone expansion context has been used as a sensitized genetic background for identifying effects of gene inhibition on neoblast self-renewal, so perhaps the reason this works is that the rates of selfrenewal are relatively less in homeostasis so that clone expansion represents a case where there is greater demand for self-renewal.
We thank the reviewer for the comment. We agree that under homeostatic conditions, where the population size remains stable, the average probability of symmetric renewal matches the average probability of symmetric differentiation or elimination. By contrast, during colony expansion, the probability of symmetric renewal exceeds that of symmetric differentiation or elimination. The differences in response to a lineage block between homeostasis and colony expansion can have multiple interpretations. However, data from homeostatic animals does not permit the analysis of individual neoblasts or their specific responses to a lineage block. Consequently, we cannot determine whether the proliferative response following the lineage block during homeostasis is a direct response to the lineage block or an indirect effect resulting from changes in other neoblasts. We discuss these possibilities further in lines 472 - 484.
In terms of the memory effect, I recall some arguments presented in the Raz 2021 study that were consistent with a slight memory for neoblast specification being retained. I believe this was a minor point from detecting a slightly higher likelihood of identifying 2-cell clones that both took on prog1+ identity compared to the population average. If this is the case, it may be worth the authors commenting on reconciling those observations with their model.
We thank the reviewer for their comment. Raz et al. (Cell Stem Cell, 2021) reported that in the asymmetric division of a zeta-neoblast, which generates a prog-2+ cell and a neoblast, there was a slightly higher observed frequency of zfp-1 expression in the neoblast compared to the expected rate (Expected: 32%, Observed: 44%). This small increase may reflect a mild memory effect, experimental variability, or both. However, statistical analysis using Fisher's exact test yielded a non-significant p-value (p = 0.1), suggesting that this difference could be attributed to experimental variability. Other data from Raz et al., such as lineage representation in early colonies, also did not show significant memory effects, indicating that any such effects, if present, are minimal and difficult to detect. Therefore, while we do not, and cannot, rule out the presence of minor memory effects, we expect that effects of this magnitude will have minimal impact on our model.
Reviewer #2 (Recommendations for the authors):
Figure 2C and 2D:
Please provide the specific time points for the data presented.
We thank the reviewer for the comment. The information was added to the figure legend.
Colony growth and homeostasis:
It would be beneficial to estimate a time point at which colony growth transitions to a model with a cell-cell feedback mechanism, similar to that observed in homeostasis. This would help in understanding the dynamics and timing of these processes.
We thank the reviewers for the comment. Our colony assays were constrained by the animals survival following sub-total irradiation (16 to 20 days). Neoblast numbers are substantially reduced compared to unirradiated animals, preventing us from determining the time point at which homeostasis is achieved.
Methods:
μl should be μL
The text was changed accordingly.
Line 526: H2O should be H2O
The text was changed accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This well-written report uses functional neuroimaging in human observers to provide convincing evidence that activity in the early visual cortex is suppressed at locations that are frequently occupied by a task-irrelevant but salient item. This suppression appears to be general to any kind of stimulus, and also occurs in advance of any item actually appearing. The work in its present form will be valuable to those examining attention, perception, learning and prediction, but with a few additional analyses could more informatively rule out potential alternative hypotheses. Further discussion of the mechanistic implications could clarify further the broad extent of its significance.
We thank the editor and the reviewers for the positive evaluation of our manuscript and the thoughtful comments. Below we provide a detailed point-by-point reply to the reviewers’ comments.
In addition to addressing the reviewers' comments, we have improved the figure legends by explicitly describing the type of error bars depicted in the figures, information which was previously only listed in the Materials and Methods section. Specifically, the statement: “Error bars denote within-subject SEM” was added to several figures, as applicable. We believe that briefly reiterating this information in the figure legends enhances clarity and enables readers to interpret the results more accurately and efficiently. We also updated our code and data sharing statement, as well as opened the repository for the public: “Analysis and experiment code, as well as data required to replicate the results reported in this manuscript are available here: https://doi.org/10.17605/OSF.IO/G4RXV. Raw MRI data is available upon request.”
Public Reviews
Reviewer #1 (Public review):
Summary:
The authors investigated if/how distractor suppression derived from statistical learning may be implemented in early visual cortex. While in a scanner, participants conducted a standard additional singleton task in which one location more frequently contained a salient distractor. The results showed that activity in EVC was suppressed for the location of the salient distractor as well as for neighbouring neutral locations. This suppression was not stimulus specific - meaning it occurred equally for distractors, targets and neutral items - and it was even present in trials in which the search display was omitted. Generally, the paper was clear, the experiment was well-designed, and the data are interesting. Nevertheless, I do have several concerns mostly regarding the interpretation of the results.
(1) My biggest concern with the study is regarding the interpretation of some of the results. Specifically, regarding the dynamics of the suppression. I appreciate that there are some limitations with what you might be able to say here given the method but I do feel as if you have committed to a single interpretation where others might still be at play. Below I've listed a few alternatives to consider.
We agree with the reviewer that there are important alternatives to consider. Adequately addressing these alternatives will substantially increase the inferences we can draw from our data. Therefore, we address each alternative interpretation in detail below.
(a) Sustained Suppression. I was wondering if there is anything in your results that would speak for or against the suppression being task specific. That is, is it possible that people are just suppressing the HPDL throughout the entire experiment (i.e., also through ITI, breaks, etc., rather than just before and during the search). Since the suppression does not seem volitional, I wonder if participants might apply a blanket suppression to HPDL un l they learn otherwise. Since your localiser comes a er the task you might be able to see hints of sustained suppression in the HPDL during these trials.
It is indeed possible that participants suppressed the HPDL throughout the entire experiment, instead of proactively instantiating suppression on each trial. While possible, we believe that this account is less likely to explain the present results, given the utilized analysis approach, a voxel-wise GLM fit to the BOLD data per run (see Materials and Methods for details). Specifically, we derived parameter estimates from this GLM per location to estimate the relative suppression. Sustained suppression would modulate BOLD responses throughout the run, i.e. presumably also during the implicit baseline period used to estimate the contrast parameter estimates per location. Hence, sustained suppression should not result in a differential modulation between locations, as the BOLD response at the HPDL during the baseline period would be equally suppressed as during the trial. Inspired by the reviewer’s comment, we now clarify this critical point in the manuscript’s Discussion section:
“Third, participants might have suppressed the HPDL consistently throughout the experiment. This sustained suppression account differs from the proactive suppression proposed here. While this alternative is plausible, we believe that it is less likely to account for the present results, given the analysis conducted. Specifically, we computed voxel-wise parameter estimates and contrasted the obtained betas between locations. Under a sustained suppression account, the HPDL would show suppression even during the implicit baseline period, which would obscure the observed BOLD suppression at and near the HPDL.”
(b) Enhancement followed by suppression. Another alternative that wasn't discussed would be an initial transient enhancement of the HPDL which might be brought on by the placeholders followed by more sustained suppression through the search task. Of course, on the whole this would look like suppression, but this still seems like it would hold different implications compared to simply "proactive suppression". This would be something like search and destroy however could be on the location level before the actual onset of the search display.
R1 correctly points out that BOLD data, given the poor temporal resolution, do not allow for the detection of potential transient enhancements at the HPDL followed by a later and more pronounced suppression (akin to “search and destroy”). We fully agree with this assessment. However, we also argue that a transient enhancement followed by sustained suppression before search display onset constitutes proactive suppression in line with our interpretation, because suppression would still arise proactively (i.e., before search, and hence distractor, onset). Whether transient enhancement precedes suppression cannot be elucidated by our data, but we believe that it constitutes an interesting avenue for future studies using me-resolved and spatially specific recording methods. We now clarify this important implementational variation in the updated manuscript.
“Finally, due to the limited temporal resolution of BOLD data, the present data do not elucidate whether the present suppression is preceded by a brief attentional enhancement of the HPDL, as implied by some prior work (Huang et al., 2024). On this account the HPDL would see transient enhancement, followed by sustained suppression, akin to a ‘search and destroy’ mechanism. Critically, we believe that this variation would nonetheless constitute proactive distractor suppression as the suppression would still arise before search onset. Using temporally and spatially resolved methods to explore potential transient enhancements preceding suppression is a promising avenue for future research charting the neural mechanisms underlying distractor suppression.”
(2) I was also considering whether your effects might be at least partially attributable to priming type effects. This would be on the spatial (not feature) level as it is clear that the distractors are switching colours. Basically, is it possible that on trial n participants see the HPDL with the distractor in it and then on trial n+1 they suppress that location. This would be something distinct from the statistical learning framework and from the repetition suppression discussion you have already included. To test for this, you could look at the trials that follow omission or trials. If there is no suppression or less suppression on these trials it would seem fair to conclude that the suppression is at least in part due to the previous trial.
We agree with the reviewer that it is plausible that participants particularly suppress locations which on previous trials contained a distractor. To address this possibility, we conducted a new analysis and adjusted the manuscript accordingly:
“Second, participants may have suppressed locations that contained the distractor on the previous trial, reflecting a spatial priming effect. This account constitutes a complementary but different perspective than statistical learning, which integrates implicit prior knowledge across many trials. We ruled out that spatial priming explains the present results by contrasting BOLD suppression magnitudes on trials with the distractor at the HPDL and trials where the distractor was not at the HPDL on the previous trial. Results, depicted in Supplementary Figure 4 showed that distractor suppression was statistically significant across both trial types, including trials without a distractor at the HPDL on the preceding trial. This indicates that the observed BOLD suppression is unlikely to be driven by priming and is instead more consistent with statistical learning. Moreover, results did not yield a statistically significant difference between trial types based on the distractor location in the preceding trial. However, these results should not be taken to suggest that spatial priming cannot contribute to distractor suppression; for details see: Supplementary Figure 4.” (p. 13).
We note that this analysis approach slightly differs from the reviewer’s suggestion, which considered omission trials. However, we decided to exclude trials immediately following an omission to ensure that both conditions were matched as closely as possible. In particular, omission trials represent extended rest periods, which could alter participants’ state and especially modulate the visually evoked BOLD responses (e.g., potentially increasing the dynamic range) compared to trials that did not follow omissions. Our analysis approach avoids this difference while still addressing the hypothesis put forward by the reviewer. We now provide the full explanation and results figure of this priming analysis in the figure text of Supplementary Figure 4:
Reviewer #2 (Public review):
The authors of this work set out to test ideas about how observers learn to ignore irrelevant visual information. Specifically, they used fMRI to scan participants who performed a visual search task. The task was designed in such a way that highly salient but irrelevant search items were more likely to appear at a given spatial location. With a region-of-interest approach, the authors found that activity in visual cortex that selectively responds to that location was generally suppressed, in response to all stimuli (search targets, salient distractors, or neutral items), as well as in the absence of an anticipated stimulus.
Strengths of the study include: A well-written and well-argued manuscript; clever application of a region of interest approach to fMRI design, which allows articulating clear tests of different hypotheses; careful application of follow-up analyses to rule out alternative, strategy-based accounts of the findings; tests of the robustness of the findings to detailed analysis parameters such as ROI size; and exclusion of the role of regional baseline differences in BOLD responses.
We thank the reviewer for the positive evaluation of our manuscript.
The report might be enhanced by analyses (perhaps in a surface space) that distinguish amongst the multiple "early" retinotopic visual areas that are analysed in the aggregate here.
We agree with the reviewer that an exploratory analysis separating early visual cortex (EVC) into its retinotopic areas could be an interesting addition. Our reasoning to combine early visual areas into one mask in the original analyses was two-fold: First, we did not have an a priori reason to expected distinct neural suppression between these early ROIs. Therefore, we did not acquire retinotopy data to reliably separate early visual areas (e.g. V1, V2 and V3), instead opting to increase the number of search task trials. The lack of retinotopy data inherently limits the reliability of the resulting cortical segmentation. However, we now performed an analysis separating early visual cortex into V1 and V2 and report the details as Supplementary Text 1:
“In an exploratory analysis we investigated whether subdivisions of EVC exhibit different representations of priority signals. In brief, we used FreeSurfer to reconstruct brain surfaces (recon-all) from each subject’s anatomical scan. From these reconstructions we derived V1_exvivo and V2_exvivo labels, which were transformed into volume space using ‘mri_label2vol’ and merged into a bilateral mask for each ROI. We then selected the voxels within each ROI that were most responsive to the four stimulus locations, based on independent localizer data. This voxel selection followed the procedure outlined in the Materials and Methods: Region of Interest (ROI) Definition. To accommodate the subdivision into two ROIs (V1 and V2) compared to the single EVC ROI in the main analysis, we halved the number of voxels selected per location. Finally, we applied the same ROI analysis to investigate distractor suppression during search and omission trials, following the procedure described in Materials and Methods: Statistical Analysis.
Results of this more fine-grained ROI analyses are depicted in Supplementary Figure 1. First, the results from V2 qualitatively mirrored our primary ROI analysis. BOLD responses in V2 differed significantly between stimulus types (main effect of stimulus type: F<sub>(2,54)</sub> = 31.11, p < 0.001, 𝜂 = 0.54). Targets elicited larger BOLD responses compared to distractors (t<sub>(27)</sub> = 3.05, p<sub>holm</sub> = 0.004, d = 0.06) and neutral stimuli (t<sub>(27)</sub> = 7.82, p<sub>holm</sub> < 0.001, d = 0.14). Distractors also evoked larger responses than neutral stimuli (t<sub>(27)</sub> = 4.78, p<sub>holm</sub> < 0.001, d = 0.09). These results likely reflect top-down modulation due to target relevance and bo om-up effects of distractor salience. Consistent with the primary ROI analysis, the manipula on of distractor predictability showed a distinct pattern of location specific BOLD suppression in V2 (main effect of location: F<sub>(1.1,52.8)</sub> = 5.01, p = 0.030, 𝜂 = 0.16). Neural populations with receptive fields at the HPDL showed significantly reduced BOLD responses compared to the diagonally opposite neutral location (NL-far; post hoc test HPDL vs NL-far: t<sub>(27)</sub> = 2.69, p<sub>holm</sub> = 0.022, d = 0.62). Again, this suppression was not confined to the HPDL but also extended to close by neutral locations (NL-near vs NL-far: t<sub>(27)</sub> = 2.79, p<sub>holm</sub> = 0.022, d = 0.65). BOLD responses did not differ between HPDL and NL-near locations (HPDL vs NL-near: t<sub>(27)</sub> = 0.11, p<sub>holm</sub> = 0.915, d = 0.03; BF<sub>10</sub> = 0.13). As in the EVC ROI analysis, this suppression pattern was consistent across distractor, target, and neutral stimuli presented at the HPDL and NL-near locations compared to NL-far. In sum, neural responses in V2 were significantly modulated by the distractor contingencies, evident as reduced BOLD responses in neural populations with receptive fields at the HPDL and neutral locations near the location of the frequent distractor (NL-near), relative to the neutral location diagonally across the HPDL (NL-far).
In V1, BOLD responses also differed significantly between stimulus types (main effect of stimulus type: F<sub>(1.3,35.6)</sub> = 6.69, p = 0.009, 𝜂 = 0.20). Targets elicited larger BOLD responses compared neutral stimuli (t<sub>(27)</sub> = 3.52, p<sub>holm</sub> = 0.003, d = 0.12) and distractors evoked larger responses than neutral stimuli (t<sub>(27)</sub> = 2.62, p<sub>holm</sub> = 0.023, d = 0.09). However, no difference between targets and distractors was observed (t<sub>(27)</sub> = 0.90, p<sub>holm</sub> = 0.375, d = 0.03; BF<sub>10</sub> = 0.17), suggesting reduced sensitivity to task-related effects in V1. Indeed, analyzing the effect of distractor predictability for BOLD responses in V1 showed a different result than in V2 and the combined EVC ROI. There was no significant main effect of location (F<sub>(2,54)</sub> = 2.20, p = 0.120, 𝜂 = 0.08; BF<sub>10</sub> = 0.77). BOLD responses at NL-near and NL-far were similar (BF<sub>10</sub> = 0.171), with the only reliable difference found between target stimuli at the HPDL and NL-far locations (W = 94, p<sub>holm</sub> = 0.012, r = 0.54).”
We include the new result figure as Supplementary Figure 5
We now include reference to these results in the manuscript’s Discussion section:
“Are representations of priority signals uniform across EVC? A priori we did not have any hypotheses regarding distinct neural suppression profiles across different early visual areas, hence our primary analyses focused stimulus responses neural populations in EVC, irrespective of subdivision. However, an exploratory analysis suggests that distractor suppression may show different patterns in V1 compared to V2 (Supplementary Figure 5 and Supplementary Text 1). In brief, results in V2 mirrored those reported for the combined EVC ROI (Figure 4). In contrast, results in V1 appeared to be only partially modulated by distractor contingencies, and if so, the modulation was less robust and not as spatially broad as in V2. This suggests the possibility of different effects of distractor predictability across subdivisions of early visual areas. However, these results should be interpreted with caution. First, our design did not optimize the delineation of early visual areas (e.g., no functional retinotopy), limiting the accuracy of V1 and V2 segmentation. Additionally, analyses were conducted in volumetric space, which further reduces spatial precision. Future studies could improve this by including retinotopy runs to accurately delineate V1, V2, and V3, and by performing analyses in surface space. Higher-resolution functional and anatomical MRI sequences would also help elucidate how distractor suppression is implemented across EVC with greater precision.”
Furthermore, the study could benefit from an analysis that tests the correlation over observers between the magnitude of their behavioural effects and their neural responses.
R2 highlights that behavioral facilitation and neural suppression could be correlated across participants. The rationale is that if neural suppression in EVC is related to the facilitation of behavioral responses, we should expect a positive relationship between neural suppression at the HPDL and RTs across participants. In this analysis we focused on the contrast between HPDL and NL-far, as this contrast was statistically significant in both the RT (Figure 2) and the neural suppression analysis (Figure 4). First, we computed for each participant the behavioural benefit of distractor suppression as: RT<sub>facilitation</sub> = RT<sub>NL-far</sub> – RT<sub>HPDL</sub>. Thereby RT facilitation reflects the response speeding due to a distractor appearing at the high probability distractor location compared to the far neutral location. Next, we computed neural suppression as: BOLD<sub>suppression</sub> = BOLD<sub>NL-far</sub> – BOLD<sub>HPDL</sub> Thus, positive values reflect the suppression of BOLD responses at the HPDL comparted to the NL-far location. The BOLD suppression index was computed for each stimulus type separately, as in the main ROI analysis (i.e. for Targets, Neutrals and Distractors). Finally, we correlated RT<sub>facilitation</sub> with BOLD<sub>suppression</sub> across participants using Pearson correlation. Results showed a small, but not statistically significant correlation between RT facilitation and BOLD suppression for distractor (r<sub>(26)</sub> = 0.22, p = 0.257), target (r<sub>(26)</sub> = 0.10, p = 0.598) and neutral (r<sub>(26)</sub> = 0.13, p = 0.519) stimuli. Thus, while the direc on of the correlation was in line with the specula on by the reviewer in the “ Recommendations for the authors”, results were not statistically reliable and therefore inconclusive. As also noted in our preliminary reply to the reviewer comments, it was a priori unlikely that this analysis would yield a statistically significant correlation. An a priori power analysis suggested that, to reach a power of 0.8 at a standard alpha of 0.05, given the present sample size of n=28, the effect size would need to exceed r > 0.75, which seemed unlikely for the correlation of behavioural and neural difference scores. Given the inconclusive nature of the results, we prefer to not include this additional analysis in the manuscript, as we believe that it does not add to the main message of the paper but have it accessible to the interested reader in the public “peer review process”.
The study provides an advance over previous studies, which iden fied enhancement or suppression in visual cortex as a function of search target/distractor predictability, but in less spatially-specific way. It also speaks to open questions about whether such suppression/enhancement is observed only in response to the arrival of visual information, or instead is preparatory, favouring the la er view. The theoretical advance is moderate, in that it is largely congruent with previous frameworks, rather than strongly excluding an opposing view or providing a major step change in our understanding of how distractor suppression unfolds.
We agree with the reviewer that our results are an advancement of prior work, particularly with respect to narrowing down the role of sensory areas and the proactive nature of distractor suppression. However, we argue that this represents a significant step forward for several reasons. First, to our knowledge, the literature on distractor suppression, and visual search in general, is by no means unanimous with respect to the conclusion that distractor suppression is instantiated proactively (Huang et al., 2021, 2022). Indeed, there are several studies suggesting the opposite account; reactive suppression (Chang et al., 2023) or contributions by both proactive and reactive mechanisms (Sauter et al., 2021; Wang et al., 2019). Moreover, studies in support of proactive distractor suppression did not investigate the involvement of (early) sensory areas during suppression. Conversely, to our knowledge most studies investigating the involvement of sensory cortex during distractor suppression did not address the question whether suppression arises proactive or reactively.
Recommendations for the authors:
Reviewer #1 ( Recommendations for the authors):
Minor Points:
(1) There are several disconnects between the behaviour and the MR results - i.e. not stimulus specific yet there are no deficits for targets appearing the HPDL, also no behavioural suppression for the NLNear but neural suppression found. Nevertheless, the behaviour is used as a way to rule out potential attentional strategies when considering whether there is enhancement in the NL-Far condition. I realise you have a few other points here, but I think it's worth addressing what could be seen as a double standard.
The reviewer points out an important concern, which we feel could have better been addressed in the manuscript. From our point of view a partial dissociation between neural modulations in EVC and eventual behavioural facilitation is not surprising, given the extensive neural processing beyond EVC required for behaviour. However, this assessment may differ, if one stresses an explicit volitional attentional strategy over an implicit statistical learning account. That said, we clearly do not want to create the impression of using a double standard. The lack of behavioural facilitation for targets at NLfar is not a critical part of our argument against explicit attentional strategies. Therefore, we rephrased the relevant paragraph in the Discussion section to now emphasize the importance of the control analysis excluding participants who reported the correct HPDL in the questionnaire (Figure 5), but nonetheless yielded qualitatively identical results to the main ROI analysis (Figure 4). In our opinion, this control analysis provides more compelling evidence against a volitional attentional strategy account without the risk of crea ng the impression of applying a double standard in the interpretation of behavioural data. Additionally, we now acknowledge the limitation of relying on behavioral data in ruling out volitional attentional strategies in the updated manuscript:
“It is well established that attention enhances BOLD responses in visual cortex (Maunsell, 2015; Reynolds & Chelazzi, 2004; Williford & Maunsell, 2006). If participants learned the underlying distractor contingencies, they could deploy an explicit strategy by directing their attention away from the HPDL, for example by focusing attention on the diagonally opposite neutral location. This account provides an alternative explanation for the observed EVC modulations. However, while credible, the current findings are not consistent with such an interpretation. First, there was no behavioral facilitation for target stimuli presented at the far neutral location, contrary to what one might expect if participants employed an explicit strategy. However, given the partial dissociation between neural suppression in EVC and behavioral facilitation, additional neural data analyses are required to rule out volitional attention strategies. Thus, we performed a control analysis that excluded all participants that indicated the correct HPDL location in the questionnaire, thereby possibly expressing explicit awareness of the contingencies. This control analysis yielded qualitatively identical results to the full sample, showing significant distractor suppression in EVC. Therefore, it is unlikely that explicit attentional strategies, and the enhancement of locations far from the HPDL, drive the results observed here. Instead the current finding are consistent with an account emphasizing the automa c deployment of spatial priors (He et al., 2022) based on implicitly learned statistical regularities.”
(2) Does the level of suppression change in any way through the experiment? I.e., does it get stronger in the second vs. first half of the experiment?
The reviewer askes an interesting question, whether BOLD suppression may change across the experiment. To address this question, we performed an additional analysis testing BOLD suppression in EVC during the first compared to second half of the MRI experiment. Here we defined BOLD suppression as: BOLD<sub>suppression</sub> = ((BOLD<sub>NL-far</sub> – BOLD<sub>HPDL</sub>) + (BOLD<sub>NL-far</sub> – BOLD<sub>NL-near</sub>)) / 2. Thus, in this formula on of BOLD suppression we summarize the two primary BOLD suppression effects observed in our main results (Figure 4). Additionally, as we previously did not observe any significant differences in BOLD suppression magnitudes between different stimulus types (i.e. suppression was similar for target, distractor and neutral stimuli), we collapsed across stimulus types in this analysis.
Results, depicted below, showed that during both the initial (Run 1+2) and later part (Run 4+5) of the MRI experiment BOLD suppression was statistically significant (BOLD suppression Run 1+2: W = 331, p = 0.003, r = 0.63; BOLD suppression Run 4+5: W = 320, p = 0.007, r= 0.58) , confirming our main results of reliable distractor suppression even in this subset of trials. However, we did not observe any statistically significant differences between early and late runs of the experiment (t<sub>(27)</sub> = -0.21, p = 0.835, d = -0.04). In fact, a Bayesian paired t-test provided evidence for the absence of a difference in BOLD suppression between early compared to later runs (BF<sub>10</sub> = 0.205), suggesting that distractor suppression in EVC was stable throughout the experiment. A qualitatively similar, pattern was evident during omission trials, with significant distractor suppression during early runs (t<sub>(27)</sub> = 2.70, p = 0.012, d = 0.51), but not quite a statistically significant modulation for later runs (t<sub>(27)</sub> = 1.97, p = 0.059, d = 0.37). Again, there was no evidence for a difference in suppression magnitudes across the experiment (W = 198, p = 0.920, d = -0.025) and support for the absence of a difference in BOLD suppression between early and late runs (BF<sub>10</sub> = 0.278).
Author response image 1.
Analysis of BOLD suppression magnitudes in EVC across the MRI experiment phases. BOLD suppression was comparable between early (Run 1+2) and late (Run 4+5) phases of the MRI experiment, suggesting consistent suppression in EVC following statistical learning. Error-bars denote within-subject SEM. * p < 0.05, ** p < 0.01, = BF<sub>10</sub> < 1/3.
In sum, results suggest that distractor suppression in EVC was stable across runs and did not change significantly throughout the experiment. This result was a priori likely, given that participants already underwent behavioral training before entering the MRI. This enabled them to establish modified spatial priority maps, containing the high probability distractor location contingencies, already before the first MRI run. While specula ve, it is possible that participants may still have consolidated the spatial priority maps during the initial runs, but that this additional consolation is not evident in the data, as later runs may see less engagement by participants due to increasing fa gue towards the end of the MRI experiment. Indeed, rapid learning and stable suppression throughout the remainder of the experiment is also reported by prior work (Lin et al., 2021). We believe that it is highly interesting for future studies to investigate the development of distractor suppression across learning, with initial exposure to the contingencies inside the MRI. However, as the present results are inconclusive, we prefer to not include this analysis in the main manuscript, as it may not provide significant additional insight into the neural mechanisms underlying distractor suppression.
(3) In the methods vs. results you have reported the probabili es slightly differently. In the methods you say the HPDL was 6x more likely to contain a distractor whereas in the results you say 4x. Based on the reported trial numbers I think it should be 4, but probably you want to double check that this is consistent and correct throughout.
We thank the reviewer for bringing this inconsistency to our attention. We have corrected this oversight in the adjusted manuscript:
“One of the four locations of interest was designated the high probability distractor location (HPDL), which contained distractor stimuli (unique color) four mes more o en than any of the remaining three locations of interest. In other words, if a distractor was present on a given trial (42 trials per run), the distractor appeared 57% (24 trials per run) at the HPDL and at one of the other three locations with equal probability (i.e., 14% or 6 trials per run per location).”
Reviewer #2 ( Recommendations for the authors):
The authors have performed their analyses in the volume rather than the surface, and have grouped together V1, V2, and V3 as "early visual cortex". As the authors' claims lean heavily on the idea that they are measuring "early" visual responses, the study would be improved by delinea ng the ROIS within these different retinotopic regions. Such an approach might be facilitated by analysing data on the reconstructed surface.
Please refer to our reply to this analysis suggested in the Public review.
The authors rightly tread carefully on the causal link between their neural findings and the behavioural outcomes. The picture might be clarified somewhat further by testing for a positive relationship between behavioural effect sizes and neural effect sizes across participants. e.g. to what extent is the search advantage when distractors are presented at the "HPDL" linked to greater suppression of BOLD at the HDPL region of early visual cortex?
Please refer to our reply to this analysis suggested in the Public review.
Some of the claims based on null hypotheses would be better supported by Bayesian tests e.g. page 6 "This pattern of results was the same regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NL-far ..." and "BOLD responses between HPDL and NL-near locations did not reliably differ ..." This is similar to the approach that the authors adopted later in the section "Ruling out attentional modulation".
We agree with the reviewer that our ROI analyses would benefit from providing evidence for the absence of a modulation. Accordingly, we updated our results by adding equivalent Bayesian tests. Bayes Factors were computed using JASP 0.18.2 (JASP Team, 2024; RRID:SCR_015823) with default settings; i.e. for Bayesian paired t-tests with a Cauchy prior width of 0.707. Qualitative interpretations of BFs were based on Lee and Wagenmakers (2014). We now report the obtained BF in the Results section.
“BOLD responses between HPDL and NL-near locations did not reliably differ (HPDL vs NL-near: t<sub>(27)</sub> = 0.47, p<sub>holm</sub> = 0.643, d = 0.08; BF<sub>10</sub> = 0.19).”
And:
“Neural responses at HPDL and NL-near did not reliably differ (t<sub>(27)</sub> = 0.21, p<sub>holm</sub> = 0.835 d = 0.04; BF<sub>10</sub> = 0.21).”
Moreover, we now denote any equivalent results (defined as BF<sub>10</sub><1/3) in Fig. 4 and Fig. 5, and included the descrip on of the associated symbol in the figure text (“ = BF<sub>10</sub> < 1/3”).
Additionally, we now also report the BF for all paired t-tests reported in Supplementary Table 1.
Finally, we addressed the statement: “This pattern of results was the same regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NLfar”. Our inten on was to emphasize that the pattern of results reported in the sentence preceding it was evident for distractor, target, or neutral stimulus, and not to suggest that the magnitude of the effect is the same. Hence, to more accurate reflect the results, we changed this sentence to: “This pattern of results was present regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NL-far”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This manuscript describes the role of PRDM16 in modulating BMP response during choroid plexus (ChP) development. The authors combine PRDM16 knockout mice and cultured PRDM16 KO primary neural stem cells (NSCs) to determine the interactions between BMP signaling and PRDM16 in ChP differentiation.
They show PRDM16 KO affects ChP development in vivo and BMP4 response in vitro. They determine genes regulated by BMP and PRDM16 by ChIP-seq or CUT&TAG for PRDM16, pSMAD1/5/8, and SMAD4. They then measure gene activity in primary NSCs through H3K4me3 and find more genes are co-repressed than co-activated by BMP signaling and PRDM16. They focus on the 31 genes found to be co-repressed by BMP and PRDM16. Wnt7b is in this set and the authors then provide evidence that PRDM16 and BMP signaling together repress Wnt activity in the developing choroid plexus.
Strengths:
Understanding context-dependent responses to cell signals during development is an important problem. The authors use a powerful combination of in vivo and in vitro systems to dissect how PRDM16 may modulate BMP response in early brain development.
Main weaknesses of the experimental setup:
(1) Because the authors state that primary NSCs cultured in vitro lose endogenous Prdm16 expression, they drive expression by a constitutive promoter. However, this means the expression levels are very different from endogenous levels (as explicitly shown in Supplementary Figure 2B) and the effect of many transcription factors is strongly dose-dependent, likely creating differences between the PRDM16-dependent transcriptional response in the in vitro system and in vivo.<br />
We acknowledge that our in vitro experiments may not ideally replicate the in vivo situation, a common limitation of such experiments, our primary aim was to explore the molecular relationship between PRDM16 and BMP signaling in gene regulation. Such molecular investigations are challenging to conduct using in vivo tissues. In vitro NSCs treated with BMP4 has been used a model to investigate NSC proliferation and quiescence, drawing on previous studies (e.g., Helena Mira, 2010; Marlen Knobloch, 2017). Crucially, to ensure the relevance of our in vitro findings to the in vivo context, we confirmed that cultured cells could indeed be induced into quiescence by BMP4, and this induction necessitated the presence of PRDM16. Furthermore, upon identifying target genes co-regulated by PRDM16 and SMADs, we validated PRDM16's regulatory role on a subset of these genes in the developing Choroid Plexus (ChP) (Fig. 7 and Suppl.Fig7-8). Only by combining evidence from both in vitro and in vivo experiments could we confidently conclude that PRDM16 serves as an essential co-factor for BMP signaling in restricting NSC proliferation.
(2) It seems that the authors compare Prdm16_KO cells to Prdm16 WT cells overexpressing flag_Prdm16. Aside from the possible expression of endogenous Prdm16, other cell differences may have arisen between these cell lines. A properly controlled experiment would compare Prdm16_KO ctrl (possibly infected with a control vector without Prdm16) to Prdm16_KO_E (i.e. the Prdm16_KO cells with and without Prdm16 overexpression.)
We agree that Prdm16 KO cells carrying the Prdm16-expressing vector would be a good comparison with those with KO_vector. However, despite more than 10 attempts with various optimization conditions, we were unable to establish a viable cell line after infecting Prdm16 KO cells with the Prdm16-expressing vector. The overall survival rate for primary NSCs after viral infection is low, and we observed that KO cells were particularly sensitive to infection treatment when the viral vector was large (the Prdm16 ORF is more than 3kb).
As an alternative oo assess vector effects, we instead included two other control cell lines, wt and KO cells infected with the 3xNLS_Flag-tag viral vector, and presented the results in supplementary Fig 2. When we compared the responses of the four lines — wt, KO, wt infected with the Flag vector, KO infected with the Flag vector — to the addition and removal of BMP4, we confirmed that the viral infection itself has no significant impacts on the responses of these cells to these treatments regarding changes in cell proliferation and Ttr induction.
Given that wt cells and the KO cells, with or without viral backbone infection behave quite similarly in terms of cell proliferation, we speculate that even if we were successful in obtaining a cell line with Prdm16-expressing vector in the KO cells, it may not exhibit substantial differences compared to wt cells infected with Prdm16-expressing vector.
Other experimental weaknesses that make the evidence less convincing:
(1) The authors show in Figure 2E that Ttr is not upregulated by BMP4 in PRDM16_KO NSCs. Does this appear inconsistent with the presence of Ttr expression in the PRDM16_KO brain in Figure1C?<br />
The reviwer’s point is that there was no significant increase in Ttr expression in Prdm16_KO cells after BMP4 treatment (Fig. 2E), but there remained residule Ttr mRNA signals in the Prdm16 mutant ChP (Fig. 1C). We think the difference lies in the measuable level of Ttr expression between that induced by BMP4 in NSC culture and that in the ChP. This is based on our immunostaining expreriment in which we tried to detect Ttr using a Ttr antibody. This antibody could not detect the Ttr protein in BMP4-treated Prdm16_expressing NSCs but clearly showed Ttr signal in the wt ChP. This means that although Ttr expression can be significantly increased by BMP4 in vitro to a level measurable by RT-qPCR, its absolute quantity even in the Prdm16_expressing condition is much lower compared to that in vivo. Our results in Fig 1C and Fig 2E, as well as Fig 7B, all consistently showed that Prdm16 depletion significantly reduced Ttr expression in in vitro and in vivo.
(2) Figure 3: The authors use H3K4me3 to measure gene activity. This is however, very indirect, with bulk RNA-seq providing the most direct readout and polymerase binding (ChIP-seq) another more direct readout. Transcription can be regulated without expected changes in histone methylation, see e.g. papers from Josh Brickman. They verify their H3K4me3 predictions with qPCR for a select number of genes, all related to the kinetochore, but it is not clear why these genes were picked, and one could worry whether these are representative.
H3K4me3 has widely been used as an indicator of active transcription and is a mark for cell identity genes. And it has been demonstrated that H3K4me3 has a direct function in regulating transciption at the step of RNApolII pausing release. As stated in the text, there are advantages and disadvantages of using H3K4me3 compared to using RNA-seq. RNA-seq profiles all gene products, which are affected by transcription and RNA stability and turnover. In contrast, H3K4me3 levels at gene promoter reflects transcriptional activity. In our case, we aimed to identify differential gene expression between proliferation and quiescence states. The transition between these two states is fast and dynamic. RNA-seq may not be able to identify functionally relevant genes but more likely produces false positive and negative results. Therefore, we chose H3K4me3 profiling.
We agree that transcription may change without histone methylation changes. This may cause an under-estimation of the number of changed genes between the conditions.
We validated 7 out of 31 genes (Wnt7b, Id3, Mybl2, Spc24, Spc25, Ndc80 and Nuf2). We chose these genes based on two critira: 1) their function is implicated in cell proliferation and cell-cycle regulation based on gene ontology analysis; 2) their gene products are detectable in the developing ChP based on the scRNA-seq data. Three of these genes (Wnt7b, Id3, Mybl2) are not related to the kinetochore. We now clarify this description in the revised text.
(3) Line 256: The overlap of 31 genes between 184 BMP-repressed genes and 240 PRDM16-repressed genes seems quite small.
This indicates that in addition to co-repressing cell-cycle genes, BMP and PRDM16 have independent fucntions. For example, it was reported that BMP regulates neuronal and astrocyte differentiation (Katada, S. 2021), while our previous work demonstrated that Prdm16 controls temporal identity of NSCs (He, L. 2021).
(4) The Wnt7b H3K4me3 track in Fig. 3G is not discussed in the text but it shows H3K4me3 high in _KO and low in _E regardless of BMP4. This seems to contradict the heatmap of H3K4me3 in Figure 3E which shows H3K4me3 high in _E no BMP4 and low in _E BMP4 while omitting _KO no BMP4. Meanwhile CDKN1A, the other gene shown in 3G, is missing from 3E.
The track in Fig 3G shows the absolute signal of H3K4me3 after mapping the sequencing reads to the genome and normaliz them to library size. Compare the signal in Prdm16_E with BMP4 and that in Prdm16_E without BMP4, the one with BMP4 has a lower peak. The same trend can be seen for the pair of Prdm16_KO cells with or without BMP4. The heatmap in Fig. 3E shows the relative level of H3K4me3 in three conditions. The Prdm16_E cells with BMP4 has the lowest level, while the other two conditions (Prdm16_KO with BMP4 and Prdm16_E without BMP4) display a higher level. These two graphs show a consistent trend of H3K4me3 changes at the Wnt7b promoter across these conditions.
(5) The authors use PRDM16 CUT&TAG on dissected dorsal midline tissues to determine if their 31 identified PRDM16-BMP4 co-repressed genes are regulated directly by PRDM16 in vivo. By manual inspection, they find that "most" of these show a PRDM16 peak. How many is most? If using the same parameters for determining peaks, how many genes in an appropriately chosen negative control set of genes would show peaks? Can the authors rigorously establish the statistical significance of this observation? And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.
In our text, we indicated the genes containing PRDM16 binding peaks in the figures and described them as “Text in black in Fig. 6A and Supplementary Fig. 5A”. We will add the precise number “25 of these genes” in the main text to clarify it. To define a negative control set of genes, we will use BMP-only repressed 184-31 =153 genes (excluding PRDM16-BMP4 co-repressed), and of these 153 genes, we will determine how many have PRDM16 peaks in the E12.5 ChP data, say X. Then we will use binomial test to calculate p-value binom_test(25, 31, X/153, alternative=“greater).
We are confused with the second part of the comment “And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.” If the reviewer meant why we didn’t sequence the material from sequential-ChIP or validate more taget genes, the reason is the limitation of the material. Sequential ChIP requires a large quantity of the antibodies, and yields little material barely sufficient for a few qPCR after the second round of IP. This yielded amount was far below the minimum required for library construction. The PRDM16 antibody was a gift, and the quantity we have was very limited. We made a lot of efforts to optimize all available commercial antibodies in ChIP and Cut&Tag, but none of them worked.
(6) In comparing RNA in situ between WT and PRDM16 KO in Figure 7, the authors state they use the Wnt2b signal to identify the border between CH and neocortex. However, the Wnt2b signal is shown in grey and it is impossible for this reviewer to see clear Wnt2b expression or where the boundaries are in Figure 7A. The authors also do not show where they placed the boundaries in their analysis. Furthermore, Figure 7B only shows insets for one of the regions being compared making it difficult to see differences from the other region. Finally, the authors do not show an example of their spot segmentation to judge whether their spot counting is reliable. Overall, this makes it difficult to judge whether the quantification in Figure 7C can be trusted.
To address these questions, in the revised manuscript we will include an individal channel of Wnt2b and mark the boundaries. We will also provide full-view images and examples of spot segmentation in supplementary figures as space limitation in the main figures.
(7) The correlation between mKi67 and Axin2 in Figure 7 is interesting but does not convincingly show that Wnt downstream of PRDM16 and BMP is responsible for the increased proliferation in PRDM16 mutants.
We agree that this result (the correlation between mKi67 and Axin2) alone only suggests that Wnt signaling is related to the proliferation defect in the Prdm16 mutant, and does not necessarily mean that Wnt is downstream of PRDM16 and BMP. Our concolusion is backed up by two additional lines of evidences: the Cut&Tag data in which PRDM16 binds to regulatory regions of Wnt7b and Wnt3a; BMP and PRDM16 co-repress Wnt7b in vitro.
An ideal result is that down-regulating Wnt signaling in Prdm16 mutant can rescue Prdm16 mutant phenotype. Such an experiment is technically challenging. Wnt plays diverse and essential roles in NSC regulation, and one would need to use a celltype-and stage-specific tool to down-regulate Wnt in the background of Prdm16 mutation. Moreover, Wnt genes are not the only targets regulated by PRDM16 in these cells, and downregulating Wnt may not be sufficient to rescue the phenotype.
Weaknesses of the presentation:
Overall, the manuscript is not easy to read. This can cause confusion.
We will revise the text to improve the clarity.
Reviewer #2 (Public review):
Summary:
This article investigates the role of PRDM16 in regulating cell proliferation and differentiation during choroid plexus (ChP) development in mice. The study finds that PRDM16 acts as a corepressor in the BMP signaling pathway, which is crucial for ChP formation.
The key findings of the study are:
(1) PRDM16 promotes cell cycle exit in neural epithelial cells at the ChP primordium.
(2) PRDM16 and BMP signaling work together to induce neural stem cell (NSC) quiescence in vitro.
(3) BMP signaling and PRDM16 cooperatively repress proliferation genes.
(4) PRDM16 assists genomic binding of SMAD4 and pSMAD1/5/8.
(5) Genes co-regulated by SMADs and PRDM16 in NSCs are repressed in the developing ChP.
(6) PRDM16 represses Wnt7b and Wnt activity in the developing ChP.
(7) Levels of Wnt activity correlate with cell proliferation in the developing ChP and CH.
In summary, this study identifies PRDM16 as a key regulator of the balance between BMP and Wnt signaling during ChP development. PRDM16 facilitates the repressive function of BMP signaling on cell proliferation while simultaneously suppressing Wnt signaling. This interplay between signaling pathways and PRDM16 is essential for the proper specification and differentiation of ChP epithelial cells. This study provides new insights into the molecular mechanisms governing ChP development and may have implications for understanding the pathogenesis of ChP tumors and other related diseases.
Strengths:
(1) Combining in vitro and in vivo experiments to provide a comprehensive understanding of PRDM16 function in ChP development.
(2) Uses of a variety of techniques, including immunostaining, RNA in situ hybridization, RT-qPCR, CUT&Tag, ChIP-seq, and SCRINSHOT.
(3) Identifying a novel role for PRDM16 in regulating the balance between BMP and Wnt signaling.
(4) Providing a mechanistic explanation for how PRDM16 enhances the repressive function of BMP signaling. The identification of SMAD palindromic motifs as preferred binding sites for the SMAD/PRDM16 complex suggests a specific mechanism for PRDM16-mediated gene repression.
(5) Highlighting the potential clinical relevance of PRDM16 in the context of ChP tumors and other related diseases. By demonstrating the crucial role of PRDM16 in controlling ChP development, the study suggests that dysregulation of PRDM16 may contribute to the pathogenesis of these conditions.
Weaknesses:
(1) Limited investigation of the mechanism controlling PRDM16 protein stability and nuclear localization in vivo. The study observed that PRDM16 protein became nearly undetectable in NSCs cultured in vitro, despite high mRNA levels. While the authors speculate that post-translational modifications might regulate PRDM16 in NSCs similar to brown adipocytes, further investigation is needed to confirm this and understand the precise mechanism controlling PRDM16 protein levels in vivo.
While mechansims controlling PRDM16 protein stability and nuclear localization in the developing brain are interesting, the scope of this paper is revealing the function of PRDM16 in the choroid plexus and its interaction with BMP signaling. We will be happy to pursuit this direction in our next study.
(2) Reliance on overexpression of PRDM16 in NSC cultures. To study PRDM16 function in vitro, the authors used a lentiviral construct to constitutively express PRDM16 in NSCs. While this approach allowed them to overcome the issue of low PRDM16 protein levels in vitro, it is important to consider that overexpressing PRDM16 may not fully recapitulate its physiological role in regulating gene expression and cell behavior.
As stated above, we acknowledge that findings from cultured NSCs may not directly apply to ChP cells in vivo. We are cautious with our statements. The cell culture work was aimed to identify potential mechanisms by which PRDM16 and SMADs interact to regulate gene expression and target genes co-regulated by these factors. We expect that not all targets from cell culture are regulated by PRDM16 and SMADs in the ChP, so we validated expression changes of several target genes in the developing ChP and now included the new data in Fig. 7 and Supplementary Fig. 7. Out of the 31 genes identified from cultured cells, four cell cycle regulators including Wnt7b, Id3, Spc24/25/nuf2 and Mybl2, showed de-repression in Prdm16 mutant ChP. These genes can be relevant downstream genes in the ChP, and other target genes may be cortical NSC-specific or less dependent on Prdm16 in vivo.
(3) Lack of direct evidence for AP1 as the co-factor responsible for SMAD relocation in the absence of PRDM16. While the study identified the AP1 motif as enriched in SMAD binding sites in Prdm16 knockout cells, they only provided ChIP-qPCR validation for c-FOS binding at two specific loci (Wnt7b and Id3). Further investigation is needed to confirm the direct interaction between AP1 and SMAD proteins in the absence of PRDM16 and to rule out other potential co-factors.
We agree that the finding of the AP1 motif enriched at the PRDM16 and SMAD co-binding regions in Prdm16 KO cells can only indirectly suggest AP1 as a co-factor for SMAD relocation. That’s why we used ChIP-qPCR to examine the presence of C-fos at these sites. Although we only validated two targets, the result confirms that C-fos binds to the sites only in the Prdm16 KO cells but not Prdm16_expressing cells, suggesting AP1 is a co-factor. We results cannot rule out the presence of other co-factors.
Reviewer #3 (Public review):
Summary:
Bone morphogenetic protein (BMP) signaling instructs multiple processes during development including cell proliferation and differentiation. The authors set out to understand the role of PRDM16 in these various functions of BMP signaling. They find that PRDM16 and BMP co-operate to repress stem cell proliferation by regulating the genomic distribution of BMP pathway transcription factors. They additionally show that PRDM16 impacts choroid plexus epithelial cell specification. The authors provide evidence for a regulatory circuit (constituting of BMP, PRDM16, and Wnt) that influences stem cell proliferation/differentiation.
Strengths:
I find the topics studied by the authors in this study of general interest to the field, the experiments well-controlled and the analysis in the paper sound.
Weaknesses:
I have no major scientific concerns. I have some minor recommendations that will help improve the paper (regarding the discussion).
We will revise the discussion according the suggestions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment
The authors utilize a valuable computational approach to exploring the mechanisms of memorydependent klinotaxis, with a hypothesis that is both plausible and testable. Although they provide a solid hypothesis of circuit function based on an established model, the model's lack of integration of newer experimental findings, its reliance on predefined synaptic states, and oversimplified sensory dynamics, make the investigation incomplete for both memory and internal-state modulation of taxis.
We would like to express our gratitude to the editor for the assessment of our work. However, we respectfully disagree with the assessment that our investigation is incomplete, if the negative assessment is primarily due to the impact of AIY interneuron ablation on the chemotaxis index (CI) which was reported in Reference [1]. It is crucial to acknowledge that the CI determined through experimental means incorporates contributions from both klinokinesis and klinotaxis [1]. It is plausible that the impact of AIY ablation was not adequately reflected in the CI value. Consequently, the experimental observation does not necessarily diminish the role of AIY in klinotaxis. Anatomical evidence provided by the database (http://ims.dse.ibaraki.ac.jp/ccep-tool/) substantiates that ASE sensory neurons and AIZ interneurons, which have been demonstrated to play a crucial role in klinotaxis [Matsumoto et al., PNAS 121 (5) e2310735121], have the highest number of synaptic connections with AIY interneurons. These findings provide substantial evidence supporting the validity of the presented minimal neural network responsible for salt klinotaxis.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This research focuses on C. elegans klinotaxis, a chemotactic behavior characterized by gradual turning, aiming to uncover the neural circuit mechanism responsible for the context-dependent reversal of salt concentration preference. The phenomenon observed is that the preferred salt concentration depends on the difference between the pre-assay cultivation conditions and the current environmental salt levels.
We would like to express our gratitude for the time and consideration you have dedicated to reviewing our manuscript.
The authors propose that a synaptic-reversal plasticity mechanism at the primary sensory neuron, ASER, is critical for this memory- and context-dependent switching of preference. They build on prior findings regarding synaptic reversal between ASER and AIB, as well as the receptor composition of AIY neurons, to hypothesize that similar "plasticity" between ASER and AIY underpins salt preference behavior in klinotaxis. This plasticity differs conceptually from the classical one as it does not rely on any structural changes but rather synaptic transmission is modulated by the basal level of glutamate, and can switch from inhibitory to excitatory.
To test this hypothesis, the study employs a previously established neuroanatomically grounded model [4] and demonstrates that reversing the ASER-AIY synapse sign in the model agent reproduces the observed reversal in salt preference. The model is parameterized using a computational search technique (evolutionary algorithm) to optimize unknown electrophysiological parameters for chemotaxis performance. Experimental validity is ensured by incorporating constraints derived from published findings, confirming the plausibility of the proposed mechanism.
Finally. the circuit mechanism allowing C. elegans to switch behaviour to an exploration run when starved is also investigated. This extension highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.
We would like to thank the reviewer for the appropriate summary of our work.
Strengths and weaknesses:
The authors' approach of integrating prior knowledge of receptor composition and synaptic reversal with the repurposing of a published neuroanatomical model [4] is a significant strength.
This methodology not only ensures biological plausibility but also leverages a solid, reproducible modeling foundation to explore and test novel hypotheses effectively.
The evidence produced that the original model has been successfully reproduced is convincing.
The writing of the manuscript needs revision as it makes comprehension difficult.
We would like to thank the reviewer for recognizing the usefulness of our approach. In the revised version, we will improve the explanation.
One major weakness is that the model does not incorporate key findings that have emerged since the original model's publication in 2013, limiting the support for the proposed mechanism. In particular, ablation studies indicate that AIY is not critical for chemotaxis, and other interneurons may play partially overlapping roles in positive versus negative chemotaxis. These findings challenge the centrality of AIY and suggest the model oversimplifies the circuit involved in klinotaxis.
We would like to express our gratitude for the constructive feedback we have received. We concur with some of your assertions. In fact, our model is the minimal network for salt klinotaxis, which includes solely the interneurons that are connected to each other via the highest number of synaptic connections. It is important to note that our model does not consider redundant interneurons that exhibit overlapping roles. Consequently, the model is not applicable to the study of the impact of interneuron ablation. In the reference [1], the influence of interneuron ablations on the chemotaxis index (CI) has been investigated. The experimentally determined CI value incorporates the contributions from both klinokinesis and klinotaxis. Consequently, it is plausible that the impact of AIY ablation was not significantly reflected in the CI value. The experimental observation does not necessarily diminish the role of AIY in klinotaxis.
Reference [1] also shows that ASER neurons exhibit complex, memory- and context-dependent responses, which are not accounted for in the model and may have a significant impact on chemotactic model behaviour.
As pointed out by the reviewer, our model does not incorporate the context-dependent response of the ASER. Instead, the salt concentration-dependent glutamate release from the ASRE [S. Hiroki et al. Nat Commun 13, 2928 (2022)] as the result of the ASER responses is considered in the present study.
The hypothesis of synaptic reversal between ASER and AIY is not explicitly modeled in terms of receptor-specific dynamics or glutamate basal levels. Instead, the ASER-to-AIY connection is predefined as inhibitory or excitatory in separate models. This approach limits the model's ability to test the full range of mechanisms hypothesized to drive behavioral switching.
We would like to thank the reviewer for the helpful comments. In the revised version, we will mention the limitation.
While the main results - such as response dependence on step inputs at different phases of the oscillator - are consistent with those observed in chemotaxis models with explicit neural dynamics (e.g., Reference [2]), the lack of richer neural dynamics could overlook critical effects. For example, the authors highlight the influence of gap junctions on turning sensitivity but do not sufficiently analyze the underlying mechanisms driving these effects. The role of gap junctions in the model may be oversimplified because, as in the original model [4], the oscillator dynamics are not intrinsically generated by an oscillator circuit but are instead externally imposed via $z_¥text{osc}$. This simplification should be carefully considered when interpreting the contributions of specific connections to network dynamics. Lastly, the complex and contextdependent responses of ASER [1] might interact with circuit dynamics in ways that are not captured by the current simplified implementation. These simplifications could limit the model's ability to account for the interplay between sensory encoding and motor responses in C. elegans chemotaxis.
We might not understand the substance of your assertions. However, we understand that the oscillator dynamics were not generated by an oscillator neural circuit in our modeling. On the other hand, the present study focuses on how the sensory input and resulting interneuron dynamics regulate the oscillatory activity of SMB motor neurons to generate klinotaxis.
Appraisal:
The authors show that their model can reproduce memory-dependent reversal of preference in klinotaxis, demonstrating that the ASER-to-AIY synapse plays a key role in switching chemotactic preferences. By switching the ASER-AIY connection from excitatory to inhibitory they indeed show that salt preference reverses. They also show that the curving/turn rate underlying the preference change is gradual and depends on the weight between ASER-AIY. They further support their claim by showing that curving rates also depend on cultivated (set-point).
We would like to thank the reviewer for assessing our work.
Thus within the constraints of the hypothesis and the framework, the model operates as expected and aligns with some experimental findings. However, significant omissions of key experimental evidence raise questions on whether the proposed neural mechanisms are sufficient for reversal in salt-preference chemotaxis.
We agree with your opinion. The present hypothesis should be verified by experiments.
Previous work [1] has shown that individually ablating the AIZ or AIY interneurons has essentially no effect on the Chemotactic Index (CI) toward the set point ([1] Figure 6). Furthermore, in [1] the authors report that different postsynaptic neurons are required for movement above or below the set point. The manuscript should address how this evidence fits with their model by attempting similar ablations. It is possible that the CI is rescued by klinokinesis but this needs to be tested on an extension of this model to provide a more compelling argument.
We would like to express our gratitude for the constructive feedback we have received. In the reference [1], the influence of interneuron ablations on the chemotaxis index (CI) has been investigated. It is important to acknowledge that the experimentally determined CI value encompasses the contributions of both klinokinesis and klinotaxis. It is plausible that the impact of AIY ablation was not reflected in the CI value. Consequently, these experimental observations do not necessarily diminish the role of AIY in klinotaxis. The neural circuit model employed in the present study constitutes a minimal network for salt klinotaxis, encompassing solely interneurons that are connected to each other via the highest number of synaptic connections. Anatomical evidence provided by the database (http://ims.dse.ibaraki.ac.jp/cceptool/) substantiates that ASE sensory neurons and AIZ interneurons, which have been demonstrated to play a crucial role in klinotaxis [Matsumoto et al., PNAS 121 (5) e2310735121], have the highest number of synaptic connections with AIY interneurons. Our model does not take into account redundant interneurons with overlapping roles, thus rendering it not applicable to the study of the effects of interneuron ablation.
The investigation of dispersal behaviour in starved individuals is rather limited to testing by imposing inhibition of the SMB neurons. Although a circuit is proposed for how hunger states modulate taxis in the absence of food, this circuit hypothesis is not explicitly modelled to test the theory or provide novel insights.
As pointed out by the reviewer, the neural circuit that inhibits the SMB motor neurons was not explicitly incorporated in our model. We then examined whether our minimal network model could reproduce dispersal behavior under starvation conditions solely due to the experimentally identified inhibitory effect of SMB motor neurons.
Impact :
This research underscores the value of an embodied approach to understanding chemotaxis, addressing an important memory mechanism that enables adaptive behavior in the sensorimotor circuits supporting C. elegans chemotaxis. The principle of operation - the dependence of motor responses to sensory inputs on the phase of oscillation - appears to be a convergent solution to taxis. Similar mechanisms have been proposed in Drosophila larvae chemotaxis [2], zebrafish phototaxis [3], and other systems. Consequently, the proposed mechanism has broader implications for understanding how adaptive behaviors are embedded within sensorimotor systems and how experience shapes these circuits across species.
We would like to express our gratitude for useful suggestion. We will add the argument that the reviewer mentioned in the revised version.
Although the reported reversal of synaptic connection from excitatory to inhibitory is an exciting phenomenon of broad interest, it is not entirely new, as the authors acknowledge similar reversals have been reported in ASER-to-AIB signaling for klinokinesis ( Hiroki et al., 2022). The proposed reversal of the ASER-to-AIY synaptic connection from inhibitory to excitatory is a novel contribution in the specific context of klinotaxis. While the ASER's role in gradient sensing and memory encoding has been previously identified, the current paper mechanistically models these processes, introducing a hypothesis for synaptic plasticity as the basis for bidirectional salt preference in klinotaxis.
The research also highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.
The methodology of parameter search on a neural model of a connectome used here yielded the valuable insight that connectome information alone does not provide enough constraints to reproduce the neural circuits for behaviour. It demonstrates that additional neurophysiological constraints are required.
We would like to acknowledge the appropriate recognition of our work.
Additional Context
Oscillators with stimulus-driven perturbations appear to be a convergent solution for taxis and navigation across species. Similar mechanisms have been studied in zebrafish phototaxis [3],
Drosophila larvae chemotaxis [2], and have even been proposed to underlie search runs in ants.
The modulation of taxis by context and memory is a ubiquitous requirement, with parallels across species. For example, Drosophila larvae modulate taxis based on current food availability and predicted rewards associated with odors, though the underlying mechanism remains elusive. The synaptic reversal mechanism highlighted in this study offers a compelling framework for understanding how taxis circuits integrate context-related memory retrieval more broadly.
We would like to express our gratitude for the insightful commentary. In the revised version, we will incorporate the discussion that the similar oscillator mechanism with stimulus-driven perturbations has been observed for zebrafish phototaxis [3] and Drosophila larvae chemotaxis [2].
As a side note, an interesting difference emerges when comparing C. elegans and Drosophila larvae chemotaxis. In Drosophila larvae, oscillatory mechanisms are hypothesized to underlie all chemotactic reorientations, ranging from large turns to smaller directional biases (weathervaning). By contrast, in C. elegans, weathervaning and pirouettes are treated as distinct strategies, often attributed to separate neural mechanisms. This raises the possibility that their motor execution could share a common oscillator-based framework. Re-examining their overlap might reveal deeper insights into the neural principles underlying these maneuvers.
We would like to acknowledge your thoughtfully articulated comment. As pointed out by the reviewer, from the anatomical database (http://ims.dse.ibaraki.ac.jp/ccep-tool/), we found that the neural circuits underlying weathervaning and pirouettes in C. elegans are predominantly distinct but exhibit partial overlap. When we restrict our search to the neurons that are connected to each other with the highest number of synaptic connections, we identify the projections from the neural circuit of weathervaning to the circuit of pirouettes; however we observed no reversal projections. This finding suggests that the neural circuit of weathervaning, namely, our minimal neural network, is not likely to be affected by that of pirouettes, which consists of AIB interneurons and interneurons and motor neurons the downstream.
(1) Luo, L., Wen, Q., Ren, J., Hendricks, M., Gershow, M., Qin, Y., Greenwood, J., Soucy, E.R., Klein, M., Smith-Parker, H.K., & Calvo, A.C. (2014). Dynamic encoding of perception, memory, and movement in a C. elegans chemotaxis circuit. Neuron, 82(5), 1115-1128.
(2) Antoine Wystrach, Konstantinos Lagogiannis, Barbara Webb (2016) Continuous lateral oscillations as a core mechanism for taxis in Drosophila larvae eLife 5:e15504.
(3) Wolf, S., Dubreuil, A.M., Bertoni, T. et al. Sensorimotor computation underlying phototaxis in zebrafish. Nat Commun 8, 651 (2017).
(4) Izquierdo, E.J. and Beer, R.D., 2013. Connecting a connectome to behavior: an ensemble of neuroanatomical models of C. elegans klinotaxis. PLoS computational biology, 9(2), p.e1002890.
Reviewer #2 (Public review):
Summary:
This study explores how a simple sensorimotor circuit in the nematode C. elegans enables it to navigate salt gradients based on past experiences. Using computational simulations and previously described neural connections, the study demonstrates how a single neuron, ASER, can change its signaling behavior in response to different salt conditions, with which the worm is able to "remember" prior environments and adjust its navigation toward "preferred" salinity accordingly.
We would like to express our gratitude for the time and consideration the reviewer has dedicated to reviewing our manuscript.
Strengths:
The key novelty and strength of this paper is the explicit demonstration of computational neurobehavioral modeling and evolutionary algorithms to elucidate the synaptic plasticity in a minimal neural circuit that is sufficient to replicate memory-based chemotaxis. In particular, with changes in ASER's glutamate release and sensitivity of downstream neurons, the ASER neuron adjusts its output to be either excitatory or inhibitory depending on ambient salt concentration, enabling the worm to navigate toward or away from salt gradients based on prior exposure to salt concentration.
We would like to thank the reviewer for appreciating our research.
Weaknesses:
While the model successfully replicates some behaviors observed in previous experiments, many key assumptions lack direct biological validation. As to the model output readouts, the model considers only endpoint behaviors (chemotaxis index) rather than the full dynamics of navigation, which limits its predictive power. Moreover, some results presented in the paper lack interpretation, and many descriptions in the main text are overly technical and require clearer definitions.
We would like to thank the reviewer for the constructive feedback. As the reviewer noted, the fundamental assumptions posited in the study have yet to be substantiated by biological validation. Consequently, these assumptions must be directly assessed by biological experimentation. The model performance for salt klinotaxis is evaluated by multiple factors, including not only a chemotaxis index but also the curving rate vs. bearing (Fig. 4a, the bearing is defined in Fig. A3) and the curving rate vs. normal gradient (Fig. 4c). The subsequent two parameters work to characterize the trajectory during salt klinotaxis. In the revised version, we will meticulously revise the manuscript according to the suggestions by the reviewer. We would like to express our sincere gratitude for your insightful review of our work.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank all the reviewers for their detailed comments. In response, we will address the comments with further analysis, experiments and an expanded discussion.
In terms of each specific reviewer's comments:
Reviewer 1 was positive overall but had several suggestions and requested further rigorously controls. These are highly constructive technical concerns and will be addressed through additional experimentation and methods for quantification.
Reviewer 2 summarised the strengths of the study as being largely confirmatory. They have perhaps not fully appreciated that this is the first published functional assessment of cerebral vascular permeability in a pericyte deficient zebrafish model.
The reviewer has made a number of very helpful suggestions to improve technical aspects of the analysis. Many align with the suggestions of Reviewer 1. Additional experiments that include more rigorous controls and further methods to quantify vessel permeability will address these concerns in revision.
We also note that the reviewer calls for a more nuanced and careful discussion section. We take the reviewers point and do appreciate their concerns. We were limited by wordcount in the initial submission in short report format, but in response will expand and provide a more thorough discussion.
Reviewer 3 was positive overall but has suggested additional controls and experiments to further strengthen the findings and support our conclusions. Some align with the suggestions of Reviewers 1 and 2. We agree and aim to address them through additional work in revision.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The investigators in this study analyzed the dataset assembly from 540 Salmonella isolates, and those from 45 recent isolates from Zhejiang University of China. The analysis and comparison of the resistome and mobilome of these isolates identified a significantly higher rate of cross-region dissemination compared to localized propagation. This study highlights the key role of the resistome in driving the transition and evolutionary history of S. Gallinarum.
Strengths:
The isolates included in this study were from 16 countries in the past century (1920 to 2023). While the study uses S. Gallinarun as the prototype, the conclusion from this work will likely apply to other Salmonella serotypes and other pathogens.
Thank you very much for your positive feedback. We recognize, as you noted, that emphasizing Salmonella enterica Serovar Gallinarum in the title may lead readers to perceive our methods and conclusions as overly restrictive. In light of your evaluation of our work, we have revised the title to: “Avian-specific Salmonella transition to endemicity is accompanied by localized resistome and mobilome interaction” We believe this final version not only reflects the applicability of our conclusions, as you appreciated, but also addresses your previous suggestion to highlight the resistome and mobilome.
Revisions in the manuscript Lines: 1-3
Weaknesses:
While the isolates came from 16 countries, most strains in this study were originally from China.
We believe that this issue was discussed in detail in our previous response. Although potential bias exists, we have minimized its impact by constructing the largest global S. Gallinarum genome dataset to date. In addition, we have further emphasized these limitations in the manuscript.
Comments on revisions:
This reviewer is happy with the detailed responses from the authors regarding revising this manuscript. I do not have further comments.
We greatly appreciate your positive feedback and are pleased that our responses have addressed your concerns.
Reviewer #2 (Public review):
Summary:
The authors sequence 45 new samples of S. Gallinarum, a commensal Salmonella found in chickens, which can sometimes cause disease. They combine these sequences with around 500 from public databases, determine the population structure of the pathogen, and coarse relationships of lineages with geography. The authors further investigate known anti-microbial genes found in these genomes, how they associate with each other, whether they have been horizontally transferred, and date the emergence of clades.
Strengths:
- It doesn't seem that much is known about this serovar, so publicly available new sequences from a high burden region are a valuable addition to the literature.
- Combining these sequences with publicly available sequences is a good way to better contextualise any findings.
- The genomic analyses have been greatly improved since the first version of the manuscript, and appropriately analyse the population and date emergence of clades.
- The SNP thresholds are contextualised in terms of evolutionary time.
- The importance and context of the findings are fairly well described.
Thank you so much for your thorough review and constructive comments on the manuscript.
Weaknesses:
- There are still a few issues with the genomic analyses, although they no longer undermine the main conclusions:
We are grateful for the valuable time and effort you have dedicated to improving our manuscript. In this revision, we have provided a point-by-point response to each of your concerns. Moreover, with the addition of new supplementary materials and modifications to the figures, we have re-examined and adjusted the numbering of figures and supplementary materials in the text to ensure they appear correctly in the manuscript.
(1) Although the SNP distance is now considered in terms of time, the 5 SNP distance presented still represents ~7yrs evolution, so it is unlikely to be a transmission event, as described. It would be better to use a much lower threshold or describe the interpretation of these clusters more clearly. Bringing in epidemiological evidence or external references on the likely time interval between transmissions would be helpful.
We sincerely thank you for highlighting this issue. We appreciate your concern regarding the use of a 5-SNP threshold to define a transmission event, especially given the approximate 7-year evolutionary timeframe. Considering our updated estimate for the evolutionary rate of S. Gallinarum (approximately 0.74 SNPs per year, with a 95% HPD range of 0.42 to 1.06), we have revised the manuscript to use a 2-SNP threshold (approximately representing less than two years of evolution) to better control the temporal span of transmission events. In addition, we have updated the manuscript to reflect this new threshold and demonstrated that the use of a more stringent SNP threshold does not affect the overall conclusions of the study.
Specifically, we adopted the newly established 2-SNP threshold to update Figure 3a and corresponding Supplementary Figure 8. The heatmap on the far right of New Figure 3a illustrates the SNP distances among 45 newly isolated S. Gallinarum strains from two locations in Zhejiang Province (Taishun and Yueqing). New Supplementary Figure 8 simulates potential transmission events between the bvSP strains isolated from Zhejiang Province (n=95) and those from other regions of China with available provincial information (n=435). These analyses collectively demonstrate the localized transmission patterns of bvSP within China.
For New Figure 3a, we found that even with the 2-SNP threshold, the number of potential transmission events among the 45 newly isolated S. Gallinarum strains from the two Zhejiang locations (Taishun and Yueqing) remains unchanged. In fact, we observed that the results from SNP tracing using an SNP threshold of less than 5 are consistent (see Author response image 1).
Author response image 1.
Clustering results of 45 newly isolated S. Gallinarum strains using different SNP thresholds of 1, 2, 3, 4, and 5 SNPs. The five subplots represent the clustering results under each threshold. Each point corresponds to an individual strain, and lines connect strains with potential transmission relationships.
For New Supplementary Figure 8, we employed the 2-SNP threshold and found that the number of transmission events between the bvSP strains isolated from Zhejiang Province (n=95) and those from other Chinese provinces (n=435) decreased from 91 to 53. The names of the strains involved in these potential transmission events are listed in Supplementary Table 5.
Revisions in the manuscript
Lines: 352-357
Figures: Figure 3; Supplementary Figure 8
Table: Supplementary Table 5
(2) The HGT definition has not fundamentally been changed and therefore still has some issues, mainly that vertical evolution is still not systematically controlled for.
We sincerely thank you for highlighting this issue. We hope the following explanation will help clarify and improve our manuscript, as well as address your concerns.
In bacteria, mobile genetic elements (MGEs) such as plasmids, transposons, integrons, and prophages, as mentioned in our manuscript, are segments of DNA that encode enzymes and proteins responsible for mediating the movement of genetic material between bacterial genomes (commonly referred to as “jumping genes”). These MGEs contribute to the mechanisms of horizontal gene transfer (HGT) in Salmonella, including transduction (via prophages), conjugation (via plasmids), and transposition (via integrons and transposons) (Nat Rev Microbiol. 2005 Sep;3(9):722-32). These “jumping genes” can enable Salmonella to acquire additional antimicrobial resistance genes (ARGs), which may not only originate from other Salmonella strains but also from distantly related species.
To further address your concern regarding the systematic control of vertical evolution, we employed the HGTphyloDetect pipeline developed by Le Yuan et al. (Brief Bioinform. 2023 Mar 19;24(2):bbad035) to control for vertical evolution in the ARG sequences mentioned in our manuscript. We chose HGTphyloDetect because, as noted, "jumping genes" often occur among evolutionarily distant species, rendering the use of Gubbins potentially unsuitable for these distant HGT events.
Using the HGTphyloDetect pipeline, we extracted base sequences for the eight ARGs shown in Figure 6b with an HGT frequency greater than zero (bla<sup>TEM-1B</sup>, sul1, dfrA17, aadA5, sul2, aph(3’’)-Ib, tet(A), aph(6)-Id). For bla<sup>TEM-1B</sup>, sul1, dfrA17, aadA5, and sul2, the HGT frequency reached 100% across different isolates, indicating that these ARG sequences have a unique sequence type. In contrast, due to the ResFinder settings requiring both similarity and coverage to meet a minimum value of 90%, the base sequences for aph(3’’)-Ib, tet(A), and aph(6)-Id are not unique. Consequently, we applied the HGTphyloDetect pipeline individually to each sequence type of ARGs to verify their association with HGT events. Specifically, among 436 bvSP isolates collected in China, we identified two sequence types of aph(3’’)-Ib, four sequence types of tet(A), and three sequence types of aph(6)-Id.
Subsequently, to identify potential ARGs horizontally acquired from evolutionarily distant organisms, we queried the translated amino acid sequences of each ARG against the National Center for Biotechnology Information (NCBI) non-redundant protein database. We then evaluated whether these sequences were products of HGT by calculating Alien Index (AI) scores and out_perc values.
The calculation of AI score is as follows:
In this study, bbhG and bbhO represent the E-values of the best blast hit in ingroup and outgroup lineages, respectively. The outgroup lineage is defined as all species outside of the kingdom, while the ingroup lineage encompasses species within the kingdom but outside of the subphylum. An AI score ≥ 45 is considered a strong indicator that the gene in question is likely derived from an HGT event.
Regarding the calculation method for out_perc:
Finally, according to the definition provided by the HGTphyloDetect pipeline, ARGs with AI score ≥ 45 and out_perc ≥ 90% are presumed to be potential candidates for HGT from evolutionarily distant species. We have compiled the calculation results for the aforementioned genes in New Supplementary Table 9. The results indicate that all ARGs presented in Figure 6b, which exhibited a HGT frequency greater than zero, were acquired horizontally by S. Gallinarum. Based on these findings, we have revised the manuscript accordingly.
Revisions in the manuscript
Lines: 302-307; 616-650; 955-957
Table: Supplementary Table 9
Using a 5kb window is not sufficient, as LD may extend across the entire genome.
We agree with your point that linkage disequilibrium (LD) could influence the transmission of genes within chromosomal regions. LD can lead to the non-random cooccurrence of alleles at different loci within a population. Considering that horizontal gene transfer (HGT) events involving more distantly related ARGs may be accompanied by vertical propagation on chromosomes, and to simultaneously assess the impact of LD, we conducted two evaluations.
It is important to note that the following assessments are based on the assumption that plasmid replicons detected by PlasmidsFinder are part of self-replicating, extrachromosomal DNA.
(1) In the revised pipeline used to calculate ARG HGT frequencies, we categorized a total of 621 ARGs carried by 436 bvSP isolates collected in China and found that 415 of these ARGs were located on MGEs. We further investigated the distribution of these 415 ARGs across different MGEs, taking into account the complex nesting relationships among them. We observed that 90% of the ARGs (372/415) were located on plasmid contigs. It is important to clarify that this finding does not contradict our statement in the manuscript regarding plasmids and transposons as the primary reservoirs for resistome geo-temporal dissemination. This is because transposons, integrons, and prophages carrying ARGs can also be found on plasmids. Additionally, only 25 bvSG isolates from China contained ARGs, which were likely acquired via transposons or integrons located on the chromosome.
(2) In our manuscript, we searched for ARGs within a 5kb upstream and downstream region (a total of 10kb) of transposons and integrons (The BLASTn parameters used in the Bacant pipeline to identify transposons and integrons were set to a coverage threshold of 60%, rather than 100%). However, in light of the potential impact of LD on vertical transmission, we expanded our search to include a 10kb upstream and downstream range (a total of 20kb) for these 25 isolates. The decision to expand the search range to 10kb upstream and downstream range is based on the following two considerations: 1) Based on literature, we determined the overall lengths of the integrons and transposons carried by the 25 isolates (Tn801, Tn6205, Tn1721, In498, In1440, In473, and In282), and found that the maximum length of these elements is ~13.5 kb. Using a 10kb upstream and downstream threshold effectively covers these integrons/transposons. 2) The limitation posed by genomic fragmentation due to next-generation sequencing, which restrict the search range. We present the results of this expanded search for colocalization of ARGs with transposons and integrons at: Figshare: https://doi.org/10.6084/m9.figshare.28129130.v1
We found that these results were consistent with those obtained using the previous search range.
Taken together, these results suggest that although linkage disequilibrium may influence genetic processes within chromosomal regions—particularly for the few chromosomeassociated antibiotic resistance genes linked to integrons and transposons—the overall impact in our study is likely minimal. This conclusion is supported by the observation that 90% of the ARGs in our dataset are located on plasmids, and even an expanded search range does not alter this outcome. Additionally, by incorporating Alien Index scores and calculating out_perc, we can further confirm the occurrence of horizontal gene transfer events.
However, it is undeniable that other studies using our current pipeline may be affected. As a temporary remedial measure, we have included a note in the "README" file as below (https://github.com/tjiaa/Cal_HGT_Frequency):
“Note: Considering that ARGs located on the chromosome and carried by mobile genetic elements—such as integrons and transposons—may introduce potential computational errors, we recommend evaluating the number of ARGs associated with these elements on the chromosome during your analysis. If a majority of ARGs in your dataset fall into this category, we suggest using additional methods to evaluate the potential impact of linkage disequilibrium. Additionally, by modifying the “MGE_start” and “MGE_end” parameters in the “eLife_MGE_ARG_Co_location.ipynb” script, you can assess the distance between different ARGs and integrons or transposons on the chromosome. This approach will further aid in evaluating the impact of linkage disequilibrium on the genetic process.”
We believe this approach will assist researchers in further assessing the potential impact of vertical evolution and help other users determine whether additional methods are necessary to account for such effects.
As the authors have now run gubbins correctly, they could use the results from this existing analysis to find recent HGT.
We sincerely thank you for your valuable suggestion. Utilizing additional methods to predict potential horizontal gene transfer (HGT) events could indeed enhance the robustness of the results. However, "jumping genes" often occur among evolutionarily distant species, rendering the use of Gubbins potentially unsuitable for these distant HGT events.
Furthermore, the primary focus of our study is to identify HGT of antimicrobial resistance genes (ARGs) in the Salmonella genome driven by mobile genetic elements. Therefore, we employed the HGTphyloDetect pipeline developed by Le Yuan et al. (Brief Bioinform. 2023 Mar 19;24(2):bbad035) to control for vertical evolution in the ARG sequences. The specific computational methods and conclusions have been detailed above.
To definite mobilisation, perhaps a standard pipeline such (e.g. https://github.com/EBIMetagenomics/mobilome-annotation-pipeline) would be more convincing.
Thank you for your valuable suggestion. We agree that defining mobilization using a standardized pipeline can add rigor and clarity to our analysis. The pipeline you referenced (https://github.com/EBI-Metagenomics/mobilome-annotation-pipeline) is an excellent resource and provides a robust approach to the identification and annotation of mobile genetic elements.
We have examined and run this pipeline, which uses “IntegronFinder” and “ICEfinder” to detect integrons, “geNomad” to identify plasmids, and “geNomad” and “VIRify” to detect prophages. Our initial checks revealed that the numbers of integrons, plasmids, and prophages identified using this pipeline were consistent with those detected in our study. However, due to the significantly different output formats, the results from this pipeline could not be integrated with the pipeline we used for calculating HGT frequency.
We will incorporate the standardized pipeline you suggested in future studies to further improve the reliability of our findings.
(3) The invasiveness index is better described, but the authors still did not provide convincing evidence that the small difference is actually biologically meaningful (there was no statistical difference between the two strains provided in response Figure 6). What do other Salmonella papers using this approach find, and can their links be brought in? If there is still no good evidence, a better description of this difference would help make the conclusions better supported.
We sincerely appreciate your thoughtful feedback. The initial introduction of the invasiveness index in our manuscript aimed to quantitatively assess the differences in invasiveness between two geographically distinct strains of S. Gallinarum (isolated from Taishun and Yueqing) by comparing the degradation of 196 top predicted genes associated with invasiveness in their genomes. We found a highly significant statistical difference (P < 0.0001) in the invasiveness index between them.
Several studies have also employed the invasiveness index to predict biological relevance in Salmonella strains, and we believe these examples provide further context for our approach:
(1) Caisey V. Pulford et al, Nat Microbiol, 2021, used the same method to calculate the invasiveness index for Salmonella Typhimurium and employed it to characterize the invasiveness of different lineage strains. They found that Salmonella in Lineage-3 exhibited the highest invasiveness index, suggesting an adaptation from an intestinal to a systemic lifestyle. The authors noted, "Although the invasiveness index cannot yet be experimentally validated, Salmonella isolates with different invasiveness indices produce distinct clinical symptoms in a human population (BMC Med. 2020 Jul 17; 18(1):212)". They emphasized the necessity of developing more robust methods to measure Salmonella invasiveness.
(2) Sandra Van Puyvelde et al, Nat Commun, 2019, reported that Salmonella Typhimurium sequence type 313 (ST313) lineage II.1 exhibited a higher invasiveness index compared to lineage II, suggesting that the two lineages might have distinct adaptations to an invasive lifestyle. Further experiments demonstrated significant differences between these lineages in terms of biofilm formation (A red dry and rough (RDAR) assay) and metabolic capacity for carbon compounds.
(3) Wim L. Cuypers et al, Nat Commun, 2023, calculated the invasiveness index for 284 global Salmonella Concord strains across different lineages and found that Lineage-4 potentially exhibited the highest invasiveness.
Given these evidences, we acknowledge that no significant difference in mortality was observed between the L2b and L3b S. Gallinarum strains in 16-day-old SPF chicken embryos. Existing literature suggests that strains with higher invasiveness indices may still exhibit differences in biofilm formation and metabolic capacities, reflecting their adaptation to different host environments. As such, we maintain that the invasiveness index remains a valuable metric for evaluating the genomic differences between S. Gallinarum strains from Taishun and Yueqing. We plan to further investigate these differences through phenotypic experiments in our next research.
In the revised manuscript, we have added the following discussion along with additional references:
Lines 358-365: “Moreover, the invasiveness index of bvSP from Taishun and Yueqing suggests that different lineages of S. Gallinarum recovered from distinct regions may exhibit biological differences. Previous studies have shown that strains with higher invasiveness indexes tend to be more virulent in hosts (30, 31), potentially causing neurological or arthritic symptoms in S. Gallinarum infections. Furthermore, strains with varying invasiveness indexes have been confirmed to differ in their biofilm formation abilities and metabolic capacities for carbon compounds (32).”
Revisions in the manuscript:
Lines: 358-365, 806-827.
In summary, the analysis is broadly well described and feels appropriate. Some of the conclusions are still not fully supported, although the main points and context of the paper now appear sound.
Thank you so much for your positive evaluation of our work. We hope that the revised manuscript meets your expectations and offers a more accurate interpretation of our findings.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
This is a great improvement over the first version and I thank the authors for a thorough response, as well as changing their conclusions in response to their improvements.
Other small remaining issues:
Figure 3: Heatmap of SNPs is hard to read in grayscale. It also just represents the between clade distances already shown by the tree. It would be more useful to present intraclade distances only to see the SNP resolution _within_ each lineage. Using a better colour scheme would also help.
Thank you for your insightful comments and suggestions regarding Figure 3. We agree that the grayscale heatmap may present challenges in terms of visual clarity. To address this, we have updated the heatmap with a more distinct color gradient, ensuring better contrast and easier interpretation (New Figure 3).
Regarding your second suggestion: "It would be more useful to present intraclade distances only to see the SNP resolution within each lineage," we believe it is already addressed in the current version of New Figure 3. Specifically, the heatmap on the right side of New Figure 3 illustrates the SNP distances between S. Gallinarum isolates from Taishun and Yueqing, with the goal of demonstrating that genomic variation within isolates from a single region is generally smaller compared to those from different regions. In this figure, 45 newly isolated S. Gallinarum strains are categorized into two lineages: L2b and L3b. The heatmap on the right side of Figure 3 displays the SNP distances between all pairwise combinations of these 45 strains, where the intraclade distances are represented by the red regions (highlighting the pairwise distances within each lineage, specifically L3b and L2b, which are indicated by two triangles). The between-clade distances are shown by the blue regions.
We also believe in further exploring the intraclade distances across the entire dataset of 580 S. Gallinarum strains, as it could provide additional insights. However, this analysis would extend beyond the scope of the current section.
Revisions in the manuscript Line: 998
Figure: Figure 3
Please remove Figure 6c, it does not add anything to the paper and raises questions about performing this regression.
Thank you for pointing out this issue. We have removed Figure 6c and the corresponding description in the "Results" section from the manuscript (New Figure 6).
Revisions in the manuscript Lines: 316, 319, 1035-1041.
Figure: Figure 6
Again, thank you all for your time and efforts in reviewing our work. We believe the improved manuscript meets the high standards of the journal.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
We thank Reviewers for highlighting the strengths of our work along with suggestions for future directions.
We agree with the Reviewers that RPS26 depletion may impact not only RAN translation initiation and codon selection (as showed in the experiments in Figure 4G), but also other mechanisms, such as speed of PIC scanning, as we stated in the discussion. Although, we did provide the data showing that mRNA of exogenous FMR1-GFP does not change upon RPS26 depletion (Figure 3B&C), hence observed effect most likely stems from translation regulation. In addition, an experiment with ASO-ACG treatment (Figure 4G) suggests that near cognate start codon selection or speed of PIC scanning may be a part of the regulation of RAN translation sensitive to RPS26 depletion. In addition, our latest unpublished results (Niewiadomska D. et al., in revision), indicate that FMRpolyG in fusion with GFP is fairly stable, in particular, while derived from long repeats (>90xCGG), suggesting that the protein stability is not at play in RPS26-dependent regulation.
We would like to stress that in order to avoid bias in result interpretation and to mimic the natural situation, the majority of experiments concerning levels of FMRpolyG were performed in cell models with stable expression of ACG-initiated FMRpolyG. Currently, we do not possess a cell model with stable expression of AUG-initiated FMRpolyG, and the experiments based on transient transfection system would not necessarily be comparable to the results obtained in stable expression system. However, we believe that the experiment presented in Figure 2B serves as a good control for overall translation level upon RPS26 depletion indicating that RPS26 insufficiency does not affect global translation and the observed regulation is specific to some mRNAs including the one encoding FMRpolyG frame. We also show that the level of ca. 80% of identified canonical proteins, including FMRP, did not change upon RPS26 silencing (SILAC-MS, Figure 4A). Indeed, we did not explore the ribosome composition upon RPS26 and TSR2 depletion, although, most likely the pool of functional ribosomes in the cell is sufficient enough to support the basal translation level (SUnSET assays, Figure 2B & 5C). However, we cannot exclude possibility that for some mRNAs, including one encoding for FMRpolyG, the observed effect can be partially caused by lowering the number of fully active ribosomes, especially in experiments with transient transfection experiments where transgene expression is hundreds times higher than for average native mRNA.
Finally, we agree with the Reviewer that in vitro translation assay would provide the evidence of direct effect of RPS26 on FMRpolyG level, however, we did not manage to overcome technical difficulties in obtaining cellular lysate devoid of RPS26 from vendor companies.
The following is the authors’ response to the original reviews.
General Comments
We thank Reviewers for the critical comments and experimental suggestions. We considered most of the advices in the revised version of the manuscript, which allowed for a more balanced interpretation of the results presented, and further supported major statement of the manuscript that insufficiency of the RPS26 and RPS25 plays a role in modulating the efficiency of noncanonical RAN translation from FMR1 mRNA, which results in the production of toxic polyglycine protein (FMRpolyG). Firstly, performing new experiments, we showed that silencing of the RPS26 and its chaperone protein TSR2, which regulates loading/exchange of RPS26 in maturing small ribosome subunit, did not elicit global translation inhibition. Secondly, we demonstrated that in contrary to RPS26 and RPS25 depletion, silencing the RPS6 protein, a core component of 40S subunit, did not affect FMRpolyG production, further supporting the specific effect of RPS26 and RPS25 on RAN translation regulation of mutant FMR1 mRNA. We also observed that depletion of RPS26, RPS25 and RPS6 had significant negative effect on cells proliferation which is in line with previously published results indicating that insufficiencies of ribosomal proteins negatively affect cell growth. Moreover, we showed that FMRpolyG production is significantly affected by RPS26 depletion while initiated at ACG, but not other near cognate start codons. Importantly, translation of FMRP initiated at canonical AUG codon of the same mRNA upstream the CGGexp was not affected by RPS26 silencing, similarly to vast majority of the human proteome. This implies that RAN translation of FMR1 mRNA mediated by RPS26 insufficiency is likely to be dependent on start codon selection/fidelity. In essence, we provide a series of evidences indicating that cellular amount of 40S ribosomal proteins RPS26 and RPS25 is important factor of CGGrelated RAN translation regulation. Finally, we also decided to tone down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion, affects RAN translation, rather than composition of 40S ribosomal subunit per se influences RAN translation. We have addressed all specific concerns below and made changes to the new version of manuscript.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.
Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.
We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process. We believe your guidance has been instrumental in significantly enhancing the quality of our research. Below, we have addressed your comments pointby-point.
Strengths:
The topic is interesting and important.
Weaknesses:
In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.
We agree that data presented in the first version of the manuscript did not directly address the following processes: ribosome content, global translation rate and cell viability upon RPS26 depletion. Therefore we addressed some of the issues in the revised version of the manuscript. In particular, we showed that RPS26 and TSR2 knock down did not inhibit global translation (new Figure 2B & 4C), hence we concluded that the changes of FMRpolyG level did not arise from general translational shut down. On the other hand, RPS26, RPS25 and RPS6 depletion negatively affected cells proliferation (new Figure 2A,5D,6C), which is in line with a number of previously published researches (e.g. Cheng et al, 2019; Havkin-Solomon et al, 2023). However, the rate of proliferation abnormalities is limited. We agree that observed effects on RAN translation from mutant FMR1 mRNA may stem from the combination of altered protein synthesis, conditions of the cells but also cis-acting factors of mRNA sequence/structure. In new experiments we showed that single nucleotide substitution of ACG by other near cognate start codons change sensitivity of RAN translation to insufficiency of RPS26 (new Figure 4F). Also the inhibitory effect of antisense oligonucleotide binding to the region of 5’UTR containing ACG initiation codon (ASO_ACG) is different in cells differing in amount of RPS26 (new Figure 4G).
We also agree that our data only partially supports the role of RPS26-defficient ribosomes in RAN translation. Therefore, we have toned down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion affects RAN translation. We also changed the title of the manuscript to: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25, negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions” (Previously it was: “Ribosomal composition affects the noncanonical translation and toxicity of polyglycine-containing proteins in fragile X-associated conditions”.
Specific points:
(1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).
The data in Table S6/Figure 3A suffer from the same problem.
I am not convinced that the mass spec data is reliable.
We thank Reviewer for the comment concerning MS data; however, we believe that it may stem from misunderstanding of the data presented in Table S3 and S6. Both tables represent the output from MaxQuant analysis (so-called ProteinGroup) of MS .raw files, without any filtering. As stated in the Material&Methods, we applied default parameters suggested by MaxQuant developers to analyze MS data, these include identification of proteins based on at least 1 unique peptide, and thus some of the proteins with only 1 unique peptide are shown in Tables S1 and S3. Reviewer is also right that in this output table common contaminants, such as keratins are included. However, these identifications are denoted as “CON_”, and are further filtered out during statistical analysis in Perseus software. During the statistical analysis we first filtered out irrelevant protein groups identifications, such as contaminants, or only identified by site modifications.
We have changed the names of Supplementary Table files, giving more detailed description. We hope this will help to avoid misunderstanding for broader public. Secondly, when comparing the data presented in Table S3 and volcano plot presented in Figure 1B, one can notice that indeed the majority of identified proteins are not statistically significant (grey points), thus not selected for further stratification. Lack of significance of these proteins may be partially due to poor MS identification, however, they are not included in the following parts of the manuscript. Further, we selected only eight proteins (out of over 150) for stratification by orthogonal techniques, thus we argue that this step validates the biological relevance of chosen candidate RAN-translation modifiers. One should also keep in mind that pull down samples analyzed by MS often yield lower intensity and identification rates, when comparing to whole cell analysis, as a result of lower protein input or stringent washes used during sample preparation.
Regarding the data presented in Table S6 (SILAC data), we argue that these data are of very good quality. More than 2,000 proteins were identified in a 125min gradient, with over 80% of proteins that were identified with at least 2 unique peptides. Each of three biological replicates was analyzed three times (technical replicates), giving total of 9 high resolution MS runs. Together, we strongly believe that this data is of high confidence.
(2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.
Indeed, we identified RPS26 as a protein that co-precipitated with FMR1 containing expanded CGG repeats (Supplementary Figure 1G) and found that depletion of RPS26 hindered RAN translation of FMRpolyG, suggesting that RPS26 positively affects RAN translation. However, we did not state that RPS26 directly interacts with toxic RNA. In order to confirm the specificity of RAN translation regulation by RPS26 insufficiency, we tested whether depletion of other 40S ribosomal protein, RPS6, affects FMRpolyG synthesis. Our experiments showed that there was no any significant effect on RAN translation efficiency post RPS6 silencing (new Figure 5C). Importantly, we showed that RPS26 depletion did not inhibit global translation (new Figure 2B). In addition, mutagenesis of near-cognate start codon (new Figure 4F) and ASO_ACG treatment (new Figure 4G) provided the evidences that modulation of FMRpolyG biosynthesis by RPS26 level may depend on start codon selection. In essence, our data suggest that RPS26 depletion specifically affects synthesis of FMRpolyG, but not FMRP derived from the same FMR1 mRNA with CGGexp. However, we do not claim that the observed effect is the consequence of a direct interaction between RPS26 and 5’UTR of FMR1 mRNA. Downregulation of FMRpolyG biosynthesis could be an outcome of the alteration of ribosomal assembly, decrease of efficiency and fidelity of PIC scanning/initiation or impeded elongation or a combination of all these processes. In the manuscript we presented the results of experiments which tested many of these possibilities.
(3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.
We agree with the Reviewer that RPS26 and DHX15 are essential proteins, similarly to all RNA binding proteins, and caution should be taken during experimental design. To address this, we titrated different concentrations of siRPS26, and found that administration of 5 nM siRPS26, which just partially silenced RPS26, decreased FMRpolyG by around 50% (new Figure 1D). This impact was even greater with 15 nM siRPS26, as we observed around 80% decrease of FMRpolyG.
Havkin-Solomon et al. (2023), showed that proliferation rate is decreased in cells with mutated C-terminus of RPS26, which is required for contacting mRNA. In accordance with this study, we showed that cells with knocked down RPS26 proliferate less efficiently (new Figure 2A), but depletion of RPS26 did not impact the global translation (new Figure 2B). In addition, our SILAC-MS data indicates that ~80% of proteins with determined expression level were not affected by RPS26 insufficiency, and ~20% of the proteins turned out to be sensitive to RPS26 decrease. Although, these data do not take into account the protein stability.
(4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.
The current version of the manuscript contains representative western blots with validation of knock-down efficiency (for example in Figure 3B, C, E, Figure 6A) and we included knock-down validations where applicable (Figures 1D, 2B, 4G and 5C).
(5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).
Mock control corresponds to the cells treated with lipofectamine reagent and was included in the study to determine the “background” signal from cells treated with delivery agent and reagents used to measure the apoptosis process. These cells were neither expressing FMRpolyG, nor siRNAs. Luminescence signals were normalized to the values obtained from mock control. We added more details describing this assay in the Figure 1 legend.
(6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.
We agree that observed effects may stem from reduced ribosome content, however, we argue that this is the only possibility and explanation. Previously, it was shown that RPS25 regulates G4C2-related RAN translation, but knock out of RPS25 does not affect global translation (Yamada S, 2019, Nat. Neuroscience). Similarly, we showed that KD of RPS26 or TSR2 did not reduce significantly global translation rate (SUnSET assay; new Figure 2B and 5C, respectively).
Moreover, in a new version of manuscript we included a control experiment, where we silenced core ribosomal protein (RPS6) and found that RPS6 depletion did not affect RAN translation from mutant FMR1 mRNA (new Figure 5C), thus strengthening our conclusion about specific RAN translation regulation by the level of RPS26 and RPS25.
Finally, our observation aligns well with current knowledge about how deficiency of different ribosomal proteins alters translation of some classes of mRNAs (Luan Y, 2022, Nucleic Acids Res; Cheng Z, 2019, Mol Cell). It was shown that depletion of RPS26 affects translation rate of different mRNAs compared to depletion of other proteins of small ribosomal subunit.
(7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).
Supplementary Figure 3D represents results indicating that the mutation in -4 position (from G to A) did not affect the RAN translation regardless of RPS26 presence or depletion. However, this result does not imply that RPS26 does not affect the selection of start codon of sequence- or RNA structure-context. We verified this particular -4 position, as it was suggested previously as important RPS26-sensitive site in yeasts (Ferretti M, 2017, Nat Struct Mol Biol). We agree with Reviewer that all 5’UTR logos presented in our paper did not show statistical significance for neither tested position for human mRNAs. On the contrary, we observed that regulation sensitive to RPS26 level depends on the selection of start codon of RAN translation, in particular ACG initiation (new Figure 4F&G). RPS26 depletion affected ACG-initiated but not GTG- or CTG-initiated RAN translation.
In the previous version of the manuscript, we wrote that we did not identify any specific motifs or enrichment within analyzed transcripts in comparison to the background. On the other hand, we found that the GC-content among analyzed transcripts is higher within 5’UTRs and in close proximity to ATG in coding sequences (Figure 4D), what suggests the importance of RNA stable structures in this region. In addition, we showed that mRNAs encoding proteins responding to RPS26 depletion have shorter than average 5’UTRs (new Figure 4E).
(8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.
Indeed, collisions as well as other mechanisms such as skewed start codon fidelity may have an effect on efficiency of FMRpolyG biosynthesis. In the current version of the manuscript, we show that RPS26 amount-sensitive regulation seems to be start codonselection dependent (new Figure 4F&G).
Reviewer #2 (Public Review):
Summary:
Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.
We thank the Reviewer for critical comments and suggestions. We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process.
Below each specific point, we addressed the mentioned issues.
Strengths:
The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.
We thank Reviewer for appreciation of provided MS-screening results, which identified proteins enriched on FMR1 RNA with expanded CGG repeats.
Weaknesses:
- It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.
In previous version of the manuscript we did not state that RPS26 binds directly to RNA with expanded CGG repeats and we did not show the experiment indicating direct interaction between studied RNA and RPS26. What we showed is that RPS26 was enriched on FMR1 RNA MS samples, however, we did not verify whether it is direct or indirect interaction. We also tried to test hypothesis that lack of RPS26 in PIC complex may affect efficiency of RAN translation initiation via specific, previously described in yeast Kozak context (Ferretti M, 2017, Nat Struct Mol Biol). As we described this hypothesis was negatively validated. However, we showed that other features of 5’UTR sequences (e.g. higher GC-content or shorter leader sequence) are potentially important for translation efficiency in cells with depleted RPS26.
Indeed, RPS26 is involved in 40S maturation steps (Plassart L, 2021, eLife) and its insufficiency or mutations or blocking its inclusion to 40S ribosome may result in incomplete 40S maturation, which subsequently might negatively affect translation per se. However, we did not observe global translation inhibition after RPS26 depletion or depletion of TSR2, the chaperon involved in incorporation/exchange RPS26 to small ribosomal subunit (new Figure 2B and 5C). In addition, our SILAC-MS data indicates that majority of studied proteins (including FMRP, the main product of FMR1 gene) were not affected by RPS26 depletion which can be carefully extrapolated to global translation. In revised manuscript we also showed that relatively low silencing of RPS26 also decreased FMRpolyG production in model cells (new Figure 1D).
We agree that reduced ribosome levels can result in different efficiency of translation of different RNA pools. We enhance this statement in revised manuscript. However, we also showed that the same mRNA containing different near cognate start codons (single/two nucleotide substitution) specific to RAN translation, or targeting this codon with antisense oligonucleotides resulted in altered sensitivity of FMR1 mRNA translation to RPS26 depletion (new Figure 4F).
- A significant claim is that RPS26 KD alleviates the effects of FMRpolyG expression, but those data aren't presented well.
We thank the Reviewer for this comment. In the new version of the manuscript, we have added new microscopic images and improved the explanation of Figure 1E. We have also completed the interpretation of Figure 1F in the main text, figure image as well as figure legend, and we hope that these changes will ameliorate understanding of our data.
Recommendations For The Authors:
- A significant claim is that RPS26 KD alleviates the effects of FMR polyG expression, but those data aren't presented well:
Figure 1D (supporting data in S2) and 2D - the authors need to show representative images of a control that has aggregation and indicate aggregates being counted on an image. The legend states that there are no aggregates, but the quantification of aggregates/nucleus is ~1, suggesting there are at least 1 per cell. It is preferred to show at least a representative of what is quantified in the main figure instead of a bar graph.
The representative images of control and siRPS26-treated cells are now shown in revised version of Figure 1E. Additionally, we completed the Figure legend concerning this part, as well as extended description of the experiment in Materials&Methods section.
Figure 1E - it is unclear what luminescence signal is being measured. Is this a dye for an apoptotic marker? More information is needed in the legend.
This information was added to the legend of modified Figure 1F (previously 1E) as suggested.
- Some of the Western blots are not very convincing. Better evidence for the changes in bar graphs would improve how convincing the data are:
Fig 2B. The western for FMR95G in the first model is not very convincing. The difference by eye for the second siRNA seems to give a larger effect than the first for 95G construct but they appear almost the same on the graph. More supporting information for the quantification is needed.
We provided better explanation for WB quantification in M&M section in the manuscript. Alos, we provided additional blot demonstrating independent biological replicate of the mentioned experiment in supplementary materials (Supplementary Figure S2E).
Figure 4A, the blots for RPS26 and FMR95G are not convincing. They are quite smeary compared to all of the others shown for these proteins in other figures. Could a different replicate be shown?
We provided additional blot demonstrating the effect on transiently expressed FMRpolyG affected by depletion of TSR2 in COS7 cell line (Supplementary Figure S4A).
Figure 5A and 5B blots are not ideal. Could a different replicate be shown? Or show multiple replicates in the supplemental figure?
We provided additional blots from the same experiment, although data is not statistically significant, most likely due to low quality of normalization factor, which is Vinculin (Supplementary Figure S5A). Nevertheless, the level of FMRpolyG is decreased by ~70% after RPS25 silencing in SH-SY5Y cells.
Figure 2C. Please use the same y axes for all four Westerns in B and C. One would like to compare 95 and 15 repeats, but it is difficult when the y axes are different.
Thank you for this comment. The y axis was adjusted as suggested by the Reviewer.
Figure 3D-The text suggests a significant difference between positive and negative responders that is not clear in the figure.
In the main body of the manuscript we state that: “We did not observe any significant differences in the frequency of individual nucleotide positions in the 20-nucleotide vicinity of the start codon relative to the expected distribution in the BG”, which is in line with the graph showed in Figure 4D (previously 3D).
Reviewer #3 (Public Review):
Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNAtagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.
(1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation (Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.
We thank the Reviewer for critical comments and suggestions. We agree that the initial title and some statements in the text were misleading and the presented data did not fully support the aforementioned statement regarding ribosomal composition affecting FMRpolyG synthesis. Therefore, in the revised version of the manuscript we included a control experiment indicating that depletion of another core 40S ribosomal protein (RPS6) did not impact the FMRpolyG synthesis (new Figure 5C), which supports our hypothesis that RPS26 and RPS25 are specific CGG-related RAN translation modifiers. To precisely deliver a main message of our work, we changed the title that will indicate the specific effect of RPS26 and RPS25 insufficiency on RAN translation of FMRpolyG. Proposed title: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25 negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions”. We also changed all statements regarding “ribosomal composition” in main text of the new version of manuscript.
(2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.
We agree that the data presented in the manuscript implies that insufficiency of RPS26 plays a pivotal role in the regulation of CGG-related RAN translation and in the revised version of the manuscript we included a series of experiments indicating that ACG codon selection seems to be an important part of RPS26 level-dependent regulation of polyglycine production (new Figure 4F&G; see point 3 below for more details). Importantly, in the luciferase assay showed on Figure 4F we used the AUG-initiated firefly luciferase reporter as normalization control.
Moreover, to verify if FMRpolyG response to RPS26 deficiency depends on the type of reporter used, we repeated many experiments using FMRpolyG fused with different tags. The luciferase-based assays were in line with experiments conducted on constructs with GFP tag (new Figure 1D), thus strengthening our previous data. Moreover, in the series of experiments, we show that FMRP synthesis which is initiated from ATG codon located in FMR1 exon 1, was not affected by RPS26 depletion (Figure 3E & 4C), even though its translation occurs on the same mRNA as FMRpolyG. This indicates a specific RPS26 regulation of polyglycine frame initiated from ACG near cognate codon.
(3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.
We agree that there are multiple factors affecting final levels of FMRpolyG-GFP proteins including aforementioned processes. We evaluated the level of FMR1 mRNA, which turned out not to be decreased upon RPS26 depletion (Figure 3B&C), therefore, we assumed that what we observed, was the regulation on translation level, especially that RPS26 is a ribosomal protein contacting mRNA in E-site. We believe that direct assays such as in vitro translation may be beneficial, however, depletion of RPS26 from cellular lysate provided by the vendor seems technically challenging, if not completely impossible. Instead, we focused on sequence/structure specific regulation of RAN translation with the emphasis on start-codon initiation selection. It resulted in generating the valuable results pointing out the RPS26 role in start codon fidelity (Figure 4F&G). These new results showed that translation from mRNAs differing just in single or two nucleotide substitution in near cognate start codon (ACG to GUG or ACG to CUG), although results in exactly the same protein, is differently sensitive to RPS26 silencing (new Figure 4F). Similar differences were observed for translation efficiency from the same mRNA targeted or not with antisense oligonucleotide complementary to the region of RAN translation initiation codon (new Figure 4G). These results also suggest that stability of FMRpolyG is not affected in cells with decreased level of RPS26.
(4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G, 2019,Front Genet), additional evaluations for cellular viability would strengthen this conclusion.
We thank the Reviewer for this suggestion. We addressed the apoptotic process in order to determine the effect of RPS26 depletion on RAN translation related toxicity (Figure 1F). In revised version of the manuscript, we also added the evaluation on how cells proliferation was affected by RPS26, RPS25, RPS6 and TSR2 depletion. Our data indicate that TSR2 silencing slightly impacted the cellular fitness (new Figure 5D), whereas insufficiencies of RPS26, RPS25 and RPS6 had a much stronger negative effect on proliferation (new Figure 2A, 5D, 6C), which is in line with previous data (Cheng Z 2019, Mol Cell; Luan Y, 2022, Nucleic Acids Res). The difference in proliferation rate after treatment with siRPS26 makes proper interpretation of cellular viability assessment very difficult.
Recommendations For The Authors:
(1) It would be nice to validate the effects of overexpression of RPS26 and other regulators on RAN translation, not limited to knockdown experiments, to support the conclusion.
We did not performed such experiments because we believed that RPS26 overexpression may have no or marginal effect on translation or RAN translation. It is likely impossible to efficiently incorporate overexpressed RPS26 into 40S subunits, because the concentration of all ribosomal proteins in the cells is very high.
(2) It would be better to explain how authors selected 8 proteins for siRNA-based validation (Figure 1C, 1D, S1D) from 32 proteins enriched in CGG repeat RNA in the first screening.
We selected those candidates based on their functions connected to translation, structured RNA unwinding or mRNA processing. For example, we tested few RNA helicases because of their known function in RAN translation regulation described by other researchers. This explanation was added to the revised version of the manuscript.
(3) Original image data showing nuclear FMRpolyG-GFP aggregates should be presented in Figure 1D.
The representative images of control and siRPS26-treated cells are now shown in modified version of Figure 1E and described with more details in the legend.
(4) Image data in Figure 2A and 2D have poor signal/noise ratio and the resolution should be improved. In addition, aggregates should be clearly indicated in Figure 2D in an appropriate manner.
The stable S-FMR95xG cellular model is characterized by very low expression of RANtranslated FMR95xG, therefore, it is challenging to obtain microscopic images of better quality with higher GFP signal. In the L-99xCGG model expression of transgene is higher. Therefore, we provided new image in the new version of Figure 3D (former 2D). Moreover, we showed aggregates on the image obtained using confocal microscopy (new Supplementary Figure 2D).
(5) The detailed information on patient-derived fibroblast (age and sex of the patient, the number of CGG repeats, etc.) in Figure 2F needed to be presented.
This information was added to the figure legend (Figure 3F; previously 2F) and in the Material and Methods section as suggested.
(6) It would be better to normalize RNA expression levels of FMR1 and FMR1-GFP by the housekeeping gene in Figure S2C, like other RT-qPCR experimental data such as Figure 2B.
Normalization of FMR1-GFP to GAPDH is now shown in modified version of Figure S2C (right graph) as requested by the Reviewer.
(7) It would be better to add information on molecular weight on all Western blotting data.
(8) Marks corresponding to molecular weight ladder were added to all images.
Full blots, including protein ladders were deposited in Zenodo repository, under doi: 10.5281/zenodo.13860370
References
Cheng Z, Mugler CF, Keskin A, Hodapp S, Chan LYL, Weis K, Mertins P, Regev A, Jovanovic M & Brar GA (2019) Small and Large Ribosomal Subunit Deficiencies Lead to Distinct Gene Expression Signatures that Reflect Cellular Growth Rate. Mol Cell 73: 36-47.e10
Havkin-Solomon T, Fraticelli D, Bahat A, Hayat D, Reuven N, Shaul Y & Dikstein R (2023) Translation regulation of specific mRNAs by RPS26 C-terminal RNA-binding tail integrates energy metabolism and AMPK-mTOR signaling. Nucleic Acids Res 51: 4415–4428
Hoem,G., Larsen,K.B., Øvervatn,A., Brech,A., Lamark,T., Sjøttem,E. and Johansen,T. (2019) The FMRpolyGlycine protein mediates aggregate formation and toxicity independent of the CGG mRNA hairpin in a cellular model for FXTAS. Front. Genet., 10, 1–18.
Luan Y, Tang N, Yang J, Liu S, Cheng C, Wang Y, Chen C, Guo YN, Wang H, Zhao W, et al (2022) Deficiency of ribosomal proteins reshapes the transcriptional and translational landscape in human cells. Nucleic Acids Res 50: 6601–6617
Plassart L, Shayan R, Montellese C, Rinaldi D, Larburu N, Pichereaux C, Froment C, Lebaron S, O’donohue MF, Kutay U, et al (2021) The final step of 40s ribosomal subunit maturation is controlled by a dual key lock. Elife 10
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
It would be great if the authors could add clarification about the NMDS analyses and the associated results (Fig. 1, Table 1 and Tables S2-4). The overall aim of these analyses was to see how plot characteristics (e.g. canopy cover) and composition of one taxonomic group were related to the composition of another taxonomic group. The authors quantified species composition by two axes from NMDS. (1) This analysis may yield an interpretation problem: if we only find one of the axes, but not the other, was significantly related to one variable, it would be difficult to determine whether that specific variable is important to the species composition because the composition is co-determined by two axes. (2) It is unclear how the authors did the correlation analyses for Tables S2-4. If correlation coefficients were presented in these tables, then these coefficients should be the same or very similar if we switch the positions of y vs. x. That is, the correlation between host vs. parasite phylogenetic composition would be very close to the correlation between parasite vs. phylogenetic composition, but not as the author found that these two relationships were quite different, leading to the interpretation of bottom-up or top-down processes. It is also unclear which correlation coefficient was significant or not because only one P value was provided per row in these tables. (3) In addition to the issues of multiple axes (point 1), NMDS axes simply define the relative positions of the objects in multi-dimensional space, but not the actual dissimilarities. Other methods, such as generalized dissimilarity modeling, redundancy analysis and MANOVA, can be better alternatives.
Thank you for the thorough and constructive review. We have taken the concerns and questions raised by the editors and reviewers into account and provided clarification about the NMDS analyses as well as additional analyses to confirm our results. First, we have now added a brief explanation in the manuscript regarding the interpretation of the two NMDS axes and how they relate to species composition. Specifically, we clarified that while NMDS defines the relative positions of objects in multi-dimensional space, the two axes together provide a more comprehensive representation of the community composition, which is not solely determined by either axis independently. Second, we acknowledge that alternative approaches could help further strengthen our conclusions. To address this, we incorporated Mantel tests and PERMANOVA (with ‘adonis2’) as additional validation methods. These analyses allowed us to summarize compositional patterns while testing our hypotheses within the framework of the plot characteristics and taxonomic relationships. We have added these analyses and their results in the manuscript to reinforce our findings.
In methods: L478-481 “To strengthen the robustness of our findings based on NMDS, we further validated the results using Mantel test and PERMANOVA (with ‘adonis2’) for correlation between communities and relationships between communities and environmental variables.”
L469-475 “NMDS was used to summarize the variation in species composition across plots. The two axes extracted from the NMDS represent gradients in community composition, where each axis reflects a subset of the compositional variation. These axes should not be interpreted in isolation, as the overall species composition is co-determined by their combined variation. For clarity, results were interpreted based on the relationships of variables with the compositional gradients captured by both axes together."
In results: L172-177 “The PERMANOVA analysis also highlighted the important role of canopy cover for host and parasitoid community (Table S6-9). The Mantel test revealed a consistent pattern with the NMDS analysis, highlighting a pronounced relationship between the species composition of hosts and parasitoids (Table S10). However, the correlation between the phylogenetic composition of hosts and parasitoids was not significant.”
In discussion: L257-261 “However, this significant pattern was observed only in the NMDS analysis and not in the Mantel test, suggesting that the non-random interactions between hosts and parasitoids could not be simply predicted by their community similarity and associations between the phylogenetic composition of hosts and parasitoids are more complex and require further investigation in the future.”
-- One additional minor point: "site" would be better set as a fixed rather than random term in the linear mixed-effects models, because the site number (2) is too small to make a proper estimate of random component.
Now we treated “site” as a fixed factor in our models, interacting with tree species richness/tree MPD and tree functional diversity to reflect the variation of spatial and tree composition between the two sites. We found the main results did not change, as both sites showed consistent patterns for effects of tree richness/MPD on network metrics, which is more pronounced in one site.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors analyzed how biotic and abiotic factors impact antagonistic host-parasitoid interaction systems in a large BEF experiment. They found the linkage between the tree community and host-parasitoid community from the perspective of the multi-dimensionality of biodiversity. Their results revealed that the structure of the tree community (habitat) and canopy cover influence host-parasitoid compositions and their interaction pattern. This interaction pattern is also determined by phylogenetic associations among species. This paper provides a nice framework for detecting the determinants of network topological structures.
Strengths:
This study was conducted using a five-year sampling in a well-designed BEF experiment. The effects of the multi-dimensional diversity of tree communities have been well explained in a forest ecosystem with an antagonistic host-parasitoid interaction.
The network analysis has been well conducted. The combination of phylogenetic analysis and network analysis is uncommon among similar studies, especially for studies of trophic cascades. Still, this study has discussed the effect of phylogenetic features on interacting networks in depth.
Weaknesses:
(1) The authors should examine species and interaction completeness in this study to confirm that their sampling efforts are sufficient.
(2) The authors only used Rao's Q to assess the functional diversity of tree communities. However, multiple metrics of functional diversity exist (e.g., functional evenness, functional dispersion, and functional divergence). It is better to check the results from other metrics and confirm whether these results further support the authors' results.
(3) The authors did not elaborate on which extinction sequence was used in robustness analysis. The authors should consider interaction abundance in calculating robustness. In this case, the author may use another null model for binary networks to get random distributions.
(4) The causal relationship between host and parasitoid communities is unclear. Normally, it is easy to understand that host community composition (low trophic level) could influence parasitoid community composition (high trophic level). I suggest using the 'correlation' between host and parasitoid communities unless there is strong evidence of causation.
Thank you very much for your thoughtful and constructive review of our manuscript. We have carefully addressed your comments and made several revisions to improve the clarity and robustness of our work.1) We appreciate your suggestion regarding species and interaction completeness. To confirm that our sampling efforts were sufficient, we have now included a figure (Fig. S1) showing the species accumulation curve and the coverage of interactions in our study. This ensures that the data collected provide a comprehensive representation of the system. 2) Regarding the use of only Rao’s Q to assess functional diversity, we acknowledge that multiple metrics of functional diversity exist. However, due to the large number of predictors in our analysis, we decided to streamline our approach and focus on Rao’s Q as it provides a robust measure for our research objectives. We have discussed this decision in the revised manuscript and clarified that, while additional metrics could be informative, we believe Rao’s Q sufficiently captures the key aspects of functional diversity in our study. 3) We have elaborated on the robustness analysis and the null model used in our study. Specifically, we now clarified which extinction sequence (random extinction) was used in our manuscript, and explained interaction abundance was incorporated into the robustness calculations (networklevel function, weighted=TURE; see L506). 4) We have revised the text to clarify the relationship between host and parasitoid communities. As you correctly pointed out, while it is intuitive that host community composition influences parasitoid community composition, we have reframed our analysis to emphasize the correlation between the two communities rather than implying causation without strong evidence. We have revised the manuscript to reflect this distinction.
Reviewer #2 (Public Review):
Summary:
In their manuscript, Multi-dimensionality of tree communities structure host-parasitoid networks and their phylogenetic composition, Wang et al. examine the effects of tree diversity and environmental variables on communities of reed-nesting insects and their parasitoids. Additionally, they look for the correlations in community composition and network properties of the two interacting insect guilds. They use a data set collected in a subtropical tree biodiversity experiment over five years of sampling. The authors find that the tree species, functional, and phylogenetic diversity as well as some of the environmental factors have varying impacts on both host and parasitoid communities. Additionally, the communities of the host and parasitoid showed correlations in their structures. Also, the network metrices of the host-parasitoid network showed patterns against environmental variables.
Strengths:
The main strength of the manuscript lies in the massive long-term data set collected on host-parasitoid interactions. The data provides interesting opportunities to advance our knowledge on the effects of environmental diversity (tree diversity) on the network and community structure of insect hosts and their parasitoids in a relatively poorly known system.
Weaknesses:
To me, there are no major issues regarding the manuscript, though sometimes I disagree with the interpretation of the results and some of the conclusions might be too far-fetched given the analyses and the results (namely the top-down control in the system). Additionally, the methods section (especially statistics) was lacking some details, but I would not consider it too concerning. Sometimes, the logic of the text could be improved to better support the studied hypotheses throughout the text. Also, the results section cannot be understood as a stand-alone without reading the methods first. The study design and the rationale of the analyses should be described somewhere in the intro or presented with the results.
Thank you very much for your valuable comments and suggestions on our manuscript! We appreciate your feedback and have made revisions accordingly. Specifically, we have rephrased the interpretation of the results and conclusions to better align with the analyses and avoid overstatements, particularly concerning the top-down control in the system. In addition, we have expanded the methods section by providing more details, especially regarding the statistical approaches, to address the points you raised. To enhance the clarity of the manuscript, we have also ensured that the logic of the text better supports the hypotheses throughout. Please see our point-by-point responses below for additional clarifications.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Line 120: "... and large ecosystems susceptible to global change (add citation here)": Citation(s)?
Now we provided the missed citations.
Line 141: Add sampling completeness information.
Now we provide a new figure about sampling completeness (Fig. S1) in the supplementary materials, showing the adequate sampling effort for our study.
Line 151: use more metrics in the evaluation of functional diversity
We used tree functional diversity Rao’s Q, which is an integrated and wildly used metric to represent functional dissimilarity of trees. As our study focus on multiple diversity indices of trees, it would be better to do not pay more attention to one type of diversity. Thank you for your suggestion!
Line 164: host vulnerability. Although generality and vulnerability are commonly used in network analysis, it is better to link these metrics with the trophic level, like the 'host vulnerability' you used. Thus, you can use 'parasitoid generality' instead of 'generality'.
Thanks for your suggestion. Now the metrics were labeled with the trophic levels in the full text.
Line 169: two'.'
Corrected.
Line 173: 'parasitoid robustness' Or 'robustness of parasitoids'?
Now changed it to ‘robustness of parasitoid’.
Lines 173, 468: For the robustness estimations, maybe use null model for binary networks to get random distributions?
Thanks for the suggestion. Actually, we have used Patefield null models to compare the randomized robustness and observed, helping to assess whether the robustness of the observed network is significantly different compared to expected by chance. All robustness indices across plots were significantly different from a random distribution, See results section L197-201.
Line 184: modulating interacting communities of hosts and parasitoids.
Changed accordingly.
Line 186: determined host-parasitoid interaction patterns
Changed accordingly.
Line 191: Biodiversity loss in this study refers to low trophic levels.
Now we clarified this point.
Line 190: understand
Changed accordingly.
Lines 215-216: Reorganize these sentences
Line 227: indirectly influenced by...
Changed accordingly.
Line 238: Be more specific. Which type of further study?
Rephased it more specific.
Lines 297-299: rewrite this sentence to make it more transparent.
Now we rewrote the sentence accordingly.
Line 302: Certain
Changed accordingly.
Line 453: effective
Changed accordingly.
Finally, the authors should check the text carefully to avoid grammatical errors.
Thanks, now we have checked the full text to avoid grammatical errors.
Reviewer #2 (Recommendations For The Authors):
I feel that the authors have very interesting data and have a solid set of analyses. I do not have major issues regarding the manuscript, though sometimes I disagree with the interpretation of the results and some of the conclusions might be too far-fetched given the analyses and the results. Additionally, the methods section (especially statistics) was lacking some details, but I would not consider it too concerning at this point.
I feel that the largest caveat of the manuscript remains in the representation of the rationale of the study. I felt the introduction could be more concise and be better focused to back up the study questions and hypotheses. Many times, the sentences were too vague and unspecific, and thus, it was difficult to understand what was meant to be said. The authors could mention something more about how community composition of hosts and parasitoids are expected to change with the studied experimental design regarding the metrices you mention in the introduction (stronger hypotheses). The results section cannot be understood as a stand-alone without reading the methods first. The study design and the rationale of the analyses must be described somewhere in the intro or results, if the journal/authors want to keep the methods last structure. Also, the results and discussion could be more focused around the hypotheses. Naturally, these things can be easily fixed.
I also disagree with the interpretation of results finding top-down control in the system (it might well be there, but I do not think that the current methods and tests are suitable in finding it). First, the used methodology cannot distinguish parasitoids if the hosts are not there and the probability to detect parasitoid likely depends on the abundance of the host. Thus, the top-down regulation is difficult to prove (is it the parasitoids that have driven the host population down). Secondly, I would be hesitant to say anything about the top-down and bottom-up control in the systems as the data in the manuscript is pooled across five years while the top-down/bottom-up regulation in insect systems usually spans only one season/generation in time (much shorter than five years). Consequently, the analyses are comparing the communities of species that some of most likely do not co-exist (they were found in the same space but not during the same time). Luckily, the top-down/bottom-up effects could potentially be explored by using separately the time steps of the now pooled community data: e.g., does the population of the host decrease in t if the parasitoids are abundant in t-1? There are also other statistical tests to explore these patterns.
In the manuscript "Phylogenetic composition" refers to Mean Pairwise Distance. I would use "phylogenetic diversity" instead throughout the text. Also, to my understanding, in trees both "phylogenetic composition" and "phylogenetic diversity" are used even though based on their descriptions, they are the same.
Detailed comments:
Punctuation needs to be checked and edited at some point (I think copy-pasting had left things in the wrong places). Please check that "-" instead of "-" is used in host-parasitoid.
1-2 The title is not very matching with the content. "Multi-dimensionality" is not mentioned in the text. "phylogenetic composition" -> "phylogenetic diversity"
We didn’t find the role of functional diversity of trees in host-parasitoid interactions, but we still have tree richness and phylogenetic diversity. I also disagree with that using phylogenetic diversity to replace phylogenetic composition, because diversity highlights higher or lower phylogenetic distance among communities, while the later highlights the phylogenetic dissimilarity across communities.
53-57 This sentence is quite vague and because of it, difficult to follow. Consider rephrasing and avoiding unspecified terms such as "tree identity", "genetic diversity", and "overall community composition of higher trophic levels" (at least, I was not sure what taxa/level you meant with them).
Rephased.
L58-61 “Especially, we lack a comprehensive understanding of the ways that biotic factors, including plant richness, overall community phylogenetic and functional composition of consumers, and abiotic factors such as microclimate, determining host–parasitoid network structure and host–parasitoid community dynamics.”
56 I would remove "interact" as no interactions were tested.
Removed accordingly.
59-60 This needs rephrasing. I feel "taxonomic and phylogenetic composition should be just "species composition". To better match, what was done: "taxonomic, phylogenetic, and network composition of both host and parasitoid communities" -> "species and phylogenetic diversity of both host and parasitoid communities and the composition their interaction networks"
Changed accordingly.
62 Remove "tree composition".
Done.
62 Replace "taxonomic" with "species". Throughout the text.
Done.
63-64 "Generally, top-down control was stronger than bottom-up control via phylogenetic association between hosts and parasitoids" I disagree, see my comments elsewhere.
Now we rephased the sentence.
L68-70 “Generally, phylogenetic associations between hosts and parasitoids reflect non-randomly structured interactions between phylogenetic trees of hosts and parasitoids.”
68 "habitat structure and heterogeneity" This is too strong and general of a statement based on the results. You did not really measure habitat structure or heterogeneity.
Now we rephased the statement to avoid strong and general description.
L71-73 “Our study indicates that the composition of higher trophic levels and corresponding interaction networks are determined by plant diversity and canopy cover especially via trophic phylogenetic links in species-rich ecosystems.”
69 Specify "phylogenetic links". Trophic links?
Specified to “trophic phylogenetic links”.
75-77 The sentence is a bit difficult to follow. Consider rephrasing.
Now we rephased it.
L79-82 “Changes in network structure of higher trophic levels usually coincide with variations in their diversity and community, which could be in turn affected by the changes in producers via trophic cascades”
76 Be more specific about what you mean by "community of trophic levels".
Specified to “community composition”.
79 Remove "basal changes of", it only makes the sentence heavier.
Done.
81 What is "species codependence"?
We sim to describe the species co-occurrence depending on their closely relationships. For clarity, now we changed to “species coexistence”
82 What do you mean by "complex dynamics"?
Rephased to “mechanisms on dynamics of networks”.
83 onward: I would not focus so much on top-down/bottom-up as I feel that your current analyses cannot really say anything too strong about these causalities but are rather correlative.
Thanks, we now removed the relevant contents from the discussion. However, we kept one sentence in the Introduction, because it should be highlighted to make reviewers aware of this (the other text on about this were removed).
89 Remove "environmental".
Done.
90 Specify what you mean by "these forces".
Done.
98-99 I have difficulties following the logic here "potential specialization of their hosts may cascade up to impact the parasitoids' presence or absence". Consider rephrasing.
Now we rephased it.
L101-102 “…and their host fluctuations may cascade up to impact the parasitoids’ presence or absence.”
100 Be more specific with "habitat-level changes".
Specified to “community-level changes”
100 I do not see why host-parasitoid systems would be ideal to study "species interactions". There are much simpler and easier systems available.
Changed to “… one of ideal…”
101-103 "influence of" on what?
Now we rephased the sentence.
L104-105 “Previous studies mainly focused on the influence of abiotic factors on host-parasitoid interactions”
104 Be more specific in "the role of multiple components of plant diversity".
Now we specified "the role of multiple components of plant diversity".
L107-108 “…the role of multiple components of plant diversity (i.e. taxonomic, functional and phylogenetic diversity)…”
106 "diversity associations" of what?
“diversity associations between host and parasitoids”.
108 Specify the "direct and indirect effects".
Now we specified it to “…direct and indirect effects (i.e. one pathway and more pathways via other variables)…”
110-113 A bit heavy sentence to follow. Consider rephrasing.
Now we rephased the sentence to make it more readable.
114 Give an example of "phylogenetic dependences".
Done. Phylogenetic dependences (e.g. phylogenetic diversity)
117 Move the "e.g. taxonomic, phylogenetic, functional" within brackets in 117 after "dimensions of biodiversity".
Done.
120 "(add citation here)" Yes please!
Done.
120-121 Specify "such relationships".
Done. Specified to “multiple dimensions of biodiversity”
128-130 This is difficult to follow. Please rephrase.
Now we rephased the sentence.
L135-137 “We aimed to discern the primary components of the diversity and composition of tree communities that affect higher trophic level interactions via quantifying the strength and complexity of associations between hosts and parasitoid.”
131-132 Remove "phylogenetic and". It is redundant to phylogenetic diversity.
Done.
128 Tested robustness does not really capture "stability of associations".
Yes, we agree. Now we rephased the sentence and exclude the “stability” description.
133 Specify "phylogenetic processes".
Now we specified “phylogenetic processes”.
L140-141 “…especially via phylogenetic processes (e.g. lineages of trophic levels diverge and evolve over time)…”
141 I would like to have more details on the data set somewhere in the results. How many individuals and species were found in each plot (on average)? Was there a lot of temporal variation (e.g. between the seasons)? On how many sites were the insect species found?
Thanks for your suggestion. Now we provide more details on the data set in the results (L153-156), including mean values of individuals and species in each plot. However, the temporal variation should be studied for another relative independent topic, as our study focus on the general patter of interactions between hosts and parasitoids. Therefore, we would not put more information on temporal changes to make readers get lost in the text.
153-156 “Among them, we found 56 host species (12 bees and 44 wasps, mean abundance and richness are 400.05 and 45.14, respectively, for each plot) and 50 parasitoid species (38 Hymenoptera and 12 Diptera, mean abundance and richness are 14.07 and 9.05, respectively, for each plot).”
149 tree -> trees
Done.
149 Should there read also some else than "NMDS scores"?
Thanks! Now we provided more details about “NMDS scores”.
L161-162 “(NMDS axis scores; i.e. preserving the rank order of pairwise dissimilarities between samples)”
149 You could mention the amount of variation explained by the first two axes of the NMDSs. Now it is difficult to estimate how much the models actually explain.
Thanks for your comments! However, we could not directly provide the explanatory power of the two axes, because NMDS is based on rank-order distances rather than linear relationships like in PCA. However, the goodness of fit for the NMDS solution is typically evaluated using the stress value. We provide the stress value in the figure caption.
150 "tree MPD" is mentioned for the first time. Spell it out.
Done.
150 Explain "eastness".
Done.
L163-164 “…eastness (sine-transformed radian values of aspect) )”
151 How was "tree functional diversity" quantified?
Please see methods. L437-L438.
160 Specify that you talk about phylogenetic compositions of the host and parasitoid communities here.
We would keep it refined here, keeping consistent with species composition here. Phylogenetic composition just represents the dissimilarities of phylogenetic linages within a community.
161 Describe "parafit" test here when first mentioned.
Done, see methods L485-487.
182 Keep on referring to tables and figures in the discussion! Also, more clearly discuss your hypotheses. There are lots of discussions on top-down/bottom-up control. It could be good to form a hypothesis on them and predict what kind of patterns would suggest either one and what would you expect to find regarding them.
Now we referred figures and tables in the discussion. As the contents on top-down and bottom-up control were not fit very well with our study (as also suggested by reviewers), so we rephased the discussion and also clearly discuss our hypotheses in the discussion. See L218, L226, and L237 etc.
186 "partly determined host-parasitoid networks" Be more specific.
Done.
L206-207 “…partly determined host-parasitoid network indices, including vulnerability, linkage density, and interaction evenness.”
195 Tell what you mean by "other biotic factors".
Specified it: “…other biotic factors such as elevation and slope…”
197-198 "It seems likely that these results are based on bee linkages to pollen resources" I would be hesitant to conclude this as the bees most likely forage way beyond the borders of the 30m by 30m study plots.
Thanks for your concern about this problem. While it is true that bees can forage beyond 30 x 30m, the study focuses on their nesting behavior and activity within this defined area, rather than their entire foraging range. Existing literature shows bees often forage locally when resources are available (e.g. Ebeling et al., 2012 Oecologia; Guo et al., year, Basic and Applied Ecology). Therefore, we are confident that this pattern could be associated with the resources around the trap nests.
223 "This could be further tested by collecting the food directly used by the wasps (caterpillars)" A bit unnecessary addition.
Thanks for your suggestion. Yes, this definitely is a good point, but currently we don’t have enough data of caterpillars, but we will follow this in the future.
232-238 I disagree with the authors on the interpretation of the causality of the results here. I think that the community of parasitoids simply indicates which host species are available, while the host community does not have an as strong effect on parasitoid community as parasitoids do not utilise the whole species pool of the hosts. (Presence of parasitoid tells that the host is around while the presence of the host does not necessarily tell about the presence of the parasitoid.) To me, this would rather indicate a bottom-up than top-down regulation. Similar patterns are also visible in species communities of hosts and parasites.
Thank you for your suggestion. We agree with you that parasitoids are more depended on hosts, as host could not be always attacked by parasitoids. Now we rephased our explanation to follow this argument.
L254-256 “Such pattern could be further confirmed by the significant association between host phylogenetic composition and parasitoid phylogenetic composition (Fig. 1c), which suggested that their interactions are phylogenetically structured to some extent.”
247-266 The logic in this section is difficult to follow. Try rephrasing.
Now we rephased the section for a clearer logic.
L270-287 “Tree community species richness did not significantly influence the diversity of hosts targeted by parasitoids (parasitoid generality), but caused a significant increase in the diversity of parasitoids per host species (host vulnerability) (Fig. 3a; Table 2). This is likely because niche differentiation often influences network specialization via potential higher resource diversity in plots with higher tree diversity (Lopez-Carretero et al. 2014). Such positive relationship between host vulnerability and tree species richness suggested that host-parasitoid interactions could be driven through bottom-up effects via benefit from tree diversity. For example, parasitoid species increases more than host diversity with increasing tree species richness (Guo et al. 2021), resulting increasing of host vulnerability at community level. According to the enemies hypothesis (Root 1973), which posits a positive effects of plant richness on natural enemies, the higher trophic levels in our study (e.g. predators and parasitoids) would benefit from tree diversity and regulate herbivores thereby (Staab and Schuldt 2020). Indeed, previous studies at the same site found that bee parasitoid richness and abundance were positively related to tree species richness, but not their bee hosts (Fornoff et al. 2021, Guo et al. 2021). Because our dataset considered all hosts and reflects an overall pattern of host-parasitoid interactions, the effects of tree species richness on parasitoid generality might be more complex and difficult to predict, as we found that neither tree species richness nor tree MPD were related to parasitoid generality.”
249 "This is likely because niche differentiation often influences network specialization via potential higher resource diversity in plots with higher tree diversity" This is a bit contradicting your vulnerability results as niche differentiation should increase specialization and diversity and specialization should decrease vulnerability (less host per parasitoid).
Thanks! We understand that the concepts of “generality” and “vulnerability” can be a bit confusing. To clarify, “fewer hosts per parasitoid” actually corresponds to lower generality at the community level.
332-337 How did you select the species growing on your plots? Or was only species number considered? What was the pool of tree species growing on the selected plots? Was the selection similar at both sites?
Now we provided more information on the experiment design.
L354-356 “The species pools of the two plots are nonoverlapping (16 species for each site). The composition of tree species within the study plots is based on a “broken-stick” design (see Bruelheide et al. 2014).”
342 Remove "centrally per plot"?
Done.
346-347 Was the selection of different reed diameters similar in all the plots?
Diameters and the relative distribution of diameters was similar in all trap nests.
399 & 432 Are "phylogenetic diversity of the tree communities" and "phylogenetic composition of trees" the same? They are both described as mean pairwise distance.
These two are actually different, as we use this to distinguish the phylogenetic diversity with communities and rank order of dissimilarities between tree communities. Here, the phylogenetic diversity of the tree communities is mean pairwise phylogenetic distance of species for tree communities. Tree phylogenetic composition is the rank order of pairwise dissimilarities between tree communities based on NMDS.
400 Do you think that MPD makes any sense with the monocultures (value is always 0)? Does this have a potential to bias your analyses and result?
We agree your point. However, we do not think that this is a major problem in the analyses. We followed the experimental design and considered low phylogenetic relatedness of tree species in a plot (Likewise in monocultures, the tree species richness is always 1).
402-405 MNTD is not mentioned before or after this. Consider removing this section.
We tested the potential effects of MNTD in our models. Now we mentioned it in our results.
L194-195 “Tree mean nearest taxon distance (MNTD) was unrelated to any network indices.”
405 "Phylogenetic metrics of trees" Which ones?
Both tree MPD and MNTD. Now we have noted it in the manuscript. (L432)
410 Further details on "Rao's Q" and how the functional diversity of the communities was calculated are needed.
Now more details were provided.
L435-438 “Specifically, seven leaf traits were used for calculation of tree functional diversity, which was calculated as the mean pairwise distance in trait values among tree species, weighted by tree wood volume, and expressed as Rao's Q”
413 Specify "higher trophic levels".
Now we specified the trophic levels.
L440-441 “…higher trophic levels in our study area, such as herbivores and predators”
417-424 What about the position of the plots within study sites? Is there potential for edge effects (e.g. bees finding easier the trap nest close to the edge of the experimental forest)? Were there any differences between the two sites? What is the elevation range of the plots?
Thanks for concerning the details of our study. First, all the plots were randomly distributed within the study sites (see Fig. S2). Admittedly, there are several plots are located in the edges of the site. However, we did not consider the potential edge effects in our analysis. Of course, this will be a good point in our future studies. Moreover, the biggest difference between the two is the non-overlapping tree species pool, and the two study sites are apart from 5 km in the same town. Finally, there is not too distinct elevation gradient across the plots (112 m - 260 m).
432-434 "The species and phylogenetic composition of trees, hosts, and parasitoids were quantified at each plot with nonmetric multidimensional scaling (NMDS) analysis based on Morisita-Horn distances" This section needs to be more specific and detailed. Did you do the NMDS separately for each plot as suggested in the text?
We provided more details of the section.
L462-465 “The minimum number of required dimensions in the NMDS based on the reduction in stress value was determined in the analysis (k = 2 in our case). We centred the results to acquire maximum variance on the first dimension, and used the principal components rotation in the analysis.”
435 Specify how picante was used (function and arguments)!
Now we specified the function.
L465-467 “The phylogenetic composition was calculated by mean pairwise distance among the host or parasitoid communities per plot with the R package “picante” with ‘mpd’ function.”
436 "standardized values" Of what? How was the standardisation done?
Now we citied a supplementary table (Table S2) to specify it (see L469). For the standardization, we used ‘scale’ function in R, which standardizes data by centering and scaling data. Specifically, it subtracts the mean and divides by the standard deviation for each variable.
443 Provide more details on parafit.
Actually, we have provided the reason why we use the parafit test and the usage.
L483-486 “We used a parafit test (9,999 permutations) with the R package “ape” to test whether the associations were non-random between hosts and parasitoids. This is widely used to assess host-parasite co-phylogeny by analyzing the congruence between host and parasite phylogenies using a distance-based matrix approach.”
449-451 Rephrase the sentence.
Rephased.
L490-491 “We constructed quantitative host-parasitoid networks at community level with the R package “bipartite” for each plot of the two sites.”
451 "six" Should this be five?
Yes, should be five, thanks.
470-481 What package and function were used for the LMMs?
As we now used linear models, we do no longer use a R package for LMMs.
470 "mix" -> mixed
Changed to linear models.
472 "six" Should this be five?
Again, we changed it to five.
479-481 How did you treat the variables from the two different sites when testing for the correlations to avoid two geographic clusters of data points?
Now we considered the two study sites as fixed factor in our linear models. Moreover, tree-based variables were additionally included as interaction terms with the study sites.
501 "mix" -> mixed
Changed to linear models.
The panel selection for figures 3 and 4 seems random. Justify it!
Thank you. To avoid including too many figures in the main text, which could potentially confuse readers, we have selected the key results that are of primary interest. The remaining figures are provided in the appendix for reference.
533 "Note that axes are on a log scale for tree species richness." Why the log-scale if the analyses were performed with linear fit? Also, the drawn regression lines do not match the model description (non-linear, while a linear model is described in the text). The models should probably be described in more detail.
We used log-transformed to promote the normality of the data. The drawn regression lines are linear lines, which fit our models.
539 "Values were adjusted for covariates of the final regression model." How?
We used residual plot to directly visualizes the relationship between the predictor and the response variable with the fitted regression line, making it easier to assess the model's fit.
Fig. S4 text does not match the figure.
Thanks! We now deleted the unmatched text in the figure.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing.
The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text.
We thank the reviewer for their positive assessment of our work and for their extremely helpful and constructive comments that helped to significantly improve the quality of our manuscript.
(1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to postselection or exclusion of participants, but at the same time do not discuss this equally important point.
Performance was indeed highly variable between observers, as is commonly found in attentional-blink (AB) and masking studies. For some observers, the AB pushes performance almost to chance level, whereas for others it has almost no effect. A similar effect can be seen in masking. We did our best to match accuracy over participants, while also matching accuracy within participants as well as possible, adjusting mask contrast manually during the experimental session. Naturally, those that are strongly affected by masking need not be the same participants as those that are strongly affected by the AB, given the fact that they rely on different mechanisms (which is also one of the main points of the manuscript). To answer the research question, what mattered most was that at the group-level, performance was well matched between the two key conditions. As all our statistical inferences, both for behavior and EEG decoding, rest on this group level. We do not think that variability at the individualsubject level detracts from this general approach.
In the Results, we added that our goal was to match performance across participants:
“Importantly, mask contrast in the masked condition was adjusted using a staircasing procedure to match performance in the AB condition, ensuring comparable perceptual performance in the masked and the AB condition across participants (see Methods for more details).”
In the Methods, we added:
“Second, during the experimental session, after every 32 masked trials, mask contrast could be manually updated in accordance with our goal to match accuracy over participants, while also matching accuracy within participants as well as possible.”
(2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and nonillusory share the same shape, so more elaborate object processing could also be occurring. Please discuss.
We agree with this qualification of our interpretation, and included the reviewer’s account as an alternative explanation in the Discussion section:
“It should be noted that not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processes representing the triangular shapes as well.”
(3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead.
We agree with the reviewer that the interpretation of this result depends on the definition of consciousness that one adheres to. If one takes report as the leading metric for consciousness (=conscious access), one can indeed conclude that perceptual segmentation/organization can also occur unconsciously. However, if the processing that results in the qualitative nature of an image (rather than whether it is reported) is taken as leading – such as the processing that results in the formation of an illusory percept – (=phenomenal) the conclusion can be quite different. This speaks to the still ongoing debate regarding the existence of phenomenal vs access consciousness, and the literature on no-report paradigms amongst others (see last paragraph of the discussion). Because the current data do not speak directly to this debate, we decided to remove the sentence about “conscious experience”, and edited this part of the manuscript (also addressing a comment about preserved unconscious processing during masking by Reviewer 2) by limiting the interpretation of unconscious processing to those aspects that are uncontroversial:
“Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling deep unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.”
(4) The two paradigms developed here could be used jointly to highlight nonidiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer?
To avoid issues with post-hoc selection of (visible vs. invisible) trials (discussed in the Introduction), we did not divide our trials into conscious and unconscious trials, and thus did not attempt to reveal NCCs, or NCCs generalizing across the two paradigms. Note also that this approach alone would not resolve the debate regarding the ‘true’ NCC as it hinges on the operational definition of consciousness one adheres to; also see our response to the previous point the reviewer raised. Our main analysis revealed that the illusory triangle could be decoded with above-chance accuracy during both masking and the AB over extended periods of time with similar topographies (Fig. 2B), so that significant cross-decoding would be expected over roughly the same extended period of time (except for the heightened 200-250 ms peak). However, as our focus was on differences between the two manipulations and because we did not use post-hoc sorting of trials, we did not add these analyses.
(5) How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy?
Compared to certain manipulations of spatial attention, the AB phenomenon is generally considered to represent an instance of “late” attentional filtering. In the Discussion section we included a paragraph on classic load theory, where early and late filtering depend on perceptual and attentional load. Just preceding this paragraph, we added this:
“Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”
Reviewer #2 (Public Review):
Summary:
This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event.
Strengths:
The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing.
Weaknesses:
- The authors could improve clarity of the rich set of decoding analyses across conditions.
- They could also enrich their Introduction and Discussion sections by taking into account the importance of conscious influences on some unconscious cognitive processes (revision of traditional concept of 'automaticity'), that may introduce some complexity in Results interpretation
- They should discuss the rich literature reporting high-level unconscious processing in masking paradigms (culminating in semantic processing of digits, words or even small group of words, and pictures) in the light of their proposal (deeper unconscious processing during AB than during masking).
We thank the reviewer for their positive assessment of our study and for their insightful comments and helpful suggestions that helped to significantly strengthen our paper. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we followed the reviewer’s suggestions and revised the Results/Discussion to include references to influences on unconscious processes and expanded our discussion of unconscious effects during masking vs. AB.
Reviewer #3 (Public Review):
Summary:
This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing topdown attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remain unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access.
Strengths:
The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions.
Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response).
The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception.
Weaknesses:
The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. An early classification peak was instead only affected by masking. However, a late classification peak showed a pattern similar to the behavioural results, with classification affected by both masking and AB.
The interpretation of the results mainly centres on the theoretical framework of the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast, collinearity, and the illusory perception reflect feedforward, local recurrent, and global recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. They all mention feedback connections, but none seem to point to lateral connections.
Moreover, the evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Furthermore, the other studies mentioned in the manuscript did not investigate feedback connections but only lateral ones, making it difficult to draw any clear conclusions.
We thank the reviewer for their careful review and positive assessment of our study, as well as for their constructive criticism and helpful suggestions. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we addressed the reviewer’s comments and suggestions by better relating our study to Fahrenfort et al.’s (2017) paper and by highlighting the limitations inherent in linking our findings to distinct neural mechanisms (in particular, to lateral vs. feedback connections).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
- Methods: it states that "The distance between the three Pac-Man stimuli as well as between the three aligned two-legged white circles was 2.8 degrees of visual angle". It is unclear what this distance refers to. Is it the shortest distance between the edges of the objects?
It is indeed the shortest distance between the edges of the objects. This is now included in the Methods.
- Methods: It's unclear to me if the mask updating procedure during the experimental session was based on detection rate or on the perceptual performance index reported on Fig1D. Please clarify.
It was based on accuracy calculated over 32 trials. We have included this information in the Methods.
- Methods and Results: I did not understand why the described procedure used to ensure that confidence ratings are not contaminated by differences in perceptual performance was necessary. To me, it just seems to make the "no manipulations" and "both manipulations" less comparable to the other 2 conditions.
To calculate accurate estimates of metacognitive sensitivity for the two matched conditions, we wanted participants to make use of the full confidence scale (asking them to distribute their responses evenly over all ratings within a block). By mixing all conditions in the same block, we would have run the risk of participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition). We made this point explicit in the Results section and in the Methods section:
“To ensure that the distribution of confidence ratings in the performancematched masked and AB condition was not influenced by participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition, respectively), the masked and AB condition were presented in the same experimental block, while the other block type included the no and both manipulations condition.”
“To ensure that confidence ratings for these matched conditions (masked, long lag and unmasked, short lag) were not influenced by participants anchoring their confidence ratings to the very easy and very difficult unmatched conditions (no and both manipulations, respectively), one type of block only contained the matched conditions, while the other block type contained the two remaining, unmatched conditions (masked, short lag and unmasked, long lag).”
- Methods: what priors were used for Bayesian analyses?
Bayesian statistics were calculated in JASP (JASP Team, 2024) with default prior scales (Cauchy distribution, scale 0.707). This is now added to the Methods.
- Results, line 162: It states that classifiers were applied on "raw EEG activity" but the Methods specify preprocessing steps. "Preprocessed EEG activity" seems more appropriate.
We changed the term to “preprocessed EEG activity” in the Methods and to “(minimally) preprocessed EEG activity (see Methods)” in the Results, respectively.
- Results, line 173: The effect of masking on local contrast decoding is reported as "marginal". If the alpha is set at 0.05, it seems that this effect is significant and should not be reported as marginal.
We changed the wording from “marginal” to “small but significant.”
- Fig1: The fixation cross is not displayed.
Because adding the fixation cross would have made the figure of the trial design look crowded and less clear, we decided to exclude it from this schematic trial representation. We are now stating this also in the legend of figure 1.
- Fig 3A: In the upper left panel, isn't there a missing significant effect of the "local contrast training and testing" condition in the first window? If not, this condition seems oddly underpowered compared to the other two conditions.
Thanks for the catch! The highlighting in bold and the significance bar were indeed lacking for this condition in the upper left panel (blue line). We corrected the figure in our revision.
- Supplementary text and Fig S6: It is unclear to me why the two control analyses (the black lines vs. the green and purple lines) are pooled together in the same figure. They seem to test for different, non-comparable contrasts (they share neither training nor testing sets), and I find it confusing to find them on the same figure.
We agree that this may be confusing, and deleted the results from one control analysis from the figure (black line, i.e., training on contrast, testing on illusion), as the reviewer correctly pointed out that it displayed a non-comparable analysis. Given that this control analysis did not reveal any significant decoding, we now report its results only in the Supplementary text.
- Fig S6: I think the title of the legend should say testing on the non-illusory triangle instead of testing on the illusory triangle to match the supplementary text.
This was a typo – thank you! Corrected.
Reviewer #2 (Recommendations For The Authors):
Issue #1: One key asymmetry between the three levels of T2 attributes (i.e.: local contrast; non-illusory triangle; illusory Kanisza triangle) is related to the top-down conscious posture driven by the task that was exclusively focusing on the last attribute (illusory Kanisza triangle). Therefore, any difference in EEG decoding performance across these three levels could also depend to this asymmetry. For instance, if participants were engaged to report local contrast or non-illusory triangle, one could wonder if decoding performance could differ from the one used here. This potential confound was addressed by the authors by using decoders trained in different datasets in which the main task was to report one the two other attributes. They could then test how classifiers trained on the task-related attribute behave on the main dataset. However, this part of the study is crucial but not 100% clear, and the links with the results of these control experiments are not fully explicit. Could the author better clarity this important point (see also Issue #1 and #3).
The reviewer raises an important point, alluding to potential differences between decoded features regarding task relevance. There are two separate sets of analyses where task relevance may have been a factor, our main analyses comparing illusion to contrast decoding, and our comparison of collinearity vs. illusion-specific processing.
In our main analysis, we are indeed reporting decoding of a task-relevant feature (illusion) and of a task-irrelevant feature (local contrast, i.e., rotation of the Pac-Man inducers). Note, however, that the Pac-Man inducers were always task-relevant, as they needed to be processed to perceive illusory triangles, so that local contrast decoding was based on task-relevant stimulus elements, even though participants did not respond to local contrast differences in the main experiment. However, we also ran control analyses testing the effect of task-relevance on local contrast decoding in our independent training data set and in another (independent) study, where local contrast was, in separate experimental blocks, task-relevant or task-irrelevant. The results are reported in the Supplementary Text and in Figure S5. In brief, task-relevance did not improve early (70–95 ms) decoding of local contrast. We are thus confident that the comparison of local contrast to illusion decoding in our main analysis was not substantially affected by differences in task relevance. In our previous manuscript version, we referred to these control analyses only in the collinearity-vs-illusion section of the Results. In our revision, we added the following in the Results section comparing illusion to contrast decoding:
“In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”
In addition to our main analysis, there is the concern that our comparison of collinearity vs. illusion-specific processing may have been affected by differences in task-relevance between the stimuli inducing the non-illusory triangle (the “two-legged white circles”, collinearity-only) and the stimuli inducing the Kanizsa illusion (the PacMan inducers, collinearity-plus-illusion). We would like to emphasize that in our main analysis classifiers were always used to decode T2 illusion presence vs. absence (collinearity-plus-illusion), and never to decode T2 collinearity-only. To distinguish collinearity-only from collinearity-plus-illusion processing, we only varied the training data (training classifiers on collinearity-only or collinearity-plus-illusion), using the independent training data set, where collinearity-only and collinearity-plus-illusion (and rotation) were task-relevant (in separate blocks). As discussed in the Supplementary Information, for this analysis approach to be valid, collinearity-only processing should be similar for the illusory and the non-illusory triangle, and this is what control analyses demonstrated (Fig. S7). In any case, general task-relevance was equated for the collinearity-only and the collinearity-plus-illusion classifiers.
Finally, in supplementary Figure 6 we also show that our main results reported in Figure 2 (discussed at the top of this response) were very similar when the classifiers were trained on the independent localizer dataset in which each stimulus feature could be task-relevant.
Together, for the reasons described above, we believe that differences in EEG decoding performance across these three stimulus levels did are unlikely to depend also depend on a “task-relevance” asymmetry.
Issue #2: Following on my previous point the authors should better mention the concept of conscious influences on unconscious processing that led to a full revision of the notion of automaticity in cognitive science [1 , 2 , 3 , 4]. For instance, the discovery that conscious endogenous temporal and spatial attention modulate unconscious subliminal processing paved the way to this revision. This concept raises the importance of Issue#1: equating performance on the main task across AB and masking is not enough to guarantee that differences of neural processing of the unattended attributes of T2 (i.e.: task-unrelated attributes) are not, in part, due to this asymmetry rather than to a systematic difference of unconscious processing strengtsh [5 , 6-8]. Obviously, the reported differences for real-triangle decoding between AB and masking cannot be totally explained by such a factor (because this is a task-unrelated attribute for both AB and masking conditions), but still this issue should be better introduced, addressed, clarified (Issue #1 and #3) and discussed.
We would like to refer to our response to the previous point: Control analyses for local contrast decoding showed that task relevance had no influence on our marker for feedforward processing. Most importantly, as outlined above, we did not perform real-triangle decoding – all our decoding analyses focused on comparing collinearity-only vs. collinearity-plus-illusion were run on the task-relevant T2 illusion (decoding its presence vs. absence). The key difference was solely the training set, where the collinearity-only classifier was trained on the (task-relevant) real triangle and the collinearity-plus-illusion classifier was trained on the (task-relevant) Kanizsa triangle. Thus, overall task relevance was controlled in these analyses.
In our revision, we are now also citing the studies proposed by the reviewer, when discussing the control analyses testing for an effect of task-relevance on local contrast decoding:
“In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”
Issue #3: In terms of clarity, I would suggest the authors to add a synthetic figure providing an overall view of all pairs of intra and cross-conditions decoding analyses and mentioning main task for training and testing sets for each analysis (see my previous and related points). Indeed, at one point, the reader can get lost and this would not only strengthen accessibility to the detailed picture of results, but also pinpoint the limits of the work (see previous point).
We understand the point the reviewer is raising and acknowledge that some of our analyses, in particular those using different training and testing sets, may be difficult to grasp. But given the variety of different analyses using different training and testing sets, different temporal windows, as well as different stimulus features, it was not possible to design an intuitive synthetic figure summarizing the key results. We hope that the added text in the Results and Discussion section will be sufficient to guide the reader through our set of analyses.
In our revision, we are now more clearly highlighting that, in addition to presenting the key results in our main text that were based on training classifiers on the T1 data, “we replicated all key findings when training the classifiers on an independent training set where individual stimuli were presented in isolation (Fig. 3A, results in the Supplementary Information and Fig. S6).” For this, we added a schematic showing the procedure of the independent training set to Figure 3, more clearly pointing the reader to the use of a separate training data set.
Issue #4: In the light of these findings the authors should discuss more thoroughly the question of unconscious high-level representations in masking versus AB: in particular, a longstanding issue relates to unconscious semantic processing of words, numbers or pictures. According to their findings, they tend to suggest that semantic processing should be more enabled in AB than in masking. However, a rich literature provided a substantial number of results (including results from the last authors Simon Van Gaal) that tend to support the notion of unconscious semantic processing in subliminal processing (see in particular: [9 , 10 , 11 , 12 , 13]). So, and as mentioned by the authors, while there is evidence for semantic processing during AB they should better discuss how they would explain unconscious semantic subliminal processing. While a possibility could be to question the unconscious attribute of several subliminal results, the same argument also holds for AB studies. Another possible track of discussion would be to differentiate AB and subliminal perception in terms of strength and durability of the corresponding unconscious representations, but not necessarily in terms of cognitive richness. Indeed, one may discuss that semantic processing of stimuli that do not need complex spatial integration (e.g.: words or digits as compared to illusory Kanisza tested here) can still be observed under subliminal conditions.
We thank the reviewer for pointing us to this shortcoming of our previous Discussion. Note that our data does not directly speak to the question of high-level unconscious representations in masking vs AB, because such conclusions would hinge on the operational definition of consciousness one adheres to (also see response to Reviewer 1). Nevertheless, we do follow the reviewer’s suggestions and added the following in the Discussion (also addressing a point about other forms of attention raised by Reviewer 1):
“Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”
And, in a following paragraph in the Discussion:
“Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling high-level unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.
Reviewer #3 (Recommendations For The Authors):
(1) The objective of Fahrenfort et al., 2017 seems very similar to that of the current study. What are the main differences between the two studies? Moreover, Fahrenfort et al., 2017 conducted similar decoding analyses to those performed in the current study.
Which results were replicated in the current study, and which ones are novel? Highlighting these differences in the manuscript would be beneficial.
We now provide a more comprehensive coverage of the study by Fahrenfort et al., 2017. In the Introduction, we added a brief summary of the key findings, highlighting that this study’s findings could have reflected differences in task performance rather than differences between masking and AB:
“For example, Fahrenfort and colleagues (2017) found that illusory surfaces could be decoded from electroencephalogram (EEG) data during the AB but not during masking. This was taken as evidence that local recurrent interactions, supporting perceptual integration, were preserved during inattention but fully abolished by masking. However, masking had a much stronger behavioral effect than the AB, effectively reducing task performance to chance level. Indeed, a control experiment using weaker masking, which resulted in behavioral performance well above chance similar to the main experiment’s AB condition, revealed some evidence for preserved local recurrent interactions also during masking. However, these conditions were tested in separate experiments with small samples, precluding a direct comparison of perceptual vs. attentional blindness at matched levels of behavioral performance. To test …”
In the Results , we are now also highlighting this key advancement by directly referencing the previous study:
“Thus, whereas in previous studies task performance was considerably higher during the AB than during masking (e.g., Fahrenfort et al., 2017), in the present study the masked and the AB condition were matched in both measures of conscious access.” When reporting the EEG decoding results in the Results section, we continuously cite the Fahrenfort et al. (2017) study to highlight similarities in the study’s findings. We also added a few sentences explicitly relating the key findings of the two studies:
“This suggests that the AB allowed for greater local recurrent processing than masking, replicating the key finding by Fahrenfort and colleagues (2017). Importantly, the present result demonstrates that this effect reflects the difference between the perceptual vs. attentional manipulation rather than differences in behavior, as the masked and the AB condition were matched for perceptual performance and metacognition.”
“This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”
We also more clearly highlighted where our study goes beyond Fahrenfort et al.’s (2017), e.g., in the Results:
“The addition of this element of collinearity to our stimuli was a key difference to the study by Fahrenfort and colleagues (2017), allowing us to compare non-illusory triangle decoding to illusory triangle decoding in order to distinguish between collinearity and illusion-specific processing.”
And in the Discussion:
“Furthermore, the addition of line segments forming a non-illusory triangle to the stimulus employed in the present study allowed us to distinguish between collinearity and illusion-specific processing.”
Also, in the Discussion, we added a paragraph “summarizing which results were replicated in the current study, and which ones are novel”, as suggested by the reviewer:
“This pattern of results is consistent with a previous study that used EEG to decode Kanizsa-like illusory surfaces during masking and the AB (Fahrenfort et al., 2017). However, the present study also revealed some effects where Fahrenfort and colleagues (2017) failed to obtain statistical significance, likely reflecting the present study’s considerably larger sample size and greater statistical power. For example, in the present study the marker for feedforward processing was weakly but significantly impaired by masking, and the marker for local recurrency was significantly impaired not only by masking but also by the AB, although to a lesser extent. Most importantly, however, we replicated the key findings that local recurrent processing was more strongly impaired by masking than by the AB, and that global recurrent processing was similarly impaired by masking and the AB and closely linked to task performance, reflecting conscious access. Crucially, having matched the key conditions behaviorally, the present finding of greater local recurrency during the AB can now unequivocally be attributed to the attentional vs. perceptual manipulation of consciousness.”
Finally, we changed the title to “Distinct neural mechanisms underlying perceptual and attentional impairments of conscious access despite equal task performance” to highlight one of the crucial differences between the Fahrenfort et al., study and this study, namely the fact that we equalized task performance between the two critical conditions (AB and masking).
(2) It is not clear from the text the link between the current study and the literature on the role of lateral and feedback connections in consciousness (Lamme, 2020). A better explanation is needed.
To our knowledge, consciousness theories such as recurrent processing theory by Lamme make currently no distinction between the role of lateral and feedback connections for consciousness. The principled distinction lies between unconscious feedforward processing and phenomenally conscious or “preconscious” local recurrent processing, where local recurrency refers to both lateral (or horizontal) and feedback connections. We added a sentence in the Discussion:
“As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness …”
(3) When training on T1 and testing on T2, EEG data showed an early peak in local contrast classification at 75-95 ms over posterior electrodes. The authors stated that this modulation was only marginally affected by masking (and not at all by AB); however, the main effect of masking is significant. Why was this effect interpreted as nonrelevant?
Following this and Reviewer 1’s comment, we changed the wording from “marginal” to “weak but significant.” We considered this effect “weak” and of lesser relevance, because its Bayes factor indicated that the alternative hypothesis was only 1.31 times more likely than the null hypothesis of no effect, representing only “anecdotal” evidence, which is in sharp contrast to the robust effects of the consciousness manipulations on illusion decoding reported later. Furthermore, later ANOVAs comparing the effect of masking on contrast vs. illusion decoding revealed much stronger effects on illusion decoding than on contrast decoding (BFs>3.59×10<sup>4</sup>).
(4) The decoding analysis on the illusory percept yielded two separate peaks of decoding, one from 200 to 250 ms and another from 275 to 475 ms. The early component was localized occipitally and interpreted as local sensory processing, while the late peak was described as a marker for global recurrent processing. This latter peak was localized in the parietal cortex and associated with the P300. Can the authors show the topography of the P300 evoked response obtained from the current study as a comparison? Moreover, source reconstruction analysis would probably provide a better understanding of the cortical localization of the two peaks.
Figure S4 now shows the P300 from electrode Pz, demonstrating a stronger positivity between 375 and 475 ms when the illusory triangle was present than when it was absent. We did not run a source reconstruction analysis.
(5) The authors mention that the behavioural results closely resembled the pattern of the second decoding peak results. However, they did not show any evidence for this relationship. For instance, is there a correlation between the two measures across or within participants? Does this relationship differ between the illusion report and the confidence rating?
This relationship became evident from simply eyeballing the results figures: Both in behavior and EEG decoding performance dropped from the both-manipulations condition to the AB and masked conditions, while these conditions did not differ significantly. Following a similar observation of a close similarity between behavior and the second/late illusion decoding peak in the study by Fahrenfort et al. (2017), we adopted their analysis approach and ran two additional ANOVAs, adding “measure” (behavior vs. EEG) as a factor. For this analysis, we dropped the both-manipulations condition due to scale restrictions (as noted in footnote 1: “We excluded the bothmanipulations condition from this analysis due to scale restrictions: in this condition, EEG decoding at the second peak was at chance, while behavioral performance was above chance, leaving more room for behavior to drop from the masked and AB condition.”). The analysis revealed that there were no interactions with condition:
“The pattern of behavioral results, both for perceptual performance and metacognitive sensitivity, closely resembled the second decoding peak: sensitivity in all three metrics dropped from the no-manipulations condition to the masked and AB conditions, while sensitivity did not differ significantly between these performancematched conditions (Fig. 2C). Two additional rm ANOVAs with the factors measure (behavior, second EEG decoding peak) and condition (no-manipulations, masked, AB)<sup>1</sup> for perceptual performance and metacognitive sensitivity revealed no significant interaction (performance: F</iv><sub>2,58</sub>=0.27, P\=0.762, BF<sub>01</sub>=8.47; metacognition: F</iv><sub>2,58</sub=0.54, P\=0.586, BF<sub>2,58</sub>=6.04). This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”
(6) The marker for illusion-specific processing emerged later (200-250 ms), with the nomanipulation decoding performing better after training on the illusion than the nonillusory triangle. This difference emerged only in the AB condition, and it was fully abolished by masking. The authors confirmed that the illusion-specific processing was not affected by the AB manipulations by running a rm ANOVA which did not result in a significant interaction between condition and training set. However, unlike the other non-significant results, a Bayes Factor is missing here.
We added Bayes factors to all (significant and non-significant) rm ANOVAs.
(7) The same analysis yielded a second illusion decoding peak at 375-475 ms. This effect was impaired by both masking and AB, with no significant differences between the two conditions. The authors stated that this result was directly linked to behavioural performance. However, it is not clear to me what they mean (see point 5).
We added analyses comparing behavior and EEG decoding directly (see our response to point 5).
(8) The introduction starts by stating that perceptual and attentional processes differently affect consciousness access. This differentiation has been studied thoroughly in the consciousness literature, with a focus on how attention differs from consciousness (e.g., Koch & Tsuchiya, TiCS, 2007; Pitts, Lutsyshyna & Hillyard, Phil. Trans. Roy. Soc. B Biol. Sci., 2018). The authors stated that "these findings confirm and enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness clearly distinguishing and specifying the neural profiles of each processing stage of the influential four-stage model of conscious experience". I found it surprising that this aspect was not discussed further. What was the state of the art before this study was conducted? What are the mentioned neural profiles? How did the current results enrich the literature on this topic?
We would like to point out that our study is not primarily concerned with the conceptual distinction between consciousness and attention, which has been the central focus of e.g., Koch and Tsuchiuya (2007). While this literature was concerned with ways to dissociate consciousness and attention, we tacitly assumed that attention and consciousness are now generally considered as different constructs. Our study is thus not dealing with dissociations between attention and consciousness, nor with the distinction between phenomenal consciousness and conscious access, but is concerned with different ways of impairing conscious access (defined as the ability to report about a stimulus), either via perceptual or via attentional manipulations. For the state of the art before the study was conducted, we would like to refer to the motivation of our study in the Introduction, e.g., previous studies’ difficulties in unequivocally linking greater local recurrency during attentional than perceptual blindness to the consciousness manipulation, given performance confounds (we expanded this Introduction section). We also expanded a paragraph in the discussion to remind the reader of the neural profiles of the 4-stage model and to highlight the novelty of our findings related to the distinction between lateral and feedback processes:
“As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness (Block, 2005; Dehaene et al., 2006; Hatamimajoumerd et al., 2022; Lamme, 2010; Pitts et al., 2018; Sergent & Dehaene, 2004), clearly distinguishing the neural profiles of each processing stage of the influential four-stage model of conscious experience (Fig. 1A). Along with the distinct temporal and spatial EEG decoding patterns associated with lateral and feedback processing, our findings suggest a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-tofeedback connections, ultimately leading to global recurrency and conscious report.”
(9) When stating that this is the first study in which behavioural measures of conscious perception were matched between the attentional blink and masking, it would be beneficial to highlight the main differences between the current study and the one from Fahrenfort et al., 2017, with which the current study shares many similarities in the experimental design (see point 1).
We would like to refer the reviewer to our response to point 1), where we detail how we expanded the discussion of similarities and differences between our present study and Fahrenfort et al. (2017).
(10) The discussion emphasizes how the current study "suggests a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-to-feedback connections, ultimately leading to global recurrency and conscious report". For transparency, it is though important to highlight that one limit of the current study is that it does not provide direct evidence for the specified types of connections (see point 6).
We added a qualification in the Discussion section:
“Although the present EEG decoding measures cannot provide direct evidence for feedback vs. lateral processes, based on neurophysiological evidence, …”
Furthermore, we added this qualification in the Discussion section:
“It should be noted that the not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processing as well.”
References
Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J.-M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 22(19), 8633–8646.
Bair, W., Cavanaugh, J. R., & Movshon, J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 23(20), 7690–7701.
Block, N. (2005). Two neural correlates of consciousness. Trends in Cognitive Sciences, 9(2), 46–52.
Chen, M., Yan, Y., Gong, X., Gilbert, C. D., Liang, H., & Li, W. (2014). Incremental integration of global contours through interplay between visual cortical areas. Neuron, 82(3), 682–694.
Dehaene, S., Changeux, J.-P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: a testable taxonomy. Trends in Cognitive Sciences, 10(5), 204–211.
Hatamimajoumerd, E., Ratan Murty, N. A., Pitts, M., & Cohen, M. A. (2022). Decoding perceptual awareness across the brain with a no-report fMRI masking paradigm. Current Biology: CB. https://doi.org/10.1016/j.cub.2022.07.068
JASP Team. (2024). JASP (Version 0.19.0)[Computer software]. https://jasp-stats.org/ Kentridge, R. W., Heywood, C. A., & Weiskrantz, L. (2004). Spatial attention speeds discrimination without awareness in blindsight. Neuropsychologia, 42(6), 831– 835.
Kiefer, M., & Brendel, D. (2006). Attentional Modulation of Unconscious “Automatic” Processes: Evidence from Event-related Potentials in a Masked Priming Paradigm. Journal of Cognitive Neuroscience, 18(2), 184–198.
Kouider, S., & Dehaene, S. (2007). Levels of processing during non-conscious perception: a critical review of visual masking. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 857–875.
Lamme, V. A. F. (2010). How neuroscience will change our view on consciousness. Cognitive Neuroscience, 1(3), 204–220.
Luck, S. J., & Hillyard, S. A. (1994). Electrophysiological correlates of feature analysis during visual search. Psychophysiology, 31(3), 291–308.
Naccache, L., Blandin, E., & Dehaene, S. (2002). Unconscious masked priming depends on temporal attention. Psychological Science, 13(5), 416–424.
Pitts, M. A., Lutsyshyna, L. A., & Hillyard, S. A. (2018). The relationship between attention and consciousness: an expanded taxonomy and implications for ‘noreport’ paradigms. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 373(1755), 20170348.
Sergent, C., & Dehaene, S. (2004). Is consciousness a gradual phenomenon? Evidence for an all-or-none bifurcation during the attentional blink. Psychological Science, 15(11), 720–728.
Van den Bussche, E., Van den Noortgate, W., & Reynvoet, B. (2009). Mechanisms of masked priming: a meta-analysis. Psychological Bulletin, 135(3), 452–477. van Gaal, S., & Lamme, V. A. F. (2012). Unconscious high-level information processing: implication for neurobiological theories of consciousness: Implication for neurobiological theories of consciousness. The Neuroscientist: A Review Journal Bringing Neurobiology, Neurology and Psychiatry, 18(3), 287–301.
Vogel, E. K., Luck, S. J., & Shapiro, K. L. (1998). Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. Journal of Experimental Psychology. Human Perception and Performance, 24(6), 1656– 1674.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this study, Kroll et al. conduct an in-depth behavioral analysis of F0 knockouts of 4 genes associated with late-onset Alzheimer's Disease (AD), together with 3 genes associated with early-onset AD. Kroll and colleagues developed a web application (ZOLTAR) to compare sleep-associated traits between genetic mutants with those obtained from a panel of small molecules to promote the identification of affected pathways and potential therapeutic interventions. The authors make a set of potentially important findings vis-à-vis the relationship between AD-associated genes and sleep. First, they find that loss-of-function in late-onset AD genes universally results in night-time sleep loss, consistent with the well supported hypothesis that sleep disruption contributes to Alzheimer's-related pathologies. psen-1, an early-onset associated AD gene, which the authors find is principally responsible for the generation of AB40 and AB42 in zebrafish, also shows a slight increase in activity at night and slight decreases in night-time sleep. Conversely, psen-2 mutations increase daytime sleep, while appa/appb mutations have no impact on sleep. Finally, using ZOLTAR, the authors identify serotonin receptor activity as potentially disrupted in sorl1 mutants, while betamethasone is identified as a potential therapeutic to promote reversal of psen2 knockout-associated phenotypes.
This is a highly innovative and thorough study, yet a handful of key questions remain. First, are night-time sleep loss phenotypes observed in all knockouts for late-onset AD genes in the larval zebrafish a valid proxy for AD risk?
We cannot say, but it is an interesting question. We selected the four late-onset Alzheimer’s risk genes (APOE, CD2AP, CLU, SORL1) based on human genetics data and brain expression in zebrafish larvae, not based on their likelihood to modify sleep behaviour, which we could have tried by searching for overlaps with GWAS of sleep phenotypes, for example. Consequently, we find it remarkable that all four of these genes caused a night-time sleep phenotype when mutated. We also find it reassuring that knockout of appa/appb and psen2 did not cause a night-time sleep phenotype, which largely excludes the possibility that the phenotype is a technical artefact (e.g. caused by the F0 knockout method) or a property of every gene expressed in the larval brain.
Having said that, it could still be a coincidence, rather than a special property of genes associated with late-onset AD. In addition to testing additional late-onset Alzheimer’s risk genes, the ideal way to answer this question would be to test in parallel a random set of genes expressed in the brain at this stage of development. From this random set, one could estimate the proportion of genes that cause a night-time sleep phenotype when mutated. One could then use that information to test whether late-onset Alzheimer’s risk genes are indeed enriched for genes that cause a night-time sleep phenotype when mutated.
For those mutants that cause night-time sleep disturbances, do these phenotypes share a common underlying pathway? e.g. Do 5-HT reuptake inhibitors promote sleep across all 4 late-onset genes in addition to psen1? Can 5-HT reuptake inhibitors reverse other AD-related pathologies in zebrafish? Can compounds be identified that have a common behavioral fingerprint across all or multiple AD risk genes? Do these modify sleep phenotypes?
To attempt to answer these questions, we used ZOLTAR to generate predictions for all the knockout behavioural fingerprints presented in the study, in the same way as for sorl1 in Fig. 5 and Fig. 5–supplement 1. Here are the indications, targets, and KEGG pathways which are shared by the largest number of knockouts (Author response image 1):
– One indication is shared by 4/7 knockouts: “opioid dependence” (significant for appa/appb, psen1, apoea/apoeb, cd2ap).
– Four targets are shared by 4/7 knockouts: “strychnine-binding glycine receptor” (psen1, apoea/apoeb, clu, sorl1); “neuronal acetylcholine receptor beta-2” (psen1, apoea/apoeb, cd2ap, clu); thyroid peroxidase (psen1, apoea/apoeb, cd2ap, clu); carbonic anhydrase IV (appa/appb, psen1, psen2, cd2ap).
– Three KEGG pathways are shared by 5/7 knockouts: “cholinergic synapse” (psen1, apoea/apoeb, cd2ap, clu, sorl1); tyrosine metabolism (psen2, apoea/apoeb, cd2ap, clu, sorl1); and “nitrogen metabolism” (appa/appb, psen1, psen2, apoea/apoeb, cd2ap).
As reminder, we hypothesised that loss of Sorl1 affected serotonin signalling based on the following annotations being significant: indication “depression”, target “serotonin transporter”, and KEGG pathway “serotonergic synapse”. Indication “depression” is only significant for sorl1 knockouts; target “serotonin transporter” is also significant for appa/appb and psen2 knockouts; and KEGG pathway “serotonergic synapse” is also significant for psen2 knockouts. ZOLTAR therefore does not predict serotonin signalling to be a major theme common to all mutants with a night-time sleep loss phenotype.
Particularly interesting is cholinergic signalling appearing in the most common targets and KEGG pathways. Acetylcholine signalling is a major theme in research on AD. For example, the first four drugs ever approved by the FDA to treat AD were acetylcholinesterase inhibitors, which increase acetylcholine signalling by preventing its breakdown by acetylcholinesterase. These drugs are generally considered only to treat symptoms and not modify disease course, but this view has been called into question (Munoz-Torrero, 2008; Relkin, 2007). If, as ZOLTAR suggests, mutations in several Alzheimer’s risk genes affect cholinergic signalling early in development, this would point to a potential causal role of cholinergic disruption in AD.
Author response image 1.
Common predictions from ZOLTAR for the seven Alzheimer’s risk genes tested. Predictions from ZOLTAR which are shared by multiple knockout behavioural fingerprints presented in the study. Only indications, targets, and KEGG pathways which are significant for at least three of the seven knockouts tested are shown, ranked from the annotations which are significant for the largest number of knockouts.
Finally, the web- based platform presented could be expanded to facilitate comparison of other behavioral phenotypes, including stimulus-evoked behaviors.
Yes, absolutely. The behavioural dataset we used (Rihel et al., 2010) did not measure other stimuli than day/night light transitions, but the “SauronX” platform and dataset (MyersTurnbull et al., 2022) seems particularly well suited for this. To provide some context, we and collaborators have occasionally used the dataset by Rihel et al. (2010) to generate hypotheses or find candidate drugs that reverse a behavioural phenotype measured in the sleep/wake assay (Ashlin et al., 2018; Hoffman et al., 2016). The present work was the occasion to enable a wider and more intuitive use of this dataset through the ZOLTAR app, which has already proven successful. Future versions of ZOLTAR may seek to incorporate larger drug datasets using more types of measurements.
Finally, the authors propose but do not test the hypothesis that sorl1 might regulate localization/surface expression of 5-HT2 receptors. This could provide exciting / more convincing mechanistic support for the assertion that serotonin signaling is disrupted upon loss of AD-associated genes.
While working on the Author Response, we made some changes to the analysis ran by ZOLTAR to calculate enrichments (see Methods and github.com/francoiskroll/ZOLTAR, notes on v2). With the new version, 5-HT receptor type 2 is not a significantly enriched target for the sorl1 knockout fingerprint but type 4 is. 5-HT receptor type 4 was also shown to interact with sorting nexin 27, a subunit of retromer, so is a promising candidate (Joubert et al., 2004). Antibodies against human 5-HT receptor type 2 and 4a exist; whether they would work in zebrafish remains to be tested. In our experience, the availability of antibodies suitable for immunohistochemistry in the zebrafish is a serious experimental roadblock.
Note, all the results presented in the “Version of Records” are from ZOLTAR v2.
Despite these important considerations, this study provides a valuable platform for highthroughput analysis of sleep phenotypes and correlation with small-molecule-induced sleep phenotypes.
Strengths:
- Provides a useful platform for comparison of sleep phenotypes across genotypes/drug manipulations.
- Presents convincing evidence that night-time sleep is disrupted in mutants for multiple late onset AD-related genes.
- Provides potential mechanistic insights for how AD-related genes might impact sleep and identifies a few drugs that modify their identified phenotypes
Weaknesses:
- Exploration of potential mechanisms for serotonin disruption in sorl1 mutants is limited.
- The pipeline developed can only be used to examine sleep-related / spontaneous movement phenotypes and stimulus-evoked behaviors are not examined.
- Comparisons between mutants/exploration of commonly affected pathways are limited.
Thank you for these excellent suggestions, please see our answers above.
Reviewer #2 (Public Review):
Summary:
This work delineates the larval zebrafish behavioral phenotypes caused by the F0 knockout of several important genes that increase the risk for Alzheimer's disease. Using behavioral pharmacology, comparing the behavioral fingerprint of previously assayed molecules to the newly generated knockout data, compounds were discovered that impacted larval movement in ways that suggest interaction with or recovery of disrupted mechanisms.
Strengths:
This is a well-written manuscript that uses newly developed analysis methods to present the findings in a clear, high-quality way. The addition of an extensive behavioral analysis pipeline is of value to the field of zebrafish neuroscience and will be particularly helpful for researchers who prefer the R programming language. Even the behavioral profiling of these AD risk genes, regardless of the pharmacology aspect, is an important contribution. The recovery of most behavioral parameters in the psen2 knockout with betamethasone, predicted by comparing fingerprints, is an exciting demonstration of the approach. The hypotheses generated by this work are important stepping stones to future studies uncovering the molecular basis of the proposed gene-drug interactions and discovering novel therapeutics to treat AD or co-occurring conditions such as sleep disturbance.
Weaknesses:
- The overarching concept of the work is that comparing behavioral fingerprints can align genes and molecules with similarly disrupted molecular pathways. While the recovery of the psen2 phenotypes by one molecule with the opposite phenotype is interesting, as are previous studies that show similar behaviorally-based recoveries, the underlying assumption that normalizing the larval movement normalizes the mechanism still lacks substantial support. There are many ways that a reduction in movement bouts could be returned to baseline that are unrelated to the root cause of the genetically driven phenotype. An ideal experiment would be to thoroughly characterize a mutant, such as by identifying a missing population of neurons, and use this approach to find a small molecule that rescues both behavior and the cellular phenotype. If the connection to serotonin in the sorl1 was more complete, for example, the overarching idea would be more compelling.
Thank you for this cogent criticism.
On the first point, we were careful not to claim that betamethasone normalises the molecular/cellular mechanism that causes the psen2 behavioural phenotype. Having said that, yes, to a certain extent that would be the hope of the approach. As you say, every compound which normalises the behavioural fingerprint will not normalise the underlying mechanism, but the opposite seems true: every compound that normalises the underlying mechanism should also normalise the behavioural fingerprint. We think this logic makes the “behaviour-first” approach innovative and interesting. The logic is to discover compounds that normalise the behavioural phenotype first, only subsequently test whether they also normalise the molecular mechanism, akin to testing first whether a drug resolves the symptoms before testing whether it actually modifies disease course. While in practice testing thousands of drugs in sufficient sample sizes and replicates on a mutant line is challenging, the dataset queried through ZOLTAR provides a potential shortcut by shortlisting in silico compounds that have the opposite effect on behaviour.
You mention a “reduction in movement bouts” but note here that the number of behavioural parameters tested is key to our argument. To take the two extremes, say the only behavioural parameter we measured in psen2 knockout larvae was time active during the day, then, yes, any stimulant used at the right concentration could probably normalise the phenotype. In this situation, claiming that the stimulant is likely to also normalise the underlying mechanism, or even that it is a genuine “phenotypic rescue”, would not be convincing. Conversely, say we were measuring thousands of behavioural parameters under various stimuli, such as swimming speed, position in the well, bout usage, tail movements, and eye angles, it seems almost impossible for a compound to rescue most parameters without also normalising the underlying mechanism. The present approach is somewhere inbetween: ZOLTAR uses six behavioural parameters for prediction (e.g. Fig 6a), but all 17 parameters calculated by FramebyFrame can be used to assess rescue during a subsequent experiment (Fig. 6c). For both, splitting each parameter in day and night increases the resolution of the approach, which partly answers your criticism. For example, betamethasone rescued the day-time hypoactivity without causing night-time hyperactivity, so we are not making the “straw man argument” explained above of using any broad stimulant to rescue the hypoactivity phenotype.
Furthermore, for diseases where the behavioural defect is the primary concern, such as autism or bipolar disorder, perhaps this behaviour-first approach is all that is needed, and whether or not the compound precisely rescues the underlying mechanism is somewhat secondary. The use of lithium to prevent manic episodes in bipolar disorder is a good example. It was initially tested because mania was thought to be caused by excess uric acid and lithium can dissolve uric acid (Mitchell and Hadzi-Pavlovic, 2000). The theory is now discredited, but lithium continues to be used without a precise understanding of its mode of action. In this example, behavioural rescue alone, assuming the secondary effects are tolerable, is sufficient to be beneficial to patients, and whether it modulates the correct causal pathway is secondary.
On the second point, we agree that testing first ZOLTAR on a mutant for which we have a fairly good understanding of the mechanism causing the behavioural phenotype could have been a productive approach. Note, however, that examples already exist in the literature (Ashlin et al., 2018; Hoffman et al., 2016). The example from Hoffman et al. (2016) is especially convincing. Drugs generating behavioural fingerprints that positively correlate with the cntnap2a/cntnap2b double knockout fingerprint were enriched with NMDA and GABA receptor antagonists. In experiments analogous to our citalopram and fluvoxamine treatments (Fig. 5c,d and Fig. 5–supplement 1c,d), cntnap2a/cntnap2b knockout larvae were overly sensitive to the NMDA receptor antagonist MK-801 and the GABAA receptor antagonist pentylenetetrazol (PTZ). Among other drugs tested, zolpidem, a GABAA receptor agonist, caused opposite effects on wild-type and cntnap2a/cntnap2b knockout larvae. Knockout larvae were found to have fewer GABAergic neurons in the forebrain. While these studies did not use precisely the same analysis that ZOLTAR runs, they used the same rationale and behavioural dataset to make these predictions (Rihel et al., 2010), which shows that approaches like ZOLTAR can point to causal processes.
On your last point, we hope our experiment testing fluvoxamine, another selective serotonin reuptake inhibitor (SSRI), makes the connection between Sorl1 and serotonin signalling more convincing.
- The behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram is based on a small number of animals. The KO Euclidean distance measure is also more spread out than for the other datasets, and it looks like only five or so fish are driving the group difference. It also appears as though the numbers were also from two injection series. While there is nothing obviously wrong with the data, I would feel more comfortable if such a strong statement of a result from a relatively subtle phenotype were backed up by a higher N or a stable line. It is not impossible that the observed difference is an experimental fluke. If something obvious had emerged through the HCR, that would have also supported the conclusions. As it stands, if no more experiments are done to bolster the claim, the confidence in the strength of the link to serotonin should be reduced (possibly putting the entire section in the supplement and modifying the discussion). The discussion section about serotonin and AD is interesting, but I think that it is excessive without additional evidence.
We mostly agree with this criticism. One could interpret the larger spread of the data for sorl1 KO larvae treated with 10 µM citalopram as evidence that the knockout larvae do indeed react differently to the drug at this dose, regardless of being driven by a subset of the animals. The result indeed does not survive removing the top 5 (p = 0.87) or top 3 (p = 0.18) sorl1 KO + 10 µM larvae, but this amounts to excluding 20 (3/14) or 35 (5/14) % of the datapoints as potential outliers, which is unreasonable. In fact, excluding the top 5 sorl1 KO + 10 µM is equivalent to calling any datapoint with z-score > 0.2 an outlier (z-scores of the top 5 datapoints are 0.2–1.8). Applying consistently the same criterion to the scrambled + 10 µM group would remove the top 6 datapoints (z-scores = 0.5–3.9). Comparing the resulting two distributions again gives the sorl1 KO + 10 µM distribution as significantly higher (p = 0.0015). We would also mention that Euclidean distance, as a summary metric for distance between behavioural fingerprints, has limitations. For example, the measure will be more sensitive to changes in some parameters but not others, depending on how much room there is for a given parameter to change. We included this metric to lend support to the observation one can draw from the fingerprint plot (Fig. 5c) that sorl1 mutants respond in an exaggerated way to citalopram across many parameters, while being agnostic to which parameter might matter most.
Given that the HCR did not reveal anything striking, we agree with you that too much of our argument relied on this result being robust. As you and Reviewer #3 suggested, we repeated this experiment with a different SSRI, fluvoxamine (Fig. 5–supplement 1). We cannot readily explain why the result was opposite to what we found with citalopram, but in both cases sorl1 knockout larvae reacted differently than their control siblings, which adds an argument to our claim that ZOLTAR correctly predicted serotonin signalling as a disrupted pathway from the behavioural fingerprint. Accordingly, we mostly kept the Discussion on Sorl1 the same, although we concede that we may not have identified the molecular mechanism.
- The authors suggest two hypotheses for the behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram. While the first is tested, and found to not be supported, the second is not tested at all ("Ruling out the first hypothesis, sorl1 knockouts may react excessively to a given spike in serotonin." and "Second, sorl1 knockouts may be overly sensitive to serotonin itself because post-synaptic neurons have higher levels of serotonin receptors."). Assuming that the finding is robust, there are probably other reasons why the mutants could have a different sensitivity to this molecule. However, if this particular one is going to be mentioned, it is surprising that it was not tested alongside the first hypothesis. This work could proceed without a complete explanation, but additional discussion of the possibilities would be helpful or why the second hypothesis was not tested.
There are no strong scientific reasons why this hypothesis was not tested. The lead author (F Kroll) moved to a different lab and country so the project was finalised at that time. We do not plan on testing this hypothesis at this stage. However, we adapted the wording to make it clear this is one possible alternative hypothesis which could be tested in the future. The small differences found by HCR are actually more in line with the new results from the fluvoxamine experiment, so it may also be that both hypotheses (pre-synaptic neurons releasing less serotonin when reuptake is blocked; or post-synaptic neurons being less sensitive) contribute. The fluvoxamine experiment was performed in a different lab (ICM, Paris; all other experiments were done in UCL, London) in a different wild-type strain (TL in ICM, AB x Tup LF in UCL), which complicates how one interprets this discrepancy.
- The authors claim that "all four genes produced a fairly consistent phenotype at night". While it is interesting that this result arose in the different lines, the second clutch for some genes did not replicate as well as others. I think the findings are compelling, regardless, but the sometimes missing replicability should be discussed. I wonder if the F0 strategy adds noise to the results and if clean null lines would yield stronger phenotypes. Please discuss this possibility, or others, in regard to the variability in some phenotypes.
For the first part of this point, please see below our answer to Reviewer #3, point (2) c.
Regarding the F0 strategy potentially adding variability, it is an interesting question which we tested in a larger dataset of behavioural recordings from F0 and stable knockouts for the same genes (unpublished). In summary, the F0 knockout method does not increase clutchto-clutch or larva-to-larva variability in the assay. F0 knockout experiments found many more significant parameters and larger effect sizes than stable knockout experiments, but this difference could largely be explained by the larger sample sizes of F0 knockout experiments. In fact, larger sample sizes within individual clutches appears to be a major advantage of the F0 knockout approach over in-cross of heterozygous knockout animals as it increases sensitivity of the assay without causing substantial variability. We plan to report in more detail on this analysis in a separate paper as we think it would dilute the focus of the present work.
- In this work, the knockout of appa/appb is included. While APP is a well-known risk gene, there is no clear justification for making a knockout model. It is well known that the upregulation of app is the driver of Alzheimer's, not downregulation. The authors even indicate an expectation that it could be similar to the other knockouts ("Moreover, the behavioural phenotypes of appa/appb and psen1 knockout larvae had little overlap while they presumably both resulted in the loss of Aβ." and "Comparing with early-onset genes, psen1 knockouts had similar night-time phenotypes, but loss of psen2 or appa/appb had no effect on night-time sleep."). There is no reason to expect similarity between appa/appb and psen1/2. I understand that the app knockouts could unveil interesting early neurodevelopmental roles, but the manuscript needs to be clarified that any findings could be the opposite of expectation in AD.
On “there is no reason to expect similarity […]”, we disagree. Knockout of appa/appb and knockout of psen1 will both result in loss of Aβ (appa/appb encode Aβ and psen1 cleaves Appa/Appb to release Aβ, cf. Fig. 3e). Consequently, a phenotype caused by the loss of Aβ, or possibly other Appa/Appb cleavage products, should logically be found in both appa/appb and psen1 knockouts.
On “it is well known that the upregulation of APP is the driver of Alzheimer’s, not downregulation”; we of course agree. Among others, the examples of Down syndrome, APP duplication (Sleegers et al., 2006), or mouse models overexpressing human APP show definitely that overexpression of APP is sufficient to cause AD. Having said that, we would not be so quick in dismissing APP knockout as potentially relevant to understanding of AD.
Loss of soluble Aβ due to aggregation could contribute to pathology (Espay et al., 2023). Without getting too much into this intricate debate, links between levels of Aβ and risk of disease are often counter-intuitive too. For example, out of 138 PSEN1 mutations screened in vitro, 104 reduced total Aβ production and 11 even seemingly abolished the production of both Aβ40 and Aβ42 (Sun et al., 2017). In short, loss of soluble Aβ occurs in both AD and in our appa/appb knockout larvae.
We added a sentence in Results (section psen2 knockouts […]) to briefly justify our appa/appb knockout approach. To be clear, we do not want to imply, for example, that the absence of a night-time sleep phenotype for appa/appb is contradictory to the body of literature showing links between Aβ and sleep, including in zebrafish (Özcan et al., 2020). As you say, our experiment tested loss of App, including Aβ, while the literature typically reports on overexpression of APP, as in APP/PSEN1-overexpressing mice (Jagirdar et al., 2021).
Reviewer #3 (Public Review):
In this manuscript by Kroll and colleagues, the authors describe combining behavioral pharmacology with sleep profiling to predict disease and potential treatment pathways at play in AD. AD is used here as a case study, but the approaches detailed can be used for other genetic screens related to normal or pathological states for which sleep/arousal is relevant. The data are for the most part convincing, although generally the phenotypes are relatively small and there are no major new mechanistic insights. Nonetheless, the approaches are certainly of broad interest and the data are comprehensive and detailed. A notable weakness is the introduction, which overly generalizes numerous concepts and fails to provide the necessary background to set the stage for the data.
Major points
(1) The authors should spend more time explaining what they see as the meaning of the large number of behavioral parameters assayed and specifically what they tell readers about the biology of the animal. Many are hard to understand--e.g. a "slope" parameter.
We agree that some parameters do not tell something intuitive about the biology of the animal. It would be easy to speculate. For example, the “activity slope” parameter may indicate how quickly the animal becomes tired over the course of the day. On the other hand, fractal dimension describes the “roughness/smoothness” of the larva’s activity trace (Fig. 2–supplement 1a); but it is not obvious how to translate this into information about the physiology of the animal. We do not see this as an issue though. While some parameters do provide intuitive information about the animal’s behaviour (e.g. sleep duration or sunset startle as a measure of startle response), the benefit of having a large number of behavioural parameters is to compare behavioural fingerprints and assess rescue of the behavioural phenotype by small molecules (Fig. 6c). For this purpose, the more parameters the better. The “MoSeq” approach from Wiltschko et al., 2020 is a good example from literature that inspired our own Fig. 6c. While some of the “behavioural syllables” may be intuitive (e.g. running or grooming), it is probably pointless to try to explain the ‘meaning’ of the “small left turn in place with head motion” syllable (Wiltschko et al., 2020). Nonetheless, this syllable was useful to assess whether a drug specifically treats the behavioural phenotype under study without causing too many side effects. Unfortunately, ZOLTAR has to reduce the FramebyFrame fingerprint (17 parameters) to just six parameters to compare it to the behavioural dataset from Rihel et al., 2010, but here, more parameters would almost certainly translate into better predictions too, regardless of their intuitiveness.
It is true however that we did not give much information on how some of the less intuitive parameters, such as activity slope or fractal dimension, are calculated or what they describe about the dataset (e.g. roughness/smoothness for fractal dimension). We added a few sentences in the legend of Fig. 2–supplement 1.
(2) Because in the end the authors did not screen that many lines, it would increase confidence in the phenotypes to provide more validation of KO specificity. Some suggestions include:
a. The authors cite a psen1 and psen2 germline mutant lines. Can these be tested in the FramebyFrame R analysis? Do they phenocopy F0 KO larvae?
We unfortunately do not have those lines. We investigated the availability of importing a psen2 knockout line from abroad, but the process of shipping live animals is becoming more and more cost and time prohibitive. However, we observed the same pigmentation phenotype for psen2 knockouts as reported by Jiang et al., 2018, which is at least a partial confirmation of phenocopying a loss of function stable mutant.
b. psen2_KO is one of the larger centerpieces of the paper. The authors should present more compelling evidence that animals are truly functionally null. Without this, how do we interpret their phenotypes?
We disagree that there should be significant doubt about these mutants being truly functionally null, given the high mutation rate and presence of the expected pigmentation phenotype (Jiang et al., 2018, Fig. 3f and Fig. 3–supplement 3a). The psen2 F0 knockouts were virtually 100% mutated at three exons across the gene (mutation rates were locus 1: 100 ± 0%; locus 2: 99.99 ± 0.06%; locus 3: 99.85 ± 0.24%). Additionally, two of the three mutated exons had particularly high rates of frameshift mutations (locus 1: 97 ± 5%; locus 2: 88 ± 17% frameshift mutation rate). It is virtually impossible that a functional protein is translated given this burden of frameshift mutations. Phenotypically, in addition to the pigmentation defect, double psen1/psen2 F0 knockout larvae had curved tails, the same phenotype as caused by a high dose of the γ-secretase inhibitor DAPT (Yang et al., 2008). These double F0 knockouts were lethal, while knockout of psen1 or psen2 alone did not cause obvious morphological defects. Evidently, most larvae must have been psen2 null mutants in this experiment, otherwise functional Psen2 would have prevented early lethality.
Translation of zebrafish psen2 can start at downstream start codons if the first exon has a frameshift mutation, generating a seemingly functional Psen2 missing the N-terminus (Jiang et al., 2020). Zebrafish homozygous for this early frameshift mutation had normal pigmentation, showing it is a reliable marker of Psen2 function even when it is mutated. This mechanism is not a concern here as the alternative start codons are still upstream of two of the three mutated exons (the alternative start codons discovered by Jiang et al., 2020 are in exon 2 and 3, but we targeted exon 3, exon 4, and exon 6).
We understand that the zebrafish community may be cautious about F0 phenotyping compared to stably generated mutants. As mentioned to Reviewer #2, we are planning to assemble a paper that expressly compares behavioural phenotypes measured in F0 vs. stable mutants to allay some of these concerns. Our current manuscript, which combines CRISPR-Cas9 rapid F0 screening with in silico pharmacological predictions, inevitability represents a first step in characterizing the functions of these genes.
c. Related to the above, for cd2AP and sorl1 KO, some of the effect sizes seem to be driven by one clutch and not the other. In other words, great clutch-to-clutch variability. Should the authors increase the number of clutches assayed?
Correct, there is substantial clutch-to-clutch variability in this behavioural assay. This is not specific to our experiments. Even within the same strain, wild-type larvae from different clutches (i.e. non-siblings) behave differently (Joo et al., 2021). This is why it is essential to compare behavioural phenotypes within individual clutches (i.e. from a single pair of parents, one male and one female), as we explain in Methods (section Behavioural video-tracking) and in the documentation of the FramebyFrame package. We often see two different experimental designs in literature: comparing non-sibling wild-type and mutant larvae, or pooling different clutches which include all genotypes (e.g. pooling multiple clutches from heterozygous in-crosses or pooling wild-type clutches before injecting them). The first experimental design causes false positive findings (Joo et al., 2021), as the clutchto-clutch variability we and others observe gets interpreted as a behavioural phenotype. The second experimental design should not cause false positives but likely decreases the sensitivity of the assay by increasing the spread within genotypes. In both cases, the clutch-to-clutch variability is hidden, either by interpreting it as a phenotype (first case) or by adding it to animal-to-animal variability (second case). Our experimental design is technically more challenging as it requires obtaining large clutches from unique pairs of parents. However, this approach is better as it clearly separates the different sources of variability (clutch-to-clutch or animal-to-animal). As for every experiment, yes, a larger number of replicates would be better, but we do not plan to assay additional clutches at this time. Our work heavily focuses on the sorl1 and psen2 knockout behavioural phenotypes. The key aspects of these phenotypes were effectively tested in four experiments (five to six clutches) as sorl1 knockout larvae were also tracked in the citalopram and fluvoxamine experiments (Fig. 5 and Fig. 5–supplement 1), and psen2 knockout larvae were also tracked in the small molecule rescue experiment (Fig. 6 and Fig. 6–supplement 1).
The psen2 behavioural phenotype replicated well across the six clutches tested (pairwise cosine similarities: 0.62 ± 0.15; Author response image 2a). 5/6 clutches were less active and initiating more sleep bouts during the day, as we claimed in Fig. 3.
In the citalopram experiment, the H<sub>2</sub>O-treated sorl1 knockout fingerprint replicated fairly well the baseline recordings in Fig. 4, despite the smaller sample size (cos = 0.30 and 0.78; Author response image 2b, see “KO Fig. 5”). 5/6 of the significant parameters presented in Fig. 4–supplement 4 moved in the same direction, and knockout larvae were also hypoactive during the day but hyperactive at night. Note that two clutches were tracked on the same 96-well plate in this experiment. We calculated each larva’s z-score using the average of its control siblings, then we averaged all the z-scores to generate the fingerprint. The H<sub>2</sub>O treated sorl1 knockout clutch from the fluvoxamine experiment did not replicate well the baseline recordings (cos = 0.08 and 0.11; Author response image 2b, see “KO Fig. 5–suppl. 1”). Knockout larvae were hypoactive during the day as expected, but behaviour at night was not as robustly affected. As mentioned above, knockouts were made in a different genetic background (TL, instead of AB x Tup LF used for all other experiments), which could explain the discrepancy.
We also took the opportunity to check whether our SSRI treatments replicated well the data from Rihel et al., 2010. For both citalopram (n = 3 fingerprints in the database) and fluvoxamine (n = 4 fingerprints in the database), replication was excellent (cos ≥ 0.67 for all comparisons of a fingerprint from this study vs. a fingerprint from Rihel et al. 2010; Author response image 2c,d). Note that the scrambled + 10 µM citalopram and + 10 µM fluvoxamine fingerprints correlate extremely well (cos = 0.92; can be seen in Author response image 2c,d), which was predicted by the small molecule screen dataset.
Author response image 2.
Replication of psen2 and sorl1 F0 knockout fingerprints and SSRI treatments from Rihel et al., 2010. a, (left) Every psen2 F0 knockout behavioural fingerprint generated in this study. Each dot represents the mean deviation from the same-clutch scrambled-injected mean for that parameter (z-score, mean ± SEM). From the experiments in Fig. 6, presented is the psen2 F0 knockout + H<sub>2</sub>O fingerprints. The fingerprints in grey (“not shown”) are from a preliminary drug treatment experiment we did not include in the final study. These fingerprints are from psen2 F0 knockout larvae treated with 0.2% DMSO, normalised to scrambled-injected siblings also treated with 0.2% DMSO. (right) Pairwise cosine similarities (−1.0–1.0) for the fingerprints presented. b, Every sorl1 F0 knockout behavioural fingerprint, as in a). c, The scrambled-injected + citalopram (10 µM) fingerprints (grey) in comparison to the citalopram (10–15 µM) fingerprints from the Rihel et al., 2010 database (green). d, The scrambled-injected + fluvoxamine (10 µM) fingerprint (grey) in comparison to the fluvoxamine fingerprints from the Rihel et al., 2010 database (pink). In c) and d), the scrambled-injected fingerprints are from the experiments in Fig. 5 and Fig. 5–suppl. 1, but were converted here into the behavioural parameters used by Rihel et al., 2010 for comparison. Parameters: 1, average activity (sec active/min); 2, average waking activity (sec active/min, excluding inactive minutes); 3, total sleep (hr); 4, number of sleep bouts; 5, sleep bout length (min); 6, sleep latency (min until first sleep bout).
(3) The authors make the point that most of the AD risk genes are expressed in fish during development. Is there public data to comment on whether the genes of interest are expressed in mature/old fish as well? Just because the genes are expressed early does not at all mean that early- life dysfunction is related to future AD (though this could be the case, of course). Genes with exclusive developmental expression would be strong candidates for such an early-life role, however. I presume the case is made because sleep studies are mainly done in juvenile fish, but I think it is really a prejy minor point and such a strong claim does not even need to be made.
This is a fair criticism but we do not make this claim (“early-life dysfunction is related to future AD”) from expression alone. The reviewer is probably referring to the following quote:
“[…] most of these were expressed in the brain of 5–6-dpf zebrafish larvae, suggesting they play a role in early brain development or function,” which does not mention future risk of AD. We do suggest that these genes have a function in development. After all, every gene that plays a role in brain development must be expressed during development, so this wording seemed reasonable. Nevertheless, we adapted the wording to address this point and Reviewer #2’s complaint below. As noted, the primary goal was to check that the genes we selected were indeed expressed in zebrafish larvae before performing knockout experiments. Our discussion does raise the hypothesis that mutations in Alzheimer’s risk genes impact brain development and sleep early in life, but this argument primarily relies on our observation that knockout of late-onset Alzheimer’s risk genes causes sleep phenotypes in 7-day old zebrafish larvae and from previous work showing brain structural differences in children at high genetic risk of AD (Dean et al., 2014; Quiroz et al., 2015), not solely on gene expression early in life.
Please also see our answer to a similar point raised by Reviewer #2 below (cf. Author response image 7).
(4) A common quandary with defining sleep behaviorally is how to rectify sleep and activity changes that influence one another. With psen2 KOs, the authors describe reduced activity and increased sleep during the day. But how do we know if the reduced activity drives increased behavioral quiescence that is incorrectly defined as sleep? In instances where sleep is increased but activity during periods during wake are normal or elevated, this is not an issue. But here, the animals might very well be unhealthy, and less active, so naturally they stop moving more for prolonged periods, but the main conclusion is not sleep per se. This is an area where more experiments should be added if the authors do not wish to change/temper the conclusions they draw. Are psen2 KOs responsive to startling stimuli like controls when awake? Do they respond normally when quiescent? Great care must be taken in all models using inactivity as a proxy for sleep, and it can harm the field when there is no acknowledgment that overall health/activity changes could be a confound. Particularly worrisome is the betamethasone data in Figure 6, where activity and sleep are once again coordinately modified by the drug.
This is a fair criticism. We agree it is a concern, especially in the case of psen2 as we claim that day-time sleep is increased while zebrafish are diurnal. We do not rely heavily on the day-time inactivity being sleep (the ZOLTAR predictions or the small molecule rescue do not change whether the parameter is called sleep or inactivity), but our choice of labelling can fairly be challenged.
To address “are psen2 KO responsive to startling stimuli like controls when awake/when quiescent”, we looked at the larvae’s behaviour immediately after lights abruptly switched on in the mornings. Almost every larva, regardless of genotype, responded strongly to every lights-off transition during the experiment. Instead, we chose the lights-on transition for this analysis because it is a weaker startling stimulus for the larvae than the lights-off transition (Fig. 3–supplement 3), potentially exposing differences between genotypes or behavioural states (quiescent or awake). We defined a larva as having reacted to the lights switching on if it made a swimming bout during the second (25 frames) a er the lights-on transition. Across two clutches and two lights-on transitions, an average of 65% (range 52–73%) of all larvae reacted to the stimulus. psen2 knockout larvae were similarly likely, if not more likely, to respond (in average 69% responded, range 60–76%) than controls (60% average, range 44– 75%). When the lights switched on, about half of the larvae (39–51%) would have been classified as asleep according to the one-minute inactivity definition (i.e. the larva did not move in the minute preceding the lights transition). This allowed us to also compare behavioural states, as suggested by the reviewer. For three of the four light transitions, larvae which were awake when lights switched on were more likely to react than asleep larvae, but this difference was not striking (overall, awake larvae were only 1.1× more likely to react; Author response image 3). Awake psen2 knockout larvae were 1.1× (range 1.04–1.11×) more likely to react than awake control larvae, so, yes, psen2 knockout larvae respond normally when awake. Asleep psen2 knockout larvae were 1.4× (range 0.63–2.19×) more likely to react than asleep control larvae, so psen2 knockouts are also more or equally likely to react than control larvae when asleep. In summary, the overall health of psen2 knockouts did not seem to be a significant confound in the experiment. As the reviewer suggested, if psen2 knockout larvae were seriously unhealthy, they would not be as responsive as control larvae to a startling stimulus.
Author response image 3.
psen2 F0 knockouts react normally to lights switching on, indicating they are largely healthy. At each lights-on transition (9 AM), each larva was categorised as awake if it had moved in the preceding one minute or asleep if it had been inactive for at least one minute. Darker tiles represent larvae which performed a swimming bout during the second following lights-on; lighter tiles represent larvae which did not move during that second. The total count of each waffle plot was normalised to 25 so plots can be compared to each other. The real count is indicated in the corner of each plot. Data is from the baseline psen2 knockout trackings presented in Fig. 3 and Fig. 3–suppl. 2.
Next, we compared inactive period durations during the day between psen2 and control larvae. If psen2 knockout larvae indeed sleep more during the day compared to controls, we may predict inactive periods longer than one minute to increase disproportionately compared to the increase in shorter inactive periods. This broadly appeared to be the case, especially for one of the two clutches (Author response image 4). In clutch 1, inactive periods lasting 1–60 sec were equally frequent in both psen2 and control larvae (fold change 1.0× during both days), while inactive periods lasting 1–2 min were 1.5× (day 1) and 2.5× (day 2) more frequent in psen2 larvae compared to control larvae. In clutch 2, 1–60 sec inactive periods were also equally frequent in both psen2 and control larvae, while inactive periods lasting 1–2 min were 3.4× (day 1) and 1.5× (day 2) more frequent in psen2 larvae compared to control larvae. Therefore, psen2 knockouts disproportionately increased the frequency of inactive periods longer than one minute, suggesting they genuinely slept more during the day.
Author response image 4.
psen2 F0 knockouts increased preferentially the frequency of longer inactive bouts. For each day and clutch, we calculated the mean distribution of inactive bout lengths across larvae of same genotype (psen2 F0 knockout or scrambled-injected), then compared the frequency of inactive bouts of different lengths between the two genotypes. For example, in clutch 1 during day 2, 0.01% of the average scrambled-injected larva’s inactive bouts lasted 111–120 seconds (X axis 120 sec) while 0.05% of the average psen2 F0 knockout larva lasted this long, so the fold change was 5×. Inactive bouts lasting < 1 sec were excluded from the analysis. In clutch 2, day 1 plot, two datapoints fall outside the Y axis limit: 140 sec, Y = 32×; 170 sec, Y = 16×. Data is from the baseline psen2 knockout trackings presented in Fig. 3 and Fig. 3–suppl. 2.
Ultimately, this criticism seems challenging to definitely address experimentally. A possible approach could be to use a closed-loop system which, after one minute of inactivity, triggers a stimulus that is sufficient to startle an awake larva but not an asleep larva. If psen2 knockout larvae indeed sleep more during the day, the stimulus should usually not be sufficient to startle them. Nevertheless, we believe the two analyses presented here are consistent with psen2 knockout larvae genuinely sleeping more during the day, so we decided to keep this label. We agree with the reviewer that the one-minute inactivity definition has limitations, especially for day-time inactivity.
(5) The conclusions for the serotonin section are overstated. Behavioural pharmacology purports to predict a signaling pathway disrupted with sorl1 KO. But is it not just possible that the drug acts in parallel to the true disrupted pathway in these fish? There is no direct evidence for serotonin dysfunction - that conclusion is based on response to the drug. Moreover, it is just one drug - is the same phenotype present with another SSRI? Likewise, language should be toned down in the discussion, as this hypothesis is not "confirmed" by the results (consider "supported"). The lack of measured serotonin differences further raises concern that this is not the true pathway. This is another major point that deserves further experimental evidence, because without it, the entire approach (behavioral pharm screen) seems more shaky as a way to identify mechanisms. There are any number of testable hypotheses to pursue such as a) Using transient transgenesis to visualize 5HT neuron morphology (is development perturbed: cell number, neurite morphology, synapse formation); b) Using transgenic Ca reporters to assay 5HT neuron activity.
Regarding the comment, “is it not just possible that the drug acts in parallel to the true disrupted pathway”, we think no, assuming we understand correctly the question. Key to our argument is the fact that sorl1 knockout larvae react differently to the drug(s) than control larvae. As an example, take night-time sleep bout length, which was not affected by knockout of sorl1 (Fig. 4–supplement 4). For the sake of the argument, say only dopamine signalling (the “true disrupted pathway”) was affected in sorl1 knockouts and that serotonin signalling was intact. Assuming that citalopram specifically alters serotonin signalling, then treatment should cause the same increase in sleep bout length in both knockouts and controls as serotonin signalling is intact in both. This is not what we see, however. Citalopram caused a greater increase in sleep bout length in sorl1 knockouts than in scrambled-injected larvae. In other words, the effect is non-additive, in the sense that citalopram did not add the same number of z-scores to sorl1 knockouts or controls. We think this shows that serotonin signalling is somehow different in sorl1 knockouts. Nonetheless, we concede that the experiment does not necessarily say much about the importance of the serotonin disruption caused by loss of Sorl1. It could be, for example, that the most salient consequence of loss of Sorl1 is cholinergic disruption (see reply to Reviewer #1 above) and that serotonin signalling is a minor theme.
Furthermore, we agree with the reviewer and Reviewer #2 that the conclusions were overly confident. As suggested, we decided to repeat this experiment with another SSRI, fluvoxamine. Please find the results of this experiment in Fig. 5–supplement 1. The suggestions to further test the serotonin system in the sorl1 knockouts are excellent as well, however we do not plan to pursue them at this stage.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Major Comments:
- Data are presented in a variety of different ways, occasionally making comparisons across figures difficult. Perhaps at a minimum, behavioral fingerprints as in Figure 3 - Supplementary Figure 1 should be presented for all mutants in the main figures.
We like this suggestion! Thank you. We brought the behavioural fingerprints figure (previously Fig. 4–supplement 5) as main Fig. 4, and put the figure focused on the sorl1 knockout behavioural phenotype in supplementary, with the other gene-by-gene figures.
- It is not clear why some data were selected for supplemental rather than main figures. In many cases, detailed phenotypic data is provided for one example mutant in the main figures, and then additional mutants are described in detail in the supplement. Again, to facilitate comparisons between mutants, fingerprints could be provided for all mutants in a main figure, with detailed analyses moved to the supplements.
The logic was to dedicate one main figure to psen2 (Fig. 3) as an example of an early-onset Alzheimer’s risk gene, and one to sorl1 (previously Fig. 4) as an example of a late-onset Alzheimer’s risk gene. We focused on them in main figures as they are both tested again later (Fig. 5 and Fig. 6). Having said that, we agree that the fingerprints may be a better use of main figure space than the parameters plots. In addition to the above (fingerprints of lateonset Alzheimer’s risk genes in main figure), we rearranged the figures in the early-onset AD section to have the psen2 F0 knockout fingerprint in main.
- The explication of the utility of behavioral fingerprinting on page 35 is somewhat confusing. The authors describe drugs used to treat depression as enriched among small molecules anti-correlating with the sorl1 fingerprint. However, in Figure 5 - Supplementary Figure 1, drugs used to treat depression are biased toward positive cosines, which are indicated as having a more similar fingerprint to sorl1. These drugs should be described as more present among compounds positively correlating with the sorl1 fingerprint.
Sorry, the confusion is about “(anti-)correlating”. Precisely, we meant “correlating and/or anti-correlating”, not just anti-correlating. We changed to that wording. In short, the analysis is by design agnostic to whether compounds with a given annotation are found more on the positive cosines side (le side in Fig. 5–supplement 1a) or the negative cosines side (right side). This is because the dataset often includes both agonists and antagonists to a given pathway but these are difficult to annotate. For example, say 10 compounds in the dataset target the dopamine D4 receptor, but these are an unknown mix of agonists and antagonists. In this case, we want ZOLTAR to generate a low p-value when all 10 compounds are found at extreme ends of the list, regardless of which end(s) that is (e.g. top 8 and bottom 2 should give an extremely low p-value). Initially, we were splitting the list, for each annotation, into positive-cosine fingerprints and negative-cosine fingerprints and testing enrichment on both separately, but we think the current approach is better as it reflects better the cases we want to detect and considers all available examples for a given annotation in one test. In sum, yes, in this case drugs used to treat depression were mostly in the positive-cosine side, but the other drugs on the negative-cosine side also contributed to what the p-value is, so it reflects better the analysis to say “correlating and/or anticorrelating”. You can read more about our logic for the analysis in Methods (section Behavioural pharmacology from sorl1 F0 knockout’s fingerprint).
- The authors conclude the above-described section by stating: "sorl1 knockout larvae behaved similarly to larvae treated with small molecules targeting serotonin signaling, suggesting that the loss of Sorl1 disrupted serotonin signaling." Directionality here may be important. Are all of the drugs targeting the serotonin transporter SSRIs or similar? If so, then a correct statement would be that loss of Sorl1 causes similar phenotypes to drugs enhancing serotonin signaling. Finally, based on the correlation between serotonin transporter inhibitor trazodone and the sorl1 crispant phenotype, it is potentially surprising that the SSRI citalopram caused the opposite phenotype from sorl1, that is, increased sleep during the day and night. It is potentially interesting that this result was enhanced in mutants, and suggests dysfunction of serotonin signaling, but the statement that "our behavioral pharmacology approach correctly predicted from behaviour alone that serotonin signaling was disrupted" is too strong a conclusion.
We understand “disrupt” as potentially going either way, but this may not be the common usage. We changed to “altered”.
The point regarding directionality is excellent, however. We tested the proportion of serotonin transporter agonists and antagonists (SSRIs) on each side of the ranked list of small molecule fingerprints. We used the STITCH database for this analysis as it has more drug–target interactions, but likely less curated, than the Therapeutic Target Database (Szklarczyk et al., 2016). As with the Therapeutic Target Database, most fingerprints of compounds interacting with the serotonin transporter SLC6A4 were found on the side of positive cosines (p ~ 0.005 using the custom permutation test), which replicates Fig. 5a with a different source for the drug–target annotations (Author response image 5). On the side of positive cosines (small molecules which generate behavioural fingerprints correlating with the sorl1 fingerprint), there were 2 agonists and 26 antagonists. On the side of negative cosines (small molecules which generate behavioural fingerprints anti-correlating with the sorl1 fingerprint), there were 3 agonists and 2 antagonists. Using a Chi-squared test, this suggests a significant (p = 0.002) over-representation of antagonists (SSRIs) on the positive side (expected count = 24, vs. 26 observed) and agonists on the negative side (expected count = 1, vs. 3 observed). If SLC6A4 antagonists, i.e. SSRIs, indeed tend to cause a similar behavioural phenotype than knockout of sorl1, this would point in the direction of our original interpretation of the citalopram experiment; which was that excessive serotonin signalling is what causes the sorl1 behavioural phenotype.
Author response image 5.
Using the STITCH database as source of annotations also predicts SLC6A4 as an enriched target for the sorl1 behavioural fingerprint. Same figures as Fig. 5a,b but using the STITCH database (Szklarczyk et al., 2016) as source for the drug targets. a, Compounds annotated by STITCH as interacting with the serotonin transporter SLC6A4 tend to generate behavioural phenotypes similar to the sorl1 F0 knockout fingerprint. 40,522 compound–target protein pairs (vertical bars; 1,592 unique compounds) are ranked from the fingerprint with the most positive cosine to the fingerprint with the most negative cosine in comparison with the mean sorl1 F0 knockout fingerprint. Fingerprints of drugs that interact with SLC6A4 are coloured in yellow. Simulated p-value = 0.005 for enrichment of drugs interacting with SLC6A4 at the top (positive cosine) and/or bottom (negative cosine) of the ranked list by a custom permutation test. b, Result of the permutation test for top and/or bottom enrichment of drugs interacting with SLC6A4 in the ranked list. The absolute cosines of the fingerprints of drugs interacting with SLC6A4 (n = 52, one fingerprint per compound) were summed, giving sum of cosines = 15.9. To simulate a null distribution, 52 fingerprints were randomly drawn 100,000 times, generating a distribution of 100,000 random sum of cosines. Here, only 499 random draws gave a larger sum of cosines, so the simulated p-value was p = 499/100,000 = 0.005 **.
If this were true, we would expect, as the reviewer suggested, SSRI treatment (citalopram or fluvoxamine) on control larvae to give a similar behavioural phenotype as knockout of sorl1. However, this generally did not appear to be the case (sorl1 knockout fingerprint vs. SSRI-treated control fingerprint, cosine = 0.08 ± 0.35; Author response image 6).
Author response image 6.
sorl1 F0 knockouts in comparison to controls treated with SSRIs. a, sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the citalopram experiment) in comparison with the scrambled-injected + citalopram (1 or 10 µM) fingerprints. Each dot represents the mean deviation from the same-clutch scrambled-injected H<sub>2</sub>O-treated mean for that parameter (z-score, mean ± SEM). b, As in a), sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the fluvoxamine experiment) in comparison with the scrambled-injected + fluvoxamine (10 µM) fingerprint.
The comparison with trazodone is an interesting observation, but it is only a weak serotonin reuptake inhibitor (Ki for SLC6A4 = 690 nM, vs. 8.9 nM for citalopram; Owens et al., 1997) and it has many other targets, both as agonist or antagonist, including serotonin, adrenergic, and histamine receptors (Mijur, 2011). In any case, the average trazodone fingerprint does not correlate particularly well to the sorl1 knockout fingerprint (cos = 0.3). Finally, the sorl1 knockout behavioural phenotype could be primarily caused by altered serotonin signalling in the hypothalamus, where we found both the biggest difference in tph1a/1b/2 HCR signal intensity (Fig. 5f) and the highest expression of sorl1 across scRNA-seq clusters (Fig. 1– supplement 2). In this case, it would be correct to expect sorl1 knockouts to react differently to SSRIs than controls, but it would be incorrect to expect SSRI treatment to cause the same behavioural phenotype, as it concurrently affects every other serotonergic neuron in the brain.
Finally, we agree the quoted conclusion was too strong given the current evidence. We since tested another SSRI, fluvoxamine, on sorl1 knockouts.
- Also in reference to Figure 5: in panel c, data are presented as deviation from vehicle treated. Because of this data presentation choice, it's no longer possible to determine whether, in this experiment, sorl1 crispants sleep less at night relative to their siblings. Does citalopram rescue / reverse sleep deficits in sorl1 mutants?
On your first point, please see our response to Reviewer #3 (2)c and Author Response 2b above.
On “does citalopram rescue/reverse sleep deficits in sorl1 mutants”: citalopram (and fluvoxamine) tends to reverse the key aspects of the sorl1 knockout behavioural phenotype by reducing night-time activity (% time active and total Δ pixels), increasing night-time sleep, and shortening sleep latency (Author response image 7). Extrapolating from the hypothesis presented in Discussion, this may be interpreted as a hint that sorl1 knockouts have reduced levels of 5-HT receptors, as increasing serotonin signalling using an SSRI tends to rescue the phenotype. However, we do not think that focusing on the significant behavioural parameters necessarily make sense here. Rather, one should take all parameters into account to conclude whether knockouts react differently to the drug than wild types (also see answer to Reviewer #3, (7) on this). For example, citalopram increased more the night-time sleep bout length of sorl1 knockouts than the one of controls (Fig. 5), but this parameter was not modified by knockout of sorl1 (Fig. 4). To explain the rationale more informally, citalopram is only used as a tool here to probe serotonin signalling in sorl1 knockouts, whether it worsens or rescues the behavioural phenotype is somewhat secondary, the key question is whether knockouts react differently than controls.
Author response image 7.
Comparing untreated sorl1 F0 knockouts vs. treated with SSRIs. a, sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the citalopram experiment) in comparison with the sorl1 knockout + citalopram (1 or 10 µM) fingerprints. Each dot represents the mean deviation from the same-clutch scrambled-injected H<sub>2</sub>O-treated mean for that parameter (z-score, mean ± SEM). b, As in a), sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the fluvoxamine experiment) in comparison with the sorl1 + fluvoxamine (10 µM) fingerprint.
- Possible molecular pathways targeted by tinidazole, fenoprofen, and betamethasone are not described.
Tinidazole is an antibiotic, fenoprofen is a non-steroidal anti-inflammatory drug (NSAIDs), betamethasone is a steroidal anti-inflammatory drug. Interestingly, long-term use of NSAIDs reduces the risk of AD (in ’t Veld Bas A. et al., 2001). Several mechanisms are possible (Weggen et al., 2007), including reduction of Aβ42 production by interacting with γ-secretase (Eriksen et al., 2003). However, we did not explore the mechanism of action of these drugs on psen2 knockouts so do not feel comfortable speculating. We do not know, for example, whether these findings apply to betamethasone.
Minor Comments:
- On page 25, panel "g" should be labeled as "f".
Thank you!
- On page 35, a reference should be provided for the statement "From genomic studies of AD, we know that mutations in genes such as SORL1 modify risk by disrupting some biological processes.".
Thank you, this is now corrected. There were the same studies as mentioned in Introduction.
- On page 43, the word "and" should be added - "in wild-type rats and mice, overexpressing mutated human APP and PSEN1, AND restricting sleep for 21 days...".
Right, this sentence could be misread, we edited it. “overexpressing […]” only applied to the mice, not the rats (as they are wild-type); and both are sleep-deprived.
- On page 45, a reference should be provided for the statement "SSRIs can generally be used continuously with no adverse effects" and this statement should potentially be softened.
The reference is at the end of that sentence (Cirrito et al., 2011). You are correct though; we reformulated this statement to: “SSRIs can generally be used safely for many years”. SSRIs indeed have side effects.
- On page 54, a 60-minute rolling average is described as 45k rows, but this seems to be a 30-minute rolling average.
Thank you! We corrected. It should have been 90k rows, as in: 25 frames-per-second × 60 seconds × 60 minutes.
Reviewer #2 (Recommendations For The Authors):
"As we observed in the scRNA-seq data, most genes tested (appa, appb, psen1, psen2, apoea, cd2ap, sorl1) were broadly expressed throughout the 6-dpf brain (Fig. 1d and Fig. 1supplement 3 and 4)."
- apoea and appb are actually not expressed highly in the scRNA-seq data, and the apoea in situ looks odd, as if it has no expression. The appb gene mysteriously does not look as though it has high expression in the Raj data, but it is clearly expressed based on the in situ. I had previously noticed the same discrepancy, and I attribute it to the transcriptome used to map the Raj data, as the new DanioCell data uses a new transcriptome and indicates high appb expression in the brain. Please point out the discrepancy and possible explanation, perhaps in the figure legend.
All excellent points, thank you. We included them directly in Results text.
"most of these were expressed in the brain of 5-6-dpf zebrafish larvae, suggesting they play a role in early brain development or function."
- Evidence of expression does not suggest function, particularly not a function in brain development. As one example, almost half of the genome is expressed prior to the maternal-zygotic transition but does not have a function in those earliest stages of development. There are numerous other instances where expression does not equal function. Please change the sentence even as simply as "it is possible that they".
We mostly agree and edited to “[…], so they could play a role […]”.
Out of curiosity, we plotted, for each zebrafish developmental stage, the proportion of Alzheimer’s risk gene orthologues expressed in comparison to the proportion of all genes expressed (Author response image 8). We defined “all genes” as every gene that is expressed in at least one of the developmental stages (n = 24,856), not the complete transcriptome, to avoid including genes that are never expressed in the brain or whose expression is always below detection limit. We counted a gene as “expressed” if at least three cells had detectable transcripts. Using these definitions, 82 ± 7% of genes are expressed during development. For every developmental stage except 5 dpf (so 11/12), a larger proportion of Alzheimer’s risk genes than all genes are expressed (+5 ± 4%).
Author response image 8.
Proportion of Alzheimer’s risk genes orthologues expressed throughout zebrafish development. Proportion of Alzheimer’s risk genes orthologues (n = 42) and all genes (n = 24,856) expressed in the zebrafish brain at each developmental stage, from 12 hours post-fertilisation (hpf) to 15 days post-fertilisation (dpf). “All genes” corresponds to every gene expressed in the brain at any of the developmental stages, not the complete transcriptome. A gene is considered “expressed” (green) if at least three cells had detectable transcripts. Single-cell RNA-seq dataset from Raj et al., 2020.
"This frame-by-frame analysis has several advantages over previous methods that analysed activity data at the one-minute resolution."
- Which methods are these? There are no citations. There are certainly existing methods in the zebrafish field that can produce similar data to the method developed for this project. This new package is useful, as most existing software is not written in R, so it would help scientists who prefer this programming language. However, I would be careful not to oversell its novelty, since many methods do exist that produce similar results.
We added the references. There were referenced above after “we combined previous sleep/wake analysis methods”, but should have been referenced again here.
We are not convinced by this criticism. We would obviously not claim that the FramebyFrame package is as sophisticated and versatile as video-tracking tools like SLEAP or DeepLabCut, but we do think it answers a genuine need that was not addressed by other methods. Specifically, we know of many labs recording pixel count data across multiple days using the Zebrabox or DanioVision (we added support for DanioVision data after submission), but there were no packages to extract behavioural parameters from these data. Other methods involved standalone scripts with no documentation or version tracking. We would concede the FramebyFrame package is mostly targeted at these labs, but we already know of six labs routinely using it and were recently contacted by a researcher tracking Daphnia in the Zebrabox.
"F0 knockouts of both cutches" - "clutches"
Thank you!
Reviewer #3 (Recommendations For The Authors):
I would suggest totally revamping the Introduction section, and being sure to provide readers with the context and background they need for the data that comes thereafter. Key areas to touch on, in no particular order, include:
• Far more detail on the behavioral pharm screen upon which this paper builds, as a brief overview of that approach and the data generated are needed.
Thank you for the suggestion, we added a sentence hinting at this work in the last Introduction paragraph.
• Limitations of current zebrafish sleep/arousal assays that motivated the authors to develop a new, temporally high-resolution system.
We think this is better explained in Results, as is currently. For example, we need to point to Fig. 2–supplement 2a,b,c to explain that one-minute methods were missing sleep bouts and how FramebyFrame resolves this issue.
• A paragraph about sleep and AD, that does a better job of citing work in humans, mammalian, and invertebrate models that motivate the interest in the connection pursued here.
Sorry, we think this would place too much focus on sleep and AD. We want the main topic of the paper to be the behavioural pharmacology approach, not AD or sleep per se. As the Introduction states, we see Alzheimer’s risk genes as a case study for the behavioural pharmacology approach, rather than the reason why the approach was developed. Additionally, presenting sleep and AD in Introduction risks sounding like ZOLTAR is specifically designed for this context, while we conceived of it as much more generalisable and explicitly encourage its use to study genes associated to other diseases. Note that the paragraph you suggest is, we think, mostly present in Discussion (section Disrupted sleep and serotonin signalling […]).
• I modestly suggest eliminating making such a strong case for a gene-first approach being the best way to understand disease. It is not a zero-sum game, and there is plenty to learn from proteomics, metabolomics, etc. I suspect nobody will argue with the authors saying they leveraged the strength of their system and focused on key AD genes of interest.
From your point below, we understand the following quote is the source of the issue: “For finding causal processes, studying the genome, rather than the transcriptome or epigenome, is advantageous because the chronology from genomic variant to disease is unambiguous […]”. We did not want to suggest it is a zero-sum game, but we now understand how it can be read this way. We adapted slightly the wording. What we want to do is highlight the causality argument as the advantage of the genomics approach. We feel we do not read this argument often enough, while it remains a ‘magic power’ of genomics. One essentially does not have to worry about causality when studying a pathogenic germline variant, while it is a constant concern when studying the transcriptome or epigenome (i.e. did the change in this transcript’s level cause disease, or vice-versa?). To take an example in the context of AD, arguments based on genomics (e.g. Down syndrome or APP duplication) are often the definite arbiters when debating the amyloid hypothesis, exactly because their causality cannot be doubted.
Minor comments
(1) The opening of the introduction is perhaps overly broad, spending an entire paragraph on genome vs transcriptome, etc and making the claim that a gene-first approach is the best path. It isn't zero-sum, and the authors could just get right into AD and study genes of interest. Similar issues occur throughout the manuscript, with sentences/paragraphs that are not necessarily needed.
Please see our answer to your previous point. On the introduction being overly broad, we perfectly agree it is broad, but related to your point about presenting sleep and AD in the Introduction, we wish to talk about finding causal processes from genomics findings using behavioural pharmacology. We purposefully present research on AD as one instance of this broader goal, not the primary topic of the paper.
Another example are these sentences, which could be totally removed as the following paragraph starts off making the same point much more succinctly. "From genomic studies of AD, we know that mutations in genes such as SORL1 modify risk by disrupting some biological processes. Presumably, the same processes are disrupted in zebrafish sorl1 knockouts, and some caused the behavioural alterations we observed. Can we now follow the thread backwards and predict some of the biological processes in which Sorl1 is involved based on the behavioural profile of sorl1 knockouts?"
Thanks for the suggestion, but we think these sentences are useful to place back this Results section in the context of the Introduction. Think of the paper as mainly about the behavioural pharmacology approach, not on Alzheimer’s risk genes. The function of the paragraph here is not simply to explain the method by which we decided to study sorl1; it is to reiterate the rationale behind the behavioural pharmacology approach so that the reader understands where this Results section fits in the overall structure.
(2) Related to the above, the authors use lecanemab as an example to support their approach, but there has been a great deal of controversy regarding this drug. I don't think such extensive justification is needed. This study uses AD risk genes as a case study in a newly developed behavioral pharm pipeline. A great deal of the rest of the intro seems to just fill space and could be more focused on the study at hand. Interestingly, a er gene selection, the next step in their pipeline is sleep/wake analysis yet nothing is covered about AD and sleep in the intro. Some justification of that approach (why focus on sleep/wake as a starting point for behavioral pharm rather than learning and memory?) would be a better use of intro space.
There has indeed been controversy about lecanemab, but even the harshest critiques of the amyloid hypothesis concede that it slows down cognitive decline (Espay et al., 2023). That is all that is needed to support our argument, which is that research on AD started primarily from genomics and thereby yielded a disease-modifying drug. The controversy seems mostly focused on whether this effect size is clinically significant, and we think we correctly represent this uncertainty (e.g. “antibodies against Aβ such as lecanemab show promise in slowing down disease progression” and “the beneficial effects from targeting Aβ aggregation currently remain modest”).
Your next point is entirely fair. We mostly answered it above. To explain further, the primary reason why we measured sleep/wake behaviour is to match the behavioural dataset from Rihel et al., 2010 so we can use it to make predictions, not to study sleep in the context of AD per se. Sure, perhaps learning and memory would have been interesting, but we do not know of any study testing thousands of small molecules on zebrafish larvae during a memory task. We understand it can be slightly confusing though, as we then spend a paragraph of Discussion on sleep as a causal process in AD, but we obviously need to discuss this topic given the findings. However, to reiterate, we purposefully designed FramebyFrame and ZOLTAR to be useful beyond studying sleep/wake behaviour. For example, FramebyFrame would not calculate 17 behavioural parameters if the only goal was to measure sleep. We now mention the Rihel et al., 2010 study in the Introduction as you suggested above (“Far more detail on the behavioral pharm screen […]”), as that is the real reason why sleep/wake behaviour was measured in the first place.
(3) Also related to the above, another more relevant point that could be talked about in the intro is the need for more refined approaches to analyze sleep in zebrafish, given the effort that went into the new analysis system described here. Again, I think the context for why the authors developed this system would be more meaningful than the current content.
Thank you, we think we answered this point above (especially below Limitations of current zebrafish sleep/arousal assays […]).
(4) GWAS can stand for Genome-wide associate studies (plural) so I do not think the extra "s" is needed (GWASs) .
Indeed, that seems to be the common usage. Thank you.
(5) AD candidate risk genes were determined from loci using "mainly statistic colocalization". Can the authors add a few more details about what was done and what the "mainly" caveat refers to?
“Mainly” simply refers to the fact that other methods were used by Schwartzentruber et al. (2021) to annotate the GWAS loci with likely causal genes, but that most calls were ultimately made from statistic colocalisation. Readers can refer to this work to learn more about the methods used.
(6) The authors write "The loss of psen1 only had mild effects on behaviour" but I think they mean "sleep behaviors" as there could be many other behaviors that are disrupted but were not assessed. The same issue a few sentences later with "Behaviour during the day was not affected" and at the end of the following paragraph.
Yes, that would be more precise, thank you.
(7) For the Sorl1 pharmacology data, it is very hard to understand what is being measured behaviorally. Are the authors measuring sleep +/- citalopram, or something else, and why the change to Euclidean distance rather than all the measures we were just introduced to earlier in the manuscript?
We understand these plots (Fig. 5c,d) are less intuitive, but it is important that we show the difference in behaviour compared to H<sub>2</sub>O-treated larvae of same genotype. The claim is that citalopram has a larger effect on knockouts than on controls, so the reader needs to focus on the effect of the drug on each genotype, not on the effect of sorl1 knockout. We added the standard fingerprints (i.e. setting controls to z-score = 0) here in Author response figures.
Euclidean distance takes as input all the measures we introduced. The point is precisely not to select a single measure. For example, say we were only plotting active bout number during the day, we would conclude that 10 µM citalopram has the same effect on knockouts and controls. Conversely, if we had taken sleep bout length at night, we would conclude 10 µM has a stronger effect on knockouts. What is the correct parameter to select? Using Euclidean distance resolves this by taking all parameters into account, rather than arbitrarily choosing one.
And what exactly is a "given spike in serotonin"? and how is this hypothesis the conclusion based on the lack of evidence for the second hypothesis? As the authors say, there could be other ways sorl1 knockouts are more sensitive to citalopram, so the absence of evidence for one hypothesis certainly does not support the other hypothesis.
We mean a given release of serotonin in the synaptic cleft. We have fixed this wording.
We tend to disagree on the second point. We can think of two ways that sorl1 knockouts are more sensitive to citalopram: 1) they produce more serotonin, so blocking reuptake causes a larger spike in knockouts; or 2) blocking reuptake causes the same increase in both knockouts and wild-types but knockouts react more strongly to serotonin. We cannot in fact think of another way to explain the citalopram results. Not finding overwhelming evidence for 1) surely supports 2) somewhat, even if we do not have direct evidence for it. As an analogy, if two diagnoses are possible for a patient, testing negative for the first one supports the other one, even before it is directly tested.
(8) Again some language is used without enough care. Fish are referred to as "drowsier" under some drug conditions. How do the authors know the animal is drowsy? The phenotype is more specific - more sleep, less activity.
Thank you, we switched to “Furthermore, fenoprofen worsened the day-time hypoactivity of psen2 knockout larvae […]”.
(9) This sentence is misleading as it gives the impression that results in this manuscript suggest the conclusion: "Our observation that disruption of genes associated with AD diagnosis after 65 years reduces sleep in 7-day zebrafish larvae suggest that disrupted sleep may be a common mechanism through which these genes exert an effect on risk." That idea is widely held in the field, and numerous other previous manuscripts/reviews should be cited for clarity of where this hypothesis came from.
This idea is not widely held in the field. You likely read this point as “disrupted sleep is a risk factor for AD”, which, yes, is widely discussed in the field, but is not precisely what we are saying. We hypothesise that mutations in some of the Alzheimer’s risk genes cause disrupted sleep, possibly from a very early age, which then causes AD decades later. Studies and reviews on sleep and AD rarely make this hypothesis, at least not explicitly. The closest we know of are a few recent human genetics studies, typically using Mendelian Randomisation, finding that higher genetic risk of AD correlates with some sleep phenotypes, such as sleep duration (Chen et al., 2022; Leng et al., 2021). The work of Muto et al. (2021) is particularly interesting as it found correlations between higher genetic risk of AD and some sleep phenotypes in men in their early twenties, which seems unlikely to be a consequence of early pathology (Muto et al., 2021). Note, however, that even these studies do not mention sleep possibly being disrupted early in development, which is what our findings in zebrafish larvae support. As we mention, we think a team should test whether sleep is different in infants at higher genetic risk of AD, essentially performing an analogous, but obviously much more difficult, experiment as we did in zebrafish larvae. We do not know of any study testing this or even raising this idea, so evidently it is not widely held. Having said that, the studies we mention here were not referenced in the Discussion paragraph. We have now corrected this.
Ashlin TG, Blunsom NJ, Ghosh M, Cockcroft S, Rihel J. 2018. Pitpnc1a Regulates Zebrafish Sleep and Wake Behavior through Modulation of Insulin like Growth Factor Signaling. Cell Rep 24:1389–1396. doi:10.1016/j.celrep.2018.07.012
Chen D, Wang X, Huang T, Jia J. 2022. Sleep and LateOnset Alzheimer’s Disease: Shared Genetic Risk Factors, Drug Targets, Molecular Mechanisms, and Causal Effects. Front Genet 13. doi:10.3389/fgene.2022.794202
Cirrito JR, Disabato BM, Restivo JL, Verges DK, Goebel WD, Sathyan A, Hayreh D, D’Angelo G, Benzinger T, Yoon H, Kim J, Morris JC, Mintun MA, Sheline YI. 2011. Serotonin signaling is associated with lower amyloid-β levels and plaques in transgenic mice and humans. Proc Natl Acad Sci U S A 108:14968–14973. doi:10.1073/pnas.1107411108
Dean DC, Jerskey BA, Chen K, Protas H, Thiyyagura P, RoonJva A, O’Muircheartaigh J, Dirks H, Waskiewicz N, Lehman K, Siniard AL, Turk MN, Hua X, Madsen SK, Thompson PM, Fleisher AS, Huentelman MJ, Deoni SCL, Reiman EM. 2014. Brain Differences in Infants at Differential Genetic Risk for Late-Onset Alzheimer Disease A Cross-sectional Imaging Study. JAMA Neurol 71:11–22. doi:10.1001/jamaneurol.2013.4544
Eriksen JL, Sagi SA, Smith TE, Weggen S, Das P, McLendon DC, Ozols VV, Jessing KW, Zavitz KH, Koo EH, Golde TE. 2003. NSAIDs and enantiomers of flurbiprofen target γ-secretase and lower Aβ42 in vivo. J Clin Invest 112:440–449. doi:10.1172/JCI18162
Espay AJ, Herrup K, Kepp KP, Daly T. 2023. The proteinopenia hypothesis: Loss of Aβ42 and the onset of Alzheimer’s Disease. Ageing Res Rev 92:102112. doi:10.1016/j.arr.2023.102112
Hoffman EJ, Turner KJ, Fernandez JM, Cifuentes D, Ghosh M, Ijaz S, Jain RA, Kubo F, Bill BR, Baier H, Granato M, Barresi MJF, Wilson SW, Rihel J, State MW, Giraldez AJ. 2016. Estrogens Suppress a Behavioral Phenotype in Zebrafish Mutants of the AuJsm Risk Gene, CNTNAP2. Neuron 89:725–733. doi:10.1016/j.neuron.2015.12.039
in ’t Veld Bas A, Ruitenberg A, Hofman A, Launer LJ, van Duijn CM, Stijnen T, Breteler MMB, Stricker BHC. 2001. Nonsteroidal Anti inflammatory Drugs and the Risk of Alzheimer’s Disease. N Engl J Med 345:1515–1521. doi:10.1056/NEJMoa010178
Jagirdar R, Fu C-H, Park J, Corbek BF, Seibt FM, Beierlein M, Chin J. 2021. Restoring activity in the thalamic reticular nucleus improves sleep architecture and reduces Aβ accumulation in mice. Sci Transl Med 13:eabh4284. doi:10.1126/scitranslmed.abh4284
Jiang H, Newman M, Lardelli M. 2018. The zebrafish orthologue of familial Alzheimer’s disease gene PRESENILIN 2 is required for normal adult melanotic skin pigmentation. PLOS ONE 13:e0206155. doi:10.1371/journal.pone.0206155
Jiang H, Pederson SM, Newman M, Dong Y, Barthelson K, Lardelli M. 2020. Transcriptome analysis indicates dominant effects on ribosome and mitochondrial function of a premature termination codon mutation in the zebrafish gene psen2. PloS One 15:e0232559. doi:10.1371/journal.pone.0232559
Joo W, Vivian MD, Graham BJ, Soucy ER, Thyme SB. 2021. A Customizable Low-Cost System for Massively Parallel Zebrafish Behavioral Phenotyping. Front Behav Neurosci 14.
Joubert L, Hanson B, Barthet G, Sebben M, Claeysen S, Hong W, Marin P, Dumuis A, Bockaert J. 2004. New sorting nexin (SNX27) and NHERF specifically interact with the 5-HT4a receptor splice variant: roles in receptor targeting. J Cell Sci 117:5367–5379. doi:10.1242/jcs.01379
Leng Y, Ackley SF, Glymour MM, Yaffe K, Brenowitz WD. 2021. Genetic Risk of Alzheimer’s Disease and Sleep Duration in Non-Demented Elders. Ann Neurol 89:177–181. doi:10.1002/ana.25910
Mitchell PB, Hadzi-Pavlovic D. 2000. Lithium treatment for bipolar disorder. Bull World Health Organ 78:515–517.
Mikur A. 2011. Trazodone: properties and utility in multiple disorders. Expert Rev Clin Pharmacol 4:181–196. doi:10.1586/ecp.10.138
Munoz-Torrero D. 2008. Acetylcholinesterase Inhibitors as Disease-Modifying Therapies for Alzheimer’s Disease. Curr Med Chem 15:2433–2455. doi:10.2174/092986708785909067
Muto V, Koshmanova E, Ghaemmaghami P, Jaspar M, Meyer C, Elansary M, Van Egroo M, Chylinski D, Berthomier C, Brandewinder M, Mouraux C, Schmidt C, Hammad G, Coppieters W, Ahariz N, Degueldre C, Luxen A, Salmon E, Phillips C, Archer SN, Yengo L, Byrne E, Collette F, Georges M, Dijk D-J, Maquet P, Visscher PM, Vandewalle G. 2021. Alzheimer’s disease genetic risk and sleep phenotypes in healthy young men: association with more slow waves and daytime sleepiness. Sleep 44. doi:10.1093/sleep/zsaa137
Myers-Turnbull D, Taylor JC, Helsell C, McCarroll MN, Ki CS, Tummino TA, Ravikumar S, Kinser R, Gendelev L, Alexander R, Keiser MJ, Kokel D. 2022. Simultaneous analysis of neuroactive compounds in zebrafish. doi:10.1101/2020.01.01.891432
Owens MJ, Morgan WN, Plok SJ, Nemeroff CB. 1997. Neurotransmiker receptor and transporter binding profile of antidepressants and their metabolites. J Pharmacol Exp Ther 283:1305– 1322.
Özcan GG, Lim S, Leighton PL, Allison WT, Rihel J. 2020. Sleep is bi-directionally modified by amyloid beta oligomers. eLife 9:e53995. doi:10.7554/eLife.53995
Quiroz YT, Schultz AP, Chen K, Protas HD, Brickhouse M, Fleisher AS, Langbaum JB, Thiyyagura P, Fagan AM, Shah AR, Muniz M, Arboleda-Velasquez JF, Munoz C, Garcia G, Acosta-Baena N, Giraldo M, Tirado V, Ramírez DL, Tariot PN, Dickerson BC, Sperling RA, Lopera F, Reiman EM. 2015. Brain Imaging and Blood Biomarker Abnormalities in Children With Autosomal Dominant Alzheimer Disease: A Cross-Sectional Study. JAMA Neurol 72:912–919. doi:10.1001/jamaneurol.2015.1099
Relkin NR. 2007. Beyond symptomatic therapy: a reexamination of acetylcholinesterase inhibitors in Alzheimer’s disease. Expert Rev Neurother 7:735–748. doi:10.1586/14737175.7.6.735
Rihel J, Prober DA, Arvanites A, Lam K, Zimmerman S, Jang S, Haggarty SJ, Kokel D, Rubin LL, Peterson RT, Schier AF. 2010. Zebrafish Behavioral Profiling Links Drugs to Biological Targets and Rest/Wake Regulation. Science 327:348–351. doi:10.1126/science.1183090
Sleegers K, Brouwers N, Gijselinck I, Theuns J, Goossens D, Wauters J, Del-Favero J, Cruts M, van Duijn CM, Van Broeckhoven C. 2006. APP duplication is sufficient to cause early onset Alzheimer’s dementia with cerebral amyloid angiopathy. Brain J Neurol 129:2977–2983. doi:10.1093/brain/awl203
Sun L, Zhou R, Yang G, Shi Y. 2017. Analysis of 138 pathogenic mutations in presenilin-1 on the in vitro production of Aβ42 and Aβ40 peptides by γ-secretase. Proc Natl Acad Sci 114:E476– E485. doi:10.1073/pnas.1618657114
Szklarczyk D, Santos A, von Mering C, Jensen LJ, Bork P, Kuhn M. 2016. STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data. Nucleic Acids Res 44:D380–D384. doi:10.1093/nar/gkv1277
Weggen S, Rogers M, Eriksen J. 2007. NSAIDs: small molecules for prevention of Alzheimer’s disease or precursors for future drug development? Trends Pharmacol Sci 28:536–543. doi:10.1016/j.Jps.2007.09.004
Wiltschko AB, Tsukahara T, Zeine A, Anyoha R, Gillis WF, Markowitz JE, Peterson RE, Katon J, Johnson MJ, Daka SR. 2020. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat Neurosci 23:1433–1443. doi:10.1038/s41593-020-00706-3
Yang T, Arslanova D, Gu Y, Augelli-Szafran C, Xia W. 2008. Quantification of gamma-secretase modulation differentiates inhibitor compound selectivity between two substrates Notch and amyloid precursor protein. Mol Brain 1:15. doi:10.1186/1756-6606-1-15
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Weaknesses:<br /> (1) While the overall results are interesting, I am somewhat left confused about how to interpret the difference in the scores derived from different conditions. For example, the authors stated "Comparing the weights for in-group and out-group distractors, the effect of proximity was larger than that of aggression and grooming" in p.8. Does this mean that the proximity is indeed the type of behavior most affected in the out-group condition compared to the in-group condition? The out-group effects are difficult to examine with actual behavioral data, but some in-group effects such as those involving OT can be tested, which possibly provides good insights into interpreting the differences of the weights observed across the experimental conditions.
Thank you for your thoughtful comments and for highlighting an important aspect of our findings. The statement in page 8 refers to the relative impact of different social behaviors—proximity, aggression, and grooming—on the derived weights for in-group and out-group distractors. Specifically, the data suggest that proximity exerts a stronger influence than aggression or grooming in differentiating the effects of out-group versus in-group distractors. Regarding the out-group condition, we acknowledge that it presents challenges for direct behavioral observation, as interactions involving out-group members are often more difficult to quantify in naturalistic settings. However, we agree with you about the suggestion to test certain in-group effects, particularly those influenced by oxytocin (OT), as they offer a more controlled framework to validate and interpret the observed differences in weights across experimental conditions. In line with this, we examined specific in-group behaviors under OT administration to disentangle their contributions to attentional dynamics (Fig. 4 and Fig. 5 e to h). By integrating controlled experimental manipulations, we think these results could provide deeper insights into how social relationships shape the observed patterns of attention.
(2) I think it is important to provide how variable spontaneous social interactions were across sessions and how impactful the variability of the interactions is on the SEI and IEI, as it helps to understand how meaningful the differences of weights are across the conditions, but such data are missing. In line with this point, although the conclusions still hold as those data were obtained during the same experimental periods, shouldn't the weights in Fig. 3f and Figs. 4g and 4h (saline) be expected to be similar, if not the same?
Thank you for your insightful comments. As highlighted, we utilized the entire experimental period as the dataset to evaluate the monkeys' social interactions. The experiments presented in Figures 3 and 4 were designed to examine how social relationships correlate with patterns of social attention under two distinct conditions: without manipulation (Fig. 3) and with nebulized exposure to oxytocin and saline (Fig. 4). Theoretically, the weights observed in the unmanipulated condition and the nebulized saline condition should be similar. However, our results indicate that distractor biases shifted significantly following nebulized saline exposure (Fig. 4) compared to the unmanipulated condition (Fig. 3) (MK: p = 9.3×10<sup>-3</sup>, ML: p = 9.77×10<sup>-4</sup>, MC: p = 9.77×10<sup>-4</sup>, MA: p = 0.09; n<sub>1</sub> = n<sub>2</sub> = 12 experimental days; Two-sided Wilcoxon signed-rank test). This suggests that the nebulization process itself, despite acclimating the monkeys to saline exposure for approximately two weeks prior to the experiments, still influenced their attentional behaviors.
While the primary goal of nebulization was to assess the effects of oxytocin on social attention, our main conclusions remain robust, even considering the impact of nebulization on distractor biases. We acknowledge that variability in spontaneous social interactions across days or experimental sessions could be an important factor influencing the SEI and IEI. The dynamic nature of social interactions within the colony is likely affected by numerous variables. Future research will aim to integrate these factors into a more comprehensive and dynamic framework to better interpret their influence on social attention metrics.
Reviewer #2 (Public review):
Weaknesses:<br /> (1) The study's conclusions are based on observations of only four monkeys, which limits the generalizability of the findings. Larger sample sizes could strengthen the validity of the results.
Thank you for your valuable comment. We acknowledge that the relatively small sample size could influence the generalizability of the findings. However, despite this limitation, our work systematically examined multifaceted social relationships among monkeys and their attentional strategies within a well-controlled experimental setup. We reported results across sessions and conditions (e.g., in-group vs. out-group; saline vs. Oxytocin), which strengthens the reliability of the observed effects of social networks within this context. We agree that increasing the sample size would improve the generalizability of the results. Future studies with a larger cohort will be critical for confirming the robustness of our findings and expanding their broader applicability. We have acknowledged this limitation in the revised manuscript and highlighted the potential for further research with larger sample sizes to validate and extend our conclusions.
(2) The limited set of stimulus images (in-group and out-group faces) may introduce unintended biases. This could be addressed by increasing the diversity of stimuli or incorporating a broader range of out-group members.
Thank you for your thoughtful comment. We acknowledge that the use of a limited set of six monkey faces as stimuli for in-group and out-group conditions could potentially introduce biases. To address this concern, we conducted an additional analysis to minimize the potential impact of individual images on our findings using the current dataset. Specifically, we randomly excluded one in-group and one out-group image and reanalyzed distractor biases using the remaining two images (Supplementary Fig. 3a). For each subject, this approach generated three sets of two distractors per group, resulting in 81(3<sup>4</sup>) combinations across four monkey subjects, and a total of 81 × 81 subject-distractor pairings. We statistically compared distractor biases between in-group and out-group faces for each combination (Supplementary Fig. 3b). As shown in Supplementary Fig. 3c, 99.30% of the 6,561 combinations demonstrated significantly lower distractor biases towards in-group faces compared to out-group faces (two-sided Wilcoxon signed-rank test, p < 0.05). These results suggest that the observed differences in social attention between in-group and out-group monkeys are unlikely to be driven by specific images within the stimulus set. That said, we agree that increasing the diversity of stimulus images or incorporating a broader range of out-group members would improve the generalizability of the results. We have acknowledged this limitation in the revised manuscript and highlighted the potential for further research to incorporate a more diverse stimulus set to validate and extend our findings.
“However, these conclusions may be constrained by the relatively small sample size and the homogeneity of stimulus set in the study. Future research focusing on larger, more diverse cohorts and incorporating a broader range of stimuli will enhance the generalizability and applicability of the findings.”
Reviewer #1 (Recommendations for the authors):
It is difficult to distinguish "Getting fighted" and "Fighting partner" in Fig. 1b (esp. when printed). I thought Actor showed "Fighting partner" several times in Session 2, but it seems to be "Getting fighted" judging from Figs. 1c and 1d. Is this correct? If so, I would suggest to change the color to improve visibility.
Thank you for your valuable comment. We apologize for the confusion in the previous version. To improve clarity, we have both terms to “begin fighting” and “being fought”. As shown in Figure 1b, we now explicitly define the identities of the two monkeys as the actor (K) and the partner (L), with all behaviors described from the perspective of the actor. For example, when the actor (K) initiates the fight, it is marked as “begin fighting”, whereas when the partner (L) initiates the fight, the actor (K) is the recipient and labeled as “being fought”. Additionally, we have implemented your suggestion by changing the colors to enhance visibility, especially for the terms “begin fighting” and “being fought”.
Reviewer #2 (Recommendations for the authors):
I have some minor concerns:
(1) Figure1B, caption for x axis is missing, 4 means 4 days?
Thank you so much for the comment. We have clarified the x-axis in Figure 1B, where the label "4" corresponds to 4 hours of video typing on each experimental day. The revised figure now includes the appropriate label for better clarity. We appreciate your careful attention to this detail.
(2) I am slightly concerned about animal safety. How do the experimenters ensure the animals' safety and well-being in cases of aggressive interactions or attacks?
Thank you for your comment. We share your concern regarding animal safety and take re the well-being of the monkeys in the study. All experimental procedures were reviewed and approved by the Institutional Animal Care and Use Committee at the Institute of Biophysics, Chinese Academy of Sciences (IBP-NHP-002(22)). The monkeys were housed together in the same colony room for over four years, in interconnected cages that allowed for direct physical interaction. Animal behaviors in cages were closely monitored via a live video system to ensure their safety. To prevent potential injuries, a sliding partition system was in place, enabling the isolation of individual animals when necessary, minimizing risks to their well-being.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
We made a serious effort to address the reviewers comments. If we have come up short, then let this be stated and explained in the eLife review. But we would be grateful if you did not include in the revised eLife review, comments that were corrected / addressed last time – unless of course there is disagreement, or if our response was unsatisfactory. If either of the latter, then please explain and we will respond.
As to the exceptionally minor issue, namely, correction for multiple statistical tests (minor because the data and the error are presented in the text). We have now conducted one-way ANOVA to back the data displayed in Fig 4A., and Supp. Figs 19 and 21. In each case ANOVA revealed a highly significant difference among means: Dunnett’s post hoc test was then used to test each result against SBW25, with the multiple comparisons corrected for in the analysis.
This resulted in changes to the description of the statistical analysis in the following captions:
To Figure 4.
Where we previously referred to paired t-tests we now state: ANOVA revealed a highly significant difference among means [F<sub>7,16</sub> = 8.19, p < 0.001] with Dunnett’s post-hoc test adjusted for multiple comparisons showing that five genotypes (*) differ significantly (p < 0.05) from SBW25.
To Supplementary Figure 19.
Where we previously referred to paired t-tests we now state: ANOVA revealed a highly significant difference among means [F<sub>7,16</sub> = 16.74, p < 0.001] with Dunnett’s post-hoc test adjusted for multiple comparisons showing that three genotypes (*) differ significantly (p < 0.05) from SBW25.
To Supplementary Figure 21.
Where we previously referred to paired t-tests we now state: ANOVA revealed a highly significant difference among means [F<sub>7,89</sub> = 9.97, p < 0.0001] with Dunnett’s post-hoc test adjusted for multiple comparisons showing that SBW25 ∆mreB and SBW25 ∆PFLU4921-4925 are significantly different (*) from SBW25 (p < 0.05).
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
his study shows a new mechanism of GS regulation in the archaean Methanosarcina mazei and clarifies the direct activation of GS activity by 2-oxoglutarate, thus featuring another way in which 2-oxoglutarate acts as a central status reporter of C/N sensing.
Mass photometry and single particle cryoEM structure analysis convincingly show the direct regulation of GS activity by 2-OG promoted formation of the dodecameric structure of GS. The previously recognized small proteins GlnK1 and Sp26 seem to play a subordinate role in GS regulation, which is in good agreement with previous data. Although these data are quite clear now, there remains one major open question: how does 2-OG further increase GS activity once the full dodecameric state is achieved (at 5 mM)? This point needs to be reconsidered.
Weaknesses:
It is not entirely clear, how very high 2-OG concentrations activate GS beyond dodecamer formation.
The data presented in this work are in stark contrast to the previously reported structure of M. mazei GS by the Schumacher lab. This is very confusing for the scientific community and requires clarification. The discussion should consider possible reasons for the contradictory results.
Importantly, it is puzzling how Schumacher could achieve an apo-structire of dodecameric GS? If 2-OG is necessary for dodecameric formation, this should be discussed. If GlnK1 doesn't form a complex with the dodecameric GS, how could such a complex be resolved there?
In addition, the text is in principle clear but could be improved by professional editing. Most obviously there is insufficient comma placement.
We thank Reviewer #1 for the professional evaluation and raising important points. We will address those comments in the updated manuscript and especially improve the discussion in respect to the two points of concern.
(1) How can GlnA1 activity further be stimulated with further increasing 2-OG after the dodecamer is already fully assembled at 5 mM 2-OG.
We assume a two-step requirement for 2-OG, the dodecameric assembly and the priming of the active sites. The assembly step is based on cooperative effects of 2-OG and does not require the presence of 2-OG in all 2-OG-binding pockets: 2-OG-binding to one binding pocket also causes a domino effect of conformational changes in the adjacent 2-OG-unbound subunit, as also described for Methanothermococcus thermolithotrophicus GS in Müller et al. 2023. Due to the introduction of these conformational changes, the dodecameric form becomes more favourable even without all 2-OG binding sites being occupied. With higher 2-OG concentrations present (> 5mM), the activity increased further until finally all 2-OG-binding pockets were occupied, resulting in the priming of all active sites (all subunits) and thereby reaching the maximal activity.
(2) The contradictory results with previously published data on the structure of M. mazei by Schumacher et al. 2023.
We certainly agree that it is confusing that Schumacher et al. 2023 obtained a dodecameric structure without the addition of 2-OG, which we claim to be essential for the dodecameric form. 2-OG is a cellular metabolite that is naturally present in E. coli, the heterologous expression host both groups used. Since our main question focused on analysing the 2-OG effect on GS, we have performed thorough dialysis of the purified protein to remove all 2-OG before performing MP experiments. In the absence of 2-OG we never observed significant enzyme activity and always detected a fast disassembly after incubation on ice. We thus assume that a dodecamer without 2-OG in Schumacher et al. 2023 is an inactive oligomer of a once 2-OG-bound form, stabilized e.g. by the presence of 5 mM MgCl2.
The GlnA1-GlnK1-structure (crystallography) by Schumacher et al. 2023 is in stark contrast to our findings that GlnK1 and GlnA1 do not interact as shown by mass photometry with purified proteins. A possible reason for this discrepancy might be that at the high protein concentrations used in the crystallization assay, complexes are formed based on hydrophobic or ionic protein interactions, which would not form under physiological concentrations.
Reviewer #2 (Public Review):
Summary:
Herdering et al. introduced research on an archaeal glutamine synthetase (GS) from Methanosarcina mazei, which exhibits sensitivity to the environmental presence of 2-oxoglutarate (2-OG). While previous studies have indicated 2-OG's ability to enhance GS activity, the precise underlying mechanism remains unclear. Initially, the authors utilized biophysical characterization, primarily employing a nanomolar-scale detection method called mass photometry, to explore the molecular assembly of Methanosarcina mazei GS (M. mazei GS) in the absence or presence of 2-OG. Similar to other GS enzymes, the target M. mazei GS forms a stable dodecamer, with two hexameric rings stacked in tail-to-tail interactions. Despite approximately 40% of M. mazei GS existing as monomeric or dimeric entities in the detectable solution, the majority spontaneously assemble into a dodecameric state. Upon mixing 2-OG with M. mazei GS, the population of the dodecameric form increases proportionally with the concentration of 2-OG, indicating that 2-OG either promotes or stabilizes the assembly process. The cryo-electron microscopy (cryo-EM) structure reveals that 2-OG is positioned near the interface of two hexameric rings. At a resolution of 2.39 Å, the cryo-EM map vividly illustrates 2-OG forming hydrogen bonds with two individual GS subunits as well as with solvent water molecules. Moreover, local side-chain reorientation and conformational changes of loops in response to 2-OG further delineate the 2-OG-stabilized assembly of M. mazei GS.
Strengths & Weaknesses:
The investigation studies the impact of 2-oxoglutarate (2-OG) on the assembly of Methanosarcina mazei glutamine synthetase (M mazei GS). Utilizing cutting-edge mass photometry, the authors scrutinized the population dynamics of GS assembly in response to varying concentrations of 2-OG. Notably, the findings demonstrate a promising and straightforward correlation, revealing that dodecamer formation can be stimulated by 2-OG concentrations of up to 10 mM, although GS assembly never reaches 100% dodecamerization in this study. Furthermore, catalytic activities showed a remarkable enhancement, escalating from 0.0 U/mg to 7.8 U/mg with increasing concentrations of 2-OG, peaking at 12.5 mM. However, an intriguing gap arises between the incomplete dodecameric formation observed at 10 mM 2-OG, as revealed by mass photometry, and the continued increase in activity from 5 mM to 10 mM 2-OG for M mazei GS. This prompts questions regarding the inability of M mazei GS to achieve complete dodecamer formation and the underlying factors that further enhance GS activity within this concentration range of 2-OG.
Moreover, the cryo-electron microscopy (cryo-EM) analysis provides additional support for the biophysical and biochemical characterization, elucidating the precise localization of 2-OG at the interface of two GS subunits within two hexameric rings. The observed correlation between GS assembly facilitated by 2-OG and its catalytic activity is substantiated by structural reorientations at the GS-GS interface, confirming the previously reported phenomenon of "funnel activation" in GS. However, the authors did not present the cryo-EM structure of M. mazei GS in complex with ATP and glutamate in the presence of 2-OG, which could have shed light on the differences in glutamine biosynthesis between previously reported GS enzymes and the 2-OG-bound M. mazei GS.
Furthermore, besides revealing the cryo-EM structure of 2-OG-bound GS, the study also observed the filamentous form of GS, suggesting that filament formation may be a universal stacking mechanism across archaeal and bacterial species. However, efforts to enhance resolution to investigate whether the stacked polymer is induced by 2-OG or other factors such as ions or metabolites were not undertaken by the authors, leaving room for further exploration into the mechanisms underlying filament formation in GS.
We thank Reviewer #2 for the detailed assessment and valuable input. We will address those comments in the updated manuscript and clarify the message.
(1) The discrepancy of the dodecamer formation (max. at 5 mM 2-OG) and the enzyme activity (max. at 12.5 mM 2-OG). We assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site. See also Reviewer #1 R.1). We assume this is the reason why the activity of dodecameric GlnA1 can be further enhanced by increased 2-OG concentration until all catalytic sites are primed.
(2) The lack of the structure of a 2-OG and ATP-bound GlnA1. Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.
(3) The observed GlnA1-filaments are an interesting finding. We certainly agree with the referee on that point, that the stacked polymers are potentially induced by 2-OG or ions. However, it is out of the main focus of this manuscript to further explore those filaments. Nevertheless, this observation could serve as an interesting starting point for future experiments.
Reviewer #3 (Public Review):
Summary:
The current manuscript investigates the effect of 2-oxoglutarate and the Glk1 protein as modulators of the enzymatic reactivity of glutamine synthetase. To do this, the authors rely on mass photometry, specific activity measurements, and single-particle cryo-EM data.
From the results obtained, the authors convey that glutamine synthetase from Methanosarcina mazei exists in a non-active monomeric/dimeric form under low concentrations of 2-oxoglutarate, and its oligomerization into a dodecameric complex is triggered by higher concentration of 2-oxoglutarate, also resulting in the enhancement of the enzyme activity.
Strengths:
Glutamine synthetase is a crucial enzyme in all domains of life. The dodecameric fold of GS is recurrent amongst prokaryotic and archaea organisms, while the enzyme activity can be regulated in distinct ways. This is a very interesting work combining protein biochemistry with structural biology.
The role of 2-OG is here highlighted as a crucial effector for enzyme oligomerization and full reactivity.
Weaknesses:
Various opportunities to enhance the current state-of-the-art were missed. In particular, omissions of the ligand-bound state of GnK1 leave unexplained the lack of its interaction with GS (in contradiction with previous results from the authors). A finer dissection of the effect and role of 2-oxoglurate are missing and important questions remain unanswered (e.g. are dimers relevant during early stages of the interaction or why previous GS dodecameric structures do not show 2-oxoglutarate).
We thank Reviewer #3 for the expert evaluation and inspiring criticism.
(1) Encouragement to examine ligand-bound states of GlnK1. We agree and plan to perform the suggested experiments exploring the conditions under which GlnA1 and GlnK1 might interact. We will perform the MP experiments in the presence of ATP. In GlnA1 activity test assays when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.
(2) The exact role of 2-OG could have been dissected much better. We agree on that point and will improve the clarity of the manuscript. See also Reviewer #1 R.1.
(3) The lack of studies on dimers. This is actually an interesting point, which we did not consider during writing the manuscript. Now, re-analysing all our MP data in this respect, GlnA1 is likely a dimer as smallest species. Consequently, we will add more supplementary data which supports this observation and change the text accordingly.
(4) Previous studies and structures did not show the 2-OG. We assume that for other structures, no additional 2-OG was added, and the groups did not specifically analyse for this metabolite either. All methanoarchaea perform methanogenesis and contain the oxidative part of the TCA cycle exclusively for the generation of glutamate (anabolism) but not a closed TCA cycle enabling them to use internal 2-OG concentration as internal signal for nitrogen availability. In the case of bacterial GS from organisms with a closed TCA cycle used for energy metabolism (oxidation of acetyl CoA) like e.g. E. coli, the formation of an active dodecameric GS form underlies another mechanism independent of 2-OG. In case of the recent M. mazei GS structures published by Schumacher et al. 2023, the dodecameric structure is probably a result from the heterologous expression and purification from E. coli. (See also Reviewer #1 R.2). One example of methanoarchaeal glutamine synthetases that do in fact contain the 2-OG in the structure, is Müller et al. 2023.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Specific issues:
L 141: 2-OG levels increase due to slowing GOGAT reaction (due to Gln limitation as a consequence of N-starvation).... (2-OG also increases in bacteria that lack GDH...)
As the GS-GOGAT cycle is the major route of ammonium assimilation, consumption of 2-OG by GDH is probably only relevant under high ammonium concentrations.
In Methanoarchaea, GS is strictly regulated and expression strongly repressed under nitrogen sufficiency - thus glutamate for anabolism is mainly generated by GDH under N sufficiency consuming 2-OG delivered by the oxidative part of the TCA cycle (Methanogenesis is the energy metabolism in methanoarchaea, a closed TCA cycle is not present) thus 2-OG is increasing under nitrogen limitation, when no NH3 is available for GDH.
L148: it is not clear what is meant by: "and due to the indirect GS activity assay"
We apologize for not being clear here. The GS activity assay used is the classical assay by Sahpiro & Stadtman 1970 and is a coupled optical test assay (coupling the ATP consumption of the GS activity to the oxidation of NADH by lactate dehydrogenase). Based on the coupled test assay the measurements of low activities show a high deviation. We now added this information in the revised MS respectively.
L: 177: arguing about 2-OG affinities: more precisely, the 0.75 mM 2-OG is the EC50 concentration of 2-OG for triggering dodecameric formation; it might not directly reflect the total 2-OG affinity, since the affinity may be modulated by (anti)cooperative effects, or by additional sites... as there may be different 2-OG binding sites involved... (same in line 201)
Thank you for the valuable input. We changed KD to EC50 within the entire manuscript. Concerning possible additional 2-OG binding sites: we did not see any other 2-OG in the cryo-EM structure aside from the described one and we therefore assume that the one described in the manuscript is the main and only one. Considering the high amounts of 2-OG (12.5 mM) used in the structure, it is quite unlikely that additional 2-OG sites exist since they would have unphysiologically low affinities.
In this respect, instead of the rather poor assay shown in Figure 1D, a more detailed determination of catalytic activation by different 2-OG concentrations should be done (similar to 1A)... This would allow a direct comparison between dodecamerization and enzymatic activation.
We agree and performed the respective experiments, which are now presented in revised Fig. 1D
Discussion: the role of 2-OG as a direct activator, comparison with other prokaryotic GS: in other cases, 2-OG affects GS indirectly by being sensed by PII proteins or other 2-OG sensing mechanisms (like 2OG-NtcA-mediated repression of IF factors in cyanobacteria)
We agree and have added that information in the discussion as suggested.
290. Unclear: As a second step of activation, the allosteric binding of 2-OG causes a series of conformational.... where is this site located? According to the catalytic effects (compare 1A and 1D) this site should have a lower affinity …
Thank you very much for pointing this out. Binding of 2-OG only occurs in one specific allosteric binding-site. Binding however, has two effects on the GlnA1: dodecamer assembly and priming of the active site (with two specific EC50, which are now shown in Fig. 1A and D).
See also public comment #1 (1).
Reviewer #2 (Recommendations For The Authors):
The primary concern for me is that mass photometry might lead to incorrect conclusions. The differences in the forms of GS seen in SEC and MP suggest that GS can indeed form a stable dodecamer when the concentration of GS is high enough, as shown in Figure S1B. I strongly suggest using an additional biophysical method to explore the connection between GS and 2-OG in terms of both assembly and activity, to truly understand 2-OG's role in the process of assembly and catalysis.
We apologize if we did not present this clear enough, however the MP analysis of GlnA1 in the absence of 2-OG showed always (monomers/) dimers, dodecamers were only present in the presence of 2-OG. The SEC analysis in Fig. S1B has been performed in the presence of 12.5 mM 2-OG, we realized this information is missing in the figure legend - we now added this in the revised version. The 2-OG is in addition visible in the Cryo EM structure. Thus, we do not agree to perform additional biophysical methods.
As for the other experimental findings, they appear satisfactory to me, and I have no reservations regarding the cryoEM data.
(1) Mass photometry is a fancy technique that uses only a tiny amount of protein to study how they come together. However, the concentration of the protein used in the experiment might be lower than what's needed for them to stick together properly. So, the authors saw a lot of single proteins or pairs instead of bigger groups. They showed in Figure S1B that the M. mazei GS came out earlier than a 440-kDa reference protein, indicating it's actually a dodecamer. But when they looked at the dodecamer fraction using mass photometry, they found smaller bits, suggesting the GS was breaking apart because the concentration used was too low. To fix this, they could try using a technique called analytic ultracentrifuge (AUC) with different amounts of 2-OG to see if they can spot single proteins or pairs when they use a bit more GS. They could also try another technique called SEC-MALS to do similar tests. If they do this, they could replace Figure 1A with new data showing fully formed GS dodecamers when they use the right amount of 2-OG.
Thank you for this input. In MP we looked at dodecamer formation after removing the 2-OG entirely and re-adding it in the respective concentration. We think that GlnA1 is much more unstable in its monomeric/dimeric fraction and that the complete and harsh removal of 2-OG results in some dysfunctional protein which does not recover the dodecameric conformation after dialysis and re-addition of 2-OG. Looking at the dodecamer-peak right after SEC however, we exclusively see dodecamers, which is now included as an additional supplementary figure (suppl. Fig. 1C). Consequently, we did not perform additional experiments.
(2) Building on the last point, the estimated binding strength (Kd) between 2-OG and GS might be lower than it really is, because the GS often breaks apart from its dodecameric form in this experiment, even though 2-OG helps keep the pairs together, as seen with cryoEM. What if they used 5-10 times more GS in the mass photometry experiment? Would the estimated bond strength stay the same? Could they use AUC or other techniques like ITC to find out the real, not just estimated, strength of the bond?
We agree that the term KD is not suitable. We have changed the term KD to EC50 as suggested by reviewer #1, which describes the effective concentration required for 50 % dodecamer assembly. Furthermore, we disagree that the dodecamer breaks apart when the concentrations are as low as in MP experiments. The actual reason for the breaking is rather the harsh dialysis to remove all 2-OG before MP experiments. Right after SEC, the we exclusively see dodecamer in MP (suppl. Fig. S1C). See also #2 (1).
(3) The fact that the GS hardly works without 2-OG is interesting. I tried to understand the experiment setup, but it wasn't clear as the protocol mentioned in the author's 2021 FEBS paper referred to an old paper from 1970. The "coupled optical test assay" they talked about wasn't explained well. I found other papers that used phosphometry assays to see how much ATP was used up. I suggest the authors give a better, more detailed explanation of their experiments in the methods section. Also, it's unclear why the GS activity keeps going up from 5 to 12.5 mM 2-OG, even though they said it's saturated. They suggested there might be another change happening from 5 to 12.5 mM 2-OG. If that's the case, they should try to get a cryo-EM picture of the GS with lots of 2-OG, both with and without ATP/glutamate (or the Met-Sox-P-ADP inhibitor), to see what's happening at a structural level during this change caused by 2-OG.
We agree with the reviewer that the GS assay was not explained in detail (since published and known for several years). However, we now added the more detailed description of the assay in the revised MS, which also measures the ATP used up by GS, but couples the generation of ADP to an optical test assay producing pyruvate from PEP with the generated ADP catalysed by pyruvate kinase present in the assay. This generated pyruvate is finally reduced to lactate by the present lactate dehydrogenase consuming NADH, the reduction of which is monitored at 340 nm.
The still increasing activity of GS after dodecamer formation (max. at 5 mM 2-OG) and the continuously increasing enzyme activity (max. at 12.5 mM 2-OG): See also public reviews, we assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site.
The suggested additional experiments with and without ATP/Glutamate: Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.
(4) Please remake Figure S2, the panels are too small to read the words. At least I have difficulty doing so.
We assume the reviewer is pointing to Suppl. Fig S3, we now changed this figure accordingly.
Line 153, the reference Schumacher et al. 23, should be 2023?
Yes, thank you. We corrected that.
Line 497. I believe it's UCSF ChimeraX, not Chimera.
We apologize and corrected accordingly.
Reviewer #3 (Recommendations For The Authors):
Recent studies on the Methanothermococcus thermolithotrophicus glutamine synthetase, published by Müller et al., 2024, have identified the binding site for 2-oxoglutarate as well as the conformational changes that were induced in the protein by its presence. In the present study, the authors confirm these observations and additionally establish a link between the presence of 2-oxoglutarate and the dodecameric fold and full activation of GS.
Curiously, here, the authors could not confirm their own findings that the dodecameric GS can directly interact with the PII-like GlnK1 protein and the small peptide sP26. However, the lack of mention of the GlnK-bound state in these studies is very alarming since it certainly is highly relevant here.
We agree with the reviewer that we have not observed the interaction with GlnK1 and sP26 in the recent study. Consequently, we speculate that yet unknown cellular factor(s) might be required for an interaction of GlnA1 with GlnK1 and sP26, which were not present in the in vitro experiments using purified proteins, however they were present in the previous pull-down approaches (Ehlers et al. 2005, Gutt et al. 2021). Another reason might be that post-translational modifications occur in M. mazei, which might be important for the interaction, which are also not present in purified proteins expressed in E. coli.
The manuscript interest could have been substantially increased if the authors had done finer biochemical and enzymatic analyses on the oligomerization process of GS, used GlnK1 bound to known effectors in their assays and would have done some more efforts to extrapolate their findings (even if a small niche) of related glutamine synthetases.
We thank the reviewer for their valuable encouragement to explore ligand-bound-states of GlnK1. However, in this manuscript we mainly focused on 2-OG as activator of GlnA1 and decided to dedicate future experiments to the exploration of conditions that possibly favor GlnK1-binding.
In principle, we have explored the ATP bound GlnK1 effects on GlnA1 activity in the activity assays (Fig. 2E) since ATP (3.6 mM) is present. GlnK1 however showed no effects on GlnA1 activity.
In general, the manuscript is poorly written, with grammatically incorrect sentences that at times, which stands in the way of passing on the message of the manuscript.
Particular points:
(1) It is mentioned that 2-OG induces the active oligomeric (dodecamer, 12-mer) state of GlnA1 without detectable intermediates. However, only 62 % of the starting inactive enzyme yields active 12-mers. Note that this is contradicted in line 212.
Thanks for pointing out this discrepancy. After removing all 2-OG as we did before MP-experiments, GlnA1 doesn’t reach full dodecamers anymore when 2-OG is re-added. This is not because the 2-OG amount is not enough to trigger full assembly, but because the protein is much more unstable in the absence of 2-OG, so we predict that some GlnA1 breaks during dialysis. See also answer reviewer #2 (1) and supplementary figure S1C.
Is there any protein precipitation upon the addition of 2-OG? Is all protein being detected in the assay, meaning, is monomer/dimer + dodecamer yields close to 100% of the total enzyme in the assay?
There is no protein precipitation upon the addition of 2-OG, indeed, GlnA1 is much more stable in the presence of 2-OG. In the mass photometry experiments, all particles are measured, precipitated protein would be visible as big entities in the MP.
Please add to Figure 1 the amount of monomer/dimer during titration. Some debate why there is no full conversion should be tentatively provided.
We agree with the reviewer and included the amount of monomer/dimer in the figure, as well as some discussion on why it is not fully converted again. GlnA1 is unstable without 2-OG and it was dialysed against buffer without 2-OG before MP measurements. This sample mistreatment resulted in no full re-assembly after re-adding 2-OG (although full dodecamers before dialysis (suppl. Fig. S1C).
(2) Figure 1B reflects an exemplary result. Here, the addition of 0.1 mM 2-OG seems to promote monomer to dimer transition. Why was this not studied in further detail? It seems highly relevant to know from which species the dodecamer is assembled.
We thank the reviewer for their comment. However, we would like to point out that, although not shown in the figure, GlnA1 is always mainly present as dimers as the smallest entity. As suggested earlier, we have added the amount of monomers/dimers to Figure 1A, which shows low monomer-counts at all 2-OG concentrations (Fig.1A). Although not depicted in the graph starting at 0.01 mM OG, we also see mainly dimers at 0 mM 2-OG.
How does the y-axis compare to the number and percentage of counts assigned to the peaks? In line 713, it is written that the percentage of dodecamer considers the total number of counts, and this was plotted against the 2-OG concentration.
We thank the reviewer for addressing this unclarity. Line 713 corresponds to Figure 1A, where we indeed plotted the percentage of dodecamer against the 2-OG-concentration. Thereby, the percentage of dodecamer corresponds to the percentage calculated from the Gaussian Fit of the MP-dodecamer-peak. In Figure 1 B, however, the y-axis displays the relative amount of counts per mass, multiple similar masses then add up to the percentage of the respective peak (Gaussian Fit above similar masses).
(3) Lines 714 and 721 (and elsewhere): Why only partial data is used for statistical purposes?
We in general only show one exemplary biological replicate, since the quality of the respective GlnA1 purification sometimes varied (maximum activity ranging from 5 - 10 U/mg). Therefore, we only compared activities within the same protein purification. For the EC50 calculations of all measurements, we refer to the supplement.
(4) Lines 192-193: It is claimed that GlnK1 was previously shown to both regulate the activity of GlnA1 and form a complex with GlnA1. Please mention the ratio between GlnK1 and GlnA1 in this complex.
We now included the requested information (GlnA1:GlnK1 1:1, (Ehlers et al. 2005); His6-GlnA1 (0.95 μM), His6-GlnK1 (0.65 μM); 2:1,4, Gutt et al. 2021).
It is also known that PII proteins such as GlnK1 can bind ADP, ATP, and 2-OG. Interestingly, however, for various described PII proteins, 2-OG can only bind after the binding of ATP.
So, the crucial question here is what is the binding state of GlnK1?
Were these assays performed in the absence of ATP? This is key to fully understand and connect the results to the previous observations. For example, if the GlnK1 used was bound to ADP but not to ATP, then the added 2-OG might indeed only be able to affect GlnA1 (leading to its activation/oligomerization). If this were true and according to the data reported, ADP would prevent GlnK1 from interacting with any oligomeric form of GlnA1. However, if GlnK1 bound to ATP is the form that interacts with GlnA1 (potentially validating previous results?) then, 2-OG would first bind to GlnK1 (assuming a higher affinity of 2-OG to GlnK1), eventually causing its release from GlnA1 followed by binding and activation of GlnA1.
These experiments need to be done as they are essential to further understand the process. Given the ability of the authors to produce the protein and run such assays, it is unclear why they were not done here. As written in line 203, in this case, "under the conditions tested" is not a good enough statement, considering what is known in the field and how many more conclusions could easily be taken from such a setup.
Thanks for the encouragement to investigate the ligand-bound states of GlnK1. We agree and plan to perform the suggested mass photometry experiments exploring the conditions under which GlnA1 and GlnK1 might interact in future work. In GlnA1 activity test assays, when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.
(5) Figure 2D legend claims that the graphic shows the percentage of dodecameric GlnA1 as a function of the concentration of 2-OG. This is not what the figure shows; Figure 2D shows the dodecamer/dimer (although legend claims monomer was used, in line 732) ratio as a function of 2-OG (stated in line 736!). If this is true, a ratio of 1 means 50 % of dodecamers and dimers co-exist. This appears to be the case when GlnK1 was added, while in the absence of GlnK1 higher ratios are shown for higher 2-OG concentration implying that about 3 times more dodecamers were formed than dimers. However, wouldn´t a 50 % ratio be physiologically significant?
We apologize for the partially incorrect and also misleading figure legend and corrected it. Indeed, the ratio of dodecamers and dimers is shown. Furthermore, we did not use monomeric GlnA1 (the smallest entity is mainly a dimer, see Fig 1A), however, the molarity was calculated based on the monomer-mass. Concerning the significance of the difference between the maximum ratio of GlnA1 and GlnK1: The ratio does appear higher, but this is mostly because adding large quantities of GlnK1 broadens all peaks at low molecular weight. This happens because the GlnK1 signal starts overlapping with the signal from GlnA1, leading to inflated GlnA1 dimer counts. We therefore do not think that this is biologically significant, especially as the activities do not differ under these conditions.
(6) Is it possible that the uncleaved GlnA1 tag is preventing interaction with GlnK1? This should be discussed.
This is of course a very important point. We however realized that Schumacher et al. also used an N-terminal His-tag, so we assume that the N-terminal tag is not hampering the interaction.
(7) Line 228: Please detail the reported discrepancies in rmsd between the current protein and the gram-negative enzymes.
The differences in rmsd between our M.mazei GlnA1 structure and the structure of gram-negative enzymes is caused by a) sequence similarity: E.g. M.mazei GlnA1 compared to B.subtilis GlnA have a sequence percent identity of 58.47; b) ligands in the structure: The B.Subtilis structure contains L-Methionine-S-sulfoximine phosphate, a transition state inhibitor, while the M. mazei structure contains 2OG; c) Methodology: The structural determination methods also contribute to these differences. B. subtilis GlnA was determined using X-ray crystallography, while the M. mazei GlnA1 structure was resolved using Cryo-EM, where the protein behaves differently in ice compared to a crystal.
(8) Line 747: The figure title claims "dimeric interface" although the manuscript body only refers to "hexameric interface" or "inter-hexamer interface" (line 224). Moreover, the figure 4 legend uses terms such as vertical and horizontal dimers and this too should be uniformized within the manuscript.
Thank you for your valuable feedback. We have updated both the figure title and the figure legend as well in the main text to ensure consistency in the description.
(9) Line 752: The description of the color scheme used here is somehow unclear.
Thanks for pointing this out. We changed the description to make it more comprehensive.
(10) Please label H14/15 and H14´/H15´in Fig 4C zoom.
We agree that this has not been very clear. We added helix labels.
(11) In Figure 4D legend, make sure to note that the binding sites for the substrate are based on homologies with another enzyme poised with these molecules.
The same should be clear in the text: sites are not known, they are assumed to be, based on homologies (paragraph starting at line 239).
Concerning this comment we want to point out that we studied the exact same enzyme as the Schumacher group, except that we used 2-OG in our experiments, which they did not.
(12) Figure 3 appears redundant in light of Figure 4.
(13) Line 235: When mentioning F24, please refer to Figure 5.
Thank you, we changed that accordingly.
(14) Please provide the distances for the bonds depicted in Figure 4B.
Thanks for pointing this out, we added distance labels to Figure 4B. For reasons of clarity only to three H-bonds.
(15) Line 241: D57 is likely serving to abstract a proton from ammonium, what is residue Glu307 potentially doing? The information seems missing in light of how the sentence is built.
Thanks for pointing this out. According to previous studies both residues are likely involved in proton abstraction - first from ammonium, and then from the formed gamma-ammonium group. Additionally, they contribute in shielding the active site from bulk solvent to prevent hydrolysis of the formed phospho-glutamate.
(16) Why do the authors assume that increased concentrations of 2-OG are a signal for N starvation only in M. mazei and not in all prokaryotic equivalent systems (line 288)?
In line 288, we did not claim that this is a unique signal for M. mazei. It is also the central N-starvation signal in Cyanobacteria but not directly perceived by the cyanobacterial GS through binding directly to GS.
The authors should look into the residues that bind 2-OG and check if they are conserved in other GS. The results of this sequence analysis should be discussed in line with the variable prokaryotic glutamine synthetase types of activity modulation that were exposed in the introduction and Figure 7.
Please refer to supplementary figure S5, where we already aligned the mentioned glutamine synthetase sequences. Since this was also already discussed in Müller et al. 2024, we did not want to repeat their observations and refer to our supplementary figure in too much detail.
(17) Figure 5 title: Replace TS by transition state structures of homology enzymes, or alike.
Thank you for this suggestion. We did not change the title however, since it is not a homologue but the exact same glutamine synthetase from Methanosarcina mazei.
(18) Line 249: D170 is not shown in Figure 5A or elsewhere in Figure 5.
Thank you for pointing this out. We added D170 to figure 5A.
(19) Representative density for the residues binding 2-OG should be provided, maybe in a supplemental figure.
Thank you for the suggestion. We added the densities of 2-OG-binding residues to figure 4B
(20) Line 260: Please add a reference when describing the phosphoryl transfer.
We thank the reviewer for this important point and added that accordingly.
(21) Line 296: The binding of 2-OG indeed appears to be cooperative, such that at concentrations above its binding affinity to the protein, only dodecamers are seen (under experimental conditions). However, claiming that the oligomerization is fast is not correct when the experimental setup includes 10 minutes of incubation before measurements are done. Please correct this within the entire manuscript.
A (fast) continuous kinetic assay could have confirmed this point and revealed the oligomerization steps and the intermediaries in the process (maybe monomer/dimers, then dimers/hexamers, and then hexamers/dodecamers). Such assays would have been highly valuable to this study.
We thank the reviewer for this suggestion, but disagree. It is indeed a rather fast regulation (as activity assays without pre-incubation only takes 1 min longer to reach full activity, see the newly included suppl. Fig S6). Considering other regulation mechanisms like e.g. transcription or translation regulation, an activation that takes only 60 s is actually quite quick.
(22) Line 305 (and elsewhere in the manuscript): the authors state that 2-OG primes the active site for a transition state. This appears incorrect. The transition state is the highest energy state in an enzymatic reaction progressing from substrate to product. Meaning, the transition state is a state that has a more or less modified form of the original substrate bound to the active site. This is not the case.
In line 366 an "active open state" appears much more adequate to use.
We agree and changed accordingly throughout the manuscript.
(23) Line 330: Please delete "found". Eventually replace it with "confirmed": As the authors write, others have described this residue as a ligand to glutamine.
Thanks, we changed that accordingly, although previous descriptions were just based on homologies without the experimental validation.
(24) The discussion in at various points summarizing again the results. It should be trimmed and improved.
(25) Line 381: replace "two fast" with "fast"?
We thank the reviewer for this suggestion, but disagree on this point. We especially wanted to highlight that there are two central nitrogen-metabolites involved in the direct regulation of GlnA1, that means TWO fast direct processes mediated by 2-OG and glutamine.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon.
The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.
To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.
Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells compared to DNMT1 KO alone.
Strengths:
The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.
Weaknesses:
Suggestions for refinement:
The initial experiment, inducing global hypo-methylation by eliminating DNMT1 in HAP1 cells, is intriguing and warrants more detailed description. How many genes experience misregulation or aberrant expression? What phenotypic changes occur in these cells? Why did the authors focus on L1TD1? Providing some of this data would be helpful to understand the rationale behind the thorough analysis of L1TD1.
The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transpositionpositive colonies? Further exploration of this phenomenon would be intriguing.
Reviewer #2 (Public review):
In this study, Kavaklıoğlu et al. investigated and presented evidence for a role for domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation dependent manner, due to DNMT1 deletion in HAP1 cell line. The authors then identified L1TD1 associated RNAs using RIPSeq, which display a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found L1TD1 protein associated with L1-RNPs and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expression, and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish feasibility of this relationship existing in vivo in either development or disease, or both.
Comments on revised version:
In general, the authors did an acceptable job addressing the major concerns throughout the manuscript. This revision is much clearer and has improved in terms of logical progression.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The authors have addressed all my questions in the revised version of the manuscript.
Reviewer #2 (Recommendations for the authors):
Revised comments:
A few points we'd like to see addressed are our comments about the model (Figure S7C), as this is important for the readership to understand this complex finding. Please try to apply some quantification, if possible (question 8). Please do your best to tone down the direct relationship of these findings to embryology (question 11). Based on both reviewer comments, we believe addressing reviewer #1s "Suggestions for refinement" (2 points), would help us change our view of solid to convincing.
Responses to changes:
Major
(1) The study only used one knockout (KO) cell line generated by CRISPR/Cas9.
Considering the possibility of an off-target effect, I suggest the authors attempt one or both of these suggestions.
A) Generate or acquire a similar DMNT1 deletion that uses distinct sgRNAs, so that the likelihood of off-targets is negligible. A few simple experiments such as qRT-PCR would be sufficient to suggest the same phenotype.
B) Confirm the DNMT1 depletion also by siRNA/ASO KD to phenocopy the KO effect.
(2) In addition to the strategies to demonstrate reproducibility, a rescue experiment restoring DNMT1 to the KO or KD cells would be more convincing. (Partial rescue would suffice in this case, as exact endogenous expression levels may be hard to replicate).
We have undertook several approaches to study the effect of DNMT1 loss or inactivation: As described above, we have generated a conditional KO mouse with ablation of DNMT1 in the epidermis. DNMT1-deficient keratinocytes isolated from these mice show a significant increase in L1TD1 expression. In addition, treatment of primary human keratinocytes and two squamous cell carcinoma cell lines with the DNMT inhibitor aza-deoxycytidine led to upregulation of L1TD1 expression. Thus, the derepression of L1TD1 upon loss of DNMT1 expression or activity is not a clonal effect.
Also, the spectrum of RNAs identified in RIP experiments as L1TD1-associated transcripts in HAP1 DNMT1 KO cells showed a strong overlap with the RNAs isolated by a related yet different method in human embryonic stem cells. When it comes to the effect of L1TD1 on L1-1 retrotranspostion, a recent study has reported a similar effect of L1TD1 upon overexpression in HeLa cells [4].
All of these points together help to convince us that our findings with HAP1 DNMT KO are in agreement with results obtained in various other cell systems and are therefore not due to off-target effects. With that in mind, we would pursue the suggestion of Reviewer 1 to analyze the effects of DNA hypomethylation upon DNMT1 ablation.
Thank you for addressing this concern. The reference to Beck 2021 and the additional cells lines (R2: keratinocytes and R3: squamous cell carcinoma) provides sufficient evidence that this result is unlikely to be a result of clonal expansion or off targets.
Question: Was the human ES Cell RIP Experiment shown here? What is the overlap?
We refer to the recently published study by Jin et al. (PMID: 38165001). As stated in the Discussion, the majority of L1TD1-associated transcripts in HAP1 cells (69%) identified in our study were also reported as L1TD1 targets in hESCs suggesting a conserved binding affinity of this domesticated transposon protein across different cell types.
(3) As stated in the introduction, L1TD1 and ORF1p share "sequence resemblance" (Martin 2006). Is the L1TD1 antibody specific or do we see L1 ORF1p if Fig 1C were uncropped?
(6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).
This is a relevant question. We are convinced that the L1TD1 antibody does not crossreact with L1 ORF1p for the following reasons: Firstly, the antibody does not recognize L1 ORF1p (40 kDa) in the uncropped Western blot for Figure 1C (Figure R4A). Secondly, the L1TD1 antibody gives only background signals in DKO cells in the indirect immunofluorescence experiment shown in Figure 1E of the manuscript.
Thirdly, the immunogene sequence of L1TD1 that determines the specificity of the antibody was checked in the antibody data sheet from Sigma Aldrich. The corresponding epitope is not present in the L1 ORF1p sequence.
Finally, we have shown that the ORF1p antibody does not cross-react with L1TD1 (Figure R4B).
Response: Thank you for sharing these images. These full images relieve concerns about specificity. The increase of ORF1P in R4B and Main figure 3C is interesting and pointed out in the manuscript. Not for the purposes of this review, but the observation of reduced transposition despite increased ORF1P could be an interesting follow up to this study (combined with the similar UPF1 result could indicate a complex of some kind).
(4) In abstract (P2), the authors mentioned that L1TD1 works as an RNA chaperone, but in the result section (P13), they showed that L1TD1 associates with L1 ORF1p in an RNA independent manner. Those conclusions appear contradictory. Clarification or revision is required.
Our findings that both proteins bind L1 RNA, and that L1TD1 interacts with ORF1p are compatible with a scenario where L1TD1/ORF1p heteromultimers bind to L1 RNA. The additional presence of L1TD1 might thereby enhance the RNA chaperone function of ORF1p. This model is visualized now in Suppl. Figure S7C.
Response: Thank you for the model. To further clarify, do you mean that L1TD1 can bind L1 RNA, but this is not needed for the effect, however this "bonus" binding (that is enabled by heteromultimerization) appears to enhance the retrotransposition frequency? Do you think L1TD1 is binding L1 RNA in this context or simply "stabilizing" ORF1P (Trimer) RNP?
Based on our data, L1TD1 associates with L1 RNA and interacts with L1 ORF1p. Both features might contribute to the enhanced retrotransposition frequency. Interestingly, the L1TD1 protein shares with its ancestor L1 ORF1p the non-canonical RNA recognition motif and the coiled-coil motif required for the trimerization but has two copies instead of one of the C-terminal domain (CTD), a structure with RNA binding and chaperone function. We speculate that the presence of an additional CTD within the L1TD1 protein might thereby enhance the RNA binding and chaperone function of L1TD1/ORF1p heteromultimers.
(5) Figure 2C fold enrichment for L1TD1 and ARMC1 is a bit difficult to fully appreciate. A 100 to 200-fold enrichment does not seem physiological. This appears to be a "divide by zero" type of result, as the CT for these genes was likely near 40 or undetectable. Another qRT-PCR based approach (absolute quantification) would be a more revealing experiment. This is the validation of the RIP experiments and the presentation mode is specifically developed for quantification of RIP assays (Sigma Aldrich RIP-qRT-PCR: Data Analysis Calculation Shell). The unspecific binding of the transcript in the absence of L1TD1 in DNMT1/L1TD1 DKO cells is set to 1 and the value in KO cells represents the specific binding relative the unspecific binding. The calculation also corrects for potential differences in the abundance of the respective transcript in the two cell lines. This is not a physiological value but the quantification of specific binding of transcripts to L1TD1. GAPDH as negative control shows no enrichment, whereas specifically associated transcripts show strong enrichement. We have explained the details of RIPqRT-PCR evaluation in Materials and Methods (page 14) and the legend of Figure 2C in the revised manuscript.
Response: Thank you for the clarification and additional information in the manuscript.
(6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).
See response to (3).
Response: Thanks.
(7) Figure S4A and S4B: There appear to be a few unusual aspects of these figures that should be pointed out and addressed. First, there doesn't seem to be any ORF1p in the Input (if there is, the exposure is too low). Second, there might be some L1TD1 in the DKO (lane 2) and lane 3. This could be non-specific, but the size is concerning. Overexposure would help see this.
The ORF1p IP gives rise to strong ORF1p signals in the immunoprecipitated complexes even after short exposure. Under these conditions ORF1p is hardly detectable in the input. Regarding the faint band in DKO HAP1 cells, this might be due to a technical problem during Western blot loading. Therefore, the input samples were loaded again on a Western blot and analyzed for the presence of ORF1p, L1TD1 and beta-actin (as loading control) and shown as separate panel in Suppl. Figure S4A.
The enhanced image is clearer. Thanks.
S4A and S4B now appear to the S6A and S6B, is that correct? (This is due to the addition of new S1 and S2, but please verify image orders were not disturbed).
Yes, the input is shown now as a separate panel in Suppl. Figure S6A.
(8) Figure S4C: This is related to our previous concerns involving antibody cross-reactivity. Figure 3E partially addresses this, where it looks like the L1TD1 "speckles" outnumber the ORF1p puncta, but overlap with all of them. This might be consistent with the antibody crossreacting. The western blot (Figure 3C) suggests an upregulation of ORF1p by at least 23x in the DKO, but the IF image in 3E is hard to tell if this is the case (slightly more signal, but fewer foci). Can you return to the images and confirm the contrast are comparable? Can you massively overexpose the red channel in 3E to see if there is residual overlap? In Figure 3E the L1TD1 antibody gives no signal in DNMT1/L1TD1 DKO cells confirming that it does not recognize ORF1p. In agreement with the Western blot in Figure 3C the L1 ORF1p signal in Figure 3E is stronger in DKO cells. In DNMT1 KO cells the L1 ORF1p antibody does not recognize all L1TD1 speckles. This result is in agreement with the Western blot shown above in Figure R4B and indicates that the L1 ORF1p antibody does not recognize the L1TD1 protein. The contrast is comparable and after overexposure there are still L1TD1 specific speckles. This might be due to differences in abundance of the two proteins.
Response: Suggestion: Would it be possible to use a program like ImageJ to supplement the western blot observation? Qualitatively, In figure 3E, it appears that there is more signal in the DKO, but this could also be due to there being multiple cells clustered together or a particularly nicely stained region. Could you randomly sample 20-30 cells across a few experiments to see if this holds up. I am interested in whether the puncta in the KO image(s) is a very highly concentrated region and in the DKO this is more disperse. Also, the representative DKO seems to be cropped slightly wrong. (Please use puncta as a guide to make the cropping more precise)
As suggested by the reviewer we have quantified the signals of 60 KO cells and 56 DKO cells in three different IF experiments by ImageJ. We measured a 1.4-fold higher expression level of L1 ORF1p in DKO cells. However, the difference is not statistically significant. This is most probably due to the change in cell size and protein content during the cell cycle with increasing protein contents from G1 to G2. Western blot analysis provides signals of comparable protein amounts representing an average expression levels over ten thousands of cells. Nevertheless, the quantification results reflect in principle the IF pictures shown in Figure 3E but IF is probably not the best method to quantify protein amounts. We have also corrected Figure 3E.
Author response image 1.
(9) The choice of ARMC1 and YY2 is unclear. What are the criteria for the selection?
ARMC1 was one of the top hits in a pilot RIP-seq experiment (IP versus input and IP versus IgG IP). In the actual RIP-seq experiment with DKO HAP1 cells instead of IgG IP as a negative control, we found ARMC1 as an enriched hit, although it was not among the top 5 hits. The results from the 2nd RIP-seq further confirmed the validity of ARMC1 as an L1TD1interacting transcript. YY2 was of potential biological relevance as an L1TD1 target due to the fact that it is a processed pseudogene originating from YY1 mRNA as a result of retrotransposition. This is mentioned on page 6 of the revised manuscript.
Response: Appreciated!
(10) (P16) L1 is the only protein-coding transposon that is active in humans. This is perhaps too generalized of a statement as written. Other examples are readily found in the literature.
Please clarify.
We will tone down this statement in the revised manuscript.
Response: Appreciated! To further clarify, the term "active" when it comes to transposable elements, has not been solidified. It can span "retrotransposition competent" to "transcripts can be recovered". There are quite a few reports of GAG transcripts and protein from various ERV/LTR subfamilies in various cells and tissues (in mouse and human at least), however whether they contribute to new insertions is actively researched.
(11) In both the abstract and last sentence in the discussion section (P17), embryogenesis is mentioned, but this is not addressed at all in the manuscript. Please refrain from implying normal biological functions based on the results of this study unless appropriate samples are used to support them.
Much of the published data on L1TD1 function are related to embryonic stem cells [3- 7].
Therefore, it is important to discuss our findings in the context of previous reports.
Response: It is well established that embryonic stem cells are not a perfect or direct proxies for the inner cell mass of embryos, as multiple reports have demonstrated transcriptomic, epigenetic, chromatin accessibility differences. The exact origin of ES cells is also considered controversial. We maintain that the distinction between embryos/embryogenesis and the results presented in the manuscript are not yet interchangeable. An important exception would be complex models of embryogenesis such as embryoids, (or synthetic/artificial embryo models that have been carefully been termed as such so as to not suggest direct implications to embryos). https://www.nature.com/articles/ncb2965
We have deleted the corresponding paragraph in the Discussion.
(12) Figure 3E: The format of Figures 1A and 3E are internally inconsistent. Please present similar data/images in a cohesive way throughout the manuscript. We show now consistent IF Figures in the revised manuscript.
Response: Thanks
Minor:
In general:
Still need checking for typos, mostly in Materials and Methods section; Please keep a consistent writing style throughout the whole manuscript. If you use L1 ORF1p, then please use L1 instead of LINE-1, or if you keep LINE-1 in your manuscript, then you should use LINE-1 ORF1p.
A lab member from the US checked again the Materials and Methods section for typos. We keep the short version L1 ORF1p.
(1) Intro:
- Is L1Td1 in mice and Humans? How "conserved" is it and does this suggest function? Murine and human L1TD1 proteins share 44% identity on the amino acid level and it was suggested that the corresponding genes were under positive selection during evolution with functions in transposon control and maintenance of pluripotency [8].
- Why HAP1? (Haploid?) The importance of this cell line is not clear.
HAP1 is a nearly haploid human cancer cell line derived from the KBM-7 chronic myelogenous leukemia (CML) cell line [9, 10]. Due to its haploidy is perfectly suited and widely used for loss-of-function screens and gene editing. After gene editing cells can be used in the nearly haploid or in the diploid state. We usually perform all experiments with diploid HAP1 cell lines. Importantly, in contrast to other human tumor cell lines, this cell line tolerates ablation of DNMT1. We have included a corresponding explanation in the revised manuscript on page 5, first paragraph.
- Global methylation status in DNMT1 KO? (Methylations near L1 insertions, for example?)
The HAP1 DNMT1 KO cell line with a 20 bp deletion in exon 4 used in our study was validated in the study by Smits et al. [11]. The authors report a significant reduction in overall DNA methylation. However, we are not aware of a DNA methylome study on this cell line. We show now data on the methylation of L1 elements in HAP1 cells and upon DNMT1 deletion in the revised manuscript in Suppl. Figure S1B.
Response: Looks great!
(2) Figure 1:
- Figure 1C. Why is LMNB used instead of Actin (Fig1D)?
We show now beta-actin as loading control in the revised manuscript.
- Figure 1G shows increased Caspase 3 in KO, while the matching sentence in the result section skips over this. It might be more accurate to mention this and suggest that the single KO has perhaps an intermediate phenotype (Figure 1F shows a slight but not significant trend).
We fully agree with the reviewer and have changed the sentence on page 6, 2nd paragraph accordingly.
- Would 96 hrs trend closer to significance? An interpretation is that L1TD1 loss could speed up this negative consequence.
We thank the reviewer for the suggestion. We have performed a time course experiment with 6 biological replicas for each time point up to 96 hours and found significant changes in the viability upon loss of DNMT1 and again significant reduction in viability upon additional loss of L1TD1 (shown in Figure 1F). These data suggest that as expected loss of DNMT1 leads to significant reduction viability and that additional ablation of L1TD1 further enhances this effect.
Response: Looks good!
- What are the "stringent conditions" used to remove non-specific binders and artifacts (negative control subtraction?)
Yes, we considered only hits from both analyses, L1TD1 IP in KO versus input and L1TD1 IP in KO versus L1TD1 IP in DKO. This is now explained in more detail in the revised manuscript on page 6, 3rd paragraph.
(3) Figure 2:
- Figure 2A is a bit too small to read when printed.
We have changed this in the revised manuscript.
- Since WT and DKO lack detectable L1TD1, would you expect any difference in RIP-Seq results between these two?
Due to the lack of DNMT1 and the resulting DNA hypomethylation, DKO cells are more similar to KO cells than WT cells with respect to the expressed transcripts.
- Legend says selected dots are in green (it appears blue to me). We have changed this in the revised manuscript.
- Would you recover L1 ORF1p and its binding partners in the KO? (Is the antibody specific in the absence of L1TD1 or can it recognize L1?) I noticed an increase in ORF1p in the KO in Figure 3C.
Thank you for the suggestion. Yes, L1 ORF1p shows slightly increased expression in the proteome analysis and we have marked the corresponding dot in the Volcano plot (Figure 3A).
- Should the figure panel reference near the (Rosspopoff & Trono) reference instead be Sup S1C as well? Otherwise, I don't think S1C is mentioned at all.
- What are the red vs. green dots in 2D? Can you highlight ERV and ALU with different colors?
We added the reference to Suppl. Figure S1C (now S3C) in the revised manuscript. In Figure 2D L1 elements are highlighted in green, ERV elements in yellow, and other associated transposon transcripts in red.
Response: Much better, thanks!
- Which L1 subfamily from Figure 2D is represented in the qRT-PCR in 2E "LINE-1"? Do the primers match a specific L1 subfamily? If so, which? We used primers specific for the human L1.2 subfamily.
- Pulling down SINE element transcripts makes some sense, as many insertions "borrow" L1 sequences for non-autonomous retro transposition, but can you speculate as to why ERVs are recovered? There should be essentially no overlap in sequence.
In the L1TD1 evolution paper [8], a potential link between L1TD1 and ERV elements was discussed:
"Alternatively, L1TD1 in sigmodonts could play a role in genome defense against another element active in these genomes. Indeed, the sigmodontine rodents have a highly active family of ERVs, the mysTR elements [46]. Expansion of this family preceded the death of L1s, but these elements are very active, with 3500 to 7000 speciesspecific insertions in the L1-extinct species examined [47]. This recent ERV amplification in Sigmodontinae contrasts with the megabats (where L1TD1 has been lost in many species); there are apparently no highly active DNA or RNA elements in megabats [48]. If L1TD1 can suppress retroelements other than L1s, this could explain why the gene is retained in sigmodontine rodents but not in megabats."
Furthermore, Jin et al. report the binding of L1TD1 to repetitive sequences in transcripts [12]. It is possible that some of these sequences are also present in ERV RNAs.
Response: Interesting, thanks for sharing
- Is S2B a screenshot? (the red underline).
No, it is a Powerpoint figure, and we have removed the red underline.
(4) Figure 3:
- Text refers to Figure 3B as a western blot. Figure 3B shows a volcano plot. This is likely 3C but would still be out of order (3A>3C>3B referencing). I think this error is repeated in the last result section.
- Figure and legends fail to mention what gene was used for ddCT method (actin, gapdh, etc.).
- In general, the supplemental legends feel underwritten and could benefit from additional explanations. (Main figures are appropriate but please double-check that all statistical tests have been mentioned correctly).
Thank you for pointing this out. We have corrected these errors in the revised manuscript.
(5) Discussion:
- Aluy connection is interesting. Is there an "Alu retrotransposition reporter assay" to test whether L1TD1 enhances this as well?
Thank you for the suggestion. There is indeed an Alu retrotransposition reporter assay reported be Dewannieux et al. [13]. The assay is based on a Neo selection marker. We have previously tested a Neo selection-based L1 retrotransposition reporter assay, but this system failed to properly work in HAP1 cells, therefore we switched to a blasticidin based L1 retrotransposition reporter assay. A corresponding blasticidin-based Alu retrotransposition reporter assay might be interesting for future studies (mentioned in the Discussion, page 11 paragraph 4 of the revised manuscript.
(6) Material and Methods :
- The number of typos in the materials and methods is too numerous to list. Instead, please refer to the next section that broadly describes the issues seen throughout the manuscript.
Writing style
(1) Keep a consistent style throughout the manuscript: for example, L1 or LINE-1 (also L1 ORF1p or LINE-1 ORF1p); per or "/"; knockout or knock-out; min or minute; 3 times or three times; media or medium. Additionally, as TE naming conventions are not uniform, it is important to maintain internal consistency so as to not accidentally establish an imprecise version.
(2) There's a period between "et al" and the comma, and "et al." should be italic.
(3) The authors should explain what the key jargon is when it is first used in the manuscript, such as "retrotransposon" and "retrotransposition".
(4) The authors should show the full spelling of some acronyms when they use it for the first time, such as RNA Immunoprecipitation (RIP).
(5) Use a space between numbers and alphabets, such as 5 μg. (6) 2.0 × 105 cells, that's not an "x".
(7) Numbers in the reference section are lacking (hard to parse).
(8) In general, there are a significant number of typos in this draft which at times becomes distracting. For example, (P3) Introduction: Yet, co-option of TEs thorough (not thorough, it should be through) evolution has created so-called domesticated genes beneficial to the gene network in a wide range of organisms. Please carefully revise the entire manuscript for these minor issues that collectively erode the quality of this submission. Thank you for pointing out these mistakes. We have corrected them in the revised manuscript. A native speaker from our research group has carefully checked the paper. In summary, we have added Supplementary Figure S7C and have changed Figures 1C, 1E, 1F, 2A, 2D, 3A, 4B, S3A-D, S4B and S6A based on these comments.
Response: Thank you for taking these comments on board!
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Wang et al., recorded concurrent EEG-fMRI in 107 participants during nocturnal NREM sleep to investigate brain activity and connectivity related to slow oscillations (SO), sleep spindles, and in particular their co-occurrence. The authors found SO-spindle coupling to be correlated with increased thalamic and hippocampal activity, and with increased functional connectivity from the hippocampus to the thalamus and from the thalamus to the neocortex, especially the medial prefrontal cortex (mPFC). They concluded the brain-wide activation pattern to resemble episodic memory processing, but to be dissociated from task-related processing and suggest that the thalamus plays a crucial role in coordinating the hippocampal-cortical dialogue during sleep.
The paper offers an impressively large and highly valuable dataset that provides the opportunity for gaining important new insights into the network substrate involved in SOs, spindles, and their coupling. However, the paper does unfortunately not exploit the full potential of this dataset with the analyses currently provided, and the interpretation of the results is often not backed up by the results presented. I have the following specific comments.
Thank you for your thoughtful and constructive feedback. We greatly appreciate your recognition of the strengths of our dataset and findings Below, we address your specific comments and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We hope these revisions address your comments and further strengthen our manuscript. Thank you again for the constructive feedback.
(1) The introduction is lacking sufficient review of the already existing literature on EEG-fMRI during sleep and the BOLD-correlates of slow oscillations and spindles in particular (Laufs et al., 2007; Schabus et al., 2007; Horovitz et al., 2008; Laufs, 2008; Czisch et al., 2009; Picchioni et al., 2010; Spoormaker et al., 2010; Caporro et al., 2011; Bergmann et al., 2012; Hale et al., 2016; Fogel et al., 2017; Moehlman et al., 2018; Ilhan-Bayrakci et al., 2022). The few studies mentioned are not discussed in terms of the methods used or insights gained.
We acknowledge the need for a more comprehensive review of prior EEG-fMRI studies investigating BOLD correlates of slow oscillations and spindles. However, these articles are not all related to sleep SO or spindle. Articles (Hale et al., 2016; Horovitz et al., 2008; Laufs, 2008; Laufs, Walker, & Lund, 2007; Spoormaker et al., 2010) mainly focus on methodology for EEG-fMRI, sleep stages, or brain networks, which are not the focus of our study. Thank you again for your attention to the comprehensiveness of our literature review, and we will expand the introduction to include a more detailed discussion of the existing literature, ensuring that the contributions of previous EEG-fMRI sleep studies are adequately acknowledged.
Introduction, Page 4 Lines 62-76
“Investigating these sleep-related neural processes in humans is challenging because it requires tracking transient sleep rhythms while simultaneously assessing their widespread brain activation. Recent advances in simultaneous EEG-fMRI techniques provide a unique opportunity to explore these processes. EEG allows for precise event-based detection of neural signal, while fMRI provides insight into the broader spatial patterns of brain activation and functional connectivity (Horovitz et al., 2008; Huang et al., 2024; Laufs, 2008; Laufs, Walker, & Lund, 2007; Schabus et al., 2007; Spoormaker et al., 2010). Previous EEG-fMRI studies on sleep have focused on classifying sleep stages or examining the neural correlates of specific waves (Bergmann et al., 2012; Caporro et al., 2012; Czisch et al., 2009; Fogel et al., 2017; Hale et al., 2016; Ilhan-Bayrakcı et al., 2022; Moehlman et al., 2019; Picchioni et al., 2011). These studies have generally reported that slow oscillations are associated with widespread cortical and subcortical BOLD changes, whereas spindles elicit activation in the thalamus, as well as in several cortical and paralimbic regions. Although these findings provide valuable insights into the BOLD correlates of sleep rhythms, they often do not employ sophisticated temporal modeling (Huang et al., 2024), to capture the dynamic interactions between different oscillatory events, e.g., the coupling between SOs and spindles.”
(2) The paper falls short in discussing the specific insights gained into the neurobiological substrate of the investigated slow oscillations, spindles, and their interactions. The validity of the inverse inference approach ("Open ended cognitive state decoding"), assuming certain cognitive functions to be related to these oscillations because of the brain regions/networks activated in temporal association with these events, is debatable at best. It is also unclear why eventually only episodic memory processing-like brain-wide activation is discussed further, despite the activity of 16 of 50 feature terms from the NeuroSynth v3 dataset were significant (episodic memory, declarative memory, working memory, task representation, language, learning, faces, visuospatial processing, category recognition, cognitive control, reading, cued attention, inhibition, and action).
Thank you for pointing this out, particularly regarding the use of inverse inference approaches such as “open-ended cognitive state decoding.” Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7. We will refocus the main text on direct neurobiological insights gained from our EEG-fMRI analyses, particularly emphasizing the hippocampal-thalamocortical network dynamics underlying SO-spindle coupling, and we will acknowledge the exploratory nature of these findings and highlight their limitations.
Discussion, Page 17-18 Lines 323-332
“To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”
(3) Hippocampal activation during SO-spindles is stated as a main hypothesis of the paper - for good reasons - however, other regions (e.g., several cortical as well as thalamic) would be equally expected given the known origin of both oscillations and the existing sleep-EEG-fMRI literature. However, this focus on the hippocampus contrasts with the focus on investigating the key role of the thalamus instead in the Results section.
We appreciate your insight regarding the relative emphasis on hippocampal and thalamic activation in our study. We recognize that the manuscript may currently present an inconsistency between our initial hypothesis and the main focus of the results. To address this concern, we will ensure that our Introduction and Discussion section explicitly discusses both regions, highlighting the complementary roles of the hippocampus (memory processing and reactivation) and the thalamus (spindle generation and cortico-hippocampal coordination) in SO-spindle dynamics.
Introduction, Page 5 Lines 87-103
“To address this gap, our study investigates brain-wide activation and functional connectivity patterns associated with SO-spindle coupling, and employs a cognitive state decoding approach (Margulies et al., 2016; Yarkoni et al., 2011)—albeit indirectly—to infer potential cognitive functions. In the current study, we used simultaneous EEG-fMRI recordings during nocturnal naps (detailed sleep staging results are provided in the Methods and Table S1) in 107 participants. Although directly detecting hippocampal ripples using scalp EEG or fMRI is challenging, we expected that hippocampal activation in fMRI would coincide with SO-spindle coupling detected by EEG, given that SOs, spindles, and ripples frequently co-occur during NREM sleep. We also anticipated a critical role of the thalamus, particularly thalamic spindles, in coordinating hippocampal-cortical communication.
We found significant coupling between SOs and spindles during NREM sleep (N2/3), with spindle peaks occurring slightly before the SO peak. This coupling was associated with increased activation in both the thalamus and hippocampus, with functional connectivity patterns suggesting thalamic coordination of hippocampal-cortical communication. These findings highlight the key role of the thalamus in coordinating hippocampal-cortical interactions during human sleep and provide new insights into the neural mechanisms underlying sleep-dependent brain communication. A deeper understanding of these mechanisms may contribute to future neuromodulation approaches aimed at enhancing sleep-dependent cognitive function and treating sleep-related disorders.”
Discussion, Page 16-17 Lines 292-307
“When modeling the timing of these sleep rhythms in the fMRI, we observed hippocampal activation selectively during SO-spindle events. This suggests the possibility of triple coupling (SOs–spindles–ripples), even though our scalp EEG was not sufficiently sensitive to detect hippocampal ripples—key markers of memory replay (Buzsáki, 2015). Recent iEEG evidence indicates that ripples often co-occur with both spindles (Ngo, Fell, & Staresina, 2020) and SOs (Staresina et al., 2015; Staresina et al., 2023). Therefore, the hippocampal involvement during SO-spindle events in our study may reflect memory replay from the hippocampus, propagated via thalamic spindles to distributed cortical regions.
The thalamus, known to generate spindles (Halassa et al., 2011), plays a key role in producing and coordinating sleep rhythms (Coulon, Budde, & Pape, 2012; Crunelli et al., 2018), while the hippocampus is found essential for memory consolidation (Buzsáki, 2015; Diba & Buzsá ki, 2007; Singh, Norman, & Schapiro, 2022). The increased hippocampal and thalamic activity, along with strengthened connectivity between these regions and the mPFC during SO-spindle events, underscores a hippocampal-thalamic-neocortical information flow. This aligns with recent findings suggesting the thalamus orchestrates neocortical oscillations during sleep (Schreiner et al., 2022). The thalamus and hippocampus thus appear central to memory consolidation during sleep, guiding information transfer to the neocortex, e.g., mPFC.”
(4) The study included an impressive number of 107 subjects. It is surprising though that only 31 subjects had to be excluded under these difficult recording conditions, especially since no adaptation night was performed. Since only subjects were excluded who slept less than 10 min (or had excessive head movements) there are likely several datasets included with comparably short durations and only a small number of SOs and spindles and even less combined SO-spindle events. A comprehensive table should be provided (supplement) including for each subject (included and excluded) the duration of included NREM sleep, number of SOs, spindles, and SO+spindle events. Also, some descriptive statistics (mean/SD/range) would be helpful.
We appreciate your recognition of our sample size and the challenges associated with simultaneous EEG-fMRI sleep recordings. We acknowledge the importance of transparently reporting individual subject data, particularly regarding sleep duration and the number of detected SOs, spindles, and SO-spindle events. To address this, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (5)Density of detected SOs; (6)Density of detected spindles; (7)Density of detected SO-spindle coupling events.
However, most of the excluded participants were unable to fall asleep or had too short a sleep duration, so they basically had no NREM sleep period, so it was impossible to count the NREM sleep duration, SO, spindle, and coupling numbers.
Supplementary Materials, Page 42-54, Table S1-S4
(Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)
(5) Was the 20-channel head coil dedicated for EEG-fMRI measurements? How were the electrode cables guided through/out of the head coil? Usually, the 64-channel head coil is used for EEG-fMRI measurements in a Siemens PRISMA 3T scanner, which has a cable duct at the back that allows to guide the cables straight out of the head coil (to minimize MR-related artifacts). The choice for the 20-channel head coil should be motivated. Photos of the recording setup would also be helpful.
Thank you for your comment regarding our choice of the 20-channel head coil for EEG-fMRI measurements. We acknowledge that the 64-channel head coil is commonly used in Siemens PRISMA 3T scanners; however, the 20-channel coil was selected due to specific practical and technical considerations in our study. In particular, the 20-channel head coil was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil allowed us to maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.
We have made this clearer in the revised manuscript.
Methods, Page 20 Lines 385-392
“All MRI data were acquired using a 20-channel head coil on a research-dedicated 3-Tesla Siemens Magnetom Prisma MRI scanner. Earplugs and cushions were provided for noise protection and head motion restriction. We chose the 20-channel head coil because it was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil helped maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.”
(6) Was the EEG sampling synchronized to the MR scanner (gradient system) clock (the 10 MHz signal; not referring to the volume TTL triggers here)? This is a requirement for stable gradient artifact shape over time and thus accurate gradient noise removal.
Thank you for raising this important point. We confirm that the EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This synchronization was achieved using the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift. As a result, the gradient artifact waveform remained stable across volumes, allowing for more effective artifact correction during preprocessing. We appreciate your attention to this critical aspect of EEG-fMRI data acquisition.
We have made this clearer in the revised manuscript.
Methods, Page 19-20 Lines 371-383
“EEG was recorded simultaneously with fMRI data using an MR-compatible EEG amplifier system (BrainAmps MR-Plus, Brain Products, Germany), along with a specialized electrode cap. The recording was done using 64 channels in the international 10/20 system, with the reference channel positioned at FCz. In order to adhere to polysomnography (PSG) recording standards, six electrodes were removed from the EEG cap: one for electrocardiogram (ECG) recording, two for electrooculogram (EOG) recording, and three for electromyogram (EMG) recording. EEG data was recorded at a sample rate of 5000 Hz, the resistance of the reference and ground channels was kept below 10 kΩ, and the resistance of the other channels was kept below 20 kΩ. To synchronize the EEG and fMRI recordings, the BrainVision recording software (BrainProducts, Germany) was utilized to capture triggers from the MRI scanner. The EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This was achieved via the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift.”
(7) The TR is quite long and the voxel size is quite large in comparison to state-of-the-art EPI sequences. What was the rationale behind choosing a sequence with relatively low temporal and spatial resolution?
We acknowledge that our chosen TR and voxel size are relatively long and large compared to state-of-the-art EPI sequences. This decision was made to optimize the signal-to-noise ratio (SNR) and reduce susceptibility-related distortions, which are particularly critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. A longer TR allowed us to sample whole-brain activity with sufficient coverage, while a larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures such as the thalamus and hippocampus, which are key regions of interest in our study. We appreciate your concern and hope this clarification provides sufficient rationale for our sequence parameters.
We have made this clearer in the revised manuscript.
Methods, Page 20-21 Lines 398-408
“Then, the “sleep” session began after the participants were instructed to try and fall asleep. For the functional scans, whole-brain images were acquired using k-space and steady-state T2*-weighted gradient echo-planar imaging (EPI) sequence that is sensitive to the BOLD contrast. This measures local magnetic changes caused by changes in blood oxygenation that accompany neural activity (sequence specification: 33 slices in interleaved ascending order, TR = 2000 ms, TE = 30 ms, voxel size = 3.5 × 3.5 × 4.2 mm<sup>3</sup>, FA = 90°, matrix = 64 × 64, gap = 0.7 mm). A relatively long TR and larger voxel size were chosen to optimize SNR and reduce susceptibility-related distortions, which are critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. The longer TR allowed whole-brain coverage with sufficient temporal resolution, while the larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures (e.g., the thalamus and hippocampus), which are key regions of interest in this study.”
(8) The anatomically defined ROIs are quite large. It should be elaborated on how this might reduce sensitivity to sleep rhythm-specific activity within sub-regions, especially for the thalamus, which has distinct nuclei involved in sleep functions.
We appreciate your insight regarding the use of anatomically defined ROIs and their potential limitations in detecting sleep rhythm-specific activity within sub-regions, particularly in the thalamus. Given the distinct functional roles of thalamic nuclei in sleep processes, we acknowledge that using a single, large thalamic ROI may reduce sensitivity to localized activity patterns. To address this, we will discuss this limitation in the revised manuscript, acknowledging that our approach prioritizes whole-structure effects but may not fully capture nucleus-specific contributions.
Discussion, Page 18 Lines 333-341
“Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”
(9) The study reports SO & spindle amplitudes & densities, as well as SO+spindle coupling, to be larger during N2/3 sleep compared to N1 and REM sleep, which is trivial but can be seen as a sanity check of the data. However, the amount of SOs and spindles reported for N1 and REM sleep is concerning, as per definition there should be hardly any (if SOs or spindles occur in N1 it becomes by definition N2, and the interval between spindles has to be considerably large in REM to still be scored as such). Thus, on the one hand, the report of these comparisons takes too much space in the main manuscript as it is trivial, but on the other hand, it raises concerns about the validity of the scoring.
We appreciate your concern regarding the reported presence of SOs and spindles in N1 and REM sleep and the potential implications. Our detection method for detecting SO, spindle, and coupling were originally designed only for N2&N3 sleep data based on the characteristics of the data itself, and this method is widely recognized and used in the sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). While, because the detection methods for SO and spindle are based on percentiles, this method will always detect a certain number of events when used for other stages (N1 and REM) sleep data, but the differences between these events and those detected in stage N23 remain unclear. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.
Methods, Page 25 Lines 515-524
“We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”
(10) Why was electrode F3 used to quantify the occurrence of SOs and spindles? Why not a midline frontal electrode like Fz (or a number of frontal electrodes for SOs) and Cz (or a number of centroparietal electrodes) for spindles to be closer to their maximum topography?
We appreciate your suggestion regarding electrode selection for SO and spindle quantification. Our choice of F3 was primarily based on previous studies (Massimini et al., 2004; Molle et al., 2011), where bilateral frontal electrodes are commonly used for detecting SOs and spindles. Additionally, we considered the impact of MRI-related noise and, after a comprehensive evaluation, determined that F3 provided an optimal balance between signal quality and artifact minimization. We also acknowledge that alternative electrode choices, such as Fz for SOs and Cz for spindles, could provide additional insights into their topographical distributions.
(11) Functional connectivity (hippocampus -> thalamus -> cortex (mPFC)) is reported to be increased during SO-spindle coupling and interpreted as evidence for coordination of hippocampo-neocortical communication likely by thalamic spindles. However, functional connectivity was only analysed during coupled SO+spindle events, not during isolated SOs or isolated spindles. Without the direct comparison of the connectivity patterns between these three events, it remains unclear whether this is specific for coupled SO+spindle events or rather associated with one or both of the other isolated events. The PPIs need to be conducted for those isolated events as well and compared statistically to the coupled events.
We appreciate your critical perspective on our functional connectivity analysis and the interpretation of hippocampus-thalamus-cortex (mPFC) interactions during SO-spindle coupling. We acknowledge that, in the current analysis, functional connectivity was only examined during coupled SO-spindle events, without direct comparison to isolated SOs or isolated spindles. To address this concern, we have conducted PPI analyses for all three ROIs(Hippocampus, Thalamus, mPFC) and all three event types (SO-spindle couplings, isolated SOs, and isolated spindles). Our results indicate that neither isolated SOs nor isolated Spindles yielded significant connectivity changes in all three ROIs, as all failed to survive multiple comparison corrections. This suggests that the observed connectivity increase is specific to SO-spindle coupling, rather than being independently driven by either SOs or spindles alone.
Results, Page 14 Lines 248-255
“Crucially, the interaction between FC and SO-spindle coupling revealed that only the functional connectivity of hippocampus -> thalamus (ROI analysis, t<sub>(106)</sub> = 1.86, p = 0.0328) and thalamus -> mPFC (ROI analysis, t<sub>(106)</sub> = 1.98, p = 0.0251) significantly increased during SO-spindle coupling, with no significant changes in all other pathways (Fig. 4e). We also conducted PPI analyses for the other two events (SOs and spindles), and neither yielded significant connectivity changes in the three ROIs, as all failed to survive whole-brain FWE correction at the cluster level (p < 0.05). Together, these findings suggest that the thalamus, likely via spindles, coordinates hippocampal-cortical communication selectively during SO-spindle coupling, but not isolated SOs or spindle events alone.”
(12) The limited temporal resolution of fMRI does indeed not allow for easily distinguishing between fMRI activation patterns related to SO-up- vs. SO-down-states. For this, one could try to extract the amplitudes of SO-up- and SO-down-states separately for each SO event and model them as two separate parametric modulators (with the risk of collinearity as they are likely correlated).
We appreciate your insightful comment regarding the challenge of distinguishing fMRI activation patterns related to SO-up vs. SO-down states due to the limited temporal resolution of fMRI. While our current analysis does not differentiate between these two phases, we acknowledge that separately modeling SO-up and SO-down states using parametric modulators could provide a more refined understanding of their distinct neural correlates. However, as you notes, this approach carries the risk of collinearity, and there is indeed a high correlation between the two amplitudes across all subjects in our results (r=0.98). Future studies could explore more on leveraging high-temporal-resolution techniques. While implementing this in the current study is beyond our scope, we will acknowledge this limitation in the Discussion section.
Discussion, Page 17 Lines 308-322
“An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.”
Discussion, Page 18 Lines 333-341
“Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”
(13) L327: "It is likely that our findings of diminished DMN activity reflect brain activity during the SO DOWN-state, as this state consistently shows higher amplitude compared to the UP-state within subjects, which is why we modelled the SO trough as its onset in the fMRI analysis." This conclusion is not justified as the fact that SO down-states are larger in amplitude does not mean their impact on the BOLD response is larger.
We appreciate your concern regarding our interpretation of diminished DMN activity reflecting the SO down-state. We acknowledge that the current expression is somewhat misleading, and our interpretation of it is: it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. And we will make this clear in the Discussion section.
Discussion, Page 17 Lines 308-322
“An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.”
(14) Line 77: "In the current study, while directly capturing hippocampal ripples with scalp EEG or fMRI is difficult, we expect to observe hippocampal activation in fMRI whenever SOs-spindles coupling is detected by EEG, if SOs- spindles-ripples triple coupling occurs during human NREM sleep". Not all SO-spindle events are associated with ripples (Staresina et al., 2015), but hippocampal activation may also be expected based on the occurrence of spindles alone (Bergmann et al., 2012).
We appreciate your clarification regarding the relationship between SO-spindle coupling and hippocampal ripples. We acknowledge that not all SO-spindle events are necessarily accompanied by ripples (Staresina et al., 2015). However, based on previous research, we found that hippocampal ripples are significantly more likely to occur during SO-spindle coupling events. This suggests that while ripple occurrence is not guaranteed, SO-spindle coupling creates a favorable network state for ripple generation and potential hippocampal activation. To ensure accuracy, we will revise the manuscript to delete this misleading sentence in the Introduction section and acknowledge in the Discussion that our results cannot conclusively directly observe the triple coupling of SO, spindle, and hippocampal ripples.
Discussion, Page 18 Lines 333-341
“Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”
Reviewer #2 (Public review):
In this study, Wang and colleagues aimed to explore brain-wide activation patterns associated with NREM sleep oscillations, including slow oscillations (SOs), spindles, and SO-spindle coupling events. Their findings reveal that SO-spindle events corresponded with increased activation in both the thalamus and hippocampus. Additionally, they observed that SO-spindle coupling was linked to heightened functional connectivity from the hippocampus to the thalamus, and from the thalamus to the medial prefrontal cortex-three key regions involved in memory consolidation and episodic memory processes.
This study's findings are timely and highly relevant to the field. The authors' extensive data collection, involving 107 participants sleeping in an fMRI while undergoing simultaneous EEG recording, deserves special recognition. If shared, this unique dataset could lead to further valuable insights. While the conclusions of the data seem overall well supported by the data, some aspects with regard to the detection of sleep oscillations need clarification.
The authors report that coupled SO-spindle events were most frequent during NREM sleep (2.46 [plus minus] 0.06 events/min), but they also observed a surprisingly high occurrence of these events during N1 and REM sleep (2.23 [plus minus] 0.09 and 2.32 [plus minus] 0.09 events/min, respectively), where SO-spindle coupling would not typically be expected. Combined with the relatively modest SO amplitudes reported (~25 µV, whereas >75 µV would be expected when using mastoids as reference electrodes), this raises the possibility that the parameters used for event detection may not have been conservative enough - or that sleep staging was inaccurately performed. This issue could present a significant challenge, as the fMRI findings are largely dependent on the reliability of these detected events.
Thank you very much for your thorough and encouraging review. We appreciate your recognition of the significance and relevance of our study and dataset, particularly in highlighting how simultaneous EEG-fMRI recordings can provide complementary insights into the temporal dynamics of neural oscillations and their associated spatial activation patterns during sleep. In the sections that follow, we address each of your comments in detail. We have revised the text and conducted additional analyses wherever possible to strengthen our argument, clarify our methodological choices. We believe these revisions improve the clarity and rigor of our work, and we thank you for helping us refine it.
We appreciate your insightful comments regarding the detection of sleep oscillations. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.
Regarding the reported SO amplitudes (~25 µV), during preprocessing, we applied the Signal Space Projection (SSP) method to more effectively remove MRI gradient artifacts and cardiac pulse noise. While this approach enhances data quality, it also reduces overall signal power, leading to systematically lower reported amplitudes. Despite this, our SO detection in NREM sleep (especially N2/N3) remain physiologically meaningful and are consistent with previous fMRI studies using similar artifact removal techniques. We appreciate your careful evaluation and valuable suggestions.
In addition, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (2)Density of detected SOs; (3)Density of detected spindles; (4)Density of detected SO-spindle coupling events.
Methods, Page 25 Lines 515-524
“We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”
Supplementary Materials, Page 42-54, Table S1-S4
(Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)
Reviewer #3 (Public review):
Summary:
Wang et al., examined the brain activity patterns during sleep, especially when locked to those canonical sleep rhythms such as SO, spindle, and their coupling. Analyzing data from a large sample, the authors found significant coupling between spindles and SOs, particularly during the upstate of the SO. Moreover, the authors examined the patterns of whole-brain activity locked to these sleep rhythms. To understand the functional significance of these brain activities, the authors further conducted open-ended cognitive state decoding and found a variety of cognitive processing may be involved during SO-spindle coupling and during other sleep events. The authors next investigated the functional connectivity analyses and found enhanced connectivity between the hippocampus, the thalamus, and the medial PFC. These results reinforced the theoretical model of sleep-dependent memory consolidation, such that SO-spindle coupling is conducive to systems-level memory reactivation and consolidation.
Strengths:
There are obvious strengths in this work, including the large sample size, state-of-the-art neuroimaging and neural oscillation analyses, and the richness of results.
Weaknesses:
Despite these strengths and the insights gained, there are weaknesses in the design, the analyses, and inferences.
Thank you for your detailed and thoughtful review of our manuscript. We are delighted that you recognize our advanced analysis methods and rich results of neuroimaging and neural oscillations as well as the large sample size data. In the following sections, we provide detailed responses to each of your comments. And we have revised the text and conducted additional analyses to strengthen our arguments and clarify our methodological choices. We believe these revisions enhance the clarity and rigor of our work, and we sincerely appreciate your thoughtful feedback in helping us refine the manuscript.
(1) A repeating statement in the manuscript is that brain activity could indicate memory reactivation and thus consolidation. This is indeed a highly relevant question that could be informed by the current data/results. However, an inherent weakness of the design is that there is no memory task before and after sleep. Thus, it is difficult (if not impossible) to make a strong argument linking SO/spindle/coupling-locked brain activity with memory reactivation or consolidation.
We appreciate your suggestion regarding the lack of a pre- and post-sleep memory task in our study design. We acknowledge that, in the absence of behavioral measures, it is hard to directly link SO-spindle coupling to memory consolidation in an outcome-driven manner. Our interpretation is instead based on the well-established role of these oscillations in memory processes, as demonstrated in previous studies. We sincerely appreciate this feedback and will adjust our Discussion accordingly to reflect a more precise interpretation of our findings.
Discussion, Page 18 Lines 333-341
“Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”
(2) Relatedly, to understand the functional implications of the sleep rhythm-locked brain activity, the authors employed the "open-ended cognitive state decoding" method. While this method is interesting, it is rather indirect given that there were no behavioral indices in the manuscript. Thus, discussions based on these analyses are speculative at best. Please either tone down the language or find additional evidence to support these claims.
Moreover, the results from this method are difficult to understand. Figure 3e showed that for all three types of sleep events (SO, spindle, SO-spindle), the same mental states (e.g., working memory, episodic memory, declarative memory) showed opposite directions of activation (left and right panels showed negative and positive activation, respectively). How to interpret these conflicting results? This ambiguity is also reflected by the term used: declarative memory and episodic memories are both indexed in the results. Yet these two processes can be largely overlapped. So which specific memory processes do these brain activity patterns reflect? The Discussion shall discuss these results and the limitations of this method.
We appreciate your critical assessment of the open-ended cognitive state decoding method and its interpretational challenges. Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7.
Due to the complexity of memory-related processes, we acknowledge that distinguishing between episodic and declarative memory based solely on this approach is not straightforward. We will revise the Supplementary Materials to explicitly discuss these limitations and clarify that our findings do not isolate specific cognitive processes but rather suggest general associations with memory-related networks.
Discussion, Page 17-18 Lines 323-332
“To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potenial functional claims.”
(3) The coupling strength is somehow inconsistent with prior results (Hahn et al., 2020, eLife, Helfrich et al., 2018, Neuron). Specifically, Helfrich et al. showed that among young adults, the spindle is coupled to the peak of the SO. Here, the authors reported that the spindles were coupled to down-to-up transitions of SO and before the SO peak. It is possible that participants' age may influence the coupling (see Helfrich et al., 2018). Please discuss the findings in the context of previous research on SO-spindle coupling.
We appreciate your concern regarding the temporal characteristics of SO-spindle coupling. We acknowledge that the SO-spindle coupling phase results in our study are not identical to those reported by Hahn et al. (2020); Helfrich et al. (2018). However, these differences may arise due to slight variations in event detection parameters, which can influence the precise phase estimation of coupling. Notably, Hahn et al. (2020) also reported slight discrepancies in their group-level coupling phase results, highlighting that methodological differences can contribute to variability across studies. Furthermore, our findings are consistent with those of Schreiner et al. (2021), further supporting the robustness of our observations.
That said, we acknowledge that our original description of SO-spindle coupling as occurring at the "transition from the lower state to the upper state" was not entirely precise. The -π/2 phase represents the true transition point, while our observed coupling phase is actually closer to the SO peak rather than strictly at the transition. We will revise this statement in the manuscript to ensure clarity and accuracy in describing the coupling phase.
Discussion, Page 16 Lines 283-291
“Our data provide insights into the neurobiological underpinnings of these sleep rhythms. SOs, originating mainly in neocortical areas such as the mPFC, alternate between DOWN- and UP-states. The thalamus generates sleep spindles, which in turn couple with SOs. Our finding that spindle peaks consistently occurred slightly before the UP-state peak of SOs (in 83 out of 107 participants), concurs with prior studies, including Schreiner et al. (2021). Yet it differs from some results suggesting spindles might peak right at the SO UP-state (Hahn et al., 2020; Helfrich et al., 2018). Such discrepancies could arise from differences in detection algorithms, participant age (Helfrich et al., 2018), or subtle variations in cortical-thalamic timing. Nonetheless, these results underscore the importance of coordinated SO-spindle interplay in supporting sleep-dependent processes.”
(4) The discussion is rather superficial with only two pages, without delving into many important arguments regarding the possible functional significance of these results. For example, the author wrote, "This internal processing contrasts with the brain patterns associated with external tasks, such as working memory." Without any references to working memory, and without delineating why WM is considered as an external task even working memory operations can be internal. Similarly, for the interesting results on SO and reduced DMN activity, the authors wrote "The DMN is typically active during wakeful rest and is associated with self-referential processes like mind-wandering, daydreaming, and task representation (Yeshurun, Nguyen, & Hasson, 2021). Its reduced activity during SOs may signal a shift towards endogenous processes such as memory consolidation." This argument is flawed. DMN is active during self-referential processing and mind-wandering, i.e., when the brain shifts from external stimuli processing to internal mental processing. During sleep, endogenous memory reactivation and consolidation are also part of the internal mental processing given the lack of external environmental stimulation. So why during SO or during memory consolidation, the DMN activity would be reduced? Were there differences in DMN activity between SO and SO-spindle coupling events?
We appreciate your concerns regarding the brevity of the discussion and the need for clearer theoretical arguments. We will expand this section to provide more in-depth interpretations of our findings in the context of prior literature. Regarding working memory (WM), we acknowledge that our phrasing was ambiguous. We will modify this statement in the Discussion section.
For the SO-related reduction in DMN activity, we recognize the need for a more precise explanation. This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state.
To address your final question, we have conducted the additional post hoc comparison of DMN activity between isolated SOs and SO-spindle coupling events. Our results indicate that
DMN activation during SOs was significantly lower than during SO-spindle coupling (t<sub>(106)</sub> = -4.17, p < 1e-4). This suggests that SO-spindle coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. We appreciate your constructive feedback and will integrate these expanded analyses and discussions into our revised manuscript.
Results, Page 11 Lines 199-208
“Spindles were correlated with positive activation in the thalamus (ROI analysis, t<sub>(106)</sub> = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t<sub>(106)</sub> \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t<sub>(106)</sub> \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t<sub>(106)</sub> \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”
Discussion, Page 17-18 Lines 308-332
“An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.
To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”
Reviewing Editor Comment:
The reviewers think that you are working on a relevant and important topic. They are praising the large sample size used in the study. The reviewers are not all in line regarding the overall significance of the findings, but they all agree the paper would strongly benefit from some extra work, as all reviewers raise various critical points that need serious consideration.
We appreciate your recognition of the relevance and importance of our study, as well as your acknowledgment of the large sample size as a strength of our work. We understand that there are differing perspectives regarding the overall significance of our findings, and we value the constructive critiques provided. We are committed to addressing the key concerns raised by all reviewers, including refining our analyses, clarifying our interpretations, and incorporating additional discussions to strengthen the manuscript. Below, we address your specific recommendations and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We believe that these revisions will significantly enhance the rigor and impact of our study, and we sincerely appreciate your thoughtful feedback in helping us improve our work.
Reviewer #1 (Recommendations for the authors):
(1) The phrase "overnight sleep" suggests an entire night, while these were rather "nocturnal naps". Please rephrase.
Thank you for pointing this out. We have revised the phrasing in our manuscript to "nocturnal naps" instead of "overnight sleep" to more accurately reflect the duration of the sleep recordings.
(2) Sleep staging results (macroscopic sleep architecture) should be provided in more detail (at least min and % of the different sleep stages, sleep onset latency, total sleep duration, total recording duration), at least mean/SD/range.
Thank you for this suggestion. We will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics. This information will help provide a clearer overview of the macroscopic sleep architecture in our dataset.
Supplementary Materials, Page 42, Table S1
Author response table 1.
Descriptive results of demographic information and sleep characteristics. Note: The total recorded time is equal to the awake time plus the total sleep time. The sleep onset latency is the time taken to reach the first sleep epoch. The Sleep Efficiency is the ratio of actual sleep time to total recording time.
Reviewer #2 (Recommendations for the authors):
In order to allow for a better estimation of the reliability of the detected sleep events, please:
(1) Provide densities and absolute numbers of all detected SOs and spindles (N1, NREM, and REM sleep).
Thank you for pointing this out. We will provide comprehensive tables in the supplementary materials, contains detailed information about sleep waves at each sleep stage for all 107 subjects (Table S2-S4), listing for each subject:1) Different sleep stage duration; 2) Number of detected SOs; 3) Number of detected spindles; 4) Number of detected SO-spindle coupling events; 5) Density of detected SOs; 6) Density of detected spindles; 7) Density of detected SO-spindle coupling events.
Supplementary Materials, Page 43-54, Table S2-S4
(Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)
(2) Show ERPs for all detected SOs and spindles (per sleep stage).
Thank you for the suggestion. We will provide ERPs for all detected SOs and spindles, separated by sleep stage (N1, N2&N3, and REM) in supplementary Fig. S2-S4. These ERP waveforms will help illustrate the characteristic temporal profiles of SOs and spindles across different sleep stages.
Methods, Page 25, Line 525-532
“Event-related potentials (ERP) analysis. After completing the detection of each sleep rhythm event, we performed ERP analyses for SOs, spindles, and coupling events in different sleep stages. Specifically, for SO events, we took the trough of the DOWN-state of each SO as the zero-time point, then extracted data in a [-2 s to 2 s] window from the broadband (0.1–30 Hz) EEG and used [-2 s to -0.5 s] for baseline correction; the results were then averaged across 107 subjects (see Fig. S2a). For spindle events, we used the peak of each spindle as the zero-time point and applied the same data extraction window and baseline correction before averaging across 107 subjects (see Fig. S2b). Finally, for SO-spindle coupling events, we followed the same procedure used for SO events (see Fig. 2a, Figs. S3–S4).”
Supplementary Materials, Page 36-38, Fig. S2-S4
Author response image 1.
ERPs of SOs and spindles coupling during different sleep stages across all 107 subjects. a. ERP of SOs in different sleep stages using the broadband (0.1–30 Hz) EEG data. We align the trough of the DOWN-state of each SO at time zero (see Methods for details). The orange line represents the SO ERP in the N1 stage, the black line represents the SO ERP in the N2&N3 stage, and the green line represents the SO ERP in the REM stage. b. ERP of spindles in different sleep stages using the broadband (0.1–30 Hz) EEG data. We align the peak of each spindle at time zero (see Methods for details). The color scheme is the same as in panel a.
Author response image 2.
ERP and time-frequency patterns of SO-spindle coupling in the N1 stage. The averaged temporal frequency pattern and ERP across all instances of SO-spindle coupling, computed over all subjects, following the same procedure as in Fig. 2a, but for N1 stage.
Author response image 3.
ERP and time-frequency patterns of SO-spindle coupling in the REM stage. The averaged temporal frequency pattern and ERP across all instances of SO-spindle coupling, computed over all subjects, again following the same procedure as in Fig. 2a, but for REM stage.
(3) Provide detailed info concerning sleep characteristics (time spent in each sleep stage etc.).
Thank you for this suggestion. Same as the response above, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics.
Supplementary Materials, Page 42, Table S1 (same as above)
(4) What would happen if more stringent parameters were used for event detection? Would the authors still observe a significant number of SO spindles during N1 and REM? Would this affect the fMRI-related results?
Thank you for this suggestion. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).
Furthermore, in order to explore the impact of this on our fMRI results, we conducted an additional sensitivity analysis by applying different detection parameters for SOs. Specifically, we adjusted amplitude percentile thresholds for SO detection (the parameter that has the greatest impact on the results). We used the hippocampal activation value during N2&N3 stage SO-spindle coupling as an anchor value and found that when the parameters gradually became stricter, the results were similar to or even better than the current results. However, when we continued to increase the threshold, the results began to gradually decrease until the threshold was increased to 80%, and the results were no longer significant. This indicates that our results are robust within a specific range of parameters, but as the threshold increases, the number of trials decreases, ultimately weakening the statistical power of the fMRI analysis.
Thank you again for your suggestions on sleep rhythm event detection. We will add the results in Supplementary and revise our manuscript accordingly.
Results, Page 11, Line 199-208
“Spindles were correlated with positive activation in the thalamus (ROI analysis, t<sub>(106)</sub> = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t<sub>(106)</sub> \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t<sub>(106)</sub> \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t<sub>(106)</sub> \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”
Supplementary Materials, Page 40, Fig. S6
Author response image 4.
Influence of the percentile threshold for SO detection on hippocampal activation (ROI) during SO-spindle coupling. We changed the percentile threshold for SO event detection in the EEG data analysis and then reconstructed the GLM design matrix based on the SO events detected at each threshold. The brain-wide activation pattern of SO-spindle couplings in the N2/3 stage was extracted using the same method as shown in Fig. 3. The gray horizontal line represents the significant range (71%–80%). * p < 0.05.
Finally, we sincerely thank all again for your thoughtful and constructive feedback. Your insights have been invaluable in refining our analyses, strengthening our interpretations, and improving the clarity and rigor of our manuscript. We appreciate the time and effort you have dedicated to reviewing our work, and we are grateful for the opportunity to enhance our study based on your recommendations.
References:
Bergmann, T. O., Mölle, M., Diedrichs, J., Born, J., & Siebner, H. R. (2012). Sleep spindle-related reactivation of category-specific cortical regions after learning face-scene associations. NeuroImage, 59(3), 2733-2742.
Buzsáki, G. (2015). Hippocampal sharp wave‐ripple: A cognitive biomarker for episodic memory and planning. Hippocampus, 25(10), 1073-1188.
Caporro, M., Haneef, Z., Yeh, H. J., Lenartowicz, A., Buttinelli, C., Parvizi, J., & Stern, J. M. (2012). Functional MRI of sleep spindles and K-complexes. Clinical neurophysiology, 123(2), 303-309.
Coulon, P., Budde, T., & Pape, H.-C. (2012). The sleep relay—the role of the thalamus in central and decentral sleep regulation. Pflügers Archiv-European Journal of Physiology, 463, 53-71.
Crunelli, V., Lőrincz, M. L., Connelly, W. M., David, F., Hughes, S. W., Lambert, R. C., Leresche, N., & Errington, A. C. (2018). Dual function of thalamic low-vigilance state oscillations: rhythm-regulation and plasticity. Nature Reviews Neuroscience, 19(2), 107-118.
Czisch, M., Wehrle, R., Stiegler, A., Peters, H., Andrade, K., Holsboer, F., & Sämann, P. G. (2009). Acoustic oddball during NREM sleep: a combined EEG/fMRI study. PloS one, 4(8), e6749.
Diba, K., & Buzsáki, G. (2007). Forward and reverse hippocampal place-cell sequences during ripples. Nature Neuroscience, 10(10), 1241.
Diekelmann, S., & Born, J. (2010). The memory function of sleep. Nature Reviews Neuroscience, 11(2), 114-126.
Fogel, S., Albouy, G., King, B. R., Lungu, O., Vien, C., Bore, A., Pinsard, B., Benali, H., Carrier, J., & Doyon, J. (2017). Reactivation or transformation? Motor memory consolidation associated with cerebral activation time-locked to sleep spindles. PloS one, 12(4), e0174755.
Hahn, M. A., Heib, D., Schabus, M., Hoedlmoser, K., & Helfrich, R. F. (2020). Slow oscillation-spindle coupling predicts enhanced memory formation from childhood to adolescence. Elife, 9, e53730.
Halassa, M. M., Siegle, J. H., Ritt, J. T., Ting, J. T., Feng, G., & Moore, C. I. (2011). Selective optical drive of thalamic reticular nucleus generates thalamic bursts and cortical spindles. Nature Neuroscience, 14(9), 1118-1120.
Hale, J. R., White, T. P., Mayhew, S. D., Wilson, R. S., Rollings, D. T., Khalsa, S., Arvanitis, T. N., & Bagshaw, A. P. (2016). Altered thalamocortical and intra-thalamic functional connectivity during light sleep compared with wake. NeuroImage, 125, 657-667.
Helfrich, R. F., Lendner, J. D., Mander, B. A., Guillen, H., Paff, M., Mnatsakanyan, L., Vadera, S., Walker, M. P., Lin, J. J., & Knight, R. T. (2019). Bidirectional prefrontal-hippocampal dynamics organize information transfer during sleep in humans. Nature Communications, 10(1), 3572.
Helfrich, R. F., Mander, B. A., Jagust, W. J., Knight, R. T., & Walker, M. P. (2018). Old brains come uncoupled in sleep: slow wave-spindle synchrony, brain atrophy, and forgetting. Neuron, 97(1), 221-230. e224.
Horovitz, S. G., Fukunaga, M., de Zwart, J. A., van Gelderen, P., Fulton, S. C., Balkin, T. J., & Duyn, J. H. (2008). Low frequency BOLD fluctuations during resting wakefulness and light sleep: A simultaneous EEG‐fMRI study. Human brain mapping, 29(6), 671-682.
Huang, Q., Xiao, Z., Yu, Q., Luo, Y., Xu, J., Qu, Y., Dolan, R., Behrens, T., & Liu, Y. (2024). Replay-triggered brain-wide activation in humans. Nature Communications, 15(1), 7185.
Ilhan-Bayrakcı, M., Cabral-Calderin, Y., Bergmann, T. O., Tüscher, O., & Stroh, A. (2022). Individual slow wave events give rise to macroscopic fMRI signatures and drive the strength of the BOLD signal in human resting-state EEG-fMRI recordings. Cerebral Cortex, 32(21), 4782-4796.
Laufs, H. (2008). Endogenous brain oscillations and related networks detected by surface EEG‐combined fMRI. Human brain mapping, 29(7), 762-769.
Laufs, H., Walker, M. C., & Lund, T. E. (2007). ‘Brain activation and hypothalamic functional connectivity during human non-rapid eye movement sleep: an EEG/fMRI study’—its limitations and an alternative approach. Brain, 130(7), e75.
Margulies, D. S., Ghosh, S. S., Goulas, A., Falkiewicz, M., Huntenburg, J. M., Langs, G., Bezgin, G., Eickhoff, S. B., Castellanos, F. X., & Petrides, M. (2016). Situating the default-mode network along a principal gradient of macroscale cortical organization. Proceedings of the National Academy of Sciences, 113(44), 12574-12579.
Massimini, M., Huber, R., Ferrarelli, F., Hill, S., & Tononi, G. (2004). The sleep slow oscillation as a traveling wave. Journal of Neuroscience, 24(31), 6862-6870.
Moehlman, T. M., de Zwart, J. A., Chappel-Farley, M. G., Liu, X., McClain, I. B., Chang, C., Mandelkow, H., Özbay, P. S., Johnson, N. L., & Bieber, R. E. (2019). All-night functional magnetic resonance imaging sleep studies. Journal of neuroscience methods, 316, 83-98.
Molle, M., Bergmann, T. O., Marshall, L., & Born, J. (2011). Fast and slow spindles during the sleep slow oscillation: disparate coalescence and engagement in memory processing. Sleep, 34(10), 1411-1421.
Ngo, H.-V., Fell, J., & Staresina, B. (2020). Sleep spindles mediate hippocampal-neocortical coupling during long-duration ripples. Elife, 9, e57011.
Picchioni, D., Horovitz, S. G., Fukunaga, M., Carr, W. S., Meltzer, J. A., Balkin, T. J., Duyn, J. H., & Braun, A. R. (2011). Infraslow EEG oscillations organize large-scale cortical– subcortical interactions during sleep: a combined EEG/fMRI study. Brain research, 1374, 63-72.
Schabus, M., Dang-Vu, T. T., Albouy, G., Balteau, E., Boly, M., Carrier, J., Darsaud, A., Degueldre, C., Desseilles, M., & Gais, S. (2007). Hemodynamic cerebral correlates of sleep spindles during human non-rapid eye movement sleep. Proceedings of the National Academy of Sciences, 104(32), 13164-13169.
Schreiner, T., Kaufmann, E., Noachtar, S., Mehrkens, J.-H., & Staudigl, T. (2022). The human thalamus orchestrates neocortical oscillations during NREM sleep. Nature communications, 13(1), 5231.
Schreiner, T., Petzka, M., Staudigl, T., & Staresina, B. P. (2021). Endogenous memory reactivation during sleep in humans is clocked by slow oscillation-spindle complexes. Nature Communications, 12(1), 3112.
Singh, D., Norman, K. A., & Schapiro, A. C. (2022). A model of autonomous interactions between hippocampus and neocortex driving sleep-dependent memory consolidation. Proceedings of the National Academy of Sciences, 119(44), e2123432119.
Spoormaker, V. I., Schröter, M. S., Gleiser, P. M., Andrade, K. C., Dresler, M., Wehrle, R., Sämann, P. G., & Czisch, M. (2010). Development of a large-scale functional brain network during human non-rapid eye movement sleep. Journal of Neuroscience, 30(34), 11379-11387.
Staresina, B. P., Bergmann, T. O., Bonnefond, M., van der Meij, R., Jensen, O., Deuker, L., Elger, C. E., Axmacher, N., & Fell, J. (2015). Hierarchical nesting of slow oscillations, spindles and ripples in the human hippocampus during sleep. Nature Neuroscience, 18(11), 1679-1686.
Staresina, B. P., Niediek, J., Borger, V., Surges, R., & Mormann, F. (2023). How coupled slow oscillations, spindles and ripples coordinate neuronal processing and communication during human sleep. Nature Neuroscience, 1-9.
Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature methods, 8(8), 665-670.
Yeshurun, Y., Nguyen, M., & Hasson, U. (2021). The default mode network: where the idiosyncratic self meets the shared social world. Nature Reviews Neuroscience, 1-12.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 1 (Public reviews):
Summary
Howard et al. performed deep mutational scanning on the MC4R gene, using a reporter assay to investigate two distinct downstream pathways across multiple experimental conditions. They validated their findings with ClinVar data and previous studies. Additionally, they provided insights into the application of DMS results for personalized drug therapy and differential ligand responses across variant types.
Strengths
They captured over 99% of variants with robust signals and investigated subtle functionalities, such as pathway-specific activities and interactions with different ligands, by refining both the experimental design and analytical methods.
Weaknesses
While the study generated informative results, it lacks a detailed explanation regarding the input library, replicate correlation, and sequencing depth for a given number of cells. Additionally, there are several questions that it would be helpful for authors to clarify.
(1) It would be helpful to clarify the information regarding the quality of the input library and experimental replicates. Are variants evenly represented in the library? Additionally, have the authors considered using long-read sequencing to confirm the presence of a single intended variant per construct? Finally, could the authors provide details on the correlation between experimental replicates under each condition?
Are variants evenly represented in the library?
We strive to achieve as evenly balanced library as possible at every stage of the DMS process (e.g., initial cloning in E. coli through integration into human cells). Below is a representative plot showing the number of barcodes per amino acid variant at each position in a given ~60 amino acid subregion of MC4R, which highlights how evenly variants are represented at the E. coli cloning stage.
Author response image 1.
We also make similar measurements after the library is integrated into HEK293T cell lines, and see similarly even coverage across all variants, as shown in the plot below:
Author response image 2.
Additionally, have the authors considered using long-read sequencing to confirm the presence of a single intended variant per construct?
We agree long-read sequencing would be an excellent way to confirm that our constructs contain a single intended variant. However, we elected for an alternate method (outlined in more detail in Jones et al. 2020) that leverages multiple layers of validation. First, the oligo chip-synthesized portions of the protein containing the variants are cloned into a sequence-verified plasmid backbone, which greatly decreases the chances of spuriously generating a mutation in a different portion of the protein. We then sequence both the oligo portion and random barcode using overlapping paired end reads during barcode mapping to avoid sequencing errors and to help detect DNA synthesis errors. At this stage, we computationally reject any constructs that have more than one variant. Given this, the vast majority of remaining unintended variants would come from somatic mutations introduced by the E. coli cloning or replication process, which should be low frequency. We have used our in-house full plasmid sequencing method, OCTOPUS, to sample and spot check this for several other DMS libraries we have generated using the same cloning methods. We have found variants in the plasmid backbone in only ~1% of plasmids in these libraries. Our statistical model also helps correct for this by accounting for barcode-specific variation. Finally we believe this provides further motivation for having multiple barcodes per variant, which dilutes the effect of any unintended additional variants.
Finally, could the authors provide details on the correlation between experimental replicates under each condition?
Certainly! In general, the Gs reporter had higher correlation between replicates than the Gq system (r ~ 0.5 vs r ~ 0.4). The plots below, which have been added as a panel to Supplementary Figure 1, show two representative correlations at the RNA-seq stage of read counts for barcodes between the low a-MSH conditions.
We added the following text to reference this panel:
(see Methods > Sequence processing for barcode expression): “The correlation (r) of barcode readcounts between replicates was ~0.5 and ~0.4 for the Gs and Gq assays, respectively (Supplementary Fig. 1E).”
One important advantage of our statistical model is that it’s able to leverage information from barcodes regardless of the number of replicates they appear in.
(2) Since the functional readout of variants is conducted through RNA sequencing, it seems crucial to sequence a sufficient number of cells with adequate sequencing saturation. Could the authors clarify the coverage depth used for each RNA-seq experiment and how this depth was determined? Additionally, how many cells were sequenced in each experiment?
The text has been added in the manuscript as follows:
(in Methods > Running DMS Assays): “Given the seeding density (~17x10<sup>6</sup> cells per 150 mm replicate dish), time from seeding to collection, and doubling time of HEK293T cells, approximately 25.5x10<sup>6</sup> cells were collected per replicate. This translates to approximately 30-60x cellular coverage per amino acid variant in each replicate.”
(in Methods > Sequence processing for barcode expression): “Total mapped reads per replicate at the RNA-seq stage were as follows:
- Gs/CRE: 9.1-18.2 million mapped reads, median=12.3
- Gq/UAS: 8.6-24.1 million mapped reads, median=14.5
- Gs/CRE+Chaperone: 6.4-9.5 million mapped reads, median=7.5”
The median read counts per sample per barcode were 8, 10, and 6 reads for Gs/CRE, Gq/UAS, and Gs/CRE+Chaperone assays, respectively. The median number of barcodes per variant across all samples (the “median of medians”) were 56 for Gs/CRE, 28 for Gq/UAS, and 44 for Gs/CRE+Chaperone.”
(3) It appears that the frequencies of individual RNA-seq barcode variants were used as a proxy for MC4R activity. Would it be important to also normalize for heterogeneity in RNA-seq coverage across different cells in the experiment? Variability in cell representation (i.e., the distribution of variants across cells) could lead to misinterpretation of variant effects. For example, suppose barcode_a1 represents variant A and barcode_b1 represents variant B. If the RNA-seq results show 6 reads for barcode_a1 and 7 reads for barcode_b1, it might initially appear that both variants have similar effect sizes. However, if these reads correspond to 6 separate cells each containing 1 copy of barcode_a1, and only 1 cell containing 7 copies of barcode_b1, the interpretation changes significantly. Additionally, if certain variants occupy a larger proportion of the cell population, they are more likely to be overrepresented in RNA sequencing.
We account for this heterogeneity in several ways. First, as shown above (see Response to Reviewer 1, Question 1), we aim to have even representation of variants within our libraries. Second, we utilize compositional control conditions like forskolin or unstimulated conditions to obtain treatment-independent measurements of barcode abundance and, consequently, of mutant-vs-WT effects that are due to compositional rather than biological variability. We expect that variability observed under these controls is due to subtle effects of molecular cloning, gene expression, and stochasticity. Using these controls, we observe that mutant-vs-WT effects are generally close to zero in these normalization conditions (e.g., in untreated Gq, see Supplementary Figure 3) as compared to treated conditions. For example, pre-mature stops behave similar to WT in normalization conditions. This indicates that mutant abundance is relatively homogenous. Where there are barcode-dependent effects on abundance, we can use information from these conditions to normalize that effect. Finally, our mixed-effect model accounts for barcode-specific deviations from the expected mutant effect (e.g., a “high count” barcode consistently being high relative to the mean).
(4) Although the assay system appears to effectively represent MC4R functionality at the molecular level, we are curious about the potential disparity between the DMS score system and physiological relevance. How do variants reported in gnomAD distribute within the DMS scoring system?
Figure 2D shows DMS scores (variant effect on Gs signaling) relative to human population frequency for all MC4R variants reported in gnomAD as of January 8, 2024.
(5) To measure Gq signaling, the authors used the GAL4-VPR relay system. Is there additional experimental data to support that this relay system accurately represents Gq signaling?
The full Gq reporter uses an NFAT response element from the IL-2 promoter to regulate the expression of the GAL4-VPR relay. In this system, the activation of Gq signaling results in the activation of the NFAT response element, and this signal is then amplified by the GAL4-VPR relay. The NFAT response element has been previously well-validated to respond to the activation of Gq signaling (e.g., Boss, Talpade, and Murphy 1996). We will have added this reference to the text (see Results> Assays for disease-relevant mechanisms) to further support the use of the Gq assay.
(6) Identifying the variants responsive to the corrector was impressive. However, we are curious about how the authors confirmed that the restoration of MC4R activity was due to the correction of the MC4R protein itself. Is there a possibility that the observed effect could be influenced by other factors affected by the corrector? When the corrector was applied to the cells, were any expected or unexpected differential gene expression changes observed?
While we do not directly measure whether Ipsen-17 has effects on other signaling processes, previous work has shown that Ipsen-17 treatment does not indirectly alter signaling kinetics such as receptor internalization (Wang et al., 2014). Furthermore, our analysis methods inherently account for this by normalizing variant effects to WT signaling levels. Any observed rescue of a given variant inherently means that the variant is specifically more responsive to Ipsen-17 than WT, and the fact that different variants exhibit different levels of rescue is reassuring that the mechanism is on target to MC4R. Lastly, Ipsen-17 is known to be an antagonist of alpha-MSH activity and is thought to bind directly to the same site on MC4R (Wang et al., 2014).
We have revised text in the Methods section as follows (see Running DMS Assays) to better articulate this : “For chaperone experiments, cells were washed 3x with 10 mL DMEM to remove Ipsen 17 prior to agonist stimulation as it has been shown to be an antagonist of α-MSH activity and is thought to bind directly to the same site on MC4R (Wang et al. 2014).”
(7) As mentioned in the introduction, gain-of-function (GoF) variants are known to be protective against obesity. It would be interesting to see further studies on the observed GoF variants. Do the authors have any plans for additional research on these variants?
We agree this would be an excellent line of inquiry, but due to changes in company priorities we unfortunately do not have any plans for additional research on these variants.
Reviewer 2 (Public reviews):
Overview
In this manuscript, the authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein-coupled receptor associated with obesity. Reasoning that current deep mutational scanning approaches are insufficiently precise for some drug development applications, they focus on articulating new, more precise approaches. These approaches, which include a new statistical model and innovative reporter assay, enable them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor with high precision and statistical rigor.
They use the resulting data for a variety of purposes, including probing the relationship between MC4R's sequence and structure, analyzing the effect of clinically important variants, identifying variants that disrupt downstream MC4R signaling via one but not both pathways, identifying loss of function variants are amenable to a corrector drug and exploring how deep mutational scanning data could guide small molecule drug optimization.
Strengths
The analysis and statistical framework developed by the authors represent a significant advance. In particular, the study makes use of barcode-level internally replicated measurements to more accurately estimate measurement noise.
The framework allows variant effects to be compared across experimental conditions, a task that is currently hard to do with rigor. Thus, this framework will be applicable to a large number of existing and future deep mutational scanning experiments.
The authors refine their existing barcode transcription-based assay for GPCR signaling, and develop a clever "relay" new reporter system to boost signaling in a particular pathway. They show that these reporters can be used to measure both gain of function and loss of function effects, which many deep mutational scanning approaches cannot do.
The use of systematic approaches to integrate and then interrogate high-dimensional deep mutational scanning data is a big strength. For example, the authors applied PCA to the variant effect results from reporters for two different MC4R signaling pathways and were able to discover variants that biased signaling through one or the other pathway. This approach paves the way for analyses of higher dimensional deep mutational scans.
The authors use the deep mutational scanning data they collect to map how different variants impact small molecule agonists activate MC4R signaling. This is an exciting idea, because developing small-molecule protein-targeting therapeutics is difficult, and this manuscript suggests a new way to map small-molecule-protein interactions.
Weaknesses
The authors derive insights into the relationship between MC4R signaling through different pathways and its structure. While these make sense based on what is already known, the manuscript would be stronger if some of these insights were validated using methods other than deep mutational scanning.
Likewise, the authors use their data to identify positions where variants disrupt MC4R activation by one small molecule agonist but not another. They hypothesize these effects point to positions that are more or less important for the binding of different small molecule agonists. The manuscript would be stronger if some of these insights were explored further.
Impact
In this manuscript, the authors present new methods, including a statistical framework for analyzing deep mutational scanning data that will have a broad impact. They also generate MC4R variant effect data that is of interest to the GPCR community.
Recommendations for the authors:
(1) Page 7 - the Gq reporter relay system is clever. Could the authors include the original data showing that the simpler design didn't work at all, or at least revise the text to say more precisely what "not suitable due to weak SNR" means?
We added a panel (D) to Supplementary Figure 2 showing that the native NFAT reporter was ~10x weaker than the CRE reporter, and the relay system amplified the NFAT signal to be comparable to the CRE reporter:
(2) Page 7 - Even though the relay system gives some signal, it's clearly less sensitive/higher background than Gs. How does that play out in the quantitative analysis?
—AND—
(4) Page 10 - The Gq library had fewer barcodes per variant, and, as noted above, the Gq reporter doesn't work quite as well as the Gs one. It would be nice if the authors could comment on how these aspects of the Gq experiments affected data quality/power to detect effects.
Due to the reviewer's excellent suggestion, we updated Supplementary Figure 2B to better contextualize the quantitative effects of the difference in signal to noise ratio of the Gq versus the Gs reporter system (see changes below). These distributions show the Z-statistic for testing either each stop mutation (red) or all possible coding variants against WT. Thus, a |Z| > 1.96 corresponds to a p = 0.05 in a two-sided Wald Test. We can see that in the Gs reporter, 95% of the stops are nominally significantly different from WT (visualized above with the majority of the red distribution being < -1.96). Alternatively, only 64% of stops are nominally significantly different from WT in Gq. This implies that it will be more difficult to detect effects in the Gq system, especially those less severe than stops.
In addition to the overall signal to noise ratio being less in the Gq system, there were also less barcodes per variant (28 vs 56 barcodes per variant on average for Gq vs Gs). As demonstrated in Supplementary Figure 2C, the error bars on our estimates are related to the number of barcodes per variant (Standard Error ~ 1 / sqrt(Number of Barcodes), as shown in the plot below). This suggests that our estimates of mutant effects will be less certain in the Gq library than the Gs library. For example, the average standard error in the Gq library was 0.260 which was ~1.58 times larger than the Gs library's 0.165. Finally, we believe this further reiterates the power of our statistical framework, as it naturally enables formalized hypothesis testing that takes these errors into account when making comparisons both within reporters and across reporters.
(3) Page 9 - it would be nice to see the analysis framework applied to a few existing datasets from other types of assays, to really judge its performance. That's not the main point of this paper, and it's fine, but it would be lovely!
We agree with the reviewer and hope others apply our framework to their problems to further refine its utility and applicability! To that end, we’ve open-sourced it under a permissive license to help encourage the community to use it. Part of the challenge in applying it to other existing datasets is that few DMS experiments leverage variant-level replication through barcodes. While we re-analyzed an older DMS data from Jones et al. 2020 to produce the distributions in Supplementary Figure 2b, a more thorough comparison is outside the scope of this paper. That said, we have two additional manuscripts in preparation that leverage this framework to analyze DMS data in different proteins and assay types.
(5) Page 10 - In discussing the relationship of the data to ClinVar and AM, the authors use qualitative comparisons like "majority" and "typically." Just giving numbers would better help the reader appreciate how the data compare.
We added specific proportions for these statements to the text for the ClinVar and AlphaMissense comparisons as follows:
(See Results > Comprehensive Deep Mutational Scanning of MC4R): “For example, the majority (63.3%, 31/49) of human MC4R variants classified as pathogenic or likely pathogenic in ClinVar (Landrum et al., 2014) lead to a significant reduction of Gs signaling under low α-MSH stimulation conditions (significance threshold: false discovery rate (FDR) < 1%; Fig. 2C). Variants that are significantly loss-of-function in this condition are rarer in the human population, and more common human variants have no significant effect on MC4R function (significance threshold: FDR < 1%; Fig. 2D). Loss-of-function variants by our DMS assay are also typically (e.g., AlphaMissense: 93.4%, 1894/2028) predicted to be deleterious by commonly used variant effect predictors like AlphaMissense (Cheng et al., 2023) and popEVE (Orenbuch et al., 2023) (Supplementary Fig. 5).”
(6) Pages 10-12, Figures 2C, E. The data look really nice, but the correlation with clinvar and the Huang data is not perfect (e.g. many pathogenic variants are classified as WT and partial LoF variants too). Can the authors comment on this discrepancy? For ClinVar, they should say when ClinVar was accessed and also how they filtered variants. I would recommend using variants with at least 1 star. Provided they did use high-quality clinical classifications, do they think the classifications are wrong, or their data? The same goes for Huang.
—AND—
(7) Page 13 - similar to previous comments, I'm curious about the 5 path/likely path ClinVar variants that are not LoF in the assay. Are they high noise/fewer barcodes? Or does the assay just miss some aspect of human biology?
ClinVar data was accessed on January 5, 2024 (see Methods: Comparison to human genetics data and variant effect predictors). No annotation quality filtering was performed, and we have revised the text as follows to clarify this:
(see Methods > Comparison to Human Genetics Data and Variant Effect Predictors): “Pathogenicity classifications of MC4R missense and nonsense variants were obtained from ClinVar (Landrum et al., 2014) on January 5, 2024, and all available annotations were included in the analysis regardless of ClinVar review status metric.”
A substantial proportion of the discrepancy between our data and ClinVar is, as the reviewer suggests, likely due to low quality ClinVar annotations. Of the five variants that the reviewer notes were reported as pathogenic/likely pathogenic but did not result in loss of protein function in any of our DMS assays, two (V50M and V166I) have been reclassified in ClinVar to uncertain or conflicting interpretation since we accessed annotations in early 2024. An additional two of the five discrepant variants (Q43K and S58C) currently have 0 star ratings to support their pathogenic/likely pathogenic annotation. The remaining discrepant variant (S94N) has a 1 star rating supporting an annotation of “likely pathogenic.
The Huang et al. paper did an admirably thorough job of aggregating variant annotations from more than a dozen primary literature sources that each reported functional validation data for small panels of variants. However, one inherent limitation of this approach is that the resulting annotation classes are based on experiments that were carried out using inconsistent methods and/or scoring criteria. For example, classifications in the Huang et al. paper are based on an inconsistent mix of functional assay types (e.g., Gs signaling, Gq signaling, protein cell surface expression, etc.), and different variants were tested in different cell types (e.g., HEK293T, CHO, Cos-7, etc.). In principle, DMS assays should provide a more accurate assessment of the relative quantitative differences between alleles since each variant was tested using identical experimental conditions and analysis parameters.
That being said, while very good, our assays are likely missing or only indirectly reporting on at least some aspects of MC4R biology. For example, in addition to Gs and Gq signaling, MC4R interfaces with β-arrestin. Variants that are protective against obesity-related phenotypes have been shown to increase recruitment of β-arrestin to MC4R, and we did not directly assess this function.
(8) Page 15, Fig 3C - The three variants they highlight all have paradoxical changes in bias as a-MSH dose is increased (e.g. the bias inverts). I'm not a GPCR expert, but this seems interesting and a little weird. Perhaps the authors could comment on it?
We agree this is an interesting observation that deserves further study, but unfortunately is outside the scope of our priorities at the moment. As noted, all three highlighted variants in this region have a biased basal activity, and this bias inverts upon stimulation. While we don’t have a good explanation for why this would be the case, this phenomenon has been previously observed for 158R (Paisdzior et al., 2020). Our DMS data emphasizes how diverse biased effects can be and further highlights the importance of characterizing these effects. It would be interesting if further studies could elucidate the mechanistic basis for this behavior and how it may be related to G protein coupling in this region.
(9) Page 16 - I'm not familiar with the A21x1 formalism. For the general reader, maybe the authors could introduce this formalism.
Given the shared structural topology of GPCRs, others have developed a variety of numbering schemes to refer to where various variants are to allow more direct comparisons between different GPCRs. We use the GPCRDB.org numbering scheme (e.g., F202<sup>5x4</sup>) as it takes experimentally determined structures into account. Roughly speaking, the number preceding the “x” corresponds to which transmembrane domain (one through seven) or region the residue is located in. The numbers following the “x” correspond to where that residue is located in that region relative to a structurally conserved residue that is always assigned 50. For example F202<sup>5x48</sup> means that F202 is located in the 5th transmembrane helix and is 2 residues before the most conserved M204<sup>5x50</sup>. We updated the text to clarify this accordingly:
(see Results > Structural Insights into Biased Signaling): “Upon ligand binding, W258 (W258<sup>6x48</sup> in https://gpcrdb.org/ nomenclature, where 6 corresponds to the 6th transmembrane helix and 48 denotes 258 is 2 residues before the most conserved residue in that helix (Isberg et al., 2015)) of the conserved CWxP motif undergoes a conformational rearrangement that is translated to L133<sup>3x36</sup> and I137<sup>3x40</sup>, of the conserved PIF motif (MIF in melanocortin receptors).”
(10) Page 17, Figure 3A - Since 137, 254, and 140 are not picked out on the structure, I have no idea where they are. If the authors want to show readers these residues, perhaps they could be annotated or a panel added. Since ~1 entire page of the manuscript is dedicated to this cascade, it might make sense to add a panel. Just amplifying the comment above as regards position 79, others were discussed in that paragraph but not highlighted.
We updated Supplementary Fig. 6C,D to label all of the listed residues on the protein structure for easy reference.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This manuscript describes an important study of the giant virus Jyvaskylavirus. The characterisation presented is solid, although, in the current form, it is not clear to what extent these findings change our perception of how giant viruses, especially those isolated from a cold environment, function. The work will be of interest to virologists working on giant viruses as well as those working with other members of the PRD1/Adenoviridae lineage.
Thank you for the revision and positive comments. We decided to submit our revised version of the manuscript with changes made in light of the comments made by the editorial team and the reviewers. We hope that now the manuscript is in a better shape and satisfies all comments received. Major changes made were:
- We changed the author order considering reviewer 2 comments (point 11). Note that no author was added or removed, we just rearranged the order of authorship.
- We included a new supplementary table with the Jyvaskylavirus genome annotation. This is now supplementary table 2.
- We included a supplementary figure 9 to support our changes based on reviewer 2 comments (point 6).
- Figures 2,5,6,7 and the supplementary figure 2 were updated to accommodate our answers to different reviewer comments.
- Three new references were added to support some of our changes.
Below you will find our responses to each specific point raised by the reviewers.
Public Reviews:
Reviewer #1 (Public review):
This study presents Jyvaskylavirus, a new member of the Marseilleviridae family, infecting Acanthamoeba castellanii. The study provides a detailed and comprehensive genomic and structural analysis of Jyvaskylavirus. The authors identified ORF142 as the capsid penton protein and additional structural proteins that comprise the virion. Using a combination of imaging techniques the authors provide new insights into the giant virus architecture and lifecycle. The study could be improved by providing atomic coordinates and refinement statistics, comparisons with available giant virus structures could be expanded, and the novelty in terms of the first isolated example of a giant virus from Finland could be expounded upon.
The study contributes new structural and genomic diversity to the Marseilleviridae family, hinting at a broader distribution and ecological significance of giant viruses than previously thought.
Thank you for your constructive comments. We have addressed each point raised in our rebuttal letter and revised the manuscript accordingly. By following your specific comments, we improved the manuscript regarding atomic coordinates, refinement statistics and novelty of finding a Finnish marseillevirus. Details are provided in the specific answers to your points.
Reviewer #2 (Public review):
Summary:
This paper describes the molecular characterisation of a new isolate of the giant virus Jyvaskylavirus, a member of the Marseilleviridae family infecting Acanthamoeba castellanii. The isolate comes from a boreal environment in Finland, showcasing that giant viruses can thrive in this ecological niche. The authors came up with a non-trivial isolation procedure that can be applied to characterise other members of the family and will be beneficial for the virology field. The genome shows typical Marseilleviridae features and phylogenetically belongs to their clade B. The structural characterisation was performed on the level of isolated virion morphology by negative stain EM, virions associated with cells either during the attachment or release by helium microscopy, the visualisation of the virus assembly inside cells using stained thin sections, and lastly on the protein secondary structure level by reconstructing ~6 A icosahedral map of the massive virion using cryoEM. The cryoEM density combined with gene product structure prediction enabled the identification and functional assessment of various virion proteins.
Strengths:
The detailed description of the virus isolation protocol is the largest strength of the paper and this reviewer believes it can be modified for isolating various viruses infecting small eukaryotes. The cryoEM map allows us to understand how exceptionally large virions of these viruses are stabilised by minor capsid proteins and nicely demonstrates the integration of medium-resolution cryoEM with protein structure prediction in deciphering virion protein function. The visualisation of ongoing virus assembly inside virus factories brings interesting hypotheses about the process that; however, needs to be verified in the next studies.
Weaknesses:
The conclusions from helium microscopy images are overinterpreted, as the native membrane structure cannot be preserved in a fixed and dehydrated sample. In the image, there are many other parts of the curved membrane and a lot of virions, to me it seems the specific position of the highlighted virion could arise by a random chance. The claim that the cells were imaged in the near-original state by this method should be therefore omitted. Also, no mass spectrometry data are presented that would supplement and confirm the identity of virion proteins which predicted models were fitted into the cryoEM density. For a general virology reader outside of the giant virus field, the results presented in the current state might not have enough influence and the section should be rewritten to better showcase the novelty of findings.
Thank you for your constructive comments. We thank reviewer #2 for highlighting these weaknesses, giving us the opportunity to improve our study. We have removed the claim that the cells were imaged in a near-original state. Additionally, we agree that the positions of the virions on the cell surface could result from a random distribution. However, the specific virion in panel 3C is situated halfway into a crevice, and it cannot be ruled out that this particular one could be in the process of being endocytotically uptaken. This is why we used the term "probably" while referring to this finding. Regarding the mass spectrometry data, while we understand that MS data would provide an additional layer of evidence to validate the specific proteins present in the virion, they would not confirm the precise location or role of these proteins within the virion.
We have addressed each point raised in our rebuttal letter and revised the manuscript accordingly.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
I have only minor comments which should be relatively simple to address:
(1) Atomic coordinates should be deposited in the PDB, and refinement statistics for the models provided, for example by expanding Table S2.
We thank reviewer #1 for the suggestion. In the original submission in the ‘Data availability’ statement we stated that ‘Predicted Jyvaskylavirus PDB models using ModelAngelo and Alphafold have been deposited at BioStudies under the accession number S-BSST1654’. So, atomic coordinates of all predicted models are publicly available at the https://www.ebi.ac.uk/biostudies/ ; for additional clarity we also added the link in the ‘Data availability’ statement in the revised version.
Our reasoning of not depositing them in the Protein Data Bank associated to our EMD-51613 entry is because they remain predicted models rigid-body fitted into the Jyvaskylavirus density map of 6.3 Å resolution. However, we have added into our BioStudies deposition (BSST1654) the whole Jyvaskylavirus pentameric assembly model (including all identified and predicted major and minor capsid proteins) rigid-body fitted into the Jyvaskylavirus map, and it can be easily downloaded.
We did not to perform the real-space ‘minimization_global’ refinement of the predicted models corresponding to the ORFs of Melbournevirus (or Jyvaskylavirus) into the corresponding Melbournevirus available densities with entries EMD-37188, 37189, 37190 at ~ 3.5 Å resolution (by block-based reconstruction methods) as these maps were generated and deposited by other authors. Instead, we performed the rigid-body fit-into-map procedure of the individual predicted Jyvaskylavirus models into the previously deposited Melbournevirus maps using ChimeraX, demonstrating a fold-map alignment and assignment (see for example the individual stereo views in Supplementary Figure 6).
In the revised version, we now provide the refinement statistics for the complete Jyvaskylavirus pentameric assembly (inclusive of peripentonal major capsid and minor capsid proteins) rigid-body fitted as a whole into the Melbournevirus 5-block reconstruction map using PHENIX, resulting into a CC<sub>mask</sub> of 57.3% (this is also stated in Supplementary Figure 7). The same pentameric assembly model was then placed into our lower-resolution 6.3 Å Jyvaskylavirus 3D density map in ChimeraX and rigid-body refined as a whole in PHENIX, yielding a predictably lower CC<sub>mask</sub> of 33%. This pentameric assembly model has now also been included into BioStudies entry.
The procedure for this rigid body fitting and refinement has been clarified and added to the 'Materials and Methods' section as follows:
“Then, the corresponding full 3D models were predicted using AlphaFold3 and fitted into the Melbournevirus and Jyvaskylavirus cryoEM density using the fit-into-map routine in ChimeraX together with the peripentonal capsomers (Meng et al 2023). To assess the metric of this fitting (Supplementary Figure 7), the 3.5 Å five-fold Melbournevirus block 3D density (EMDB-37190) was boxed around the pentameric assembly model and refined as a whole using rigid-body refinement in PHENIX, yielding a CC<sub>mask</sub> of 57.3%. The same pentameric model was subsequently fitted into the 6.3 Å Jyvaskylavirus 3D cryo-EM density (previously boxed around the model), resulting in a lower CC<sub>mask</sub> of 33%, consistent with the limited resolution of the capsid map and below regions.”
(2) The results section 'Jyvaskylavirus three-dimensional architecture' could be expanded to compare and contrast with other giant virus structures, in terms of T-number, diameter, and features on and inside the capsid. This is not essential but would help focus claims of novelty with regard to structure.
We have added a few lines as indicated by reviewer#1 to contextualize in morphological terms Jyvaskylavirus with other NCLDV viruses as follows:
“Both the capsid organization and virion size are similar to those of other Marseilleviruses, such as Melbournevirus and Tokyovirus. Pacmanvirus, considered to be at the crossroads between Asfarviridae and Faustoviruses, also possesses the same T number (309) and a comparable diameter to Jyvaskylavirus. In contrast, other giant viruses, such as African swine fever virus (ASFV), representative of the Asfarviridae family, have a T number of 277 and a diameter of approximately 2,100 Å, while PBCV-1, a member of the Phycodnaviridae family, has a T number of 169 and an average diameter of 1,900 Å. All of the above-mentioned viruses have been shown to possess a major capsid protein with a vertical double jelly-roll fold that composes the capsid shell, along with an internal membrane bilayer. Minor capsid proteins have been identified and structurally modelled for the smaller virions ASFV and PBCV-1 (Wang et al. 2019; Shao et al. 2022).”
(3) The authors highlight one of the main novelties of the virus as being the first to be isolated from Finland. The first isolation of a giant virus from the region is indeed a success but reported isolation experiments for giant viruses are still relatively few. To help shed light on the likely distribution of Jyvaskylavirus-like viruses in the region, and further afield, the genome of Jyvaskylavirus could be searched against relevant available metagenomes.
In the last decade the interest on finding giant viruses by metagenomics has increased. However, the focus has been on marine environments, where these viruses are shown to be prevalent. Besides the few isolates from the Northern hemisphere mentioned in the manuscript, northern giant viruses were detected in metagenome datasets from glacier samples, epishelf lakes, the permafrost, the Nordic seas and in a deep-sea hydrothermal vent. Most of the genomic hits are for mimivirus-like or phycodnavirus-like sequences. A few marseilleviruses were found in the Loki’s castle deep sea vent, and we have already included these sequences in the analysis shown by the supplementary figure 3. In this case the deep-sea vent viruses clusters outside the conventional clades of the marseilleviridae family, evidencing their uniqueness.
In response to the suggestion of exploring the distribution of Jyvaskylavirus, we utilized the MGnify-database to search for DNA polymerase (DNApol) and major capsid protein (MCP) sequences. Our findings revealed multiple hits with significantly low E-values (< 1e-80), where both DNApol and MCP were detected from the same studies, indicating the presence of similar virus-like particles (VLPs) globally. Of particular interest was the detection of similar sequences in metagenomes and transcriptomes obtained from drinking water distribution systems of ground and surface waterworks in central and eastern Finland (https://www.ebi.ac.uk/metagenomics/studies/MGYS00005650#overview). We have acknowledged this in the manuscript and cited the appropriated references, as follows:
Results: “Searching the Jyvaskylavirus major capsid protein and DNA polymerase sequences in the MGnify-database (Richardson et al 2023) yields multiple hits with significantly low E-values (< 1e-80), as expected from the apparent ubiquity of marseilleviruses. Of note was the detection of similar sequences in metagenomes and transcriptomes obtained from drinking water distribution systems of ground and surface waterworks in central and eastern Finland, evidencing that marseilleviruses are prevalent but still unexplored in this region (Tiwari et al 2022)”.
Discussion: “Marseillevirus DNA polymerase sequences are present in metagenomes from Finnish drinking water distribution systems (Tiwari et al 2022), hinting to a wide distribution of these viruses and still unknown ecological role in Central and Eastern Finland.”
Reviewer #2 (Recommendations for the authors):
Apart from the major comments in the weaknesses section, I have these additional minor comments to the authors:
(1) I do not understand why the authors emphasized the uniqueness of isolating a giant virus from Finland. I think the manuscript would benefit if they rather emphasize that the virus comes from a boreal environment.
The first giant virus, APMV, was described in 2003. In the following years the apparent ubiquity of these viruses was evidenced by two fronts. Metagenomics made clear that giant viruses are found almost everywhere, biased towards the oceans. Isolation efforts brought new virus groups in evidence but has been so far biased towards central Europe and South America samples. The closest isolated giant viruses to Jyvaskylavirus would be either an uncharacterized Swedish cedratvirus or a few microalgae-infecting mimivirus-like and phycodnaviruses-like isolates from Norway. Among marseilleviruses, Jyvaskylavirus is the northernmost isolate so far. Other marseilleviruses from the northern hemisphere were found in France, India, Japan and Algeria only.
We still believe that finding a giant virus in Finland is relevant, considering that no other is known to date, be as an isolate or detected by genomics. We have made these observations clearer in the manuscript, giving emphasis to the boreal environment as well.
(2) All discussed AlphaFold models should be added as Supplementary PDB data.
We thank reviewer #2 for the suggestion. In the original submission in the ‘Data availability’ statement we stated that ‘Predicted Jyvaskylavirus PDB models using ModelAngelo and Alphafold have been deposited at BioStudies under the accession number S-BSST1654’. So, atomic coordinates of all predicted models are publicly available at the https://www.ebi.ac.uk/biostudies/ ; for additional clarity we also added the link in the ‘Data availability’ statement in the revised version.
Our reasoning of not depositing them in the Protein Data Bank associated to our EMD-51613 entry is because they remain predicted models rigid-body fitted into the Jyvaskylavirus density map of 6.3 Å resolution. However, we have added into our BioStudies deposition (BSST1654) the whole Jyvaskylavirus pentameric assembly model (including all identified and predicted major and minor capsid proteins) rigid-body fitted into the Jyvaskylavirus map, and it can be easily downloaded.
We did not to perform the real-space ‘minimization_global’ refinement of the predicted models corresponding to the ORFs of Melbournevirus (or Jyvaskylavirus) into the corresponding Melbournevirus available densities with entries EMD-37188, 37189, 37190 at ~ 3.5 Å resolution (by block-based reconstruction methods) as these maps were generated and deposited by other authors. Instead, we performed the rigid-body fit-into-map procedure of the individual predicted Jyvaskylavirus models into the previously deposited Melbournevirus maps using ChimeraX, demonstrating a fold-map alignment and assignment (see for example the individual stereo views in Supplementary Figure 6).
In the revised version, we now provide the refinement statistics for the complete Jyvaskylavirus pentameric assembly (inclusive of peripentonal major capsid and minor capsid proteins) rigid-body fitted as a whole into the Melbournevirus 5-block reconstruction map using PHENIX, resulting into a CC<sub>mask</sub> of 57.3% (this is also stated in Supplementary Figure 7).
The same pentameric assembly model was then placed into our lower-resolution 6.3 Å Jyvaskylavirus 3D density map in ChimeraX and rigid-body refined as a whole in PHENIX, yielding a predictably lower CC<sub>mask</sub> of 33%. This pentameric assembly model has now also been included into BioStudies entry.
The procedure for this rigid body fitting and refinement has been clarified and added to the 'Materials and Methods' section as follows:
“Then, the corresponding full 3D models were predicted using AlphaFold3 and fitted into the Melbournevirus and Jyvaskylavirus cryoEM density using the fit-into-map routine in ChimeraX together with the peripentonal capsomers (Meng et al 2023). To assess the metric of this fitting (Supplementary Figure 7), the 3.5 Å five-fold Melbournevirus block 3D density (EMDB-37190) was boxed around the pentameric assembly model and refined as a whole using rigid-body refinement in PHENIX, yielding a CC<sub>mask</sub> of 57.3%. The same pentameric model was subsequently fitted into the 6.3 Å Jyvaskylavirus 3D cryo-EM density (previously boxed around the model), resulting in a lower CC<sub>mask</sub> of 33%, consistent with the limited resolution of the capsid map and below regions.”
(3) Figure 2A: Could ORFs that encode structural proteins discussed in the paper, be somehow highlighted?
We have updated Figure2A to include this information.
(4) Figure 2C: Could be somehow highlighted from these members on which there was conducted structural characterisation (e.g. by some symbol next to the name)?
We have updated Figure2C to include this information.
(5) Figure 5A: Could the central bid be shown in a lower threshold (you can retain the threshold for the protein shell)? It would be interesting to see some details of the interior, rather than a massive blob.
We have decreased the threshold level of the map as suggested.
(6) Figure 6: the density corresponding to MCPs, minor capsid, and penton proteins respectively could be colour-zoned in Chimera(X). This would better visualise where each entity lies.
About ORF142 - what other virus protein possesses this fold? Is it similar to the penton protein in other PRD1/Adenoviridae viruses? Maybe some comparison could be presented?
We have incorporated the feedback from reviewer_#_2 by modifying the corresponding panel A in Figure 6. We have colour-zoned the penton (ORF142), some of the density region corresponding to the MCPs (ORF184) and to the minor cap proteins (ORF121). We have kept in grey the density corresponding to other minor proteins, and those we were able to identify are logically introduced later and shown as individual coloured cartoon tube models fitted into the density in panel A of Figure 7.
Regarding ORF142, we have included a reference in the Discussion section to a new Supplementary Figure 9, where we provide a side-by-side comparison of the predicted Jyvaskylavirus penton protein model with experimentally derived penton protein models of PRD1 and HCIV-1. In light of this comparison, we have also added a brief clarification in the Discussion as follows:
“However, in ORF142, the CHEF strands are predicted to be tilted relative to the BIDG strands, with an estimated angle of approximately 60° based on visual inspection (Supplementary Figure 9).”
(7) Figure 7B: Could the density around the protein be zoned (rather than side view clipped), as this would better showcase how it fits the density?
Initially, we presented a side view of the clipped surface to highlight the correspondence between the wall-shaped density, characteristic of a low-resolution beta-barrel, and the beta-barrel of the predicted model. Following the Reviewer’s suggestion, we have now surface-zoned the density and provided a stereo view of the density with the model fitted into the map using ChimeraX. While we recognize that stereo views are no longer commonly used in main text figures, we believe they remain valuable for visually assessing the overall match in low-resolution 3D density maps.
(8) The authors did not try to reconstruct the asymmetric feature of the virion by classifying pentons, which may have identified a special vertex, one they claim might be required for genome packaging in "open particles". I understand the number of particles is low, but even low-resolution classification in C5 might be of interest in the field.
We thank reviewer #2 for this valuable comment. The potential existence of a unique vertex in Marseilleviruses remains an open and intriguing question. Further investigations, including a significant increase in the number of particles, may help clarify this issue, and we plan to explore this topic in future structural studies.
(9) Supplementary Figure 2: It would be interesting how the titre changes after the 12 hours, will it plateau? Could you add a bar showing the original titre to the chart showing stability after 109 days? I like the data in this figure and think it should be transferred to the main text.
The titre at the 12h time point is very close to the titre we often get in our stocks, indicating that indeed it is close to peaking. For comparison: the titre of the 12-hour time point was 10<sup>11.55</sup> TCID50/ml, whereas our stock has a titre of 10<sup>11.66</sup> TCID50/ml. Our growth curve had more time points up to 48h and we lost the later time points due to a higher viral load than predicted, which led to us not being able to count these time points with the dilutions used. Showing the first 12 hours was enough for our initial purpose, which was to show a quick replication cycle for Jyvaskylavirus, in accordance with the other marseilleviruses in which the timing of the replication cycle was observed (see the answer for point 10 below).
We have added a bar representing the original titre of the stock used for the stability experiment as suggested.
While preparing the draft we were divided into having the growth and stability figure in the main text or in the supplementary material. Our decision was to move this data to the supplementary material and keep the focus of the main text on the discovery, genome analysis and structural data, as these are the main findings of our work. The specifics regarding stability, growth and other uncharacterized VLPs went to the supplementary material for those in the field who are interested in looking deeper. That being said, we will decide to keep this data as supplementary material if you and the editor agrees.
(10) In the Discussion, the authors should focus on how our perception of giant viruses changes by this study - compare with other growth curves, stability assays, and structures of giant viruses, showcasing how prevalent those stabilising minor capsid proteins are, etc. My impression is that in the current form, it is just not clear if/how substantial these findings are and such a comparison and putting the results in a bigger picture would considerably increase the impact of the paper.
Our comparisons with other marseilleviruses were based on genomic and structural characteristics, the two fronts we had data from the literature and databases to compare to. Sadly there is not too much information regarding stability and growth of other isolates that could be used for an in-depth comparison. For example: although marseilleviruses are known to have a fast replication cycle, this has been measured by DAPI staining of DNA inside infected cells to evaluate viral factory formation (Boyer et al 2009), or by time-series observations of viral cycle stages by electron microscopy (Fabre et al 2017), and not by viral titration as done here. We included a mention to these references in the results:
“A fast replication cycle is a feature also shown for other marseilleviruses (Boyer et al 2009 ; Fabre et al 2017).”
The literature also does not show virion stability of other isolates, making it impossible to have a comparison with jyvaskylavirus. A comparative study testing different isolates side by side is definitely of relevance and interest, but this would be difficult to be done in a short time due to obtaining other isolates. We believe the results in this manuscript might set some parameters to be used for comparing with other marseilleviruses, by our groups and others, in the future.
Regarding the prevalence of the minor capsid proteins, we have expanded and clarified the identification of ORFs in Melbournevirus in the ‘Results’ and ‘Discussion’ sections. The revised Supplementary Table 4 has been updated accordingly and referenced in the results to clarify that the identification of Melbourne ORFs was carried out in BLASTp by querying the Jyvaskylavirus minor protein sequences exclusively against the Melbournevirus isolate 1 (NCBI Reference Sequence: NC_025412.1). BLASTp was then performed against the full sequence database, and homologous sequences were primarily retrieved from other marseillaviruses. These results have been compiled in a new Supplementary Table 5.
However, Supplementary Table 5 also shows that the hits for Melbournevirus are not ranked at the top, and in some cases, they do not appear among the top hits.
The ‘Results’ section now contains the following text:
“To this end, we identified the corresponding Jyvaskylavirus ORFs in Melbournevirus through sequence comparison with Melbournevirus isolate 1 (NCBI Reference Sequence: NC_025412.1) (Supplementary Table 34). However, when the identified Jyvaskylavirus ORF sequences were analyzed using BLASTp without restricting the search to the Melbournevirus reference, many hits were observed in other giant viruses, primarily marseillevirus. Remarkably, some of these hits scored higher than those for Melbournevirus, supporting the presence of homologous proteins in these viruses (Supplementary Table 5).”
The ‘Discussion’ section now contains the following text:
“Additionally, the observation that the identified Jyvaskylavirus minor capsid protein sequences are shared across other marseillaviruses supports their essential structural and stabilizing roles in these viruses.”
At the same time, we have modified the ‘Materials and Methods’ section to include a reference to Supplementary Figure 5, where the use of ModelAngelo is mentioned. Additionally, a new Supplementary Figure 10 has been included to clarify how the residues built into the Melbournevirus density using ModelAngelo (without prior knowledge of any sequence) are subsequently matched with the Jyvaskylavirus sequences.
(11) Based on the author's statement, Iker Arriaga did all the cryoEM experiments. It is strange to me they are not placed higher on the author's list.
We thank you for this observation and agree with your comment. This manuscript has been in preparation for a few years, and the first draft had the author order defined before the structural data collection and analyses were completed. Iker participation was indeed important and substantial from the first draft to the submitted version and he definitely deserves a better author placement. We have modified the author order to accommodate this. Note that only the author order changed and that no author has been included or removed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this manuscript, the authors provide strong evidence that the cell surface E3 ubiquitin ligases RNF43 and ZNRF3, which are well known for their role in regulating cell surface levels of WNT receptors encoded by FZD genes, also target EGFR for degradation. This is a newly identified function for these ubiquitin ligases beyond their role in regulating WNT signaling. Loss of RNF43/ZNRF3 expression leads to elevated EGFR levels and signaling, suggesting a potential new axis to drive tumorigenesis, whereas overexpression of RNF43 or ZNRF3 decreases EGFR levels and signaling. Furthermore, RNF43 and ZNRF3 directly interact with EGFR through their extracellular domains.
Strengths:
The data showing that RNF43 and ZNRF3 interact with EGFR and regulate its levels and activity are thorough and convincing, and the conclusions are largely supported.
Weaknesses:
While the data support that EGFR is a target for RNF43/ZNRF3, some of the authors' interpretations of the data on EGFR's role relative to WNT's roles downstream of RNF43/ZNRF3 are overstated. The authors, perhaps not intentionally, promote the effect of RNF43/ZNRF3 on EGFR while minimizing their role in WNT signaling. This is the case in most of the biological assays (cell and organoid growth and mouse tumor models). For example, the conclusion of "no substantial activation of Wnt signaling" (page 14) in the prostate cancer model is currently not supported by the data and requires further examination. In fact, examination of the data presented here indicates effects on WNT/b-catenin signaling, consistent with previous studies.
Cancers in which RNF43 or ZNRF3 are deleted are often considered to be "WNT addicted", and inhibition of WNT signaling generally potently inhibits tumor growth. In particular, treatment of WNT-addicted tumors with Porcupine inhibitors leads to tumor regression. The authors should test to what extent PORCN inhibition affects tumor (and APC-min intestinal organoid) growth. If the biological effects of RNF43/ZNRF3 loss are mediated primarily or predominantly through EGFR, then PORCN inhibition should not affect tumor or organoid growth.
We thank the reviewer’s appreciation of the key strength of our study. We fully agree with the reviewer that RNF43/ZNRF3 play key roles in restraining WNT signaling and their deletions activate WNT signaling that leads to cancer promotion, as discussed and cited in our manuscript (Hao et al, 2012; Koo et al, 2012). We have revised the language in this manuscript to avoid any confusion or appearance of downplaying this known signaling pathway in cancer progression.
What we would like to highlight in this work is that our study uncovered an effect of RNF43/ZNRF3 on EGFR, leading to biological impact in multiple model systems. In particular, we included the APC-mutated human cancer cell line HT29 and Apc min mouse intestinal tumor organoids. In the context of APC mutations, β-catenin stabilization and the activation of WNT target genes are essentially decoupled from upstream WNT ligand binding to WNT receptors, thus we could primarily focus on the effect of RNF43/ZNRF3 on EGFR. Our statement of “no substantial activation of WNT signaling” as cited by the reviewer was made in describing the data in Fig. 7E where we did not observe β-catenin accumulation in the nucleus and reasoned no substantial activation of canonical WNT signaling. We agree that further examination would help strengthen the conclusion and appreciate the reviewer’s suggestion of PORCN inhibition experiments. While PORCN inhibition is a valuable experiment in models with abundance of WNT ligands/receptors and non-mutationally activated regulators of WNT signaling (Yu et al, 2020), in biological scenarios with existing APC mutations, another group has previously demonstrated that PORCN inhibition had no observable effect on WNT signaling in APC-deficient cells (PMID: 29533772). In our initial submission, we confirmed this predicted low response to manipulation of WNT signaling components upstream of a mutated APC. We showed that addition of RSPO1 in Apc min mouse intestinal tumor organoids failed to further activate WNT target expression (Fig. 6G). Furthermore, in this revised manuscript, we added new data on EGFR inhibition and PORCN inhibition in WT and Znrf3 KO MEFs (Fig. 6L). PORCN inhibition had no impact on cell growth in neither WT nor Znrf3 KO MEFs, suggesting that Znrf3 KO promoting MEF growth is WNT independent. In contrast, inhibition of EGFR downstream signaling components (Fig. 6L) significantly blocked MEF growth and abolished the impact of Znrf3 KO in MEF growth. This new evidence further supports our main conclusion that RNF43/ZNRF3 controls EGFR signaling to regulate cell growth.
Reviewer #2 (Public Review):
Using proteogenomic analysis of human cancer datasets, Yu et al, found that EGFR protein levels negatively correlate with ZNFR3/RNF43 expression across multiple cancers. Interestingly, they found that CRC harbouring the frequent RNF43 G659Vfs*41 mutation exhibits higher levels of EGFR when compared to RNF43 wild-type tumors. This is highly interesting since this mutation is generally not thought to influence Frizzled levels and Wnt-bcatenin pathway activity. Using CRISPR knockouts and overexpression experiments, the authors show that EGFR levels are modulated by ZNRF3/RNF43. Supporting these findings, modulation of ZNRF3/RNF43 activity using Rspondin also leads to increased EGFR levels. Mechanistically, the authors, show that ZNRF3/RNF43 ubiquitinate EGFR and leads to degradation. Finally, the authors present functional evidence that loss of ZNRF3/RNF43 unleashes EGFR-mediated cell growth in 2D culture and organoids and promotes tumor growth in vivo.
Overall, the conclusions of the manuscript are well supported by the data presented, but some aspects of the mechanism presented need to be reinforced to fully support the claims made by the authors. Additionally, the title of the paper suggests that ZNRF3 and RNF43 loss leads to the hyperactivity of EGFR and that its signalling activity contributes to cancer initiation/progression. I don't think the authors convincingly showed this in their study.
We thank the reviewer commenting that our “conclusions of the manuscript are well supported by the data presented.” We address the concerns raised by this reviewer in an itemized way as detailed below:
Major points:
(1) EGFR ubiquitination. All of the experiments supporting that ZNFR3/RNF43 mediates EGFR ubiquitination are performed under overexpression conditions. A major caveat is also that none of the ubiquitination experiments are performed under denaturing conditions. Therefore, it is impossible to claim that the ubiquitin immunoreactivity observed on the western blots presented in Figure 4 corresponds to ubiquitinated-EGFR species. Another issue is that in Figure 4A, the experiments suggest that the RNF43-dependent ubiquitination of EGFR is promoted by EGF. However, there is no control showing the ubiquitination of EGFR in the absence of EGF but under RNF43 overexpression. According to the other experiments presented in Figures 4B, 4C, and 4F, there seems to be a constitutive ubiquitination of EGFR upon overexpression. How do the authors reconcile the role of ZNRF3/RNF43 vs c-cbl?
We agree with this reviewer of the limitation of overexpression experiments. In this manuscript, we actually leveraged both overexpression and knockout systems to demonstrate that ZNRF3/RNF43 regulates EGFR ubiquitination: in Fig 4A, we showed that overexpression of RNF43 increased EGFR ubiquitination; in Fig 4B&C and Fig S3A, we showed that RNF43 knockout decreased EGFR ubiquitination; in Fig 4F, we showed that overexpression of ZNRF3 WT increased EGFR ubiquitination but overexpression of ZNRF3 RING domain deletion mutant failed to increase EGFR ubiquitination.
We also appreciate the rigor with which the reviewer has approached our methodology. We acknowledge that denaturing conditions can provide additional validation, but the technical challenges associated with denaturing conditions include the potential disruption of epitope structures recognized by these antibodies. Our methodology was chosen to balance the need for accurate detection with the preservation of protein structure and function, which are crucial for understanding the biological implications of EGFR ubiquitination. Moreover, our immunoprecipitation and subsequent Western blotting were stringent with high SDS and 2-ME, optimized to minimize non-specific binding and enhance the specificity of detection. We believe that the data presented are robust and contribute significantly to the existing body of knowledge on EGFR ubiquitination.
CBL is a well-known E3 ligase of EGFR, and it induces EGFR ubiquitination upon EGF ligand stimulation. Therefore, in order to have a fair comparison of RNF43 and CBL on EGFR ubiquitination, we designed Fig 4A and related experiments in the setting of EGF stimulation. We observed that RNF43 overexpression increased EGFR ubiquitination as potently as CBL did. Following this result, we further demonstrated that knockout of RNF43 decreased endogenous ubiquitinated EGFR level in the unstimulated/basal condition (Fig 4B) as well as in the EGF-stimulated condition (Fig 4C). We acknowledge the importance and interest in fully understanding how ZNRF3/RNF43 interplays with the functions of CBL in regulating EGFR ubiquitination. This line of investigation indeed holds the potential to uncover novel regulatory mechanisms in detail. However, the primary focus of the current study was to establish a foundational understanding of ZNRF3/RNF43 role in regulating EGFR ubiquitination. We look forward to exploring further in future work.
(2) EGFR degradation vs internalization. In Figure 3C, the authors show experiments that demonstrate that RNF43 KO increases steady-state levels of EGFR and prevents its EGF-dependent proteolysis. Using flow cytometry they then present evidence that the reduction in cell surface levels of EGFR mediated by EGF is inhibited in the absence of RNF43. The authors conclude that this is due to inhibition of EGF-induced internalization of surface EGF. However, the experiments are not designed to study internalization and rather merely examine steady-state levels of surface EGFR pre and post-treatment. These changes are an integration of many things (retrograde and anterograde transport mechanisms presumable modulated by EGF). What process(es) is/are specifically affected by ZNFR3/RNF43? Are these processes differently regulated by c-cbl? If the authors are specifically interested in internalization/recycling, the use of cell surface biotinylation experiments and time courses are needed to examine the effect of EGF in the presence or absence of the E3 ligases.
We agree that our study design primarily assesses EGFR levels on the cell surface before and after EGF treatment and does not comprehensively measure the whole internalization process. In response to the reviewer’s comments, we have revised the relevant sections of manuscript to clarify that our current findings are focused on changes in cell surface EGFR and do not extend to the detailed mechanisms of EGF-induced internalization or recycling.
(3) RNF43 G659fs*41. The authors make a point in Figure 1D that this mutant leads to elevated EGFR in cancers but do not present evidence that this mutant is ineffective in mediated ubiquitination and degradation of EGFR. As this mutant maintains its ability to promote Frizzled ubiquitination and degradation, it would be important to show side by side that it does not affect EGFR. This would perhaps imply differential mechanisms for these two substrates.
Fig 1D is based on bioinformatic analysis of colon cancer patient samples, showing that RNF43 G659Vfs*41 mutant tumors exhibited significantly higher levels of EGFR protein compared to RNF43 WT tumors. Following this lead, we investigated whether this RNF43 G659fs*41 hotspot mutation lost its role in downregulating EGFR. To this end, we transfected the same amount of control vector, RNF43 WT, RING deletion mutant, G659fs*41 mutant DNA into 293T cells and measured the level of EGFR (co-transfected). As shown in Author response image 1, overexpression of RNF43 WT decreased EGFR level while overexpression of RING deletion mutant had no impact on EGFR level as compared with the Vector group, which is consistent with our findings in the manuscript. Cells transfected with the RNF43 G659Vfs*41 mutant exhibited nearly normal levels of EGFR; however, we also observed that RNF43 G659Vfs*41 was less expressed than WT, even though the same amounts of DNA were transfected. Therefore, the insubstantial impact on EGFR levels could be attributed to both functional loss or compromised stability of RNF43 G659Vfs*41 mRNA or protein. Further investigation on RNF43 G659Vfs*41 mRNA and protein stability vs. RNF43 G659Vfs*41 protein function is needed to draw a solid conclusion.
Author response image 1.
(4) "Unleashing EGFR activity". The title of the paper implies that ZNRF3/RNF43 loss leads to increased EGFR expression and hence increased activity that underlies cancer. However, I could find only one direct evidence showing that increased proliferation of the HT29 cell line mutant for RNF43 could be inhibited by the EGFR inhibitor Erlotinib. All the other evidence presented that I could find is correlative or indirect (e.g. RPPA showing increased phosphorylation of pathway members upon RNF43 KO, increased proliferation of a cell line upon ZNRF3/ RNF43 KO, decreased proliferation of a cell line upon ZNRF3/RNF43 OE in vitro or in xeno...). Importantly, the authors claim that cancer initiation/ progression in ZNRF3/RNF43 mutants may in some contexts be independent of their regulation of Wnt-bcatenin signaling and relying on EGFR activity upregulation. However, this has not been tested directly. Could the authors leverage their znrf3/RNF43 prostate cancer model to test whether EGFR inhibition could lead to reduced cancer burden whereas a Frizzled or Wnt inhibitor does not?
More broadly, if EGFR signaling were to be unleashed in cancer, then one prediction would be that these cells would be more sensitive to EGFR pathway inhibition. Could the authors provide evidence that this is the case? Perhaps using isogenic cell lines or a panel of patient-derived organoids (with known genotypes).
We appreciate the reviewer’s suggestion to provide more direct evidence demonstrating the importance of the ZNRF3/RNF43-EGFR axis in cancer cell proliferation. In this revised manuscript, we further studied this issue in the WT vs. Znrf3 KO MEF cells. We observed that treatment with the EGFR inhibitor erlotinib did not affect WT MEF but stunted the growth advantage of Znrf3 KO MEF cells (Fig. 6L). On the other hand, treatment with the porcupine inhibitor C59 did not impact either WT or Znrf3 KO MEF cells (Fig. 6L), suggesting a more important role of the ZNRF3/RNF43-EGFR axis in mediating the enhanced cell growth of MEF caused by Znrf3 knockout. Furthermore, considering EGFR is often mutated in human cancer, to increase the clinical relance of our study, we also tested the effect of RNF43 knockout on EGFR L858R (Fig. 2D), a common oncogenic EGFR mutant, and found that RNF43 knockout in HT29 boosted levels of this EGFR mutant detected by its FLAG tag, suggesting that RNF43 degrades both WT and mutated EGFR and its loss can enhance signaling of both WT EGFR and its oncogenic mutant . However, we emphasize again that this manuscript is in no way written to diminish the proven importance of ZNRF3/RNF43-WNT-β-catenin axis in cancer and development.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
The main conclusion that EGFR is targeted for degradation by RNF43 and ZNRF3 is well supported and documented. Figures 1-5 and associated supplemental figures contain largely convincing data. Figures 6 and 7, however, require some modifications, as follows in order of appearance:
Figure 6C: Growth of intestinal tumor organoids from Apcmin mice does not require Rspo, however, the authors show that these organoids grow larger in the presence of Rspo, an effect they attribute to increased EGFR activity, rather than increased WNT activity. While this conclusion may be correct, the authors should address this possibility by treating the organoids with PORCN inhibitor. The prediction would be that Rspo treatment still increases organoid size in the presence of PORCN inhibition. A further prediction would be that blocking EGFR (e.g. with Cetuximab) will abrogate the RSPO1 effect.
Yes, we attributed the impact of Rspo on Apc min organoid growth to enhanced EGFR activity because we observed increased EGFR levels (Fig 6F) but no detectable increase in eight WNT target genes assayed. We agree that further pharmacologic experiments would further boost our conclusion, but our few attempts at treating organoids encountered technical difficulties. Hence, we switched to testing PORCN inhibition vs EGFR inhibition in WT and Znfr33 KO MEFs. As shown in the revised Fig. 6L, EGFR inhibition significantly reversed the growth advantage caused by Znrf3 KO but C59 did not.
Figure 6G: It is unclear why the authors provide "8-day RSPO1 treatment" data. Here, EGFR mRNA appears to be elevated 2-fold (perhaps not statistically significant), and the Wnt targets Lef1 and Axin2 are decreased, as indicated by the statistical significance. What point is being made here?
Our observation of increased size of APC min mouse intestinal tumor organoids and increased the EGFR protein levels were at 8 days of RSPO1 treatment. Therefore, we measured mRNA levels at the same time point with the 2-day time point also included for comparison. The goal of this qPCR experiment was to detect the contribution of WNT signaling, and we did not detect an increased transcriptional readout. We included EGFR mRNA levels for comparison, and we did not detect a statistically significant increase, consistent with our experiments concluding that ZNRF3/RNF43 regulate EGFR at the protein level. As stated in the preceding response, these data led us to attribute the impact of Rspo on Apc min organoid growth to enhanced EGFR activity.
Figure 7A: This requires quantitation. How many mice were used per cell line? The data shown is not particularly convincing, with ZNRF3 overexpressing HT29 cells growing detectably. Showing representative mice is fine, but this should be supplemented with quantitation of all mice.
We had provided this data. The BLI signal quantification was shown below the representative BLI images. Seven mice were used per cell line, as annotated at the top of the graph.
Figure 7B: The authors assert that "canonical WNT signaling, based on levels of active-β-Catenin (non-phosphorylated at Ser33/37/Thr41; Figure 7B), remained unaffected". As shown, 2 of the 3 Myc-Znrf3 tumors have increased active-b-catenin signal over the GFP tumors. This indicates to me that canonical Wnt signaling was affected. The authors either need to present quantitative data that supports this claim or modify their conclusions. As presented, I don't think it is appropriate to decouple the effect of Znrf3 overexpression on EGFR from its effect on WNT.
As requested, we have quantified the level of non-phospho β-Catenin at Ser33/37/Thr41 and found no significant differences (p > 0.05) between the control group vs. ZNRF3 overexpression group. We once again note that our manuscript was not meant to dispute the proven signaling and biological significance of WNT signaling regulation by ZNRF3/RNF43, and we have proof-read the manuscript multiple times to ensure that we did not make any generalized or misleading statements in this aspect.
Author response image 2.
Figure 7E: Here the authors assert that "no substantial activation of canonical Wnt signaling" in the Z&R KO tumors, however, the figure shows a substantial increase in active b-catenin staining. The current resolution is insufficient to claim that there is no increase in nuclear b-catenin. The authors' claim that WNT signaling is not involved here is not supported by the data presented here. One way to demonstrate that this effect is through EGFR activation and not through WNT activation is to treat mice with PORCN inhibitor. WNT-addicted tumors, such as by Rnf43 or Znrf3 deletion, regress upon PORCN inhibition. In this case, if the effect of Z&R KO is mediated through EGFR rather than WNT, then there should be no effect on tumor growth upon PORCN inhibition. This is a critical experiment in order to make this point.
We appreciate the reviewer’s comments and suggestion of experiments. We based our initial statement on insubstantial nuclear β-catenin staining, but we agree that immunohistochemical staining lacks the resolution suitable for quantification. We could not generate the adequate number of KO animals for these in vivo experiments in the window of time planned for this revision. Rather, as shown in the newly added Fig. 6L, we tested EGFR inhibition and PORCN inhibition in Znrf3 KO MEFs and obtained strong data further supporting EGFR in mediating Znrf3 KO promotion of MEF growth. Notwithstanding, we have carefully revised our description of the in vivo data in Fig 7E to avoid any confusion or over-interpretation.
Minor points:
Figure 2A: provide quantitation of this immunoblot.
We have revised manuscript with quantification result shown next to the immunoblot.
Figure 2B: provide more detail in the figure legend and in the Materials and Methods section on how the KO MEFs were generated. Confirmation that Znrf3 (or in cases of Rnf43 KO) expression is lost in KO would be advisable.
We have confirmed Znrf3 KO by genotyping and RNF43 KO by immunofluorescent staining. We have also tested multiple commercial anti-ZNRF3 antibodies and anti-RNF43 antibodies for Western blotting, but they all failed.
Figure 4C is a little misleading. The schematic indicates that ECD-TM and TM-ICD truncations were analyzed for both ZNRF3 and RNF43. However, Figure 4 only shows data for ZNRF3, and the corresponding Figure S4 lacks data for the TM-ICD of Rnf43. A recommendation is to show only those schematics for which data is presented in that figure. On a related topic, the results using the deltaRING constructs (Figure S5) are not mentioned/described in the text.
We think that the reviewer meant Fig 5C. We have revised the Fig 5C by removing the RNF43 label, and we confirm that Results section does include the data in Fig S5.
Figure S4A: Only ZNRF3 is indicated in this figure. Please explain why RNF43 is not represented here. Also, indicate what is plotted along the x-axis.
We only detected the endogenous ZNRF3-EGFR interaction, possibly because the RNF43 protein level is relatively low in the cell line we used for the mass spec experiment. X-axis is the proteins ordered based on Y-axis values as detailed in the figure legend -- each data point was arranged along the x axis based on the fold change of iBAQ of EGFR-associated proteins identified in EGF-stimulated vs. control in the log2 scale, from low to high (from left to right on x axis). We have added the phrase “Proteins detected by Mass-Spec” for X-axis.
Reviewer #2 (Recommendations For The Authors):
Minor Points.
(1) In Figure 2B, the authors claim that Znrf3 KO enhanced both EGFR and p-EGFR levels both in the absence and presence of EGF. Although it is clear in the presence of EGF, the increased in p-EGFR in the absence of EGF is less than clear.
We have revised the manuscript to more clearly state the result in Fig 2B.
(2) Importantly the authors validated their findings using three independent RNF43 gRNA (fig S2D) but they do not show the editing efficiency obtained with the gRNA.
We did not include RNF43 IB in this Figure due to lack of specific antibodies for detecting RNR43 in IB. We have no reasons to doubt adequate efficiency of knockout since EGFR was increased compared to the control group. As a result, we did not perform deep sequencing to validate knockout efficacy.
(3) In S2E, the authors show that KO of either ZNRF3 or RNF43 enhance HER2 levels. This suggests that there is no redundancy between these E3 ligases, at least in this context. How do the authors reconcile that?
The reviewer raised an interesting issue. Due to the lack of WB antibodies for these two proteins, we would not easily assess the feedback impact of knockout of either gene on the protein levels of the other gene. We speculate that there may be a threshold level of the sum of the two proteins that is needed for adequate degradation of HER2, leading to HER2 increase when either gene is knocked out. Detailed studies of this issue is beyond the scope of this current work.
(4) Experiments performed in Fig 3C are performed in only one clone. The authors need to repeat in an additional clone or rescue this phenotype using a RNF43 cDNA.
Our RNF43 KO HT29 line is a pool of KO cells, not a single clone.
(5) In Figure 7E, the authors suggest that the absence of nuclear bcatenin means that canonical Wnt signaling is unaffected. It is widely known that nuclear bcatenin is often not correlating with pathway activity.
As stated above, we have revised the manuscript to avoid confusion and misinterpretation.
(6) What is the nature of the error bars in Fig 3c? Are the differences statistically significant?
As mentioned in the figure legend, the error bars are SEM. The result is statistically significant, and p-value is noted in the graph.
(7) In the Figure legends, it should be stated clearly how many biological replicates were performed for each experiment and single data points should be plotted where applicable (e.g. qPCR data). It would be helpful if the uncropped and unprocessed Western blot membranes and replicates that are not shown would be accessible to allow the reader a more comprehensive view of the acquired data, especially for blots that were quantified (e.g. Figure 2F, Figure 3C, there is clearly some defect on the blot).
For WB representation, it would be helpful to include more size markers on the Western blots (especially on the Ips that show ubiquitin smear) and in general to use a reference protein (GAPDH, Actin, Vinculin) that is closer to the protein being accessed.
More details should be added in the Methods section to explain how protocols were performed in detail. For example, it should be explained how the viruses used for infecting cells were produced (which plasmids were transfected using which transfection reagent, how long was the virus collected for, etc). Then, it should be stated how long the cells were undergoing selection before being harvested. Because the expression of the viral constructs potentially has an effect on cell proliferation through EGFR, this information is quite relevant. This is just an example, there are details missing in nearly every section (Flow: washing protocols, gating protocols (Live/dead stain?), WB: RIPA lysis buffer composition? How much protein was loaded on blots? How was protein quantification done? IP: how were washes performed and how often repeated?)
Missing: antibody dilutions for IF, IHC, and WB, plasmid backbones, sequences and availability, qPCR primer sequences from Origene.
Incucyte experiments are not described.
We have revised the relevant sections to include more details.
(8) Line 141: revise text: 2x mRNA abundance in the same sentence.
Line 162: define intermediate expression better.
Line 197/198: revise text ('the predominant one'?).
Line 218/219: revise text (Internalisation of surface EGFR?).
Line 245: clarify in text that it is endogenous EGFR that is being pulled down.
Line 264: typo: conserved instead of conservative.
Line 324: revise text (What does 'unknown significance' mean).
Line 396/397: revise text: 2x Co-IP in the same sentence.
Figure 3 D/E: more details on the Method in the figure legend.
We have revised them accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers and editors for these careful and constructive comments. Based on these comments, we plan to perform new experiments and revised analysis, summarized as follows:
(1) A more thorough analysis and experimental test of the effects of YW->SR variants on baseline AP excitability in neurons in the absence of any pharmacology.
(2) More details on modeling of selective block of Na<sub>V</sub>1.2 and Na<sub>V</sub>1.6.
(3) Revisions to text, figure contents, and figure order to better convey key points and better frame these findings in the context of current clinically available anti-seizure medications that interact with sodium channels.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank both reviewers for their comments on our manuscript. We are pleased that the value of this research has been communicated effectively, and that the reviewers agree that whilst our sample size of individuals is relatively small, it offers a unique perspective for understanding the effects of aging for wild chimpanzees’ technological behaviors. Whilst only yielding data on a few individuals, the Bossou archive is the only available data source with which we can currently address these questions over extended timescales, and is key for understanding longitudinal effects of aging for specific individuals. This is particularly true if we are to understand the life-long dynamics of chimpanzees’ technical skills during tasks which require the organization of multiple movable elements. Bossou is the only community where chimpanzees both perform nut cracking with moveable hammer and anvil stones, and have been systematically studied over a period of decades. Moreover, given the dwindling population at Bossou (N = 3 as of 2025), we must make every effort to understand these effects with existing data. We agree that this work will likely form a valuable foundation for future studies, which may aim to either replicate our results, or use our findings to design more specific research questions and approaches.
In the next iteration of the manuscript, we will elaborate on our choice of field seasons more clearly. However, this was a logistical tradeoff between needing to sample across a long lifespan using fine-granularity behavior coding, versus the time constraints for our project and the likely yield of data collection. We sampled from the middle of individuals’ prime age, up until the oldest recorded ages of individuals lifespans (17 years). Where possible we aimed to use consistent time intervals (approximately 4 years); however, this was not always possible, as in some years data was not collected by researchers at Bossou (for example, during years where there were Ebola outbreaks affecting the region). In such instances, we sampled the closest available year that offered sufficient data to meet our sampling requirements).
Reviewer 2 raises that there may be a disconnect between how human observers and chimpanzees conceive of efficiency when nut cracking, and support this idea with a citation to previous work on efficiency of Oldowan stone knapping. We agree that knowing precisely how chimpanzees perceive their own efficiency during tool use is not available through observation alone, nor can we assess the true extent to which chimpanzees are concerned about the efficiency of their nut-cracking. However, following previous studies, it is reasonable to assume that adult chimpanzees embody some level of efficiency, given that adults often select tools which aid efficient nut cracking (Braun et al. 2025, J. Hum. Evol.; Carvalho et al. 2008, J. Hum. Evol.; Sirianni et al. 2015, Animal Behav.); perform nut cracking using more streamlined combinations of actions than less experienced individuals (Howard-Spink et al. 2024, Peer J; Inoue-Nakamura & Matsuzawa 1997, J. Comp. Psychol.), and consequently end up cracking nuts using fewer hammer strikes, indicating a higher level of skill (Biro et al. 2003, Animal Cogn.; Boesch et al. 2019, Sci. Rep.). Ultimately, these factors suggest that across adulthood, experienced chimpanzees perform nut cracking with a level of efficiency which exceeds novice individuals, including across the chaine operatoire.
To account for the multiple ways in which reduced efficiency may manifest later in life, we provide one of the most flexible measures of efficiency in wild chimpanzee tool use to date, which incorporates more classical measures of time and hammer strikes (see previous examples of Biro et al. 2003, Animal Cogn.; Boesch et al. 2019 Sci. Rep.) as well as additional variables which aim to characterize how streamlined behavioral sequences are (tool rotations, tool swaps, nut replacements, etc. see Berdugo et al. 2024 Nat. Hum. Behav for other analyses using similar metrics). In the case of swapping out tools, Reviewer 2 suggests that some of these tool swaps may in fact be to aid nut cracking, by maintaining kernel integrity (a key result relating to Yo’s coula nut cracking efficiency). This however seems unlikely, given that these behaviors were performed extremely rarely by chimpanzees in early field seasons, and were not performed more frequently by other individuals with aging. We will provide additional information behind our metrics for measuring efficiency, with reference to earlier work, and also will incorporate the points raised by Reviewer 2 concerning the limitations with which we can infer chimpanzees’ goals, and how efficiently they meet them.
Reviewer 1 questioned why we did not sample efficiency data for younger individuals, and compare this data with older individuals to detect the effects of aging. Throughout our manuscript, we compared aging individuals’ nut-cracking efficiency with their efficiency in previous years (thus, at younger ages). This offered each individual personalized benchmark of efficiency in early life, and allowed us to identify aging effects whilst controlling for long-term interindividual variation in skill levels. Indeed, previous analyses at Bossou find that across the majority of adulthood, efficiency varies between individuals, but is relatively stable within individuals (see Berdugo et al. 2024, Nat. Hum. Behav.). As focal aging chimpanzees cracked multiple nuts each field season (and each encounter), we had ample data to fit models that examine individuals’ efficiency over field seasons, using random slopes to model correlations for each individual. By taking this approach, our paper offers a novel perspective by being able to report the longitudinal effects of aging on tool-using efficiency, rather than averaged cross-sectional effects between young and old cohorts. As random slope models (and not just random intercept models) offered the best explanation for variation in aging individuals’ efficiency over our sample period, this implies that focal chimpanzees were experiencing individual-level changes in efficiency over time, thus giving us key evidence that interindividual variation in tool-using efficiency can be compounded by aging.
We argue that the reductions in efficiency observed for some individuals (e.g. Yo & Velu) are unlikely to be due to environmental changes (e.g. nuts becoming harder in later field seasons), as if this was the case, these effects would be detected across the behaviors of all individuals (which was not observed). Additionally, in the specific case of the hardness of nuts, nuts used in our experiment were sourced from local communities, and were moderately aged. This avoided the use of young nuts which are harder to crack, or older nuts which are often worm-eaten or can be empty (Sakura & Matsuzawa, 1991; Ethology). We will update our manuscript with this information.
Whilst other factors may introduce general variation into our efficiency data (such as different stones used on different encounters, or more general variation in nut hardness across encounters), very few of these factors predict directional long-term changes in efficiency. Rather, if these factors were driving the majority of variation in our data, we would expect them to lead to variation across visits during earlier field seasons (such as 1999-2008) and later field seasons (2011 onwards) equally, and in a way which does not necessarily correlate with age. This does not match the pattern we observed in our data, where for some individuals (e.g. Yo & Velu), efficiency in nut cracking reduced in later field seasons only, and was relatively consistent across field seasons prior to 2011. Moreover, for Yo – the individual who exhibited the greatest reductions in tool-using efficiency - efficiency continued to decrease across the three of the latest sampled field seasons. Thus, it is more likely Yo was experiencing deleterious effects of aging. We do however agree that additional data on these variables would help us to remove the possibility of compounding factors more rigorously – we will include recommendations for this data to be collected in future studies.
When modelling the effect of aging on attendance at the outdoor laboratory, we could not use the same approach we used when modelling tool-using efficiency, as we could only acquire one datapoint (attendance rate) per individual for each field season. We therefore had to adapt our analysis, and introduce attendance rates for younger individuals as a baseline to compare against the attendance rates of older individuals across years. We observed a significant interaction effect, where across field seasons, attendance dropped significantly more rapidly for older individuals than younger ones. Reviewer 2 has asked why we do not consider inter-annual variability across this time period, and suggested that we ignored intervening years. This is not the case. When fitting models that examined the effects of aging on attendance, we used all data across all field seasons. We reported an approximate effect size for this significant correlation using a digestible comparison of the attendance rates in the initial and final field seasons sampled. We will ensure that this is clear in the next iteration of our manuscript.
Reviewer 2 noted that many factors may have influenced the decision for chimpanzees to attend the outdoor laboratory in older field seasons, and the current data may not be used to make strong arguments for changes in attendance rates being due to dietary preferences. We agree that many factors may have influenced these attendance rates, and that is what we have aimed to transparently report within our discussion where we raise an extensive, non-exhaustive list of hypotheses for why we have observed this age-related change in our data. We will aim to ensure that this is exceptionally clear prior to resubmission, and where relevant, will further emphasize points raised by Reviewer 2. We consider some points raised by Reviewer 2 to be unlikely to apply for our study; for example, it is unlikely neophobia has influenced the behaviors of chimpanzees, as these chimpanzees habitually attended the outdoor laboratory at their own accord for over a decade prior to the earliest year we sampled in this study (reflecting extremely high levels of habituation to the experimental set up). Previous studies at Bossou have surveyed the ecology of stone tool use across the home range, and confirm that the outdoor laboratory is visited by chimpanzees during ranging as a food patch (Almeida-Warren et al. 2022 Int. J. Primatol.).
Reviewer 2 suggested that it would be helpful to have additional data on variables such as hand grip, as this may reveal further information about how cognitive and physiological senescence influences reductions in tool-using efficiency. We agree that whilst further data on hand grips are not required to detect reductions in efficiency per say per se, it would be profitable for future analyses to collect similar data – we will add this as a recommendation to our discussion.
Finally, Reviewer 2 commented that they found our discussion of coula-nut cracking disruptive to the flow of the manuscript, given that we could not compare with coula-nut cracking in earlier years. We reported the coula nut cracking of Yo in 2011 as it was part of our sampled data, and we felt that the comparison with other individuals in the same year was an interesting discussion point, however we acknowledge this limitation. We will move all data and discussion of coula-nut cracking to the Supplementary Materials, which we will present as an interesting additional observation which may warrant further investigation using additional data from the Bossou archive. Data collection for this future project could include collecting data on the additional variables raised by both reviewers (e.g. hand grips).
We thank both reviewers for their comments. We believe that their feedback will improve the quality of our reporting, and the validity of our interpretations.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment
The conclusions of this work are based on valuable simulations of a detailed model of striatal dopamine dynamics. Establishing that a lower dopamine uptake rate can lead to a 'tonic' level of dopamine in the ventral but not dorsal striatum, and that dopamine concentration changes at short delays can be tracked by D1 but not D2 receptor activation, is of value and will be of interest to dopamine aficionados. However, the simulations are incomplete, providing only partial support for the key claims. Several things can be done to strengthen the conclusions, including, for example, but not exclusively, a demonstration of how the results would change as a function of changes in D2 affinity.
We sincerely thank the Editors and Reviewers for their insightful comments on our manuscript. We are pleased that our simulations are recognized as interesting, sophisticated and valuable. Moreover, we fully agree that many of the findings will be of particular interest to dopamine aficionados. While we maintain that our simulations provide a solid basis for the key claims, we acknowledge that the conclusions can be further strengthened by the revisions suggested below.
Reviewer #1 (Public review):
Ejdrup, Gether, and colleagues present a sophisticated simulation of dopamine (DA) dynamics based on a substantial volume of striatum with many DA release sites. The key observation is that a reduced DA uptake rate in the ventral striatum (VS) compared to the dorsal striatum (DS) can produce an appreciable "tonic" level of DA in VS and not DS. In both areas they find that a large proportion of D2 receptors are occupied at "baseline"; this proportion increases with simulated DA cell phasic bursts but has little sensitivity to simulated DA cell pauses. They also examine, in a separate model, the effects of clustering dopamine transporters (DAT) into nanoclusters and say this may be a way of regulating tonic DA levels in VS. I found this work of interest and I think it will be useful to the community. At the same time, there are a number of weaknesses that should be addressed, and the authors need to more carefully explain how their conclusions are distinct from those based on prior models.
(1) The conclusion that even an unrealistically long (1s) and complete pause in DA firing has little effect on DA receptor occupancy is potentially important. The ability to respond to DA pauses has been thought to be a key reason why D2 receptors (may) have high affinity. This simulation instead finds evidence that DA pauses may be useless. This result should be highlighted in the abstract and discussed more.
We appreciate that the reviewer finds our work interesting and useful to the community. However, we acknowledge that in the revised version we to need to better describe how our conclusions are different from those reached based on previous models.
We will also carry out new simulations across a range of D2R affinities to assess how this will affect the finding that even a long pause in DA firing has little effect on DR2 receptor occupancy. As also suggested, the results will be highlighted and further discussed.
(2) The claim of "DAT nanoclustering as a way to shape tonic levels of DA" is not very well supported at present. None of the panels in Figure 4 simply show mean steady-state extracellular DA as a function of clustering. Perhaps mean DA is not the relevant measure, but then the authors need to better define what is and why. This issue may be linked to the fact that DAT clustering is modeled separately (Figure 4) to the main model of DA dynamics (Figures 1-3) which per the Methods assumes even distribution of uptake. Presumably, this is because the spatial resolution of the main model is too coarse to incorporate DAT nanoclusters, but it is still a limitation.
We will improve our definitions and descriptions relating to nanoclustering of DAT in the revised version of the manuscript. We fully agree that the spatial resolution of the main model is a limitation and, ideally, that the nanoclustering should be combined with the large-scale release simulations. Unfortunately, this would require many orders of magnitude more computational power than currently available.
As it stands it is convincing (but too obvious) that DAT clustering will increase DA away from clusters, while decreasing it near clusters. I.e. clustering increases heterogeneity, but how this could be relevant to striatal function is not made clear, especially given the different spatial scales of the models.
Thank you for raising this important point. While it is true that DAT clustering increases heterogeneity in DA distribution at the microscopic level, the diffusion rate is, in most circumstances, too fast to permit concentration differences on a spatial scale relevant for nearby receptors. Accordingly, we propose that the primary effect of DAT nanoclustering is to decrease the overall uptake capacity, which in turn increases overall extracellular DA concentrations. Thus, homogeneous changes in extracellular DA concentrations can arise from regulating heterogenous DAT distribution. An exception to this would be the circumstance where the receptor is located directly next to a dense cluster – i.e. within nanometers. In such cases, local DA availability may be more directly influenced by clustering effects. This will be further discussed in the revised manuscript.
(3) I question how reasonable the "12/40" simulated burst firing condition is, since to my knowledge this is well outside the range of firing patterns actually observed for dopamine cells. It would be better to base key results on more realistic values (in particular, fewer action potentials than 12).
We fully agree that this typically is outside the physiological range. The values are included to showcase what extreme situations would look like.
(4) There is a need to better explain why "focality" is important, and justify the measure used.
We will expand on the intention of this measure in the revised manuscript. Thank you for pointing out this lack of clarification.
(5) Line 191: " D1 receptors (-Rs) were assumed to have a half maximal effective concentration (EC50) of 1000 nM" The assumptions about receptor EC50s are critical to this work and need to be better justified. It would also be good to show what happens if these EC50 numbers are changed by an order of magnitude up or down.
We agree that these assumptions are critical. Simulations on effective off-rates across a range of EC50 values will be included in the revised version.
(6) Line 459: "we based our receptor kinetics on newer pharmacological experiments in live cells (Agren et al., 2021) and properties of the recently developed DA receptor-based biosensors (Labouesse & Patriarchi, 2021). Indeed, these sensors are mutated receptors but only on the intracellular domains with no changes of the binding site (Labouesse & Patriarchi, 2021)”
This argument is diminished by the observation that different sensors based on the same binding site have different affinities (e.g. in Patriarchi et al. 2018, dLight1.1 has Kd of 330nM while dlight1.3b has Kd of 1600nM).
We sincerely thank the reviewer for highlighting this important point. We fully recognize the fundamental importance of absolute and relative DA receptor kinetics for modeling DA actions and acknowledge that differences in affinity estimates from sensor-based measurements highlight the inherent uncertainty in selecting receptor kinetics parameters. While we have based our modeling decisions on what we believe to be the most relevant available data, we acknowledge that the choice of receptor kinetics is a topic of ongoing debate. Importantly, we are making our model available to the research community, allowing others to test their own estimates of receptor kinetics and assess their impact on the model’s behavior. In our revised manuscript, we will further discuss the rationale behind our parameter choices, including: Our selection of a Kd value of 1000 nM for D1R (based on the observed affinities for D1R sensors) and an extrapolated Koff of 19.5 s<sup>-1</sup> (Labouesse & Patriarchi, 2021). Our use of a Kd value of 7 nM and an extrapolated Koff of 0.2 s<sup>-1</sup> for D2R, consistent with recent binding studies (Ågren et al., 2021).
(7) Estimates of Vmax for DA uptake are entirely based on prior fast-scan voltammetry studies (Table S2). But FSCV likely produces distorted measures of uptake rate due to the kinetics of DA adsorption and release on the carbon fiber surface.
We fully agree that this is a limitation of FSCV. However, most of the cited papers attempt to correct for this by way of fitting the output to a multi-parameter model for DA kinetics. If newer literature brings the Vmax values estimated into question, we have made the model publicly available to rerun the simulations with new parameters.
(8) It is assumed that tortuosity is the same in DS and VS - is this a safe assumption?
The original paper cited does not specify which region the values are measured in. However, a separate paper estimates the rat cerebellum has a comparable tortuosity index (Nicholson and Phillips, J Physiol. (1981)), suggesting it may be a rather uniform value across brain regions.
(9) More discussion is needed about how the conclusions derived from this more elaborate model of DA dynamics are the same, and different, to conclusions drawn from prior relevant models (including those cited, e.g. from Hunger et al. 2020, etc).
As part of our revision, we will expand the current discussion of our finding in the context of previous models in the manuscript
Reviewer #2 (Public review):
The work presents a model of dopamine release, diffusion, and reuptake in a small (100 micrometer^2 maximum) volume of striatum. This extends previous work by this group and others by comparing dopamine dynamics in the dorsal and ventral striatum and by using a model of immediate dopamine-receptor activation inferred from recent dopamine sensor data. From their simulations, the authors report two main conclusions. The first is that the dorsal striatum does not appear to have a sustained, relatively uniform concentration of dopamine driven by the constant 4Hz firing of dopamine neurons; rather that constant firing appears to create hotspots of dopamine. By contrast, the lower density of release sites and lower rate of reuptake in the ventral striatum creates a sustained concentration of dopamine. The second main conclusion is that D1 receptor (D1R) activation is able to track dopamine concentration changes at short delays but D2 receptor activation cannot.
The simulations of the dorsal striatum will be of interest to dopamine aficionados as they throw some doubt on the classic model of "tonic" and "phasic" dopamine actions, further show the disconnect between dopamine neuron firing and consequent release, and thus raise issues for the reward-prediction error theory of dopamine.
There is some careful work here checking the dependence of results on the spatial volume and its discretisation. The simulations of dopamine concentration are checked over a range of values for key parameters. The model is good, the simulations are well done, and the evidence for robust differences between dorsal and ventral striatum dopamine concentration is good.
However, the main weakness here is that neither of the main conclusions is strongly evidenced as yet. The claim that the dorsal striatum has no "tonic" dopamine concentration is based on the single example simulation of Figure 1 not the extensive simulations over a range of parameters. Some of those later simulations seem to show that the dorsal striatum can have a "tonic" dopamine concentration, though the measurement of this is indirect. It is not clear why the reader should believe the example simulation over those in the robustness checks, for example by identifying which range of parameter values is more realistic.
We appreciate that the reviewer finds our work interesting and carefully performed. The reviewer is correct that DA dynamics, including the presence and level of tonic DA, are parameter-dependent in both the dorsal striatum (DS) and ventral striatum (VS). Indeed, our simulations across a broad range of biological parameters were intended to help readers understand how such variation would impact the model’s outcomes, particularly since many of the parameters remain contested. Naturally, altering these parameters results in changes to the observed dynamics. However, to derive possible conclusions, we selected a subset of parameters that we believe best reflect the physiological conditions, as elaborated in the manuscript. This is eventually required in computational modelling of biological systems. In response to the reviewer’s comment, we will place greater emphasis on clarifying which parameter regimes produce a "tonic" versus "non-tonic" DA state in the DS. Additionally, we will underscore that the distinction between tonic and non-tonic states is not a binary outcome but a parameter-dependent continuum—one that our model now allows researchers to explore systematically. Finally, we will highlight how our simulations across parameter space not only capture this continuum but also identify the regimes that produce the most heterogeneous DA signaling, both within and across striatal regions.
The claim that D1Rs can track rapid changes in dopamine is not well supported. It is based on a single simulation in Figure 1 (DS) and 2 (VS) by visual inspection of simulated dopamine concentration traces - and even then it is unclear that D1Rs actually track dynamics because they clearly do not track rapid changes in dopamine that are almost as large as those driven by bursts (cf Figure 1i).
We would like to draw the attention also to Fig. S1, where the claim that D1R track rapid changes is supported in more depth. According to this figure, upon coordinated burst firing, the D1R occupancy rapidly increased as diffusion no longer equilibrated the extracellular concentrations on a timescale faster than the receptors – and D1R receptor occupancy closely tracked extracellular DA with a delay on the order of tens of milliseconds. Note that the brief increases in [DA] from uncoordinated stochastic release events from tonic firing in Fig. 1i are too brief to drive D1 signaling, as the DA concentration diffuses into the remaining extracellular space on a timescale of 1-5 ms. This is faster than the receptors response rate, and does not lead to any downstream signaling according to our simulations. This means D1 kinetics are rapid enough to track coordinated signaling on a ~50 ms timescale and slower, but not fast enough to respond to individual release events from tonic activity. In our revised manuscript we will expand the discussion of this topic to provide greater clarity.
The claim also depends on two things that are poorly explained. First, the model of binding here is missing from the text. It seems to be a simple bound-fraction model, simulating a single D1 or D2 receptor. It is unclear whether more complex models would show the same thing.
We realize that this is not made clear in the methods and, accordingly, we will update the method section to elaborate on how we model receptor binding. The model simulates occupied fraction of D1R and D2R in every single voxel of the simulation space.
Second, crucial to the receptor model here is the inference that D1 receptor unbinding is rapid; but this inference is made based on the kinetics of dopamine sensors and is superficially explained - it is unclear why sensor kinetics should let us extrapolate to receptor kinetics, and unclear how safe is the extrapolation of the linear regression by an order of magnitude to get the D1 unbinding rate.
We chose to use the sensors because it was possible to estimate precise affinities/off-rates from the fluorescent measurements. Although there might some variation in affinities that could be attributable to the mutations introduced in the sensors, the data clearly separated D1R and D2R with a D1R affinity of ~1000 nM and a D1R affinity of ~7 nM (Labouesse & Patriarchi, 2021) consistent with earlier predictions of receptor affinities. From our assessment of the literature we found that this was the most reasonable way to estimate affinities and thereby off-rates. Importantly, the model has been made publicly available, so should new measurements arise, the simulations can be rerun with tweaks to the input parameters.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers for their thoughtful feedback. Below we provide an initial response to the central concerns that they have raised. In general, as part of our revisions, we plan to perform additional analyses to strengthen our conclusions, tone down more speculative interpretations, and clarify the novel contributions of our work. A full, point-by-point reply will follow alongside the revised manuscript.
Briefly, the reviewers’ central concerns are that some of the conclusions are not sufficiently supported by the experimental evidence, specifically (1) the involvement of sharp-wave ripple (SWR)-unmodulated PFC neurons in signaling upcoming choice and (2) the absence of SWR time-locking of PFC non-local representations. They further suggest that (3) the spatial tuning in the PFC may reflect other cognitive processes rather than encoding spatial information; and (4) the manuscript is ambiguous as to which results are novel or corroborating previous work.
(1) SWR-unmodulated PFC neurons signaling upcoming choice
Reviewer 1 suggests that our finding that SWR-modulated neurons relate to hippocampal non-local representations contradicts the manuscript’s main conclusion. However, in our view, there is no contradiction and the finding highlights the distinction between the two sub-populations, namely the SWR-modulated neurons linked to hippocampal non-local representations, and the SWR-unmodulated neurons that are more active during prefrontal non-local representations.
We do agree with the reviewer that the observation of higher firing rates of SWR-unmodulated neurons in the expression of non-local representations does not mean that these neurons are the sole or even main contributors to the non-local decoding. To address both comments, we will perform additional analyses to further disentangle the contributions of SWR-modulated and SWR-unmodulated PFC neurons to the non-local representations of upcoming choice.
(2) Time-locking of PFC non-local representations to hippocampal SWRs
Reviewer 1 comments that in the analysis of time-locking to hippocampal SWRs and theta phase, the behavior of the animals needs to be taken into account (i.e., immobility or running). We confirm that this was indeed done in our analysis and we will clarify this point in the revised manuscript.
The reviewer further requested that PFC decoding during SWRs be performed at shorter timescales as in previous studies. We like to point out that (1) we found no increase in non-local decoding in the PFC around SWR onset (see Fig 5a), and (2) most of the non-local representations in the PFC occurred during the expression of local representations in the hippocampus (see Fig 4d). These data suggest that the non-local representations in both brain regions are expressed independently. To further strengthen this idea, we plan to (1) include the result of decoding PFC activity during SWRs at fine timescales as the reviewer suggested, and (2) look at the firing rates of PFC neurons during non-local representations exclusively when the hippocampus is encoding the actual (local) position.
Following a suggestion by reviewer 2, we will also add a statistical assessment of how strongly the data supports the absence of time-locking.
(3) Spatial tuning in the mPFC
Reviewer 2 points out that the spatial tuning in the prefrontal cortex may be related to cognitive processes (e.g., attention or decision-making) rather than spatial encoding. However, our results show that decoded mPFC activity reliably differentiates between the two start and goal arms (Fig 4a), rate maps show little evidence of mirroring (Fig 3a), and the activity predicts turns in the cue-based task during which goal arms switch pseudo-randomly (meaning that the non-local representations encode the North and South arm alternatingly and correctly, rather than encoding a general rewarded goal arm; Fig. 4b). While it is likely that mPFC encodes several task-related variables, our data suggest that it also encodes distinct locations.
The reviewer further claims that the results of Jadhav et al. (2016) contradict our findings because they supposedly showed that mPFC neurons unmodulated by SWRs are less tuned to space. However, this is incorrect, as Jadhav et al. (2016) showed that SWR-unmodulated PFC neurons have lower spatial coverage and consequentially are more spatially selective, which is consistent with our observations. We will rephrase this in the text to improve clarity.
(4) Novelty
We thank reviewer 2 for pointing out the significance of several novel findings in our work that deserve to be highlighted. This includes the dorsal-ventral profile of SWR-modulation and theta phase locking in the PFC and our observation that the neural representations in the PFC precede the behavioral switch in reversal learning. In our revised manuscript, we will rewrite the text to better emphasize our novel contributions, clearly distinguish new findings from confirmatory observations, and add missing citations where appropriate.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Hippocampal place cells display a sequence of firing activities when the animal travels through a spatial trajectory at a behavioral time scale of seconds to tens of seconds. Interestingly, parts of the firing sequence also occur at a much shorter time scale: ~120 ms within individual cycles of theta oscillation. These so-called theta sequences are originally thought to naturally result from the phenomenon of theta phase precession. However, there is evidence that theta sequences do not always occur even when theta phase precession is present, for example, during the early experience of a novel maze. The question is then how they emerge with experience (theta sequence development). This study presents evidence that a special group of place cells, those tuned to fast-gamma oscillations, may play a key role in theta sequence development.
The authors analyzed place cells, LFPs, and theta sequences as rats traveled a circular maze in repeated laps. They found that a group of place cells were significantly tuned to a particular phase of fast-gamma (FG-cells), in contrast to others that did not show such tunning (NFG-cells). The authors then omitted FG-cells or the same number of NFG-cells, in their algorithm of theta sequence detection and found that the quality of theta sequences, quantified by a weighted correlation, was worse with the FG-cell omission, compared to that with the NFG-cell omission, during later laps, but not during early laps. What made the FG-cells special for theta sequences? The authors found that FG-cells, but not NFG-cells, displayed phase recession to slow-gamma (25 - 45 Hz) oscillations (within theta cycles) during early laps (both FG- and NFG-cells showed slow-gamma phase precession during later laps). Overall, the authors conclude that FG-cells contribute to theta sequence development through slow-gamma phase precession during early laps.
How theta sequences are formed and developed during experience is an important question, because these sequences have been implicated in several cognitive functions of place cells, including memory-guided spatial navigation. The identification of FG-cells in this study is straightforward. Evidence is also presented for the role of these cells in theta sequence development. However, given several concerns elaborated below, whether the evidence is sufficiently strong for the conclusion needs further clarification, perhaps, in future studies.
We thank the reviewer for these positive comments.
(1) The results in Figure 3 and Figure 8 seems contradictory. In Figure 8, all theta sequences displayed a seemingly significant weighted correlation (above 0) even in early laps, which was mostly due to FG-cell sequences but not NFG-cell sequences (correlation for NFG-sequences appeared below 0). However, in Figure 3H, omitting FG-cells and omitting NFG-cells did not produce significant differences in the correlation. Conversely, FG-cell and NFG-cell sequences were similar in later laps in Figure 8 (NFG-cell sequences appeared even better than FG-cell sequences), yet omitting NFG-cells produced a better correlation than omitting FG-cells. This confusion may be related to how "FG-cell-dominant sequences" were defined, which is unclear in the manuscript. Nevertheless, the different results are not easy to understand.
We thank the reviewer for pointing out this important problem. The potential contradictory can be interpreted by different sequence dataset included in Fig3 and Fig8, described as follows.
(1) In Fig 3, all sequences decoded without either FG or NFG cells were included, defined as exFG-sequences and exNFG sequences, so that we couldn’t observe sequence development at early phase and thus the weighted correlation was low. (2) In Fig8, however, the sequences with either FG or NFG cells firing across at least 3 slow gamma cycles were included, defined as FG-cell sequences and NFG-cell sequences. This criterion ensures to investigate the relationship between sequence development and slow gamma phase precession, so that these sequences were contributed by cells likely to show slow gamma phase precession. These definitions have been updated to the “Theta sequences detection” section of the Methods (Line 606-619).
At early phase, there’s still no difference of weighted correlation between FG-cell sequences and NFG-cell sequences (Author response image 1A, Student’s t test, t(65)=0.2, p=0.8, Cohen's D=0.1), but the FG-cell sequences contained high proportion of slow gamma phase precession (Fig8F). At late phase, both FG-cell sequences and NFG-cell sequences exhibited slow gamma phase precession, so that their weighted correlation were high with no difference (Author response image 1B, Student’s t test, t(62)=-1.1, p=0.3, Cohen's D=0.3). This result further indicates that the theta sequence development requires slow gamma phase precession, especially for FG cells during early phase.
Author response image 1.
(2) The different contributions between FG-cells and NFG-cells to theta sequences are supposed not to be caused by their different firing properties (Figure 5). However, Figure 5D and E showed a large effect size (Cohen's D = 07, 0.8), although not significant (P = 0.09, 0.06). But the seemingly non-significant P values could be simply due to smaller N's (~20). In other parts of the manuscript, the effect sizes were comparable or even smaller (e.g. D = 0.5 in Figure 7B), but interpreted as positive results: P values were significant with large N's (~480 in Fig. 7B). Drawing a conclusion purely based on a P value while N is large often renders the conclusion only statistical, with unclear physical meaning. Although this is common in neuroscience publications, it makes more sense to at least make multiple inferences using similar sample sizes in the same study.
We thank the reviewer for this kind suggestion. We made multiple inferences using similar sample sizes as much as possible. In Fig7B, we did the statistical analysis with sessions as samples, and we found the significant conclusion was maintained. These results have been updated to the revised manuscript (Lines 269-270).and the Fig7B has been replaced correspondingly.
(3) In supplementary Figure 2 - S2, FG-cells displayed stronger theta phase precession than NFG-cells, which could be a major reason why FG-cells impacted theta sequences more than NFG cells. Although factors other than theta phase precession may contribute to or interfere with theta sequences, stronger theta phase precession itself (without the interference of other factors), by definition, can lead to stronger theta sequences.
This is a very good point. The finding that FG-cells displayed stronger theta phase precession than NFG-cells was consistent with the finding of Guardamagna et al., 2023 Cell Rep, that the theta phase precession pattern emerged with strong fast gamma. Since slow gamma phase precession occurred within theta cycles, it is hard to consider the contribution of these factors to theta sequences development, without taking theta phase precession into account. But one should be noted that the theta sequences could not be developed even if theta phase precession existed from the very beginning of the exploration (Feng et al., 2025 J Neurosci). These findings suggest that theta phase precession, together with other factors, impact theta sequence development. However, the weight of each factor and their interaction still need to be further investigated. We have discussed this possibility in the Discussion section (Lines 361- 373).
(4) The slow-gamma phase precession of FG-cells during early laps is supposed to mediate or contribute to the emergence of theta sequences during late laps (Figure 1). The logic of this model is unclear. The slow-gamma phase precession was present in both early and late laps for FG-cells, but only present in late laps for NFG-cells. It seems more straightforward to hypothesize that the difference in theta sequences between early and later laps is due to the difference in slow-gamma phase precession of NFG cells between early and late laps. Although this is not necessarily the case, the argument presented in the manuscript is not easy to follow.
We thank the reviewer for pointing this out. The slow gamma phase precession was first found in my previous publication (Zheng et al., 2016 Neuron), which indicates a temporally compressed manner for coding spatial information related to memory retrieval. In this case, we would expect that slow gamma phase precession occurred in all cells during late laps, because spatial information was retrieved when rats have been familiar with the environment. However, during early laps when novel information was just encoded, there would be balance between fast gamma and slow gamma modulation of cells for upcoming encoding-retrieval transition. A possibility is that FG-cells support this balance by receiving modulation of both fast gamma and slow gamma, but with distinct phase-coding modes (fast gamma phase locking and slow gamma phase precession) in a temporally coordinated manner. We have discussed this possibility in the Discussion section (Lines 415- 428).
(5) There are several questions on the description of methods, which could be addressed to clarify or strengthen the conclusions.
(i) Were the identified fast- and slow-gamma episodes mutually exclusive?
Yes, the fast- and slow-gamma episodes are mutually exclusive. We have added descriptions in the “Detection of gamma episodes” section in the Methods part (Lines 538-550).
(ii) Was the task novel when the data were acquired? How many days (from the 1st day of the task) were included in the analysis? When the development of the theta sequence was mentioned, did it mean the development in a novel environment, in a novel task, or purely in a sense of early laps (Lap 1, 2) on each day?
We thank the reviewer for pointing this out. The task was not novel to rats in this dataset, because only days with good enough recording quality for sequence decoding were included in this paper, which were about day2-day10 for each rat. However, we still observed the process of sequence formation because of the rat’s exploration interest during early laps. Thus, when the development of the theta sequence was mentioned, it meant a sense of early laps on each day.
(iii) How were the animals' behavioral parameters equalized between early and later laps? For example, speed or head direction could potentially produce the differences in theta sequences.
This is a very good point. In terms of the effect of running speed on theta sequences, we quantified the running speeds during theta sequences across trials 1-5. We found that the rats were running at stable running speed, which has been reported in Fig.3F. In terms of the effect of head direction on theta sequences, we measured the angle difference between head direction and running direction. We found that the angle difference for each lap was distributed around 0, with no significant difference across laps (Fig.S3, Watson-Williams multi-sample test, F(4,55)=0.2, p=0.9, partial η<sup>2</sup>= 0.01). These results indicate that the differences in theta sequences across trials cannot be interpreted by the variability of behavioral parameters. We have updated these results and corresponding methods in the revised manuscript (Lines 172-175, Lines 507-511, with a new Fig.S3).
Reviewer #2 (Public Review):
This manuscript addresses an important question that has not yet been solved in the field, what is the contribution of different gamma oscillatory inputs to the development of "theta sequences" in the hippocampal CA1 region? Theta sequences have received much attention due to their proposed roles in encoding short-term behavioral predictions, mediating synaptic plasticity, and guiding flexible decision-making. Gamma oscillations in CA1 offer a readout of different inputs to this region and have been proposed to synchronize neuronal assemblies and modulate spike timing and temporal coding. However, the interactions between these two important phenomena have not been sufficiently investigated. The authors conducted place cell and local field potential (LFP) recordings in the CA1 region of rats running on a circular track. They then analyzed the phase locking of place cell spikes to slow and fast gamma rhythms, the evolution of theta sequences during behavior, and the interaction between these two phenomena. They found that place cells with the strongest modulation by fast gamma oscillations were the most important contributors to the early development of theta sequences and that they also displayed a faster form of phase precession within slow gamma cycles nested with theta. The results reported are interesting and support the main conclusions of the authors. However, the manuscript needs significant improvement in several aspects regarding data analysis, description of both experimental and analytical methods, and alternative interpretations, as I detail below.
• The experimental paradigm and recordings should be explained at the beginning of the Results section. Right now, there is no description whatsoever which makes it harder to understand the design of the study.
We thank the reviewer for this kind suggestion. The description of experimental paradigm and recordings has been added to the beginning of the results section (Lines 114-119).
• An important issue that needs to be addressed is the very small fraction of CA1 cells phased-locked to slow gamma rhythms (3.7%). This fraction is much lower than in many previous studies, that typically report it in the range of 20-50%. However, this discrepancy is not discussed by the authors. This needs to be explained and additional analysis considered. One analysis that I would suggest, although there are also other valid approaches, is to, instead of just analyzing the phase locking in two discrete frequency bands, compute the phase locking will all LFP frequencies from 25-100 Hz. This will offer a more comprehensive and unbiased view of the gamma modulation of place cell firing. Alternative metrics to mean vector length that is less sensitive to firing rates, such as pairwise phase consistency index (Vinck et a., Neuroimage, 2010), could be implemented. This may reveal whether the low fraction of phase-locked cells could be due to a low number of spikes entering the analysis.
We thank the reviewer for this constructive suggestion. A previous work also on Long-Evans rats showed that the proportion of slow gamma phase-locked cells during novelty exploration was ~20%, however it dropped to ~10% during familiar exploration (Fig.4E, Kitanishi et al., 2015 Neuron). This suggests that the proportion of slow gamma phase-locked cells may decreased with familiarity of the environment, which supports our data. In addition, we also calculated the pairwise phase consistency index in terms of the effect of spike counts on MVL. We could observe that the tendency of PPC (Author response image 2A) and MVL (Author response image 2B) along frequency bands were consistent across different subsets of cells, suggesting that the determination of cell subsets by MVL metric was not biased by the low number of spikes. These results further shed light to the contribution of slow gamma phase precession of place cells to theta sequence development.
Author response image 2.
• From the methods, it is not clear to me whether the reference LFP channel was consistently selected to be a different one that where the spikes analyzed were taken. This is the better practice to reduce the contribution of spike leakage that could substantially inflate the coupling with faster gamma frequencies. These analyses need to be described in more detail.
We thank the reviewer for pointing this out. In the main manuscript, we used local LFPs as the cells were recorded from the same tetrode. In addition, we selected an individual tetrode which located at stratum pyramidale and at the center of the drive bundle for each rat. We detected a similar proportion of FG-cells by using LFPs on this tetrode, compared with that using local LFPs (Author response image 3A-B, Chi-squared test, χ<sup>2</sup>= 0.9, p=0.4, Cramer V=0.03). We further found that the PPC measurement of FG- and NFG-cells were different at fast gamma band by using central LFPs (Author response image 3D), consistent with that by using local LFPs (Author response image 3C). Therefore, these results suggest that the findings related to fast gamma was not due to the contribution of spike leakage in the local LFPs. We have updated the description in the manuscript (Lines 553-557, 566-568).
Author response image 3.
• The initial framework of the authors of classifying cells into fast gamma and not fast gamma modulated implies a bimodality that may be artificial. The authors should discuss the nuances and limitations of this framework. For example, several previous work has shown that the same place cell can couple to different gamma oscillations (e.g., Lastoczni et al., Neuron, 2016; Fernandez-Ruiz et al., Neuron, 2017; Sharif et al., Neuron,2021).
We thank the reviewer for this kind suggestion. We have cited these references and discussed the possibility of bimodal phase-locking in the manuscript (Lines 430-433).
• It would be useful to provide a more thorough characterization of the physiological properties of FG and NFG cells, as this distinction is the basis of the paper. Only very little characterization of some place cell properties is provided in Figure 5. Important characteristics that should be very feasible to compare include average firing rate, burstiness, estimated location within the layer (i.e., deep vs superficial sublayers) and along the transverse axis (i.e., proximal vs distal), theta oscillation frequency, phase precession metrics (given their fundamental relationship with theta sequences), etc.
We thank the reviewer for this constructive suggestion. In addition to the characterizations shown in Fig5, we also analyzed firing rate, anatomical location and theta modulation to compare the physiological properties of FG- and NFG-cells.
In terms of the firing properties of both types of cells, we found that the mean firing rate of FG-cell was higher than NFG-cell (Fig. 5A, Student's t-test, t(22) = 2.1, p = 0.04, Cohen's D = 0.9), which was consistent with the previous study that the firing rate was higher during fast gamma than during slow gamma (Zheng et al., 2015 Hippocampus). However, the spike counts of excluded FG- and NFG-cells for decoding were similar (Fig. 5B, Student's t-test, t(22) = 1.2, p = 0.3, Cohen's D = 0.5), suggesting that the differences found in theta sequences cannot be accounted for by different decoding quality related to spike counts. In addition, we measured the burstiness based on the distribution of inter-spike-intervals, and we found that the bursting probability of spikes was not significantly different between FG and NFG cells (Author response image 4A, Student's t-test, t(22) = 0.6, p=0.5, Cohen's d=0.3).
In terms of theta modulation of cells, we first compared the theta frequency related to the firing of FG and NFG cells. We detected the instantaneous theta frequency at each spike timing of FG and NFG cells, and found that it was not significantly different between cell types (Author response image 4B, Student's t-test, t(22) = -0.5, p=0.6, Cohen's d=0.2). In addition, we found the proportion of cells with significant theta phase precession was greater in FG-cells than in NFG-cells (Fig. S2E). However, the slope and starting phase of theta phase precession was not significantly different between FG and NFG cells (Author response image 4C, Student's t-test, t(21) = 0.3, p=0.8, Cohen's d=0.1; Author response image 4D, Watson-Williams test, F(1,21)=0.5, p=0.5, partial η<sup>2</sup>=0.02).
In terms of the anatomical location of FG and NFG cells, we identified tetrode traces in slices for each cell. We found that both FG and NFG cells were recorded from the deep layer of dorsal CA1, with no difference of proportions between cell types (Author response image 4E, Chi-squared test, χ<sup>2</sup>=0.5, p=0.5, Cramer V=0.05). The distribution of FG-cells he NFG-cells along the transverse axis was also similar between cell types (Author response image 4F, χ<sup>2</sup>=0.08, p=0.8, Cramer V=0.02).
Author response image 4.
• It is not clear to me how the analysis in Figure 6 was performed. In Figure 6B I would think that the grey line should connect with the bottom white dot in the third panel, which would be the interpretation of the results.
We thank the reviewer for raising this good point. The grey line was just for intuitional observation, not a quantitative analysis. We have removed the grey lines from all heat maps in Fig.6.
Reviewer #3 (Public Review):
[Editors' note: This review contains many criticisms that apply to the whole sub-field of slow/fast gamma oscillations in the hippocampus, as opposed to this particular paper. In the editors' view, these comments are beyond the scope of any single paper. However, they represent a view that, if true, should contextualise the interpretation of this paper and all papers in the sub-field. In doing so, they highlight an ongoing debate within the broader field.]
Summary:
The authors aimed to elucidate the role of dynamic gamma modulation in the development of hippocampal theta sequences, utilizing the traditional framework of "two gammas," a slow and a fast rhythm. This framework is currently being challenged, necessitating further analyses to establish and secure the assumed premises before substantiating the claims made in the present article.
The results are too preliminary and need to integrate contemporary literature. New analyses are required to address these concerns. However, by addressing these issues, it may be possible to produce an impactful manuscript.
We thank the reviewer for raising these important questions in the hippocampal gamma field. We have done a lot of new analyses according to the comments to strengthen our manuscript.
I. Introduction
Within the introduction, multiple broad assertions are conveyed that serve as the premise for the research. However, equally important citations that are not mentioned potentially contradict the ideas that serve as the foundation. Instances of these are described below:
(1) Are there multiple gammas? The authors launched the study on the premise that two different gamma bands are communicated from CA3 and the entorhinal cortex. However, recent literature suggests otherwise, offering that the slow gamma component may be related to theta harmonics:
From a review by Etter, Carmichael and Williams (2023)
"Gamma-based coherence has been a prominent model for communication across the hippocampal-entorhinal circuit and has classically focused on slow and fast gamma oscillations originating in CA3 and medial entorhinal cortex, respectively. These two distinct gammas are then hypothesized to be integrated into hippocampal CA1 with theta oscillations on a cycle-to-cycle basis (Colgin et al., 2009; Schomburg et al., 2014). This would suggest that theta oscillations in CA1 could serve to partition temporal windows that enable the integration of inputs from these upstream regions using alternating gamma waves (Vinck et al., 2023). However, these models have largely been based on correlations between shifting CA3 and medial entorhinal cortex to CA1 coherence in theta and gamma bands. In vivo, excitatory inputs from the entorhinal cortex to the dentate gyrus are most coherent in the theta band, while gamma oscillations would be generated locally from presumed local inhibitory inputs (Pernía-Andrade and Jonas, 2014). This predominance of theta over gamma coherence has also been reported between hippocampal CA1 and the medial entorhinal cortex (Zhou et al., 2022). Another potential pitfall in the communication-through-coherence hypothesis is that theta oscillations harmonics could overlap with higher frequency bands (Czurkó et al., 1999; Terrazas et al., 2005), including slow gamma (Petersen and Buzsáki, 2020). The asymmetry of theta oscillations (Belluscio et al., 2012) can lead to harmonics that extend into the slow gamma range (Scheffer-Teixeira and Tort, 2016), which may lead to a misattribution as to the origin of slow-gamma coherence and the degree of spike modulation in the gamma range during movement (Zhou et al., 2019)."
And from Benjamin Griffiths and Ole Jensen (2023)
"That said, in both rodent and human studies, measurements of 'slow' gamma oscillations may be susceptible to distortion by theta harmonics [53], meaning open questions remain about what can be attributed to 'slow' gamma oscillations and what is attributable to theta."
This second statement should be heavily considered as it is from one of the original authors who reported the existence of slow gamma.
Yet another instance from Schomburg, Fernández-Ruiz, Mizuseki, Berényi, Anastassiou, Christof Koch, and Buzsáki (2014):
"Note that modulation from 20-30 Hz may not be related to gamma activity but, instead, reflect timing relationships with non-sinusoidal features of theta waves (Belluscio et al., 2012) and/or the 3rd theta harmonic."
One of this manuscript's authors is Fernández-Ruiz, a contemporary proponent of the multiple gamma theory. Thus, the modulation to slow gamma offered in the present manuscript may actually be related to theta harmonics.
With the above emphasis from proponents of the slow/fast gamma theory on disambiguating harmonics from slow gamma, our first suggestion to the authors is that they A) address these statements (citing the work of these authors in their manuscript) and B) demonstrably quantify theta harmonics in relation to slow gamma prior to making assertions of phase relationships (methodological suggestions below). As the frequency of theta harmonics can extend as high as 56 Hz (PMID: 32297752), overlapping with the slow gamma range defined here (25-45 Hz), it will be important to establish an approach that decouples the two phenomena using an approach other than an arbitrary frequency boundary.
We agree with the reviewer that the theta oscillations harmonics could overlap with higher frequency bands including slow gamma, as the above reviews discussed. In order to rule out the possibility of theta harmonics effects in this study, we added new analyses in this letter (see below).
(2) Can gammas be segregated into different lamina of the hippocampus? This idea appears to be foundational in the premise of the research but is also undergoing revision.
As discussed by Etter et al. above, the initial theory of gamma routing was launched on coherence values. However, the values reported by Colgin et al. (2009) lean more towards incoherence (a value of 0) rather than coherence (1), suggesting a weak to negligible interaction. Nevertheless, this theory is coupled with the idea that the different gamma frequencies are exclusive to the specific lamina of the hippocampus.
Recently, Deschamps et al. (2024) suggested a broader, more nuanced understanding of gamma oscillations than previously thought, emphasizing their wide range and variability across hippocampal layers. This perspective challenges the traditional dichotomy of gamma sub-bands (e.g., slow vs. medium gamma) and their associated cognitive functions based on a more rigid classification according to frequency and phase relative to the theta rhythm. Moreover, they observed all frequencies across all layers.
Similarly, the current source density plots from Belluscio et al. (2012) suggest that SG and FG can be observed in both the radiatum and lacunosum-moleculare.
Therefore, if the initial coherence values are weak to negligible and both slow and fast gamma are observed in all layers of the hippocampus, can the different gammas be exclusively related to either anatomical inputs or psychological functions (as done in the present manuscript)? Do these observations challenge the authors' premise of their research? At the least, please discuss.
We thank the reviewer for raising this point, which I believe still remains controversial in this field. We also thank the reviewer for providing detailed proofs of existence forms of gamma rhythms. The reviewer was considering 2 aspects of gamma: 1) the reasonability of dividing slow and fast gamma by specific frequency bands; 2) the existence of gamma across all hippocampal layers, which challenged the functional significance of different types of gamma rhythms. Although the results in Douchamps et al., 2024 challenged the idea of rigid gamma sub-bands, we still could see separate slow and fast gamma components exclusively occurred along time course, with central frequency of slow gamma lower than ~60Hz and central frequency of fast gamma higher than ~60Hz (Fig.1b of Douchamps et al., 2024). This was also seen in the rat dataset of this reference (Fig. S3). Since their behavioral test required both memory encoding and retrieval processes, it was hard to distinguish the role of different gamma components as they may dynamically coordinate during complex memory process. Thus, although the behavioral performance can be decoded from broad range of gamma, we still cannot deny the existence of difference gamma rhythms and their functional significance during difference memory phases.
(3) Do place cells, phase precession, and theta sequences require input from afferent regions? It is offered in the introduction that "Fast gamma (~65-100Hz), associated with the input from the medial entorhinal cortex, is thought to rapidly encode ongoing novel information in the context (Fernandez-Ruiz et al., 2021; Kemere, Carr, Karlsson, & Frank, 2013; Zheng et al., 2016)".
CA1 place fields remain fairly intact following MEC inactivation include Ipshita Zutshi, Manuel Valero, Antonio Fernández-Ruiz , and György Buzsáki (2022)- "CA1 place cells and assemblies persist despite combined mEC and CA3 silencing" and from Hadas E Sloin, Lidor Spivak, Amir Levi, Roni Gattegno, Shirly Someck, Eran Stark (2024) - "These findings are incompatible with precession models based on inheritance, dual-input, spreading activation, inhibition-excitation summation, or somato-dendritic competition. Thus, a precession generator resides locally within CA1."
These publications, at the least, challenge the inheritance model by which the afferent input controls CA1 place field spike timing. The research premise offered by the authors is couched in the logic of inheritance, when the effect that the authors are observing could be governed by local intrinsic activity (e.g., phase precession and gamma are locally generated, and the attribution to routed input is perhaps erroneous). Certainly, it is worth discussing these manuscripts in the context of the present manuscript.
We thank the review for this discussion. The main purpose of our current study is to investigate the mechanism of theta sequence development along with learning, which may or may not dependent on theta phase precession of single place cells as it remains controversial in this field. Also, there is a limitation in this study that all gamma components were recorded from stratum pyramidale, thus we cannot make any conclusion on the originate of gamma in modulating sequence development.
II. Results
(1) Figure 2-
a. There is a bit of a puzzle here that should be discussed. If slow and fast frequencies modulate 25% of neurons, how can these rhythms serve as mechanisms of communication/support psychological functions? For instance, if fast gamma is engaged in rapid encoding (line 72) and slow gamma is related to the integration processing of learned information (line 84), and these are functions of the hippocampus, then why do these rhythms modulate so few cells? Is this to say 75% of CA1 neurons do not listen to CA3 or MEC input?
The proportion ~25% was the part of place cells phase-locked to either slow or fast gamma. However, one of the main findings in this study was that most cells were modulated by slow gamma as they fired at precessed slow gamma phase within a theta cycle (Figs 6-8), which would promote information compression for theta sequence development. Therefore, we didn’t mean that only a small proportion of cells were modulated by gamma rhythms and contributed to this process.
b. Figure 2. It is hard to know if the mean vector lengths presented are large or small. Moreover, one can expect to find significance due to chance. For instance, it is challenging to find a frequency in which modulation strength is zero (please see Figure 4 of PMID: 30428340 or Figure 7 of PMID: 31324673).
i. Please construct the histograms of Mean Vector Length as in the above papers, using 1 Hz filter steps from 1-120Hz and include it as part of Figure 2 (i.e., calculate the mean vector length for the filtered LFP in steps of 1-2 Hz, 2-3 Hz, 3-4 Hz,... etc). This should help the authors portray the amount of modulation these neurons have relative to the theta rhythm and other frequencies. If the theta mean vector length is higher, should it be considered the primary modulatory influence of these neurons (with slow and fast gammas as a minor influence)?
We thank the review for this suggestion. We measured the mean vector length at 5Hz step (equivalent to 1Hz step), and we found that the FG-cells were phase-locked to fast gamma rhythms even stronger than that to theta (Author response image 2B, mean MVL of theta=0.126±0.007, mean MVL of theta=0.175±0.006, paired t-test, t(112)=-5.9, p=0.01, Cohen's d=0.7). In addition, in some previous studies with significant fast gamma phase locking, the MVL values were around 0.15 by using broad gamma band (Kitanishi et al., 2015 Neuron, Lasztóczi et al., 2016 Neuron, Tomar et al., 2021 Front Behav Neurosci, and Asiminas et al., 2022 Molecular Autism), which was consistent with the value in this study. Therefore, we don’t believe that fast gamma was only a minor influence of these neurons.
ii. It is possible to infer a neuron's degree of oscillatory modulation without using the LFP. For instance, one can create an ISI histogram as done in Figure 1 here (https://www.biorxiv.org/content/10.1101/2021.09.20.461152v3.full.pdf+html; "Distinct ground state and activated state modes of firing in forebrain neurons"). The reciprocal of the ISI values would be "instantaneous spike frequency". In favor of the Douchamps et al. (2024) results, the figure of the BioRXiV paper implies that there is a single gamma frequency modulate as there is only a single bump in the ISIs in the 10^-1.5 to 10^-2 range. Therefore, to vet the slow gamma results and the premise of two gammas offered in the introduction, it would be worth including this analysis as part of Figure 2.
By using suggested method, we calculated the ISI distribution on log scale for FG-cells and NFG-cells during behavior (Author response image 5). We could observe that the ISI distribution of FG-cells had a bump in the 10<sup>-1.5</sup>= to 10<sup>-2</sup>= range (black bar), in particular in the fast gamma range (10<sup>-2</sup>= to 10<sup>-1.8</sup>=).
Author response image 5.
c. There are some things generally concerning about Figure 2.
i. First, the raw trace does not seem to have clear theta epochs (it is challenging to ascertain the start and end of a theta cycle). Certainly, it would be worth highlighting the relationship between theta and the gammas and picking a nice theta epoch.
We thank the review for this suggestion. We've updated this figure with a nice theta epoch in the revised manuscript.
ii. Also, in panel A, there looks to be a declining amplitude relationship between the raw, fast, and slow gamma traces, assuming that the scale bars represent 100uV in all three traces. The raw trace is significantly larger than the fast gamma. However, this relationship does not seem to be the case in panel B (in which both the raw and unfiltered examples of slow and fast gamma appear to be equal; the right panels of B suggest that fast gamma is larger than slow, appearing to contradict the A= 1/f organization of the power spectral density). Please explain as to why this occurs. Including the power spectral density (see below) should resolve some of this.
We thank the review for pointing this out. The scales of y-axis of LFPs tracs in Fig.2B was not consistent, which mislead the comparison of amplitude between slow and fast gamma. We have unified y axis scales across different gamma types in the revised manuscript. Moreover, we also have replaced these examples with more typical ones (also see the response below).
iii. Within the example of spiking to phase in the left side of Panel B (fast gamma example)- the neuron appears to fire near the trough twice, near the peak twice, and somewhere in between once. A similar relationship is observed for the slow gamma epoch. One would conclude from these plots that the interaction of the neuron with the two rhythms is the same. However, the mean vector lengths and histograms below these plots suggest a different story in which the neuron is modulated by FG but not SG. Please reconcile this.
We thank the review for pointing this out. We found that the fast gamma phase locking was robust across FG-cells with fast gamma peak as the preferred phase. Therefore, we have replaced these examples with more typical ones, so that the examples were consistent with the group effect.
iv. For calculating the MVL, it seems that the number of spikes that the neuron fires would play a significant role. Working towards our next point, there may be a bias of finding a relationship if there are too few spikes (spurious clustering due to sparse data) and/or higher coupling values for higher firing rate cells (cells with higher firing rates will clearly show a relationship), forming a sort of inverse Yerkes-Dodson curve. Also, without understanding the magnitude of the MVL relative to other frequencies, it may be that these values are indeed larger than zero, but not biologically significant.
- Please provide a scatter plot of Neuron MVL versus the Neuron's Firing Rate for 1) theta (7-9 Hz), 2) slow gamma, and 3) fast gamma, along with their line of best fit.
- Please run a shuffle control where the LFP trace is shifted by random values between 125-1000ms and recalculate the MVL for theta, slow, and fast gamma. Often, these shuffle controls are done between 100-1000 times (see cross-correlation analyses of Fujisawa, Buzsaki et al.).
- To establish that firing rate does not play a role in uncovering modulation, it would be worth conducting a spike number control, reducing the number of spikes per cell so that they are all equal before calculating the phase plots/MVL.
We thank the review for raising this point. Beside of the MVL value, we also calculated the pairwise phase consistency (PPC) as suggested by Reviewer2, which is not sensitive to the spike counts. We found that the phase locking strength to either rhythm (theta or gamma) was comparable between MVL and PPC measurements (Author response image 2). Moreover, we quantified the relationship between MVL and mean firing rate, as suggested. We found that the MVL value for theta, slow gamma and fast gamma was negatively correlated with mean firing rate (Author response image 6, Pearson correlation, theta: R<sup>2</sup>= 0.06, Pearson’s r=-0.3, p=1.3×10<sup>-8</sup>=; slow gamma: R<sup>2</sup>= 0.1, Pearson’s r=-0.4, p=2.4×10<sup>-17</sup>=; fast gamma: R<sup>2</sup>= 0.03, Pearson’s r=-0.2, p=4.3×10<sup>-5</sup>=). These results help us rule out the concerns of the effect of spikes counts on the phase modulation measurement.
Author response image 6.
(2) Something that I anticipated to see addressed in the manuscript was the study from Grosmark and Buzsaki (2016): "Cell assembly sequences during learning are "replayed" during hippocampal ripples and contribute to the consolidation of episodic memories. However, neuronal sequences may also reflect preexisting dynamics. We report that sequences of place-cell firing in a novel environment are formed from a combination of the contributions of a rigid, predominantly fast-firing subset of pyramidal neurons with low spatial specificity and limited change across sleep-experience-sleep and a slow-firing plastic subset. Slow-firing cells, rather than fast-firing cells, gained high place specificity during exploration, elevated their association with ripples, and showed increased bursting and temporal coactivation during postexperience sleep. Thus, slow- and fast-firing neurons, although forming a continuous distribution, have different coding and plastic properties."
My concern is that much of the reported results in the present manuscript appear to recapitulate the observations of Grosmark and Buzsaki, but without accounting for differences in firing rate. A parsimonious alternative explanation for what is observed in the present manuscript is that high firing rate neurons, more integrated into the local network and orchestrating local gamma activity (PING), exhibit more coupling to theta and gamma. In this alternative perspective, it's not something special about how the neurons are entrained to the routed fast gamma, but that the higher firing rate neurons are better able to engage and entrain their local interneurons and, thus modulate local gamma. However, this interpretation challenges the discussion around the importance of fast gamma routed from the MEC.
a. Please integrate the Grosmark & Buzsaki paper into the discussion.
b. Also, please provide data that refutes or supports the alternative hypothesis in which the high firing rate cells are just more gamma modulated as they orchestrate local gamma activity through monosynaptic connections with local interneurons (e.g., Marshall et al., 2002, Hippocampal pyramidal cell-interneuron spike transmission is frequency dependent and responsible for place modulation of interneuron discharge). Otherwise, the attribution to a MEC routed fast gamma routing seems tenuous.
c. It is mentioned that fast-spiking interneurons were removed from the analysis. It would be worth including these cells, calculating the MVL in 1 Hz increments as well as the reciprocal of their ISIs (described above).
We thank the review for this suggestion. Because we found the mean firing rate of FG-cells was higher than that of NFG-cells, it would be possible that the FG-cells are mainly overlapped with fast-firing cells (rigid cells) in Grosmark et al., 2016 Science. Actually, in this study, we aimed to investigate how fast and slow gamma rhythms modulated neurons dynamically during learning, rather than defining new cell types. Thus, we don’t think this work was just a replication of the previous publication. We have added this description in the Discussion part (Lines 439-441). In addition, we don’t have enough number of interneurons to support the analysis between interneurons and place cells. Therefore, we couldn’t make any statement about where was the fast gamma originated (CA1 locally or routed from MEC) in this study.
(3) Methods - Spectral decomposition and Theta Harmonics.
a. It is challenging to interpret the exact parameters that the authors used for their multi-taper analysis in the methods (lines 516-526). Tallon-Baudry et al., (1997; Oscillatory γ-Band (30-70 Hz) Activity Induced by a Visual Search Task in Humans) discuss a time-frequency trade-off where frequency resolution changes with different temporal windows of analysis. This trade-off between time and frequency resolution is well known as the uncertainty principle of signal analysis, transcending all decomposition methods. It is not only a function of wavelet or FFT, and multi-tapers do not directly address this. (The multitaper method, by using multiple specially designed tapers -like the Slepian sequences- smooths the spectrum. This smoothing doesn't eliminate leakage but distributes its impact across multiple estimates). Given the brevity of methods and the issues of theta harmonics as offered above, it is worth including some benchmark trace testing for the multi-taper as part of the supplemental figures.
i. Please spectrally decompose an asymmetric 8 Hz sawtooth wave showing the trace and the related power spectral density using the multiple taper method discussed in the methods.
ii. Please also do the same for an elliptical oscillation (perfectly symmetrical waves, but also capable of casting harmonics). Matlab code on how to generate this time series is provided below:
A = 1; % Amplitude
T = 1/8; % Period corresponding to 8 Hz frequency
omega = 2*pi/T; % Angular frequency
C = 1; % Wave speed
m = 0.9; % Modulus for the elliptic function (0<m<1 for cnoidal waves)
x = linspace(0, 2*pi, 1000); % temporal domain
t = 0; % Time instant
% Calculate B based on frequency and speed
B = sqrt(omega/C);
% Cnoidal wave equation using the Jacobi elliptic function
u = A .* ellipj(B.*(x - C*t), m).^2;
% Plotting the cnoidal wave
figure;
plot(x./max(x), u);
title('8 Hz Cnoidal Wave');
xlabel('time (x)');
ylabel('Wave amplitude (u)');
grid on;
The Symbolic Math Toolbox needs to be installed and accessible in your MATLAB environment to use ellipj. Otherwise, I trust that, rather than plotting a periodic orbit around a circle (sin wave) the authors can trace the movement around an ellipse with significant eccentricity (the distance between the two foci should be twice the distance between the co-vertices).
We thank the review for this suggestion. In the main text of manuscript, we only applied Morlet's wavelet method to calculate the time varying power of rhythms. Multitaper method was used for the estimation of power spectra across running speeds, which was shown in the manuscript. Therefore, we removed the description of Multitaper method and updated the Morlet's wavelet power spectral analysis in the Methods (Lines 541-544).
As suggested, we estimated the power spectral densities of 8 Hz sawtooth and elliptical oscillation by using these methods, and compared them with the results from FFT. We found that both the Multitaper's and Morlet's wavelet methods could well capture the 8Hz oscillatory components (Author response image 7). However, we could observe harmonic components from FFT spectrum.
Author response image 7.
iii. Line 522: "The power spectra across running speeds and absolute power spectrum (both results were not shown).". Given the potential complications of multi-taper discussed above, and as each convolution further removes one from the raw data, it would be the most transparent, simple, and straightforward to provide power spectra using the simple fft.m code in Matlab (We imagine that the authors will agree that the results should be robust against different spectral decomposition methods. Otherwise, it is concerning that the results depend on the algorithm implemented and should be discussed. If gamma transience is a concern, the authors should trigger to 2-second epochs in which slow/fast gamma exceeds 3-7 std. dev. above the mean, comparing those resulting power spectra to 2-second epochs with ripples - also a transient event). The time series should be at least 2 seconds in length (to avoid spectral leakage issues and the issues discussed in Talon-Baudry et al., 1997 above).
Please show the unmolested power spectra (Y-axis units in mV2/Hz, X-axis units as Hz) as a function of running speed (increments of 5 cm/s) for each animal. I imagine three of these PSDs for 3 of the animals will appear in supplemental methods while one will serve as a nice manuscript figure. With this plot, please highlight the regions that the authors are describing as theta, slow, and fast gamma. Also, any issues should be addressed should there be notable differences in power across animals or tetrodes (issues with locations along proximal-distal CA1 in terms of MEC/LEC input and using a local reference electrode are discussed below).
As suggested, we firstly estimated the power spectra as a function of running speeds in each running lap, and showed them separately for each rat, by using the multitaper spectral analysis (Author response image 8). In addition, to achieve unmolested power spectra, the short-time Fourier transform (STFT) was used for this analysis at the same frequency resolution (Author response image 9). We could see that the power spectra were consistent between these two methods. Notably, there seems no significant theta harmonic component in the slow gamma band range.
The multitaper spectral analysis was performed as follows. The power spectra were measured across different running speeds as described previously (Ahmed et al., 2012 J Neurosci; Zheng et al., 2015 Hippocampus; Zheng et al., 2016 eNeuro). Briefly, the absolute power spectrum was calculated for 0.5s moving window and 0.2s step size of the LFPs recordings each lap, using the multitaper spectral analysis in the Chronux toolbox (Mitra and Bokil, 2008, http://chronux.org/) and STFT spectral analysis in Matlab script stft.m. In the multitaper method, the time-bandwidth product parameter (TW) was set at 3, and the number of tapers (K) was set at 5. In the STFT method, the FFT length was set at 2048, which was equivalent with the parameters used in multitaper method. Running speed was calculated (see “Estimation of running speed and head direction” section in the manuscript) and averaged within each 0.5s time window corresponding to the LFP segments. Then, the absolute power at each frequency was smoothed with a Gaussian kernel centered on given speed bin. The power spectral as a function of running speed and frequency were plotted in log scale. Also, the colormap was in log scale, allowing for comparisons across different frequencies that would otherwise be difficult due to the 1/f decay of power in physiological signals.
Author response image 8.
Author response image 9.
iv. Schomberg and colleagues (2014) suggested that the modulation of neurons in the slow gamma range could be related to theta harmonics (see above). Harmonics can often extend in a near infinite as they regress into the 1/f background (contributing to power, but without a peak above the power spectral density slope), making arbitrary frequency limits inappropriate. Therefore, in order to support the analyses and assertions regarding slow gamma, it seems necessary to calculate a "theta harmonic/slow gamma ratio". Aru et al. (2015; Untangling cross-frequency coupling in neuroscience) offer that: " The presence of harmonics in the signal should be tested by a bicoherence analysis and its contribution to CFC should be discussed." Please test both the synthetic signals above and the raw LFP, using temporal windows of greater than 4 seconds (again, the large window optimizes for frequency resolution in the time-frequency trade-off) to calculate the bicoherence. As harmonics are integers of theta coupled to itself and slow gamma is also coupled to theta, a nice illustration and contribution to the field would be a method that uses the bispectrum to isolate and create a "slow gamma/harmonic" ratio.
We thank the reviewer for providing the method regarding on the theta harmonics. We firstly measured the theta harmonics on the synthesized signal by using the biphasic coherence method, and we could clearly observe the nonlinear coupling between theta rhythm and its harmonics (Author response image 10).
Author response image 10.
In addition, we also measured the bicoherence on raw traces during slow gamma episodes. We did not see nonlinear coupling between slow gamma and theta bands in this real data (mean bicoherence=0.1±0.0002) compared with that in the synthesized signal (mean bicoherence=0.7 for elliptical waves and 0.5 for sawtooth waves), suggesting that the slow gamma detected in this study was not pure theta harmonic (Author response image 11C, F, I, in red boxes). Therefore, we believe that the contribution of theta harmonic in slow gamma is not significant.
Author response image 11.
(4) I appreciate the inclusion of the histology for the 4 animals. Knerim and colleagues describe a difference in MEC projection along the proximal-distal axis of the CA1 region (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866456/)- "There are also differences in their direct projections along the transverse axis of CA1, as the LEC innervates the region of CA1 closer to the subiculum (distal CA1), whereas the MEC innervates the region of CA1 closer to CA2 and CA3 (proximal CA1)" From the histology, it looks like some of the electrodes are in the part of CA1 that would be dominated by LEC input while a few are closer to where the MEC would project.
a. How do the authors control for these differences in projections? Wouldn't this change whether or not fast gamma is observed in CA1?
b. I am only aware of one manuscript that describes slow gamma in the LEC which appeared in contrast to fast gamma from the MEC (https://www.science.org/doi/10.1126/science.abf3119). One would surmise that the authors in the present manuscript would have varying levels of fast gamma in their CA1 recordings depending on the location of the electrodes in the Proximal-distal axis, to the extent that some of the more medial tetrodes may need to be excluded (as they should not have fast gamma, rather they should be exclusively dominated by slow gamma). Alternatively, the authors may find that there is equal fast gamma power across the entire proximal-distal axis. However, this would pose a significant challenge to the LEC/slow gamma and MEC/fast gamma routing story of Fernandez-Ruiz et al. and require reconciliation/discussion.
c. Is there a difference in neuron modulation to these frequencies based on electrode location in CA1?
We thank the reviewer for this concern, which was also raised by Reviewer2. We aligned the physical location of LFP channels in the proximal-distal axis based on histology. In our dataset, only 2 rats were recorded from both distal and proximal hippocampus, so we calculated the gamma power from both sites in these rats. We found that slow power was higher from proximal tetrodes than that from distal tetrodes (Author response image 12, repeated measure ANOVA, F(1,7)=10.2, p=0.02, partial η <sup>2</sup>=0.8). However, fast gamma power were similar between different recording sites (F(1,7)=0.008, p=0.9, partial η <sup>2</sup>=0.001). These results are partially consistent with the LEC/slow gamma and MEC/fast gamma routing story of Fernandez-Ruiz’s work. The main reason would be that all LFPs were recorded from tetrodes in stratum pyramidale, deep layer in particular (Author response image 4E), so that it was hard to precisely identify their distance to distal/proximal apical dendrites.
Author response image 12.
In terms of the anatomical location of FG and NFG cells, we identified tetrode traces in slices for each cell. We found that both FG and NFG cells were recorded from the deep layer of dorsal CA1, with no difference of proportions between cell types (Author response image 4E, Chi-squared test, χ<sup>2</sup>=0.5, p=0.5, Cramer V=0.05). The distribution of FG-cells he NFG-cells along the transverse axis was also similar between cell types (Author response image 4F, χ<sup>2</sup>=0.08, p=0.8, Cramer V=0.02).
(5) Given a comment in the discussion (see below), it will be worth exploring changes in theta, theta harmonic, slow gamma, and fast gamma power with running speed as no changes were observed with theta sequences or lap number versus. Notably, Czurko et al., report an increase in theta and harmonic power with running speed (1999) while Ahmed and Mehta (2012) report a similar effect for gamma.
a. Please determine if the oscillations change in power and frequency of the rhythms discussed above change with running speed using the same parameters applied in the present manuscript. The specific concern is that how the authors calculate running speed is not sensitive enough to evaluate changes.
We thank the reviewer for this suggestion. The description of running speed quantification has been updated in the Method (see “Estimation of running speed and head direction” section, Lines 501-511). Overall, the sample frequency of running speed was25Hz which would be sensitive enough to evaluate the behavioral changes.
By measuring the rhythmic power changing as a function of running speed (Author response image 8 and Author response image 9), we could observe that theta power was increased as running speed getting higher. Consistent with the results in (Ahmed and Mehta, 2012) and our previous study (Zheng et al., 2015), the fast gamma power was increasing and slow gamma power was decreasing when running speed was getting high.
In addition, we also estimated the rhythmic frequency as a function of running speed in the slow and fast episodes respectively. We found that fast gamma frequency was increased with running speed (Author response image 13, linear regression, R<sup>2</sup>=0.4, corr=0.6, p=9.9×10<sup>-15</sup>), whereas slow gamma frequency was decreased with running speed (R<sup>2</sup>=0.2, corr=-0.4, p=8.8×10<sup>-6</sup>). Although significant correlation was found between gamma frequency and running speed, consistent with the previous studies, the frequency change (~70-75Hz for fast gamma and ~30-28Hz for slow gamma) was not big enough to affect the sequence findings in this study. In additiontheta frequency was maintained in either slow episodes (R<sup>2</sup>=0.02, corr=-0.1, p=0.1) or fast episodes (R<sup>2</sup>=0.004, corr=0.06, p=0.5), consistent with results in Fig.1G of Kropff et al., 2021 Neuron.
Author response image 13.
b. It is astounding that animals ran as fast as they did in what appears to be the first lap (Figure 3F), especially as rats' natural proclivity is thigmotaxis and inquisitive exploration in novel environments. Can the authors expand on why they believe their rats ran so quickly on the first lap in a novel environment and how to replicate this? Also, please include the individual values for each animal on the same plot.
We thank the reviewer for pointing this out. The task was not brand new to rats in this dataset, because only days with good enough recording quality for sequence decoding were included in this paper, which were about day2-day10 for each rat. However, we still observed the process of sequence formation because of the rat’s exploration interest during early laps. Thus, in terms exploration behaviors, the rats ran at relative high speeds across laps (Author response image 14, each gray line represents the running speed within an individual session).
Author response image 14.
c. Can the authors explain how the statistics on line 169 (F(4,44)) work? Specifically, it is challenging to determine how the degrees of freedom were calculated in this case and throughout if there were only 4 animals (reported in methods) over 5 laps (depicted in Figure 3F. Given line 439, it looks like trials and laps are used synonymously). Four animals over 5 laps should have a DOF of 16.
This statistic result was performed with each session/day as a sample (n=12 sessions/days). The statistics were generated by repeated measures ANOVA on 5 trials in 12 sessions, with a DOF of 44.
(6) Throughout the manuscript, I am concerned about an inflation of statistical power. For example on line 162, F(2,4844). The large degrees of freedom indicate that the sample size was theta sequences or a number of cells. Since multiple observations were obtained from the same animal, the statistical assumption of independence is violated. Therefore, the stats need to be conducted using a nested model as described in Aarts et al. (2014; https://pubmed.ncbi.nlm.nih.gov/24671065/). A statistical consult may be warranted.
We thank the reviewer for this suggestion. We have replaced this statistic result by using generalized linear mixed model with ratID being a covariate. These results have been updated in the revised manuscript (Lines 164-167).
(7) It is stated that one tetrode served as a quiet recording reference. The "quiet" part is an assumption when often, theta and gamma can be volume conducted to the cortex (e.g., Sirota et al., 2008; This is often why laboratories that study hippocampal rhythms use the cerebellum for the differential recording electrode and not an electrode in the corpus callosum). Generally, high frequencies propagate as well as low frequencies in the extracellular milieu (https://www.eneuro.org/content/4/1/ENEURO.0291-16.2016). For transparency, the authors should include a limitation paragraph in their discussion that describes how their local tetrode reference may be inadvertently diminishing and/or distorting the signal that they are trying to isolate. Otherwise, it would be worth hearing an explanation as to how the author's approach avoids this issue.
In terms of the locations of references, we had 2 screws above the cerebellum in the skull connected to the recording drive ground, and 1 tetrode in a quiet area of the cortex serving as the recording reference. We agree that the theta and gamma can be volume conducted to the cortex which may affect the power of these rhythms in the stratum pyramidale. However, we didn’t mean to measure or compare the absolute theta or gamma power in this study, as we only cared about the phase modulation of gamma to place cells. Therefore, we believe the location of recording reference would not make significant effect on our conclusion.
Apologetically, this review is already getting long. Moreover, I have substantial concerns that should be resolved prior to delving into the remainder of the analyses. e.g., the analyses related to Figure 3-5 assert that FG cells are important for sequences. However, the relationship to gamma may be secondary to either their relationship to theta or, based on the Grosmark and Buzsaki paper, it may just be a phenomenon coupled to the fast-firing cells (fast-firing cells showing higher gamma modulation due to a local PING dynamic). Moreover, the observation of slow gamma is being challenged as theta harmonics, even by the major proponents of the slow/fast gamma theory. Therefore, the report of slow gamma precession would come as an unsurprising extension should they be revealed to be theta harmonics (however, no control for harmonics was implemented; suggestions were made above). Following these amendments, I would be grateful for the opportunity to provide further feedback.
III. Discussion.
a. Line 330- it was offered that fast gamma encodes information while slow gamma integrates in the introduction. However, in a task such as circular track running (from the methods, it appears that there is no new information to be acquired within a trial), one would guess that after the first few laps, slow gamma would be the dominant rhythm. Therefore, one must wonder why there are so few neurons modulated by slow gamma (~3.7%).
The proportion of ~3.7% was the part of place cells phase-locked to slow gamma. However, we aimed to find that the slow gamma phase precession of place cells promoted the theta sequence development. We would not expect the cells phase-locked to slow gamma if phase precession occurred.
b. Line 375: The authors contend that: "...slow gamma, related to information compression, was also required to modulate fast gamma phase-locked cells during sequence development. We replicated the results of slow gamma phase precession at the ensemble level (Zheng et al., 2016), and furthermore observed it at late development, but not early development, of theta sequences." In relation to the idea that slow gamma may be coupled to - if not a distorted representation of - theta harmonics, it has been observed that there are changes in theta relative to novelty.
i. A. Jeewajee, C. Lever, S. Burton, J. O'Keefe, and N. Burgess (2008) report a decrease in theta frequency in novel circumstances that disappears with increasing familiarity.
ii. One could surmise that this change in frequency is associated with alterations in theta harmonics (observed here as slow gamma), challenging the author's interpretation.
iii. Therefore, the authors have a compelling opportunity to replicate the results of Jeewajee et al., characterizing changes of theta along with the development of slow gamma precession, as the environment becomes familiar. It will become important to demonstrate, using bicoherence as offered by Aru et al., how slow gamma can be disambiguated from theta harmonics. Specifically, we anticipate that the authors will be able to quantify A) theta harmonics (the number, and their respective frequencies and amplitudes), B) the frequency and amplitude of slow gamma, and C) how they can be quantitatively decoupled. Through this, their discussion of oscillatory changes with novelty-familiarity will garner a significant impact.
We think we have demonstrated that the slow gamma observed in this study was not purely theta harmonics. We didn’t focus on the frequency change of slow gamma or theta rhythms in this study. Further investigation will be carried out on this topic in the future.
c. Broadly, it is interesting that the authors emphasize the gamma frequency throughout the discussion. Given that the power spectral density of the Local Field Potential (LFP) exhibits a log-log relationship between amplitude and frequency, as described by Buzsáki (2005) in "Rhythms of the Brain," and considering that the LFP is primarily generated through synaptic transmembrane currents (Buzsáki et al., 2012), it seems parsimonious to consider that the bulk of synaptic activity occurs at lower frequencies (e.g., theta). Since synaptic transmission represents the most direct form of inter-regional communication, one might wonder why gamma (characterized by lower amplitude rhythms) is esteemed so highly compared to the higher amplitude theta rhythm. Why isn't the theta rhythm, instead, regarded as the primary mode of communication across brain regions? A discussion exploring this question would be beneficial.
We thank the reviewer for this deep thinking. When stating the conclusion on gamma rhythms, we didn’t mean to weaken the role of theta rhythm. Conversely, the fast or slow gamma episodes were detected riding on theta rhythms, and we believe that the information compression should occur at a finer scale within a theta cycle scale. More investigation will be carried out on this topic in the future.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) It is helpful to clearly define "FG-cell sequences" before the relevant results are described in the Results section. More importantly, the seemingly conflicting results between Figure 3 and Figure 8 may need to be clarified.
The “exFG-sequences and exNFG sequences”, “FG-cell sequences and NFG-cell sequences” have been defined clearly in the revised manuscript. Moreover, the seemingly conflicting results between Figure 3 and Figure 8 have been interpreted properly.
(2) It is helpful to clearly state the N and what defines a sample whenever a result is described.
In each statistical results, the N and what defines a sample have been clarified in the revised manuscript.
(3) Addressing the questions regarding the methods (#5) would clarify some of the results.
The questions regarding the Methods part has addressed in the revised manuscript.
(4) Line #244: "successful" should be "successive"?
Fixed.
Reviewer #2 (Recommendations For The Authors):
- The writing of the manuscript can be substantially improved.
The manuscript can be substantially revised and updated.
- I noticed that the last author of the manuscript is not the lead or corresponding and has only provided a limited contribution to this work (according to the detailed author contributions). The second to last author seems to be the main senior intellectual contributor and supervisor, together with the third to last author. This speaks of potential bad academic practices where a senior person whose intellectual contribution to the study is relatively minor takes the last author position, against the standard conventions on authorship worldwide. I strongly suggest that this is corrected.
We thank the reviewer for raising this problem. The last author Dr. Ming was also a senior author and supervised this project with large contribution. We have fixed his role as a co-corresponding author in the revised manuscript.
-
-
arxiv.org arxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This paper concerns mechanisms of foraging behavior in C. elegans. Upon removal from food, C. elegans first executes a stereotypical local search behavior in which it explores a small area by executing many random, undirected reversals and turns called "reorientations." If the worm fails to find food, it transitions to a global search in which it explores larger areas by suppressing reorientations and executing long forward runs (Hills et al., 2004). At the population level, the reorientation rate declines gradually. Nevertheless, about 50% of individual worms appear to exhibit an abrupt transition between local and global search, which is evident as a discrete transition from high to low reorientation rate (Lopez-Cruz et al., 2019). This observation has given rise to the hypothesis that local and global search correspond to separate internal states with the possibility of sudden transitions between them (Calhoun et al., 2014). The main conclusion of the paper is that it is not necessary to posit distinct internal states to account for discrete transitions from high to low reorientation rates. On the contrary, discrete transitions can occur simply because of the stochastic nature of the reorientation behavior itself.
Strengths:
The strength of the paper is the demonstration that a more parsimonious model explains abrupt transitions in the reorientation rate.
Weaknesses:
(1) Use of the Gillespie algorithm is not well justified. A conventional model with a fixed dt and an exponentially decaying reorientation rate would be adequate and far easier to explain. It would also be sufficiently accurate - given the appropriate choice of dt - to support the main claims of the paper, which are merely qualitative. In some respects, the whole point of the paper - that discrete transitions are an epiphenomenon of stochastic behavior - can be made with the authors' version of the model having a constant reorientation rate (Figure 2f).
We apologize, but we are not sure what the reviewer means by “fixed dt”. If the reviewer means taking discrete steps in time (dt), and modeling whether a reorientation occurs, we would argue that the Gillespie algorithm is a better way to do this because it provides floating-point precision time resolution, rather than a time resolution limited by dt, which we hopefully explain in the comments below.
The reviewer is correct that discrete transitions are an epiphenomenon of stochastic behavior as we show in Figure 2f. However, abrupt stochastic jumps that occur with a constant rate do not produce persistent changes in the observed rate because it is by definition, constant. The theory that there are local and global searches is based on the observation that individual worms often abruptly change their rates. But this observation is only true for a fraction of worms. We are trying to argue that the reason why this is not observed for all, or even most worms is because these are the result of stochastic sampling, not a sudden change in search strategy.
(2) In the manuscript, the Gillespie algorithm is very poorly explained, even for readers who already understand the algorithm; for those who do not it will be essentially impossible to comprehend. To take just a few examples: in Equation (1), omega is defined as reorientations instead of cumulative reorientations; it is unclear how (4) follows from (2) and (3); notation in (5), line 133, and (7) is idiosyncratic. Figure 1a does not help, partly because the notation is unexplained. For example, what do the arrows mean, what does "*" mean?
We apologize for this, you are correct, is cumulative reorientations, and we will edit the text as follows:
Experimentally, reorientation rate is measured as the number of reorientation events that occurred in an observational window. However, these are discrete stochastic events, so we should describe them in terms of propensity, i.e. the probability of observing a transitional event (in this case, a reorientation) is:
Here, P(W+1,t) is the probability of observing a reorientation event at time t, and a<sub>1</sub> is the propensity for this event to occur. Observationally, the frequency of reorientations observed decays over time, so we can define the propensity as:
Where α is the initial propensity at t=0.
We can model this decay as the reorientation propensity coupled to a decaying factor (M):
Where the propensity of this event (a<sub>2</sub>) is:
Since M is a first-order decay process, when integrated, the cumulative M observed is:
We can couple the probability of observing a reorientation to this decay by redefining (a<sub>1</sub> as:
So that now:
A critical detail should be noted. While reorientations are modeled as discrete events, the amount of M at time t\=0 is chosen to be large (M<sub>0</sub>←1,000), so that over the timescale of 40 minutes, the decay in M is practically continuous. This ensures that sudden changes in reorientations are not due to sudden changes in M, but due to the inherent stochasticity of reorientations.
To model both processes, we can create the master equation:
Since these are both Poisson processes, the probability density function for a state change i occurring in time t is:
The probability that an event will not occur in time interval t is:
The probability that no events will occur for ALL transitions in this time interval is:
We can draw a random number (r<sub>1</sub> ∈[0,1]) that represents the probability of no events in time interval t, so that this time interval can be assigned by rearranging equation 11:
where:
This is the time interval for any event (W+1 or M-1) happening at t + t. The probability of which event occurs is proportional to its propensity:
We can draw a second number (r<sub>2</sub> ∈[0,1]) that represents this probability so that which event occurs at time t + t is determined by the smallest n that satisfies:
so that:
The elegant efficiency of the Gillespie algorithm is two-fold. First, it models all transitions simultaneously, not separately. Second, it provides floating-point time resolution. Rather than drawing a random number, and using a cumulative probability distribution of interval-times to decide whether an event occurs at discrete steps in time, the Gillespie algorithm uses this distribution to draw the interval-time itself. The time resolution of the prior approach is limited by step size, whereas the Gillespie algorithm’s time resolution is limited by the floating-point precision of the random number that is drawn.
We are happy to add this text to improve clarity.
We apologize for the arrow notation confusion. Arrow notation is commonly used in pseudocode to indicate variable assignment, and so we used it to indicate variable assignment updates in the algorithm.
We added Figure 2a to help explain the Gillespie algorithm for people who are unfamiliar with it, but you are correct, some notation, like probabilities, were left unexplained. We will address this to improve clarity.
(3) In the model, the reorientation rate dΩ⁄dt declines to zero but the empirical rate clearly does not. This is a major flaw. It would have been easy to fix by adding a constant to the exponentially declining rate in (1). Perhaps fixing this obvious problem would mitigate the discrepancies between the data and the model in Figure 2d.
You are correct that the model deviates slightly at longer times, but this result is consistent with Klein et al. that show a continuous decline of reorientations. However, we could add a constant to the model, since an infinite run length is likely not physiological.
(4) Evidence that the model fits the data (Figure 2d) is unconvincing. I would like to have seen the proportion of runs in which the model generated one as opposed to multiple or no transitions in reorientation rate; in the real data, the proportion is 50% (Lopez). It is claimed that the "model demonstrated a continuum of switching to non-switching behavior" as seen in the experimental data but no evidence is provided.
We should clarify that the 50% proportion cited by López-Cruz was based on an arbitrary difference in slopes, and by assessing the data visually. We sought to avoid this subjective assessment by plotting the distribution of slopes and transition times produced by the method used in López-Cruz. We should also clarify by what we meant by “a continuum of switching and non-switching” behavior. Both the transition time distributions and the slope-difference distributions do not appear to be the result of two distributions. This is unlike roaming and dwelling on food, where two distinct distributions of behavioral metrics can be identified based on speed and angular speed (Flavell et al, 2009, Fig S2a). We will add a permutation test to verify the mean differences in slopes and transition times between the experiment and model are not significant.
(5) The explanation for the poor fit between the model and data (lines 166-174) is unclear. Why would externally triggered collisions cause a shift in the transition distribution?
Thank you, we should rewrite the text to clarify this better. There were no externally triggered collisions; 10 animals were used per experiment. They would occasionally collide during the experiment, but these collisions were excluded from the data that were provided. However, worms are also known to increase reorientations when they encounter a pheromone trail, and it is unknown (from this dataset) which orientations may have been a result of this phenomenon.
(6) The discussion of Levy walks and the accompanying figure are off-topic and should be deleted.
Thank you, we agree that this topic is tangential, and we will remove it.
Reviewer #2 (Public review):
Summary:
In this study, the authors build a statistical model that stochastically samples from a time-interval distribution of reorientation rates. The form of the distribution is extracted from a large array of behavioral data, and is then used to describe not only the dynamics of individual worms (including the inter-individual variability in behavior), but also the aggregate population behavior. The authors note that the model does not require assumptions about behavioral state transitions, or evidence accumulation, as has been done previously, but rather that the stochastic nature of behavior is "simply the product of stochastic sampling from an exponential function".
Strengths:
This model provides a strong juxtaposition to other foraging models in the worm. Rather than evoking a behavioral transition function (that might arise from a change in internal state or the activity of a cell type in the network), or evidence accumulation (which again maps onto a cell type, or the activity of a network) - this model explains behavior via the stochastic sampling of a function of an exponential decay. The underlying model and the dynamics being simulated, as well as the process of stochastic sampling, are well described and the model fits the exponential function (Equation 1) to data on a large array of worms exhibiting diverse behaviors (1600+ worms from Lopez-Cruz et al). The work of this study is able to explain or describe the inter-individual diversity of worm behavior across a large population. The model is also able to capture two aspects of the reorientations, including the dynamics (to switch or not to switch) and the kinetics (slow vs fast reorientations). The authors also work to compare their model to a few others including the Levy walk (whose construction arises from a Markov process) to a simple exponential distribution, all of which have been used to study foraging and search behaviors.
Weaknesses:
This manuscript has two weaknesses that dampen the enthusiasm for the results. First, in all of the examples the authors cite where a Gillespie algorithm is used to sample from a distribution, be it the kinetics associated with chemical dynamics, or a Lotka-Volterra Competition Model, there are underlying processes that govern the evolution of the dynamics, and thus the sampling from distributions. In one of their references, for instance, the stochasticity arises from the birth and death rates, thereby influencing the genetic drift in the model. In these examples, the process governing the dynamics (and thus generating the distributions from which one samples) is distinct from the behavior being studied. In this manuscript, the distribution being sampled is the exponential decay function of the reorientation rate (lines 100-102). This appears to be tautological - a decay function fitted to the reorientation data is then sampled to generate the distributions of the reorientation data. That the model performs well and matches the data is commendable, but it is unclear how that could not be the case if the underlying function generating the distribution was fit to the data.
Thank you, we apologize that this was not clearer. In the Lotka-Volterra model, the density of predators and prey are being modeled, with the underlying assumption that rates of birth and death are inherently stochastic. In our model, the number of reorientations are being modeled, with the assumption (based on the experiments), that the occurrence of reorientations is stochastic, just like the occurrence (birth) of a prey animal is stochastic. However, the decay in M is phenomenological, and we speculate about the nature of M later in the manuscript.
You are absolutely right that the decay function for M was fitted to the population average of reorientations and then sampled to generate the distributions of the reorientation data. This was intentional to show that the parameters chosen to match the population average would produce individual trajectories with comparable stochastic “switching” as the experimental data. All we’re trying to show really is that observed sudden changes in reorientation that appear persistent can be produced by a stochastic process without resorting to binary state assignments. In Calhoun, et al 2014 it is reported all animals produced switch-like behavior, but in Klein et al, 2017 it is reported that no animals showed abrupt transitions. López-Cruz et al seem to show a mix of these results, which can be easily explained by an underlying stochastic process.
The second weakness is somewhat related to the first, in that absent an underlying mechanism or framework, one is left wondering what insight the model provides. Stochastic sampling a function generated by fitting the data to produce stochastic behavior is where one ends up in this framework, and the authors indeed point this out: "simple stochastic models should be sufficient to explain observably stochastic behaviors." (Line 233-234). But if that is the case, what do we learn about how the foraging is happening? The authors suggest that the decay parameter M can be considered a memory timescale; which offers some suggestion, but then go on to say that the "physical basis of M can come from multiple sources". Here is where one is left for want: The mechanisms suggested, including loss of sensory stimuli, alternations in motor integration, ionotropic glutamate signaling, dopamine, and neuropeptides are all suggested: these are basically all of the possible biological sources that can govern behavior, and one is left not knowing what insight the model provides. The array of biological processes listed is so variable in dynamics and meaning, that their explanation of what governs M is at best unsatisfying. Molecular dynamics models that generate distributions can point to certain properties of the model, such as the binding kinetics (on and off rates, etc.) as explanations for the mechanisms generating the distributions, and therefore point to how a change in the biology affects the stochasticity of the process. It is unclear how this model provides such a connection, especially taken in aggregate with the previous weakness.
Providing a roadmap of how to think about the processes generating M, the meaning of those processes in search, and potential frameworks that are more constrained and with more precise biological underpinning (beyond the array of possibilities described) would go a long way to assuaging the weaknesses.
Thank you, these are all excellent points. We should clarify that in López-Cruz et al, they claim that only 50% of the animals fit a local/global search paradigm. We are simply proposing there is no need for designating local and global searches if the data don’t really support it. The underlying behavior is stochastic, so the sudden switches sometimes observed can be explained by a stochastic process where the underlying rate is slowing down, thus producing the persistently slow reorientation rate when an apparent “switch” occurs. What we hope to convey is that foraging doesn’t appear to follow a decision paradigm, but instead a gradual change in reorientations which for individual worms, can occasionally produce reorientation trajectories that appear switch-like.
As for M, you are correct, we should be more explicit. A decay in reorientation rate, rather than a sudden change, is consistent with observations made by López-Cruz et al. They found that the neurons AIA and ADE redundantly suppress reorientations, and that silencing either one was sufficient to restore the large number of reorientations during early foraging. The synaptic output of AIA and ADE was inhibited over long timescales (tens of minutes) by presynaptic glutamate binding to MGL-1, a slow G-Protein coupled receptor expressed in AIA and ADE. Their results support a model where sensory neurons suppress the synaptic output of AIA and ADE, which in turn leads to a large number of reorientations early in foraging. As time passes, glutamatergic input from the sensory neurons decrease, which leads to disinhibition of AIA and ADE, and a subsequent suppression of reorientations.
The sensory inputs into AIA and ADE are sequestered into two separate circuits, with AIA receiving chemosensory input and ADE receiving mechanosensory input. Since the suppression of either AIA or ADE is sufficient to increase reorientations, the decay in reorientations is likely due to the synaptic output of both of these neurons decaying in time. This correlates with an observed decrease in sensory neuron activity as well, so the timescale of reorientation decay could be tied to the timescale of sensory neuron activity, which in turn is influencing the timescale of AIA/ADE reorientation suppression. This implies that our factor “M” is likely the sum of several different sensory inputs decaying in time.
The molecular basis of which sensory neuron signaling factors contribute to decreased AIA and ADE activity is made more complicated by the observation that the glutamatergic input provided by the sensory neurons was not essential, and that additional factors besides glutamate contribute to the signaling to AIA and ADE. In addition to this, it is simply not the sensory neuron activity that decays in time, but also the sensitivity of AIA and ADE to sensory neuron input that decays in time. Simply depolarizing sensory neurons after the animals had starved for 30 minutes was insufficient to rescue the reorientation rates observed earlier in the foraging assay. This observation could be due to decreased presynaptic vesicle release, and/or decreased receptor localization on the postsynaptic side.
In summary, there are two neuronal properties that appear to be decaying in time. One is sensory neuron activity, and the other is decreased potentiation of presynaptic input onto AIA and ADE. Our factor “M” is a phenomenological manifestation of these numerous decaying factors.
Reviewer #3 (Public review):
Summary:
This intriguing paper addresses a special case of a fundamental statistical question: how to distinguish between stochastic point processes that derive from a single "state" (or single process) and more than one state/process. In the language of the paper, a "state" (perhaps more intuitively called a strategy/process) refers to a set of rules that determine the temporal statistics of the system. The rules give rise to probability distributions (here, the probability for turning events). The difficulty arises when the sampling time is finite, and hence, the empirical data is finite, and affected by the sampling of the underlying distribution(s). The specific problem being tackled is the foraging behavior of C. elegans nematodes, removed from food. Such foraging has been studied for decades, and described by a transition over time from 'local'/'area-restricted' search'(roughly in the initial 10-30 minutes of the experiments, in which animals execute frequent turns) to 'dispersion', or 'global search' (characterized by a low frequency of turns). The authors propose an alternative to this two-state description - a potentially more parsimonious single 'state' with time-changing parameters, which they claim can account for the full-time course of these observations.
Figure 1a shows the mean rate of turning events as a function of time (averaged across the population). Here, we see a rapid transient, followed by a gradual 4-5 fold decay in the rate, and then levels off. This picture seems consistent with the two-state description. However, the authors demonstrate that individual animals exhibit different "transition" statistics (Figure 1e) and wish to explain this. They do so by fitting this mean with a single function (Equations 1-3).
Strengths:
As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.
Weaknesses:
(1) The authors claim that only about half the animals tested exhibit discontinuity in turning rates. Can they automatically separate the empirical and model population into these two subpopulations (with the same method), and compare the results?
Thank you, we should clarify that the observation that about half the animals exhibit discontinuity was not made by us, but by López-Cruz et al. The observed fraction of 50% was based on a visual assessment of the dual regression method we described. To make the process more objective, we decided to simply plot the distributions of the metrics they used for this assessment to see if two distinct populations could be observed. However, the distributions of slope differences and transition times do not produce two distinct populations. Our stochastic approach, which does not assume abrupt state-transitions, also produces comparable distributions. To quantify this, we will perform permutation tests on the means and variances differences between experimental and model data.
(2) The equations consider an exponentially decaying rate of turning events. If so, Figure 2b should be shown on a semi-logarithmic scale.
We are happy to add this panel as well.
(3) The variables in Equations 1-3 and the methods for simulating them are not well defined, making the method difficult to follow. Assuming my reading is correct, Omega should be defined as the cumulative number of turning events over time (Omega(t)), not as a "turn" or "reorientation", which has no derivative. The relevant entity in Figure 1a is apparently <Omega (t)>, i.e. the mean number of events across a population which can be modelled by an expectation value. The time derivative would then give the expected rate of turning events as a function of time.
Thank you, you are correct. Please see response to Reviewer #1.
(4) Equations 1-3 are cryptic. The authors need to spell out up front that they are using a pair of coupled stochastic processes, sampling a hidden state M (to model the dynamic turning rate) and the actual turn events, Omega(t), separately, as described in Figure 2a. In this case, the model no longer appears more parsimonious than the original 2-state model. What then is its benefit or explanatory power (especially since the process involving M is not observable experimentally)?
Thank you, yes we see how as written this was confusing. In our response to Reviewer #1, we added an important detail:
While reorientations are modeled as discrete events, which is observationally true, the amount of M at time t\=0 is chosen to be large (M<sub>0</sub>←1,000), so that over the timescale of 40 minutes, the decay in M is practically continuous. This ensures that sudden changes in reorientations are not due to sudden changes in M, but due to the inherent stochasticity of reorientations.
However you are correct that if M was chosen to have a binary value of 0 or 1, then this would indeed be the two state model. Adding this as an additional model would be a good idea to compare how this matches the experimental data, and we are happy to add it.
(5) Further, as currently stated in the paper, Equations 1-3 are only for the mean rate of events. However, the expectation value is not a complete description of a stochastic system. Instead, the authors need to formulate the equations for the probability of events, from which they can extract any moment (they write something in Figure 2a, but the notation there is unclear, and this needs to be incorporated here).
Thank you, yes please see our response to Reviewer #1.
(6) Equations 1-3 have three constants (alpha and gamma which were fit to the data, and M0 which was presumably set to 1000). How does the choice of M0 affect the results?
Thank you, this is a good question. We will test this down to a binary state of M as mentioned in comment #4.
(7) M decays to near 0 over 40 minutes, abolishing omega turns by the end of the simulations. Are omega turns entirely abolished in worms after 30-40 minutes off food? How do the authors reconcile this decay with the leveling of the turning rate in Figure 1a?
Yes, reviewer #1 recommended adding a baseline reorientation rate which is likely more biologically plausible. However, we should also note that in Klein et al they observed a continuous decay over 50 minutes.
(8) The fit given in Figure 2b does not look convincing. No statistical test was used to compare the two functions (empirical and fit). No error bars were given (to either). These should be added. In the discussion, the authors explain the discrepancy away as experimental limitations. This is not unreasonable, but on the flip side, makes the argument inconclusive. If the authors could model and simulate these limitations, and show that they account for the discrepancies with the data, the model would be much more compelling. To do this, I would imagine that the authors would need to take the output of their model (lists of turning times) and convert them into simulated trajectories over time. These trajectories could be used to detect boundary events (for a given size of arena), collisions between individuals, etc. in their simulations and to see their effects on the turn statistics.
Thank you, we will add error bars and perform a permutation test on the mean and variance differences between experiment and model over the 40 minute window.
(9) The other figures similarly lack any statistical tests and by eye, they do not look convincing. The exception is the 6 anecdotal examples in Figure 2e. Those anecdotal examples match remarkably closely, almost suspiciously so. I'm not sure I understood this though - the caption refers to "different" models of M decay (and at least one of the 6 examples clearly shows a much shallower exponential). If different M models are allowed for each animal, this is no longer parsimonious. Are the results in Figure 2d for a single M model? Can Figure 2e explain the data with a single (stochastic) M model?
Thank you, yes, we will perform permutation tests on the mean and variance differences in the observed distributions in figure 2d. We certainly don’t want the panels in Figure 2e to be suspicious! These comparisons were drawn from calculating the correlations between all model traces and all experimental traces, and then choosing the top hits. Every time we run the simulation, we arrive at a different set of examples. Since it was recommended we add a baseline rate, these examples will be a completely different set when we run the simulation, again.
We apologize for the confusion regarding M. Since the worms do not all start out with identical reorientation rates, we drew the initial M value from a distribution centered on M0 and a variance to match the initial distribution of observed experimental rates.
(10) The left axes of Figure 2e should be reverted to cumulative counts (without the normalization).
Thank you, we will add this. We want to clarify that we normalized it because we chose these examples based on correlation to show that the same types of sudden changes in search strategy can occur with a model that doesn’t rely on sudden rate changes.
(11) The authors give an alternative model of a Levy flight, but do not give the obvious alternative models:
a) the 1-state model in which P(t) = alpha exp (-gamma t) dt (i.e. a single stochastic process, without a hidden M, collapsing equations 1-3 into a single equation).
b) the originally proposed 2-state model (with 3 parameters, a high turn rate, a low turn rate, and the local-to-global search transition time, which can be taken from the data, or sampled from the empirical probability distributions). Why not? The former seems necessary to justify the more complicated 2-process model, and the latter seems necessary since it's the model they are trying to replace. Including these two controls would allow them to compare the number of free parameters as well as the model results. I am also surprised by the Levy model since Levy is a family of models. How were the parameters of the Levy walk chosen?
Thank you, we will remove this section completely, as it is tangential to the main point of the paper.
(12) One point that is entirely missing in the discussion is the individuality of worms. It is by now well known that individual animals have individual behaviors. Some are slow/fast, and similarly, their turn rates vary. This makes this problem even harder. Combined with the tiny number of events concerned (typically 20-40 per experiment), it seems daunting to determine the underlying model from behavioral statistics alone.
Thank you, yes we should have been more explicit in the reasoning behind drawing the initial M from a distribution (response to comment #9). We assume that not every worm starts out with the same reorientation rate, but that some start out fast (high M) and some start out slow (low M). However, we do assume M decays with the same kinetics, which seems sufficient to produce the observed phenomena.
(13) That said, it's well-known which neurons underpin the suppression of turning events (starting already with Gray et al 2005, which, strangely, was not cited here). Some discussion of the neuronal predictions for each of the two (or more) models would be appropriate.
Thank you, yes we will add Gray et al, but also the more detailed response to Reviewer #2.
(14) An additional point is the reliance entirely on simulations. A rigorous formulation (of the probability distribution rather than just the mean) should be analytically tractable (at least for the first moment, and possibly higher moments). If higher moments are not obtainable analytically, then the equations should be numerically integrable. It seems strange not to do this.
Thank you for suggesting this, we will add these analyses.
In summary, while sample simulations do nicely match the examples in the data (of discontinuous vs continuous turning rates), this is not sufficient to demonstrate that the transition from ARS to dispersion in C. elegans is, in fact, likely to be a single 'state', or this (eq 1-3) single state. Of course, the model can be made more complicated to better match the data, but the approach of the authors, seeking an elegant and parsimonious model, is in principle valid, i.e. avoiding a many-parameter model-fitting exercise.
As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.
Thank you, we agree that this is a generic phenomenon, which is partly why we did this. The data from López-Cruz seem to agree in part with Calhoun et al, that claim abrupt transitions occur, and Klein et al, which claim they do not occur. Since the underlying phenomenon is stochastic, we propose the mixed observations of sudden and gradual changes in search strategy are simply the result of a stochastic process, which can produce both phenomena for individual observations.
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This study investigated the mechanism by which PGE2 inhibits the release of insulin from pancreatic beta cells in response to glucose. The researchers used a combination of cell line experiments and studies in mice with genetic ablation of the Kv2.2 channel. Their findings suggest a novel pathway where PGE2 acts through EP2/EP4 receptors to activate PKA, which directly phosphorylates a specific site (S448) on the Kv2.2 channel, inhibiting its activity and reducing GSIS.
Strengths:
- The study elegantly demonstrates a potential pathway connecting PGE2, EP2/EP4 receptors, PKA, and Kv2.2 channel activity, using embryonic cell line.
- Additional experiments in INS1 and primary mouse beta cells with altered Kv2.2 function partially support the inhibitory role of PGE2 on GSIS through Kv2.2 inhibition.
Weaknesses:
- A critical limitation is the use of HEK293T cells, which are not pancreatic beta cells. Functional aspects can differ significantly between these cell types.
- The study needs to address the apparent contradiction of PKA activating insulin secretion in beta cells, while also inhibiting GSIS through the proposed mechanism.
- A more thorough explanation is needed for the discrepancies observed between the effects of PGE2 versus Kv2.2 knockdown/mutation on the electrical activity of beta cells and GSIS.
Thank you for your positive evaluation and constructive feedback on our study. We appreciate the concern regarding the use of HEK293T cells, which are not pancreatic beta cells and may exhibit functional differences. In response, we have repeated our key experiments using INS1 cells and primary mouse beta cells, which are more representative of the native beta cell environment. These additional experiments confirm our hypothesis and further support the role of Kv2.2 in PGE2-induced inhibition of GSIS. In beta cells, glucose-induced PKA activation is highly localized. As a result, while some PKA pathways promote insulin secretion, others may inhibit it. To directly demonstrate that PGE2-induced PKA phosphorylation of Kv2.2 is involved in the inhibitory effect on GSIS, we overexpressed the S448A mutant Kv2.2 channel in INS-1(832/13) cells. Our results show that Kv2.2-S448A channels significantly attenuate the inhibitory effect of PGE2 on GSIS, further supporting the critical role of Kv2.2 phosphorylation at S448. These data have been added to the revised Figure 7C.
Reviewer #2 (Public Review):
The authors identified new target elements for prostaglandin E2 (PGE2) through which insulin release can be regulated in pancreatic beta cells under physiological conditions. In vitro extracellular exposure to PGE2 could directly and dose-dependently inhibit the potassium channel Kv2.2. In vitro pharmacology revealed that this inhibition occurs through the EP2/4 receptors, which activate protein kinase A (PKA). By screening specific sites of the Kv2.2 channel, the target phosphorylation site (S448) for PKA regulation was found. The physiological relevance of the described signaling cascade was investigated and confirmed in vivo, using a Kv2.2 knockdown mouse model.
The strength of this manuscript is the novelty of the (EP2/4-PKA-Kv2.2 channel) molecular pathway described and the comprehensive methodological toolkit the authors have relied upon.
The introduction is detailed and contains all the information necessary to place the claims in context. Although the dataset is comprehensive and a logical lead is consistently built, there is one important point to consider: to clarify that the described signaling pathway is characteristic of normal physiological conditions and thus differs from pathological changes. It would be useful to carry out basic experiments in a diabetes model (regardless of whether this is in mice or rats).
Thank you for your positive evaluation and insightful comment. We have clarified in the Discussion section that our findings pertain specifically to physiological conditions. We acknowledge the importance of investigating the signaling pathway in a pathological context and plan to conduct experiments using a diabetes model in future studies to explore how this pathway may differ under such conditions.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) Figure 3A-C: PKA activation regulates different functional aspects in beta cells and HEK293T cells. It is well known that PKA activation enhances insulin secretion in beta cells, therefore the mechanisms that allow the same pathway at the same time to inhibit GSIS are not clear and should be addressed by experiments in beta cells.
Thank you for your insightful comment. Specificity and versatility in cAMP-PKA signaling are governed by the spatial localization and temporal dynamics of the signal. In beta cells, glucose-induced PKA activation is highly localized (Tengholm and Gylfe, 2017). As a result, while some PKA pathways promote insulin secretion, others may inhibit it. For example, a global increase in cAMP, such as through treatment with Db-cAMP, can simultaneously activate both stimulatory and inhibitory PKA pathways, reflecting a more integrated, complex response. In previous studies, 1 mM Db-cAMP was shown to enhance GSIS in INS-1 cells (Dezaki et al., 2011). We observed that 1 mM Db-cAMP increased GSIS, but lower concentrations (10 mM) decreased GSIS (as shown in Author response image 1). These findings suggest that not all PKA signaling events increase GSIS. To further investigate the role of PGE2-induced PKA phosphorylation of Kv2.2 in the inhibition of GSIS, we overexpressed the S448A mutant of Kv2.2 in INS-1 (832/13) cells. Our results showed that the Kv2.2-S448A mutant significantly attenuated the inhibitory effect of PGE2 on GSIS. These new data have been incorporated into the revised Figure 7C.
Author response image 1.
Effect of Db-cAMP on GSIS in INS-1 cells. Statistics for the effect of different concentrations of Db-cAMP on GSIS in INS-1(832/13) cells. One-way ANOVA with Bonferroni post hoc test. *p < 0.05; ***p < 0.001; ****p < 0.0001; n.s., not significant.
(2) Figure 3G: One would expect that the phospho-mimetic mutation, S448D, will have an opposite effect to S448A and a similar effect as PGE2 or PKA activator in Figure 3B. There is no explanation by the authors for having the same effect in S448A and S448D.
Thank you for your thoughtful comment. Indeed, the S448D mutation exhibited a similar effect to PGE2 on Kv2.2 channels, as we observed significantly smaller currents compared to wild-type Kv2.2 (Figure 3F). The S448D mutation mimics the phosphorylated state of S448, and since PGE2 regulates Kv2.2 channels by phosphorylating this residue, it has no further effect on the S448D mutant (Figure 3G). In contrast, the S448A mutation prevents phosphorylation at this site, which explains why PGE2 has no effect on the currents of S448A mutant Kv2.2 channels (Figure 3H). These results confirm that PGE2 modulates Kv2.2 channels specifically through phosphorylation of S448, as evidenced by the lack of effect on both the S448A and S448D mutants.
(3) Figure 4E: Since both PGE2 and Kv2.2 KD inhibit the activity of the channel, it doesn't definitively prove whether PGE2 acts through Kv2.2 in INS-1 cells. A complementary experiment should be done in which overactivation of Kv2.2 rescues the effect of PGE2. For example, with the S448A form of the channel.
We appreciate your comment and valuable suggestion. Knockdown of Kv2.2 abrogated the inhibitory effect of PGE2 on I<sub>K</sub> currents in INS-1 cells (Figure 4E and F), which strongly indicates that PGE2 acts through Kv2.2. While we agree that the suggested complementary experiment with Kv2.2 overactivation (e.g., using the S448A mutant) could provide additional insights, we believe the current data sufficiently support our conclusion, as the knockdown of Kv2.2 eliminates the observed PGE2 effect, providing direct evidence of the channel's involvement.
(4) Figure 5C: This result requires further explanation. If PGE2 downregulates Kv2.2 activity and has an inhibitory effect on GSIS, why does Kv2.2 KD have the opposite effect?
The knockdown of Kv2.2 (Fig. 5C) reduced action potential (AP) firing rates compared to the scramble control (Fig. 5B), which is expected because Kv2.2 is critical for maintaining AP firing. When Kv2.2 is knocked down, the reduced AP firing diminishes the system’s responsiveness to further modulation by PGE2. This is because PGE2 exerts its effects primarily through Kv2.2 channels. Therefore, in the Kv2.2 knockdown condition, PGE2 does not exert an additional inhibitory effect on AP firing rates, as the channels critical for its action are already impaired.
(5) Figure 5D - The EP1-EP4 receptor antibodies should be validated at least in INS-1(832/13) cells using knockdowns.
Thank you for your suggestion. We have validated the EP1-EP4 receptor antibodies in INS-1(832/13) cells using knockdown experiments. The validation results, including confirmation of specificity and knockdown efficiency, are provided in Supplemental Figure S2.
(6) Figure 7B - These experiments don't necessarily prove that PGE2 acts directly through Kv2.2 inhibition. Using the S448A mutation in these experiments could prove this point.
Thank you for this valuable suggestion. We have now overexpressed the S448A mutant Kv2.2 channels in INS-1(832/13) cells, and the results demonstrate that Kv2.2-S448A channels significantly reduce the inhibitory effect of PGE2 on GSIS. These new data have been incorporated into the revised Figure 7C.
Reviewer #2 (Recommendations For The Authors):
(1) Deficiencies and inaccuracies in the description of the methods (animal numbers, name of vendors, abbreviations) and the typos in the figures (axis label) require correction.
Thank you for pointing this out. We have carefully reviewed the manuscript and the figures, making the necessary corrections to address the deficiencies in the methods section and the typos in the figure axis labels.
(2) Reducing the number of figures (Figures 7/C-E: knockout mouse line test and Figure1/HEK cell experiments could be part of supplementary) and paragraphs would make the manuscript more compact and powerful. It would also ease its reading for non-experts.
Thank you for your suggestion. We have moved Figures 7C-E to the supplementary data (Supplemental Figure S1) to streamline the main manuscript.
(3) Multiple immunostainings for EP receptors in insulinoma cells or pancreatic islets would be representative.
Due to the rabbit-derived nature of the antibodies (EP1, EP2, EP4), performing multiple immunostainings on the same samples is not feasible due to potential cross-reactivity. However, the immunohistochemistry images demonstrate that each antibody labels more than 90% of the cells, indicating that β-cell express different subtypes of EP receptors simultaneously.
(4) The antagonists chosen (AH6809, AH23848) are non-specific. Experiments should be re-run (at least some) under more stringent conditions.
Thank you for your suggestion. AH6809 and AH23848 are well-documented, widely used antagonists in the literature. To further strengthen our findings, we have included additional, widely-used antagonists: the EP2-specific antagonist TG4155 and the EP4-specific antagonist GW627368. The results obtained with these new antagonists were consistent with those observed using AH6809 and AH23848. These updated data are now included in the revised Figure 4I and 4J.
(5) It would be very helpful to indeed emphasise that this work is for physiological conditions and that it is (or is not) modified in diabetes. Maybe even irrelevant for diabetes (?). This needs to be clarified and supported by data even if one could assume the authors intend to have a follow-up entirely dedicated to pathological changes, perhaps.
Thank you for this insightful comment. We have clarified in the Discussion that our findings are specific to physiological conditions. To address this point, we have added the following statement:
"Importantly, our findings pertain to physiological conditions. While we demonstrate the inhibitory effects of PGE2 on Kv2.2 channels in normal b-cells, the role of this pathway under diabetic conditions remains to be investigated and will be the focus of future studies."
Dezaki K, Damdindorj B, Sone H, Dyachok O, Tengholm A, Gylfe E, Kurashina T, Yoshida M, Kakei M, Yada T (2011) Ghrelin attenuates cAMP-PKA signaling to evoke insulinostatic cascade in islet beta-cells. Diabetes 60:2315-2324.
Tengholm A, Gylfe E (2017) cAMP signalling in insulin and glucagon secretion. Diabetes Obes Metab 19 Suppl 1:42-53.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer 1 (Public Review):
O’Neill et al. have developed a software analysis application, miniML, that enables the quantification of electrophysiological events. They utilize a supervised deep learned-based method to optimize the software. miniML is able to quantify and standardize the analyses of miniature events, using both voltage and current clamp electrophysiology, as well as optically driven events using iGluSnFR3, in a variety of preparations, including in the cerebellum, calyx of held, Golgi cell, human iPSC cultures, zebrafish, and Drosophila. The software appears to be flexible, in that users are able to hone and adapt the software to new preparations and events. Importantly, miniML is an open-source software free for researchers to use and enables users to adapt new features using Python.
Overall this new software has the potential to become widely used in the field and an asset to researchers. However, the authors fail to discuss or even cite a similar analysis tool recently developed (SimplyFire), and determine how miniML performs relative to this platform. There are a handful of additional suggestions to make miniML more user-friendly, and of broad utility to a variety of researchers, as well as some suggestions to further validate and strengthen areas of the manuscript:
(1) miniML relative to existing analysis methods: There is a major omission in this study, in that a similar open source, Python-based software package for event detection of synaptic events appears to be completely ignored. Earlier this year, another group published SimplyFire in eNeuro (Mori et al., 2024; doi: 10.1523/eneuro.0326-23.2023). Obviously, this previous study needs to be discussed and ideally compared to miniML to determine if SimplyFire is superior or similar in utility, and to underscore differences in approach and accuracy.
We thank the reviewer for bringing this interesting publication to our attention. We have included SimplyFire in our benchmarking for comprehensive comparison with miniML. The approach taken by SimplyFire differs from miniML in a number of ways. Our results show that miniML provides higher recall and precision than SimplyFire (revised Figure 3). We appreciate that SimplyFire provides a user-interface similar to the commonly used MiniAnalysis software. In addition, the peak-finding-based approach of SimplyFire makes it relatively robust to event shape, which facilitates analysis of diverse data. However, we noted a strong threshold-dependence and long run time of SimplyFire (revised Figure 3 and Figure 3—figure supplement 1). In addition, SimplyFire is not robust against various types of noise typically encountered in electrophysiological recordings. Our extended benchmark analysis thus indicates that AI-based event detection is superior to existing algorithmic approaches, including SimplyFire.
(2) The manuscript should comment on whether miniML works equally well to quantify current clamp events (voltage; e.g. EPSP/mEPSPs) compared to voltage clamp (currents, EPSC/mEPSCs), which the manuscript highlights. Are rise and decay time constants calculated for each event similarly?
miniML works equally well for current- and voltage events (Figure 5, Figure 9). In general, events of opposite polarity can be analyzed by simply inverting the data. Transfer learning models may further improve the detection.
For each detected event, independent of data/recording type, rise times are calculated as 10–90% times (baseline–peak), and decay times are calculated as time to 50% of the peak. In addition, event decay time constants are calculated from a fit to the event average. With miniML being open-source, researchers can adapt the calculations of event statistics to their needs, if desired. In the revised manuscript, we have expanded the Methods section that describes the quantification of event statistics (Methods, Quantification).
(3) The interface and capabilities of miniML appear quite similar to Mini Analysis, the free software that many in the field currently use. While the ability and flexibility for users to adapt and adjust miniML for their own uses/needs using Python programming is a clear potential advantage, can the authors comment, or better yet, demonstrate, whether there is any advantage for researchers to use miniML over Mini Analysis or SimplyFire if they just need the standard analyses?
Following the reviewer’s suggestion, we developed a graphical user interface (GUI) for miniML to enhance its usability (Figure 2—figure supplement 2), which is provided on the GitHub repository. Our comprehensive benchmark analysis demonstrated that miniML outperforms existing tools such as MiniAnalysis and SimplyFire. The main advantages are (i) increased reliability of results, which eliminates the need for visual inspection; (ii) fast runtime and easy automation; (iii) superior detection performance as demonstrated by higher recall in both synthetic and real data; (iv) open-source Python-based design. We believe that these advantages make miniML a valuable tool for researchers recording various types of synaptic events, offering a more efficient and reliable solution compared to existing methods.
(4) Additional utilities for miniML: The authors show miniML can quantify miniature electrophysiological events both current and voltage clamp, as well as optical glutamate transients using iGluSnFR. As the authors mention in the discussion, the same approach could, in principle, be used to quantify evoked (EPSC/EPSP) events using electrophysiology, Ca2+ events (using GCaMP), and AP waveforms using voltage indicators like ASAP4. While I don’t think it is reasonable to ask the authors to generate any new experimental data, it would be great to see how miniML performs when analysing data from these approaches, particularly to quantify evoked synaptic events and/or Ca2+ (ideally postsynaptic Ca2+ signals from miniature events, as the Drosophila NMJ have developed nice approaches).
In the revised manuscript, we have extended the application examples of miniML. We applied miniML to detect mEPSPs recorded with the novel voltage-sensitive indicator ASAP5 (Figure 9 and Figure 9—figure supplement 1). We performed simultaneous recordings of membrane voltage through electrophysiology and ASAP5 voltage imaging in rat cultured neurons at physiological temperature. Data were analyzed using miniML, with electrophysiology data being used as ground-truth for assessing detection performance in imaging data. Our results demonstrate that miniML robustly detects mEPSPs in current-clamp, and can localize corresponding transients in imaging data. Furthermore, we observed that miniML performs better than template matching and deconvolution on ASAP5 imaging data (Figure 9 and Figure 9—figure supplement 2).
Reviewer 2 (Public Review):
This paper presents miniML as a supervised method for the detection of spontaneous synaptic events. Recordings of such events are typically of low SNR, where state-of-the-art methods are prone to high false positive rates. Unlike current methods, training miniML requires neither prior knowledge of the kinetics of events nor the tuning of parameters/thresholds.
The proposed method comprises four convolutional networks, followed by a bi-directional LSTM and a final fully connected layer which outputs a decision event/no event per time window. A sliding window is used when applying miniML to a temporal signal, followed by an additional estimation of events’ time stamps. miniML outperforms current methods for simulated events superimposed on real data (with no events) and presents compelling results for real data across experimental paradigms and species. Strengths:
The authors present a pipeline for benchmarking based on simulated events superimposed on real data (with no events). Compared to five other state-of-the-art methods, miniML leads to the highest detection rates and is most robust to specific choices of threshold values for fast or slow kinetics. A major strength of miniML is the ability to use it for different datasets. For this purpose, the CNN part of the model is held fixed and the subsequent networks are trained to adapt to the new data. This Transfer Learning (TL) strategy reduces computation time significantly and more importantly, it allows for using a substantially smaller data set (compared to training a full model) which is crucial as training is supervised (i.e. uses labeled examples).
Weaknesses:
The authors do not indicate how the specific configuration of miniML was set, i.e. number of CNNs, units, LSTM, etc. Please provide further information regarding these design choices, whether they were based on similar models or if chosen based on performance.
The data for the benchmark system was augmented with equal amounts of segments with/without events. Data augmentation was undoubtedly crucial for successful training.
(1) Does a balanced dataset reflect the natural occurrence of events in real data? Could the authors provide more information regarding this matter?
In a given recording, the event frequency determines the ratio of event-containing vs. nonevent-containing data segments. Whereas many synapses have a skew towards non-events, high event frequencies as observed, e.g., in pyramidal cells or Purkinje neurons, can shift the ratio towards event-containing data.
For model training, we extracted data segments from mEPSC recordings in cerebellar granule cells, which have a low mEPSC frequency (about 0.2 Hz, Delvendahl et al. 2019). Unbalanced training data may complicate model training (Drummond and Holte 2003; Prati et al. 2009; Tyagi and Mittal 2020). We therefore decided to balance the training dataset for miniML by down-sampling the majority class (i.e., non-event segments), so that the final datasets for model training contained roughly equal amounts of events and non-events.
(2) Please provide a more detailed description of this process as it would serve users aiming to use this method for other sub-fields.
We thank the reviewer for raising this point. In the revised manuscript, we present a systematic analysis of the impact of imbalanced training data on model training (Figure 1—figure supplement 2). In addition, we have revised the description of model training and data augmentation in the Methods section (Methods, Training data and annotation).
The benchmarking pipeline is indeed valuable and the results are compelling. However, the authors do not provide comparative results for miniML for real data (Figures 4-8). TL does not apply to the other methods. In my opinion, presenting the performance of other methods, trained using the smaller dataset would be convincing of the modularity and applicability of the proposed approach.
Quantitative comparison of synaptic detection methods on real-world data is challenging because the lack of ground-truth data prevents robust, quantitative analyses. Nevertheless, we compared miniML to common template-based and finite-threshold based methods on four different types of synapses. We noted that miniML generally detects more events, whereas other methods are susceptible to false-positives (Figure 4—figure supplement 1). In addition, we analyzed the performance of miniML on voltage imaging data (Figure 9). Simultaneous recordings of electrophysiological and imaging data allowed a quantitative comparison of detection methods in this dataset. Our results demonstrate that miniML provides higher recall for optical minis recorded using ASAP5 (Figure 9 and Figure 9—figure supplement 2; F1 score, Cohen’s d 1.35 vs. template matching and 5.1 vs. deconvolution).
Impact:
Accurate detection of synaptic events is crucial for the study of neural function. miniML has a great potential to become a valuable tool for this purpose as it yields highly accurate detection rates, it is robust, and is relatively easily adaptable to different experimental setups.
Additional comments:
Line 73: the authors describe miniML as "parameter-free". Indeed, miniML does not require the selection of pulse shape, rise/fall time, or tuning of a threshold value. Still, I would not call it "parameter-free" as there are many parameters to tune, starting with the number of CNNs, and number of units through the parameters of the NNs. A more accurate description would be that as an AI-based method, the parameters of miniML are learned via training rather than tuned by the user.
We agree that a deep learning model is not parameter-free, and this term may be misleading. We have therefore changed this sentence in the introduction as follows: "The method is fast, robust to threshold choice, and generalizable across diverse data types [...]"
Line 302: the authors describe miniML as "threshold-independent". The output trace of the model has an extremely high SNR so a threshold of 0.5 typically works. Since a threshold is needed to determine the time stamps of events, I think a better description would be "robust to threshold choice".
To detect event localizations, a peak search is performed on the model output, which uses a minimum peak height parameter (or threshold). Extreme values for this parameter do indeed have a small impact on detection performance (Figure 3J). We have changed the description in the introduction and discussion according to the reviewer’s suggestion.
Reviewer 3 (Public Review):
miniML as a novel supervised deep learning-based method for detecting and analyzing spontaneous synaptic events. The authors demonstrate the advantages of using their methods in comparison with previous approaches. The possibility to train the architecture on different tasks using transfer learning approaches is also an added value of the work. There are some technical aspects that would be worth clarifying in the manuscript:
(1) LSTM Layer Justification: Please provide a detailed explanation for the inclusion of the LSTM layer in the miniML architecture. What specific benefits does the LSTM layer offer in the context of synaptic event detection?
Our model design choice was inspired by similar approaches in the literature (Donahue et al. 2017; Islam et al. 2020; Passricha and Aggarwal 2019; Tasdelen and Sen 2021; Wang et al. 2020). Convolutional and recurrent neural networks are often combined for time-series classification problems as they allow learning spatial and temporal features, respectively. Combining the strengths of both network architectures can thus help improve the classification performance. Indeed, a CNN-LSTM architecture proved to be superior in both training accuracy and detection performance (Figure 1—figure supplement 2). Further, this architecture requires fewer free parameters than comparable model designs using fully connected layers instead. The revised manuscript shows a comparison of different model architectures (Figure 1—figure supplement 2), and we added the following description to the text (Methods, Deep learning model architecture):
"The combination of convolutional and recurrent neural network layers helps to improve the classification performance for time-series data. In particular, LSTM layers allow learning temporal features."
(2) Temporal Resolution: Can you elaborate on the reasons behind the lower temporal resolution of the output? Understanding whether this is due to specific design choices in the model, data preprocessing, or post-processing will clarify the nature of this limitation and its impact on the analysis.
When running inference on a continuous recording, we choose to use a sliding window approach with stride. Therefore, the model output has a lower temporal resolution than the raw data, which is determined by the stride length (i.e., how many samples to advance the sliding window). While using a stride is not required, it significantly reduces inference time (cf. Figure 2—figure supplement 1). We recommend a stride of 20 samples, which does not impact the detection of events. Any subsequent quantification of events (amplitude, area, risetimes, etc.) is performed on raw data. Based on the reviewer’s comment, we have adapted the code to resample the prediction trace to the sampling rate of the original data. This maintains temporal precision and avoids confusion.
The Methods now include the following statement:
"To maintain temporal precision, the prediction trace is resampled to the sampling frequency of the raw data."
(3) Architecture optimization: how was the architecture CNN+LSTM optimized in terms of a number of CNN layers and size?
We performed a Bayesian optimization over a defined range of hyperparameters in combination with empirical hyperparameter tuning. We now describe this in the Methods section as follows:
"To optimise the model architecture, we performed a Bayesian optimisation of hyperparameters. Hyperparameter ranges were chosen for the free parameters of all layers. Optimisation was then performed with a maximum number of trials of 50. Models were evaluated using the validation dataset. Because higher number of free parameters tended to increase inference times, we then empirically tuned the chosen hyperparameter combination to achieve a trade-off between number of free parameters and accuracy."
Recommendations For The Authors
Reviewing Editor (Recommendations For The Authors):
Overall suggestions to the authors:
(1) Directly compare miniML with SimplyFire (which was not cited or discussed in the original manuscript), with both idealized and actual data. Discuss the pros/cons of each software.
We have conducted an extensive comparison between miniML and SimplyFire using both simulated and actual experimental data. This analysis is now presented in the revised Figure 3, Figure 3—figure supplement 1, and Figure 4—figure supplement 1. In addition, we have included relevant citations for SimplyFire in our manuscript. These additions provide a more comprehensive and balanced view of the available tools in the field, positioning our work within the broader context of existing solutions.
(2) Generate a better user interface akin to MiniAnalysis or SimplyFire.
We thank the editor and reviewers for the suggestion to improve the user interface. We have created a user-friendly graphical user interface (GUI) for miniML that is available on our GitHub repository. This GUI is now showcased in Figure 2—figure supplement 2 of the manuscript. The new interface allows users to load and analyze data through an intuitive point-and-click system, visualize results in real-time, and adjust parameters easily without coding knowledge. We have incorporated user feedback to refine the interface and improve user experience. These improvements significantly enhance the accessibility of miniML, making it more user-friendly for researchers with varying levels of programming expertise.
Reviewer 1 (Recommendations For The Authors):
Related to point (1) of the Public Review, we have taken the liberty to compare electrophysiological data using miniAnalysis, SimiplyFire, and miniML. In our comparison, we note the following in our experience:
(1.1) In contrast to both SimplyFire and miniAnalysis, miniML does not currently have a user-friendly interface where the user can directly control or change the parameters of interest, nor does miniML have a user control center, so the user cannot simply type or select the mini manually. Rather, if any parameter needs to be changed, the user needs to read, understand, and change the original source code to generate the preferred change. This level of "activation energy" and required user coding expertise in computer science, which many researchers do not have, renders miniML much less accessible when directly compared to SimplyFire and miniAnalysis. Hence, unless miniML’s interface can be made more user-friendly, this is a major disadvantage, especially when compared to SimplyFire, which has many of the same features as miniML but with a much easier interface and user controls.
As suggested by the reviewer, we have created a graphical user interface (GUI) for miniML. The GUI allows easy data loading, filtering, analysis, event inspection, and saving of results without the need for writing Python code. Figure 2—figure supplement 2 illustrates the typical workflow for event analysis with miniML using the GUI and a screenshot of the user interface. Code to use miniML via the GUI is now included in the project’s GitHub repository. The GUI provides a simple and intuitive way to analyze synaptic events, whereas running miniML as Python script allows for more customization and a high degree of automatization.
(1.2) We compared electrophysiological miniature events between miniML, SimplyFire, and miniAnalysis. All three achieved similar mean amplitudes in "wild type" conditions, and conditions in which mini events were enhanced and diminished, so the overall means and utilities are similar, with miniML and SimplyFire being preferred given the flexibility and much faster analysis. We did note a few differences, however. SimplyFire tends to capture a high number of mini-events over miniML, especially in conditions of diminished mini amplitude (e.g., miniML found 76 events, while SimplyFire 587). The mean amplitudes, however, were similar. It seems that in data with low SNR, SimplyFire captures many more events as real minis that are probably noise, while miniML is more selective, which might be an advantage in miniML. That being said, we found SimplyFire to be superior in many respects, not least of which the user interface and experience.
We appreciate the reviewer’s thorough comparison of miniML, SimplyFire, and MiniAnalysis. While we acknowledge SimplyFire’s user-friendly interface, our study highlights several advantages of AI-based event analysis over conventional algorithmic approaches. Our updated benchmark analysis revealed better detection performance of miniML compared with SimplyFire (revised Figure 3), which had similar performance to deconvolution. As already noted by the reviewer, high false positive rates are a major issue of the SimplyFire approach. Although a minimum amplitude cutoff can partially resolve this problem, detection performance is highly sensitive to threshold setting (revised Figure 3). Another apparent disadvantage of SimplyFire is its relatively slow runtime (Figure 3—figure supplement 1). Finally, we have enhanced miniML’s accessibility by providing a graphical user interface that is easy to use and provides additional functionality.
Some technical comments:
(1) Improvements to the dependence version of miniML: There is a need to clarify the dependence version of the python and tensor flow used in this study and in the GitHub. We used Python version 3.8.19 to load the miniML model. However, if Python versions >=3.9, as described on the GitHub provided, it is difficult to have a matched h5py version installed. It is also inaccurate to say using Python >=3.9, because tensor flow version for this framework needs to be around 2.13. However, if using Python >=3.10, it will only allow 2.16 version tensor flow to be the download choice. Therefore, as a Python framework, the dependency version needs to be specified on GitHub to allow researchers to access the model using the entire work.
Thank you for highlighting this issue. We have now included specific version numbers in the requirements to avoid version conflicts and to ensure proper functioning of the code.
(2) Due to the intrinsic characteristics of the trained model, every model is only suitable for analyzing data with similar attributes. It is hard for researchers without a strong computer science background to train a new model themselves for their specific data. Therefore, it would be preferred if there were more available transfer learning models on GitHub accessible for researchers to adapt to their data.
We would like to thank the reviewer for this feedback. Trained models (such as the default model) can often be used on different data (see, e.g., Figure 4, where data from four distinct synaptic preparations were analyzed with the base model, and Figure 5—figure supplement 1). However, changes in event waveform and/or noise characteristics may necessitate transfer learning to obtain optimal results with miniML. We have revised the description and tutorial for model training on the project’s GitHub repository to provide more guidance in this process. In addition, we now provide a tutorial on how to use existing models on out-of-sample data with distinct kinetics, using resampling. We hope these updates to the miniML GitHub repository will facilitate the use of the method.
Following the suggestion by the reviewer, we have provided the transfer learning models used for the manuscript on the project’s GitHub repository to increase the number of available machine learning models for event detection. In addition, users of miniML are encouraged to supply their custom models. We hope that this will facilitate model exchange between laboratories in the future.
Reviewer 3:
I congratulate all authors for the convincing demonstration of their methodology, I do not have additional recommendations.
We would like to thank the reviewer for the positive assessment of our manuscript.
References
Delvendahl, I., Kita, K., & Müller, M. (2019). Rapid and sustained homeostatic control of presynaptic exocytosis at a central synapse. Proceedings of the National Academy of Sciences, 116(47), 23783–23789. https://doi.org/10.1073/pnas.1909675116
Donahue, J., Hendricks, L. A., Rohrbach, M., Venugopalan, S., Guadarrama, S., Saenko, K., & Darrell, T. (2017). Long-term recurrent convolutional networks for visual recognition and description. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 677–691. https://doi.org/10.1109/tpami.2016.2599174
Drummond, C., & Holte, R. C. (2003). C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. https: //api.semanticscholar.org/CorpusID:204083391
Islam, M. Z., Islam, M. M., & Asraf, A. (2020). A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using x-ray images. Informatics in Medicine Unlocked, 20, 100412. https://doi.org/10.1016/j.imu.2020.100412
Passricha, V., & Aggarwal, R. K. (2019). A hybrid of deep CNN and bidirectional LSTM for automatic speech recognition. Journal of Intelligent Systems, 29(1), 1261–1274. https://doi.org/10.1515/jisys-2018-0372
Prati, R. C., Batista, G. E. A. P. A., & Monard, M. C. (2009). Data mining with imbalanced class distributions: Concepts and methods. Indian International Conference on Artificial Intelligence. https://api.semanticscholar.org/CorpusID:16651273
Tasdelen, A., & Sen, B. (2021). A hybrid CNN-LSTM model for pre-miRNA classification. Scientific Reports, 11(1). https://doi.org/10. 1038/s41598-021-93656-0
Tyagi, S., & Mittal, S. (2020). Sampling approaches for imbalanced data classification problem in machine learning. In P. K. Singh, A. K. Kar, Y. Singh, M. H. Kolekar, & S. Tanwar (Eds.), Proceedings of icric 2019 (pp. 209–221). Springer International Publishing.
Wang, H., Zhao, J., Li, J., Tian, L., Tu, P., Cao, T., An, Y., Wang, K., & Li, S. (2020). Wearable sensor-based human activity recognition using hybrid deep learning techniques. Security and Communication Networks, 2020, 1–12. https://doi.org/10.1155/2020/ 2132138
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Summary:
This is a new and important system that can efficiently train mice to perform a variety of cognitive tasks in a flexible manner. It is innovative and opens the door to important experiments in the neurobiology of learning and memory.
Strengths:
Strengths include: high n's, a robust system, task flexibility, comparison of manual-like training vs constant training, circadian analysis, comparison of varying cue types, long-term measurement, and machine teaching.
Weaknesses:
I find no major problems with this report.
(1) Line 219: Water consumption per day remained the same, but number of trails triggered was more as training continued. First, is this related to manual-type training? Also, I'm trying to understand this result quantitatively, since it seems counter-intuitive: I would assume that with more trials, more water would be consumed since accuracy should go up over training (so more water per average trial). Am I understanding this right? Can the authors give more detail or understanding to how more trials can be triggered but no more water is consumed despite training?
Thanks for the thoughtful comment. We would like to clarify the phenomenon described in Line 219: As the training advanced, the number of trials triggered by mice per day decreased (rather than increased as you mentioned in the comment) gradually for both manual and autonomous groups of mice (Fig. 2H left). The performance as you mentioned, improved over time, leading to an increased probability of obtaining water and thus relatively stable daily water intake (Fig. 2H left). We believe the stable daily intake is the minimum amount of water required by the mice under circumstance of autonomous behavioral training.
(2) Figure 2J: The X-axis should have some label: at least "training type". Ideally, a legend with colors can be included, although I see the colors elsewhere in the figure. If a legend cannot be added, then the color scheme should be explained in the caption.
(3) Figure 2K: What is the purple line? I encourage a legend here. The same legend could apply to 2J.
(4) Supplementary Figure S2 D: I do not think the phrase "relying on" is correct. Instead, I think "predicted by" or "correlating with" might be better.
We thank the reviewer for the valuable suggestion. We will address all these points and make the necessary revisions in the next version of our manuscript.
Reviewer #2 (Public review):
Summary:
The manuscript by Yu et al. describes a novel approach for collecting complex and different cognitive phenotypes in individually housed mice in their home cage. The authors report a simple yet elegant design that they developed for assessing a variety of complex and novel behavioral paradigms autonomously in mice.
Strengths:
The data are strong, the arguments are convincing, and I think the manuscript will be highly cited given the complexity of behavioral phenotypes one can collect using this relatively inexpensive ($100/box) and high throughput procedure (without the need for human interaction). Additionally, the authors include a machine learning algorithm to correct for erroneous strategies that mice develop which is incredibly elegant and important for this approach as mice will develop odd strategies when given complete freedom.
Weaknesses:
(1) A limitation of this approach is that it requires mice to be individually housed for days to months. This should be discussed in depth.
Thank you for raising this important point. We agree that the requirement for individual housing of mice during the training period is a limitation of our approach, and we appreciate the opportunity to discuss this in more depth. In the revised manuscript, we will add a dedicated section to the Discussion to address this limitation, including the potential impact of individual housing on the mice, the rationale for individual housing in our study, and efforts or alternatives made to mitigate the effects of individual housing.
(2) A major issue with continuous self-paced tasks such as the autonomous d2AFC used by the authors is that the inter-trial intervals can vary significantly. Mice may do a few trials, lose interest, and disengage from the task for several hours. This is problematic for data analysis that relies on trial duration to be similar between trials (e.g., reinforcement learning algorithms). It would be useful to see the task engagement of the mice across a 24-hour cycle (e.g., trials started, trials finished across a 24-hour period) and approaches for overcoming this issue of varying inter-trial intervals.
Thank you for your insightful comment regarding the variability in inter-trial intervals and its potential impact on data analysis. We agree that this is an important consideration for continuous self-paced tasks like the autonomous d2AFC paradigm used in our study. In the original manuscript, we have showed the general task engagement across 24-hour cycle (Fig. 2K). The distribution of inter-trial interval was also illustrated (Fig. S3H), which actually shows that most of trials have short intervals (though with extreme long ones). We will include more detailed analysis and discuss the challenges for data analysis.
Regarding the approaches to mitigate the issue of varying inter-trial interval, we will also discuss strategies to account for and mitigate the effects, including: trial selection, incorporating engagement period (e.g., open only during a fixed 2-hour period each day), etc.
(3) Movies - it would be beneficial for the authors to add commentary to the video (hit, miss trials). It was interesting watching the mice but not clear whether they were doing the task correctly or not.
Thanks for the reminder. We will add subtitles to the videos in the next version.
(4) The strength of this paper (from my perspective) is the potential utility it has for other investigators trying to get mice to do behavioral tasks. However, not enough information was provided about the construction of the boxes, interface, and code for running the boxes. If the authors are not willing to provide this information through eLife, GitHub, or their own website then my evaluation of the impact and significance of this paper would go down significantly.
Thanks for this important comment. We would like to clarify that the construction methods, GUI, code for our system, PCB and CAD files (newly uploaded) have already been made publicly available on https://github.com/Yaoyao-Hao/HABITS. Additionally, we have open-sourced all the codes and raw data for all training protocols (https://doi.org/10.6084/m9.figshare.27192897). We will continue to maintain these resources in the future.
Minor concerns:
Learning rate is confusing for Figure 3 results as it actually refers to trials to reach the criterion, and not the actual rate of learning (e.g., slope).
Thanks for pointing this out. We will make the revision in the next version.
Reviewer #3 (Public review):
Summary:
In this set of experiments, the authors describe a novel research tool for studying complex cognitive tasks in mice, the HABITS automated training apparatus, and a novel "machine teaching" approach they use to accelerate training by algorithmically providing trials to animals that provide the most information about the current rule state for a given task.
Strengths:
There is much to be celebrated in an inexpensively constructed, replicable training environment that can be used with mice, which have rapidly become the model species of choice for understanding the roles of distinct circuits and genetic factors in cognition. Lingering challenges in developing and testing cognitive tasks in mice remain, however, and these are often chalked up to cognitive limitations in the species. The authors' findings, however, suggest that instead, we may need to work creatively to meet mice where they live. In some cases, it may be that mice may require durations of training far longer than laboratories are able to invest with manual training (up to over 100k trials, over months of daily testing) but the tasks are achievable. The "machine teaching" approach further suggests that this duration could be substantially reduced by algorithmically optimizing each trial presented during training to maximize learning.
Weaknesses:
(1) Cognitive training and testing in rodent models fill a number of roles. Sometimes, investigators are interested in within-subjects questions - querying a specific circuit, genetically defined neuron population, or molecule/drug candidate, by interrogating or manipulating its function in a highly trained animal. In this scenario, a cohort of highly trained animals that have been trained via a method that aims to make their behavior as similar as possible is a strength.
However, often investigators are interested in between-subjects questions - querying a source of individual differences that can have long-term and/or developmental impacts, such as sex differences or gene variants. This is likely to often be the case in mouse models especially, because of their genetic tractability. In scenarios where investigators have examined cognitive processes between subjects in mice who vary across these sources of individual difference, the process of learning a task has been repeatedly shown to be different. The authors do not appear to have considered individual differences except perhaps as an obstacle to be overcome.
The authors have perhaps shown that their main focus is highly-controlled within-subjects questions, as their dataset is almost exclusively made up of several hundred young adult male mice, with the exception of 6 females in a supplemental figure. It is notable that these female mice do appear to learn the two-alternative forced-choice task somewhat more rapidly than the males in their cohort.
Thank you for your insightful comments and for highlighting the importance of considering both within-subject and between-subject questions in cognitive training and testing in rodent models.
We acknowledge that our study primarily focused on highly controlled within-subject questions. However, the datasets we provided have showed some evidences for the ‘between-subject’ questions. For example, the large variability in learning rates among mice observed in Fig. 2I, the overall learning rate difference between male and female subjects (Fig. 2D vs. Fig. S2G, as the reviewer already mentioned), the varying nocturnal behavioral patterns (Fig. 2K), etc. While our primary focus was on highly controlled within-subjects questions, we recognize the value of exploring between-subjects differences. In the revised version, we will discuss these points more systematically.
(2) Considering the implications for mice modeling relevant genetic variants, it is unclear to what extent the training protocols and especially the algorithmic machine teaching approach would be able to inform investigators about the differences between their groups during training. For investigators examining genetic models, it is unclear whether this extensive training experience would mitigate the ability to observe cognitive differences, or select the animals best able to overcome them - eliminating the animals of interest. Likewise, the algorithmic approach aims to mitigate features of training such as side biases, but it is worth noting that the strategic uses of side biases in mice, as in primates, can benefit learning, rather than side biases solely being a problem. However, the investigators may be able to highlight variables selected by the algorithm that are associated with individual strategies in performing their tasks, and this would be a significant contribution.
Thank you for the insightful comments. We acknowledge that the extensive training experience, particularly through the algorithmic machine teaching approach, could potentially influence the ability to observe cognitive differences between groups of mice with relevant genetic variants. However, our study design and findings suggest that this approach can still provide valuable insights into individual differences and strategies used by the animals during training. First, the behavioral readout (including learning rate, engagement pattern, etc.) as mentioned above, could tell certain number of differences among mice. Second, detailed modelling analysis (with logistical regression modelling) could further dissect the strategy that mouse use along the training process (Fig. S2B). We have actually highlighted some variables selected by the regression that are associated with individual strategies in performing their tasks (Fig. S2C) and these strategies could be different between manual and autonomous training groups (Fig. S2D). We will discuss these points more in the next version of the manuscript.
(3) A final, intriguing finding in this manuscript is that animal self-paced training led to much slower learning than "manual" training, by having the experimenter introduce the animal to the apparatus for a few hours each day. Manual training resulted in significantly faster learning, in almost half the number of trials on average, and with significantly fewer omitted trials. This finding does not necessarily argue that manual training is universally a better choice because it leads to more limited water consumption. However, it suggests that there is a distinct contribution of experimenter interactions and/or switching contexts in cognitive training, for example by activating an "occasion setting" process to accelerate learning for a distinct period of time. Limiting experimenter interactions with mice may be a labor-saving intervention, but may not necessarily improve performance. This could be an interesting topic of future investigation, of relevance to understanding how animals of all species learn.
Thank you for your insightful comments. We agree that the finding that manual training led to significantly faster learning compared to self-paced training is both intriguing and important. One of the possible reasons we think is due to the limited duration of engagement provided by the experimenter in the manual training case, which forced the mice to concentrate more on the trails (thus with fewer omitting trials) than in autonomous training. Your suggestion that experimenter interactions might activate an "occasion setting" process is particularly interesting. In the context of our study, we could actually introduce, for example, a light, serving as the cue that prompt the animals to engage; and when the light is off, the engagement was not accessible any more for the mice to simulate the manual training situation. We agree that this could be an interesting topic for future investigation that might create a more conducive environment for learning, thereby accelerating the learning rate.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1:
Summary:
The authors aim to predict ecological suitability for the transmission of highly pathogenic avian influenza (HPAI) using ecological niche models. This class of models identify correlations between the locations of species or disease detections and the environment. These correlations are then used to predict habitat suitability (in this work, ecological suitability for disease transmission) in locations where surveillance of the species or disease has not been conducted. The authors fit separate models for HPAI detections in wild birds and farmed birds, for two strains of HPAI (H5N1 and H5Nx) and for two time periods, pre- and post-2020. The authors also validate models fitted to disease occurrence data from pre-2020 using post-2020 occurrence data.
Strengths:
The authors follow the established methods of Dhingra et al., 2016 to provide an updated spatial assessment of HPAI transmission suitability for two time periods, pre- and post-2020. They explore further methods of model cross-validation and consider the diversity of the bird species that HPAI has been detected in.
Weaknesses:
The precise ecological niche that the authors are modelling here is ambiguous: if we treat the transmission of HPAI in the wild bird population and in poultry populations as separate transmission cycles, linked by spillover events, then these transmission cycles are likely to have fundamentally different ecological niches.
We apologise if this aspect was not clear enough in the previous version of our manuscript but our analyses do not treat or make the assumption of distinct transmission cycles between wild and domestic bird species; those transmission cycles being indeed interconnected by frequent spillover events. Yet, we indeed conduct independent ecological niche modelling analyses to estimate both the ecological suitability for the risk of local circulation in domestic birds as well as the ecological suitability for the risk of local circulation in wild birds. This distinction does not imply that the virus circulates exclusively within one of these populations but rather allows us to identify potential differences in the environmental conditions associated with virus occurrences in each context.
Our results indicate that these two ecological niche models capture distinct environmental patterns. Virus occurrences in wild birds were primarily associated with factors such as open water and proximity to urban areas, while occurrences in domestic birds were more strongly linked to variables like poultry density and cultivated vegetation. This finding supports the existence of two distinct ecological niches for the virus, corresponding to virus circulation in wild and domestic bird populations. We thank the Reviewer for their feedback and we will take this opportunity to further clarify this aspect in the text.
While an "index case" in farmed poultry is relevant to the wildlife transmission cycle, further within-farm and farm-to-farm transmission is likely to be contingent on anthropogenic factors, rather than the environment. Similarly, we would expect "index cases" in outbreaks of HPAI in mammals to be relevant to transmission risk in wild birds - this data is not included in this manuscript. Such "index cases" in farmed poultry occur under separate ecological conditions to subsequent transmission in farmed poultry, so should be separated if possible. Some careful editing of the language used in the manuscript may elucidate some of my questions related to model conceptualisation.
We agree, but index cases are particularly difficult to separate from secondary spread in the absence of field investigation. Identification of index cases based on space-time filtering have been previously investigated but are strongly dependent on the quality of the surveillance, i.e. an “apparent” primary case can be a secondary case of previously undetected ones, and constant surveillance quality cannot be assumed to be homogeneous across countries. Our ecological niche modelling approach is based on HPAI cases reported in the EMPRES-i database, which includes all documented outbreaks without distinguishing primary introductions from subsequent farm-to-farm transmissions. Thus, our ecological niche models are trained on confirmed cases that result from a combination of different transmission dynamics, including introduction events in poultry populations (which can be impacted by ecological factors) and persistence within and between poultry populations (which can be impacted by anthropogenic factors).
For clarity, we will revise the manuscript to clarify that, while our study primarily aims to assess the environmental suitability for HPAI occurrences, the dataset does not exclude cases resulting from farm-to-farm spread. This means that our models can capture the environmental variables associated with the risk of cases associated with both primary introductions (e.g., spillover from wild birds) and secondary transmission events within poultry systems, although the latter is also influenced by anthropogenic factors such as biosecurity practices and poultry trade networks. These latter factors are not included in our models, which will be highlighted in the limitations (Discussion section) of the revised manuscript.
In addition, we note the Reviewer's comment regarding the relevance of “index cases” in mammalian outbreaks to understanding the risk of HPAI transmission in wild birds. Although these data are not included in our current study, we will highlight the potential value of incorporating these cases into future models in order to refine risk predictions, provided that they can be identified with some reasonable level of certainty.
The authors' handling of sampling bias in disease detection data in poultry is possibly inappropriate: one would expect the true spatial distribution of disease surveillance in poultry to be more closely correlated with poultry farming density, in contrast to human population density. This shortcoming in the modelling workflow possibly dilutes a key finding of the Results, that the transmission risk of HPAI in poultry is greatest in areas where poultry farming density is high.
The Reviewer raises a valid point that poultry surveillance efforts can also be considered as correlated with poultry farm density than with human population density. While human population density can serve as a reasonable proxy for surveillance intensity — given that disease detection is often more active in areas with stronger veterinary notification systems — we acknowledge that poultry disease surveillance can also be influenced by the spatial distribution of poultry farms, as high-density poultry areas could be prioritised for monitoring. Please note that in our study, we followed a previously established approach (Dhingra et al. 2016) and weighted pseudo-absence sampling based on human population density to account for general surveillance biases. However, we do not agree with the Reviewer’s point. In fact, assuming a sampling bias correlated with poultry density would result in reducing its effect as a risk factor. The current approach does not.
Reviewer #2:
Summary:
This study aimed to determine which spatial factors (conceived broadly as environmental, agronomic and socio-economic) explain greater avian influenza case numbers reported since 2020 (2020--2022) by comparing similar models built with data from the period 2015--2020. The authors have chosen an environmental niche modelling approach, where detected infections are modelled as a function of spatial covariates extracted at the location of each case. These covariates are available over the entire world so that the predictions can be projected back to space in the form of a continuous map.
Strengths:
The authors use boosted regression trees as the main analytical tool, which always feature among the best-performing models for environmental niche models (also known as habitat suitability models). They run replicate sets of the analysis for each of their model targets (wild/domestic x pathogen variant), which can help produce stable predictions. The authors take steps to ameliorate some forms of expected bias in the detection of cases, such as geographic variation in surveillance efforts, and in general more detections near areas of higher human population density.
Weaknesses:
The study is not altogether coherent with respect to time. Data sets for the response (N5H1 or N5Hx case data in domestic or wild birds) are divided into two periods; 2015-2020, and 2020-2022. Each set is modelled using a common suite of covariates that are not time-varying. That suggests that causation is inferred by virtue of cases being in different geographic areas in those two time periods. Furthermore, important predictors such as chicken density appear to be informed (in the areas of high risk) from census data from before 2010. The possibility for increased surveillance effort *through time* is overlooked, as is the possibility that previously high-burden locations may implement practice changes to reduce vulnerability.
We acknowledge the Reviewer's comments regarding the consistency of time periods in our study. Our approach is to divide the HPAI case data into two time periods (2015-2020 and 2020-2022) and ecological niche models using a common set of covariates that do not explicitly account for temporal variation. We will further clarify these aspects in the revised version of our manuscript:
(1) Our primary objective is to assess changes in ecological suitability over time rather than infer direct causation. By comparing models trained on pre-2020 data with post-2020 occurrences, we evaluate whether pre-2020 environmental conditions can predict recent HPAI suitability. However, we acknowledge that this does not capture dynamic changes in surveillance efforts, biosecurity measures, or host-pathogen interactions over time.
(2) Regarding predictor variables, we used poultry density data from 2015, rather than pre-2010 data. However, this dataset is not based on a single census year; instead, it represents a median estimate derived from subnational poultry census data collected between 2000 and 2019. This median year approach provides a more stable representation of poultry density than any single-year snapshot. Furthermore, while poultry production systems may exhibit some temporal variation, these changes are generally minor compared to the inter-annual variability observed in HPAI occurrence, which is largely driven by epidemic dynamics. Given the current limitations of global poultry data, distinguishing distributions from different years is not feasible with the available GLW dataset. We will clarify these points in the manuscript.
(3) We recognise that increased surveillance efforts and adaptive changes in poultry farming practices could influence the observed HPAI case distribution. While our current models do not incorporate time-varying surveillance intensity or biosecurity policies, we will address this limitation in the Discussion section and suggest that future work integrates dynamic surveillance data to improve risk assessments.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Wang et al. investigated how sexual failure influences sweet taste perception in male Drosophila. The study revealed that courtship failure leads to decreased sweet sensitivity and feeding behavior via dopaminergic signaling. Specifically, the authors identified a group of dopaminergic neurons projecting to the suboesophageal zone that interacts with sweet-sensing Gr5a+ neurons. These dopaminergic neurons positively regulate the sweet sensitivity of Gr5a+ neurons via DopR1 and Dop2R receptors. Sexual failure diminishes the activity of these dopaminergic neurons, leading to reduced sweet-taste sensitivity and sugar-feeding behavior in male flies. These findings highlight the role of dopaminergic neurons in integrating reproductive experiences to modulate appetitive sensory responses.
Previous studies have explored the dopaminergic-to-Gr5a+ neuronal pathways in regulating sugar feeding under hunger conditions. Starvation has been shown to increase dopamine release from a subset of TH-GAL4 labeled neurons, known as TH-VUM, in the suboesophageal zone. This enhanced dopamine release activates dopamine receptors in Gr5a+ neurons, heightening their sensitivity to sugar and promoting sucrose acceptance in flies. Since the function of the dopaminergic-to-Gr5a+ circuit motif has been well established, the primary contribution of Wang et al. is to show that mating failure in male flies can also engage this circuit to modulate sugar-feeding behavior. This contribution is valuable because it highlights the role of dopaminergic neurons in integrating diverse internal state signals to inform behavioral decisions.
An intriguing discrepancy between Wang et al. and earlier studies lies in the involvement of dopamine receptors in Gr5a+ neurons. Prior research has shown that Dop2R and DopEcR, but not DopR1, mediate starvation-induced enhancement of sugar sensitivity in Gr5a+ neurons. In contrast, Wang et al. found that DopR1 and Dop2R, but not DopEcR, are involved in the sexual failure-induced decrease in sugar sensitivity in these neurons. I wish the authors had further explored or discussed this discrepancy, as it is unclear how dopamine release selectively engages different receptors to modulate neuronal sensitivity in a context-dependent manner.
Our immunostaining experiments showed that three dopamine receptors, DopR1, Dop2R, and DopEcR were expressed in Gr5a<sup>+</sup> neurons in the proboscis, which was consistent with previous findings by using RT-PCR (Inagaki et al 2012). As the reviewer pointed out, we found that DopR1 and Dop2R were required for courtship failure-induced suppression of sugar sensitivity, whereas Marella et al 2012 and Inagaki et al 2012 found that Dop2R and DopEcR were required for starvation-induced enhancement of sugar sensitivity. These results may suggest different internal states (courtship failure vs. starvation) modulate peripheral sensory system via different signaling pathways (e.g. different subsets of dopaminergic neurons; different dopamine release mechanisms; and different dopamine receptors). We will further discuss these possibilities in the revised manuscript.
The data presented by Wang et al. are solid and effectively support their conclusions. However, certain aspects of their experimental design, data analysis, and interpretation warrant further review, as outlined below.
(1) The authors did not explicitly indicate the feeding status of the flies, but it appears they were not starved. However, the naive and satisfied flies in this study displayed high feeding and PER baselines, similar to those observed in starved flies in other studies. This raises the concern that sexually failed flies may have consumed additional food during the 4.5-hour conditioning period, potentially lowering their baseline hunger levels and subsequently reducing PER responses. This alternative explanation is worth considering, as an earlier study demonstrated that sexually deprived males consumed more alcohol, and both alcohol and food are known rewards for flies. To address this concern, the authors could remove food during the conditioning phase to rule out its influence on the results.
We think this is a valid concern. We will conduct courtship conditioning in the absence of food and test if courtship failure can still suppress sugar sensitivity in the revised manuscript.
(2) Figure 1B reveals that approximately half of the males in the Failed group did not consume sucrose yet Figure 1-S1A suggests that the total volume consumed remained unchanged. Were the flies that did not consume sucrose omitted from the dataset presented in Figure 1-S1A? If so, does this imply that only half of the male flies experience sexual failure, or that sexual failure affects only half of males while the others remain unaffected? The authors should clarify this point.
Here is a brief clarification of our experimental design and we will further clarify the details in the revised manuscript:
After the behavioral conditioning, male flies were divided for two assays. On the one hand, we quantified PER responses of individual flies. As shown in Figure 1C, Failed males exhibited decreased sweet sensitivity (as demonstrated by the right shift of the response curve).
On the other hand, we sought to quantify food consumption of individual flies by using the MAFE assay (Qi et al 2005). When presented with 400 mM sucrose, approximately 100% of the flies in the Naïve and Satisfied groups, and 50% of the flies in the Failed group, extended their proboscis and started feeding (Figure 1B). For these flies, we could quantify the consumed volumes and found there was no change (Figure 1, S1A). We should also note the consistency of these two experiments, e.g. in Figure 1C, only 50-60% of Failed males responded to 400 mM stimulation.
These two experiments in combination suggest that sexual failure suppressed sweet sensitivity of the Failed males. Meanwhile, as long as they still initiated feeding, the volume of food consumption remained unchanged. These results led us to focus on the modulatory effect of sexual failure on the sensory system, the main topic of this present study.
In addition, to further clarify the potential misunderstanding, we plan to examine food consumption by using 800 mM sucrose in the revised manuscript. As shown in Figure 1C, 800 mM sucrose was adequate to induce feeding in ~100% of the flies.
(3) The evidence linking TH-GAL4 labeled dopaminergic neurons to reduced sugar sensitivity in Gr5a+ neurons in sexually failed males could be further strengthened. Ideally, the authors would have activated TH-GAL4 neurons and observed whether this restored GCaMP responses in Gr5a+ neurons in sexually failed males. Instead, the authors performed a less direct experiment, shown in Figures 3-S1C and D. The manuscript does not describe the condition of the flies used in this experiment, but it appears that they were not sexually conditioned. I have two concerns with this experiment. First, no statistical analysis was provided to support the enhancement of sucrose responses following activation of TH-GAL4 neurons. Second, without performing this experiment in sexually failed males, the authors lack direct evidence to confirm that the dampened response of Gr5a+ neurons to sucrose results from decreased activity in TH-GAL4 neurons.
We think this is also a valid suggestion. We will directly examine whether activating TH<sup>+</sup> neurons in sexually conditioned males would enhance sugar responses of Gr5a<sup>+</sup> neurons in sexually failed males. We will also add in statistical analysis.
Nevertheless, we would still argue our current experiments using Naive males (Figure 3, S1C-D) are adequate to show a functional link between TH<sup>+</sup> neurons and Gr5a<sup>+</sup> neurons. Combining with the results that these neurons form active synapses (Figure 3, S1B) and that the activity of TH<sup>+</sup> neurons was dampened in sexually failed males (Figure 3G-I), our current data support the notion that sexual failure suppresses sweet sensitivity via TH-Gr5a circuity.
(4) The statistical methods used in this study are poorly described, making it unclear which method was used for each experiment. I suggest that the authors include a clear description of the statistical methods used for each experiment in the figure legends. Furthermore, as I have pointed out, there is a lack of statistical comparisons in Figures 3-S1C and D, a similar problem exists for Figures 6E and F.
We will add detailed information of statistical analysis in each figure legend.
(5) The experiments in Figure 5 lack specificity. The target neurons in this study are Gr5a+ neurons, which are directly involved in sugar sensing. However, the authors used the less specific Dop1R1- and Dop2R-GAL4 lines for their manipulations. Using Gr5a-GAL4 to specifically target Gr5a+ neurons would provide greater precision and ensure that the observed effects are directly attributable to the modulation of Gr5a+ neurons, rather than being influenced by potential off-target effects from other neuronal populations expressing these dopamine receptors.
We agree with the reviewer that manipulating Dop1R1 and Dop2R genes (Figure 4) and the neurons expressing them (Figure 5) might have broader impacts. In fact, we have also tested the role of Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons by RNAi experiments (Figure 6). As shown by both behavioral and calcium imaging experiments, knocking down Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons both eliminated the effect of sexual failure to dampen sweet sensitivity, further confirming the role of these two receptors in Gr5a<sup>+</sup> neurons.
(6) I found the results presented in Fig. 6F puzzling. The knockdown of Dop2R in Gr5a+ neurons would be expected to decrease sucrose responses in naive and satisfied flies, given the role of Dop2R in enhancing sweet sensitivity. However, the figure shows an apparent increase in responses across all three groups, which contradicts this expectation. The authors may want to provide an explanation for this unexpected result.
We agree that there might be some potential discrepancies. However, our current data are not adequate for the clarification given the experiments shown in Figure 6E-F and the apparent control (Figure 3C) were not conducted under identical settings at the same (that’s why we did not directly compare these results). One way to address the issues is to conduct these calcium imaging experiments again with a head-to-head comparison with the control group (Gr5a-GCaMP, +/- Dop1R1 and Dop2R RNAi). We will conduct the experiments and present the data in the revised manuscript.
(7) In several instances in the manuscript, the authors described the effects of silencing dopamine signaling pathways or knocking down dopamine receptors in Gr5a neurons with phrases such as 'no longer exhibited reduced sweet sensitivity' (e.g., L269 and L288), 'prevent the reduction of sweet sensitivity' (e.g., L292), or 'this suppression was reversed' (e.g. L299). I found these descriptions misleading, as they suggest that sweet sensitivity in naive and satisfied groups remains normal while the reduction in failed flies is specifically prevented or reversed. However, this is not the case. The data indicate that these manipulations result in an overall decrease in sweet sensitivity across all groups, such that a further reduction in failed flies is not observed. I recommend revising these descriptions to accurately reflect the observed phenotypes and avoid any confusion regarding the effects of these manipulations.
We will change our expressions in the revised manuscript. In brief, we think that these manipulations (suppressing Dop1R1<sup>+</sup> and Dop2R<sup>+</sup> neurons) have two consequences: suppressing the overall sweet sensitivity and eliminating the effect of sexual failure.
Reviewer #2 (Public review):
Summary:
The authors exposed naïve male flies to different groups of females, either mated or virgin. Male flies can successfully copulate with virgin females; however, they are rejected by mated females. This rejection reduces sugar preference and sensitivity in males. Investigating the underlying neural circuits, the authors show that dopamine signaling onto GR5a sensory neurons is required for reduced sugar preference. GR5a sensory neurons respond less to sugar exposure when they lack dopamine receptors.
Strengths:
The findings add another strong phenotype to the existing dataset about brain-wide neuromodulatory effects of mating. The authors use several state-of-the-art methods, such as activity-dependent GRASP to decipher the underlying neural circuitry. They further perform rigorous behavioral tests and provide convincing evidence for the local labellar circuit.
Weaknesses:
The authors focus on the circuit connection between dopamine and gustatory sensory neurons in the male SEZ. Therefore, it is still unknown how mating modulates dopamine signaling and what possible implications on other behaviors might result from a reduced sugar preference.
We agree with the reviewer that in the current study, we did not examine how mating experience suppressed the activity of dopaminergic neurons in the SEZ. The current study mainly focused on the behavioral characterization (sexual failure suppresses sweet sensitivity) and the downstream mechanism (TH-Gr5a pathway). We think that examining the upstream modulatory mechanism may be more suitable for a separate future study.
We believe that a sustained reduction in sweet sensitivity (not limited to sucrose but extend to other sweet compounds, Figure 1, S1B-C) upon sexual failure suggests a generalized and sustained consequence on reward-related behaviors. Sexual failure may thus resemble a state of “primitive emotion” in fruit flies. We will further discuss this possibility in the revised manuscript.
Reviewer #3 (Public review):
Summary
In this work, the authors asked how mating experience impacts reward perception and processing. For this, they employ fruit flies as a model, with a combination of behavioral, immunostaining, and live calcium imaging approaches.
Their study allowed them to demonstrate that courtship failure decreases the fraction of flies motivated to eat sweet compounds, revealing a link between reproductive stress and reward-related behaviors. This effect is mediated by a small group of dopaminergic neurons projecting to the SEZ. After courtship failure, these dopaminergic neurons exhibit reduced activity, leading to decreased Gr5a+ neuron activity via Dop1R1 and Dop2R signaling, and leading to reduced sweet sensitivity. The authors therefore showed how mating failure influences broader behavioral outputs through suppression of the dopamine-mediated reward system and underscores the interactions between reproductive and reward pathways.
Concern
My main concern regarding this study lies in the way the authors chose to present their results. If I understood correctly, they provided evidence that mating failure induces a decrease in the fraction of flies exhibiting PER. However, they also showed that food consumption was not affected (Fig. 1, supplement), suggesting that individuals who did eat consumed more. This raises questions about the analysis and interpretation of the results. Should we consider the group as a whole, with a reduced sensitivity to sweetness, or should we focus on individuals, with each one eating more? I am also concerned about how this could influence the results obtained using live imaging approaches, as the flies being imaged might or might not have been motivated to eat during the feeding assays. I would like the authors to clarify their choice of analysis and discuss this critical point, as the interpretation of the results could potentially be the opposite of what is presented in the manuscript.
Here is a brief clarification of our experimental design and we will further clarify the details in the revised manuscript:
After the behavioral conditioning, male flies were divided for two assays. On the one hand, we quantified PER responses of individual flies. As shown in Figure 1C, Failed males exhibited decreased sweet sensitivity (as demonstrated by the right shift of the response curve).
On the other hand, we sought to quantify food consumption of individual flies by using the MAFE assay (Qi et al 2005). When presented with 400 mM sucrose, approximately 100% of the flies in the Naïve and Satisfied groups, and 50% of the flies in the Failed group, extended their proboscis and started feeding (Figure 1B). For these flies, we could quantify the consumed volumes and found there was no change (Figure 1, S1A). We should also note the consistency of these two experiments, e.g. in Figure 1C, only 50-60% of Failed males responded to 400 mM stimulation.
These two experiments in combination suggest that sexual failure suppressed sweet sensitivity of the Failed males. Meanwhile, as long as they still initiated feeding, the volume of food consumption remained unchanged. These results led us to focus on the modulatory effect of sexual failure on the sensory system, the main topic of this present study.
In addition, to further clarify the potential misunderstanding, we plan to examine food consumption by using 800 mM sucrose instead. As shown in Figure 1C, 800 mM sucrose was adequate to induce feeding in ~100% of the flies.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This valuable study combined whole-head magnetoencephalography (MEG) and subthalamic (STN) local field potential (LFP) recordings in patients with Parkinson's disease undergoing deep brain stimulation surgery. The paper provides solid evidence that cortical and STN beta oscillations are sensitive to movement context and may play a role in the coordination of movement redirection.
We are grateful for the expert assessment by the editor and the reviewers. Below we provide pointby-point replies to both public and private reviews. We have tried to keep the answers in the public section short and concise, not citing the changed passages unless the point does not re-appear in the recommendations. There, we did include all of the changes to the manuscript, such that the reviewers need not go back and forth between replies and manuscript.
The reviewer comments have not only led to numerous improvements of the text, but also to new analyses, such as Granger causality analysis, and to methodological improvements e.g. including numerous covariates in the statistical analyses. We believe that the article improved substantially through the feedback, and we thank the reviewers and the editor for their effort.
Public Reviews
Reviewer #1 (Public review):
Summary:
Winkler et al. present brain activity patterns related to complex motor behaviour by combining wholehead magnetoencephalography (MEG) with subthalamic local field potential (LFP) recordings from people with Parkinson's disease. The motor task involved repetitive circular movements with stops or reversals associated with either predictable or unpredictable cues. Beta and gamma frequency oscillations are described, and the authors found complex interactions between recording sites and task conditions. For example, they observed stronger modulation of connectivity in unpredictable conditions. Moreover, STN power varied across patients during reversals, which differed from stopping movements. The authors conclude that cortex-STN beta modulation is sensitive to movement context, with potential relevance for movement redirection.
Strengths:
This study employs a unique methodology, leveraging the rare opportunity to simultaneously record both invasive and non-invasive brain activity to explore oscillatory networks.
Weaknesses:
It is difficult to interpret the role of the STN in the context of reversals because no consistent activity pattern emerged.
We thank the reviewer for the valuable feedback to our study. We agree that the interpretation of the role of the STN during reversals is rather difficult, because reversal-related STN activity was highly variable across patients. Although there seem to be consistent patterns in sub-groups of the current cohort, with some patients showing event-related increases (Fig. 3b) and others showing decreases, the current dataset is not large enough to substantiate or even explain the existence of such clusters. Thus, we limit ourselves to acknowledging this limitation and discussing potential reasons for the high variability, namely variability in electrode placement and insufficient spatial resolution for the separation of specialized cell ensembles within the STN (see Discussion, section Limitations and future directions).
Reviewer #2 (Public review):
Summary:
This study examines the role of beta oscillations in motor control, particularly during rapid changes in movement direction among patients with Parkinson's disease. The researchers utilized magnetoencephalography (MEG) and local field potential (LFP) recordings from the subthalamic nucleus to investigate variations in beta band activity within the cortex and STN during the initiation, cessation, and reversal of movements, as well as the impact of external cue predictability on these dynamics. The primary finding indicates that beta oscillations more effectively signify the start and end of motor sequences than transitions within those sequences. The article is well-written, clear, and concise.
Strengths:
The use of a continuous motion paradigm with rapid reversals extends the understanding of beta oscillations in motor control beyond simple tasks. It offers a comprehensive perspective on subthalamocortical interactions by combining MEG and LFP.
Weaknesses:
(1) The small and clinically diverse sample size may limit the robustness and generalizability of the findings. Additionally, the limited exploration of causal mechanisms reduces the depth of its conclusions and focusing solely on Parkinson's disease patients might restrict the applicability of the results to broader populations.
We thank the reviewer for the insightful feedback. We address these issues one by one in our responses to points 2, 4 and 6, respectively.
(2) The small sample size and variability in clinical characteristics among patients may limit the robustness of the study's conclusions. It would be beneficial for the authors to acknowledge this limitation and propose strategies for addressing it in future research. Additionally, incorporating patient-specific factors as covariates in the ANOVA could help mitigate the confounding effects of heterogeneity.
Thank you for this comment. The challenges associated with recording brain activity peri-operatively can be a limiting factor when it comes to sample size and cohort stratification. We now acknowledge this in the revised discussion (section Limitations and future directions). Furthermore, we suggest using sensing-capable devices in the future as a measure to increase sample sizes (Discussion, section Limitations and future directions). Lastly, we appreciate the idea of adding patient-specific factors as covariates to the ANOVAs and have thus included age, disease duration and pre-surgical UPDRS score into our models. This did not lead to any qualitative changes of statistical effects.
(3) The author may consider using standardized statistics, such as effect size, that would provide a clearer picture of the observed effect magnitude and improve comparability.
Thanks for the suggestion. As measures of effect size, we have added partial eta squared (η<sub>p</sub><sup2</sup>) to the results of all ANOVAs and Cohen’s d to all follow-up t-tests.
(4) Although the study identifies relevance between beta activity and motor events, it lacks causal analysis and discussion of potential causal mechanisms. Given the valuable datasets collected, exploring or discussing causal mechanisms would enhance the depth of the study.
We appreciate this idea and have conducted Granger causality analyses in response to this comment. This new analysis reveals that there is a strong cortical drive to the STN for all movements of interest and predictability conditions in the beta band. The detailed results can be viewed on p. 16 in the section on Granger causality. For statistical testing, we conducted an rmANCOVA, similar to those for power and coherence (see p. 46-48 and 54-56 for the corresponding tables), as well as t-tests assessing directionality (Figure 6-figure supplement 2 on p. 35). In the discussion section, we connect these results with prior findings suggesting that the frontal cortex drives the STN in the beta band, likely through hyperdirect pathway fibers (p. 17).
(5) The study cohort focused on senior adults, who may exhibit age-related cortical responses during movement planning in neural mechanisms. These aspects were not discussed in the study.
We appreciate the comment and agree that age may have impacted neural oscillatory activity of patients in the present study. We now acknowledge this in the limitations section, and point out that our approach to handling these effects was including age as a covariate in the statistical analyses.
(6) Including a control group of patients with other movement disorders who also undergo DBS surgery would be beneficial. Because we cannot exclude the possibility that the observed findings are specific to PD or can be generalized. Additionally, the current title and the article, which are oriented toward understanding human motor control, may not be appropriate.
We thank the reviewer for this comment and fully agree that it cannot be ruled out that the present findings are, in part, specific to PD. We acknowledge this limitation in the Limitations and future directions section (p. 20-21). Indeed, including a control group of patients with other disorders would be ideal, but the scarcity of patients with diseases other than PD who receive STN DBS in our centre makes this an unfeasible option in practical terms. We do suggest that future research may address this issue by extending our approach to different disorders or healthy participants on the cortical level (p. 21). Lastly, we appreciate the idea to adjust the title of the present article. The adjusted title is: “Context-Dependent Modulations of Subthalamo-Cortical Synchronization during Rapid Reversals of Movement Direction in Parkinson’s Disease”.
That being said, we do believe that our findings at least approximate healthy functioning and are not solely related to PD. For one, patients were on their usual dopaminergic medication and dopamine has been found to normalize pathological alterations of beta activity. Further, the general pattern of movement-related beta and gamma oscillations reported here has been observed in numerous diseases and brain structures, including cortical beta oscillations measured non-invasively in healthy participants.
Reviewer #3 (Public review):
Summary:
The study highlights how the initiation, reversal, and cessation of movements are linked to changes in beta synchronization within the basal ganglia-cortex loops. It was observed that different movement phases, such as starting, stopping briefly, and stopping completely, affect beta oscillations in the motor system.
It was found that unpredictable cues lead to stronger changes in STN-cortex beta coherence. Additionally, specific patterns of beta and gamma oscillations related to different movement actions and contexts were observed. Stopping movements was associated with a lack of the expected beta rebound during brief pauses within a movement sequence.
Overall, the results underline the complex and context-dependent nature of motor-control and emphasize the role of beta oscillations in managing movement according to changing external cues.
Strengths:
The paper is very well written, clear, and appears methodologically sound.
Although the use of continuous movement (turning) with reversals is more naturalistic than many previous button push paradigms.
Weaknesses:
The generalizability of the findings is somewhat curtailed by the fact that this was performed perioperatively during the period of the microlesion effect. Given the availability of sensing-enabled DBS devices now and HD-EEG, does MEG offer a significant enough gain in spatial localizability to offset the fact that it has to be done shortly postoperatively with externalized leads, with an attendant stun effect? Specifically, for paradigms that are not asking very spatially localized questions as a primary hypothesis?
We appreciate the reviewer’s feedback and acknowledge the valid point raised on the timing of our measurements. Indeed, sensing-enabled devices offer a valid alternative to peri-operative recordings, circumventing the stun effect. We acknowledge this in the revised discussion, section Limitations and future directions (p. 23): “Additionally, future research could capitalize on sensingcapable devices to circumvent the necessity to record brain activity peri-operatively, facilitating larger sample sizes and circumventing the stun effect, an immediate improvement in motor symptoms arising as a consequence of electrode implantation (Mann et al., 2009).” This alternative strategy, however, was not an option here because we did not have a sufficient number of patients implanted with sensing-enabled devices at the time when the data collection was initialized.
That being said, we would like to highlight that in the present study, our goal was not to study pathology related to Parkinson’s disease. Rather, we aimed to learn about motor control in general. The stun effect may have facilitated motor performance in our patients, which is actually beneficial to the research goals at hand.
Further investigation of the gamma signal seems warranted, even though it has a slightly lower proportional change in amplitude in beta. Given that the changes in gamma here are relatively wide band, this could represent a marker of neural firing that could be interestingly contrasted against the rhythm account presented.
We appreciate the reviewer’s interest and we have extended the investigation of gamma oscillations. We now provide statistics regarding the influence of predictability on gamma power and gamma coherence (no significant effects) and explore Granger causality in the gamma (and beta) band (see comment 4 of reviewer 2). Unfortunately, we cannot measure spiking via the DBS electrode, and therefore we cannot investigate correlations between gamma oscillatory activity and action potentials. We do agree with the reviewer, however, that action potentials rather than oscillations form the basis of motor control in the brain. This view of ours is now reflected in the revised discussion, section Limitations and future directions (p. 21): “Lastly, given the present study’s focus on understanding movement-related rhythms, particularly in the beta range, future research could further explore the role of gamma oscillations in continuous movement and their relation to action potentials in motor areas (Fischer et al., 2020; Igarashi, Isomura, Arai, Harukuni, & Fukai, 2013), which form the basis of movement encoding in the brain.”
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
This is a well-conducted study and overall the results are clear. I only have one minor suggestion for improvement of the manuscript. I found the order of appearance of the results somewhat confusing, switching from predictability-related behavioral effects to primarily stopping and reversal-related neurophysiological effects, back to predictability but starting with coherence. I would suggest that the authors try to follow a systematic order focused on the questions at hand. E.g. perhaps readability could be improved if the results section is split into reversal vs. stopping related effects, reporting behavior, power, and coherence in this order, followed by a predictability section, again reporting behavior, power, and coherence. Obviously, this is an optional suggestion. Apart from that, I just missed a more direct message related to the absence of statistical significance related to STN power changes during reversal. I think this could be made more clear in the text.
We thank the reviewer for the feedback to our study. In order to ease reading, we modified the order and further added additional sub-titles to the results section. We start with Behavior (p. 4) and then move on to Power (general movement effects on power – movement effects on STN power – movement effects on cortical power – predictability effects on power). Next, we move on to Connectivity (movement effects on connectivity – predictability effects on connectivity – Granger causality). We hope that these adaptations will help guide the reader.
Additionally, we thank the reviewer for noting that we did not explicitly mention the lack of statistical significance of reversal-related beta power modulations in the STN. We have adapted the section on modulation of STN beta power associated with reversals (p. 8) to: “In the STN, reversals were associated with a brief modulation of beta power, which was weak in the group-average spectrum and did not reach significance (Fig. 3a).”
Reviewer #2 (Recommendations for the authors):
(1) The small sample size and variability in clinical characteristics among patients may limit the robustness of the study's conclusions. It would be beneficial for the authors to acknowledge this limitation and propose strategies for addressing it in future research. Additionally, incorporating patient-specific factors as covariates in the ANOVA could help mitigate the confounding effects of heterogeneity.
Thank you for this comment. The challenges associated with recording brain activity peri-operatively can be a limiting factor when it comes to sample size. We now acknowledge this in the revised discussion, section Limitations and future directions (p. 20):
“Invasive measurements of STN activity are only possible in patients who are undergoing or have undergone brain surgery. Studies drawing from this limited pool of candidate participants are typically limited in terms of sample size and cohort stratification, particularly when carried out in a peri-operative setting. Here, we had a sample size of 20, which is rather high for a peri-operative study, but still low in terms of absolute numbers.”
Furthermore, we suggest using sensing-capable devices in the future as a measure to increase sample sizes (p. 21):
“Additionally, future research could capitalize on sensing-capable devices to circumvent the necessity to record brain activity peri-operatively, facilitating larger sample sizes and circumventing the stun effect, an immediate improvement in motor symptoms arising as a consequence of electrode implantation (Mann et al., 2009).”
Lastly, we appreciate the idea of adding patient-specific factors as covariates to the ANOVAs and have thus included age, disease duration and pre-surgical UPDRS score into our models. This did not lead to any qualitative changes of statistical effects.
Revised article
Methods, Statistical analysis:
“To account for their potential influence on brain activity, we added age, pre-operative UPDRS score, and disease duration as covariates to all ANOVAs. Covariates were standardized by means of zscoring.”
(2) The author may consider using standardized statistics, such as effect size, that would provide a clearer picture of the observed effect magnitude and improve comparability.
Thanks for this useful suggestion. As measures of effect size, we have added partial eta squared (η<sub>p</sub><sup2</sup>) to the results of all ANOVAs and Cohen’s d to all follow-up _t-_tests.
(3) Although the study identifies relevance between beta activity and motor events, it lacks causal analysis and discussion of potential causal mechanisms. Given the valuable datasets collected, exploring or discussing causal mechanisms would enhance the depth of the study.
We appreciate this idea and have conducted Granger causality analyses in response to this comment. This new analysis reveals that there is a strong cortical drive to the STN for all movements of interest and predictability conditions in the beta band, but no directed interactions in the gamma band. For statistical testing, we conducted an rmANCOVA, similar to the analysis of power and coherence (see p. 46-48 and 54-56 for the corresponding tables), as well as t-tests assessing directionality (Figure 6 figure supplement 2 on p. 35). In the discussion section, we connect these results with prior findings suggesting that the frontal cortex drives the STN in the beta band, likely through hyperdirect pathway fibers (p. 17).
Revised article
Methods Section, Granger Causality Analysis
“We computed beta and gamma band non-parametric Granger causality (Dhamala, Rangarajan, & Ding, 2008) between cortical ROIs and the STN in the hemisphere contralateral to movement for the post-event time windows (0 – 2 s with respect to start, reversal, and stop). Because estimates of Granger causality are often biased, we compared the original data to time-reversed data to suppress non-causal interactions. True directional influence is reflected by a higher causality measure in the original data than in its time-reversed version, resulting in a positive difference between the two, the opposite being the case for a signal that is “Granger-caused” by the other. Directionality is thus reflected by the sign of the estimate (Haufe, Nikulin, Müller, & Nolte, 2013). Because rmANCOVA results indicated no significant effects for predictability and movement type, and post-hoc tests did not detect significant differences between hemispheres, we averaged Granger causality estimates over movement types, hemispheres and predictability conditions in Figure 6-figure supplement 2.”
Results, Granger causality
“In general, cortex appeared to drive the STN in the beta band, regardless of the movement type and predictability condition. This was reflected in a main effect of ROI on Granger causality estimates (F<sub>ROI</sub>(7,9) = 3.443, p<sub>ROI</sub> = 0.044, η<sub>p</sub><sup2</sup> = 0.728; refer to Supplementary File 4 for the full results of the ANOVA). In the hemisphere contralateral to movement, follow-up t-tests revealed significantly higher Granger causality estimates from M1 to the STN (t = 3.609, one-sided p < 0.001, d = 0.807) and from MSMC to the STN (t = 2.051, one-sided p < 0.027, d = 0.459) than the other way around. The same picture emerged in the hemisphere ipsilateral to movement (M1 to STN: t = 3.082, one-sided p = 0.003, d = 0.689; MSMC to STN: t \= 1.833, one-sided p < 0.041, d = 0.410). In the gamma band, we did not detect a significant drive from one area to the other (F<sub>ROI</sub>(7,9) = 0.338, p<sub>ROI</sub> = 0.917, η<sub>p</sub><sup2</sup> = 0.208, Supplementary File 6). Figure 6-figure supplement 2 demonstrates the differences in Granger causality between original and time-reversed data for the beta and gamma band.”
Discussion, The dynamics of STN-cortex coherence
“Considering the timing of the increase observed here, the STN’s role in movement inhibition (Benis et al., 2014; Ray et al., 2012) and the fact that frontal and prefrontal cortical areas are believed to drive subthalamic beta activity via the hyperdirect pathway (Chen et al., 2020; Oswal et al., 2021) it seems plausible that the increase of beta coherence reflects feedback of sensorimotor cortex to the STN in the course of post-movement processing. In line with this idea, we observed a cortical drive of subthalamic activity in the beta band.”
(4) The study cohort focused on senior adults, who may exhibit age-related cortical responses during movement planning in neural mechanisms. These aspects were not discussed in the study.
We appreciate the comment and agree that age may have impacted neural oscillatory activity of patients in the present study. We now acknowledge this in the limitations section, and point out that our approach to handling these effects was including age as a covariate in the statistical analyses.
Revised article
Discussion, Limitations and Future Directions
“Further, most of our participants were older than 60 years. To diminish any confounding effects of age on movement-related modulations of neural oscillations, such as beta suppression and rebound (Bardouille & Bailey, 2019; Espenhahn et al., 2019), we included age as a covariate in the statistical analyses.”
(5) Including a control group of patients with other movement disorders who also undergo DBS surgery would be beneficial. Because we cannot exclude the possibility that the observed findings are specific to PD or can be generalized. Additionally, the current title and the article, which are oriented toward understanding human motor control, may not be appropriate.
We thank the reviewer for this comment and fully agree that it cannot be ruled out that the present findings are, in part, specific to PD. We acknowledge this limitation in the Limitations and future directions section (p. 20-21). Indeed, including a control group of patients with other disorders would be ideal, but the scarcity of patients with diseases other than PD who receive STN DBS makes this an unfeasible option. We do suggest that future research may address this issue by extending our approach to different disorders or healthy participants on the cortical level (p. 21). Lastly, we appreciate the idea to adjust the title of the present article. The adjusted title is: “Context-Dependent Modulations of Subthalamo-Cortical Synchronization during Rapid Reversals of Movement Direction in Parkinson’s Disease”.
That being said, we do believe that our findings at least approximate healthy functioning and are not solely related to PD. For one, patients were on their usual dopaminergic medication for the study and dopamine has been found to normalize pathological alterations of beta activity. More importantly, the general pattern of movement-related beta and gamma oscillations has been observed in numerous diseases and brain structures, including cortical beta oscillations measured non-invasively in healthy participants. Thus, it is not unlikely that the new aspects discovered here are also general features of motor processing.
Revised article
Discussion, Limitations and future directions
“Furthermore, we cannot be sure to what extent the present study’s findings relate to PD pathology rather than general motor processing. We suggest that our approach at least approximates healthy brain functioning as patients were on their usual dopaminergic medication. Dopaminergic medication has been demonstrated to normalize power within the STN and globus pallidus internus, as well as STN-globus pallidus internus and STN-cortex coherence (Brown et al., 2001; Hirschmann et al., 2013). Additionally, several of our findings match observations made in other patient populations and healthy participants, who exhibit the same beta power dynamics at movement start and stop (Alegre et al., 2004) that we observed here. Notably, our finding of enhanced cortical involvement in face of uncertainty aligns well with established theories of cognitive processing, given the cortex' prominent role in managing higher cognitive functions (Altamura et al., 2010). Yet, transferring our approach and task to patients with different disorders, e.g. obsessive compulsive disorder, or examining young and healthy participants solely at the cortical level, could contribute to elucidating whether the synchronization dynamics reported here are indeed independent of PD and age.”
Reviewer #3 (Recommendations for the authors):
Despite the strengths of the "rhythm" account of cognitive processes, the paper could possibly be improved by making it less skewed to rhythms explaining all of the movement encoding.
Thank you for this comment - the point is well taken. There is a large body of literature relating neural oscillations to spiking in larger neural populations, which itself is likely the most relevant signal with respect to motor control. In our eyes, it is this link that justifies the rhythm account, i.e. we agree with the reviewer that action potentials are the basis of movement encoding in the brain, not oscillations. Unfortunately, we cannot measure spiking with the method at hand.
To better integrate this view into the current manuscript, we make the following suggestion for future research in the Limitations and future directions section (p. 21): “Lastly, given the present study’s focus on understanding movement-related rhythms, particularly in the beta range, future research could further explore the role of gamma oscillations in continuous movement and their relation to action potentials in motor areas (Fischer et al., 2020; Igarashi, Isomura, Arai, Harukuni, & Fukai, 2013), which form the basis of movement encoding in the brain.”
In Figure 5 - is the legend correct? Is it really just a 0.2% change in power only? That would be a very surprisingly small effect size.
We thank the reviewer for noting this. Indeed, the numbers on the scale quantify relative change (post - pre)/pre and should be multiplied by 100 to obtain %-change. We have adjusted the color bars accordingly.
The dissociation between the effects of unpredictable cues in coherence versus raw power is interesting and could potentially be directly contrasted further in the discussion (here they are presented separately with separate discussions, but this seems like a pretty important and novel finding as beta coherence and power usually go in the same direction).
We appreciate the reviewer’s interest in our findings on the predictability of movement instructions. In case of coherence, the difference between pre- and post-event was generally more positive in the unpredictable condition, meaning that suppressions (negative pre-post difference) were diminished whereas increases (positive pre-post difference) were enhanced. With respect to power, we also observed less suppression in the unpredictable condition at movement start. Therefore, the direction of change is in fact the same. We made this clearer in the revised version by adapting the corresponding sections of the abstract, results and discussion (see below).
The only instance of coherence and power diverging (on a qualitative level) was observed during reversals: here, we noted post-event increases in coherence and post-event decreases in M1 power in the group-average spectra. However, when comparing the pre- and post-event epochs statistically by means of permutation testing, the coherence increase did not reach significance. Hence, we did not highlight this aspect.
Revised version
Abstract
“… Event-related increases of STN-cortex beta coherence were generally stronger in the unpredictable than in the predictable condition. … “
Results, Effects of predictability on beta power
“With respect to the effect of predictability of movement instructions on beta power dynamics (research aim 2), we observed an interaction between movement type and condition (F<sub>cond*mov</sub> (2,14) = 4.206, p<sub>cond*mov</sub> = 0.037, η<sub>p</sub><sup2</sup> = 0.375), such that the beta power suppression at movement start was generally stronger in the predictable (M = -0.170, SD = 0.065) than in the unpredictable (M \= -0.154, SD = 0.070) condition across ROIs (t = -1.888, one-sided p \= 0.037, d = -0.422). We did not observe any modulation of gamma power by the predictability of movement instructions (F<sub>cond</sub> (1,15) = 0.792, p<sub>cond</sub> = 0.388, η<sub>p</sub><sup2</sup> = 0.050, Supplementary File 5).”
Effects of predictability on STN-cortex coherence
“With respect to the effect of predictability of movement instructions on beta coherence (research aim 2), we found that the pre-post event differences were generally more positive in the unpredictable condition (main effect of predictability condition; F<sub>cond</sub>(1,15) = 8.684, p<sub>cond</sub> = 0.010, η<sub>p</sub><sup2</sup> = 0.367; Supplementary File 3), meaning that the suppression following movement start was diminished and the increases following stop and reversal were enhanced in the unpredictable condition (Fig. 6a). This effect was most pronounced in the MSMC (Fig. 6b). When comparing regionaverage TFRs between the unpredictable and the predictable condition, we observed a significant difference only for stopping (t<sub>clustersum</sub> = 142.8, p = 0.023), suggesting that the predictability effect was mostly carried by increased beta coherence following stops. When repeating the rmANCOVA for preevent coherence, we did not observe an effect of predictability (F<sub>cond</sub>(1,15) = 0.163, p<sub>cond</sub> = 0.692, η<sub>p</sub><sup2</sup> = 0.011), i.e. the effect was most likely not due to a shift of baseline levels. The increased tendency for upward modulations and decreased tendency for downward modulations rather suggests that the inability to predict the next cue prompted intensified event-related interaction between STN and cortex. STN-cortex gamma coherence was not modulated by predictability (F<sub>cond</sub>(1,15) = 0.005, p<sub>cond</sub> = 0.944, η<sub>p</sub><sup2</sup> = 0.000, Supplementary File 5).”
Discussion, Beta coherence and beta power are modulated by predictability
“In the present paradigm, patients were presented with cues that were either temporally predictable or unpredictable. We found that unpredictable movement prompts were associated with stronger upward modulations and weaker downward modulations of STN-cortex beta coherence, likely reflecting the patients adopting a more cautious approach, paying greater attention to instructive cues. Enhanced STN-cortex interactions might thus indicate the recruitment of additional neural resources, which might have allowed patients to maintain the same movement speed in both conditions. […]”
With respect to power, we observed reduced beta suppression in the unpredictable condition at movement start, consistent with the effect on coherence, likely demonstrating a lower level of motor preparation.
Given that you have a nice continuous data task here - the turning of the wheel, it might be interesting to cross-correlate the circular position (and separately - velocity) of the turning with the envelope of the beta signal. This would be a nice finding if you could also show that the beta is modulated continuously by the continuous movements. In the natural world, we rarely do a continuous movement with a sudden reversal, or stop, most of the time we are in continuous movement. Look at this might also be a strength of your dataset.
We could not agree more. In fact, having a continuous behavioral output was a major motivation for choosing this particular task. We are very interested in state space models such as preferential subspace identification (Sani et al., 2021), for example. These models relate continuous brain signals to continuous behavioral target variables and should be of great help for questions such as: do oscillations relate to moment-by-moment adaptations of continuous movement? Which frequency bands and brain areas are important? Is angular position encoded by different brain areas/frequency bands than angular speed? These analyses are in fact ongoing. This project, however, is too large to fit into the current article.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This study is an important follow-up to their prior work - Wong et al. (2019), starting with clear questions and hypotheses, followed by a series of thoughtful and organized experiments. The method and results are convincing. Experiment 1 demonstrated the sensory preconditioned fear with few (8) or many (32) sound-light pairings. Experiments 2A and 2B showed the role of PRh NMDA receptors during conditioning for online integration, revealing that this contribution is present only after a few sound-light pairings, not after many sound-light pairings. Experiments 3A and 3B showed the contribution of PRh-BLA communication to online integration, again only after a few but not after many. Contrary to Experiments 3A and 3B, Experiments 4A and 4B showed the contribution of PRh-BLA communication to integration at test only after many but not few sound-light pairings.
Strengths:
Throughout the manuscript, the methods and results are clearly organized and described, and the use of statistics is solid, all contributing to the overall clarity of the research. The discussion section was also well-written, effectively comparing the current research with the prior work and offering insightful interpretations and potential future directions for this line of research. I have only a limited amount of concerns about some results and some details of experiments/statistics.
We thank the reviewer for their positive assessment.
Weaknesses:
Could you provide further interpretation regarding line 171: the observation that sensory preconditioned fear increased with the number of sound-light pairings? Was this increase due to better sound-light association learning during Stage 1? Additionally, were there any experimental differences between Experiment 1 and the other experiments that might explain why freezing was higher in the P32 group compared to the P8 group? This pattern seemed to be absent in the other experiments. If we consider the hypothesis that the online integration mechanism is more active with fewer pairings and the chaining mechanism at the test is more prominent with many pairings, we wouldn't expect a difference between the P8 and P32 groups. Given the relatively small sample size in Experiment 1, the authors might consider conducting a cross-experiment analysis or something similar to investigate this further.
We appreciate the reviewer’s point and thank them for the question. The heightened level of sensory preconditioned fear among rats that received many sound-light pairings in the initial control experiment (Group P32) may reflect the combined effects of both mediated learning and chaining at test. We are, however, reluctant to offer a strong interpretation of this result as it was not replicated in the subsequent experiments: i.e., the levels of freezing to the sensory preconditioned stimulus at test were almost identical among vehicle-injected controls that received either few (8) or many (32) sound-light pairings in Experiments 2A and 2B; and this was also true in Experiments 3A and 3B, and again in Experiments 4A and 4B. A key difference between the initial and subsequent experiments is that, in contrast to the initial experiment, rats in subsequent experiments underwent surgery for one reason or another (implantation of cannulas, lesion of the perirhinal cortex). The implication is that surgical interventions in the perirhinal cortex and/or basolateral amygdala might affect the way that rats integrate the sound-light and light-shock associations in sensory preconditioning: i.e., they may force rats to rely on one type of integration strategy or the other. This is, of course, purely speculative – it will be addressed in future research.
Reviewer #2 (Public review):
This manuscript builds on the authors' earlier work, most recently Wong et al. 2019, in which they showed the importance of the perirhinal cortex (PRh) during the first-order conditioning stage of sensory preconditioning. Sensory preconditioning requires learning between two neutral stimuli (S2-S1) and subsequent development of a conditioned response to one of the neutral stimuli after pairing of the other stimulus with a motivationally relevant unconditioned stimulus (S1-US). One highly debated question regarding the mechanisms of learning of sensory preconditioning has been whether conditioned responses evoked by the indirectly trained stimulus (S2) occur through a mediated representation at the time of the first-order US training, or whether the conditioned responses develop through a chained evoked representation (S2--> S1 --> US) at the time of test. The authors' prior findings provided strong evidence for PRh being involved in mediated learning during the first-order training. They showed that protein synthesis was required during the first-order S1-US learning to support the conditioned response to the indirectly trained stimulus (S2) at the test.
One question remaining following the previous paper was whether certain conditions may promote a chaining mechanism over mediated learning, as there is some evidence for chained representations at the time of the test. In this paper, the authors directly address this important question and find unambiguous results that the extent of training during the preconditioning stage impacts the involvement of PRh during the first-order conditioning or stage 2. They show that putative blockade of synaptic changes in PRh, using an NMDA antagonist, disrupts responding to the preconditioned cue at test during shorter duration preconditioning training (8 trials), but not during extended training (32 trials). They also show that this is the case for communication between the PRh and BLA during the same stage of training using a contralateral inactivation approach. This confirms their previous findings in 2019 of connectivity between these regions for the short-duration training, while they observe here for the first time that this is not the case for extended training. Finally, they show that with extended training, communication between BLA and the PRh is required at the final test of the preconditioned stimulus, but not for the short duration training.
The results are clear and extremely consistent across experiments within this paper as well as with earlier work. The experiments here are thorough, and well-conceived, and address an important and highly debated question in the field regarding the neural and psychological mechanisms underlying sensory preconditioning. This work is highly impactful for the field as the debate over mediated versus chaining mechanisms has been an important topic for more than 70 years.
We thank the reviewer for their kind assessment.
Reviewer #3 (Public review):
The authors tested whether the number of stimulus-stimulus pairings alters whether preconditioned fear depends on online integration during the formation of the stimulus-outcome memory or during the probe test/mobilization phase, when the original stimulus, which was never paired with aversive events, elicits fear via chaining of stimulus-stimulus and stimulus-outcome memories. They found that sensory preconditioning was successful with either 8 or 32 stimulus-stimulus pairings. Perirhinal cortex NMDA receptor blockade during stimulus-outcome learning impaired preconditioning following 8 but not 32 pairings during preconditioning. Therefore, perirhinal cortex NMDA activity is required for online integration or mediated learning. Perirhinal-basolateral amygdala had nearly identical effects with the same interpretation: these areas communicate during stimulus-outcome learning, and this online communication is required for later expressing preconditioned fear. Disconnection prior to the probe test, when chaining might occur, had different effects: it impaired the expression of preconditioned fear in rats that received 32, but not 8, pairings during preconditioning. The study has several strengths and provides a thoughtful discussion of future experiments. The study is highly impactful and significant; the authors were successful in describing the behavioral and neurobiological mechanisms of mediated learning versus chaining in sensory preconditioning, which is often debated in the learning field. Therefore this study will have a significant impact on the behavioral neurobiology and learning fields.
Strengths:
Careful, rigorous experimental design and statistics.
The discussion leaves open questions that are very much worth exploring. For example - why did perirhinal-amygdala disconnection prior to the probe have no effect in the 8-pairing group, when bilateral perirhinal inactivation did (in Wong et al, 2019)? The authors propose that perirhinal cortex outputs bypass the amygdala during the probe test, which is an excellent hypothesis to test.
The authors provide evidence that both mediated learning and chaining occur.
Thank you for the positive assessment – we fully intend to identify the circuitry that regulates retrieval/expression of sensory preconditioned fear when it is based on mediated learning in stage 2.
Weaknesses:
This is inherent to all neural interference and behavioral experiments: biological/psychological functions do not typically operate binarily. There is no single clear number or parameter at which mediated learning or chaining happens, and both probably happen to some extent. Addressing this is even more difficult given behavioral variability across subjects, implant sites, etc. Thus, this is not so much a weakness particular to this study as much as an existential problem, which the authors were able to work around with careful experimental design and appropriate controls.
We completely agree with the point raised here and thank the reviewer for their assessment.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) It appears that the method description for Sensory Preconditioning was copied from their previous Wong et al. (2019) paper, which is fine, but in the current research, the authors use 8 or 32 presentations, which is not reflected in the description.
Thank you for bringing this to our attention. This is now addressed in the method section on page 27 (beginning at line 655):
“Rats received either eight presentations of the sound and eight of the light in a single session, or 32 presentations of the sound and 32 of the light across four daily sessions. On Day 3, all rats received eight presentations of the sound and eight of the light. Each presentation of the sound was 30 s in duration and each presentation of the light was 10 s in duration. The first stimulus presentation occurred five min after rats were placed into the chambers. The offset of one stimulus co-occurred with the onset of the other stimulus for groups that received paired presentations of the sound and the light, while these stimuli were presented separately for groups that received explicitly unpaired presentations. The interval between each paired presentation was five min while the interval between each separately presented stimulus was 150 s. After the last stimulus presentation, rats remained in the chambers for an additional one min. They were then returned to their home cages. This training was repeated on Days 4-6 for rats that received 32 presentations of the sound and 32 of the light. All rats proceeded to first-order conditioning (details below) the day after their final session of sound and light exposures, which was Day 4 for rats exposed to eight presentations of the sound and light and Day 7 for rats exposed to 32 presentations of the sound and light.”
(2) Line 148: Could the authors clarify how the "significant linear increase" was assessed? From similar descriptions in later experiments, it seems it was based on a comparison of freezing across the four presentations, but the F(1,26) statistic suggests there seemed to be a half-split test. The same questions exist in all the experiments. Please clarify.
Conditioning data were analysed using contrasts with repeated measures in ANOVA. The repeated measures (or within-subject) factor was “trial” as all rats were exposed to four light-shock pairings in this stage of training. We examined whether there was a significant linear increase in freezing across trials using a standard within-subject contrast. The specific coefficients for this contrast, given the four trials, were -3, -1, 1, and 3. The reason that the degrees of freedom remain 1 and 26 in this analysis is because the within-subject contrast is part of a set of planned orthogonal contrasts. That is, in any planned analysis of the sort conducted here, the df1 will always be 1, indicating the very nature of the analysis. There was no splitting of the data, or comparisons between the split halves.
(3) Line 154: Could the authors clarify what is meant by "other main effects and their interactions"? It is not clearly inferable from the context.
Apologies for the confusion here. “Other main effects” refer to the two between-subject factors in isolation: i.e., the overall comparison of freezing to the light (averaged across the four trials) between groups that received either paired or unpaired stimulus presentations in stage 1 (factor 1 à main effect 1), and between groups that received either eight or 32 sound and light exposures in stage 1 (factor 2 à main effect 2). “Their interaction” refers to the assessment of whether the overall difference in freezing to the light (averaged across the four trials) between Groups P8 and U8 differs from the overall difference in freezing to the light (averaged across the four trials) between Groups P32 and U32. We have edited the text near line 153 to indicate that:
“The overall comparisons of freezing to the light (averaged across the four conditioning trials) between groups that received either paired or unpaired stimulus presentations in stage 1 (factor 1), and between groups that received either eight or 32 sound and light exposures in stage 1 (factor 2), were not significant (Fs < .45, p > .508). The interaction between these two between-subject factors was also not significant (F < .45, p > .508).”
(4) The use of sound and light as preconditioned and conditioned cues are counterbalanced. Was there any difference in the increase of freezing during conditioning depending on the type of conditioned cues? Was there any difference in the preconditioned fear? While it is hard to assess statistical significance due to the sample size limit, even observing a trend could be interesting.
We examined whether the levels of freezing to the conditioned and preconditioned stimuli depend on their physical identity. In general, there was a slight trend towards more freezing to the preconditioned stimulus when it was a tone, and less freezing to the conditioned stimulus when it was a tone. These are, however, simply indications. None of the statistical comparisons between rats for which the preconditioned stimulus was the tone (and, thereby, conditioned stimulus was the light) and rats for which the preconditioned stimulus was the light (and, thereby, conditioned stimulus was the tone) reached the conventional level of significance.
(5) General suggestion on reporting non-significant statistics: the authors reported a small F statistic value a few times to suggest non-significance. But without clearly specifying degrees of freedom, it is hard to get a sense of statistical significance (e.g. Line 227, largest F<3.10). I recommend adding p values alongside the F statistics and reporting exact statistics whenever possible.
Apologies for the omission. The p values have now been included alongside all non-significant F statistics.
(6) Another general suggestion is to use non-parametric statistical testing with such small sample sizes. I recommend using the Kruskal-Wallis H test (the non-parametric equivalent of F-statistic) to replace the ANOVA result. Also, given many tests only involve comparing two independent groups, using Mann-Whitney U test (the non-parametric equivalent of independent t-test) would be sufficient.
We understand that small sample sizes can occasionally lead to unequal variances between groups, which necessitates the use of non-parametric statistics. However, as non-parametric statistics raise a different set of issues for data analysis (e.g., power) and interpretation, our general view for the type of data collected in this study is that parametric analyses are appropriate and should be retained (particularly in the absence of unequal variances between groups). We hold this view for two reasons. First, the hypotheses tested in the present series were derived from past work in which parametric analyses revealed meaningful patterns of results at the same level of statistical power. Second, the application of these analyses then yielded results consistent with our hypotheses: for the most part, we observed between-group differences where we expected there to be such differences and did not observe between-group differences where we did not expect there to be such differences. As such, we have not switched from a parametric to non-parametric analysis strategy. We do, however, appreciate the suggestion and will apply a non-parametric approach where it is warranted in our future work.
Reviewer #2 (Recommendations for the authors):
I have a few very minor comments for the authors regarding the discussion and interpretation of the very nice experimental results.
(1) In Figures 4 and 5, the authors provide a schematic of the experiment. It's very clearly indicated whether the BLA inactivation is ipsi- or contralateral, but the unilateral PRh lesion isn't mentioned. I'd recommend including that here so that someone reading through the figures can more easily understand the experiment. The hypothesis is clear and the experiment is so well designed that a read through of the figures can relay most information to an experienced reader.
Thank you for this suggestion – we have included information about the unilateral PRh lesion in the schematic for Figures 4 and 5.
(2) The authors have an extended description of backward conditioning in the discussion. It seems like the authors are suggesting this as an important future direction, but they never explicitly say this, resulting in a bit of confusion as to what this section refers to. Also, Ward-Robinson and Hall 1996 showed backward sensory preconditioning using a serial auditory-visual association and argued for a mediated solution based on their results. It may be worth citing that paper here.
Apologies for the lack of clarity. We have revised this point in the discussion (page 18, beginning line 434) and referenced Ward-Robinson and Hall (1996):
“Why does increasing the number of sound-light pairings change the way that rats integrate the sound-light and light-shock memories? One possibility is that increasing the number of sound-light pairings in stage 1 reduces the ability of each stimulus to activate the memory of the other. This is consistent with findings by Holland (1998), who showed that the likelihood of mediated learning in rats decreases with the amount of training (see also Holland, 2005); but inconsistent with our findings that, after extended training, rats continue to integrate the sound-light and light-shock associations through chaining at the time of testing (as chaining is predicated on the sound activating the memory of the light after extended training). Instead, we propose that the change in integration occurs because the increased number of sound-light pairings allows the rats to learn about the order in which the sound and light are presented (Figure 1; for evidence that rats acquire order information in sensory preconditioning, see Barnet et al., 1997; Hart et al., 2022; Leising et al., 2007; Miller & Barnet, 1993). This order hypothesis is consistent with evidence showing that the way in which animals represent an audio-visual compound changes across repeated compound exposures (e.g., Bellingham & Gillette, 1981; Holmes & Harris, 2009). It can be tested using a so-called “backward” sensory preconditioning protocol, which reverses the order of stimulus presentations in stage 1 (e.g., Ward-Robinson & Hall, 1996). That is, rather than rats being exposed to the “forward” sound-light pairings used here and by Wong et al. (2019), rats in a backward protocol are exposed to light-sound pairings. Increasing the number of light-sound pairings in this protocol should result in rats learning that the light is followed by the sound (light→sound) and that the sound is followed by nothing (sound→nothing). Hence, during the session of light-shock pairings in stage 2, the light should continue to activate the memory of the sound, resulting in formation of the mediated sound-shock association (e.g., Ward-Robinson & Hall, 1996). That is, if our order hypothesis is correct, increasing the number of light-sound pairings in the backward protocol should preserve the likelihood of mediated learning in stage 2 and, if anything, diminish the likelihood of chaining at test in stage 3 (as the sound is never followed by a light). Hence, PRh manipulations that fail to affect fear of the sound when administered after many sound-light pairings (e.g., infusion of DAP5) should disrupt that fear when administered after many light-sound pairings in the backward protocol. This will be assessed in future work.”
(3) Line 467 in the discussion suggests that the results are surprising that PRh-BLA communication is not needed at test when learning putatively occurs through a mediated mechanism during first-order conditioning. I was a bit surprised by this comment since I was under the assumption that only BLA was required at this point after consolidation of the mediated learning. Holmes et al., 2013 showed that BLA is required for extinction to S2 after first-order conditioning. In that experiment they inactivated BLA during S2- presentations (typically considered the extinction test), and showed that reduction to S2 did not occur the subsequent day, indicating the memory was stored in BLA and may not necessarily require PRh-BLA communication.
The result noted here was somewhat surprising as our past studies showed that silencing activity in the PRh prior to testing attenuates freezing to a sensory preconditioned stimulus (i.e., an S2). We took this to mean that the PRh is necessary for retrieval/expression of fear to S2 and supposed that this retrieval/expression would be achieved through communication between the PRh and BLA. However, the results of the PRh-BLA disconnection at test show that this communication is not required, leaving us to speculate that retrieval/expression of fear to S2 may be achieved through communication between the PRh and CeA.
We have edited the opening of the relevant paragraph to clarify why the result noted here was surprising (page 20, beginning line 485):
“While the PRh and BLA clearly communicate to support mediated learning about the sound, this communication is not required for retrieval/expression of the mediated sound-shock association at the time of testing. This result is somewhat surprising as activity in the PRh is needed for expression of fear to the sound (Holmes et al., 2013; Wong et al., 2019) and raises the question: how does the PRh-dependent sound-shock association come to be expressed in fear responses?”
(4) The authors reference Holland 1981 and 1998, yet there's not much discussion of these findings. I think there should be a bit more emphasis on these studies since they show how mediated learning greatly depends on the extent of training. Also, it may be worth considering Holland's theory of why mediated conditioning is more effective with shorter training. His theory may be consistent with the authors, but I believe he suggests that early in training a stronger mediated representation is evoked which tends to dissipate with time. I think this is a valid hypothesis to consider in this paper.
The Holland papers show that rats form mediated associations (Holland, 1981) and that the likelihood of them doing so decreases with the amount of training (Holland, 1998). These findings are paralleled by those reported in the present series of experiments. However, the protocols used by Holland were very different to those used in the present study; and the explanation for his 1998 findings (which is the more relevant of the two papers) simply does not apply to the case of sensory preconditioning.
To be clear: Holland (1998) exposed rats to either “few” or “many” tone-food pairings in stage 1, tone-lithium chloride pairings in stage 2 and, finally, tested rats with the food alone in stage 3. He predicted and showed that those exposed to few tone-food pairings showed an aversion to the food at test (i.e., they consumed less of the food than controls) whereas those exposed to many tone-food pairings showed no such aversion (i.e., they consumed the same amount of food as the controls). This was taken to mean that, across the series of tone-lithium pairings, the tone activated the memory of food among rats in the few condition, resulting in a mediated food-lithium association; but failed to do so among rats in the many condition, resulting in no food-lithium association. According to Holland, the tone failed to activate the memory of food in the many condition because, by the end of training in stage 1, it was not needed for them to know what to do when the tone was presented: they simply had to run to the magazine to collect the food when delivered. That is, the tone eventually associated with the responses that rats emitted in the training situation, thereby obviating any need for activation of the food memory.
While this explanation is both elegant and interesting, it cannot be applied to the results obtained in the present study where the initial stage of training involved few or many sound-light pairings. That is, unlike in the Holland study where rats in the many condition eventually learned a stimulus-“run to magazine” association that maintained performance in the absence of any mental image of food, in the present study, any stimulus-response association acquired in stage 1 (e.g., orienting responses towards the sources of the auditory and visual stimuli) cannot have contributed to the expression of sensory preconditioned fear at test. Hence, stimulus-response learning in the many condition cannot be invoked to explain the pattern of results in the present study, even if it adequately explains what-appears-to-be a similar finding in the Holland study.
Nonetheless, we have included a reference to the general style of explanation that was considered and rejected by Holland in his 1998 and 2005 papers. This appears on page 18 (beginning line 434) and reads:
“Why does increasing the number of sound-light pairings change the way that rats integrate the sound-light and light-shock memories? One possibility is that increasing the number of sound-light pairings in stage 1 reduces the ability of each stimulus to activate the memory of the other. This is consistent with findings by Holland (1998), who showed that the likelihood of mediated learning in rats decreases with the amount of training (see also Holland, 2005); but inconsistent with our findings that, after extended training, rats continue to integrate the sound-light and light-shock associations through chaining at the time of testing (as chaining is predicated on the sound activating the memory of the light after extended training). Instead, we propose that the change in integration occurs because the increased number of sound-light pairings allows the rats to learn about the order in which the sound and light are presented (Figure 1; for evidence that rats acquire order information in sensory preconditioning, see Barnet et al., 1997; Hart et al., 2022; Leising et al., 2007; Miller & Barnet, 1993)…”
(5) There is also a Holland 2005 paper in which he tests whether extended training of the initial stimulus associations may result in a reduced associability of those stimuli. This would potentially result in lower mediated learning due to a decreased associability of the mediated representation, thereby explaining why extended training reductions in mediated learning occur. Using a probabilistic design, Holland shows that this reduction in mediated learning is likely not due to a change in associability.
We appreciate the note re Holland (2005) and have included a reference to it in our General Discussion. We agree with Holland that the reduction in mediated learning across extended training is not due to reduced associability of the retrieved stimulus representation. If this were the case, it would remain to explain why stimulus representations continue to be activated at test, which must occur for successful chaining of the sound-light and light-shock associations upon presentations of the sound alone. This is included in the modified text on page 18 (beginning line 434), which is part of our response to point 4.
Reviewer #3 (Recommendations for the authors):
(1) I think the 4th intro paragraph is essentially saying that more pairings during preconditioning encourage chaining as opposed to mediated learning - I might recommend clarifying this a bit. It took me a while to put it together.
Apologies for the confusion. We have clarified the argument at this point in the Introduction with the following insertion on page 4 (beginning line 84):
“That is, increasing the number of sound-light pairings may allow rats to encode information about stimulus order in stage 1 and, thereby, shift the locus of integration from mediated conditioning in stage 2 to chaining at test in stage 3 (Holmes et al., 2022).”
(2) In analyzing test data I am assuming percent freezing is the average of the entire 30s or 10s CS period - could this be clarified?
This is correct and has been clarified in the section for ‘Scoring and Statistics’ on page 29 (beginning line 708):
“Freezing data were collected using a time-sampling procedure in which each rat was scored as either ‘freezing’ or ‘not freezing’ every two seconds by an observer blind to the rat’s group allocation. A percentage score was then calculated by dividing the number of samples scored as freezing by the total number of samples. The baseline level of freezing was established by scoring the first two min at the start of each experimental session: i.e., we divided the total number of samples scored as freezing by the total number of observed samples, which was 60. The levels of freezing to the 10 s conditioned stimulus and 30 s preconditioned stimulus were established in a similar manner: we scored the entire period of each stimulus presentation and divided the number of samples scored as freezing by the total number of observed samples, which was 5 for each presentation of the conditioned stimulus and 15 for each presentation of the preconditioned stimulus.”
(3) Complementary to the above - during the probe test is there a difference during the first/last 2s of the CS? This would be interesting with respect to understanding the associative structure encoded.
We have previously examined whether freezing responses change across the duration of a 30 s preconditioned stimulus and a 10 s conditioned stimulus. We have never seen any such changes: in our past work and in the present series of experiments, the expression of freezing is largely uniform across each presentation of a preconditioned or conditioned stimulus.
(4) It is sort of unclear to me why more CS-CS pairings produced stronger preconditioned fear - is it that both mediated learning and chaining occur and giving 32 pairings permits both processes more than 8 pairings?
This is a very reasonable explanation for the heightened level of sensory preconditioned fear among rats that received many sound-light pairings in the initial control experiment. We are, however, reluctant to offer a strong interpretation of this result as it was not replicated across subsequent experiments in the series: i.e., the levels of freezing to the sensory preconditioned stimulus at test were largely the same among vehicle-injected controls that received either few (8) or many (32) sound-light pairings in Experiments 2A and 2B, and again in Experiments 3A and 3B as well as Experiments 4A and 4B.
(5) I would suggest individual data points overlaid on the bars, violin plots, or box and whisker plots to provide a better visualization of the data.
We appreciate the suggestion – these have been included overlaid on bars in each histogram_._
(6) There are other citations that would strengthen arguments for the idea that unidirectional/temporal associative structure can be acquired during (appetitive) sensory preconditioning: Leising 2007 Learning and Behavior, Hart 2022 Current Biology, for example.
Thank you for these citations. We have included references to the Leising et al (2007) and Hart et al (2022) papers in our discussion on page 18-19 (beginning line 442):
“Instead, we propose that the change in integration occurs because the increased number of sound-light pairings allows the rats to learn about the order in which the sound and light are presented (Figure 1; for evidence that rats acquire order information in sensory preconditioning, see Barnet et al., 1997; Hart et al., 2022; Leising et al., 2007; Miller & Barnet, 1993)…”
Editor's note:
We agree with the suggestions about full statistical reporting for non-significant results and about putting individual data points, perhaps coded to identify sex, on top of the bar graphs. Both will increase the transparency of the rigor of the work for readers.
We thank the editors and authors for their suggestions. We have included full statistical reporting for non-significant results and overlaid individual data points on the bars in each histogram.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Joint Public Review:
Summary:
The behavioral switch between foraging and mating is important for resource allocation in insects. This study investigated the role of the neuropeptide, sulfakinin, and of its receptor, the sulfakinin receptor 1 (SkR1), in mediating this switch in the oriental fruit fly, Bactrocera dorsalis. The authors use genetic disruption of sulfakinin and of SkR1 to provide strong evidence that changes in sulfakinin signaling alter odorant receptor expression profiles and antennal responses and that these changes mediate the behavioral switch. The combination of molecular and physiological data is a strength of the study. Additional work would be needed to determine whether the physiological and molecular changes observed account for the behavioral changes observed.
Strengths:
(1) The authors show that sulfakinin signaling in the olfactory organ mediates the switch between foraging and mating, thereby providing evidence that peripheral sensory inputs contribute to this important change in behavior.
(2) The authors' development of an assay to investigate the behavioral switch and their use of different approaches to demonstrate the role of sulfakinin and SkR1 in this process provides strong support for their hypothesis.
(3) The manuscript is overall well-organized and documented.
Weaknesses:
(1) The authors claim that sulfakinin acts directly on SkR1-positive neurons to modulate the foraging and mating behaviors in B. dorsalis. The authors also indicated in the schematic that satiation suppresses SkR1 expression. Additional experiments and more a detailed discussion of the results would help support these claims.
(2) The findings reported could be strengthened with additional experimental details regarding time of day versus duration of starvation effects and additional genetic controls, amongst others.
Recommendations for the authors:
Major issues
(1) As written the introduction is somewhat fragmented and does not lay out a clear rationale for the current study in the species used by the authors. Others, including Guo et al. (2021) and Wang et al. (2022), have previously shown that sulfakinin signaling pathways are important for feeding and receptivity regulation in D. melanogaster. Thus, the novelty of this study should be more clearly articulated.
The introduction in the revision is significantly changed to improve the description for the rationale of study (lines 60-66 in the revision).
(2) In addition, the Introduction should provide more specific background information on the pheromonal activity of oriental fruit fly body extract, the odor-preferences, and the sex pheromone of this species compared to that of model insects such as Drosophila melanogaster.
The revision contains a paragraph of introduction for chemical ecology of oriental fruit fly that is related to this study (lines 67-75).
(3) It isn't clear what the first image in Figure 1C represents - is this a schematic of the area or does it represent data?
The Fig 1C and the associated figure caption are revised. The figure is more visible by changing the track colors. The figure caption is revised as “Representative foraging trajectories in the 100 mm diameter arenas within a 15-min observation period of flies starved for different durations.”
(4) The authors should include examples of the EAG recordings following the stimulation with food volatiles or pheromones, not only the results of their analyses. This could be included in the main figures or even in supporting information.
As suggested, we added the examples of the EAG recordings following the stimulation with food odors and body extracts in the Figure 1 and Figure 3.
(5) The demonstration that removal of the antennae severely impairs mating is dispensable because the antennae are required for other functions in addition to olfaction.
We agree that the roles of the antennae are likely more than the olfactory function. As suggested, we removed the data.
(6) It is currently difficult to understand how the authors measured successful rates of foraging. Please provide more details.
In the revision, we added a sentence describing the method for measuring in detail. See line 269-273.
(7) The expression of sulfakinin does not change significantly in the antennae following starvation (Figure 2A). Do the authors know whether they change in the central nervous system under these conditions? Have the authors (or has anyone else) checked the expression pattern of sulfakinin in the antennae? This information would help determine whether the sulfakinin signal that acts on SkR1 is released from neurons in the central nervous system (Figure S4C) or whether it is also released from the neurons in the olfactory organs. Based on the immunochemistry results shown in Figure S4C, it would also be interesting to determine whether the intensity of anti-sulfakinin immunoreactivity changes before versus after starvation. This could help establish whether sulfakinin is released during starvation.
We added the expression data showing the the mRNA level of Sk in the head that is higher after refeeding in Fig. S3. The change in the expression of Sk is also added in the text (lines 107-110). We were unable to identify the Sk neurons in the antennae suggesting possibility of the direct action of humoral Sk on the antennae.
(8) In Figure 2A, the authors show that the expression levels of some neuropeptides system components change during starvation. However, it would be helpful if the authors could include more detailed information on how the results are shown in the figure legends (e.g., the expression level of each candidate in fed flies was set as 1, etc).
We revised the figure caption to explain the Figure 2 with the expression values in the figure legend.
(9) In Figure 2D, null mutant males of sulfakinin and SkR1 consume more food at all times compared to the wild type. However, the corresponding mutant females consume more food only at night. Is this because the wild-type female flies eat more food during the day? In a related issue, Figure 2D shows differences in food consumption measured at different times of day, however, this is not directly addressed in the text, which instead mentions that "the amount of excess food consumed by the mutants was dependent on the duration of the starvation period in both sexes".
Thank you for the important suggestions. We speculate that the difference of feeding amounts of females occurring only at night is due to the high basal feeding rate of females during the daytime, masking the increase in feeding in the knockout of Sk signaling. As suggested, we have added a relevant description of the difference in food consumption. In addition, we changed the Y-axis scale in the figure for a justified comparison between males and females. See line 123-128.
(10) It isn't clear how the time of day relates to the duration of starvation. This suggests that mutant females only consume more at 21:00 (presumably at night) whereas males consume more throughout the day. Does this suggest an interaction with the circadian system? What is the duration of starvation in Figure 3A? In a related issue, in Figure 4 it would be useful to know what time of day the EAG analysis was done because the data shown in Figure 2D suggests that the time of day significantly impacts behavioral responses. And does the red versus blue color scheme of the OR subunits represent up/downregulated levels in wild-type animals? Please define this for the reader.
In addition to the response to the point 9, responding to the issue of feeding amount in females. As the reviewer noted, there was indeed a diurnal difference in food amount consumed by B. dorsalis. However, whether this is related to circadian rhythms is something we haven't studied for further in-depth. Measuring food intake at these 3 times of day, we all ensured that the duration of starvation was the same 12 h. The duration of starvation in Figure 3A is 12h. We have mentioned this in the manuscript. See line 267-268.
The EAG for sex pheromones and body surface extracts were measured form 21:00-23:00, and food odor was measured from 9:00-11:00. The times of the experiments are described in the revision. See line 309-311.
Accordingly, we made a revision of the figure caption for explaining the colored fonts. Red color represents a set of ORs related with foraging and blue color is for a set of ORs related with mating. Therefore, the ORs with red color were upregulated in starved wild-type animals and the ORs with blue color were downregulated in starved wild-type flies. We have defined this in the revised manuscript. See line 672-673.
(11) The authors convincingly show that SKR1 is present in the antennae and is co-expressed with orco. It would be useful to discuss whether this receptor is also expressed in other tissues where there may be additional sites of action of this pathway.
Indeed, SkR1 is also expressed in the Drosophila brain. We added the discussion on the expression and additional sites of action of SKR1 within the central nervous system. See line 200-205.
(12) It isn't clear what the dotted arrows in the model shown in Figure 5 represent.
Dashed arrows represent the additional possible pathways that have not been tested in this study, but not excluded in the model. Please see the discussion for details of additional possible factors modulating odorant sensitivity relevant to satiety. See line 210-229.
(13) In Figure 5, the authors indicate that satiation suppresses SkR1 expression. It would be helpful if the authors tested the expression level of SkR1 in re-fed flies (by feeding the flies after 12h starvation) to see whether levels of expression are rapidly restored to the levels seen in satiated animals. Such a result could further support the claims made by the authors.
Thank for your suggestions. Indeed, refeeding after 12h starvation significantly decreased SkR1. We added the result in supporting information (Fig. S3). See line 713. Results see line 107-110.
(14) The authors show that locomotor activity is unaffected in the mutants but body size comparison would be more useful here since this could also contribute to baseline differences in meal size.
In the revision, we provided a comparison between WT and Sk-/- in the supplementary data. Results showed that mutant flies have the same body size as the WT flies. (Fig. S7) See line 742. Results see line 120-121.
(15) Have the authors tested the behavioral phenotypes of heterozygotes mutant of both Sk and SkR1 flies? This may reveal whether a reduced expression of Sk-SkR1 will also cause significant changes in the foraging and mating behaviors seen during starvation.
We tested the behavioral phenotypes of heterozygous mutant of Sk knockout flies. The results showed that foraging and mating behaviors of Sk heterozygous mutants were unaffected during starvation, suggesting the mutants are completely recessive. We have added the results in supporting information (Fig. S8). See line 746. Results see line 132-135.
(16) It would be useful to provide information about which SK peptide is detected by the antibody used in Figure S4C. In Figures S4C and S5D, it would be useful to include a counterstain to show that the general morphology is unaffected in the mutants.
As suggested, we added a detailed description for rabbit anti-BdSk antibody. See line 362-363. We have improved the background image to be available to show the general structure. So counter staining would not be essential.
(17) The figure legends for supporting figures need to be improved as they are currently difficult to understand. For example, in S2: what is the meaning of "different removal of antennae"? In S3: it isn't clear how the authors evaluated the responses in EAG experiments; in S4A: there are several DNA sequences that do not appear in the main text of the manuscript; in S4C: the meaning of the boxes and the dots is unclear, as is the figure to the left; in S5D, the authors explain only the suppression of SKR1, yet the figure indicates some images for SKR IHC. These are only a few examples; we ask that the authors revise and improve the legends for supporting figures.
For S2, we removed the data as suggested. For S3, we added a sentence describing the method for measuring in detail. See line 707-709. For S4, the figure in the revision is significantly changed and added a detailed description in the legend (lines 717-724 in the revision). For S5, we have improved our description. See line 731-734. In addition, we have checked all the figure legends of our manuscript and changes were displayed in track version.
Minor issues
(1) It isn't clear what the meaning of "the complexity of sulfakinin pathways" is. Please explain.
We have rewritten the sentence in the revised manuscript by adding the description as “…complexity of Sk pathways, special and temporal dynamics and multiple ligands and receptors, is…”. See line 61-65.
(2) Please double-check the calls to the various figures in the text.
We have double-checked the calls to all the figures in the text to make sure they were correct.
(3) L125: What is the meaning of "olfactory reprogramming"? Please explain.
We rephrased it to “alteration of olfactory sensitivities”. See line 145.
(4) L135: After mentioning qRT-PCR the authors should include a call to a figure that shows these results.
Thank you for your suggestion, the qRT-PCR results are shown in Figure 4B, and we have added it as suggested. See line 154.
(5) L270: Details are provided for the extraction of the pheromone. However, more details are needed on how the EAG and other functional assays were done.
We have described the assay procedures in detail in the materials and method part. See line 298-311.
(6) Figure 2B. Please remove the period(".") at the C-terminal end of WT sk.
We are sorry for our mistake. We have corrected it.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
[…] Weaknesses:
While there are no glaring weaknesses in this study, it should be noted that a great deal of literature has pinpointed the MPOA (and specifically inhibitory cells in this area) as being critical to sexual behavior, including female mating. However, no study to my knowledge has explored self-paced female mating with such fine control over manipulating and monitoring cellular activity in this region. In addition, this study may act to inspire others to further explore the additional brain regions found to show upregulation of neural activity (Fos) during mating completion in the female using the data sets generated here.
Reviewer #2 (Public Review):
[…] Weaknesses:
The authors include an elegant manipulation of ejaculation-activated neurons in the MPOA using DREADD. However, this study was limited to show that activation of previously activated cells was sufficient to reduce approach behavior in a paced mating paradigm and receiving intromissions in a home cage mating paradigm. An inhibition approach using DREADD would have been a great complement to this study as it would have examined if activation of the cells was required. Moreover, additional tests for sexual motivation would have greatly strengthened the overall conclusions.
Reviewer #3 (Public Review):
[…] Weaknesses:
(1) Their activity-dependent labeling strategy is not exclusive to mating completion but instead includes all neurons active before, during, and after the social encounter. In the manuscript, the authors did not discuss the time course of Fos activation or the timeframe of the FosTRAP labeling strategy. Fos continues to be expressed and is detectable for hours following neural activation. Therefore, the FosTRAP strategy also labels neurons that were activated 3 hours before the injection of 4-OHT. The original FosTRAP2 paper which is cited in this manuscript (DeNardo et al, 2019) performed a detailed analysis of the labeling window in Supplementary Figure 2 of that paper. Here is quoted text from that paper: "Resultant patterns of tdTomato expression revealed that the majority of TRAPing occurred within a 6-hour window centered around the 4-OHT injection." Thus, the FosTRAP "mating completion" groups throughout this manuscript also include neurons activated 3 hours before mating completion, which includes neurons activated during appetitive and consummatory mating behaviors.
This makes all of the FosTRAP data very difficult to interpret. Compounding this is the issue that the two groups the authors compare in their experiments are females administered 4-OHT following appetitive investigation behaviors (with the male removed before mating behaviors occurred) and females administered 4-OHT following mating completion. The "appetitive" group labeled neurons activated only during appetitive investigation, but the "completion" group labeled neurons activated during appetitive investigations, consummatory mating bouts, and mating completion. Therefore, in the brain-wide analysis of Figure 2, it is impossible to identify brain regions that were activated exclusively by mating completion and not by consummatory mating behaviors. This could have been achieved if the "completion" group was compared to a group of females that had commenced consummatory mating behaviors but were separated from the male before mating was completed. Then, any neurons labeled by the "completion" FosTRAP but not the "consummatory" FosTRAP would be neurons specifically activated by mating completion. In the current brain-wide analysis experiments, neurons activated by consummatory behaviors and mating completion can not be disassociated.
This same issue is present in the interpretation of the chemogenetic activation data in Figure 6. In the experiments of Figure 6, the authors are activating neurons naturally activated during consummatory mating behaviors as well as those activated during mating completion.
We appreciate the reviewers comments and concerns about the TRAP method.
First, we agree that the FosTRAP method does not have the sensitivity to separate ensembles that happen within a short time window. From our preliminary results, we have observed that the cells that inject 4-OHT after mating completion induce more tdTomato cells in the MPN than injection after appetitive behavior or consummatory behavior (Author response image 1).
To further compare the difference between the “consummatory” and “completion” ensemble, we included an additional cohort where we TRAP cells responding to consummatory behavior. This cohort is added to Figure 2, 6, S3, S4, S9, S10 and S11. From the whole brain mapping of TRAP cells, we found that many hypothalamic and extended amygdala areas including the medial preoptic area, and the bed nucleus of stria terminalis were shown to have significantly larger tdTomato+ cell density in the completion group than in the appetitive group while there was a tendency that the consummatory group also had larger cell density than the appetitive group. In the Gq-DREADD experiment, we found that the Completion-hM3Dq group but not the Consummatory-hM3Dq group showed the reduction of sexual motivation of the female mouse in the self-paced mating assay (Figure 6). The Completion-hM3Dq group but not the Consummatory-hM3Dq group also showed significantly low intromission events and tended to show lower receptivity in the home cage mating assay (Figure S10). Furthermore, post-hoc histological analysis also showed that the c-Fos+ and TRAP labeled cells in the MPN tended to be the larger in the Completion-hM3Dq group than in the Consummatory-hM3Dq group (Figure S9). These results, together with the in vivo Calcium imaging experiments in Figure 3, 4 and 5, suggests that the MPN contains male-ejaculation responsive cells that are distinct with the male-mounting responsive cells and that they are sufficient to suppress female sexual motivation.
However, it is true that with the current state of mouse genetic tools, we do not have any methods with higher time accuracy. We have discussed the limitations of FosTRAP method regarding its low time sensitivity in the Discussion section.
Author response image 1.
Representative image showing TRAP labeling in the MPN after mating completion and intromission
(2) This study does not definitively show that the female mice used in this study display decreased sexual motivation after the completion of mating. The females exhibit reduced interaction with males that had also just completed mating, but it is unclear if the females would continue to show reduced interaction time if given the choice to interact with a male that was not in the post-ejaculatory refractory period. Perhaps, these females have a natural preference to interact more with sexually motivated males compared to recently mated (not sexually motivated) males. To definitively show that these females exhibit decreased sexual motivation the authors should perform two control experiments: 1) provide the females with access to a fully sexually motivated male after the females have completed mating with a different male to see if interaction time changes, and 2) compare interaction time toward mated and non-mated males using the self-paced mating assay. These controls would show that the reduction in the interaction time is because the females have reduced sexual motivation and not because these females just naturally interact with sexually motivated males more than males in the post-ejaculatory refractory period.
We highly appreciate the reviewers comments regarding the interpretation of the self-paced mating assay. To address the concerns, we added an experiment where the female subjects were introduced to a novel sexually motivated male mice in the self-paced mating assay immediately after receiving ejaculation (Figure S2). As result, we found that similar to the self-paced mating assay using the same male animal, the female subject spends significantly more time in the isolation zone on the post-ejaculation day when compared to the pre-ejaculation day.
(3) It is unclear how the transient 90-second response of these MPOA neurons following the completion of mating causes the prolonged reduction in female sexual motivation that is at the minutes to hours timeframe. No molecular or cellular mechanism is discussed.
(4) The authors discuss potential cell types and neural population markers within the MPOA and go into some detail in Figure S3. However, their experiments are performed with only the larger excitatory and inhibitory MPOA neural populations.
While the molecular or cellular mechanism of prolonged activity of MPOA neurons is critical to understand the neural mechanism of how sustained neural activity in the MPOA suppress female sexual motivation, it is out of the reach of the current manuscript and a subject of future studies. We have added a section in the discussion part to further discuss the potential molecular mechanisms.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
If the authors haven't already, it would be useful if the authors could make the brain-wide analysis of Fos activity publicly available.
We have distributed the data to https://dandiarchive.org/
I would also make sure the n's are included in each Figure Legend for each panel (some are missing in the Supplementals).
We appreciate the comment, we have added the number of subjects to Figure 3, 4, 5.
It would also be best to provide clearer labels to some of the Figures, for example, Figure 5D, the Types should also be labeled with what behaviors they correspond to.
We appreciate the comment. Figure 5 is focused on post-ejaculation neural activity. The cell types are categorized based neural activity after experiencing male ejaculation, it does not correspond to any behaviors.
Reviewer #2 (Recommendations For The Authors):
(1) A first recommendation is to replace the use of the term "mating completion" with "ejaculation". Male and female rodents display a period of reduced approach behavior following display or experiencing ejaculation, which is referred to as the post-ejaculatory interval. The current studies investigate the neural ensemble that contributes to this post-ejaculatory interval in female mice. In addition, male and female rodents will display a prolonged period of sexual inactivity referred to as satiety, which is typically observed after repeated display or experience of ejaculations. The current studies do not investigate satiety. Moreover, in the current studies, female mice appeared to display approach behavior (time in the interaction zone) even within the 10 minutes following experiencing ejaculation (Fig 1F). Hence, the term "completion" is not accurate and should be replaced by "ejaculation" in all figures and throughout the manuscript. Replacing completion with ejaculation will also clarify what defines "onset of completion", which this reviewer assumes refers to the onset of ejaculatory behavior observed in the male.
Thank you for the comment. We agree that the mating completion was inappropriate. We have changed the wording to ejaculation or post-ejaculatory period.
(2) Likewise, a variety of other terms and descriptions need to be adjusted for consistency and accuracy. For example, "room" when referring to the interaction or isolation zones; "onset of mating completion" when referring to ejaculation; "male intruder" to refer to the introduction of the male mating partner, but using a term typically used for an intruder-resident aggression test. Replacing these terms will aid in reducing confusion for the reader and more accurately describe the behavioral parameters.
We appreciate the comment. We have updated the terms “male intruder” to “partner”, “room” to “area” or “zone”.
(3) The use of the paced mating paradigm is a strength of these studies. This paradigm has been widely used and validated to study female sexual behavior in rodents. Please refer to recent reviews and landmark papers using this paradigm in addition to the current cited papers to better reflect the vast wealth of studies that previously reported the behavioral data that were replicated in this study.
We have added a section discussing the self-paced mating assay, its merits and caveats P8.
(4) In the paced mating test, females can pace the receipt of sexual stimulation, and latencies to withdraw and return to the male-containing chamber are considered indicators of sexual motivation. Female withdrawal will increase with the intensity of the sexual stimulation and latency to return is longer following ejaculation. Paced mating is thus a balance of approach and withdrawal behaviors that increases reward and likelihood of pregnancy for females. Moreover, ejaculation-induced withdrawal and longer latencies to return and approach are altered by hormonal status and by the introduction of a novel male partner. Thus, female sexual behavior is complex and withdrawal behavior (in this paper measured as time spent in an isolation zone) needs to be interpreted with caution and not simply referred to as sexual motivation. I recommend expanding the description of the paradigm to highlight the strengths and limitations of this paradigm and use caution to interpret time spent in the isolation zone as a lack of sexual motivation. I also recommend referring to the period after ejaculation as the post-ejaculatory interval (instead of completion).
Thank you for the comment. We have changed the wording in the manuscript to adjust the way it refers to sexual motivation.
(5) In the current paper, time in the isolation zone and the number of transitions are used as the behavioral measures. Latencies, which are typically included in paced mating studies, were missing from the data. If data are available for latencies to withdraw and return to the interaction zone after mount, intromission, and ejaculation, please add these data. If such data were not collected or are not available, please recognize this caveat.
Thank you for the comment. In figure 1, which all animals did experience male ejaculation, we added latency analysis (Figure 1I and 1P). The result indicates as suggested in the literature, female mice took significantly longer to return the interaction zone after male-ejaculation.
(6) The brain-wide mapping study of cFos expression after ejaculation confirms and extends prior findings, mostly in rats. Please reference prior papers in female rodents showing cFos after ejaculation and discuss how the current data replicate or differ from prior data.
In the manuscript P8 L351, we have referred to Pfaus et al., 1993 to discuss the similarity in the c-Fos expression pattern studied in rats. We have further added descriptions to emphasize the similarity between the two datasets.
(7) A paragraph describing the specific cell types that are activated in the MPOA is an essential part of the study and is described in detail, but only shown in supplementary figures. Given the emphasis on this particular part of the study, a recommendation is to incorporate these data as a regular figure instead of supplementary material.
While we greatly appreciate the comment, we consider that the molecular characterization of MPOA neurons are not the main focus of the paper and decided to keep it in the supplementary figure.
(8) Calcium imaging studies were performed in the home cage for obvious practical reasons. However, in the home cage testing, the females withdraw from the males using a different approach and do not exit an interaction zone through a division. There may also be differences in the male sexual behavior patterns and thus the stimulation that females receive from the male. Yet, it appears that ejaculation induces similar patterns of neural activation in this paradigm. Thus, it is likely that neuron activation is a result of receiving ejaculation, rather than withdraw behavior. Please briefly discuss the comparisons between the cFos and calcium imaging conclusions in these two different paradigms.
We have added a section discussing the self-paced mating assay, its merits and caveats P8. Withdrawal and latency and its interpretation is discussed in this section.
(9) The final study includes the manipulation of ejaculation-activated neurons in the MPOA using DREADD. This study was limited to show that activation of previously activated cells was sufficient to reduce approach behavior in a paced mating paradigm and receiving intromissions in a home cage mating paradigm. An inhibition approach using DREADD would have been a great complement to this study as it would have shown if activation of the cells was required. Moreover, additional tests for sexual motivation, such as partner preference tests would have greatly strengthened the results since a lack of entering an interaction zone can also be explained by impaired sensory processing or locomotor behavior. Finally, CNO also appeared to impact time in the isolation zone for a subset of animals in the ejaculation (completion) control group and the appetitive group. These effects didn't reach statistical significance, but groups also had low sample sizes (n=6-7) and may thus have been underpowered. The recommendation is to include these caveats and shortcomings in the discussion of these results.
We appreciate the comments. We first added an inhibitory approach to show the necessity of MPOA neurons. As result, we found that the inhibition of these neurons did not affect the behavior in the self-paced mating assay but increased the subjects sexual receptivity (Figure S11). For the low sample size, we have added a power analysis in the statistical section.
(10) The studies utilized ovariectomized females with hormone priming. Since sexual receptivity in females is highly dependent on the hormonal milieu, the authors are encouraged to add an explanation of why ovariectomized females were used and if the results may have differed in cycling females.
We appreciate the comments. The female subjects used in the TRAP experiment will be needing to experience ejaculation from the male mice twice, once to label the cells, and second during the reactivation. In order to avoid pregnancy during the first experience, we ovariectomized the female and controlled their hormonal conditions. This method has been used successfully in other sexual behavior studies (Yang et al., 2013, Ring., 1944.). This was described in P11. We have further demonstrated in Figure 1N-T that female mice were not ovariectomized and were under the natural estrus cycle showed similar suppression of sexual interaction after the completion of mating. The manuscript was updated to discuss that the behavior change after mating completion is not dependent on the ovary.
(11) Overall, the paper lacks references to relevant prior studies. For example, many studies have been reported over the past 2-3 decades about the effects of female rodent sexual behavior on activation in the brain and the effects of different vaginocervical stimulation on pregnancy and fertility. It is absolutely the case that much remains unknown about the complex neural circuitries that control behavior during the post-ejaculatory interval and sexual satiety in both male and female rodents, but studies have indicated roles for hypothalamic areas, bed nucleus of the stria terminals, ventral tegmental area, posterior thalamus, and prefrontal cortex. Hence, the current introduction and discussion do not adequately summarize or acknowledge these prior investigations and therefore place these new findings in the context of what was previously known.
We appreciate the comment and added references to P2 L65, P8 L355-357 to discuss existing literature about c-Fos mapping analysis after ejaculation or genital stimulation in female rats.
(12) Finally, sample sizes appear to be modest, ranging n=4-8 (except n=14 in the completion group in Figure S7) and vary between groups within and between studies. Please explain in the methods section how sample sizes were pre-determined and acknowledge if studies may have potentially been underpowered.
The sample size for behavior experiments in this study were n = 6-9. This was predetermined based on previous studies examining female sexual behavior (Ishii et al. 2017, Liu et al. 2022, Yin et al. 2022). To further examine the number of animals required for our behavioral experiments, we pooled data used in this study and conducted a power analysis (n = 111 pooled data, control n = 94, stim n = 17). We conducted a power analysis using the variance calculated from pooled average time in isolation zone. These data were pooled from control animals in each experiment (eg. animals with GFP control virus injected, saline injected, etc.). The average time in isolation zone in the after ejaculation or after reactivating the completion cells was 420 ± 210 seconds, and 49 ± 91 seconds in the control group (mean ± s.d.). Within this population, we found that 5 animals were sufficient to detect the difference (p < 0.05, power = 0.8) in Students t-test. We have added this explanation in the supplemental experimental procedure, page P18, line 817-827.
Reviewer #3 (Recommendations For The Authors):
The authors should discuss the fact that the FosTRAP2 strategy labels neurons activated 3 hours before the 4-OHT injection. As the manuscript is written, it seems to suggest that the 4-OHT injection given following mating completion only labeled neurons activated during mating completion. This is very misleading. I respect the amount of work and rigor that went into these experiments. The single-cell imaging, implementation of the FosTRAP strategy, and behavioral analysis are all well executed. Novel insights into the neural regulation of female sexual drive can be gleaned from the neural imaging experiments. Unfortunately, the limitations of the FosTRAP strategy make those studies very difficult to interpret, and therefore, a more candid discussion and re-interpretation of the data from the FosTRAP experiments is needed.
We appreciate the reviewers comments and concerns about the TRAP method.
First, we agree that the FosTRAP method does not have the sensitivity to separate ensembles that happen within a short time window. From our preliminary results, we have observed that the cells that inject 4-OHT after mating completion induce more tdTomato cells in the MPN than injection after appetitive behavior or consummatory behavior (Author response image 1).
To further compare the difference between the “consummatory” and “completion” ensemble, we included an additional cohort where we TRAP cells responding to consummatory behavior. This cohort is added to Figure 2, 6, S3, S4, S9, S10 and S11. From the whole brain mapping of TRAP cells, we found that many hypothalamic and extended amygdala areas including the medial preoptic area, and the bed nucleus of stria terminalis were shown to have significantly larger tdTomato+ cell density in the completion group than in the appetitive group while there was a tendency that the consummatory group also had larger cell density than the appetitive group. In the Gq-DREADD experiment, we found that the Completion-hM3Dq group but not the Consummatory-hM3Dq group showed the reduction of sexual motivation of the female mouse in the self-paced mating assay (Figure 6). The Completion-hM3Dq group but not the Consummatory-hM3Dq group also showed significantly low intromission events and tended to show lower receptivity in the home cage mating assay (Figure S10). Furthermore, post-hoc histological analysis also showed that the c-Fos+ and TRAP labeled cells in the MPN tended to be the larger in the Completion-hM3Dq group than in the Consummatory-hM3Dq group (Figure S9). These results, together with the in vivo Calcium imaging experiments in Figure 3, 4 and 5, suggests that the MPN contains male-ejaculation responsive cells that are distinct with the male-mounting responsive cells and that they are sufficient to suppress female sexual motivation.
However, it is true that with the current state of mouse genetic tools, we do not have any methods with higher time accuracy. We have discussed the limitations of FosTRAP method regarding its low time sensitivity in the Discussion section.
Editor notes:
Should you choose to revise your manuscript, please include full statistical reporting in the main text including test statistic, degrees of freedom, an exact P value.
Thank you for the comment. The statistical values were added to the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #3 (Public review):
Human and simian immunodeficiency viruses (HIV and SIV, respectively) evolved numerous mechanisms to compromise effective immune responses but the underlying mechanisms remain incompletely understood. Here, Yamamoto and Matano examined the humoral immune response in a large number of rhesus macaques infected with the difficult-to-neutralize SIVmac239 strain and identified a subgroup of animals showing significant neutralizing Ab responses. Sequence analyses revealed that in most of these animals (7/9) but only a minority in the control group (2/19) SIVmac variants containing a CD8+ T-cell escape mutation of G63E/R in the viral Nef gene emerged. Functional analyses revealed that this change attenuates the ability of Nef to stimulate PI3K/Akt/mTORC2 signalling. The authors propose that this improved induction of SIVmac239 nAb is reciprocal to antibody dysregulation caused by a previously identified human PI3K gain-of-function mutation associated with impaired anti-viral B-cell responses. Altogether, the results suggest that PI3K signalling plays a role in B-cell maturation and generation of effective nAb responses. Preliminary data indicate that Nef might be transferred from infected T cells to B cells by direct contact. However, the exact mechanism and the relevance for vaccine development requires further studies
Strengths of the study are that the authors analyzed a large number of SIVmac-infected macaques to unravel the biological significance of the known effect of the interaction of Nef with PI3K/Akt/mTORC2 signaling. This is interesting and may provide a novel means to improve humoral immune responses to HIV. In the revised version the authors made an effort to address previous concerns. Especially, they provide data supporting that Nef might be transferred to B cells by direct cell-cell contact. In addition, the provide some evidence that G63R that also emerged in most animals does not share the disruptive effect of G63G although experimental examination and discussion why G63R might emerge remains poor. Another weakness that remains is that some effects of the G63E mutation are modest and effects were not compared to SIVmac constructs lacking Nef entirely. The evidence for a role of Nef G63E mutation on PI3K and the association with improved nAb responses was largely convincing and it is appreciated that the authors provide additional evidence for a potential impact of "soluble" Nef on neighboring B cells. However, the experimental set-up and the results are difficult to comprehend. It seems that direct cell-cell contact is required and membranes are exchanged. Since Nef is associated with cellular membranes this might lead to some transfer of Nef to B cells. However, the immunological and functional consequences of this remain largely elusive. Alternatively, Nef-mediated manipulation of helper CD4 T cells might also impact B cell function and effective humoral immune responses. As previously noted, the presentation of the results and conclusions was in part very convoluted and difficult to comprehend. While the authors made attempts to improve the writing parts of the manuscript are still challenging to follow. This applies even more to the rebuttal (complex words combined with poor grammar), which made it difficult to assess which concerns have been satisfactory addressed.
We are grateful for the visionary comments. Based on suggestion, we have edited the writing throughout and appended remarks on certain points raised in the Discussion section. For points that need experimentation, we would like to address them in a follow-up study now under preparation.
Reviewer #3 (Recommendations for the authors):
Additional editing of the manuscript is highly recommended to make the results accessible for a broad readership.
We are grateful for the important suggestion. Accordingly, we have made editing of the manuscript aimed for a broad readership.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Wang et al. generate XAP5 and XAP5L knockout mice and find that they are male infertile due to meiotic arrest and reduced sperm motility, respectively. RNA-Seq was subsequently performed and the authors concluded that XAP5 and XAP5L are antagonistic transcription factors of cilliogenesis (in XAP5-KO P16 testis: 554 genes were unregulated and 1587 genes were downregulated; in XAP5L-KO sperm: 2093 genes were unregulated and 267 genes were downregulated).
We are grateful for the comprehensive summary.
Strengths:
Knockout mouse models provided strong evidence to indicate that XAP5 and XAP5L are critical for spermatogenesis and male fertility.
Thank you for your positive comment.
Weaknesses:
The key conclusions are not supported by evidence. First, the authors claim that XAP5 and XAP5L transcriptionally regulate sperm flagella development; however, detailed molecular experiments related to transcription regulation are lacking. How do XAP5 and XAP5L regulate their targets? Only RNA-Seq is not enough. Second, the authors declare that XAP5 and XAP5L are antagonistic transcription factors; however, how do XAP5 and XAP5L regulate sperm flagella development antagonistically? Only RNA-Seq is not enough. Third, I am concerned about whether XAP5 really regulates sperm flagella development. XAP5 is specifically expressed in spermatogonia and XAP5-cKO mice are in meiotic arrest, indicating that XAP5 regulates meiosis rather than sperm flagella development.
Thank you for the critical comments. To strengthen our conclusions, we have included XAP5/XAP5L CUT&Tag data in our revised manuscript. This highly sensitive method has allowed us to identify direct target genes of XAP5 and XAP5L (Table S1, Figure S6). Notably, our results demonstrate that both FOXJ1 and RFX2 are occupied by XAP5 (Figure 4G). Additionally, real-time PCR validation confirmed that RFX2 is also associated with XAP5L, even though enriched peaks for the RFX2 gene were not detected in the initial CUT&Tag data (Figure 4G). These findings indicate that XAP5 and XAP5L regulate the expression of FOXJ1 and RFX2 by directly binding to these genes. De novo motif analyses revealed that XAP5 and XAP5L shared a conserved binding sequence (CCCCGCCC/GGGCGGGG) (Figure S6C), and the bound regions of FOXJ1 and RFX2 contain this sequence. Further analysis shows that many XAP5L target genes are also targets of XAP5 (Figure S6G), despite the limited number of identified XAP5L target genes. This differential binding and regulation of shared target genes underscore the antagonistic relationship between XAP5 and XAP5L. Collectively, these findings provide additional support for the idea that XAP5 and XAP5L function as antagonistic transcription factors, acting upstream of transcription factor families, including FOXJ1 and RFX factors, to coordinate ciliogenesis during spermatogenesis.
While we agree that XAP5 primarily regulates meiosis during spermatogenesis, our data also indicate that many cilia-related genes, including key transcription regulators of spermiogenesis such as RFX2 and SOX30, are downregulated in XAP5-cKO mice and are bound by XAP5 (Figure 4, Figures S4 and S6). It is important to note that genes coding for flagella components are expressed sequentially and in a germ cell-specific manner during development. When we refer to "regulating sperm flagella development", we mean the spatiotemporal regulation. We have revised the manuscript to clarify this point.
Reviewer #2 (Public Review):
In this study, Wang et al., report the significance of XAP5L and XAP5 in spermatogenesis, involved in transcriptional regulation of the ciliary gene in testes. In previous studies, the authors demonstrate that XAP5 is a transcription factor required for flagellar assembly in Chlamydomonas. Continuing from their previous study, the authors examine the conserved role of the XAP5 and XAP5L, which are the orthologue pair in mammals.
XAP5 and XAP5L express ubiquitously and testis specifically, respectively, and their absence in the testes causes male infertility with defective spermatogenesis. Interestingly, XAP5 deficiency arrests germ cell development at the pachytene stage, whereas XAP5L absence causes impaired flagellar formation. RNA-seq analyses demonstrated that XAP5 deficiency suppresses ciliary gene expression including Foxj1 and Rfx family genes in early testis. By contrast, XAP5L deficiency abnormally remains Foxj1 and Rfx genes in mature sperm. From the results, the authors conclude that XAP5 and XAP5L are the antagonistic transcription factors that function upstream of Foxj1 and Rfx family genes.
This reviewer thinks the overall experiments are performed well and that the manuscript is clear. However, the current results do not directly support the authors' conclusion. For example, the transcriptional function of XAP5 and XAP5L requires more evidence. In addition, this reviewer wonders about the conserved XAP5 function of ciliary/flagellar gene transcription in mammals - the gene is ubiquitously expressed despite its functional importance in flagellar assembly in Chlamydomonas. Thus, this reviewer thinks authors are required to show more direct evidence to clearly support their conclusion with more descriptions of its role in ciliary/flagellar assembly.
Thank you for your thoughtful review of our work. We appreciate your positive feedback on the overall quality of the experiments and the clarity of the manuscript. In response to your concerns, we have included new experimental data and made revisions to the manuscript (lines 193-217) to better support our conclusions, particularly regarding the transcriptional function of XAP5 and XAP5L. Additionally, we have expanded on the role of XAP5 in ciliary and flagellar assembly to provide more direct evidence for its functional importance. Thank you for your insights.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
The title (Control of ciliary transcriptional programs during spermatogenesis by antagonistic transcription factors) is not specific and does tend to exaggerate.
Thank you for the comment, and we appreciate the opportunity to clarify the appropriateness of the title. Our paper extensively investigates the transcriptional regulation of ciliary genes during spermatogenesis. It demonstrates that XAP5/XAP5L are key transcription factors involved in this process. The title reflects our primary focus on the transcriptional programs that govern ciliary gene expression. Moreover, our paper shows that XAP5 positively regulates the expression of ciliary genes, particularly during the early stages of spermatogenesis, while XAP5L negatively regulates these genes. This antagonistic relationship is a crucial aspect of the study and is effectively conveyed in the title. In addition, our revised paper provides detailed insights into how XAP5/XAP5L control ciliary gene expression during spermatogenesis.
Figure 4C: FOXJ1 and RFX2 are absent in sperm from WT mice. Are you sure? They are highly expressed in WT testes.
Thank you for your careful review. While FOXJ1 and RFX2 are indeed highly expressed in the testes of wild-type (WT) mice, our data show that they are not detectable in mature sperm. This observation is consistent with published single-cell RNA-seq data(Jung et al., 2019), which indicate that FOXJ1 and RFX2 are primarily expressed in spermatocytes but not in spermatids (Figure S7). This expression pattern aligns with that that of IFT-particle proteins, which are essential for the formation but not the maintenance of mammalian sperm flagella(San Agustin, Pazour, & Witman, 2015).
XAP5 is specifically expressed in spermatogonia and XAP5-cKO mice are in meiotic arrest, indicating that XAP5 regulates meiosis rather than sperm flagella development.
We appreciate your insightful comments. As mentioned above, we agree that XAP5 primarily regulates meiosis during spermatogenesis. When we mentioned "regulating sperm flagella development," we were referring to the spatiotemporal regulation of these processes. We have revised the manuscript to clarify this distinction. Thank you for your understanding.
The title of Figure 2 (XAP5L is required for normal sperm formation) is not accurate because the progress of spermatogenesis and sperm count is normal in XAP5L-KO mice (only sperm motility is reduced).
We apologize for any confusion caused by the previous figure. It did not accurately convey the changes in sperm count. In the revised Figure 2B, we clearly demonstrate that the sperm count in XAP5L-KO mice is indeed lower than that in WT mice. This revision aims to provide a more accurate representation of the effects of XAP5L deficiency on spermatogenesis. Thank you for bringing this to our attention.
Reviewer #2 (Recommendations For The Authors):
(1) Although XAP5 and XAP5L deficiency alters the transcription of Foxj1 and Rfx family genes, which are the essential transcription factors for the ciliogenesis, current data do not directly support that XAP5 and XAP5L are the upstream transcription factors. The authors need to show more direct evidence such as CHIP-Seq data.
Thank you for your valuable feedback! In this revised manuscript, we have included data identifying candidate direct targets of XAP5 and XAP5L using the highly sensitive CUT&Tag method (Kaya-Okur et al., 2019). Our results show that XAP5 occupies both FOXJ1 and RFX2 (Figure 4G). Furthermore, real-time PCR validation of the CUT&Tag experiments confirmed that RFX2 is also occupied by XAP5L (Figure 4G), despite the initial CUT&Tag data not revealing enriched peaks for the RFX2 gene (Table S1). Unfortunately, the limited number of enriched peaks identified for XAP5L (Table S1) suggests that the XAP5L antibody used in the CUT&Tag experiment might have suboptimal performance, which prevented us from detecting occupancy on the FOXJ1 promoter. Nevertheless, these additional data provide strong evidence that XAP5 and XAP5L function as upstream transcription factors for FOXJ1 and RFX family genes, supporting their essential roles in ciliogenesis.
(2) Shared transcripts that are altered by the absence of either XAP5 or XAP5L do not clearly support they are antagonistic transcription factors.
Thank you for your insightful comment. In our revised manuscript, we performed CUT&Tag analysis to identify target genes of XAP5 and XAP5L. Motif enrichment analysis revealed conserved binding sequences for both factors (Figures S6C), indicating a subset of shared downstream genes between XAP5 and XAP5L. Among the downregulated genes in XAP5 cKO germ cells, 891 genes were bound by XAP5 (Figure S6D). Although the number of enriched peaks identified for XAP5L was limited, 75 of the upregulated genes in XAP5L KO sperm were bound by XAP5L (Figure S6E). Importantly, of these 75 XAP5L target genes, approximately 30% (22 genes) were also identified as targets of XAP5 (Figure S6G), further support the idea that XAP5 and XAP5L function as antagonistic transcription factors.
(3) XAP5 seems to be an ancient transcription factor for cilia and flagellar assembly. However, XAP5 expresses ubiquitously in mice. How can this discrepancy be explained? Is it also required for primary cilia assembly? Are their expression also directly linked to ciliogenesis in other types of cells?
Thank you for the thoughtful questions. The ubiquitous expression of XAP5 in mice can be understood in light of its role as an ancient transcription factor for cilia and flagellar assembly. Given that cilia are present on nearly every cell type in the mammalian body (O'Connor et al., 2013), this broad expression pattern makes sense. In fact, XAP5 serves not only as a master regulator of ciliogenesis but also as a critical regulator of various developmental processes (Kim et al., 2018; Lee et al., 2020; Xie et al., 2023).
Our current unpublished work demonstrates that XAP5 is essential for primary cilia assembly in different cell lines. The loss of XAP5 protein results in abnormal ciliogenesis, further supporting its vital role in ciliary formation across different cell types.
We believe that the widespread expression of XAP5 reflects its fundamental importance in multiple cellular processes, including ciliogenesis, development, and potentially other cellular functions yet to be discovered.
(4) XAP5L causes impairs flagellar assembly. Have the authors observed any other physiological defects in the absence of XAP5L in mouse models? Such as hydrocephalus and/or tracheal defects?
Thank you for the questions. We have carefully examined XAP5L KO mice for other physiological defects. To date, we have not observed any additional physiological abnormalities. Specifically, we assessed the condition of tracheal cilia in XAP5L KO mice and found no significant differences compared to wild-type (WT) mice, as illustrated in Author response image 1 below.
Author response image 1.
References
Jung, M., Wells, D., Rusch, J., Ahmad, S., Marchini, J., Myers, S. R., & Conrad, D. F. (2019). Unified single-cell analysis of testis gene regulation and pathology in five mouse strains. Elife, 8. doi:10.7554/eLife.43966
Kaya-Okur, H. S., Wu, S. J., Codomo, C. A., Pledger, E. S., Bryson, T. D., Henikoff, J. G., . . . Henikoff, S. (2019). CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat Commun, 10(1), 1930. doi:10.1038/s41467-019-09982-5
Kim, Y., Hur, S. W., Jeong, B. C., Oh, S. H., Hwang, Y. C., Kim, S. H., & Koh, J. T. (2018). The Fam50a positively regulates ameloblast differentiation via interacting with Runx2. J Cell Physiol, 233(2), 1512-1522. doi:10.1002/jcp.26038
Lee, Y.-R., Khan, K., Armfield-Uhas, K., Srikanth, S., Thompson, N. A., Pardo, M., . . . Schwartz, C. E. (2020). Mutations in FAM50A suggest that Armfield XLID syndrome is a spliceosomopathy. Nature Communications, 11(1). doi:10.1038/s41467-020-17452-6
O'Connor, A. K., Malarkey, E. B., Berbari, N. F., Croyle, M. J., Haycraft, C. J., Bell, P. D., . . . Yoder, B. K. (2013). An inducible CiliaGFP mouse model for in vivo visualization and analysis of cilia in live tissue. Cilia, 2(1), 8. doi:10.1186/2046-2530-2-8
San Agustin, J. T., Pazour, G. J., & Witman, G. B. (2015). Intraflagellar transport is essential for mammalian spermiogenesis but is absent in mature sperm. Mol Biol Cell, 26(24), 4358-4372. doi:10.1091/mbc.E15-08-0578
Xie, X., Li, L., Tao, S., Chen, M., Fei, L., Yang, Q., . . . Chen, L. (2023). Proto-Oncogene FAM50A Can Regulate the Immune Microenvironment and Development of Hepatocellular Carcinoma In Vitro and In Vivo. Int J Mol Sci, 24(4). doi:10.3390/ijms24043217
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this manuscript, Corso-Diaz et al, focus on the NRL transcription factor (TF), which is critical for retinal rod photoreceptor development and function. The authors profile NRL's protein interactome, revealing several RNA-binding proteins (RBPs) among its components. Notably, many of these RBPs are associated with R-loop biology, including DHX9 helicase, which is the primary focus of this study. R-loops are three-stranded nucleic acid structures that frequently form during transcription. The authors demonstrate that R-loop levels increase during photoreceptor maturation and establish an interaction between NRL TF and DHX9 helicase. The association between NRL and RBPs like DHX9 suggests a cooperative regulation of gene expression in a cell-type-specific manner, an intriguing discovery relevant to photoreceptor health. Since DHX9 is a key regulator of R-loop homeostasis, the study proposes a potential mechanism where a cell-type-specific TF controls the expression of certain genes by modulating R-loop homeostasis. This study also presents the first data on R-loop mapping in mammalian retinas and shows the enrichment of R-loops over intergenic regions as well as genes encoding neuronal function factors. While the research topic is very important, there is some concern regarding the data presented: there are substantial data supporting the interaction between NRL and DHX9, including pull-down experiments and proximity labeling assay (PLA), however, the data showing an interaction between NRL and DDX5, another R-loop-associated helicase, are inadequate. Importantly, the data supporting the claim that NRL interacts with R-loops are absolutely insufficient and at best, correlative. The next concerns are regarding the R-loop mapping data analysis and visualization.
Strengths:
There is compelling evidence that the NRL transcription factor interacts with several RNA binding proteins, and specifically, sufficient data supporting the interaction of NRL with DHX9 helicase.
A major strength is the use of the single-stranded R-loop mapping method in the mouse retina.
Weaknesses:
(1) Figure S1A: There is a strong band in GST-IP (control IP) for either HNRNPUI1 or HNRNPU, although the authors state in their results that there is a strong interaction of these two RBPs with NRL.
Under our experimental conditions, most RNA-binding proteins displayed higher binding to glutathione beads (Fig. S1A). However, GST-NRL purifications showed much stronger signals for respective RBPs. In the case of HNRNPU and HNRNPUl1, white bands that are indicative of substrate depletion due to higher protein levels are observed in GST-NRL lanes. Additionally, in Figures 1B and 1C, there is a clear enrichment of HNRNPU and HNRNPUl1 above the background signal. We added this to the text. See page 5.
Both DHX9 and DDX5 samples have a faint band in the GST-IP.
RNA-binding proteins may display some background as observed in other studies (e.g. PMID: 32704541). We think that showing the raw data without decreasing the exposure time is useful and that there is a clear enrichment compared to controls. In addition, we tested the interaction in multiple systems.
There is an extremely faint band for HNRNPA2B1 in the GST-NRL IP lane. Given this is a pull-down with added benzonase treatment to remove all nucleic acids, these data suggest, that previously observed NRL interactions with these particular RBPs are mediated via nucleic acids. Similarly, there is a loss of band signal for HNRNM in this assay, although it was identified as an NRL-interacting protein in three assays, which again suggests that nucleic acids mediate the interaction.
Thank you for highlighting this point. We mention in the manuscript that the interaction between HNRNPM and A1 depends on nucleic acids, as noted by the reviewer, since there is no obvious band after the pull-down. We have now added that the interaction of NRL with HNRNPA1B1 is likely dependent on nucleic acids as well, given its weak signal. See page 5.
(2) The data supporting NRL-DDX5 interaction in rod photoreceptor nuclei is very weak. In Figure 2D, the PLA signal for DDX5-NRL is very weak in the adult mouse retina and is absent in the human retina, as shown in Figure 2H.
We agree with the reviewer. We think that the signal for DDX5 is weak, and we addressed this in the text. We noted on page 7: “Taken together, these findings suggest a strong interaction between NRL and DHX9 throughout the nuclear compartment in the retina and that a transient and/or more regulated interaction of NRL with DDX5 may require additional protein partners.” We have modified this sentence to add that the data also suggest transient interaction or the requirement of additional protein partners for stable interaction. See page 7.
Given that there is no NRL-KO available for the human PLA assay, the control experiments using single-protein antibodies should be included in the assay. Similarly, the single-protein antibody control PLA experiments should be included in the experimental data presented in Figure 2J.
Thank you for the suggestion. We performed PLAs using both DHX9 and IgG in the human retina and observed no specific amplification signal. Some background is observed outside the nucleus and in the extracellular space. We added these results to the text and to the supplementary information. See page 7 and Fig.S2B.
(3) The EMSA experiment using a probe containing NRL binding motif within the DHX9 promoter should include incubation with retina nuclear extracts depleted for NRL as a control.
In EMSA experiments, we used bovine retina to obtain enough protein quantities. As suggested by the reviewer, using NRL depleted extract would increase the specificity of observed gel shift and complement our pre-immune serum as a negative control. However, removal of all the NRL protein using the antibodies available was not feasible. In the future, we will use enough mice to obtain large quantities of protein for this experiment and will collect retinas from Nrl knockout as negative control.
(4) There is a reduced amount of DHX9 pulled down in NRL-IP in HEK293 cells, but there is no statistically significant difference in the reciprocal IP (DHX9-IP and blotting for NRL) (Figure 4C).
We believe the reviewer is referring to the data in Figure 4C showing that RNase H treatment led to significantly reduced pulldown of DHX9 as compared to control, but the reciprocal IP in Figure 4D showed no statistical significance between control and RNase H treatment. In Figure 4D, we hypothesize that NRL may account for only a small proportion of DHX9’s interactome, so the change in NRL levels could not be detected due to the sensitivity of our assay. DHX9 likely constitutes a large proportion of NRL’s interactome in HEK293 cells, hence the change in DHX9 level was more obvious when pulling down with NRL. We added this information to the results. See page 8.
(5) The only data supporting the claim that NRL interacts with R-loops are presented in Figure 5A.
Additional evidence that NRL interacts with R-loops comes from DRIP-Seq experiments where signals from R-loops overlap with NRL ChIP-Seq signals (Figure 7A). This shows that R-loops and NRL co-occur on multiple genomic regions. In addition, indirect evidence of NRL and R-loops’ interaction is shown in pull down experiments and PLA assays where R-loops influence DHX9 and NRL binding. We clarified this in the discussion. See page 14.
This is a co-IP of R-loops and then blotting for NRL, DHX9, and DDX5. Here, there is no signal for DDX5, quantification of DHX9 signal shows no statistically significant difference between RNase H treated and untreated samples, while NRL shows a signal in RNase H treated sample. These data are not sufficient to make the statement regarding the interaction of NRL with R-loops.
Thank you for this comment. We respectfully disagree as we observe statistically significant enrichment for both NRL and DHX9 in these experiments (See Fig5A). Some NRL continues to bind to DNA that is pulled down nonspecifically, which may be expected since NRL is a transcription factor. See for example R-loop binding by the transcription factor Sox2 (PMID: 32704541). However, binding to R-loops is evidenced by an enrichment compared to RNase H-treated sample. We clarified this in Results section (See page 9).
(6) Regarding R-loop mapping, the data analysis is quite confusing. The authors perform two different types of analyses: either overall narrow and broad peak analysis or strand-specific analysis. Given that the authors used ssDRIP-seq, which is a method designed to map R-loops strand specifically, it is confusing to perform different types of analyses.
Thank you for highlighting this point. This has enhanced the clarity of the methods and enriched the discussion. We aimed to identify R-loops as accurately as possible. We conducted two types of analyses to capture different aspects of R-loops: one that looks at overall patterns (narrow and broad peaks) and another that focuses on specific strands of DNA.
Using ssDRIP-seq, which is designed to map R-loops on specific strands, allowed us to examine R-loops formed in only one strand and those formed on both strands. To identify strand-specific R-loops, we filtered our RNase-H enriched peaks for those enriched on one strand compared to the opposite strand. We clarified the analysis in the results section, and Figure 6B. See page 10 and methods section page 25.
Next, the peak analysis is usually performed based on the RNase H treated R-loop mapping; what does it mean then to have a pool of "Not R-loops", see Figure 6B?
The “Not R-loop” group refers to peaks called using the opposite strand that are not observed when calling peaks using RNase H as control. We modified this figure for clarity (Figure 6B).
In that regard, what does the term "unstranded" R-loops mean? Based on the authors' definition, these are R-loops that do not fall within the group of strand-specific R-loops. The authors should explain the reasons behind these types of analyses and explain, what the biological relevance of these different types of R-loops is.
Thank you for helping us clarify this point. Unstranded R-loops are DNA regions containing DNA:RNA hybrids on both plus and minus strands and possibly representing bidirectional transcription by Pol II. We observed that unstranded R-loops are enriched only in intergenic regions, H3K9me3 regions, and downstream of the transcriptional termination site (TTS). We added to the discussion the possible implications of these enrichments, including regulation of Pol II termination and transcription of long genes. See Page 13.
(7) It would be more useful to show the percent distribution of R-loops over the different genomic regions, instead of showing p-value enrichment, see Figure 6C.
Since most of the genome is non-coding, plotting the distribution as a proportion was not informative since the vast proportion of the data falls in intergenic regions. However, we created a new figure showing observed vs. expected ratio that seems to be more informative and moved the current p-value figure to the supplement in revised version. See Figure 6C and S6D.
(8) Based on the model presented, NRL regulates R-loop biology via interaction with RBPs, such as DHX9, a known R-loop resolution helicase. Given that the gene targets of NRL TF are known, it would be useful to then analyze the R-loop mapping data across this gene set.
Thank you for this suggestion. We performed an analysis of R-loops on NRL-regulated genes. Interestingly, NRL target genes have an enrichment of stranded R-loops at the promoter/TSS and unstranded R-loops on the gene body compared to all Ensembl genes (Figure S7B). We added a table containing all NRL-regulated genes we used for this analysis (table S5) and a figure showing this result (Fig. S7B).
Reviewer #2 (Public review):
Summary:
The authors utilize biochemical approaches to determine and validate NRL protein-protein interactions to further understand the mechanisms by which the NRL transcription factor controls rod photoreceptor gene regulatory networks. Observations that NRL displays numerous protein-protein interactions with RNA-binding proteins, many of which are involved in R-loop biology, led the authors to investigate the role of RNA and R-loops in mediating protein-protein interactions and profile the co-localization of R-loops with NRL genomic occupancy.
Strengths:
Overall, the manuscript is very well written, providing succinct explanations of the observed results and potential implications. Additionally, the authors use multiple orthogonal techniques and tissue samples to reproduce and validate that NRL interacts with DHX9 and DDX5. Experiments also utilize specific assays to understand the influence of RNA and R-loops on protein-protein interactions. The authors also use state-of-the-art techniques to profile R-loop localization within the retina and integrate multiple previously established datasets to correlate R-loop presence with transcription factor binding and chromatin marks in an attempt to understand the significance of R-loops in the retina.
Weaknesses:
In general, the authors provide superficial interpretations of the data that fit a narrative but fail to provide alternative explanations or address caveats of the results. Specifically, many bands are present in interaction studies either in control lanes (GST controls) of Westerns or large amounts of background in PLA experiments.
We have added additional information to the text regarding the presence of background signals in pull downs. We wish to note that experimental samples always exceeded background signals. We believe that reporting these raw findings (rather than showing shorter exposures) is valuable for the scientific community. We did not observe any background in the proximity ligation assay (PLA) that exceeded what is typically expected, and the signals were clearly discernible. Cases where signals are weaker, such as with DDX5, have been highlighted. In addition, we added a DHX9-IgG negative control for the human PLA experiment. See page 5 and Fig. S2B.
Additionally, the lack of experiments testing the functional significance of Nrl interactions or R-loops within the developing retina fails to provide novel biological insights into the regulation of gene regulatory networks other than, 'This could be a potentially important new mechanism'.
We agree that functional experiments are necessary to understand the molecular mechanisms behind R-loop regulation in the retina; however, we believe it goes beyond the scope of this initial characterization (as this is the first report on R-loops in the retina). We are currently pursuing these studies.
We performed new analysis on NRL-regulated genes as suggested by reviewer 1. We show that NRL target genes have an enrichment of stranded R-loops at the promoter/TSS and unstranded R-loops on the gene body compared to all Ensembl genes (Figure S7B), providing further evidence of the functional interaction between NRL and R-loops. See table S5 and Fig. S7B, and discussion.
Additionally, the authors test the necessity of RNA for NRL/DHX9 interactions but don't show RNA binding of NRL or DHX9 or the sufficiency of RNA to interfere/mediate protein-protein interactions. Recent work has highlighted the prevalence of RNA binding by transcription factors through Arginine Rich Motifs that are located near the DNA binding domains of transcription factors.
We agree that the role of RNA in these complexes is very exciting, and we are currently pursuing these studies. However, we believe that they fall outside the scope of this initial report on R-loops in the retina.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
There are a couple of minor comments:
(1) Unfinished sentence; page 11, the end of the first paragraph.
Thank you for catching this error. We removed the unfinished text.
(2) Page 6: Figure S2A should be Figure S2.
In general, the manuscript would benefit from a deeper explanation of the biological relevance of R-loop formation and the connection to NRL TF and the expression of genes regulated by NRL. In this regard, a more substantial description of the model would be useful.
We have modified the discussion for clarity and included new ideas on possible roles of R-loops in gene regulation of photoreceptors.
Reviewer #2 (Recommendations for the authors):
(1) The specificity of interactions needs to be addressed:
- Figure 1B - HNRNPUI1 bands present in GST control.
- Figure 1C - Bands present in the Empty Vector control IP for HNRNPU and DHX9.
- Supplemental Figure 1A - most proteins are present in GST control suggesting prevalent binding to GST and lack of specificity for other interactions.
Thank you for your comment. RNA-binding proteins can have more background as observed in other studies (e.g. PMID: 32704541) but there is always a higher signal in experimental samples compared to controls. While we agree that we can enhance the conditions for immunoprecipitation (IP) by optimizing washing buffers, exposure and other parameters, we believe the current methods tell the story. We have added additional text explaining this. See page 5.
(2) Use of the term 'Strongest' interaction - IPs don't directly address the strength of interaction, but depend on levels of expression AND affinity. The strength of interaction should be tested using techniques like an OCTET or SPR assay. One can also quantify the effect that RNA would have in such an assay.
Thank you for your suggestion. We replaced the term 'stronger' with “higher signal” and “robust” at most places. The source of protein lysates is the same for experiments and controls, thus the amount of protein is consistent in both conditions, and not dependent on level of gene expression.
(3) In supplemental tables, please use the proper gene names, not the UniProt peptide name. For example, there are no genes named ELAV1-ELAV4. These should be ELAVL1-ELAVL4. A short glance identifies >10 gene name errors.
Thank you for the suggestion. We updated current gene names in all tables.
(4) Please provide the rationale for the choice of DNA sequence for the DHX9 nucleotide sequence used for EMSA assays. In the human DHX9 locus, the NRL ChIP-seq peak looks to be contained in Intron1 whereas the NRL ChIP-seq peak in mouse DHX9 looks to be in the proximal upstream promoter. Did the authors choose an evolutionarily conserved sequence in the promoter region that contained the NRL motif or does the probe sequence arise from the sequence that has known NRL binding as assayed by NRL ChIP-seq? A zoomed-in image of the NRL ChIP-seq pile-ups in the DHX9 locus in each species would be beneficial.
Thank you for this suggestion. The probe was chosen by scanning for NRL binding motifs on the Chip-Seq peak at the human DHX9 promoter. We added a Zoom-in image of the ChIP-Seq or CUT&RUN reads for NRL on both human and mouse retinas. Figure 3D shows NRL binding in both species in regions containing the homologous motif. The sequence is partially conserved and shown in the figure.
(5) Normalization in RNaseH/RNaseA Co-IP experiments. Why does RNAseH treatment result in increased NRL IP (increased NRL expression?) or does RNaseA treatment cause reduced IP of DHX9? These differences seem to cause a 'denominator' effect, leading the Authors to conclude decreased co-IP of DHX9 with NRL when R-loops are inhibited or increased co-IP of NRL with DHX9 when RNA is degraded. An alternate interpretation would be that inhibiting the R-loop binding of NRL unmasks the epitope for antibody recognition. The authors should test NRL binding to RNA and determine if RNA binding affects the co-IP of NRL with DHX9.
We agree that removing total RNA by RNase A or R-loops by RNase H may alter the accessibility of our antibodies to the epitopes, resulting in the differences in the level of total protein pulled down. However, we quantified the relative level of the associating protein to the total protein and confirmed, in reciprocal assays, that RNase A treatment led to increased interaction between NRL and DHX9. However, the quantification was not consistent between the reciprocal IPs upon RNase H treatment. We reason that in Figure 4D, as NRL may account for only a small proportion of DHX9’s interactome, the change in NRL level could not be detected due to the sensitivity of our assay. However reciprocally, DHX9 can constitute a larger proportion of NRL’s interactome in HEK293 cells, hence the change in DHX9 level was more obvious. We added this information to the text. See page 8.
(6) Figure 7 - Malat1 - there doesn't seem to be an overlap of NRL with Stranded R-loop peaks in this image. Nrl seems to flank the region of R-loops.
We changed Malat1 for Mplkip that shows a direct overlap of Nrl binding and R-loops. See Figure 7C.
(7) Results end with 'A Model'. Seems like some concluding remarks and references to Figure 8 were mistakenly left out.
Thank you for catching this typo. We removed the misplaced text.
(8) Model and Discussion - authors should show raw data for RHO with respect to NRL binding and R-loops. No evidence was provided regarding R-loops (or lack thereof) in the Rhodopsin locus. Additionally, conclusions stating that "R-loops... are specifically depleted from genes, such as Rhodopsin, with high expression levels" go against Figures 7B and 7C. Malat1 is one of the highest expressed genes in the retina and contains R-loops.
Thank you for helping us clarify our hypothesis. We added a genome browser view of Rhodopsin showing the absence of R-loops (Fig. S8). We hypothesize that R-loops could interfere with achieving higher rates of transcription, however we did not mean to say that all high expressed genes lack R-loops. We have rephrased the discussion to clarify this point.
(9) Neuronal genes, particularly those involved in synaptic transmission are known to be, on average, longer than most genes (Gabel, 2015; PMID: 25762136). Is it possible that R-loops are detected at genes involved in synaptic function/structure solely because of transcript length, as it takes longer for transcription termination to resolve in genes that are longer? A plot showing R-loop enrichment and transcript length would address this.
We added a plot showing gene length in relation to R-loops and expression levels. We observed that R-loops are more common over long genes regardless of their expression levels. We also observed that the concomitant presence of stranded and unstranded R-loops is restricted to the longest genes in most cases. We added this to Figure 7D.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
We thank the reviewers for valuable feedback and comments. Based on the feedback we revised the manuscript and believe that we addressed most of the reviewers' raised points. Below we include a summary of key revisions and point-by-point responses to reviewers comments.
Abstract/Introduction
We further emphasized EP-GAN strength in parameter inference of detailed neuron parameters vs specialized models with reduced parameters.
Results
We further elaborated on the method of training EP-GAN on synthetic neurons and validating on both synthetic and experimental neurons.
We added a new section Statistical Analysis and Loss Extension which includes:
- Statistical evaluation of baseline EP-GAN and other methods on neurons with multi recording membrane potential responses/steady-state currents data: AWB, URX, HSN
- Evaluation of EP-GAN with added resting potential loss + longer simulations to ensure stability of membrane potential (EP-GAN-E)
Methods
We added a detailed explanation on "inverse gradient process"
We added detailed current/voltage-clamp protocols for both synthetic and experimental validation and prediction scenarios (table 6)
Supplementary
We added error distribution and representative samples for synthetic neuron validations (Fig S1)
We added membrane potential response statistical analysis plots for existing methods for AWB, URX, HSN (Fig S6)
We added steady-state currents statistical analysis plots on EP-GAN + existing methods for AWB, URX, HSN (Fig S7)
We added mean membrane potential errors for AWB, URX, HSN normalized by empirical standard deviations for all methods (Table S4)
Please see our point-by-point responses to specific feedback and comment below.
Reviewer 1:
First, at the methodological level, the authors should explain the inverse gradient operation in more detail, as the reconstructed voltage will not only depend on the evaluation of the right-hand side of the HH-equations, as they write but also on the initial state of the system. Why did the authors not simply simulate the responses?
We thank the reviewer for the feedback regarding the need for further explanation. We have revised the Methods section to provide a more detailed description of the inverse gradient process. The process uses a discrete integration method, similar to Euler’s formula, which takes systems’ initial conditions into account. For the EP-GAN baseline, the initial states were picked soon after the start of the stimulus to reconstruct the voltage during the stimulation period. For EP-GAN with extended loss (EP-GAN-E), introduced in this revision in sub-section Statistical Analysis and Loss Extension, initial states before/after stimulations were also taken into account to incorporate resting voltage states into target loss.
Since EP-GAN is a neural network and we want the inverse gradient process to be part of the training process (i.e., making EP-GAN a “model informed network”), the process is expected to be implemented as a differentiable function of generated parameter p. This enables the derivatives from reconstructed voltages to be traced back to all network components via back-propagation algorithm.
Computationally, this requires the implementation of the process as a combination of discrete array operations with “auto-differentiation”, which allows automatic computation of derivatives for each operation. While explicit simulation of the responses using ODE solvers provides more accurate solutions, the algorithms used by these solvers typically do not support such specialized arrays nor are they compatible with neural network training. We thus utilized PyTorch tensors [54], which support both auto-differentiation and vectorization to implement the process.
The authors did not allow the models time to equilibrate before starting their reconstruction simulations, as testified by the large transients observed before stimulation onset in their plots. To get a sense of whether the models reproduce the equilibria of the measured responses to a reasonable degree, the authors should allow sufficient time for the models to equilibrate before starting their stimulation protocol.
In the added Statistical Analysis and Loss Extension under the Results section, we added results for EP-GAN-E where we simulate the voltage responses with 5 seconds of added stabilization period in the beginning of simulations. The added period mitigates voltage fluctuations observed during the initial simulation phase and we observe that simulated voltage responses indeed reach stable equilibrium for both prior stimulations and for the zero stimulus current-clamp protocol (Figure 5 bottom, Column 3).
In fact, why did the authors not explicitly include the equilibrium voltage as a target loss in their set of loss functions? This would be an important quantity that determines the opening level of all the ion channels and therefore would influence the associated parameter values.
EP-GAN baseline does include equilibrium voltage as a target loss since all current-clamp protocols used in the study (both synthetic and experimental) include a membrane potential trace where the stimulus amplitude is zero throughout the entire recording duration (see added Table 6 for current clamp protocols), thus enforcing EP-GAN to optimize resting membrane potential alongside with other non-zero stimulus current-clamp scenarios.
To further study EP-GAN’s accuracy in resting potential, we evaluated EP-GAN with supplemental resting potential target loss and evaluated its performance in the sub-section Statistical Analysis and Loss Extension. The added loss, combined with 5 seconds of additional stabilization period, improved accuracy in predicting resting potentials by mitigating voltage fluctuations during the early simulation phase and made significant improvements to predicting AWB membrane potential responses where EP-GAN baseline resulted in overshoot of the resting potential.
The authors should provide a more detailed evaluation of the models. They should explicitly provide the IV curves (this should be easy enough, as they compute them anyway), and clearly describe the time-point at which they compute them, as their current figures suggest there might be strong transient changes in them.
We included predicted IV-curve vs ground truth plots in addition to the voltages in the supplementary materials (Figure S2, S5) in the original submitted version of the manuscript. In this revision, we added additional IV-curve plots with statistical analysis for the neurons with multi-recording data (AWB, URX, HSN) in the supplementary materials (Figure S7).
For the evaluation of predicted membrane potential responses, we added further details in Validation Scenarios (Synthetic) under Results section such that it clearly explains on the current-clamp protocols used for both synthetic and experimental neurons and which time interval the RMSE evaluations were performed.
In the sub-section Statistical Analysis and Loss Extension, we introduced a new statistical metric in addition to RMSE, applied for neurons AWB, URX, HSN which evaluates the percentage of predicted voltages that fall within the empirical range (i.e., mean +- 2 std) and voltage error normalized by empirical standard deviations (Table S4).
The authors should assess the stability of the models. Some of the models exhibit responses that look as if they might be unstable if simulated for sufficiently long periods of time. Therefore, the authors should investigate whether all obtained parameter sets lead to stable models.
In the sub-section Statistical Analysis and Loss Extension, we included individual voltage traces generated by both EP-GAN baseline and EP-GAN-E (extended) with longer simulation (+5 seconds) to ensure stability. EP-GAN-E is able to produce equilibrium voltages that are indeed stable and within empirical bounds throughout the simulations for the zero-stimulus current-clamp scenario (column 3) for the 3 tested neurons (AWB, URX, HSN).
Minor:
The authors should provide a description of the model, and it's trainable parameters. At the moment, it is unclear which parameter of the ion channels are actually trained by the methodology.
The detailed description of the model and its ion channels can be found in [7]. Supplementary materials also include an excel table predicted parameters which lists all EP-GAN fitted parameters for 9 neurons (+3 new parameter sets for AWB, URX, HSN using EP-GAN-E) included in the study, the labels for trainability, and their respective lower/upper bounds used during training data generation. In the revised manuscript, we further elaborated on the above information in the second paragraph of the Results section.
Reviewer 2:
Major 1: While the models generated with EP-GAN reproduce the average voltage during current injections reasonably well, the dynamics of the response are not well captured. For example, for the neuron labeled RIM (Figure 2), the most depolarized voltage traces show an initial 'overshoot' of depolarization, i.e. they depolarize strongly within the first few hundred milliseconds but then fall back to a less depolarized membrane potential. In contrast, the empirical recording shows no such overshoot. Similarly, for the neuron labeled AFD, all empirically recorded traces slowly ramp up over time. In contrast, the simulated traces are mostly flat. Furthermore, all empirical traces return to the pre-stimulus membrane potential, but many of the simulated voltage traces remain significantly depolarized, far outside of the ranges of empirically observed membrane potentials. While these deviations may appear small in the Root mean Square Error (RMSE), the only metric used in the study to assess the quality of the models, they likely indicate a large mismatch between the model and the electrophysiological properties of the biological neuron.
EP-GAN main contribution is targeted towards parameter inference of detailed neuron model parameters, in a compute efficient manner. This is a difficult problem to address even with current state-of-the-art fitting algorithms. While EP-GAN is not perfect in capturing the dynamics of the responses and RMSE does not fully reflect the quality of predicted electrophysiological properties, it’s a generic error metric for time series that is easily interpretable and applicable for all methods. Using such a metric, our studies show that EP-GAN overall prediction quality exceeds those of existing methods when given identical optimization goals in a compute normalized setup.
In our revised manuscript, we included a new section Statistical Analysis and Loss Extension under Results section where we performed additional statistical evaluations (e.g., % of predicted responses within empirical range) of EP-GAN’s predictions for neurons with multi recording data. The results show that predicted voltage responses from EP-GAN baseline (introduced in original manuscript) are in general, within the empirical range with ~80% of its responses falling within +- 2 empirical standard deviations, which were higher than existing methods: DEMO (57.9%), GDE3 (37.9%), NSDE (38%), NSGA2 (60.2%).
Major 2: Other metrics than the RMSE should be incorporated to validate simulated responses against electrophysiological data. A common approach is to extract multiple biologically meaningful features from the voltage traces before, during and after the stimulus, and compare the simulated responses to the experimentally observed distribution of these features. Typically, a model is only accepted if all features fall within the empirically observed ranges (see e.g. https://doi.org/10.1371/journal.pcbi.1002107). However, based on the deviations in resting membrane potential and the return to the resting membrane potential alone, most if not all the models shown in this study would not be accepted.
In our original manuscript, due to all of our neurons’ recordings having a single set of recording data, RMSE was chosen to be the most generic and interpretable error metric. We conducted additional electrophysiological recordings for 3 neurons in prediction scenarios (AWB, URX, HSN) and performed statistical analysis of generated models in the sub-section Statistical Analysis and Loss Extension. Specifically, we evaluated the percentage of predicted voltage responses that fall within the empirical range (empirical mean +- 2 std, p ~ 0.05) that encompass the responses before, during and after stimulus (Figure 5, Table 5) and mean membrane potential error normalized by empirical standard deviations (Table S4).
The results show that EP-GAN baseline achieves average of ~80% of its predicted responses falling within the empirical range, which is higher than the other methods: DEMO (57.9%), GDE3 (37.9%), NSDE (38%), NSGA2 (60.2%). Supplementing EP-GAN with additional resting potential loss (EPGAN-E) increased the percentage to ~85% with noticeable improvements in reproducing dynamical features for AWB (Figure 5). Evaluations of membrane potential errors normalized by empirical standard deviations also showed similar results where EP-GAN baseline and EP-GAN-E have average error of 1.0 std and 0.7 std respectively, outperforming DEMO (1.7 std), GDE3 (2.0 std), NSDE (3.0 std) and NSGA (1.5 std) (Table S4).
Major 3: Abstract and introduction imply that the 'ElectroPhysiome' refers to models that incorporate both the connectome and individual neuron physiology. However, the work presented in this study does not make use of any connectomics data. To make the claim that ElectroPhysiomeGAN can jointly capture both 'network interaction and cellular dynamics', the generated models would need to be evaluated for network inputs, for example by exposing them to naturalistic stimuli of synaptic inputs. It seems likely that dynamics that are currently poorly captured, like slow ramps, or the ability of the neuron to return to its resting membrane potential, will critically affect network computations.
In the paper, EP-GAN is introduced as a parameter estimation method that can aid the development of ElectroPhysiome, which is a network model - these are two different method types and we do not claim EP-GAN is a model that can capture network dynamics. To avoid possible confusion, we made further clarifications in the abstract/introduction that EP-GAN is a machine learning approach for neuron HH-parameter estimation.
I find it hard to believe that the methods EP-GAN is compared to could not perform any better. For example, multi-objective optimization algorithms are often successful in generating models that match empirical observations very well, but features used as target of the optimization need to be carefully selected for the optimization to succeed. Likely, each method requires extensive trial and error to achieve the best performance for a given problem. It is therefore hard to do a fair comparison. Given these complications, I would like to encourage the authors to rethink the framing of the story as a benchmark of EP-GAN vs. other methods. Also, the number of parameters does not seem that relevant to me, as long as the resulting models faithfully reproduce empirical data. What I find most interesting is that EP-GAN learns general relationships between electrophysiological responses and biophysical parameters, and likely could also be used to inspect the distribution of parameters that are consistent with a given empirical observation.
We thank the reviewer for providing this perspective. While it is indeed difficult to have a completely fair comparison between existing optimization methods vs EP-GAN due to the fundamental differences in their algorithms, we believe that the current comparisons with other methods are justified as they provide baseline performance metrics to test EP-GAN for its intended use cases.
The main strength of EP-GAN, as previously mentioned, is in its ability to efficiently navigate large detailed HH-models with many parameters so that it can aid in the development of nervous system models such as ElectroPhysiome, potentially fitting hundreds of neurons in a time efficient manner.
While EP-GAN’s ability to learn the general relationship between electrophysiological responses and parameter distribution are indeed interesting and warrant a more careful examination, this is not the main focus of the paper since in this work we focus on introducing EP-GAN as a methodology for parameter inference.
In this context, we believe the comparisons with other methods conducted in a compute normalized manner (i.e., each method is given the same # of simulations) and identical optimization targets provides an adequate framework for evaluating the aforementioned EP-GAN aim. Indeed, while EPGAN excels with larger HH-models, it performs slightly worse than DE for smaller models such as the one used by [16] despite it being more compute efficient (Table S2).
To emphasize the EP-GAN aim, we revised the main manuscript description to focus on its intended use in parameter inference of detailed neuron parameters vs specialized models with reduced parameters.
I could not find important aspects of the methods. What are the 176 parameters that were targeted as trainable parameters? What are the parameter bounds? What are the remaining parameters that have been excluded? What are the Hodgkin-Huxley models used? Which channels do they represent? What are the stimulus protocols?
The detailed description and development of the HH-model that we use and its ion channel list can be found in [7]. Supplementary materials also include an excel table predicted parameters which lists all EP-GAN fitted parameters for 9 neurons (+3 new parameter sets for AWB, URX, HSN using EPGAN-E), the labels for trainability, and parameter bounds used for parameters during the generation of training data.
We also added a new Table which details the current/voltage clamp protocols used for 9 neurons including the ones used for evaluating EP-GAN-E, which was supplemented with longer simulation time to ensure voltage stability (please see Table 6).
I could not assess the validation of the EP-GAN by modeling 200 synthetic neurons based on the data presented in the manuscript since the only reported metric is the RMSE (5.84mV and 5.81mV for neurons sampled from training data and testing data respectively) averaged over all 200 synthetic neurons. Please report the distribution of RMSEs, include other biologically more relevant metrics, and show representative examples. The responses should be carefully investigated for the types of mismatches that occur, and their biological relevance should be discussed. For example, is the EP-GAN biased to generate responses with certain characteristics, like the 'overshoot' discussed in Major 1? Is it generally poor at fitting the resting potential?
We thank the reviewer for the feedback regarding the need for additional supporting data for synthetic neuron validations. In the revised supplementary materials Figure S1, we included the distribution of RMSE errors for both groups of synthetic neuron validations (validation/test set) and representative samples for both EP-GAN baseline and EP-GAN-E. Notably, the inaccuracies observed during the experimental neuron predictions (e.g., resting potential, voltage overshoot) do not necessarily generalize to synthetic neurons, indicating that such mismatches could stem from the differences between synthetic neurons used for training and experimental neurons for predictions. While synthetic neurons are generated according to empirically determined parameter bounds, some experimental neuron types are rarer than the others and may also involve other channels that have not been recorded or modeled in [7], which can affect the quality of predicted parameters (see 2nd and 4th paragraphs of Discussions section for more detail). Also, properties such as recording error/noise that are often present in experimental neurons are not fully accounted for in synthetic neurons.
To further study how these mismatches can be mitigated, in the revision we added an extended version of EP-GAN where target loss was supplemented with additional resting potential and 5 seconds of stabilization period during simulations (EP-GAN-E described in Statistical Analysis and Loss Extension). With such extensions, EP-GAN-E was able to improve its accuracies on both resting potentials and dynamical features with the most notable improvements on AWB where predicted voltage responses closely match slowly rising voltage response during stimulation. EPGAN-E is an example of further extensions to loss function that account for additional experimental features.
Furthermore, the conclusion of the ablation study ('EP-GAN preserves reasonable accuracy up to a 25% reduction in membrane potential responses') does not seem to be justified given the voltage traces shown in Figure 3. For example, for RIM, the resting membrane potential stays around 0 mV, but all empirical traces are around -40mV. For AFD, all simulated traces have a negative slope during the depolarizing stimuli, but a positive slope in all empirically observed traces. For AIY, the shape of hyperpolarized traces is off.
Since EP-GAN baseline optimizes voltage responses during the stimulation period, RMSE was also evaluated with respect to this period. From these errors, we evaluated whether the predicted voltage error for each ablation scenario fell within the 2 standard deviations from the mean error obtained from synthetic neuron test data (i.e. the baseline performance). We found that for input ablation for voltage responses, the error was within such range up to 25% reduction whereas for steady-state current input ablation, all 25%, 50% and 75% reductions resulted in errors within the range.
We extended the “Ablation Studies” sub-section so that the above reasoning is better communicated to the readers.
Additionally, I found a number of minor issues:
Minor 1: Table 1 lists the number of HH simulations as '32k (11k · 3)'. Should it be 33k, since 11.000 times 3 is 33.000? Please specify the exact number of samples.
Minor 2: x- and y-ticks are missing in Fig 2, Fig 3, Fig S1, Fig S2, Fig S3 and Fig S4.
Minor 3: All files in the supplementary zip file should be listed and described.
Minor 4: Code for training the GAN, generation of training datasets and for reproducing the figures should be provided.
Minor 5: In the reference (Figure 3A, Table 1 Row 2): should this refer to Table 2?
Minor 6: 'the ablation is done on stimulus space where a 50% reduction corresponds to removing half of the membrane potential responses traces each associated with a stimulus.' - which half is removed?
We thank the reviewer for pointing out these errors in the original manuscript. The revised manuscript includes corrections for these items. We will publish the python code reproducing the results in the public repository in the near future.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Since multiple Reviewers requested that the results describing effects of TTX treatment on GluA2 receptor levels detected by immunofluorescence and confocal imaging be revised, we have made substantial changes, which are described below. We believe the changes have greatly improved the manuscript and thank the reviewers for their comments.
Lack of significant increase in GluA2 receptor data is due to too few cultures sampled; anything could have happened [in one] particular dissociation. A concern that the TTX effect might vary greatly from culture to culture was why we felt it was important to match the receptor measurements on the same cultures that we recorded mEPSCs. We now present the culture means in Figure 5A (mEPSCs) and 5B (GluA2 receptor cluster size). These plots make it clear that the variability in the GluA2 receptor cluster size effect is not attributable to a failure of that culture to show a homeostatic effect. That is, the variability in GluA2 receptor effect is independent of the variability in mEPSC effect. To increase sample size, we examined 2 additional cultures for synaptic GluA2 receptor levels in control vs. TTX treatment. These cultures showed very modest increases (Figure 5C). When cell means from these experiments were pooled with those from the 3 matched cultures, the TTX effect was still not statistically significant (Figure 5G).
Lack of significant increase in GluA2 receptor data is due to the choice to restrict our analysis to the primary dendrite, close to the cell body. We restricted our analysis to the primary dendrite because Figure 3 in Turrigiano et al, 1998, shows the increased response to exogenously applied glutamate after TTX treatment is greatest close to the cell body and wanes as the glutamate is applied further away (added to Results, new lines 388-389).
Variability in GluA2 receptor data is due to the much smaller number of synapses sampled, compared to mEPSCs. We matched the sampling for mEPSC amplitude data to that of imaging data by taking only 20 samples from each electrophysiological recording. Each mEPSC represents one synapse; in a set of 20 mEPSCs some might come from the same synapse, so that we are sampling from £ 20 synapses. The effect of TTX on mEPSC amplitudes remained significant despite the reduced samples per cell (Figure 5A).
Why do we fail to show a significant increase in receptors when this has been shown in many studies?
We have added to our discussion the point that several studies, including Wang et al. 2019, use the number of puncta, rather than the number of cells, as the sample number. We ran an analysis of GluA2 receptor cluster size where we sampled multiple synapses per cell, and used the number of clusters as the sample n. We found that even with as few as 6 synapses randomly selected from each cell, the effect of TTX on GluA2 receptor cluster size became highly significant (p = 0.001 for data from 3 cultures and p = 0.005 for data from 5 cultures) (see new lines 400-406 in Discussion). In sum, our data are not very different from that of some previous studies. We are not arguing that receptors do not increase. Instead our point is that the increase is more variable than the increase in MESPC amplitude and thus takes a much bigger sample size to detect. In sum, the difference between the mEPSC data and the receptor data is that the mEPSC data consistently show a ~20-25% increase, whereas the receptor data do not always show an increase and sometimes the increase is only ~10%. Finally, we added two matched culture experiments examining synaptic GluA1 receptor cluster characteristics. GluA1 receptor cluster size decreased in one culture, and increased very modestly in the other (Supplemental Figure 1B), whereas mEPSC amplitude robustly increased (Supplemental Figure 1A; Results, new lines 265-268).
We conclude that these data support the idea that there is another contributor to the TTXinduced increase in quantal size.
Other changes in presentation of GluA2 receptor results: Since the effects on intensity and integral are of lesser magnitude than that on cluster size, we have removed these results from the graphs, although they are presented in Table 1. We have removed Figure 6, the presentation of individual culture results, since these results are now conveyed in Figure 5A-C. We have removed graphs depicting GluA2 receptor cluster size in response to TTX in Rab3A-/- cultures, but these data are still presented in Table 1.
We address other detailed comments below.
Public Reviews:
Reviewer #1 (Public review):
(2) The effects of Rab3A on TTX-induced mini frequency modulation remains unclear, because TTX does not induce a change in mini frequency in the Rab3A+/Ebd control (Fig. 2). The respective conclusions should be revised accordingly (l. 427).
The effects on mini frequency were added for completeness, but given the lack of consistently significant changes with TTX treatment or changes in the KO or Rab3A<sup>Ebd/Ebd</sup> cultures, we have removed comment on these results from the Discussion.
(3) The model is still not supported by the data. In particular, data supporting a negative regulation of Rab3A by APs, Rab3A-dependent release of a tropic factor, or a Rab3Adependent increase in GluA2 abundance are not presented.
We have removed the model from the manuscript.
(4) Data points are not overlapping and appear "quantal" in most box plots. How were the data rounded?
The appearance of quantal variation in cell amplitude means is due to the binning that is part of the creation of the box plot. We have not remade the figures without binning, because the binning provides a visual depiction of the distribution of the data points. We have added the bin sizes to the appropriate figure legends.
Reviewer #2 (Public review):
However, the authors still have not provided further investigation of the mechanisms behind the role of Rab3A in this form of plasticity, and the revision therefore has added little to the significance of the study. Moreover, the experimental design for the investigation of the mismatch between mEPSC amplitude and GluA2 cluster fluorescence remains questionable, making it difficult to draw any credible conclusions from groups of data that not only look similar to the eye but also show no significance statistically.
To our knowledge, no other study has matched measurements of mEPSC amplitude in the same cultures where synaptic receptor levels were assessed. As stated above, we have revised the presentation of GluA2 receptor results, concluding from the lack of significant effects on receptor levels that the mEPSC amplitude increase cannot be fully explained by the receptor data (which is strengthened by addition of two more cultures analyzed for GluA2 immunofluorescence). This is an important addition to the significance of the study.
In summary, this study establishes that neuronal Rab3A plays a role in homeostatic synaptic plasticity, but so do a number of other molecules that have been implicated in homeostatic synaptic plasticity in the past two decades (only will grow with the new techniques such as RNAseq). Without going beyond this finding and demonstrating how exactly Rab3A participates in the induction and/or expression of this form of plasticity, or maybe the potential Rab3A-mediated functional and behavioral defects in vivo, the contribution of the current study to the field is limited. However, given the presynaptic location of Rab3A, this finding could serve as a starting point for researchers interested in pre-postsynaptic cross-talk during homeostatic plasticity in general.
We previously published a review in which we list 19 molecules known at that time to be important for homeostatic synaptic plasticity (see Table 2, Koesters et al., 2024), and they fall into two categories: molecules involved in glutamate receptor expression or trafficking, and signaling molecules. Rab3A is the first synaptic vesicle protein to be implicated in homeostatic plasticity of quantal size. We have added this point to the Discussion, new lines 473-476. By demonstrating that Rab3A is not acting in glia (which release TNF, which regulates receptor expression), and that GluA2 receptor levels do not explain the homeostatic mEPSC increase in our experimental conditions, we have ruled out two major mechanisms.
Reviewer #3 (Public review):
Other questions arise from the NASPM experiments, used to justify looking at GluA2 (and not GluA1) in the immunostaining. First, there is a frequency effect that is unclear in origin. One would expect NASPM to merely block some fraction of the post-synaptic current, and not affect pre-synaptic release or block whole synapses. However the change in frequency seems to argue (as the authors do) that some synapses only have CP-AMPARs, while the rest of the synapses have few or none. Another possibility is that there are pre-synaptic NASPM-sensitive receptors that influence release probability. Further, the amplitude data show a strong trend towards smaller amplitude following NASPM treatment (Fig 3B). The p value for both control and TTX neurons was 0.08 - it is very difficult to argue that there is no effect. The decrease on average is larger in the TTX neurons, and some cells show a strong effect. It is possible there is some heterogeneity between neurons on whether GluA1/A2 heteromers or GluA1 homomers are added during HSP. This would impact the weakly supported conclusions about the GluA2 imaging vs mEPSC amplitude data.
We cannot rule out that the NAPSM-induced decrease in mEPSC frequency is due to a loss of presynaptic glutamate receptor enhancement of release probability, and have added this statement to the Results, new lines 202-204. Regarding the p value of 0.08—we are not arguing that NASPM has no effect on mEPSC amplitude, only that it has no effect on the homeostatic increase in amplitude after TTX treatment. An increase in GluA1/A2 heteromers should have been detected in our imaging studies.
Unaddressed issues that would greatly increase the impact of the paper:
(1) Is Rab3A acting pre-synaptically, post-synaptically or both? The authors provide good evidence that Rab3A is acting within neurons and not astrocytes. But where it is acting (pre or post) would aid substantially in understanding its role. They could use sparse knockdown of Rab3A, or simply mix cultures from KO and WT mice (with appropriate tags/labels). The general view in the field has been that HSP is regulated post-synaptically via regulation of AMPAR trafficking, and considerable evidence supports this view. The more support for their suggestion of a pre-synaptic site of control, the better.
We agree that doing co-cultures of Rab3A-/- and Rab3A+/+ neurons is the definitive experiment to determine the locus of action of Rab3A in homeostatic synaptic plasticity. We hope to examine this question in a future manuscript.
(2) Rab3A is also found at inhibitory synapses. It would be very informative to know if HSP at inhibitory synapses is similarly affected. This is particularly relevant as at inhibitory synapses, one expects a removal of GABARs (ie the opposite of whatever is happening at excitatory synapses). If both processes are regulated by Rab3A, this might suggest a role for this protein more upstream in the signaling; an effect only at excitatory synapses would argue for a more specific role just at these synapses.
We agree that it would be very interesting to determine if the homeostatic decrease in mIPSCs after activity blockade depends on Rab3A. We hope to address this question in the future.
Recommendations for the authors:
Reviewer #3 (Recommendations for the authors):
Minor points:
The abstract is a bit repetitive in places. Some editing would be advised.
We did not identify anything repetitive in the abstract except the parallel construction referring to the previous findings at the NMJ and current findings in cortical neurons. However, we have eliminated a section in the introduction which went into detail about the receptor imaging results (previous lines 103-110).
Line 77: 'shift toward early awakening' is unclear; do you mean shorter sleep/wake cycle? Other circadian issues? A more complete description is needed.
We have moved the additional detail about the Earlybird mutation’s effect on circadian period from the Results to the Introduction, new lines 77 to 79.
The results section has many passages that seem more like discussion, offering various interpretation and alternatives for the data. While some commentary is appropriate, to justify the next series of experiments and maintain a logical flow, this manuscript has rather a high amount of this. Some editing and shifting material to the discussion might be warranted.
We have reduced the commentary in the Results section.
Line 245: GluA2 homomers are really unlikely, as they won't pass current (unless unedited) and don't often if ever form. But GluA2/A3 heteromers are likely (and detected by their methods).
GluA2 homomers do conduct current, albeit less than heteromers (Swanson et al., 1997; Oh and Derkach, 2005; Coombs et al., 2019). [The Oh and Derkach paper shows a GluA2 homomer current in Supplementary Figure 3]. We have modified the text to acknowledge that the GluA2 receptor imaging will detect heteromers and homomers (Results, new lines 214 to 215).
Line 258: If the number of synaptic pairs analyzed was usually <20, what was the average and range of pairs? This gets into the sampling issue.
We have added the average number of synaptic sites (20.4 ± 6.5) and range (11-38) to the text, Results, new line 229.
Are the stats of the baseline mEPSC amplitude and frequency shifts (WT vs KO on WT feeder layer) given somewhere (lines 398-402)? If not, please add them.
These stats have been added to the text, mEPSC amplitude, (CON, WT on WT, 13.3 ± 0.5 pA; CON, KO on WT, 15.2 ± 1.1 pA, p = 0.23, Kruskal-Wallis test), new lines 325-326 and frequency, (CON, WT on WT, 2.54 ± 0.57 sec<sup>-1</sup>; CON, KO on WT, 4.46 ± 1.21 sec<sup>-1</sup>, p = 0.23, Kruskal-Wallis test), new lines, 329-330.
25mM K+ is going to be much more than 'mildly' depolarizing (line 697). Should just skip that word.
‘mildly’ has been removed.
The section on MiniAnalysis seems overly argumentative, and there is no need to discuss flaws in the Wu paper. The important thing (a bit buried at the end of this section) is that the manual mini selection was done blind to condition, which is the normal way of dealing with potential bias. It would be better to limit the methods to describing what was done.
The bulk of the justification of manual analysis has been removed from the text.
The discussion of potential conductance changes (lines 534-6) seems somewhat unwarranted.
Modification of GluA1 phosphorylation in the GluA1/A2 heteromer would not be detected by NASPM (and the NASPM data being a bit inconclusive anyway). Further, auxiliary subunits (like TARPs) can alter conductance of any of the AMPARs. So I don't think they have enough data to exclude such a possibility.
The discussion of contributions of conductance have been removed from the text.
Coombs ID, Soto D, McGee TP, Gold MG, Farrant M, Cull-Candy SG (2019) Homomeric GluA2(R) AMPA receptors can conduct when desensitized. Nat Commun 10:4312.
Oh MC, Derkach VA (2005) Dominant role of the GluR2 subunit in regulation of AMPA receptors by CaMKII. Nat Neurosci 8:853-854.
Swanson GT, Kamboj SK, Cull-Candy SG (1997) Single-channel properties of recombinant AMPA receptors depend on RNA editing, splice variation, and subunit composition. J Neurosci 17:5869.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
In this manuscript, the authors investigate the role of BEND2, a novel regulator of meiosis, in both male and female fertility. Huang et al have created a mouse model where the fulllength BEND2 transcript is depleted but the truncated BEND2 version remains. This mouse model is fertile, and the authors used it to study the role of BEND2 on both male and female meiosis. Overall, the full-length BEND2 appears dispensable for male meiosis. The more interesting phenotype was observed in females. Females exhibit a lower ovarian reserve suggesting that full-length BEND2 is involved in the establishment of the primordial follicle pool.
Strengths:
The authors generated a mouse model that enabled them to study the role of BEND2 in meiosis. The role of BEND2 in female fertility is novel and enhances our knowledge of genes involved in the establishment of the primordial follicle pool.
Weaknesses:
The manuscript extensively explores the role of BEND2 in male meiosis; however, a more interesting result was obtained from the study of female mice. Only a few experiments were performed using female mice, therefore, more experiments should be performed to complete the story of the role of BEND2 on female fertility. In addition, the title and abstract of the manuscript do not align with the story, as female fertility is only a small portion of the data compared to the male fertility section.
We appreciate the reviewer’s thoughtful summary, recognition of the strengths of our study, and constructive feedback. In the revised manuscript, we have performed additional experiments to enhance our understanding of the role of BEND2 in female gametogenesis. These new experiments provide further insights into the establishment of the ovarian reserve and the role of BEND2 in female fertility.
Additionally, we have rewritten the title, abstract, and introduction to better align with the content of the manuscript and to reflect the balance between the male and female fertility results. We believe these changes address the reviewer’s concerns and improve the overall clarity and focus of the manuscript.
Reviewer #1 (Recommendations For The Authors):
• I recommend that the authors re-organize their abstract and introduction to accurately reflect the manuscript's primary focus on male fertility. Right now, the title of the manuscript is misleading. The manuscript does not investigate reproductive aging; rather, it primarily describes the depletion of primordial follicle number. The mechanism behind this depletion and whether this phenotype accelerates reproductive aging, are not explored. Clarifying these points will help align the title and content of the manuscript more accurately.
We thank the reviewer for this suggestion. We agree that the original title and abstract did not fully capture the focus of the study. In response, we have rewritten the title, abstract, and introduction to better align with the results presented, focusing more clearly on the implications of the effects of the full-length BEND2 depletion for spermatogenesis and oogenesis. These revisions ensure that the title, the abstract, and the manuscript's introduction are now more accurately reflective of the work performed.
• Figure 1: I couldn't find the validation of the polyclonal antibody against BEND2 that the authors generated.
Regarding this query about the validation of the polyclonal antibody against BEND2, we apologize for any confusion. We would like to clarify that this validation is indeed presented in Figure 2 of our manuscript. To ensure this information is easily accessible, we have revised the text to explicitly mention the validation in Figure 2.
• Figure 2A: Could you provide the actual numbers for the weight of the mice testis?
In response to this question regarding Figure 2A and the weights of the mice testis, we have now included this data in a graph in Fig 2A and Table S1 and added this information in the results section.
• Figure 2C and D: I am confused by the fact that in the WB we can appreciate a high expression of the p75 protein, but the signal is very low in the IF (Figure 2D).
We thank the reviewer for raising this point. We acknowledge the apparent discrepancy between the strong p75 signal observed in the Western blot (Fig. 2C) and the weaker signal seen in the immunofluorescence (Fig. 2D). We think several factors could contribute to this difference, such as differences in sensitivity and detection methods, epitope accessibility, protein localization or differences in sample preparation, antibody affinity, and experimental conditions between Western blot and IF.
• In the same figure, the authors also mention that the p75 protein is functional. On what basis do they rely on reaching this conclusion?
We acknowledge that we cannot definitively confirm the functionality of the p75 protein. Our assumption was based on the observed fertility of the male mice and existing literature indicating that BEND2 is essential for completing meiosis (Ma et al., 2022). However, we understand the importance of clarity in our claims. To avoid any potential confusion, we have revised the sentence to read: "The p75 BEND2 protein—likely corresponding to an exon 11-skipped transcript—is present and might be functional in our mutant testis, based on the observed phenotype (see below)."
• The phenotype in females is very interesting. The authors conclude that BEND2 influences primordial follicle formation, oocyte quality, fertility, and reproductive aging by (1) performing follicle counts, (2) analyzing the litter size, and (3) analyzing meiotic progression. Given that the authors build their story around these experiments, I strongly encourage them to expand the section on female fertility, or reorganize the manuscript, or be more cautious with some of their conclusions. They might consider performing additional experiments such as:
- Oocyte quality: To determine whether BEND2 impacts oocyte quality, mice should be stimulated with hormones and oocyte quality should be analyzed (GV, MI, MII progression, spindle morphology and/or fertilization, and embryo development). Does the decrease in primordial follicles correlate with the number of ovulated oocytes, or is the impact only on oocyte quality?
We appreciate the reviewer's suggestion to assess the impact of BEND2 on oocyte quality. Following the reviewer’s recommendation, we stimulated three control and three mutant mice. We analyzed the number of ovulated oocytes, their fertilization rate, and the percentage of embryos that developed to the blastocyst stage. These new results are included in the revised manuscript (see Results section and new Table 1). Our analyses indicate that for all parameters assessed, control and mutant oocytes behaved similarly. Specifically, there were no significant differences in the number of ovulated oocytes, fertilization rates, or the ability of embryos to progress to the blastocyst stage between the control and mutant groups. These findings suggest that mutant oocyte quality is comparable to control mice of a similar age. We have incorporated these new results into the manuscript.
- Reproductive aging: A fertility trial would provide more information on whether BEND2 depletion triggers an acceleration of reproductive aging. In addition, the oldest mice used by the authors are 9 months old, and at this point, fertility has not declined yet.
We appreciate the reviewer's suggestion regarding the assessment of reproductive aging. However, we respectfully disagree with the assertion that fertility has not declined by 9 months of age. In our colony, we have observed a significant decline in fertility around 10 months of age. Specifically, out of 18 10-month-old female mice placed in breeding cages, we observed only three pregnancies within the first 30 days (N.N. and I.R., data not published). Based on these observations, we determined that fertility begins to decline around this age in our colony, which informed our decision to use 9-month-old mice as the oldest age group for our analysis. Thus, this age is appropriate for evaluating the potential effects of BEND2 depletion on reproductive aging in our specific mouse population.
- The observation that the primordial follicle pool is already diminished in mice that are 1 week old is very interesting. Some experiments that the authors could perform to figure out the mechanism are: (1) Analyzing apoptosis. Are the primordial follicles dying during the pool's establishment, or is this an ongoing apoptotic process throughout the mice's lifespan? (2) If the authors still have ovaries from mice younger than 1 week of age (when the primordial pool is forming), they could perform DDX4 staining and quantify the number of oocytes in follicles and the total number of oocytes. These experiments would provide mechanistic insights into whether BEND2 impacts the formation of the primordial follicle pool or if the pool forms but is then depleted.
We appreciate the reviewer's suggestion to further explore the mechanism behind the reduced primordial follicle pool. In response, we have analyzed the number of DDX4positive cells (DDX4 labels oocytes) in newborn mutant and wild-type animals. Our results show that mutant ovaries contain significantly fewer oocytes compared to controls (see new Fig. 5). This finding supports the hypothesis that BEND2 is critical for the establishment of a normal ovarian reserve. We are grateful for this suggestion, as these additional data reinforce our conclusion that BEND2 is required to determine a normal ovarian reserve in mice.
• What is the red signal in Supplementary Figure 1C?
This image depicts the BEND2 staining pattern in 16 days post-coitum (dpc) wild-type mouse ovaries. To clarify this and prevent any confusion, we have updated the figure legend to explicitly state that the sample shown is from a wild-type mouse.
• Please spell out the full term of all the acronyms.
We apologize for the oversight in not fully spelling out some acronyms in the original manuscript. We have carefully reviewed the entire manuscript and have ensured that all acronyms are now spelled out in full upon their first use in the revised version. We want to thank the reviewer for bringing this to our attention.
• Is Line-1 also dysregulated in the ovary? This was one of the main findings from the male part. It would be interesting to perform the same analysis in the ovary since Line1 has a role in establishing the ovarian reserve (PDMI: 31949138).
We thank the reviewer for this insightful suggestion. We have analyzed the number of LINE1 and SYCP3-positive cells in wild-type and mutant newborn ovaries (new Fig. S4). Our results show no significant difference between the two genotypes, suggesting that LINE-1 is not dysregulated in newborn Bend2 mutant oocytes. These findings indicate that, at least in the context of the newborn ovary, LINE-1 does not appear to be affected by BEND2 depletion.
Reviewer #2 (Public Review):
In their manuscript entitled "BEND2 is a crucial player in oogenesis and reproductive aging", the authors present their findings that full-length BEND2 is important for repair of meiotic double strand break repair in spermatocytes, regulation of LINE-1 elements in spermatocytes, and proper oocyte meiosis and folliculogenesis in females. The manuscript utilizes an elegant system to specifically ablate the full-length form of BEND2 which has been historically difficult to study due to its location on the X chromosome and male sterility of global knockout animals.
While the manuscript is an overall excellent addition to the field, it would significantly benefit from a few additional experiments, as well as some additional clarification/elaboration.
The claim that BEND2 is required for ovarian reserve establishment is not supported, as the authors only look at folliculogenesis and oocyte abundance starting at one week of age, after the reserve is formed. Analysis of earlier time points would be much more convincing and would parse the role of BEND2 in the establishment vs. maintenance of this cell population. In spermatocytes, the authors demonstrate a loss of nuclear BEND2 in their mutant but do not comment on the change in localization (which is now cytoplasmic) of the remaining protein in these animals. This may have true biological significance and a discussion of this should be more thoroughly explored.
We thank the reviewer for their thoughtful feedback and constructive suggestions to improve our manuscript.
In response to the comment regarding the establishment of the ovarian reserve, we have now analyzed Bend2 mutant and control newborn ovaries. Our results show a significant reduction in the number of DDX4-positive cells in mutant ovaries compared to controls. These findings demonstrate that BEND2 is required for the establishment of the ovarian reserve, as the reduction is evident at birth.
Regarding the cytoplasmic staining of BEND2 in mutant spermatocytes, we did perform secondary-antibody-only controls using goat anti-rabbit Cy3 to address the specificity of the signal. The staining observed in the Bend2 mutants closely resembles background staining, suggesting that the cytoplasmic signal is nonspecific. Therefore, we do not believe this represents a meaningful change in the localization of BEND2 protein in the mutants. We have clarified this in the revised manuscript to address this point.
We hope these additional experiments and clarifications strengthen the manuscript and address the reviewer’s concerns.
Reviewer #2 (Recommendations For The Authors):
Major points:
(1) The title of the manuscript does not accurately capture the content of the work. The vast majority of the data presented here is from the male, which is not reflected at all in the title - perhaps considering revising it?
Thank you for your valuable suggestion. We agree that the original title did not fully reflect the focus of the manuscript. In response, we have revised the title, along with the abstract and introduction, to more accurately capture the content of the study and the emphasis on the male data. These changes ensure that the manuscript more clearly aligns with the results presented.
(2) In Figure 2D, the authors demonstrate that WT BEND2 expression and localization are lost in the mutant, but staining is still apparent, just in the cytoplasm. Did the authors perform secondary-antibody-only controls to determine if this was background staining or real staining? If real, can they comment on the change in localization of the protein?
We thank the reviewer for this insightful question. We have indeed performed secondary antibody-only controls using goat anti-rabbit Cy3. The staining observed in the Bend2 mutants closely resembles background staining, suggesting that the signal in the cytoplasm is not specific. Therefore, we do not believe this staining represents any real or meaningful expression of the BEND2 protein in the mutants.
(3) In Figure S2A, the authors show Ku70 staining and describe that it is similar between the genotypes, but - to my eye - it looks quite distinctly different. It appears to stain in patches in WT SYCP3+ spermatocytes, versus staining in patches in the more mature, SYCP3- germ cells closer to the lumen in the mutant. Can the authors please clarify, or provide arrows to point which foci they are referring to?
We apologize for the confusion caused by the image provided in the original submission. Upon review, we realized that the mutant image was not fully representative of the staining pattern observed in the majority of mutant samples. We have replaced this image with a new one in the revised manuscript, which more accurately reflects the similarity in Ku70 staining between wild-type and mutant testis. In this updated Figure S2, we have also included arrowheads to indicate the relevant foci, making it clearer to the reader. We have updated the figure legend to correspond with these changes as well.
(4) The authors state that BEND2 is "required to establish the ovarian reserve during oogenesis" but this has not been demonstrated. The authors do show a reduced density of primordial follicles at one week of age. While this is compelling data, the ovarian reserve is established earlier in the mouse, around postnatal days 0-1, so it is not clear from this manuscript whether BEND2 is required for the maintenance of this population after PND1, leading to reduced numbers by 1 week of age, OR if it is required for the establishment of this population, which would result in reduced numbers of oocytes around the time of birth. This is a critical experiment that should be performed in order to determine which of these possibilities is likely the case. Ideally, looking at embryonic through early postnatal time points during ovarian development would be very helpful.
We thank the reviewer for raising this important point. As mentioned earlier in response to Reviewer 1, we have performed the experiment suggested by Reviewer 2 and analyzed the number of DDX4-positive cells in newborn ovaries. Our results show that Bend2 mutant ovaries have fewer oocytes at birth than wild-type controls (Fig. 5H). This finding reinforces our conclusion that BEND2 is indeed required to establish the ovarian reserve, as the reduction in oocyte number is evident at the time of birth. We agree that this additional data strengthens our original claim, so we have included these results in the revised manuscript.
Reviewer #3 (Public Review):
Summary:
Huang et al. investigated the phenotype of Bend2 mutant mice which expressed a truncated isoform. This mutant male showed increasing apoptosis due to unrepaired double-strand breaks. However, this mutant male has fertility, and this enabled them to analyze Bend2 function in females. They revealed that Bend2 mutation in females showed decreasing follicle numbers which leads to loss of ovarian reserve.
Strengths:
Since their Bend2 mutant males were fertile, they were able to analyze the function of Bend2 in females and they revealed that loss of Bend2 causes less follicle formation.
Weaknesses:
Why the phenotype of their mutant male is different from previous work (Ma et al.) is not clear enough although they discuss it.
We appreciate the reviewer’s comment regarding the differences between our Bend2 mutant male phenotype and the previously reported phenotype by Ma et al., 2022. We believe this discrepancy is due to the fact that the Bend2 locus encodes two BEND2 isoforms: p140 and p80. In contrast to the previous study, where both proteins were ablated by mutation employed (the deletion of exons 12 and 13), our exon 11 deletion specifically ablates p140 expression while allowing the expression of p80 in the testis.
Based on the distinct phenotypes observed in the two Bend2 mutant mouse models, we hypothesize that p80 is sufficient to fulfill BEND2’s roles in meiosis, which could explain why our Bend2 mutant males remain fertile. We have rewritten the relevant sections in the results and discussion to better articulate this hypothesis and clarify the potential mechanisms behind the observed phenotypic differences.
We hope these clarifications and additional details adequately address the reviewer’s concerns.
Reviewer #3 (Recommendations For The Authors):
(1) The authors showed that Bend2 mutant females had decreased fertility. This may be due to decreased ovarian reserve. Did the authors check if the mutant mice decreased or lost fertility faster than WT? If the authors have the data, please refer to it in the manuscript.
We followed the breeding performance of a small number of control and Bend2 mutant females, and preliminary observations suggested no clear differences between the two groups. However, due to the limited sample size, we felt that these data were not conclusive enough to be included in the manuscript. We agree that a more thorough analysis of fertility decline over time would be valuable, and we plan to address this question in a future study.
(2) In Figure 1 A, there is no exon1 in the upper figure.
We thank the reviewer for pointing this out. We have revised Figure 1A to include exon 1 and ensure the schematic is accurate. The updated figure is included in the revised version of the manuscript.
(3) Figure 3A, it would be nice to show several tubules of the testis section as well as an enlarged one.
Following the reviewer's advice, we have revised Figure 3A to include new images showing several tubules and an enlarged view of one section of a tubule. These updates are included in the revised manuscript to better represent the testis sections.
(4) Please be consistent with the format of the graph, especially Supplemental figures 2C and 4D.
We have revised the figures, including Supplemental Figures 2C and 4D, to ensure consistency in the format throughout the manuscript. We have made modifications to the figures to align them more closely and improve the overall presentation.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We are grateful to the reviewers and editors for their time and positive assessment of our manuscript. We will incorporate all their comments to further improve our work. In the revised version of the manuscript, we will provide a more detailed description of the quantification of the wrapping index and further explain the differential roles of Htl and Uif during cell growth versus the role of Notch during axon wrapping. In addition, we will perform further experiments using combinations of reporters and antibodies to further explore the relationship between Htl, Uif and Notch. The discussion will be expanded and possible mechanisms by which Uif 'stabilises' a specific membrane domain will be included.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
This work seeks to provide genetic evidence for a role for beta-adrenergic receptors that regulate heart rate and blood flow on cavernous malformation development using a zebrafish model, and to extend information regarding beta-adrenergic drug blockade in cavernous malformation development, with the idea that these drugs may be useful therapeutically.
Strengths:
The work shows that genetic loss of a specific beta-adrenergic receptor in zebrafish, adrb1, prevents embryonic venous malformations and CCM in adult zebrafish brains. Two drugs, propranolol and metoprolol, also blunt CCM in the adult fish brain. These findings are predicted to potentially impact the treatment of human CCM, and they increase understanding of the factors leading to CCM.
Response 1: We are grateful for the reviewer’s acknowledgment of this study’s potential translational significance.
Weaknesses:
There are minor weaknesses that detract slightly from enthusiasm, including poor annotation of the Figure panels and lack of a baseline control for the study of Klf2 expression (Figure 4).
Response 2: We agree. Annotation of the Figure panels were added, and a baseline control for the study of klf2a expression (Figure 4) was added. Details were described in the response to “recommendations for the authors”.
Reviewer #2 (Public review):
Summary:
Previously, the authors developed a zebrafish model for cerebral cavernous malformations (CCMs) via CRISPR/Cas9-based mosaic inactivation of the ccm2 gene. This model yields CCM-like lesions in the caudal venous plexus of 2 days post-fertilization embryos and classical CNS cavernomas in 8-week fish that depend, like the mouse model, on the upregulation of the KLF2 transcription factor. Remarkably, the morpholino-based knockdown of the gene encoding the Beta1 adrenergic receptor or B1AR (adrb1; a hemodynamic regulator) in fish and treatment with the anti-adrenergic S enantiomer of propranolol in both fish and mice reduce the frequency and size of CMM lesions.
In the present study, the authors aim to test the model that adrb1 is required for CCM lesion development using adrb1 mutant fish (rather than morpholino-mediated knockdown and pharmacological treatments with the anti-adrenergic S enantiomer of propranolol or a racemic mix of metoprolol (a selective B1AR antagonist).
Strengths:
The goal of the work is important, and the findings are potentially highly relevant to cardiovascular medicine.
Response 3: We are grateful for the reviewer’s acknowledgment of this study’s scientific importance and clinical relevance.
Weaknesses:
(1) The following figures do not report sample sizes, making it difficult to assess the validity of the findings: Figures 1B and D (the number of scored embryos is missing), Figures 2G and 3B (should report both the number of fish and lesions scored, with color-coding to label the lesions corresponding to individual fish in which they were found).
Response 4: We agree. Sample sizes of Figures 1B and D were added in the figures and figure legends. Sample sizes of Figures 2G and 3B were added in their figure legend respectively. The lesion volume in Figures 2G and 3B is the total lesion volume in each brain.
(2) Figure 4 has a few caveats. First, the use of adrb1 morphants (rather than morphants) is at odds with the authors' goal of using genetic validation to test the involvement of adrb1 in CCM2-induced lesion development.
Second, the authors should clarify if they have validated that the tnnt (tnnt2a) morpholino phenocopies tnnt2a mutants in the context in which they are using it (this reviewer found that the tnnt2a morpholino blocks the heartbeat just like the mutant, but induces additional phenotypes not observed in the mutants).
Response 5: We appreciate the reviewer’s comments; however, generating adrb1<sup>-/-</sup> and tnnt2a<sup>-/-</sup> klf2a reporter fish, while also ensuring the presence of only one EGFP transgene allele for intensity measurement, would require prohibitively time-consuming breeding efforts.
The use of morpholinos for tnnt2a and adrb1, as well as their effects on the heart, have been well-documented in previous studies (Sehnert AJ et al., Nat Genet. 2002;31:106-10; Steele SL et al., J Exp Biol. 2011;214:1445-57).
Third, the data in Figure 4E is from just two embryos per treatment, a tiny sample size. Furthermore, judging from the number of points in the graph, only a few endothelial PCV cells appear to have been sampled per embryo. Also, judging from the photos and white arrowheads and arrows (Figure 4A-D), only the cells at the ventral side of the vessel were scored (if so, the rationale behind this choice requires clarification).
Response 6: We have increased the sample size, as described in the Figure 4 legend. Regarding the scoring of endothelial nuclei, we focused on the ventral side of the vessel because nuclei on the dorsal side often reside at branching points of the venous plexus. This positional variance could influence klf2a expression levels; thus, we focused on the ventral surface to limit this potential confounding variable.
Fourth, it is unclear whether and how the Tg(kdrl:mcherry)is5 endothelial reporter was used to mask the signals from the klf2a reporter. The reviewer knows by experience that accuracy suffers if a cytosolic or cell membrane signal is used to mask a nuclear green signal.
Response 7: We agree that it is theoretically possible for Förster resonance energy transfer (FRET) to occur, as the emission spectrum of EGFP (495-550 nm in our filter setup) overlaps with the absorption spectrum of mCherry. However, several factors reduce the likelihood of FRET in our experimental setup:
(1) Without a nuclear localization signal, the majority of mCherry is localized in the cytoplasm, although small amounts may passively diffuse into the nucleus.
(2) EGFP, on the other hand, is predominantly localized in the nucleus due to the presence of a nuclear localization signal.
(3) FRET requires two fluorophores to be within a proximity of 8-10 nanometers or less for efficient energy transfer. The nuclear envelope, with a typical thickness of 30-50 nanometers, separates nuclear EGFP from cytoplasmic mCherry and FRET efficiency is inversely proportional to the sixth power of the distance between donor and acceptor. Thus, the theoretical likelihood of significant energy transfer under these conditions is low.
To empirically examine potential FRET between nuclear EGFP and mcherry in our experiment setup, we scanned and scored the Tg(klf2a:H2b-EGFP; kdrl:mcherry) double transgenic embryos and Tg(klf2a:H2b-EGFP) embryos for EGFP intensity. The result is attached here:
Author response image 1.
42 endothelial nuclei from 7 embryos were scored as described in the Experimental Procedures of the manuscript. Two tailed t test were performed. P=0.4529
Finally, the text and legend related to Figure 4 could be more explicit. What do the authors mean by a mosaic pattern of endothelial nuclear EGFP intensity, and how is that observation reflected in graph 4E? When I look at the graph, I understand that klf2a is decreased in C-D compared to A-B. Are some controls missing? Suppose the point is to show mosaicism of Klf2a levels upon ccm2 CRISPR. Don't you need embryos without ccm2 CRISPR to show that Klf2a levels in those backgrounds have average levels that vary within a defined range and that in the presence of ccm2 mosaicism, some cells have values significantly outside that range? Also, in 4A-D, what are the white arrowheads and arrows? The legend does not mention them.
Response 8: We have revised our description of Figure 4 to better convey that mosaic expression of KLF2a is evidenced by the wide variability of klf2a reporter intensity in endothelial cells in ccm2 CRISPR embryos. A baseline control for the study of klf2a expression was added to Figure 4. The arrowheads and arrows in Figure 4A-D are explained in Figure 4 legends.
Given the practical relevance of the findings to cardiovascular medicine, increasing the strength of the evidence would greatly enhance the value of this work.
Recommendations for the authors:
Reviewing Editor:
Concerns about the labeling of figures and sample sizes should both be addressed, as detailed in the reviews, as this will be important to ensure the robustness of the claims.
Reviewer #1 (Recommendations for the authors):
Overall a strong research advance that provides rigorous genetic analysis and further drug testing in the zebrafish CCM model. There are some minor issues that, if addressed, would strengthen the work.
Minor issues:
(1) Figures in general are very poorly annotated and labeled. None of the images in Figures 1-3 show the reporter used to visualize vessels/CM, and the scale bars are not sized in the Figures or legends. Figure 1B is an experiment where the effects of a drug that increases heart rate are evaluated in mutants and controls, but the drug is not mentioned in the figure panel. Figure 1D shows the percentage of embryos with CVP dilation, but the graph and accompanying description does not define whether the percent is relative to the total embryos from the intercross or the percent of that category having the CVP dilation.
Response 9: Changes were made in Figures and Figure legends. The transgenic reporter line Tg(fli1:EGFP) was annotated in Figures 1-3. Scale bars were sized in the Figures and Figure legends. The chemical used for Figure 1B was annotated in the Figure. The percentage of CVP dilation in the graph was explained in the Figure legend.
(2) Figure 4 does not include baseline data in unmanipulated embryos scored at the same time to show the increase in Klf2 expression with mosaic ccm2 deletion. This is important as the result in E is interpreted as a lack of change in the increase.
Response 10: A baseline control for the study of klf2a expression in Figure 4 was added.
Reviewer #2 (Recommendations for the authors):
SUGGESTIONS FOR EXPERIMENTS, DATA, OR ANALYSES
(1) For maximum rigor, in the Figure 4 experiment, use adrb1 mutants and tnnt2a (silent heart) mutants (or verify that the adrb1 and tnnt2a morpholinos faithfully copy the phenotype of interest). See: Guidelines for morpholino use in zebrafish (PMID: 29049395; PMCID: PMC5648102).
Response 11: See Response 5.
(2) Increase sample sizes if appropriate.
Response 12: In the revised version of the manuscript, we have increased the sample size, as described in the Figure 4 legend.
(3) The imaging and fluorescence intensity analysis methods require more detail for reproducibility's sake. Please provide this information. See as a guideline: Guillermo MarquésThomas PengoMark A Sanders (2020) Science Forum: Imaging methods are vastly underreported in biomedical research eLife 9:e55133.
Response 13: We added detailed procedures for the “Airyscan imaging and fluorescence intensity analysis” in the “Experimental Procedures”.
(4) I suggest further clarifying how inhibition of B1AR prevents cavernoma formation. Given that lesion formation is suppressed in adrb1 mutants (which have slow blood flow) and 2,3-BDM treatment (which also slows blood flow) has a similar effect, the beneficial effects of propranolol and metoprolol might be due to the slowing of blood flow via B1AR targeting rather than reflecting that B1AR is a critical component of the genetic circuit for cavernoma formation. Indeed, in prior work by the same first author and collaborators (Elife 2021 May 20:10:e62155), the investigators observed reduced cavernoma formation in embryos devoid of cardiac contractility and thus lacking blood flow (tnnt2a morphants). Such a scenario does not take away the value of a pharmacological treatment. Still, it implies a different mechanism and allows potentially many other drugs with similar effects on blood flow to be effective.
Discussing how B1AR activity is regulated and outlining future experiments would be helpful. Suggestions for the latter include testing the effect of normalizing blood flow in adrb1 mutants with a drug or providing exogenous B1AR in the myocardium or the endothelium to test the model further.
Response 14: We are grateful for the reviewer’s suggestions and added the statement for future experiments.
MINOR CORRECTIONS TO TEXT AND FIGURES
(1) Figure 4E: Label the four genotypes explicitly, rather than A-D for the reader's ease.
(2) Legend of Figure 4: "(F) EGFP intensity...". It should be (E).
CITATIONS TO CORRECT
(1) The citation for the Tg(kdrl:mcherry)is5 transgene needs to be corrected (reference 29 is from the Stainier lab). However, the "is" designation is for the Essner lab (https://zfin.org/action/feature/view/ZDB-ALT-110127-25)
Response 15: Corrections were made as instructed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewing Editor Comment:<br /> Please note that all three reviewers suggested this manuscript would best fit as a resource paper at eLife.
Reviewer #1 (Public review):
Summary:
This impressive study presents a comprehensive scRNAseq atlas of the cranial region during neural induction, patterning, and morphogenesis. The authors collected a robust scRNAseq dataset covering six distinct developmental stages. The analysis focused on the neural tissue, resulting in a highly detailed temporal map of neural plate development. The findings demonstrate how different cell fates are organized in specific spatial patterns along the anterior-posterior and medial-lateral axes within the developing neural tissue. Additionally, the research utilized high-density single-cell RNA sequencing (scRNAseq) to reveal intricate spatial and temporal patterns independent of traditional spatial techniques.
The investigation utilized diffusion component analysis to spatially order cells based on their positioning along the anterior-posterior axis, corresponding to the forebrain, midbrain, hindbrain, and medial-lateral axis. By cross-referencing with MGI expression data, the identification of cell types was validated, affirming the expression patterns of numerous known genes and implicating others as differentially expressed along these axes. These findings significantly advance our understanding of the spatially regulated genes in neural tissues during early developmental stages. The emphasis on transcription factors, cell surface, and secreted proteins provides valuable insights into the intricate gene regulatory networks underpinning neural tissue patterning. Analysis of a second scRNAseq dataset where Shh signaling was inhibited by culturing embryos in SAG identified known and previously unknown transcripts regulated by Shh, including the Wnt pathway.
The data includes the neural plate and captures all major cell types in the head, including the mesoderm, endoderm, non-neural ectoderm, neural crest, notochord, and blood. With further analyses, this high-quality data promises to significantly advance our understanding of how these tissues develop in conjunction with the neural tissue, paving the way for future breakthroughs in developmental biology and genomics.
Strengths:
The data is well presented in the figures and thoroughly described in the text. The quality of the scRNAseq data and bioinformatic analysis is exceptional.
Weaknesses:
No weaknesses were identified by this reviewer.
Reviewer #2 (Public review):
Summary:
Brooks et al. generate a gene expression atlas of the early embryonic cranial neural plate. They generate single-cell transcriptome data from early cranial neural plate cells at 6 consecutive stages between E7.5 to E9. Utilizing computational analysis they infer temporal gene expression dynamics and spatial gene expression patterns along the anterior-posterior and mediolateral axis of the neural plate. Subsequent comparison with known gene expression patterns revealed a good agreement with their inferred patterns, thus validating their approach. They then focus on Sonic Hedgehog (Shh) signalling, a key morphogen signal, whose activities partition the neural plate into distinct gene expression domains along the mediolateral axis. Single-cell transcriptome analysis of embryos in which the Shh pathway was pharmacologically activated throughout the neural plate revealed characteristic changes in gene expression along the mediolateral axis and the induction of distinct Shh-regulated gene expression programs in the developing fore-, mid-, and hindbrain.
Strengths:
This manuscript provides a comprehensive transcriptomic characterisation of the developing cranial neural plate, a part of the embryo that to my knowledge has not been extensively analysed by single-cell transcriptomic approaches. The single-cell sequencing data appears to be of high quality and will be a great resource for the wider scientific community. Moreover, the computational analysis is well executed and the validation of the sequencing data using published gene expression patterns is convincing. Taken together, this is a well-executed study that describes a relevant scientific resource for the wider scientific community.
Weaknesses:
Conceptually, the findings that gene expression patterns differ along the rostrocaudal, mediolateral, and temporal axes of the neural plate and that Shh signalling induces distinct target genes along the anterior-posterior axis of the nervous system are more expected than surprising. However, the strength of this manuscript is again the comprehensive characterization of the spatiotemporal gene expression patterns and how they change upon ectopic activation of the Shh pathway.
Reviewer #3 (Public review):
Summary:
The authors performed a detailed single-cell analysis of the early embryonic cranial neural plate with unprecedented temporal resolution between embryonic days 7.5 and 8.75. They employed diffusion analysis to identify genes that correspond to different temporal and spatial locations within the embryo. Finally, they also examined the global response of cranial tissue to a Smoothened agonist.
Strengths:
Overall, this is an impressive resource, well-validated against sets of genes with known temporal and spatial patterns of expression. It will be of great value to investigators examining the early stages of neural plate patterning, neural progenitor diversity, and the roles of signaling molecules and gene regulatory networks controlling the regionalization and diversification of the neural plate.
Weaknesses:
The manuscript should be considered a resource. Experimental manipulation is limited to the analysis of neural plate cells that were cultured in vitro for 12 hours with SAG. Besides the identification of a significant set of previously unreported genes that are differentially expressed in the cranial neural plate, there is little new biological insight emerging from this study. Some additional analyses might help to highlight novel hypotheses arising from this remarkable resource.
We thank all three reviewers for their thoughtful and constructive public reviews and believe they nicely capture the contributions of our study. We agree that this article represents a valuable resource for the community and agree with its designation as a Tools and Resources article.
We also thank the reviewers for their useful suggestions for improving the manuscript. In addition to addressing most of their comments, described below, we note that we have changed midbrain-hindbrain boundary (MHB) to rhombomere 1 (r1) throughout the paper and in Tables S4, S7, S10, and S11, as this designation is more closely aligned with the literature on this region. In addition, we added the anterior-posterior and mediolateral cluster identities from our wild-type analysis for the genes that were differentially expressed in SAG-treated embryos in Table S11. Lastly, we have added a new figure (Figure 5—figure supplement 2), as suggested by Reviewer 2, in which we compare our results with the published expression of genes in neural progenitor domains along the dorsal-ventral axis of the spinal cord.
Reviewer #1 (Recommendations for the authors):
I have a few small suggestions for improving the presentation of the data.
(1) It would be helpful to show illustrations and embryo images of all the stages utilized in the analysis in Figures 1A and B.
(2) It was difficult to distinguish all the different colors in Figures 3B and 4B. Could you label, as in Figure 4, supplements 1D, F?
(3) I was confused by the position of the color code key for Figure 7D-J, thinking it belonged to panels B and C. Could you put it under the figure/heatmap key so that it is clearly linked to panels D-J?
Thank you for these suggestions. We have incorporated the third suggestion to improve readability, but were not able to make the first two changes due to space limitations.
Reviewer #2 (Recommendations for the authors):
I only have a couple of minor additional suggestions/questions for the authors:
(1) The authors state that nearly half of the transcripts they found as differentially regulated in SAG-treated embryos were also characterized as spatially regulated in the wild-type embryos. It would be great if the authors could provide more detail here. How many of the transcripts that are differentially regulated along the mediolateral axis of the wild-type are characterized as differentially regulated in the SAG-treated embryos? How does this further break down into where these genes are expressed along the mediolateral and the anterior-posterior axes? I am aware that the authors answer some of these questions already by providing examples, but a more systematic characterisation would be appreciated here.
We have updated Table S11 to include the anterior-posterior and mediolateral cluster identities of differentially expressed genes in SAG-treated embryos, where applicable. In addition, we have added more discussion of the genes from our SAG analysis that were also found to be spatially patterned in wild-type embryos to the fourth paragraph of the last results section.
(2) Related to the previous question, the authors nicely demonstrate that SAG treatment of embryos causes many transcriptional changes, including the expression/repression of several transcription factors well-known to mediate spatial patterning, raising the question of which of these effects are directly due to gene regulation by the Shh pathway and which effects are secondary consequences of transcriptional changes of other transcription factors. Similarly, the authors' results also suggest that some genes are only induced in specific parts along the neuraxis, raising the question of why. The authors could attempt some type of regulon-interference approaches to identify further candidates that may mediate these effects.
This is an excellent suggestion for a future extension of this work, as we agree that validation of the predicted SHH targets, including which targets are direct, indirect, or region-specific, would be required to evaluate the predictions of this scRNA-seq analysis.
(3) The authors report that they observed 'a previously unreported inhibition of Scube2' upon SAG treatment of the embryos. At least in the spinal cord Scube2 is well-known to be expressed at a distance from the source of Shh secretion (e.g. Kawakami et al. Curr. Biol. 2005), thus the direct or indirect repression by Shh signalling is strongly expected. Moreover, a recent preprint (Collins et al. bioRxiv, https://doi.org/10.1101/469239 ) suggests that the interaction between Shh and Scube2 can mediate the scale-invariance of Shh patterning. Of note, the authors of this preprint also state that 'upregulation of Shh represses scube2 expression while Shh downregulation increases scube2 expression thus establishing a negative feedback loop.'
Thank you for this suggestion. We have added these references.
(4) The authors partition genes based on different diffusion components as being differentially expressed along the mediolateral axis. However, starting from ~e8.5, neural progenitors in the neural tube can be partitioned based on the expression of well-characterised combinatorial sets of transcription factors into molecularly defined progenitor domains that subsequently give rise to functionally distinct types of neurons. How much of this patterning process can the authors capture with their diffusion component analysis and does their data also allow them to capture these finer-grained differences in gene expression along the mediolateral and prospective dorsal-ventral axis of the neural tube that are known to exist?
This is a very interesting point. We have added a new figure showing UMAPs of the E8.5-9.0 cranial neural plate for a subset of 29 genes (described in Delile et al., 2019) that define distinct neural progenitor domains along the dorsal-ventral axis of the spinal cord (Figure 5—figure supplement 2). We observed that 18 of 20 genes that were detected in the midbrain/r1 region in our dataset were expressed in broad domains along the mediolateral axis of the cranial neural plate that were roughly consistent with their expression domains along the dorsal-ventral axis of the spinal cord. Of these 18 genes, 14 were patterned along both anterior-posterior and mediolateral axes, 2 were patterned only along the mediolateral axis, and 2 were patterned only along the anterior-posterior axis. These results suggest a general correspondence between mediolateral patterning in the cranial neural plate and dorsal-ventral patterning in the spinal cord. However, less refinement of these domains along the mediolateral axis was observed in the cranial neural plate, possibly because the relatively early, pre-closure stages captured by our dataset may be before the establishment of secondary feedback systems that lead to fine-scale patterning of mutually exclusive neural precursor domains. These results are described in the last paragraph of the results section titled “An integrated framework for analyzing cell identity in multiscale space.”
(5) The authors state that they will not only make the raw sequencing data but also the processed intermediate data files available. This is greatly appreciated as it strongly facilitates the re-use of the data. However, it would be also appreciated if the authors made the computational code publicly available that was used to analyze the data and generate the figure panels in the manuscript.
We have deposited the processed h5ad files in the GEO database, accession number GSE273804. Additionally, we have made interactive python notebooks available with the code used to analyze gene expression and generate the figures in this study, as well as code used to automatically generate customizable links to gene expression images in the Mouse Genome Informatics Gene Expression database, on our lab GitHub page (https://github.com/ZallenLab). We have updated the Data availability section to reflect these changes.
Reviewer #3 (Recommendations for the authors):
(1) Considering that individual progenitor domains in the developing neural tube are typically sharply delineated with few cells exhibiting mixed identities, it is interesting that clustering of single-cell data results in a largely continuous “cloud” of cells. Is this because the early neural plate cells have not yet crystallized their identity, or would clustering based on a smaller set of genes that exhibit high variance across only neural plate cells result in improved granularity, allowing for better characterization and quantification of distinct progenitor subtypes?
Thank you for raising this interesting point. The apparent continuity of gene expression in the cranial neural plate could reflect a gene signature shared by cranial neural plate cells and that cells may not be extensively regionalized into unique populations at these early stages. We now discuss these possibilities in the third paragraph of the discussion.
(2) Can the authors clarify how neural plate cells were identified and how they were distinguished from the anterior epiblast?
Cell typing was performed by supervised clustering based on known markers of fate. Cranial neural plate cells were identified by their expression of pan-neural factors (Sox2 and Sox3), early or late neural plate markers (Cdh1 or Cdh2), and the lack of markers associated with non-neural ectodermal cell fates (Grhl2, Krt18, Tfap2a) or other cell types (Ets1, T, Tbx6). Full gene sets used to identify all cell types in our analysis are provided in Supplementary Table 13.
(3) Did the study identify cells with cranial placode identity? Cranial placodes emerge during the same period, and it would be useful to highlight them in Figure 1.
Thank you for highlighting this point. Examination of the early placode markers Six1 and Eya1 indicates that cranial placode cells are a subset of the cells in PhenoGraph cluster 17 in our full dataset Figure 1—figure supplement 1). We now mention this along with other cell types of interest in the last paragraph of the discussion.
(4) It could be interesting to provide more information about the novel genes identified as differentially expressed along the AP or mediolateral axes. Do they belong to gene families that were not previously implicated in neural patterning, or do they point to novel biological mechanisms controlling neural patterning?
Diverse gene families are represented by the genes that are patterned along the anterior-posterior and mediolateral axes of the cranial neural plate at these stages, likely due to the large number of genes that are spatially patterned in this tissue. Further investigation of the biological mechanisms suggested by these patterns is an important direction for future work, both in terms of molecularly classifying the genes identified as well as directly investigating their roles in neural patterning using genetic analysis.
(5) It would be helpful to discuss how the data presented here compare to other relevant single-cell analyses, such as PMC10901739. This would help to highlight aspects that are unique to this study.
We have added this reference as well as an earlier study from these authors and we discuss how our study complements this work in the introduction.
(6) The inclusion of single-cell data from control embryos that were cultured for 12 hours is of great interest. The authors should identify the set of genes that are deregulated in cultured cells and, taking advantage of their detailed temporal series, examine whether the maturation of cultured embryos progresses normally or whether there are genes that fail to mature correctly in vitro.
We agree that an analysis of the impact of ex vivo culture on gene expression would be useful. However, the large difference in the number of cells in our wild-type and cultured embryo datasets, as well as the lack of time-course data for the cultured embryos, could make a comparison between our current cultured and non-cultured embryo datasets difficult to interpret.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
The authors studied how hippocampal connectivity gradients across the lifespan, and how these relate to memory function and neurotransmitter distributions. They observed older age with less distinct transitions and observed an association between gradient de-differentiation and cognitive decline.
This is overall an innovative and interesting study to assess gradient alterations across the lifespan and its associations to cognition.
The paper is well-written, and the methods appear sound and thoughtful. There are several strengths, including the inclusion of two independent cohorts, the use of gradient mapping and alignment techniques, and an overall sound statistical and analysis framework. There are several areas for potential improvements in the paper, and these are listed below:
We thank the Reviewer for their positive assessment and summary of our work. We address each of the Reviewer’s comments below, and outline the revisions we have made to the manuscript based on the Reviewer’s suggestions.
(1) The reported D1 associations appear a bit post-hoc in the current work and I was unclear why the authors specifically focussed on dopamine here, as other transmitter systems are similar present at the level of the hippocampus and implicated in aging.
Other neurotransmitter systems may indeed be relevant in the context of hippocampal function in aging. In this study, however, we included a specific research question about the DA D1 receptor (D1DR) based on previous research 1) emphasizing the role of DA neuromodulation in maintaining functional network segregation in aging to support cognition (Pedersen et al., 2023), 2) reporting heterogeneous distribution of DA markers across the hippocampus, supporting efficient modulation of distinct behaviors (Dubovyk & ManahanVaughan, 2019; Edelmann & Lessmann, 2018; Gasbarri et al., 1994; Kempadoo et al., 2016), and 3) demonstrating the spatial distribution of D1DRs as varying across neocortex along a unimodal-transmodal gradient (Pedersen et al., 2024). To which degree this variation might be reflected in cortico-hippocampal connectivity, however, remained to be investigated. As such, one of the study’s specific aims was to evaluate the spatial distribution of D1DRs as a molecular correlate of the hippocampus’ functional organization. Importantly, we were interested in mapping associations between individual differences in the organization of connectivity and D1DRs. This was uniquely enabled by utilizing the DyNAMiC sample, as it includes structural and functional MRI data in combination with D1DR PET in the same individuals across the adult lifespan (n=180). However, after observing significant spatial correspondence between functional organization and D1DR expressed by the second hippocampal gradient (G2), we did indeed perform complimentary analyses with group-averaged data of additional dopamine markers (D2DR from a subsample of our participants, as well as DAT and FDOPA from open sources) to test the generalizability of the original finding. Taken together, the original analyses based on subject-level data and complimentary group-level analyses provided support for the interpretation of G2 as a dopaminergic mode.
We have updated the manuscript to clarify the focus on the D1 receptor and the contribution of including additional DA markers.
Updated paragraph in the Introduction, pages 5-6:
“Dopamine (DA) is one of the most important modulators of hippocampus-dependent function(47,48), and influences the brain’s functional architecture through enhancing specificity of neuronal signaling(49). Consistently, there is a DA-dependent aspect of maintained functional network segregation in aging which supports cognition(50). Animal models suggest heterogeneous patterns of DA innervation(51,52) and postsynaptic DA receptors(53), across both transverse and longitudinal hippocampal axes, likely allowing for separation between DA modulation of distinct hippocampus-dependent behaviors(47). Moreover, the human hippocampus has been linked to distinct DA circuits on the basis of long-axis variation in functional connectivity with midbrain and striatal regions(54,55). Taken together with recent findings revealing a unimodal-transmodal organization of the most abundantly expressed DA receptor subtype, D1 (D1DR), across cortex(56), we tested the hypothesis that the organization of hippocampal-neocortical connectivity partly reflects the underlying distribution of hippocampal DA receptors, predicting predominant spatial correspondence for any hippocampal gradient conveying a unimodal-transmodal pattern across cortex.”
Updated sections in the Results, page 13-14:
“Our next aim was to investigate to which extent the distribution of hippocampal DA D1 receptors (D1DRs), measured by [<sup>11</sup>C]SCH23390 PET in the DyNAMiC(58) sample, may serve as a molecular correlate of the hippocampus’ functional organization.”
“Complimentary analyses were then conducted to further evaluate G2 as a dopaminergic hippocampal mode by utilizing additional DA markers at group-level.”
Moreover, the authors may be aware that multiple PET tracers are somewhat challenged in the mesiotemporal region. Is this the case for the D1 receptor as well? The hippocampus is a small and complex structure, and PET more of a low res technique so one would want to highlight and discuss the limitations of the correlations with PET maps here and/or evaluate whether the analysis adds necessary findings to the study.
We thank the Reviewer for raising this point. The lower resolution of PET is indeed a relevant aspect to consider when quantifying D1DR availability in the hippocampus, even though previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET measurement in this region (Kaller et al., 2017). We have now elaborated on PET limitations in the Discussion of the revised manuscript.
In our study, we made efforts to reduce potential partial volume effects (PVE) by correcting our PET data, and tested spatial associations between our functional gradients and D1DR maps using trend-surface modelling (TSM), rather than through voxel-wise comparisons. This allowed us to evaluate the spatial correspondence between functional connectivity and D1DRs at a level of spatial trends, estimated using TSM models computed at increasing levels of complexity. The results showed consistent spatial overlap between G2 and D1DRs across these models, that is, across spatial trends described at coarser-to-finer scales. Furthermore, this was replicated across several DA markers with PET and SPECT data from independent samples.
Taken together, we agree with the Reviewer that the spatial correspondence observed between G2 and hippocampal D1DRs should be interpreted in the context of resolution-related limitations inherent to PET imaging. However, we strongly believe that our DA analyses offer valuable insight to the molecular underpinnings of hippocampal functional organization.
Updated paragraph in the Discussion, pages 25-26:
“We discovered that G2, specifically, manifested organizational principles shared among function, behavior, and neuromodulation. Meta-analytical decoding reproduced a unimodalassociative axis across G2 (Figure 3B), and analyses in relation to the distribution of D1DRs – which vary across cortex along a unimodal-transmodal axis(76,77) – demonstrated topographic correspondence both at the level of individual differences and across the group. It should, however, be acknowledged that PET imaging in the hippocampus is associated with resolutionrelated limitations, although previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET to quantify D1DR availability in this region(78). As such, mapping the distribution of hippocampal D1DRs at a fine spatial scale remains challenging, and replication of our results in terms of overlap with G2 is needed in independent samples. Here, we evaluated the observed spatial overlap between G2 topography and D1DRs across multiple TSM model orders, showing correspondence between modalities from simple to more complex parameterizations of their spatial properties. Topographic correspondence was additionally observed between G2 and other DA markers from independent datasets (Figure 3B), suggesting that G2 may constitute a mode reflecting a dopaminergic phenotype, which contributes to the currently limited understanding of its biological underpinnings.”
From my (perhaps somewhat biased) perspective, it might be valuable to instead or in addition look at measures of hippocampal microstructure and how these relate to the functional aging effects. This could be done, if available, using data from the same subjects (eg based on quantitative MRI contrasts and/or structural MRI) and/or using contextualization findings as implemented in eg hippomaps.readthedocs.io
We thank the Reviewer for this suggestion. We performed additional analyses investigating the spatial overlap between our connectivity gradients and estimates of hippocampal microstructure, computed as the ratio of T1- over T2-weighted (T1w/T2w) images (Glasser & Von Essen, 2011; vos de Wael et al., 2018). Analyses of spatial correspondence then followed the TSM-based method used to test the spatial overlap between functional connectivity gradients and D1DR distribution. Applying TSM to the T1w/T2w image computed for each participant yielded subject-level model parameters describing microstructure topography, which were then entered as predictors of connectivity topography in multivariate GLMs (separate models for each gradient and hemisphere, 6 models in total).
Analyses revealed that microstructure of the right hippocampus significantly predicted gradient topography of right-hemisphere G1 (F = 1.325, p \= 0.034), while no other links between connectivity gradients and microstructure emerged as significant (F 0.930-1.184, ps 0.7060.079).
These results, suggesting an association along the anteroposterior axis, deviate from previous findings linking hippocampal microstructure to G3-like, medial-lateral, connectivity organization (vos de Wael et al., 2018). As we believe that comprehensive analyses of our gradients in relation to microstructure across the lifespan would be best addressed in future work, we have not included these analyses of microstructure in the revised manuscript.
(2) Can the authors clarify why they did not replicate based on cohorts that are more widely used in the community and open access, such as CamCAN and/or HCP-Aging? It might connect their results with other studies if an attempt was made to also show that findings persist in either of these repositories.
We agree with the Reviewer that replication in samples such as CamCAN and/or HCP-Aging would provide valuable opportunities to connect our findings with those of other studies using those datasets. Here, we included the Betula dataset (Nilsson et al., 2004) as our replication sample, as it was immediately available to us, included a large sample of adults in a comparable age, and a word recall episodic memory task closely aligned with the one included in DyNAMiC. Importantly, leveraging the Betula dataset as our replication sample allows us to link our findings to a wide range of previous studies central to the understanding of neurocognitive aging in general, and hippocampal aging in particular (Nyberg, 2017; Nyberg et al., 2020). Betula is a large longitudinal project that has been tracking individuals since 1988, and is part of the National E-infrastructure for Aging Research (NEAR: www.near-aging.se), through which data from several Swedish studies are made available to both national and international researchers. While we acknowledge the value of extending replication efforts to datasets like CamCAN and HCP-Aging, we emphasize the significant contribution of having replicated our connectivity gradients in the Betula dataset.
(3) The authors applied TSM and related these parameters to topographic changes in the gradients. I was wondering whether and how such an approach controls for autocorrelation present in both the PET map and gradients. Could the authors clarify?
The Reviewer raises an important topic in spatial autocorrelation. The TSM approach used to parameterize the topography of the functional gradients and D1DR distribution, and to test the spatial correspondence between modalities, did not include any specific method to control for autocorrelation. Here, we highlight two aspects of our study in relation to this point. First, we demonstrated in the Supplementary information (S. Figure 4) that autocorrelation induced by spatial smoothing likely has limited effects on overall gradient topography and the ability of TSM parameters to capture meaningful inter-individual differences in terms of age. Second, in the case of spatial overlap effects being significantly impacted by autocorrelation, we would expect the association between right-hemisphere G2 and D1DR topography to similarly emerge for G2 in the left hemisphere. The absence of such an association may speak to a limited effect of spatial autocorrelation.
(4) The TSM approach quantifies the gradients in terms of x/y/z direction in a cartesian coordinate system. Wouldn't a shape intrinsic coordinate system in the hippocampus also be interesting, and perhaps even be more efficient to look at here (see eg DeKraker 2022 eLife or Paquola et al 2020 eLife)?
This is a very relevant question and we appreciate the Reviewer’s suggestion. We recognize that there may be several benefits associated with adopting a shape-intrinsic coordinate system when characterizing effects in the hippocampus, given its curved/folded anatomy. Approaches like the ones adopted in DeKraker et al., 2022 and Paquola et al., 2020, utilizes geodesic coordinate frameworks to represent the hippocampus in surface space, enabling mapping of connectivity onto the hippocampal surface while respecting its inherent curvature and topology. We anticipate that quantifying gradients within such a framework would especially benefit identification of connectivity change across the hippocampal surface relative to reference points such as subfield boundaries, while minimizing effects of interindividual differences in hippocampal shape and folding. In our study, hippocampal gradients and their associated cortical patterns were computed in volumetric space, with TSM subsequently used to parameterize the change in connectivity along these gradients. This indeed yields a description of connectivity change within a coordinate system less specific to hippocampal anatomy, but may favor generalizability and integration with previous gradient findings within and beyond the hippocampus (e.g., Przeździk et al., 2019; Tian et al., 2020; Katsumi et al., 2023; Navarro-Schröder et al., 2015), as well as connections with broader neuroimaging frameworks through techniques such as meta-analytical decoding. In our view, the different coordinate frameworks offer complimentary insight to hippocampal organization, and while we have opted to not undertake novel analyses to explore our gradients within a geodesic coordinate system for the purposes of this paper, we recognize the importance of such evaluation of our gradients in future analyses. We have made updates to the Discussion in the revised manuscript on this topic (pages 23-24):
“Greater anatomical specificity, with more precise characterization of connectivity in relation to subfield boundaries while minimizing effects of inter-individual differences in hippocampal shape and folding, might be achieved by adopting techniques implementing a geodesic coordinate system to represent effects within the hippocampus(68,69).”
Reviewer #2 (Public Review):
Summary:
This paper derives the first three functional gradients in the left and right hippocampus across two datasets. These gradient maps are then compared to dopamine receptor maps obtained with PET, associated with age, and linked to memory. Results reveal links between dopamine maps and gradient 2, age with gradients 1 and 2, and memory performance.
Strengths:
This paper investigates how hippocampal gradients relate to aging, memory, and dopamine receptors, which are interesting and important questions. A strength of the paper is that some of the findings were replicated in a separate sample.
Weaknesses:
The paper would benefit from added clarification on the number of models/comparisons for each test. Furthermore, it would be helpful to clarify whether or not multiple comparison correction was performed and - if so - what type or - if not - to provide a justification. The manuscript would furthermore benefit from code sharing and clarifying which results did/did not replicate.
We thank the Reviewer for their positive assessment and suggestions regarding further clarifications. We have addressed the Reviewer’s comments in a point-by-point manner under the “Recommendations for the authors” section.
Reviewer #3 (Public Review):
Summary:
In this study, the authors analyzed the complex functional organization of the hippocampus using two separate adult lifespan datasets. They investigated how individual variations in the detailed connectivity patterns within the hippocampus relate to behavioral and molecular traits. The findings confirm three overlapping hippocampal gradients and reveal that each is linked to established functional patterns in the cortex, the arrangement of dopamine receptors within the hippocampus, and differences in memory abilities among individuals. By employing multivariate data analysis techniques, they identified older adults who display a hippocampal gradient pattern resembling that of younger individuals and exhibit better memory performance compared to their age-matched peers. This underscores the behavioral importance of maintaining a specific functional organization within the hippocampus as people age.
Strengths:
The evidence supporting the conclusions is overall compelling, based on a unique dataset, rich set of carefully unpacked results, and an in-depth data analysis. Possible confounds are carefully considered and ruled out.
Weaknesses:
No major weaknesses. The transparency of the statistical analyses could be improved by explicitly (1) stating what tests and corrections (if any) were performed, and (2) justifying the elected statistical approaches. Further, some of the findings related to the DA markers are borderline statistically significant and therefore perhaps less compelling but they line up nicely with results obtained using experimental animals and I expect the small effect sizes to be largely related to the quality and specificity of the PET data rather than the derived functional connectivity gradients.
We thank the Reviewer for the thoughtful summary and positive assessment of our work. To increase transparency of the statistical analyses, we have in the revised manuscript added information regarding statistical tests and corrections for multiple comparisons. In the Results, p-values were reported at an uncorrected statistical threshold, and we have in the revised manuscript included the corresponding p-values adjusted for multiple comparisons using the Benjamini-Hochberg method to control the false discovery rate (FDR). Finally, in the revised manuscript, we have now elaborated on the potential limitations of our PET analyses and we include the updated paragraph below.
Addition made to the Results section, page 13:
“Individual maps of D1DR binding potential (BP) were also submitted to TSM, yielding a set of spatial model parameters describing the topographic characteristics of hippocampal D1DR distribution for each participant. D1DR parameters were subsequently used as predictors of gradient parameters in one multivariate GLM per gradient (in total 6 GLMs, controlled for age, sex, and mean FD). Results are reported with p-values at an uncorrected statistical threshold and p-values after adjustment for multiple comparisons using the Benjamini-Hochberg method to control the false discovery rate (FDR).”
Addition made to the Results section, page 15:
“Effects of age on gradient topography were assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD). One model was fitted per gradient and hemisphere, each model including all TSM parameters belonging to a gradient (in total, 6 GLMs).”
Addition made to the Results section, page 17:
“Models were assessed separately for left and right hemispheres, across the full sample and within age groups, yielding eight hierarchical models in total. Results are reported with p-values at an uncorrected statistical threshold and p-values after FDR adjustment.”
Updated paragraph in the Discussion, pages 25-26:
“We discovered that G2, specifically, manifested organizational principles shared among function, behavior, and neuromodulation. Meta-analytical decoding reproduced a unimodalassociative axis across G2 (Figure 3B), and analyses in relation to the distribution of D1DRs – which vary across cortex along a unimodal-transmodal axis(76,77) – demonstrated topographic correspondence both at the level of individual differences and across the group. It should, however, be acknowledged that PET imaging in the hippocampus is associated with resolutionrelated limitations, although previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET to quantify D1DR availability in this region(78). As such, mapping the distribution of hippocampal D1DRs at a fine spatial scale remains challenging, and replication of our results in terms of overlap with G2 is needed in independent samples. Here, we evaluated the observed spatial overlap between G2 topography and D1DRs across multiple TSM model orders, showing correspondence between modalities from simple to more complex parameterizations of their spatial properties. Topographic correspondence was additionally observed between G2 and other DA markers from independent datasets (Figure 3B), suggesting that G2 may constitute a mode reflecting a dopaminergic phenotype, which contributes to the currently limited understanding of its biological underpinnings.”
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Please see the comments in the public review.
We thank the Reviewer for their comments and recommendations, and have addressed them in the “Public review” section.
Reviewer #2 (Recommendations For The Authors):
(1) All statistical analyses are based on linear regressions using trend surface modeling (TSM) parameters that parameterize gradients at the subject level. These models resulted in 9 parameters for gradient 1 and 12 parameters each for gradients 2 and 3. The text states that 'Effects of age on gradient topography was assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD)'. Please clarify whether these GLMs were fitted separately for each TSM parameter (i.e., 9+12+12=33 models for both left and right = 66 total models) or on the overall model?
We appreciate the Reviewer’s request for clarification on this matter. These GLMs were fitted on the overall TSM model, that is, through one GLM per gradient (3) and hemisphere (2), each one including all TSM parameters belonging to a gradient (in total, 6 GLMs).
In the revised manuscript, we have added more details to the Results section, page 15: “Effects of age on gradient topography were assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD). One model was fitted per gradient and hemisphere, each model including all TSM parameters belonging to a gradient (in total, 6 GLMs).”
(2) Similarly, for memory it appears that multiple models were performed (left and right, young, middle-aged, old, whole groups). Please clarify whether and how multiple comparison correction was performed in this case.
In the revised manuscript, we have now specified the number of analyses conducted in relation to memory performance. We have also clarified that p-values were reported at an uncorrected statistical threshold, and we have in the revised manuscript included the corresponding p-values adjusted for multiple comparisons using the Benjamini-Hochberg method to control the FDR.
Updated section in the Results, page 17:
“Models were assessed separately for left and right hemispheres, across the full sample and within age groups, yielding eight hierarchical models in total. Results are reported with p-values at an uncorrected statistical threshold and p-values after FDR adjustment.”
(3) Although I applaud the authors for their replication efforts, the results do not appear to replicate well. For example, memory was linked to gradient 2 in the whole group but to gradient 1 in the young group. Furthermore, dopamine was linked to gradient 2 in the right but not the left hemisphere. Although the overall group-level gradients were very stable between the two datasets, it is not clear whether the age findings replicated and the memory subgroup findings only replicated at trend level for memory and only partially replicated at the TSM parameter level.
We thank the Reviewer for highlighting the inclusion of a replication dataset as a strength of our study, and we appreciate the recommendation to clarify to which extent results replicated. We provide a response to the Reviewer’s points below, and specify the revisions made to the manuscript in relation to this topic.
The main aim of our study was to characterize the topographic organization of functional hippocampal-neocortical connectivity within the hippocampus across the adult lifespan, as previous studies have limited their focus to younger adults. Given the lack of previous studies for comparison, together with our identification of a novel secondary long-axis connectivity gradient (G2) taking precedence over the previously established medial-lateral G3, we included the Betula sample (Nilsson et al., 2004) for the purpose of replication. There was a high level of consistency between our main dataset and our replication dataset, with gradients 1-3 in left and right hemispheres identified in both samples.
Further use of the replication dataset, beyond the identification of the connectivity gradients, was originally not planned. As such, not all subsequent analyses in the main dataset were conducted in the replication dataset. However, we found it critical to evaluate the observation that older individuals who maintained a youth-like gradient topography also exhibited higher levels of memory performance in an independent sample. This was possible given that the replication dataset included a comparable number of participants in similar ages and a word recall episodic memory task corresponding well to the one used in DyNAMiC. Overall, we conclude that these analyses replicated well across samples. Firstly, topography of lefthemisphere G1 informed the classification of older adults into youth-like and aged subgroups in both samples. Furthermore, in both samples, we observed that the older subgroups identified based on G1 topography also exhibited the youth-like vs. aged pattern in G2 topography. This pattern was, however, evident also in G3 only in the main sample, possibly suggesting a limited contribution of G3 topography in determining overall functional profiles in older age. In terms of the behavioral relevance of maintaining youth-like gradient topography in older age, we observed effects on word recall performance in both samples; although the Reviewer correctly points out that, the difference between subgroups was significant at trend-level (p = 0.058) in the replication dataset. While this indeed underscores the importance of replication efforts in additional samples, we argue that the pattern observed in our replication dataset is overall consistent with, and conveys effects in the expected direction based on, the original observations in our main dataset.
In revising the manuscript, we have performed additional analyses for replication purposes in terms of memory. Originally, we observed a significant association between G2 topography and episodic memory across the main sample. However, this effect did not remain significant after FDR adjustment for multiple comparisons. To evaluate this association further, we conducted a corresponding hierarchical multiple regression analysis in the replication dataset, which supported a role of G2 in memory (Adj. R<sup>2</sup> = 0.368, ΔR<sup>2</sup> = 0.081, F= 1.992, p = 0.028). Together, these analyses suggest that inter-individual differences in episodic memory performance may in part be explained by the spatial characteristics of G2 across the adult lifespan, although increased statistical power in relation to the large number of TSM parameters included in the hierarchical regression models may be needed to explore this association in smaller, age-stratified, groups. Relatedly, it is worth mentioning that higher levels of memory performance in older age were linked to the maintenance of youth-like G2 topography in both our main and replication datasets.
In parallel, topographic parameters of G1 predicted memory performance in the younger adults, which successfully replicates TSM-based results previously reported in Przeździk et al., 2019. Although similar associations were not evident within the other age groups, a link between G1 topography and memory was demonstrated in older age based on a) the identification of individuals maintaining a youth-like G1 profile and higher levels of memory, within which b) memory performance was, as in young adults, significantly predicted by G1 topography.
The spatial correspondence between G2 topography and distribution of hippocampal D1DRs was lateralized to the right, and as the Reviewer points out, as such did not replicate across hemispheres. To which extent replication across hemispheres should be expected in this case is, however, difficult to determine. Lateralization and/or hemispheric asymmetry is commonly observed in numerous hippocampal features, from the molecular level to its functional involvement in behavior (Nematis et al., 2023; Persson & Söderlund, 2015), including various dopaminergic markers tested in the animal literature (Afonso et al., 1993; Sadeghi et al., 2017). Yet, potential differences between hemispheres in D1DR availability and the spatial distribution of receptors along hippocampal axes remain less studied in humans. More data is therefore needed to determine the nature of this right-hemisphere lateralization.
In sum, we argue that our results show a good level of replication across independent datasets and across analyses in our main dataset. Whereas this study did not attempt replication of all analyses conducted in the main dataset, it has through replication across independent samples provided support for its main findings – the organization of hippocampal-neocortical connectivity along three main hippocampal gradients across the adult lifespan, and the gradient topography-based identification of older individuals maintaining a youth-like hippocampal organization in older age.
The revised manuscript includes edits made to incorporate the new analyses and clarifications of observations in relation to memory.
In the Results, page 17:
“Observing that the association between G2 and memory did not remain significant after FDR adjustment, we performed the same analysis in our replication dataset, which also included episodic memory testing. Consistent with the observation in our main dataset, G2 significantly predicted memory performance (Adj. R<sup>2</sup> = 0.368, ΔR<sup>2</sup> = 0.081, F= 1.992, p = 0.028) over and above covariates and topography of G1. Here, the analysis also showed that G1 topography predicted performance across the sample (Adj. R<sup>2</sup> = 0.325, ΔR<sup>2</sup> = 0.112, F= 3.431, p < 0.001).”
In the Discussion, page 26:
“Results linked both G1 and G2 to episodic memory, suggesting complimentary contributions of these two overlapping long-axis modes. Considered together, analyses in the main and replication datasets indicated a role of G2 topography in memory across the adult lifespan, independent of age. A similar association with G1 was only evident across the entire sample in the replication dataset, whereas results in the main sample seemed to emphasize a role of youthlike G1 topography in memory performance. In line with previous research, memory was successfully predicted by G1 topography in young adults(30), and similarly predicted by G1 in older adults exhibiting a youth-like functional profile.”
(4) Please share the data and code and add a description of data and code availability in the manuscript.
We have now made our code available, and added a statement on data and code availability in the revised manuscript.
On page 37: “Data from the DyNAMiC study are not publicly available. Access to the original data may be shared upon request from the Principal investigator, Dr. Alireza Salami. The Matlab, R, and FSL codes used for analyses included in this study are openly available at https://github.com/kristinnordin/hcgradients. Computation of gradients was done using the freely available toolbox ConGrads: https://github.com/koenhaak/congrads.”
Reviewer #3 (Recommendations For The Authors):
Please see the comments in the public review.
We thank the Reviewer for their comments and recommendations, and have addressed them in the “Public review” section.
References
Afonso, D., Santana, C., & Rodriguez, M. (1993). Neonatal lateralization of behavior and brain dopaminergic asymmetry. Brain Research Bulletin, 32(1), 11–16. https://doi.org/10.1016/0361-9230(93)90312-Y
DeKraker, J., Haast, R. A., Yousif, M. D., Karat, B., Lau, J. C., Köhler, S., & Khan, A. R. (2022). Automated hippocampal unfolding for morphometry and subfield segmentation with HippUnfold. eLife, 11, e77945. https://doi.org/10.7554/eLife.77945
Dubovyk, V., & Manahan-Vaughan, D. (2019). Gradient of expression of dopamine D2 receptors along the dorso-ventral axis of the hippocampus. Frontiers in Synaptic Neuroscience, 11. https://doi.org/10.3389/fnsyn.2019.00028
Edelmann, E., & Lessmann, V. (2018). Dopaminergic innervation and modulation of hippocampal networks. Cell and Tissue Research, 373(3), 711–727. https://doi.org/10.1007/s00441-018-2800-7
Gasbarri, A., Verney, C., Innocenzi, R., Campana, E., & Pacitti, C. (1994). Mesolimbic dopaminergic neurons innervating the hippocampal formation in the rat: A combined retrograde tracing and immunohistochemical study. Brain Research, 668(1), 71–79. https://doi.org/10.1016/0006-8993(94)90512-6
Glasser, M. F., & Essen, D. C. V. (2011). Mapping Human Cortical Areas In Vivo Based on Myelin Content as Revealed by T1- and T2-Weighted MRI. Journal of Neuroscience, 31(32), 11597–11616. https://doi.org/10.1523/JNEUROSCI.2180-11.2011
Kaller, S., Rullmann, M., Patt, M., Becker, G.-A., Luthardt, J., Girbardt, J., Meyer, P. M., Werner, P., Barthel, H., Bresch, A., Fritz, T. H., Hesse, S., & Sabri, O. (2017). Test– retest measurements of dopamine D1-type receptors using simultaneous PET/MRI imaging. European Journal of Nuclear Medicine and Molecular Imaging, 44(6), 1025–1032. https://doi.org/10.1007/s00259-017-3645-0
Katsumi, Y., Zhang, J., Chen, D., Kamona, N., Bunce, J. G., Hutchinson, J. B., Yarossi, M., Tunik, E., Dickerson, B. C., Quigley, K. S., & Barrett, L. F. (2023). Correspondence of functional connectivity gradients across human isocortex, cerebellum, and hippocampus. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04796-0
Kempadoo, K. A., Mosharov, E. V., Choi, S. J., Sulzer, D., & Kandel, E. R. (2016). Dopamine release from the locus coeruleus to the dorsal hippocampus promotes spatial learning and memory. Proceedings of the National Academy of Sciences, 113(51), 14835–14840. https://doi.org/10.1073/pnas.1616515114
Navarro Schröder, T., Haak, K. V., Zaragoza Jimenez, N. I., Beckmann, C. F., & Doeller, C. F. (2015). Functional topography of the human entorhinal cortex. eLife, 4, e06738. https://doi.org/10.7554/eLife.06738
Nemati, S. S., Sadeghi, L., Dehghan, G., & Sheibani, N. (2023). Lateralization of the hippocampus: A review of molecular, functional, and physiological properties in health and disease. Behavioural Brain Research, 454, 114657. https://doi.org/10.1016/j.bbr.2023.114657
Nilsson, L.-G., Adolfsson, R., Bäckman, L., Frias, C. M. de, Molander, B., & Nyberg, L. (2004). Betula: A Prospective Cohort Study on Memory, Health and Aging. Aging, Neuropsychology, and Cognition, 11(2–3), 134–148. https://doi.org/10.1080/13825580490511026
Nyberg, L. (2017). Functional brain imaging of episodic memory decline in ageing. Journal of Internal Medicine, 281(1), 65–74. https://doi.org/10.1111/joim.12533
Nyberg, L., Boraxbekk, C.-J., Sörman, D. E., Hansson, P., Herlitz, A., Kauppi, K., Ljungberg, J. K., Lövheim, H., Lundquist, A., Adolfsson, A. N., Oudin, A., Pudas, S., Rönnlund, M., Stiernstedt, M., Sundström, A., & Adolfsson, R. (2020). Biological and environmental predictors of heterogeneity in neurocognitive ageing: Evidence from Betula and other longitudinal studies. Ageing Research Reviews, 64, 101184. https://doi.org/10.1016/j.arr.2020.101184
Paquola, C., Benkarim, O., DeKraker, J., Larivière, S., Frässle, S., Royer, J., Tavakol, S.,
Valk, S., Bernasconi, A., Bernasconi, N., Khan, A., Evans, A. C., Razi, A., Smallwood, J., & Bernhardt, B. C. (2020). Convergence of cortical types and functional motifs in the human mesiotemporal lobe. eLife, 9, e60673. https://doi.org/10.7554/eLife.60673
Pedersen, R., Johansson, J., Nordin, K., Rieckmann, A., Wåhlin, A., Nyberg, L., Bäckman, L., & Salami, A. (2024). Dopamine D1-Receptor Organization Contributes to Functional Brain Architecture. Journal of Neuroscience, 44(11). https://doi.org/10.1523/JNEUROSCI.0621-23.2024
Pedersen, R., Johansson, J., & Salami, A. (2023). Dopamine D1-signaling modulates maintenance of functional network segregation in aging. Aging Brain, 3, 100079. https://doi.org/10.1016/j.nbas.2023.100079
Persson, J., & Söderlund, H. (2015). Hippocampal hemispheric and long-axis differentiation of stimulus content during episodic memory encoding and retrieval: An activation likelihood estimation meta-analysis. Hippocampus, 25(12), 1614–1631. https://doi.org/10.1002/hipo.22482
Przeździk, I., Faber, M., Fernández, G., Beckmann, C. F., & Haak, K. V. (2019). The functional organisation of the hippocampus along its long axis is gradual and predicts recollection. Cortex, 119, 324–335. https://doi.org/10.1016/j.cortex.2019.04.015
Sadeghi, L., Rizvanov, A. A., Salafutdinov, I. I., Dabirmanesh, B., Sayyah, M., Fathollahi, Y., & Khajeh, K. (2017). Hippocampal asymmetry: Differences in the left and right hippocampus proteome in the rat model of temporal lobe epilepsy. Journal of Proteomics, 154, 22–29. https://doi.org/10.1016/j.jprot.2016.11.023
Tian, Y., Margulies, D. S., Breakspear, M., & Zalesky, A. (2020). Topographic organization of the human subcortex unveiled with functional connectivity gradients. Nature Neuroscience, 1–12. https://doi.org/10.1038/s41593-020-00711-6
vos de Wael, R., Larivière, S., Caldairou, B., Hong, S.-J., Margulies, D. S., Jefferies, E., Bernasconi, A., Smallwood, J., Bernasconi, N., & Bernhardt, B. C. (2018). Anatomical and microstructural determinants of hippocampal subfield functional connectome embedding. Proceedings of the National Academy of Sciences, 115(40), 10154–10159. https://doi.org/10.1073/pnas.1803667115
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this study, Yue et al. re-processed publicly available DNA methylation data (published in 2012 and 2017 from the Meissner lab) from pre- and post-implantation mouse embryos. Against the global wave of genome-wide reduction of DNA methylation occurring during pre-implantation development, they detected a slight increase (~1% on average) of DNA methylation at gene promoter regions during the transition from 8-cell to blastocyst stage. They claim that many such promoters are located in the X chromosome. Subsequently, they knocked down Dnmt3b (presumably because of its upregulation during the transition from the 8-cell to blastocyst stage) and detected the aberrant patterning of H3K27me3 in the mutant female embryos. Based on this observation, they claim that imprinted X-chromosome inactivation is impaired in the Dnmt3b-Kd pre-implantation embryos. Finally, they propose a model where such an increase of DNA methylation together with H3K27me3 regulates imprinted X-chromosome inactivation in the pre-implantation embryos. While their observation is of potential interest, the current version of the work fails to provide enough evidence to support their conclusions. Below are suggestions and comments on the manuscript.
Major issues:
(1) Sex of the embryos of the genome-wide bisulfite-sequencing data
The authors re-analyzed publicly available genome-wide DNA methylation data from the Meissner lab published in 2012 and 2017. The former used reduced representation bisulfite sequencing (RRBS) and the latter used whole-genome bisulfite sequencing (WGBS). Based mainly on the RRBS data, Yue et al. detected de novo DNA methylated promoters during the transition from 8-cell to blastocyst against the global wave of genome-wide DNA demethylation. They claim that such promoter regions are enriched at the "inactive" X chromosome. However, it would be difficult to discuss DNA methylation at inactive X-chromosomes as the RRBS data were derived from a mixture of male and female embryos. It would also be notable that the increase of DNA methylation at these promoter regions is ~1% on average. Such a slight increase in DNA methylation during pre-implantation development could also be due to the developmental variations between the embryos or between the sexes of embryos.
Thanks so much for your insightful comments. Whether de novo DNA methylation occurs in a sex-dimorphic manner would be of significance for our study. Based on your comments, we have added a reanalysis based on a publicly available single cell multi-omics sequencing (COOL-seq) data of mouse early embryos (Guo et al., 2017). The results showed that both male and female embryonic cells gain DNA methylation during the transition from the 8-cell to ICM (Figure 1—figure supplement 1C-D; Lines 112-115 in the revised manuscript).
With regards to the increase in the promoter region, many previous studies have revealed that promoter and overlapping CGI regions, especially high CpG promoters, always showed low levels of DNA methylation (Auclair et al., 2014; Borgel et al., 2010; Dahlet et al., 2020). The relatively lower basal levels make the increase seem relatively slight. Thus, we added relevant statements to clarify this information and rewritten the sentences in the revised manuscript (Lines 116-118, 125-127 in the revised manuscript).
In addition, using the single cell COOL-seq data, we also specifically reanalyzed the DNA methylation changes on the X chromosome in female embryos. The X chromosome showed a more notable increase than that on autosomes, and the female X chromosome showed a higher DNA methylation level than that of the male (Figure 3—figure supplement 2A-B; Lines 203-206 in the revised manuscript).
Thanks again for your insightful and constructive comments that significantly strengthen our evidence. We have added these results in the revised manuscript.
(2) Imprinted X-chromosome inactivation and evaluation of H3K27me3 (related to Figures 2C, D; 3F; Figure2-supplement 2 F, G; Figure3-supplement 3G)
Based on the slight change in the H3K27me3 signals in the Dnmt3b-Kd blastocysts, the authors claim that imprinted X-chromosome inactivation is impaired in the mutant embryo. It would be not easy to reach this conclusion from such a rough analysis of H3K27me3 presented in Figure 2C, D. Rigorous quantification/evaluation of the H3K27me3 signals in the Dnmt3b-Kd embryos should be considered. Additional evidence for the impairment of H3K27me3 in the mutant embryos should also be provided (expression of a subset of X-linked genes by RNA-FISH or RT-PCR etc.). Though technically challenging, high-resolution genome-wide approach such as ChIP-seq of H3K27me3 in the Dnmt3b-kd female embryos (with traceable SNPs between maternal and paternal X chromosome to distinguish inactive and active X-chromosome) could more precisely evaluate regions that lose H3K27me3 in the X-chromosome (de novo DNA methylated promoters from 8-cell to blastocyst, for example).
Thanks so much for your insightful comments that make our results more convincing. The H3K27me3 domain is a classic marker for establishment of XCI by achieving X chromosome wide heterochromatinization of transcriptional depression (Chow and Heard, 2009; Heard et al., 2004; Huynh and Lee, 2005). Thus, in the present study, we have performed immunostaining for H3K27me3 domains to evaluate the iXCI status in the blastocysts, as previously reported (Fukuda et al., 2014; Gontan et al., 2018; Inoue et al., 2010; Tan et al., 2016). Base on your comments, we have added another statistical method to quantify the establishment of iXCI, i.e. the percentage of H3K27me3-positive and -negative cells to total trophoblast cells in female blastocysts subject to Dnmt3b knockdown or not. The result also indicated that Dnmt3b knockdown led to a significant loss of H3K27me3 domains from total trophoblast cells. Similarly, new data based on statistical analyses of total trophoblast cells, has also been added in the results of Dnmt3b knockout and 5-aza-dC (Figure 3F; Figure 3—figure supplement 3D, H in the revised manuscript).
To clarify the significance and reliability of detecting H3K27me3 domains, we have added a schematic diagram depicting the process of iXCI initiation and establishment, as well as the experimental design and work flows, to make our results easier to be understood (Figure 3C in the revised manuscript).
In addition, we agree with your comments that additional evidence will benefit the conclusion. Thus, we have reanalyzed the RNA-seq and H3K27me3 CHIP-seq data in extraembryonic ectoderm (ExE) of E6.5 single embryos that underwent Dnm3a/3b knockout because preimplantation iXCI status maintains extraembryonic cells (Chen et al., 2019; Galupa and Heard, 2015; Schulz and Heard, 2013). The results showed that Dnmt knockout-induced chromosome-wide loss of DNA methylation led to a nearly complete loss of H3k27me3 on paternal X chromosome (specifically inactivated in iXCI), along with a notable transcriptional upregulation cross the chromosome. By contrast, these changes cannot be not observed on maternal X chromosome.
We have added this result in the revised manuscript (Lines 253-261; Figure 3—figure supplement 4A in the revised manuscript).
(3) Analysis of the developmental potential of Dnmt3b-kd embryos
While the authors claim that Dnmt3b-mediated de novo DNA methylation plays an important role in imprinted X-chromosome inactivation, it remains unclear whether the analysis presented in Figure 4 is derived from "female" embryos. This analysis seemed confusing as the authors claim that de novo DNA methylation in the promoter regions during the transition from 8-cell to blastocyst regulates imprinted X-chromosome inactivation, but this should not happen in the male embryos. Was the impairment of embryonic proliferation and differentiation observed in both male and female embryos? Or is this specific to the female embryos? We think that the sex of the embryos would be critical for the analysis presented in Figure 4.
Thanks so much for your constructive comments to make our results smoother and clearer. The Figure 4 mainly presents the developmental role of minor de novo methylation based on the integrated analysis of DNA methylation and gene expression dynamics from the 8-cell to ICM. Because our data indicated that both male and female embryos undergo minor de novo methylation (Figure 1—figure supplement 1C-D in the revised manuscript). This section mainly focused on genome wide and general changes, but not on sex dimorphic consequence.
To avoid the possible confusion, we have reorganized the RESULTS AND DISCUSSION section and presented this section as Figure 2 in the revised manuscript, before the chromosomal distribution analysis and subsequent detection relevant to iXCI.
Reviewer #2 (Public Review):
Summary:
Here, Yue et al. set out to determine if the low DNMT3B expression that is observed prior to de novo DNA methylation (before the blastocyst stage) has a function. Re-analyzing existing DNA methylation data from Smith et al. (2012) they find a small DNA methylation gain over a subset of promoters and gene bodies, occurring between the 8-cell and blastocyst stages, and refer to this as "minor de novo DNA methylation". They attempt to assess the relevance/functionality of this minor DNA methylation gain, and report reduced H3K27me3 in Dnmt3b knockdown (KD) trophoblast cells that normally undergo imprinted X-chromosome inactivation (iXCI) before the blastocyst stage. In addition, they assess the proliferation, differentiation, metabolic function, implantation rate, and live birth rate of Dnmt3b KD blastocysts.
Strengths:
Working with early embryos is technically demanding, making the well-designed experiments from this manuscript useful to the epigenetics community. Particularly, the DNMT3B expression and 5-mC staining at different embryonic stages.
Thanks for your positive evaluation, we have revised manuscript based on your comments, and the items need to be addressed in detail are explained in the point-by-point response to each comment.
Weaknesses:
- Throughout the manuscript, please represent DNA methylation changes as delta DNA methylation instead of fold change.
Thanks so much for your constructive comments. We have represented DNA methylation changes as “ΔDNA methylation” (Figure 2—figure supplement 1A; Figure 3—figure supplement 1A; Figure 3—figure supplement 3I in the revised manuscript).
- Detailed methods on the re-analysis of the DNA methylation data from Smith et al. 2012 are missing from the materials and methods section. Was a minimum coverage threshold used?
Thanks so much for your reminder. We have added relevant statements and provided the detail of the coverage criteria in the subsection of Bioinformatics analysis in the Materials and methods section as follows: RRBS data of mouse embryos (2-cell embryos, 4-cell embryos, 8-cell embryos, ICM, and E6.5 embryos) were downloaded from the published article by Smith et al (Smith et al., 2012) (accession number: GSE34864). The methylation level was calculated as the number of “methylated” reads (reporting as C), divided by the total number of “methylated” and “unmethylated” read, which reporting as C or T. The genomic region information was downloaded from the mm9 Repeat Masker. As described in the published article, promoters were defined as 1 kb up- and downstream of the TSS and classified into high-density CpG promoter (HCP), intermediate-density CpG promoter (ICP) and low-density CpG promoter (LCP). Only CpG sites with at least fivefold coverage were included in the methylation analysis. We have added relevant information in the revised manuscript (Lines 462-470 in the revised manuscript).
- Detailed methods on the establishment and validation of Dnmt3b KO blastocysts and 5-aza-dC treated blastocysts are missing (related to Figure 2).
Thanks so much for your detailed reminder. In the present study, we used a well-established Dnmt3b-deficient mouse model (Okano et al., 1999) to validate the role of minor de novo DNA methylation in iXCI establishment. Heterozygous Dnmt3b<sup>+/-</sup> mice that carry one mutant locus of Dnmt3b, were obtained from the Mutant Mouse Resource & Research Centers (MMRRC, NIH). Homozygous embryos were obtained by intercrossing Dnmt3b<sup>+/-</sup> male and female mice. Genotyping assays of collected embryos was performed by PCR using primers that were designed based on the gene targeting strategy following the MMRRC genotyping protocol (https://www.med.unc.edu/mmrrc/genotyping-protocols/mmrrc-center-protocol-29886/). We have provided the detailed methods in the revised manuscript (Lines 350-354; 391-393 in the revised manuscript). In addition, we added a schematic diagram depicting the processes of embryo collection and detection (Figure 3—figure supplement 3A in the revised manuscript).
Similarly, we have provided relevant details of 5-aza-dC supplementation in the revised manuscript (Lines 412-415 in the revised manuscript) and added a schematic diagram depicting the details of experimental design and processes (Figure 3—figure supplement 3E in the revised manuscript).
- Detailed methods on the re-analysis of the ChIPseq data from Liu et al. 2016 are missing from the materials and methods section.
Thank you for pointing this out. The bigwig files of H3K27me3 ChIP-seq data were downloaded from the published article by Liu et al (Liu et al., 2016)(accession number: GSE73952). These signal tracks were generated using the MACS2 (v2.0.10.20131216) pileup function and normalized to 1 million reads for visualization, as described in the original publication. We have added relevant information to the MATERIALS AND METHODS section in the revised manuscript (Lines 474-479 in the revised manuscript).
- Some of the data represented in bar graphs does not look convincing/significant. Maybe this data can be better represented differently, such as in box plots or violin plots, which would better represent the data.
Thanks so much for your comments that improve our result presentation, relevant results have been changed into box plots in the revised manuscript (Figure 3E; Figure 3—figure supplement 3C; Figure 3—figure supplement 3G in the revised manuscript). In addition, to strengthen our evidence, we have added alternative statistical method to quantify the establishment of iXCI, i.e. the percentage of H3K27me3-positive and -negative cells to total trophoblast cells in female blastocysts subject to Dnmt3b knockdown or not. (Figure 3F; Figure 3—figure supplement 3D, H in the revised manuscript).
- The relevance and rationale for experiments using 5-aza-dC treatment is unclear.
Thanks so much for reminding us to make our results more informative and convincing. 5-aza-dC is a well-established global DNA hypomethylating agent that efficiently inhibit the activity of all DNMTs, and thus has been frequently used to study the maintenance of DNA methylation and de novo DNA methylation (Maslov et al., 2012; Oka et al., 2005).
In our study, to validate the function of minor de novo DNA methylation in iXCI, we take advantage of 5-aza-dC-induced DNMT inhibition, which allows us, despite its inhibitory effect common to various DNMTs, to transiently treat embryos specifically during the window of minor de novo DNA methylation (from the 8-cell to blastocyst stage). We have added these statements, as well as a schematic diagram depicting the experimental design, in the revised manuscript to make our experiments more rational and easier to be understood (Lines 183-188; Figure 3—figure supplement 3E in the revised manuscript).
References
Auclair, G., Guibert, S., Bender, A. and Weber, M. (2014). Ontogeny of CpG island methylation and specificity of DNMT3 methyltransferases during embryonic development in the mouse. Genome Biol. 15, 545.
Borgel, J., Guibert, S., Li, Y., Chiba, H., Schubeler, D., Sasaki, H., Forne, T. and Weber, M. (2010). Targets and dynamics of promoter DNA methylation during early mouse development. Nat. Genet. 42, 1093-1100.
Chen, Z., Yin, Q., Inoue, A., Zhang, C. and Zhang, Y. (2019). Allelic H3K27me3 to allelic DNA methylation switch maintains noncanonical imprinting in extraembryonic cells. Sci Adv 5, eaay7246.
Chow, J. and Heard, E. (2009). X inactivation and the complexities of silencing a sex chromosome. Curr. Opin. Cell Biol. 21, 359-366.
Dahlet, T., Argueso Lleida, A., Al Adhami, H., Dumas, M., Bender, A., Ngondo, R. P., Tanguy, M., Vallet, J., Auclair, G., Bardet, A. F., et al. (2020). Genome-wide analysis in the mouse embryo reveals the importance of DNA methylation for transcription integrity. Nat Commun 11, 3153.
Fukuda, A., Tomikawa, J., Miura, T., Hata, K., Nakabayashi, K., Eggan, K., Akutsu, H. and Umezawa, A. (2014). The role of maternal-specific H3K9me3 modification in establishing imprinted X-chromosome inactivation and embryogenesis in mice. Nat Commun 5, 5464.
Galupa, R. and Heard, E. (2015). X-chromosome inactivation: new insights into cis and trans regulation. Curr. Opin. Genet. Dev. 31, 57-66.
Gontan, C., Mira-Bontenbal, H., Magaraki, A., Dupont, C., Barakat, T. S., Rentmeester, E., Demmers, J. and Gribnau, J. (2018). REX1 is the critical target of RNF12 in imprinted X chromosome inactivation in mice. Nat Commun 9, 4752.
Guo, F., Li, L., Li, J., Wu, X., Hu, B., Zhu, P., Wen, L. and Tang, F. (2017). Single-cell multi-omics sequencing of mouse early embryos and embryonic stem cells. Cell Res. 27, 967-988.
Heard, E., Chaumeil, J., Masui, O. and Okamoto, I. (2004). Mammalian X-chromosome inactivation: an epigenetics paradigm. Cold Spring Harb. Symp. Quant. Biol. 69, 89-102.
Huynh, K. D. and Lee, J. T. (2005). X-chromosome inactivation: a hypothesis linking ontogeny and phylogeny. Nat. Rev. Genet. 6, 410-418.
Inoue, K., Kohda, T., Sugimoto, M., Sado, T., Ogonuki, N., Matoba, S., Shiura, H., Ikeda, R., Mochida, K., Fujii, T., et al. (2010). Impeding Xist expression from the active X chromosome improves mouse somatic cell nuclear transfer. Science 330, 496-499.
Liu, X. Y., Wang, C. F., Liu, W. Q., Li, J. Y., Li, C., Kou, X. C., Chen, J. Y., Zhao, Y. H., Gao, H. B., Wang, H., et al. (2016). Distinct features of H3K4me3 and H3K27me3 chromatin domains in pre-implantation embryos. Nature 537, 558-562.
Maslov, A. Y., Lee, M., Gundry, M., Gravina, S., Strogonova, N., Tazearslan, C., Bendebury, A., Suh, Y. and Vijg, J. (2012). 5-aza-2'-deoxycytidine-induced genome rearrangements are mediated by DNMT1. Oncogene 31, 5172-5179.
Oka, M., Meacham, A. M., Hamazaki, T., Rodic, N., Chang, L. J. and Terada, N. (2005). De novo DNA methyltransferases Dnmt3a and Dnmt3b primarily mediate the cytotoxic effect of 5-aza-2'-deoxycytidine. Oncogene 24, 3091-3099.
Okano, M., Bell, D. W., Haber, D. A. and Li, E. (1999). DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 99, 247-257.
Schulz, E. G. and Heard, E. (2013). Role and control of X chromosome dosage in mammalian development. Curr. Opin. Genet. Dev. 23, 109-115.
Smith, Z. D., Chan, M. M., Mikkelsen, T. S., Gu, H. C., Gnirke, A., Regev, A. and Meissner, A. (2012). A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature 484, 339-344.
Tan, K., An, L., Miao, K., Ren, L., Hou, Z., Tao, L., Zhang, Z., Wang, X., Xia, W., Liu, J., et al. (2016). Impaired imprinted X chromosome inactivation is responsible for the skewed sex ratio following in vitro fertilization. Proc. Natl. Acad. Sci. U. S. A. 113, 3197-3202.
Reviewer #1 (Recommendations For The Authors):
Title
It would be hard to understand what "co"-regulates means. Does this mean DNA methylation and H3K27me3 co-regulate imprinted X- X-chromosome inactivation? If so, the title can be reworded.
Thanks for your insightful comments, the title has been corrected into “A wave of minor de novo DNA methylation initiates in mouse 8-cell embryos and co-regulates imprinted X- chromosome inactivation with H3K27me3” (Line 2 in the revised manuscript).
Text
(1) As DNA methylation analysis is a primary part of this study, how they processed DNA methylation data can be added to the "Bioinformatics analysis" in the MATERIALS AND METHODS section.
Thanks for your kind reminder. We have added relevant information in the Materials and methods section in the revised manuscript (Lines 462-474 in the revised manuscript).
(2) It seems that recent literature has not been cited in the manuscript. Specifically, none of the papers after 2018 were cited. Recent relevant papers should also be cited throughout the manuscript.
Thanks so much for your reminder. We have added more recent literature to update the relevant information, such as the evidence supporting the causal role between DNA methylation and XCI (Lines 225-228, 264-265 in the revised manuscript); the concurrent enrichment of DNA methylation and H3K27me3 in genes subject to XCI (Lines 301-303 in the revised manuscript); the dominant role of de novo methylation in X chromosome (Lines 253-256 in the revised manuscript), etc.
(3) Line 56: The first report that describes the dynamics of DNMT3B expression in pre-implantation embryonic development (Hirasawa et al., 2007) is missing. This paper should be cited.
Sorry for our carelessness, we have added relevant references and rewritten the sentence in the revised manuscript (Lines 56-57 in the revised manuscript). I think you meant the report by Hirasawa et al in 2008, in which presented expression and subcellular localization of Dnmt3a and Dnmt3b in mouse oocytes and preimplantation embryos.
(4) Line 98: It would be good to mention that the data were derived from reduced representation bisulfite sequencing as the authors used whole-genome bisulfite sequencing data from the same research group as well.
Thanks for your kind reminder. As you have suggested, we have added the description in the revised manuscript to emphasize that these data were derived from reduced representation bisulfite sequencing, while another data were derived from whole-genome bisulfite sequencing, respectively. (Lines 98-99, 111 in the revised manuscript).
(5) Line 101: We first... "the preferential target of DNMT3B (Auclair et al., 2014; Borgel et al., 2010)". More recent literature (Baubec et al., 2016, Duymich et al., 2016, for example) showed that the preferential target of DNMT3B is not a promoter but a gene body. This sentence should be reworded.
Thanks so much for your detailed reminder. As you have pointed out, “preferential target” seems to be an inaccurate statement. Besides of promoters, gene bodies and other elements also undergo de novo DNA methylation (Auclair et al., 2014; Dahlet et al., 2020; Duymich et al., 2016).
We have rewritten the sentence as follows in the revised manuscript: “Promoter regions are important target sites of DNMT3B (Choi et al., 2011). The acquisition of DNA methylation in promoters, especially in intermediate and low CpG promoters, during implantation is largely dependent on DNMT3B and plays an important role in regulating developmental genes (Auclair et al., 2014; Borgel et al., 2010; Dahlet et al., 2020). Thus, among genomic regions that may undergo de novo DNA methylation, we initially focused our analysis on DNA methylation dynamics of promoters...” (Lines 100-106 in the revised manuscript)
(6) Lines 108-109: It would be good to mention that these data were derived from whole-genome bisulfite sequencing.
Thanks for your kind reminder. As aforementioned, we have added a description in the revised manuscript to distinguish between data derived from reduced representation bisulfite sequencing and whole-genome bisulfite sequencing (Lines 98-99, 111 in the revised manuscript).
(7) Line 141: rXCI should be defined.
Thanks for your kind reminder. We have added full descriptions and more necessary information about iXCI and rXCI, to make our statements clearer and easier to be understood (Lines 210-213 in the revised manuscript). In addition, we carefully checked the relevant descriptions throughout the manuscript, and each abbreviation (such as “ICM”) has been defined at its first occurrence. Additionally, we have replaced abbreviations that appears only once in the manuscript with their full terms (Lines 122, 212 in the revised manuscript).
(8) Lines 145-149: The role of DNA methylation for imprinted X-inactivation has already been reported (Chiba et al., 2008). The relevant sentences should be reworded.
Thanks so much for reminding us the important earlier literature that explores the relationship between DNA methylation and XCI. However, the primary aim and hypothesis of the study by Chiba et al. are different from those of our study. Chiba et al focused on whether DNA methylation is the imprinting mark responsible for monoallelic expression of Xist (the initiation event of iXCI), while our study focused on the role of DNA methylation in achieving X chromosomal heterochromatinization (the late event of iXCI).
In detail, the study by Chiba et al. mainly focused on exploring why Xist is specifically expressed from paternal allele and iXCI occurs specifically on the paternal X chromosome in mouse preimplantation embryos. Because Previous studies have suggested that genomic imprinting of Xist is established during oogenesis (Oikawa et al., 2014; Tada et al., 2000), Chiba et al. wanted to test whether the DNA methylation imprinting established during oogenesis is responsible for the monoallelic expression of Xist in preimpantaiton embryos. Analyses of DNA methyltransferase maternal knockout embryos revealed that oocyte DNA methylation is dispensable for Xist imprinting (Chiba et al., 2008). Follow-up study by Inoue et al. identified a broad H3K27me3 enrichment within the Xist 5’region established during oocyte growth and persists through preimplantation development, as the imprinting mark of Xist (Inoue et al., 2017). These series of studies are very important and allows us to understand the mechanism underlying paternal allele-specific iXCI in mouse preimplantation embryos and extraembryonic tissues.
However, the hypothesis is different in our study. Based on the finding of minor de novo DNA methylation and its preferential distribution on the X chromosome, we have speculated that the minor de novo methylation, which occurs from the 8-cell to blastocyst stage, may participate in achieving X chromosomal heterochromatinization. Although DNA methylation is essential for maintaining X chromosome-wide transcriptional silence of rXCI, its role in iXCI remains controversial and it is even plausibly thought that DNA methylation is not required for achieving iXCI because preimplantation embryos undergo global and massive DNA demethylation.
We have reorganized this paragraph, relevant statements have been added to make the background and discussion clearer and easier to be understood. (Lines 217-234 in the revised manuscript)
(9) Lines 164-165: Information regarding Dnmt3b KO is missing. Did the authors generate an original KO line or use an already published one? It should be explicitly stated.
Thank you so much for your kind reminder. The Dnmt3b heterozygous mice were obtained from the Mutant Mouse Resource & Research Centers (MMRRC), and Dnmt3b knockout (KO) embryos were generated by mating Dnmt3b heterozygous females with heterozygous males. The genotyping of Dnmt3b KO embryos was performed by PCR following the MMRRC genotyping protocol (https://www.med.unc.edu/mmrrc/genotyping-protocols/mmrrc-center-protocol-29886/). The relevant information has been added to the MATERIALS AND METHODS section in the revised manuscript (Lines 350-354; 391-393 in the revised manuscript).
(10) Line 165: chemical-induced inhibition of DNMT3B. As 5-aza-dC also blocks DNMT3A and DNMT1, this sentence should be reworded.
Thank you for your valuable comments. 5-aza-dC is a well-established global DNA hypomethylating agent that efficiently inhibit the activity of all DNMTs, and has been frequently used to study the maintenance of DNA methylation and de novo DNA methylation (Maslov et al., 2012; Oka et al., 2005). Thus, despite its inhibitory effect common to various DNMTs, chemical-induced inhibition of DNMTs has the advantage of allowing us to transiently treated embryos specifically during the window of minor de novo DNA methylation (the 8-cell to blastocyst stage). We have rewritten the relevant sentences in the revised manuscript (Lines 183-188 in the revised manuscript).
(11) Lines 171-174: "The role of de novo methylation in iXCI...". This possibility was already tested in the previous study from the Sasaki lab (Chiba et al., 2008).
As mentioned above, the primary aim and hypothesis of the study by Chiba et al. are different from those of our study. Chiba et al. mainly focused on exploring why Xist is specifically expressed from paternal allele and iXCI occurs specifically on the paternal X chromosome in mouse preimplantation embryos, so they tested whether the DNA methylation imprinting established during oogenesis is responsible for this monoallelic expression of Xist in preimplantation embryos (the initiation event of iXCI).
By contrast, based on the finding of minor de novo DNA methylation and its preferential distribution on X chromosome, our study has speculated that the minor de novo DNA methylation, which occurs from the 8-cell to blastocyst stage, may participate in achieving X chromosomal heterochromatinization (the late event of iXCI).
Thanks so much for reminding us this important literature, to make our discussion more informative. We have reorganized this paragraph by rewriting or adding relevant statements to make the background and discussion clearer and easier to be understood (Lines 217-231 in the revised manuscript). In addition, to avoid repeated statement and make our discussion more concise, we have removed the similar sentences at the end of this paragraph.
(12) Lines 198-200: "Given DNA methylation...". These citations mention a general relationship between DNA methylation and H3K27me3 in cells in culture. As I believe the authors focus on X-chromosome inactivation in the female embryos, more relevant papers that discuss the order of the events for the establishment of H3K27me3 and DNA methylation in the inactive X-chromosome can be cited.
Thanks so much for your comment to improve our discussion. It has been thought that during the late phase of rXCI in fully differentiated cells, gene silencing is achieved by PRC2 complex-induced H3K27me3, and then is further stably maintained by the redundant action of multiple layers of epigenetic modifications, including DNA methylation, to reach the maximum level of chromatin compaction (Chow and Heard, 2009; Heard et al., 2004; Pintacuda and Cerase, 2015). In line with this, a recent multifaceted analysis showed that DNA methylation and H3K27me3 are concurrently enriched in genes subject to XCI (Balaton and Brown, 2021). We have added these statements in the revised manuscript (Lines 295-303 in the revised manuscript).
(13) Line 241: As 5-aza-dC blocks both de novo and maintenance DNA methylation, this sentence should be reworded.
Thank you for your kind reminder. As you have mentioned above, 5-aza-dC is a well-established global DNA hypomethylating agent that efficiently inhibit the activity of all DNMTs, and has been frequently used to study the maintenance of DNA methylation and de novo DNA methylation (Maslov et al., 2012; Oka et al., 2005). Thus, despite its inhibitory effect common to various DNMTs, chemical-induced inhibition of DNMTs has the advantage of allowing us to transiently treated embryos specifically during the window of minor de novo DNA methylation (the 8-cell to blastocyst stage). We have rewritten the relevant sentences in the revised manuscript (Lines 183-188 in the revised manuscript).
Figures
(1) Figure 1C, D: Do the rows in C and D show the corresponding genes?
Figure 1C and D represent the DNA methylation changes of promoters (C) and gene bodies (D) respectively, during the transition from the 8-cell to blastocyst stage. Two data were analyzed independently, and rows did not show the corresponding genes. Since we have focused on the minor de novo methylation in promoter regions, to avoid confusion, the results of the gene body have been removed from the revised manuscript.
(2) Figure 1G: Yy2 promoter gained DNA methylation during the transition from 8-cell to the blastocyst stage. Is this a representative locus for the de novo methylated promoters that are shown in Figure 1F where an increase of DNA methylation is about ~1% on average? Another representative locus could be shown instead of this gene promoter.
Thanks so much for you detailed reminder. The inconsistency between the global methylation change and bisulfite sequencing analysis of Yy2, may be due to the details of methodologies, such C-T conversion efficiency, the number of picked colonies, etc. Since we have confirmed the presence of minor de novo DNA methylation using different publicly available data, to avoid ambiguity, we have removed this result in revised manuscript.
(3) Figures 2C and 3A: It would be helpful to mention what the arrowheads mean.
Thanks so much for you detailed reminder. In Figure 2C, the arrowhead indicates the H3k27me3 domain and the blank arrowhead indicates the blastomere without the H3k27me3 domain. In Figure 3A, the arrowhead indicates Xist RNA domain and the blank arrowhead indicates the blastomere without Xist RNA domain. We have added the information in the revised manuscript (Lines 736-738, 747-749 in the revised manuscript).
(4) Figure 3-figure supplement 2B: It would be hard to see whether H3K27me3 is enriched at the promoter regions of presented genes. It would be helpful to show the values for the Y-axis as in panel A.
Thanks for your helpful reminder. We have added the scales to the figure to improve the result presentation (Figure 4—figure supplement 2B in the revised manuscript).
(5) Figure 4-figure supplement 2: 5-aza-dC blocks not only the activity of DNMT3B but also DNMT1, and DNMT3A (all these DNMTs are expressed during pre-implantation embryos, see Hirasawa et al., 2007). This part can be omitted from the manuscript.
Thanks for your insightful comments. As you have mentioned above, the relevance and rationale for experiments using 5-aza-dC treatment should be clarified. 5-aza-dC is a well-established global DNA hypomethylating agent that efficiently inhibit the activity of all DNMTs, and thus has been frequently used to study the maintenance of DNA methylation and de novo DNA methylation (Maslov et al., 2012; Oka et al., 2005).
In our study, to validate the function of minor de novo DNA methylation in iXCI and blastocyst development, we take advantage of 5-aza-dC-induced DNMT inhibition, which allows us to transiently treated embryos specifically during the window of minor de novo DNA methylation (the 8-cell to blastocyst stage), despite its non-specificity to various DNMTs.
Based on these considerations, we hope to retain this result, and wish to get your understanding.
We have added these statements in the revised manuscript to make our experiments more rational and easier to be understood (Lines 183-188 in the revised manuscript) and added a schematic diagram depicting the experimental design (Figure 3—figure supplement 3E in the revised manuscript).
Reviewer #2 (Recommendations For The Authors):
Recommendations/concerns in the text:
- Line 106, it is unclear what is meant by "in line with this"? Gene body DNA methylation is a characteristic of active transcription, so why would a gain in DNA methylation at promoters be in line with a gain in DNA methylation over gene bodies?
Thank you so much for your comments that pointed out our ambiguous statement. We meant both the promoter and gene body regions, albeit accounting for small proportions, gain DNA methylation during the transition from the 8-cell to blastocyst stage. Based on the comment by Reviewer#1, since we have focused on the minor de novo methylation in promoter regions, to avoid confusion, the results of the gene body have been removed from the revised manuscript.
- Line 111 & 114, can 6% DNA methylation really be considered "relatively hypermethylated" compared to 3% DNA methylation that is referred to as "more hypomethylated"?
We apologize for our unclear and ambiguous statements. Here we focused on the promoter regions. Many previous studies have revealed that compared with gene bodies and other genome elements, promoter and overlapping CGI regions, especially high CpG promoters, always showed low levels of DNA methylation. We have added relevant statements to clarify this information, and rewritten the sentences in the revised manuscript (Lines 100-106, 116-118, 121, 124 in the revised manuscript).
- Line 124, there are a number of processes identified, why only mention one in the text? Suggest changing writing to be more accurate, indicating what was included for the GO analysis and using the words "enriched for ... processes". Saying it may be linked to a process is an overstatement and not supported by further experiments/data.
Thank you so much for your detailed comments that make our results more informative. We have checked the relevant description and addressed your suggestions as follows: By performing gene ontology enrichment analysis of genes that undergo minor or major de novo DNA methylation respectively, we noticed that besides of many important basic processes common to two waves of de novo DNA methylation, genes subject to minor de novo DNA methylation were enriched in processes such as organic substance transport, chromosome organization, and cell fate specification (Lines 129-134 in the revised manuscript).
- Lines 149 - 152: sentence/message unclear.
We apologize for the ambiguous description. We have corrected the relevant descriptions as follows: To identify the biological function of minor de novo DNA methylation in iXCI, we knocked down Dnmt3b in preimplantation embryos by microinjecting Dnmt3b siRNA into zygotes (Lines 234-236 in the revised manuscript).
- Lines 162-164: the data in Figure 2C/D does not support this statement, as it does not show H3K27me3 loss specifically at the inactive X-chromosome.
Thanks so much for your insightful comments. Despite the global enrichment of H3K27me3, the H3K27me3 domain detected by immunostaining is a classic marker for establishment of XCI by achieving X chromosome wide heterochromatinization of transcriptional depression (Chow and Heard, 2009; Heard et al., 2004; Huynh and Lee, 2005). Thus, we have used immunostaining for H3K27me3 domains to evaluate the iXCI establishment in the blastocysts, as previously reported (Fukuda et al., 2014; Gontan et al., 2018; Inoue et al., 2010; Tan et al., 2016). To make our results more convincing, we have added another statistical method to quantify the establishment of iXCI, i.e., the percentage of H3K27me3-positive and -negative trophoblast cells to total trophoblast cells in female blastocysts subject to Dnmt3b knockdown or not.
In addition, we have added a schematic diagram depicting the process of iXCI initiation and establishment, as well as the experimental design and work flows, to make the result easier to be understood.
In addition, we agree with your comments that additional evidence will benefit the conclusion. To strengthen the evidence, and test whether DNA methylation loss leads to a prolonged effect on iXCI, we have reanalyzed the RNA-seq and H3K27me3 CHIP-seq data in extraembryonic ectoderm (ExE) of E6.5 single embryos that underwent Dnm3a/3b knockout because preimplantation iXCI status maintains extraembryonic cells (Chen et al., 2019; Galupa and Heard, 2015; Schulz and Heard, 2013). The results showed that chromosome-wide loss of DNA methylation led to a nearly complete loss of H3k27me3 on paternal (specifically inactivated in iXCI), along with a notable transcriptional upregulation cross the chromosome. By contrast, these changes cannot be not observed on maternal X chromosome. (Lines 253-261; Figure 3—figure supplement 4A in the revised manuscript)
- Lines 169-174: sentence/message unclear.
As aforementioned, we have reorganized this paragraph by rewriting or adding relevant statements relevant to the DNA methylation and XCI, to make the background and discussion clearer and easier to be understood (Lines 217-234 in the revised manuscript). In addition, to avoid repeated statement and make our discussion more concise, we have removed the similar sentences at the end of this paragraph.
- Lines 177-179: this statement is too bold. The data does not support "direct evidence".
Thank you for your detailed reminder. We have rewritten the sentence to avoid confusion and overstatement (Lines 262-268 in the revised manuscript).
- Line 198: these are not all enzymes, but could be referred to as chromatin modifiers.
We apologize for the ambiguous description. As you suggested, we have corrected “enzymes” to “chromatin modifiers” (Lines 284, 287 in the revised manuscript).
- Line 199: this statement is not correct in all contexts. There are many studies showing antagonism between DNA methylation and H3K27me3.
Thanks so much for you careful reviewing. As you have pointed out, the relationship of DNA methylation and H3K27me3 are divergent and largely controversial among studies. Under certain circumstances, DNA methylation shows antagonistic effect to H3K27me3 at promoters, via excluding the binding of PRC2 (the main complex responsible for H3K27me3 deposition) components to their targets (Bartke et al., 2010; Jermann et al., 2014), while other studies have presented alternative evidence that PRC2 (the main complex responsible for H3K27me3 deposition) and DNA methylation cooperate to achieve silencing (Hagarman et al., 2013; Vire et al., 2006). Thus, it has been thought that the relationship between DNA and methylation and histone modifications is complex, possibly in a cell-type and/or genomic region-specific manner. Both antagonism and coordination can be observed in different regulatory elements in mouse ES cells (King et al., 2016).
We apologize our incomplete statement because we mainly focused on their synergistic relationship. We have refined this section by rewriting relevant sentences and adding necessary statements (Lines 288-303 in the revised manuscript).
- Lines 228-230: the developmental significance of DNA methylation homeostasis is already well-established. Please reference relevant papers showing this here.
Thank you for this helpful suggestion. We have reorganized this section. Relevant references that highlight the developmental significance of DNA methylation homeostasis have added. The sentence has been rewritten and moved to the end of this paragraph, in the revised manuscript (Lines 159-161 in the revised manuscript).
- Line 238: an explanation/rationale for looking at energy metabolism is lacking.
Thank you for your comments to make our results earlier to be understood. The detection of energy metabolism is mainly based on the integrated analysis of DNA methylation and gene expression from the 8-cell embryos to ICM, to test the potential short-and long-term developmental consequences of minor de novo DNA methylation. Bioinformatic analysis suggested that many basic processes, such as cell differentiation, cell cycle and metabolic regulation, may be regulated by minor de novo DNA methylation. Among the enriched genes, several are related energy metabolism. In addition, because energy metabolism is crucial for supporting embryo differentiation and development, and oxidative phosphorylation (OXPHOS) metabolism is highly activated during the blastocyst stage (Zhao et al., 2021), we next examined the energy metabolism, particularly OXPHOS activity, of Dnmt3b-KD embryos. We have refined the section by rewritten relevant sentence and added necessary statements (Lines 175-179 in the revised manuscript).
- Lines 246-248: Looking at the data in Figure 2 figure supplement 2, this statement is simply not true with regards to DNMT3B protein, and also global DNA methylation level is reduced in the Dnmt3b KD blastocyst, which could lead to defective major de novo DNA methylation.
Thanks for your careful reviewing, we have rewritten the sentence to make our statement more accurate and avoid overstatement (Lines 188-190 in the revised manuscript).
Recommendations/concerns relating to figures:
Figure 1:
- Of all genic promoters, how many were included in the analysis (contained sufficient coverage)? What cut-off/thresholds were used to consider DNA methylation gain at a promoter?
Thanks for your comments. In total, 11662 promoters were analyzed. Given that promoter methylation is generally at low level, particularly at the 8-cell stage at which minor de novo methylation is just initiated. The relatively lower basal levels make the increase before the blastocyst, seem considerably slight. To capture the slight changes, we have used the relaxed threshold based on ΔDNA methylation. Only CpG sites with at least fivefold coverage were included in the methylation analysis based on data from Smith et al. (Smith et al., 2012)., ΔDNA methylation greater or less than 0 was defined as gain or loss of DNA methylation. We have added this information in the revised manuscript (Lines 462-470 in the revised manuscript).
- Does an average methylation level of 0.02 represent 2% DNA methylation? Presuming yes, is the average 1.5% DNA methylation gain at promoters real? And meaningful? Especially compared to the gain in DNA methylation that takes place between ICM and E6.5 (Figure 1 Figure Supplement 1 D)
As you have pointed out, an average methylation level of 0.02 represent 2% DNA methylation. As aforementioned, promoters exhibited an average of 1.5% DNA methylation gain during the transition from 8-cell stage to ICM. The slight increase may be mainly due to the relatively lower basal levels. As you expected, compared with the comprehensive de novo DNA methylation during implantation, preimplantation de novo methylation occurs more slightly, at a small proportion of promoter regions, so designated it as minor de novo DNA methylation. It should be also mentioned that a proportion of these promoters continue to gain massive DNA methylation during implantation. We have refined the relevant sentences to provide more detailed information of our results (Lines 125-127 in the revised manuscript).
- Why is there a focus on promoters (which are not the preferential target of DNMT3B)?
Thanks so much for your detailed reminder. As you have pointed out, “preferential target” seems to be an inaccurate statement. besides of promoters, gene bodies and other elements also undergo de novo DNA methylation (Auclair et al., 2014; Dahlet et al., 2020; Duymich et al., 2016). We have focused on the promoter regions based on the following considerations: (1) Promoter regions are important target sites of DNMT3B (Choi et al., 2011); (2) The acquisition of DNA methylation in promoters, especially in intermediate and low CpG promoters, during implantation is largely dependent on DNMT3B and plays an important role in regulating developmental genes (Auclair et al., 2014; Borgel et al., 2010; Dahlet et al., 2020). We have rewritten the relevant sentence in the revised manuscript (Lines 100-106 in the revised manuscript).
- Figure 1H shows that promoters that gain DNA methylation during the "minor de novo DNA methylation" continue to gain DNA methylation during "de novo DNA methylation". Is the ~1.5% DNA methylation gain just the slow start of the main de novo DNA methylation wave?
Your comments is very helpful to improve the description of our results. In the present study, our analysis indicated that a small proportion of promoters initially gain methylation during the transition from the 8-cell to ICM. The finding challenges current knowledge: (1) de novo DNA methylation occurs during implantation, by which globally hypomethylated blastocysts acquire genome-wide DNA methylation (Borgel et al., 2010; Dahlet et al., 2020; Smith et al., 2012); (2) during preimplantation development, embryos undergo massive and global DNA demethylation.
To distinguish the current knowledge of the timing and dynamics of DNA methylation during the early development, we have designated our finding during the transition from the 8-cell to blastocyst stage, as minor de novo DNA methylation.
We agree with your notion that among the promoters undergoing minor de novo methylation, most of them continue to gain DNA methylation during implantation, as revealed in Fig. 1F. We have added refine the relevant statement in revised manuscript (Lines 125-127 in the revised manuscript).
- The GO analysis performed for Figure 1H, what was used as input? Promoters of genes that gain DNA methylation as identified in 1C?
Thank you for your comments. For the GO analysis shown in Figure 1H, we used genes with promoter regions that gained or lost DNA methylation during the transition from the 8-cell to ICM respectively (identified in Figure 1C, as input), respectively. This information has been clarified in the revised manuscript to ensure accuracy (Lines 129-134 in the revised manuscript).
- Figure 1 figure supplement 1, is there only a fold change as threshold or also a calculated significance (eg. p-value/FDR)?
Thanks for your valuable comments. Considering the relatively low DNA methylation levels at promoter regions, and the slightly changes occurring during the preimplantation embryo development, we used the relaxed threshold based on ΔDNA methylation. Only CpG sites with at least fivefold coverage were included in the methylation analysis based on data from Smith et al. (Smith et al., 2012), ΔDNA methylation greater or less than 0 was defined as gain or loss of DNA methylation. We have replaced relevant figures and added this information in the revised manuscript (Figure 1—figure supplement 1D-E; Lines 125-127 in the revised manuscript).
- To confirm DNMT3B is responsible for the DNA methylation gain: DNMT3B KD/KO followed by promoter DNA methylation analysis to confirm the promoters that gain DNA methylation between 8 cell and ICM don't gain DNA methylation in the absence of DNMT3B.
We agree with your comments that additional evidence will benefit the conclusion. To strengthen the evidence, we have reanalyzed the RNA-seq and H3K27me3 CHIP-seq data in extraembryonic ectoderm (ExE) of E6.5 single embryos that underwent Dnm3a/3b knockout because preimplantation iXCI status maintains extraembryonic cells (Chen et al., 2019; Galupa and Heard, 2015; Schulz and Heard, 2013). The results showed that chromosome-wide loss of DNA methylation led to a nearly complete loss of H3k27me3 on paternal (specifically inactivated in iXCI), which showed a notable transcriptional upregulation cross the chromosome. By contrast, these changes cannot be not observed on maternal X chromosome. We have added this result in the revised manuscript (Lines 253-261; Figure 3—figure supplement 4A in the revised manuscript).
Figure 2:
- Figure 2A: label missing for what the numbers on the y-axis represent.
Thank you for pointing this out. We apologize for the oversight. We have added the label of y-axis in Figure 2A to clarify what the numbers represent, making it easier to be understood (Figure 3A in the revised manuscript).
- Figure 2B: y-axis is % of methylated promoters compared to all promoters?
Thank you for your suggestion. The y-axis in Figure 2B indeed represents the percentage of de novo methylated promoters relative to all promoters. As you have suggested, we have clarified this labeling in the revised manuscript (Figure 3B in the revised manuscript).
- What is the delta DNA methylation gain specifically for X-linked promoters?
Thanks so much for your reminder. To provide more convincing evidence. We have reanalyzed a single cell COOL-seq data, we also specifically reanalyzed the DNA methylation changes on the X chromosomal promoter in female embryos. The X chromosome showed a more notable increase in the de novo methylated promoters than that on autosomes, and the female X chromosome showed higher DNA methylation levels than that of the male (Figure 3—figure supplement 2A-B; Lines 203-206 in the revised manuscript).
- Figure 2C: include representative images of separate channels to better see the signal of CDX2 and H3K27me3. Quantification would be better represented with box plots.
Thank you for your helpful suggestions. We have added separate channel images in the revised manuscript. Additionally, we have adjusted the quantification to be represented as box plots, as you have suggested, to improve the accuracy and interpretability of the data presentation (Figure 3D-F in the revised manuscript).
- Figure 2C: Does the H3K27me3 signal overlap with the location of the inactive X-chromosome (is there maybe denser DAPI or do IF combined with Xist RNA-FISH)?
Thanks so much for your insightful comments. Despite the global enrichment of H3K27me3, the H3K27me3 domain detected by immunostaining is a classic marker for establishment of XCI by achieving X chromosome wide heterochromatinization of transcriptional depression (Chow and Heard, 2009; Heard et al., 2004; Huynh and Lee, 2005). Thus, we have used immunostaining for H3K27me3 domains to evaluate the iXCI establishment in the blastocysts, as previously reported (Fukuda et al., 2014; Gontan et al., 2018; Inoue et al., 2010; Tan et al., 2016). We have taken effort to perform co-staining of H3K27me3 IF and Xist FISH, but was hindered by the technical challenge, we wish to get your understanding. However, as we aforementioned, H3K27me3 is a well-accepted maker to clarify the XCI status.
In addition, to make our results more convincing, we have added an alternative statistical method to quantify the establishment of iXCI, i.e., the percentage of H3K27me3-positive and -negative trophoblast cells to total trophoblast cells in female blastocysts subject to Dnmt3b knockdown or not (Figure 3F; Lines 243-244 in the revised manuscript)
- Figure 2 figure supplement 2A: relative expression of Dnmt3b?
Thanks for your detailed reminder. The data represent the relative expression level of Dnmt3b, as noted in the original figure legend. Based on your comments, we have added the gene name in the label of the Y-axis. Similarly, the protein name has been also added to make the results more informative (Figure 2 figure supplement 2A, C, E in the revised manuscript).
- Figure 2 figure supplement 2B/C: in the text, line 153, it is stated that "Dnmt3b mRNA and protein levels were significantly reduced in morulae, but not in blastocysts compared to those of negative control (NC) group". These figures do not support that statement. The IF images show a loss of DNMT3B in the Dnmt3b KD blastocysts. The IF quantification seems to have fewer datapoints for the blastocyst, and looking at the bar graphs, there seems to be a trend towards reduced DNMT3B in both the morula and blastocyst, which would also explain the reduction in DNA methylation in both stages as shown in Figure 2 figure supplement 2D/E.
Thanks so much for your careful reviewing that makes our statements more accurate. We have rewritten the sentence in the revised manuscript as follows: Dnmt3b mRNA and protein levels were significantly reduced in morulae, and tended to be lower in blastocysts compared to those of the negative control (NC) group. In addition, we have removed “transient” from the original statement “The transient inhibition of Dnmt3b” (Lines 168-170 in the revised manuscript).
- Figure 2 figure supplement 2F/G: include representative IF images with separation of all channels and the merged image.
Thank you for your suggestion. We have added the representative immunofluorescence (IF) images with separate channels and merged image in the revised manuscript (Figure 3—figure supplement 3B, F in the revised manuscript).
- Figure 2 figure supplement 2H: Instead of showing log2FC in methylation levels, delta methylation would be more informative. Are these genes already inactivated at the 8-cell stage? Or are they active and become inactivated by the gain in DNA methylation? Doing qPCR for these genes, or looking at published RNAseq data would be informative. What happens to the expression of these genes in the Dnmt3b KD?
Thanks for your suggestions. We have represented DNA methylation changes as “ΔDNA methylation”. During mouse preimplantation development, iXCI is initiated in earlier cleavage female embryos dependent on Xist upregulation around 4-8-cell stage, and then Xist specifically coats paternal X chromosome and finally leads to chromosome-wide silencing via heterochromatinization in early blastocysts. Thus, these non-escaping genes, which are subject to XCI, would not be inactivated at 8-cell stage
Author response image 1.
The processes of iXCI initiation and establishment (left panel), and dynamics of total expression levels of X chromosome in male and female preimplantation embryos (right panel, note that X-dosage is balanced between sexes until the early blastocyst stage).
As you expected, most of these representative non-escaping is downregulated upon the transition of 8-cell to blastocyst stage, consistent with their gain of DNA methylation. Additionally, since preimplantation iXCI status maintains extraembryonic cells (Galupa and Heard, 2015; Schulz and Heard, 2013), we further reanalyzed the published RNA-seq data in extraembryonic ectoderm (ExE) of E6.5 single embryos that underwent DNA methyltransferase knockout (Chen et al., 2019). The results showed that chromosome-wide loss of DNA methylation led to a chromosome-wide transcriptional upregulation, including the locus of these non-escaping genes, on paternal X chromosome. We have added this result in the revised manuscript (Figure 3—figure supplement 3J; Figure 3—figure supplement 4A-B; Lines 253-261 in the revised manuscript).
Figure 3:
- Figure 3 figure supplement 1: representative IF image missing.
Thanks for your kind reminder. We have added the representative IF images in the revised manuscript to provide a clearer illustration of the data (Figure 4—figure supplement 1A in the revised manuscript).
- Figure 3 figure supplement 2B: scales are missing for the H3K27me3 ChIP-seq data (are the 8-cell and ICM tracks set to the same scale?). It looks like the ICM track is cut off at the top (peaks not fully displayed) and the data looks very sparse. A more informative analysis would be to do peak calling over promoters and compare 8-cell with ICM.
Thanks for your detailed reminder. We apologize for the missing of scale bars in the H3K27me3 ChIP-seq data. The 8-cell and ICM tracks were set to the same scale, and we have now added scales to the figure in the revised manuscript to improve the result presentation. As you have speculated, the visual effect of the flatted peak is not caused by track cutting off, but rather by zooming into a specific region in the extended IGV files.
These results are based on the reanalysis of publicly available data of pooled embryos, which just provided suggestive but not direct evidence to support the role of DNA methylation in promoting X-linked H3K27me3 enrichment in iXCI.
To provide more convincing evidence. we have reanalyzed the RNA-seq and H3K27me3 CHIP-seq data in extraembryonic ectoderm (ExE) of E6.5 female embryos that underwent Dnmt3a/3b knockout because preimplantation iXCI status maintains extraembryonic cells (Chen et al., 2019; Galupa and Heard, 2015; Schulz and Heard, 2013). The results showed that Dnmt knockout led to a nearly complete loss of H3k27me3 on paternal (specifically inactivated in iXCI), which showed a notable transcriptional upregulation cross the chromosome. By contrast, these changes cannot be not observed on maternal X chromosome (Figure 3—figure supplement 4 in the revised manuscript). We have added these results in the revised manuscript.
- Figure 3E: Given all tested proteins give a positive signal, it would have been good to include a negative control chromatin protein that is known to not interact with DNMT3B. Given both PRC2 and DNMT3B are chromatin-binding proteins, can the signal be a result of close proximity instead of a direct interaction?
In the present study, to test the interaction between DNMT3B and PRC2 core components, we have used in situ proximity ligation assay (PLA), an increasingly popular technique for detecting the close proximity of two proteins in fixed samples using two primary antibodies (Alsemarz et al., 2018).
Author response image 2.
Schematic diagram of the principle of the in situ PLA.
Compared with classical co-Immunoprecipitation (Co-IP) method, in situ PLA has advantages in (1) detecting low input samples or proteins expressed at low levels, which is extremely difficult using Co-IP; (2) providing in situ or subcellular information of protein-protein interaction. However, it should be noted that the maximal distance allowing this reaction is 40 nm, which is not quite small enough to demonstrate a physical interaction between the two antigens, but sufficient to support a very close “proximity”.
In our study, in situ PLA, including the experimental design of negative control, was performed in the accordance with the manufacturer’s instruction of Duolink® In Situ Red Starter Kit (MilliporeSigma): “Technical negative controls included incubation with each primary antibody separately and no primary antibody”. We have refined the relevant sentence in the revised manuscript (Lines 308-310 in the revised manuscript)
- Figure 3G: It would have been good to include a negative control, and DNase/benzonase to exclude DNA/RNA-mediated protein interaction.
- (Of note, there have been previous studies reporting an interaction between PRC2 and DNMT3B in other cell types, such as in Weigert et al. 2023, but unfortunately, they don't seem to use DNase/benzonase either).
The Co-IP analysis of DNMT3B and PRC2 core components in differentiated female ES cells was presented as additional supportive evidence. Because the Co-IP analysis is extremely difficult for preimplantation embryos, we have used in situ PLA to detect their interaction. However, the maximal distance allowing in situ PLA reaction is 40 nm, which is not quite small enough to demonstrate a physical interaction (Alsemarz et al., 2018). Thus, we have added a Co-IP analysis using differentiated female ES cells, in which rXCI occurs upon the differentiation.
Based on this consideration of the importance and contribution of this result, we have moved this result from the main figure, to the supplemental figure (Figure 4—figure supplement 3H in the revised manuscript).
- Figure 3 figure supplement 3G: what were the ESCs differentiated into? Did the Dnmt3b KO or Dnmt3a/b DKO show any differentiation defect?
The mouse ESC line PGK12.1 was a well-established ex vivo model of rXCI. Under the standard culture condition, PGK12.1 is normally fated to neuroectodermal commitment.
Author response image 3.
Immunostaining of NESTIN, a neuroectodermal stem cell marker molecule, and NANOG in undifferentiated and differentiated PGK12.1 ESCs respectively.
No differentiation defects have been observed in either Dnmt3b KO or Dnmt3a/3b DKO ESCs in our study. Dnmt KO/DKO/TKO ES cell lines have been successfully used as the model of interaction of DNA methylation and H3K27me3 deposition (King et al., 2016).
Figure 4:
- Figure 4B: Is there an explanation for seeing similar total cell numbers in Figure 4B, but showing decreased proliferation in Figure 4A?
Thank you for your insightful comments. The EdU cell proliferation assays labels cells during the S phase of cell cycle, as the 5-ethynyl 2´-deoxyuridine (EdU) is incorporated into newly synthesized DNA. This labeling identifies cells undergoing DNA synthesis, but these cells may not have completed mitosis at the time of detection. As a result, the total cell number may not immediately reflect the decrease in proliferation observed in the treated group. To address this point, we have rewritten the sentences in the revised manuscript (Lines 174-175 in the revised manuscript).
References
Alsemarz, A., Lasko, P. and Fagotto, F. J. B. (2018). Limited significance of the in situ proximity ligation assay. bioRxiv, 411355.
Auclair, G., Guibert, S., Bender, A. and Weber, M. (2014). Ontogeny of CpG island methylation and specificity of DNMT3 methyltransferases during embryonic development in the mouse. Genome Biol. 15, 545.
Balaton, B. P. and Brown, C. J. (2021). Contribution of genetic and epigenetic changes to escape from X-chromosome inactivation. Epigenetics Chromatin 14, 30.
Bartke, T., Vermeulen, M., Xhemalce, B., Robson, S. C., Mann, M. and Kouzarides, T. (2010). Nucleosome-interacting proteins regulated by DNA and histone methylation. Cell 143, 470-484.
Borgel, J., Guibert, S., Li, Y., Chiba, H., Schubeler, D., Sasaki, H., Forne, T. and Weber, M. (2010). Targets and dynamics of promoter DNA methylation during early mouse development. Nat. Genet. 42, 1093-1100.
Chen, Z., Yin, Q., Inoue, A., Zhang, C. and Zhang, Y. (2019). Allelic H3K27me3 to allelic DNA methylation switch maintains noncanonical imprinting in extraembryonic cells. Sci Adv 5, eaay7246.
Chiba, H., Hirasawa, R., Kaneda, M., Amakawa, Y., Li, E., Sado, T. and Sasaki, H. (2008). De novo DNA methylation independent establishment of maternal imprint on X chromosome in mouse oocytes. Genesis 46, 768-774.
Choi, S. H., Heo, K., Byun, H. M., An, W., Lu, W. and Yang, A. S. (2011). Identification of preferential target sites for human DNA methyltransferases. Nucleic Acids Res. 39, 104-118.
Chow, J. and Heard, E. (2009). X inactivation and the complexities of silencing a sex chromosome. Curr. Opin. Cell Biol. 21, 359-366.
Dahlet, T., Argueso Lleida, A., Al Adhami, H., Dumas, M., Bender, A., Ngondo, R. P., Tanguy, M., Vallet, J., Auclair, G., Bardet, A. F., et al. (2020). Genome-wide analysis in the mouse embryo reveals the importance of DNA methylation for transcription integrity. Nat Commun 11, 3153.
Duymich, C. E., Charlet, J., Yang, X. J., Jones, P. A. and Liang, G. N. (2016). DNMT3B isoforms without catalytic activity stimulate gene body methylation as accessory proteins in somatic cells. Nat Commun 7, 11453.
Fukuda, A., Tomikawa, J., Miura, T., Hata, K., Nakabayashi, K., Eggan, K., Akutsu, H. and Umezawa, A. (2014). The role of maternal-specific H3K9me3 modification in establishing imprinted X-chromosome inactivation and embryogenesis in mice. Nat Commun 5, 5464.
Galupa, R. and Heard, E. (2015). X-chromosome inactivation: new insights into cis and trans regulation. Curr. Opin. Genet. Dev. 31, 57-66.
Gontan, C., Mira-Bontenbal, H., Magaraki, A., Dupont, C., Barakat, T. S., Rentmeester, E., Demmers, J. and Gribnau, J. (2018). REX1 is the critical target of RNF12 in imprinted X chromosome inactivation in mice. Nat Commun 9, 4752.
Hagarman, J. A., Motley, M. P., Kristjansdottir, K. and Soloway, P. D. (2013). Coordinate regulation of DNA methylation and H3K27me3 in mouse embryonic stem cells. PLoS One 8, e53880.
Heard, E., Chaumeil, J., Masui, O. and Okamoto, I. (2004). Mammalian X-chromosome inactivation: an epigenetics paradigm. Cold Spring Harb. Symp. Quant. Biol. 69, 89-102.
Huynh, K. D. and Lee, J. T. (2005). X-chromosome inactivation: a hypothesis linking ontogeny and phylogeny. Nat. Rev. Genet. 6, 410-418.
Inoue, A., Jiang, L., Lu, F. and Zhang, Y. (2017). Genomic imprinting of Xist by maternal H3K27me3. Genes Dev. 31, 1927-1932.
Inoue, K., Kohda, T., Sugimoto, M., Sado, T., Ogonuki, N., Matoba, S., Shiura, H., Ikeda, R., Mochida, K., Fujii, T., et al. (2010). Impeding Xist expression from the active X chromosome improves mouse somatic cell nuclear transfer. Science 330, 496-499.
Jermann, P., Hoerner, L., Burger, L. and Schubeler, D. (2014). Short sequences can efficiently recruit histone H3 lysine 27 trimethylation in the absence of enhancer activity and DNA methylation. Proc. Natl. Acad. Sci. U. S. A. 111, E3415-3421.
King, A. D., Huang, K., Rubbi, L., Liu, S., Wang, C. Y., Wang, Y., Pellegrini, M. and Fan, G. (2016). Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Rep. 17, 289-302.
Maslov, A. Y., Lee, M., Gundry, M., Gravina, S., Strogonova, N., Tazearslan, C., Bendebury, A., Suh, Y. and Vijg, J. (2012). 5-aza-2'-deoxycytidine-induced genome rearrangements are mediated by DNMT1. Oncogene 31, 5172-5179.
Oikawa, M., Inoue, K., Shiura, H., Matoba, S., Kamimura, S., Hirose, M., Mekada, K., Yoshiki, A., Tanaka, S., Abe, K., et al. (2014). Understanding the X chromosome inactivation cycle in mice: a comprehensive view provided by nuclear transfer. Epigenetics-Us 9, 204-211.
Oka, M., Meacham, A. M., Hamazaki, T., Rodic, N., Chang, L. J. and Terada, N. (2005). De novo DNA methyltransferases Dnmt3a and Dnmt3b primarily mediate the cytotoxic effect of 5-aza-2'-deoxycytidine. Oncogene 24, 3091-3099.
Pintacuda, G. and Cerase, A. (2015). X Inactivation Lessons from Differentiating Mouse Embryonic Stem Cells. Stem Cell Rev Rep 11, 699-705.
Schulz, E. G. and Heard, E. (2013). Role and control of X chromosome dosage in mammalian development. Curr. Opin. Genet. Dev. 23, 109-115.
Smith, Z. D., Chan, M. M., Mikkelsen, T. S., Gu, H. C., Gnirke, A., Regev, A. and Meissner, A. (2012). A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature 484, 339-344.
Tada, T., Obata, Y., Tada, M., Goto, Y., Nakatsuji, N., Tan, S., Kono, T. and Takagi, N. (2000). Imprint switching for non-random X-chromosome inactivation during mouse oocyte growth. Development 127, 3101-3105.
Tan, K., An, L., Miao, K., Ren, L., Hou, Z., Tao, L., Zhang, Z., Wang, X., Xia, W., Liu, J., et al. (2016). Impaired imprinted X chromosome inactivation is responsible for the skewed sex ratio following in vitro fertilization. Proc. Natl. Acad. Sci. U. S. A. 113, 3197-3202.
Vire, E., Brenner, C., Deplus, R., Blanchon, L., Fraga, M., Didelot, C., Morey, L., Van Eynde, A., Bernard, D., Vanderwinden, J. M., et al. (2006). The Polycomb group protein EZH2 directly controls DNA methylation. Nature 439, 871-874.
Zhao, J., Yao, K., Yu, H., Zhang, L., Xu, Y., Chen, L., Sun, Z., Zhu, Y., Zhang, C., Qian, Y., et al. (2021). Metabolic remodelling during early mouse embryo development. Nat Metab 3, 1372-1384.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This study presents valuable finding regarding the role of life history differences in determining population size and demography. The evidence for the claims is still partially incomplete, with concerns about generation times and population structure. Nonetheless, the work will be of considerable interest to biologists thinking about the evolutionary consequences of life history changes.
Thank you. We have addressed the generation time and population structure issues in detail in our revision and hope that you, like us, find them to be of sufficiently low concern (i.e., they are not driving the results) that they do not overshadow the main findings and conclusions.
The opportunity to make in-depth revisions also helped the manuscript in two ways unanticipated by both us and the reviewers. First, KW made a mistake in the original analysis of phylogenetic signal, and catching that error simplifies that aspect of the study (there is none in our measured variables). Second, in June 2024 Hilgers et al. (2024; https://doi.org/10.1101/2024.06.17.599025) posted an important manuscript to bioRxiv noting the possibility of false population size peaks in PSMC analyses using the standard default settings. Our results had three of those, which we have eliminated. N<sub>e</sub>ither of these issues affect the overall conclusions, but their resolution improves the work.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This interesting study applies the PSMC model to a set of new genome sequences for migratory and nonmigratory thrushes and seeks to describe differences in the population size history among these groups. The authors create a set of summary statistics describing the PSMC traces - mean and standard deviation of N<sub>e</sub>, plus a set of metrics describing the shape of the oldest N<sub>e</sub> peak - and use these to compare across migratory and resident species (taking single samples sequenced here as representative of the species). The analyses are framed as supporting or refuting aspects of a biogeographic model describing colonization dynamics from tropical to temperate North and South America.
Strengths:
At a technical level, the sequencing and analysis up through PSMC looks good and the paper is engaging and interesting to read as an introduction to some verbal biogeographic models of avian evolution in the Pleistocene.
The core findings - higher and more variable N<sub>e</sub> in migratory species - seem robust, and the biogeographic explanation is plausible.
Thanks. We thought so as well. Our analyses go beyond being simply descriptive and test some simple hypotheses, including a biogeographic+ecological expansion opportunity gained in some lineages through the adoption of a seasonal migration life-history strategy.
Weaknesses:
I did not find the analyses particularly persuasive in linking specific aspects of clade-level PSMC patterns causally to evolutionary driving forces. To their credit, the authors have anticipated my main criticism in the discussion. This is that variation in population size inferred by methods like PSMC is in "effective" terms, and the link between effective and census population size is a morass of bias introduced by population structure and selection so robustly connecting specific aspects of PSMC traces to causal evolutionary forces is somewhere between extremely difficult and impossible.
As R1 notes, we do not attempt to link effective population sizes and census sizes (though we do discuss this), and we are also careful to discuss correlated rather than causative factors when going beyond the overarching hypotheses regarding life-history strategy.
Population structure is the most obvious force that can generate large N<sub>e</sub> changes mimicking the census-sizefocused patterns the authors discuss. The authors argue in the discussion that since they focus on relatively deep time (>50kya at least, with most analyses focusing on the 5mya - 500kya range) population structure is "likely to become less important", and the resident species are usually more structured today (true) which might bias the findings against the observed higher N<sub>e</sub> in migrants.
To clarify, the patterns we discuss are entirely related to effective population size, not census size. But, yes, this is why we’ve given population structure its own section in the Discussion.
But is structure really unimportant in driving PSMC results at these specific timescales? There is no numerical analysis presented to support the claim in this paper. The biogeographic model of increased temperate-latitude land area supporting higher populations could yield high N<sub>e</sub> via high census size, but shifts in population structure (for example, from one large panmictic population to a series of isolated refugial populations as a result of glaciation-linked climate changes) could plausibly create elevated and more variable N<sub>e</sub>. Is it more land area and ecological release leading to a bigger and faster initial N<sub>e</sub> bump, or is it changes in population connectivity over time at expanding range edges, or is the whole single-bump PSMC trace an artifact of the dataset size, or what? The authors have convinced me that the N<sub>e</sub> history of migratory thrushes is on average very different from nonmigrant thrushes, but beyond that it's unclear what exactly we've learned here about the underlying process.
We do not argue that population structure is unimportant, only that it is less important as one goes into deeper time. Further, we agree with the reviewer’s observation above that structure is more likely to bias nonmigrant estimates of N<sub>e</sub>. In other words, following Li & Durbin’s (2011) simulations, we interpret that an inflated N<sub>e</sub> due to structure should occur more often among residents. We have clarified this in the revision. We also agree that what we’ve learned about the underlying process is not entirely clear, but as we stated, population structure does not seem to be the main driver, and there is evidence that both biogeographic and ecological factors are involved. With this being the first time that these questions have been asked, we think we’ve made an important advance and that we’ve opened a number of avenues for future study.
It also important to consider the time scales involved and the sampling regime. Glacial-interglacial cycles averaged ~100 Kyr back to 0.74 Mya and then averaged ~41 Kyr from then back to 2.47 Mya; about 50-60 of these cycles occurred (Lisiecki & Raymo 2005: fig. 4). This probably caused a lot of population structuring and mixing in these lineages. In addition, in the PSMC output from one of our lineages, C. ustulatus swainsonii, we find that there are 54 time segments sampled for the Pleistocene, indicating the inadequacy of this method to reflect fine-scale changes and suggesting that each estimate is capturing a lot of both phenomena, structuring and mixing. We have added this to the revision.
I generally agree with the authors that "at present there is no way to fully disentangle the effects of population structure and geographic space on our results". But given that, I think there are two options - either we can fully acknowledge that oversimplified demographic models like PSMC cannot be interpreted as supporting evidence of any particular mechanistic or biogeographic hypothesis and stop trying to use them to do that, or we have to do our best to understand specifically which models can be distinguished by the analyses we're employing.
Short of developing some novel theory deep in the PSMC model, I think readers would need to see simulations showing that the analyses employed in this paper are capable of supporting or refuting their biogeographic hypothesis before viewing them as strongly supporting a specific biogeographic model. Tools like msprime and stdpopsim can be used to simulate genome-scale data with fairly complex biogeographic models. Running simulations of a thrush-like population under different biogeographic scenarios and then using PSMC to differentiate those patterns would be a more convincing argument for the biogeographic aspects of this paper. The other benefit of this approach would be to nail down a specific quantitative version of the taxon cycles model referenced in the abstract, and it would allow the authors to better study and explain the motivation behind the specific summary statistics they develop for PSMC posthoc analysis.
These could very well be fruitful pursuits for future work, but they are beyond the scope of this paper. The impossibility of reconstructing ranges through deep time makes anything other than the very general biogeographic hypothesis we’ve posed an uncertain pursuit. Also, a purely biogeographic approach neglects the likelihood of ecological expansion also being involved. We get at the importance of the latter in the “Geography and evolutionary ecology” section of the Discussion. Below, the editor states that discussions among reviewers indicate that simulations are not warranted at this time. We agree that the complexities involved are substantial, to the point of making direct relevance to this empirical study uncertain (especially in such an among-lineage context). Regarding taxon cycles, we merely point out that that conceptual framework seems relevant given our findings. This was not even remotely anticipated at the outset of the study, so we are reluctant to do anything more than point out its possible relevance in several aspects of the results. Finally, the motivation for the study’s summary statistics were entirely driven by the hypotheses, as given in Methods, and due to an earlier error (noted above), there are no post-hoc analyses in the revision. Sorry for the needless confusion.
Reviewer #2 (Public Review):
Summary:
Winker and Delmore present a study on the demographic consequences of migratory versus resident behavior by contrasting the evolutionary history of lineages within the same songbird group (thrushes of the genus Catharus).
Strengths:
I appreciate the test-of-hypothesis design of the study and the explicit formulation of three main expectations to test. The data analysis has been done with appropriate available tools.
Weaknesses:
The current version of the paper, with the case study chosen, the results, and the relative discussion, is not satisfying enough to support or reject the hypotheses here considered.
Given the stated strengths, the weaknesses noted seem a little incongruous, but we understand from the comments below that the reviewer would like to see the study redesigned and expanded.
The authors hypothesized that the wider realized breeding and ecological range characterising migrants versus resident lineages could be a major drive for increased effective population size and population expansion in migrants versus residents. I understand that this pattern (wider range in migrants) is a common characteristic across bird lineages and that it is viewed as a result of adapting to migration. A problem that I see in their dataset is that the breeding grounds range of the two groups are located in very different geographic areas (mainly South versus North America). The authors could have expanded their dataset to include species whose breeding grounds are from the two areas, regardless of their migratory behaviour, as a comparison to disentangle whether ecological differences of these two areas can affect the population sizes or growth rates.
Because the questions are about the migratory life history strategy and the best way to get at this is in a phylogenetic framework, we’re not sure how we could effectively add species “regardless of their migratory behavior.” Further, we know that migration causes lineages to experience variable ecological conditions that include breeding, migration, and wintering conditions. Obligate migrants are going to have different breeding ranges from their close relatives, and the more distantly related species are, the less likely it is that they respond to particular ecological conditions the same way. So we do not think that an approach that included miscellaneous species from northern and southern regions would strengthen this study. Here, the comparative framework of closely related lineages that possess or lack the trait of interest is a study design strength. We do agree, however, that future work is needed that does encompass more lineages (we would argue in a phylogenetic context), and that disentangling the effects of geography and ecology will also be an important future endeavor.
As I understand from previous literature, the time-scale to population growth and estimates of effective population sizes considered in the present paper for the resident versus migratory clades seem to widely predate the times to speciation for the same lineages, which were reported in previous work of the same authors (Everson et al 2019) and others (Termignoni-Garcia et al 2022). This piece of information makes the calculation of species-specific population size changes difficult to interpret in the light of lineages' comparison. It is unclear what the authors consider to be lineage-specific in these estimates, as the clades were likely undergoing substantial admixture during the time predating full isolation.
We do recognize that timing estimates vary among studies. Differences among studies in important variables like markers, methods, generation time, and mutation or substitution rates create much of this uncertainty. Also, we are not confident in prior dating efforts in this group, largely because of gene flow and its effects on bringing estimates closer to the present. As we point out (line 485), differences among studies on these issues do not detract from the strengths here for within-study, among-lineage contrasts. In short, the timing could be off in an among-study context (and likely is with prior work, given gene flow), but relative performance of among-lineage N<sub>e</sub> differences is less susceptible to these factors. This was shown fairly well in Li & Durbin’s initial use of the method among human populations. Regarding substantial admixture, PSMC curves often unite at their origins with sister lineages (when they were the same lineage). A good example is with the two C. guttatus E & W curves in Fig. S3, which still have substantial gene flow today (they are subspecies and in contact), yet they show remarkably different N<sub>e</sub> curves through their history. It is not possible to mark a cutoff point for each lineage that represents the cessation of admixture with another lineage (e.g., Everson et al. 2019 showed substantial admixture between three full species in this group); that period can be very long (Price et al. 2008), varies among lineages, and will not be available for deeper lineage divergences in the phylogeny. We therefore chose to use all of the time intervals retrievable from the genomic data in each lineage, considering that this uniform treatment is the best approach for our among-lineage comparison. And note that we were careful to label these as “the lineages’ PSMC inception” (line 190).
Regarding the methodological difficulties in interpreting the impact of population structure on the estimates of effective population sizes with the PSMC approach, I would think that performing simulations to compare different scenarios of different degrees of structured populations would have helped substantially understand some of the outcomes.
The complexities of such modeling in a system like this are daunting. The different degrees of structuring among all of these lineages across just a single glacial-interglacial cycle would necessitate a lot of guesswork; projecting that back across 50-60 such cycles just in the Pleistocene would probably end up being fiction. Disentangling the effects of structure versus changes in N<sub>e</sub> in a system like this would probably not be possible with that approach and these data. As noted above and below, there was agreement among reviewers and the editor that simulations in this case are not warranted for revision. We have added the nature of the glacialinterglacial cycles and the PSMC sampling time segments to help readers understand this better (see above in response to R1, and lines 272-278).
Additionally, I have struggled to understand if migratory behaviour in birds is considered to be acquired to relieve species competition, or as a consequence of expanded range (i.e., birds expand their range but their feeding ground is kept where speciation occurred as to exploit a ground with higher quality and abundance of seasonal local resources).
The origins of migration have been a struggle for researchers since the subject was taken up. But how the trait was acquired among these species does not really matter for our study. Here, migratory lineages possess different biogeographic+ecological attributes than their close relatives that are sedentary. Our focus is on the presence and absence of this life-history trait.
The points raised above could be considered to improve the current version of the paper.
Thank you. We appreciate the opportunity to guide our revision using your comments.
Reviewer #3 (Public Review):
Summary:
This paper applies PSMC and genomic data to test interesting questions about how life history changes impact long-term population sizes.
Strengths:
This is a creative use of PSMC to test explicit a priori hypotheses about season migration and N<sub>e</sub>. The PSMC analyses seem well done and the authors acknowledge much of the complexity of interpretation in the discussion.
Weaknesses:
The authors use an average generation time for all taxa, but the citations imply generation time is known for at least some of them. Are there differences in generation time associated with migration? I am not a bird biologist, but quick googling suggests maybe this is the case (https://doi.org/10.1111/1365-2656.13983). I think it important the authors address this, as differences in generation time I believe should affect estimates of N<sub>e</sub> and growth.
Good point. The study cited by the reviewer encompasses a much higher degree of variation in body size and thus generation time. Differences in generation time in similarly sized close relatives, as in our study, should be small, and our approach has been to average those that are known. Unfortunately, generation times are not known for all of these species, but given their similarity in size we can have reasonable confidence in their being similar. We used data from the life-history research available (as cited) to obtain our average; there are not appropriate data for the residents, though. However, there is thought to be a generation time cost to seasonal migration in birds, and Bird et al. (2020) included this in their estimates to provide modeled values for all of the lineages we studied. We’re leery of using modeled values where good data for the nonmigrants in this group don’t exist (and the basis for quantifying this cost is tiny), but we recognize that this second approach is available and could leave some doubt in our results if not pursued. So we re-did everything with the modeled generation times of Bird et al. (2020). As expected, most of the differences are time-related. Importantly, our overall results are not different. We present them as Table S2 and have added the details on this to the Methods.
The writing could be improved, both in the introduction for readers not familiar with the system and in the clarity and focus of the discussion.
We have added a phylogeny (new Fig. 1) to help readers better understand the system, and we’ve re-worked the Discussion to make it clearer what is clarified by our results and what remains unclear.
Recommendations for the authors:
Reviewing Editor comment:
I note that discussion among the reviewers made clear that simulations are probably not the right answer given the complexity of the modeling required.
We appreciate this conclusion, with which we agree.
Reviewer #2 (Recommendations For The Authors):
Apologies for the delay with the review, which came at a very busy time. I hope you will find my comments helpful.
Thanks. Your comments are helpful, and we fully understand how reviews (and our revisions!) have to wait until more pressing needs are addressed.
I enjoyed reading the manuscript but I believe that the discussion sections could be heavily rewritten for better clarity. The discussion is sometimes redundant and lacks some flow/clarity. In a nutshell, I had the feeling that a bit of everything is thrown in the discussion but clear conclusions are not made.
Yes, the Discussion has been difficult to write, because more issues arose in the Results than we anticipated at the outset. We feel that discussing them is relevant, but we agree that much remains unclear. This coupling of paleodemographics with geography and ecology is a new area, which opens some important new (and relevant) areas to consider. So clarity is not possible in some areas. We’ve revised to point out where we do have clarity (e.g., in migrant lineages having different paleodemographic attributes than nonmigrants) and where only further study can provide clarity (e.g., in the roles of geography versus ecology). The journal format does not seem to have secondary subheaders, but we’ve used bold in one place to highlight ‘ecological mechanisms’ to offset that section, one of the more complex. We’ve also added a paragraph in the conclusions to clarify where we have clear takeaways and where uncertainties remain.
Reviewer #3 (Recommendations For The Authors):
The introduction should engage the reader with biology, not the use of demographic methods or genomics (both of which have been around for more than a decade). I would drop the first paragraph and considerably expand the second. What has previous research on ecology/behavior/genetics found regarding the demographic effects of seasonal migration?
There are two important aspects to our study: 1) using paleodemographic methods to test hypotheses about adoption of a major life-history trait—an important biological question regardless of system, and so far (surprisingly) unaddressed; and 2) using this novel approach to study the effects of one such trait, seasonal migration. At these timescales, nothing exists on this subject, so there is really nothing to expand with. If there is relevant literature that we’ve missed, we’d be happy to add it.
What is the missing bit of information or angle the current study addresses (other than just doing it larger and fancier with genomics)?
The effects of major life-history traits on paleodemographics has not been addressed before, to our knowledge. The whole context is new, so we’re not doing something “larger and fancier” with genomics. We are doing something that has not been done before: testing hypotheses about the effects of a major life-history trait on population sizes in evolutionary time. We’re not sure how this can be made clearer. To us this seems like a very engaging biological question with wide applicability. We hope that this study is just the first of many to come, in a diversity of biological systems.
A figure showing the phylogenetic relationships of these taxa which are migratory would help the reader immensely. Although this is shown in Fig S3 I think it might be nice to have a map of the species and their ranges alongside a phylogeny as a main figure early on.
Thank you. This is a good suggestion. We can’t fit a phylogeny and all the distribution maps (Fig. S1) onto a page, but we can include a phylogeny as one of the main figures with nonmigrants highlighted. We’ve inserted this as a new Fig. 1.
If I understand correctly, the authors' arguments for why migratory species should show more growth hinge on large range size and geographic expansion. Yet they argue in the discussion that these forces are unlikely to be important (L226). I found the discussion on this confusing (e.g. L231 then says maybe it does matter). I think more clarity here would be helpful.
Our argument and predictions are based both on geographic and ecological expansion. This was clearly stated as our third prediction “3) early population growth would be higher as seasonal migration opens novel ecological and geographic space…” We have gone back through and reiterated the coupling of these two factors. The line mentioned concludes the first paragraph in the section ‘Geography and evolutionary ecology,’ which focuses on the difficulty of decoupling these in this system. As the paragraph relates, geography alone does not seem to be driving our results (we do not argue that it is unimportant).
I also would have liked more time in the discussion addressing why variation in N<sub>e</sub> may be higher in migratory lineages.
In addition to re-clarifying this in the Introduction, we have touched back on this now at line 221: “We attribute the higher variation in N<sub>e</sub> among migrants to be the result of the relative instability of northern biomes compared with tropical ones through glacial-interglacial cycles (e.g., Colinvaux et al., 2000; Pielou, 1991).”
Minor comments:
L 62: Presumably PSMC is limited by the coalescent depth of the genelaogy, which may be younger or older than population "origins" depending on the history of colonization, lineage splitting, gene flow, etc.
We were careful to phrase these as “the lineages’ PSMC inception” (line 190), and responded to this issue in more detail above in response to R2’s public review.
L 338: I think a few more details on PSMC would be helpful. Was no maskfile used?
We did not use a maskfile, choosing instead to generate data of decent coverage and aligning reads to a single closely related relative.
Did the consensus fasta include all species?
No, we used a single reference high-quality fasta of Catharus ustulatus , as reported (lines 434-37). We have added that “Identical treatment of all lineages in these respects should provide a strong foundation for a comparative study like this among close relatives.”
L 361: Fair to assume the authors used a weighted average of N<sub>e</sub> from the output, rather than just averaging the N<sub>e</sub> values from each time segment?
No – we used all the values of N<sub>e</sub> produced by PSMC output. The PSMC method uses nonoverlapping portions of the genome in its analyses (which we’ve added to make that clear), and portions in juxtaposition will often provide data for very different periods in the time segments. Further, time segments are uneven within and among taxa, so it is not clear how a uniform and comparable weighting scheme could be implemented. We consider a uniform approach to be of primary importance, including for future comparisons among studies.
L 383 "delta" typo
Thank you for catching this.
L 93: I'd be tempted to present the questions (how does seasonal migration affect population size trajectory, means, and variation) and rationale before presenting the hypotheses. I found myself reading the hypotheses and wondering "why?"
We’ve tried this change in the revision. It makes the hypotheses a little harder to pull out (they are no longer numbered in a short sequence), but it is shorter and solves this concern.
L 337 read depth is usually expressed as X (e.g. "23X") rather than bp.
Changed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This important study further validates DNAH12 as a causative gene for asthenoteratozoospermia and male infertility in humans and mice. The data supporting the notion that DNAH12 is required for proper axonemal development are generally convincing, although more experiments would solidify the conclusions. This work will interest reproductive biologists working on spermatogenesis and sperm biology, as well as andrologists working on male fertility.
We thank the editor and the two reviewers for their time and careful evaluation of our manuscript. We sincerely appreciate their encouraging feedback and insightful guidance on improving our study. In the revised manuscript, we have performed additional experiments and provided quantitative data regarding the reviewers' comments.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Even though this is not the first report that the mutation in the DNAH12 gene causes asthenoteratozoospermia, the current study explores the sperm phenotype in-depth. The authors show experimentally that the said mutation disrupts the proper axonemal arrangement and recruitment of DNALI1 and DNAH1 - proteins of inner dynein arms. Based on these results, the authors propose a functional model of DNAH12 in proper axonemal development. Lastly, the authors demonstrate that the male infertility caused by the studies mutation can be rescued by ICSI treatment at least in the mouse. This study furthers our understanding of male infertility caused by a mutation of axonemal protein DNAH12, and how this type of infertility can be overcome using assisted reproductive therapy.
Strengths:
This is an in-depth functional study, employing multiple, complementary methodologies to support the proposed working model.
Thank you for your recognition of the strength of this study. Your positive feedback motivates us to continue refining our research and methodological rigor in future studies.
Weaknesses:
The study strength could be increased by including more controls such as peptide blocking of the inhouse raised mouse and rat DNAH12 antibodies, and mass spectrometry of control IP with beads/IgG only to exclude non-specific binding. Objective quantifications of immunofluorescence images and WB seem to be missing. At least three technical replicates of western blotting of sperm and testis extracts could have been performed to demonstrate that the decrease of the signal intensity between WT and mutant was not caused by a methodological artifact.
Thank you for your comments. In order to study in-depth, we have analyzed the protein sequence features of DNAH12 protein, 1-200 amino acids of DNAH12 were selected as the ideal antigen considering its good performance (1. high immunogenicity; 2. High hydrophilicity; 3. Good Surface Leakage Groups; 4. Sequence homology analysis to avoid unspecific recognition to other proteins;). The two different anti-DNAH12 antibodies were developed with the help Dia-An Biotech company in 2022, we have tried to acquire the polypeptide fragments of target proteins to do peptide blocking but the material were discard after the service. Luckily, we have got the target band of DNAH12 protein in western blotting experiment while the band was not detected in knockout mice group; the immunofluorescence signals of DNAH12 were strong but not present in knockout mice group. Besides, we have tested that the inhouse raised rabbit antibody were suitable for IP experiment. The IP experiment also showed the raised rabbit antibody were able to immunoprecipitated the DNAH12 band in the Dnah12<sup>+/+</sup> mice but not in Dnah12<sup>-/-</sup> mice. Collectively, these data could support the specificity of the raised DNAH12 antibodies. In IP assay, we have added the IgG group in the IP-mass spectrometry to exclude non-specific binding. And the experimental design was described in Figure 6B. The raw data were deposited in iProX partner repository (accession number: PXD051681), and we have coordinated with the repository manager to make the data publicly accessible (https://www.iprox.cn/page/subproject.html?id=IPX0008674001).
Besides, we have conducted replicates of western blotting of sperm and testis extracts at least 3 times and added the objective quantifications of immunofluorescence signals and WB images. The quantifications of the blot were shown in figures to help readers understand these results easily.
Reviewer #2 (Public Review):
Summary:
The authors first conducted whole exome sequencing for infertile male patients and families where they co-segregated the biallelic mutations in the Dynein Axonemal Heavy Chain 12 (DNAH12) gene.
Sperm from patients with biallelic DNAH12 mutations exhibited a wide range of morphological abnormalities in both tails and heads, reminiscing a prevalent cause of male infertility, asthenoteratozoospermia. To deepen the mechanistic understanding of DNAH12 in axonemal assembly, the authors generated two distinct DNAH12 knockout mouse lines via CRISPR/Cas9, both of which showed more severe phenotypes than observed in patients. Ultrastructural observations and biochemical studies revealed the requirement of DNAH12 in recruiting other axonemal proteins and that the lack of DNAH12 leads to the aberrant stretching in the manchette structure as early as stage XI-XII. At last, the authors proposed intracytoplasmic sperm injection as a potential measure to rescue patients with DNAH12 mutations, where the knockout sperm culminated in the blastocyst formation with a comparable ratio to that in WT.
Strengths:
The authors convincingly showed the importance of DNAH12 in assembling cilia and flagella in both human and mouse sperm. This study is not a mere enumeration of the phenotypes, but a strong substantiation of DNAH12's essentiality in spermiogenesis, especially in axonemal assembly.
The analyses conducted include basic sperm characterizations (concentration, motility), detailed morphological observations in both testes and sperm (electron microscopy, immunostaining, histology), and biochemical studies (co-immunoprecipitation, mass-spec, computational prediction). Molecular characterizations employing knockout animals and recombinant proteins beautifully proved the interactions with other axonemal proteins.
Many proteins participate in properly organizing flagella, but the exact understanding of the coordination is still far from conclusive. The present study gives the starting point to untangle the direct relationships and order of manifestation of those players underpinning spermatogenesis. Furthermore, comparing flagella and trachea provides a unique perspective that attracts evolutional perspectives.
Thank you for your thoughtful and positive feedback. We are delighted that you found our study to be a strong substantiation of DNAH12's essential role in spermiogenesis, particularly in axonemal assembly. We believe that this study represents a meaningful step toward unraveling the intricate coordination of axonemal proteins during spermatogenesis, and your comments further inspire us to continue exploring these complex mechanisms in future work. Thank you once again for your valuable insights and summary of this work.
Weaknesses:
Seemingly minor, but the discrepancies found in patients and genetically modified animals were not fully explained. For example, both knockout mice vastly reduced the count of sperm in the epididymis and the motility, while phenotypes in patients were rather milder. Addressing the differences in the roles that the orthologs play in spermatogenesis would deepen the comprehensive understanding of axonemal assembly.
This is an interesting question. Actually, it seems that although humans and mice share the male infertility phenotypes with deficiency in dynein proteins essential for sperm flagellar development, they are different in some ways. For instance, it has been reported that deficiency in DNAH17 (Clin Genet. 2021. PMID: 33070343) or DNAH8 (Am J Hum Genet. 2020. PMID: 32619401; PMCID: PMC7413861), two other members of Dynein Axonemal Heavy Chain family, also cause more severe phenotype in mice, comparing with that of human patients carrying bi-allelic DNAH17 or DNAH8 loss-of-function mutations. In knockout mice, sperm counts are lower, and the proportion of abnormal sperm morphology is higher, whereas the phenotypes in human patients tend to be milder. These observations suggest that orthologs may influence spermatogenesis to slightly different extents in humans and mice. We plan to investigate the mechanisms underlying these discrepancies in future studies, which will provide deeper insights into axonemal assembly and the evolutionary aspects of spermatogenesis. Thank you again for bringing up this important issue.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
This reviewer is impressed by the study's depth and the extent of the methodology used in the study. The study is well-designed, and the results are very interesting. The reviewer's enthusiasm was reduced by the lack of some controls (provided that the reviewer did not miss them). Further are point-to-point suggestions that this reviewer believes will increase the merit of the present study.
Title:
(1) Why a "special" dynein? What makes it special when compared to other dyneins? I suggest removing the word special.
Through phylogenetic and protein domain analyses of the DNAH family, we found that DNAH12 is the shortest member and the only one that lacks a typical microtubule-binding domain (MTBD) in the DNAH family, thus we want to describe it as a “special” dynein. We have fully considered your valuable suggestion and decided to remove it from the title.
Abstract:
(2) L23: same as above, why special?
We identified DNAH12 as the shortest member of the DNAH family and uniquely lacking the typical microtubule-binding domain (MTBD). This distinct characteristic prompted us to describe it as a 'special' dynein in the abstract part.
(3) L37: the reviewer did not find a figure (neither main nor supplementary) that would demonstrate the proper organization of microtubules in cilia. Figure S11 only shows the presence of cilia in DNAH12-/- mouse. A TEM image of cilia is required to confirm or reject the claim that DNAH12 does not play a crucial role in proper microtubule organization in cilia.
We have now added TEM images of cilia in wild-type and Dnah12<sup>-/-</sup> mice. The ultra-structures of cilia axonemes were comparable in wild-type and Dnah12<sup>-/-</sup> groups, suggesting that DNAH12 may not play crucial role in proper microtubule organization. The results have now been added to Supplemental Figure 11F.
(4) L122-6: Did the authors also confirm these structures by cryo-EM? If not, this needs to be pointed out as a shortcoming in the discussion, that the structures and interactions are predicted in silico only.
Thank you for your comment. Due to resource limit, we do not perform cryo-EM to confirm these structures. We will pursue the structures details at an atomic resolution structure in further study. We understand this point and now we have addressed this as a shortcoming in the discussion part.
(5) L134: Be more specific about what characteristics of DNAH12 were analyzed.
Thank you for your comment. We have now updated these in the method part. The characteristics of the DNAH12 including its region immunogenicity, hydrophilicity, surface leakage groups, and sequence homology were analyzed.
(6) L137: Be more specific about how the antibodies validated were. Were the antibodies validated for both immunofluorescence and western blotting? I suggest doing peptide blocking of the antibody, for instance for ICC, preincubation of ab with immunizing peptide followed by primary ab incubation with studied cells/tissues.
Thank you for your comments and suggestions. We validated the antibodies for both immunofluorescence and western blotting to ensure their effectiveness in our experiments. The two different anti-DNAH12 antibodies were developed with the help Dia-An Biotech company in 2022, we have attempted to acquire the polypeptide fragments of target proteins to do peptide blocking but the material were disposed after the service. Luckily, we have got the target band of DNAH12 protein which showed strong signal in western blotting experiment and the band was not detected in knockout mice group; the immunofluorescence signals of DNAH12 were strong but not present in knockout mice group. Besides, the IP experiment also showed the raised rabbit antibody were able to immunoprecipitated the DNAH12 band in the Dnah12<sup>+/+</sup> mice but not in Dnah12<sup>-/-</sup> mice. Collectively, these data could support the specificity of the raised DNAH12 antibodies. We sincerely admire your suggestion and will require for the peptide material if we develop new antibodies.
(7) L142: This reviewer is unfamiliar with using TRIzol for sperm protein extraction. Is there a specific reason for not using PAGE loading buffer for human sperm protein extraction?
Thanks for your suggestions. TRIzol reagent can be used for small amounts of samples (5×10<sup>6</sup> cells) as well as large amounts of samples (>10<sup>7</sup> cells). It is suitable for extraction of RNA and proteins at the same time. Our lab has adopted these methods in our previous work (Hum Reprod Open. 2023; PMID: 37325547; PMCID: PMC10266965.). This method is very useful to process valuable small amounts of samples for scientific work. The human sperm protein extraction was added with SDS-sample buffer [PAGE loading buffer] before SDS-PAGE separation. We have added this detail in the method part. We are sorry for making this misunderstanding.
(8) L144: Were these the final concentrations of the SDS loading buffer? 1 × Laemmli buffer contains 62.5 mM TRIS, 2% (w/w) SDS, 10 % (w/v) glycerol, and 5% 2-mercaptoethanol. Please, amend accordingly.
Thanks for your suggestions. We apologized for incorrect labelling of concentrations (The previous one is 3× SDS loading buffer). We have now amended the SDS loading buffer to 1 × Laemmli buffer as suggested.
(9) L151: Table S2 contains other homemade antibodies than DNAH12. Please, include references to the studies where the generation and validation of these antibodies is described.
Thank you for your suggestions. We have developed a DNAH1 antibody for use in Western blot assays, with its generation and validation detailed in Frontiers in Endocrinology (Lausanne), 2021 (PMID: 34867808; PMCID: PMC8635859). Additionally, we have produced a DNAH17 antibody for both immunofluorescence (IF) and Western blot, as described in Journal of Experimental Medicine, 2020 (PMID: 31658987; PMCID: PMC7041708). These references have now been included.
(10) L167: Please, spell out ICR at its first appearance.
Done as suggested, Thank you. The full name of ICR is Institute of Cancer Research.
(11)L169: This reviewer is confused. It seems that the mouse encodes DNAH12 on exons 5 and 18 simultaneously. Each mouse model has only one exon targeted for a knockout. Would not this mean that the expression of DNAH12 in both models is not completely knocked down? Please, give more background in this paragraph for those less familiar with CRISPR/Cas9.
Thank you for your insightful comment. We appreciate your attention to detail. To clarify, while the mouse model does indeed encode DNAH12 on exons 5 and 18 simultaneously, we specifically targeted the key exon 5 or exon 18 in each model to achieve different knockout strategies. This approach allows us to assess the functional implications of the remaining DNAH12 expression in both models. We have checked the DNAH12 expression in both models, and the result showed both models present with undetected DNAH12 proteins, indicating both models were completely knocked out of DNAH12 proteins. Additionally, we will revise the manuscript to include further details on the CRISPR/Cas9 methodology, ensuring accessibility for readers less familiar with this technique. Thank you again for your valuable feedback, which we believe will greatly enhance our manuscript.
(12) L201: 50 % PBS? As in 0.5 x concentrated PBS? Please, rewrite for clarity.
The term "50% PBS" refers to a 1:1 dilution of phosphate-buffered saline (PBS) with an appropriate diluent, resulting in a final concentration of 0.5x PBS. We will revise the text to explicitly clarify this, ensuring it is clear to all readers. Thank you for highlighting this point.
(13) L224: Please, state what beads those were (magnetic/agarose, conjugated to protein A/G...) Include catalog # and manufacturer.
Thank you for your suggestion. We have updated the manuscript to include this information. The beads used were Protein A/G Magnetic Beads (Catalog #B23202, Bimake, Texas, USA).
(14) L227: What was the reason for adding a proteasomal inhibitor? What concentration was used? Please, add this information to the text.
We adding MG132 in cell immunoprecipitation (IP) experiments is to inhibit proteasomal activity, thereby preventing the degradation of the target protein. This helps maintain the stability of the target protein during the experiment (Sci Adv. 2022. PMID: 35020426; PMCID: PMC8754306.), enhancing its detectability in subsequent analyses. MG132 (5 μM) was added. We have added this information in the revised the manuscript
(15) L233: in vivo IP of mouse testis lysate? This does not make sense. I suggest removing "in vivo".
Thank you for your careful review and comments on our manuscript. We have modified as suggested.
(16) L317: Supplemental Figure 6 precedes Supplemental Figure 5 in the text, which is neither logical nor orderly.
Thank you for your suggestion. Since the N-terminal DNAH12 antibody is already described in the Methods section (L317), we propose removing Supplemental Figure 6 from the content to improve the logical flow and maintain an orderly presentation.
(17) L345 and elsewhere: how did the authors quantify the decrement of the signal? This needs to be measured objectively.
Thank you for your valuable suggestion. We quantified the signal intensity using Fiji (Nat Methods. 2012. PMID: 22743772; PMCID: PMC3855844), which allows for precise analysis of pixel intensity. The results are presented in the figures to effectively illustrate the decrement in signal intensity. We appreciate your suggestion, and we have provided a description of the method in our methodology section.
(18) L371: I recommend: ...and elongated spermatids; the abnormal...
Done as suggested. Thank you.
(19) L412-4: Cilia in both Dnah12<sup>mut/mut</sup> and Dnah12<sup>-/-</sup> are developed, but are they motile or immotile? This needs to be investigated. Is the DNAH12 in cilia truncated while still fulfilling its function?
Thanks for your comment. We have checked the ciliary motility using an inverted microscope, and no significant difference of ciliary motility were observed between the knockout group and the control group. These results indicated that the ciliary motility was not affected by DNAH12 deficiency. The N-terminal DNAH12 antibody was developed to detect whether a truncated protein in mice tissues while we do not detect DNAH12 signals through immunofluorescence assay on trachea sections of the Dnah12<sup>-/-</sup> mice. These results indicate that DNAH12 may exert little influence on cilia, comparing to its important function in flagella.
(20) L414-6: The results do not support this claim as the authors do not show that cilia are motile.
Thanks for your comment. The supplemental videos 3-4 of trachea live of Dnah12<sup>+/+</sup> and Dnah12<sup>-/-</sup> mice have been uploaded to support this conclusion.
(21) L421-3: Did the authors perform a negative test, where they let the testis lysate interact with beads/IgG only and performed the MS to identify non-specific binding? This is a crucial specificity test for this approach.
We have performed negative test. In IP assay, we have added the IgG group in the IP-mass spectrometry to exclude non-specific binding. And the experimental design was described in Figure 6B. The raw data were deposited in iProX partner repository (PXD051681), which we have required the manager soon to update the status to public, so it will be visible to readers.
(22) L462: same as #18 the authors need to show that cilia are also motile. The mere presence of cilia in DNAH12-/- as shown in Fig S11C&D is not sufficient to conclude that the mice do not manifest PCD symptoms.
Thanks for your comment. We do not observe obvious differences between the cilia of Dnah12<sup>+/+</sup> and Dnah12<sup>-/-</sup> mice. The supplemental videos 3-4 of trachea live of Dnah12<sup>+/+</sup> and Dnah12<sup>-/-</sup> mice have been uploaded to show the motility of the trachea.
(23) L529: MTBD region instead of domain, as "domain" is already part of the abbreviation.
Done as suggested
(24) L875: Sperm is both the singular and plural form. Spermatozoon vs spermatozoa can be used where the distinction between singular and plural needs to be made.
Thanks for your suggestion. We have checked and changed this usage.
(25) Figure 3H: Is there a specific reason why P11 is not shown?
Because limited smear slides of P11 were available, the P11 were not stained for DNAH17 antibody previously. We have now updated the experiment, which showed that DNAH17 expression were not affected in patient P11. We have now added this result to Figure 3H.
(26) Figure 8H: The authors in their MS do not describe what is happening to N-DRC proteins, yet they suggest in their model that it's unaffected in the mutant mouse/human. Please, address this in the MS and clearly state in the model that N-DRC needs further exploration in future studies.
Thanks for your suggestion, we have checked the MS data but do not observe the enrichment of nexin-dynein regulatory complex (N-DRC) protein, just one known N-DRC protein DRC1 present with only 1 unique peptide. Instead, enrichment of inner dynein arm proteins and radial spoke proteins were observed. However, we cannot determine the N-DRC structures maybe affected or not. We have stated this in the discussion part and will pursue this with high resolution technology like cryo-EM in the future.
(27) Figure 5F: Is it possible to choose a different Dnah12<sup>-/-</sup> spermatozoon to see a reduced level of DNALI1 so that it corresponds with the WB detection in Fig 5B?
Thanks for your suggestion, we have chosen a Dnah12<sup>-/-</sup> spermatozoon with faint remnants of the DNALI1 signal as the representative picture.
(28) Figure S2 and elsewhere: How were the authors able to resolve and calibrate 356 kDa protein using SDS PAGE? Agarose electrophoresis protein electrophoresis is more suitable for resolution of high molecular proteins. Most of the protein standards have as high molecular standard as 250 kDa.
We have found that high molecular proteins (like 356kDa) were able to resolve in concentration 4-12% gradient gel of polyacrylamide gels and employ appropriate voltages and more time during electrophoresis to improve resolution of high molecular weight proteins. The DNAH12 proteins were calibrated by the using of a HiMark™ Pre-Stained High Molecular Weight Protein Standard (30-460 kDa). We have now updated the blot images to show the size of the DNAH12 protein (Fig S6B,). The target band is obvious between 268 kDa and 460 kDa, which make it easy to calculate the target band of DNAH12 antibody elsewhere. Thanks for your suggestion.
(29) Figure S5: similar to #24: Why P10 and P11 are not shown?
Because limited smear slides of P10 or P11 were available, we did not stain ODF2 antibody previously. We have now updated the experiments, which showed that ODF2 expression were not affected in patient P10 or P11. We have now added this result to Figure S5.
(30) Figure S6B: The specificity of the anti-DNAH12 antibody against mouse DNAH12 seems to be questionable since the authors detect multiple bands on WB. I recommend doing peptide blocking to show that these are non-specific binding as opposed to off-target binding.
Thank you for your comments. In order to study in-depth, we have analyzed the protein sequence features of DNAH12 protein, 1-200 amino acids of DNAH12 were selected as the ideal antigen considering its good performance (1. high immunogenicity; 2. High hydrophilicity; 3. Good Surface Leakage Groups; 4. Sequence homology analysis to avoid unspecific recognition to other proteins;). The two different anti-DNAH12 antibodies were developed with the help Dia-An Biotech company in 2022, we have attempted to acquire the polypeptide fragments of target proteins to do peptide blocking but the material were disposed after the service. Luckily, we have got the target band of DNAH12 protein which showed strong signal in western blotting experiment and the band was not detected in knockout mice group; the immunofluorescence signals of DNAH12 were strong but not present in knockout mice group. Besides, we have tested that the inhouse raised rabbit antibody was suitable for IP experiment. The IP experiment also showed the raised rabbit antibody were able to immunoprecipitated the DNAH12 band in the Dnah12<sup>+/+</sup> mice but not in Dnah12<sup>-/-</sup> mice. Collectively, these data could support the specificity of the raised DNAH12 antibodies. We admire your suggestion and will require for the peptide material if we develop new antibodies.
Reviewer #2 (Recommendations For The Authors):
Recruitment of DNAH1 and DNALI1 to the flagella is dependent on DNAH12 expression, according to the data. What would be the mechanism that locates DNAH12 which lacks MTBD to the flagella?
Thank you for your insightful question. We are currently investigating the mechanisms that facilitate the loading of DNAH12 to the flagella. Based on existing data, we hypothesize that CCDC39 and/or CCDC40 may play a critical role in the recruitment of DNAH12 to sperm flagella during spermiogenesis (Nat Genet. 2011, PMID: 21131972; PMCID: PMC3509786; Nat Genet. 2011, PMID: 21131974; PMCID: PMC3132183). Furthermore, a structural study by Walton et al. showed that DNAH12 associates with CCDC39/CCDC40 proteins (Nature. 2023, PMID: 37258679; PMCID: PMC10266980). These findings suggest that CCDC39 and/or CCDC40 may play a role in facilitating the localization of DNAH12 to the flagella. Additional studies are needed to identify other potential factors involved in this process and to further elucidate the mechanisms underlying this complex biological phenomenon.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors claim that they can use a combination of repetitive transcranial magnetic stimulation (intermittent theta burst-iTBS) and transcranial alternating current stimulation (gamma tACS) to cause slight improvements in memory in a face/name/profession task.
Strengths:
The idea of stimulating the human brain non-invasively is very attractive because, if it worked, it could lead to a host of interesting applications. The current study aims to evaluate one such exciting application.
Weaknesses:
(1) It is highly unclear what, if anything, transpires in the brain with non-invasive stimulation. To cite one example of many, a rigorous study in rats and human cadavers, compellingly showed that traditional parameters of transcranial electrical stimulation lead to no change in brain activity due to the attenuation by the soft tissue and skull (Mihály Vöröslakos et al Nature Communications 2018): https://www.nature.com/articles/s41467-018-02928-3. It would be very useful to demonstrate via invasive neurophysiological recordings that the parameters used in the current study do indeed lead to any kind of change in brain activity. Of course, this particular study uses a different non-invasive stimulation protocol.
Thank you for raising the important issue of the actual neurophysiological effects of non-invasive brain stimulation. Unfortunately, invasive neurophysiological recordings in humans for this type of study are not feasible due to ethical constraints, while studies on cadavers or rodents would not fully resolve our question. Indeed, the authors of the cited study (Mihály Vöröslakos et al., Nature Communications, 2018) highlight the impossibility of drawing definitive conclusions about the exact voltage required in the in-vivo human brain due to significant differences between rats and humans, as well as the in-vivo human brain and cadavers due to alterations in electrical conductivity that occur in postmortem tissue.
We acknowledge that further exploration of this aspect would be highly valuable, and we agree that it is worth discussing both as a technical limitation and as a potential direction for future research, we therefore modify the manuscript correspondingly. However, to address the challenge of in vivo recordings, we conducted Experiments 3 and 4, which respectively examined the neurophysiological and connectivity changes induced by the stimulation in a non-invasive manner. The observed changes in brain oscillatory activity (increased gamma oscillatory activity), cortical excitability (enhanced posteromedial parietal cortex reactivity), and brain connectivity (strengthened connections between the precuneus and hippocampi) provided evidence of the effects of our non-invasive brain stimulation protocol, further supporting the behavioral data.
Additionally, we carefully considered the issue of stimulation distribution and, in response, performed a biophysical modeling analysis and E-field calculation using the parameters employed in our study (see Supplementary Materials).
(2) If there is any brain activity triggered by the current stimulation parameters, then it is extremely difficult to understand how this activity can lead to enhancing memory. The brain is complex. There are hundreds of neuronal types. Each neuron receives precise input from about 10,000 other neurons with highly tuned synaptic strengths. Let us assume that the current protocol does lead to enhancing (or inhibiting) simultaneously the activity of millions of neurons. It is unclear whether there is any activity at all in the brain triggered by this protocol, it is also unclear whether such activity would be excitatory, or inhibitory. It is also unclear how many neurons, let alone what types of neurons would change their activity. How is it possible that this can lead to memory enhancement? This seems like using a hammer to knock on my laptop and hope that the laptop will output a new Mozart-like sonata.
Thank you for your musical observation. As you correctly point out, we still do not have precise knowledge of which neurons—and to what extent—are activated during non-invasive brain stimulation in humans. However, this challenge is not limited to brain stimulation but applies to many other therapeutic interventions, including psychiatric medications, without limiting their use.
Nevertheless, a substantial body of research has investigated the mechanisms underlying the efficacy of TMS in enhancing memory performance, primarily through its ability to induce long-term potentiation (Bliss & Collingridge, The Journal of Physiology, 1993a; Huang et al., Clinical Neurophysiology, 2007; Ridding & Rothwell, Nature Reviews Neuroscience, 2007; Koch et al., Neuroimage 2018; Koch et al., Brain 2022; Jannati et al., Neuropsychopharmacology, 2023).
We acknowledge that we took this important aspect for granted, and we will expand the discussion accordingly.
(3) Even if there is any kind of brain activation, it is unclear why the authors seem to be so sure that the precuneus is responsible. Are there neurophysiological data demonstrating that the current protocol only activates neurons in the precuneus? Of note, the non-invasive measurements shown in Figure 3 are very weak (Figure 3A top and bottom look very similar, and Figure 3C left and right look almost identical). Even if one were to accept the weak alleged differences in Figure 3, there is no indication in this figure that there is anything specific to the precuneus, rather a whole brain pattern. This would be the kind of minimally rigorous type of evidence required to make such claims. In a less convincing fashion, one could look at different positions of the stimulation apparatus. This would not be particularly compelling in terms of making a statement about the precuneus. But at least it would show that the position does matter, and over what range of distances it matters, if it matters.
Thank you for your feedback. We will improve the clarity of the manuscript to better address this important aspect. Our assumption that the precuneus plays a key role in the observed effects is based on several factors:
(1) The non-invasive stimulation protocol was applied to an individually identified precuneus for each participant. Given existing evidence on TMS propagation, we can reasonably assume that the precuneus was at least a mediator of the observed effects (Ridding & Rothwell, Nature Reviews Neuroscience 2007). For further details about target identification and TMS and tACS propagation, please refer to the MRI data acquisition section in the main text and Biophysical modeling and E-field calculation section in the supplementary materials.
(2) To investigate the effects of the neuromodulation protocol on cortical responses, we conducted a whole-brain analysis using multiple paired t-tests comparing each data point between different experimental conditions. To minimize the type I error rate, data were permuted with the Monte Carlo approach and significant p-values were corrected with the false discovery rate method (see the Methods section for details). The results identified the posterior-medial parietal areas as the only regions showing significant differences across conditions.
(3) To control for potential generalized effects, we included a control condition in which TMS-EEG recordings were performed over the left parietal cortex (adjacent to the precuneus). This condition did not yield any significant results, reinforcing the cortical specificity of the observed effects.
However, as stated in the Discussion, we do not claim that precuneus activity alone accounts for the observed effects. As shown in Experiment 4, stimulation led to connectivity changes between the precuneus and hippocampus, a network widely recognized as a key contributor to long-term memory formation (Bliss & Collingridge, Nature 1993). These connectivity changes suggest that precuneus stimulation triggered a ripple effect extending beyond the stimulation site, engaging the broader precuneus-hippocampus network.
Regarding Figure 3A, it represents the overall expression of oscillatory activity detected by TMS-EEG. Since each frequency band has a different optimal scaling, the figure reflects a graphical compromise. A more detailed representation of the significant results is provided in Figure 3B. The effect sizes for gamma oscillatory activity in the delta T1 and T2 conditions were 0.52 and 0.50, respectively, which correspond to a medium effect based on Cohen’s d interpretation.
(4) In the absence of any neurophysiological documentation of a direct impact on the brain, an argument in this type of study is that the behavioral results show that there must be some kind of effect. I agree with this argument. This is also the argument for placebo effects, which can be extremely powerful and useful even if the mechanism is unrelated to what is studied. Then let us dig into the behavioral results.
Hoping to have already addressed your concern regarding the neurophysiological impact of the stimulation on the brain, we would like to emphasize that the behavioral results were obtained controlling for placebo effects. This was achieved by having participants perform the task under different stimulation conditions, including a sham condition.
(4a) There does not seem to be any effect on the STMB task, therefore we can ignore this.
(4b) The FNAT task is minimally described in the supplementary material. There are no experimental details to understand what was done. What was the size of the images? How long were the images presented for? Were there any repetitions of the images? For how long did the participants study the images? Presumably, all the names and occupations are different? What were the genders of the faces? What is chance level performance? Presumably, the same participant saw different faces across the different stimulation conditions. If not, then there can be memory effects across different conditions that are even more complex to study. If yes, then it would be useful to show that the difficulty is the same across the different stimuli.
We thank you for signaling the lack in the description of FNAT task. We will add all the information required to the manuscript.
In the meantime, here we provide the answers to your questions. The size of the images 19x15cm. They were presented in the learning phase and the immediate recall for 8 seconds each, while in the delayed recall they were shown (after the face recognition phase) until the subject answered. The learning phase, where name and occupation were shown together with the faces, lasted around 2 minutes comprising the instructions. We used a different set of stimuli for each stimulation condition, for a total of 3 parallel task forms balanced across the condition and order of sessions. All the parallel forms were composed of 6 male and 6 female faces, for each sex there were 2 young adults (aged around 30 years old), 2 middle adults (aged around 50 years old), and 2 old adults (aged around 70 years old). Before the experiments, we ran a pilot study to ensure there were no differences between the parallel forms of the task. We can provide the task with its parallel form upon request. The chance level in the immediate and delayed recall is not quantifiable since the participants had to freely recall the name and the occupation without a multiple choice. In the recognition, the chance level was around 33% (since the possible answers were 3).
(4c) Although not stated clearly, if I understand FNAT correctly, the task is based on just 12 presentations. Each point in Figure 2A represents a different participant. Unfortunately, there is no way of linking the performance of individual participants across the conditions with the information provided. Lines joining performance for each participant would be useful in this regard. Because there are only 12 faces, the results are quantized in multiples of 100/12 % in Figure 3A. While I do not doubt that the authors did their homework in terms of the statistical analyses, it is difficult to get too excited about these 12 measurements. For example, take Figure 3A immediate condition TOTAL, arguably the largest effect in the whole paper. It seems that on average, the participants may remember one more face/name/occupation.
Thank you for the suggestion. We will add another graph to the manuscript with lines connecting each participant's performance. Unfortunately, we were not able to incorporate it in the box-and-whisker plot.
We apologize for the lack of clarity in the description of the FNAT. As you correctly pointed out, we used the percentage based on the single association between face, name and occupation (12 in total). However, each association consisted of three items, resulting in a total of 36 items to learn and associate – we will make it more explicit in the manuscript.
In the example you mentioned, participants were, on average, able to recall three more items compared to the other conditions. While this difference may not seem striking at first glance, it is important to consider that we assessed memory performance after a single, three-minute stimulation session. Similar effects are typically observed only after multiple stimulation sessions (Koch et al., NeuroImage, 2018; Grover et al., Nature Neuroscience, 2022).
(4d) Block effects. If I understand correctly, the experiments were conducted in blocks. This is always problematic. Here is one example study that articulated the big problems in block designs (Li et al TPAMI 2021): https://ieeexplore.ieee.org/document/9264220
Thank you for the interesting reference. According to this paper, in a block design, EEG or fMRI recordings are performed in response to different stimuli of a given class presented in succession. If this is the case, it does not correspond to our experimental design where both TMS-EEG and fMRI were conducted in a resting state on different days according to the different stimulation conditions.
(4e) Even if we ignore the lack of experimental descriptions, problems with lack of evidence of brain activity, the minimalistic study of 12 faces, problems with the block design, etc. at the end of the day, the results are extremely weak. In FNAT, some results are statistically significant, some are not. The interpretation of all of this is extremely complex. Continuing with Figure 3A, it seems that the author claims that iTBS+gtACS > iTBS+sham-tACS, but iTBS+gtACS ~ sham+sham. I am struggling to interpret such a result. When separating results by name and occupation, the results are even more perplexing. There is only one condition that is statistically significant in Figure 3A NAME and none in the occupation condition.
Thank you again for your feedback. We will work on making the large amount of data we reported easier to interpret.
Hoping to have thoroughly addressed your initial concerns in our previous responses, we now move on to your observations regarding the behavioral results, assuming you were referring to Figure 2A. The main finding of this study is the improvement in long-term memory performance, specifically the ability to correctly recall the association between face, name, and occupation (total FNAT), which was significantly enhanced in both Experiments 1 and 2. However, we also aimed to explore the individual contributions of name and occupation separately to gain a deeper understanding of the results. Our analysis revealed that the improvement in total FNAT was primarily driven by an increase in name recall rather than occupation recall. We understand that this may have caused some confusion. Therefore we will clarify this in the manuscript and consider presenting the name and occupation in a separate plot.
Regarding the stimulation conditions, your concerns about the performance pattern (iTBS+gtACS > iTBS+sham-tACS, but iTBS+gtACS ~ sham+sham) are understandable. However, this new protocol was developed precisely in response to the variability observed in behavioral outcomes following non-invasive brain stimulation, particularly when used to modulate memory functions (Corp et al., 2020; Pabst et al., 2022). As discussed in the manuscript, it is intended as a boost to conventional non-invasive brain stimulation protocols, leveraging the mechanisms outlined in the Discussion section.
(5) In sum, it would be amazing to be able to use non-invasive stimulation for any kind of therapeutic purpose as the authors imagine. More work needs to be done to convince ourselves that this kind of approach is viable. The evidence provided in this study is weak.
We hope our response will be carefully considered, fostering a constructive exchange and leading to a reassessment of your evaluation.
Reviewer #2 (Public review):
Summary:
The manuscript "Dual transcranial electromagnetic stimulation of the precuneus-hippocampus network boosts human long-term memory" by Borghi and colleagues provides evidence that the combination of intermittent theta burst TMS stimulation and gamma transcranial alternating current stimulation (γtACS) targeting the precuneus increases long-term associative memory in healthy subjects compared to iTBS alone and sham conditions. Using a rich dataset of TMS-EEG and resting-state functional connectivity (rs-FC) maps and structural MRI data, the authors also provide evidence that dual stimulation increased gamma oscillations and functional connectivity between the precuneus and hippocampus. Enhanced memory performance was linked to increased gamma oscillatory activity and connectivity through white matter tracts.
Strengths:
The combination of personalized repetitive TMS (iTBS) and gamma tACS is a novel approach to targeting the precuneus, and thereby, connected memory-related regions to enhance long-term associative memory. The authors leverage an existing neural mechanism engaged in memory binding, theta-gamma coupling, by applying TMS at theta burst patterns and tACS at gamma frequencies to enhance gamma oscillations. The authors conducted a thorough study that suggests that simultaneous iTBS and gamma tACS could be a powerful approach for enhancing long-term associative memory. The paper was well-written, clear, and concise.
Weaknesses:
(1) The study did not include a condition where γtACS was applied alone. This was likely because a previous work indicated that a single 3-minute γtACS did not produce significant effects, but this limits the ability to isolate the specific contribution of γtACS in the context of this target and memory function
Thank you for your comments. As you pointed out, we did not include a condition where γtACS was applied alone. This decision was based on the findings of Guerra et al. (Brain Stimulation 2018), who investigated the same protocol and reported no aftereffects. Given the substantial burden of the experimental design on patients and our primary goal of demonstrating an enhancement of effects compared to the standalone iTBS protocol, we decided to leave out this condition. However, we agree that investigating the effects of γtACS alone is an interesting and relevant aspect worthy of further exploration. In line with these observations, we will expand the discussion on this point in the study’s limitations section.
(2) The authors applied stimulation for 3 minutes, which seems to be based on prior tACS protocols. It would be helpful to present some rationale for both the duration and timing relative to the learning phase of the memory task. Would you expect additional stimulation prior to recall to benefit long-term associative memory?
Thank you for your comment and for raising this interesting point. As you correctly noted, the protocol we used has a duration of three minutes, a choice based on previous studies demonstrating its greater efficacy with respect to single stimulation from a neurophysiological point of view. Specifically, these studies have shown that the combined stimulation enhanced gamma-band oscillations and increased cortical plasticity (Guerra et al., Brain Stimulation 2018; Maiella et al., Scientific Reports 2022). Given that the precuneus (Brodt et al., Science 2018; Schott et al., Human Brain Mapping 2018), gamma oscillations (Osipova et al., Journal of Neuroscience 2006; Deprés et al., Neurobiology of Aging 2017; Griffiths et al., Trends in Neurosciences 2023), and cortical plasticity (Brodt et al., Science 2018) are all associated with encoding processes, we decided to apply the co-stimulation immediately before it to enhance the efficacy.
Regarding the question of whether stimulation could also benefit recall, the answer is yes. We can speculate that repeating the stimulation before recall might provide an additional boost. This is supported by evidence showing that both the precuneus and gamma oscillations are involved in recall processes (Flanagin et al., Cerebral Cortex 2023; Griffiths et al., Trends in Neurosciences 2023). Furthermore, previous research suggests that reinstating the same brain state as during encoding can enhance recall performance (Javadi et al., The Journal of Neuroscience 2017).
We will expand the study rationale and include these considerations in the future directions section.
(3) How was the burst frequency of theta iTBS and gamma frequency of tACS chosen? Were these also personalized to subjects' endogenous theta and gamma oscillations? If not, were increases in gamma oscillations specific to patients' endogenous gamma oscillation frequencies or the tACS frequency?
The stimulation protocol was chosen based on previous studies (Guerra et al., Brain Stimulation 2018; Maiella et al., Scientific Reports 2022). Gamma tACS sinusoid frequency wave was set at 70 Hz while iTBS consisted of ten bursts of three pulses at 50 Hz lasting 2 s, repeated every 10 s with an 8 s pause between consecutive trains, for a total of 600 pulses total lasting 190 s (see iTBS+γtACS neuromodulation protocol section). In particular, the theta iTBS has been inspired by protocols used in animal models to elicit LTP in the hippocampus (Huang et al., Neuron 2005). Consequently, neither Theta iTBS nor the gamma frequency of tACS were personalized. The increase in gamma oscillations was referred to the patient’s baseline and did not correspond to the administrated tACS frequency.
(4) The authors do a thorough job of analyzing the increase in gamma oscillations in the precuneus through TMS-EEG; however, the authors may also analyze whether theta oscillations were also enhanced through this protocol due to the iTBS potentially targeting theta oscillations. This may also be more robust than gamma oscillations increases since gamma oscillations detected on the scalp are very low amplitude and susceptible to noise and may reflect activity from multiple overlapping sources, making precise localization difficult without advanced techniques.
Thank you for the suggestion. We analyzed theta oscillations finding no changes.
(5) Figure 4: Why are connectivity values pre-stimulation for the iTBS and sham tACS stimulation condition so much higher than the dual stimulation? We would expect baseline values to be more similar.
We acknowledge that the pre-stimulation connectivity values for the iTBS and sham tACS conditions appear higher than those for the dual stimulation condition. However, as noted in our statistical analyses, there were no significant differences at baseline between conditions (p-FDR= 0.3514), suggesting that any apparent discrepancy is due to natural variability rather than systematic bias. One potential explanation for these differences is individual variability in baseline connectivity measures, which can fluctuate due to factors such as intrinsic neural dynamics, participant state, or measurement noise. Despite these variations, our statistical approach ensures that any observed post-stimulation effects are not confounded by pre-existing differences.
(6) Figure 2: How are total association scores significantly different between stimulation conditions, but individual name and occupation associations are not? Further clarification of how the total FNAT score is calculated would be helpful.
We apologize for any lack of clarity. The total FNAT score reflects the ability to correctly recall all the information associated with a person—specifically, the correct pairing of the face, name, and occupation. Participants received one point for each triplet they accurately recalled. The scores were then converted into percentages, as detailed in the Face-Name Associative Task Construction and Scoring section in the supplementary materials.
Total FNAT was the primary outcome measure. However, we also analyzed name and occupation recall separately to better understand their individual contributions. Our analysis revealed that the improvement in total FNAT was primarily driven by an increase in name recall rather than occupation recall.
We acknowledge that this distinction may have caused some confusion. To improve clarity, we will revise the manuscript accordingly and consider presenting name and occupation recall in separate plots.
Reviewer #3 (Public review):
Summary:
Borghi and colleagues present results from 4 experiments aimed at investigating the effects of dual γtACS and iTBS stimulation of the precuneus on behavioral and neural markers of memory formation. In their first experiment (n = 20), they found that a 3-minute offline (i.e., prior to task completion) stimulation that combines both techniques leads to superior memory recall performance in an associative memory task immediately after learning associations between pictures of faces, names, and occupation, as well as after a 15-minute delay, compared to iTBS alone (+ tACS sham) or no stimulation (sham for both iTBS and tACS). Performance in a second task probing short-term memory was unaffected by the stimulation condition. In a second experiment (n = 10), they show that these effects persist over 24 hours and up to a full week after initial stimulation. A third (n = 14) and fourth (n = 16) experiment were conducted to investigate the neural effects of the stimulation protocol. The authors report that, once again, only combined iTBS and γtACS increase gamma oscillatory activity and neural excitability (as measured by concurrent TMS-EEG) specific to the stimulated area at the precuneus compared to a control region, as well as precuneus-hippocampus functional connectivity (measured by resting-state MRI), which seemed to be associated with structural white matter integrity of the bilateral middle longitudinal fasciculus (measured by DTI).
Strengths:
Combining non-invasive brain stimulation techniques is a novel, potentially very powerful method to maximize the effects of these kinds of interventions that are usually well-tolerated and thus accepted by patients and healthy participants. It is also very impressive that the stimulation-induced improvements in memory performance resulted from a short (3 min) intervention protocol. If the effects reported here turn out to be as clinically meaningful and generalizable across populations as implied, this approach could represent a promising avenue for the treatment of impaired memory functions in many conditions.
Methodologically, this study is expertly done! I don't see any serious issues with the technical setup in any of the experiments (with the only caveat that I am not an expert in fMRI functional connectivity measures and DTI). It is also very commendable that the authors conceptually replicated the behavioral effects of experiment 1 in experiment 2 and then conducted two additional experiments to probe the neural mechanisms associated with these effects. This certainly increases the value of the study and the confidence in the results considerably.<br /> The authors used a within-subject approach in their experiments, which increases statistical power and allows for stronger inferences about the tested effects. They are also used to individualize stimulation locations and intensities, which should further optimize the signal-to-noise ratio.
Weaknesses:
I want to state clearly that I think the strengths of this study far outweigh the concerns I have. I still list some points that I think should be clarified by the authors or taken into account by readers when interpreting the presented findings.
I think one of the major weaknesses of this study is the overall low sample size in all of the experiments (between n = 10 and n = 20). This is, as I mentioned when discussing the strengths of the study, partly mitigated by the within-subject design and individualized stimulation parameters. The authors mention that they performed a power analysis but this analysis seemed to be based on electrophysiological readouts similar to those obtained in experiment 3. It is thus unclear whether the other experiments were sufficiently powered to reliably detect the behavioral effects of interest. That being said, the authors do report significant effects, so they were per definition powered to find those. However, the effect sizes reported for their main findings are all relatively large and it is known that significant findings from small samples may represent inflated effect sizes, which may hamper the generalizability of the current results. Ideally, the authors would replicate their main findings in a larger sample. Alternatively, I think running a sensitivity analysis to estimate the smallest effect the authors could have detected with a power of 80% could be very informative for readers to contextualize the findings. At the very least, however, I think it would be necessary to address this point as a potential limitation in the discussion of the paper.
Thank you for the observation. As you mentioned, our power analysis was based on our previous study investigating the same neuromodulation protocol with a corresponding experimental design. The relatively small sample could be considered a possible limitation of the study which we will add to the discussion. A fundamental future step will be to replay these results on a larger population, however, to strengthen our results we performed the sensitivity analysis you suggested.
In detail, we performed a sensitivity analysis for repeated-measures ANOVA with α=0.05 and power(1-β)=0.80 with no sphericity correction. For experiment 1, a sensitivity analysis with 1 group and 3 measurements showed a minimal detectable effect size of f=0.524 with 20 participants. In our paper, the ANOVA on total FNAT immediate performance revealed an effect size of η<sup>2</sup>=0.274 corresponding to f=0.614; the ANOVA on FNAT delayed performance revealed an effect size of η<sup>2</sup> =0.236 corresponding to f=0.556. For experiment 2, a sensitivity analysis for total FNAT immediate performance (1 group and 3 measurements) showed a minimal detectable effect size of f=0.797 with 10 participants. In our paper, the ANOVA on total FNAT immediate performance revealed an effect size of η<sup>2</sup> =0.448 corresponding to f=0.901. The sensitivity analysis for total FNAT delayed performance (1 group and 6 measurements) showed a minimal detectable effect size of f=0.378 with 10 participants. In our paper, the ANOVA on total FNAT delayed performance revealed an effect size of η<sup>2</sup> =0.484 corresponding to f=0.968. Thus, the sensitivity analysis showed that both experiments were powered enough to detect the minimum effect size computed in the power analysis. We have now added this information to the manuscript and we thank the reviewer for her/his suggestion.
It seems that the statistical analysis approach differed slightly between studies. In experiment 1, the authors followed up significant effects of their ANOVAs by Bonferroni-adjusted post-hoc tests whereas it seems that in experiment 2, those post-hoc tests where "exploratory", which may suggest those were uncorrected. In experiment 3, the authors use one-tailed t-tests to follow up their ANOVAs. Given some of the reported p-values, these choices suggest that some of the comparisons might have failed to reach significance if properly corrected. This is not a critical issue per se, as the important test in all these cases is the initial ANOVA but non-significant (corrected) post-hoc tests might be another indicator of an underpowered experiment. My assumptions here might be wrong, but even then, I would ask the authors to be more transparent about the reasons for their choices or provide additional justification. Finally, the authors sometimes report exact p-values whereas other times they simply say p < .05. I would ask them to be consistent and recommend using exact p-values for every result where p >= .001.
Thank you again for the suggestions. Your observations are correct, we used a slightly different statistical depending on our hypothesis. Here are the details:
In experiment 1, we used a repeated-measure ANOVA with one factor “stimulation condition” (iTBS+γtACS; iTBS+sham-tACS; sham-iTBS+sham-tACS). Following the significant effect of this factor we performed post-hoc analysis with Bonferroni correction.
In experiment 2, we used a repeated-measures with two factors “stimulation condition” and “time”. As expected, we observed a significant effect of condition, confirming the result of experiment 1, but not of time. Thus, this means that the neuromodulatory effect was present regardless of the time point. However, to explore whether the effects of stimulation condition were present in each time point we performed some explorative t-tests with no correction for multiple comparisons since this was just an explorative analysis.
In experiment 3, we used the same approach as experiment 1. However, since we had a specific hypothesis on the direction of the effect already observed in our previous study, i.e. increase in spectral power (Maiella et al., Scientific Report 2022), our tests were 1-tailed.
For the p-values, we will correct the manuscript reporting the exact values for every result.
While the authors went to great lengths trying to probe the neural changes likely associated with the memory improvement after stimulation, it is impossible from their data to causally relate the findings from experiments 3 and 4 to the behavioral effects in experiments 1 and 2. This is acknowledged by the authors and there are good methodological reasons for why TMS-EEG and fMRI had to be collected in sperate experiments, but it is still worth pointing out to readers that this limits inferences about how exactly dual iTBS and γtACS of the precuneus modulate learning and memory.
Thank you for your comment. We fully agree with your observation, which is why this aspect has been considered in the study's limitations. To address your concern, we will further emphasize the fact that our findings do not allow precise inferences regarding the specific mechanisms by which dual iTBS and γtACS of the precuneus modulate learning and memory.
There were no stimulation-related performance differences in the short-term memory task used in experiments 1 and 2. The authors argue that this demonstrates that the intervention specifically targeted long-term associative memory formation. While this is certainly possible, the STM task was a spatial memory task, whereas the LTM task relied (primarily) on verbal material. It is thus also possible that the stimulation effects were specific to a stimulus domain instead of memory type. In other words, could it be possible that the stimulation might have affected STM performance if the task taxed verbal STM instead? This is of course impossible to know without an additional experiment, but the authors could mention this possibility when discussing their findings regarding the lack of change in the STM task.
Thank you for your insightful observation. We argue that the intervention primarily targeted long-term associative memory formation, as our findings demonstrated effects only on FNAT. However, as you correctly pointed out, we cannot exclude the possibility that the stimulation may also influence short-term verbal associative memory. We will acknowledge this potential effect when discussing the absence of significant findings in the STM task.
While the authors discuss the potential neural mechanisms by which the combined stimulation conditions might have helped memory formation, the psychological processes are somewhat neglected. For example, do the authors think the stimulation primarily improves the encoding of new information or does it also improve consolidation processes? Interestingly, the beneficial effect of dual iTBS and γtACS on recall performance was very stable across all time points tested in experiments 1 and 2, as was the performance in the other conditions. Do the authors have any explanation as to why there seems to be no further forgetting of information over time in either condition when even at immediate recall, accuracy is below 50%? Further, participants started learning the associations of the FNAT immediately after the stimulation protocol was administered. What would happen if learning started with a delay? In other words, do the authors think there is an ideal time window post-stimulation in which memory formation is enhanced? If so, this might limit the usability of this procedure in real-life applications.
Thank you for your comment and for raising these important points.
We hypothesized that co-stimulation would enhance encoding processes. Previous studies have shown that co-stimulation can enhance gamma-band oscillations and increase cortical plasticity (Guerra et al., Brain Stimulation 2018; Maiella et al., Scientific Reports 2022). Given that the precuneus (Brodt et al., Science 2018; Schott et al., Human Brain Mapping 2018), gamma oscillations (Osipova et al., Journal of Neuroscience 2006; Deprés et al., Neurobiology of Aging 2017; Griffiths et al., Trends in Neurosciences 2023), and cortical plasticity (Brodt et al., Science 2018) have all been associated with encoding processes, we decided to apply co-stimulation before the encoding phase, to boost it.
We applied the co-stimulation immediately before the learning phase to maximize its potential effects. While we observed a significant increase in gamma oscillatory activity lasting up to 20 minutes, we cannot determine whether the behavioral effects we observed would have been the same with a co-stimulation applied 20 minutes before learning. Based on existing literature, a reduction in the efficacy of co-stimulation over time could be expected (Huang et al., Neuron 2005; Thut et al., Brain Topography 2009). However, we hypothesize that multiple stimulation sessions might provide an additional boost, helping to sustain the effects over time (Thut et al., Brain Topography 2009; Koch et al., Neuroimage 2018; Koch et al., Brain 2022).
Regarding the absence of further forgetting in both stimulation conditions, we think that the clinical and demographical characteristics of the sample (i.e. young and healthy subjects) explain the almost absence of forgetting after one week.
-