10,000 Matching Annotations
  1. Oct 2024
    1. Author response:

      We thank the editors and reviewers for their valuable feedback and are committed to addressing their suggestions in a revised manuscript. We appreciate the reviewers’ recognition of the value of our findings, including the insights into the consequences of synaptic topography and the investigation of spike initiation zones in DNs, which further advance our understanding of signal processing. Our studies offer broader insights into synaptic organization and its significance for dendritic integration in an ethologically relevant context.

      We particularly appreciate the reviewer's suggestion to elaborate on the electrophysiological properties of DNs and to consider the electrotonic distance in our analysis. We also thank the reviewers for highlighting points that need clarification. In short, our models suggest that DNs effectively distribute synapses to maintain linear encoding of synapse numbers when multiple synapses are coactivated. This supports the results of an earlier study suggesting that synapse number gradients encode the location of an approaching stimulus in these neurons (Dombrovski et al., 2023).

      We also agree with the reviewers that the temporal activation of synapses is highly relevant for this system. However, we have focused on synaptic topography because the characterization of temporal patterns of VPN activity is currently lacking in the field. A more detailed investigation of temporal dynamics is therefore beyond the scope of this study.

      With the publication of the reviewed preprint, we have now made the computational pipeline and models available on GitHub (https://github.com/AusbornLab/VPN-DN-synapse-normalization).

      Reference

      Dombrovski M, Peek MY, Park J-Y, Vaccari A, Sumathipala M, Morrow C, Breads P, Zhao A, Kurmangaliyev YZ, Sanfilippo P, Rehan A, Polsky J, Alghailani S, Tenshaw E, Namiki S, Zipursky SL, Card GM. 2023. Synaptic gradients transform object location to action. Nature 613:534–542. doi:10.1038/s41586-022-05562-8

    1. eLife Assessment

      This valuable study on strategies used by Pseudomonas to subvert hots immunity identifies a new immune evasion strategy. The study presents solid evidence on the cleavage of VgrG2B by Caspase 11 and the generation of fragments that inhibit activity of the NLRP3 inflammasome. This work should be of interest to immunologists and microbiologists.

    2. Reviewer #1 (Public review):

      In the manuscript entitled "A VgrG2b fragment cleaved by caspase-11/4 promotes Pseudomonas aeruginosa infection through suppressing the NLRP3 inflammasome", Qian et al. found an activation of the non-canonical inflammasome, but not the downstream NLRP3 inflammasome, during the infection of macrophage by P. aeruginosa, which is in sharp contrast to that by E. coli (Figure 1). In realizing that the suppression of the NLRP3 inflammasome is Caspase-11 dependent, the authors performed a screening among P. aeruginosa proteins and identified VgrG2b being a major substrate of Caspase-11 (Figure 2). Next, the authors mapped the cleavage site on VgrG2b to D883, and demonstrated that cleavage of VgrG2b by Caspase-11 is essential for the suppression of the NLRP3 inflammasome (Figure 3). Furthermore, they found that a binding between the C-terminal fragment of the cleaved VgrG2b and NLRP3 existed (Figure 4), which was then proved to block the association of NLRP3 with NEK7 (Figure 5). Finally, the authors demonstrated that blocking of VgrG2b cleavage, by either mutation of the D883 or administration of a designed peptide, effectively improved the survival rate of the P. aeruginosa-infected mice (Figure 6). This is a well-designed and executed study, with the results clearly presented and stated.

    3. Reviewer #2 (Public review):

      Summary:

      In their manuscript, Quian and colleagues identified a novel mechanism by which Pseudomonas control inflammatory responses upon inflammasome activation. They identified a caspase-11 substrate (VgrG2b) which, upon cleavage, binds and inhibits the NLRP3 to reduce the production of pro-inflammatory cytokines. This is a unique mechanism that allows for the tailoring of the innate immune response upon bacterial recognition.

      Strengths:

      The authors are presenting here a novel conceptual framework in host-pathogen interactions. Their work is supported by a range of approaches (biochemical, cellular immunology, microbiology, animal models), and their conclusions are supported by multiple independent evidences. The work is likely to have an important impact on the innate immunity field and host-pathogen interactions field and may guide the development of novel inhibitors.

      Weaknesses:

      Although quite exhaustive, a few of the authors' conclusions are not fully supported (e.g., caspase-11 directly cleaving VgrG2b, the unique affinity of VgrG2b-C for NLRP3) and would require complementary approaches to validate their findings fully. This is minimal.

    1. eLife Assessment

      Wang et al's study addresses an important critical gap in our understanding of de novo epithelial polarization using MDCK cell doublets surrounded by ECM, providing convincing evidence through imaging and depletion studies on the role of conserved polarity proteins and the centrosome during this process. While the authors propose a clear hierarchical model, there is a need for further exploration of how microtubule organization contributes to this process. Specifically, live cell imaging of microtubules under mutants and their included ECM conditions, along with a more precise temporal mapping of microtubule dynamics in relation to proteins like Gp135, would strengthen the study's conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      Wang, Po-Kai, et al., utilized the de novo polarization of MDCK cells cultured in Matrigel to assess the interdependence between polarity protein localization, centrosome positioning, and apical membrane formation. They show that the inhibition of Plk4 with Centrinone does not prevent apical membrane formation, but does result in its delay, a phenotype the authors attribute to the loss of centrosomes due to the inhibition of centriole duplication. However, the targeted mutagenesis of specific centrosome proteins implicated in the positioning of centrosomes in other cell types (CEP164, ODF2, PCNT, and CEP120) did not affect centrosome positioning in 3D cultured MDCK cells. A screen of proteins previously implicated in MDCK polarization revealed that the polarity protein Par-3 was upstream of centrosome positioning, similar to other cell types.

      Strengths:

      The investigation into the temporal requirement and interdependence of previously proposed regulators of cell polarization and lumen formation is valuable to the community. Wang et al., have provided a detailed analysis of many of these components at defined stages of polarity establishment. Furthermore, the generation of PCNT, p53, ODF2, Cep120, and Cep164 knockout MDCK cell lines is likely valuable to the community.

      Weaknesses:

      Additional quantifications would highly improve this manuscript, for example it is unclear whether the centrosome perturbation affects gamma tubulin levels and therefore microtubule nucleation, it is also not clear how they affect the localization of the trafficking machinery/polarity proteins. For example, in Figure 4, the authors measure the intensity of Gp134 at the apical membrane initiation site following cytokinesis, but there is no measure of Gp134 at the centrosome prior to this.

    3. Reviewer #2 (Public review):

      Summary:

      The authors decoupled several players that are thought to contribute to the establishment of epithelial polarity and determined their causal relationship. This provides a new picture of the respective roles of junctional proteins (Par3), the centrosome, and endomembrane compartments (Cdc42, Rab11, Gp135) from upstream to downstream.<br /> Their conclusions are based on live imaging of all players during the early steps of polarity establishment and on the knock-down of their expression in the simplest ever model of epithelial polarity: a cell doublet surrounded by ECM.

      The position of the centrosome is often taken as a readout for the orientation of the cell polarity axis. There is a long-standing debate about the actual role of the centrosome in the establishment of this polarity axis. Here, using a minimal model of epithelial polarization, a doublet of daugthers MDCK cultured in Matrigel, the authors made several key observations that bring new light to our understanding of a mechanism that has been studied for many years without being fully explained:

      (1) They showed that centriole can reach their polarized position without most of their microtubule-anchoring structures. These observations challenge the standard model according to which centrosomes are moved by the production and transmission of forces along microtubules.

      (2) (However) they showed that epithelial polarity can be established in the absence of centriole.

      (3) (Somehow more expectedly) they also showed that epithelial polarity can't be established in the absence of Par3.

      (4) They found that most other polarity players that are transported through the cytoplasm in lipid vesicles, and finally fused to the basal or apical pole of epithelial cells, are moved along an axis which is defined by the position of centrosome and orientation of microtubules.

      (5) Surprisingly, two non-daughters cells that were brought in contact (for 6h) could partially polarize by recruiting a few Par3 molecules but not the other polarity markers.

      (6) Even more surprisingly, in the absence of ECM, Par 3 and centrosomes could move to their proper position close to the intercellular junction after cytokinesis but other polarity markers (at least GP135) localized to the opposite, non-adhesive, side. So the polarity of the centrosome-microtubule network could be dissociated from the localisation of GP135 (which was believed to be transported along this network).

      Strengths:

      (1) The simplicity and reproducibility of the system allow a very quantitative description of cell polarity and protein localisation.

      (2) The experiments are quite straightforward, well-executed, and properly analyzed.

      (3) The writing is clear and conclusions are convincing.

      Weaknesses:

      (1) The simplicity of the system may not capture some of the mechanisms involved in the establishment of cell polarity in more physiological conditions (fluid flow, electrical potential, ion gradients,...).

      (2) The absence of centriole in centrinone-treated cells might not prevent the coalescence of centrosomal protein in a kind of MTOC which might still orient microtubules and intracellular traffic. How are microtubules organized in the absence of centriole? If they still form a radial array, the absence of a centriole at the center of it somehow does not conflict with classical views in the field.

      (3) The mechanism is still far from clear and this study shines some light on our lack of understanding. Basic and key questions remain:<br /> a) How is the centrosome moved toward the Par3-rich pole? This is particularly difficult to answer if the mechanism does not imply the anchoring of MTs to the centriole or PCM.<br /> b) What happens during cytokinesis that organises Par3 and intercellular junction in a way that can't be achieved by simply bringing two cells together? In larger epithelia cells have neighbours that are not daughters, still, they can form tight junctions with Par3 which participates in the establishment of cell polarity as much as those that are closer to the cytokinetic bridge (as judged by the overall cell symmetry). Is the protocol of cell aggregation fully capturing the interaction mechanism of non-daughter cells?

    4. Reviewer #3 (Public review):

      Here, Wang et al. aim to clarify the role of the centrosome and conserved polarity regulators in apical membrane formation during the polarization of MDCK cells cultured in 3D. Through well-presented and rigorous studies, the authors focused on the emergence of polarity as a single MDCK cell divided in 3D culture to form a two-cell cyst with a nascent lumen. Focusing on these very initial stages, rather than in later large cyst formation as in most studies, is a real strength of this study. The authors found that conserved polarity regulators Gp135/podocalyxin, Crb3, Cdc42, and the recycling endosome component Rab11a all localize to the centrosome before localizing to the apical membrane initiation site (AMIS) following cytokinesis. This protein relocalization was concomitant with a repositioning of centrosomes towards the AMIS. In contrast, Par3, aPKC, and the junctional components E-cadherin and ZO1 localize directly to the AMIS without first localizing to the centrosome. Based on the timing of the localization of these proteins, these observational studies suggested that Par3 is upstream of centrosome repositioning towards the AMIS and that the centrosome might be required for delivery of apical/luminal proteins to the AMIS.

      To test this hypothesis, the authors generated numerous new cell lines and/or employed pharmacological inhibitors to determine the hierarchy of localization among these components. They found that removal of the centrosome via centrinone treatment severely delayed and weakened the delivery of Gp135 to the AMIS and single lumen formation, although normal lumenogenesis was apparently rescued with time. This effect was not due to the presence of CEP164, ODF2, CEP120, or Pericentrin. Par3 depletion perturbed the repositioning of the centrosome towards the AMIS and the relocalization of the Gp135 and Rab11 to the AMIS, causing these proteins to get stuck at the centrosome. Finally, the authors culture the MDCK cells in several ways (forced aggregation and ECM depleted) to try and further uncouple localization of the pertinent components, finding that Par3 can localize to the cell-cell interface in the absence of cell division. Par3 localized to the edge of the cell-cell contacts in the absence of ECM and this localization was not sufficient to orient the centrosomes to this site, indicating the importance of other factors in centrosome recruitment.

      Together, these data suggest a model where Par3 positions the centrosome at the AMIS and is required for the efficient transfer of more downstream polarity determinants (Gp135 and Rab11) to the apical membrane from the centrosome. The authors present solid and compelling data and are well-positioned to directly test this model with their existing system and tools. In particular, one obvious mechanism here is that centrosome-based microtubules help to efficiently direct the transport of molecules required to reinforce polarity and/or promote lumenogenesis. This model is not really explored by the authors except by Pericentrin and subdistal appendage depletion and the authors do not test whether these perturbations affect centrosomal microtubules. Exploring the role of microtubules in this process could considerably add to the mechanisms presented here. In its current state, this paper is a careful observation of the events of MCDK polarization and will fill a knowledge gap in this field. However, the mechanism could be significantly bolstered with existing tools, thereby elevating our understanding of how polarity emerges in this system.

    1. eLife Assessment

      This important study shows that Toxoplasma gondii uses paracrine mechanisms, in addition to cell-intrinsic methods, to evade the host immune system, with MYR1 playing a key role in transporting effector molecules into host cells. The authors present convincing evidence that in vivo, MYR1-deficient parasites can be rescued by wild-type parasites, revealing a limitation in pooled CRISPR screens, where such paracrine effects may obscure the identification of key parasite pathways involved in immune evasion.

    2. Reviewer #1 (Public review):

      Previous studies have highlighted some of these paracrine activities of Toxoplasma - and Rasogi et al (mBio, 2020) used a single cell sequencing approach of cells infected in vitro with the WT or MYR KO parasites - and one of their conclusions was that MYR-1 dependent paracrine activities counteract ROP-dependent processes. Similarly, Chen et al (JEM 2020) highlighted that a particular rhoptry protein (ROP16) could be injected into uninfected macrophages and move them to an anti-inflammatory state that might benefit the parasite.

      There are caveats around immunity and as yet no insight into how this works. In Figure 2 there is a marked defect in the ability of the parasites to expand at day 2 and day 5. Together, these data sets suggest that this paracrine effect mediated by MYR-1 works early - well before the development of adaptive responses.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript by Torelli et al., the authors propose that the major function of MYR1 and MYR1-dependent secreted proteins is to contribute to parasite survival in a paracrine manner rather than to protect parasites from cell-autonomous immune response. The authors conclude that these paracrine effects rescue ∆MYR1 or knockouts of MYR1-dependent effectors within pooled in vivo CRISPR screens.

      Strengths:

      The authors raised a more general concern that pooled CRISPR screens (not only in Toxoplasma but also other microbes or cancers) would miss important genes by "paracrine masking effect". Although there is no doubt that pooled CRISPR screens (especially in vivo CRISPR screens) are powerful techniques, I think this topic could be of interest to those fields and researchers.

      Weaknesses:

      In this version, the reviewer is not entirely convinced of the 'paracrine masking effect' because the in vivo experiments should include appropriate controls (see major point 2).

      (1) It is convincing that co-infection of WT and ∆MYR1 parasites could rescue the growth of ∆MYR1 in mice shown by in vivo luciferase imaging. Also, this is consistent with ∆MYR1 parasites showing no in vivo fitness defect in the in vivo CRISPR screens conducted by several groups. Meanwhile, it has been reported previously and shown in this manuscript that ∆MYR1 parasites have an in vitro growth defect; however, ∆MYR1 parasites show no in vitro fitness defect the in vitro pooled CRISPR screen. The authors show that the competition defect of ∆MYR1 parasites cannot be rescued by co-infection with WT parasites in Figure 1c, which might indicate that no paracrine rescue occurred in an in vitro environment. The authors seem not to mention these discrepancies between in vitro CRISPR screens and in vitro competition assays. Why do ∆MYR1 parasites possess neutral in vitro fitness scores in in vitro CRISPR screens? Could the authors describe a reasonable hypothesis?

      (2) The authors developed a mixed infection assay with an inoculum containing a 20:80 ratio of ΔMYR1-Luc parasites with either WT parasites or ΔMYR1 mutants not expressing luciferase, showing that the in vivo growth defect of ∆MYR1 parasites is rescued by the presence of WT parasites. Since this experiment lacks appropriate controls, interpretation could be difficult. Is this phenomenon specific to MYR1? If a co-inoculum of ∆GRA12-Luc with either WT parasites or GRA12 parasites not expressing luciferase is included, the data could be appropriately interpreted.

      (3) In the Discussion part, the authors argue that the rescue phenotype of mixed infection is not due to co-infection of host cells (lines 307-310). This data is important to support the authors' paracrine hypothesis and should be shown in the main figure.

      (4) In the Discussion part, the authors assume that the rescue phenotype is the result of multiple MYR1-dependent effectors. I admit that this hypothesis could be possible since a recently published paper described the concerted action of numerous MYR1-dependent or independent effectors contributing to the hypermigration of infected cells (Ten Hoeve et al., mBio, 2024). I think this paragraph would be kind of overstated since the authors did not test any of the candidate effectors. Since the authors possess ∆IST parasites, they can test whether IST is involved in the "paracrine masking effect" or not to support their claim.

    1. eLife Assessment

      This important study reports a detailed quantification of the population dynamics of Salmonella enterica serovar Typhimurium in mice. Bacterial burden and founding population sizes across various organs were quantified, revealing pathways of dissemination and reseeding of the gastrointestinal tract from systemic organs. Using various techniques, including genetic distance measurements, the authors present compelling evidence to support their conclusions, thus presenting new knowledge that will be of broad interest to scientists focusing on infectious diseases.

    2. Reviewer #1 (Public review):

      Hotinger et al. explore the population dynamics of Salmonella enterica serovar Typhimurium in mice using genetically tagged bacteria. In addition to physiological observations, pathology assessments, and CFU measurements, the study emphasizes quantifying host bottleneck sizes that limit Salmonella colonization and dissemination. The authors also investigate the genetic distances between bacterial populations at various infection sites within the host.

      Initially, the study confirms that pretreatment with the antibiotic streptomycin before inoculation via orogastric gavage increases the bacterial burden in the gastrointestinal (GI) tract, leading to more severe symptoms and heightened fecal shedding of bacteria. This pretreatment also significantly reduces between-animal variation in bacterial burden and fecal shedding. The authors then calculate founding population sizes across different organs, discovering a severe bottleneck in the intestine, with founding populations reduced by approximately 10^6-fold compared to the inoculum size. Streptomycin pretreatment increases the founding population size and bacterial replication in the GI tract. Moreover, by calculating genetic distances between populations, the authors demonstrate that, in untreated mice, Salmonella populations within the GI tract are genetically dissimilar, suggesting limited exchange between colonization sites. In contrast, streptomycin pretreatment reduces genetic distances, indicating increased exchange.

      In extraintestinal organs, the bacterial burden is generally not substantially increased by streptomycin pretreatment, with significant differences observed only in the mesenteric lymph nodes and bile. However, the founding population sizes in these organs are increased. By comparing genetic distances between organs, the authors provide evidence that subpopulations colonizing extraintestinal organs diverge early after infection from those in the GI tract. This hypothesis is further tested by measuring bacterial burden and founding population sizes in the liver and GI tract at 5 and 120 hours post-infection. Additionally, they compare orogastric gavage infection with the less injurious method of infection via drinking, finding similar results for CFUs, founding populations, and genetic distances. These results argue against injuries during gavage as a route of direct infection.

      To bypass bottlenecks associated with the GI tract, the authors compare intravenous (IV) and intraperitoneal (IP) routes of infection. They find approximately a 10-fold increase in bacterial burden and founding population size in immune-rich organs with IV/IP routes compared to orogastric gavage in streptomycin-pretreated animals. This difference is interpreted as a result of "extra steps required to reach systemic organs."

      While IP and IV routes yield similar results in immune-rich organs, IP infections lead to higher bacterial burdens in nearby sites, such as the pancreas, adipose tissue, and intraperitoneal wash, as well as somewhat increased founding population sizes. The authors correlate these findings with the presence of white lesions in adipose tissue. Genetic distance comparisons reveal that, apart from the spleen and liver, IP infections lead to genetically distinct populations in infected organs, whereas IV infections generally result in higher genetic similarity.

      Finally, the authors investigate GI tract reseeding, identifying two distinct routes. They observe that the GI tracts of IP/IV-infected mice are colonized either by a clonal or a diversely tagged bacterial population. In clonally reseeded animals, the genetic distance within the GI tract is very low (often zero) compared to the bile population, which is predominantly clonal or pauciclonal. These animals also display pathological signs, such as cloudy/hardened bile and increased bacterial burden, leading the authors to conclude that the GI tract was reseeded by bacteria from the gallbladder bile. In contrast, animals reseeded by more complex bacterial populations show that bile contributes only a minor fraction of the tags. Given the large founding population size in these animals' GI tracts, which is larger than in orogastrically infected animals, the authors suggest a highly permissive second reseeding route, largely independent of bile. They speculate that this route may involve a reversal of known mechanisms that the pathogen uses to escape from the intestine.

      The manuscript presents a substantial body of work that offers a meticulously detailed understanding of the population dynamics of S. Typhimurium in mice. It quantifies the processes shaping the within-host dynamics of this pathogen and provides new insights into its spread, including previously unrecognized dissemination routes. The methodology is appropriate and carefully executed, and the manuscript is well-written, clearly presented, and concise. The authors' conclusions are well-supported by experimental results and thoroughly discussed. This work underscores the power of using highly diverse barcoded pathogens to uncover the within-host population dynamics of infections and will likely inspire further investigations into the molecular mechanisms underlying the bottlenecks and dissemination routes described here.

      Major point:

      Substantial conclusions in the manuscript rely on genetic distance measurements using the Cavalli-Sforza chord distance. However, it is unclear whether these genetic distance measurements are independent of the founding population size. I would anticipate that in populations with larger founding population sizes, where the relative tag frequencies are closer to those in the inoculum, the genetic distances would appear smaller compared to populations with smaller founding sizes independent of their actual relatedness. This potential dependency could have implications for the interpretation of findings, such as those in Figures 2B and 2D, where antibiotic-pretreated animals consistently exhibit higher founding population sizes and smaller genetic distances compared to untreated animals.

    3. Reviewer #2 (Public review):

      In this paper, Hotinger et. al. propose an improved barcoded library system, called STAMPR, to study Salmonella population dynamics during infection. Using this system, the authors demonstrate significant diversity in the colonization of different Salmonella clones (defined by the presence of different barcodes) not only across different organs (liver, spleen, adipose tissues, pancreas, and gall bladder) but also within different compartments of the same gastrointestinal tissue. Additionally, this system revealed that microbiota competition is the major bottleneck in Salmonella intestinal colonization, which can be mitigated by streptomycin treatment. However, this has been demonstrated previously in numerous publications. They also show that there was minimal sharing between populations found in the intestine and those in the other organs. Upon IV and IP infection to bypass the intestinal bottleneck, they were able to demonstrate, using this library, that Salmonella can renter the intestine through two possible routes. One route is essentially the reverse path used to escape the gut, leading to a diverse intestinal population; while the other, through the bile, typically results in a clonal population. Although the authors showed that the STAMPR pipeline improved the ability to identify founder populations and their diversity within the same animal during infections, some of the conclusions appear speculative and not fully supported.

      (1) It's particularly interesting how the authors, using this system, demonstrate the dominant role of the microbiota bottleneck in Salmonella colonization and how it is widened by antibiotic treatment (Figure 1). Additionally, the ability to track Salmonella reseeding of the gut from other organs starting with IV and IP injections of the pathogen provides a new tool to study population dynamics (Figure 5). However, I don't think it is possible to argue that the proximal and distal small intestine, Peyer's patches (PPs), cecum, colon, and feces have different founder populations for reasons other than stochastic variations. All the barcoded Salmonella clones have the same fitness and the fact that some are found or expanded in one region of the gastrointestinal tract rather than another likely results from random chance - such as being forced in a specific region of the gut for physical or spatial reasons-and subsequent expansion, rather than any inherent biological cause. For example, some bacteria may randomly adhere to the mucus, some may swim toward the epithelial layer, while others remain in the lumen; all will proliferate in those respective sites. In this way, different founder populations arise based on random localization during movement through the gastrointestinal tract, which is an observation, but it doesn't significantly contribute to understanding pathogen colonization dynamics or pathogenesis. Therefore, I would suggest placing less emphasis on describing these differences or better discussing this aspect, especially in the context of the gastrointestinal tract.

      (2) I do think that STAMPR is useful for studying the dynamics of pathogen spread to organs where Salmonella likely resides intracellularly (Figure 3). The observation that the liver is colonized by an early intestinal population, which continues to proliferate at a steady rate throughout the infection, is very interesting and may be due to the unique nature of the organ compared to the mucosal environment. What is the biological relevance during infection? Do the authors observe the same pattern (Figures 3C and G) when normalizing the population data for the spleen and mesenteric lymph nodes (mLN)? If not, what do the authors think is driving this different distribution?

      (3) Figure 6: Could the bile pathology be due to increased general bacterial translocation rather than Salmonella colonization specifically? Did the authors check for the presence of other bacteria (potentially also proliferating) in the bile? Do the authors know whether Salmonella's metabolic activity in the bile could be responsible for gallbladder pathology?

    1. eLife Assessment

      This important study uses a large dataset from both recent isolates and genomes in databases to provide an analysis of the population structure of the pathogen Salmonella gallinarum. The results regarding regional adaptation and the evolutionary trajectory of the resistome and mobilome remain incomplete, requiring additional details to fully support their claims and assess the value of these insights for future policy interventions regarding this and other pathogens. This work will interest microbiologists and researchers working on genomics, evolution, and antimicrobial resistance.

    2. Reviewer #1 (Public review):

      Summary:

      The investigators in this study analyzed the dataset assembly from 540 Salmonella isolates, and those from 45 recent isolates from Zhejiang University of China. The analysis and comparison of the resistome and mobilome of these isolates identified a significantly higher rate of cross-region dissemination compared to localized propagation. This study highlights the key role of the resistome in driving the transition and evolutionary history of S. Gallinarum.

      Strengths:

      The isolates included in this study were from 16 countries in the past century (1920 to 2023). While the study uses S. Gallinarun as the prototype, the conclusion from this work will likely apply to other Salmonella serotypes and other pathogens.

      Weaknesses:

      While the isolates came from 16 countries, most strains in this study were originally from China.

    3. Reviewer #2 (Public review):

      Summary:

      The authors sequence 45 new samples of S. Gallinarum, a commensal Salmonella found in chickens, which can sometimes cause disease. They combine these sequences with around 500 from public databases, determine the population structure of the pathogen, and coarse relationships of lineages with geography. The authors further investigate known anti-microbial genes found in these genomes, how they associate with each other, whether they have been horizontally transferred, and date the emergence of clades.

      Strengths:

      (1) It doesn't seem that much is known about this serovar, so publicly available new sequences from a high-burden region are a valuable addition to the literature.

      (2) Combining these sequences with publicly available sequences is a good way to better contextualise any findings.

      Weaknesses:

      There are many issues with the genomic analysis that undermine the conclusions, the major ones I identified being:

      (1) Recombination removal using gubbins was not presented fully anywhere. In this diversity of species, it is usually impossible to remove recombination in this way. A phylogeny with genetic scale and the gubbins results is needed. Critically, results on timing the emergence (fig2) depend on this, and cannot be trusted given the data presented.

      (2) The use of BEAST was also only briefly presented, but is the basis of a major conclusion of the paper. Plot S3 (root-to-tip regression) is unconvincing as a basis of this data fitting a molecular clock model. We would need more information on this analysis, including convergence and credible intervals.

      (3) Using a distance of 100 SNPs for a transmission is completely arbitrary. This would at least need to be justified in terms of the evolutionary rate and serial interval.

      (4) The HGT definition is non-standard, and phylogeny (vertical inheritance) is not controlled for.<br /> The cited method:<br /> 'In this study, potentially recently transferred ARGs were defined as those with perfect identity (more than 99% nucleotide identity and 100% coverage) in distinct plasmids in distinct host bacteria using BLASTn (E-value {less than or equal to}10−5)'<br /> This clearly does not apply here, as the application of distinct hosts and plasmids cannot be used. Subsequent analysis using this method is likely invalid, and some of it (e.g. Figure 6c) is statistically very poor.

      (5) Associations between lineages, resistome, mobilome, etc do not control for the effect of genetic background/phylogeny. So e.g. the claim 'the resistome also demonstrated a lineage-preferential distribution' is not well-supported.

      (6) The invasiveness index is not well described, and the difference in means is not biologically convincing as although it appears significant, it is very small.

      (7) 'In more detail, both the resistome and mobilome exhibited a steady decline until the 1980s, followed by a consistent increase from the 1980s to the 2010s. However, after the 2010s, a subsequent decrease was identified.'<br /> Where is the data/plot to support this? Is it a significant change? Is this due to sampling or phylogenetics?

      (8) It is not clear what the burden of disease this pathogen causes in the population, or how significant it is to agricultural policy. The article claims to 'provide valuable insights for targeted policy interventions.', but no such interventions are described.

      (9) The abstract mentions stepwise evolution as a main aim, but no results refer to this.

      (10) The authors attribute changes in population dynamics to normalisation in China-EU relations and hen fever. However, even if the date is correct, this is not a strongly supported causal claim, as many other reasons are also possible (for example other industrial processes which may have changed during this period).

      (11) No acknowledgment of potential undersampling outside of China is made, for example, 'Notably, all bvSP isolates from Asia were exclusively found in China, which can be manually divided into three distinct regions (southern, eastern, and northern).'. Perhaps we just haven't looked in other places?

      (12) Many of the conclusions are highly speculative and not supported by the data.

      (13) The figures are not always the best presentation of the data:<br /> a. Stacked bar plots in Figure 1 are hard to interpret, the total numbers need to be shown. Panel C conveys little information.<br /> b. Figure 4B: stacked bars are hard to read and do not show totals.<br /> c. Figure 5 has no obvious interpretation or significance.

      In summary, the quality of analysis is poor and likely flawed (although there is not always enough information on methods present to confidently assess this or provide recommendations for how it might be improved). So, the stated conclusions are not supported.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript "Heat Shock Factor Regulation of Antimicrobial Peptides Expression Suggests a Conserved Defense Mechanism Induced by Febrile Temperature in Arthropods," Xiao and colleagues examine the role of the shrimp Litopenaeus vannamei HSF1 ortholog (LvHSF1) in the response to viral infection. The authors provide compelling support for their conclusions that the activation of LvHSF1 limits viral load at high temperatures. Specifically, the authors convincingly show that (i) LvHSF1 mRNA and protein are induced in response to viral infection at high temperatures, (ii) increased LvHSF1 levels can directly induce the expression of the nSWD (and directly or indirectly other antibacterial peptides, AMPs), (ii) nSWD's antimicrobial activities can limit viral load, and, (iv) LvHSF1 protects survival at high temperatures following virus infection. These data thus provide a model by which an increase in HSF1 levels limits viral load through the transcription of antimicrobial peptides and provides a rationale for the febrile response as a conserved response to viral infection.

      Strengths:

      The large body of careful time series experiments, tissue profiling, and validation of RNA-seq data is convincing. Several experimental methodologies are used to support the authors' conclusions that nSWD is an LvHSf1 target and increased LvHSF1 alone can explain increased levels of nSWD. Similar carefully conducted experiments also conclusively implicate nSWD protein in limiting WSSV viral loads.

      Weaknesses:

      Despite this compelling data regarding the protective role of HSF1 in the febrile response, what remains unexplained and complicates the authors' model is the observation that losing LvHSF1 at 'normal' temperatures of 25C is not detrimental to survival, even though viral loads increase and nSWD is likely still subject to LvHSF1 regulation. These observations suggest that WSSV infection may have other detrimental effects on the cell not reflected by viral load and that LvHSF1 may play additional roles in protecting the organism from these effects of WSSV infection, such as perhaps, perturbations to protein homeostasis. This is worth discussing, especially in light of the rather complicated roles of hormesis in protection from infection, the role of HSF1 in hormesis responses, and the findings from other groups that the authors discuss.

    2. Reviewer #2 (Public review):

      Temperature is a critical factor affecting the progression of viral diseases in vertebrates and invertebrates. In the current study, the authors investigate mechanisms by which high temperatures promote anti-viral resistance in shrimp. They show that high temperatures induce HSF1 expression, which in turn upregulates AMPs. The AMPs target viral envelope proteins and inhibit viral infection/replication. The authors confirm this process in drosophila and suggest that there may be a conserved mechanism of high-temperature mediated anti-viral response in arthropods. These findings will enhance our understanding of how high temperature improves resistance to viral infection in animals.

      The conclusions of this paper are mostly well supported by data, but some aspects of data analysis need to be clarified and extended. Further investigation on how WSSV infection is affected by AMP would have strengthened the study.

    3. Reviewer #3 (Public review):

      In the manuscript titled "Heat Shock Factor Regulation of Antimicrobial Peptides Expression Suggests a Conserved Defense Mechanism Induced by Febrile Temperature in Arthropods", the authors investigate the role of heat shock factor 1 (HSF1) in regulating antimicrobial peptides (AMPs) in response to viral infections, particularly focusing on febrile temperatures. Using shrimp (Litopenaeus vannamei) and Drosophila S2 cells as models, this study shows that HSF1 induces the expression of AMPs, which in turn inhibit viral replication, offering insights into how febrile temperatures enhance immune responses. The study demonstrates that HSF1 binds to heat shock elements (HSE) in AMPs, suggesting a conserved antiviral defense mechanism in arthropods. The findings are informative for understanding innate immunity against viral infections, particularly in aquaculture. However the logical flow of the paper can be improved.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      Dr. Santamaria's group previously utilized antigen-specific nanomedicines to induce immune tolerance in treating autoimmune diseases. The success of this therapeutic strategy has been linked to expanded regulatory mechanisms, particularly the role of T-regulatory type-1 (TR1) cells. However, the differentiation program of TR1 cells remained largely unclear. Previous work from the authors suggested that TR1 cells originate from T follicular helper (TFH) cells. In the current study, the authors aimed to investigate the epigenetic mechanisms underlying the transdifferentiation of TFH cells into IL-10-producing TR1 cells. Specifically, they sought to determine whether this process involves extensive chromatin remodeling or is driven by preexisting epigenetic modifications. Their goal was to understand the transcriptional and epigenetic changes facilitating this transition and to explore the potential therapeutic implications of manipulating this pathway. 

      The authors successfully demonstrated that the TFH-to-TR1 transdifferentiation process is driven by pre-existing epigenetic modifications rather than extensive new chromatin remodeling. The comprehensive transcriptional and epigenetic analyses provide robust evidence supporting their conclusions. 

      Strengths: 

      (1) The study employs a broad range of bulk and single-cell transcriptional and epigenetic tools, including RNA-seq, ATAC-seq, ChIP-seq, and DNA methylation analysis. This comprehensive approach provides a detailed examination of the epigenetic landscape during the TFH-to-TR1 transition. 

      (2) The use of high-throughput sequencing technologies and sophisticated bioinformatics analyses strengthens the foundation for the conclusions drawn. 

      (3) The data generated can serve as a valuable resource for the scientific community, offering insights into the epigenetic regulation of T-cell plasticity. 

      (4) The findings have significant implications for developing new therapeutic strategies for autoimmune diseases, making the research highly relevant and impactful. 

      We thank the reviewer for providing constructive feedback on the manuscript.

      Weaknesses: 

      (1) While the scope of this study lies in transcriptional and epigenetic analyses, the conclusions need to be validated by future functional analyses. 

      We fully agree with the reviewer’s suggestion. We have added the following text to the Discussion to address this concern: “The current study provides a foundational understanding of how the epigenetic landscape of TFH cells evolves as they transdifferentiate into TR1 progeny in response to chronic ligation of cognate TCRs using pMHCII-NPs. Our current studies focus on functional validation of these observations, by carrying out extensive perturbation studies of the TFH-TR1 transdifferentiation pathway in conditional transcription factor gene knock-out mice. In these ongoing studies, genes coding for a series of transcription factors expressed along the TFH-TR1 pathway are selectively knocked out in T cells, to ascertain (i) the specific roles of key transcription factors in the various cell conversion events and transcriptional changes that take place along the TFH-TR1 cell axis; (ii) the roles that such transcription factors play in the chromatin re-modeling events that underpin the TFH-TR1 transdifferentiation process; and (iii) the effects of transcription factor gene deletion on phenotypic and functional readouts of TFH and regulatory T cell function.”

      (2) This study successfully identified key transcription factors and epigenetic marks. How these factors mechanistically drive chromatin closure and gene expression changes during the TFH-to-TR1 transition requires further investigation. 

      Agreed. Please see our response to point #1 above.  

      (3) The study provides a snapshot of the epigenetic landscape. Future dynamic analysis may offer more insights into the progression and stability of the observed changes. 

      We have previously shown that the first event in the pMHCII-NP-induced TFH-TR1 transdifferentiation process involves proliferation of cognate TFH cells in the splenic germinal centers. This event is followed by immediate transdifferentiation of the proliferated TFH cells into transitional and terminally differentiated TR1 subsets. Although the snapshot provided by our single cell studies reported herein documents the simultaneous presence of the different subsets composing the transdifferentiation pathway at any given time point, the transdifferentiation process itself is extremely fast, such that proliferated TFH cells already transdifferentiate into TR1 cells after a single pMHCII-NP dose (Sole et al., 2023a). This makes it extremely challenging to pursue dynamic experiments. Notwithstanding this caveat, ongoing studies of cognate T cells post treatment withdrawal, coupled to single cell studies of the TFHTR1 pathway in transcription factor gene knockout mice exhibiting perturbed transdifferentiation processes are likely to shed light into the progression and stability of the epigenetic changes reported herein. 

      To address this limitation in the manuscript, we have added the following paragraph to the Discussion: “Although the snapshot provided by our single cell studies reported herein documents the simultaneous presence of the different subsets composing the TFH-TR1 cell pathway upon the termination of treatment, the transdifferentiation process itself is extremely fast, such that proliferated TFH cells already transdifferentiate into TR1 cells after a single pMHCII-NP dose (6). This makes it extremely challenging to pursue dynamic experiments. Notwithstanding this caveat, ongoing studies of cognate T cells post treatment withdrawal, coupled to single cell studies of the TFH-TR1 pathway in transcription factor gene knockout mice exhibiting perturbed transdifferentiation processes are likely to shed light into the progression and stability of the epigenetic changes reported herein”. 

      Reviewer #1 (Recommendations for the authors): 

      The authors may consider the following suggestions to improve this study: 

      (1) The authors may include a brief background on type 1 diabetes and the model involving BDC2.5 T cells to provide context for readers who may not be familiar with these aspects. 

      We have added this information to the first paragraph in the Results section: “BDC2.5mi/I-Ag7-specific CD4+ T cells comprise a population of autoreactive T cells that contribute to the progression of spontaneous autoimmune diabetes in NOD mice. The size of this type 1 diabetes-relevant T cell specificity is small and barely detectable in untreated NOD mice, but treatment with cognate pMHCII-NPs leads to the expansion and formation of antidiabetogenic TR1 cells that retain the antigenic specificity of their precursors (3). As a result, treatment of hyperglycemic NOD mice with these compounds results in the reversal of type 1 diabetes (3).”

      (2) It is understandable that further biological and functional experiments are beyond the scope of this paper, but it would be of interest to know how the authors envision future studies based on the transcriptional and epigenetic information obtained thus far. 

      We have added the following text to the Discussion section: “The current study provides a foundational understanding of how the epigenetic landscape of TFH cells evolves as they transdifferentiate into TR1 progeny in response to chronic ligation of cognate TCRs using pMHCII-NPs. Our current studies focus on functional validation of these observations, by carrying out extensive perturbation studies of the TFH-TR1 transdifferentiation pathway in conditional transcription factor gene knock-out mice. In these ongoing studies, genes coding for a series of transcription factors expressed along the TFH-TR1 pathway are selectively knocked out in T cells, to ascertain (i) the specific roles of key transcription factors in the various cell conversion events and transcriptional changes that take place along the TFH-TR1 cell axis; (ii) the roles that such transcription factors play in the chromatin re-modeling events that underpin the TFH-TR1 transdifferentiation process; and (iii) the effects of transcription factor gene deletion on phenotypic and functional readouts of TFH and regulatory T cell function.”

      (3) The authors may consider adjusting figures where genes are crowded or difficult to read due to small font size. 

      Figures with crowded text have been modified to facilitate reading.

      Reviewer #2 (Public Review): 

      Summary: 

      This study, based on their previous findings that TFH cells can be converted into TR1 cells, conducted a highly detailed and comprehensive epigenetic investigation to answer whether TR1 differentiation from TFH is driven by epigenetic changes. Their evidence indicated that the downregulation of TFH-related genes during the TFH to TR1 transition depends on chromatin closure, while the upregulation of TR1-related genes does not depend on epigenetic changes. 

      Strengths: 

      (1) A significant advantage of their approach lies in its detailed and comprehensive assessment of epigenetics. Their analysis of epigenetics covers chromatin open regions, histone modifications, DNA methylation, and using both single-cell and bulk techniques to validate their findings. As for their results, observations from different epigenetic perspectives mutually supported each other, lending greater credibility to their conclusions. This study effectively demonstrates that (1) the TFH-to-TR1 differentiation process is associated with massive closure of OCRs, and (2) the TR1-poised epigenome of TFH cells is a key enabler of this transdifferentiation process. Considering the extensive changes in epigenetic patterns involved in other CD4+ T lineage commitment processes, the similarity between TFH and TR1 in their epigenetics is intriguing. 

      (2) They performed correlation analysis to answer the association between "pMHC-NPinduced epigenetic change" and "gene expression change in TR1". Also, they have made their raw data publicly available, providing a comprehensive epigenomic database of pMHC-NPinduced TR1 cells. This will serve as a valuable reference for future research. 

      We thank the reviewer for his/her constructive feedback and suggestions for improvement of the manuscript.

      Weaknesses: 

      (1) A major limitation is that this study heavily relies on a premise from the previous studies performed by the same group on pMHC-NP-induced T-cell responses. This significantly limits the relevance of their conclusion to a broader perspective. Specifically, differential OCRs between Tet+ and naïve T cells were limited to only 821, as compared to 10,919 differential OCRs between KLH-TFH and naïve T cells (Figure 2A), indicating that the precursors and T cell clonotypes that responded to pMHC-NP were extremely limited. This limitation should be clearly discussed in the Discussion section. 

      We agree that this study focuses on a very specific, previously unrecognized pathway discovered in mice treated with pMHCII-NPs. Despite this apparent narrow perspective, we now have evidence that this is a naturally occurring pathway that also develops in other contexts (i.e., in mice that have not been treated with pMHCII-NPs). Furthermore, this pathway affords a unique opportunity to further understand the transcriptional and epigenetic mechanisms underpinning T cell plasticity; the findings reported can help guide/inform not only upcoming translational studies of pMHCII-NP therapy in humans, but also other research in this area. 

      We have added the following text to the Discussion to address this limitation: “Although the TFH-TR1 transdifferentiation was discovered in mice treated with pMHCII-NPs, we now have evidence that this is a naturally occurring pathway that also develops in other contexts (i.e., in mice that have not been treated with pMHCII-NPs). Importantly, the discovery of this transdifferentiation process affords a unique opportunity to further understand the transcriptional and epigenetic mechanisms underpinning T cell plasticity; the findings reported here can help guide/inform not only upcoming translational studies of pMHCII-NP therapy in humans, but also other research in this area”.

      We acknowledge that, in the bulk ATAC-seq studies, the differences in the number of OCRs found in tetramer+ cells or KLH-induced TFH cells vs. naïve T cells may be influenced by the intrinsic oligoclonality of the tetramer+ T cell pool arising in response to repeated pMHCII-NP challenge (Sole et al., 2023a). However, we note that our scATAC-seq studies of the tetramer+ T cell pool found similar differences between the oligoclonal tetramer+ TFH subpool and its (also oligoclonal) tetramer+ TR1 counterparts (i.e., substantially higher number of OCRs in the former vs. the latter relative to naïve T cells). 

      This has been clarified in the revised version of the manuscript, by adding the following text to the last paragraph of the Results subsection entitled “Contraction of the chromatin in pMHCII-NP-induced Tet+ vs. TFH cells at the bulk level”: “We acknowledge that, in the bulk ATAC-seq studies, the differences in the number of OCRs found in tetramer+ cells or KLHinduced TFH cells vs. naïve T cells may be influenced by the intrinsic oligoclonality of the tetramer+ T cell pool arising in response to repeated pMHCII-NP challenge (6). However, we note that scATAC-seq studies of the tetramer+ T cell pool found similar differences between the oligoclonal tetramer+ TFH subpool and its (also oligoclonal) tetramer+ TR1 counterparts (i.e., substantially higher number of OCRs in the former vs. the latter relative to naïve T cells)”.

      (2) This article uses peak calling to determine whether a region has histone modifications, claiming that the regions with histone modifications in TFH and TR1 are highly similar. However, they did not discuss the differences in histone modification intensities measured by ChIP-seq. For example, as shown in Figure 6C, IL10 H3K27ac modification in Tet+ cells showed significantly higher intensity than KLH-TFH, while in this article, it may be categorized as "possessing same histone modification region". This will strengthen their conclusions.

      We appreciate your suggestion to discuss differences in histone modification intensities as measured by ChIP-seq. However, we respectfully disagree with the reviewer’s interpretation of these data.

      Our study primarily focuses on the identification of epigenetic similarities and differences between pMHCII-NP-induced tetramer+ cells and KLH-induced TFH cells relative to naive T cells. The outcome of direct comparisons of histone deposition (ChIP-seq) between these cell types is summarized in the lower part of Figure 4B and detailed in Datasheet 5. Throughout this section, we mention the number of differentially enriched regions, their overlap with OCRs shared between tetramer+ TFH and tetramer+ TR1 cells based on scATAC-seq data, and the associated genes. Clearly, the epigenetic modifications that TR1 cells inherit from TFH cells were acquired by TFH cells upon differentiation from naïve T cell precursors. 

      Regarding the specific point raised by the reviewer on differences in the intensity of the H3K27Ac peaks linked to Il10 in Figure 6C, we note that the genomic tracks shown are illustrative. Thorough statistical analyses involving signal background for each condition and p-value adjustment did not support differential enrichment for H3K27Ac deposition around the Il10 gene between pMHCII-NP-induced tetramer+ T cells and KLH-induced TFH cells. 

      This has now been clarified by adding the following text to the end of the Results subsection entitled ”H3K4me3, H3K27me3 and H3K27ac marks in genes upregulated during the TFH-to-TR1 cell conversion are already in place at the TFH cell stage”: “We note that, although in the representative chromosome track views shown in Fig. 6C there appear to be differences in the intensity of the peaks, thorough statistical analyses involving signal background for each condition and p-value adjustment did not support differential enrichment for histone deposition around the Il10 gene between pMHCII-NP-induced tetramer+ T cells and KLH-induced TFH cells.” 

      We have also clarified this in the corresponding section of the Methods section (“ATACseq and ChIP-seq” under “Bioinformatic and Statistical Analyses”): “Given that peak calling alone does not account for variations in the intensity of histone mark deposition, analysis of differential histone deposition includes both qualitative and quantitative assessments. Whereas qualitative assessment involves evaluating the overall pattern and distribution of the various histone marks, quantitative assessment measures the intensity and magnitude of histone mark deposition.”

      (3) Last, the key findings of this study are clear and convincing, but some results and figures are unnecessary and redundant. Some results are largely a mere confirmation of the relationship between histone marks and chromatin status. I propose to reduce the number of figures and text that are largely confirmatory. Overall, I feel this paper is too long for its current contents. 

      We understand your concern about the potential redundancy of some results and figures. Our aim in including these analyses was to provide a comprehensive understanding of the intricate relationships between epigenetic features and transcriptomic differences. We believe that a detailed examination of these relationships is crucial for several reasons: (i) the breadth of the data allows for a thorough exploration of the relationships between histone marks, open chromatin status and transcriptional differences. This comprehensive approach helps to ensure that our conclusions are robust and well-supported; (ii) some of the results that may appear confirmatory are, in fact, important for validating and reinforcing the consistency of our findings across different contexts. These details are intended to provide a nuanced understanding of the interactions between epigenetic features and gene expression; and (iii) By presenting a detailed analysis, we aim to offer a solid foundation for future research in this area. The extensive data presented will serve as a valuable resource for others in the field who may seek to build on our findings.

      That said, we have carefully reviewed the manuscript to identify and streamline elements that might be perceived as overly redundant, while retaining the depth of analysis that we believe is essential.

      Reviewer #2 (Recommendations for the authors): 

      (1) In Figure 1E, the text states "94% (n=217/231) of the genes associated with chromatin regions that had closed during the TFH-TR1 conversion,", but n=231 do not match with n=1820 provided in Figure 1D as downregulated genes. This is one of the examples that do not match numbers among figures or lack sufficient explanations. Please check those numbers carefully and add some sentences if necessary. 

      We note that the text referring to Figure 1D describes the total number of differentially expressed genes between Tet+ TR1 and Tet+ TFH cells using the scMultiome dataset (n = 2,086 genes downregulated in the former vs. the latter; and n = 266 genes upregulated in the former vs. the latter). The text in the paragraph that follows (referring to Figure 1E) focuses exclusively on the genes that had closed chromatin regions during the TFH-to-TR1 conversion, to ascertain whether or not chromatin closure was indeed associated with such gene downregulation. 

      We have modified the first sentence in the paragraph referring to Figure 1E to clarify this point for the reader: “Further analyses focusing on the genes that had closed chromatin regions during the TFH-to-TR1 conversion, confirmed…”

    2. eLife Assessment

      This study provides important information on pre-existing epigenetic modification in T cell plasticity. The evidence supporting the conclusions is compelling, supported by comprehensive transcriptional and epigenetic analyses. The work will be of interest to immunologists and colleagues studying transcriptional regulation.

    3. Reviewer #1 (Public review):

      Summary:

      Dr. Santamaria's group previously utilized antigen-specific nanomedicines to induce immune tolerance in treating autoimmune diseases. The success of this therapeutic strategy has been linked to expanded regulatory mechanisms, particularly the role of T-regulatory type-1 (TR1) cells. However, the differentiation program of TR1 cells remained largely unclear. Previous work from the authors suggested that TR1 cells originate from T follicular helper (TFH) cells. In the current study, the authors aimed to investigate the epigenetic mechanisms underlying the transdifferentiation of TFH cells into IL-10-producing TR1 cells. Specifically, they sought to determine whether this process involves extensive chromatin remodeling or is driven by pre-existing epigenetic modifications. Their goal was to understand the transcriptional and epigenetic changes facilitating this transition and to explore the potential therapeutic implications of manipulating this pathway.

      The authors successfully demonstrated that the TFH-to-TR1 transdifferentiation process is driven by pre-existing epigenetic modifications rather than extensive new chromatin remodeling. The comprehensive transcriptional and epigenetic analyses provide robust evidence supporting their conclusions.

      Strengths:

      (1) The study employs a broad range of bulk and single-cell transcriptional and epigenetic tools, including RNA-seq, ATAC-seq, ChIP-seq, and DNA methylation analysis. This comprehensive approach provides a detailed examination of the epigenetic landscape during the TFH-to-TR1 transition.

      (2) The use of high-throughput sequencing technologies and sophisticated bioinformatics analyses strengthens the foundation for the conclusions drawn.

      (3) The data generated can serve as a valuable resource for the scientific community, offering insights into the epigenetic regulation of T cell plasticity.

      (4) The findings have significant implications for developing new therapeutic strategies for autoimmune diseases, making the research highly relevant and impactful.

      Weaknesses:

      (1) While the study focuses on transcriptional and epigenetic analyses, the authors are currently undertaking efforts to validate these findings functionally. Ongoing research aims to further explore the roles of key transcription factors in the TFH-to-TR1 transition, reflecting the authors' commitment to building on the insights gained from this study.

      (2) The identification of key transcription factors and epigenetic marks is a strong foundation for future work. The authors are actively investigating how these factors drive chromatin remodeling, which will enhance the mechanistic understanding of the TFH-to-TR1 process in future studies.

      (3) Although the study provides a valuable snapshot of the epigenetic landscape, the authors are pursuing additional research to assess the dynamics of these changes over time. These ongoing efforts will contribute to a deeper understanding of the stability and progression of the observed epigenetic modifications.

      Comments on revised version:

      The authors have effectively discussed and addressed all previously raised questions. There are no further concerns.

    4. Reviewer #2 (Public review):

      Summary:

      This study, based on their previous findings that TFH cells can be converted into TR1 cells, conducted a highly detailed and comprehensive epigenetic investigation to answer whether TR1 differentiation from TFH is driven by epigenetic changes. Their evidence indicated that the downregulation of TFH-related genes during the TFH to TR1 transition depends on chromatin closure, while the upregulation of TR1-related genes does not depend on epigenetic changes.

      Strengths:

      A significant advantage of their approach lies in its detailed and comprehensive assessment of epigenetics. Their analysis of epigenetics covers chromatin open regions, histone modifications, DNA methylation, and using both single-cell and bulk techniques to validate their findings. As for their results, observations from different epigenetic perspectives mutually supported each other, lending greater credibility to their conclusions. This study effectively demonstrates that 1. the TFH-to-TR1 differentiation process is associated with massive closure of OCRs, and 2. the TR1-poised epigenome of TFH cells is a key enabler of this transdifferentiation process. Considering the extensive changes in epigenetic patterns involved in other CD4+ T lineage commitment processes, the similarity between TFH and TR1 in their epigenetics is intriguing.

      They performed correlation analysis to answer the association between "pMHC-NP-induced epigenetic change" and "gene expression change in TR1". Also, they have made their raw data publicly available, providing a comprehensive epigenomic database of pMHC-NP induced TR1 cells. This will serve as a valuable reference for future research.

      Weaknesses:

      A major limitation is that this study heavily relies on a premise from the previous studies performed by the same group on pMHC-NP-induced T cell responses. This significantly limits the relevance of their conclusion to a broader perspective. Specifically, differential OCRs between Tet+ and naïve T cells were limited to only 821, as compared to 10,919 differential OCRs between KLH-TFH and naïve T cells (Fig. 2A), indicating that the precursors and T cell clonotypes that responded to pMHC-NP were extremely limited. I acknowledge that this limitation has been added and discussed in the Discussion section of the revised manuscript.

    1. eLife Assessment

      The authors have developed a valuable approach that employs cell-free expression to reconstitute ion channels into giant unilamellar vesicles for biophysical characterisation. The work is convincing and will be of particular interest to those studying ion channels that primarily occur in organelles and are therefore not amenable to be studied by more traditional methods.

    2. Reviewer #1 (Public review):

      Summary:

      The authors have developed a valuable method based on a fully cell-free system to express a channel protein and integrated it into a membrane vesicle in order to characterize it biophysically. The study presents a useful alternative to study channels that are not amenable to be studied by more traditional methods.

      Strengths:

      The evidence supporting the claims of the authors is solid and convincing. The method will be of interest to researchers working on ionic channels, allowing to study a wide range of ion channel functions such as those involved in transport, interaction with lipids or pharmacology.

      Weaknesses:

      The inclusion of a mechanistic interpretation how the channel protein folds into a protomer or a tetramer to become functional into the membrane, would strengthen the study.

      Comments on revised version:

      In the revised version, the authors did not experimentally addressed how tetrameric or protomeric proteins are actually produced. However, they performed new experiments to assess the amount of tetramers that are being actually formed. They used a size-exclusion chromatography to conclude that the protomers and tetramers species of complexes are formed and assembled.

      The authors have addressed most of my minor concerns and have modified or updated the manuscript following my recommendations, so I have no further comments.

    3. Reviewer #2 (Public review):

      It is challenging to study the biophysical properties of organelle channels using conventional electrophysiology. The conventional reconstitution methods requires multiple steps and can be contaminated by endogenous ionophores from the host cell lines after purification. To overcome this challenge, in this manuscript, Larmore et al. described a fully synthetic method to assay the functional properties of the TRPP channel family. The TRPP channels are an important organelle ion channel family that natively traffic to primary cilia and ER organelles. The authors utilized cell-free protein expression and reconstitution of the synthetic channel protein into giant unilamellar vesicles (GUV), the single channel properties can be measured using voltage-clamp electrophysiology. Using this innovative method, the authors characterized their membrane integration, orientation, and conductance, comparing the results to those of endogenous channels. The manuscript is well-written and may present broad interest to the ion channel community studying organelle ion channels. Particularly because of the challenges of patching native cilia cells, the functional characterization is highly concentrated in very few labs. This method may provide an alternative approach to investigate other channels resistant to biophysical analysis and pharmacological characterization.

      Comments on revised version:

      The authors have addressed my concerns. This excellent method manuscript would benefit the study of organelle channels.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors have developed a valuable method based on a fully cell-free system to express a channel protein and integrate it into a membrane vesicle in order to characterize it biophysically. The study presents a useful alternative to study channels that are not amenable to being studied by more traditional methods.

      Strengths:

      The evidence supporting the claims of the authors is solid and convincing. The method will be of interest to researchers working on ionic channels, allowing them to study a wide range of ion channel functions such as those involved in transport, interaction with lipids, or pharmacology.

      Weaknesses:

      The inclusion of a mechanistic interpretation of how the channel protein folds into a protomer or a tetramer to become functional in the membrane would strengthen the study.

      Work from other labs has described key factors which can improve expression and artificial lipid integration of cellfree derived transmembrane proteins (PMIDs: 35520093, 29625253, 26270393) . However, a significant number of additional experiments would be needed to elucidate the exact biophysical properties governing channel assembly of synthetically derived polycystins. We carried out additional biochemical experiments to address these concerns (see new Figure 1— figure supplement 1 D, E). We used fluorescence-detection size-exclusion chromatography (FSEC) with the goal of understanding how much of the CFE-derived protomers are biochemically folding and assembly into functional tetramers upon incorporation into SUVs. When compared to protein recombinant sources from HEK cells, the production of assembled channels is less than 4% when using the CFE+SUV approach, an estimate based on the oligomer peak fluorescence. In the absence of chaperones found in cells, the assembly of synthetically derived protomers into tetramers is likely intrinsic to the chemical properties of the proteins, and the biophysical principles governing helical membrane protein when inserted into the lipid membrane  (PMID:35133709). We have added our interpretation in lines 111-121.

      Reviewer #2 (Public Review):

      It is challenging to study the biophysical properties of organelle channels using conventional electrophysiology. The conventional reconstitution methods require multiple steps and can be contaminated by endogenous ionophores from the host cell lines after purification. To overcome this challenge, in this manuscript, Larmore et al. described a fully synthetic method to assay the functional properties of the TRPP channel family. The TRPP channels are an important organelle ion channel family that natively traffic to primary cilia and ER organelles. The authors utilized cell-free protein expression and reconstitution of the synthetic channel protein into giant unilamellar vesicles (GUV), the single channel properties can be measured using voltage-clamp electrophysiology. Using this innovative method, the authors characterized their membrane integration, orientation, and conductance, comparing the results to those of endogenous channels. The manuscript is well-written and may present broad interest to the ion channel community studying organelle ion channels. Particularly because of the challenges of patching native cilia cells, the functional characterization is highly concentrated in very few labs. This method may provide an alternative approach to investigate other channels resistant to biophysical analysis and pharmacological characterization.

      Thank you for evaluating our manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) It would be useful to explain how the Polycystin protein is folded under the experimental conditions used. The expression data shown in Figure 1 Supplement 1B show different protein concentrations of protomer or tetramer. However, it is not described how each form is identified and distinguished. It is also important to mention in the manuscript that this method is only applicable to membrane channels that do not require chaperons for its folding and expression into the membrane. How is the tetramer mechanistically conformed? In line 184, it is stated that this method can be leveraged for studying the effects of channel subunit composition. Would this method allow the expression of two different subunit proteins in order to produce a heteromeric channel?

      In Figure 1—figure supplement 1B, total fluorescence from the synthesized channel-GFP was measured. Protein concentration was calculated based on the linear regression of the GFP standards. Monomeric protein concentration was reported directly from total fluorescence. Tetrameric protein concentration was calculated by dividing the fluorescence by four, and subsequently calculating the concentration based off the GFP standards. 

      This is a good point. Based on your suggestion, we carried out additional biochemical experiments (see new Figure 1— figure supplement 1 D, E). We used fluorescence-detection size-exclusion chromatography (FSEC) with the goal of understanding how much of the CFE-derived protomers are biochemically folding and assembly into functional tetramers upon incorporation into SUVs. As controls we produced recombinant PKD2-GFP and PKD2L1GFP channels as elution time standards and to compare the relative production of tetrameric channels generated when using the two expression systems. The synthetically derived polycystin channels indeed produced tetramers and protomers, which supports feasibility of using this method to assay their functional properties.  When compared to protein recombinant sources from HEK cells, the production of assembled channels is less than 4% when using the CFE+SUV approach, an estimate based on the oligomer peak fluorescence. We speculate that assembly of synthetically derived protomers into tetramers is likely intrinsic to the chemical properties of the proteins, and the biophysical principles governing helical membrane protein when inserted into the lipid membrane (PMID: 35133709). Although an interesting question, a systematic analysis of these channel-lipid interactions is beyond the scope of this eLife Report but can be addressed in future studies. The limitation of using this method to characterize channels which fold and membrane integrate without the aid of molecular chaperones is now stated in lines 201205. In principle, the CFE-GUV method can be deployed to co-express different subunits to produce heteromeric channels. We have modified the text lines 192-197 to be clearer on this point.

      (2) The type of plasmid (and promoter) required for this methodology should be mentioned.

      Added to the methods (lines 210-211). “PKD2 and PKD2L1 are in pET19b plasmid under T7 promoter.”

      (3) Since this paper is methodological, it would be useful to have some information about the stability of the GUVs containing the synthetic channel. In Methods, it is stated that GUV vesicles are used on the same day (line 207). And in line 193 it says that the reactions (?) are placed at 4{degree sign}C for storage.

      Restated in lines 226-228: GUVs are electroformed and used for electrophysiology the same day. SUVs with channel incorporated are stored at 4°C for 3 days.

      (4) A comment reasoning why the PKD2 protein is more frequently incorporated into the membrane in comparison to PKD2L1 should be included. A brief description of the differences between these two proteins would also be helpful for the reader.

      In terms of overall protein production and oligomeric assembly— more PKD2L1 channels are produced compared to PKD2 (see new Figure 1C, and Figure 1— figure supplement 1 D, E). In lines 149-155 we note single channel openings were frequently observed for the high expressing PKD2L1 channels, but this often resulted in patch instability. As a result, GUV patches with lower expressing PKD2-GFP channel were more stable and thus more successfully recorded from. We have revised the text to be clearer on this point.

      (5) There are no methods for preparing hippocampal neurons or IMCD cells shown in Figure 4 Supplement 1. Instead, the method of mammalian cultures provided corresponds to HEK 293T cells.

      This information has been added to lines 273-284.

      (6) Minor:

      In Figure 2C, please include the actual % of the Cell488+Surface647+Clear lumen vesicles.

      Added

      Line 99, 108: Figures 1B and 1C are swapped. Please correct.

      Corrected in figure and figure legends.

      Line 108: misspelling: effect.

      Done

      Line 109: check sentence: verb is missing.

      Sentence now reads “Minimal changes in fluorescence were detected when a control plasmid (Ctrl) encoding a non- fluorescent protein (dihyrofolate reductase) was used in the reaction.”

      Line 145: recoding. Correct.

      Recoding changed to recordings

      Line 169: "from" is missing (recorded from MCD cilia).

      Added

      Line 169: In Table 1, the PKD2 K+ conductance magnitudes recorded from IMCD cilia were significantly smaller, not larger as stated, than those assayed using CFE-GUV system. Please correct.

      Corrected

      Line 180: "of" is missing (adaptation of CFE derived...).

      Corrected

      Line 182: "to" is missing (generalized to other channels).

      Corrected

      Line 193: "in" 4ºC, correct at.

      Corrected

      Line 197: replace "mole" for "mol".

      Corrected

      Line 207: are used "within the" same day.

      Corrected

      Line 210: c-terminally. C-should be capital letter.

      Corrected

      Line 231: n-terminally. N- should be capital letter.

      Corrected

      Reviewer #2 (Recommendations For The Authors):

      The authors validated their method using PKD2 and PKD2L1 channels, demonstrating the potential of this approach. However, a few points merit further clarification or validation:

      (1) Stability of the protein vesicles for recording. The authors observed membrane instability during voltage transitions. It would be beneficial to discuss potential solutions to enhance stability.

      In lines 197-202, we have added a discussion of potential solutions to enhance stability. CsF in the intracellular saline could be added to stabilize the GUV membranes. CsF is frequently added to stabilize whole cell membranes in HTS planer patch clamp recording. We did not explore this formulation because Cs+ would limit outward polycystin conductance. We also suggest but did not test altering the membrane formulation of GUVs with additional cholesterol to stabilize these recordings.

      (2) Validation. Further discussion on how broadly this method can be applied to other channels would strengthen the manuscript.

      We have included further discussion on this point in lines 190-206. 

      (3) Protein production estimated by a standard GFP absorbance assay. The estimation of protein production using GFP absorption may be affected by improperly folded protein. Additional validation methods could be considered.

      C-terminal GFP fluorescence has been widely used in expression systems to designate proper folding of the target protein upstream of the GFP-tag (PMID: 22848743, PMID: 21805523, PMID: 35520093). Nonetheless we have conducted additional experiments designed to estimate the amount of assembled PKD2 and PKD2L1 channels generated using the CFE method. In the new Figure 1— figure supplement 1 D, E, we carried out fluorescencedetection size-exclusion chromatography and compared channel assembly of recombinant and CFE+SUV derived PKD2-GFP and PKD2L1-GFP. Here, we clearly observed tetrameric and protomeric forms of the channels using the synthetic approach, which supports feasibility of using this method to assay their functional properties (see new Figure 1— figure supplement 1 D, E).  When compared to protein recombinant sources from HEK cells, the production of assembled channels is less than 4% when using the CFE+SUV approach, an estimate based on the oligomer peak fluorescence. 

      (4) Single channels were observed more frequently from PKD2 incorporated GUVs compared to PKD2L1. Does this just randomly happen or is there a reason behind this difference?

      In terms of overall protein production and oligomeric assembly— more PKD2L1 channels are produced compared to PKD2 (Figure 1C, and Figure 1— figure supplement 1 D, E). This is apparent whether the channels are produced recombinantly in cells or when using the cell-free method (Figure 1— figure supplement 1 D, E). In lines 149-155, we note single channel openings were frequently observed but that the high expression of the PKD2L1 often resulted in patch instability. As a result, GUV patches the lower expressing PKD2-GFP channel were more stable and thus more successfully recorded from. As requested, we have included a brief description of the two proteins in lines 76-78. 

      (5) Additional validation or clarification for examining the channel orientation may strengthen the manuscript.

      We have modified the text to make this point clearer. 

      (6) Advantage and limitations. The authors compared the recordings from hippocampal primary cilia membranes, noting differences in conductance magnitudes compared to the GUV method. Further discussing the limitations and advantages of this approach for the biophysical properties of organelle channels would be beneficial.

      We have revised the final paragraph to discuss the limitations of this method.

      (7) Including experiments that demonstrate ligand-induced activation or inhibition to further validate the current using this method would strengthen the manuscript (optional, not required).

      Despite our best attempts, exchange of the external bath to apply inhibitors (Gd3+, La3+) resulted in GUV patch instability. Our plans are to investigate ways to stabilize the high resistance seals to develop pharmacological screening using the CFE+GUV method.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers for their interest in our studies. In response to their comments, we have conducted additional experiments and made the necessary revisions to the manuscript. The new studies included to address the reviewers’ comments are shown in Figure 1B, 1F, Figure 2—figure supplement 1, Figure 3, Figure 3—figure supplement 1, Figure 3—figure supplement 2, Figure 3—figure supplement 3, Figure 4E, Figure 4—figure supplement 1, Figure 5, Figure 5—figure supplement 1, Figure 5—figure supplement 2D, and Figure 6. We are grateful for the critiques, which have helped us substantially improve the quality of the manuscript.

      Below, we have provided a point-by-point response to the reviewers’ comments.  

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper, the authors show that disruption of calcineurin, which is encoded by tax-6 in C. elegans, results in increased susceptibility to P. aeruginosa, but extends lifespan. In exploring the mechanisms involved, the authors show that disruption of tax-6 decreases the rate of defecation leading to intestinal accumulation of bacteria and distension of the intestinal lumen. The authors further show that the lifespan extension is dependent on hlh-30, which may be involved in breaking down lipids following deficits in defecation, and nhr-8, whose levels are increased by deficits in defecation. The authors propose a model in which disruption of the defecation motor program is responsible for the effect of calcineurin on pathogen susceptibility and lifespan, but do not exclude the possibility that calcineurin affects these phenotypes independently of defecation.

      We thank the reviewer for providing an excellent summary of our work. We have performed additional experiments as suggested by both the reviewers and believe we have thoroughly addressed all the reviewers' concerns.

      Reviewer #2 (Public Review):

      The manuscript titled "Calcineurin Inhibition Enhances Caenorhabditis elegans Lifespan by Defecation Defects-Mediated Calorie Restriction and Nuclear Hormone Signaling" by Priyanka Das, Alejandro Aballay, and Jogender Singh reveals that inhibiting calcineurin, a conserved protein phosphatase, in C. elegans affects the defecation motor program (DMP), leading to intestinal bloating and increased susceptibility to bacterial infection. This intestinal bloating mimics calorie restriction, ultimately resulting in an enhanced lifespan. The research identifies the involvement of HLH-30 and NHR-8 proteins in this lifespan enhancement, providing new insights into the role of calcineurin in C. elegans DMP and mechanisms for longevity.

      The authors present novel findings on the role of calcineurin in regulating the defecation motor program in C. elegans and how its inhibition can lead to lifespan enhancement. The evidence provided is solid with multiple experiments supporting the main claims.

      Strengths:

      The manuscript's strength lies in the authors' use of genetic and biochemical techniques to investigate the role of calcineurin in regulating the DMP, innate immunity, and lifespan in C. elegans. Moreover, the authors' findings provide a new mechanism for calcineurin inhibitionmediated longevity extension, which could have significant implications for understanding the molecular basis of aging and developing interventions to promote healthy aging.

      (1) The study uncovers a new role for calcineurin in the regulation of C. elegans DMP and a potential novel pathway for enhancing lifespan via calorie restriction involving calcineurin, HLH-30, and NHR-8 in C. elegans.

      (2) Multiple signaling pathways involved in lifespan enhancement were investigated with fairly strong experimental evidence supporting their claims.

      We thank the reviewer for an excellent summary of our work and for highlighting the strengths of the findings.

      Weaknesses:

      The manuscript's weaknesses include the lack of mechanistic details regarding how calcineurin inhibition leads to defects in the DMP and induces calorie restriction-like effects on lifespan.

      The exact site of calcineurin action, i.e., whether in the intestine or enteric muscles (Lee et al., 2005), and the possible molecular mechanisms linking calcineurin inhibition, DMP defects, and lifespan were not adequately explored. Although characterization of the full mechanism is probably beyond the scope of this paper, given the relative simplicity and advantages of using C. elegans as a model organism for this study, some degree of rigor is expected with additional straightforward control experiments as listed below:

      The authors state that tax-6 knockdown animals had drastically reduced expulsion events (Figure 2G), leading to irregular DMP (Lines 144-145), but did not describe the nature of DMP irregularity. For example, did the reduced expulsion events still occur with regular intervals but longer cycle lengths? Or was the rhythmicity completely abolished? The former would suggest the intestine clock is still intact, and the latter would indicate that calcineurin is required for the clock to function. Therefore, ethograms of DMP in both wild-type and tax6 mutant animals are warranted to be included in the manuscript. Along the same line, besides the cycle length, the three separable motor steps (aBoc, pBoc, EMC) are easily measurable, with each step indicating where the program goes wrong, hence the site of action, which is precisely the beauty of studying C. elegans DMP. Unfortunately, the authors did not use this opportunity to characterize the exact behavior phenotypes of the tax-6 mutant to guide future investigations. Furthermore, it is interesting that about 64% of tax-6 (p675) animals had normal DMP. The authors attributed this to p675 being a weak allele. It would be informative to further examine tax-6 RNAi as in other experiments or to make a tax-6 null mutant with CRISPR. In addition, in one of the cited papers (Lee et al., 2005), the exact calcineurin loss-of-function strain tax-6(p675) was shown to have normal defecation, including normal EMC, while the gain-of-function mutant of calcineurin tax-6(jh107) had abnormal EMC steps. It wasn't clear from Lee et al., 2005, if the reported "normal defecation" was only referring to the expulsion step or also included the cycle length. Nevertheless, this potential contradiction and calcineurin gain-of-function mutant is highly relevant to the current study and should be further explored as a follow-up to previously reported results. For some of the key experiments, such as tax-6's effects on susceptibility to PA14, DMP, intestinal bloating, and lifespan, additional controls, as the norm of C. elegans studies, including second allele and rescue experiments, would strengthen the authors' claims and conclusions.

      We have now included lifespan, survival on P. aeruginosa, and DMP data using an additional knockout allele, tax-6(ok2065). Additionally, we have added ethograms of DMP for both tax-6 RNAi and the tax-6(ok2065) mutant. Our observations indicate that tax-6 inhibition leads to a complete loss of DMP rhythmicity, suggesting that calcineurin is essential for maintaining the DMP clock. While characterizing the DMP, we noticed that expulsion events appeared superficial in the tax-6(ok2065) mutant, with little to no gut content released. Consequently, we examined the movement of gut content and found that both tax-6(ok2065) mutants and tax-6 knockdown animals showed significantly reduced gut content movement. The new findings on the characterization of DMP are presented in Figure 2—figure supplement 1, Figure 3, Figure 3—figure supplement 1, and Figure 3—figure supplement 2. The text in the results section reads (lines 160-176): “Next, we investigated whether the reduced number of expulsion events was due to regular intervals with longer cycle lengths or if rhythmicity was entirely disrupted upon tax-6 knockdown. To assess this, we obtained ethograms of the DMP for N2 animals grown on control and tax-6 RNAi. While animals on control RNAi displayed regular cycles of pBoc, aBoc, and EMC, the tax-6 RNAi animals exhibited disrupted rhythmicity (Figure 3A and Figure 3—figure supplement 1). Most tax-6 knockdown animals lacked the pBoc and aBoc steps and had sporadic expulsion events. Isolated pBoc events were occasionally observed, indicating a complete loss of rhythmicity in tax-6 knockdown animals. Ethograms for tax-6(ok2065) animals also showed disrupted rhythmicity (Figure 3B and Figure 3—figure supplement 2). Although the number of expulsion events appeared higher in tax-6(ok2065) animals compared to tax-6 RNAi animals (Figure 3—figure supplement 1 and 2), these expulsion events seemed superficial, releasing little to no gut content. This suggested slow movement of gut content in tax6(ok2065) animals, leading to constipation and intestinal bloating. We examined gut content movement by measuring the clearance of blue dye (erioglaucine disodium salt) from the gut. The clearance was significantly slower in tax-6(ok2065) animals compared to N2 animals (Figure 3C), indicating impaired gut content movement due to the loss of tax-6. Similarly, tax-6 knockdown animals also showed significantly slowed gut content movement (Figure 3D).”

      Moreover, we have added a potential reason for the tax-6(p675) contradictory results from Lee et al., 2005 (lines 154-159): “At the 1-day-old adult stage, about 36% of tax-6(p675) animals showed irregular and slowed DMP, while the remainder had regular DMP (Figure 2H), suggesting that tax-6(p675) is a weak allele. The fraction of the animals with irregular DMP appeared to increase with age, indicating that this phenotype might be agedependent. This may also explain why tax-6(p675) animals were reported to have a normal defecation cycle in an earlier study (Lee et al., 2005).”

      The second weakness of this manuscript is the data presentation for all survival rate curves. The authors stated that three independent experiments or biological replicates were performed for each group but only showed one "representative" curve for each plot. Without seeing all individual datasets or the averaged data with error bars, there is no way to evaluate the variability and consistency of the survival rate reported in this study.

      We now provide all replicates data in the source data files.

      Overall, the authors' claims and conclusions are justified by their data, but further experiments are needed to confirm their findings and establish the detailed mechanisms underlying the observed effects of calcineurin inhibition on the DMP, calorie restriction, and lifespan in C. elegans.

      We have conducted additional experiments to elucidate the role of calcineurin in the DMP and to investigate the impact of the DMP on calorie restriction and lifespan in C. elegans, as described in the various responses to the reviewers’ comments. 

      Recommendations for the authors:

      Our specific comments to guide the authors, should they choose to revise the manuscript:

      The RNAi experiments in the eat-2 mutant background are difficult to interpret. If these animals are eating fewer bacteria, it is possible that there is also less tax-6 dsRNA being ingested and therefore less tax-6 inactivation. These experiments should be conducted with a tax-6 null allele.

      We have included lifespan experiments with the eat-2(ad465);tax-6(ok2065) double mutant, along with the individual single mutant controls, as shown in Figure 4E. These results demonstrate that the eat-2 mutation does not further extend the lifespan of the tax-6(ok2065) mutant. Additionally, we confirmed that the eat-2(ad465) mutants do not exhibit defects in feeding-based RNAi (Figure 4—figure supplement 1).

      While aak-2, hlh-30, and nhr-8 mutants may not have an eat phenotype, the negative tax-6 RNAi results should be confirmed with a tax-6 null mutant to obviate the consideration that these background mutations reduce RNAi efficacy.

      The genes hlh-30 and nhr-8 are located very close to tax-6 on chromosome IV (https://wormbase.org//#012-34-5), which made it challenging to generate double mutants. However, we tested the RNAi sensitivity of the hlh-30(tm1978) and nhr-8(ok186) mutants and confirmed that they are not defective in RNAi (Figure 5—figure supplement 1). We also found that tax-6 RNAi disrupted the DMP in both hlh-30(tm1978) and nhr-8(ok186) mutants (Figure 5—figure supplement 2). Furthermore, our results show that hlh-30(tm1978) and nhr-8(ok186) animals have increased susceptibility to P. aeruginosa upon tax-6 knockdown (Figure 6A, B), indicating that tax-6 RNAi was effective in these mutants. Since the phenotype in the aak-2 mutant was only partially observed, we did not conduct further experiments with aak-2 mutants.

      Reviewer #1 (Recommendations For The Authors):

      The low penetrance of defecation cycle defects in tax-6(p675) worms brings into question the role of the defecation deficits in the phenotypes caused by the disruption of tax6. At the same time, the low penetrance provides a golden opportunity to test this. Do tax6(p675) worms with a normal defecation cycle length have extended longevity? Increased susceptibility to bacterial pathogens? Smaller body size? Distended lumen? Decreased fat accumulation? Increased pha-4 and nhr-8 expression? It would be relatively straightforward to measure defecation cycle length in individual tax-6(p675) worms, bin them into normal defecation and slow defecation groups, and then compare the above-mentioned phenotypes.

      We appreciate the reviewer's interesting suggestion. However, the DMP defect phenotype in tax-6(p675) worms appears to be age-dependent, with the number of DMPdefective worms increasing as they age. Additionally, we observed that exposure to P. aeruginosa accelerates the onset of DMP defects in tax-6(p675) worms. As a result, tax6(p675) worms are not suitable for the type of experiments the reviewer suggested. Nevertheless, we believe that the additional data using the tax-6(ok2065) mutant, along with the characterization of ethograms of DMP, firmly establishes the role of calcineurin in maintaining a regular DMP in C. elegans.

      Another way to dissect specific effects of calcineurin disruption from phenotypes resulting from defecation motor program deficits would be to further characterize other worms with deficits in defecation (flr-1, nhx-2, pbo-1 RNAi). It is mentioned that they have decreased lifespan. Do they also show increased susceptibility to bacterial pathogens? Do they show decreased fat? Is their lifespan dependent on HLH-30 and NHR-8?

      We thank the reviewer for this important suggestion. We have now included data with flr-1, nhx-2, and pbo-1 RNAi, which shows that the knockdown of these genes also enhances susceptibility to P. aeruginosa (Figure 3—figure supplement 3G). Knockdown of these genes is already known to reduce fat levels in N2 worms, and we demonstrate that they similarly reduce fat levels in hlh-30(tm1978) and nhr-8(ok186) animals (Figure 5B, C, F, G). Additionally, we found that the increased lifespan observed upon knockdown of these genes (as well as with tax-6 knockdown) is dependent on HLH-30 and NHR-8 (Figure 5A, D).

      To place "enhanced susceptibility to pathogen" within the proposed model, it would be important to examine the effect of HLH-30 and NHR-8 disruption on this phenotype. The proposed model suggests that this phenotype is independent of HLH-30 and NHR-8, but this should be tested experimentally. Similarly, it would be important to test the effect of HLH-30 and NHR-8 disruption on defecation cycle length to determine if defecation deficits are upstream or downstream of deficits in the defecation motor program

      We show that the knockdown of tax-6 leads to defects in the DMP in hlh30(tm1978) and nhr-8(ok186) animals (Figure 5—figure supplement 2). Moreover, we show that hlh-30(tm1978) and nhr-8(ok186) animals have increased susceptibility to P. aeruginosa upon tax-6 knockdown (Figure 6A, B). These results are described as (lines 279-285): “Given that HLH-30 and NHR-8 are essential for lifespan extension upon calcineurin inhibition, we investigated whether these pathways also influence survival in response to P. aeruginosa infection following calcineurin knockdown. Both hlh-30(tm1978) and nhr-8(ok186) animals showed significantly reduced survival upon tax-6 RNAi (Figure 6A, B). These findings suggested that the reduced survival on P. aeruginosa following calcineurin inhibition is independent of HLH-30 and NHR-8 and is more likely due to increased gut colonization by P. aeruginosa resulting from DMP defects (Figure 6C).”

      Is the lifespan of tax-6(p675) increased? This would be important to measure and include in Figure 1.

      Indeed, the lifespan of tax-6(p675) mutants is increased. We have included the lifespan of tax-6(p675) and tax-6(ok2065) in Figure 1F.

      In Figure 2, disruption of tax-6 appears to result in a clear decrease in body size. To what extent is the decrease in fat/worm in Figure 3 simply a result of the worms being smaller? Perhaps, a measurement of Oil-Red-O intensity PER AREA would be a more appropriate measure.

      The ORO intensity values we had shown per animal were already area normalized. We have now indicated this in the Figure Legends.

      There are multiple long-lived mutant strains such as clk-1 and isp-1 that have an increased defecation cycle length. To what extent do these worms exhibit phenotypes similar to tax-6 disruption? isp-1 have increased resistance to bacterial pathogens suggesting that defecation motor program deficits are not sufficient to increase susceptibility to bacterial pathogens.

      We have now examined the clk-1 and isp-1 mutants and found that these mutants exhibit reduced gut colonization by P. aeruginosa compared to N2 animals. This reduction in colonization may be attributed to the slowed pharyngeal pumping rates observed in these mutants. These findings suggest that the phenotypes associated with a slow DMP versus a disrupted DMP could be significantly different. The manuscript with the new data on these mutants reads (lines 177-192): “We then explored whether the disruption of DMP rhythmicity due to tax-6 knockdown affected P. aeruginosa responses similarly to longer but regular DMP cycles. To do this, we studied P. aeruginosa colonization in clk-1(qm30) and isp1(qm150) mutants, which have regular but extended DMP cycles (Feng et al., 2001; Wong et al., 1995). Interestingly, both clk-1(qm30) and isp-1(qm150) mutants showed significantly reduced intestinal colonization by P. aeruginosa compared to N2 animals (Figure 3—figure supplement 3A-D). This reduced colonization could be attributed to their significantly decreased pharyngeal pumping rates (Wong et al., 1995; Yee et al., 2014), suggesting a lower intake of bacterial food in these mutants. While the survival of clk-1(qm30) animals on P. aeruginosa was comparable to N2 animals (Figure 3—figure supplement 3E), isp1(qm150) animals exhibited significantly improved survival (Figure 3—figure supplement 3F). Conversely, knockdown of flr-1, nhx-2, and pbo-1 in N2 animals resulted in significantly reduced survival on P. aeruginosa compared to control RNAi (Figure 3—figure supplement 3G). Knockdown of these genes causes complete disruption of DMP rhythmicity, increasing gut colonization by P. aeruginosa (Singh and Aballay, 2019a). Overall, these findings demonstrated that calcineurin is crucial for maintaining the DMP ultradian clock, and its inhibition increases susceptibility to P. aeruginosa by disrupting the DMP.”

      Line 192. This statement is speculative. There is no evidence that HLH-30 is mediating lipid depletion in these worms.

      We have removed this statement. We observed that the knockdown of flr-1, nhx2, and pbo-1 resulted in significant fat depletion in hlh-30(tm1978) animals (Figure 5B, C). Additionally, tax-6 knockdown also caused a small but significant reduction in fat levels in hlh-30(tm1978) animals. This contrasts with our initial submission, possibly due to the increased number of animals included in the analysis. These findings suggest that the increase in lifespan due to DMP defects requires HLH-30, likely through a mechanism independent of HLH-30’s role in fat depletion. We have updated the manuscript text and model (Figure 6C) accordingly.

      In Figure S2, tax-6 RNAi appears to have a more detrimental effect in pmk-1 mutants than the other mutants. The authors should comment on this.

      We have added the following sentence in the manuscript (lines 123-125): “The knockdown of tax-6 appeared to have a more pronounced effect in pmk-1(km25) mutants than in other mutants, suggesting that inhibition of tax-6 might exacerbate the adverse effects observed in pmk-1(km25) mutants.”

      Reviewer #2 (Recommendations For The Authors):

      Line 192-193: The statement is confusing and not accurate because HLH-30 did not enhance lifespan with or without calcineurin (Figure 4A and S4A, also in Lapierre 2023). The takeaway should be along the lines of calcineurin inhibition enhancing lifespan through HLH-30 or HLH-30 being required for lifespan enhancement via calcineurin inhibition.

      We have removed this statement. We now state (lines 237-239): “Knockdown of tax-6 did not extend the lifespan of hlh-30(tm1978) animals (Figure 5A), indicating that HLH-30 is required for the increased lifespan observed with calcineurin inhibition.”

      Line 261: Similar to the point above. Where is the data showing NHR-8 increases lifespan with or without calcineurin?

      We have removed this sentence.

      Figure 1 legend line 699: animals per condition per replicate >90, but in the Method section Line 317, it says more than 80 animals per condition per replicate. Could be more accurate.

      We have now specified in the Methods section that the exact number of animals per condition is provided in the source data files. Since different lifespan curves within a given figure panel had varying numbers of animals, we have indicated the lower boundary for all curves (including the replicates). The precise number of animals for each lifespan experiment is available in the source data files.

      Figures 2F and G, "tax-6" should be labeled as "tax-6 RNAi" to be consistent with other figures.

      We thank the reviewer for this suggestion and have updated the label to “tax-6 RNAi”.

      In summary, we would like to thank the reviewers again for providing constructive critiques. We believe we have fully addressed all the concerns of the reviewers by carrying out several new experiments and modifying the text. The manuscript has undergone substantial revision and has thereby improved significantly. We do hope that the evidence in support of the conclusions is found to be complete in the revised manuscript.

    2. eLife Assessment

      This important study reveals insights into how calcineurin influences C. elegans pathogen susceptibility and lifespan through its role in controlling the defecation motor program. The authors provide convincing evidence to support a new mechanism through which calcineurin impacts longevity. This work will be of interest to investigators studying host-pathogen interactions and aging in a number of experimental systems.

    3. Reviewer #1 (Public review):

      In this paper, the authors show that disruption of calcineurin, which is encoded by tax-6 in C. elegans, results in increased susceptibility to P. aeruginosa but extends lifespan. In exploring the mechanisms involved, the authors show that disruption of tax-6 decreases the rate of defecation leading to intestinal accumulation of bacteria and distension of the intestinal lumen. The authors further show that the lifespan extension is dependent on hlh-30, which may be involved in breaking down lipids following deficits in defecation, and nhr-8, whose levels are increased by deficits in defecation. The authors propose a model in which disruption of the defecation motor program is responsible for the effect of calcineurin on pathogen susceptibility and lifespan, but do not exclude the possibility that calcineurin affects these phenotypes independently of defecation.

    4. Reviewer #2 (Public review):

      The relationships between genes and phenotypes are complex and the impact of deleting or a gene can often have multifaceted and unforeseen consequences. This paper dissected the role of calcineurin, encoded by tax-6, in various phenotypes in C. elegans, including lifespan, pathogen susceptibility, the defecation motor program, and nutrient absorption or calorie restriction, through a series of genetic and behavioral analyses. Many genes in these pathways were tested yielding robust results. Classic epistasis analyses were used to distinguish between genes operating in the same or separate pathways. Researchers in the related fields will be very interested in looking through the data presented in this paper in great detail and benefit from it.

      Overall, this paper supports a model in which the increased lifespan and heightened pathogen susceptibility observed following calcineurin inhibition result from the disruptions in the defecation motor program but through distinct pathways. A defective defecation motor program leads to intestine bloating and compromised nutrient absorption. Calorie restriction resulting from poor nutrient absorption affects lifespan, whereas increased colonization in the bloated intestine heightens pathogen susceptibility. The observation that knockdown of several other DMP-related genes also results in increased lifespan and pathogen susceptibility further reinforces the proposed model.

    1. eLife Assessment

      This important study presents a high-resolution cryoEM structure of the supercomplex between photosystem I (PSI) and fucoxanthin chlorophyll a/c-binding proteins (FCPs) from the model diatom Thalassiosira pseudonana CCMP1335, revealing subunits, protein:protein interactions and pigments not previously seen in other diatoms or red/green photosynthetic lineages. Combining structural, sequence and phylogenetic analyses, the authors provide convincing evidence of conserved motifs crucial for the binding of FCPs, accompanied by interesting speculation about the mechanisms governing the assembly of PSI-FCP supercomplexes in diatoms and their implications for related PSI-LHC supercomplexes in plants. The findings set the stage for functional experiments that will further advance the fields of photosynthesis, bioenergy, ocean biogeochemistry and evolutionary relationships between photosynthetic organisms.

    2. Reviewer #1 (Public review):

      The authors present the cryo-EM structure of PSI-fucoxanthin chlorophyll a/c-binding proteins (FCPs) supercomplex from the diatom Thalassiosira pseudonana CCMP1335 at a global resolution of 2.3 Å. This exceptional resolution allows the authors to construct a near-atomic model of the entire supercomplex and elucidate the molecular details of FCPs arrangement. The high-resolution structure reveals subunits not previously identified in earlier reconstructions and models, as well as sequence analysis of PSI-FCPIs from other diatoms and red algae. Additionally, the authors use their model in conjunction with a phylogenetic analysis to compare and contrast the structural features of the T. pseudonana supercomplex with those of Chaetoceros gracilis, uncovering key structural features that contribute to the efficiency of light energy conversion in diatoms.

      The study employs the advanced technique of single particle cryo-electron microscopy to visualize the complex architecture of the PSI supercomplex at near-atomic resolution and analyze the specific roles of FCPs in enhancing photosynthetic performance in diatoms.

      Overall, the approach and data are both compelling and of high quality. The paper is well written and will be of wide interest for comprehending the molecular mechanisms of photosynthesis in diatoms. This work provides valuable insights for applications in bioenergy, environmental conservation, plant physiology, and membrane protein structural biology.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors present the cryo-EM structure of of PSI-fucoxanthin chlorophyll a/c-binding proteins (FCPs) supercomplex from the diatom Thalassiosira pseudonana CCMP1335 at a global resolution of 2.3 Å. This exceptional resolution allows the authors to construct a near-atomic model of the entire supercomplex and elucidate the molecular details of FCPs arrangement. The high-resolution structure reveals subunits not previously identified in earlier reconstructions and models, as well as sequence analysis of PSI-FCPIs from other diatoms and red algae. Additionally, the authors use their model in conjunction with a phylogenetic analysis to compare and contrast the structural features of the T. pseudonana supercomplex with those of Chaetoceros gracilis, uncovering key structural features that contribute to the efficiency of light energy conversion in diatoms.

      The study employs the advanced technique of single particle cryo-electron microscopy to visualize the complex architecture of the PSI supercomplex at near-atomic resolution and analyze the specific roles of FCPs in enhancing photosynthetic performance in diatoms.

      Overall, the approach and data are both compelling and of high quality. The paper is well written and will be of wide interest for comprehending the molecular mechanisms of photosynthesis in diatoms. This work provides valuable insights for applications in bioenergy, environmental conservation, plant physiology, and membrane protein structural biology.

      We thank you very much for your highly positive evaluation and comments on our manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript elucidated the cryo-electron microscopic structure of a PSI supercomplex incorporating fucoxanthin chlorophyll a/c-binding proteins (FCPs), designated as PSI-FCPI, isolated from the diatom Thalassiosira pseudonana CCMP1335. Combining structural, sequence, and phylogenetic analyses, the authors provided solid evidence to reveal the evolutionary conservation of protein motifs crucial for the selective binding of individual FCPI subunits and provided valuable information about the molecular mechanisms governing the assembly and selective binding of FCPIs in diatoms.

      Strengths:

      The manuscript is well-written and presented clearly as well as consistently. The supplemental figures are also of high quality.

      Weaknesses:

      Only minor comments (provided in recommendations for authors) to help improve the manuscript.

      We thank you very much for your highly positive evaluation and comments on our manuscript.

      Reviewer #3 (Public Review):

      Summary:

      Understanding the structure and function of the photosynthetic machinery is crucial for grasping its mode of action. Photosystem I (PSI) plays a vital role in light-driven electron transfer, which is essential for generating cellular reducing power. A primary strategy to mitigate light and environmental stresses involves incorporating peripheral light-harvesting proteins. Among various lineages, the number of LHCIs and their protein and pigment compositions differ significantly in PSI-LHCI structures. However, it is still unclear how LHCIs recognize their specific binding sites in the PSI core. This study aims to address this question by obtaining a high-resolution structure of the PSI supercomplex, including fucoxanthin chlorophyll a/c-binding proteins (FCPs), referred to as PSI-FCPI, isolated from the diatom Thalassiosira pseudonana. Through structural and sequence analyses, distinct protein-protein interactions are identified at the interfaces between FCPI and PSI subunits, as well as among FCPI subunits themselves.

      Strengths:

      The primary strength of this work lies in its superb isolation and structural determination, followed by clear discussion and conclusions. However, the interactions among the protein complexes and their relevance in formulating general rules are not definitively established. While efficiency is a crucial aspect, preventing damage is equally important, and currently, we cannot infer this from the provided structures.

      Weaknesses:

      The interactions among the protein complexes and their relevance in formulating general rules are not definitively established. While efficiency is a crucial aspect, preventing damage is equally important, and currently, we cannot infer this from the provided structures.

      We thank you very much for your highly positive evaluation and comments on our manuscript. This study is aimed to decipher the interactions among different protein subunits within the PSI-FCPI supercomplex, from which we wish to draw their relevance in formulating general rules. While we agree that damage is equally important, it is unclear to us what kind of damage you are mentioning, and we consider that this may need to be treated in another publication, as we cannot elucidate everything in one paper.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Line 69: "Diatoms are one of the most important phytoplankton in aquatic environments and contribute to the primary production in the ocean remarkably." Check the sentence, something is missing.

      We modified the sentence as follow:

      "Diatoms are among the most essential phytoplankton in aquatic environments, playing a crucial role in the global carbon cycle, supporting marine food webs, and contributing significantly to nutrient cycling, thus ensuring the health and sustainability of marine ecosystems"

      (2) Supplementary Figure 1B: The SDS-PAGE gel shows multiple bands. Do the authors know the identity of these proteins, or have they considered analyzing the bands using mass spectrometry? The band at ~17 kDa is particularly intense. Could you comment on this? Have you tried running a Native-PAGE gel?

      We did not identify protein bands by MS analysis. The protein bands in the PSI-FCPI supercomplex of this diatom have been identified by Ikeda et al. 2013. The protein bands of our sample were similar to those of Ikeda et al. 2013. To explain this, we modified the sentences and cited Ikeda et al. 2013 in the revised manuscript (lines 89-91).

      "The PSI-FCPI supercomplexes were purified from the diatom T. pseudonana CCMP1335 and analyzed by biochemical and spectroscopic techniques (Fig. S1). Notably, the protein bands of PSI-FCPI closely resembled those reported in a previous study (31)."

      The ~17 kDa protein band appears to be FCPIs, which was identified in Ikeda et al. 2013. We did not perform BN-PAGE of this sample; however, we performed trehalose density gradient centrifugation (Fig. S1A).

      (3) Can the authors comment on the position of the FCPI subunits in the PSI supercomplex in diatoms compared to the arrangement of LHCIs in complex with PSI in cyanobacteria, green algae, and angiosperms? This information would be useful to incorporate into the text.

      We previously compared the PSI-FCPI structures of the diatom C. gracilis to the PSI-LHCI structures of land plant, green alga, and red alga (Nagao et al., 2020). Also, Xu et al. 2020 compared the C. gracilis PSI-FCPI structure to the PSI-LHCI structures of land plant, green alga, and red alga. The binding sites between FCPIs and LHCIs are conserved to some extent. However, our recent study revealed that no orthologous relationship exists among LHCs bound to PSI between primitive red algae and diatoms (Kato et al., 2024). Consequently, we found that the information obtained from structural comparisons alone is extremely limited. To avoid misinterpretation, this study focused on comparing the structures and amino acid sequences of FCPIs between T. pseudonana and C. gracilis.

      (4) Line 104: Despite achieving high resolution, the authors modeled only six lipid densities (the PDB model contains actually 9 lipids, you should correct it in the text). Do you believe this is due to the detergent used for purification? Can you comment on the position, identity, and potential role of the lipids within your model?

      There are 6 lipids associated with the PSI core and 3 with FCP, giving rise to a total of 9 lipids. We have described it in our original text (lines 102-104 in the modified manuscript). Additionally, our structure reveals unidentified densities which likely represent lipids; they are modeled as 88 unknown lipids (UNLs). Thus, there are more lipids in the supercomplex. However, we also observed 4 β-DDM molecules (LMT) in the structure, which are used as detergents. Thus, it is possible that some lipids have dissociated and replaced by detergents. Many of the observed lipids are located between subunits, likely contributing to the stabilization of the complex.

      (5) Line 111: The global resolution is very high. Why does the unknown protein have such low resolution that it was impossible to model it properly and perform de novo identification from the density map? Is it due to a lower abundance of particles with this subunit bound? Have you tried improving this with 3D classification/ focus refinement /density modification?

      The Unknown subunit (UNK) is located peripherally, and its density is significantly lower compared to the neighboring subunits, which may suggest a low abundance. We applied density modification using Topaz for 3D map denoising, but the effect was minimal. As the low abundance of UNK may be the cause, 3D classification and focus refinement also had limited impact.

      (6) Figure 2A: It would be useful to show the density map for the subunit together with the model, especially to demonstrate visualization of the long loop.

      We added the model and map of Psa29 to Figure S4C in the revised manuscript.

      (7) Given the proximity of Psa29 to PsaC, is the protein involved in electron shuttling? If so, could you comment on this? In line 131, you state that Psa29 was not found in other organisms. Can the authors speculate on the potential role of this protein in diatoms?

      We have no idea about the function of Psa29 at present. However, Psa29 does not contain any cofactors, indicating no contribution of it to electron transfer reactions. To understand the function of Psa29, a deletion mutant of this gene is required for examining its functional and physiological roles in diatom photosynthesis. To explain this, we added the following sentences to the revised manuscript (lines 129-133):

      "However, the functional and physiological roles of Psa29 remain unclear at present. It is evident that Psa29 does not have any pigments, quinones, or metal complexes, suggesting no contribution of Psa29 to electron transfer reactions within PSI. Further mutagenesis studies will be necessary to investigate the role of Psa29 in diatom photosynthesis."

      (8) Line 163: "Among the FCPI subunits, only FCPI-1 has BCRs in addition to Fxs and Ddxs (Figure S6A). FCPI-1 is a RedCAP, which belongs to the LHC protein superfamily but is distinct from the LHC protein family (6, 7)." It would be useful if the authors could add the carotenoid model embedded in the cryoEM density map to the figure to show the features that led to modeling BCR instead of other carotenoids. Additionally, it would be helpful to include in the text why RedCAPs differ from LHCIs and their proposed role.

      We added the model and map of two BCRs in FCPI-1 (RedCAP) to Figure S4F in the revised manuscript.

      Phylogenetic analysis showed that RedCAPs are distinct from the LHC protein family. This has been explained in lines 163-164. Also, the functional and physiological roles of RedCAP remain unclear. To explain this, we added the sentence "; however, the functional and physiological roles of RedCAP remain unclear" to the revised manuscript (lines 164-165).

      (9) Line 185: "However, it is unknown (i) whether CgRedCAP is indeed bound to the C. gracilis PSI-FCPI supercomplex and (ii) if a loop structure corresponding to the Q96-T116 loop of TpRedCAP exists in CgRedCAP." Have the authors attempted to model the protein using AlphaFold? If so, are there significant differences? Could you speculate on the absence of RedCAP in C. gracilis? Do you believe it is due to using a different detergent or related to environmental factors?

      We did not model CgRedCAP using AlphaFold. Our recent study “Kato et al. 2024” proposed that CgRedCAP binds to the LHCI-1 site in the PSI-FCPI structure based on sequence comparison. There are two types of PSI-FCPI supercomplexes, one having 16 FCPIs and the other having 24 FCPs, from C. gracilis. The different antenna sizes may depend on the growth conditions of C. gracilis (Nagao et al. 2020). These explanations were already described in the manuscript (lines 243-246).

      (10) Line 193: Figure 8 is mentioned before Figures 4-7.

      We are sorry for the mistake of Figure number. Figure 8 is Supplementary Figure 8, so that we modified Fig. S8B in the revised manuscript.

      (11) Line 223: FCPI-4 interacts only with FCPI-5, primarily through the interaction of Y196/4 with the FCPI-5 backbone. Is this interaction facilitated by other factors such as lipids, carotenoids, or other ligands? Also, FCPI-4 occupies a peculiar position compared to other LHCIs proteins (it is peripheral to FCPI-4 and FCPI-5). Do you believe this could be due to a transient interaction with the complex? Could the presence of this protein be related to the growth conditions experienced by the plant? Are there any literature reports on environmental conditions influencing FCPI arrangements? Including this information in the text would be interesting.

      Y196/4 interacts with only backbones by hydrogen-bond interactions; therefore, other cofactors do not contribute to the interactions.

      We do not believe that the interaction of FCPI-4 is transient; rather, this binding appears to be stable within the complex. Given that the PSI-FCPI supercomplexes were isolated by anion exchange chromatography, FCPI-4 and FCPI-5 are tightly associated within this complex. However, it is important to note that the expression of diatom FCPI proteins can indeed vary depending on growth conditions, as highlighted in our previous study (Nagao et al., 2020). While the peculiar position of FCPI-4 may not be directly related to transient interactions, environmental conditions could still influence the overall arrangement and expression levels of FCPIs. This information has already been described in the manuscript (lines 243-246).

      (12) Given the high resolution of your map, the overall model quality does not seem to match the map quality. Specifically, the clash score (10) and sidechain outliers (3%) are elevated. Could you comment on this? Do you believe it is related to the high number of ligands?

      Our structure contains a total of 295 ligands, including cofactors, detergents, and unknown lipids. We believe the high clash score and number of sidechain outliers are due to the large number of ligands present.

      (13) Supplementary Figure 2: You should show the 3D classes that were discarded.

      According to your comment, we added the 3D classes that were discarded and the sentence "Red boxes highlight selected particles from each 3D classification." to Figure S2 and its legend in the revised manuscript.

      (14) Which masks were used for refinement? How were they generated, and which parameters were chosen? This information should be added to the Materials and Methods section. You should show the masks used during classification, for example.

      We used a 240 Å spherical mask for refinement and classification, without applying any reference mask as input. To explain this, we added the corresponding sentence to Methods in the revised manuscript (lines 347-348) as follow:

      "A 240-Å spherical mask was used during the 3D classification and refinement processes."

      (15) Were any extra proteins detected in the early stages of the cryoEM analysis (i.e., 2D classification) that were discarded? Could you visualize the superior oligomeric states of the supercomplex?

      In the single-particle analysis, no larger particles than the analyzed complex were detected. The results of 2D classification using a sufficiently large spherical mask with a diameter of 320 Å are shown below.

      Author response image 1.

      (16) Have you tried using cryoSPARC for data analysis? If so, could you comment on that?

      We did not use cryoSPARC for data analysis.

      Reviewer #2 (Recommendations For The Authors):

      I have some minor comments below to help improve the manuscript. The line numbers below refer to those in the Word version of the manuscript.

      (1) Figure 1 legend, line 559, "membrane normal"? Panel A and B, structures with the same colors, do they refer to the closely related or interacted parts? For example, the red color for FCP1-1 in A and PsaA in B. If not, the authors may want to clarify it.

      The term 'membrane normal' refers to the direction perpendicular to the surface of a membrane. It is a concept frequently used in physics and biology to describe the orientation relative to the membrane's plane.

      We do not refer to either the closely related or interacted parts used in Figure 1. According to your comments, the colors of subunits were revised in the revised manuscript.

      (2) Line 109-117. "Psa28 is a novel subunit found in the C. gracilis PSI-FCPI structure, and its name follows the nomenclature as suggested previously (31).... After psaZ, the newly identified genes should be named psa27, psa28, etc., and the corresponding proteins are called Psa27, Psa28, etc... Psa28 was also named PsaR in the PSI-FCPI structure of C. gracilis (16)". It is confusing. Was Psa28 named twice, PsaR and Psa28? It would be helpful to add a simple explanation here.

      According to your comment, we modified the sentence as follow (lines 117-118):

      " However, Xu et al. named the subunit as PsaR in the PSI-FCPI structure of C. gracilis "

      (3) Line 134, "One of the Car molecules in PsaJ was identified as ZXT103 in the T. pseudonana PSI-FCPI structure but it is BCR112 in the C. gracilis PSI-FCPI structure (15)". Figure S4D mentioned BCR863 but did not mention BCR112. Figure S4C, D, it may need better explanations of the colors and labels, and indicate which parts are from T. pseudonana or C. gracilis.

      BCR112 was misnumbered; the correct number is BCR103. In response to your comments, we revised Figure S4C and D by labeling the characteristic pigments in the revised manuscript.

      (4) Figure S7, although mentioned in the legend, it would be helpful to label interaction pairs on the figure directly with corresponding colours.

      According to your comments, we modified the Figure and legends in the revised manuscript.

      (5) Figure 3E, it is better to avoid red/green colours in one figure as some readers may be colour-blind. It would also be helpful to label each FCPI with the same colour as its structure on the figure directly.

      According to your comments, we modified Figure 3E in the revised manuscript.

      (6) Line 185, "structures similar to the Q96-T116 loop in TpRedCAP found in the present study (Figure 8B).". The authors refer to Figure S8B? I have the same comment for line 186, Figure 8C.

      We are sorry for the mistake of Figure number. Figure 8 is Supplementary Figure 8, so we modified it as Fig. S8B in the revised manuscript.

      (7) Line 270, "TpLhcq10 cannot bind at the FCPI-2 site". Why not use FCPI-3 for TpLhcq10?

      This means that the gene product of TpLhcq10 binds at the FCPI-3 site but not at the other sites such as FCPI-2. To avoid misreading, we modified the sentence as follows:

      "TpLhcq10 binds specifically at the FCPI-3 site but not at the other sites such as FCPI-2" (lines 278-279)

      Reviewer #3 (Recommendations For The Authors):

      I have no technical or conceptual suggestions at the current stage.

      Thank you.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      - The authors should think about revising the terminology used to describe electrophysiological data in zebrafish (Fig.5): "posterior" hair cells in a neuromast are sensitive to posterior-to-anterior flow, which is currently termed "anterior". This is confusing because when "posterior" or "anterior" is used, for instance in the labels of the figure, one may get confused about whether this applies to hair-cell position or directionality of the stimulus. It would help to always use clearer terminology for the stimulus (e.g. posterior-to-anterior (P-to-A) as in Kindig 2023, or "from the tail"). Also, the authors may want to clarify what we should see in Fig.5 demonstrating that posterior hair cells, with reversed hair-bundle polarity, actually evince transduction of similar magnitude as anterior hair cells, with normal polarity of their hair bundles. 

      This nomenclature can indeed be confusing. Per the reviewers request we have changed the terminology to always refer to the direction of flow sensed by the hair cells. For example, HCs that respond to posterior-directed flow or anterior-directed flow. We now denote these HCs as (A to P) and (P to A), respectively in the Figure for clarity. We have modified Figure 5, the Figure 5 legend and Results (starting line 339) to reflect these changes.

      In addition, in our results we now provide more context when comparing the response magnitude of the anterior-sensing hair cells in gpr156 mutants to the response magnitude of the two diVerent orientations of hair cells in controls.

      - Also, does it make sense that there is no defect in MET for mouse otolith organs with deleted GPR156, whereas there is a diVerence in the zebrafish lateral line? It would help motivate the study on mechanoelectrical transduction (see comment of Reviewer 1 below). 

      We previously discussed this point and recognized that subtle eVects remain possible in mouse (previously Discussion line 614). We have now  modified the text in the Discussion to better emphasize this point (new line 627). The Eatock lab is currently working on developing calcium imaging in the mouse utricle to revisit this question in a future study. "Subtle e)ects remain possible, however, given the variance in single-cell electrophysiological data from both control and mutant mice.  Nevertheless, current results are consistent with normal HC function in the Gpr156 mouse mutant, a prerequisite to interrogate how non-reversed HCs a)ects vestibular behavior."

      To help motivate transduction studies starting in the second Result paragraph, we added a transition at Line 205 that was indeed lacking:

      "Gpr156 inactivation could be a powerful model to specifically ask how HC reversal contributes to vestibular function. However, GPR156 may have other confounding roles in HCs besides regulating their orientation, similar to EMX2, which impacts mechanotransduction in zebrafish HCs (Kindig et al., 2023) and a)erent innervation  in mouse and zebrafish HCs (Ji et al., 2022; Ji et al., 2018)."

      (1) One overarching objective of this study was to use the Gpr156 KO model to discover how polarity reversal informs vestibular function (Introduction, overall summary in the last paragraph) . Pairing behavioral defects with hair cell orientation is only possible if hair cell transduction is normal, which had to be tested.

      (2) The notion that experiments that produced negative results are unecessary and are not properly motivated can only apply in retrospect. At early stages we performed electrophysiology because we did not know whether transduction would be normal in absence of GPR156. We also did not know whether innervation would be normal. The fact that both appear normal makes Gpr156 KO a better model to address the importance of orientation reversal (conclusion of the Discussion line 705).

      See also reply to Reviewer #1 below.

      Reviewer #1 (Recommendations For The Authors): 

      Fig1, panel B appears to show diVerent focal planes for Gpr156del/+ and Gpr156del/del. 

      Figure 1B had control and mutant panels at slightly diVerent focal planes indeed. We swapped the right (mutant) panel image and adjusted intensities in the control image to match adjustments of the new mutant image.  

      Given that this work is largely about polarity and connectivity to neurons, I do not understand the need to assess mechanosensitivity in Gpr156 mutants. Please explain in the text, as follows: "After establishing normal numbers and types of mouse vestibular HCs, we assessed whether HCs respond normally to hair bundle deflections in the absence of GPR156." We did this because... 

      Please see reply above in 'Recommendations for the authors' for comment about the need to assess mechanosensitivity. We agree that this transition was lacking, and we added an explanation as recommended:

      "Gpr156 inactivation could be a powerful model to specifically ask how HC reversal contributes to vestibular function. However, GPR156 may have other confounding roles in HCs besides regulating their orientation, similar to EMX2, which impacts mechanotransduction in zebrafish HCs (Kindig et al., 2023) and a)erent innervation  in mouse and zebrafish HCs (Ji et al., 2022; Ji et al., 2018)."

      Anyway, the data in Figures 2, 3 and 4 seems somewhat superfluous to the main message of the paper. 

      Please see reply above in 'Recommendations for the authors'. This data may appear superfluous in retrospect but we could not claim that behavioral changes in Gpr156 mutants reflect the role of the line of polarity reversal if, for example, hair cell transduction was abnormal. We had to perform experiments to figure this out. We were further motivated as data began to emerge from the zebrafish lateral line that showed eVects on HC transduction. Although we did not get positive results on this question in the mouse, we think the diVerence between models should be included as a significant part of the narrative.

    2. eLife Assessment

      This valuable study provides convincing evidence that mutant hair cells with abnormal, reversed polarity of their hair bundles in mouse otolith organs retain wild-type localization, mechanoelectrical transduction and firing properties of their afferent innervation, leading to mild behavioral dysfunction. It thus demonstrates that the bimodal pattern of afferent nerve projections in this organ is not causally related to the bimodal distribution of hair-bundle orientations, as also confirmed in the zebrafish lateral line. The work will be of interest to scientists interested in the development and function of the vestibular system as well as in planar-cell polarity.

    3. Reviewer #1 (Public review):

      Summary:

      The authors aim at dissecting the relationship between hair-cell directional mechanosensation and orientation-linked synaptic selectivity, using mice and the zebrafish. They find that Gpr156 mutant animals homogenize the orientation of hair cells without affecting the selectivity of afferent neurons, suggesting that hair-cell orientation is not the feature that determines synaptic selectivity. Therefore, the process of Emx2-dependent synaptic selectivity bifurcates downstream of Gpr156.

      Strengths:

      This is an interesting and solid paper. It solves an interesting problem and establishes a framework for the following studies. That is, to ask what are the putative targets of Emx2 that affect synaptic selectivity.<br /> The quality of the data is generally excellent.

      Weaknesses:

      The feeling is that the advance derived from the results is very limited.

      Comments on revised version:

      I am happy with the authors' reply and do not wish to modify my initial assessment.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors inquire in particular whether the receptor Gpr156, which is necessary for hair cells to reverse their polarities in the zebrafish lateral line and mammalian otolith organs downstream of the differential expression of the transcription factor Emx2, also controls the mechanosensitive properties of hair cells and ultimately an animal's behavior. This study thoroughly addresses the issue by analyzing the morphology, electrophysiological responses, and afferent connections of hair cells found in different regions of the mammalian utricle and the Ca2+ responses of lateral line neuromasts in both wild-type animals and gpr156 mutants. Although many features of hair cell function are preserved in the mutants-such as development of the mechanosensory organs and the Emx2-dependent, polarity-specific afferent wiring and synaptic pairing-there are a few key changes. In the zebrafish neuromast, the magnitude of responses of all hair cells to water flow resembles that of the wild-type hair cells that respond to flow arriving from the tail. These responses are larger than those observed in hair cells that are sensitive to flow arriving from the head and resemble effects previously observed in Emx2 mutants. The authors note that this behavior suggests that the Emx2-GPR156 signaling axis also impinges on hair cell mechanotransduction. Although mutant mice exhibit normal posture and balance, they display defects in swimming behavior. Moreover, their vestibulo-ocular reflexes are perturbed. The authors note that the gpr156 mutant is a good model to study the role of opposing hair cell polarity in the vestibular system, for the wiring patterns follow the expression patterns of Emx2, even though hair cells are all of the same polarity. This paper excels at describing the effects of gpr156 perturbation in mouse and zebrafish models and will be of interest to those studying the vestibular system, hair cell polarity, and the role of inner-ear organs in animal behavior.

      The study is exceptional in including, not only morphological and immunohistochemical indices of cellular identity but also electrophysiological properties. The mutant hair cells of murine maculæ display essentially normal mechanoelectrical transduction and adaptation-with two or even three kinetic components-as well as normal voltage-activated ionic currents.

    1. eLife Assessment

      The authors describe a model for tracking time-varying functional connectivity between neurons from multi-electrode spike recordings. This is an interesting and potentially useful approach to an open problem in neural data analysis, and could be an essential tool for investigating the neural code from large-scale in-vivo recordings of spiking activity. However, the evidence is incomplete: systematic comparisons with existing methods and/or demonstration of its utility relative to conventional methods are essential to demonstrate the usefulness of the method.

    2. Reviewer #1 (Public review):

      Summary:

      This work proposes a new method, DyNetCP, for inferring dynamic functional connectivity between neurons from spike data. DyNetCP is based on a neural network model with a two-stage model architecture of static and dynamic functional connectivity.<br /> This work evaluates the accuracy of the synaptic connectivity inference and shows that DyNetCP can infer the excitatory synaptic connectivity more accurately than a state-of-the-art model (GLMCC) by analyzing the simulated spike trains. Furthermore, it is shown that the inference results obtained by DyNetCP from large-scale in-vivo recordings are similar to the results obtained by the existing methods (jitter-corrected CCG and JPSTH). Finally, this work investigates the dynamic connectivity in the primary visual area VISp and in the visual areas using DyNetCP.

      Strengths:

      The strength of the paper is that it proposes a method to extract the dynamics of functional connectivity from spike trains of multiple neurons. The method is potentially useful for analyzing parallel spike trains in general, as there are only a few methods (e.g. Aertsen et al., J. Neurophysiol., 1989, Shimazaki et al., PLoS Comput Biol 2012) that infer the dynamic connectivity from spikes. Furthermore, the approach of DyNetCP is different from the existing methods: while the proposed method is based on the neural network, the previous methods are based on either the descriptive statistics (JSPH) or the Ising model.

      Weaknesses:

      Although the paper proposes a new method, DyNetCP, for inferring the dynamic functional connectivity, its strengths are neither clear nor directly demonstrated in this paper. That is, insufficient analyses are performed to support the usefulness of DyNetCP.<br /> First, this paper attempts to show the superiority of DyNetCP by comparing the performance of synaptic connectivity inference with GLMCC (Fig. 2). However, the improvement in the synaptic connectivity inference does not seem to be convincing. While this paper compares the performance of DyNetCP with a state-of-the-art method (GLMCC), there are several problems with the comparison. For example,

      (1) It is unclear how accurately the proposed method can infer the dynamic connectivity.<br /> (2) This paper does not compare with existing approach (e.g., classical JPSTH: Aertsen et al., J. Neurophysiol., 1989, and other methods : Stevenson and Koerding, NIPS, 2011; Linderman et al., NIPS, 2014; Song et al., J. Neurosci. Methods, 2015), and<br /> (3) only a population of neurons generated from the Hodgkin-Huxley model was evaluated.

      Thus, the results in this paper are not sufficient to conclude the superiority of DyNetCP in the estimation of synaptic connections. In addition, this paper compares the proposed method with the standard statistical methods Jitter-corrected CCG (Fig. 3) and JPSTH (Fig. 4). Unfortunately, these results do not show the superiority of the proposed method. It only shows that the results obtained by the proposed method are consistent with those obtained by the existing methods (CCG or JPSTH). This paper also compares the proposed method with the standard statistical methods, such as jitter-corrected CCG (Fig. 3) and JPSTH (Fig. 4). It only shows that the results obtained by the proposed method are consistent with those obtained by the existing methods (CCG or JPSTH), which does not show the superiority of the proposed method.

      In summary, although DyNetCP has the potential to infer the dynamic (time-dependent) correlation more accurately than existing methods, the paper does not provide sufficient analysis to make this claim. It is also unclear whether the proposed method is superior to the existing methods for estimating functional connectivity, such as JPSTH and statistical approach (Stevenson and Koerding, NIPS, 2011; Linderman et al., NIPS, 2014). Thus, the strength of DyNetCP is unclear.

    3. Reviewer #2 (Public review):

      Summary:

      Here the authors describe a model for tracking time-varying coupling between neurons from multi-electrode spike recordings. Their approach extends a GLM with static coupling between neurons to include dynamic weights, learned by a long-short-term-memory (LSTM) model. Each connection has a corresponding LSTM embedding and is read-out by a multi-layer perceptron to predict the time-varying weight.

      Strengths:

      This is an interesting approach to an open problem in neural data analysis. I think, in general, the method would be interesting to computational neuroscientists.

      Weaknesses:

      It is somewhat difficult to interpret what the model is doing. I think it would be worthwhile to add some additional results that make it more clear what types of patterns are being described and how.

      Major Issues:

      Simulation for dynamic connectivity. It certainly seems doable to simulate a recurrent spiking network whose weights change over time, and I think this would be a worthwhile validation for this DyNetCP model. In particular, I think it would be valuable to understand how much the model overfits, and how accurately it can track known changes in coupling strength. If the only goal is "smoothing" time-varying CCGs, there are much easier statistical methods to do this (c.f. McKenzie et al. Neuron, 2021. Ren, Wei, Ghanbari, Stevenson. J Neurosci, 2022), and simulations could be useful to illustrate what the model adds beyond smoothing.

      Stimulus vs noise correlations. For studying correlations between neurons in sensory systems that are strongly driven by stimuli, it's common to use shuffling over trials to distinguish between stimulus correlations and "noise" correlations or putative synaptic connections. This would be a valuable comparison for Fig 5 to show if these are dynamic stimulus correlations or noise correlations. I would also suggest just plotting the CCGs calculated with a moving window to better illustrate how (and if) the dynamic weights differ from the data.

      Minor Issues:

      Introduction - it may be useful to mention that there have been some previous attempts to describe time-varying connectivity from spikes both with probabilistic models: Stevenson and Kording, Neurips (2011), Linderman, Stock, and Adams, Neurips (2014), Robinson, Berger, and Song, Neural Computation (2016), Wei and Stevenson, Neural Comp (2021) ... and with descriptive statistics: Fujisawa et al. Nat Neuroscience (2008), English et al. Neuron (2017), McKenzie et al. Neuron (2021).

      In the sections "Static DyNetCP ...reproduce". It may be useful to have some additional context to interpret the CCG-DyNetCP correlations and CCG-GLMCC correlations (for simulation). If I understand right, these are on training data (not cross-validated) and the DyNetCP model is using NM+1 parameters to predict ~100 data points (It would also be good to say what N and M are for the results here). The GLMCC model has 2 or 3 parameters (if I remember right?).

      In the section "Static connectivity inferred by the DyNetCP from in-vivo recordings is biologically interpretable"... I may have missed it, but how is the "functional delay" calculated? And am I understanding right that for the DyNetCP you are just using [w_i\toj, w_j\toi] in place of the CCG?

    4. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for the constructive criticism and detailed assessment of our work which helped us to significantly improve our manuscript. We made significant changes to the text to better clarify our goals and approaches. To make our main goal of extracting the network dynamics clearer and to highlight the main advantage of our method in comparison with prior work we incorporated Videos 1-4 into the main text. We hope that these changes, together with the rest of our responses, convincingly demonstrate the utility of our method in producing results that are typically omitted from analysis by other methods and can provide important novel insights on the dynamics of the brain circuits. 

      Reviewer #1 (Public Review):

      (1) “First, this paper attempts to show the superiority of DyNetCP by comparing the performance of synaptic connectivity inference with GLMCC (Figure 2).”

      We believe that the goals of our work were not adequately formulated in the original manuscript that generated this apparent misunderstanding. As opposed to most of the prior work focused on reconstruction of static connectivity from spiking data (including GLMCC), our ultimate goal is to learn the dynamic connectivity structure, i.e. to extract time-dependent strength of the directed connectivity in the network. Since this formulation is fundamentally different from most of the prior work, therefore the goal here is not to show the “improvement” or “superiority” over prior methods that mostly focused on inference of static connectivity, but rather to thoroughly validate our approach and to show its usefulness for the dynamic analysis of experimental data. 

      (2) “This paper also compares the proposed method with standard statistical methods, such as jitter-corrected CCG (Figure 3) and JPSTH (Figure 4). It only shows that the results obtained by the proposed method are consistent with those obtained by the existing methods (CCG or JPSTH), which does not show the superiority of the proposed method.”

      The major problem for designing such a dynamic model is the virtual absence of ground-truth data either as verified experimental datasets or synthetic data with known time-varying connectivity. In this situation optimization of the model hyper-parameters and model verification is largely becoming a “shot in the dark”. Therefore, to resolve this problem and make the model generalizable, here we adopted a two-stage approach, where in the first step we learn static connections followed in the next step by inference of temporally varying dynamic connectivity. Dividing the problem into two stages enables us to separately compare the results of both stages to traditional descriptive statistical approaches. Static connectivity results of the model obtained in stage 1 are compared to classical pairwise CCG (Fig.2A,B) and GLMCC (Fig.2 C,D,E), while dynamic connectivity obtained in step 2 are compared to pairwise JPSTH (Fig.4D,E).

      Importantly, the goal here therefore is not to “outperform” the classical descriptive statistical or any other approaches, but rather to have a solid guidance for designing the model architecture and optimization of hyper-parameters. For example, to produce static weight results in Fig.2A,B that are statistically indistinguishable from the results of classical CCG, the procedure for the selection of weights which contribute to averaging is designed  as shown in Fig.9 and discussed in details in the Methods. Optimization of the L2 regularization parameter is illustrated in Fig.4 – figure supplement 1 that enables to produce dynamic weights very close to cJPSTH as evidenced by Pearson coefficient and TOST statistical tests. These comparisons demonstrate that indeed the results of CCG and JPSTH are faithfully reproduced by our model that, we conclude, is sufficient justification to apply the model to analyze experimental results. 

      (3) “However, the improvement in the synaptic connectivity inference does not seem to be convincing.”

      We are grateful for the reviewer to point out to this issue that we believe, as mentioned above, results from the deficiency of the original manuscript to clarify the major motivation for this comparison. Comparison of static connectivity inferred by stage 1 of our model to the results of GLMCC in Fig.2C,D,E is aimed at optimization of yet another two important parameters - the pair spike threshold and the peak height threshold. Here, in Fig. 2D we show that when the peak height threshold is reduced from rigorous 7 standard deviations (SD) to just 5 SD, our model recovers 74% of the ground truth connections that in fact is better than 69% produced by GLMCC for a comparable pair spike threshold of 80. As explained above, we do not intend to emphasize here that our model is “superior” since it was not our goal, but rather use this comparison to illustrate the approach for optimization of thresholds for units and pairs filtering as described in detail in Fig. 11 and corresponding section in Methods.

      To address these misunderstandings and better clarify the goal of our work we changed the text in the Introductory section accordingly. We also incorporated Videos 1-4 from the Supplementary Materials into the main text as Video 1, Video 2, Video 3, and Video 4. In fact, these videos represent the main advantage (or “superiority”) of our model with respect to prior art that enables to infer the time-dependent dynamics of network connectivity as opposed to static connections.

      (4) “While this paper compares the performance of DyNetCP with a state-of-the-art method (GLMCC), there are several problems with the comparison. For example: 

      (a) This paper focused only on excitatory connections (i.e., ignoring inhibitory neurons). 

      (b) This paper does not compare with existing neural network-based methods (e.g., CoNNECT: Endo et al. Sci. Rep. 2021; Deep learning: Donner et al. bioRxiv, 2024).

      (c) Only a population of neurons generated from the Hodgkin-Huxley model was evaluated.”

      (a) In general, the model of Eq.1 is agnostic to excitatory or inhibitory connections it can recover. In fact, Fig. 5 and Fig.6 illustrate inferred dynamic weights for both excitatory (red arrows) and inhibitory (blue arrows) connections between excitatory (red triangles) and inhibitory (blue circles) neurons. Similarly, inhibitory and excitatory dynamic interactions between connections are represented in Fig. 7 for the larger network across all visual cortices.

      (b) As stated above, the goal for the comparison of the static connectivity results of stage 1 of our model to other approaches is to guide the choice of thresholds and optimization of hyperparameters rather than claiming “superiority” of our model. Therefore, comparison with “static” CNN-based model of Endo et al. or ANN-based static model of Donner et al. (submitted to bioRxiv several months after our submission to eLife) is beyond the scope of this work. 

      (c) We have chosen exactly the same sub-population of neurons from the synthetic HH dataset of Ref. 26 that is used in Fig.6 of Ref. 26 that provides direct comparison of connections reconstructed by GLMCC in the original Ref.26 and the results of our model. 

      (5) “In summary, although DyNetCP has the potential to infer synaptic connections more accurately than existing methods, the paper does not provide sufficient analysis to make this claim. It is also unclear whether the proposed method is superior to the existing methods for estimating functional connectivity, such as jitter-corrected CCG and JPSTH. Thus, the strength of DyNetCP is unclear.”

      As we explained above, we have no intention to claim that our model is more accurate than existing static approaches. In fact, it is not feasible to have better estimation of connectivity than direct descriptive statistical methods as CCG or JPSTH. Instead, comparison with static (CCG and GLMCC) and temporal (JPSTH) approaches are used here to guide the choice of the model thresholds and to inform the optimization of hyper-parameters to make the prediction of the dynamic network connectivity reliable. The main strength of DyNetCP is inference of dynamic connectivity as illustrated in Videos 1-4. We demonstrated the utility of the method on the largest in-vivo experimental dataset available today and extracted the dynamics of cortical connectivity in local and global visual networks. This information is unattainable with any other contemporary methods we are aware of. 

      Reviewer #1 (Recommendations for the Authors):

      (6) “First, the authors should clarify the goal of the analysis, i.e., to extract either the functional connectivity or the synaptic connectivity. While this paper assumes that they are the same, it should be noted that functional connectivity can be different from synaptic connectivity (see Steavenson IH, Neurons Behav. Data Anal. Theory 2023).”

      The goal of our analysis is to extract dynamics of the spiking correlations. In this paper we intentionally avoided assigning a biological interpretation to the inferred dynamic weights. Our goal was to demonstrate that a trough of additional information on neural coding is hidden in the dynamics of neural correlations. The information that is typically omitted from the analysis of neuroscience data. 

      Biological interpretation of the extracted dynamic weights can follow the terminology of the shortterm plasticity between synaptically connected neurons (Refs 25, 33-37) or spike transmission strength (Refs 30-32,46). Alternatively, temporal changes in connection weights can be interpreted in terms of dynamically reconfigurable functional interactions of cortical networks (Refs 8-11,13,47) through which the information is flowing. We could not also exclude interpretation that combines both ideas. In any event our goal here is to extract these signals for a pair (video1, Fig.4), a cortical local circuit (Video 2, Fig.5), and for the whole visual cortical network (Videos 3, 4 and Fig.7). 

      To clarify this statement, we included a paragraph in the discussion section of the revised paper. 

      (7) “Finally, it would be valuable if the authors could also demonstrate the superiority of DyNetCP qualitatively. Can DyNetCP discover something interesting for neuroscientists from the large-scale in vivo dataset that the existing method cannot?”

      The model discovers dynamic time-varying changes in neuron synchronous spiking (Videos 1-4) that more traditional methods like CCG or GLMCC are not able to detect. The revealed dynamics is happening at the very short time scales of the order of just a few ms during the stimulus presentation. Calculations of the intrinsic dimensionality of the spiking manifold (Fig. 8) reveal that up to 25 additional dimensions of the neural code can be recovered using our approach. These dimensions are typically omitted from the analysis of the neural circuits using traditional methods.  

      Reviewer #2 (Public Review):

      (1) “Simulation for dynamic connectivity. It certainly seems doable to simulate a recurrent spiking network whose weights change over time, and I think this would be a worthwhile validation for this DyNetCP model. In particular, I think it would be valuable to understand how much the model overfits, and how accurately it can track known changes in coupling strength.”

      We are very grateful to the reviewer for this insight. Verification of the model on synthetic data with known time-varying connectivity would indeed be very useful. We did generate a synthetic dataset to test some of the model performance metrics - i.e. testing its ability to distinguish True Positive (TP) from False Positive (FP) “serial” or “common input” connections (Fig.10A,B). Comparison of dynamic and static weights might indeed help to distinguish TP connections from an artifactual FP connections. 

      Generating a large synthetic dataset with known dynamic connections that mimics interactions in cortical networks is, however, a separate and not very trivial task that is beyond the scope of this work. Instead, we designed a model with an architecture where overfitting can be tested in two consecutive stages by comparison with descriptive statistical approaches – CCG and JPSTH. Static stage 1 of the model predicts correlations that are statistically indistinguishable from the CCG results (Fig.2A,B). The dynamic stage 2 of the model produce dynamic weight matrices that faithfully reproduce the cJPSTH (Fig.4D,E). Calculated Pearson correlation coefficients and TOST testing enable optimizing the L2 regularization parameter as shown in Fig.4 – supplement 1 and described in detail in the Methods section. The ability to test results of both stages separately to descriptive statistical results is the main advantage of the chosen model architecture that allow to verify that the model does not overfit and can predict changes in coupling strength at least as good as descriptive statistical approaches (see also our answer above to the Reviewer #1 questions).

      (2) “If the only goal is "smoothing" time-varying CCGs, there are much easier statistical methods to do this (c.f. McKenzie et al. Neuron, 2021. Ren, Wei, Ghanbari, Stevenson. J Neurosci, 2022), and simulations could be useful to illustrate what the model adds beyond smoothing.”

      We are grateful to the reviewer for bringing up these very interesting and relevant references that we added to the discussion section in the paper. Especially of interest is the second one, that is calculating the time-varying CCG weight (“efficacy” in the paper terms) on the same Allen Institute Visual dataset as our work is using. It is indeed an elegant way to extract time-variable coupling strength that is similar to what our model is generating. The major difference of our model from that of Ren et al., as well as from GLMCC and any statistical approaches is that the DyNetCP learns connections of an entire network jointly in one pass, rather than calculating coupling separately for each pair in the dataset without considering the relative influence of other pairs in the network. Hence, our model can infer connections beyond pairwise (see Fig. 11 and corresponding discussion in Methods) while performing the inferences with computational efficiency. 

      (3) “Stimulus vs noise correlations. For studying correlations between neurons in sensory systems that are strongly driven by stimuli, it's common to use shuffling over trials to distinguish between stimulus correlations and "noise" correlations or putative synaptic connections. This would be a valuable comparison for Figure 5 to show if these are dynamic stimulus correlations or noise correlations. I would also suggest just plotting the CCGs calculated with a moving window to better illustrate how (and if) the dynamic weights differ from the data.”

      Thank you for this suggestion. Note that for all weight calculations in our model a standard jitter correction procedure of Ref. 33 Harrison et al., Neural Com 2009 is first implemented to mitigate the influences of correlated slow fluctuations (slow “noise”). Please also note that to obtain the results in Fig. 5 we split the 440 total experimental trials for this session (when animal is running, see Table 1) randomly into 352 training and 88 validation trials by selecting 44 training trials from each configuration of contrast or grating angle and 11 for validation. We checked that this random selection, if changed, produced the very same results as shown in Fig.5. 

      Comparison of descriptive statistical results of pairwise cJPSTH and the model are shown in Fig. 4D,E. The difference between the two is characterized in Fig.4 – supplement 1 in detail as evidenced by Pearson coefficient and TOST statistical tests.

      Reviewer #2 (Recommendations for the Authors):

      (4) “The method is described as "unsupervised" in the abstract, but most researchers would probably call this "supervised" (the static model, for instance, is logistic regression).”

      The model architecture is composed of two stages to make parameter optimization grounded. While the first stage is regression, the second and the most important stage is not. Therefore, we believe the term “unsupervised” is justified. 

      (5) “Introduction - it may be useful to mention that there have been some previous attempts to describe time-varying connectivity from spikes both with probabilistic models: Stevenson and Kording, Neurips (2011), Linderman, Stock, and Adams, Neurips (2014), Robinson, Berger, and Song, Neural Computation (2016), Wei and Stevenson, Neural Comp (2021) ... and with descriptive statistics: Fujisawa et al. Nat Neuroscience (2008), English et al. Neuron (2017), McKenzie et al. Neuron (2021).”

      We are very grateful to both reviewers for bringing up these very interesting and relevant references that we gladly included in the discussions within the Introduction and Discussion sections. 

      (6) “In the section "Static connectivity inferred by the DyNetCP from in-vivo recordings is biologically interpretable"... I may have missed it, but how is the "functional delay" calculated? And am I understanding right that for the DyNetCP you are just using [w_i\toj, w_j\toi] in place of the CCG?”

      The functional delay is calculated as a time lag of the maximum (or minimum) in the CCG (or static weight matrix). The static weight that the model is extracting is indeed the wiwj product. We changed the text in this section to better clarify these definitions. 

      (7) “P14 typo "sparce spiking" sparse”

      Fixed. Thank you. 

      (8) “Suggest rewarding "Extra-laminar interactions reveal formation of neuronal ensembles with both feedforward (e.g., layer 4 to layer 5), and feedback (e.g., layer 5 to layer 4) drives." I'm not sure this method can truly distinguish common input from directed, recurrent cortical effects. Just as an example in Figure 5, it looks like 2->4, 0->4, and 3>2 are 0 lag effects. If you wanted to add the "functional delay" analysis to this laminar result that could support some stronger claims about directionality, though.”

      The time lags for the results of Fig. 5 are indeed small, but, however, quantifiable. Left panel Fig. 5A shows static results with the correlation peaks shifted by 1ms from zero lag.

      (9) “Methods - I think it would be useful to mention how many parameters the full DyNetCP model has.”

      Overall, after the architecture of Fig.1C is established, dynamic weight averaging procedure is selected (Fig.9), and Fourier features are introduced (Fig.10), there is just a few parameters to optimize including L2 regularization (Fig.4 – supplement 1) and loss coefficient  (Fig.1 – figure supplement 1A). Other variables, common for all statistical approaches, include bin sizes in the lag time and in the trial time. Decreasing the bin size will improve time resolution while decreasing the number of spikes in each bin for reliable inference. Therefore, number of spikes threshold and other related thresholds α𝑠 , α𝑤 , α𝑝 as well as λ𝑖λ𝑗, need to be adjusted accordingly (Fig.11) as discussed in detail in the Methods, Section 4. We included this sentence in the text. 

      (10) “It may be useful to also mention recent results in mice (Senzai et al. Neuron, 2019) and monkeys (Trepka...Moore. eLife, 2022) that are assessing similar laminar structures with CCGs.”

      Thank you for pointing out these very interesting references. We added a paragraph in “Dynamic connectivity in VISp primary visual area” section comparing our results with these findings. In short, we observed that connections are distributed across the cortical depth with nearly the same maximum weights (Fig.7A) that is inconsistent with observed in Trepka et al, 2022 greatly diminished static connection efficacy within <200µm from the source. It is consistent, however, with the work of Senzai et al, 2019 that reveals much stronger long-distance correlations between layer 2/3 and layer 5 during waking in comparison to sleep states. In both cases these observations represent static connections averaged over a trial time, while the results presented in Video 3 and Fig.7A show strong temporal modulation of the connection strength between all the layers during the stimulus presentation. Therefore, our results demonstrate that tracking dynamic connectivity patterns in local cortical networks can be invaluable in assessing circuitlevel dynamic network organization.

    1. eLife Assessment

      This work provides an important and novel framework for interpreting the interactions between recurrent dynamics across stages of neural processing. The authors report that two different kinds of dynamics exist in recurrent networks differing in the extent to which they align with the output weights. The authors also present convincing evidence that both types of dynamics exist in the brain.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, authors utilize recurrent neural networks (RNNs) to explore the question of when and how neural dynamics and the network's output are related from a geometrical point of view. The authors found that RNNs operate between two extremes: an 'aligned' regime in which the weights and the largest PCs are strongly correlated and an 'oblique' regime where the output weights and the largest PCs are poorly correlated. Large output weights led to oblique dynamics, and small output weights to aligned dynamics. This feature impacts whether networks are robust to perturbation along output directions. Results were linked to experimental data by showing that these different regimes can be identified in neural recordings from several experiments.

      Strengths:

      Diverse set of relevant tasks<br /> Similarity measure well chosen<br /> Explored various hyperparameter settings

      Weaknesses:

      One of the major connections to found BCI data with neural variance aligned to the outputs. Maybe I was confused about something, but doesn't this have to be the case based on the design of the experiment? The outputs of the BCI are chosen to align with the largest principal components of the data.

      Proposed experiments maybe have already been done (New neural activity patterns emerge with long-term learning, Oby et al. 2019). My understanding of these results is that activity moved to be aligned as the manifold changed, but more analyses could be done to more fully understand the relationship between those experiments and this work.

      Analysis of networks was thorough, but connections to neural data were weak. I am thoroughly convinced of the reported effect of large or small output weights in networks. I also think this framing could aid in future studies of interactions between brain regions.

      This is an interesting framing to consider the relationship between upstream activity and downstream outputs. As more labs record from several brain regions simultaneously, this work will provide an important theoretical framework for thinking about the relative geometries of neural representations between brain regions.

      It will be interesting to compare the relationship between geometries of representations and neural dynamics across connected different brain areas that are closer to the periphery vs. more central.

      Exciting to think about the versatility of the oblique regime for shared representations and network dynamics across different computations.

      Versatility of oblique regime could lead to differences between subjects in neural data.

    3. Reviewer #2 (Public review):

      Summary:

      This paper tackles the problem of understanding when the dynamics of neural population activity do and do not align with some target output, such as an arm movement. The authors develop a theoretical framework based on RNNs showing that an alignment of neural dynamics to an output can be simply controlled by the magnitude of the read-out weight vector while the RNN is being trained: small magnitude vectors result in aligned dynamics, where low-dimensional neural activity recapitulates the target; large magnitude vectors result in "oblique" dynamics, where encoding is spread across many dimensions. The paper further explores how the aligned and oblique regimes differ, in particular that the oblique regime allows degenerate solutions for the same target output.

      Strengths:

      - A really interesting new idea that different dynamics of neural circuits can arise simply from the initial magnitude of the output weight vector: once written out (Eq 3) it becomes obvious, which I take as the mark of a genuinely insightful idea

      - The offered framework potentially unifies a collection of separate experimental results and ideas, largely from studies of motor cortex in primate: the idea that much of the ongoing dynamics do not encode movement parameters; the existence of the "null space" of preparatory activity; and that ongoing dynamics of motor cortex can rotate in the same direction even when the arm movement is rotating in opposite directions.

      - The main text is well written, with a wide-ranging set of key results synthesised and illustrated well and concisely

      - Shows the occurrence of the aligned and oblique regimes generalises across a range of simulated behavioural tasks

      - A deep analytical investigation of when the regimes occur and how they evolve over training

      - Shows where the oblique regime may be advantageous: allows multiple solutions to the same problem; and differs in sensitivity to perturbation and noise

      - An insightful corollary result that noise in training is needed to obtain the oblique regime

      - Tests whether the aligned and oblique regimes can be seen in neural recordings from primate cortex in a range of motor control tasks

      - The revised text offers greater clarity and precision about when the aligned and oblique regimes occur and in the interpretation of the analyses of neural data

      Weaknesses:

      - The depth of analytical treatment in the Methods is impressive; however, the paper and the Methods analyses are largely independent, with the numerous results in the latter not being mentioned in the Results or Discussion. It in effect operates as two papers.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, the authors utilize recurrent neural networks (RNNs) to explore the question of when and how neural dynamics and the network's output are related from a geometrical point of view. The authors found that RNNs operate between two extremes: an 'aligned' regime in which the weights and the largest PCs are strongly correlated and an 'oblique' regime where the output weights and the largest PCs are poorly correlated. Large output weights led to oblique dynamics, and small output weights to aligned dynamics. This feature impacts whether networks are robust to perturbation along output directions. Results were linked to experimental data by showing that these different regimes can be identified in neural recordings from several experiments.

      Strengths:

      A diverse set of relevant tasks.

      A well-chosen similarity measure.

      Exploration of various hyperparameter settings.

      Weaknesses:

      One of the major connections found BCI data with neural variance aligned to the outputs.

      Maybe I was confused about something, but doesn't this have to be the case based on the design of the experiment? The outputs of the BCI are chosen to align with the largest principal components of the data.

      The reviewer is correct. We indeed expected the BCI experiments to yield aligned dynamics. Our goal was to use this as a comparison for other, non-BCI recordings in which the correlation is smaller, i.e. dynamics closer to the oblique regime. We adjusted our wording accordingly and added a small discussion at the end of the experimental results, Section 2.6.

      Proposed experiments may have already been done (new neural activity patterns emerge with long-term learning, Oby et al. 2019). My understanding of these results is that activity moved to be aligned as the manifold changed, but more analyses could be done to more fully understand the relationship between those experiments and this work.

      The on- vs. off-manifold experiments are indeed very close to our work. On-manifold initializations, as stated above, are expected to yield aligned solutions. Off-manifold initializations allow, in principle, for both aligned and oblique solutions and are thus closer to our RNN simulations. If, during learning, the top PCs (dominant activity) rotate such that they align with the pre-defined output weights, then the system has reached an aligned solution. If the top PCs hardly change, and yet the behavior is still good, this is an oblique solution. There is some indication of an intermediate result (Figure 4C in Oby et al.), but the existing analysis there did not fully characterize these properties. Furthermore, our work suggests that systematically manipulating the norm of readout weights in off-manifold experiments can yield new insights. We thus view these as relevant results but suggest both further analysis and experiments. We rewrote the corresponding section in the discussion to include these points.

      Analysis of networks was thorough, but connections to neural data were weak. I am thoroughly convinced of the reported effect of large or small output weights in networks. I also think this framing could aid in future studies of interactions between brain regions.

      This is an interesting framing to consider the relationship between upstream activity and downstream outputs. As more labs record from several brain regions simultaneously, this work will provide an important theoretical framework for thinking about the relative geometries of neural representations between brain regions.

      It will be interesting to compare the relationship between geometries of representations and neural dynamics across connected different brain areas that are closer to the periphery vs. more central.

      It is exciting to think about the versatility of the oblique regime for shared representations and network dynamics across different computations.

      The versatility of the oblique regime could lead to differences between subjects in neural data.

      Thank you for the suggestions. Indeed, this is precisely why relative measures of the regime are valuable, even in the absence of absolute thresholds for regimes. We included your suggestions in the discussion.

      Reviewer #2 (Public Review):

      Summary:

      This paper tackles the problem of understanding when the dynamics of neural population activity do and do not align with some target output, such as an arm movement. The authors develop a theoretical framework based on RNNs showing that an alignment of neural dynamics to output can be simply controlled by the magnitude of the read-out weight vector while the RNN is being trained. Small magnitude vectors result in aligned dynamics, where low-dimensional neural activity recapitulates the target; large magnitude vectors result in "oblique" dynamics, where encoding is spread across many dimensions. The paper further explores how the aligned and oblique regimes differ, in particular, that the oblique regime allows degenerate solutions for the same target output.

      Strengths:

      - A really interesting new idea that different dynamics of neural circuits can arise simply from the initial magnitude of the output weight vector: once written out (Eq 3) it becomes obvious, which I take as the mark of a genuinely insightful idea.

      - The offered framework potentially unifies a collection of separate experimental results and ideas, largely from studies of the motor cortex in primates: the idea that much of the ongoing dynamics do not encode movement parameters; the existence of the "null space" of preparatory activity; and that ongoing dynamics of the motor cortex can rotate in the same direction even when the arm movement is rotating in opposite directions.

      - The main text is well written, with a wide-ranging set of key results synthesised and illustrated well and concisely.

      - The study shows that the occurrence of the aligned and oblique regimes generalises across a range of simulated behavioural tasks.

      - A deep analytical investigation of when the regimes occur and how they evolve over training.

      - The study shows where the oblique regime may be advantageous: allows multiple solutions to the same problem; and differs in sensitivity to perturbation and noise.

      - An insightful corollary result that noise in training is needed to obtain the oblique regime.

      - Tests whether the aligned and oblique regimes can be seen in neural recordings from primate cortex in a range of motor control tasks.

      Weaknesses:

      - The magnitude of the output weights is initially discussed as being fixed, and as far as I can tell all analytical results (sections 4.6-4.9) also assume this. But in all trained models that make up the bulk of the results (Figures 3-6) all three weight vectors/matrices (input, recurrent, and output) are trained by gradient descent. It would be good to see an explanation or results offered in the main text as to why the training always ends up in the same mapping (small->aligned; large->oblique) when it could, for example, optimise the output weights instead, which is the usual target (e.g. Sussillo & Abbott 2009 Neuron).

      We understand the reviewer’s surprise. We chose a typical setting (training all weights of an RNN with Adam) to show that we don’t have to fine-tune the setting (e.g. by fixing the output weights) to see the two regimes. However, other scenarios in which the output weights do change are possible, depending on the algorithm and details in the way the network is parameterized. Understanding why some settings lead to our scenario (no change in scale) and others don’t is not a simple question. A short explanation here, nonetheless:

      - Small changes to the internal weights are sufficient to solve the tasks.

      - Different versions of gradient descent and different ways of parametrizing the network lead to different results in which parts of the weights get trained. This goes in particular for how weight scales are introduced, e.g. [Jacot et al. 2018 Neurips], [Geiger et al. 2020 Journal of Statistical Mechanics], or [Yang, Hu 2020, arXiv, Feature learning in infinite-width networks]. One insight from these works is that plain gradient descent (GD) with small output weights leads to learning only at the output (and often divergence or unsuccessful learning). For this reason, plain GD (or stochastic GD) is not suitable for small output weights (the aligned regime). Other variants of GD, such as Adam or RMSprop, don’t have this problem because they shift the emphasis of learning to the hidden layers (here the recurrent weights). This is due to the normalization of the gradients.

      - FORCE learning [Sussillo & Abbott 2009] is somewhat special in that the output weights are simultaneously also used as feedback weights. That is, not only the output weights but also an additional low-rank feedback loop through these output weights is trained. As a side note: By construction, such a learning algorithm thus links the output directly to the internal dynamics, so that one would only expect aligned solutions – and the output weights remain correspondingly small in these algorithms [Mastrogiuseppe, Ostojic, 2019, Neural Comp].

      - In our setting, the output is not fed back to the network, so training the output alone would usually not suffice. Indeed, optimizing just the output weights is similar to what happens in the lazy training regime. These solutions, however, are not robust to noise, and we show that adding noise during the training does away with these solutions.

      To address this issue in the manuscript, we added the following sentence to section 2.2: “While explaining this observation is beyond the scope of this work, we note that (1) changing the internal weights suffices to solve the task, and that (2) the extent to which the output weights change during learning depends on the algorithm and specific parametrization [21, 27, 85].”

      - It is unclear what it means for neural activity to be "aligned" for target outputs that are not continuous time-series, such as the 1D or 2D oscillations used to illustrate most points here.

      Two of the modeled tasks have binary outputs; one has a 3-element binary vector.

      For any dynamics and output, we compare the alignment between the vector of output weights and the main PCs (the leading component of the dynamics). In the extreme of binary internal dynamics, i.e., two points {x_1, x_2}, there would only be one leading PC (the line connecting the two points, i.e. the choice decoder).

      - It is unclear what criteria are used to assign the analysed neural data to the oblique or aligned regimes of dynamics.

      Such an assignment is indeed difficult to achieve. The RNN models we showed were at the extremes of the two regimes, and these regimes are well characterized in the case of large networks (as described in the methods section). For the neural data, we find different levels of alignment for different experiments. These differences may not be strong enough to assign different regimes. Instead, our measures (correlation and relative fitting dimension) allow us to order the datasets. Here, the BCI data is more aligned than non-BCI data – perhaps unsurprisingly, given the experimental design of the prior and the previous findings for the rotation task [Russo et al, 2018]. We changed the manuscript accordingly, now focusing on the relative measure of alignment, even in the absence of absolute thresholds. We are curious whether future studies with more data, different tasks, or other brain regions might reveal stronger differentiation towards either extreme.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      There's so much interesting content in the supplement - it seemed like a whole other paper! It is interesting to read about the dynamics over the course of learning. Maybe you want to put this somewhere else so that more people read it?

      We are glad the reviewer appreciated this content. We think developing these analysis methods is essential for a more complete understanding of the oblique regime and how it arises, and that it should therefore be part of the current paper.

      Nice schematic in Figure 1.

      There were some statements in the text highlighting co-rotation in the top 2 PCs for oblique networks. Figure 4a looks like aligned networks might also co-rotate in a particular subspace that is not highlighted. I could be wrong, but the authors should look into this and correct it if so. If both aligned and oblique networks have co-rotation within the top 5 or so PCs, some text should be updated to reflect this.

      This is indeed the case, thanks for pointing this out! For one example, there is co-rotation for the aligned network already in the subspace spanned by PCs 1 and 3, see the figure below. We added a sentence indicating that co-rotation can take place at low-variance PCs for the aligned regime and pointed to this figure, which we added to the appendix (Fig. 17).

      While these observations are an important addition, we don’t think they qualitatively alter our results, particularly the stronger dissociation between output and internal dynamics for oblique than aligned dynamics.

      Figure 4 color labels were 'dark' and 'light'. I wasn't sure if this was a typo or if it was designed for colorblind readers? Either way, it wasn't too confusing, but adding more description might be useful.

      Fixed to red and yellow.

      Typo "Aligned networks have a ratio much large than one"

      Typo "just started to be explored" Typo "hence allowing to test"

      Fixed all typos.

      Reviewer #2 (Recommendations For The Authors):

      - Explain/discuss in the main text why the initial output weights reliably result in the required internal RNN dynamics (small->aligned; large->oblique) after training. The magnitude of the output weights is initially discussed as being fixed, and as far as I can tell all analytical results (sections 4.6-4.9) also assume this. But in all trained models that make up the bulk of the results (Figures 3-6) all three weight vectors/matrices (input, recurrent, and output) are trained by gradient descent. It would be good to see an explanation or results offered in the main text as to why the training always ends up in the same mapping (small->aligned; large->oblique) when it could, for example, just optimise the output weights instead.

      See the answer to a similar comment by Reviewer #1 above.

      - Page 6: explain the 5 tasks.

      We added a link to methods where the tasks are described.

      - Page 6/Fig 3 & Methods: explain assumptions used to compute a reconstruction R^2 between RNN PCs and a binary or vector target output.

      We added a new methods section, 4.4, where we explain the fitting process in Fig. 3. For all tasks, the target output was a time series with P specified target values in N_out dimensions. We thus always applied regression and did not differentiate between binary and non-binary tasks.

      - Page 8: methods and predictions are muddled up: paragraph ending "along different directions" should be followed by paragraph starting "Our intuition...". The intervening paragraph ("We apply perturbations...") should start after the first sentence of the paragraph "To test this,...".

      Right, these sentences were muddled up indeed. We put them in the correct order.

      - Page 10: what are the implications of the differences in noise alignment between the aligned and oblique regimes?

      The noise suppression in the oblique regime is a slow learning process that gradually renders the solution more stable. With a large readout, learning separates into two phases. An early phase, in which a “lazy” solution is learned quickly. This solution is not robust to noise. In a second, slower phase, learning gradually leads to a more robust solution: the oblique solution. The main text emphasizes the result of this process (noise suppression). In the methods, we closely follow this process. This process is possibly related to other slow learning process fine-tuning solutions, e.g., [Blanc et al. 2020, Li et al. 2021, Yang et al. 2023]. Furthermore, it would be interesting to see whether such fine-tuning happens in animals [Ratzon et al. 2024]. We added corresponding sentences to the discussion.

      - Neural data analysis:

      (i) Page 11 & Fig 7: the assignment of "aligned" or "oblique" to each neural dataset is based on the ratio of D_fit/D_x. But in all cases this ratio is less than 1, indicating fewer dimensions are needed for reconstruction than for explaining variance. Given the example in Figure 2 suggests this is an aligned regime, why assign any of them as "oblique"?

      We weakened the wording in the corresponding section, and now only state that BCI data leans more towards aligned, non-BCI data more towards oblique. This is consistent with the intuition that BCI is by construction aligned (decoder along largest PCs) and non-BCI data already showed signs of oblique dynamics (co-rotating leading PCs in the cycling task, Russo et al. 2018).

      We agree that Fig 2 (and Fig 3) could suggest distinguishing the regimes at a threshold D_fit/D_x = 1, although we hadn’t considered such a formal criterion.

      (ii) Figure 23 and main text page 11: discuss which outputs for NLB and BCI datasets were used in Figure 7 & and main text; the NLB results vary widely by output type - discuss in the main text; D_fit for NLB-maze-accuracy is missing from panel D; as the criterion is D_fit/D_x, plot this too.

      We now discuss which outputs were used in Fig. 7 in its caption: the velocity of the task-relevant entity (hand/finger/cursor). This was done to have one quantity across studies. We added a sentence to the main text, p. 11, which points to Fig 22 (which used to be Fig 23) and states that results are qualitatively similar for other decoded outputs, despite some fluctuations in numerical values and decodability.

      Regarding Fig 22: D_fit for NLB-maze-accuracy was beyond the manually set y-limit (for visibility of the other data points). We also extended the figure to include D_fit/D_x. We also discovered a small bug in the analysis code which required us to rerun the analysis and reproduce the plots. This also changed some of the numbers in the main text.

      - Discussion:

      "They do not explain why it [the "irrelevant activity"] is necessary", implies that the following sentence(s) will explain this, but do not. Instead, they go on to say:

      "Here, we showed that merely ensuring stability of neural dynamics can lead to the oblique regime": this does not explain why it is necessary, merely that it exists; and it is unclear what results "stability of neural dynamics" is referring to.

      We agree this was not a very clear formulation. We replaced these last three sentences with the following:

      “Our study systematically explains this phenomenon: generating task-related output in the presence of large, task-unrelated dynamics requires large readout weights. Conversely, in the presence of large output weights, resistance to noise or perturbations requires large, potentially task-unrelated neural dynamics (the oblique regime).”

      - The need for all 27 figures was unclear, especially as some seemed not to be referenced or were referenced out of order. Please check and clarify.

      Fig 16 (Details for network dynamics in cycling tasks) and Fig 21 (loss over learning time for the different tasks) were not referenced, and are now removed.

      We also reordered the figures in the appendix so that they would appear in the order they are referenced. Note that we added another figure (now Fig. 17) following a question from Reviewer #1.

    1. eLife Assessment

      This study presents useful findings on the role of the small GTPase Rab3A in homeostatic synaptic plasticity following activity suppression. While the study demonstrates that Rab3A is required for homeostatic scaling, the evidence supporting the model put forward by the authors is incomplete. The work will be of interest to the field of synaptic transmission and synaptic plasticity.

    2. Reviewer #1 (Public review):

      Koesters and colleagues investigated the role of the small GTPase Rab3A in homeostatic scaling of miniature synaptic transmission in primary mouse cortical cultures using electrophysiology and immunohistochemistry. The major finding is that TTX incubation for 48 hours does not induce an increase in the amplitude of excitatory synaptic miniature events in neuronal cultures derived from Rab3A KO and Rab3A Earlybird mutant mice. NASPM application had comparable effects on mEPSC amplitude in control and after TTX, implying that Ca2+-permeable glutamate receptors are unlikely modulated during synaptic scaling. Immunohistochemical analysis revealed no significant changes in GluA2 puncta size, intensity, and integral in control and Rab3A KO cultures. Finally, they provide evidence that loss of Rab3A in neurons, but not astrocytes, blocks homeostatic scaling. Based on these data, the authors propose a model in which neuronal Rab3A is required for homeostatic scaling of synaptic transmission through GluA2-dependent and independent mechanisms.

      While the title of the manuscript is mostly supported by data of solid quality, many conclusions, as well as the final model, cannot be derived from the results presented. Importantly, the data do not support that GluA2 levels change upon TTX treatment in control cultures, rendering conclusions regarding Rab3A's role in TTX-dependent GluA2 modulation spurious. Other aspects of the model, such as a Rab3A-dependent release of a tropic factor, cannot be derived from the data.

      The following points should be addressed:

      (1) There is no (significant) increase in GluA2 levels (intensity, area, or integral) upon TTX treatment in controls (Fig. 5). Conclusions regarding Rab3As role in TTX-dependent GluA2 modulation should be revised accordingly. Hence, the data shown in Fig. 4 - 7 do not allow drawing conclusions in the context of Rab3A-dependent GluA2 modulation and scaling.

      (2) The effects of Rab3A on TTX-induced mini frequency modulation remains unclear, because TTX does not induce a change in mini frequency in the Rab3A+/Ebd control (Fig. 2). The respective conclusions should be revised accordingly (l. 427).

      (3) The model is still not supported by the data. In particular, data supporting a negative regulation of Rab3A by APs, Rab3A-dependent release of a tropic factor, or a Rab3A-dependent increase in GluA2 abundance are not presented.

      (4) Data points are not overlapping and appear "quantal" in most box plots. How were the data rounded?

    3. Reviewer #2 (Public review):

      In the revised manuscript, the authors investigated the role of a presynaptic protein, Rab3A, in the homeostatic synaptic plasticity in cultured cortical neurons. The study was motivated by their previous findings that Rab3A is required for expression of similar homeostatic mechanisms at the neuromuscular junction. The authors first show that untreated WT neurons express homeostatic synaptic plasticity in response to 48h of TTX treatment (upregulation of both mEPSC amplitude and frequency), whereas neurons lacking Rab3A or carrying a dominant negative mutated Rab3A (earlybird) do not. They also demonstrate that only neuronal, but not glial Rab3A is responsible for this defect. Furthermore, they confirm the increased mEPSC amplitudes in WT neurons reflect the addition of GluA2-containing AMPA receptors rather than calcium-permeable ones, as previously reported by multiple labs. However, the increase in mEPSC amplitude is not accompanied by a corresponding upregulation of GluA2 synaptic clusters according to their IHC data (cluster size and intensity trend slightly upwards but not reaching significance). They further show that this modest upward trend is absent in Rab3A KO neurons, and conclude that Rab3A is involved in postsynaptic GluA2 upregulation during homeostatic synaptic plasticity.

      When compared to the original version, the authors have done an admirable job in switching to more established ways to assess homeostatic synaptic plasticity by comparing the mean mEPSC amplitude and frequency, which has greatly improved the legibility of the manuscript to the public. Their data in Figures 1,2, and 8 clearly demonstrate that functional Rab3A in cortical neurons is required for the homeostatic regulation of mEPSCs.

      However, the authors still have not provided further investigation of the mechanisms behind the role of Rab3A in this form of plasticity, and the revision therefore has added little to the significance of the study. Moreover, the experimental design for the investigation of the mismatch between mEPSC amplitude and GluA2 cluster fluorescence remains questionable, making it difficult to draw any credible conclusions from groups of data that not only look similar to the eye but also show no significance statistically.

      A major claim the authors want to make is that Rab3A, although a presynaptic protein, regulates postsynaptic GluA2, and they do this by showing in Figure 5 that the upward trend of GluA2 cluster size and intensity is absent in Rab3A KO neurons. First, it is difficult to convince readers that this upward trend is real in Figures 5B-D without getting more samples. Second, the authors pick GluA2 clusters on the primary dendrites, whereas mEPSC events come from a much larger synapse population (e.g., more distal), therefore it makes sense that these two forms of measurement do not show matching changes, and this caveat could be addressed by sampling more diverse dendritic locations. Without a convincing phenotype in WT neurons, the support for this claim is weak.

      Another claim of the authors is that this mismatch between mEPSC amplitude and GluA2 cluster sizes with the same culture suggests there are other factors contributing to the mEPSC amplitude. They do this by comparing results from individual culture dissociations, which greatly suffer from undersampling (Figure 6). In particular, they point out that 2 out of 3 dissociations show "matching" upward trends in mEPSC and GluA2 cluster (figure 6A and 6B) while the third one shows opposite trends, and use this to support their claim. Anyone who has done culture preparation would know the high variability between dissociations, which is why culture data are always pooled for assessment of any population trend. Anything could have happened to this particular dissociation (culture #3, figure 6C), and drawing conclusion from this single incident does little to support this claim. At least, they should double the dissociation numbers, and their claim would be much more convincing if a similar phenomenon occurs again. Besides, as mentioned above, all these mismatching trends could just be due to sampling differences.

      In summary, this study establishes that neuronal Rab3A plays a role in homeostatic synaptic plasticity, but so do a number of other molecules that have been implicated in homeostatic synaptic plasticity in the past two decades (only will grow with the new techniques such as RNAseq). Without going beyond this finding and demonstrating how exactly Rab3A participates in the induction and/or expression of this form of plasticity, or maybe the potential Rab3A-mediated functional and behavioral defects in vivo, the contribution of the current study to the field is limited. However, given the presynaptic location of Rab3A, this finding could serve as a starting point for researchers interested in pre-postsynaptic cross-talk during homeostatic plasticity in general.

    4. Reviewer #3 (Public review):

      This manuscript presents a number of interesting findings that have the potential to increase our understanding of the mechanism underlying homeostatic synaptic plasticity (HSP). The data broadly support that Rab3A plays a role in HSP, although the site and mechanism of action remain uncertain.

      The authors clearly demonstrate the Rab3A plays a role in HSP at excitatory synapses, with substantially less plasticity occurring in the Rab3A KO neurons. There is also no apparent HSP in the Earlybird Rab3A mutation, although baseline synaptic strength seems already elevated. In this context, it is unclear if the plasticity is absent or just occluded by a ceiling effect due the synapses already being strengthened. Occlusion may also occur in the mixed cultures, with Rab3A missing from neurons but not astrocytes. The authors do appropriately discuss both options. There are also differences in genetic background between the Rab3A KO and Earlybird mutants that could also impact the results, which are also noted. The authors have solid data showing that Rab3A is unlikely to be active in astrocytes, Finally, they attempt to study the linkage between synaptic strength during HSP and AMPA receptor trafficking and conclude that trafficking may not be solely responsible for the changes in synaptic strength.

      Strengths:

      This work adds another player into the mechanisms underlying an important form of synaptic plasticity. The plasticity is likely only reduced, suggesting Rab3A is only partially required and perhaps multiple mechanisms contribute. The authors speculate about some possible novel mechanisms.

      However, the conclusions on the partial dissociation of AMPAR trafficking and synaptic response are made from somewhat weaker data. On average, across 3 culture sets, they saw similar magnitude of change in mEPSC amplitude and GluA2 cluster area and integral, but the GluA2 data was not significant. This is likely due to the nature of the datasets. Their imaging method involves only assessing puncta pairs (GluA2/VGlut1) clearly associated with a MAP2 labeled dendrite. This is a small subset of synapses, with usually less than 20 synapses per neuron analyzed (as stated by the authors). The mEPSC recordings will be averaging across several hundred events, which likely represent a hundred or more synapses given reasonable expectations on release probability. It has been reported, in work from this lab as well as by direct monitoring of tagged AMPARs during HSP (Wang, et al., 2019), that individual synapses are quite variable in their response. So there will almost necessarily be higher variability in the imaging data due to the smaller number of synapses sampled. The overall trends, though, are in alignment with previous data implicating receptor trafficking as the mechanism for HSP. However, the authors go on to evaluate each of the individual cultures, where 2 show similar changes between the mEPSC data and GluA2 clusters, and 1 culture showing little/no change in GluA2 clusters. The n's are very low here, and none of the datasets are significant. They want to conclude for this culture, there was a change in mEPSC amplitude that was not accompanied by a change in GluA2 at synaptic sites. But these data are collected from different coverslips, and due to the low n's, the potential under-sampling of the GluA2 clusters, and neuron-to-neuron variability, it is very hard to distinguish if this apparent difference is a methodological issue rather than a biological one. Much stronger data would be necessary to conclude that additional factors beyond receptor trafficking are required for HSP.

      Other questions arise from the NASPM experiments, used to justify looking at GluA2 (and not GluA1) in the immunostaining. First, there is a frequency effect that is unclear in origin. One would expect NASPM to merely block some fraction of the post-synaptic current, and not affect pre-synaptic release or block whole synapses. However the change in frequency seems to argue (as the authors do) that some synapses only have CP-AMPARs, while the rest of the synapses have few or none. Another possibility is that there are pre-synaptic NASPM-sensitive receptors that influence release probability. Further, the amplitude data show a strong trend towards smaller amplitude following NASPM treatment (Fig 3B). The p value for both control and TTX neurons was 0.08 - it is very difficult to argue that there is no effect. The decrease on average is larger in the TTX neurons, and some cells show a strong effect. It is possible there is some heterogeneity between neurons on whether GluA1/A2 heteromers or GluA1 homomers are added during HSP. This would impact the weakly supported conclusions about the GluA2 imaging vs mEPSC amplitude data.

      Unaddressed issues that would greatly increase the impact of the paper:

      (1) Is Rab3A acting pre-synaptically, post-synaptically or both? The authors provide good evidence that Rab3A is acting within neurons and not astrocytes. But where it is acting (pre or post) would aid substantially in understanding its role. They could use sparse knock-down of Rab3A, or simply mix cultures from KO and WT mice (with appropriate tags/labels). The general view in the field has been that HSP is regulated post-synaptically via regulation of AMPAR trafficking, and considerable evidence supports this view. The more support for their suggestion of a pre-synaptic site of control, the better.

      (2) Rab3A is also found at inhibitory synapses. It would be very informative to know if HSP at inhibitory synapses is similarly affected. This is particularly relevant as at inhibitory synapses, one expects a removal of GABARs (ie the opposite of whatever is happening at excitatory synapses). If both processes are regulated by Rab3A, this might suggest a role for this protein more upstream in the signaling; an effect only at excitatory synapses would argue for a more specific role just at these synapses.

    5. Author response:

      The following is the authors’ response to the original reviews.

      The detailed, thorough critique provided by the three reviewers is very much appreciated. We believe the manuscript is greatly improved by the changes we have made based on those reviews. The major changes are described below, followed by a point by point response.

      Major Changes:

      (1) We revised our model (old Fig. 10; new Fig. 9) to keep the explanation focused on the data shown in the current study. Specifically, references to GTP/GDP states of Rab3A and changes in the presynaptic quantum have been removed and the mechanisms depicted are confined to pre- or post-synaptic Rab3A participating in either controlling release of a trophic factor that regulates surface GluA2 receptors (pre- or postsynaptic) or directly affecting fusion of GluA2-receptor containing vesicles (postsynaptic).

      (2) We replaced all cumulative density function plots and ratio plots, based on multiple quantile samples per cell, with box plots of cell means. This affects new Figures 1, 2, 3, 5, 6, 7 and 8. All references to “scaling,” “divergent scaling,” or “uniform scaling,” have been removed. New p values for comparison of means are provided above every box plot in Figures 1, 2, 3, 5, 6, 7 and 8. The number of cultures is provided in the figure legends.

      (3) We have added frequency to Figures 1, 2 and 8. Frequency values overall are more variable, and the effect of activity blockade less robust, than for mEPSC amplitudes. We have added text indicating that the increase in frequency after activity blockade was significant in neurons from cultures prepared from WT in the Rab3A+/- colony but not cultures prepared from KO mice (Results, lines 143 to 147, new Fig. 1G. H). The TTX-induced increase in frequency was significant in the NASPM experiments before NASPM, but not after NASPM (Results, lines 231 to 233, new Fig. 3, also cultures from WT in Rab3A+/- colony). The homeostatic plasticity effect on frequency did not reach significance in WT on WT glia cultures or

      WT on KO glia cultures, possibly due to the variability of frequency, combined with smaller sample sizes (Results, lines 400 to 403, new Fig. 8). In the cultures prepared from WT mice in the Rab3A+/Ebd colony, there was a trend towards higher frequency after TTX that did not reach statistical significance, and in cultures prepared from mutant mice, the p value was large, suggesting disruption of the effect, which appears to be due to an increase in frequency in untreated cultures, similar to the behavior of mEPSC amplitudes in neurons from mutant mice (Results, lines 161-167). In sum, the effect of activity on frequency requires Rab3A and Ca2+-permeable receptors, and is mimicked by the presence of the Rab3A Earlybird mutant. We have also added a discussion of these results (Discussion, lines 427-435). 

      (4) In the revised manuscript we have added analysis of VGLUT1 levels for the same synaptic sites that we previously analyzed GluA2 levels, and these data are described in Results, lines 344 to 371, and appear in new Table 2. In contrast to previous studies, we did not find any evidence for an increase in VGLUT1 levels after activity blockade. We reviewed those studies to determine whether there might be differences in the experimental details that could explain the lack of effect we observed. In (De Gois et al., 2005), the authors measured mRNA and performed western blots to show increases in VGLUT1 after TTX treatment in older rat cortical cultures (DIV 19). The study performs immunofluorescence imaging of VGLUT1 but only after bicuculline treatment (it decreases), not after TTX treatment. In (Wilson et al.,

      2005), the hippocampal cultures are treated with AP5, not TTX, and the VGLUT1 levels in immunofluorescence images are reported relative to synapsin I. That the type of activity blockade matters is illustrated by the failure of Wilson and colleagues to observe a consistent increase in VGLUT1/Synapsin ratio in cultures treated with AMPA receptor blockade (NBQX; supplementary information). These points have been added to the Discussion, lines 436 to 447.)

      Reviewer #1:

      (1) (model…is not supported by the data), (2) (The analysis of mEPSC data using quantile sampling…), (3) (…statistical analysis of CDFs suffers from n-inflation…), (4) (How does recording noise and the mEPSC amplitude threshold affect “divergent scaling?”) (5) (…justification for the line fits of the ratio data…), (7) (A comparison of p-values between conditions….) and (10) (Was VGLUT intensity altered in the stainings presented in the manuscript?)

      The major changes we made, described above, address Reviewer #1’s points. The remaining points are addressed below.

      (6) TTX application induces a significant increase in mEPSC amplitude in Rab3A-/- mice in two out of three data sets (Figs. 1 and 9). Hence, the major conclusion that Rab3A is required for homeostatic scaling is only partially supported by the data. 

      The p values based on CDF comparisons were problematic, but the point we were making is that they were much larger for amplitudes measured in cultures prepared from Rab3A-/- mice (Fig. 1, p = 0.04) compared to those from cultures prepared from Rab3A+/+ mice (Fig. 1, p = 4.6 * 10-4). Now that we are comparing means, there are no significant TTX-induced effects on mEPSC amplitudes for Rab3A-/- data. However, acknowledging that some increase after activity blockade remains, we describe homeostatic plasticity as being impaired or not significant, rather than abolished, by loss of Rab3A, (Abstract, lines 37 to 39; Results, lines 141 to 143; Discussion, lines 415 to 418).

      (8) There is a significant increase in baseline mEPSC amplitude in Rab3AEbd/Ebd (15 pA) vs. Rab3AEbd/+ (11 pA) cultures, but not in Rab3A-/- (13.6 pA) vs. Rab3A+/- (13.9 pA). Although the nature of scaling was different between Rab3AEbd/Ebd vs. Rab3AEbd/+ and Rab3AEbd/Ebd with vs. without TTX, the question arises whether the increase in mEPSC amplitude in Rab3AEbd/Ebd is Rab3A dependent. Could a Rab3A independent mechanism occlude scaling?

      The Reviewer is concerned that the increase in mEPSC amplitude in the presence of the Rab3A point mutant may be through a ‘non-Rab3A’ mechanism (a concern raised by the lack of such effect in cultures from the Rab3A-/- mice), and secondly, that the already large mEPSC cannot be further increased by the homeostatic plasticity mechanism. It must always be considered that a mutant with an altered genetic sequence may bind to novel partners, causing activities that would not be either facilitated or inhibited by the original molecule. We have added this caveat to Results, lines 180 to 186 We added that a number of other manipulations, implicating individual molecules in the homeostatic mechanism, have caused an increase in mEPSC amplitude at baseline, potentially nonspecifically occluding the ability of activity blockade to induce a further increase (Results lines 186 to 189). Still, it is a strong coincidence that the novel activity of the mutant Rab3A would affect mEPSC amplitude, the same characteristic that is affected by activity blockade in a Rab3A dependent manner, a point which we added to Results, lines 189 to 191.

      (9) Figure 4: NASPM appears to have a stronger effect on mEPSC frequency in the TTX condition vs. control (-40% vs -15%). A larger sample size might be necessary to draw definitive conclusions on the contribution of Ca2+-permeable AMPARs.

      Our results, even with the modest sample size of 11 cells, are clear: NASPM does not disrupt the effect of TTX treatment on mEPSC amplitude (new Fig. 3A). It also looks like there is a greater magnitude effect of NAPSM on frequency in TTX-treated cells; we note this, but point out that nevertheless, these mEPSCs are not contributing to the increase in mEPSC amplitude (Results, lines 238-241). 

      (11) The change in GluA2 area or fluorescence intensity upon TTX treatment in controls is modest. How does the GluA2 integral change?

      We had reported that GluA2 area showed the most prominent increase following activity blockade, with intensity changing very little. When we examined the integral, it closely matched the change in area. We have added the values for integral to new Fig. 5 D, H; new Fig. 6 A-C; new Fig. 7 A-C and new Table 1 (for GluA2) and new Table 2 (for VGLUT1). These results are described in the text in the following places: Results, lines 289-292; 298-299; 311-319; 328-324). For VGLUT1, both area and intensity changed modestly, and the integral appeared to be a combination of the two, being higher in magnitude and resulting in smaller p values than either area or intensity (Results, lines 344-348; 353-359; new Table 2).

      (12) The quantitative comparison between physiology and microscopy data is problematic. The authors report a mismatch in ratio values between the smallest mEPSC amplitudes and the smallest GluA2 receptor cluster sizes (l. 464; Figure 8). Is this comparison affected by the fluorescence intensity threshold? What was the rationale for a threshold of 400 a.u. or 450 a.u.? How does this threshold compare to the mEPSC threshold of 3 pA.

      This concern is partially addressed by no longer comparing the rank ordered mEPSC amplitudes with the rank ordered GluA2 receptor characteristics. We had used multiple thresholds in the event that an experiment was not analyzable with the chosen threshold (this in fact happened for VGLUT1, see end of this paragraph). We created box plots of the mean GluA2 receptor cluster size, intensity and integral, for experiments in which we used all three thresholds, to determine if the effect of activity blockade was different depending on which threshold was applied, and found that there was no obvious difference in the results (Author response image 1). Nevertheless, since there is no need to use a different threshold for any of the 6 experiments (3 WT and 3KO), for new Figures 5, 6 and 7 we used the same threshold for all data, 450; described in Methods, lines 746 to 749. For VGLUT1 levels, it was necessary to use a different threshold for Rab3A+/+ Culture #1 (400), but a threshold of 200 for the other five experiments (Methods, lines 751-757). The VGLUT1 immunofluorescent sites in Culture #1 had higher levels overall, and the low threshold caused the entire AOI to be counted as the synapse, which clearly included background levels outside of the synaptic site. Conversely, to use a threshold of 400 on the other experiments meant that the synaptic site found by the automated measurement tool was much smaller that what was visible by eye. In our judgement it would have been meaningless to adhere to a single threshold for VGLUT1 data.

      Author response image 1.

      Using different thresholds does not substantially alter GluA2 receptor cluster size data. A) Rab3A+/+ Culture #1, size data for three different thresholds, depicted above each graph. B) Rab3A+/+ Culture #2, size data for three different thresholds, depicted above each graph. Note scale bar in A is different from B, to highlight differences for different thresholds. (Culture #3 was only analyzed with 450 threshold).

      The conclusion that an increase in AMPAR levels is not fully responsible for the observed mEPSC increase is mainly based on the rank-order analysis of GluA2 intensity, yielding a slope of ~0.9. There are several points to consider here: (i) GluA2 fluorescence intensity did increase on average, as did GluA2 cluster size.

      (ii) The increase in GluA2 cluster size is very similar to the increase in mEPSC amplitude (each approx. 1820%). (iii) Are there any reports that fluorescence intensity values are linearly reporting mEPSC amplitudes (in this system)? Antibody labelling efficiency, and false negatives of mEPSC recordings may influence the results. The latter was already noted by the authors.

      Our comparison between mEPSC amplitude and GluA2 receptor cluster characteristics has been reexamined in the revised version using means rather than rank-ordered data in rank-order plots or ratio plots. Importantly, all of these methods revealed that in one out of three WT cultures (Culture #3) GluA2 receptor cluster size (old Fig. 8, old Table 1; new Fig. 6, new Table 1), intensity and integral (new Fig. 6, new Table 1) values decreased following activity blockade while in the same culture, mEPSC amplitudes increased. It is based on this lack of correspondence that we conclude that increases in mEPSC amplitude are not fully explained by increases in GluA2 receptors, and suggest there may be other contributors. These points are made in the Abstract (lines 108-110); Results (lines 319 to 326; 330337; 341-343) and the Discussion (lines 472 to 474). To our knowledge, there are not any reports that quantitatively compare receptor levels (area, intensity or integrals) to mEPSC amplitudes in the same cultures. We examined the comparisons very closely for 5 studies that used TTX to block activity and examined receptor levels using confocal imaging at identified synapses (Hou et al., 2008; Ibata et al., 2008; Jakawich et al., 2010a; Xu and Pozzo-Miller, 2017; Dubes et al., 2022). We were specifically looking for whether the receptor data were more variable than the mEPSC amplitude data, as we found. However, for 4 of the studies, sample sizes were very different so that we cannot simply compare the p values. Below is a table of the comparisons.

      Author response table 1.

      In Xu 2017 the sample sizes are close enough that we feel comfortable concluding that the receptor data were slightly more variable (p < 0.05) than mEPSC data (p<0.01) but recognize that it is speculative to say our finding has been confirmed. A discussion of these articles is in Discussion, lines 456-474.

      (iv) It is not entirely clear if their imaging experiments will sample from all synapses. Other AMPAR subtypes than GluA2 could contribute, as could kainite or NMDA receptors.

      While our imaging data only examined GluA2, we used the application of NASPM to demonstrate Ca2+permeable receptors did not contribute quantitatively to the increase in mEPSC amplitude following TTX treatment. Since GluA3 and GluA4 are also Ca2+-permeable, the findings in new Figure 3 (old Fig. 4) likely rule out these receptors as well.  There are also reports that Kainate receptors are Ca2+-permeable and blocked by NASPM (Koike et al., 1997; Sun et al., 2009), suggesting the NASPM experiment also rules out the contribution of Kainate receptors. Finally, given our recording conditions, which included normal magnesium levels in the extracellular solution as well as TTX to block action-potential evoked synaptic transmission, NMDA receptors would not be available to contribute currents to our recordings due to block by magnesium ions at resting Vm. These points have been added to the Methods section, lines 617 to 677 (NMDA); 687-694 (Ca2+-permeable AMPA receptors and Kainate receptors).

      Furthermore, the statement “complete lack of correspondence of TTX/CON ratios” is not supported by the data presented (l. 515ff). First, under the assumption that no scaling occurs in Rab3A-/-, the TTX/CON ratios show a 20-30% change, which indicates the variation of this readout. Second, the two examples shown in Figure 8 for Rab3A+/+ are actually quite similar (culture #1 and #2, particularly when ignoring the leftmost section of the data, which is heavily affected by the raw values approaching zero.

      We are no longer presenting ratio plots in the revised manuscript, so we do not base our conclusion that mEPSC amplitude data is not always corresponding to GluA2 receptor data on the difference in behavior of TTX/CON ratio values, but only on the difference in direction of the TTX effect in one out of three cultures. We agree with the reviewer that the ratio plots are much more sensitive to differences between control and treated values than the rank order plot, and we feel these differences are important, for example, there is still a homeostatic increase in the Rab3A-/- cultures, and the effect is still divergent rather than uniform. But the comparison of ratio data will be presented elsewhere.

      (13) Figure 7A: TTX CDF was shifted to smaller mEPSC amplitude values in Rab3A-/- cultures. How can this be explained?

      While this result is most obvious in CDF plots, we still observe a trend towards smaller mEPSC amplitudes after TTX treatment in two of three individual cultures prepared from Rab3A-/- mice when comparing means (new Fig. 7, Table 1) which did not reach statistical significance for the pooled data (new Fig. 5, new Table 1). There was not any evidence of this decrease in the larger data set (new Fig. 1) nor for Rab3A-/- neurons on Rab3A+/+ glia (new Fig. 8). Given that this effect is not consistent, we did not comment on it in the revised manuscript. It may be that there is a non-Rab3A-dependent mechanism that results in a decrease in mEPSC amplitude after activity blockade, which normally pulls down the magnitude of the activity-dependent increase typically observed. But studying this second component would be difficult given its magnitude and inconsistent presentation.

      Reviewer #1 (Recommendations For the Authors):

      (1) Abstract, last sentence: The conclusion of the present manuscript should be primarily based on the results presented. At present, it is mainly based on a previous publication by the authors.

      We have revised the last sentence to reflect actual findings of the current study (Abstract, lines 47 to 49).

      (2) Line 55: “neurodevelopmental”

      This phrase has been removed.

      (3) Line 56: “AMPAergic” should be replaced by AMPAR-mediated

      This sentence was removed when all references to “scaling” were removed; no other instances of “AMPAergic” are present.

      (4) Figure 9: The use of BioRender should be disclosed in the Figure Legend.

      We used BioRender in new Figures 3, 7 and 8, and now acknowledge BioRender in those figure legends.

      (5) Figure legends and results: The number of cultures should be indicated for each comparison.

      Number of cultures has been added to the figure legends.

      (6) Line 289: A comparison of p-values between conditions does not allow any meaningful conclusions.

      Agreed, therefore we have removed CDFs and the KS test comparison p values. All comparisons in the revised manuscript are for cell means.

      (7) Line 623ff: The argument referring to NMJ data is weak, given that different types of receptors are involved.

      We still think it is valid to point out that Rab3A is required for the increase in mEPC at the NMJ but that ACh receptors do not increase (Discussion, lines 522 to 525). We are not saying that postsynaptic receptors do not contribute in cortical cultures, only that there could be another Rab3A-dependent mechanism that also affects mEPSC amplitude.

      (8) Plotting data points outside of the ranges should be avoided (e.g., Fig. 2Giii, 7F).

      These two figures are no longer present in the revised manuscript. In revising figures, we made sure no other plots have data points outside of the ranges.

      (9) The rationale for investigating Rab3AEbd/Ebd remains elusive and should be described.

      A rationale for investigating Rab3AEbd/Ebd is that if the results are similar to the KO, it strengthens the evidence for Rab3A being involved in homeostatic synaptic plasticity. In addition, since its phenotype of early awakening was stronger than that demonstrated in Rab3A KO mice (Kapfhamer et al., 2002), it was possible we would see a more robust effect. These points have been added to the Results, lines 118 to 126.

      (10) Figures 3 and 4, as well as Figure 5 and 6 could be merged.

      In the revised version, Figure 3 has been eliminated since its main point was a difference in scaling behavior. Figure 4 has been expanded to include a model of how NASPM could reduce frequency (new Fig. 3.) Images of the pyramidal cell body have been added to Figure 5 (new Fig. 4), and Figure 6 has been completely revised and now includes pooled data for both Rab3A+/+ and Rab3A-/- cultures, for mEPSC amplitude, GluA2 receptor cluster size, intensity and integral.

      (11) Figure 5: The legend refers to MAP2, but this is not indicated in the figure.

      MAP2 has now been added to the labels for each image and described in the figure legend (new Fig. 4).

      Reviewer #2:

      Technical concerns:

      (1) The culture condition is questionable. The authors saw no NMDAR current present during spontaneous recordings, which is worrisome since NMDARs should be active in cultures with normal network activity (Watt et al., 2000; Sutton et al., 2006). It is important to ensure there is enough spiking activity before doing any activity manipulation. Similarly it is also unknown whether spiking activity is normal in Rab3AKO/Ebd neurons.

      In the studies cited by the reviewer, NMDA currents were detected under experimental conditions in which magnesium was removed. In our recordings, we have normal magnesium (1.3 mM) and also TTX, which prevents the necessary depolarization to allow inward current through NMDA receptors. This point has been added to our Methods, lines 674 to 677. We acknowledge we do not know the level of spiking in cultures prepared from Rab3A+/+, Rab3A-/- or Rab3A_Ebd/Ebd_ mice. Given the similar mEPSC amplitude for untreated cultures from WT and KO studies, we think it unlikely that activity was low in the latter, but it remains a possibility for untreated cultures from Rab3A_Ebd/Ebd_ mice, where mEPSC amplitude was increased. These points are added to the Methods, lines 615 to 622.

      (2) Selection of mEPSC events is not conducted in an unbiased manner. Manually selecting events is insufficient for cumulative distribution analysis, where small biases could skew the entire distribution. Since the authors claim their ratio plot is a better method to detect the uniformity of scaling than the well-established rank-order plot, it is important to use an unbiased population to substantiate this claim.

      We no longer include any cumulative distributions or ratio plot analysis in the revised version. We have added the following text to Methods, lines 703 to 720:

      “MiniAnalysis selects many false positives with the automated feature when a small threshold amplitude value is employed, due to random fluctuations in noise, so manual re-evaluation of the automated process is necessary to eliminate false positives. If the threshold value is set high, there are few false positives but small amplitude events that visually are clearly mEPSCs are missed, and manual re-evaluation is necessary to add back false negatives or the population ends up biased towards large mEPSC amplitudes. As soon as there is a manual step, bias is introduced. Interestingly, a manual reevaluation step was applied in a recent study that describes their process as ‘unbiased (Wu et al., 2020). In sum, we do not believe it is currently possible to perform a completely unbiased detection process. A fully manual detection process means that the same criterion (“does this look like an mEPSC?”) is applied to all events, not just the false positives, or the false negatives, which prevents the bias from being primarily at one end or the other of the range of mEPSC amplitudes. It is important to note that when performing the MiniAnalysis process, the researcher did not know whether a record was from an untreated cell or a TTX-treated cell.”

      (3) Immunohistochemistry data analysis is problematic. The authors only labeled dendrites without doing cell-fills to look at morphology, so it is questionable how they differentiate branches from pyramidal neurons and interneurons. Since glutamatergic synapse on these two types of neuron scale in the opposite directions, it is crucial to show that only pyramidal neurons are included for analysis.

      We identified neurons with a pyramidal shape and a prominent primary dendrite at 60x magnification without the zoom feature. This should have been made clear in the description of imaging. We have added an image of the two selected cells to our figure of dendrites (old Fig. 5, new Fig. 4), and described this process in the Methods, lines 736 to 739, and Results, lines 246 to 253. Given the morphology of the neurons selected it is highly unlikely that the dendrites we analyzed came from interneurons.

      Conceptual Concerns

      The only novel finding here is the implicated role for Rab3A in synaptic scaling, but insights into mechanisms behind this observation are lacking. The authors claim that Rab3A likely regulates scaling from the presynaptic side, yet there is no direct evidence from data presented. In its current form, this study’s contribution to the field is very limited.

      We have demonstrated that loss of Rab3A and expression of a Rab3A point mutant disrupt homeostatic plasticity of mEPSC amplitudes, and that in the absence of Rab3A, the increase in GluA2 receptors at synaptic sites is abolished. Further, we show that this effect cannot be through release of a factor, like TNFα, from astrocytes. In the new version, we add the finding that VGLUT1 is not increased after activity blockade, ruling out this presynaptic factor as a contributor to homeostatic increases in mEPSC amplitude. We show for the first time by examining mEPSC amplitudes and GluA2 receptors in the same cultures that the increases in GluA2 receptors are not as consistent as the increases in mEPSC amplitude, suggesting the possibility of another contributor to homeostatic increases in mEPSC amplitude. We first proposed this idea in our previous study of Rab3A-dependent homeostatic increases in mEPC amplitudes at the mouse neuromuscular junction. In sum, we dispute that there is only one novel finding and that we have no insights into mechanism. We acknowledge that we have no direct evidence for regulation from the presynaptic side, and have removed this claim from the revised manuscript. We have retained the Discussion of potential mechanisms affecting the presynaptic quantum and evidence that Rab3A is implicated in these mechanisms (vesicle size, fusion pore kinetics; Discussion, lines 537 to 563). One way to directly show that the amount of transmitter released for an mEPSC has been modified after activity blockade is to demonstrate that a fast off-rate antagonist has become less effective at inhibiting mEPSCs (because the increased glutamate released out competes it; see (Liu et al., 1999) and (Wilson et al., 2005) for example experiments). This set of experiments is underway but will take more time than originally expected, because we are finding surprisingly large decreases in frequency, possibly the result of mEPSCs with very low glutamate concentration that are completely inhibited by the dose used. Once mEPSCs are lost, it is difficult to compare the mEPSC amplitude before and after application of the antagonist. Therefore we intend to include this experiment in a future report, once we determine the reason for the frequency reduction, or, can find a dose where this does not occur.

      (1) Their major argument for this is that homeostatic effects on mEPSC amplitudes and GluA2 cluster sizes do not match. This is inconsistent with reports from multiple labs showing that upscaling of mEPSC amplitude and GluA2 accumulation occur side by side during scaling (Ibata et al., 2008; Pozo et al., 2012; Tan et al., 2015; Silva et al., 2019). Further, because the acquisition and quantification methods for mEPSC recordings and immunohistochemistry imaging are entirely different (each with its own limitations in signal detection), it is not convincing that the lack of proportional changes must signify a presynaptic component.

      Within the analyses in the revised manuscript, which are now based only on comparison of cell/dendrite means, we find a very good match in the magnitude of increase for the pooled data of mEPSC amplitudes and GluA2 receptor cluster sizes (+19.7% and +20.0% respectively; new Table 1). However, when looking at individual cultures, we had one of three WT cultures in which mEPSC amplitude increased 17.2% but GluA2 cluster size decreased 9.5%. This result suggests that while activity blockade does lead to an increase in GluA2 receptors after activity blockade, the effect is more variable than that for mEPSC amplitude. We went back to published studies to see if this has been previously observed, but found that it was difficult to compare because the sample sizes were different for the two characteristics (see Author response table 1). We included these particular 5 studies because they use the same treatment (TTX), examine receptors using imaging of identified synaptic sites, and record mEPSCs in their cultures (although the authors do not indicate that imaging and recordings are done simultaneously on the same cultures.) Only one of the studies listed by the Reviewer is in our group (Ibata et al., 2008). The study by (Tan et al., 2015) uses western blots to measure receptors; the study by (Silva et al., 2019) blocks activity using a combination of AMPA and NMDA receptor blockers; the study by (Pozo et al., 2012) correlates mEPSC amplitude changes with imaging but not in response to activity blockade, instead for changing the expression of GluA2. While it may seem like splitting hairs to reject studies that use other treatment protocols, there is ample evidence that the mechanisms of homeostatic plasticity depend on how activity was altered, see the following studies for several examples of this (Sutton et al., 2006; Soden and Chen, 2010; Fong et al., 2015). A discussion of the 5 articles we selected is in the revised manuscript, Discussion, lines 456 to 474. In sum, we provide evidence that activity blockade is associated with an overall increase in GluA2 receptors; what we propose is that this increase, being more variable, does not fully explain the increase in mEPSC amplitude. However, we acknowledge that the disparity could be explained by the differences in limitations of the two methods (Discussion, lines 469-472).

      (2) The authors also speculate in the discussion that presynaptic Rab3A could be interacting with retrograde BDNF signaling to regulate postsynaptic AMPARs. Without data showing Rab3A-dependent presynaptic changes after TTX treatment, this argument is not compelling. In this retrograde pathway, BDNF is synthesized in and released from dendrites (Jakawich et al., 2010b; Thapliyal et al., 2022), and it is entirely possible for postsynaptic Rab3A to interfere with this process cell-autonomously.

      We have added the information that Rab3A could control BDNF from the postsynaptic cell and included the two references provided by the reviewer, Discussion, lines 517 to 518. We have added new evidence, recently published, that the Rab3 family has been shown to regulate targeting of EGF receptors to rafts (among other plasma membrane molecules), with Rab3A itself clearly present in nonneuronal cells (Diaz-Rohrer et al., 2023) (added to Discussion, lines 509 to 515).

      (3) The authors propose that a change in AMPAR subunit composition from GluA2-containing ones to GluA1 homomers may account for the distinct changes in mEPSC amplitudes and GluA2 clusters. However, their data from the NASPM wash-in experiments clearly show that the GluA1 homomer contributions have not changed before and after TTX treatment.

      We have revised this section in the Discussion, lines 534 to 536, to clarify that any change due to GluA1 homomers should have been detectable by a greater ability of NASPM to reverse the TTX-induced increase.

      Reviewer #2 (Recommendations for the Authors):

      For authors to have more convincing arguments in general, they will need to clarify/improve certain details in their data collection by addressing the above technical concerns. Additionally, the authors should design experiments to test whether Rab3A regulates scaling from pre- or post-synaptic site. For example, they could sparsely knock out Rab3A in WT neurons to test the postsynaptic possibility. On the other hand, their argument for a presynaptic role would be much more compelling if they could show whether there are clear functional changes such as in vesicle sizes and release probability in the presynaptic terminal of Rab3AKO neurons.

      An important next step is to identify whether Rab3A is acting pre- or post-synaptically (Discussion, lines 572 to 573), but these experiments will be undertaken in the future. It would not add much to simply show vesicle size is altered in the KO (and we do not necessarily expect this since mEPSC amplitude is normal in the KO). It will be very difficult to establish that vesicle size is changing with activity blockade and that this change is prevented in the Rab3A KO, because we are looking for a ~25% increase in vesicle volume, which would correspond to a ~7.5% increase in diameter. Finally, we do not believe demonstrating changes in release probability tell us anything about a presynaptic role for Rab3A in regulating the size of the presynaptic quantum.

      Reviewer #3 (Public Review)

      Weaknesses: However, the rather strong conclusions on the dissociation of AMPAR trafficking and synaptic response are made from somewhat weaker data. The key issue is the GluA2 immunostaining in comparison with the mEPSC recordings. Their imaging method involves only assessing puncta clearly associated with a MAP2 labeled dendrite. This is a small subset of synapses, judging from the sample micrographs (Fig. 5). To my knowledge, this is a new and unvalidated approach that could represent a particular subset of synapses not representative of the synapses contributing to the mEPSC change (they are also sampling different neurons for the two measurements; an additional unknown detail is how far from the cell body were the analyzed dendrites for immunostaining.) While the authors acknowledge that a sampling issue could explain the data, they still use this data to draw strong conclusions about the lack of AMPAR trafficking contribution to the mEPSC amplitude change. This apparent difference may be a methodological issue rather than a biological one, and at this point it is impossible to differentiate these. It will unfortunately be difficult to validate their approach. Perhaps if they were to drive NMDAdependent LTD or chemLTP, and show alignment of the imaging and ephys, that would help. More helpful would be recordings and imaging from the same neurons but this is challenging. Sampling from identified synapses would of course be ideal, perhaps from 2P uncaging combined with SEP-labeled AMPARs, but this is more challenging still. But without data to validate the method, it seems unwarranted to make such strong conclusions such as that AMPAR trafficking does not underlie the increase in mEPSC amplitude, given the previous data supporting such a model.

      In the new version, we soften our conclusion regarding the mismatch between GluA2 receptor levels and mEPSC amplitudes, now only stating that receptors may not be the sole contributor to the TTX effect on mEPSC amplitude (Discussion, lines 472 to 474). With our analysis in the new version focusing on comparisons of cell means, the GluA2 receptor cluster size and the mEPSC amplitude data match well in magnitude for the data pooled across the 3 matched cultures (20.0% and 19.7%, respectively, see new Table 1). However, in one of the three cultures the direction of change for GluA2 receptors is opposite that of mEPSC amplitudes (Table 1, Culture #3, -9.5% vs +17.2%, respectively).

      It is unlikely that the lack of matching of homeostatic plasticity in one culture, but very good matching in two other cultures, can be explained by an unvalidated focus on puncta associated with MAP2 positive dendrites. We chose to restrict analysis of synaptic GluA2 receptors to the primary dendrite in order to reduce variability, reasoning that we are always measuring synapses for an excitatory pyramidal neuron, synapses that are relatively close to the cell body, on the consistently identifiable primary dendrite. We measured how far this was for the two cells depicted in old Figure 5 (new Fig. 4). Because we always used the 5X zoom window which is a set length, and positioned it within ~10 microns of the cell body, these cells give a ball park estimate for the usual distances. For the untreated cell, the average distance from the cell body was 38.5 ± 2.8 µm; for the TTX-treated cell, it was 42.4 ± 3.2 µm (p = 0.35, KruskalWallis test). We have added these values to the Results, lines 270 to 274.

      We did not mean to propose that AMPA receptor levels do not contribute at all to mEPSC amplitude, and we acknowledge there are clear cases where the two characteristics change in parallel (for example, in the study cited by Reviewer #2, (Pozo et al., 2012), increases in GluA2 receptors due to exogenous expression are closely matched by increases in mEPSC amplitudes.) What our matched culture experiments demonstrate is that in the case of TTX treatment, both GluA2 receptors and mEPSC amplitudes increase on average, but sometimes mEPSC amplitudes can increase in the absence of an increase in GluA2 receptors (Culture #3, Rab3A+/+ cultures), and sometimes mEPSC amplitudes do not increase even though GluA2 receptor levels do increase (Culture #3, Rab3A-/- cultures). Therefore, it would not add anything to our argument to examine receptors and mEPSCs in NMDA-dependent LTP, a different plasticity paradigm in which changes in receptors and mEPSCs may more closely align. It has been demonstrated that mEPSCs of widely varying amplitude can be recorded from a single synaptic site (Liu and Tsien, 1995), so we would need to measure a large sample of individual synapse recordings to detect a modest shift in average values due to activity blockade. In addition, it would be essential to express fluorescent AMPA receptors in order to correlate receptor levels in the same cells we record from (or at the same synapses). And yet, even after these heroics, one is still left with the issue that the two methods, electrophysiology and fluorescent imaging, have distinct limitations and sources of variability that may obscure any true quantitative correlation.

      Other questions arise from the NASPM experiments, used to justify looking at GluA2 (and not GluA1) in the immunostaining. First, there is a frequency effect that is quite unclear in origin. One would expect NASPM to merely block some fraction of the post-synaptic current, and not affect pre-synaptic release or block whole synapses. It is also unclear why the authors argue this proves that NASPM was at an effective concentration (lines 399-400). Further, the amplitude data show a strong trend towards smaller amplitude. The p value for both control and TTX neurons was 0.08 – it is very difficult to argue that there is no effect. And the decrease is larger in the TTX neurons. Considering the strong claims for a presynaptic locus and the use of this data to justify only looking at GluA2 by immunostaining, these data do not offer much support of the conclusions. Between the sampling issues and perhaps looking at the wrong GluA subunit, it seems premature to argue that trafficking is not a contributor to the mEPSC amplitude change, especially given the substantial support for that hypothesis. Further, even if trafficking is not the major contributor, there could be shifts in conductance (perhaps due to regulation of auxiliary subunits) that does not necessitate a pre-synaptic locus. While the authors are free to hypothesize such a mechanism, it would be prudent to acknowledge other options and explanations.

      We have created a model cartoon to explain how NASPM could reduce mEPSC frequency (new Fig. 3D). mEPSCs that arise from a synaptic site that has only Ca2+-permeable AMPA receptors will be completely blocked by NASPM, if the NASPM concentration is maximal. The reason we conclude that we have sufficient NASPM reaching the cells is that the frequency is decreased, as expected if there are synaptic sites with only Ca2+-permeable AMPA receptors. We previously were not clear that there is an effect of NASPM on mEPSC amplitude, although it did not reach statistical significance (new Fig. 3B). Where there is no effect is on the TTX-induced increase in mEPSC amplitude, which remains after the acute NASPM application (new Fig. 3A). We have revised the description of these findings in Results, lines 220 to 241. In reviewing the literature further, we could find no previous studies demonstrating an increase in conductance in GluA2 or Ca2+-impermeable receptors, only in GluA1 homomers. In other words, any conductance change would have been due to a change in GluA1 homomers, and should have been visible as a disruption of the homeostatic plasticity by NASPM application. We have added text to Results, lines 211 to 217; 236-241; Discussion, lines 420 to 422; 526-536 and Methods, lines 685 to 695 regarding this point.

      The frequency data are missing from the paper, with the exception of the NASPM dataset. The mEPSC frequencies should be reported for all experiments, particularly given that Rab3A is generally viewed as a pre-synaptic protein regulating release. Also, in the NASPM experiments, the average frequency is much higher in the TTX treated cultures. Is this statistically above control values?

      This comment is addressed by the major change #3, above.

      Unaddressed issues that would greatly increase the impact of the paper:

      (1) Is Rab3A activity pre-synaptically, post-synaptically or both. The authors provide good evidence that Rab3A is acting within neurons and not astrocytes. But where is it acting (pre or post) would aid substantially in understanding its role (and particularly the hypothesized and somewhat novel idea that the amount of glutamate released per vesicle is altered in HSP). They could use sparse knockdown of Rab3A, or simply mix cultures from KO and WT mice (with appropriate tags/labels). The general view in the field has been that HSP is regulated post-synaptically via regulation of AMPAR trafficking, and considerable evidence supports this view. The more support for their suggestion of a pre-synaptic site of control, the better.

      This is similar to the request of Reviewer #2, Recommendations to the Authors. An important next step is to identify whether Rab3A is working pre- or postsynaptically. However, it is possible that it is acting pre-synaptically to anterogradely regulate trafficking of AMPAR, as we have depicted in our model, new Fig. 9. To demonstrate that the presynaptic quantum is being altered, we would need to show that vesicle size is increased, or the amount of transmitter being released during an mEPSC is increased after activity blockade. To that end, we are currently performing experiments using a fast off-rate antagonist. As described above in response to Reviewer #2’s Conceptual Concerns, we find dramatic decreases in frequency not explained by the 30-60% inhibition observed for the largest amplitude mEPSCs, which suggests the possibility that small mEPSCs are more sensitive than large mEPSCs and therefore may have less transmitter. Due to these complexities and the delay while we test other antagonists to see if the effect is specific to fast-off rate antagonists, we are not including these results here.

      (2) Rab3A is also found at inhibitory synapses. It would be very informative to know if HSP at inhibitory synapses is similarly affected. This is particularly relevant as at inhibitory synapses, one expects a removal of GABARs and/or a decrease of GABA-packaging in vesicles (ie the opposite of whatever is happening at excitatory synapses.). If both processes are regulated by Rab3A, this might suggest a role for this protein more upstream in the signaling, an effect only at excitatory synapses would argue for a more specific role just at these synapses.

      It will be important to determine if homeostatic synaptic plasticity at inhibitory synapses on excitatory neurons is sensitive to Rab3A deletion, especially in light of the fact that unlike many of the other molecules implicated in homeostatic increases in mEPSCS, Rab3A is not a molecule known to be selective for glutamate receptor trafficking (in contrast to Arc/Arg3.1 or GRIP1, for example). Such a study would warrant its own publication.

      Reviewer #3 (Recommendations for the Authors):

      There are a number of minor points or suggestions for the authors:

      Is RIM1 part of this pathway (or expected to be)? Some discussion of this would be nice.

      RIM, Rab3-interacting molecule, has been implicated at the drosophila neuromuscular junction in a presynaptic form of homeostatic synaptic plasticity in which evoked release is increased after block of postsynaptic receptors (Muller et al., 2012), a plasticity that also requires Rab3-GAP (Muller et al., 2011). To our knowledge there is no evidence that RIM is involved in the homeostatic plasticity of mEPSC amplitude after activity blockade by TTX. The Rim1a KO does not have a change in mEPSC amplitude relative to WT (Calakos et al., 2004), but that is not unexpected given the normal mEPSC amplitude in neurons from cultures prepared from Rab3A-/- mice in the current study. It would be interesting to look at homeostatic plasticity in cortical cultures prepared from Rim1a or other RIM deletion mice, but we have not added these points to the revised manuscript since there are a number of directions one could go in attempting to define the molecular pathway and we feel it is more important to discuss the potential location of action and physiological mechanisms.

      Is the Earlybird mutation a GOF? More information about this mutation would help.

      We have added a description of how the Earlybird mutation was identified, in a screen for rest:activity mutants (Results, lines 118 to 123). Rab3A Earlybird mice have a shortened circadian period, shifting their wake cycle earlier and earlier. When Rab3A deletion mice were tested in the same activity raster plot measurements, the shift was smaller than that for the Earlybird mutant, suggesting the possibility that it is a dominant negative mutation.

      The high K used in the NASPM experiments seems a bit unusual. Have the authors done high K/no drug controls to see if this affects the synapses in any way?

      We used the high K based on previous studies that indicated the blocking effect of the Ca2+-permeable receptor blockers was use dependent (Herlitze et al., 1993; Iino et al., 1996; Koike et al., 1997). We reasoned that a modest depolarization would increase the frequency of AMPA receptor mEPSCs and allow access of the NASPM.  We have added this point to the Methods, lines 695 to 708. 

      The NASPM experiments do not show that GluA1 does not contribute (line 401), only that GluA1 homomers are not contributing (much – see above). GluA1/A2 heteromers are quite likely involved. Also, the SEM is missing from the WT pre/post NASPM data.

      Imaging of GluA2-positive sites will not distinguish between GluA2 homomers and GluA2-GluA1 heteromers, so we have added this clarification to Results, lines 242 to 246. We have remade the NASPM pre-post line plots so that the mean values and error bars are more visible (new Fig. 3B, C).

      It seems odd to speculate based on non-significant findings (line 650-1), with lower significance (p = 0.11) than findings being dismissed in the paper (NASPM on mEPSC amplitude; p = 0.08).

      We did not mean to dismiss the effect of NASPM on mEPSC amplitude (new Fig. 3B), rather, we dismiss the effect of NASPM on the homeostatic increase in mEPSC amplitude caused by TTX treatment (new Fig. 3A). We have emphasized this distinction in Results, lines 223 to 225, and Discussion, lines 420 to 422, as well as adding that the stronger effect of NASPM on frequency after TTX treatment suggests an activity-dependent increase in the number of synapses expressing only Ca2+ permeable homomers (Results, lines 236 to 241; Discussion, lines 431 to 435).

      Fig. 4 could be labeled better (to make it clear that B is amplitude and C is freq from the same cells).

      Fig. 4 has been revised—now the amplitude and frequency plots from the same condition (new Fig. 3, B, C; CON or TTX) are in a vertical line and the figure legend states that the frequency data are from the same cells as in Fig. 3A.

      The raw amplitude data seems a bit hidden in the inset panels – I would suggest these data are at least as important as the cumulative distributions in the main panel. Maybe re-organizing the figures would help.

      We have removed all cumulative distributions, rank order plots, and ratio plots. The box plots are now full size in new Figures 1, 2, 5, 6, 7 and 8.

      I’m not sure I would argue in the paper that 12 cells a day is a limiting issue for experiments. It doesn’t add anything and doesn’t seem like that high a barrier. It is fine to just say it is difficult and therefore there is a limited amount of data meeting the criteria.

      We have removed the comment regarding difficulty.

      Calakos N, Schoch S, Sudhof TC, Malenka RC (2004) Multiple roles for the active zone protein RIM1alpha in late stages of neurotransmitter release. Neuron 42:889-896.

      De Gois S, Schafer MK, Defamie N, Chen C, Ricci A, Weihe E, Varoqui H, Erickson JD (2005) Homeostatic scaling of vesicular glutamate and GABA transporter expression in rat neocortical circuits. J Neurosci 25:7121-7133.

      Diaz-Rohrer B, Castello-Serrano I, Chan SH, Wang HY, Shurer CR, Levental KR, Levental I (2023) Rab3 mediates a pathway for endocytic sorting and plasma membrane recycling of ordered microdomains. Proc Natl Acad Sci U S A 120:e2207461120.

      Dubes S, Soula A, Benquet S, Tessier B, Poujol C, Favereaux A, Thoumine O, Letellier M (2022) miR-124dependent tagging of synapses by synaptopodin enables input-specific homeostatic plasticity. EMBO J 41:e109012.

      Fong MF, Newman JP, Potter SM, Wenner P (2015) Upward synaptic scaling is dependent on neurotransmission rather than spiking. Nat Commun 6:6339.

      Herlitze S, Raditsch M, Ruppersberg JP, Jahn W, Monyer H, Schoepfer R, Witzemann V (1993) Argiotoxin detects molecular differences in AMPA receptor channels. Neuron 10:1131-1140.

      Hou Q, Zhang D, Jarzylo L, Huganir RL, Man HY (2008) Homeostatic regulation of AMPA receptor expression at single hippocampal synapses. Proc Natl Acad Sci U S A 105:775-780.

      Ibata K, Sun Q, Turrigiano GG (2008) Rapid synaptic scaling induced by changes in postsynaptic firing. Neuron 57:819-826.

      Iino M, Koike M, Isa T, Ozawa S (1996) Voltage-dependent blockage of Ca(2+)-permeable AMPA receptors by joro spider toxin in cultured rat hippocampal neurones. J Physiol 496 ( Pt 2):431437.

      Jakawich SK, Neely RM, Djakovic SN, Patrick GN, Sutton MA (2010a) An essential postsynaptic role for the ubiquitin proteasome system in slow homeostatic synaptic plasticity in cultured hippocampal neurons. Neuroscience 171:1016-1031.

      Jakawich SK, Nasser HB, Strong MJ, McCartney AJ, Perez AS, Rakesh N, Carruthers CJ, Sutton MA (2010b) Local presynaptic activity gates homeostatic changes in presynaptic function driven by dendritic BDNF synthesis. Neuron 68:1143-1158.

      Kapfhamer D, Valladares O, Sun Y, Nolan PM, Rux JJ, Arnold SE, Veasey SC, Bucan M (2002) Mutations in Rab3a alter circadian period and homeostatic response to sleep loss in the mouse. Nat Genet 32:290-295.

      Koike M, Iino M, Ozawa S (1997) Blocking effect of 1-naphthyl acetyl spermine on Ca(2+)-permeable AMPA receptors in cultured rat hippocampal neurons. Neurosci Res 29:27-36.

      Liu G, Tsien RW (1995) Properties of synaptic transmission at single hippocampal synaptic boutons. Nature 375:404-408.

      Liu G, Choi S, Tsien RW (1999) Variability of neurotransmitter concentration and nonsaturation of postsynaptic AMPA receptors at synapses in hippocampal cultures and slices. Neuron 22:395409.

      Muller M, Pym EC, Tong A, Davis GW (2011) Rab3-GAP controls the progression of synaptic homeostasis at a late stage of vesicle release. Neuron 69:749-762.

      Muller M, Liu KS, Sigrist SJ, Davis GW (2012) RIM controls homeostatic plasticity through modulation of the readily-releasable vesicle pool. J Neurosci 32:16574-16585.

      Pozo K, Cingolani LA, Bassani S, Laurent F, Passafaro M, Goda Y (2012) beta3 integrin interacts directly with GluA2 AMPA receptor subunit and regulates AMPA receptor expression in hippocampal neurons. Proc Natl Acad Sci U S A 109:1323-1328.

      Silva MM, Rodrigues B, Fernandes J, Santos SD, Carreto L, Santos MAS, Pinheiro P, Carvalho AL (2019) MicroRNA-186-5p controls GluA2 surface expression and synaptic scaling in hippocampal neurons. Proc Natl Acad Sci U S A 116:5727-5736.

      Soden ME, Chen L (2010) Fragile X protein FMRP is required for homeostatic plasticity and regulation of synaptic strength by retinoic acid. J Neurosci 30:16910-16921.

      Sun HY, Bartley AF, Dobrunz LE (2009) Calcium-permeable presynaptic kainate receptors involved in excitatory short-term facilitation onto somatostatin interneurons during natural stimulus patterns. J Neurophysiol 101:1043-1055.

      Sutton MA, Ito HT, Cressy P, Kempf C, Woo JC, Schuman EM (2006) Miniature neurotransmission stabilizes synaptic function via tonic suppression of local dendritic protein synthesis. Cell 125:785-799.

      Tan HL, Queenan BN, Huganir RL (2015) GRIP1 is required for homeostatic regulation of AMPAR trafficking. Proc Natl Acad Sci U S A 112:10026-10031.

      Thapliyal S, Arendt KL, Lau AG, Chen L (2022) Retinoic acid-gated BDNF synthesis in neuronal dendrites drives presynaptic homeostatic plasticity. Elife 11.

      Wilson NR, Kang J, Hueske EV, Leung T, Varoqui H, Murnick JG, Erickson JD, Liu G (2005) Presynaptic regulation of quantal size by the vesicular glutamate transporter VGLUT1. J Neurosci 25:62216234.

      Wu YK, Hengen KB, Turrigiano GG, Gjorgjieva J (2020) Homeostatic mechanisms regulate distinct aspects of cortical circuit dynamics. Proc Natl Acad Sci U S A 117:24514-24525.

      Xu X, Pozzo-Miller L (2017) EEA1 restores homeostatic synaptic plasticity in hippocampal neurons from Rett syndrome mice. J Physiol 595:5699-5712.

    1. eLife Assessment

      This study provides important findings that during credit assignment, the lateral orbitofrontal cortex (lOFC) and hippocampus (HC) encode causal choice representations, while the frontopolar cortex (FPl) mediates HC -lOFC interactions when the causality needs to be maintained over longer distractions. While this research offers compelling evidence and employs sophisticated multivariate pattern analysis, there are some concerns regarding a) task design which may have oversimplified real-world credit assignment complexities, and b) the interpretation of results. This work will be of interest to cognitive and computational neuroscientists who work on value-based decision-making and fronto-hippocampal circuits.

    2. Reviewer #1 (Public review):

      Summary:

      The authors conducted a study on one of the fundamental research topics in neuroscience: neural mechanisms of credit assignment. Building on the original studies of Walton and his colleagues and subsequent studies on the same topic, the authors extended the research into the delayed credit assignment problem with clever task design, which compared the non-delayed (direct) and delayed (indirect) credit assignment processes. Their primary goal was to elucidate the neural basis of these processes in humans, advancing our understanding beyond previous studies.

      Strengths:

      (1) Innovative task design distinguishing between direct and indirect credit assignment.

      (2) Use of sophisticated multivariate pattern analysis to identify neural correlates of pending representations.

      (3) Well-executed study with clear presentation of results.

      (4) Extension of previous research to human subjects, providing valuable comparative insights.

      Considerations for Future Research:

      (1) The task design, while clear and effective, might be further developed to capture more real-world complexity in credit assignment.

      (2) There's potential for deeper exploration of the role of task structure understanding in credit assignment processes.

      (3) The interpretation of lateral orbitofrontal cortex (lOFC) involvement could be expanded to consider its role in both credit assignment and task structure representation.

      Achievement of Aims and Support of Conclusions:

      The authors successfully achieved their aim of investigating direct and indirect credit assignment processes in humans. Their results provide valuable insights into the neural representations involved in these processes. The study's conclusions are generally well-supported by the data, particularly in identifying neural correlates of pending representations crucial for delayed credit assignment.

      Impact on the Field and Utility of Methods:

      This study makes a significant contribution to the field of credit assignment research by bridging animal and human studies. The methods, particularly the multivariate pattern analysis approach, provide a robust template for future investigations in this area. The data generated offers valuable insights for researchers comparing human and animal models of credit assignment, as well as those studying the neural basis of decision-making and learning.

      The study's focus on the lOFC and its role in credit assignment adds to our understanding of this brain region's function.

      Additional Context and Future Directions:

      (1) Temporal ambiguity in credit assignment: While the current design provides clear task conditions, future studies could explore more ambiguous scenarios to further reflect real-world complexity.

      (2) Role of task structure understanding: The difference in task comprehension between human subjects in this study and animal subjects in previous studies offers an interesting point of comparison.

      (3) The authors used a sophisticated method of multivariate pattern analysis to find the neural correlate of the pending representation of the previous choice, which will be used for the credit assignment process in the later trials. The authors tend to use expressions that these representations are maintained throughout this intervening period. However, the analysis period is specifically at the feedback period, which is irrelevant to the credit assignment of the immediately preceding choice. This task period can interfere with the ongoing credit assignment process. Thus, rather than the passive process of maintaining the information of the previous choice, the activity of this specific period can mean the active process of protecting the information from interfering and irrelevant information. It would be great if the authors could comment on this important interpretational issue.

      (4) Broader neural involvement: While the focus on specific regions of interest (ROIs) provided clear results, future studies could benefit from a whole-brain analysis approach to provide a more comprehensive understanding of the neural networks involved in credit assignment.

    3. Reviewer #2 (Public review):

      Summary:

      The present manuscript addresses a longstanding challenge in neuroscience: how the brain assigns credit for delayed outcomes, especially in real-world learning scenarios where decisions and outcomes are separated by time. The authors focus on the lateral orbitofrontal cortex and hippocampus, key regions involved in contingent learning. By integrating fMRI data and behavioral tasks, the authors examined how neural circuits maintain a causal link between past decisions and delayed outcomes. Their findings offer insights into mechanisms that could have critical implications for understanding human decision-making.

      Strengths:

      (1) The experimental designs were extremely well thought-out. The authors successfully coupled behavioral data and neural measures (through fMRI) to explore the neural mechanisms of contingent learning. This integration adds robustness to the findings and strengthens their relevance.

      (2) The emphasis on the interaction between the lateral orbitofrontal cortex (lOFC) and hippocampus (HC) in this study is very well-targeted. The reported findings regarding their dynamic interactions provide valuable insights into contingent learning in humans.

      (3) The use of an advanced modeling framework and analytical techniques allowed the authors to uncover new mechanistic insights regarding a complex case of the decision-making process. The methods developed will also benefit analyses of future neuroimaging data on a range of decision-making tasks as well.

      Weaknesses:

      Given the limited temporal resolution of fMRI and that the measured signal is an indirect measure of neural activity, it is unclear the extent to which the reported causality reflects the true relationship/interactions between neurons in different regions.

    4. Reviewer #3 (Public review):

      The authors apply multivoxel decoding analyses from fMRI during reward feedback about the cues previously chosen that led to that feedback. They compare two versions of the task - one in which the feedback is provided about the current trial, and one in which the feedback is provided about the previous trial. Reward probability changes slowly over time, so subjects need to identify which cues are leading to reward at a given time. They find that evidence for recall of the cue in the lateral orbitofrontal cortex (lOFC) and hippocampus (HC). They also find that in the second condition, where feedback is for the one-back trial, this representation is mediated by the lateral frontal pole (FPl).

      Overall, the analyses are clean and elegant and seem to be complete. I have only a few comments.

      (1) They do find (not surprisingly) that the one-back task is harder. It would be good to ensure that the reason that they had more trouble detecting direct HC & lOFC effects on the harder task was not because the task is harder and thus that there are more learning failures on the harder one-back task. (I suspect their explanation that it is mediated by FPl is likely to be correct. But it would be nice to do some subsampling of the zero-back task [matched to the success rate of the one-back task] to ensure that they still see the direct HC and lOFC there).

      (2) The evidence that they present in the main text (Figure 3) that the HC and lOFC are mediated by FPl is a correlation. I found the evidence presented in Supplemental Figure 7 to be much more convincing. As I understand it, what they are showing in SF7 is that when FPl decodes the cue, then (and only then) HC and lOFC decode the cue. If my understanding is correct, then this is a much cleaner explanation for what is going on than the secondary correlation analysis. If my understanding here is incorrect, then they should provide a better explanation of what is going on so as to not confuse the reader.

      (3) I like the idea of "credit spreading" across trials (Figure 1E). I think that credit spreading in each direction (into the past [lower left] and into the future [upper right]) is not equivalent. This can be seen in Figure 1D, where the two tasks show credit spreading differently. I think a lot more could be studied here. Does credit spreading in each of these directions decode in interesting ways in different places in the brain?

    1. eLife Assessment

      The study from Frank and colleagues reports potentially important cryo-EM observations of mouse glutamatergic synapses isolated from adult mammalian brains. The authors used a combination of mouse genetics to generate PSD95-GFP labeling in vivo, a rapid synaptosome isolation and cryo-protectant strategy, and cryogenic correlated light-electron microscopy (cryoCLEM) to record tomograms of synapses, which together provide convincing support for their conclusions. Controversially, the authors report that forebrain glutamatergic synapses do not contain postsynaptic "densities" (PSD), a defining feature of synapse structure identified in chemically-fixed and resin-embedded brain samples. The work questions a long-standing concept in neurobiology and is primarily of interest to specialists in synaptic structure and function.

    2. Reviewer #1 (Public review):

      The authors survey the ultrastructural organization of glutamatergic synapses by cryo-ET and image processing tools using two complementary experimental approaches. The first approach employs so-called "ultra-fresh" preparations of brain homogenates from a knock-in mouse expressing a GFP-tagged version of PSD-95, allowing Peukes and colleagues to specifically target excitatory glutamatergic synapses. In the second approach, direct in-tissue (using cortical and hippocampal regions) targeting of the glutamatergic synapses employing the same mouse model is presented. In order to ascertain whether the isolation procedure causes any significant changes in the ultrastructural organization (and possibly synaptic macromolecular organization) the authors compare their findings using both of these approaches. The quantitation of the synaptic cleft height reveals an unexpected variability, while the STA analysis of the ionotropic receptors provides insights into their distribution with respect to the synaptic cleft.

      The main novelty of this study lies in the continuous claims by the authors that the sample preservation methods developed here are superior to any others previously used. This leads them as well to systematically downplay or directly ignore a substantial body of previous cryo-ET studies of synaptic structure. Without comparisons with the cryo-ET literature, it is very hard to judge the impact of this work in the field. Furthermore, the data does not show any better preservation in the so-called "ultra-fresh" preparation than in the literature, perhaps to the contrary as synapses with strangely elongated vesicles are often seen. Such synapses have been regularly discarded for further analysis in previous synaptosome studies (e.g. Martinez-Sanchez 2021). Whilst the targeting approach using a fluorescent PSD95 marker is novel and seems sufficiently precise, the authors use a somewhat outdated approach (cryo-sectioning) to generate in-tissue tomograms of poor quality. To what extent such tomograms can be interpreted in molecular terms is highly questionable. The authors also don't discuss the physiological influence of 20% dextran used for high-pressure freezing of these "very native" specimens.

      Lastly, a large part of the paper is devoted to image analysis of the PSD which is not convincing (including a somewhat forced comparison with the fixed and heavy-metal staining room temperature approach). Despite being a technically challenging study, the results fall short of expectations.

    3. Reviewer #2 (Public review):

      Summary:

      The authors set out to visualize the molecular architecture of the adult forebrain glutamatergic synapses in a near-native state. To this end, they use a rapid workflow to extract and plunge-freeze mouse synapses for cryo-electron tomography. In addition, the authors use knockin mice expression PSD95-GFP in order to perform correlated light and electron microscopy to clearly identify pre- and synaptic membranes. By thorough quantification of tomograms from plunge- and high-pressure frozen samples, the authors show that the previously reported 'post-synaptic density' does not occur at high frequency and therefore not a defining feature of a glutamatergic synapse.

      Subsequently, the authors are able to reproduce the frequency of post-synaptic density when preparing conventional electron microscopy samples, thus indicating that density prevalence is an artifact of sample preparation. The authors go on to describe the arrangement of cytoskeletal components, membraneous compartments, and ionotropic receptor clusters across synapses.

      Demonstrating that the frequency of the post-synaptic density in prior work is likely an artifact and not a defining feature of glutamatergic synapses is significant. The descriptions of distributions and morphologies of proteins and membranes in this work may serve as a basis for the future of investigation for readers interested in these features.

      Strengths:

      The authors perform a rigorous quantification of the molecular density profiles across synapses to determine the frequency of the post-synaptic density. They prepare samples using two cryogenic electron microscopy sample preparation methods, as well as one set of samples using conventional electron microscopy methods. The authors can reproduce previous reports of the frequency of the post-synaptic density by conventional sample preparation, but not by either of the cryogenic methods, thus strongly supporting their claim.

    4. Reviewer #3 (Public review):

      Summary:

      The authors use cryo-electron tomography to thoroughly investigate the complexity of purified, excitatory synapses. They make several major interesting discoveries: polyhedral vesicles that have not been observed before in neurons; analysis of the intermembrane distance, and a link to potentiation, essentially updating distances reported from plastic-embedded specimen; and find that the postsynaptic density does not appear as a dense accumulation of proteins in all vitrified samples (less than half), a feature which served as a hallmark feature to identify excitatory plastic-embedded synapses.

      Strengths:

      (1) The presented work is thorough: the authors compare purified, endogenously labeled synapses to wild-type synapses to exclude artifacts that could arise through the homogenation step, and, in addition, analyse plastic embedded, stained synapses prepared using the same quick workflow, to ensure their findings have not been caused by way of purification of the synapses. Interestingly, the 'thick lines of PSD' are evident in most of their stained synapses.

      (2) I commend the authors on the exceptional technical achievement of preparing frozen specimens from a mouse within two minutes.

      (3) The approaches highlighted here can be used in other fields studying cell-cell junctions.

      (4) The tomograms will be deposited upon publication which will enable neurobiologists and researchers from other fields to carry on data evaluation in their field of expertise since tomography is still a specialized skill and they collected and reconstructed over 100 excellent tomograms of synapses, which generates a wealth of information to be also used in future studies.

      (5) The authors have identified ionotropic receptor positions and that they are linked to actin filaments, and appear to be associated with membrane and other cytosolic scaffolds, which is highly exciting.

      (6) The authors achieved their aims to study neuronal excitatory synapses in great detail, were thorough in their experiments, and made multiple fascinating discoveries. They challenge dogmas that have been in place for decades and highlight the benefit of implementing and developing new methods to carefully understand the underlying molecular machines of synapses.

      Weaknesses:

      The authors show informative segmentations in their figures but none have been overlayed with any of the tomograms in the submitted videos. It would be helpful for data evaluation to a broad audience to be able to view these together as videos to study these tomograms and extract more information. Deposition of segmentations associated with the tomgrams would be tremendously helpful to Neurobiologists, cryo-ET method developers, and others to push the boundaries.

      Impact on community:

      The findings presented by Peukes et al. pertaining to synapse biology change dogmas about the fundamental understanding of synaptic ultrastructure. The work presented by the authors, particularly the associated change of intermembrane distance with potentiation and the distinct appearance of the PSD as an irregular amorphous 'cloud' will provide food for thought and an incentive for more analysis and additional studies, as will the discovery of large membranous and cytosolic protein complexes linked to ionotropic receptors within and outside of the synaptic cleft, which are ripe for investigation. The findings and tomograms available will carry far in the synapse fields and the approach and methods will move other fields outside of neurobiology forward. The method and impactful results of preparing cryogenic, unlabeled, unstained, near-native synapses may enable the study of how synapses function at high resolution in the future.

    5. Author response:

      Reviewer #1 (Public review): 

      The authors survey the ultrastructural organization of glutamatergic synapses by cryo-ET and image processing tools using two complementary experimental approaches. The first approach employs so-called "ultra-fresh" preparations of brain homogenates from a knock-in mouse expressing a GFP-tagged version of PSD-95, allowing Peukes and colleagues to specifically target excitatory glutamatergic synapses. In the second approach, direct in-tissue (using cortical and hippocampal regions) targeting of the glutamatergic synapses employing the same mouse model is presented. In order to ascertain whether the isolation procedure causes any significant changes in the ultrastructural organization (and possibly synaptic macromolecular organization) the authors compare their findings using both of these approaches. The quantitation of the synaptic cleft height reveals an unexpected variability, while the STA analysis of the ionotropic receptors provides insights into their distribution with respect to the synaptic cleft.

      The main novelty of this study lies in the continuous claims by the authors that the sample preservation methods developed here are superior to any others previously used. This leads them as well to systematically downplay or directly ignore a substantial body of previous cryo-ET studies of synaptic structure. Without comparisons with the cryo-ET literature, it is very hard to judge the impact of this work in the field. Furthermore, the data does not show any better preservation in the so-called "ultra-fresh" preparation than in the literature, perhaps to the contrary as synapses with strangely elongated vesicles are often seen. Such synapses have been regularly discarded for further analysis in previous synaptosome studies (e.g. Martinez-Sanchez 2021). Whilst the targeting approach using a fluorescent PSD95 marker is novel and seems sufficiently precise, the authors use a somewhat outdated approach (cryo-sectioning) to generate in-tissue tomograms of poor quality. To what extent such tomograms can be interpreted in molecular terms is highly questionable. The authors also don't discuss the physiological influence of 20% dextran used for high-pressure freezing of these "very native" specimens.

      Lastly, a large part of the paper is devoted to image analysis of the PSD which is not convincing (including a somewhat forced comparison with the fixed and heavy-metal staining room temperature approach). Despite being a technically challenging study, the results fall short of expectations. 

      Our manuscript contains a discussion of both conventional EM and cryoET of synapses. We apologise if we have omitted referencing or discussing any earlier cryoET work. This was certainly not our intention, and we include a more complete discussion of published cryoET work on synapses in our revised manuscript.

      The reviewer is concerned that the synaptic vesicles in some synapse tomograms are “stretched” and that this may reflect poor preservation.  We would like to point out that such non-spherical synaptic vesicles have also been previously reported in cryoET of primary neurons grown on EM grids (Tao et al., J. Neuro, 2018). Indeed, there is no reason per se to suppose synaptic vesicles are always spherical and there are many diverse families of proteins expressed at the synapse that shape membrane curvature (BAR domain proteins, synaptotagmin, epsins, endophilins and others). We will add further discussion of this issue in the revised manuscript.

      The reviewer regards ‘cryo-sectioning’ as outdated and cryoET data from these preparations as “poor quality”. We respectfully disagree. Preparing brain tissues for cryoET is generally considered to be challenging. The first successful demonstration of preparing such samples was before the advent of the cryoEM resolution revolution (with electron counting detectors) by Zuber et al (Proc. Natl. Acad. Sci.,2005) preparing cryo-sections/CEMOVIS of in vitro brain cultures. We followed this technique to prepare tissue cryo-sections for cryoET in our manuscript. Recently, cryoFIB-SEM liftout has been developed as an alternative method to prepare tissue samples for cryoET (Mahamid et al., J. Struct. Biol., 2015) and only more recently this method became available to more laboratories. Both techniques introduce damage as has been described (Han et al., J. Microsc., 2008; Lucas et al., Proc. Natl. Acad. Sci., 2023). Importantly no like-for-like, quantitative comparison of these two methodologies has yet been performed. We have recently demonstrated that the molecular structure of amyloid fibrils within human brain is preserved down to the protein fold level in samples prepared by cryo-sectioning (Gilbert et al., Nature, 2024). We will add further detail on the process by which we excluded poor quality tomograms from our analysis, which we described in detail in our methods section.

      The reviewer asks what the physiological effect is of adding 20% w/v ~40,000 Da dextran? This is a reasonable concern since this could in principle exert osmotic pressure on the tissue sample. While we did not investigate this ourselves, earlier studies have (Zuber et al, 2005) showing cell membranes were not damaged by and did not have any detectable effect on cell structure in the presence of this concentration of dextran.

      The reviewer is not convinced by our analysis of the apparent molecular density of macromolecules in the postsynaptic compartment that in conventional EM is called the postsynaptic density. However, the reviewer provides no reasoning for this assessment nor alternative approaches that could be attempted. We would like to add that we have tested multiple different approaches to objectively measure molecular crowding in cryoET data, that give comparable results. We believe that our conclusion – that we do not observe an increased molecular density conserved at the postsynaptic membrane, and that the PSD that we and others observed by conventional EM does not correspond to a region of increased molecular density - is well supported by our data.  We and the other reviewers consider this an important and novel observation.

      Reviewer #2 (Public review): 

      Summary: 

      The authors set out to visualize the molecular architecture of the adult forebrain glutamatergic synapses in a near-native state. To this end, they use a rapid workflow to extract and plunge-freeze mouse synapses for cryo-electron tomography. In addition, the authors use knockin mice expression PSD95-GFP in order to perform correlated light and electron microscopy to clearly identify pre- and synaptic membranes. By thorough quantification of tomograms from plunge- and high-pressure frozen samples, the authors show that the previously reported 'post-synaptic density' does not occur at high frequency and therefore not a defining feature of a glutamatergic synapse.

      Subsequently, the authors are able to reproduce the frequency of post-synaptic density when preparing conventional electron microscopy samples, thus indicating that density prevalence is an artifact of sample preparation. The authors go on to describe the arrangement of cytoskeletal components, membraneous compartments, and ionotropic receptor clusters across synapses.

      Demonstrating that the frequency of the post-synaptic density in prior work is likely an artifact and not a defining feature of glutamatergic synapses is significant. The descriptions of distributions and morphologies of proteins and membranes in this work may serve as a basis for the future of investigation for readers interested in these features.

      Strengths: 

      The authors perform a rigorous quantification of the molecular density profiles across synapses to determine the frequency of the post-synaptic density. They prepare samples using two cryogenic electron microscopy sample preparation methods, as well as one set of samples using conventional electron microscopy methods. The authors can reproduce previous reports of the frequency of the post-synaptic density by conventional sample preparation, but not by either of the cryogenic methods, thus strongly supporting their claim. 

      We thank the reviewer for their generous assessment of our manuscript.

      Reviewer #3 (Public review): 

      Summary: 

      The authors use cryo-electron tomography to thoroughly investigate the complexity of purified, excitatory synapses. They make several major interesting discoveries: polyhedral vesicles that have not been observed before in neurons; analysis of the intermembrane distance, and a link to potentiation, essentially updating distances reported from plastic-embedded specimen; and find that the postsynaptic density does not appear as a dense accumulation of proteins in all vitrified samples (less than half), a feature which served as a hallmark feature to identify excitatory plastic-embedded synapses. 

      Strengths: 

      (1) The presented work is thorough: the authors compare purified, endogenously labeled synapses to wild-type synapses to exclude artifacts that could arise through the homogenation step, and, in addition, analyse plastic embedded, stained synapses prepared using the same quick workflow, to ensure their findings have not been caused by way of purification of the synapses. Interestingly, the 'thick lines of PSD' are evident in most of their stained synapses.

      (2) I commend the authors on the exceptional technical achievement of preparing frozen specimens from a mouse within two minutes.

      (3) The approaches highlighted here can be used in other fields studying cell-cell junctions.

      (4) The tomograms will be deposited upon publication which will enable neurobiologists and researchers from other fields to carry on data evaluation in their field of expertise since tomography is still a specialized skill and they collected and reconstructed over 100 excellent tomograms of synapses, which generates a wealth of information to be also used in future studies.

      (5) The authors have identified ionotropic receptor positions and that they are linked to actin filaments, and appear to be associated with membrane and other cytosolic scaffolds, which is highly exciting.

      (6) The authors achieved their aims to study neuronal excitatory synapses in great detail, were thorough in their experiments, and made multiple fascinating discoveries. They challenge dogmas that have been in place for decades and highlight the benefit of implementing and developing new methods to carefully understand the underlying molecular machines of synapses.

      Weaknesses: 

      The authors show informative segmentations in their figures but none have been overlayed with any of the tomograms in the submitted videos. It would be helpful for data evaluation to a broad audience to be able to view these together as videos to study these tomograms and extract more information. Deposition of segmentations associated with the tomgrams would be tremendously helpful to Neurobiologists, cryo-ET method developers, and others to push the boundaries.

      Impact on community: 

      The findings presented by Peukes et al. pertaining to synapse biology change dogmas about the fundamental understanding of synaptic ultrastructure. The work presented by the authors, particularly the associated change of intermembrane distance with potentiation and the distinct appearance of the PSD as an irregular amorphous 'cloud' will provide food for thought and an incentive for more analysis and additional studies, as will the discovery of large membranous and cytosolic protein complexes linked to ionotropic receptors within and outside of the synaptic cleft, which are ripe for investigation. The findings and tomograms available will carry far in the synapse fields and the approach and methods will move other fields outside of neurobiology forward. The method and impactful results of preparing cryogenic, unlabelled, unstained, near-native synapses may enable the study of how synapses function at high resolution in the future.

      We thank the reviewer for their supportive assessment of our manuscript.  We thank the reviewer for suggesting overlaying segmentations with videos of the raw tomographic volumes. We will include this in our revised manuscript.

    1. eLife Assessment

      This important study provides new and nuanced insights into the evolution of morphs in a textbook example of Batesian mimicry. The evidence supporting the claims about the origin and dominance relationships among morphs is convincing, but the interpretation of signals needs improvement with complementary analysis and some nuanced interpretation. Pending a revision, this work will be of interest to a broad range of evolutionary biologists.

    2. Reviewer #1 (Public review):

      In this study, Deshmukh et al. provide an elegant illustration of Haldane's sieve, the population genetics concept stating that novel advantageous alleles are more likely to fix if dominant because dominant alleles are more readily exposed to selection. To achieve this, the authors rely on a uniquely suited study system, the female-polymorphic butterfly Papilio polytes.

      Deshmukh et al. first reconstruct the chronology of allele evolution in the P. polytes species group, clearly establishing the non-mimetic cyrus allele as ancestral, followed by the origin of the mimetic allele polytes/theseus, via a previously characterized inversion of the dsx locus, and most recently, the origin of the romulus allele in the P. polytes lineage, after its split from P. javanus. The authors then examine the two crucial predictions of Haldane's sieve, using the three alleles of P. polytes (cyrus, polytes, and romulus). First, they report with compelling evidence that these alleles are sequentially dominant, or put in other words, novel adaptive alleles either are or quickly become dominant upon their origin. Second, the authors find a robust signature of positive selection at the dsx locus, across all five species that share the polytes allele.

      In addition to exquisitely exemplifying Haldane's sieve, this study characterizes the genetic differences (or lack thereof) between mimetic alleles at the dsx locus. Remarkably, the polytes and romulus alleles are profoundly differentiated, despite their short divergence time (< 0.5 my), whereas the polytes and theseus alleles are indistinguishable across both coding and intronic sequences of dsx. Finally, the study reports incidental evidence of exon swaps between the polytes and romulus alleles. These exon swaps caused intermediate colour patterns and suggest that (rare) recombination might be a mechanism by which novel morphs evolve.

      This study advances our understanding of the evolution of the mimicry polymorphism in Papilio butterflies. This is an important contribution to a system already at the forefront of research on the genetic and developmental basis of sex-specific phenotypic morphs, which are common in insects. More generally, the findings of this study have important implications for how we think about the molecular dynamics of adaptation. In particular, I found that finding extensive genetic divergence between the polytes and romulus alleles is striking, and it challenges the way I used to think about the evolution of this and other otherwise conserved developmental genes. I think that this study is also a great resource for teaching evolution. By linking classic population genetic theory to modern genomic methods, while using visually appealing traits (colour patterns), this study provides a simple yet compelling example to bring to a classroom.

      In general, I think that the conclusions of the study, in terms of the evolutionary history of the locus, the dominance relationships between P. polytes alleles, and the inference of a selective sweep in spite of contemporary balancing selection, are strongly supported; the data set is impressive and the analyses are all rigorous. I nonetheless think that there are a few ways in which the current presentation of these data could lead to confusion, and should be clarified and potentially also expanded.

      (1) The study is presented as addressing a paradox related to the evolution of phenotypic novelty in "highly constrained genetic architectures". If I understand correctly, these constraints are assumed to arise because the dsx inversion acts as a barrier to recombination. I agree that recombination in the mimicry locus is reduced and that recombination can be a source of phenotypic novelty. However, I'm not convinced that the presence of a structural variant necessarily constrains the potential evolution of novel discrete phenotypes. Instead, I'm having a hard time coming up with examples of discrete phenotypic polymorphisms that do not involve structural variants. If there is a paradox here, I think it should be more clearly justified, including an explanation of what a constrained genetic architecture means. I also think that the Discussion would be the place to return to this supposed paradox, and tell us exactly how the observations of exon swaps and the genetic characterization of the different mimicry alleles help resolve it.

      (2) While Haldane's sieve is clearly demonstrated in the P. polytes lineage (with cyrus, polytes, and romulus alleles), there is another allele trio (cyrus, polytes, and theseus) for which Haldane's sieve could also be expected. However, the chronological order in which polytes and theseus evolved remains unresolved, precluding a similar investigation of sequential dominance. Likewise, the locus that differentiates polytes from theseus is unknown, so it's not currently feasible to identify a signature of positive selection shared by P. javanus and P. alphenor at this locus. I, therefore, think that it is premature to conclude that the evolution of these mimicry polymorphisms generally follows Haldane's sieve; of two allele trios, only one currently shows the expected pattern.

    3. Reviewer #2 (Public review):

      Summary:

      Deshmukh and colleagues studied the evolution of mimetic morphs in the Papilio polytes species group. They investigate the timing of origin of haplotypes associated with different morphs, their dominance relationships, associations with different isoform expressions, and evidence for selection and recombination in the sequence data. P. polytes is a textbook example of a Batesian mimic, and this study provides important nuanced insights into its evolution, and will therefore be relevant to many evolutionary biologists. I find the results regarding dominance and the sequence of events generally convincing, but I have some concerns about the motivation and interpretation of some other analyses, particularly the tests for selection.

      Strengths:

      This study uses widespread sampling, large sample sizes from crossing experiments, and a wide range of data sources.

      Weaknesses:

      (1) Purpose and premise of selective sweep analysis

      A major narrative of the paper is that new mimetic alleles have arisen and spread to high frequency, and their dominance over the pre-existing alleles is consistent with Haldane's sieve. It would therefore make sense to test for selective sweep signatures within each morph (and its corresponding dsx haplotype), rather than at the species level. This would allow a test of the prediction that those morphs that arose most recently would have the strongest sweep signatures.

      Sweep signatures erode over time - see Figure 2 of Moest et al. 2020 (https://doi.org/10.1371/journal.pbio.3000597), and it is unclear whether we expect the signatures of the original sweeps of these haplotypes to still be detectable at all. Moest et al show that sweep signatures are completely eroded by 1N generations after the event, and probably not detectable much sooner than that, so assuming effective population sizes of these species of a few million, at what time scale can we expect to detect sweeps? If these putative sweeps are in fact more recent than the origin of the different morphs, perhaps they would more likely be associated with the refinement of mimicry, but not necessarily providing evidence for or against a Haldane's sieve process in the origin of the morphs.

      (2) Selective sweep methods

      A tool called RAiSD was used to detect signatures of selective sweeps, but this manuscript does not describe what signatures this tool considers (reduced diversity, skewed frequency spectrum, increased LD, all of the above?). Given the comment above, would this tool be sensitive to incomplete sweeps that affect only one morph in a species-level dataset? It is also not clear how RAiSD could identify signatures of selective sweeps at individual SNPs (line 206). Sweeps occur over tracts of the genome and it is often difficult to associate a sweep with a single gene.

      (3) Episodic diversification

      Very little information is provided about the Branch-site Unrestricted Statistical Test for Episodic Diversification (BUSTED) and Mixed Effects Model of Evolution (MEME), and what hypothesis the authors were testing by applying these methods. Although it is not mentioned in the manuscript, a quick search reveals that these are methods to study codon evolution along branches of a phylogeny. Without this information, it is difficult to understand the motivation for this analysis.

      (4) GWAS for form romulus

      The authors argue that the lack of SNP associations within dsx for form romulus is caused by poor read mapping in the inverted region itself (line 125). If this is true, we would expect strong association in the regions immediately outside the inversion. From Figure S3, there are four discrete peaks of association, and the location of dsx and the inversion are not indicated, so it is difficult to understand the authors' interpretation in light of this figure.

      (5) Form theseus

      Since there appears to be only one sequence available for form theseus (actually it is said to be "P. javanus f. polytes/theseus"), is it reasonable to conclude that "the dsx coding sequence of f. theseus was identical to that of f. polytes in both P. javanus and P. alphenor" (Line 151)? Looking at the Clarke and Sheppard (1972) paper cited in the statement that "f. polytes and f. theseus show equal dominance" (line 153), it seems to me that their definition of theseus is quite different from that here. Without addressing this discrepancy, the results are difficult to interpret.

    4. Author Response:

      Reviewer #1 (Public review):

      In this study, Deshmukh et al. provide an elegant illustration of Haldane's sieve, the population genetics concept stating that novel advantageous alleles are more likely to fix if dominant because dominant alleles are more readily exposed to selection. To achieve this, the authors rely on a uniquely suited study system, the female-polymorphic butterfly Papilio polytes.

      Deshmukh et al. first reconstruct the chronology of allele evolution in the P. polytes species group, clearly establishing the non-mimetic cyrus allele as ancestral, followed by the origin of the mimetic allele polytes/theseus, via a previously characterized inversion of the dsx locus, and most recently, the origin of the romulus allele in the P. polytes lineage, after its split from P. javanus. The authors then examine the two crucial predictions of Haldane's sieve, using the three alleles of P. polytes (cyrus, polytes, and romulus). First, they report with compelling evidence that these alleles are sequentially dominant, or put in other words, novel adaptive alleles either are or quickly become dominant upon their origin. Second, the authors find a robust signature of positive selection at the dsx locus, across all five species that share the polytes allele.

      In addition to exquisitely exemplifying Haldane's sieve, this study characterizes the genetic differences (or lack thereof) between mimetic alleles at the dsx locus. Remarkably, the polytes and romulus alleles are profoundly differentiated, despite their short divergence time (< 0.5 my), whereas the polytes and theseus alleles are indistinguishable across both coding and intronic sequences of dsx. Finally, the study reports incidental evidence of exon swaps between the polytes and romulus alleles. These exon swaps caused intermediate colour patterns and suggest that (rare) recombination might be a mechanism by which novel morphs evolve.

      This study advances our understanding of the evolution of the mimicry polymorphism in Papilio butterflies. This is an important contribution to a system already at the forefront of research on the genetic and developmental basis of sex-specific phenotypic morphs, which are common in insects. More generally, the findings of this study have important implications for how we think about the molecular dynamics of adaptation. In particular, I found that finding extensive genetic divergence between the polytes and romulus alleles is striking, and it challenges the way I used to think about the evolution of this and other otherwise conserved developmental genes. I think that this study is also a great resource for teaching evolution. By linking classic population genetic theory to modern genomic methods, while using visually appealing traits (colour patterns), this study provides a simple yet compelling example to bring to a classroom.

      In general, I think that the conclusions of the study, in terms of the evolutionary history of the locus, the dominance relationships between P. polytes alleles, and the inference of a selective sweep in spite of contemporary balancing selection, are strongly supported; the data set is impressive and the analyses are all rigorous. I nonetheless think that there are a few ways in which the current presentation of these data could lead to confusion, and should be clarified and potentially also expanded.

      We thank the reviewer for the kind and encouraging assessment of our work.

      (1) The study is presented as addressing a paradox related to the evolution of phenotypic novelty in "highly constrained genetic architectures". If I understand correctly, these constraints are assumed to arise because the dsx inversion acts as a barrier to recombination. I agree that recombination in the mimicry locus is reduced and that recombination can be a source of phenotypic novelty. However, I'm not convinced that the presence of a structural variant necessarily constrains the potential evolution of novel discrete phenotypes. Instead, I'm having a hard time coming up with examples of discrete phenotypic polymorphisms that do not involve structural variants. If there is a paradox here, I think it should be more clearly justified, including an explanation of what a constrained genetic architecture means. I also think that the Discussion would be the place to return to this supposed paradox, and tell us exactly how the observations of exon swaps and the genetic characterization of the different mimicry alleles help resolve it.

      The paradox that we refer to here is essentially the contrast of evolving new adaptive traits which are genetically regulated, while maintaining the existing adaptive trait(s) at its fitness peak. While one of the mechanisms to achieve this could be differential structural rearrangement at the chromosomal level, it could arise due to alternative alleles or splice variants of a key gene (caste determination in Cardiocondyla ants), and differential regulation of expression (the spatial regulation of melanization in Nymphalid butterflies by ivory lncRNA). In each of these cases, a new mutation would have to give rise to a new phenotype without diluting the existing adaptive traits when it arises. We focused on structural variants, because that was the case in our study system, however, the point we were making referred to evolution of novel traits in general. We will add a section in the revised discussion to address this.

      (2) While Haldane's sieve is clearly demonstrated in the P. polytes lineage (with cyrus, polytes, and romulus alleles), there is another allele trio (cyrus, polytes, and theseus) for which Haldane's sieve could also be expected. However, the chronological order in which polytes and theseus evolved remains unresolved, precluding a similar investigation of sequential dominance. Likewise, the locus that differentiates polytes from theseus is unknown, so it's not currently feasible to identify a signature of positive selection shared by P. javanus and P. alphenor at this locus. I, therefore, think that it is premature to conclude that the evolution of these mimicry polymorphisms generally follows Haldane's sieve; of two allele trios, only one currently shows the expected pattern.

      We agree with the reviewer that the genetic basis of f. theseus requires further investigation. f. theseus occupies the same level on the dominance hierarchy of dsx alleles as f. polytes (Clarke and Sheppard, 1972) and the allelic variant of dsx present in both these female forms is identical, so there exists just one trio of alleles of dsx. Based on this evidence, we cannot comment on the origin of forms theseus and polytes. They could have arisen at the same time or sequentially. Since our paper is largely focused on the sequential evolution of dsx alleles through Haldane’s sieve, we have included f. theseus in our conclusions. We think that it fits into the framework of Haldane’s sieve due to its genetic dominance over the non-mimetic female form. However, this aspect needs to be explored further in a more specific study focusing on the characterization, origin, and developmental genetics of f. theseus in the future.

      Reviewer #2 (Public review):

      Summary:

      Deshmukh and colleagues studied the evolution of mimetic morphs in the Papilio polytes species group. They investigate the timing of origin of haplotypes associated with different morphs, their dominance relationships, associations with different isoform expressions, and evidence for selection and recombination in the sequence data. P. polytes is a textbook example of a Batesian mimic, and this study provides important nuanced insights into its evolution, and will therefore be relevant to many evolutionary biologists. I find the results regarding dominance and the sequence of events generally convincing, but I have some concerns about the motivation and interpretation of some other analyses, particularly the tests for selection.

      We thank the reviewer for these insightful remarks.

      Strengths:

      This study uses widespread sampling, large sample sizes from crossing experiments, and a wide range of data sources.

      We appreciate this point. This strength has indeed helped us illuminate the evolutionary dynamics of this classic example of balanced polymorphism.

      Weaknesses:

      (1) Purpose and premise of selective sweep analysis

      A major narrative of the paper is that new mimetic alleles have arisen and spread to high frequency, and their dominance over the pre-existing alleles is consistent with Haldane's sieve. It would therefore make sense to test for selective sweep signatures within each morph (and its corresponding dsx haplotype), rather than at the species level. This would allow a test of the prediction that those morphs that arose most recently would have the strongest sweep signatures.

      Sweep signatures erode over time - see Figure 2 of Moest et al. 2020 (https://doi.org/10.1371/journal.pbio.3000597), and it is unclear whether we expect the signatures of the original sweeps of these haplotypes to still be detectable at all. Moest et al show that sweep signatures are completely eroded by 1N generations after the event, and probably not detectable much sooner than that, so assuming effective population sizes of these species of a few million, at what time scale can we expect to detect sweeps? If these putative sweeps are in fact more recent than the origin of the different morphs, perhaps they would more likely be associated with the refinement of mimicry, but not necessarily providing evidence for or against a Haldane's sieve process in the origin of the morphs.

      Our original plan was to perform signatures of sweeps on individual morphs, but we have very small sample sizes for individual morphs in some species, which made it difficult to perform the analysis. We agree that signatures of selective sweeps cannot give us an estimate of possible timescales of the sweep. They simply indicate that there may have been a sweep in a certain genomic region. Therefore, with just the data from selective sweeps, we cannot determine whether these occurred with refining of mimicry or the mimetic phenotype itself. We have thus made no interpretations regarding time scales or causal events of the sweep. Additionally, we discuss the results we obtained for individual alleles represent what could have occurred at the point of origin of mimetic resemblance or in the course of perfecting the resemblance, although we cannot differentiate between the two at this point (lines 320 to 333).

      (2) Selective sweep methods

      A tool called RAiSD was used to detect signatures of selective sweeps, but this manuscript does not describe what signatures this tool considers (reduced diversity, skewed frequency spectrum, increased LD, all of the above?). Given the comment above, would this tool be sensitive to incomplete sweeps that affect only one morph in a species-level dataset? It is also not clear how RAiSD could identify signatures of selective sweeps at individual SNPs (line 206). Sweeps occur over tracts of the genome and it is often difficult to associate a sweep with a single gene.

      RAiSD (https://www.nature.com/articles/s42003-018-0085-8) detects selective sweeps using the μ statistic, which is a combined score of SFS, LD, and genetic diversity along a chromosome. The tool is quite sensitive and is able to detect soft sweeps. RAiSD can use a VCF variant file comprising of SNP data as input and uses an SNP-driven sliding window approach to scan the genome for signatures of sweep. Using an SNP file instead of runs of sequences prevents repeated calculations in regions that are sparse in variants, thereby optimizing execution time. Due to the nature of the input we used, the μ statistic was also calculated per site. We then tried to annotate the SNPs based on which genes they occur in and found that all species showing mimicry had atleast one site that showed a signature of sweep contained within the dsx locus.

      (3) Episodic diversification

      Very little information is provided about the Branch-site Unrestricted Statistical Test for Episodic Diversification (BUSTED) and Mixed Effects Model of Evolution (MEME), and what hypothesis the authors were testing by applying these methods. Although it is not mentioned in the manuscript, a quick search reveals that these are methods to study codon evolution along branches of a phylogeny. Without this information, it is difficult to understand the motivation for this analysis.

      We thank you for bringing this to our notice, we will add a few lines in the Methods about the hypothesis we were testing and the motivation behind this analysis. We will additionally cite a previous study from our group which used these and other methods to study the molecular evolution of dsx across insect lineages.

      (4) GWAS for form romulus

      The authors argue that the lack of SNP associations within dsx for form romulus is caused by poor read mapping in the inverted region itself (line 125). If this is true, we would expect strong association in the regions immediately outside the inversion. From Figure S3, there are four discrete peaks of association, and the location of dsx and the inversion are not indicated, so it is difficult to understand the authors' interpretation in light of this figure.

      We indeed observe the regions flanking dsx showing the highest association in our GWAS. This is a bit tricky to demonstrate in the figure as the genome is not assembled at the chromosome level. However, the association peaks occur on scf 908437033 at positions 2192979, 1181012 and 1352228 (Fig. S3c, Table S3) while dsx is located between 1938098 and 2045969. We will add the position of dsx in the figure legend of the revised manuscript.

      (5) Form theseus

      Since there appears to be only one sequence available for form theseus (actually it is said to be "P. javanus f. polytes/theseus"), is it reasonable to conclude that "the dsx coding sequence of f. theseus was identical to that of f. polytes in both P. javanus and P. alphenor" (Line 151)? Looking at the Clarke and Sheppard (1972) paper cited in the statement that "f. polytes and f. theseus show equal dominance" (line 153), it seems to me that their definition of theseus is quite different from that here. Without addressing this discrepancy, the results are difficult to interpret.

      Among P. javanus individuals sampled by us, we obtained just one individual with f. theseus and the H P allele, however, in the data we added from a previously published study (Zhang et. al. 2017), we were able to add nine more individuals of this form (Fig. S4b and S7), while we did not show these individuals in Fig 3 (which was based on PCR amplification and sequencing of individual exons od dsx), all the analysis with sequence data was performed on 10 theseus individuals in total. In Zhang et. al. the authors observed what we now know are species specific differences when comparing theseus and polytes dsx alleles and not allele-specific differences. Our observations were consistent with these findings.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Summary: 

      The authors compared four types of hiPSCs and four types of hESCs at the proteome level to elucidate the differences between hiPSCs and hESCs. Semi-quantitative calculations of protein copy numbers revealed increased protein content in iPSCs. Particularly in iPSCs, proteins related to mitochondrial and cytoplasmic were suggested to reflect the state of the original differentiated cells to some extent. However, the most important result of this study is the calculation of the protein copy numbers per cell, and the validity of this result is problematic. In addition, several experiments need to be improved, such as using cells of different genders (iPSC: female, ESC: male) in mitochondrial metabolism experiments.

      Strengths: 

      The focus on the number of copies of proteins is exciting and appreciated if the estimated calculation result is correct and biologically reproducible. 

      Weaknesses: 

      The proteome results in this study were likely obtained by simply looking at differences between clones, and the proteome data need to be validated. First, there were only a few clones for comparison, and the gender and number of cells did not match between ESCs and iPSCs. Second, no data show the accuracy of the protein copy number per cell obtained by the proteome data. 

      We agree with the reviewer that it would be useful to have data from more independent stem cell clones and ideally an equal gender balance of the donors would be preferable. As usual, practical cost-benefit, and time available affect the scope of work that can be performed. We note that the impact of biological donor sex on proteome expression in iPSC lines has already been addressed in previous studies13. We will however revise the manuscript to include specific mention of these limitations and propose a larger-scale follow-up when resources are available.

      Regarding the estimation of protein copy numbers in our study, we would like to highlight that the proteome ruler approach we have used has been employed extensively in the field previously, with direct validation of differences in copy numbers provided using orthogonal methods to MS, e.g., FACS2-4,7,10. Furthermore, the original manuscript14 directly compared the copy numbers estimated using the “proteomic ruler” to spike-in protein epitope signature tags and found remarkable concordance. This original study was performed with an older generation mass spectrometer and reduced peptide coverage, compared with the instrumentation used in our present study. Further, we noted that these authors predicted that higher peptide coverage, such as we report in our study, would further increase quantitative performance.

      Reviewer #2 (Public Review):

      Summary: 

      Pluripotent stem cells are powerful tools for understanding development, differentiation, and disease modeling. The capacity of stem cells to differentiate into various cell types holds great promise for therapeutic applications. However, ethical concerns restrict the use of human embryonic stem cells (hESCs). Consequently, induced human pluripotent stem cells (ihPSCs) offer an attractive alternative for modeling rare diseases, drug screening, and regenerative medicine. A comprehensive understanding of ihPSCs is crucial to establish their similarities and differences compared to hESCs. This work demonstrates systematic differences in the reprogramming of nuclear and non-nuclear proteomes in ihPSCs. 

      We thank the reviewer for the positive assessment.

      Strengths: 

      The authors employed quantitative mass spectrometry to compare protein expression differences between independently derived ihPSC and hESC cell lines. Qualitatively, protein expression profiles in ihPSC and hESC were found to be very similar. However, when comparing protein concentration at a cellular level, it became evident that ihPSCs express higher levels of proteins in the cytoplasm, mitochondria, and plasma membrane, while the expression of nuclear proteins is similar between ihPSCs and hESCs. A higher expression of proteins in ihPSCs was verified by an independent approach, and flow cytometry confirmed that ihPSCs had larger cell sizes than hESCs. The differences in protein expression were reflected in functional distinctions. For instance, the higher expression of mitochondrial metabolic enzymes, glutamine transporters, and lipid biosynthesis enzymes in ihPSCs was associated with enhanced mitochondrial potential, increased ability to uptake glutamine, and increased ability to form lipid droplets. 

      Weaknesses: 

      While this finding is intriguing and interesting, the study falls short of explaining the mechanistic reasons for the observed quantitative proteome differences. It remains unclear whether the increased expression of proteins in ihPSCs is due to enhanced transcription of the genes encoding this group of proteins or due to other reasons, for example, differences in mRNA translation efficiency. Another unresolved question pertains to how the cell type origin influences ihPSC proteomes. For instance, whether ihPSCs derived from fibroblasts, lymphocytes, and other cell types all exhibit differences in their cell size and increased expression of cytoplasmic and mitochondrial proteins. Analyzing ihPSCs derived from different cell types and by different investigators would be necessary to address these questions. 

      We agree with the Reviewer that our study does not extend to also providing a detailed mechanistic explanation for the quantitative differences observed between the two stem cell types and did not claim to have done so. We have now included an expanded section in the discussion where we discuss potential causes. However, in our view fully understanding the reasons for this difference is likely to involve extensive future in-depth analysis in additional studies and is not something that can be determined just by one or two additional supplemental experiments.

      We also agree studying hiPSCs reprogrammed from different cell types, such as blood lymphocytes, would be of great interest. Again, while we agree it is a useful way forward, in practice this will require a very substantial additional commitment of time and resources. We have now included a section discussing this opportunity within the discussion to encourage further research into the area.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) aizi1 and ueah1 clones, which were analyzed in Figure 1A, were excluded from the proteome analysis. In particular, the GAPDH expression level of the aizi1 clone is similar to that of ESCs and different from other iPSC clones. An explanation of how the clones were selected for proteome analysis is needed. Previously, the comparative analysis of iPSCs and ESCs reported in many studies from 2009-2017 (Ref#1-7) has already shown that the number of clones used in the comparative analysis is small, claiming differences (Ref#1-3) and that the differences become indistinguishable when the number of clones is increased (Ref#4-7). Certainly, few studies have been done at the proteome level, so it is important to examine what differences exist in the proteome. Also, it is interesting to focus on the amount of protein per cell. However, if the authors want to describe biological differences, it would be better to get the proteome data in biological duplicate and state the reason for selecting the clones used.

      (1) M. Chin, Cell Stem Cell, 2009, PMID: 19570518

      (2) K. Kim, Nat Biotechnol., 2011, PMID: 22119740

      (3) R. Lister, Nature, 2011, PMID: 21289626

      (4) A.M. Newman, Cell Stem Cell, 2010, PMID: 20682451

      (5) M.G. Guenther, Cell Stem Cell, 2010, PMID: 20682450

      (6) C. Bock, Cell, 2010, PMID: 21295703

      (7) S. Yamanaka, Cell Stem Cell, PMID: 22704507

      We agree with the reviewer that analysing more clones would be beneficial. We have included a section of this topic in the discussion. In our study, we only had access to the 4 hESC lines included, therefore in the original proteomic study we also analysed 4 hiPSC lines, which were routinely grown within our stem cell facility. While as the study progressed the stem cell facility expanded the culture of additional hiPSC lines, unfortunately we couldn’t also access additional hESC lines.

      We agree that ideally combining each biological replicate with additional technical replicates would provide extra robustness. As usual, cost and practical considerations at the time the experiments were performed affected the experimental design chosen. For the experimental design, each experiment was contained within 1 batch to avoid the strong batch effects present in TMT (Brenes et al 2019).

      (2) iPSC samples used in the proteome analysis are two types of female and two types of male, while ESC samples are three types of female and one type of female. The number of sexes of the cells in the comparative analysis should be matched because sex differences may bias the results.

      While we agree with the reviewer in principle, we have previously performed detailed comparisons of proteome expression in many independent iPSC lines from both biological male and female donors (see Brenes et al., Cell Reports 2021) and it seems unlikely that biological sex differences alone could account for the proteome differences between iPS and ESC lines uncovered in this study . However, as this is a relevant point, we have revised the manuscript to explicitly mention this caveat within the discussion section.

      (3) In Figure 1h, I suspect that the variation of PCA plots is very similar between ESCs and iPSCs. In particular, the authors wrote "copy numbers for all 8 replicates" in the legend, but if Figure 1b was done 8 times, there should be 8 types of cells x 8 measurements = 64 points. Even if iPSCs and ESCs are grouped together, there should be 8 points for each cell type. Is it possible that there is only one TMT measurement for this analysis? If so, at least technical duplicates or biological duplicates would be necessary. I also think each cell should be plotted in the PCA analysis instead of combining the four types of ESCs and iPSCs into one.

      We thank the reviewer for bringing this error to our attention. The legend has been corrected to state, “for all 8 stem cell lines”. Each dot represents the proteome of each of the 4 hESCs and 4 hiPSCs that were analysed using proteomics.

      (4) It is necessary to show what functions are enriched in the 4408 proteins whose protein copies per cell were increased in the iPSCs obtained in Figure 2B.

      The enrichment analysis requested has been performed and is now included as a new supplemental figure 2. We find it very interesting that despite the large number of proteins involved here (4,408), the enrichment analysis still shows clear enrichment for specific cellular processes. The summary plot using affinity propagation within webgestalt is included here:

      Author response image 1.

      (5) The Proteomic Ruler method used in this study is a semi-quantitative method to calculate protein copy numbers and is a concentration estimation method. Therefore, if the authors want to have a biological discussion based on the results, they need to show that the estimated concentrations are correct. For example, there are Western Blotting (WB) results for genes with no change in protein levels in hESC and hiPSC in Fig. 6ij, but the WB results for the group of genes that are claimed to have changed are not shown throughout the paper. Also, there is no difference in the total protein level between iPSCs and ESCs from the ponceau staining in Fig.6ij. WB results for at least a few genes are needed to show whether the concentration estimates obtained from the proteome analysis are plausible. If the protein per cell is increased in these iPSC clones, performing WB analysis using an equal number of cells would be better.

      Regarding the ‘proteome ruler’ approach we would like to highlight that this method has previously been used extensively in the field, with detailed validation, as already explained above. It is also not ‘semi-quantitative’ and can estimate absolute abundance, as well as concentrations. Our work does not use their concentration formulas, but the estimation of protein copy numbers, which was shown to closely match the observed copy numbers as determined when spike-ins are used14.

      In providing here additional validation using Western Blotting (WB), we prioritised for analysis also by WB the proteins related to pluripotency markers, which are vital to determine the pluripotency state of the hESCs and hiPSCs, as well as histone markers. We have included a section in the discussion concerning additional validation data and agree in general that further validation is always useful.

      (6) Regarding the experiment shown in Figure 4l, the gender of iPSC used (wibj2) is female and WA01 (H1; WA01) is male. Certainly, there is a difference in the P/E control ratio, but isn't this just a gender difference? The sexes of the cells need to be matched.

      We accept that ideally the sexes of donors should ideally have been matched and have mentioned this within the discussion. Nonetheless, as previously mentioned, our previous detailed proteomic analyses of multiple hiPSC lines13 derived from both biological male and female donors provide relevant evidence that the results shown in this study are not simply a reflection of the sex of the donors for the respective iPSC and ESC lines. When comparing eroded and non-eroded female hiPSCs to male hiPSCs we found no significant differences in any electron transport chain proteins, not TCA proteins between males and females.

      Minor comments:

      (1) Method: Information on the hiPSCs and hESCs used in this study should be described. In particular, the type of differentiated cells, gender, and protocols that were used in the reprogramming are needed.

      We agree with the reviewer on this. The hiPSC lines were generated by the HipSci consortium, as described in the flagship HipSci paper15. We cite the flagship paper, which specifies in great detail the reprogramming protocols and quality control measures, including analysis of copy number variations15. However, we agree that this information may not be easily accessible for readers. We agree it is relevant to explicitly include this information in our present manuscript, instead of expecting readers to look at the flagship paper. These details have therefore been added to the revised version.

      (2) Method: In Figure1a, Figure 6i, j, the antibody information of Nanog, Oct4, Sox2, and Gapdh is not written in the method and needs to be shown.

      The data relating to these has now been included within the methods section.

      (3) Method: In Figure 1b and other figures, the authors should indicate which iPSC corresponds to which TMT label; the data in the Supplemental Table also needs to indicate which data is which clone.

      We have now added this to the methods section.

      (4) Method: The method of the FACS experiment used in Figure 2 should be described.

      The methods related to the FACS analysis have now been included within the manuscript.

      (5) Method: The cell name used in the mitochondria experiment shown in Figure 4 is listed as WA01, which is thought to be H1. Variations in notation should be corrected.

      This has now been corrected.

      (6) Method: The name of the cell clone shown in Figure 3l,m should be mentioned.

      We have now added these details on the corresponding figure and legend.

      Reviewer #2 (Recommendations For The Authors):

      This study utilized quantitative mass spectrometry to compare protein expression in independently derived 4 ihPSC and 4 hESC cell lines. The investigation quantified approximately 7,900 proteins, and employing the "Proteome ruler" approach, estimated protein copy numbers per cell. Principal component analyses, based on protein copy number per cell, clearly separated hiPSC and hESC, while different hiPSCs and hESCs grouped together. The study revealed a global increase in the expression of cytoplasmic, mitochondrial, membrane transporters, and secreted proteins in hiPSCs compared to hESCs. Interestingly, standard median-based normalization approaches failed to capture these differences, and the disparities became apparent only when protein copy numbers were adjusted for cell numbers. Increased protein abundance in hiPSC was associated with augmented ribosome biogenesis. Total protein content was >50% higher in hiPSCs compared to hESCs, a observation independently verified by total protein content measurement via the EZQ assay and further supported by the larger cell size of hiPSCs in flow cytometry. However, the cell cycle distribution of hiPSC and hESC was similar, indicating that the difference in protein content was not due to variations in the cell cycle. At the phenotypic level, differences in protein expression also correlated with increased glutamine uptake, enhanced mitochondrial potential, and lipid droplet formation in hiPSCs. ihPSCs also expressed higher levels of extracellular matrix components and growth factors.

      Overall, the presented conclusions are adequately supported by the data. Although the mechanistic basis of proteome differences in ihPSC and hESC is not investigated, the work presents interesting findings that are worthy of publication. Below, I have listed my specific questions and comments for the authors.

      (1) Figure 1a displays immunoblots from 6 iPSC and 4 ESC cell lines, with 8 cell lines (4 hESC, 4 hiPSC) utilized in proteomic analyses (Fig. 1b). The figure legend should specify the 8 cell lines included in the proteomic analyses. The manuscript text describing these results should explicitly mention the number and names of cell lines used in these assays.

      We agree with the reviewer and have now marked in figure 1 all the lines that were used for proteomics and have added a section in the methods specifying which cell lines were analysed in each TMT channel.

      (2) In most figures, the quantitative differences in protein expression between hiPSC and hESC are evident, and protein expression is highly consistent among different hiPSCs and hESCs. However, the glutamine uptake capacity of different hiPSC cell lines, and to some extent hESC cell lines, appears highly variable (Figure 3e). While proteome changes were measured in 4 hiPSCs and 4 hESCs, the glutamine uptake assays were performed on a larger number of cell lines. The authors should clarify the number of cell lines used in the glutamine uptake assay, clearly indicating the cell lines used in the proteome measurements. Given the large variation in glutamine uptake among different cell lines, it would be useful to plot the correlation between the expression of glutamine transporters and glutamine uptake in individual cell lines. This may help understand whether differences in glutamine uptake are related to variations in the expression of glutamine transporters.

      The “proteomic ruler” has the capacity to estimate the protein copy numbers per cell, as such changes in the absolute number of cells that were analysed do not cause major complications in quantification. Furthermore, TMT-based proteomics is the most precise proteomics methods available, where the same peptides are detected in all samples across the same data points and peaks, as long as the analysis is done within a single batch, as is the case here.

      The glutamine uptake assay is much more sensitive to the variation in the number of cells. The number of cells were estimated by plating the cells with approximately 5e4 cells two days before the assay, which creates variability. Furthermore, hESCs and hiPSCs are more adhesive than the cells used in the original protocol, hence the quench data was noisier for these lines, making the data from the assay more variable.

      (3) In Figure 4j, it would be helpful to indicate whether the observed differences in the respiration parameters are statistically significant.

      We have now modified the plot to show which proteins were significantly different.

      (4) The iPSCs used here are generated from human primary skin fibroblasts. Different cells vary in size; for instance, fibroblast cells are generally larger than blood lymphocytes. This raises the question of whether the parent cell origin impacts differences in hiPSCs and hESC proteomes. For example, do the authors anticipate that hiPSCs derived from small somatic cells would also display higher expression of cytoplasmic, mitochondrial, and membrane transporters compared to ESC? The authors may consider discussing this point.

      This is a very interesting point. We have now added an extension to the discussion focussed on this subject.

      (5) One wonders if the "Proteome ruler" approach could be applied retrospectively to previously published ihPSC and hESC proteome data, confirming higher expression of cytoplasmic and mitochondrial proteins in ihPSCs, which may have been masked in previous analyses due to median-based normalization.

      We agree with the reviewer and think this is a very good suggestion. Unfortunately, in the main proteomic papers comparing hESC and hiPSCs16,17  the authors did not upload their raw files to a public repository (as it was not mandatory at that period in time), and they also used the International Protein Index (IPI), which is a discontinued database. So the raw files can’t be reprocessed and the database doesn’t match the modern SwissProt entries. Therefore, reprocessing the previous data was impractical.

      (6) The work raises a fundamental question: what is the mechanistic basis for the higher expression of cytoplasmic and mitochondrial proteins in ihPSCs? Conceivably, this could be due to two reasons: (a) Genes encoding cytoplasmic and mitochondrial proteins are expressed at a higher level in ihPSCs compared to hESC. (b) mRNAs encoding cytoplasmic and mitochondrial proteins are translated at a higher level in ihPSCs compared to hESC. The authors may check published transcriptome data from the same cell lines to shed light on this point.

      This is a very interesting point. We believe that the reprogrammed cells contained mature mitochondria, which are not fully regressed upon reprogramming and that this can establish a growth advantage in the normoxic environments in which the cells are grown. Unfortunately, the available transcriptomic data lacked spike-ins, and thus only enables comparison of concentration, not of copy numbers13. Therefore, we could not determine with the available data if there was an increase in the copies of specific mRNAs. However, with a future study where there was a transcriptomic dataset with spike-ins included, this would be very interesting to analyse.

      Reviewer #3 (Recommendations For The Authors):

      It is unclear whether changes in protein levels relate to any phenotypic features of cell lines used. For example, the authors highlight that increased protein expression in hiPSC lines is consistent with the requirement to sustain high growth rates, but there is no data to demonstrate whether hiPSC lines used indeed have higher growth rates.

      We respectfully disagree with the reviewer on this point. Our data show that hESCs and hiPSCs show significant differences in protein mass and cell size, with the MS data validated by the EZQ assay and FACS, while having no significant differences in their cell cycle profiles. Thus, increased size and protein content would require higher growth rates to sustain the increased mass, which is what we observe.

      The authors claim that the cell cycle of the lines is unchanged. However, no details of the method for assessing the cell cycle were included so it is difficult to appreciate if this assessment was appropriately carried out and controlled for.

      We apologise for this omission; the details have been included in the revised version of the manuscript.

      Details and characterisation of iPSC and ESC lines used in this study are overall lacking. The lines used are merely listed in methods, but no references are included for published lines, how lines were obtained, what passage they were used at, their karyotype status etc. For details of basic characterisation, the authors should refer to the ISSC Standards for the use of human stem cells in research. In particular, the authors should consider whether any of the changes they see may be attributed to copy number variants in different lines.

      We agree with the reviewer on this and refer to the reply above concerning this issue.

      The expression data for markers of undifferentiated state in Figure 1a would ideally be shown by immunocytochemistry or flow cytometry as it is impossible to tell whether cultures are heterogeneous for marker expression.

      We agree with the reviewer on this. FACS is indeed much more quantitative and a better method to study heterogeneity. However, we did not have protocols to study these markers using FACS.

      TEM analysis should ideally be quantified.

      We agree with the reviewer that it would be nice to have a quantitative measure.

      All figure legends should explicitly state what graphs are representing (e.g. average/mean; how many replicates (biological or technical), which lines)? Some data is included in Methods (e.g. glutamine uptake), but not for all of the data (e.g. TEM).

      We agree with the reviewer. These has been corrected in the revised version of the manuscript, with additional details included.

      Validation experiments were performed typically on one or two cell lines, but the lines used were not consistent (e.g. wibj_2 versus H1 for respirometry and wibj_2, oaqd_3 versus SA121 and SA181 for glutamine uptake). Can the authors explain how the lines were chosen?

      The validation experiments were performed at different time points, and the selection of lines reflected the availability of hiPSC and hESC lines within our stem cell facility at a given point in time.

      We chose to use a range of different lines for comparison, rather than always comparing only one set of lines, to try to avoid a possible bias in our conclusions and thus to make the results more general.

      The authors should acknowledge the need for further functional validation of the results related to immunosuppressive proteins.

      We agree with the reviewer and have added a sentence in the discussion making this point explicitly.

      Differences in H1 histones abundance were highlighted. Can the authors speculate as to the meaning of these differences?

      Regarding H1 histones, our study of the literature, as well as discussions with with chromatin and histone experts, both within our institute and externally, have not shed light into what the differences could imply, based upon previous literature. We think therefore that this is a striking and interesting result that merits further study, but we have not yet been able to formulate a clear hypothesis on the consequences.

      (1) Howden, A. J. M. et al. Quantitative analysis of T cell proteomes and environmental sensors during T cell differentiation. Nat Immunol, doi:10.1038/s41590-019-0495-x (2019).

      (2) Marchingo, J. M., Sinclair, L. V., Howden, A. J. & Cantrell, D. A. Quantitative analysis of how Myc controls T cell proteomes and metabolic pathways during T cell activation. Elife 9, doi:10.7554/eLife.53725 (2020).

      (3) Damasio, M. P. et al. Extracellular signal-regulated kinase (ERK) pathway control of CD8+ T cell differentiation. Biochem J 478, 79-98, doi:10.1042/BCJ20200661 (2021).

      (4) Salerno, F. et al. An integrated proteome and transcriptome of B cell maturation defines poised activation states of transitional and mature B cells. Nat Commun 14, 5116, doi:10.1038/s41467-023-40621-2 (2023).

      (5) Antico, O., Nirujogi, R. S. & Muqit, M. M. K. Whole proteome copy number dataset in primary mouse cortical neurons. Data Brief 49, 109336, doi:10.1016/j.dib.2023.109336 (2023).

      (6) Edwards, W. et al. Quantitative proteomic profiling identifies global protein network dynamics in murine embryonic heart development. Dev Cell 58, 1087-1105 e1084, doi:10.1016/j.devcel.2023.04.011 (2023).

      (7) Barton, P. R. et al. Super-killer CTLs are generated by single gene deletion of Bach2. Eur J Immunol 52, 1776-1788, doi:10.1002/eji.202249797 (2022).

      (8) Phair, I. R., Sumoreeah, M. C., Scott, N., Spinelli, L. & Arthur, J. S. C. IL-33 induces granzyme C expression in murine mast cells via an MSK1/2-CREB-dependent pathway. Biosci Rep 42, doi:10.1042/BSR20221165 (2022).

      (9) Niu, L. et al. Dynamic human liver proteome atlas reveals functional insights into disease pathways. Mol Syst Biol 18, e10947, doi:10.15252/msb.202210947 (2022).

      (10) Murugesan, G., Davidson, L., Jannetti, L., Crocker, P. R. & Weigle, B. Quantitative Proteomics of Polarised Macrophages Derived from Induced Pluripotent Stem Cells. Biomedicines 10, doi:10.3390/biomedicines10020239 (2022).

      (11) Ryan, D. G. et al. Nrf2 activation reprograms macrophage intermediary metabolism and suppresses the type I interferon response. iScience 25, 103827, doi:10.1016/j.isci.2022.103827 (2022).

      (12) Nicolas, P. et al. Systems-level conservation of the proximal TCR signaling network of mice and humans. J Exp Med 219, doi:10.1084/jem.20211295 (2022).

      (13) Brenes, A. J. et al. Erosion of human X chromosome inactivation causes major remodeling of the iPSC proteome. Cell Rep 35, 109032, doi:10.1016/j.celrep.2021.109032 (2021).

      (14) Wisniewski, J. R., Hein, M. Y., Cox, J. & Mann, M. A "proteomic ruler" for protein copy number and concentration estimation without spike-in standards. Mol Cell Proteomics 13, 3497-3506, doi:10.1074/mcp.M113.037309 (2014).

      (15) Kilpinen, H. et al. Common genetic variation drives molecular heterogeneity in human iPSCs. Nature 546, 370-375, doi:10.1038/nature22403 (2017).

      (16) Phanstiel, D. H. et al. Proteomic and phosphoproteomic comparison of human ES and iPS cells. Nat Methods 8, 821-827, doi:10.1038/nmeth.1699 (2011).

      (17) Munoz, J. et al. The quantitative proteomes of human-induced pluripotent stem cells and embryonic stem cells. Mol Syst Biol 7, 550, doi:10.1038/msb.2011.84 (2011).

    2. eLife Assessment

      This study reports differences in proteomic profiles of embryonic versus induced pluripotent stem cells. This important finding cautions against the interchangeable use of both types of cells in biomedical research, although the mechanisms responsible for these differences remains unknown. The proteomic evidence is convincing, even though there is limited validation with other methods.

    3. Reviewer #1 (Public review):

      Summary:

      The authors compared four types of hiPSCs and four types of hESCs at the proteome level to determine their differences. Semiquantitative calculations of protein copy number revealed increased protein content in iPSCs. In particular, the results suggest that mitochondria- and cytoplasm-associated proteins in iPSCs reflect to some extent the state of the original differentiated cells. Basically, it contains responses to almost all comments and adds text mainly to the discussion. No additional experiments were performed in the revision, but I believe that future validation using methods other than proteomics would provide more support for the results.

      Pros:

      Mitochondrial function was verified by high-resolution respirometry, indicating increased ATP-producing capacity of the phosphorylation system in iPSCs.

      Weaknesses:

      The proteome data in this study may be the result of a simple examination of differences between the clones, and proteome data should be verified using various methods in the future.

    4. Reviewer #2 (Public review):

      Summary:

      Pluripotent stem cells are powerful tools for understanding development, differentiation, and disease modeling. The capacity of stem cells to differentiate into various cell types holds great promise for therapeutic applications. However, ethical concerns restrict the use of human embryonic stem cells (hESCs). Consequently, induced human pluripotent stem cells (ihPSCs) offer an attractive alternative for modeling rare diseases, drug screening, and regenerative medicine. A comprehensive understanding of ihPSCs is crucial to establish their similarities and differences compared to hESCs. This work demonstrates systematic differences in the reprogramming of nuclear and non-nuclear proteomes in ihPSCs.

      Strengths:

      The authors employed quantitative mass spectrometry to compare protein expression differences between independently derived ihPSC and hESC cell lines. Qualitatively, protein expression profiles in ihPSC and hESC were found to be very similar. However, when comparing protein concentration at a cellular level, it became evident that ihPSCs express higher levels of proteins in the cytoplasm, mitochondria, and plasma membrane, while the expression of nuclear proteins is similar between ihPSCs and hESCs. A higher expression of proteins in ihPSCs was verified by an independent approach, and flow cytometry confirmed that ihPSCs had larger cell size than hESCs. The differences in protein expression were reflected in functional distinctions. For instance, the higher expression of mitochondrial metabolic enzymes, glutamine transporters, and lipid biosynthesis enzymes in ihPSCs was associated with enhanced mitochondrial potential, increased ability to uptake glutamine, and increased ability to form lipid droplets.

      Weaknesses:

      While this finding is intriguing and interesting, the study falls short of explaining the mechanistic reasons for the observed quantitative proteome differences. It remains unclear whether the increased expression of proteins in ihPSCs is due to enhanced transcription of the genes encoding this group of proteins or due to other reasons, for example, differences in mRNA translation efficiency. Another unresolved question pertains to how the cell type origin influences ihPSC proteomes. For instance, whether ihPSCs derived from fibroblasts, lymphocytes, and other cell types all exhibit differences in their cell size and increased expression of cytoplasmic and mitochondrial proteins. Analyzing ihPSCs derived from different cell types and by different investigators would be necessary to address these questions.

    5. Reviewer #3 (Public review):

      This study provides a useful insight into the proteomic analysis of several human induced pluripotent (hiPSC) and human embryonic stem cell (hESC) lines. Although the study is largely descriptive with limited validation of the differences found in the proteomic screen, the findings provide a solid platform for further mechanistic discovery.

    1. eLife Assessment

      This important study advances our understanding of the temporal dynamics and cortical mechanisms of eye movements and the cognitive process of attention. The evidence supporting the conclusions is convincing and based on measuring the time course of the eye movement-attention interaction in a novel, carefully-controlled experimental task. This study will be of broad interest to psychologists and neuroscientists interested in the dynamics of cognitive processes.

    2. Reviewer #2 (Public review):

      Goldstein et al. provide a thorough characterization of the interaction of attention and eye movement planning. These processes have been thought to be intertwined since at least the development of the Premotor Theory of Attention in 1987, and their relationship has been a continual source of debate and research for decades. Here, Goldstein et al. capitalize on their novel urgent saccade task to dissociate the effects of endogenous and exogenous attention on saccades towards and away from the cue. They find that attention and eye movements are, to some extent, linked to one another but that this link is transient and depends on the nature of the task. A primary strength of the work is that the researchers are able to carefully measure the time course of the interaction between attention and eye movements in various well-controlled experimental conditions. As a result, the behavioral interplay of two forms of attention (endogenous and exogenous) are illustrated at the level of tens of milliseconds as they interact with the planning and execution of saccades towards and away from the cued location. Overall, the results allow the authors to make meaningful claims about the time course of visual behavior, attention, and the potential neural mechanisms at a timescale relevant to everyday human behavior.

    3. Reviewer #3 (Public review):

      The present study used an experimental procedure involving time-pressure for responding, in order to uncover how the control of saccades by exogenous and endogenous attention unfolds over time. The findings of the study indicate that saccade planning is influenced by the locus of endogenous attention, but that this influence was short-lasting and could be overcome quickly. Taken together, the present findings reveal new dynamics between endogenous attention and eye movement control and lead the way for studying them using experiments under time-pressure.

      The results achieved by the present study advance our understanding of vision, eye movements, and their control by brain mechanisms for attention. In addition, they demonstrate how tasks involving time-pressure can be used to study the dynamics of cognitive processes. Therefore, the present study seems highly important not only for vision science, but also for psychology, (cognitive) neuroscience, and related research fields in general.

      I think the authors' addressed all of the reviewers' points successfully and in detail, so that I don't have any further suggestions or comments.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The main research question could be defined more clearly. In the abstract and at some points throughout the manuscript, the authors indicate that the main purpose of the study was to assess whether the allocation of endogenous attention requires saccade planning [e.g., ll.3-5 or ll.247-248]. While the data show a coupling between endogenous attention and saccades, they do not point to a specific direction of this coupling (i.e., whether endogenous attention is necessary to successfully execute a saccade plan or whether a saccade plan necessarily accompanies endogenous attention).

      Thanks for the suggestion. We have modified the text in the abstract and at various points in the text to make it more clear that the study investigates the relationship between attention and saccades in one particular direction, first attentional deployment and then saccade planning.

      Some of the analyses were performed only on subgroups of the participants. The reporting of these subgroup analyses is transparent and data from all participants are reported in the supplementary figures. Still, these subgroup analyses may make the data appear more consistent, compared to when data is considered across all participants. For instance, the exogenous capture in Experiments 1 and 2 appears much weaker in Figure 2 (subgroup) than Figure S3 (all participants). Moreover, because different subgroups were used for different analyses, it is often difficult to follow and evaluate the results. For instance, the tachometric curves in Figure 2 (see also Figure 3 and 4) show no motor bias towards the cue (i.e., performance was at ~50% for rPTs <75 ms). I assume that the subsequent analyses of the motor bias were based on a very different subgroup. In fact, based on Figure S2, it seems that the motor bias was predominantly seen in the unreliable participants. Therefore, I often found the figures that were based on data across all participants (Figures 7 and S3) more informative to evaluate the overall pattern of results.

      Indeed, our intent was to dissociate the effects on saccade bias and timing as clearly as possible, even if that meant having to parse the data into subgroups of participants for different analyses. We do think conceptually this is the better strategy, because the bias and timing effects were distinct and not strongly correlated with specific participants or task variants. For instance, the unreliable participants were somewhat more consistently biased in the same direction, but the reliable participants also showed substantial biases, so the difference in magnitude was relatively modest. This can be more easily appreciated now that the reliable and unreliable participants are indicated in Figures 3 and 5. The impact of the bias is also discussed further in the last paragraphs of the Results, which note that the bias was not a reliable predictor of overall success during informed choices.

      Reviewer #3 (Public Review):

      (1) In this experimental paradigm, participants must decide where to saccade based on the color of the cue in the visual periphery (they should have made a prosaccade toward a green cue and an antisaccade away from a magenta cue). Thus, irrespective of whether the cue signaled that a prosaccade or an antisaccade was to be made, the identity of the cue was always essential for the task (as the authors explain on p. 5, lines 129-138). Also, the location where the cue appeared was blocked, and thus known to the participants in advance, so that endogenous attention could be directed to the cue at the beginning of a trial (e.g., p. 5, lines 129-132). These aspects of the experimental paradigm differ from the classic prosaccade/antisaccade paradigm (e.g. Antoniades et al., 2013, Vision Research). In the classic paradigm, the identity of the cues does not have to be distinguished to solve the task, since there is only one stimulus that should be looked at (prosaccade) or away from (antisaccade), and whether a prosaccade or antisaccade was required is constant across a block of trials. Thus, in contrast to the present paradigm, in the classic paradigm, the participants do not know where the cue is about to appear, but they know whether to perform a prosaccade or an antisaccade based on the location of the cue.

      The present paradigm keeps the location of the cue constant in a block of trials by intention, because this ensures that endogenous attention is allocated to its location and is not overpowered by the exogenous capture of attention that would happen when a single stimulus appeared abruptly in the visual field. Thus, the reason for keeping the location of the cue constant seems convincing. However, I wondered what consequences the constant location would have for the task representations that persist across the task and govern how attention is allocated. In the classic paradigm, there is always a single stimulus that captures attention exogenously (as it appears abruptly). In a prosaccade block, participants can prioritize the visual transient caused by the stimulus, and follow it with a saccade to its coordinates. In an antisaccade block, following the transient with a saccade would always be wrong, so that participants could try to suppress the attention capture by the transient, and base their saccade on the coordinates of the opposite location. Thus, in prosaccade and antisaccade blocks, the task representations controlling how visual transients are processed to perform the task differ. In the present task, prosaccades and antisaccades cannot be distinguished by the visual transients. Thus, such a situation could favor endogenous attention and increase its influence on saccade planning, even though saccade planning under more naturalistic conditions would be dominated by visual transients. I suggest discussing how this (and vice versa the emphasis on visual transients in the classic paradigm) could affect the generality of the presented findings (e.g., how does this relate to the interpretation that saccade plans are obligatorily coupled to endogenous attention? See, Results, p. 10, lines 306-308, see also Deubel & Schneider, 1996, Vision Research).

      Great discussion point. There are indeed many ways to set up an experiment where one must either look to a relevant cue or look away from it. Furthermore, it is also possible to arrange an experiment where the behavior is essentially identical to that in the classic antisaccade task without ever introducing the idea of looking away from something (Oor et al., 2023). More important than the specific task instructions or the structure of the event sequence, we think the fundamental factors that determine behavior in all of these cases are the magnitudes of the resulting exogenous and endogenous signals, and whether they are aligned or misaligned. Under urgent conditions, consideration of these elements and their relevant time scales explains behavior in a wide variety of tasks (see Salinas and Stanford, 2021). Furthermore, a recent study (Zhu et al., 2024) showed that the activation patterns of neurons in monkey prefrontal cortex during the antisaccade task can be accurately predicted from their stimulus- and saccade-related responses during a simpler task (a memory guided saccade task). This lends credence to the idea that, at the circuit level, the qualities that are critical for target selection and oculomotor performance are the relative strengths of the exogenous and endogenous signals, and their alignment in space and time. If we understand what those signals are, then it no longer matters how they were generated. The Discussion now includes a paragraph on this issue.

      (2) Discussion (p. 16, lines 472-475): The authors suppose that "It is as if the exogenous response was automatically followed by a motor bias in the opposite direction. Perhaps the oculomotor circuitry is such that an exogenous signal can rapidly trigger a saccade, but if it does not, then the corresponding motor plan is rapidly suppressed regardless of anything else.". I think this interesting point should be discussed in more detail. Could it also be that instead of suppression, other currently active motor plans were enhanced? Would this involve attention? Some attention models assume that attention works by distributing available (neuronal) processing resources (e.g., Desimone & Duncan, 1995, Annual Review of Neuroscience; Bundesen, 1990, Psychological Review; Bundesen et al., 2005, Psychological Review) so that the information receiving the largest share of resources results in perception and is used for action, but this happens without the active suppression of information.

      The rebound seen after the exogenously driven changes is certainly interesting, and we agree that it could involve not only the suppression of a specific motor plan but also enhancement of another (opposite) plan. However, we think that, given the lack of prior data with the requisite temporal precision, further elaboration of this point would just be too speculative in the context of the point that we are trying to make, which is simply that the underlying choice dynamics are more rapid and intricate than is generally appreciated.

      (3) Methods, p. 19, lines 593-596: It is reported that saccades were scored based on their direction. I think more information should be provided to understand which eye movements entered the analysis. Was there a criterion for saccade amplitude? I think it would be very helpful to provide data on the distributions of saccade amplitudes or on their accuracy (e.g. average distance from target) or reliability (e.g. standard deviation of landing points). Also, it is reported that some data was excluded from the analysis, and I suggest reporting how much of the data was excluded. Was the exclusion of the data related to whether participants were "reliable" or "unreliable" performers?

      The reported results are based on all saccades (detected according to a velocity threshold) that were produced after the go signal and in a predominantly horizontal direction (within ± 60° of the cue or non-cue), which were the vast majority (> 99%). Indeed, most saccades were directed to the choice targets, with 95% of them within ± 14.2° of the horizontal plane. The excluded (non-scored) trials were primarily fixation breaks plus a small fraction of trials with blinks, which compromised saccade determination. There was no explicit amplitude criterion; applying one (for instance, excluding any saccades with amplitude < 2°) produced minimal changes to the data. Overall, saccade amplitudes were distributed unimodally with a median of 7.7° and a 95% confidence interval of [3.7°, 9.7°], whereas the choice targets were located at ± 8° horizontally. This is now reported in the Methods.

      As far as data exclusion, analyses were based on urgent trials (gap > 0); non-urgent (gap < 0) trials were excluded from calculation of the tachometric curves simply because they might correspond to a slightly different regime (go signal after cue onset) and to long processing times in the asymptotic range (rPT in 200–300 ms) or beyond, which are not as informative. However, including them made no appreciable difference to the results. No data were excluded based on participant performance or identity; all psychometric analyses were carried out after the selection of trials based on the scoring criteria described above. This is now stated in the Methods.

      (4) Results, p. 9, lines 262-266: Some data analyses are performed on a subset of participants that met certain performance criteria. The reasons for this data selection seem convincing (e.g. to ensure empirical curves were not flat, line 264). Nevertheless, I suggest to explain and justify this step in more detail. In addition, if not all participants achieved an acceptable performance and data quality, this could also speak to the experimental task and its difficulty. Thus, I suggest discussing the potential implications of this, in particular, how this could affect the studied mechanisms, and whether it could limit the presented findings to a special group within the studied population.

      The ideal (i.e., best) analysis for determining the cost of an antisaccade for each individual participant (Fig. 4c) was based on curve fitting and required task performance to rise consistently above chance at long rPTs in both pro and anti trials. This is why the mentioned conditions on the fits were imposed. This is now explained in the text. This ideal analysis was not viable for all tachometric curves not necessarily because of task difficulty but also because of high variability or high bias in a particular experiment/condition. It is true that the task was somewhat difficult, but this manifested in various ways across the dataset, so attempting to draw a clean-cut classification of participants based on “difficulty” may not be easy or all that informative (as can be gleaned from Fig. S1). There simply was a range of success levels, as one might expect from any task that requires some nontrivial cognitive processing. Also note that no participants were excluded flat out from analysis. Thus, at the mentioned point in the text, we simply note that a complementary analysis is presented later that includes all participants and all conditions and provides a highly consistent result (namely, Fig. 7e). Then, in the last section of the Results, where Fig. 7 is presented, we point out that there is considerable variance in performance at long rPTs, and that it relates to both the bias and the difficulty of the task across participants.   

      Reviewer #1 (Recommendations For The Authors):

      (1) I have some questions related to the initial motor bias:

      a) Based on Figure S3, which shows the tachometric curves using data from all participants, there only seems to be a systematic motor bias in Experiments 1 and 3 but no bias in Experiments 2 and 4. It is unclear to me why this is different from the data shown in Figure 7.

      For the bars in Fig. 7, accuracy (% correct) was computed for each participant and then averaged across participants, whereas for the data in Fig. S3, trials were first pooled across participants and then accuracy was computed for each rPT bin. The different averaging methods produce slightly different results because some participants had more trials in the guessing range than others, and different biases.  

      b) Based on Figure 7 (and Figure S3), there was no motor bias in Experiment 4. Based on the correlations between motor bias and time difference between pro and antisaccades, I would expect that the rise points between pro and antisaccades would be more similar in this Experiment. Was this the case?

      No. Figs. 3c and S3d show that the rise times of pro and anti trials for Experiment 4 still differ by about 30 ms (around the 75% correct mark), and the rest of the panels in those figures show that the difference is similar for all experiments. What happens is that Figs. 7 and S3 show that on average the bias is zero for Experiment 4, but that does not mean that the average difference in rise times is zero because there is an offset in the data (correlation is not the same as regression). The most relevant evidence is in Fig. 6c, which shows that, for an overall bias of zero, one would still expect a positive difference in rise times of about 25–30 ms. This figure now includes a regression line, and the corresponding text now explains the relationship between bias and rise times more clearly. Thanks for asking; this is an important point that was not sufficiently elaborated before.

      c) If I understand correctly, the initial motor bias was predominantly observed in participants who were classified as 'unreliable performers' (comparing Figure S2 and Figure 2). Was there a correlation between the motor bias and overall success in the task? In other words: Was a strong motor bias generally disadvantageous?

      Good question. Participants classified as ‘unreliable’ were somewhat more consistently biased in the same direction than those classified as ‘reliable’, but the distinction in magnitude was not large. This can be better appreciated now in Fig. 5 by noting the mix of black (reliable) and gray labels (unreliable) along the x axes. The unreliable participants were also, by definition, less accurate in their asymptotic performance in at least one experiment (Fig. S1). In general, however, this classification was used simply to distinguish more clearly the two main effects in the data (timing cost and bias). In fact, the motor bias was not a reliable predictor of performance during informed choices: across all participants, the mean accuracy in the asymptotic range (rPT > 200 ms) had a weak, non-significant correlation with the bias (ρ = ‒0.07, p = 0.7). So, no, the motor bias did not incur an obvious disadvantage in terms of overall success in the task. Its more relevant effect was the asymmetry in performance that it promoted between pro- and antisaccade trials (Fig. 6c). This is now explained at the end of the Results.

      (2) One of the key analyses of the current study is the comparison of the rPT required to make informed pro and antisaccades (ll.246 ff). I think it would be informative for readers to see the results of this analysis separately for all four experiments. For instance, based on Figure 4a and b, it looks like the rise points were actually very similar between pro and antisaccades in Experiment 1.

      We agree that the ideal analysis would be to compute the performance rise point for pro- and antisaccade curves for each experiment and each participant, but as is now noted in the text, this requires a steady and substantial rise in the tachometric curve, which is not always obtained at such a fine-grained level; the underlying variability can be glimpsed from the individual points in Fig. 7a, b. Indeed, in Fig. 4a, b the mean difference between pro and anti rise points appears small for Experiment 1 — but note that the two panels include data from only partially overlapping sets of participants; the figure legend now makes this more clear. Again, this is because the required fitting procedure was not always reliable in both conditions (pro and anti) for a given subject in a given experiment. Thus, panels a and b cannot be directly compared. The key results are those in Fig. 4c, which compare the rise points in the two conditions for the same participants (11 of them, for which both rise points could be reliably determined). In that case the mean difference is evident, and the individual effect consistent for 9 of the 11 participants (as now noted).

      A similar comparison for Experiments 1 or 2 individually would include fewer data points and lose statistical power. However, on average, the results for Experiments 1 and 2 (separately) were indeed very similar; in both cases, the comparison between pro and anti curves pooled across the same qualifying participants as in Fig. 4c produced results that were nearly identical to those of Fig. 4d (as can be inferred from Fig. 2a, b). Furthermore, results for the four individual experiments pooled across all participants are presented in Figure S3, which shows delayed rises in antisaccade performance consistent with the single participant data (Fig. 4c).

      (3) Figure 3: It would be helpful to indicate the reliable performers that were used for Figure 3a in the bar plots in Figure 3b. Same for Figures 3c and d.

      Done. Thanks for the suggestion.

      (4) Introduction: The literature on the link between covert attention and directional biases in microsaccades seems relevant in the context of the current study (e.g., Hafed et al., 2002, Vision Res; Engbert & Kliegl, 2003, Vision Res; Willett & Mayo, 2023, Proc Natl Acad Sci USA).

      Yes, thanks for the suggestion. The introduction now mentions the link between attentional allocation and microsaccade production.

      (5) ll.395ff & Figure 7f: Please clarify whether data were pooled across all four experiments for this analysis.

      Yes, the data were pooled, but a positive trend was observed for each of the four experiments individually. This is now stated.

      (6) ll.432-433: There is evidence that the attentional locus and the actual saccade endpoint can also be dissociated (e.g., Wollenberg et al., 2018, PLoS Biol; Hanning et al., 2019, Proc Natl Acad Sci USA).

      True. We have rephrased accordingly. Thanks for the correction.

      (7) ll.438-440: This sentence is difficult to parse.

      Fixed.

      Reviewer #2 (Recommendations For The Authors):

      The manuscript is well-written and compelling. The biggest issue for me was keeping track of the specifics of the individual experiments. I think some small efforts to reinforce those details along the way would help the reader. For example, in the Figure 3 figure legend, I found the parenthetical phrase "high luminence cue, low luminence non-cue)" immensely helpful. It would be helpful and trivial to add the corresponding phrase after "Experiment 4" in the same legend.

      Thanks for the suggestion. Legends and/or labels have been expanded accordingly in this and other figures.

      Line 314: "..had any effect on performance,..." Should there be a callout to Figure 2 here?

      Done.

      It wasn't clear to me why the specific high and low luminance values (48 and 0.25) were chosen. I assume there was at least some quick perceptual assessment. If that's the case or if the values were taken from prior work, please include that information.

      Done.

      Reviewer #3 (Recommendations For The Authors):

      Minor points. Please note that the comments made in the public review above are not repeated here.

      (1) Introduction, p. 2, lines 41-45: It is mentioned that the effects of covert attention or a saccade can be quite distinct. I suggest specifying in what way.

      Done.

      (2) Introduction, p. 2, lines 46-47: It is said that the relation between attention and saccade planning was still uncertain and then it is stressed that this was the case for more natural viewing conditions. However, the discussed literature and the experimental approach of the current study still rely on experimental paradigms that are far from natural viewing conditions. Thus, I suggest either discussing the link between these paradigms and natural viewing in more detail or leaving out the reference to natural viewing at this point (I think the latter suggestion would fit the present paper best).

      We followed the latter suggestion.

      (3) Introduction (e.g. p. 3, lines 55-58): The authors discuss the effects that sustaining fixation might have on attention and eye movements. Recently, it has been found that maintaining fixation can ameliorate cognitive conflicts that involve spatial attention (Krause & Poth, 2023, iScience). It seems interesting to include this finding in the discussion, because it supports the authors' view that it is necessary to study fixation and eye movements rather than eye movements alone to uncover their interplay with attention and decision-making.

      Thanks for the reference. The reported finding is certainly interesting, but we find it somewhat tangential to the specific point we make about strong fixation constraints — which is that they suppress internally driven motor activity, including biases, that are highly informative of the relationship between attention and saccade planning (lines 466‒472, 541‒561). Whether fixation state has other subtle consequences for cognitive control is an intriguing, important issue, for sure. But we would rather maintain the readers’ focus on the reasons why less restrictive fixation requirements are relevant for understanding the deployment of attention.

      (4) Results, p. 9, lines 264-266: It is reported that "The rise points were statistically the same across experiments for both prosaccades (p=0.08, n=10, permutation test)...", but the p-value seems quite close to significance. I suggest mentioning this and phrasing the sentence a bit more carefully.

      We now refer to the rise points as “similar”.

      (5) Figure 7 a-d: It might help readers who first skim through the figures before reading the text to use other labels for the bins on the x-axis that spell out the name of the phase in the trial. It might also help to visualize the bins on the plot of a tachymetric function (in this case, changing the labels could be unnecessary).

      Thanks for the suggestion. We added an insert to the figure to indicate the correspondence between labels and time bins more intuitively.

      (6) Methods, p. 18, lines 566-567: On some trials, participants received an auditory beep as a feedback stimulus. As this could induce a burst of arousal, I wondered how it affected the subsequent trials.

      This is an interesting issue to ponder. We agree that, in principle, the beep could have an impact on arousal. However, what exactly would be predicted as a consequence? The absence of a beep is meant to increase the urgency of the participant, so some effect of the beep event on RT would be expected anyway as per task instructions. Thus, it is unclear whether an arousal contribution could be isolated from other confounds. That said, three observations suggest that, at most, an independent arousal effect would be very small. First, we have performed multisensory experiments (unpublished) with auditory and visual stimuli, and have found that it is difficult to obtain a measurable effect of sound on an urgent visual choice task unless the experimental conditions are particularly conducive; namely, when the visual stimuli are dim and the sound is loud and lateralized. None of these conditions applies to the standard feedback beep. Second, because most trials are on time, the meaningful feedback signal is conveyed by the absence of the beep. But this signal to alter behavior (i.e., respond sooner) has zero intensity and is therefore unlikely to trigger a strong exogenous, automatic response. Finally, in our data, we can parse the trials that followed a beep (the majority) from those that did not (a minority). In doing so, we found no differences with respect to perceptual performance; only minor differences in RT that were identical for pro- and antisaccade trials. All this suggests to us that it is very unlikely that the feedback alters arousal significantly on specific trials, somehow impacting the tachometric curve (a contribution to general arousal across blocks or sessions is possible, of course, but would be of little consequence to the aims of the study).

      (7) Methods, p. 18, lines 574-577: I suggest referring to the colors or the conditions in the text as it was done in the experiments, just to prevent readers being confused before reading the methods.

      We appreciate the thought, but think that the study is easier to understand by pretending, initially, that the color assignments were fixed. This is a harmless simplification. Mentioning the actual color assignments early on would be potentially more confusing and make the description of the task longer and more contrived.

      (8) Methods, p. 18, Table 1: Given that the authors had a spectrophotometer, I suggest providing (approximate) measurements for the stimulus colors in addition to the luminance (i.e. not just RGB values).

      Unfortunately, we have since switched the monitor in our setup, so we don’t have the exact color measurements for the stimuli used at the time. We will keep the suggestion in mind for future studies though.

      References

      Oor EE, Stanford TR, Salinas E (2023) Stimulus salience conflicts and colludes with endogenous goals during urgent choices. iScience 26:106253.

      Salinas E, Stanford TR (2021) Under time pressure, the exogenous modulation of saccade plans is ubiquitous, intricate, and lawful. Curr Opin Neurobiol 70:154-162.

      Zhu J, Zhou XM, Constantinidis C, Salinas E, Stanford TR (2024) Parallel signatures of cognitive maturation in primate antisaccade performance and prefrontal activity. iScience.  doi: https://doi.org/10.1016/j.isci.2024.110488.

    1. eLife Assessment

      The main idea tested in this work is that host galectin-9 inhibits Mycobacterium tuberculosis (Mtb) growth by recognizing the Mtb cell wall component arabinogalactan (AG) and, as a result, disrupting mycobacterial cell wall structure. Moreover, a similar effect is achieved by anti-AG antibodies. While the hypothesis is intriguing and the work has the potential to make a valuable contribution to Mtb therapy, the evidence presented is incomplete and does not explain several critical points including the dose-independent effect of galectin-9 on Mtb growth and how anti-AG antibodies and galectin-9 access the AG layer of intact Mtb.

    2. Reviewer #1 (Public review):

      The molecular interactions which determine infection (and disease) trajectory following human exposure to Mycobacterium tuberculosis (Mtb) are critical to understanding mycobacterial pathogenicity and tuberculosis (TB), a global public health threat which disproportionately impacts a number of high-burden countries and, owing to the emergence of multidrug-resistant Mtb strains, is a major contributor to antimicrobial resistance (AMR). In this submission, Qin and colleagues extend their own previous work which identified a potential role for host galectin-9 in recognizing the major Mtb cell wall component, arabinogalactan (AG). First, the authors present data indicating that galectin-9 inhibits mycobacterial growth during in vitro culture in liquid and on solid media, and that the inhibition depends on carbohydrate recognition by galectin-9. Next, the authors identify anti-AG antibodies in sera of TB patients and use this observation to inform isolation of monoclonal anti-AG antibodies (mAbs) via an in vitro screen. Finally, they apply the identified anti-AG mAbs to inhibit Mtb growth in vitro via a mechanism which proteomic and microscopic analyses suggest is dependent on disruption of cell wall structure. In summary, the dual observation of (i) the apparent role of naturally arising host anti-AG antibodies to control infection and (ii) the potential utility of anti-AG monoclonal antibodies as novel anti-Mtb therapeutics is compelling; however, as noted in the comments below, the evidence presented to support these insights is not adequate and the authors should address the following:

      (1) The experiment which utilizes lactose or glucose supplementation to infer the importance of carbohydrate recognition by galectin-9 cannot be interpreted unequivocally owing to the growth-enhancing effect of lactose supplementation on Mtb during liquid culture in vitro.

      (2) Similar to the comment above, the apparent dose-independent effect of galectin-9 on Mtb growth in vitro is difficult to reconcile with the interpretation that galectin is functioning as claimed.

      (3) The claimed differences in galectin-9 concentration in sera from tuberculin skin test (TST)-negative or TST-positive non-TB cases versus active TB patients are not immediately apparent from the data presented.

      (4) Neither fluorescence microscopy nor electron microscopy analyses are supported by high-quality, interpretable images which, in the absence of supporting quantitative data, renders any claims of anti-AG mAb specificity (fluorescence microscopy) or putative mAb-mediated cell wall swelling (electron microscopy) highly speculative.

      (5) Finally, the absence of any discussion of how anti-AG antibodies (similarly, galectin-9) gain access to the AG layer in the outer membrane of intact Mtb bacilli (which may additionally possess an extracellular capsule/coat) is a critical omission - situating these results in the context of current knowledge about Mtb cellular structure (especially the mycobacterial outer membrane) is essential for plausibility of the inferred galectin-9 and anti-AG mAb activities.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors work to extend their previous observation that galectin-9 interacts with arabinogalactans of Mtb in their EMBO reports 2021 manuscript. Here they provide evidence for the CARD2 domain of galectin-9 can inhibit the growth of Mtb in culture. In addition, antibodies that also bind to AG appear to inhibit Mtb growth in culture. These data indicate that independent of the common cell-associated responses to galectin-9 and antibodies, interaction of these proteins with AG of mycobacteria may have consequences for bacterial growth.

      Strengths:

      The authors provided several lines of evidence in culture media that the introduction of galectin-9 proteins and antibodies inhibit the growth rate of Mtb.

      Weaknesses:

      The methodology for generating and screening the anti-AG antibodies lacks pertinent details for recapitulating and interpreting the results.

      The figure legends and methods associated with the microscopy assays lack sufficient details to appropriately interpret the experiments conducted.

      The galectin-9 measured in the sera of TB patients does not approach the concentrations required for Mtb growth restriction in the in vitro assays performed by the authors. It remains difficult to envision how greater levels of galectin-9 release might contribute to Mtb control in severe forms of TB, since higher levels of serum Gal9 has been observed in other human studies and correlate with poorly controlled infection. The authors over-interpret the role of Gal9 in bacterial control during disease/infection without any evidence of impact on in vivo (animal model) control.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Question 1: The experiment that utilizes lactose or glucose supplementation to infer the importance of carbohydrate recognition by galectin-9 cannot be interpreted unequivocally owing to the growth-enhancing effect of lactose supplementation on Mtb during liquid culture in vitro.

      Thank you for this very constructive comment. We repeated the experiments by lowering the concentration of lactose or AG from 10 μg/mL to 1 μg/mL. We found that low concentration of lactose or AG showed neglectable effect on Mtb growth, however, they still reversed the inhibitory effect of galectin-9 on mycobacterial growth (revised Fig. 2A, C). Therefore, we consider that the supplementation of lactose or AG reverse galectin-9 mediated inhibition of Mtb growth largely through its carbohydrate recognition rather than their growth-enhancing effect.

      Question 2: Similar to the comment above, the apparent dose-independent effect of galectin-9 on Mtb growth in vitro is difficult to reconcile with the interpretation that galectin is functioning as claimed.

      We thank the reviewer for the correction. Indeed, as the reviewer pointed out, galectin-9 inhibits Mtb growth in dose-independent manner. We had corrected the claim in the revised manuscript (Line 114).

      Question 3: The claimed differences in galectin-9 concentration in sera from tuberculin skin test (TST)-negative or TST-positive non-TB cases versus active TB patients are not immediately apparent from the data presented.

      We appreciate your concern. Previous samples are from a cohort set up in Max Plank Institute for Infection Biology. We have performed the detection of galectin-9 in sera in another independent cohort of active TB patients and healthy donors in China. And we found higher abundance of galectin-9 in serum from TB patients than that from heathy donors (revised Fig. 1E).

      Question 4: Neither fluorescence microscopy nor electron microscopy analyses are supported by high-quality, interpretable images which, in the absence of supporting quantitative data, renders any claims of anti-AG mAb specificity (fluorescence microscopy) or putative mAb-mediated cell wall swelling (electron microscopy) highly speculative.

      We appreciate your concern. We have improved the procedure of the immunofluorescence assay and obtained high-quality and interpretable images with quantitative data (revised Fig. 4F). As for electron microscopy analyses, we added clearer label indicating cell wall in revised manuscript (revised Fig. 7C).

      Question 5: Finally, the absence of any discussion of how anti-AG antibodies (similarly, galectin-9) gain access to the AG layer in the outer membrane of intact Mtb bacilli (which may additionally possess an extracellular capsule/coat) is a critical omission - situating these results in the context of current knowledge about Mtb cellular structure (especially the mycobacterial outer membrane) is essential for plausibility of the inferred galectin-9 and anti-AG mAb activities.

      Exactly, AG is hidden by mycolic acids in the outer layer of Mtb cell wall. As we have discussed in the Discussion part of previous manuscript (line 285), we speculate that during Mtb replication, cell wall synthesis is active and AG becomes exposed, thereby facilitating its binding to galectin-9 or AG antibody and leading to Mtb growth arrest. It’s highly possible that galectin-9 or AG antibody targets replicating Mtb.

      To Reviewer #2 (Public Review):

      Question 1: In light of other observations that cleaved galectin-9 levels in the plasma is a biomarker for severe infection (Padilla A et al Biomolecules 2021 and Iwasaki-Hozumi H et al. Biomoleucles 2021) it is difficult to reconcile the author's interpretation that the elevated gal-9 in Active TB patients (Figure 1E) contributes to the maintenance of latent infection in humans. The authors should consider incorporating these observations in the interpretation of their own results.

      Thank you for these very insightful comments. We observed elevated levels of galectin-9 in the serum of active TB patients, consistent with reports indicating that cleaved galectin-9 levels in the serum serve as a biomarker for severe infection (Iwasaki-Hozumi et al., 2021; Padilla et al., 2020). We consider that the elevated levels of galectin-9 in the serum of active TB may be an indicator of the host immune response to Mtb infection, however, the magnitude of elevated galectin-9 is not sufficient to control Mtb infection and maintain latent infection. This is highly similar to other protective immune factors such as interferon gamma, which is elevated in active TB as well (El-Masry et al., 2007; Hasan et al., 2009). We have included the discussion in the revised manuscript (line 298).

      Question 2: The anti-AG titers were measured only in individuals with active TB (Figure 3C), generally thought to be a less protective immunological state. The speculation that individuals with anti-AG titers have some protection is not founded. Further only 2 mAbs were tested to demonstrate restriction of Mtb in culture. It is possible that clones of different affinities for AG present within a patient's polyclonal AG-antibody responses may or may not display a direct growth restriction pressure on Mtb in culture. The authors should soften the claims about the presence of AG-titers in TB patients being indicative of protection.

      We appreciate your concern. As per your suggestion, we have softened the claim to that “We speculate that during Mtb infection, anti-AG IgG antibodies are induced, which potentially contribute to protection against TB by directly inhibiting Mtb replication albeit seemingly in vain.”

      References

      El-Masry, S., Lotfy, M., Nasif, W.A., El-Kady, I.M., and Al-Badrawy, M. (2007). Elevated serum level of interleukin (IL)-18, interferon (IFN)-gamma and soluble Fas in patients with pulmonary complications in tuberculosis. Acta microbiologica et immunologica Hungarica 54, 65-77.

      Hasan, Z., Jamil, B., Khan, J., Ali, R., Khan, M.A., Nasir, N., Yusuf, M.S., Jamil, S., Irfan, M., and Hussain, R. (2009). Relationship between circulating levels of IFN-gamma, IL-10, CXCL9 and CCL2 in pulmonary and extrapulmonary tuberculosis is dependent on disease severity. Scandinavian journal of immunology 69, 259-267.

      Iwasaki-Hozumi, H., Chagan-Yasutan, H., Ashino, Y., and Hattori, T. (2021). Blood Levels of Galectin-9, an Immuno-Regulating Molecule, Reflect the Severity for the Acute and Chronic Infectious Diseases. Biomolecules 11.

      Padilla, S.T., Niki, T., Furushima, D., Bai, G., Chagan-Yasutan, H., Telan, E.F., Tactacan-Abrenica, R.J., Maeda, Y., Solante, R., and Hattori, T. (2020). Plasma Levels of a Cleaved Form of Galectin-9 Are the Most Sensitive Biomarkers of Acquired Immune Deficiency Syndrome and Tuberculosis Coinfection. Biomolecules 10.

    1. eLife Assessment

      This important work uses in vivo foveal cone-resolved imaging and simultaneous microscopic photostimulation to investigate the relationship between ocular drift - eye movements long thought to be random - and visual acuity. The surprising result is that ocular drift is systematic - causing the object to move to the center of the cone mosaic over the course of each perceptual trial. The tools used to reach this conclusion are state-of-the-art and the evidence presented is convincing. This work advances our understanding of the visuomotor system and the interplay of anatomy, oculomotor behavior, and visual acuity.

    2. Reviewer #1 (Public review):

      Summary:

      This paper investigates the relationship between ocular drift - eye movements long thought to be random - and visual acuity. This is a fundamental issue for how vision works. The work uses adaptive optics retinal imaging to monitor eye movements and where a target object is in the cone photoreceptor array. The surprising result is that ocular drift is systematic - causing the object to move to the center of the cone mosaic over the course of each perceptual trial. The tools used to reach this conclusion are state-of-the-art and the evidence presented is convincing.

      Strengths

      The central question of the paper is interesting, as far as I know, it has not been answered in past work, and the approaches employed in this work are appropriate and provide clear answers.

      The central finding - that ocular drift is not a completely random process - is important and has a broad impact on how we think about the relationship between eye movements and visual perception.

      The presentation is quite nice: the figures clearly illustrate key points and have a nice mix of primary and analyzed data, and the writing (with one important exception) is generally clear.

      Weaknesses

      The primary concern I had about the previous version of the manuscript was how the Nyquist limit was described. The changes the authors made have improved this substantially in the current version.

    3. Reviewer #2 (Public review):

      Summary:

      In this work, Witten et al. assess visual acuity, cone density, and fixational behavior in the central foveal region in a large number of subjects.<br /> This work elegantly presents a number of important findings, and I can see this becoming a landmark work in the field. First, it shows that acuity is determined by the cone mosaic, hence, subjects characterized by higher cone densities show higher acuity in diffraction limited settings. Second, it shows that humans can achieve higher visual resolution than what is dictated by cone sampling, suggesting that this is likely the result of fixational drift, which constantly moves the stimuli over the cone mosaic. Third, the study reports a correlation between the amplitude of fixational motion and acuity, namely, subjects with smaller drifts have higher acuities and higher cone density. Fourth, it is shown that humans tend to move the fixated object toward the region of higher cone density in the retina, lending further support to the idea that drift is not a random process, but is likely controlled. This is a beautiful and unique work that furthers our understanding of the visuomotor system and the interplay of anatomy, oculomotor behavior, and visual acuity.

      Strengths:

      The work is rigorously conducted, it uses state-of-the-art technology to record fixational eye movements while imaging the central fovea at high resolution, and examines exactly where the viewed stimulus falls on individuals' foveal cone mosaic with respect to different anatomical landmarks in this region. Figures are clear and nicely packaged. It is important to emphasize that this study is a real tour-de-force in which the authors collected a massive amount of data on 20 subjects. This is particularly remarkable considering how challenging it is to run psychophysics experiments using this sophisticated technology. Most of the studies using psychophysics with AO are, indeed, limited to a few subjects. Therefore, this work shows a unique set of data, filling a gap in the literature.

      Weaknesses:

      Data analysis has been improved after the first round of review. The revised version of the manuscript is solid, and there are no weaknesses that should be addressed. The authors added more statistical tests and analyses, reported comparable effects even when different metrics are used (e.g., diffusion constant), and removed the confusing text on myopia. I think this work represents a significant scientific contribution to vision science.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Witten et al., aims to investigate the link between acuity thresholds (and hyperacuity) and retinal sampling. Specifically, using in vivo foveal cone-resolved imaging and simultaneous microscopic photo stimulation, the researchers examined visual acuity thresholds in 16 volunteers and correlated them with each individual's retinal sampling capacity and the characteristics of ocular drift.

      First, the authors found that although visual acuity was highly correlated with the individual spatial arrangement of cones, for all participants, visual resolution exceeded the Nyquist sampling.

      Thus, the researchers hypothesized that this increase in acuity, which could not be explained in terms of spatial encoding mechanisms, might result from exploiting the spatiotemporal characteristics of the visual input associated with the dynamics of the fixational eye movements (and ocular drift in particular).

      The authors reported a correlation between acuity threshold and drift amplitude, suggesting that the visual system benefits from transforming spatial input into a spatiotemporal flow. Finally, they showed that drift, contrary to the traditional view of it as random involuntary movement, appears to exhibit directionality: drift tends to move stimuli to higher cone density areas, therefore enhancing visual resolution.

      I find the work of broad interest, its methods are clear, and the results solid.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:  

      This paper investigates the relationship between ocular drift - eye movements long thought to be random - and visual acuity. This is a fundamental issue for how vision works. The work uses adaptive optics retinal imaging to monitor eye movements and where a target object is in the cone photoreceptor array. The surprising result is that ocular drift is systematic - causing the object to move to the center of the cone mosaic over the course of each perceptual trial. The tools used to reach this conclusion are state-of-the-art and the evidence presented is convincing.

      Strengths  

      P1.1. The central question of the paper is interesting, as far as I know, it has not been answered in past work, and the approaches employed in this work are appropriate and provide clear answers.

      P1.2. The central finding - that ocular drift is not a completely random process - is important and has a broad impact on how we think about the relationship between eye movements and visual perception.

      P1.3. The presentation is quite nice: the figures clearly illustrate key points and have a nice mix of primary and analyzed data, and the writing (with one important exception) is generally clear.

      Thank you for your positive feedback.

      Weaknesses

      P1.4. The handling of the Nyquist limit is confusing throughout the paper and could be improved. It is not clear (at least to me) how the Nyquist limit applies to the specific task considered. I think of the Nyquist limit as saying that spatial frequencies above a certain cutoff set by the cone spacing are being aliased and cannot be disambiguated from the structure at a lower spatial frequency. In other words, there is a limit to the spatial frequency content that can be uniquely represented by discrete cone sampling locations. Acuity beyond that limit is certainly possible with a stationary image - e.g. a line will set up a distribution of responses in the cones that it covers, and without noise, an arbitrarily small displacement of the line would change the distribution of cone responses in a way that could be resolved. This is an important point because it relates to whether some kind of active sampling or movement of the detectors is needed to explain the spatial resolution results in the paper. This issue comes up in the introduction, results, and discussion. It arises in particular in the two Discussion paragraphs starting on line 343.

      We thank you for pointing out a possible confusion for readers. Overall, we contrast our results to the static Nyquist limit because it is generally regarded as the upper limit of resolution acuity. We updated our text in a few places, especially the Discussion, and added a reference to make our use of the Nyquist limit clearer.

      We agree with the reviewer of how the Nyquist limit is interpreted within the context of visual structure. If visual structure is under-sampled, it is not lost, but creates new, interfered visual structure at lower spatial frequency. For regular patterns like gratings, interference patterns may emerge akin to Moire patterns, which have been shown to occur in the human eye, and which form is based on the arrangement and regularity of the photoreceptor mosaic (Williams, 1985). We note however that the successful resolution of the lower frequency pattern does not necessarily carry the same structural information, specifically, orientation, and the aliased structure might indeed mask the original stimulus. Please compare Figure 1f where we show individual static snapshots of such aliased patterns, especially visible when the optotypes are small (towards the lower right of the figure). We note that theoretical work predicts that with prior knowledge about the stimulus, even such static images might be possible to de-alias (Ruderman & Bialek, 1992). We added this to our manuscript.   

      We think the reviewer’s following point about the resolution of a line position, is only partially connected to the first, however. In our manuscript we note in the Introduction that resolution of the relative position of visual objects is a so called hyperacuity phenomenon. The fact that it occurs in humans and other animals demonstrates that visual brains have come up with neuronal mechanisms to determine relative stimulus position with sub-Nyquist resolution. The exact mechanism is however not fully clear. One solution is that relative cone signal intensities could be harnessed, similar as is employed technically, e.g. in a quadrant-cell detector. Its positional precision is much higher than the individual cell’s size (or Nyquist limit), predominantly determined by the detector’s sensitivity and to a lesser degree its size. On the other hand, such detector, being hyperacute with object location, would not have the same resolution as, for instance, letter-E orientation discrimination. 

      Note that in all the above occasions, a static image-sensor-relationship is assumed. In our paper, we were aiming to convey, like others did before, that a moving stimulus may give rise to sub-Nyquist structural resolution, beyond what is already known for positional acuity and hence, classical hyperacuity. 

      Based on the data shown in this manuscript and other experimental data currently collected in the lab, it seems to us that eye movements are indeed the crucial point in achieving sub-Nyquist resolution. For example, ultra-short presentation durations, allowing virtually no retinal slip, push thresholds close to the Nyquist limit and above. Furthermore, with AOSLO stimulation, it is possible to stabilize a stimulus on the retina, which would be a useful tool studying this hypothesis. Our current level of stabilization is however not accurate enough to completely mitigate retinal image motion in the foveola, where cells are smallest, and transients could occur. From what we observe and other studies that looked at resolution thresholds at more peripheral retinal locations, we would predict that foveolar resolution of a perfectly stabilized stimulus would be indeed limited by the Nyquist limit of the receptor mosaic.

      P1.5. One question that came up as I read the paper was whether the eye movement parameters depend on the size of the E. In other words, to what extent is ocular drift tuned to specific behavioral tasks?

      This is an interesting question. Yet, the experimental data collected for the current manuscript does not contain enough dispersion in target size to give a definitive answer, unfortunately. A larger range of stimulus sizes and especially a similar number of trials per size would be required. Nonetheless, when individual trials were re-grouped to percentiles of all stimulus sizes (scaled for each eye individually), we found that drift length and directionality was not significantly different between any percentile group of stimulus sizes (Wilcoxon sign rank test, p > 0.12, see also Figure R1). Our experimental trials started with a stimulus demanding visual acuity of 20/16 (logMAR = -0.1), therefore all presented stimulus sizes were rather close to threshold. The high visual demand in this AO resolution task might bring the oculomotor system to a limit, where ocular drift length can’t be decreased further. However, with the limitation due to the small range of stimulus sizes, further investigations would be needed. Given this and that this topic is also ongoing research in our lab where also more complex dynamics of FEM patterns are considered, we refrain from showing this analysis in the current manuscript.  

      Author response image 1.

      Drift length does not depend on stimulus sizes close to threshold. All experimental trials were sorted by stimulus size and then grouped into percentiles for each participant (left). Additionally, 10 % of trials with stimulus sizes just above or below threshold are shown for comparison (right). For each group, median drift lengths (z-scored) are shown as box and whiskers plot. Drift length was not significantly different across groups.  

      Reviewer #2 (Public Review):

      Summary:

      In this work, Witten et al. assess visual acuity, cone density, and fixational behavior in the central foveal region in a large number of subjects.

      This work elegantly presents a number of important findings, and I can see this becoming a landmark work in the field. First, it shows that acuity is determined by the cone mosaic, hence, subjects characterized by higher cone densities show higher acuity in diffraction-limited settings. Second, it shows that humans can achieve higher visual resolution than what is dictated by cone sampling, suggesting that this is likely the result of fixational drift, which constantly moves the stimuli over the cone mosaic. Third, the study reports a correlation between the amplitude of fixational motion and acuity, namely, subjects with smaller drifts have higher acuities and higher cone density. Fourth, it is shown that humans tend to move the fixated object toward the region of higher cone density in the retina, lending further support to the idea that drift is not a random process, but is likely controlled. This is a beautiful and unique work that furthers our understanding of the visuomotor system and the interplay of anatomy, oculomotor behavior, and visual acuity.

      Strengths:

      P2.1. The work is rigorously conducted, it uses state-of-the-art technology to record fixational eye movements while imaging the central fovea at high resolution and examines exactly where the viewed stimulus falls on individuals' foveal cone mosaic with respect to different anatomical landmarks in this region. The figures are clear and nicely packaged. It is important to emphasize that this study is a real tour-de-force in which the authors collected a massive amount of data on 20 subjects. This is particularly remarkable considering how challenging it is to run psychophysics experiments using this sophisticated technology. Most of the studies using psychophysics with AO are, indeed, limited to a few subjects. Therefore, this work shows a unique set of data, filling a gap in the literature.

      Thank you, we are very grateful for your positive feedback.

      Weaknesses:

      P2.2. No major weakness was noted, but data analysis could be further improved by examining drift instantaneous direction rather than start-point-end-point direction, and by adding a statistical quantification of the difference in direction tuning between the three anatomical landmarks considered.

      Thank you for these two suggestions. We now show the development of directionality with time (after the first frame, 33 ms as well as 165 ms, 330 ms and 462 ms), and performed a Rayleigh test for non-uniformity of circular data. Please also see our response to comment R2.4.

      Briefly, directional tuning was already visible at 33 ms after stimulus onset and continuously increases with longer analysis duration. Directionality is thus not pronounced at shorter analysis windows. These results have been added to the text and figures (Figure 4 - figure supplement 1).

      The statistical tests showed that circular sample directionality was not uniformly distributed for all three retinal locations. The circular average was between -10 and 10 ° in all cases and the variance was decreasing with increasing time (from 48.5 ° to 34.3 ° for CDC, 49.6 ° to 38.6 ° for PRL and 53.9 ° to 43.4 for PCD location, between frame 2 and 15). As we have discussed in the paper, we would expect all three locations to come out as significant, given their vicinity to the CDC (which is systematic in the case of PRL, and random in the case of PCD, see also comment R2.2).        

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Witten et al., titled "Sub-cone visual resolution by active, adaptive sampling in the human foveola," aims to investigate the link between acuity thresholds (and hyperacuity) and retinal sampling. Specifically, using in vivo foveal cone-resolved imaging and simultaneous microscopic photostimulation, the researchers examined visual acuity thresholds in 16 volunteers and correlated them with each individual's retinal sampling capacity and the characteristics of ocular drift.

      First, the authors found that although visual acuity was highly correlated with the individual spatial arrangement of cones, for all participants, visual resolution exceeded the Nyquist sampling limit - a well-known phenomenon in the literature called hyperacuity.

      Thus, the researchers hypothesized that this increase in acuity, which could not be explained in terms of spatial encoding mechanisms, might result from exploiting the spatiotemporal characteristics of visual input, which is continuously modulated over time by eye movements even during so-called fixations (e.g., ocular drift).

      Authors reported a correlation between subjects, between acuity threshold and drift amplitude, suggesting that the visual system benefits from transforming spatial input into a spatiotemporal flow. Finally, they showed that drift, contrary to the traditional view of it as random involuntary movement, appears to exhibit directionality: drift tends to move stimuli to higher cone density areas, therefore enhancing visual resolution.

      Strengths:

      P3.1. The work is of broad interest, the methods are clear, and the results are solid.

      Thank you.

      Weaknesses:

      P3.2. Literature (1/2): The authors do not appear to be aware of an important paper published in 2023 by Lin et al. (https://doi.org/10.1016/j.cub.2023.03.026), which nicely demonstrates that (i) ocular drifts are under cognitive influence, and (ii) specific task knowledge influences the dominant orientation of these ocular drifts even in the absence of visual information. The results of this article are particularly relevant and should be discussed in light of the findings of the current experiment.

      Thank you for pointing to this important work which we were aware of. It simply slipped through during writing. It is now discussed in lines 390-393. 

      P3.3. Literature (2/2): The hypothesis that hyperacuity is attributable to ocular movements has been proposed by other authors and should be cited and discussed (e.g., https://doi.org/10.3389/fncom.2012.00089, https://doi.org/10.10

      Thank you for pointing us towards these works which we have now added to the Discussion section. We would like to stress however, that we see a distinction between classical hyperacuity phenomena (Vernier, stereo, centering, etc.) as a form of positional acuity, and orientation discrimination.  

      P3.4. Drift Dynamic Characterization: The drift is primarily characterized as the "concatenated vector sum of all frame-wise motion vectors within the 500 ms stimulus duration.". To better compare with other studies investigating the link between drift dynamics and visual acuity (e.g., Clark et al., 2022), it would be interesting to analyze the drift-diffusion constant, which might be the parameter most capable of describing the dynamic characteristics of drift.

      During our analysis, we have computed the diffusion coefficient (D) and it showed qualitatively similar results to the drift length (see figures below). We decided to not show these results, because we are convinced that D is indeed not the most capable parameter to describe the typical drift characteristic seen here. The diffusion coefficient is computed as the slope of the mean square displacement (MSD). In our view, there are two main issues with applying this metric to our data, one conceptual, one factual:

      (1) Computation of a diffusion coefficient is based upon the assumption that the underlying movement is similar to a random walk process. From a historical perspective, where drift has been regarded as more random, this makes sense. We also agree that D can serve as a valuable metric, depending on the individual research question. In our data, however, we clearly show that drift is not random, and a metric quantifying randomness is thus ill-defined. 

      (2) We often observed out- and in-type motion traces, i.e. where the eye somewhat backtracks from where it started. Traces in this case are equally long (and fast) as other motion will be with a singular direction, but D would in this case be much smaller, as the MSD first increases and then decreases. In reality, the same number of cones would have been traversed as with the larger D of straight outward movement, albeit not unique cones. For our current analyses, the drift length captures this relationship better.

      Author response image 2.

      Diffusion coefficient (D) and the relation to visual acuity (see Figure 3 e-g for comparison to drift length). a, D was strongly correlated between fellow eyes. b, Cone density and D were not significantly correlated. c, The median D had a moderate correlation with visual acuity thresholds in dominant as well as non-dominant eyes. Dominant eyes are indicated by filled, nondominant eyes by open markers.

      We would like to put forward that, in general, better metrics are needed, especially in respect to the visual signals arising from the moving eye. We are actively looking into this in follow-up work, and we hope that the current manuscript might spark also others to come up with new ways of characterizing the fine movements of the eye during fixation.

      P3.5. Possible inconsistencies: Binocular differences are not expected based on the hypothesis; the authors may speculate a bit more about this. Additionally, the fact that hyperacuity does not occur with longer infrared wavelengths but the drift dynamics do not vary between the two conditions is interesting and should be discussed more thoroughly.

      Binocularity: the differences in performance between fellow eyes is rather subtle, and we do not have a firm grip on differences other than the cone mosaic and fixational motor behavior between the two eyes. We would rather not speculate beyond what we already do, namely that some factor related to the development of ocular dominance is at play. What we do show with our data is that cone density and drift patterns seem to have no part in it.  

      Effect of wavelength: even with the longer 840 nm wavelength, most eyes resolve below the Nyquist limit, with a general increase in thresholds (getting worse) compared to 788 nm. As we wrote in the manuscript, we assume that the increased image blur and reduced cone contrast introduced by the longer wavelength are key to why there is an overall reduction in acuity. No changes were made to the manuscript. As a more general remark, we would not consider the sub-Nyquist performances seen in our data to be a hyperacuity, although technically it is. The reason is that hyperacuity is usually associated with stimuli that require resolving positional shifts, and not orientation. There is a log unit of difference between thresholds in these tasks.  

      P3.6. As a Suggestion: can the authors predict the accuracy of individual participants in single trials just by looking at the drift dynamics?

      That’s a very interesting point that we indeed currently look at in another project. As a comment, we can add that by purely looking at the drift dynamics in the current data, we could not predict the accuracy (percent correct) of the participant. When comparing drift length or diffusion coefficients between trials with correct or false response, we do not observe a significant difference. Also, when adding an anatomical correlate and compare between trials where sampling density increases or decreases, there is no significant trend. We think that it is a more complex interplay between all the influencing factors that can perhaps be met by a model considering all drift dynamics, photoreceptor geometry and stimulus characteristics.   

      No changes were made to the manuscript.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      As you will see, the reviewers were quite enthusiastic about your work, but have a few issues for your consideration. We hope that this is helpful. We'll consider any revisions in composing a final eLife assessment.

      Reviewer #1 (Recommendations For The Authors):

      R1.1:  Discussion of myopia. Myopia takes a fair bit of space in the Discussion, but the paper does not include any subjects that are sufficiently myopic to test the predictions. I would suggest reducing the amount of space devoted to this issue, and instead making the prediction that myopia may help with resolution quickly. The introduction (lines 54-56) left me expecting a test of this hypothesis, and I think similarly that issue could be left out of the introduction.

      We have removed this part from the Introduction and shortened the Discussion.  

      R1.2: Line 118: define CDC here.

      Thank you for pointing this out, it is now defined at this location.  

      R1.3: Line 159-162: suggest breaking this sentence into two. This sentence also serves as a transition to the next section, but the wording suggests it is a result that is shown in the prior section. Suggest rewording to make the transition part clear. Maybe something like "Hence the spatial arrangement of cones only partially ... . Next we show that ocular motion and the associated ... are another important factor."

      Text was changed as suggested.  

      R1.4.: Figure 3: The retina images are a bit hard to see - suggest making them larger to take an entire row. As a reader, I also was wondering about the temporal progression of the drift trajectories and the relation to the CDC. Since you get to that in Figure 4, you could clarify in the text that you are starting by analyzing distance traveled and will return to the issue of directed trajectories.

      Visibility was probably an issue during the initial submission and review process where images were produced at lower resolution. The original figures are of sufficient resolution to fully appreciate the underlying cone mosaic and will later be able to zoom in the online publication.  

      We added a mention of the order of analysis in the Results section (LL 163-165)

      R1.5: Line 176: define "sum of piecewise drift amplitude" (e.g. refer to Figure where it is defined).

      We refer to this metric now as the drift length (as pointed out rightfully so by reviewer #2), and added its definition at this location.   

      R1.6: Lines 205-208: suggest clarifying this sentence is a transition to the next section. As for the earlier sentence mentioned above, this sounds like a result rather than a transition to an issue you will consider next.

      This sentence was changed to make the transition clearer. 

      R1.7: Line 225: suggest starting a new paragraph here.

      Done as suggested

      Reviewer #2 (Recommendations For The Authors):

      I don't have any major concerns, mostly suggestions and minor comments.

      R2.1: (1) The authors use piecewise amplitude as a measure of the amount of retinal motion introduced by ocular drift. However, to me, this sounds like what is normally referred to as the path length of a trace rather than its amplitude. I would suggest using the term length rather than amplitude, as amplitude is normally considered the distance between the starting and the ending point of a trace.

      This was changed as suggested throughout the manuscript. 

      R2.2: (2) It would be useful to elaborate more on the difference between CDC and PCD, I know the authors do this in other publications, but to the naïve reader, it comes a bit as a surprise that drift directionality is toward the CDC but less so toward the PCD. Is the difference between these metrics simply related to the fact that defining the PCD location is more susceptible to errors, especially if image quality is not optimal? If indeed the PCD is the point of peak cone density, assuming no errors or variability in the estimation of this point, shouldn't we expect drift moving stimuli toward this point, as the CDC will be characterized by a slightly lower density? I.e., is the absence of a PCD directionality trend as strong as the trend seen for the CDC simply the result of variability and error in the estimate of the PCD or it is primarily due to the distribution of cone density not being symmetrical around the PCD?

      Thank you for this comment. We already refer in the Methods section to the respective papers where this difference is analyzed in more detail, and shortly discuss it here.

      To briefly answer the reviewer’s final question: PCD location is too variable, and ought to be avoided as a retinal landmark. While we believe there is value in reporting the PCD as a metric of maximum density, it has been shown recently (Reiniger et al., 2021; Warr et al., 2024; Wynne et al., 2022) and is visible in our own (partly unpublished) data, that its location will change with changing one or more of these factors: cone density metric, window size or cone quantity selected, cone annotation quality, image quality (e.g. across days), individual grader, annotation software, and likely more. Each of these factors alone can change the PCD location quite drastically, all while of course, the retina does not change. The CDC on the other hand, given its low-pass filtering nature, is immune to the aforementioned changes within a much wider range and will thus reflect the anatomical and, shown here, functional center of vision, better. However, there will always be individual eyes where PCD location and the CDC are close, and thus researchers might be inclined to also use the PCD as a landmark. We strongly advise against this. In a way, the PCD is a non-sense location while its dimension, density, can be a valuable metric, as density does not vary that much (see e.g. data on CDC density and PCD density reported in this manuscript).  

      Below we append a direct comparison of PCD vs CDC location stability when only one of the mentioned factors are changed. Sixteen retinas imaged on two different days were annotated and analyzed by the same grader with the same approach, and the difference in both locations are shown.  

      Author response image 3.

      Reproducibility of CDC and PCD location in comparison. Two retinal mosaics which were recorded at two different timepoints, maximum 1 year apart from each other, were compared for 16 eyes. The retinal mosaics were carefully aligned. The retinal locations for CDC and PCD that were computed for the first timepoint were used as the spatial anchor (coordinate center), the locations plotted here as red circles (CDC) and gray diamonds (PCD) represent the deviations that were measured at the second timepoint for both metrics.  

      R2.3.: I don't see a statistical comparison between the drift angle tuning for CDC, PRL, and PCD. The distributions in Figure 4F look very similar and all with a relatively wide std. It would be useful to mark the mean of the distributions and report statistical tests. What are the data shown in this figure, single subjects, all subjects pooled together, average across subjects? Please specify in the caption.

      We added a Rayleigh test to test each distribution for nun-uniformity and Kolmogorov-Smirnov tests to compare the distributions towards the different landmarks.  We added the missing specifications to the figure caption of Figure 4 – figure supplement 1. 

      R2.4: I would suggest also calculating drift direction based on the average instantaneous drift velocity, similarly to what is done with amplitude. From Figure 3B it is clear that some drifts are more curved than others. For curved drifts with small amplitudes the start-point- end-point (SE) direction is not very meaningful and it is not a good representation of the overall directionality of the segment. Some drifts also seem to be monotonic and then change direction (eg. the last three examples from participant 10). In this case, the SE direction is likely quite different from the average instantaneous direction. I suspect that if direction is calculated this way it may show the trend of drifting toward the CDC more clearly.

      In response to this and a comment of reviewer #1, we add a calculation of initial  drift direction (and for increasing duration) and show it in Figure 4 – figure supplement 1. By doing so, we hope to capture initial directionality, irrespective of whether later parts in the path change direction. We find that directionality increases with increasing presentation duration. 

      R2.5: I find the discussion point on myopia a bit confusing. Considering that this is a rather tangential point and there are only two myopic participants, I would suggest either removing it from the discussion or explaining it more clearly.

      We changed this section, also in response to comment R1.1.

      R2.6: I would suggest adding to the discussion more elaboration on how these results may relate to acuity in normal conditions (in the presence of optical aberrations). For example, will this relationship between sampling cone density and visual acuity also hold natural viewing conditions?

      We added only a half sentence to the first paragraph of the discussion. We are hesitant to extend this because there is very likely a non-straightforward relationship between acuity in normal and fully corrected conditions. We would predict that, if each eye were given the same type and magnitude of aberrations (similar to what we achieved by removing them), cone density will be the most prominent factor of acuity differences. Given that individual aberrations can vary substantially between eyes, this effect will be diluted, up to the point where aberrations will be the most important factor to acuity. As an example, under natural viewing conditions, pupil size will dominantly modulate the magnitude of aberrations.

      R2.7: Line 398 - the point on the superdiffusive nature of drift comes out of the blue and it is unclear. What is it meant by "superdiffusive"?

      We simply wanted to express that some drift properties seem to be adaptable while others aren’t. The text was changed at this location to remove this seemingly unmotivated term. 

      R2.8: Although it is true that drift has been assumed to be a random motion, there has been mounting evidence, especially in recent years, showing a degree of control and knowledge about ocular drift (eg. Poletti et al, 2015, JN; Lin et al, 2023, Current Biology).

      We agree, of course. We mention this fact several times in the paper and adjusted some sentences to prevent misunderstandings. The mentioned papers are now cited in the Discussion. 

      R2.9: Reference 23 is out of context and should be removed as it deals with the control of fine spatial attention in the foveola rather than microsaccades or drift.

      We removed this reference. 

      R2.10: Minor point: Figures appear to be low resolution in the pdf.

      This seemed to have been an issue with the submission process. All figures will be available in high resolution in the final online version. 

      R2.11: Figure S3, it would be useful to mark the CDC at the center with a different color maybe shaded so it can be visible also on the plot on the left.

      We changed the color and added a small amount of transparency to the PRL markers to make the CDC marker more visible. 

      R2.12: Figure S2, it would be useful to show the same graphs with respect to the PCD and PRL and maybe highlight the subjects who showed the largest (or smallest) distance between PRL and CDC).

      Please find new Figure 4 supplement 1, which contains this information in the group histograms. Also, Figure 4 supplement 2 is now ordered by the distance PRL-CDC (while the participant naming is kept as maximum acuity exhibited. In this way, it should be possible to infer the information of whether PRL-CDC distance plays a role. For us it does not seem to be crucial. Rather, stimulus onset and drift length were related, which is captured in Figure 4g. 

      R2.13: There is a typo in Line 410.

      We could not find a typo in this line, nor in the ones above and below. “Interindividual” was written on purpose, maybe “intraindividual” was expected? No changes were made to the text. 

      References

      Reiniger, J. L., Domdei, N., Holz, F. G., & Harmening, W. M. (2021). Human gaze is systematically offset from the center of cone topography. Current Biology, 31(18), 4188–4193. https://doi.org/10.1016/j.cub.2021.07.005

      Ruderman, D. L., & Bialek, W. (1992). Seeing Beyond the Nyquist Limit. Neural Computation, 4(5), 682–690. https://doi.org/10.1162/neco.1992.4.5.682

      Warr, E., Grieshop, J., Cooper, R. F., & Carroll, J. (2024). The effect of sampling window size on topographical maps of foveal cone density. Frontiers in Ophthalmology, 4, 1348950. https://doi.org/10.3389/fopht.2024.1348950

      Williams, D. R. (1985). Aliasing in human foveal vision. Vision Research, 25(2), 195–205. https://doi.org/10.1016/0042-6989(85)90113-0

      Wynne, N., Cava, J. A., Gaffney, M., Heitkotter, H., Scheidt, A., Reiniger, J. L., Grieshop, J., Yang, K., Harmening, W. M., Cooper, R. F., & Carroll, J. (2022). Intergrader agreement of foveal cone topography measured using adaptive optics scanning light ophthalmoscopy. Biomedical Optics Express, 13(8), 4445–4454. https://doi.org/10.1364/boe.460821

    1. eLife Assessment

      This valuable study provides a novel method to detect sleep cycles based on variations in the slope of the power spectrum from electroencephalography signals. The method, dispensing with time-consuming and potentially subjective manual identification of sleep cycles, is supported by solid evidence and analyses but some aspects could be better illustrated and the source of the discrepancies between classical and fractal cycles should be identified. This study will be of interest to researchers and clinicians working on sleep and brain dynamics.

    2. Reviewer #1 (Public review):

      In this study, Rosenblum et al introduce a novel and automatic way of calculating sleep cycles from human EEG. Previous results have shown that the slope of the non-oscillatory component of the power spectrum (called the aperiodic or fractal component) changes with sleep stage. Building on this, the authors present an algorithm that extracts the continuous-time fluctuations in the fractal slope and propose that peaks in this variable can be used to identify sleep cycle limits. Cycles defined in this way are termed "fractal cycles". The main focus of the article is a comparison of "fractal" and "classical" (ie defined manually based on the hypnogram) sleep cycles in numerous datasets.

      The manuscript amply illustrates through examples the strong overlap between fractal and classical cycle identification. Accordingly, a high percentage (81%) can be matched one-to-one between methods and sleep cycle duration is well correlated (around R = 0.5). Moreover, the methods track certain global changes in sleep structure in different populations: shorter cycles in children and longer cycles in patients medicated with REM-suppressing anti-depressants. Finally, a major strength of the results is that they show similar agreement between fractal and classical sleep cycle length in 5 different data sets, showing that it is robust to changes in recording settings and methods.

      The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method.<br /> The difference between the fractal and classical methods appear to be linked to the uncertain definition of sleep cycles since they are tied to when exactly the cycle begins/ends and whether or not to count cycles during fractured sleep architecture at sleep onset. Moreover, the discrepancies between the two are on the order of that found between classical cycles defined manually or via an automatic algorithm.

      Overall the fractal cycle is an attractive method to study sleep architecture since it dispenses with time-consuming and potentially subjective manual identification of sleep cycles. However, given its difference from the classical method, it is unlikely that fractal scoring will be able to replace classical scoring directly. By providing a complementary quantification, it will likely contribute to refining the definition of sleep cycles that is currently ambiguous in certain cases. Moreover, it has the potential to be applied to animal studies which rarely deal with sleep cycle structure.

    3. Reviewer #2 (Public review):

      Summary:

      This study focused on using strictly the slope of the power spectral density (PSD) to perform automated sleep scoring and evaluation of the durations of sleep cycles. The method appears to work well because the slope of the PSD is highest during slow-wave sleep, and lowest during waking and REM sleep. Therefore, when smoothed and analyzed across time,there are cyclical variations in the slope of the PSD, fit using an IRASA (Irregularly resampled auto-spectral analysis) algorithm proposed by Wen & Liu (2016).

      Strengths:

      The main novelty of the study is that the non-fractal (oscillatory) components of the PSD that are more typically used during sleep scoring can be essentially ignored because the key information is already contained within the fractal (slope) component. The authors show that for the most part, results are fairly consistent between this and conventional sleep scoring, but in some cases show disagreements that may be scientifically interesting.

      Weaknesses:

      The previous weaknesses were well-addressed by the authors in the revised manuscript. I will note that from the fractal cycle perspective, waking and REM sleep are not very dissimilar. Combining these states underlies some of the key results of this study.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Weaknesses:

      The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method. This raises the question as to whether differences are due to one method being more reliable than another or whether they are also identifying different underlying biological differences. It is not clear for example whether the agreement between the two methods is better or worse than between two human scorers, which generally serve as a gold standard to validate novel methods. The authors provide some insight into differences between the methods that could account for differences in results. However, given that the fractal method is automatic it would be important to clearly identify criteria for recordings in which it will produce similar results to the classical method.

      Thank you for these insightful suggestions. In the revised Manuscript, we have added a number of additional analyses that provide a quantitative comparison between the classical and fractal cycle approaches aiming to identify the source of the discrepancies between classical and fractal cycle durations. Likewise, we assessed the intra-fractal and intra-classical method reliability as outlined below.

      Reviewer #1 (Recommendations For The Authors):

      One of the challenges in interpreting the results of the manuscript is understanding whether the differences between the two methods are due to a genuine difference in what these two methods are quantifying or simply noise/variability in each method. If the authors could provide some more insight into this, it would be a great help in assessing their findings and I think bolster the applicability of their method.

      (1) Method reliability: The manuscript clearly shows that cycle length is robustly correlated between fractal and classical in multiple datasets, however, it is hard to assign a meaningful interpretation to the correlation value (ie R = 0.5) without some reference point. This could be provided by looking at the intra-method correlation of cycle lengths. In the case of classical scoring, inter-scorer results could be compared, if the R-value here is significantly higher than 0.5 it would suggest genuine differences between the methods. In the case of fractal scoring, inter-electrode results could be compared / results with slight changes to the peak prominence threshold or smoothing window.

      In the revised Manuscript, we performed the following analyses to show the intra-method reliability:

      a) Classical cycle reliability: For the revised Manuscript, an additional scorer has independently defined classical sleep cycles for all datasets and marked sleep cycles with skipped REM sleep. Likewise, we have performed automatic sleep cycle detection using the R “SleepCycles” package by Blume & Cajochen (2021). We have added a new Table S8 to Supplementary Material 2 that shows the averaged cycle durations and cycle numbers obtained by the two human scorers and automatic algorithm as well as the inter-scorer rate agreement. We have added a new sheet named “Classical method reliability” that reports classical cycle durations for each participant and each dataset as defined by two human scorers and the algorithm To the Supplementary Excel file.

      We found that the correlation coefficients between two human scorers ranged from 0.69 to 0.91 (in literature, r’s > 0.7 are defined as strong scores) in different datasets, thus being higher than correlation coefficients between fractal and classical cycle durations, which in turn ranged from 0.41 to 0.55 (r’s in the range of 0.3 – 0.7 are considered moderate scores). The correlation coefficients between human raters and the automatic algorithm showed remarkably lower coefficients ranging from 0.30 to 0.69 (moderate scores) in different datasets, thus lying within the range of the correlation coefficients between fractal and classical cycle durations. This analysis is reported in Supplementary Material 2, section ”Intra-classical method reliability” and Table S8.

      b) Fractal cycle reliability: In the revised Supplementary Material 2 of our Manuscript, we assessed the intra-fractal method reliability, we correlated between the durations of fractal cycles calculated as defined in the main text, i.e., using a minimum peak prominence of 0.94 z and smoothing window of 101 thirty-second epochs, with those calculated using a minimum peak prominence ranging from 0.86 to 1.20 z with a step size of 0.04 z and smoothing windows ranging from 81 to 121 thirty-second epochs with a step size of 10 epochs (Table S7). We found that fractal cycle durations calculated using adjacent minimum peak prominence (i.e., those that differed by 0.04 z) showed r’s > 0.92, while those calculated using adjacent smoothing windows (i.e., those that differed by 10 epochs) showed r’s > 0.84. In addition, we correlated fractal cycle durations defined using different channels and found that the correlation coefficients ranged between 0.66 – 0.67 (Table S1). Thus, most of the correlations performed to assess intra-fractal method reliability showed correlation coefficients (r > 0.6) higher than those obtained to assess inter-method reliability (r = 0.41 – 0.55), i.e., correlations between fractal and classical cycle. This analysis is reported in Supplementary Material 2, section ”Intra-fractal method reliability” and Table S7. Likewise, we have added a new sheet named “Fractal method reliability” that reports the actual values for the abovementioned parameters to the Supplementary Excel file. For a discussion on potential sources of differences, see below.

      (2) Origin of method differences: The authors outline a few possible sources of discrepancies between the two methods (peak vs REM end, skipped REM cycle detection...) but do not quantify these contributions. It would be interesting to identify some factors that could predict for either a given night of sleep or dataset whether it is likely to show a strong or weak agreement between methods. This could be achieved by correlating measures of the proposed differences ("peak flatness", fractal cycle depth, or proportion of skipped REM cycles) with the mismatch between the two methods.

      In the revised Manuscript, we have quantified a few possible sources of discrepancies between the durations of fractal vs classical cycles and added a new section named “Sources of fractal and classical cycle mismatches” to the Results as well as new Tables 5 and S10 (Supplementary Material 2). Namely, we correlated the difference in classical vs fractal sleep cycle durations on the one side, and either the amplitude of fractal descent/ascent (to reflect fractal cycle depth), duration of cycles with skipped REM sleep/TST, duration of wake after sleep onset/TST or the REM episode length of a given cycle (to reflect peak flatness) on the other side. We found that a higher difference in classical vs fractal cycle duration was associated with a higher proportion of wake after sleep onset (r = 0.226, p = 0.001), shallower fractal descents (r = 0.15, p = 0.002) and longer REM episodes (r = 0.358, p < 0.001, n = 417 cycles, Table S10 in Supplementary Material 2). The rest of the assessed parameters showed no significant correlations (Table S10). We have added a new sheet named “Fractal-classical mismatch” that reports the actual values for the abovementioned parameters to the Supplementary Excel file.  

      (3) Skipped REM cycles: the authors underline that the fractal method identified skipped REM cycles. It seems likely that manual identification of skipped REM cycles is particularly challenging (ie we would expect this to be a particular source of error between two human scorers). If this is indeed the case, it would be interesting to discuss, since it would highlight an advantage of their methodology that they already point out (l644).

      In the revised Manuscript, we have added the inter-scorer rate agreement regarding cycles with skipped REM sleep, which was equal to 61%, which is 32% lower than the performance of our fractal cycle algorithm (93%). These findings are now reported in the “Skipped cycles” section of the Results and in Table S9 of Supplementary Material 2. We also discuss them in Discussion:

      “Our algorithm detected skipped cycles in 93% of cases while the hypnogram-based agreement on the presence/absence of skipped cycles between two independent human raters was 61% only; thus, 32% lower. We deduce that the fractal cycle algorithm detected skipped cycles since a lightening of sleep that replaces a REM episode in skipped cycles is often expressed as a local peak in fractal time series.”<br /> Discussion, section “Fractal and classical cycles comparison”, paragraph 5.

      Minor comments:

      - In the subjects where the number of fractal and classical cycles did not match, how large was the difference (ie just one extra cycle or more)? Correlating cycle numbers could be one way to quantify this.

      In the revised Manuscript, we have reported the required information for the participants with no one-to-one match (46% of all participants) as follows: 

      “In the remaining 46% of the participants, the difference between the fractal and classical cycle numbers ranged from -2 to 2 with the average of -0.23 ± 1.23 cycle. This subgroup had 4.6 ± 1.2 fractal cycles per participant, while the number of classical cycles was 4.9 ± 0.7 cycles per participant. The correlation coefficient between the fractal and classical cycle numbers was 0.280 (p = 0.006) and between the cycle durations – 0.278 (p=0.006).” Results, section “Correspondence between fractal and classical cycles”, last paragraph.

      - When discussing the skipped REM cycles (l467), the authors explain: "For simplicity and between-subject consistency, we included in the analysis only the first cycles". I'm not sure I understood this, could they clarify to which analysis they are referring to?

      In the revised Manuscript, we performed this analysis twice: using first cycles and using all cycles and therefore have rephrased this as follows:

      _“We tested whether the fractal cycle algorithm can detect skipped cycles, i.e., the cycles where an anticipated REM episode is skipped (possibly due to too high homeostatic pressure). We performed this analysis twice. First, we counted all skipped cycles (except the last cycles of a night, which might lack REM episode for other reasons, e.g., a participant had/was woken up). Second, we counted only the first classical cycles (i.e., the first cycle out of the 4 – 6 cycles that each participant had per night, Fig. 3 A – B) as these cy_cles coincide with the highest NREM pressure. An additional reason to disregard skipped cycles observed later during the night was our aim to achieve higher between-subject consistency as later skipped cycles were observed in only a small number of participants.” Results, section “Skipped cycles”, first paragraph.

      - The inclusion of all the hypnograms as a supplementary is a great idea to give the reader concrete intuition of the data. If the limits of the sleep cycles for both methods could be added it would be very useful.

      Supplementary Material 1 has been updated such that each graph has a mark showing the onsets of fractal and classical sleep cycles, including classical cycles with skipped REM sleep.

      - The difference in cycle duration between adults and children seems stronger / more reliable for the fractal cycle method, particularly in the histogram (Figure 3C). Is this difference statistically significant?

      In the revised Manuscript, we have added the Multivariate Analysis of Variance to compare F-values, partial R-squared and eta squared. The findings are as follows:

      “To compare the fractal approach with the classical one, we performed a Multivariate Analysis of Variance with fractal and classical cycle durations as dependent variables, the group as an independent variable and the age as a covariate. We found that fractal cycle durations showed higher F-values (F(1, 43)  \= 4.5 vs F(1, 43) = 3.1), adjusted R squared (0.138 vs 0.089) and effect sizes (partial eta squared 0.18 vs 0.13) than classical cycle durations.” Results, Fractal cycles in children and adolescents, paragraph 3.

      There have been some recent efforts to define sleep cycles in an automatic way using machine learning approaches. It could be interesting to mention these in the discussion and highlight their relevance to the general endeavour of automatizing the sleep cycle identification process.

      In the Discussion of the revised Manuscript, we have added the section on the existing automatic sleep cycle definition algorithms:

      “Even though recently, there has been a significant surge in sleep analysis incorporating various machine learning techniques and deep neural network architectures, we should stress that this research line mainly focused on the automatic classification of sleep stages and disorders almost ignoring the area of sleep cycles. Here, as a reference method, we used one of the very few available algorithms for sleep cycle detection (Blume & Cajochen, 2021). We found that automatically identified classical sleep cycles only moderately correlated with those detected by human raters (r’s = 0.3 – 0.7 in different datasets). These coefficients lay within the range of the coefficients between fractal and classical cycle durations (r = 0.41 – 0.55, moderate) and outside the range of the coefficients between classical cycle durations detected by two human scorers (r’s = 0.7 – 0.9, strong, Supplementary Material 2, Table S8).” Discussion, section “Fractal and classical cycles comparison”, paragraph 4.

      Reviewer #2 (Public Review):

      One weakness of the study, from my perspective, was that the IRASA fits to the data (e.g. the PSD, such as in Figure 1B), were not illustrated. One cannot get a sense of whether or not the algorithm is based entirely on the fractal component or whether the oscillatory component of the PSD also influences the slope calculations. This should be better illustrated, but I assume the fits are quite good.

      Thank you for this suggestion. In the revised Manuscript, we have added a new figure (Fig.S1 E, Supplementary Material 2), illustrating the goodness of fit of the data as assessed by the IRASA method.

      The cycles detected using IRASA are called fractal cycles. I appreciate the use of a simple term for this, but I am also concerned whether it could be potentially misleading? The term suggests there is something fractal about the cycle, whereas it's really just that the fractal component of the PSD is used to detect the cycle. A more appropriate term could be "fractal-detected cycles" or "fractal-based cycle" perhaps?

      We agree that these cycles are not fractal per se. In the Introduction, when we mention them for the first time, we name them “fractal activity-based cycles of sleep” and immediately after that add “or fractal cycles for short”. In the revised version, we renewed this abbreviation with each new major section and in Abstract. Nevertheless, given that the term “fractal cycles” is used 88 times, after those “reminders”, we used the short name again to facilitate readability. We hope that this will highlight that the cycles are not fractal per se and thus reduce the possible confusion while keeping the manuscript short.

      The study performs various comparisons of the durations of sleep cycles evaluated by the IRASA-based algorithm vs. conventional sleep scoring. One concern I had was that it appears cycles were simply identified by their order (first, second, etc.) but were not otherwise matched. This is problematic because, as evident from examples such as Figure 3B, sometimes one cycle conventionally scored is matched onto two fractal-based cycles. In the case of the Figure 3B example, it would be more appropriate to compare the duration of conventional cycle 5 vs. fractal cycle 7, rather than 5 vs. 5, as it appears is currently being performed.

      In cases where the number of fractal cycles differed from the number of classical cycles (from 34 to 55% in different datasets as in the case of Fig.3B), we did not perform one-to-one matching of cycles. Instead, we averaged the duration of the fractal and classical cycles over each participant and only then correlated between them (Fig.2C). For a subset of the participants (45 – 66% of the participants in different datasets) with a one-to-one match between the fractal and classical cycles, we performed an additional correlation without averaging, i.e., we correlated the durations of individual fractal and classical cycles (Fig.4S of Supplementary Material 2). This is stated in the Methods, section Statistical analysis, paragraph 2.

      There are a few statements in the discussion that I felt were either not well-supported. L629: about the "little biological foundation" of categorical definitions, e.g. for REM sleep or wake? I cannot agree with this statement as written. Also about "the gradual nature of typical biological processes". Surely the action potential is not gradual and there are many other examples of all-or-none biological events.

      In the revised Manuscript, we have removed these statements from both Introduction and Discussion.

      The authors appear to acknowledge a key point, which is that their methods do not discriminate between awake and REM periods. Thus their algorithm essentially detected cycles of slow-wave sleep alternating with wake/REM. Judging by the examples provided this appears to account for both the correspondence between fractal-based and conventional cycles, as well as their disagreements during the early part of the sleep cycle. While this point is acknowledged in the discussion section around L686. I am surprised that the authors then argue against this correspondence on L695. I did not find the "not-a-number" controls to be convincing. No examples were provided of such cycles, and it's hard to understand how positive z-values of the slopes are possible without the presence of some wake unless N1 stages are sufficient to provide a detected cycle (in which case, then the argument still holds except that its alterations between slow-wave sleep and N1 that could be what drives the detection).

      In the revised Manuscript, we have removed the “NaN analysis” from both Results and Discussion. We have replaced it with the correlation between the difference between the durations of the classical and fractal cycles and proportion of wake after sleep onset. The finding is as follows:

      “A larger difference between the durations of the classical and fractal cycles was associated with a higher proportion of wake after sleep onset in 3/5 datasets as well as in the merged dataset (Supplementary Material 2, Table S10).” Results, section “Fractal cycles and wake after sleep onset”, last two sentences. This is also discussed in Discussion, section “Fractal cycles and age”, paragraph 1, last sentence. 

      To me, it seems important to make clear whether the paper is proposing a different definition of cycles that could be easily detected without considering fractals or spectral slopes, but simply adjusting what one calls the onset/offset of a cycle, or whether there is something fundamentally important about measuring the PSD slope. The paper seems to be suggesting the latter but my sense from the results is that it's rather the former.

      Thank you for this important comment. Overall, our paper suggests that the fractal approach might reflect the cycling nature of sleep in a more precise and sensitive way than classical hypnograms. Importantly, neither fractal nor classical methods can shed light on the mechanism underlying sleep cycle generation due to their correlational approach. Despite this, the advantages of fractal over classical methods mentioned in our Manuscript are as follows:

      (1) Fractal cycles are based on a real-valued metric with known neurophysiological functional significance, which introduces a biological foundation and a more gradual impression of nocturnal changes compared to the abrupt changes that are inherent to hypnograms that use a rather arbitrary assigned categorical value (e.g., wake=0, REM=-1, N1=-2, N2=-3 and SWS=-4, Fig.2 A).

      (2) Fractal cycle computation is automatic and thus objective, whereas classical sleep cycle detection is usually based on the visual inspection of hypnograms, which is time-consuming, subjective and error-prone. Few automatic algorithms are available for sleep cycle detection, which only moderately correlated with classical cycles detected by human raters (r’s = 0.3 – 0.7 in different datasets here).

      (3) Defining the precise end of a classical sleep cycle with skipped REM sleep that is common in children, adolescents and young adults using a hypnogram is often difficult and arbitrary.   The fractal cycle algorithm could detect such cycles in 93% of cases while the hypnogram-based agreement on the presence/absence of skipped cycles between two independent human raters was 61% only; thus, 32% lower.

      (4) The fractal analysis showed a stronger effect size, higher F-value and R-squared than the classical analysis for the cycle duration comparison in children and adolescents vs young adults. The first and second fractal cycles were significantly shorter in the pediatric compared to the adult group, whereas the classical approach could not detect this difference.

      (5) Fractal – but not classical – cycle durations correlated with the age of adult participants.

      These bullets are now summarized in Table 5 that has been added to the Discussion of the revised manuscript.

    1. eLife Assessment

      Liu and colleagues' study provides important insights into the neural mechanisms of narrative comprehension by identifying three distinct brain states using a hidden Markov model on fMRI data. The work is compelling, as it demonstrates that the dynamics of these brain states, particularly their timely expression, are linked to better comprehension and are specific to spoken language processing. The study's robust findings, validated in a separate dataset, will be of broad interest to researchers exploring the neural basis of speech and language comprehension, as well as those studying the relationship between dynamic brain states and cognition.

    2. Reviewer #1 (Public review):

      Summary:

      Liu and colleagues applied the hidden Markov model on fMRI to show three brain states underlying speech comprehension. Many interesting findings were presented: brain state dynamics were related to various speech and semantic properties, timely expression of brain states (rather than their occurrence probabilities) was correlated with better comprehension, and the estimated brain states were specific to speech comprehension but not at rest or when listening to non-comprehensible speech.

      Strengths:

      Recently, the HMM has been applied to many fMRI studies, including movie watching and rest. The authors cleverly used the HMM to test the external/linguistic/internal processing theory that was suggested in comprehension literature. I appreciated the way the authors theoretically grounded their hypotheses and reviewed relevant papers that used the HMM on other naturalistic datasets. The manuscript was well written, the analyses were sound, and the results had clear implications.

      Weaknesses:

      Further details are needed for the experimental procedure, adjustments needed for statistics/analyses, and the interpretation/rationale is needed for the results.

    3. Reviewer #2 (Public review):

      Liu et al. applied hidden Markov models (HMM) to fMRI data from 64 participants listening to audio stories. The authors identified three brain states, characterized by specific patterns of activity and connectivity, that the brain transitions between during story listening. Drawing on a theoretical framework proposed by Berwick et al. (TICS 2023), the authors interpret these states as corresponding to external sensory-motor processing (State 1), lexical processing (State 2), and internal mental representations (State 3). States 1 and 3 were more likely to transition to State 2 than between one another, suggesting that State 2 acts as a transition hub between states. Participants whose brain state trajectories closely matched those of an individual with high comprehension scores tended to have higher comprehension scores themselves, suggesting that optimal transitions between brain states facilitated narrative comprehension.

      Overall, the conclusions of the paper are well-supported by the data. Several recent studies (e.g., Song, Shim, and Rosenberg, eLife, 2023) have found that the brain transitions between a small number of states; however, the functional role of these states remains under-explored. An important contribution of this paper is that it relates the expression of brain states to specific features of the stimulus in a manner that is consistent with theoretical predictions.

      (1) It is worth noting, however, that the correlation between narrative features and brain state expression (as shown in Figure 3) is relatively low (~0.03). Additionally, it was unclear if the temporal correlation of the brain state expression was considered when generating the null distribution. It would be helpful to clarify whether the brain state expression time courses were circularly shifted when generating the null.

      (2) A strength of the paper is that the authors repeated the HMM analyses across different tasks (Figure 5) and an independent dataset (Figure S3) and found that the data was consistently best fit by 3 brain states. However, it was not entirely clear to me how well the 3 states identified in these other analyses matched the brain states reported in the main analyses. In particular, the confusion matrices shown in Figure 5 and Figure S3 suggests that that states were confusable across studies (State 2 vs. State 3 in Fig. 5A and S3A, State 1 vs. State 2 in Figure 5B). I don't think this takes away from the main results, but it does call into question the generalizability of the brain states across tasks and populations.

      (3) The three states identified in the manuscript correspond rather well to areas with short, medium, and long temporal timescales (see Hasson, Chen & Honey, TiCs, 2015). Given the relationship with behavior, where State 1 responds to acoustic properties, State 2 responds to word-level properties, and State 3 responds to clause-level properties, the authors may want to consider a "single-process" account where the states differ in terms of the temporal window for which one needs to integrate information over, rather than a multi-process account where the states correspond to distinct processes.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Liu and colleagues applied the hidden Markov model on fMRI to show three brain states underlying speech comprehension. Many interesting findings were presented: brain state dynamics were related to various speech and semantic properties, timely expression of brain states (rather than their occurrence probabilities) was correlated with better comprehension, and the estimated brain states were specific to speech comprehension but not at rest or when listening to non-comprehensible speech.

      Strengths:

      Recently, the HMM has been applied to many fMRI studies, including movie watching and rest. The authors cleverly used the HMM to test the external/linguistic/internal processing theory that was suggested in comprehension literature. I appreciated the way the authors theoretically grounded their hypotheses and reviewed relevant papers that used the HMM on other naturalistic datasets. The manuscript was well written, the analyses were sound, and the results had clear implications.

      Weaknesses:

      Further details are needed for the experimental procedure, adjustments needed for statistics/analyses, and the interpretation/rationale is needed for the results.

      We greatly appreciate the reviewers for the insightful comments and constructive suggestions. Below are the revisions we plan to make:

      (1) Experimental Procedure: We will provide a more detailed description of the stimuli and comprehension tests in the revised manuscript. Additionally, we will upload the corresponding audio files and transcriptions as supplementary data to ensure full transparency. 

      (2) Statistics/Analyses: In response to the reviewer's suggestions, we have reproduced the states' spatial maps using unnormalized activity patterns. For the resting state, we observed a state similar to the baseline state described by Song, Shim, & Rosenberg (2023). However, for the speech comprehension task, all three states showed network activity levels that deviated significantly from zero. Furthermore, we regenerated the null distribution for behavior-brain state correlations using a circular shift approach, and the results remain largely consistent with our previous findings. We have also made other adjustments to the analyses and introduced some additional analyses, as per the reviewer's recommendations. These changes will be incorporated into the revised manuscript.

      (3) Interpretation/Rationale: We will expand on the interpretation of the relationship between state occurrence and semantic coherence. Specifically, we will highlight that higher semantic coherence may enable the brain to more effectively accumulate information over time. State #2 appears to be involved in the integration of information over shorter timescales (hundreds of milliseconds), while State #3 is engaged in longer timescales (several seconds). 

      Reviewer #2 (Public review):

      Liu et al. applied hidden Markov models (HMM) to fMRI data from 64 participants listening to audio stories. The authors identified three brain states, characterized by specific patterns of activity and connectivity, that the brain transitions between during story listening. Drawing on a theoretical framework proposed by Berwick et al. (TICS 2023), the authors interpret these states as corresponding to external sensory-motor processing (State 1), lexical processing (State 2), and internal mental representations (State 3). States 1 and 3 were more likely to transition to State 2 than between one another, suggesting that State 2 acts as a transition hub between states. Participants whose brain state trajectories closely matched those of an individual with high comprehension scores tended to have higher comprehension scores themselves, suggesting that optimal transitions between brain states facilitated narrative comprehension.

      Overall, the conclusions of the paper are well-supported by the data. Several recent studies (e.g., Song, Shim, and Rosenberg, eLife, 2023) have found that the brain transitions between a small number of states; however, the functional role of these states remains under-explored. An important contribution of this paper is that it relates the expression of brain states to specific features of the stimulus in a manner that is consistent with theoretical predictions.

      (1) It is worth noting, however, that the correlation between narrative features and brain state expression (as shown in Figure 3) is relatively low (~0.03). Additionally, it was unclear if the temporal correlation of the brain state expression was considered when generating the null distribution. It would be helpful to clarify whether the brain state expression time courses were circularly shifted when generating the null. 

      We have regenerated the null distribution by circularly shifting the state time courses. The results remain consistent with our previous findings: p = 0.002 for the speech envelope, p = 0.007 for word-level coherence, and p = 0.001 for clause-level coherence. 

      We notice that in other studies which examined the relationship between brain activity and word embedding features, the group-mean correlation values are similarly low but statistically significant and theoretically meaningful (e.g., Fernandino et al., 2022; Oota et al., 2022). We think these relatively low correlations is primarily due to the high level of noise inherent in neural data. Brain activity fluctuations are shaped by a variety of factors, including task-related cognitive processing, internal thoughts, physiological states, as well as arousal and vigilance. Additionally, the narrative features we measured may account for only a small portion of the cognitive processes occurring during the task. As a result, the variance in narrative features can only explain a limited portion of the overall variance in brain activity fluctuations.

      We will update Figure 3 and relevant supplementary figures to reflect the new null distribution generated via circular shift. Furthermore, we will expand the discussion to address why the observed brain-stimuli correlations are relatively small, despite their statistical significance.

      (2) A strength of the paper is that the authors repeated the HMM analyses across different tasks (Figure 5) and an independent dataset (Figure S3) and found that the data was consistently best fit by 3 brain states. However, it was not entirely clear to me how well the 3 states identified in these other analyses matched the brain states reported in the main analyses. In particular, the confusion matrices shown in Figure 5 and Figure S3 suggests that that states were confusable across studies (State 2 vs. State 3 in Fig. 5A and S3A, State 1 vs. State 2 in Figure 5B). I don't think this takes away from the main results, but it does call into question the generalizability of the brain states across tasks and populations. 

      We identified matching states across analyses based on similarity in the activity patterns of the nine networks. For each candidate state identified in other analyses, we calculate the correlation between its network activity pattern and the three predefined states from the main analysis, and set the one it most closely resembled to be its matching state. For instance, if a candidate state showed the highest correlation with State #1, it was labelled State #1 accordingly. 

      Each column in the confusion matrix depicts the similarity of each candidate state with the three predefined states. In Figure S3 (analysis for the replication dataset), the highest similarity occurred along the diagonal of the confusion matrix. This means that each of the three candidate states was best matched to State #1, State #2, and State #3, respectively, maintaining a one-to-one correspondence between the states from two analyses.

      For the comparison of speech comprehension task with the resting and the incomprehensible speech condition, there was some degree of overlap or "confusion." In Figure 5A, there were two candidate states showing the highest similarity to State #2. In this case, we labelled the candidate state with the the strongest similarity as State #2, while the other candidate state is assigned as State #3 based on this ranking of similarity. This strategy was also applied to naming of states for the incomprehensible condition. The observed confusion supports the idea that the tripartite-state space is not an intrinsic, task-free property. To make the labeling clearer in the presentation of results, we will use a prime symbol (e.g., State #3') to indicate cases where such confusion occurred, helping to distinguish these ambiguous matches.

      In the revised manuscript, we will give a detailed illustration for how the correspondence of states across analyses were made. 

      (3) The three states identified in the manuscript correspond rather well to areas with short, medium, and long temporal timescales (see Hasson, Chen & Honey, TiCs, 2015). Given the relationship with behavior, where State 1 responds to acoustic properties, State 2 responds to word-level properties, and State 3 responds to clause-level properties, the authors may want to consider a "single-process" account where the states differ in terms of the temporal window for which one needs to integrate information over, rather than a multi-process account where the states correspond to distinct processes.

      The temporal window hypothesis indeed provides a better explanation for our results. Based on the spatial maps and their modulation by speech features, States #1, #2, and #3 seem to correspond to the short, medium, and long processing timescales, respectively. We will update the discussion to reflect this interpretation. 

      We sincerely appreciate the constructive suggestions from the two anonymous reviewers, which have been highly valuable in improving the quality of the manuscript.

    1. eLife assessment

      This study seeks to expand the understanding of insulin and glucose responses in the brain, specifically by implicating a family of protein kinases responsive to insulin. The significance of the study to the field is valuable. The evidence supporting the conclusions about brain glucose utilization is convincing, although there are several aspects that could benefit from additional validation to strengthen the claims.

    2. Joint Public Review:

      Summary:

      The study by Akita B. Jaykumar et al. explores an interesting and relevant hypothesis whether serine/threonine With-No-lysine (K) kinases (WNK)-1, -2, -3, and -4 engage in insulin-dependent glucose transporter-4 (GLUT4) signaling in the murine central nervous system. The authors especially focused on the hippocampus as this brain region exhibits high expression of insulin and GLUT4. Additionally, disrupted glucose metabolism in the hippocampus has been associated with anxiety disorders, while impaired WNK signaling has been linked to hypertension, learning disabilities, psychiatric disorders, or Alzheimer's disease. The study took advantage of selective pan-WNK inhibitor WNK 643 as the main tool to manipulate WNK 1-4 activity both in vivo by daily, per-oral drug administration to wild-type mice, and in vitro by treating either adult murine brain synaptosomes, hippocampal slices, primary cortical cultures, and human cell lines (HEK293, SH-SY5Y). Using a battery of standard behavior paradigms such as open field test, elevated plus maze test, and fear conditioning, the authors convincingly demonstrate that the inhibition of WNK1-4 results in behavior changes, especially in enhanced learning and memory of WNK643-treated mice. To shed light on the underlying molecular mechanism, the authors implemented multiple biochemical approaches including immunoprecipitation, glucose-uptake assay, surface biotylination assay, immunoblotting, and immunofluorescence. The data suggest that simultaneous insulin stimulation and WNK1-4 inhibition results in increased glucose uptake and the activity of insulin's downstream effectors, phosphorylated Akt and phosphorylated AS160. Moreover, the authors demonstrate that insulin treatment enhances the physical interaction of the WNK effector OSR1/SPAK with Akt substrate AS160. As a result, combined treatment with insulin and the WNK643 inhibitor synergistically increases the targeting of GLUT4 to the plasma membrane. Collectively, these data strongly support the initial hypothesis that neuronal insulin- and WNK-dependent pathways do interact and engage in cognitive functions.

      Strengths:

      The insulin-dependent signaling in the central nervous system is relatively understudied. This explorative study delves into several interesting and clinically relevant possibilities, examining how insulin-dependent signaling and its crosstalk with WNK kinases might affect brain circuits involved in memory formation and/or anxiety. Therefore, these findings might inspire follow-up studies performed in disease models for disorders that exhibit impaired glucose metabolism, deficient memory, or anxiety, such as Diabetes mellitus, Alzheimer's disease, or most psychiatric disorders.

      The graphical presentation of the figures is of high quality, which helps the reader to obtain a good overview and easily understand the experimental design, results, and conclusions.

      The behavioral studies are well conducted and provide valuable insights into the role of WNK kinases in glucose metabolism and their effect on learning and memory. Additionally, the authors evaluate the levels of basal and induced anxiety in Figures 1 and 2, enhancing our understanding of how WNK signaling might engage in cognitive function and anxiety-like behavior, particularly in the context of altered glucose metabolism.

      Weaknesses:

      The study used a WNK643 inhibitor as the only tool to manipulate WNK1-4 activity. This inhibitor seems selective; however, it has been reported that it exhibits different efficiency in inhibiting the individual WNK kinases among each other (e.g. PMID: 31017050, PMID: 36712947). Additionally, the authors do not analyze nor report the expression profiles or activity levels of WNK1, WNK2, WNK3, and WNK4 within the relevant brain regions (i.e. hippocampus, cortex, amygdala). Combined, these weaknesses raise concerns about the direct involvement of WNK kinases within the selected brain regions and behavior circuits. It would be beneficial if the authors provided gene profiling for WNK1, 2, 3, and -4 (e.g. using Allen brain atlas). To confirm the observations, the authors should either add results from using other WNK inhibitors or, preferentially, analyze knock-down or knock-out animals/tissue targeting the single kinases.

      The authors do not report any data on whether the global inhibition of WNKs affects insulin levels. Since the authors wish to demonstrate the synergistic effect of simultaneous insulin treatment and WNK1-4 inhibition, such data are missing.

      The study discovered that the Sortilin receptor binds to OSR1, leading the authors to speculate that Sortilin may be involved in the insulin-dependent GLUT4 surface trafficking. However, the authors do not provide any evidence supporting Sortilin's involvement in insulin- or WNK-dependent GLUT4 trafficking. Thus, this conclusion should be qualified, rephrased, or additional data included.

    3. Author response:

      Joint Public Review:

      Strengths:

      The insulin-dependent signaling in the central nervous system is relatively understudied. This explorative study delves into several interesting and clinically relevant possibilities, examining how insulin-dependent signaling and its crosstalk with WNK kinases might affect brain circuits involved in memory formation and/or anxiety. Therefore, these findings might inspire follow-up studies performed in disease models for disorders that exhibit impaired glucose metabolism, deficient memory, or anxiety, such as Diabetes mellitus, Alzheimer's disease, or most psychiatric disorders.

      The graphical presentation of the figures is of high quality, which helps the reader to obtain a good overview and easily understand the experimental design, results, and conclusions.

      The behavioral studies are well conducted and provide valuable insights into the role of WNK kinases in glucose metabolism and their effect on learning and memory. Additionally, the authors evaluate the levels of basal and induced anxiety in Figures 1 and 2, enhancing our understanding of how WNK signaling might engage in cognitive function and anxiety-like behavior, particularly in the context of altered glucose metabolism.

      We thank the reviewers for recognizing the strengths of our study.

      Weaknesses:

      The study used a WNK643 inhibitor as the only tool to manipulate WNK1-4 activity. This inhibitor seems selective; however, it has been reported that it exhibits different efficiency in inhibiting the individual WNK kinases among each other (e.g. PMID: 31017050, PMID: 36712947). Additionally, the authors do not analyze nor report the expression profiles or activity levels of WNK1, WNK2, WNK3, and WNK4 within the relevant brain regions (i.e. hippocampus, cortex, amygdala). Combined, these weaknesses raise concerns about the direct involvement of WNK kinases within the selected brain regions and behavior circuits. It would be beneficial if the authors provided gene profiling for WNK1, 2, 3, and -4 (e.g. using Allen brain atlas). To confirm the observations, the authors should either add results from using other WNK inhibitors or, preferentially, analyze knock-down or knock-out animals/tissue targeting the single kinases.

      We thank the reviewers for the suggestions. To address the criticism and as recommended, we have planned to include gene profiling for WNK1-4 in the brain from Allen brain atlas. Additionally, we have planned to include the effect of WNK1 knockdown on pAKT levels in immortalized SHSY5Y cells.

      The authors do not report any data on whether the global inhibition of WNKs affects insulin levels. Since the authors wish to demonstrate the synergistic effect of simultaneous insulin treatment and WNK1-4 inhibition, such data are missing.

      To address this critique, we have planned to include plasma insulin levels upon global inhibition of WNKs using WNK463 in C57BL/6J mice.

      The study discovered that the Sortilin receptor binds to OSR1, leading the authors to speculate that Sortilin may be involved in the insulin-dependent GLUT4 surface trafficking. However, the authors do not provide any evidence supporting Sortilin's involvement in insulin- or WNK-dependent GLUT4 trafficking. Thus, this conclusion should be qualified, rephrased, or additional data included.

      We thank the reviewers for suggesting experiments that will significantly enhance the clarity of our conclusions. We have planned to include immunofluorescence staining data for sortilin localization in SHSY5Y cells under conditions of DMSO, insulin and/or WNK463 treatment. These data would suggest whether WNK463 treatment affects localization of sortilin in the golgi network which has been shown by previous studies to affect sortilin-dependent GLUT4 trafficking.

    1. eLife Assessment

      This important study examines the stability and compensatory plasticity in the retinotopic mapping in patients with congenital achromatopsia. It provides convincing evidence for a stable mapping of the visual field in V1, alongside changes of the readout from V1 into V3, which shows revised receptive field location and size. With the controlling for potential confounding variables, this paper would be of interest to scientists studying the visual system, brain plasticity, and development.

    2. Reviewer #1 (Public review):

      Summary:

      This paper examines plasticity in early cortical (V1-V3) areas in an impressively large number of rod monochromats (individuals with achromatopia). The paper examines three things:

      (1) Cortical thickness. It is now well established that early complete blindness leads to increases in cortical thickness. This paper shows increased thickness confined to the foveal projection zone within achromats. This paper replicates the work by Molz (2022) and Lowndes (2021), but the detailed mapping of cortical thickness as a function of eccentricity and the inclusion of higher visual areas is particularly elegant.

      (2) Failure to show largescale reorganization of early visual areas using retinotopic mapping. This is a replication of a very recent study by Molz et al. but I believe, given anatomical variability (and the very large n in this study) and how susceptible pRF findings are to small changes in procedure, this replication is also of interest.

      (3) Connective field modelling, examining the connections between V3-V1. The paper finds changes in the pattern of connections, and smaller connective fields in individuals with achromatopsia than normally sighted controls, and suggests that these reflect compensatory plasticity, with V3 compensating for the lower resolution V1 signal in individuals with achromatopsia.

      Strengths:

      This is a carefully done study (both in terms of data collection and analysis) that is an impressive amount of work. I have a number of methodological comments but I hope they will be considered as constructive engagement - this work is highly technical with a large number of factors to consider.

      Weaknesses:

      (1) Effects of eye-movements

      I have some concerns with how the effects of eye-movements are being examined. There are two main reasons the authors give for excluding eye-movements as a factor in their results. Both explanations have limitations.

      a) The first is that R2 values are similar across groups in the foveal confluence. This is fine as far as it goes, but R2 values are going to be low in that region. So this shows that eye-movements don't affect coverage (the number of voxels that generate a reliable pRF), but doesn't show that eye-movements aren't impacting their other measures.

      b) The authors don't see a clear relationship between coverage and fixation stability. This seems to rest on a few ad hoc examples. (What happens if one plots mean fixation deviation vs. coverage (and sets the individuals who could not be calibrated as the highest value of calibrated fixation deviation. Does a relationship then emerge?).

      In any case, I wouldn't expect coverage to be particularly susceptible to eye-movements. If a voxel in the cortex entirely projects to the scotoma then it should be robustly silent. The effects of eye-movements will be to distort the size and eccentricity estimates of voxels that are not entirely silent.

      There are many places in the paper where eye-movements might be playing an important role.

      Examples include the larger pRF sizes observed in achromats. Are those related to fixation instability? Given that fixation instability is expected to increase pRF size by a fixed amount, that would explain why ratios are close to 1 in V3 (Figure 4).

      (2) Topography

      The claim of no change in topography is a little confusing given that you do see a change in eccentricity mapping in achromats.

      Either this result is real, in which case there *is* a change in topography, albeit subtle, or it's an artifact.

      Perhaps these results need a little bit of additional scrutiny.

      One reason for concern is that you see different functions relating eccentricity to V1 segments depending on the stimulus. That almost certainly reflects biases in the modelling, not reorganization - the curves of Figure 2D are exactly what Binda et al. predict.

      Another reason for concern is that I'm very surprised that you see so little effect of including/not including the scotoma - the differences seem more like what I'd expect from simply repeating the same code twice. (The quickest sanity check is just to increase the size of the estimated scotoma to be even bigger?).

      I'd also look at voxels that pass an R2>0.2 threshold for both the non-selective and selective stimulus. Are the pRF sizes the same for both stimuli? Are the eccentricity estimates? If not, that's another clear warning sign.

      (3) Connective field modelling

      Let's imagine a voxel on the edge of the scotoma. It will tend to have a connective field that borders the scotoma, and will be reduced in size (since it will likely exclude the cortical region of V1 that is solely driven by resting state activity). This predicts your rod monochromat data. The interesting question is why this doesn't happen for controls. One possibility is that there is top-down 'predictive' activity that smooths out the border of the scotoma (there's some hint of that in the data), e.g., Masuda and Wandell.

      One thing that concerns me is that the smaller connective fields don't make sense intuitively. When there is a visual stimulus, connective fields are predominantly driven by the visual signal. In achromats, there is a large swath of cortex (between 1-2.5 degrees) which shows relatively flat tuning as regards eccentricity. The curves for controls are much steeper, See Figure 2b. This predicts that visually driven connective fields should be larger for achromats. So, what's going on? The beta parameter is not described (and I believe it can alter connective field sizes). Similarly, it's possible to get very small connective fields, but there wasn't a minimum size described in the thresholding. I might be missing something obvious, but I'm just deeply confused as to how the visual maps and the connectome maps can provide contradictory results given that the connectome maps are predominantly determined by the visual signal. Some intuition would be helpful.

      Some analyses might also help provide the reader with insight. For example, doing analyses separately on V3 voxels that project entirely to scotoma regions, project entirely to stimulus-driven regions, and V3 voxels that project to 'mixed' regions.

      The finding that pRF sizes are larger in achromats by a constant factor as a function of eccentricity is what differences in eye-movements would predict. It would be worth examining the relationship between pRF sizes and fixation stability.

    3. Reviewer #2 (Public review):

      Summary:

      The authors inspect the stability and compensatory plasticity in the retinotopic mapping in patients with congenital achromatopsia. They report an increased cortical thickness in central (eccentricities 0-2 deg) in V1 and the expansion of this effect to V2 (trend) and V3 in a cohort with an average age of adolescents.

      In analyzing the receptive fields, they show that V1 had increased receptive field sizes in achromats, but there were no clear signs of reorganization filling in the rod-free area.<br /> In contrast, V3 showed an altered readout of V1 receptive fields. V3 of achromats oversampled the receptive fields bordering the rod-free zone, presumably to compensate and arrive at similar receptive fields as in the controls.

      These findings support a retention of peripheral-V1 connectivity, but a reorganization of later hierarchical stages of the visual system to compensate for the loss, highlighting a balance between stability and compensation in different stages of the visual hierarchy.

      Strengths:

      The experiment is carefully analyzed, and the data convey a clear and interesting message about the capacities of plasticity.

      Weaknesses:

      The existence of unstable fixation and nystagmus in the patient group is alluded to, but not quantified or modeled out in the analyses. The authors may want to address this possible confound with a quantitative approach.

    4. Author response:

      We would like to thank the reviewers for their positive evaluation of our work, and their comments inspiring useful discussion. We will provide an in-depth response once one of the key authors has returned from parental leave (in some months), but below we share initial thoughts:  

      Both reviewers asked to see more gaze data to understand how eye movements in patients with achromatopsia might drive our results. We will expand our analyses of eye tracking data and discuss the implications in more depth, but would like to note that our key findings (no change in signal coverage in the foveal rod-scotoma projection zone in achromats, and changes in connective fields) are both robust to eye movement, and unlikely to be driven by gaze differences. Where this is less clear (i.e., population Receptive Field eccentricities are shifted outwards and increased in size), we have highlighted this and avoided drawing strong conclusions. 

      Reviewer 1 questioned why smaller connective fields (CFs) were observed in achromats, suggesting that their flatter V1 eccentricity tuning should predict larger CFs. It’s not straightforward to predict how V1's population receptive field (pRF) tuning profile shapes V3's sampling extent, as CFs are driven, but not dictated by V1 - they combine and integrate V1 signals. As we’re dealing with an atypically developed visual system, assumptions about expected relationships are complicated further. We believe that the most relevant aspect of pRF data to the interpretability of V3 CF extent, is the ratio between V1 and V3 pRF sizes. Our outcomes show that pRF sizes in achromats, while larger in V1, are more normalized in V3, predicting more local V3 sampling from V1. This is what our quantifications of CF size show across two independent measures with different stimuli. We will provide further data to address reviewer 1's various queries about the potential causes of the pRF eccentricity shifts in achromats, the relationship between pRFs and CFs, and methodological details of CF fits.

      We thank the reviewers again for their insightful  comments and look forward to providing more comprehensive responses to their queries substantiated with data as soon as possible.

    1. eLife Assessment

      This work provides important findings characterizing potential synaptic mechanisms supporting the role of midline thalamus-hippocampal projections in fear memory extinction in mice. The methods and approaches used were solid. However, the evidence itself is incomplete, as there are concerns with whether the findings fully support the conclusions drawn.

    2. Reviewer #1 (Public review):

      The findings of Ziolkowska and colleagues show that a specific projection from the nucleus reuniens of the thalamus (RE) to dorsal hippocampal CA1 neurons plays an important role in fear extinction learning in male and female mice. In and of itself, this is not a particularly new finding, although the authors' identification of structural alterations from within dorsal CA1 stratum lacunosum moleculare (SLM) as a candidate mechanism for the learning-related plasticity is potentially novel and exciting. The authors use a range of anatomical and functional approaches to demonstrate structural synaptic changes in dorsal CA1 that parallel the necessary role of RE inputs in modulating extinction learning. Yet, the significance of these findings is substantially limited by several technical shortcomings in the experimental design, and the authors' central interpretation. Otherwise, there remain several strengths in the design and interpretation that offset some of these concerns.

      Given that much is already known about the role of RE and hippocampus in modulating fear learning and extinction, it remains unclear whether addressing these concerns would substantially increase the impact of this study beyond the specific area of speciality. Below, several major weaknesses will be highlighted, followed by several miscellaneous comments.

      Methodological:

      One major methodological weakness in the experimental design involves the widespread misapplication of Ns used for the statistical analyses. Much of the anatomical analyses of structural synaptic changes in the RE-CA1 pathway use N = number of axons (Figs. 1, 2), N = number of dendrites (Figs. 3, 4), and N = number of sections (Fig. 7; note that there are 7 figures in total). In every instance, N = animal number should be used. It is unclear which of these results would remain significant if N = animal number were used in each or how many more animals would be required. This is problematic since these data comprise the main evidence for the authors' central conclusion that specific structural synaptic changes are associated with fear extinction learning.

      There is a lack of specific information regarding what constitutes learning with respect to behavioral freezing. It is never clearly stated what specific intervals are used over which freezing is measured during acquisition, extinction, and in extinction retrieval tests. Additionally, assessment of freezing during retrieval at 5- and 30-min time points doesn't lay to rest the possibility that there were differences in the decay rate over the 30-min period (also see below).

      A minor-to-moderate methodological weakness concerns the authors' decision to utilize saline injected groups as controls for the chemogenetics experiments (Figs. 5, 6). The correct design is to have a CNO-only group with the same viral procedure sans hM4Di. This concern is partly mitigated by the inclusion of a CNO vs. saline injection control experiment (Fig. 6).

      In the electron microscopic analyses of dendritic spines (Fig. 5), comparison of only the fear acquisition versus extinction training, and the lack of inclusion of a naïve control group, makes it difficult to understand how these structural synaptic changes are occurring relative to baseline. It is noteworthy that the authors utilize the tripartite design in other anatomical analyses (Fig. 2-4).

      Interpretation:

      The main interpretive weakness in the study is the authors' claim that their data shows a role for the RE-CA1 pathway in memory consolidation (i.e., see Abstract). This claim is based on the premise that, although RE-CA1 pathway inactivation with CNO treatment 30 min prior to contextual fear extinction did not affect freezing at 5- and 30-min time points relative to saline controls, these rats showed greater freezing when tested on extinction retrieval 24 h thereafter. First, the data do not rule out possible differences in the decay rate of freezing during extinction training due to CNO administration. Next, the fact that CNO is given prior to training still leaves open the possibility that acquisition was affected, even if there were not any frank differences in freezing. Support for this latter possibility derives from the fact that mice tested for extinction retrieval as early as 5 min after extinction training (Fig. 6C) showed the same impairments as mice tested 24 h later (Figs. 6A). Further, all the structural synaptic changes argued to underlie consolidation were based on analysis at a time point immediately following extinction training, which is too early to allow for any long-term changes that would underlie memory consolidation, but instead would confer changes associated with the extinction training event.

    3. Reviewer #2 (Public review):

      Summary:

      Ziółkowska et al. characterize the synaptic mechanisms at the basis of the REdCA1 contribution to the consolidation of fear memory extinction. In particular, they describe a layer specific modulation of RE-dCA1 excitatory synapses modulation associated to contextual fear extinction which is impaired by transient chemogenetic inhibition of this pathway. These results indicate that RE activity-mediated modulation of synaptic morphology contributes to the consolidation of contextual fear extinction

      Strengths:

      The manuscript is well conceived, the statistical analysis is solid and methodology appropriate. The strength of this work is that it nicely builds up on existing literature and provides new molecular insight on a thalamo-hippocampal circuit previously known for its role in fear extinction. In addition, the quantification of pre- and post-synapses is particularly thorough.

      Weaknesses:

      The findings in this paper are well supported by the data more detailed description of the methods is needed.

      (1) In the paragraph Analysis of dCA1 synapses after contextual fear extinction (CFE), more experimental and methodological data should be given in the text: -how was PSD95 used for the analysis, what was the difference between RE. Even if Thy1-GFP mice were used in Fig.2, it appears they were not used for bouton size analysis. To improve clarity, I suggest moving panel 2C to Figure 3. It is not clear whether all RE axons were indiscriminately analysed in Fig. 2 or if only the ones displaying colocalization with both PSD95 and GFP were analysed. If GFP was not taken into account here, analysed boutons could reflect synapses onto inhibitory neurons and this potential scenario should be discussed<br /> (2) in the methods: The volume of intra-hippocampal CNO injections should be indicated. The concentration of 3 uM seems pretty low in comparison with previous studies. More details of what software/algorithm was used to score freezing should be included. CNO source is missing. Antibody dilutions for IHC should be indicated. Secondary antibody incubation time should be indicated

      No statement about code and data availability is present.

    4. Reviewer #3 (Public review):

      Summary:

      This paper examined the role of nucleus reuniens (RE) projections to dorsal CA1 neurons in context fear extinction learning. First, they show that RE neurons send excitatory projections to the stratum oriens (SO) and the stratum lacunosum moleculare (SLM), but not the stratum radiatum (SR). After context fear conditioning, the synaptic connections between RE and dCA1 neurons in the SLM (but not the SO) are weakened (reduced bouton and spine density) after mice undergo context fear conditioning. This weakening is reversed by extinction learning, which leads to enhanced synaptic connectivity between RE inputs and dendrites in the SLM. Control experiments demonstrate that the observed changes are due to extinction and not caused by simple exposure to the context. Extinction learning also induced increases in the size (volume and surface area) of the post-synaptic density (PSD) in SLM. To establish the functional role of RE inputs to dCA1, the researchers used an inhibitory DREADD to silence this pathway during extinction learning. They observe that extinction memory (measured 2-hours or 24-hours later) is impaired by this inhibition. Control experiments show that the extinction memory deficit is not simply due to increased freezing caused by inactivation of the pathway or injections of CNO. Inhibiting the RO projection during extinction learning also reduced the levels of PSD-95 protein levels in the spines of dCA1 neurons.

      Strengths:

      Based on their results, the authors conclude that, "the RE→SLM pathway participates in the updating of fearful context value by actively regulating CFE-induced molecular and structural synaptic plasticity in the SLM.". I believe the data are generally consistent with this hypothesis, although there is an important control condition missing from the behavioral experiments.

      Weaknesses:

      (1) A defining feature of extinction learning is that it is context specific (Bouton, 2004). It is expressed where it was learned, but not in other environments. Similarly, it has been shown that internal contexts (or states) also modulate the expression of extinction (Bouton, 1990). For example, if a drug is administered during extinction learning, it can induce a specific internal state. If this state is not present during subsequent testing, the expression of extinction is impaired just as it is when the physical context is altered (Bouton, 2004). It is possible that something similar is happening in Figure 6. In these experiments, CNO is administered to inactivate the RE-dCA1 projection during extinction learning. The authors observe that this manipulation impairs the expression of extinction the next day (or 2-hours later). However, the drug is not given again during the test. Therefore, it is possible that CNO (and/or inactivation of the RE-dCA1 pathway) induces a state change during extinction that is not present during subsequent testing. Based on the literature cited above, this would be expected to disrupt fear extinction as the authors observed. To determine if this alternative explanation is correct, the researchers need to add groups that receive CNO during extinction training and subsequent extinction testing. If the deficits in extinction expression reported in Figure 6 result from a state change, then these groups should not exhibit an impairment. In contrast, if the authors' account is correct, then the expression of extinction should still be disrupted in mice that receive CNO during training and testing.

      (2) In their analysis of dCA1 synapses after contextual fear extinction (CFE) (Figure 4), the authors should have compared Ctx and Ctx-Ctx animals against naïve animals (as they did in Figure 3) when comparing 5US and Ext with naïve animals. Otherwise, the authors cannot make the following conclusion; "since changes of SLM synapses were not observed in the animals exposed to the familiar context that was not associated with the USs, our data support the role of the described structural plasticity at the RE→SLM synapses in CFE, rather than in processing contextual information in general.".

      (3) In the materials and methods section, the description of cannula placements is confusing and needs to be rewritten.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      The findings of Ziolkowska and colleagues show that a specific projection from the nucleus reuniens of the thalamus (RE) to dorsal hippocampal CA1 neurons plays an important role in fear extinction learning in male and female mice. In and of itself, this is not a particularly new finding, although the authors' identification of structural alterations from within dorsal CA1 stratum lacunosum moleculare (SLM) as a candidate mechanism for the learning-related plasticity is potentially novel and exciting. The authors use a range of anatomical and functional approaches to demonstrate structural synaptic changes in dorsal CA1 that parallel the necessary role of RE inputs in modulating extinction learning. Yet, the significance of these findings is substantially limited by several technical shortcomings in the experimental design, and the authors' central interpretation. Otherwise, there remain several strengths in the design and interpretation that offset some of these concerns.

      Given that much is already known about the role of RE and hippocampus in modulating fear learning and extinction, it remains unclear whether addressing these concerns would substantially increase the impact of this study beyond the specific area of speciality. Below, several major weaknesses will be highlighted, followed by several miscellaneous comments.

      Methodological:

      (1) One major methodological weakness in the experimental design involves the widespread misapplication of Ns used for the statistical analyses. Much of the anatomical analyses of structural synaptic changes in the RE-CA1 pathway use N = number of axons (Figs. 1, 2), N = number of dendrites (Figs. 3, 4), and N = number of sections (Fig. 7; note that there are 7 figures in total). In every instance, N = animal number should be used. It is unclear which of these results would remain significant if N = animal number were used in each or how many more animals would be required. This is problematic since these data comprise the main evidence for the authors' central conclusion that specific structural synaptic changes are associated with fear extinction learning.

      We do agree with the reviewer that N = animal number is the preferred way to present data in most of our experiments. However, in some experimental groups we observed a very low number of entries. For example, in the 5US group we found RE+/+ spines only in 3 out of 6 analyzed animals. We believe that this observation is not due to technical problems as mCherry virus transduction required to find RE+/+ spines is similar in all experimental groups and we analyzed similar volumes of tissue. While this result still allows the calculation of density of RE+/+ spines per animal it generates no entries for spine area and PSD95 mean gray value if N = animal number. Hence, we decided to use N=animals to calculate spines and boutons densities, and N=dendritic spines/boutons to calculate other spine/bouton parameters.

      (2) There is a lack of specific information regarding what constitutes learning with respect to behavioral freezing. It is never clearly stated what specific intervals are used over which freezing is measured during acquisition, extinction, and in extinction retrieval tests. Additionally, assessment of freezing during retrieval at 5- and 30-min time points doesn't lay to rest the possibility that there were differences in the decay rate over the 30-min period (also see below).

      We added a detailed description of how learning was assessed.

      ln 125-134: For assessment of learning we used percent of time spent by animals freezing (% freezing). Freezing behavior was defined as complete lack of movement, except respiration. To assess within-session learning (working memory) we compared pre- and post-US freezing frequency (the first 148 sec vs last 30 sec) during the CFC session (day 1). To assess formation of long-term contextual fear memory, we compared pre-US freezing (day 1) and the first 5 minutes of the Extinction session (day 2). To assess within session contextual fear extinction we ran 2-way ANOVA to assess the effect of time and manipulation on freezing frequency. Freezing data were analyzed in 5-minute bins. To assess formation of long-term contextual fear extinction memory we compared the first 5 minutes of the Extinction session (day 2) and Test session (day 3).

      As suggested by the reviewer, we also added data for all six 5-minut bins of Extinction sessions. 

      (3) A minor-to-moderate methodological weakness concerns the authors' decision to utilize saline injected groups as controls for the chemogenetics experiments (Figs. 5, 6). The correct design is to have a CNO-only group with the same viral procedure sans hM4Di. This concern is partly mitigated by the inclusion of a CNO vs. saline injection control experiment (Fig. 6).

      Figure 5 does not describe a chemogenetic experiment.

      We added new groups with control virus (CNO vs saline) to Figure 6 (now Fig. 6D and H). 

      The chemogenetic experiment shown on Figure 7 has all 4 experimental groups (Control vs hM4Di and saline vs CNO).

      (4) In the electron microscopic analyses of dendritic spines (Fig. 5), comparison of only the fear acquisition versus extinction training, and the lack of inclusion of a naïve control group, makes it difficult to understand how these structural synaptic changes are occurring relative to baseline. It is noteworthy that the authors utilize the tripartite design in other anatomical analyses (Fig. 2-4).

      We added data for the Naive mice to Figure 5.

      (5) Interpretation:

      The main interpretive weakness in the study is the authors' claim that their data shows a role for the RE-CA1 pathway in memory consolidation (i.e., see Abstract). This claim is based on the premise that, although RE-CA1 pathway inactivation with CNO treatment 30 min prior to contextual fear extinction did not affect freezing at 5- and 30-min time points relative to saline controls, these rats showed greater freezing when tested on extinction retrieval 24 h thereafter. First, the data do not rule out possible differences in the decay rate of freezing during extinction training due to CNO administration. Next, the fact that CNO is given prior to training still leaves open the possibility that acquisition was affected, even if there were not any frank differences in freezing. Support for this latter possibility derives from the fact that mice tested for extinction retrieval as early as 5 min after extinction training (Fig. 6C) showed the same impairments as mice tested 24 h later (Figs. 6A). Further, all the structural synaptic changes argued to underlie consolidation were based on analysis at a time point immediately following extinction training, which is too early to allow for any long-term changes that would underlie memory consolidation, but instead would confer changes associated with the extinction training event.

      We do agree with the reviewer that our data do not allow us to conclude whether RE-CA1 pathway is involved in acquisition or consolidation of CFE memory. Therefore, we avoid those terms in the manuscript. We just conclude that RE→CA1 participates in the CFE.

      Reviewer #2 (Public review):

      Summary:

      Ziółkowska et al. characterize the synaptic mechanisms at the basis of the REdCA1 contribution to the consolidation of fear memory extinction. In particular, they describe a layer specific modulation of RE-dCA1 excitatory synapses modulation associated to contextual fear extinction which is impaired by transient chemogenetic inhibition of this pathway. These results indicate that RE activity-mediated modulation of synaptic morphology contributes to the consolidation of contextual fear extinction

      Strengths:

      The manuscript is well conceived, the statistical analysis is solid and methodology appropriate. The strength of this work is that it nicely builds up on existing literature and provides new molecular insight on a thalamo-hippocampal circuit previously known for its role in fear extinction. In addition, the quantification of pre- and post-synapses is particularly thorough.

      Weaknesses:

      The findings in this paper are well supported by the data more detailed description of the methods is needed.

      (1) In the paragraph Analysis of dCA1 synapses after contextual fear extinction (CFE), more experimental and methodological data should be given in the text: 

      - how was PSD95 used for the analysis, what was the difference between RE. Even if Thy1-GFP mice were used in Fig.2, it appears they were not used for bouton size analysis. To improve clarity, I suggest moving panel 2C to Figure 3. It is not clear whether all RE axons were indiscriminately analysed in Fig. 2 or if only the ones displaying colocalization with both PSD95 and GFP were analysed. If GFP was not taken into account here, analysed boutons could reflect synapses onto inhibitory neurons and this potential scenario should be discussed.

      PSD-95 immunostaining in close apposition to boutons was used to identify RE buttons innervating CA1 (Fig 1 and 2). In these cases PSD-95 signal was not quantified. PSD-95 in close apposition to dendritic spines was used as a proxy of PSDs in CA1 (Figure 3, 4 and 7). In these cases we assessed the integrated mean gray value of PSD-95 signal per dendritic spine (Figure 3, 4) or per ROI (Figure 7). This is explained in detail in the section Confocal microscopy and image quantification (ln 149-172).

      GFP signal was not taken into account during boutons analysis. This is explained in the materials and methods section Confocal microscopy and image quantification (ln 149-172).

      We indicate that PSD-95 is a marker of excitatory synapses located both on excitatory and inhibitory neurons.

      Ln 258: RE boutons were identified in SO and SLM as axonal thickenings in close apposition to PSD-95-positive puncta (a synaptic scaffold used as a marker of excitatory synapses located both on excitatory and inhibitory neurons (Kornau et al., 1995; El-Husseini et al., 2000; Chen et al., 2011; Dharmasri et al., 2024). 

      We also cite literature demonstrating that RE projects to the hippocampal formation and forms asymmetric synapses with dendritic spines and dendrites, suggesting innervation of excitatory synapses on both excitatory and aspiny inhibitory neurons (ln 673).

      As advised by the reviewer the Figure 2C panel was moved to Figure 3 (now it is Fig 3A).

      (2) in the methods: The volume of intra-hippocampal CNO injections should be indicated. The concentration of 3 uM seems pretty low in comparison with previous studies. CNO source is missing.

      This section has been rewritten to be more clear. The concentration of CNO was chosen based on the previous studies (Stachniak et al., 2014).

      ln 103: Cannula placement. Mice were anesthetized by inhalation of 3–5% isoflurane (IsoFlo; Abbott Animal Health) in oxygen and positioned in a stereotaxic frame (51503, Stoelting, Wood Dale, IL, USA). Two holes were drilled in the skull, and a double guide cannulae (2 mm apart and 2 mm long; 26GA, Plastics One) was lowered into the holes such that the cannula tip was located over dorsal CA1 area (2 mm posterior to bregma, ±1 mm lateral, and −1.3 mm vertical). Cannulae were kept patent by using 33-gauge internal dummy cannulae (Plastics One). The animals were used in contextual fear conditioning 21 days after the cannulation. Animals received bilateral CNO (3 μM, 0.2 μl per side for 1 min; Tocris Bioscience, Cat. No. 4936) (Stachniak et al., 2014) or saline injections (0.2 μl per side) 30 minutes before Extinction session via intrahippocampal injection cannulae (33-gauge). After the infusion, the cannula was left in place for 30 seconds. The cannula placement was verified by histology, and only data from animals with correct cannula implants were included in statistical analyses.”

      (3) More details of what software/algorithm was used to score freezing should be included. 

      Freezing was automatically scored with VideoFreeze™ Software (Med Associates Inc.).

      (4) Antibody dilutions for IHC should be indicated. Secondary antibody incubation time should be indicated.

      The missing information is added.

      ln 144: Next, sections were incubated in 4°C overnight with primary antibodies directed against PSD-95 (1:500, Millipore, MAB 1598), washed three times in 0.3% Triton X-100 in PBS and incubated in room temperature for 90 minutes with a secondary antibody bound with Alexa Fluor 647 (1:500, Invitrogen, A31571). 

      (5) No statement about code and data availability is present.

      The statements are added.

      ln 785: Row data and the code used for analysis of confocal data is available at OSF (https://osf.io/bnkpx/).

      Reviewer #3 (Public review):

      Summary:

      This paper examined the role of nucleus reuniens (RE) projections to dorsal CA1 neurons in context fear extinction learning. First, they show that RE neurons send excitatory projections to the stratum oriens (SO) and the stratum lacunosum moleculare (SLM), but not the stratum radiatum (SR). After context fear conditioning, the synaptic connections between RE and dCA1 neurons in the SLM (but not the SO) are weakened (reduced bouton and spine density) after mice undergo context fear conditioning. This weakening is reversed by extinction learning, which leads to enhanced synaptic connectivity between RE inputs and dendrites in the SLM. Control experiments demonstrate that the observed changes are due to extinction and not caused by simple exposure to the context. Extinction learning also induced increases in the size (volume and surface area) of the post-synaptic density (PSD) in SLM. To establish the functional role of RE inputs to dCA1, the researchers used an inhibitory DREADD to silence this pathway during extinction learning. They observe that extinction memory (measured 2-hours or 24-hours later) is impaired by this inhibition. Control experiments show that the extinction memory deficit is not simply due to increased freezing caused by inactivation of the pathway or injections of CNO. Inhibiting the RO projection during extinction learning also reduced the levels of PSD-95 protein levels in the spines of dCA1 neurons.

      Strengths:

      Based on their results, the authors conclude that, "the RE→SLM pathway participates in the updating of fearful context value by actively regulating CFE-induced molecular and structural synaptic plasticity in the SLM.". I believe the data are generally consistent with this hypothesis, although there is an important control condition missing from the behavioral experiments.

      Weaknesses:

      (1) A defining feature of extinction learning is that it is context specific (Bouton, 2004). It is expressed where it was learned, but not in other environments. Similarly, it has been shown that internal contexts (or states) also modulate the expression of extinction (Bouton, 1990). For example, if a drug is administered during extinction learning, it can induce a specific internal state. If this state is not present during subsequent testing, the expression of extinction is impaired just as it is when the physical context is altered (Bouton, 2004). It is possible that something similar is happening in Figure 6. In these experiments, CNO is administered to inactivate the RE-dCA1 projection during extinction learning. The authors observe that this manipulation impairs the expression of extinction the next day (or 2-hours later). However, the drug is not given again during the test. Therefore, it is possible that CNO (and/or inactivation of the RE-dCA1 pathway) induces a state change during extinction that is not present during subsequent testing. Based on the literature cited above, this would be expected to disrupt fear extinction as the authors observed. To determine if this alternative explanation is correct, the researchers need to add groups that receive CNO during extinction training and subsequent extinction testing. If the deficits in extinction expression reported in Figure 6 result from a state change, then these groups should not exhibit an impairment. In contrast, if the authors' account is correct, then the expression of extinction should still be disrupted in mice that receive CNO during training and testing.

      We do agree with the reviewer that such an experiment would be interesting. However, it could be also confusing as we could not distinguish whether the possible behavioral effects are related to the state-dependent aspects of CFE or impaired recall of CFE. Importantly, previous studies showed that RE is crucial for extinction recall (Totty et al., 2023). We also show that CFE memory is impaired not only when the animals recall CFE without CNO (day 3) but also with CNO (day 4) (Figure 6C). Moreover, we do not see the effects of CNO on CFE in the control groups (Figure 6D and H). So we believe that it is unlikely that CNO results in state-dependent CFE.

      (2) In their analysis of dCA1 synapses after contextual fear extinction (CFE) (Figure 4), the authors should have compared Ctx and Ctx-Ctx animals against naïve animals (as they did in Figure 3) when comparing 5US and Ext with naïve animals. Otherwise, the authors cannot make the following conclusion; "since changes of SLM synapses were not observed in the animals exposed to the familiar context that was not associated with the USs, our data support the role of the described structural plasticity at the RE→SLM synapses in CFE, rather than in processing contextual information in general.".

      We assume that the key experimental groups to conclude about synaptic plasticity related to particular behavior are the groups that differ just by one factor/experience. For CFE that would be mice sacrificed immediately before and after CFE session (Figure 2 & 3); on the other hand to conclude about the effects of the re-exposure to the neutral context mice sacrificed before and after second exposure to the neutral context are needed (Figure 4). The naive group, as it differs by at least two manipulations from the Ext and Ctx-Ctx groups, is interesting but not crucial in both cases. This group would be necessary if we focused on the memories of FC or novel context. However, these topics are not the main focus of the current manuscript. Still, the naive group is shown on Figures 2 & 3 to check if CFE brings spine parameters to the levels observed in mice with low freezing.

      We have re-written the cited paragraph to be more precise in our conclusions. 

      "Overall, our data demonstrate that synapses in all dCA1 strata undergo structural or molecular changes relevant to CFC and/or CFE. However, only in SLM CFE-induced synaptic changes are likely to be directly regulated by RE inputs as they appear on RE+ dendrites and spines. Since such changes of SLM synapses were not observed in the animals re-exposed to the neutral context, our data support the role of the described structural plasticity at the RE→SLM synapses in CFE, rather than in processing contextual information in general."

      (3) In the materials and methods section, the description of cannula placements is confusing and needs to be rewritten.

      This section has been rewritten.

      ln 103: Cannula placement. Mice were anesthetized by inhalation of 3–5% isoflurane (IsoFlo; Abbott Animal Health) in oxygen and positioned in a stereotaxic frame (51503, Stoelting, Wood Dale, IL, USA). Two holes were drilled in the skull, and a double guide cannulae (2 mm apart and 2 mm long; 26GA, Plastics One) was lowered into the holes such that the cannula tip was located over dorsal CA1 area (2 mm posterior to bregma, ±1 mm lateral, and −1.3 mm vertical). Cannulae were kept patent by using 33-gauge internal dummy cannulae (Plastics One). The animals were used in contextual fear conditioning 21 days after the cannulation. Animals received bilateral CNO (3 μM, 0.2 μl per side for 1 min; Tocris Bioscience, Cat. No. 4936) (Stachniak et al., 2014) or saline injections (0.2 μl per side) 30 minutes before Extinction session via intrahippocampal injection cannulae (33-gauge). After the infusion, the cannula was left in place for 30 seconds. The cannula placement was verified by histology, and only data from animals with correct cannula implants were included in statistical analyses.”

    1. eLife Assessment

      This valuable report describes the changing antiviral activity of IFIT1 across mammals and in response to distinct viruses, likely as a result of past arms races. One of the main strengths of the manuscript is the breadth of mammalian IFIT1 orthologs and viruses that were tested. Overall the evidence is solid, but the analysis of positive selection could benefit from more thorough validation with complementary selection tests and also from assessing or more extended discussion of the impact of recombination and/or physical interactions with other IFITs.

    2. Reviewer #1 (Public review):

      Summary:

      McDougal et al. aimed to characterize the antiviral activity of mammalian IFIT1 orthologs. They first performed three different evolutionary selection analyses within each major mammalian clade and identified some overlapping positive selection sites in IFIT1. They found that one site that is positively selected in primates is in the RNA-binding exit tunnel of IFIT1 and is tolerant of mutations to amino acids with similar biochemical properties. They then tested 9 diverse mammalian IFIT1 proteins against VEEV, VSV, PIV3, and SINV and found that each ortholog has distinct antiviral activities. Lastly, they compared human and chimpanzee IFIT1 and found that the determinant of their differential anti-VEEV activity may be partly attributed to their ability to bind Cap0 RNA.

      Strengths:

      The study is one of the first to test the antiviral activity of IFIT1 from diverse mammalian clades against VEEV, VSV, PIV3, and SINV. Cloning and expressing these 39 IFIT1 orthologs in addition to single and combinatorial mutants is not a trivial task. The positive connection between anti-VEEV activity and Cap0 RNA binding is interesting, suggesting that differences in RNA binding may explain differences in antiviral activity.

      Weaknesses:

      The evolutionary selection analyses yielded interesting results, but were not used to inform follow-up studies except for a positively selected site identified in primates. Since positive selection is one of the two major angles the authors proposed to investigate mammalian IFIT1 orthologs with, they should integrate the positive selection results with the rest of the paper more seamlessly, such as discussing the positive selection results and their implications, rather than just pointing out that positively selected sites were identified. The paper should elaborate on how the positive selection analyses PAML, FUBAR, and MEME complement one another to explain why the tests gave them different results. Interestingly, MEME which usually provides more sites did not identify site 193 in primates that was identified by both PAML and FUBAR. The authors should also provide the rationale for choosing to focus on the 3 sites identified in primates only. One of those sites, 193, was also found to be positively selected in bats, although the authors did not discuss or integrate that finding into the study. In Figure 1A, they also showed a dN/dS < 1 from PAML, which is confusing and would suggest negative selection instead of positive selection. Importantly, since the authors focused on the rapidly evolving site 193 in primates, they should test the IFIT1 orthologs against viruses that are known to infect primates to directly investigate the impact of the evolutionary arms race at this site on IFIT1 function.

      Some of the data interpretation is not accurate. For example:

      (1) Lines 232-234: "...western blot analysis revealed that the expression of IFIT1 orthologs was relatively uniform, except for the higher expression of orca IFIT1 and notably lower expression of pangolin IFIT1 (Figure 4B)." In fact, most of the orthologs are not expressed in a "relatively uniform" manner e.g. big brown bat vs. shrew are quite different.

      (2) Line 245: "...mammalian IFIT1 species-specific differences in viral suppression are largely independent of expression differences." While it is true that there is no correlation between protein expression and antiviral activity in each species, the authors cannot definitively conclude that the species-specific differences are independent of expression differences. Since the orthologs are clearly not expressed in the same amounts, it is impossible to fully assess their true antiviral activity. At the very least, the authors should acknowledge that the protein expression can affect antiviral activity. They should also consider quantifying the IFIT1 protein bands and normalizing each to GAPDH for readers to better compare protein expression and antiviral activity. The same issue is in Line 267.

      (3) Line 263: "SINV... was modestly suppressed by pangolin, sheep, and chinchilla IFIT1 (Figure 4E)..." The term "modestly suppressed" does not seem fitting if there is 60-70% infection in cells expressing pangolin and chinchilla IFIT1.

      (4) The study can be significantly improved if the authors can find a thread to connect each piece of data together, so the readers can form a cohesive story about mammalian IFIT1.

    3. Reviewer #2 (Public review):

      McDougal et al. describe the surprising finding that IFIT1 proteins from different mammalian species inhibit the replication of different viruses, indicating that the evolution of IFIT1 across mammals has resulted in host species-specific antiviral specificity. Before this work, research into the antiviral activity and specificity of IFIT1 had mostly focused on the human ortholog, which was described to inhibit viruses including vesicular stomatitis virus (VSV) and Venezuelan equine encephalitis virus (VEEV) but not other viruses including Sindbis virus (SINV) and parainfluenza virus type 3 (PIV3). In the current work, the authors first perform evolutionary analyses on IFIT1 genes across a wide range of mammalian species and reveal that IFIT1 genes have evolved under positive selection in primates, bats, carnivores, and ungulates. Based on these data, they hypothesize that IFIT1 proteins from these diverse mammalian groups may show distinct antiviral specificities against a panel of viruses. By generating human cells that express IFIT1 proteins from different mammalian species, the authors show a wide range of antiviral activities of mammalian IFIT1s. Most strikingly, they find several IFIT1 proteins that have completely different antiviral specificities relative to human IFIT1, including IFIT1s that fail to inhibit VSV or VEEV, but strongly inhibit PIV3 or SINV. These results indicate that there is potential for IFIT1 to inhibit a much wider range of viruses than human IFIT1 inhibits. Electrophoretic mobility shift assays (EMSAs) suggest that some of these changes in antiviral specificity can be ascribed to changes in the direct binding of viral RNAs. Interestingly, they also find that chimpanzee IFIT1, which is >98% identical to human IFIT1, fails to inhibit any tested virus. Replacing three residues from chimpanzee IFIT1 with those from human IFIT1, one of which has evolved under positive selection in primates, restores activity to chimpanzee IFIT1. Together, these data reveal a vast diversity of IFIT1 antiviral specificity encoded by mammals, consistent with an IFIT1-virus evolutionary "arms race".

      Overall, this is a very interesting and well-written manuscript that combines evolutionary and functional approaches to provide new insight into IFIT1 antiviral activity and species-specific antiviral immunity. The conclusion that IFIT1 genes in several mammalian lineages are evolving under positive selection is supported by the data, although there are some important analyses that need to be done to remove any confounding effects from gene recombination that has previously been described between IFIT1 and its paralog IFIT1B. The virology results, which convincingly show that IFIT1s from different species have distinct antiviral specificity, are the most surprising and exciting part of the paper. As such, this paper will be interesting for researchers studying mechanisms of innate antiviral immunity, as well as those interested in species-specific antiviral immunity. Moreover, it may prompt others to test a wide range of orthologs of antiviral factors beyond those from humans or mice, which could further the concept of host-specific innate antiviral specificity. Additional areas for improvement, which are mostly to clarify the presentation of data and conclusions, are described below.

      Strengths:

      (1) This paper is a very strong demonstration of the concept that orthologous innate immune proteins can evolve distinct antiviral specificities. Specifically, the authors show that IFIT1 proteins from different mammalian species are able to inhibit the replication of distinct groups of viruses, which is most clearly illustrated in Figure 4G. This is an unexpected finding, as the mechanism by which IFIT1 inhibits viral replication was assumed to be similar across orthologs. While the molecular basis for these differences remains unresolved, this is a clear indication that IFIT1 evolution functionally impacts host-specific antiviral immunity and that IFIT1 has the potential to inhibit a much wider range of viruses than previously described.

      (2) By revealing these differences in antiviral specificity across IFIT1 orthologs, the authors highlight the importance of sampling antiviral proteins from different mammalian species to understand what functions are conserved and what functions are lineage- or species-specific. These results might therefore prompt similar investigations with other antiviral proteins, which could reveal a previously undiscovered diversity of specificities for other antiviral immunity proteins.

      (3) The authors also surprisingly reveal that chimpanzee IFIT1 shows no antiviral activity against any tested virus despite only differing from human IFIT1 by eight amino acids. By mapping this loss of function to three residues on one helix of the protein, the authors shed new light on a region of the protein with no previously known function.

      (4) Combined with evolutionary analyses that indicate that IFIT1 genes are evolving under positive selection in several mammalian groups, these functional data indicate that IFIT1 is engaged in an evolutionary "arms race" with viruses, which results in distinct antiviral specificities of IFIT1 proteins from different species.

      Weaknesses:

      (1) The evolutionary analyses the authors perform appear to indicate that IFIT1 genes in several mammalian groups have evolved under positive selection. However, IFIT1 has previously been shown to have undergone recurrent instances of recombination with the paralogous IFIT1B, which can confound positive selection analyses such as the ones the authors perform. The authors should analyze their alignments for evidence of recombination using a tool such as GARD (in the same HyPhy package along with MEME and FUBAR). Detection of recombination in these alignments would invalidate their positive selection inferences, in which case the authors need to either analyze individual non-recombining domains or limit the number of species to those that are not undergoing recombination. While it is likely that these analyses will still reveal a signature of positive selection, this step is necessary to ensure that the signatures of selection and sites of positive selection are accurate.

      (2) The choice of IFIT1 homologs chosen for study needs to be described in more detail. Many mammalian species encode IFIT1 and IFIT1B proteins, which have been shown to have different antiviral specificity, and the evolutionary relationship between IFIT1 and IFIT1B paralogs is complicated by recombination. As such, the assertion that the proteins studied in this manuscript are IFIT1 orthologs requires additional support than the percent identity plot shown in Figure 3B.

      (3) Some of the results and discussion text could be more focused on the model of evolution-driven changes in IFIT1 specificity. In particular, the chimpanzee data are interesting, but it would appear that this protein has lost all antiviral function, rather than changing its antiviral specificity like some other examples in this paper. As such, the connection between the functional mapping of individual residues with the positive selection analysis is somewhat confusing. It would be more clear to discuss this as a natural loss of function of this IFIT1, which has occurred elsewhere repeatedly across the mammalian tree.

      (4) In other places in the manuscript, the strength of the differences in antiviral specificity could be highlighted to a greater degree. Specifically, the text describes a number of interesting examples of differences in inhibition of VSV versus VEEV from Figure 3C and 3D, but it is difficult for a reader to assess this as most of the dots are unlabeled and the primary data are not uploaded. A few potential suggestions would be to have a table of each ortholog with % infection by VSV and % infection by VEEV. Another possibility would be to plot these data as an XY scatter plot. This would highlight any species that deviate from the expected linear relationship between the inhibition of these two viruses, which would provide a larger panel of interesting IFIT1 antiviral specificities than the smaller number of species shown in Figure 4.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript by McDougal et al, demonstrates species-specific activities of diverse IFIT1 orthologs and seeks to utilize evolutionary analysis to identify key amino acids under positive selection that contribute to the antiviral activity of this host factor. While the authors identify amino acid residues as important for the antiviral activity of some orthologs and propose a possible mechanism by which these residues may function, the significance or applicability of these findings to other orthologs is unclear. However, the subject matter is of interest to the field, and these findings could be significantly strengthened with additional data.

      Strengths:

      Assessment of multiple IFIT1 orthologs shows the wide variety of antiviral activity of IFIT1, and identification of residues outside of the known RNA binding pocket in the protein suggests additional novel mechanisms that may regulate IFIT1 activity.

      Weaknesses:

      Consideration of alternative hypotheses that might explain the variable and seemingly inconsistent antiviral activity of IFIT1 orthologs was not really considered. For example, studies show that IFIT1 activity may be regulated by interaction with other IFIT proteins but was not assessed in this study.

      Given that there appears to be very little overlap observed in orthologs that inhibited the viruses tested, it's possible that other amino acids may be key drivers of antiviral activity in these other orthologs. Thus, it's difficult to conclude whether the findings that residues 362/4/6 are important for IFIT1 activity can be broadly applied to other orthologs, or whether these are unique to human and chimpanzee IFIT1. Similarly, while the hypothesis that these residues impact IFIT1 activity in an allosteric manner is an attractive one, there is no data to support this.

    5. Author response:

      We are grateful to the reviewers for their thoughtful and constructive feedback on our manuscript. Based on the Public Reviews, we will address the concerns raised by each reviewer through a combination of new analyses, clarifications, and expanded discussion as outlined below:

      Reviewer #1:

      (1) Integration of Positive Selection Results:

      We will enhance the integration of positive selection analyses throughout the manuscript. Specifically, we will discuss how the positively selected sites in primates, including site 193, inform IFIT1 function. We will expand the discussion to explain how PAML, FUBAR, and MEME complement each other and why MEME did not detect site 193 in primates. Additionally, we will provide a rationale for focusing on the three sites identified in primates and address the overlap with bat orthologs.

      (2) Expression Levels and Antiviral Activity:

      We acknowledge the variability in IFIT1 ortholog expression levels. To address this, we will quantify and normalize protein expression to GAPDH across all orthologs, allowing for a more accurate comparison of antiviral activity. We will revise the text to clarify that species-specific diVerences in viral suppression may be influenced by expression levels.

      (3) Clarification of Terminology and Data Interpretation:

      We will refine our description of the antiviral eVects observed for SINV in Figure 4E. We will also revise statements related to protein expression in the relevant sections to improve accuracy.

      (4) Cohesion of Data:

      We will work to more tightly connect the evolutionary analysis with the functional virology data, framing the manuscript around how positive selection shapes IFIT1 function across species. 

      Reviewer #2:

      (1) Recombination Analysis of IFIT1:

      We will conduct a recombination analysis using GARD from the HyPhy package to ensure that the signatures of positive selection are not confounded by recombination between IFIT1 and IFIT1B. 

      (2) Clarification of IFIT1 Homologs Studied:

      We will provide additional details on how IFIT1 orthologs were selected, including addressing the relationship between IFIT1 and IFIT1B. We will support this by presenting additional sequence comparisons to demonstrate the orthology of the proteins studied.

      (3) Chimpanzee IFIT1 Loss of Function:

      We will revise the discussion of chimpanzee IFIT1 to better reflect the data. 

      (4) Presentation of Antiviral Specificity Data:

      We will include a supplementary table listing the percentage of infection normalized to control by VSV and VEEV for each ortholog to allow for clearer comparisons.

      Additionally, we will provide an alternative visualization to better compare the data sets. 

      Reviewer #3:

      (1) Alternative Hypotheses for IFIT1 Antiviral Activity such as IFIT1-IFIT interactions:

      We will expand the discussion to consider alternative hypotheses, including the potential for IFIT1 activity to be regulated through interactions with other IFIT family members. Therefore, we will address how IFIT1-IFIT interactions may be broadly applicable to our findings with IFIT1 orthologs. In addition, we will clarify that we do not conclude that residues 362/4/6 are the sole drivers of antiviral specificity across the orthologs tested in this study.

      (2) Generalization of Findings Across Orthologs:

      We acknowledge that the functional importance of residues 362/4/6 may not be generalizable across all orthologs. We will discuss this limitation more explicitly in the manuscript, while also expanding on how these findings apply specifically to primate IFIT1 orthologs.

      We believe that these revisions will address the key concerns raised by the reviewers and strengthen the manuscript. We look forward to submitting the revised version for further consideration.

    1. Author Response:

      We are grateful to the reviewers for their encouraging comments and constructive suggestions. These suggestions will be valuable to improve the revised manuscript.

      Reviewer 1:

      PD-1 signaling is suppressive to the establishment of cytokine-producing effector cells in general. However, as the reviewer pointed out, one of the results in Fig. 2H showing a decrease of IFN-gamma-producing cells is against this trend. The data indicate percentages of cytokine-producing cells, which are not always consistent with the absolute number of activated T cells. Nonetheless, we plan additional experiments in order to address the question.

      For PD-1YFYF experiments in Figs. 3-5, there were moderate changes in cytokine production between wild-type and mutant PD-1. We conducted gene transduction to newly prepared T cells in each experiment. In addition, to monitor the immunosuppressive effect of PD-1 agonist antibodies, these T cells were stimulated using PD-L1-deficient APC. Therefore, we think these cytokine levels were most likely a technical variation, but not specific function of PD-1YFYF.

      Anti-PD-L1 mAb was used for the optimal blockade of PD-1/PD-L1 blockade, and the concentration of antibody (5 microg/ml) is within a normal range for this purpose. We used variable concentrations of OVA peptide to set up experiments with different intensities of TCR stimulation. TCR signal intensity has been shown to affect CD4+ T cell differentiation into Th1 and Th2 cells. We lowered the peptide concentration to test the effect of PD-1 signals under the suboptimal TCR stimulation.

      Reviewer 2:

      Antigen-specific T cells from immunized mice are not ideal for Th differentiation studies because activated T cells in response to the antigen might have already undergone functional differentiation in vivo. Incorporating the reviewer’s suggestion, we will test alternative approach including human CD4+ T cells.

      For the allergy model, we will expand the analysis for inflammatory effectors.

    2. eLife Assessment

      This valuable study reports on a novel role of PD-1 in early T cell differentiation, showing that PD-1 stimulation impairs Th2 differentiation more effectively than that of Th1, with implications for the treatment of allergies. However, whereas the series of well-designed experiments using OVA-specific CD4 T cells from DO.11.10 mice and the use of an allergy model generated compelling data, the study is still incomplete since it shows gaps in the rationale for the experimental protocols, contradictory data regarding IFN-gamma and IL-4 production, and the lack of in vivo experiments on Th2 differentiation to further support the main hypothesis. Nonetheless, the reported data would be of interest to immunologists working on T cell differentiation and allergy.

    3. Reviewer #1 (Public review):

      Summary:

      The authors analyze the roles of PD-1 in the early stages (pre-activation) of T cell differentiation and show that naïve CD4+ T cell differentiation is altered, especially Th2 differentiation is strongly impaired, upon early PD-1 stimulation. The results have important implications for the immunotherapy area, but I think the manuscript requires some revisions.

      Strengths:

      (1) Novel Insights into PD-1 in Early T Cell Differentiation:<br /> The study provides new insights into the role of PD-1 during the pre-activation phase of T cell differentiation, particularly its impact on naïve CD4+ T cells and Th2 differentiation. This is a significant contribution to immunotherapy research.

      (2) Relevance to Immunotherapy:<br /> The findings have potential implications for the development of immunotherapies by demonstrating how PD-1 signaling affects specific T cell subsets early in differentiation.

      Weaknesses:

      (1) Inconsistent and Confusing Data:<br /> There are contradictions between the figures and the conclusions, particularly regarding IL-4 and IFNgamma production in PD-1-expressing cells. This raises concerns about data interpretation and experimental accuracy.

      (2) Unclear Experimental Rationale:<br /> The reviewer questions the rationale behind key methodological choices, such as the high concentration of PDL-1 antibody and varying OVA peptide concentrations. These decisions need more justification.

    4. Reviewer #2 (Public review):

      Summary:

      The authors try to demonstrate that PD-1 regulates not only the quantity but also the quality of the immune response determining the Th differentiation. The authors suggest that the ability of PD-1 agonists to dampen Th2 differentiation could be exploited in allergies or classical Th2-mediated disease as a therapeutical approach.

      Strengths:<br /> The authors performed a series of elegant experiments using OVA-specific CD4 T cells from mice, showing a strong reduction of Th2 differentiation in vitro. They also perform some experiments with a model of allergies, showing an amelioration of the phenotype after administration of PD-1 agonist with a reduction of Th2 cells.

      Weaknesses:

      The authors perform all the experiments using DO11.10 mouse cells. Such cells have a TCR with very high affinity, it would be relevant to repeat at least some of the in vitro assays in a more physiological setting (you can immunise mice with ova to increase the pool of OVA-specific T cells, and then repeat the restimulation experiment). Also, a longer kinetic would be of interest to see the effect of the agonist on Th1 cells.

      Another drawback is the lack of experiments with human cells. It would be really important to repeat the experiments with CD4 T cells from healthy donors (the antibody that the authors use as PD-1 agonist is human, so it would not be a complicated experiment).

      It would be also interesting to show in the allergic disease model the effect of the agonism on the T cell response in general.

    1. eLife Assessment

      The presented soft tissue data of pterosaur tail vanes represent a valuable contribution to ongoing research efforts to decipher the flight abilities of pterosaurs in the fields of paleontology, comparative biomechanics, and bioinspired design. The new methods are compelling and give new detail on tail morphology that has the potential to resolve how pterosaurs were able to control and maintain tail stiffness to furnish flight control.

    2. Reviewer #1 (Public review):

      Summary:

      This paper reports fossil soft-tissue structures (tail vanes) of pterosaurs, and attempts to relate this to flight performance and other proposed functions for the tail

      Strengths:

      The paper presents new evidence for soft-tissue strengthening of vanes using exciting new methods.

      Weaknesses:

      There seems to be no discussion of bias in the sample selection method - even a simple consideration of whether discarded specimens were likely not to have had the cross-linking lattice, or if it was not visible.

      There seems to be no supporting evidence or theory to show how the lattice could have functioned, other than a narrative description. Moreover, there is no comparison to extant organisms where a comparison of function might be drawn.

    3. Reviewer #2 (Public review):

      Summary:

      The authors have set out to investigate and explain how early members of the Pterosauria were able to maintain stiffness in the vane of their tails. This stiffness, it is said, was crucial for flight in early members of this clade. Through the use Laser-Stimulated Fluorescence imaging, the authors have revealed that certain pterosaurs had a sophisticated dynamic tensioning system that has previously been unappreciated.

      Strengths:

      The choice of method of investigation for the key question is sound enough, and the execution of the same is excellent. Overall the paper is well written and well presented, and provides a very succinct, accessible and clear conclusion.

      Weaknesses:

      None

    1. eLife Assessment

      This study presents numerical results on a potentially useful framework for understanding the dynamics of subthreshold waves in a network of electrical synapses modeled on the connectome data of the C elegans nematode. However, the strength of the evidence presented in favor of interference effects being a major component in subthreshold wave dynamics is inadequate.

    2. Joint Public Review

      This work investigates numerically the propagation of subthreshold waves in a model neural network that is derived from the C. elegans connectome. Using a scattering formalism and tight-binding description of the network -- approximations which are commonplace in condensed matter physics -- this work attempts to show the relevance of interference phenomena, such as wavenumber-dependent propagation, for the dynamics of subthreshold waves propagating in a network of electrical synapses.

      The primary strength of the work is in trying to use theoretical tools from a far-away corner of fundamental physics to shed light on the properties of a real neural system.<br /> While a system composed of neurons and synapses is classical in nature, there are occasions in which interference or localization effects are useful for understanding wave propagation in complex media [review, van Rossum & Nieuwenhuizen, 1999]. However, it is expected that localization effects only have an impact in some parameter regimes and with low phase dissipation. The authors should have addressed the existence of this validity regime in detail prior to assuming that interference effects are important.

      An additional approximation that was made without adequate justification is the use of a tight-binding Hamiltonian. This can be a reasonable approximation, even for classical waves, in particular in the presence of high-quality-factor resonators, where most of the wave amplitude is concentrated on the nodes of the network, and nodes are coupled evanescently with each other. Neither of these conditions were verified for this study.

      The motivation for this work is to understand the basic mechanisms underlying subthreshold intrinsic oscillations in the inferior olive, but detailed connectivity patterns in this brain area are not available. The connectome is known for C elegans, but sub-threshold oscillations have not been observed there, and the implications of this work for C elegans neuroscience remain unclear. The authors should also give more evidence for the claim that their study may give a mechanism for synchronized rhythmic activity in the mammalian inferior olive nucleus, or refrain from making this conclusion.

      In the same vein, since the work emphasizes the dependence on the wavenumber for the propagation of subthreshold oscillations, they should make an attempt at estimating the wavenumber of subthreshold oscillations in C elegans if they were to exist and be observed. Next, the presence of two "mobility edges" in the transmission coefficient calculated in this work is unmistakably due to the discrete nature of the system, coming from the tight-binding approximation, and it is unclear if this approximation is justified in the current system.

      Similarly, it is possible that the wavenumber-dependent transmission observed depends strongly on the addition of a large number of virtual nodes (VNs) in the network, which the authors give little to no motivation for. As these nodes are not present in the C elegans connectome, the authors should explain the motivation for their inclusion in the model and should discuss their consequences on the transmission properties of the network.

      As it stands, the work would only have a very limited impact on the understanding of subthreshold oscillations in the rat or in C elegans. Indeed, the preprint falls short of relating its numerical results to any phenomena which could be observed in the lab.

    1. eLife Assessment

      Gil Ávila et al. evaluated the aperiodic component in the medial prefrontal cortex using resting-state EEG recordings from 149 individuals with chronic pain and 115 healthy participants. The authors present compelling evidence that the aperiodic component of the EEG does not differentiate between those with chronic pain and healthy individuals. The study was well-designed and rigorously conducted, and the clear and conclusive results provide important insights that can guide future research in the field of pain neuroscience.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Avila et al. tested the hypothesis that chronic pain states are associated with changes in the excitability of the medial prefrontal cortex (mPFC). The authors used the slope of the aperiodic component of the EEG power spectrum (= the aperiodic exponent) as a novel, non-invasive proxy for the cortical excitation-inhibition ratio. They performed source localization to estimate the EEG signals generated specifically by the mPFC. By pooling resting-state EEG recordings from three existing datasets, the authors were able to compare the aperiodic exponent in the mPFC and across the whole brain (at all modeled cortical sources) between 149 chronic pain patients and 115 healthy controls. Additionally, they assessed the relationship between the aperiodic exponent and pain intensity reported by the patients. To account for heterogeneity in pain etiology, the analysis was also performed separately for two patient subgroups with different chronic pain conditions (chronic back pain and chronic widespread pain). The study found robust evidence against differences in the aperiodic exponent in the mPFC between people with chronic pain and healthy participants, and no correlation was observed between the aperiodic exponent and pain intensity. These findings were consistent across different patient subgroups and were corroborated by the whole-brain analysis.

      Strengths:

      The study is based on sound scientific reasoning and rigorously employs suitable methods to test the hypothesis. It follows a pre-registered protocol, which greatly increases the transparency and, consequently, the credibility of the reported results. In addition to the planned steps, the authors used a multiverse analysis to ensure the robustness of the results across different methodological choices. I find this particularly interesting, as the EEG aperiodic exponent has only recently been linked to network excitability, and the most appropriate methods for its extraction and analysis are still being determined. The methods are clearly and comprehensively described, making this paper very useful for researchers planning similar studies. The results are convincing, and supported by informative figures, and the lack of the expected difference in mPFC excitability between the tested groups is thoroughly and constructively discussed.

      Weaknesses:

      Firstly, although I appreciate the relatively large sample size, pooling data recorded by different researchers using different experimental protocols inevitably increases sample variability and may limit the availability of certain measures, as was the case here with the reports of pain intensity in the patient group. Secondly, the analysis heavily relies on the estimation of cortical sources, an approach that offers many advantages but may yield imprecise results, especially when default conduction models, source models, and electrode coordinates are used. In my opinion, this point should be discussed as well.

    3. Reviewer #2 (Public review):

      Summary:

      This study evaluated the aperiodic component in the medial prefrontal cortex (mPFC) using resting-state EEG recordings from 149 individuals with chronic pain and 115 healthy participants. The findings showed no significant differences in the aperiodic component of the mPFC between the two groups, nor was there any correlation between the aperiodic component and pain intensity. These results were consistent across various chronic pain subtypes and were corroborated by whole-brain analyses. The study's robustness was further reinforced by preregistration and multiverse analyses, which accounted for a wide range of methodological choices.

      Strengths:

      This study was rigorously conducted, yielding clear and conclusive results. Furthermore, it adhered to stringent open and reproducible science practices, including preregistration, blinded data analysis, and Bayesian hypothesis testing. All data and code have been made openly available, underscoring the study's commitment to transparency and reproducibility.

      Weaknesses:

      The aperiodic exponent of the EEG power spectrum is often regarded as an indicator of the excitatory/inhibitory (E/I) balance. However, this measure may not be the most accurate or optimal for quantifying E/I balance, a limitation that the authors might consider addressing in the future.

    1. eLife Assessment

      This important study presents results for the theory of odor coding in hyperbolic spaces by revealing spiral trajectories in the dynamics of odors during natural, ethologically relevant processes such as ripening. In the current manuscript, the strength of the evidence is solid and would be strengthened by answering several technical points raised by reviewers.

    2. Reviewer #1 (Public review):

      Summary:

      This work represents a new development in the theory of odor coding and recognition, based on mapping odor mixtures in low-dimensional hyperbolic spaces. The authors describe the dynamics of odor mapping, across stages of ripening and fermentation (trajectories in odor space), which, surprisingly, generalize across fruit types.

      Strengths:

      The approach provides a remarkably concise and clear description of the odor dynamics. As a model, the approach is mathematically exhaustive and generalizable. The analyses are technically correct and statistically robust.

      Weaknesses:

      None.

    3. Reviewer #2 (Public review):

      This article presents an analysis of the chemical composition of head-space generated by fruit at differing stages of ripeness. The authors used gas chromatography-mass spectrometry (GC-MS) to record the chemical makeup of the respective head-space samples. The authors process the data and present it in a low dimensional space. They then draw conclusions from the geometry of that representation about the process of fermentation.

      I have a number of major concerns with some of the stages in the argument advanced by the authors:

      (1) As far as I understand, the authors restrict their analysis to 13 molecules which appear in samples of all three levels of ripeness. This choice causes the analysis to overlook the very likely (and meaningful) possibility that different molecules present at different levels of ripeness are informative and might support different results.

      (2) It is unclear what was used as control? Empty bag? Please include the control results in your supplementary table, or indicate in the text if you eliminated compounds that were found in the control.

      (3) It is not clear that Figure 2-H _looks_ like a spiral. The authors should provide a quantifiable measure of the quality of the fit of a spiral rather than other paths. Furthermore, in the section "collective spiral ..." the end of paragraph one, "the points were best fitted by a two parameter archemedian spiral" best out of what? best out of all two parameter spirals? Please explain

      (4) In the section "estimating odor source phenotype ... " the authors write: "we first calculated the association of odorant compounds with different phenotypes in this dataset" how was that done?

      (5) Even if hyperbolic space MDS is slightly better, an R^2 value for Euclidean MDS of 0.797 is very good and one could say that Euclidean MDS is also an option.

      (6) In the section "collective spiral ..." near end of paragraph two: " we removed outlier samples for days 10 and 17 for two reasons...". Why does a smaller number of samples should make a certain day an outlier.

      (7) In section titles "collective spiral progression of multiple..." the authors write: the hyperbolic t-sne embedding exhibited batch effects across runs that amounted to rotation of the data. To compensate for these effects and combine data across runs we performed Procrustes analysis to align data across runs".

      Can we be sure that this process does itself not manufacture an alignment of data? The authors should apply the same process to random or shuffled data and see if the result is different from the actual data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Aging is associated with a number of physiologic changes including perturbed circadian rhythms. However, mechanisms by which rhythms are altered remain unknown. Here authors tested the hypothesis that age-dependent factors in the sera affect the core clock or outputs of the core clock in cultured fibroblasts. They find that both sera from young and old donors are equally potent at driving robust ~24h oscillations in gene expression, and report the surprising finding that the cyclic transcriptome after stimulation by young or old sera differs markedly. In particular, genes involved in the cell cycle and transcription/translation remain rhythmic in both conditions, while genes associated with oxidative phosphorylation and Alzheimer's Disease lose rhythmicity in the aged condition. Also, the expression of cycling genes associated with cholesterol biosynthesis increases in the cells entrained with old serum. Together, the findings suggest that age-dependent blood-borne factors, yet to be identified, affect circadian rhythms in the periphery. The most interesting aspect of the paper is that the data suggest that the same system (BJ-5TA), may significantly change its rhythmic transcriptome depending on how the cells are synchronized. While there is a succinct discussion point on this, it should be expanded and described whether there are parallels with previous works, as well as what would be possible mechanisms for such an effect.

      We’ve expanded our discussion in the manuscript to discuss possible mechanisms and also how the genes/pathways implicated in our study relate to other aging literature.  

      Major points: 

      Fig 1 and Table S1. Serum composition and levels of relevant blood-borne factors probably change in function of time. At what time of the day were the serum samples from the old and young groups collected? This important information should be provided in the text and added to Table S1. 

      We made sure to highlight the collection time in the abstract of the manuscript “We collected blood from apparently healthy young (age 25-30) and old (age 70-76) individuals at 14:001 and used the serum to synchronize cultured fibroblasts.” The time of blood draw is also in sections of the paper (Intro and Methods). Since Table S1 is demographic information, we did not think that the blood draw time fit best there, but hopefully it is now clear in the text.

      Fig 2A. Luminescence traces: the manuscript would greatly benefit from inclusion of raw luminescence traces.

      Raw luminescence traces have been added to Figure S3 (S3A).

      Fig 2. Of the many genes that change their rhythms after stimulation with young and old sera, what are the typical fold changes? For example, it would be useful to show histograms for the two groups. Does one group tend to have transcript rhythms of higher or lower fold changes? 

      We’ve presented these data in Figure S5. There are a few significant differences, but largely the groups are similar in terms of fold change.

      Fig. 2 Gene expression. Also here, the presentation would benefit from showing a few key examples for different types of responses. 

      Sample traces of genes that gain rhythmicity, lose rhythmicity, phase shift, and change MESOR are now illustrated in Figure S6.

      What was the rationale to use these cells over the more common U2OS cells? Are there similarities between the rhythmic transcriptomes of the BJ-5TA cells and that of U2OS cells or other human cells? This could easily be assessed using published datasets. 

      The original rationale to use BJ-5TA fibroblast cells was that we were aiming to build upon an observation found in a previous study2 which showed that circadian period changes with age in human fibroblasts. While our findings did not match theirs, we think an added benefit of using the BJ-5TA line is that unlike U2OS cells, it is not a carcinoma derived cell line. We’ve added this point in lines 98-101.

      Our study finds many more rhythmic transcripts compared to the previous studies examining U2OS cells. This can be attributed to several factors including differences in methods, including the use of human serum in our study, cell type differences, or decoupling of rhythms in some cancer cells. While a comparison of BJ-5TA cells and U2OS cells could be interesting, a proper comparison requires investigation of many data sets, since any pair of BJ-5TA and U2OS data sets will most likely differ in some detail of experimental design or data processing pipeline, which could contribute to observed differences in rhythmic transcripts.

      That being said, we compared clock reference genes (see Author response image 1) between BJ-5TA and U2OS cells, comparing circadian profiles obtained from our data with those available on CircaDB. These circadian profiles exhibit many similarities and a few differences. The peak to trough ratios (amplitudes) are quite similar for ARNTL, NR1D1, NR1D2, PER2, PER3, and are about 25% lower for CRY1 and somewhat higher for TEF (about 15%) in our data. We find that the MESORS are generally similar with the exception of NR1D1 which is much lower and NR1D2 which is much higher in our data.

      Author response image 1.

      BJ-5TA and U2OS Cells Exhibit Similar Profiles of Circadian Gene Transcription. We compared the transcriptomic profiles of the BJ-5TA cells in young and old serum (left) to the U2OS transcriptomic data (right) available on CircaDB, a database containing profiles of several circadian reference genes in U2OS cells. This figure suggests that circadian profiles of these genes exhibit many similarities. We find that the peak to trough ratios (amplitudes) are similar for ARNTL, NR1D1, NR1D2, Per2, PER3, and that the MESORS are similar (with the exception of NR1D1 which is much lower and NR1D2 which is much higher in the BJ-5TA cells). We find that the amplitudes of CRY1 is ~25% lower and TEF is ~15% higher for the BJ5TA cells. The axis for plots on the left show counts divided by 3.5 in order to made MESORs of ARNTL similar to ease comparison.

      For the rhythmic cell cycle genes, could this be the consequence of the serum which synchronizes also the cell cycle, or is it rather an effect of the circadian oscillator driving rhythms of cell cycle genes? 

      This is an interesting point. Given our previous data showing that the cell cycle gene cyclin D1 is regulated by clock transcription factors3, we believe the circadian oscillator drives, or at least contributes, to rhythms of cell cycle genes. However, the serum clearly makes a difference as we find that MESORs of cell cycle genes decrease with aged serum. This is consistent with the decreased proliferation previously observed in aged human tissue4.

      While the reduction of rhythmicity in the old serum for oxidative phosphorylation transcripts is very interesting and fits with the general theme that metabolic function decreases with age, it is puzzling that the recipient cells are the same, but it is only the synchronization by the old and young serum that changes. Are the authors thus suggesting that decrease of metabolic rhythms is primarily a non cell-autonomous and systemic phenomenon? What would be a potential mechanism? 

      We are indeed suggesting this, although it is also possible that it is not cycling per se, but rather an overall inefficiency of oxidative phosphorylation that is conveyed by the serum. Relating other work in the field to our findings, we’ve added the following to our discussion: “Previous work in the field demonstrates that synchronization of the circadian clock in culture results in cycling of mitochondrial respiratory activity5,6 further underscoring the different effects of old serum, which does not support oscillations of oxidative phosphorylation associated transcripts. Age-dependent decrease in oxidative phosphorylation and increase in mitochondrial dysfunction7 has been seen in aged fibroblasts8 and contributes to age-related diseases9. We suggest that the age-related inefficiency of oxidative phosphorylation is conferred by serum signals to the cells such that oxidative phosphorylation cycles are mitigated. On the other hand, loss of cycling could contribute to impairments in mitochondrial function with age.”

      The delayed shifts after aged serum for clock transcripts (but not for Bmal1) are interesting and indicate that there may be a decoupling of Bmal1 transcript levels from the other clock gene phases. How do the authors interpret this? could it be related to altered chronotypes in the elderly? 

      One possible explanation is that the delay of NPAS2, BMAL1’s binding partner, results in the delay of the transcription of clock controlled genes/negative arm genes. Since the RORs do not seem to be affected, Bmal is transcribed/translated as usual, but there isn’t enough NPAS2 to bind with BMAL1. In this case downstream genes are slower to transcribe causing the phase delay.

      Reviewer #2 (Public Review): 

      Schwarz et al. have presented a study aiming to investigate whether circulating factors in sera of subjects are able to synchronize depending on age, circadian rhythms of fibroblast. The authors used human serum taken from either old (age 70-76) or young (age 25-30) individuals to synchronise cultured fibroblasts containing a clock gene promoter driven luciferase reporter, followed by RNA sequencing to investigate whole gene expression. 

      This study has the potential to be very interesting, as evidence of circulating factors in sera that mediate peripheral rhythms has long been sought after. Moreover, the possibility that those factors are affected by age which could contribute to the weaken circadian rhythmicity observed with aging. 

      Here, the authors concluded that both old and young sera are equally competent at driving robust 24 hour oscillations, in particular for clock genes, although the cycling behaviour and nature of different genes is altered between the two groups, which is attributed to the age of the individuals. This conclusion could however be influenced by individual variabilities within and between the two age groups. The groups are relatively small, only four individual two females and two males, per group. And in addition, factors such as food intake and exercise prior to blood drawn, or/and chronotype, known to affect systemic signals, are not taken into consideration. As seen in figure 4, traces from different individuals vary heavily in terms of their patterns, which is not addressed in the text. Only analysing the summary average curve of the entire group may be masking the true data. More focus should be attributed to investigating the effects of serum from each individual and observing common patterns. Additionally, there are many potential causes of variability, instead or in addition to age, that may be contributing to the variation both, between the groups and between individuals within groups. All of this should be addressed by the authors and commented appropriately in the text. 

      We are not aware of any specific feature distinguishing the subjects (other than age) that could account for the differences between old and young. The fact that we see significant differences between the two groups, even with the relatively small size of the groups, suggests strongly that these differences are largely due to age. Nevertheless, we acknowledge that individual variability can be a contributing factor. For instance, the change in phase of clock genes appears to be driven largely by two subjects. We have commented on this and individual differences, in general, in the discussion.  

      The authors also note in the introduction that rhythms in different peripheral tissues vary in different ways with age, however the entire study is performed on only fibroblast, classified as peripheral tissue by the authors. It would be very interesting to investigate if the observed changes in fibroblast are extended or not to other cell lines from diverse organ origin. This could provide information about whether circulating circadian synchronising factors could exert their function systemically or on specific tissues. At the very least, this hypothesis should be addressed within the discussion. 

      It is likely that factors circulating in serum act on several tissues, and so their effects are relatively broad. However, this would require extensive investigation of other tissues. We now discuss this in the manuscript.

      In addition to the limitations indicated above I consider that the data of the study is an insufficiently analysis beyond the rhythmicity analysis. Results from the STRING and IPA analysis were merely descriptive and a more comprehensive bioinformatic analysis would provide additional information about potential molecular mechanism explaining the differential gene expression. For example, enrichment of transcription factors binding sites in those genes with different patters to pinpoint chromatin regulatory pathways.

      We performed LinC similarity analysis (LISA) to study enrichment of transcription factor binding. Results are displayed in Fig 3B and in lines 157-168. 

      Recommendations for the authors:

      The two reviewers and reviewing editor have agreed on the following recommendations for the authors: 

      Major: 

      (1) The bioinformatic analysis would benefit from a more thorough focus on variability between individuals. Specifically, the main conclusion of the manuscript could be significantly influenced by individual variabilities within and between the two age groups. This is of particular concern, as the groups are relatively small (four individual two females and two males, per group). In addition, the consideration of factors such as food intake and exercise prior to blood drawn, or/and chronotype, known to affect systemic signals should be more adequately explained. The lab is an experienced chronobiology lab, and thus we are confident that these factors had been thought of, but this needs to be better made clear.

      As seen in Figure 4, traces from different individuals vary heavily in terms of their patterns, which is not addressed in the text. Only analysing the summary average curve of the entire group may be masking the relevant data. Furthermore, there are many potential causes of variability, instead or in addition to age, that may be contributing to the variation both, between the groups and between individuals within groups. All of this should be addressed by the authors and commented appropriately in the text. 

      We are not aware of any specific feature distinguishing the subjects (other than age) that could account for the differences between old and young. The fact that we see significant differences between the two groups, even with the relatively small size of the groups, suggests strongly that these differences are largely due to age. Nevertheless, we acknowledge that individual variability can be a contributing factor. For instance, the change in phase of clock genes appears to be driven largely by two subjects. We have commented on this and individual differences, in general, in the discussion. 

      (2) The study would benefit from a more thorough analysis of the data beyond the rhythmicity analysis. Results from the STRING and IPA analysis were merely descriptive and a more comprehensive bioinformatic analysis would provide additional information about potential molecular mechanism explaining the differential gene expression. For example, enrichment of transcription factors binding sites in those genes with different patters to pinpoint chromatin regulatory pathways. This would provide additional value to the study, especially given the otherwise apparent lack of any mechanistic explanation. 

      We performed LinC similarity analysis (LISA) to study enrichment of transcription factor binding. Results are displayed in Fig 3B and in lines 157-168.

      (3) There were some questions about the amplitude of the core circadian clock gene rhythms raised, which in other human cell types would be much higher. A comment on this matter and the provision of the raw luminescence traces for Fig 2A would be greatly beneficial.

      Addressing the same topic: what are the typical fold changes of the many genes that change their rhythms after stimulation with young and old sera? For example, it would be useful to show histograms for the two groups. Does one group tend to have transcript rhythms of higher or lower fold changes? The presentation of the manuscript would further benefit from showing a few key examples for different types of responses. 

      The average luminescence trace for each individual serum sample from Fig 2A has been added to Fig S3A.

      We’ve presented the fold change data in Figure S5. There are a few significant differences, but largely the groups are similar in terms of fold change.

      (4) There are several points that we recommend to consider to add to the discussion: 

      What was the rationale to use these cells over the more common U2OS cells? Are there similarities between the rhythmic transcriptomes of the BJ-5TA cells and that of U2OS cells or other human cells? It should be relatively easy to address this point by assessing published datasets. 

      The original rationale to use BJ-5TA fibroblast cells was that we were aiming to build upon an observation found in a previous study2 which showed that circadian period changes with age in human fibroblasts. While our findings did not match theirs, we think an added benefit of using the BJ-5TA line is that unlike U2OS cells, it is not carcinoma derived cell line. We’ve added this point in lines 98-101. 

      Our study finds many more rhythmic transcripts compared to the previous studies examining U2OS cells. This can be attributed to several factors including differences in methods, including the use of human serum in our study, cell type differences, or decoupling of rhythms in some cancer cells. While a comparison of BJ-5TA cells and U2OS cells could be interesting, a proper comparison requires investigation of many data sets, since any pair of BJ-5TA and U2OS data sets will most likely differ in some detail of experimental design or data processing pipeline, which could contribute to observed differences in rhythmic transcripts.

      That being said, we compared clock reference genes (see Author response image 1) between BJ-5TA and U2OS cells, comparing circadian profiles obtained from our data with those available on CircaDB. These circadian profiles exhibit many similarities and a few differences. The peak to trough ratios (amplitudes) are quite similar for ARNTL, NR1D1, NR1D2, PER2, PER3, and are about 25% lower for CRY1 and somewhat higher for TEF (about 15%) in our data. We find that the MESORS are generally similar with the exception of NR1D1 which is much lower and NR1D2 which is much higher in our data.

      For the rhythmic cell cycle genes, could this be the consequence of the serum which synchronizes also the cell cycle, or is it rather an effect of the circadian oscillator driving rhythms of cell cycle genes? 

      This is an interesting point. Given our previous data showing that the cell cycle gene cyclin D1 is regulated by clock transcription factors3, we believe the circadian oscillator drives, or at least contributes to rhythms of cell cycle genes. However, the serum clearly makes a difference as we find that MESORs of cell cycle genes decrease with aged serum. This is consistent with the decreased proliferation previously observed in aged human tissue.

      While the reduction of rhythmicity in the old serum for oxidative phosphorylation transcripts is very interesting and fits with the general theme that metabolic function decreases with age, it is puzzling that the recipient cells are the same, but it is only the synchronization by the old and young serum that changes. Are the authors thus suggesting that decrease of metabolic rhythms is primarily a non cell-autonomous and systemic phenomenon? What would be a potential mechanism? 

      It may not be the cycling per se, but rather an overall inefficiency of oxidative phosphorylation that is conveyed by the serum. Relating other work in the field to our findings, we’ve added the following to our discussion: “Previous work in the field demonstrates that synchronization of the circadian clock in culture results in cycling of mitochondrial respiratory activity5,6 further underscoring the different effects of old serum, which does not support oscillations of oxidative phosphorylation associated transcripts. Age-dependent decrease in oxidative phosphorylation and increase in mitochondrial dysfunction7 is seen also in aged fibroblasts8 and contributes to age-related diseases9. We suggest that the age-related inefficiency of oxidative phosphorylation is conferred by serum signals to the cells such that oxidative phosphorylation cycles are mitigated. On the other hand, loss of cycling could contribute to impairments in mitochondrial function with age.”

      The delayed shifts after aged serum for clock transcripts (but not for Bmal1) are interesting and indicate that there may be a decoupling of Bmal1 transcript levels from the other clock gene phases. How do the authors interpret this? Could it be related to altered chronotypes in the elderly? 

      One possible explanation is that the delay of NPAS2, BMAL1’s binding partner, results in the delay of the transcription of clock controlled genes/negative arm genes. Since the RORs do not seem to be affected, Bmal is transcribed/translated as usual, but there isn’t enough NPAS2 to bind with BMAL1. In this case downstream genes are slower to transcribe causing the phase delay.

      The discussion would also benefit from mentioning parallels and dissimiliarities with previous works, as well as what would be possible mechanisms for such an effect. 

      We’ve expanded our discussion in the manuscript to discuss possible mechanisms and also how the genes/pathways implicated in our study relate to other aging literature.  

      Minor: 

      While time of serum collection is provided in the methods, it would be very useful to provide this information, along with the accompanying argumentation also at a more prominent position and to also add it to Table S1. 

      We made sure to highlight the collection time in the abstract of the manuscript “We collected blood from apparently healthy young (age 25-30) and old (age 70-76) individuals at 14:001 and used the serum to synchronize cultured fibroblasts.” The time of blood draw is also in sections of the paper (Intro and Methods). Since Table S1 is demographic information, we did not think that the blood draw time fit best there, but hopefully it is now clear in the text.

      L73 EKG: define the abbreviation 

      We rewrote this paragraph, but defined the term where it is used the paper.  

      L77: transfected BJ-5TA fibroblasts. Mention in the text that these are stably transfected cells. 

      We added this to the text.

      L88: Day 2 also revealed different phases of cyclic expression between young and old "groups" for a larger number of genes. Here it is only two donors, right? 

      Yes, we swapped out the word “groups” for “subjects”.

      L115. MESORs of steroid biosynthesis genes, particularly those relating to cholesterol biosynthesis, were also increased in the old sera condition. This is quite interesting, can the authors speculate on the significance of this finding? 

      We’ve added discussion about this finding in the context of the literature in our discussion.

      Fig 3. - FDRs are only listed for certain KEGG pathways, and gene counts for each pathway are also missing, which excludes some valuable context for drawing conclusions. Full tables of KEGG pathway enrichment outputs should be provided in supplementary materials. Input gene lists should also be uploaded as supplementary data files.

      Both output and input files are included in this submission as additional files.  

      Line 322 - How many replicates were excluded in the end for each group? Providing this information would strengthen the claim that the ability of both old and young serum to drive 24h oscillations in fibroblasts is robust and not only individual. 

      Each serum was tested in triplicate in two individual runs of the experiment. Of the 15 serum samples, on one of the runs, a triplicate for each of two serum samples (one old, one young) was excluded. Given that only one technical replicate in one run of the experiment had to be excluded for one old and one young individual out of all the samples assayed, this supports the idea that young and old serum drive robust oscillations.

      Line 373 - Should list which active interaction sources were used for analysis. 

      In this manuscript we used STRING (search tool for retrieval of interacting genes) analysis to broadly identify relevant pathways defined by different algorithms. From these data, we focused in particular on KEGG pathways.

      Reviewer #1 (Recommendations For The Authors): 

      These comments are in addition to those provided above: 

      Minor: 

      L73 EKG: define the abbreviation 

      We rewrote this paragraph, but defined the term where it is used the paper.  

      L77: transfected BJ-5TA fibroblasts. Mention in the text that these are stably transfected cells. 

      We added this to the text.

      L88: Day 2 also revealed different phases of cyclic expression between young and old "groups" for a larger number of genes. Here it is only two donor, right? 

      Yes, we swapped out the word “groups” for “subjects”.

      L115. MESORs of steroid biosynthesis genes, particularly those relating to cholesterol biosynthesis, were also increased in the old sera condition. This is quite interesting, can the authors speculate on the significance of this finding? 

      We’ve added discussion about this finding in the context of the literature.

      Fig.4 The fold change amplitude of the clock gene seems quite a bit lower than what is usually expected (for Nr1d1 it is usually 10 fold). The authors should provide an explanation and discuss this. 

      There are a variety of factors that contribute to the fold change amplitude of clock genes. First, the change in amplitude of clock genes is lower in vitro compared to in vivo samples. For example, in U2OS cell cultures the fold change in the cycling of Nr1d1 is only 2 fold and is not significantly different from the fold change we observe (as shown in the U2OS data from CircaDB plotted in Figure 1R). Second, the method of synchronization contributes to the strength of the rhythms. Serum synchronization is generally less effective at driving strong clock cycling than forskolin or dexamethasone although, as noted in the manuscript, it may promote the cycling of more genes. Lastly, rhythm amplitude is also dependent on the cell type in question so cell to cell variability also contributes to differences. However, overall, we do not find major differences in comparing the U2OS data and ours. Please note that the y-axis has a logarithmic scale.

      What is the authors' strategy to identify which serum components that are responsible for the reported changes? This should be discussed. 

      In the future, we intend to analyze the serum factors using a combination of fractionation and either proteomics or metabolomics to identify relevant factors. We have added this to the discussion.

      Reviewer #2 (Recommendations For The Authors): 

      Overall, the article is well-written but lacks some more rigorous data analysis as mentioned in the public review above. In addition to a more thorough analysis approach focusing much more heavily on individual variability, several other changes can be made to strengthen this study:

      Fig 3. - FDRs are only listed for certain KEGG pathways, and gene counts for each pathway are also missing, which excludes some valuable context for drawing conclusions. Full tables of KEGG pathway enrichment outputs should be provided in supplementary materials. Input gene lists should also be uploaded as supplementary data files. 

      Both output and input files are included in this submission as additional files.

      Fig 1A. - Only n=5 participants were used for this analysis, explanation of the exclusion criteria for the other participants would be useful. 

      As Figure 1A is a schematic, we assume the reviewer is referring to Figure 1B. We’ve provided a flow chart of subject inclusion/exclusion in Figure S2.

      Fig 2. - For circadian transcriptome analysis only n=4 participants were used - what criteria was used to exclude individuals, and why were only these individuals used in the end? 

      As patient recruitment was interrupted by COVID, we selected samples where we had sufficient serum to effectively carry out the RNA seq experiment and control for age and sex.

      Line 322 - How many replicates were excluded in the end for each group? Providing this information would strengthen the claim that the ability of both old and young serum to drive 24h oscillations in fibroblasts is robust and not only individual. 

      Each serum was tested in triplicate in two individual runs of the experiment. Of the 15 serum samples, on one of the runs, a triplicate for each of two serum samples (one old, one young) was excluded. Given that only one technical replicate in one run of the experiment had to be excluded for one old and one young individual out of all the samples assayed, this supports the idea that young and old serum drive robust oscillations.

      Line 373 - Should list which active interaction sources were used for analysis. 

      In this manuscript we used STRING (search tool for retrieval of interacting genes) analysis to identify relevant pathways. We do not present any STRING networks in the paper.

      Line 68 - "These novel findings suggest that it may be possible to treat impaired circadian physiology and the associated disease risks by targeting blood borne factors." This is a completed overstatement that are cannot be sustained by the limited findings provided by the authors. 

      We’ve modified this statement to avoid overstating results.

      (1) Pagani, L. et al. Serum factors in older individuals change cellular clock properties. Proceedings of the National Academy of Sciences 108, 7218–7223 (2011).

      (2) Pagani, L. et al. Serum factors in older individuals change cellular clock properties. Proc Natl Acad Sci U S A 108, 7218–7223 (2011).

      (3) Lee, Y. et al. G1/S cell cycle regulators mediate effects of circadian dysregulation on tumor growth and provide targets for timed anticancer treatment. PLOS Biology 17, e3000228 (2019).

      (4) Tomasetti, C. et al. Cell division rates decrease with age, providing a potential explanation for the age-dependent deceleration in cancer incidence. Proceedings of the National Academy of Sciences 116, 20482–20488 (2019).

      (5) Cela, O. et al. Clock genes-dependent acetylation of complex I sets rhythmic activity of mitochondrial OxPhos. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1863, 596–606 (2016).

      (6) Scrima, R. et al. Mitochondrial calcium drives clock gene-dependent activation of pyruvate dehydrogenase and of oxidative phosphorylation. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1867, 118815 (2020).

      (7) Lesnefsky, E. J. & Hoppel, C. L. Oxidative phosphorylation and aging. Ageing Research Reviews 5, 402–433 (2006).

      (8) Greco, M. et al. Marked aging-related decline in efficiency of oxidative phosphorylation in human skin fibroblasts. The FASEB Journal 17, 1706–1708 (2003).

      (9) Federico, A. et al. Mitochondria, oxidative stress and neurodegeneration. Journal of the Neurological Sciences 322, 254–262 (2012).

    2. eLife Assessment

      The authors tested the hypothesis that age-dependent factors in human sera affect the core circadian clock or its outputs in cultured fibroblasts, and they provide compelling evidence that genes involved in the cell cycle and transcription/translation remain rhythmic in both conditions, genes associated with oxidative phosphorylation and Alzheimer's Disease lose rhythmicity in the aged condition, while the expression of cycling genes associated with cholesterol biosynthesis increase in the cells entrained with old serum. Together, the findings suggest that yet to be identified age-dependent blood-borne factors affect circadian rhythms in the periphery. The paper provides fundamental insights and a possible explanation for previous observations showing that circadian gene expression in peripheral tissues tends to dampen or phase-shift with age.

    3. Reviewer #1 (Public review):

      Aging is associated with a number of physiologic changes including perturbed circadian rhythms. However, mechanisms by which rhythms are altered remain unknown. Here authors tested the hypothesis that age-dependent factors in the sera affect the core clock or outputs of the core clock in cultured fibroblasts. They find that both sera from young and old donors are equally potent at driving robust ~24h oscillations in gene expression, and report the surprising finding that the cyclic transcriptome after stimulation by young or old sera differs markedly. In particular, genes involved in the cell cycle and transcription/translation remain rhythmic in both conditions, while genes associated with oxidative phosphorylation and Alzheimer's Disease lose rhythmicity in the aged condition. Also, the expression of cycling genes associated with cholesterol biosynthesis increases in the cells entrained with old serum. Together, the findings suggest that age-dependent blood-borne factors, yet to be identified, affect circadian rhythms in the periphery. The most interesting aspect of the paper is that the data suggest that the same system (BJ-5TA), may significantly change its rhythmic transcriptome depending on how the cells are synchronized. While there is a succinct discussion point on this, it should be expanded and described whether there are parallels with previous works, as well as what would be possible mechanisms for such an effect.

      Comments on revised version:

      The authors have done a thorough revision of their manuscripts and provided convincing answers to all of my points. In particular, I applaud the authors for having added raw luminescence traces, and for providing Figure S5 on the amplitudes. Perhaps the authors could add a comment in the final text that the amplitudes are fairly low, 10^0.1 = 1.25 which means that the bulk of those genes has rhythms of at most 25%, which could reflect that the synchronization of the cells is partial.

    1. eLife Assessment

      This technical study presents a novel sampling strategy for detecting synaptic coupling between neurons from dual pipette patch-clamp recordings in acute slices of mammalian brain tissue in vitro. The authors present solid evidence that this strategy, which incorporates automated patch clamp electrode positioning and cleaning for reuse with strategic neuron targeting, has the potential to substantially improve the efficiency of neuronal sampling with paired recordings. This technique and the extensions discussed will be useful for neuroscientists wanting to apply or already conducting automated multi-pipette patch clamp recording electrophysiology experiments in vitro for neuron connectivity analyses.

    2. Reviewer #1 (Public review):

      Summary:

      In this technical paper, the authors introduce an important variation on the fully automated multi-electrode patch-clamp recording technique for probing synaptic connections that they term "patch-walking". The patch-walking approach involves coordinated pipette route-planning and automated pipette cleaning procedures for pipette reuse to improve recording throughput efficiency, which the authors argue can theoretically yield almost twice the number of connections to be probed by paired recordings on a multi-patch electrophysiology setup for a given number of cells compared to conventional manual patch-clamping approaches used in brain slices in vitro. The authors show convincing results from recordings in mouse in vitro cortical slices, demonstrating the efficient recording of dozens of paired neurons with a dual patch pipette configuration for paired recordings and detection of synaptic connections. This approach will be of interest and valuable to neuroscientists conducting automated multi-patch in vitro electrophysiology experiments and seeking to increase efficiency of neuron connectivity detection while avoiding the more complex recording configurations (e.g., 8 pipette multi-patch recording configurations) used by several laboratories that are not readily implementable by most of the neuroscience community.

      Strengths:

      (1) The authors introduce the theory and methods and show experimental results for a fully automated electrophysiology dual patch-clamp recording approach with a coordinated patch-clamp pipette route-planning and automated pipette cleaning procedures to "patch-walk" across an in vitro brain slice.

      (2) The patch-walking approach offers throughput efficiency improvements over manual patch clamp recording approaches, especially for investigators looking to utilize paired patch electrode recordings in electrophysiology experiments in vitro.

      (3) Experimental results are presented from in vitro mouse cortical slices demonstrating the efficiency of recording dozens of paired neurons with a two-patch pipette configuration for paired recordings and detection of synaptic connections, demonstrating the feasibility and efficiency of the patch-walking approach.

      (4) The authors suggest extensions of their technique while keeping the number of recording pipettes employed and recording rig complexity low, which are important practical technical considerations for investigators wanting to avoid the more complex recording configurations (e.g., 8-10 pipette multi-patch recording configurations) used by several laboratories that are not readily implementable by most of the neuroscience community.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors aim to combine automated whole-cell patch clamp recording simultaneously from multiple cells. Using a 2-electrode approach, they are able to sample as many cells (and connections) from one slice, as would be achieved with a more technically demanding and materially expensive 8-electrode patch clamp system. They provide data to show that this approach is able to successfully record from 52% of attempted cells, which was able to detect 3 pairs in 71 screened neurons. The authors state that this is a step forward in our ability to record from randomly connected ensembles of neurons.

      Strengths:

      The conceptual approach of recording multiple partner cells from another in a step wise manner indeed increases the number of tested connections. An approach that is widely applicable to both automated and manual approaches. Such a method could be adopted for many connectivity studies using dual recording electrodes.

      The implementation of automated robotic whole-cell patch-clamp techniques from multiple cells simultaneously is a useful addition to the multiple techniques available to ex vivo slice electrophysiologists.

      The approach using 2 electrodes, which are washed between cells is economically favourable, as this reduces equipment costs for recording multiple cells, and limits the wastage of capillary glass that would otherwise be used once.

      Weaknesses:

      (1) Based on the revised manuscript - a discussion of the implementation of this approach to manual methods is still lacking,

      (2) A comparison of measurements shown in Figure 2 to other methods has not been addressed adequately.

      (3) The morphological identification of neurons is understandably outside the remit of this project - but should be discussed and/or addressed. It was not suggested to perform detailed anatomical analysis - but to highlight the importance of this, and it should still be discussed

      (4) The revised manuscript does not clearly state which cells were included in the analysis as far as I can see - and indeed cells with Access Resistance >40 MOhm appear to still be included in the data.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Yip and colleagues incorporated the pipette cleaning technique into their existing dual-patch robotic system, "the PatcherBot", to allow sequential patching of more cells for synaptic connection detection in living brain slices. During dual-patching, instead of retracting all two electrodes after each recording attempt, the system cleaned just one of the electrodes and reused it to obtain another recording while maintaining the other. With one new patch clamp recording attempt, new connections can be probed. By placing one pipette in front of the other in this way, one can "walk" across the tissue, termed "patch-walking." This application could allow for probing additional neurons to test the connectivity using the same pipette in the same preparation.

      Strengths:

      Compared to regular dual-patch recordings, this new approach could allow for probing more possible connections in brain slices with dual-patch recordings, thus having the potential to improve the efficiency of identifying synaptic connections

      Weaknesses:

      While this new approach offers the potential to increase efficiency, it has several limitations that could curtail its widespread use.

      Loss of Morphological Information: Unlike traditional multi-patch recording, this approach likely loses all detailed morphology of each recorded neuron. This loss is significant because morphology can be crucial for cell type verification and understanding connectivity patterns by morphological cell type.

      Spatial Restrictions: The robotic system appears primarily suited to probing connections between neurons with greater spatial separation (~100µm ISD). This means it may not reliably detect connections between neurons in close proximity, a potential drawback given that the connectivity is much higher between spatially close neurons. This limitation could help explain the low connectivity rate (5%) reported in the study.

      Limited Applicability: While the approach might be valuable in specific research contexts, its overall applicability seems limited. It's important to consider scenarios where the trade-off between efficiency and specific questions that are asked.<br /> Scalability Challenges: Scaling this method beyond a two-pipette setup may be difficult. Additional pipettes would introduce significant technical and logistical complexities.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers and editors for insightful feedback on how we could improve the manuscript. We have revised the manuscript and addressed the points raised.

      Regarding the technical issues raised about the quality of patch clamp recordings (Reviewer 2), we acknowledge that the upper limit of the access resistance cutoff should be lower and that the accepted change should be 10-20%. To this end, we have revised the manuscript to more accurately detail the quality metrics used. The access resistance for the neurons in paired recordings were below 40 MΩ (similar to the metric used by Kolb et al. 2019), and if the access changed above 50 MΩ, we stopped recording from that neuron. Furthermore, the inclusion of neurons in the histogram with access resistance above 50 MΩ was to highlight the total number of neurons patched but not necessarily used in paired recordings. As this was done with an automated robotic system, the neurons would still undergo an initial voltage clamp and current clamp protocol before the pipette would release the neuron and patch another cell. To the point of Reviewer 2, this patch-walk protocol could also be alternatively implemented using manual recording approaches and this point has been included in the revised manuscript.

      Regarding the spatial restrictions (Reviewer 3), we agree that the average intersomatic distance is higher than ideal. This was likely due to failed patch attempts; for instance, if one pipette successfully achieved whole cell, and the other pipette had several sequential failed patch attempts, the intersomatic distance (ISD) would increase with each failed attempt due to the user selected index of cells. Ideally, the pipettes would be walking across a slice with low ISD if the whole-cell success rate was closer to 100%. To overcome this challenge in future work, automated cell identification and tracking could enable the path planning to be continuously updated after each patch attempt. Given the whole-cell success rate efficiency for a given electrophysiologist, we believe that the automated robot could be improved in later versions to include routeplanning algorithms to minimize the distance between neurons. Alternatively, this patch-walk system could also be integrated to improve connectivity yields for manual recording approaches as well.

      For the point raised about morphological identification, we believe that while important, morphological identification is out of the scope for this project. Future work will include neuronal reconstruction. Regarding the other points, we will amend the manuscript to highlight other key metrics such as maximum time we could hold a neuron under the whole-cell configuration. Additionally, we agree with Reviewer 3 that some of the current language may cause confusion, and we will amend it accordingly.

      To all the reviewers, thank you for your time, understanding, and the opportunity to improve our manuscript.

    1. eLife Assessment

      This important study provides proof of principle that C. elegans models can be used to accelerate the discovery of candidate treatments for human Mendelian diseases by detailed high-throughput phenotyping of strains harboring mutations in orthologs of human disease genes. The data are compelling and support an approach that enables the potential rapid repurposing of FDA-approved drugs to treat rare diseases for which there are currently no effective treatments. The authors should provide a clearer explanation of how the statistical analyses were performed, as well as a link to a GitHub repository to clarify how figures and tables in the manuscript were generated from the phenotypic data.

    2. Reviewer #1 (Public review):

      Summary:

      As the scientific community identifies increasing numbers of genes and genetic variants that cause rare human diseases, a challenge in the field quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often a bottleneck, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease gene knockout mutants. Overall, the work constitutes an elegant combination of previously developed high-volume imaging with highly detailed quantitative phenotyping (and some paring down to specific phenotypes) to establish proof of principle on how the combined applications can contribute to screens for compounds that may address specific genetic deficits, which can, in turn, suggest both mechanism and therapy.

      In brief, the authors selected 25 genes for which loss of function is implicated in human neuro-muscular disease and engineered deletions in the corresponding C. elegans homologs. The authors then imaged morphological features and behaviors prior to, during, and after blue light stimuli, quantitating features, and clustering outcomes as they elegantly developed previously (PMID 35322206; 30171234; 30201839). In doing so, phenotypes in 23/25 tested mutants could be separated enough to distinguish WT from mutant and half of those with adequate robustness to permit high-throughput screens, an outcome that supports the utility of related general efforts to ID phenotypes in C. elegans disease orthologs. A detailed discussion of 4 ciliopathy gene defects, and NACLN-related channelopathy mutants reveals both expected and novel phenotypes, validating the basic approach to modeling vetted targets and underscoring that quantitative imaging approaches reiterate known biology.

      The authors then screened a library of nearly 750 FDA-approved drugs for the capacity to shift the unc-80 NACLN channel-disrupted phenotype closer to the wild type. Top "mover" compounds shift outcome in the experimental outcome space; and also reveal how "side effects" can be evaluated to prioritize compounds that confer the fewest changes of other parameters away from the center.

      Strengths:

      Although the imaging and data analysis approaches have been reported and the screen is restricted in scope and intervention exposure, it is impressive, encouraging and important that the authors strongly combine tools to demonstrate how quantitative imaging phenotypes can be integrated with C. elegans genetics to accelerate the identification of potential modulators of disease (easily extendable to other goals). Generation of deletion alleles and documentation of their associated phenotypes (available in supplemental data) provide potentially useful reagents/data to the field. The capacity to identify "over-shooting" of compound applications with suggestions for scale back and to sort efficacious interventions to minimize other changes to behavioral and physical profiles is a strong contribution.

      Weaknesses:

      The work does not have major weaknesses, and in revision, the authors have expanded the discussion to potential utility and application in the field.

      The authors have also taken into account minor modifications in writing.

    3. Reviewer #2 (Public review):

      Summary and strengths:

      O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to make genotype-phenotype connections using C. elegans. Given the rate that rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one such way. Overall, I have a few minor suggestions and some specific edits.

      Weaknesses:

      (1) Throughout the text on figures, labels are nearly impossible to read. I had to zoom into the PDF to determine what the figure was showing. Please make text in all figures a minimum of 10 point font. Similarly, Figure 2D point type is impossible to read. Points should be larger in all figures. Gene names should be in italics in all figures, following C. elegans convention.

      (2) I have a strong bias against the second point in Figure 1A. Sequencing of trios, cohorts, or individuals NEVER identifies causal genes in the disease. This technique proposes a candidate gene. Future experiments (oftentimes in model organisms) are required to make those connections to causality. Please edit this figure and parts of the text.

      (3) How were the high-confidence orthologs filtered from 767 to 543 (lines 128-131)? Also, the choice of the final list of 25 genes is not well justified. Please expand more about how these choices were made.

      (4) Figures 3 and 4, why show all 8289 features? It might be easier to understand and read if only the 256 Tierpsy features were plotted in the heat maps.

      (5) The unc-80 mutant screen is clever. In the feature space, it is likely better to focus on the 256 less-redundant Tierpsy features instead of just a number of features. It is unclear to me how many of these features are correlated and not providing more information. In other words, the "worsening" of less-redundant features is far more of a concern than "worsening" of 1000 correlated features.Reviewer #2 (Public review):

      Summary and strengths:

      O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to make genotype-phenotype connections using C. elegans. Given the rate that rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one such way. Overall, I have a few minor suggestions and some specific edits.

      Weaknesses:

      (1) Throughout the text on figures, labels are nearly impossible to read. I had to zoom into the PDF to determine what the figure was showing. Please make text in all figures a minimum of 10 point font. Similarly, Figure 2D point type is impossible to read. Points should be larger in all figures. Gene names should be in italics in all figures, following C. elegans convention.

      (2) I have a strong bias against the second point in Figure 1A. Sequencing of trios, cohorts, or individuals NEVER identifies causal genes in the disease. This technique proposes a candidate gene. Future experiments (oftentimes in model organisms) are required to make those connections to causality. Please edit this figure and parts of the text.

      (3) How were the high-confidence orthologs filtered from 767 to 543 (lines 128-131)? Also, the choice of the final list of 25 genes is not well justified. Please expand more about how these choices were made.

      (4) Figures 3 and 4, why show all 8289 features? It might be easier to understand and read if only the 256 Tierpsy features were plotted in the heat maps.

      (5) The unc-80 mutant screen is clever. In the feature space, it is likely better to focus on the 256 less-redundant Tierpsy features instead of just a number of features. It is unclear to me how many of these features are correlated and not providing more information. In other words, the "worsening" of less-redundant features is far more of a concern than "worsening" of 1000 correlated features.

    4. Reviewer #3 (Public review):

      In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

      First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

      Subsequently, the study focuses on the NALCN channel gene family, comparing the phenotypes of mutants of nca-1, unc-77, and unc-80. This initial characterization identifies three behavioral parameters that exhibit significant differences from the wild type and could serve as indicators for pharmacological modulation.

      As a proof-of-concept, O'Brien et al. present a drug repurposing screen using an FDA-approved compound library, identifying two compounds capable of rescuing the behavioral phenotype in a model with UNC80 deficiency. The relatively short time and low cost associated with creating and phenotyping these strains suggest that high-throughput worm tracking could serve as a scalable approach for drug repurposing, addressing the multitude of Mendelian diseases. Interestingly, by measuring a wide range of behavioural parameters, this strategy also simultaneously reveals deleterious side effects of tested drugs that may confound the analysis.

      Considering the wealth of data generated in this study regarding important human disease genes, it is regrettable that the data is not made accessible to researchers less versed in data analysis methods. This diminishes the study's utility. It would have a far greater impact if an accessible and user-friendly online interface were established to facilitate data querying and feature extraction for specific mutants. This would empower researchers to compare their findings with the extensive dataset created here.

      Another technical limitation of the study is the use of single alleles. Large deletion alleles were generated by CRISPR/Cas9 gene editing. At first glance, this seems like a good idea because it limits the risk that background mutations, present in chemically-generated alleles, will affect behavioral parameters. However, these large deletions can also remove non-coding RNAs or other regulatory genetic elements, as found, for example, in introns. Therefore, it would be prudent to validate the behavioral effects by testing additional loss-of-function alleles produced through early stop codons or targeted deletion of key functional domains.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      Summary: 

      As the scientific community identifies increasing numbers of genetic variants that cause rare human diseases, a challenge is how the field can most quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often challenging, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease gene knockout mutants. 

      Overall, the work constitutes an elegant combination of previously developed high-volume imaging with highly detailed quantitative phenotyping (and some paring down to specific phenotypes) to establish proof of principle on how the combined applications can contribute to screens for compounds that may address specific genetic deficits, which can suggest both mechanism and therapy. 

      In brief, the authors selected 25 genes for which loss of function is implicated in human neuro-muscular disease and engineered deletions in the corresponding C. elegans homologs. The authors then imaged morphological features and behaviors prior to, during, and after blue light stimuli, quantitating features, and clustering outcomes as they elegantly developed previously (PMID 35322206; 30171234; 30201839). In doing so, phenotypes in 23/25 tested mutants could be separated enough to distinguish WT from mutant and half of those with adequate robustness to permit high-throughput screens, an outcome that supports the utility of general efforts to ID phenotypes in C. elegans disease orthologs using this approach. A detailed discussion of 4 ciliopathy gene defects, and NACLN-related channelopathy mutants reveals both expected and novel phenotypes, validating the basic approach to modeling vetted targets and underscoring that quantitative imaging approaches reiterate known biology. The authors then screened a library of nearly 750 FDA-approved drugs for the capacity to shift the unc-80 NACLN channel-disrupted phenotype closer to the wild type. Top "mover" compound move outcome in the experimental outcome space; and also reveal how "side effects" can be evaluated to prioritize compounds that confer the fewest changes of other parameters away from the center. 

      Strengths: 

      Although the imaging and data analysis approaches have been reported and the screen is limited in scope and intervention exposure, it is important that the authors strongly combine individual approach elements to demonstrate how quantitative imaging phenotypes can be integrated with C. elegans genetics to accelerate the identification of potential modulators of disease (easily extendable to other goals). Generation of deletion alleles and documentation of their associated phenotypes (available in supplemental data) provide potentially useful reagents/data to the field. The capacity to identify "over-shooting" of compound applications with suggestions for scale back and to sort efficacious interventions to minimize other changes to behavioral and physical profiles is a strong contribution. 

      Weaknesses: 

      The work does not have major weaknesses, although it may be possible to expand the discussion to increase utility in the field: 

      (1) Increased discussion of the challenges and limitations of the approach may enhance successful adaptation application in the field. 

      It is quite possible that morphological and behavioral phenotypes have nothing to do with disease mechanisms and rather reflect secondary outcomes, such that positive hits will address "off-target" consequences. 

      This is possible and can only be determined with human data. We now discuss the possibility in the discussion.

      The deletion approach is adequately justified in the text, but the authors may make the point somewhere that screening target outcomes might be enhanced by the inclusion of engineered alleles that match the human disease condition. Their work on sod-1 alleles (PMID 35322206) might be noted in this discussion. 

      We agree and now mention this work in the discussion. We are currently working on a collection of strains with patient-specific mutations.

      Drug testing here involved a strikingly brief exposure to a compound, which holds implications for how a given drug might engage in adult animals. The authors might comment more extensively on extended treatments that include earlier life or more extended targeting. The assumption is that administering different exposure periods and durations, but if the authors are aware as to whether there are challenges associated with more prolonged applications, larger scale etc. it would be useful to note them. 

      More prolonged applications are definitely possible. We chose short treatments for this screen to model the potential for changing neural phenotypes once developmental effects of the mutation have already occurred. We now briefly discuss this choice and the potential of longer treatments in the discussion.

      (2) More justification of the shift to only a few target parameters for judging compound effectiveness. 

      - In the screen in Figure 4D and text around 313, 3 selected core features of the unc-80 mutant (fraction that blue-light pause, speed, and curvature) were used to avoid the high replicate requirements to identify subtle phenotypes. Although this strategy was successful as reported in Figure 5, the pared-down approach seems a bit at odds with the emphasis on the range of features that can be compared mutant/wt with the author's powerful image analysis. Adding details about the reduced statistical power upon multiple comparisons, with a concrete example calculated, might help interested scientists better assess how to apply this tool in experimental design. 

      To empirically test the effect of including more features on the subsequent screen, we have repeated the analysis using increasing numbers of features. In a new supplementary figure we find increasing the number of features reduces our power to detect rescue. At 256 features, we would not be able to detect any compounds that rescued the disease model phenotype.

      (3) More development of the side-effect concept. The side effects analysis is interesting and potentially powerful. Prioritization of an intervention because of minimal perturbation of other phenotypes might be better documented and discussed a bit further; how reliably does the metric of low side effects correlate with drug effectiveness? 

      Ultimately this can only be determined with clinical trial data on multiple drugs, but there are currently no therapeutic options for UNC80 deficiency in humans. We have included some extra discussion of the side effect concept.

      Reviewer #2 (Public Review): 

      Summary and strengths: 

      O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at the movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to making genotype-phenotype connections using C. elegans. Given the rate at which rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one such way. Overall, I have a few minor suggestions and some specific edits. 

      Weaknesses: 

      (1) Throughout the text on figures, labels are nearly impossible to read. I had to zoom into the PDF to determine what the figure was showing. Please make text in all figures a minimum of 10-point font. Similarly, the Figure 2D point type is impossible to read. Points should be larger in all figures. Gene names should be in italics in all figures, following C. elegans convention. 

      We have updated all figures with larger labels and, where necessary, split figures to allow for better readability. We’ve also corrected italicisation.

      (2) I have a strong bias against the second point in Figure 1A. Sequencing of trios, cohorts, or individuals NEVER identifies causal genes in the disease. This technique proposes a candidate gene. Future experiments (oftentimes in model organisms) are required to make those connections to causality. Please edit this figure and parts of the text. 

      We have removed references to causation. We were thinking of cases where a known variant is found in a patient where causality has already been established rather than cases of new variant discovery.

      (3) How were the high-confidence orthologs filtered from 767 to 543 (lines 128-131)? Also, the choice of the final list of 25 genes is not well justified. Please expand more about how these choices were made. 

      We now explain the extra keyword filtering step. For the final filtering step, we simply examined the list and chose 25. There is therefore little justification to provide and we acknowledge these cannot be seen as representative of the larger set according to well-defined rules. The choice was based on which genes we thought would be interesting using their descriptions or our prior knowledge (“subjective interestingness” in the main text).

      (4) Figures 3 and 4, why show all 8289 features? It might be easier to understand and read if only the 256 Tierpsy features were plotted in the heat maps. 

      In this case, we included all features because they were all tested for differences between mutants and controls. By consistently using all features for each fingerprint we can be sure that the features that are different that we want to highlight in box plots can be referred to in the fingerprint.

      (5) The unc-80 mutant screen is clever. In the feature space, it is likely better to focus on the 256 less-redundant Tierpsy features instead of just a number of features. It is unclear to me how many of these features are correlated and not providing more information. In other words, the "worsening" of less-redundant features is far more of a concern than the "worsening" of 1000 correlated features. 

      This is a good point. We’ve redone the analysis using the Tierpsy 256 feature set and included this as a supplementary figure. We find that the same trend exists when looking at this reduced feature set.

      Reviewer #3 (Public Review): 

      In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans. 

      First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models. 

      Subsequently, the study focuses on the NALCN channel gene family, comparing the phenotypes of mutants of nca-1, unc-77, and unc-80. This initial characterization identifies three behavioral parameters that exhibit significant differences from the wild type and could serve as indicators for pharmacological modulation. 

      As a proof-of-concept, O'Brien et al. present a drug repurposing screen using an FDA-approved compound library, identifying two compounds capable of rescuing the behavioral phenotype in a model with UNC80 deficiency. The relatively short time and low cost associated with creating and phenotyping these strains suggest that high-throughput worm tracking could serve as a scalable approach for drug repurposing, addressing the multitude of Mendelian diseases. Interestingly, by measuring a wide range of behavioural parameters, this strategy also simultaneously reveals deleterious side effects of tested drugs that may confound the analysis. 

      Considering the wealth of data generated in this study regarding important human disease genes, it is regrettable that the data is not actually made accessible. This diminishes the study's utility. It would have a far greater impact if an accessible and user-friendly online interface were established to facilitate data querying and feature extraction for specific mutants. This would empower researchers to compare their findings with the extensive dataset created here. Otherwise, one is left with a very limited set of exploitable data. 

      We have now made the feature data available on Zenodo (https://doi.org/10.5281/zenodo.12684118) as a matrix of feature summaries and individual skeleton timeseries data (the feature matrix makes it more straightforward to extract the data from particular mutants for reanalysis). We have also created a static html version of the heatmap in Figure 2 containing the entire behavioural feature set extracted by Tierpsy. This can be opened in a browser and zoomed for detailed inspection. Mousing over the heatmap shows the names of features at each position making it easier to arrive at intuitive conclusions like ‘strain A is slow’ or ‘strain B is more curved’.

      Another technical limitation of the study is the use of single alleles. Large deletion alleles were generated by CRISPR/Cas9 gene editing. At first glance, this seems like a good idea because it limits the risk that background mutations, present in chemically-generated alleles, will affect behavioral parameters. However, these large deletions can also remove non-coding RNAs or other regulatory genetic elements, as found, for example, in introns. Therefore, it would be prudent to validate the behavioral effects by testing additional loss-of-function alleles produced through early stop codons or targeted deletion of key functional domains. 

      We have added a note in the main text on limitations of deletion alleles. We like the idea of making multiple alleles in future studies, especially in cases where a project is focussed on just one or a few genes.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors): 

      Note that none of the above suggestions or the one immediately below are considered mandatory. 

      One additional minor point: The dual implication of mevalonate perturbations for NACLM deficiencies is striking. At the same time, the mevalonate pathway is critical for embryo viability among other things, which prompts questions about how reproductive physiology is integrated in this screen approach. It appears that sterilization protocols are not used to prepare screen target animals, but it would be useful to know if there were a signature associated with drug-induced sterility that might help identify one potential common non-interesting outcome of compound treatments in general. In this work, the screen treatment is only 4 hours, which is probably too short to compromise reproduction, but as noted above, it is likely users would intend to expose test subjects for much longer than 4-hour periods. 

      This is an interesting point. In its current form our screen doesn’t assess reproductive physiology. This is something that we will consider in ongoing projects.

      Figures 

      Figure 1D might be omitted or moved to supplement. 

      We have removed 1D and moved figure 1E as a standalone table (Table 1) to improve readability.

      Figure 2D "key" is hard to make out size differences for prestim, bluelight, and poststim -more distinctive symbols should be used. 

      We have increased the size of the symbols so that the key is easier to read.

      Line 412 unc-25 should be in italics 

      Corrected

      Reviewer #2 (Recommendations For The Authors): 

      Specific edits: 

      All of the errors below have been corrected.

      Line 47, "loss of function" should be hyphenated because it is a compound adjective that modifies mutations. 

      Line 50, "genetically-tractable" should not be hyphenated because it is not a compound adjective. It is an adverb-adjective pair. Line 102 has the same grammatical issue. 

      Line 85, "rare genetic diseases" do not "affect nervous system function". The disease might have deficits in this function, but the disease does not do anything to function. 

      Line 86, it should be mutations not mutants. Mutations are changes to DNA. Mutants are individuals with mutations. 

      Throughout, wild-type should be hyphenated when it is used as a compound adjective. 

      Figure 4, asterisks is spelled incorrectly. 

      Reviewer #3 (Recommendations For The Authors): 

      - As stated in the public review, the utility of the study is limited by the lack of access to the complete dataset. The wealth of data produced by the study is one of its major outputs. 

      We have made the data publicly available on Zenodo. We appreciate the request.

      - Describe the exact break-points of the different alleles, because it was not readily feasible to derive them from the gene fact sheets provided in the supplementary materials. 

      We have now provided the start position and total length of deletion for each gene in the gene fact sheets.

      - Figure 1C: what does "Genetic homology"/"sequence identity" refer to? How were these values calculated? 

      UNC-49 is clearly not 95% identical to vertebrate GABAR subunits at the protein level. 

      We have changed the axis label to “BLAST % Sequence Identity” to clarify that these values are calculated from BLAST sequence alignments on WormBase and the Alliance Genom Resources webpages.

      - Figure 1E : The data presented in Figure 1E appears somewhat unreliable. For example, a cursory check showed: 

      (1) Wrong human ortholog: unc-49 is a Gaba receptor, not a Glycine receptor as indicated in the second column. 

      (2) Wrong disease association: dys-1 is not associated with Bardet-Biedl syndrome; overall the data indicated in the table does not seem to fully match the HPO database. 

      (3) Inconsistent disease association: why don't the avr-14 and glc-2 (and even unc-49) profiles overlap/coincide given that they present overlapping sets of human orthologs. 

      Thank you for catching this! We have corrected gene names which were mistakenly pasted. We have also made this a standalone table (Table 1) for improved readability.

      - Error in legend to figure 4I : "with ciliopathies and N2" > ciliopathies should be "NALCN disease". 

      - Error at line 301: "Figures 2E-H" should be "Figures 4E-H". 

      Corrected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This study explores the sequence characteristics and features of high-occupancy target (HOT) loci across the human genome. The computational analyses presented in this paper provide information into the correlation of TF binding and regulatory networks at HOT loci that were regarded as lacking sequence specificity.

      By leveraging hundreds of ChIP-seq datasets from the ENCODE Project to delineate HOT loci in HepG2, K562, and H1-hESC cells, the investigators identified the regulatory significance and participation in 3D chromatin interactions of HOT loci. Subsequent exploration focused on the interaction of DNA-associated proteins (DAPs) with HOT loci using computational models. The models established that the potential formation of HOT loci is likely embedded in their DNA sequences and is significantly influenced by GC contents. Further inquiry exposed contrasting roles of HOT loci in housekeeping and tissue-specific functions spanning various cell types, with distinctions between embryonic and differentiated states, including instances of polymorphic variability. The authors conclude with a speculative model that HOT loci serve as anchors where phase-separated transcriptional condensates form. The findings presented here open avenues for future research, encouraging more exploration of the functional implications of HOT loci.

      Strengths:

      The concept of using computational models to define characteristics of HOT loci is refreshing and allows researchers to take a different approach to identifying potential targets. The major strengths of the study lies in the very large number of datasets analyzed, with hundreds of ChIP-seq data sets for both HepG2 and K562 cells as part of the ENCODE project. Such quantitative power allowed the authors to delve deeply into HOT loci, which were previously thought to be artifacts.

      Weaknesses:

      While this study contributes to our knowledge of HOT loci, there are critical weaknesses that need to be addressed. There are questions on the validity of the assumptions made for certain analyses. The speculative nature of the proposed model involving transcriptional condensates needs either further validation or be toned down. Furthermore, some apparent contradictions exist among the main conclusions, and these either need to be better explained or corrected. Lastly, several figure panels could be better explained or described in the figure legends.

      We thank the reviewer for their valuable comments.

      - We have extended the study and included a new chapter focusing on the condensate hypothesis, added more supporting evidence (including the ones suggested by the reviewer), and made explicit statements on the speculative nature of this model.

      - We have restructured the text to remove the sentences which might be construed as contradictory.

      Reviewer #2 (Public Review):

      Summary:

      The paper 'Sequence characteristic and an accurate model of abundant hyperactive loci in human genome' by Hydaiberdiev and Ovcharenko offers comprehensive analyses and insights about the 'high-occupancy target' (HOT) loci in the human genome. These are considered genomic regions that overlap with transcription factor binding sites. The authors provided very comprehensive analyses of the TF composition characteristics of these HOT loci. They showed that these HOT loci tend to overlap with annotated promoters and enhancers, GC-rich regions, open chromatin signals, and highly conserved regions, and that these loci are also enriched with potentially causal variants with different traits.

      Strengths:

      Overall, the HOT loci' definition is clear and the data of HOT regions across the genome can be a useful dataset for studies that use HepG2 or K562 as a model. I appreciate the authors' efforts in presenting many analyses and plots backing up each statement.

      Weaknesses:

      It is noteworthy that the HOT concept and their signature characteristics as being highly functional regions of the genome are not presented for the first time here. Additionally, I find the main manuscript, though very comprehensive, long-winded and can be put in a shorter, more digestible format without sacrificing scientific content.

      The introduction's mention of the blacklisted region can be rather misleading because when I read it, I was anticipating that we are uncovering new regulatory regions within the blacklisted region. However, the paper does not seem to address the question of whether the HOT regions overlap, if any, with the ENCODE blacklisted regions afterward. This plays into the central assessment that this manuscript is long-winded.

      The introduction also mentioned that HOT regions correspond to 'genomic regions that seemingly get bound by a large number of TFs with no apparent DNA sequence specificity' (this point of 'no sequence specificity' is reiterated in the discussion lines 485-486). However, later on in the paper, the authors also presented models such as convolutional neural networks that take in one-hot-encoded DNA sequence to predict HOT performed really well. It means that the sequence contexts with potential motifs can still play a role in forming the HOT loci. At the same time, lines 59-60 also cited studies that "detected putative drive motifs at the core segments of the HOT loci". The authors should edit the manuscript to clarify (or eradicate) contradictory statements.

      We thank the reviewer for their valuable comments. Below are our responses to each paragraph in the given order:

      We added a statement in the commenting and summarizing other publications that studied the functional aspects of HOT loci with the following sentence in the introduction part:

      “Other studies have concluded that these regions are highly functionally consequential regions enriched in epigenetic signals of active regulatory elements such as histone modification regions and high chromatin accessibility”.

      We significantly shortened the manuscript by a) moving the detailed analyses of the computational model to the supplemental materials, and b) shortening the discussions by around half, focusing on core analyses that would be most beneficial to the field.

      Given that the ENCODE blacklisted regions are the regions that are recommended by the ENCODE guidelines to be avoided in mapping the ChIP-seq (and other NGS), we excluded them from our analyzed regions before mapping to the genome. Instead, we relied on the conclusions of other publications on HOT loci that the initial assessments of a fraction of HOT loci were the result of factoring in these loci which later were included in blacklisted regions.

      We addressed the potential confusion by using the expression of “no sequence specificity” by a) changing the sentence in the introduction by adding a clarification as “... with no apparent DNA sequence specificity in terms of detectible binding motifs of corresponding motifs” and b) removing that part from the sentence in the discussions.

      Reviewer #3 (Public Review):

      Summary:

      Hudaiberdiev and Ovcharenko investigate regions within the genome where a high abundance of DNA-associated proteins are located and identify DNA sequence features enriched in these regions, their conservation in evolution, and variation in disease. Using ChIP-seq binding profiles of over 1,000 proteins in three human cell lines (HepG2, K562, and H1) as a data source they're able to identify nearly 44,000 high-occupancy target loci (HOT) that form at promoter and enhancer regions, thus suggesting these HOT loci regulate housekeeping and cell identity genes. Their primary investigative tool is HepG2 cells, but they employ K562 and H1 cells as tools to validate these assertions in other human cell types. Their analyses use RNA pol II signal, super-enhancer, regular-enhancer, and epigenetic marks to support the identification of these regions. The work is notable, in that it identifies a set of proteins that are invariantly associated with high-occupancy enhancers and promoters and argues for the integration of these molecules at different genomic loci. These observations are leveraged by the authors to argue HOT loci as potential sites of transcriptional condensates, a claim that they are well poised to provide information in support of. This work would benefit from refinement and some additional work to support the claims.

      Comments:

      (1) Condensates are thought to be scaffolded by one or more proteins or RNA molecules that are associated together to induce phase separation. The authors can readily provide from their analysis a check of whether HOT loci exist within different condensate compartments (or a marker for them). Generally, ChIPSeq signal from MED1 and Ronin (THAP11) would be anticipated to correspond with transcriptional condensates of different flavors, other coactivator proteins (e.g., BRD4), would be useful to include as well. Similarly, condensate scaffolding proteins of facultative and constitutive heterochromatin (HP1a and EZH2/1) would augment the authors' model by providing further evidence that HOT Loci occur at transcriptional condensates and not heterochromatin condensates. Sites of splicing might be informative as well, splicing condensates (or nuclear speckles) are scaffolded by SRRM/SON, which is probably not in their data set, but members of the serine arginine-rich splicing factor family of proteins can serve as a proxy-SRSF2 is the best studied of this set. This would provide a significant improvement to their proposed model and be expected since the authors note that these proteins occur at the enhancers and promoter regions of highly expressed genes.

      (2) It is curious that MAX is found to be highly enriched without its binding partner Myc, is Myc's signal simply lower in abundance, or is it absent from HOT loci? How could it be possible that a pair of proteins, which bind DNA as a heterodimer are found in HOT loci without invoking a condensate model to interpret the results?

      (3) Numerous studies have linked the physical properties of transcription factor proteins to their role in the genome. The authors here provide a limited analysis of the proteins found at different HOT-loci by employing go terms. Is there evidence for specific types of structural motifs, disordered motifs, or related properties of these proteins present in specific loci?

      (4) Condensates themselves possess different emergent properties, but it is a product of the proteins and RNAs that concentrate in them and not a result of any one specific function (condensates can have multiple functions!)

      (5) Transcriptional condensates serve as functional bodies. The notion the authors present in their discussion is not held by practitioners of condensate science, in that condensates exist to perform biochemical functions and are dissolved in response to satisfying that need, not that they serve simply as reservoirs of active molecules. For example, transcriptional condensates form at enhancers or promoters that concentrate factors involved in the activation and expression of that gene and are subsequently dissolved in response to a regulatory signal (in transcription this can be the nascently synthesized RNA itself or other factors). The association reactions driving the formation of active biochemical machinery within condensates are materially changed, as are the kinetics of assembly. It is unnecessary and inaccurate to qualify transcriptional condensates as depots for transcriptional machinery.

      6) This work has the potential to advance the field forward by providing a detailed perspective on what proteins are located in what regions of the genome. Publication of this information alongside the manuscript would advance the field materially.

      We thank the reviewer for constructive comments and suggestions. Below are our point-by-point responses:

      (1) We added a new short section “Transcriptional condensates as a model for explaining the HOT regions” with additional support for the condensate hypothesis, wherein some of the points raised here were addressed. Specifically, we used a curated LLPS proteins (CD-CODE) database and provided statistics of those annotation condensate-related DAPs.

      Regarding the DAPs mentioned in this question, we observed that the distributions corresponding ChIP-seq peaks confirm the patterns expected by the reviewer (Author response image 1). Namely:

      - MED1 and Ronin (THAP11) are abundant in the HOT loci, being present 67% and 64% of HOT loci respectively.

      - While the BRD4 is present in 28% of the HOT loci, we observed that the DAPs with annotated LLPS activity ranged from 3% to 73%, providing further support for the condensate hypothesis.

      - ENCODE database does not contain ChIP-seq dataset for HP1A. EZH2 peaks were absent in the HOT loci (0.4% overlap), suggesting the lack of heterochromatin condensate involvement.

      - Serine-rich splicing factor family proteins were present only in 7.7% of the HOT loci, suggesting the absence or limited overlap with splicing condensates or nuclear speckles.

      Author response image 1.

      (2) In this study we selected the TF ChIP-seq datasets with stringent quality metrics, excluding those which had attached audit warning and errors. As a result, the set of DAPs analyzed in HepG2 did not include MYC, since the corresponding ChIP-seq dataset had the audit warning tags of "borderline replicate concordance, insufficient read length, insufficient read depth, extremely low read depth". Analyses in K562 and H1 did include MYC (alongside MAX) ChIP-seq dataset.

      To address this question, we added the mentioned ChIP-seq dataset (ENCODE ID: ENCFF800JFG) and analyzed the colocalization patterns of MYC and MAX. We observed that the MYC ChIP-seq peaks in HepG2 display spurious results, overlapping with only 5% of HOT loci. Meanwhile in K562 and H1, MYC and MAX are jointly present in 54% and 44% of the HOT loci, respectively (Author response image 2).

      Author response image 2.

      These observations were also supported by Jaccard indices between the MYC and MAX ChIP-seq peaks. To do this analysis, we calculated the pairwise Jaccard indices between MYC and MAX and divided them by the average Jaccard indices of 2000 randomly selected DAP pairs. In K562 and H1, the Jaccard indices between MYC and MAX are 5.72x and 2.53x greater than the random background, respectively. For HepG2, the ratio was 0.21x, clearly indicating that HepG2 MYC ChIP-seq dataset is likely erroneous.

      Author response image 3.

      (3) Despite numerous publications focusing on different structural domains in transcription factors, we could not find an extensive database or a survey study focusing on annotations of structural motifs in human TFs. Therefore, surveying such a scale would be outside of this study’s scope. We added only the analysis of intrinsically disordered regions, as it pertains to the condensate hypothesis. To emphasize this shortcoming, we added the following sentence to the end of the discussions section.

      “Further, one of the hallmarks of LLPS proteins that have been associated with their abilities to phase-separate is the overrepresentation of certain structural motifs, which we did not pursue due to size limitations.”

      (4, 5) We agree with these statements and thank the reviewer for pointing out this faulty statement. We modified the sections in the discussions related to the condensates and removed the part where we implied that the condensate model could be because of mostly a single function of TF reservoir.

      (6) We added a table to the supplemental materials (Zenodo repository) with detailed annotation of HOT and non-HOT DAP-bound loci in the genome.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      The clause with "inadequate" would be dropped if the authors sufficiently address reviewer concerns about clarity of writing, including:

      (1) Editing the title to better reflect the findings of the paper.

      (2) Making clear that the condensate model is speculative and not explicitly tested in this study (and may be better described as a hypothesis).

      (3) Resolving apparent contradictions regarding DNA sequence specificity and the interpretation of ChIP-seq signal intensity.

      (4) Better specifying and justifying model parameters, thresholds, and assumptions.

      (5) Shortening the manuscript to emphasize the main, well-supported claims and to enhance readability (especially the discussion section).

      We thank the Editor for their work. We followed their advice and implemented changes and additions to address all 5 points.

      Reviewer #1 (Recommendations For The Authors):

      (1) The title "Sequence characteristics and an accurate model of abundant hyperactive loci in the human genome" does not accurately reflect the findings of the paper. We are unclear as to what the 'accurate model' refers to. Is it the proposed model 'based on the existence of large transcriptional condensates' (abstract)? If so, there are concerns below regarding this statement (see comment 2). If the authors are referring to the computational modeling presented in Figure 5, it is unclear that any one of them performed that much better than the others and the best single model was not identified. Furthermore, the models being developed in the study constitute only a portion of the paper and lacked validation through additional datasets. Additionally, sequence characteristics were not a primary focus of the study. Only figure 5 talks about the model and sequence characteristics, the rest of the figures are left out of the equation.

      We agree with and thank the reviewer for this idea of clarifying the intended meaning.

      (1) We changed the title and clarified that the computational model is meant:

      “Functional characteristics and a computational model of abundant hyperactive loci in the human genome”.

      (2) Shortened the part of the manuscript discussing the computational models and pointed out the CNNs as “the best single model”.

      (2) The abstract and discussion (and perhaps the title) propose a model of transcriptional condensates in relation to HOT loci. However, there is no data provided in the manuscript that relates to condensates. Therefore, anything relating to condensates is primarily speculative. This distinction needs to be properly made, especially in the abstract (and cannot be included in the title). Otherwise, these statements are misleading. Although the field of transcriptional condensates is relatively new, there have been several factors studied. The authors could include in Figure 2d which factors have been shown to form transcriptional condensates. This might provide some support for the model, though it would still largely remain speculative unless further testing is done.

      We added a new short chapter “Transcriptional condensates as a model for explaining the HOT regions”,  with additional analyses testing the condensates hypothesis. We provided supportive evidence by analyzing the metrics used as hallmarks of condensates including the distributions of annotated condensate-related proteins, nascent transcription, and protein-RNA interaction levels in HOT loci. Still, we acknowledge that this is a speculative hypothesis and we clarified that with the following statement in the discussions:

      “It is important to note here that our proposed condensate model is a speculative hypothesis. Further experimental studies in the field are needed to confirm or reject it.”

      (3) Several apparent contradictions exist throughout the manuscript. For example, "HOT locus formation are likely encoded in their DNA sequences" (lines 329-330) vs the proposed model of formation through condensates (abstract). These two statements do not seem compatible, or at the very least, the authors can explain how they are consistent with each other. Another example: "ChIP-seq signal intensity as a proxy for... binding affinity" (line 229) vs. "ChIP-seq signal intensities do not seem to be a function of the DNA-binding properties of the DAPs" (lines 259-260). The first statement is the assumption for subsequent analyses, which has its own concerns (see comment 4). But the conclusion from that analysis seems to contradict the assumption, at least as it is stated.

      In this study, we argue that the two statements may not necessarily contradict each other. We aimed to a) demonstrate that the observed intensity of DAP-DNA interactions as measured by ChIP-seq experiments at HOT loci cannot be explained with direct DNA-binding events of the DAPs alone and b) propose a hypothesis that this observation can be at least partially explained if the HOT loci have the propensity to either facilitate or take part in the formation of transcriptional condensates.

      One of the conditions for condensates to form at enhancers was shown to be the presence of strong binding sites of key TFs (Shrinivas et al. 2019 “Enhancer features that drive the formation of transcriptional condensates”), where the study was conducted using only one TF (OCT4) and one coactivator (MED1). To the best of our knowledge, no such study has been conducted involving many TFs and cofactors simultaneously. We also know that the factors that lead to liquid-to-liquid phase separation include weak multivalent IDR-IDR, IDR-DNA, and IDR-RNA interactions. As a result, the observed total sum of ChIP-seq peaks in HOT loci is the direct DNA-binding events combined with the indirect DAP-DNA interactions, some of which may be facilitated by condensates. And, the fact that CNNs can recognize the HOT loci with high accuracy suggests that there must be an underlying motif grammar specific to HOT loci.

      We emphasized this conclusion in the discussions.

      The comment on using the ChIP-seq signal as a proxy for DNA-binding affinity is addressed under comment 4.

      (4) In lines 229-230, the authors used "the ChIP-seq signal intensity as a proxy for the DAP binding affinity." What is the basis for this assumption? If there is a study that can be referenced, it should be added. However, ChIP-seq signal intensity is generally regarded as a combination of abundance, frequency, or percentage of cells with binding. RNA Pol2 is a good example of this as it has no specific binding affinity but the peak heights indicate level of expression. Therefore, the analyses and conclusions in Figure 4, particularly panel A, are problematic. In addition, clarification from lines 258-260 is needed as it contradicts the earlier premise of the section (see comment 3).

      We thank the reviewer for pointing out this error. The main conclusion of the paragraph is that the average ChIP-seq signal values at HOT loci do not correlate well with the sequence-specificity of TFs. We reworded the paragraph stating that we are analyzing the patterns of ChIP-seq signals across the HOT loci, removing the part that we use them as a proxy for sequence-specific binding affinity.

      (5) In Figure 1A, the authors show that "the distribution of the number of loci is not multimodal, but rather follows a uniform spectrum, and thus, this definition of HOT loci is ad-hoc" (lines 92-95). The threshold to determine how a locus is considered to be HOT is unclear. How did the authors decide to use the current threshold given the uniform spectrum observed? How does this method of calling HOT loci compare to previous studies? How much overlap is there in the HOT loci in this study versus previous ones?

      We moved the corresponding explanation from the supplemental methods to the main methods section of the manuscript.

      Briefly, our reasoning was as follows: assuming that an average TFBS is 8bp long and given that we analyze the loci of length 400bp, we can set the theoretical maximum number of simultaneous binding events to be 50. Hence, if there are >50 TF ChIP-seq peaks in a given 400bp locus, it is highly unlikely that the majority of ChIP-seq peaks can be explained by direct TF-DNA interactions. The condition of >50 TFs corresponded to the last four bins of our binning scale, which was used as an operational definition for HOT loci.

      We have compared our definition of HOT loci to those reported in previous studies by Remaker et al. and Boyle et al. The results of our analyses are in lines 147-154.

      (6) In Figure 3B, the authors state that of "the loop anchor regions with >3 overlapping loops, 51% contained at least one HOT locus, suggesting an interplay between chromatin loops and HOT loci." However, it is unclear how "51%" is calculated from the figure. Similarly, in the following sentence, "94% of HOT loci are located in regions with at least one chromatin interaction". It is unclear as to how the number was obtained based on the referenced figure.

      Initially, the x-axis on the Figure 3B was missing, making it hard to understand what we meant. We added the x-axis numbers and changed the “51%” to “more than half”. We intend to say that, of the loci with 4 and 5 overlapping loops, exactly 50% contain at least one HOT locus. However, since for x=6 the percentage is 100% (since there’s only one such locus), the percentage is technically “more than half”.

      The percentage of HOT loci engaging in chromatin interaction regions (91%) was calculated by simply overlapping the HOT regions with Hi-C long-range contact anchors. The details of extracting these regions using FitHiChip are described in Supplemental Methods 1.3.

      (7) While we have a limited basis to evaluate computational models, we would like to see a clearer explanation of the model set-up in terms of the number of trained vs. test datasets. In addition, it would be interesting to see if the models can be applied to data from different cell lines.

      We added the table with the sizes of the datasets used for classification in Supplemental Methods 1.6.1.

      Evaluating the models trained on the HOT loci of HepG2 and K562 on other cell lines would pose challenges since the number of available ENCODE TF ChIP-seq datasets is significantly less compared to the mentioned cell lines. Therefore, we conducted the proposed analysis between the studied cell lines. Specifically, we used the CNN models trained on HOT and regular enhancers of HepG2 and K562. Then, we evaluated each model on the test sets of each classification experiment (Author response image 4). We observed that the classification results of the HOT loci demonstrated a higher level of tissue-specificity compared to the same classification results of the regular enhancers.

      Author response image 4.

      (8) Lines 349-351. The significance of highly expressed genes being more prone to having multiple HOT loci, and vice versa, appears conventional and remains unclear. Intuitively, it makes sense for higher expressed genes to have more of the transcriptional machinery bound, and would bias the analysis. One way to circumvent this is to only analyze sequence-specific TFs and remove ones that are directly related to transcription machinery.

      We thank the reviewer for this suggestion. Our attempt to re-annotate the HOT loci with only sequence-specific TFs led to a significantly different set of loci, which would not be strictly comparable to the HOT loci defined by this study. Analyzing these new sets of loci would create a noticeable departure from the flow of the manuscript and further extend the already long scope of the study.

      Moreover, numerous studies have shown that super-enhancers recruit large numbers of TFs via transcriptional condensates (Boija et al., 2018; Cho et al., 2018; Sabari et al., 2018). We hope that our results can serve as data-driven supportive evidence for those studies.

      (9) Lines 393-396. We would like to see a reference to the models shown in the figures, if these models have been published previously.

      We could not understand the question. The lines 393-396 contains the following sentence:

      “However, many of the features of the loci that we’ve analyzed so far demonstrated similar patterns (GC contents, target gene expressions, ChIP-seq signal values etc.) when compared to the DAP-bound loci in HepG2 and K562, suggesting that albeit limited, the distribution of the DAPs in H1 likely reflects the true distribution of HOT loci.”

      In case the question was about the models that we trained to classify the HOT loci, we included the models and codebase to Zenodo and GitHub repository.

      (10) Values in Figure 7D are not reflected in the text. Specifically, the text states "Average ... phastCons of the developmental HOT loci are 1.3x higher than K562 and HepG2 HOT loci (Figure 7D)" (lines 408-409). Figure 7D shows conservation scores between HOT enhancers vs promoters for each cell line, and does not seem to reflect the text.

      We modified the figure to reflect the statement appropriately.

      (11) Methodology should include a justification for the use of the Mann-Whitney U-test (non-parametric) over other statistical tests.

      We added the following description to the methods section:

      “For calculating the statistical significance, we used the non-parametric Mann-Whitney U-test when the compared data points are non-linearly correlated and multi-modal. When the data distributions are bell-curve shaped, the Student’s t-test was used.“

      Minor:

      (1) Figure 2b was never mentioned in the paper. This can be added alongside Figure S6C, line 148.

      Indeed, Figure 2B was supposed to be listed together with Figure S6C, which was omitted by mistake. It was corrected.

      (2) Supplementary Figure 8 has two Cs. Needs to be corrected to D.

      Fixed.

      (3) Figure 3B is missing labels on the x-axis.

      Fixed.

      (4) The horizontal bar graph on the bottom left of Figure 1E needs to be described in the figure legend.

      Description added to the figure caption.

      (5) Line 345, Fig 15A should be Fig S15A.

      Corrected.

      Reviewer #2 (Recommendations For The Authors):

      I listed all my concerns about the paper in the public comments. I think the manuscript is very comprehensive and it is valuable, but it should be cut short and presented in a more digestible way.

      We thank the reviewer for their valuable comments and suggestions. We addressed all the concerns listed in the public comments. We shortened the manuscript by reducing the paragraph that focuses on computational classification models and reduced the discussions by about half in length.

      Line 55: What are chromatin-associated proteins, i.e. are they histone modifications?

      To clarify the definition used from the citation we changed the sentence to the following:

      “For instance, Partridge et al. studied the HOT loci in the context of 208 proteins including TFs, cofactors, and chromatin regulators which they called chromatin-associated proteins.”

      Though most of the paper can be cut short to avoid analysis paralysis for readers, there are details that still need filling in. For example, how did the authors perform PCA analysis, i.e. what are the features of each data point in the PCA analysis? Lines 214-215: How do we calculate the number of multi-way contacts in Hi-C data?

      We added clarifying descriptions and changed the mentioned sentences to the following:

      PCA:

      “To analyze the signatures of unique DAPs in HOT loci, we performed a PCA analysis where each HOT locus is represented by a binary (presence/absence) vector of length equal to the total number of DAPs analyzed.”

      Multi-way contacts on loop anchors:

      “To investigate further, we analyzed the loop anchor regions harboring HOT loci and observed that the number of multi-way contacts on loop anchors (i.e. loci which serve as anchors to multiple loops) correlates with the number of bound DAPs (rho=0.84 p-value<10E-4; Pearson correlation). “

      - Lines 251-252: How did the referenced study categorize DAPs? It is important for any manuscript to be self-contained.

      We added the explanation and changed the sentence to the following:

      “To test this hypothesis, we classified the DAPs into those two categories using the definitions provided in the study (Lambert et al. 2018) 28, where the TFs are classified by manual curation through extensive literature review and supported by annotations such as the presence of DNA-binding domains and validated binding motifs. Based on this classification, we categorized the ChIP-seq signal values into these two groups.“

      - Lines 181-185, sentences starting with 'To test' can be moved to the methods, leaving only brief mentions of the statistic tests if needed.

      We removed the mentioned sentence and moved to the supplemental methods (1.4).

      - Lines 217-220: I find this sentence extremely redundant unless it can offer more specific insights about a particular set of DAPs or if the DAPs are closer/or a proven distal enhancer to a confirmed causal gene.

      We removed the mentioned sentence from the text.

      - Lines 243-246: How did the authors determine the set DAPs that have stabilizing effects, and how exactly are the 'stabilizing effects' observed/measured?

      We added explanations to Supplemental Methods 3.1 and Fig S18, S19.

      While addressing this comment we realized that the reported value of the ratio is 1.91x, not 1.7x. We corrected that value in the main text and added the p-value.

      - When discussing the phastCons scores analyses, such as in lines 268-271, how did the authors calculate the relationship between phastCons scores and HOT loci, i.e. was the score averaged across the 400-bp locus to obtain a locus-specific conservation score?

      Yes, per-locus conservation scores were averaged over the bps of loci. We added this clarification to the methods.

      - Line 311: What is the role of the 'control sets' in the analyses of the sequence's relationship with HOT?

      In this specific case, the control sets are used as background or negative sets to set up the classification tasks. In other words, we are asking, whether the HOT loci can be distinguished when compared to random chromatin-accessible regions, promoters, or regular enhancers. We clarified this in the text.

      - I also find the discussion about different machine learning methods that classify HOT loci based on sequence contexts quite redundant UNLESS the authors decide to go further into the features' importance (such as motifs) in the models that predict/ are associated with HOT loci, which in itself can constitute another study.

      We agree with the reviewer, and shortened the part with the discussions of models by limiting it to only 3 main models and moved the rest to the supplemental materials.

      - Can the authors clarify where they obtain data on super-enhancers?

      We obtained the super-enhancer definitions from the original study (Hnisz et al. 2013, PMID: 24119843) where the super-enhancers were defined for multiple cell lines. We clarified this in the methods.

      - Figure 1B, the x and y axis should be clarified.

      We clarified it by using MAX as an example case in the figure caption as follows:

      “Prevalence of DAPs in HOT loci. Each dot represents a DAP. X-axis: percentage of HOT loci in which DAP is present (e.g. MAX is present in 80% of HOT loci). Y-axis: percentage of total peaks of DAPs that are located in HOT loci (e.g. 45% of all the ChIP-seq peaks of MAX is located in the HOT loci). Dot color and size are proportional to the total number of ChIP-seq peaks of DAP.”

      Reviewer #3 (Recommendations For The Authors):

      The list of proteins associated with different types of genomic loci at a meta level (enhancers, promoters, and gene body etc.), and an annotation of the genome at the specific loci level.

      The authors use a wide range of acronyms throughout the text and figure legends, they do a reasonably good job, but the main text section "HOT-loci are enriched in causal variants" and Figure 8 would be materially improved if they held it to the same standard.

      Size is a physical property and not a physicochemical property.

      We thank the reviewer for their comments and suggestions. We added a table to supplemental files with detailed annotations of analyzed loci.

      We reviewed the section “HOT loci are enriched in causal variants” and corrected a few mismatches in the acronyms.

    2. eLife Assessment

      This valuable study explores the sequence characteristics and conservation of high-occupancy target loci, regions in the human genome such as promoters and enhancers that are bound by a multitude of transcription factors. The computational analyses presented in this study are solid. This study would be a helpful resource for researchers performing ChIP-seq based analyses of transcription factor binding.

    3. Reviewer #1 (Public review):

      Summary:

      This study explores the sequence characteristics and features of high-occupancy target (HOT) loci across the human genome. The computational analyses presented in this paper provide information into the correlation of TF binding and regulatory networks at HOT loci that were regarded as lacking sequence specificity.

      By leveraging hundreds of ChIP-seq datasets from the ENCODE Project to delineate HOT loci in HepG2, K562, and H1-hESC cells, the investigators identified the regulatory significance and participation in 3D chromatin interactions of HOT loci. Subsequent exploration focused on the interaction of DNA-associated proteins (DAPs) with HOT loci using computational models. The models established that the potential formation of HOT loci is likely embedded in their DNA sequences and is significantly influenced by GC contents. Further inquiry exposed contrasting roles of HOT loci in housekeeping and tissue-specific functions spanning various cell types, with distinctions between embryonic and differentiated states, including instances of polymorphic variability. The authors conclude with a speculative model that HOT loci serve as anchors where phase-separated transcriptional condensates form. The findings presented here open avenues for future research, encouraging more exploration of the functional implications of HOT loci.

      Strengths:

      The concept of using computational models to define characteristics of HOT loci is refreshing and allows researchers to take a different approach in identifying potential targets. The major strengths of the study lie in the very large number of datasets analyzed, with hundreds of ChIP-seq data sets for both HepG2 and K562 cells as part of the ENCODE project. Such quantitative power allowed the authors to delve deeply into HOT loci, which were previously thought to be artifacts.

      Weaknesses:

      While this study contributes to our knowledge of HOT loci, there are critical weaknesses that need to be addressed. There are questions on the validity of the assumptions made for certain analyses. The speculative nature of the proposed model involving transcriptional condensates needs either further validation or be toned down. Furthermore, some apparent contradictions exist among the main conclusions, and these either need to be better explained or corrected. Lastly, several figure panels could be better explained or described in the figure legends.

      Update After Revisions:

      The authors have addressed the above comments and concerns appropriately. The addition of the new Figure 9 is particularly compelling and strengthens the authors' conclusions. This reviewer has no further concerns.

    4. Reviewer #2 (Public review):

      Summary:

      The paper by Hydaiberdiev and Ovcharenko offers comprehensive analyses and insights about the 'high-occupancy target' (HOT) loci in the human genome. These are considered genomic regions that overlap with transcription factor binding sites. The authors provided very comprehensive analyses of the TF composition characteristics of these HOT loci. They showed that these HOT loci tend to overlap with annotated promoters and enhancers, GC-rich regions, open chromatin signals, and highly conserved regions and that these loci are also enriched with potentially causal variants with different traits.

      Strengths:

      Overall, the HOT loci' definition is clear and the data of HOT regions across the genome can be a useful dataset for studies that use HepG2 or K562 as a model. I appreciate the authors' efforts in presenting many analyses and plots backing up each statement.

      Comments on revised version:

      In the second round of review, I think the authors have sufficiently addressed all of my previous comments. The study itself is very comprehensive, tackling all aspects of the HOT loci, though I still find the paper to be unnecessarily long and long-winded. That said, being consistent with the long and detailed paper, the provided Github repository and Zenodo archive is well-documented. I appreciate that the authors include detailed readme about the different datafiles available for readers. The list of HOT loci is probably the most useful asset in this manuscript and the authors did a good job documenting data availability in both Github and Zenodo.

    5. Reviewer #3 (Public review):

      Summary:

      Hudaiberdiev and Ovcharenko investigate regions within the genome where a high abundance of DNA associated proteins are located and identify DNA sequence feature enriched in these regions, their conservation in evolution, and variation in disease. Using ChIP-seq binding profiles of over 1,000 proteins in three human cell lines (HepG2, K562, and H1) as a data source they're able to identify nearly 44,000 high-occupancy target loci (HOT) that form at promoter and enhancer regions, thus suggesting these HOT loci regulate housekeeping and cell identity genes. Their primary investigative tool is HepG2 cells, but they employ K562 and H1 cells as tools to validate these assertions in other human cell types. Their analyses use RNA pol II signal, super enhancer, regular enhancer and epigentic marks to support the identification of these regions. The work is notable, in that it identifies a set of proteins that are invariantly associated with high-occupancy enhancers and promoters and argues for the integration of these molecules at different genomic loci. These observations are leveraged by the authors to argue HOT loci as potential sites of transcriptional condensates, a claim that they provide information in support of. Transcriptional condensates are an important "family" of condensates, regulating different types of genes and this work supports the hypothesis that they possess similar protein partner molecules as those thought to define such bodies.

    1. eLife Assessment

      This important study identifies biallelic variants of DNAH3 in unrelated infertile men and reports infertility in DNAH3 knockout mice. The authors demonstrate that compromised DNAH3 activity decreases the expression of IDA-associated proteins in the spermatozoa of human patients and knockout mice, providing convincing evidence that DNAH3 is a novel pathogenic gene for asthenoteratozoospermia and male infertility. The study will be of substantial interest to clinicians, reproductive counselors, embryologists, and basic researchers working on infertility and assisted reproductive technology.

    2. Reviewer #1 (Public Review):

      Summary:

      Wang and colleagues identify biallelic variants of DNAH3 in four unrelated Han Chinese infertile men through whole-exome sequencing, which contributes to abnormal sperm flagellar morphology and ultrastructure. To investigate the importance of DNAH3 in male infertility, the authors generated crispant Dnah3 knockout (KO) male mice. They observed that KO mice are also infertile, showing a severe reduction in sperm movement with abnormal IDA (inner dynein arms) and mitochondrion structure. Moreover, nonfunctional DNAH3 expression decreased the expression of IDA-associated proteins in the spermatozoa of patients and KO mice, which are involved in the disruption of sperm motility. Interestingly, the infertility of patients and KO mice is rescued by intracytoplasmic sperm injection (ICSI). Taken together, the authors propose that DNAH3 is a novel pathogenic gene for asthenoterozoospermia and male infertility.

      Strengths:

      This work investigates the role of DNAH3 in sperm mobility and male infertility. By using gold-standard molecular biology techniques, the authors demonstrate with exquisite resolution the importance of DNAH3 in sperm morphology, showing strong evidence of its role in male infertility. Overall, this is a very interesting, well-written, and appealing article. All aspects of the study design and methods are well described and appropriate to address the main question of the manuscript. The conclusions drawn are consistent with the analyses conducted and supported by the data.

      Weaknesses:

      The paper is solid, and in its current form, I have not detected relevant weaknesses.

    3. Reviewer #2 (Public Review):

      Wang et al. investigated the role of dynein axonemal heavy chain 3 (DNAH3) in male infertility. They found that variants of DNAH3 were present in four infertile men, and the deficiency of DNAH3 in sperm affects sperm mobility. Additionally, they showed that Dnah3 knockout male mice are infertile. Furthermore, they demonstrated that DNAH3 influences inner dynein arms by regulating several DNAH proteins. Importantly, they showed that intracytoplasmic sperm injection (ICSI) can rescue the infertility in Dnah3 knockout mice and two patients with DNAH3 variants.

      Strengths:

      The conclusions of this paper are well-supported by data.

      Weaknesses:

      The sample/patient size is small; however, the findings are consistent with those of a recent study on DNAH3 in male infertility with 432 patients.

    4. Reviewer #3 (Public Review):

      Summary:

      (1) To further explore the genetic basis of asthenoteratozoospermia, the authors performed whole-exome sequencing analyses among infertile males affected by asthenoteratozoospermia. Four unrelated Han Chinese patients were found to carry biallelic variations of DNAH3, a gene encoding IDA-associated protein.<br /> (2) To verify the function of IDA associated protein DNAH3, the authors generated a Dnah3-KO mouse model and revealed that the loss of DNAH3 leads to severe male infertility as a result of the severe reduction in sperm movement with the abnormal IDA and mitochondrion structures.<br /> (3) Mechanically, they confirmed decreased expression of IDA-associated proteins (including DNAH1, DNAH6 and DNALI1) in the spermatozoa from patients with DNAH3 mutations and Dnah3-KO male mice.<br /> (4) Then, they also found that male infertility caused by DNAH3 deficiency could be rescued by intracytoplasmic sperm injection (ICSI) treatment in humans and mice.

      Strengths:

      (1) In addition to existing research, the authors provided novel variants of DNAH3 as important factors leading to asthenoteratozoospermia. This further expands the spectrum of pathogenic variants in asthenoteratozoospermia.<br /> (2) By mechanistic studies, they found that DNAH3 deficiency led to decreased expression of IDA-associated proteins, which may be used to explain the disruption of sperm motility and reduced fertility caused by DNAH3 deficiency.<br /> (3) Then, successful ICSI outcomes were observed in patients with DNAH3 mutations and Dnah3 KO mice, which will provide an important reference for genetic counselling and clinical treatment of male infertility.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Wang and colleagues identify biallelic variants of DNAH3 in four unrelated Han Chinese infertile men through whole-exome sequencing, which contributes to abnormal sperm flagellar morphology and ultrastructure. To investigate the importance of DNAH3 in male infertility, the authors generated crispant Dnah3 knockout (KO) male mice. They observed that KO mice are also infertile, showing a severe reduction in sperm movement with abnormal IDA (inner dynein arms) and mitochondrion structure. Moreover, nonfunctional DNAH3 expression decreased the expression of IDA-associated proteins in the spermatozoa of patients and KO mice, which are involved in the disruption of sperm motility. Interestingly, the infertility of patients and KO mice is rescued by intracytoplasmic sperm injection (ICSI). Taken together, the authors propose that DNAH3 is a novel pathogenic gene for asthenoterozoospermia and male infertility.

      Strengths:

      This work investigates the role of DNAH3 in sperm mobility and male infertility. By using gold-standard molecular biology techniques, the authors demonstrate with exquisite resolution the importance of DNAH3 in sperm morphology, showing strong evidence of its role in male infertility. Overall, this is a very interesting, well-written, and appealing article. All aspects of the study design and methods are well described and appropriate to address the main question of the manuscript. The conclusions drawn are consistent with the analyses conducted and supported by the data.

      Weaknesses:

      The paper is solid, and in its current form, I have not detected relevant weaknesses.

      We thank the comments from the reviewer very much.

      Reviewer #2 (Public Review):

      Wang et al. investigated the role of dynein axonemal heavy chain 3 (DNAH3) in male infertility. They found that variants of DNAH3 were present in four infertile men, and the deficiency of DNAH3 in sperm affects sperm mobility. Additionally, they showed that Dnah3 knockout male mice are infertile. Furthermore, they demonstrated that DNAH3 influences inner dynein arms by regulating several DNAH proteins. Importantly, they showed that intracytoplasmic sperm injection (ICSI) can rescue the infertility in Dnah3 knockout mice and two patients with DNAH3 variants.

      Strengths:

      The conclusions of this paper are well-supported by data.

      Weaknesses:

      The sample/patient size is small; however, the findings are consistent with those of a recent study on DNAH3 in male infertility involving 432 patients.

      We extend our sincere gratitude to the expert reviewers for their valuable comments and insightful suggestions.

      A cohort of 587 unrelated infertile men with asthenoteratozoospermia was recruited to investigate the potential genetic etiology using WES. In addition to mutations in DNAH3 identified in four patients, mutations in serval other genes previous reported by our group, including CFAP65 (Zhang et al., 2019. PMID: 31571197), DNAH8 (Yang et al., 2020. PMID: 32681648), DNAH12 (Li et al., 2022. PMID: 34791246), FISIP2 (Zheng et al., 2023. PMID: 35654582), CEP128 (Zhang et al., 2022. PMID: 35296684), CEP78 (Zhang et al., 2022. PMID: 36206347), CT55 (Zhang et al., 2023. PMID: 36481789), SPATA20 (Wang et al., 2023. PMID: 36415156), TENT5D (Zhang et al., 2024. PMID: 38228861), CFAP52 (Jin et al., 2023. PMID: 38126872), CEP70 (Ruan et al., 2023. PMID: 36967801), PRSS55 (Liu et al., 2022. PMID: 35821214), as well as other unreported variants were also identified.

      Reviewer #3 (Public Review):

      Summary:

      (1) To further explore the genetic basis of asthenoteratozoospermia, the authors performed whole-exome sequencing analyses among infertile males affected by asthenoteratozoospermia. Four unrelated Han Chinese patients were found to carry biallelic variations of DNAH3, a gene encoding IDA-associated protein.

      (2) To verify the function of IDA associated protein DNAH3, the authors generated a Dnah3-KO mouse model and revealed that the loss of DNAH3 leads to severe male infertility as a result of the severe reduction in sperm movement with the abnormal IDA and mitochondrion structures.

      (3) Mechanically, they confirmed decreased expression of IDA-associated proteins (including DNAH1, DNAH6 and DNALI1) in the spermatozoa from patients with DNAH3 mutations and Dnah3-KO male mice.

      (4) Then, they also found that male infertility caused by DNAH3 deficiency could be rescued by intracytoplasmic sperm injection (ICSI) treatment in humans and mice.

      Strengths:

      (1) In addition to existing research, the authors provided novel variants of DNAH3 as important factors leading to asthenoteratozoospermia. This further expands the spectrum of pathogenic variants in asthenoteratozoospermia.

      (2) By mechanistic studies, they found that DNAH3 deficiency led to decreased expression of IDA-associated proteins, which may be used to explain the disruption of sperm motility and reduced fertility caused by DNAH3 deficiency.

      (3) Then, successful ICSI outcomes were observed in patients with DNAH3 mutations and Dnah3 KO mice, which will provide an important reference for genetic counselling and clinical treatment of male infertility.

      We are very grateful for the reviewer's careful comments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for The Authors):

      I have carefully read the revised versions of this manuscript, and I would like to thank the authors for addressing all my previous concerns.

      I have no additional comments or suggestions.

      We thank the reviewer for reviewing our revised manuscript.

      Reviewer #2 (Recommendations for The Authors):

      (1) Statistical analyses should be provided alongside the quantification (Fig S1B, S7C).

      According to the suggestions of the reviewer, we have added statistical analyses of the corresponding quantification in the legends of Figure S1 and Figure S7.

      (2) The numbers of sperms counted in Fig S1A should be listed.

      In response to reviewer's valuable suggestions. We have listed the corresponding ratio of different morphological defects in sperm tail of the patients in Figure S1A.

      (3) Due to the high similarities in experimental design, data and conclusions between the current study and previously published work by Meng et al. (2024), as well as the very similar titles of the two studies, it is crucial to emphasize the differences in the Discussion section.

      Many thanks for reviewer's kind suggestions for our revised manuscript.

      Employing whole-exome sequencing (WES) on infertile men to identify candidate variants, followed by in-silico and functional analysis of these variants, and generating mouse models using CRISPR-Cas9 technology, has proven to be an efficient and widely used approach for uncovering the causative genes of male infertility associated with sperm defects. Both our study and the recent work by Meng et al. utilized this approach to verify whether DNAH3 mutations are a cause of asthenoteratozoospermia. Additionally, we have also updated the title of our study to: 'DNAH3 deficiency causes flagellar inner dynein arm loss and male infertility in humans and mice'.

      Meng et al. reported DNAH3 mutations in asthenoteratozoospermia affected patients, revealing multiple morphological defects in sperm tail. Moreover, ultrastructural abnormalities of the flagellar axoneme in the patients were evident in these patients, characterized by a disrupted '9+2' arrangement and the notable absence of IDAs. Additionally, they generated Dnah3 KO mice, which were infertile and exhibited moderate morphological abnormalities. While the '9+2' microtubule arrangement in the flagella of their Dnah3 KO mice remained intact, the IDAs on the microtubules were partially absent. In our study, we observed similar phenotypic differences between DNAH3-deficient patients and Dnah3 KO mice. Both studies suggest that DNAH3 plays a crucial role in human and mouse male reproduction.

      However, there are notable differences between the two studies. Firstly, the phenotypes of Dnah3 KO mice showed slight differences. Meng et al. generated two Dnah3 KO mouse models (KO1 and KO2), and both of which exhibited significantly higher sperm motility and progressive motility than in our study, where nearly all sperm were completely immobile. Furthermore, their Dnah3 KO2 mice even displayed motility comparable to WT mice and retained partial fertility. We speculate that these differences may be attributed to variations in mouse genetic background or the presence of a truncated DNAH3 protein resulting from specific knockout strategies. Secondly, we conducted additional research and uncovered novel findings. We revealed that male infertility caused by DNAH3 mutations follows an autosomal recessive inheritance pattern, as confirmed through Sanger sequencing of the patients' parents. We also discovered the dynamic expression and localization of DNAH3 during spermatogenesis in humans and mice through immunofluorescent staining. We further found that DNAH3 deficiency had no impact on ciliary development in the oviduct or on oogenesis in mice, resulting in normal female fertility. Moreover, in the absence of DNAH3 in both humans and mice, the expression of IDA-associated proteins, including DNAH1, DNAH6 and DNALI1, was decreased, while the expression of ODA-associated proteins remained unaffected, indicating that DNAH3 is involved in sperm axonemal development, specifically through its role in the assembly of IDAs. Collectively, our study corroborates the findings of Meng et al., and provides additional unique insights, comprehensively elucidating the critical role of DNAH3 in human and mouse spermatogenesis.

      We have added these discussions in line 275 to line 306.

      Reviewer #3 (Recommendations for The Authors):

      I have no more recommendations for the authors.

      We thank the reviewer for reviewing our revised manuscript.

    1. eLife Assessment

      This study, which proposes a new role of ATG6 in plant immune response, makes a valuable contribution to our understanding of plant immunity. The results suggest a direct interaction between ATG6 and NPR1, a salicylic acid receptor protein, and they will be of interest to scientists studying the regulation of plant immunity. The data presented are convincing, although the discrepancies between data from fluorescence microscopy and protein blots, particularly in the interpretation of ATG6-mCherry fusion proteins. Addressing these inconsistencies would enhance the study's overall impact.

    2. Reviewer #1 (Public Review):

      The authors showed that autophagy-related genes are involved in plant immunity by regulating the protein level of the salicylic acid receptor, NPR1.

      The experiments are carefully designed and the data is convincing. The authors did a good job of understanding the relationship between ATG6 and NRP1.

      Comments on latest version:

      The authors have already addressed all my comments. I have no further issues with the manuscript.

    3. Reviewer #2 (Public Review):

      The manuscript by Zhang et al. explores the effect of autophagy regulator ATG6 on NPR1-mediated immunity. The authors propose that ATG6 directly interacts with NPR1 in the nucleus to increase its stability and promote NPR1-dependent immune gene expression and pathogen resistance. This novel role of ATG6 is proposed to be independent of its role in autophagy in the cytoplasm. The authors demonstrate through biochemical analysis that ATG6 interacts with NPR1 in yeast and very weakly in vitro. They further demonstrate using overexpression transgenic plants that in the presence of ATG6-mcherry the stability of NPR1-GFP and its nuclear pool is increased.

      Comments on latest version:

      The term "invasion" has to be replaced with infection, as it doesn't have much meaning to this particular study. I already explained this point in the first review, but authors did not address it throughout the manuscript.

      In fig. 1e there's no statistical analysis. How can one show measurements from multiple samples without statistical analysis? All the data points have to be shown in the graph and statistics performed. In the arg6-npr1 and snrk-npr1 pairs no nuclear marker is included. How can one know where the nucleus is, particularly in such poor quality low res. images? The nucleus marker has to be included in this analysis and shown. This is an important aspect of the study as nuclear localization of ATG6 is proposed to be essential for its new function. Co-localization provided in the fig. S2 cannot complement this analysis, particularly since no cytoplasmic fraction is present for NPR1-GFP in fig. S2.

      In the alignment in fig 2c, it is not explained what are the species the atg6 is taken from. The predicted NLS has to be shown in the context of either the entire protein sequence alignment or at least individual domain alignment with the indication of conserved residues (consensus). They have to include more species in the analysis, instead of including 3 proteins from a single species. Also, the predicted NLS in atg6 doesn't really have the classical type architecture, which might be an indication that it is a weak NLS, consistent with the fact that the protein has significant cytoplasmic accumulation. They also need to provide the NLS prediction cut-off score, as this parameter is a measure of NLS strength.

      Line 150: the NLS sequence "FLKEKKKKK" is a wrong sequence.

      In fig. 3d no explanation for the error bars is included, and what type of statistical analysis is performed is not explained.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors showed that autophagy-related genes are involved in plant immunity by regulating the protein level of the salicylic acid receptor, NPR1.<br /> The experiments are carefully designed and the data is convincing. The authors did a good job of understanding the relationship between ATG6 and NRP1.

      The authors have addressed most of my previous concerns.

      Thank you so much for acknowledging our research. It is incredibly rewarding to see our work recognized. We hope that our findings will inspire new perspectives and foster further exploration in this area.

      Reviewer #2 (Public Review):

      The manuscript by Zhang et al. explores the effect of autophagy regulator ATG6 on NPR1-mediated immunity. The authors propose that ATG6 directly interacts with NPR1 in the nucleus to increase its stability and promote NPR1-dependent immune gene expression and pathogen resistance. This novel role of ATG6 is proposed to be independent of its role in autophagy in the cytoplasm. The authors demonstrate through biochemical analysis that ATG6 interacts with NPR1 in yeast and very weakly in vitro. They further demonstrate using overexpression transgenic plants that in the presence of ATG6-mcherry the stability of NPR1-GFP and its nuclear pool is increased.

      Comments on revised version:

      The authors demonstrate the correlation between overexertion of atg6 and higher stability and activity of npr1. They claim a novel activity of atg6 in the nucleus.

      Overall, the experimental scope of the study is solid, however, the over-interpretation of the results substantially reduces the significance and value of this study for the target plant immunity readership.

      Thank you very much for you constructive and insightful comments, as well as for acknowledging the experimental scope of this study. In addition, we have made every effort to address the over-interpretation of the results, as per your comments, ensuring they are more accurate and concise. In the revised version, the modified content has been highlighted in blue to clearly indicate the changes made.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors have addressed most of my concerns. I have no further comments.

      Thank you so much for acknowledging our research. It is incredibly rewarding to see our work recognized. We hope that our findings will inspire new perspectives and foster further exploration in this area.

      Reviewer #2 (Recommendations For The Authors):

      As I previously commented, in fig. 2a and c, the discrepancy between levels of atg6-mcherry in microscope image vs WB has to be explained. The explanation provided by the authors is incomplete and may mislead. The most likely reason for the difference is that the fluorescence signal in fig. 2a is predominantly from free mCherry, rather than the atg6-mcherry fusion. This has to be included in the main text to avoid misleading the reader.

      Thank you very much for you constructive and insightful comments, in response to your comments, we have incorporated the necessary explanations into the revised manuscript (lines 160-164).

      In fig. 1B, the PD fraction has to show the size range of free GST. Also, please use "anti" to indicate that these are immunoblots,.

      Thank you for pointing this out. In the revised manuscript, we identified the range of free GST and used "anti:GST and anti:His" to indicate that these are immunoblots.

      In fig 1C, the WB has to show the free GFP band in the input and IP fractions together with NPR1, rather than in separate blots.

      Thank you for bringing this to our attention. Fig. 1c has been replaced, and the updated image now shows the free GFP band in the input and IP fractions together with NPR1-GFP.

      In fig. 1d, the bifc signal has to be quantified from multiple images across the biological repeats. Also, there's no significance in showing the chlorophyll autofluorescence. What is the purpose of this? They need to use a nuclear marker instead.

      Thank you for your suggestion. Based on your input, we utilized ImageJ software to quantify the YFP fluorescence signal. A total of n = 15 independent images were analyzed, and the corresponding results have been added to Figure 1e. Monitoring chlorophyll autofluorescence serves as a useful background signal, aiding in the distinction between the fluorescence signal of the target protein and background noise. This approach helps reduce potential signal overlap or interference during the experiment, thereby enhancing the reliability of the results.

      Please provide a sequence alignment with multiple ATGs to show the conservation of the presumed bipartite NLS. This information has to be included in the main data.

      Thank you very much for your constructive and insightful comments. We analyzed the putative nuclear localization signal (NLS) in the ATG6 protein sequence using the online INSP (Identification of Nuclear Signal Peptide) prediction software (http://www.csbio.sjtu.edu.cn/bioinf/INSP/). The prediction results indicated the presence of a potential nuclear localization sequence "FLKEKKKKK" within the ATG6 protein, spanning from the 217th to the 223rd amino acid. Additionally, we utilized INSP to investigate the nuclear localization sequences of various ATG proteins (TaATG6a [1], TaATG6b [1], TaATG6c [1], SlATG8h [2]) that have been previously reported to localize in the nucleus. This analysis revealed a relatively conserved NLS sequence motif: "E/K-K/E-K-K-L/K-K" in these ATG proteins. In line with your suggestion, the results of this sequence comparison have been incorporated into the revised manuscript as Figure 2c. The revised manuscript includes a description of the corresponding results. (lines 146-156).

      Fig. 3d and f, how many blots are used for this quantification? Please include all the individual analyzed blots in the supplementary data. In addition, if you present such quantification with error bars, then statistical analysis is required.

      Thank you for pointing this out. In Figure 3d, three independent blots were utilized for this quantification. In Figure 3f, two independent blots were used. The individual analyzed blots have been included in the supplementary Figure 7. We also conducted a statistical analysis as shown in Fig 3d and f, with a detailed description included in the legend section (lines 858 and 861).

      In fig. 4, please indicate what is the normalizing gene. Also, what are the error bars?

      Thank you for pointing this out. In Fig.4, values are means ± SD (n = 3 biological replicates). The AtActin gene was used as the internal control. We have included a detailed description in the figure notes

      In fig. 4b the labeling is missing.

      Thank you for bringing this to our attention. We have included the labeling for Fig. 4 in the revised manuscript.

      Lines 236-239: this statement contradicts the data in fig. 5b: the levels of NPR1-GFP are actually reduced in the presence of atg6 at 24h. So, this result has to be described more accurately by stating that the increase is transient, and it is evident more at 8h, but not at 20-24h.

      Thank you very much for you constructive and insightful comments. We have revised the description of this section to provide a more accurate account of the results (lines 253-258).

      Reference

      (1) Yue J, Sun H, Zhang W, et al. Wheat homologs of yeast ATG6 function in autophagy and are implicated in powdery mildew immunity. BMC Plant Biol. 2015;15:95.

      (2) Li F, Zhang M, Zhang C, et al. Nuclear autophagy degrades a geminivirus nuclear protein to restrict viral infection in solanaceous plants. New Phytol. 2020;225:1746-1761.

    1. eLife Assessment

      This important work substantially advances our understanding of nocturnal animal navigation and the ways that animals use polarized light. The evidence supporting the conclusions is convincing, with elegant behavioural experiments in actively navigating ants. The work will be of interest to biologists working on animal navigation or sensory ecology.

    2. Reviewer #1 (Public review):

      Freas et al. investigated if the exceedingly dim polarization pattern produced by the moon can be used by animal to guide a genuine navigational task. The sun and moon are celestial beacons for directional information, but they can be obscured by clouds, canopy, or the horizon. However, even when hidden from view, these celestial bodies provide directional information through the polarized light patterns in the sky. While the sun's polarization pattern is famously used by many animals for compass orientation, until now it has never been shown that the extremely dim polarization pattern of the moon can be used for navigation. To test this, Freas et al. studied nocturnal bull ants, by placing a linear polarizer in the homing path on a freely navigating ant 45 degrees shifted to the moon's natural polarization pattern. They recorded the homing direction of an ant before entering the polarizer, under the polarizer, and again after leaving the area covered by the polarizer. The results very clearly show, that ants walking under the linear polarizer change their homing direction by about 45 degrees in comparison to the homing direction under the natural polarization pattern and change it back after leaving the area covered by the polarizer again. These results can be repeated throughout the lunar month, showing that bull ants can use the moon's polarization pattern even under crescent moon conditions. Finally, the authors show, that the degree in which the ants change their homing direction is dependent on the length of their home vector, just as it is for the solar polarization pattern.

      The behavioral experiments are very well designed, and the statistical analyses are appropriate for the data presented. The authors' conclusions are nicely supported by the data and clearly show nocturnal bull ants use the dim polarization pattern of the moon for homing, in the same way many animals use the sun's polarization pattern during the day. This is the first proof of the use of the lunar polarization pattern in any animal.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to understand whether polarised moonlight could be used as a directional cue for nocturnal animals homing at night, particularly at times of night when polarised light is not available from the sun. To do this, the authors used nocturnal ants, and previously established methods, to show that the walking paths of ants can be altered predictably when the angle of polarised moonlight illuminating them from above is turned by a known angle (here +/- 45 degrees).

      Strengths:

      The behavioural data are very clear and unambiguous. The results clearly show that when the angle of downwelling polarised moonlight is turned, ants turn in the same direction. The data also clearly show that this result is maintained even for different phases (and intensities) of the moon, although during the waning cycle of the moon the ants' turn is considerably less than may be expected.

      Weaknesses:

      The final section of the results - concerning the weighting of polarised light cues into the path integrator - lacks clarity and should be re-worked and expanded in both the Methods and the Results (also possibly with an extra methods figure). I was really unsure of what these experiments were trying to show or what the meaning of the results actually are.

      Impact:

      The authors have discovered that nocturnal bull ants, while homing back to their nest holes at night, are able to use the dim polarised light pattern formed around the moon for path integration. Even though similar methods have previously shown the ability of dung beetles to orient along straight trajectories for short distances using polarised moonlight, this the first evidence of an animal that uses polarised moonlight in homing. This is quite significant, and their findings are well supported by their data.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript presents a series of experiments aimed at investigating orientation to polarized lunar skylight in a nocturnal ant, the first report of its kind that I am aware of.

      Strengths:

      The study was conducted carefully and is clearly explained here.

      Weaknesses:

      The revised manuscript is much improved.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Freas et al. investigated if the exceedingly dim polarization pattern produced by the moon can be used by animals to guide a genuine navigational task. The sun and moon have long been celestial beacons for directional information, but they can be obscured by clouds, canopy, or the horizon. However, even when hidden from view, these celestial bodies provide directional information through the polarized light patterns in the sky. While the sun's polarization pattern is famously used by many animals for compass orientation, until now it has never been shown that the extremely dim polarization pattern of the moon can be used for navigation. To test this, Freas et al. studied nocturnal bull ants, by placing a linear polarizer in the homing path on freely navigating ants 45 degrees shifted to the moon's natural polarization pattern. They recorded the homing direction of an ant before entering the polarizer, under the polarizer, and again after leaving the area covered by the polarizer. The results very clearly show, that ants walking under the linear polarizer change their homing direction by about 45 degrees in comparison to the homing direction under the natural polarization pattern and change it back after leaving the area covered by the polarizer again. These results can be repeated throughout the lunar month, showing that bull ants can use the moon's polarization pattern even under crescent moon conditions. Finally, the authors show, that the degree in which the ants change their homing direction is dependent on the length of their home vector, just as it is for the solar polarization pattern. 

      The behavioral experiments are very well designed, and the statistical analyses are appropriate for the data presented. The authors' conclusions are nicely supported by the data and clearly show that nocturnal bull ants use the dim polarization pattern of the moon for homing, in the same way many animals use the sun's polarization pattern during the day. This is the first proof of the use of the lunar polarization pattern in any animal.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors aimed to understand whether polarised moonlight could be used as a directional cue for nocturnal animals homing at night, particularly at times of night when polarised light is not available from the sun. To do this, the authors used nocturnal ants, and previously established methods, to show that the walking paths of ants can be altered predictably when the angle of polarised moonlight illuminating them from above is turned by a known angle (here +/- 45 degrees).

      Strengths: 

      The behavioural data are very clear and unambiguous. The results clearly show that when the angle of downwelling polarised moonlight is turned, ants turn in the same direction. The data also clearly show that this result is maintained even for different phases (and intensities) of the moon, although during the waning cycle of the moon the ants' turn is considerably less than may be expected.

      Weaknesses: 

      The final section of the results - concerning the weighting of polarised light cues into the path integrator - lacks clarity and should be reworked and expanded in both the Methods and the Results (also possibly with an extra methods figure). I was really unsure of what these experiments were trying to show or what the meaning of the results actually are.

      Rewrote these sections and added figure panel to Figure 6.

      Impact: 

      The authors have discovered that nocturnal bull ants while homing back to their nest holes at night, are able to use the dim polarised light pattern formed around the moon for path integration. Even though similar methods have previously shown the ability of dung beetles to orient along straight trajectories for short distances using polarised moonlight, this is the first evidence of an animal that uses polarised moonlight in homing. This is quite significant, and their findings are well supported by their data.

      Reviewer #3 (Public Review): 

      Summary: 

      This manuscript presents a series of experiments aimed at investigating orientation to polarized lunar skylight in a nocturnal ant, the first report of its kind that I am aware of.

      Strengths: 

      The study was conducted carefully and is clearly explained here. 

      Weaknesses: 

      I have only a few comments and suggestions, that I hope will make the manuscript clearer and easier to understand.

      Time compensation or periodic snapshots 

      In the introduction, the authors compare their discovery with that in dung beetles, which have only been observed to use lunar skylight to hold their course, not to travel to a specific location as the ants must. It is not entirely clear from the discussion whether the authors are suggesting that the ants navigate home by using a time-compensated lunar compass, or that they update their polarization compass with reference to other cues as the pattern of lunar skylight gradually shifts over the course of the night - though in the discussion they appear to lean towards the latter without addressing the former. Any clues in this direction might help us understand how ants adapted to navigate using solar skylight polarization might adapt use to lunar skylight polarization and account for its different schedule. I would guess that the waxing and waning moon data can be interpreted to this effect.

      Added a paragraph discussing this distinction in mechanisms and the limits of the current data set in untangling them. An interesting topic for a follow up to be sure.

      Effects of moon fullness and phase on precision 

      As well as the noted effect on shift magnitudes, the distributions of exit headings and reorientations also appear to differ in their precision (i.e., mean vector length) across moon phases, with somewhat shorter vectors for smaller fractions of the moon illuminated. Although these distributions are a composite of the two distributions of angles subtracted from one another to obtain these turn angles, the precision of the resulting distribution should be proportional to the original distributions. It would be interesting to know whether these differences result from poorer overall orientation precision, or more variability in reorientation, on quarter moon and crescent moon nights, and to what extent this might be attributed to sky brightness or degree of polarization.

      See below for response to this and the next reviewer comment

      N.B. The Watson-Williams tests for difference in mean angle are also sensitive to differences in sample variance. This can be ruled out with another variety of the test, also proposed by Watson and Williams, to check for unequal variances, for which the F statistic is = (n2-1)*(n1-R1) / (n1-1)*(n2-R2) or its inverse, whichever is >1. 

      We have looked at the amount of variance from the mean heading direction in terms of both the shifts and the reorientations and found no significant difference in variance between all relevant conditions. It is possible (and probably likely) that with a higher n we might find these differences but with the current data set we cannot make statistical statements regarding degradations in navigational precision.  

      As an additional analysis to address the Watson-Williams test‘s sensitivity to changes in variance, we have added var test comparisons for each of the comparisons, which is a well-established test to compare variance changes. None of these were significantly different, suggesting the observed differences in the WW tests are due to changes in the mean vector and not the distribution. We have added this test to the text.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      I have only very few minor suggestions to improve the manuscript: 

      (1) While I fully agree with the authors that their study, to the best of my knowledge, provides the first proof (in any animal) of the use of the moon's polarization pattern, the many repetitions of this fact disturb the flow of the text and could be cut at several instances. 

      Yes, it is indeed repeated to an annoying degree. 

      We have removed these beyond bookending mentions (Abstract and Discussion).

      (2) In my opinion, the authors did not change the "ambient polarization pattern" when using the linear polarization filter (e.g., l. 55, 170, 177 ...). The linear polarizer presents an artificial polarization pattern with a much higher degree of polarization in comparison to the ambient polarization pattern. I would suggest re-phrasing this, to emphasize the artificial nature of the polarization pattern under the polarizer.

      We have made these suggested changes throughout the text to clarify. We no longer say the ambient pattern was   

      (3) Line 377: I do not see the link between the sentence and Figure 7 

      Changed where in the discussion we refer to Figure 7.

      (4) Figure 7 upper part: In my opinion, the upper part of Figure 7 does not add any additional value to the illustration of the data as compared to Figure 5 and could be cut.

      We thought it might be easier for some reader to see the shifts as a dial representation with the shift magnitude converted to 0-100% rather than the shifts in Figure 5. This makes it somewhat like a graphical abstract summarising the whole study.

      I agree that Figure 5 tells the same story but a reader that has little background in directional stats might find figure 7 more intuitive. This was the intent at least. 

      If it becomes a sticking point, then we can remove the upper portion.  

      Reviewer #2 (Recommendations For The Authors): 

      Minor corrections and queries 

      Line 117: THE majority 

      Corrected

      Lines 129-130: Do you have a reference to support this statement? I am unaware of experiments that show that homing ants count their steps, but I could have missed it.

      We have added the references that unpack the ant pedometer.  

      Line 140: remove "the" in this line. 

      Removed

      Line 170: We need more details here about the spectral transmission properties of the polariser (and indeed which brand of filter, etc.). For instance, does it allow the transmission of UV light?

      Added

      Line 239: "...tested identicALLY to ...." 

      Corrected

      Lines 242-258 (Vector testing): I must admit I found the description of these experiments very difficult to follow. I read this section several times and felt no wiser as a result. I think some thought needs to be given to better introduce the reader to the rationale behind the experiment (e.g., start by expanding lines 243-246, and maybe add a methods figure that shows the different experimental procedures).

      I have rewritten this section of the methods to clearly state the experiment rational and to be clearer as to the methodology.

      Also added a methods panel to Figure 6.

      Line 247: "reoriented only halfway". What does this mean? Do you mean with half the expected angle?

      Yes, this is a bit unclear. We have altered for clarity:

      ‘only altered their headings by about half of the 45° e-vector shift (25.2°± 3.7°), despite being tested on near-full-moon nights.’

      Results section (in general): In Figure 1 (which is a very nice figure!) you go to all the trouble of defining b degrees (exit headings) and c degrees (reorientation headings), which are very intuitive for interpreting the results, and then you totally abandon these convenient angles in favour of an amorphous Greek symbol Phi (Figs. 2-6) to describe BOTH exit and reorientation headings. Why?? It becomes even more confusing when headings described by Phi can be typically greater than 300 degrees in the figures, but they are never even close to this in the text (where you seem to have gone back to using the b degrees and c degrees angles, without explicitly saying so). Personally, I think the b degrees and c degrees angles are more intuitive (and should be used in both the text and the figures), but if you do insist on using Phi then you should use it consistently in both the text and the figures. 

      Replaced Phi with b° and c° for both figures and in the text.

      Finally, for reorientation angles in Figure 4A, you say that the angle is 16.5 degrees. This angle should have been 143.5 degrees to be consistent with other figures. 

      Yes, the reorientation was erroneously copied from the shift data (it is identical in both the +45 shift and reorientation for Figure 4A). This has now been corrected

      Line 280, and many other lines: Wherever you refer to two panels of the same figure, they should be written as (say) Figure 2A, B not Figure 2AB.

      Changed as requested throughout the text.

      Line 295 (Waxing lunar phases): For these experiments, which nest are you using? 1 or 2?

      We have added that this is nest 1. 

      Figure 3B: The title of this panel should be "Waxing Crescent Moon" I think. 

      Ah yes, this is incorrect in the original submission. I have fixed this.

      Lines 312-313: Here it sounds as though the ants went right back to the full +/- 45 degrees orientations when they clearly didn't (it was -26.6 degrees and 189.9 degrees). Maybe tone the language down a bit here.

      Changed this to make clear the orientation shift is only ‘towards’ the ambient lunar e-vector.

      Line 327: Insert "see" before "Figure 5" 

      Added

      Line 329: See comment for Line 295. 

      We have added that this is nest 1. 

      Lines 357-373 (Vector testing): Again, because of the somewhat confusing methods section describing these experiments, these results were hard to follow, both here and in the Discussion. I don't really understand what you have shown here. Re-think how you present this (and maybe re-working the Methods will be half the battle won). 

      I have rewritten these sections to try to make clear these are ant tested with differences in vector length 6m vs. 2m, tested at the same location. Hopefully this is much clearer, but I think if these portions remain a bit confusing that a full rename of the conditions is in order. Something like long vector and short vector would help but comes with the problem of not truly describing what the purpose of the test is which is to control for location, thus the current condition names. As it stands, I hope the new clarifications adequately describe the reasoning while keeping the condition names. Of course, I am happy to make more changes here as making this clear to readers is important for driving home that the path integrator is in play.

      See current change to results as an example: ‘Both forgers with a long ~6m remaining vector (Halfway Release), or a short ~2m remaining vector (Halfway Collection & Release), tested at the same location_,_ exhibited significant shifts to the right of initial headings when the e-vector was rotated clockwise +45°.’

      Line 361: I think this should be 16.8 not 6.8 

      Yes, you are correct. Fixed in text (16.8).

      Line 365: I think this should be -12.7 not 12.7 

      Yes, you are correct. Fixed in text (–12.7).

      Line 408: "morning twilight". Should this be "morning solar twilight"? Plus "M midas" should be "M. midas"

      Added and fixed respectively.

      Line 440. "location" is spelt wrong. 

      Fixed spelling.

      Line 444: "...WITH longer accumulated vectors, ..." 

      Added ‘with’ to sentence. 

      Line 447: Remove "that just as"

      Removed.

      Line 448: "Moonlight polarised light" should be "Polarised moonlight" 

      Corrected.

      Lines 450-453: This sentence makes little sense scientifically or grammatically. A "limiting factor" can't be "accomplished". Please rephrase and explain in more detail.

      This sentence has been rephrased:

      ‘The limiting factors to lunar cue use for navigation would instead be the ant’s detection threshold to either absolute light intensity, polarization sensitivity and spectral sensitivity. Moonlight is less UV rich compared to direct sunlight and the spectrum changes across the lunar cycle (Palmer and Johnsen 2015).’

      Line 474: Re-write as "... due to the incorporation of the celestial compass into the path integrator..."

      Added.

      Reviewer #3 (Recommendations For The Authors): 

      Minor comments 

      Line 84 I am not sure that we can infer attentional processes in orientation to lunar skylight, at least it has not yet been investigated.

      Yes, this is a good point. We have changed ‘attend’ to ‘use’.  

      Line 90 This description of polarized light is a little vague; what is meant by the phrase "waves which occur along a single plane"? (What about the magnetic component? These waves can be redirected, are they then still polarized? Circular polarization?). I would recommend looking at how polarized light is described in textbooks on optics.

      We have rewritten the polarised light section to be clearer using optics and light physics for background. 

      Line 92 The phrase "e-vector" has not been described or introduced up to this point.

      We now introduce e-vector and define it. 

      ‘Polarised light comprises light waves which occur along a single plane and are produced as a by-product of light passing through the upper atmosphere (Horváth & Varjú 2004; Horváth et al., 2014). The scattering of this light creates an e-vector pattern in the sky, which is arranged in concentric circles around the sun or moon's position with the maximum degree of polarisation located 90° from the source. Hence when the sun/moon is near the horizon, the pattern of polarised skylight is particularly simple with uniform direction of polarisation approximately parallel to the north-south axes (Dacke et al., 1999, 2003; Reid et al. 2011; Zeil et al., 2014).’

      Happy to make further changes as well.  

      Line 107 Diurnal dung beetles can also orient to lunar skylight if roused at night (Smolka et al., 2016), provided the sky is bright enough. Perhaps diurnal ants might do the same?

      Added the diurnal dung beetles mention as well as the reference.

      Also, a very good suggestion using diurnal bull ants.

      Line 146 Instead of lunar calendar the authors appear to mean "lunar cycle". 

      Changed

      Line 165 In Figure 1B, it looks like visual access to the sky was only partly "unobstructed". Indeed foliage covers as least part of the sky right up to the zenith.

      We have added that the sky is partially obstructed. 

      Line 179 This could also presumably be checked with a camera? 

      For this testing we tried to keep equipment to a minimum for a single researcher walking to and from the field site given the lack of public transport between 1 and 4am. But yes, for future work a camera based confirmation system would be easier. 

      Line 243 The abbreviation "PI" has not been described or introduced up to this point.

      Changes to ‘path integration derived vector lengths….’

      Line 267 The method for comparing the leftwards and rightwards shifts should be described in full here (presumably one set of shifts was mirrored onto the other?).

      We have added the below description to indicate the full description of the mirroring done to counterclockwise shifts.

      ‘To assess shift magnitude between −45° and +45° foragers within conditions, we calculated the mirror of shift in each −45° condition, allowing shift magnitude comparisons within each condition. Mirroring the −45° conditions was calculated by mirroring each shift across the 0° to 180° plane and was then compared to the corresponding unaltered +45 condition.’

      Discussion Might the brightness and spectrum of lunar skylight also play a role here?

      We have added a section to the discussion to mention the aspects of moonlight which may be important to these animals, including the spectrum, brightness and polarisation intensity.  

      Line 451 The sensitivity threshold to absolute light intensity would not be the only limiting factor here. Polarization sensitivity and spectral sensitivity may also play a role (moonlight is less UV rich than sunlight and the spectrum of twilight changes across the lunar cycle: Palmer & Johnsen, 2015). 

      Added this clarification.

      Line 478 Instead of the "masculine ordinal" symbol used (U+006F) here a degree symbol (U+00B0) should be used.

      Ah thank you, we have replaced this everywhere in the text.  

      Line 485 It should be possible to calculate the misalignment between polarization pattern before and after this interruption of celestial cues. Does the magnitude of this misalignment help predict the size of the reorientation?

      Reorientations are highly correlated with the shift size under the filter, which makes sense as larger shifts mean that foragers need to turn back more to reorient to both the ambient pattern and to return to their visual route. Reorientation sizes do not show a consistent reduction compared to under-the-filter shifts when the lunar phase is low and is potentially harder to detect.

      I have reworked this line in the text as I do not think there is much evidence for misalignment and it might be more precise to say that overnight periods where the moon is not visible may adversely impact the path integrator estimate, though it is currently unknown the full impact of this celestial cue gap of if other cues might also play a role.

      Line 642 "from their" should be "relative to" 

      Changed as requested

      Figure 1B Some mention should be made of the differences in vegetation density. 

      Added a sentence to the figure caption discussing the differences in both vegetation along the horizon and canopy cover.

      Figures 2-6 A reference line at 0 degrees change might help the reader to assess the size of orientation changes visually. Confidence intervals around the mean orientation change would also help here.

      We have now added circular grid lines and confidence intervals to the circular plots. These should help make the heading changes clear to readers.

    1. eLife Assessment

      In this compelling study, the authors examine the interactions between stellate cells and PV+ interneurons in the medial entorhinal cortex. Huang et al. focus on the spatial distribution of synaptic inputs and demonstrate that closely located neuron pairs receive common inputs, suggesting a structured functional organization in the entorhinal cortex. Advanced dual whole-cell patch recordings further reveal patterns of postsynaptic activation, indicating intensive interactions within clusters of these neurons, with weaker interactions between clusters. These findings offer significant insights into the functional dynamics of the entorhinal cortex and the circuit mechanisms that shape grid cell activity. This study is important not only for the field of MEC and grid cells, but also for broader fields of continuous attractor networks and neural circuits.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment 

      This is a valuable study in the Jurkat T cell line that calls attention to phosphorylation of formin-like 1 β role and its role in polarization of CD63 positive extracellular vesicles (referred to as exosomes). The evidence presented in the Jurkat model is solid, but concerns have been raised about the statistical analysis and more details would be required to fully assess the significance of the results. For example, ANOVA is the method described, but it requires large amounts of normally distributed data in multiple groups and cannot be used to make pairwise comparisons within groups, which would require a post-hoc method (which is not discussed). In addition, the data showing forming-like 1 β in primary human T cells without and with a CAR are provided without quantification and don't investigate any of the novel claims, so doesn't address the relevance of Formin-like 1 β beyond the Jurkat model. Nonetheless, the consistent trends in the body of the study provide solid support for the claims.

      We acknowledge this general statement on statistics. Thus, we have now discussed and provided more details on the post-hoc method (Tukey), as a new Supplementary data S13 (p-values after applying tukey's method -post hoc- to the one-way anova for all the pairwise comparisons). Additionally, we have now provided quantitative data on the percentage of primary cells with and without CAR that show FMNL1 accumulations at the immune synapse (Suppl. Fig. S7). Regarding the data in primary human T cells, we have already changed the title of the manuscript to strictly adjust it to the main body of the data and our conclusions in the well-established Jurkat synapse model. We also want to emphasize that we have not pretended to extrapolate the relevance of our data regarding FMNL1 and exosomes beyond the Jurkat model. Thus, we have included some additional sentences and/or nuances in the Discussion to somewhat soften our statements in this regard (i.e. “…..provided that the FMNL1 effect on exosome secretion in Jurkat cells can be extended to primary T lymphocytes”) and to clarify this important point.

      Reviewer 1:

      (1) The main findings have been obtained in clones of Jurkat cells. They have not been confirmed in primary T cells. The only experiment performed in primary cells is shown in Figure S7 (primary human T lymphoblasts) for which only the distribution of FMNL1 is shown without quantification. No results presenting the effect of FMNL1 KO and expression of mutants in primary T cells are shown.

      Referee is right regarding the extension of exosome secretion studies to primary human T lymphocytes. Unfortunately, it is well known that primary T lymphocytes are extremely difficult to transfect. Moreover, the expression of our large bi-cistronic large plasmids (>15 Kb) is very inefficient, coupled with the challenge of expressing large proteins, such as the 180 kDa YFP-FMNL1 chimeric variants. The convergence of all these undesirable factors synergistically hampers these studies and we have been unable to consistently achieve enough transfection efficiency to perform these experiments. However, the role of FMNL1 on MTOC/MVB polarization in Jurkat cells, confirmed in this manuscript, has been already extended to primary CD8+ T cell clones (DOI10.1016/j.immuni.2007.01.008). Given that exosome secretion requires

      MTOC/MVB polarization both in Jurkat and primary T lymphoblasts (10.1038/cdd.2010.184, 10.3389/fimmu.2019.00851), this suggests FMNL1 may also control exosome secretion in primary T cells, although the formal demonstration will require further research.

      A new sentence has been included in the Discussion to address this important point. Regarding the second request, we have quantified the images mentioned in Suppl. Fig. S7, and the percentages of fixed T cells showing FMNL1 accumulations at the immune synapse are included in the figure legend.

      (2) Analysis in- depth of the defect in actin remodeling (quantification of the images, analysis of some key actors of actin remodeling) is still lacking. Only Factin is shown, no attempt to look more precisely at actors of actin remodeling has been done.

      The referee is right. Since we have obtained new results on the role of FMNL1 on actin remodeling, we have focused on this formin, which is already a key actor in this process. In this context, we have previously shown that the formin Dia1, another major actor of actin remodeling in T lymphocytes along with FMNL1 (DOI10.1016/j.immuni.2007.01.008), does not undergo phosphorylation upon PKC activation (Suppl. Fig. 5 in https://doi.org/10.1080/20013078.2020.1759926). Since our aim was to unravel the PKC-mediated pathway controlling actin remodeling, we have ruled out more studies on Dia1. Therefore, we have included a new sentence to emphasize the specific role of FMNL1 phosphorylation, but not Dia1, in this regard. Nonetheless, future studies aimed to identifying new important players in this or related pathways could offer significant insights.

      (3) The defect in the secretion of extracellular vesicles is still very preliminary. Examples of STED images given by the authors are nice, yet no quantification is performed.

      The referee is right regarding this point and we acknowledge this comment. Accordingly, we have now quantified the STED images and provided numerical data on the percentages of cells exhibiting the observed phenotypes (see the figure legend for Fig. 10).

      (4) Results shown in Figure S12 on the colocalization of proteins phosphorylated on Ser/Thr are still not convincing. It seems indeed that "phospho-PKC" is labeling more preferentially the CMAC positive cells (Raji) than the Jurkat T cells. It is thus particularly difficult to conclude on the colocalization and even more on the recruitment of phosphorylated-FMNL1 at the IS. Thus, these experiments are not conclusive and cannot be the basis even for their cautious conclusion: "Although all these data did not allow us to infer that FMNL1b is phosphorylated at the IS due to the resolution limit of confocal and STED microscopes, the results are compatible with the idea that both endogenous FMNL1 and YFP-FMNL1bWT are specifically phosphorylated at the cIS".

      The referee may be correct regarding the detail of the "phospho-PKC" labeling. However, it cannot be overlooked that Raji cells also contain proteins that are or may be potential PKC substrates. As a matter of fact, Raji cells also express FMNL1. In addition, MHCII triggering in B cells induces PKC activation (https://doi.org/10.1002/eji.200323351). Regarding which cell type is preferentially labeled, this is a variable topic depending on the analyzed synapse. 

      It is true that there are likely several PKC substrates, both in Jurkat in Raji cells, but our point is that one of these substrates either colocalizes with FMNL1 or is FMNL1 itself. We do not claim at any point that FMNL1 is the only PKC substrate, neither in Jurkat or in Raji cells. 

      Apparently, the referee has either overlooked our results or we did not emphasize them sufficiently. Our results effectively validated the PKC substrate antibody, both on endogenous phospho-FMNL1 and phospho-YFPFMNL1β by WB (Fig. 3). Moreover, the phospho-PKC does not recognize

      YFP-FMNL1β S1086A or S1086D variants (Fig. 3). Last, but not least, when FMNL1 is interfered in the Jurkat cell, the phospho-PKC does not colocalize with FMNL1, but it strongly colocalizes at the synapse with expressed YFPFMNL1βWT in the Jurkat cell (Fig. S11). Indeed YFP-FMNL1β belonged to the Jurkat cell. Taken together these results demonstrate: 1. the specificity of phospho-PKC antibody, 2. the phospho-PKC antibody certainly recognizes phosphorylated YFP-FMNL1β but not its non-phosphorylatable mutant variants, 3. the colocalization of phospho-PKC with anti-FMNL1 is specific. We have included some sentences to clarify these points and to avoid possible misunderstandings by potential readers.  We acknowledge the referee for his/her clarifying point, and we firmly believe our mentioned cautious conclusion is strictly correct, although we have tuned it to consider the possibility that a different PKC substrate could be closely associated to FMNL1, producing the observed colocalization: “Although all these data do not yet allow us to infer that FMNL1b is phosphorylated at the IS due to the resolution limits of super resolution microscopy and the possibility that another PKC substrate may be associated to FMNL1 or very close to FMNL1, in a strictly S1086-dependent manner”.

      To clear any doubt regarding which cell is labelled with phospho-PKC, we have changed the lower panels in Suppl. Fig. S12, and now is more evident that FMNL1 and phospho-PKC belong to the Jurkat cell.

      The study would benefit from a more careful statistical analysis. The dot plots showing polarity are presented for one experiment. Yet, the distribution of the polarity is broad. Results of the 3 independent experiments should be shown and a statistical analysis performed on the independent experiments.

      The referee is right and we have now included further post-hoc analyses data (Tukey) at Suppl. Fig S13. Tukey’s test values were included for all the dot plot figures. We have not included all the plots from 3 different experiments since the manuscript already contains 10+12 multi panel figures and is too large. However, we have stated in the figure legend that these independent experiments are representative of the data obtained from 3 independent experiments. Referee’s consideration regarding the broad distribution of polarity data is correct. We included in the first version of the manuscript a sentence in this regard, that it may have been overlooked: “Remarkably, one important feature of the IS consists of both the onset of the initial cell-cell contacts and the establishment of a mature, fully productive IS, are intrinsically stochastic, rapid and asynchronous processes (87, 88) (43). Thus, the score of the PI corresponding to the distance of MTOC/MVB with respect the IS (42) may be contaminated by background MTOC/MVB polarization, in great part due to the stochastic nature of IS formation (87)”.

    2. eLife Assessment

      This important study uses the Jurkat T cell model to study the role of Formin-like 1 β phosphorylation at S1086 on actin dynamics and exosome release at the immunological synapse. The evidence supporting these findings is compelling within the framework of the Jurkat model. As the Jurkat model is known to have a bias toward formin-mediated actin filament formation at the expense of Arp2/3-mediated branched F-actin foci observed in primary T cells, it will be beneficial in the future to confirm major findings in primary T cells.

    3. Joint Public Review:

      Summary

      Based on i) the documented role of FMNL1 proteins in IS formation; ii) their ability to regulate F-actin dynamics; iii) the implication of PKCdelta in MVB polarization to the IS and FMNL1beta phosphorylation; and iv) the homology of the C-terminal DAD domain of FMNL1beta with FMNL2, where a phosphorylatable serine residue regulating its auto-inhibitory function had been previously identified, the authors have addressed the role of S1086 in the FMNL1beta DAD domain in F-actin dynamics, MVB polarization and exosome secretion, and investigated the potential implication of PKCdelta, which they had previously shown to regulate these processes, in FMNL1beta S1086 phosphorylation. They demonstrate that FMNL1beta is indeed phosphorylated on S1086 in a PKCdelta-dependent manner and that S1086-phosphorylated FMNL1beta acts downstream of PKCdelta to regulate centrosome and MVB polarization to the IS and exosome release. They provide evidence that FMNL1beta accumulates at the IS where it promotes F-actin clearance from the IS center, thus allowing for MVB secretion.

      Strengths

      The work is based on a solid rationale, which includes previous findings by the authors establishing a link between PKCdelta, FMNL1beta phosphorylation, synaptic F-actin clearance and MVB polarization to the IS. The authors have thoroughly addressed the working hypotheses using robust tools. Among these, of particular value is an expression vector that allows for simultaneous RNAi-based knockdown of the endogenous protein of interest (here all FMNL1 isoforms) and expression of wild-type or mutated versions of the protein as YFP-tagged proteins to facilitate imaging studies. The imaging analyses, which are the core of the manuscript, have been complemented by immunoblot and immunoprecipitation studies, as well as by the measurement of exosome release (using a transfected MVB/exosome reporter to discriminate exosomes secreted by T cells).

      Weaknesses

      As stated in the title of the article, the main findings have been obtained in clones of Jurkat cells and have not been confirmed in primary T cells.

    1. eLife Assessment

      The research has the potential to be a valuable addition to the field, and the conclusions are solid, but there is a need for more reproducible data to address existing discrepancies and enhance its impact.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aim to investigate the role of ORMDL3 in regulating Type 1 interferon (IFN) responses and its effect on tumor growth inhibition. The study focuses on the mechanisms involving the RIG-I pathway and USP10-mediated degradation and attempts to establish a link between ORMDL3 expression and the effectiveness of cancer therapy. The authors also explore the broader implications of ORMDL3 in immune signaling, particularly within the context of Type 1 IFN signaling and its therapeutic potential.

      Strengths:

      • The manuscript explores a novel aspect of cancer immunology by examining the relationship between ORMDL3 and Type 1 IFN signaling, potentially offering new therapeutic avenues.<br /> • A variety of experimental approaches are employed, including knockdown models, overexpression assays, and protein interaction analyses, to elucidate the role of ORMDL3 in modulating immune responses.<br /> • The findings suggest a potential mechanism by which ORMDL3 affects the tumor microenvironment and immune responses, which could have significant implications for understanding cancer progression and therapy.

      Weaknesses:

      • The study does not clearly establish the relationship between Type 1 IFN and cancer therapy, and more robust data are needed to support the claim that tumor growth inhibition occurs via Type 1 IFN upregulation following ORMDL3 knockdown.<br /> • There is ambiguity regarding whether ORMDL3 has a positive or negative role in the Type 1 IFN pathway, especially given conflicting findings in the literature that link higher ORMDL3 levels to increased Type 1 IFN expression.<br /> • The use of certain experimental models, such as HEK293T cells (which are not typical Type 1 IFN producers), raises concerns about the validity and generalizability of the results. Further clarity is needed regarding the rationale for using the same tag in overexpression experiments.<br /> • The manuscript contains several inconsistencies and lacks detailed explanations of critical areas, such as the mechanism by which ORMDL3 facilitates USP10 transfer to RIG-I despite no direct interaction between ORMDL3 and RIG-I.

    3. Reviewer #2 (Public review):

      Summary:

      The authors identified ORMDL3 as a negative regulator of the RLR pathway and anti-tumor immunity. Mechanistically, ORMDL3 interacts with MAVS and further promotes RIG-I for proteasome degradation. In addition, the deubiquitinating enzyme USP10 stabilizes RIG-I and ORMDL3 disturbs this process. Moreover, in subcutaneous syngeneic tumor models in C57BL/6 mice, they showed that inhibition of ORMDL3 enhances anti-tumor efficacy by augmenting the proportion of cytotoxic CD8-positive T cells and IFN production in the tumor microenvironment (TME).

      Strengths:

      The paper has a clearly arranged structure and the English is easy to understand. It is well written. The results are clearly supporting the conclusion.

    4. Author response:

      • The study does not clearly establish the relationship between Type 1 IFN and cancer therapy, and more robust data are needed to support the claim that tumor growth inhibition occurs via Type 1 IFN upregulation following ORMDL3 knockdown.

      We thank the reviewer’s concern. In Figure 6 we detected the expression of IFNB1 and ISGs in MC38 and LLC tumor upon ORMDL3 knockdown. At the mean time, we also used IHC to explore the abundance of RIG-I and ORMDL3 in these tumors. In addition, in figure S5 we performed western blots to detect the expression of RIG-I with or without ORMDL3 knockdown. All these results support our hypothesis that that ORMDL3 is a negative regulator of interferon via modulating RIG-I abundance.

      • There is ambiguity regarding whether ORMDL3 has a positive or negative role in the Type 1 IFN pathway, especially given conflicting findings in the literature that link higher ORMDL3 levels to increased Type 1 IFN expression.

      We appreciate the reviewer’s concern. In our system and experiments, we validated that ORMDL3 is a negative regulator of interferon, although there is also literature that links higher ORMDL3 levels to increased type-I IFN response. ORMDL3 has been reported associated with rhinovirus-induced childhood asthma (Nature.  2007;448(7152):470-473; N Engl J Med. 2013 Apr 11;368(15):1398-407), and ORMDL3 level is positively associated with rhinovirus abundance (N Engl J Med. 2013 Apr 11;368(15):1398-407).  There are reports indicating that ORMDL3 supports the replication of rhinovirus (for example, Am J Respir Cell Mol Biol. 2020 Jun;62(6):783-792). This phenomenon is consistent with our findings that higher ORMDL3 expression leads to lower interferon production, which facilitates viral replication. We believe that the different experimental conclusions obtained in these experiments are due to different experiment condition and different stimulation. In our research, we provided comprehensive studies at the molecular, cellular, and animal levels to support the conclusion that ORMDL3 is a negative regulator of type-I interferon.

      • The use of certain experimental models, such as HEK293T cells (which are not typical Type 1 IFN producers), raises concerns about the validity and generalizability of the results. Further clarity is needed regarding the rationale for using the same tag in overexpression experiments.

      We thank the reviewer’s suggestion. Besides HEK293T, in Figure 1C and 1D we also used A549 and BMDM to overexpress ORMDL3 and stimulate them with polyI:C or polyG:C, Our results showed that ORMDL3 especially inhibits RLR signaling. Additionally, in Figure 3H we found that the endogenous RIG-I expression decreased when we overexpressed ORMDL3 in BMDM. Regarding the issue of using different protein tags, we plan to use different tags to validate our results.

      • The manuscript contains several inconsistencies and lacks detailed explanations of critical areas, such as the mechanism by which ORMDL3 facilitates USP10 transfer to RIG-I despite no direct interaction between ORMDL3 and RIG-I.

      There are some ERMC (ER-mitochondria contact) proteins that mediate the interaction between ER and mitochondria. ORMDL3 locates in ER, and it has been reported to be associated with calcium transportation. At the meantime, the calcium transfer between ER and mitochondria plays an important role in protein synthesis. It is possible that some ERMC proteins mediate the interaction between ORMDL3 and MAVS. In addition,  we also validated that ORMDL3 interacts with USP10 (Figure 5B). Although ORMDL3 and RIG-I do not interact directly, we generated a mechanistic model that ORMDL3 and MAVS recruit USP10 and RIG-I to ERMCS respectively, thus USP10 could form a complex with RIG-I (Figure 5C) and regulate the stability of RIG-I upon RNA sensing.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The main hypothesis/conclusion is summarized in the abstract: "Our study presents an intriguing model of cilia length regulation via controlling IFT speed through the modulation of the size of the IFT complex." The data clearly document the remarkable correlation between IFT velocity and ciliary length in the different cells/tissues/organs analyzed. The experimental test of this idea, i.e., the knock-down of GFP-IFT88, further supports the conclusion but needs to be interpreted more carefully. While IFT particle size and train velocity were reduced in the IFT88 morphants, the number of IFT particles is even more decreased. Thus, the contributions of the reduction in train size and velocity to ciliary length are, in my opinion, not unambiguous. Also, the concept that larger trains move faster, likely because they dock more motors and/or better coordinating kinesin-2 and that faster IFT causes cilia to be longer, is to my knowledge, not further supported by observations in other systems (see below).

      Thank you for your comments. We agree with the reviewer that the final section on IFT train size, velocity, and ciliary length regulation requires additional evidence. The purpose of the knockdown experiments was to investigate the potential relationship between IFT speed and IFT train size. We hypothesize that a deficiency in IFT88 proteins may disrupt the regular assembly of IFT particles, leading to the formation of shorter IFT trains. Indeed, we observed a shorter IFT particles and slight reduction in the transport speed of IFT particles in the morphants. Certainly, it would be more convincing to distinguish these IFT trains through ultrastructural analysis. However, with current techniques, performing such analysis on the zebrafish model will be very difficult due to the limited sample size. In the revised version, we have tempered the conclusions in these sections, as suggested by other reviewers as well.

      (2) I think the manuscript would be strengthened if the IFT frequency would also be analyzed in the five types of cilia. This could be done based on the existing kymographs from the spinning disk videos. As mentioned above, transport frequency in addition to train size and velocity is an important part of estimating the total number of IFT particles, which bind the actual cargoes, entering/moving in cilia.

      Thank you. We have analyzed the entry frequency of IFT in five types of cilia, both anterior and posterior. The analysis indicates that longer cilia also exhibit a higher frequency of fluorescent particles entering the cilia. These results are presented in Figure 3J.

      (3) Here, the variation in IFT velocity in cilia of different lengths within one species is documented - the results document a remarkable correlation between IFT velocity and ciliary length. These data need to be compared to observations from the literature. For example, the velocity of IFT in the quite long (~ 100 um) olfactory cilia of mice is similar to that observed in the rather short cilia of fibroblasts (~0.6 um/s). In Chlamydomonas, IFT velocity is not different in long flagella mutants compared to controls. Probably data are also available for C. elegans or other systems. Discussing these data would provide a broader perspective on the applicability of the model outside of zebrafish.

      Thank you for your suggestions. We believe the most significant novelty of our manuscript is the discovery that IFT velocities are closely related to cilia length in an in vivo model system. Our data suggest that longer cilia may require faster IFT transport to maintain their stable length, powered by larger IFT trains. We did observe substantial variability in IFT velocities across different studies. For example, anterograde IFT transport ranges from 0.2 µm/s in mouse olfactory neurons (Williams et al, 2014) to 0.8 µm/s in 293T cells (See et al, 2016) and 0.4 µm/s in IMCD-3 cells (Broekhuis et al, 2014). Even in NIH-3T3 cells, two studies report significant differences, despite using the same IFT reporters: 0.3 µm/s versus 0.9 µm/s (Kunova Bosakova et al, 2018; Luo et al, 2017). These findings suggest that cell types and culture conditions can influence IFT velocities in vitro, which may not accurately represent in vivo conditions. Interestingly, research on mouse olfactory neurons showed a strong correlation between anterograde and retrograde IFT velocities. Additionally, IFT velocity is closely related to the cell types within the olfactory neuron population, consistent with our results (Williams et al., 2014). 

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors study intraflagellar transport (IFT) in cilia of diverse organs in zebrafish. They elucidate that IFT88-GFP (an IFT-B core complex protein) can substitute for endogenous IFT88 in promoting ciliogenesis and use it as a reporter to visualize IFT dynamics in living zebrafish embryos. They observe striking differences in cilia lengths and velocity of IFT trains in different cilia types, with smaller cilia lengths correlating with lower IFT speed. They generate several mutants and show that disrupting the function of different kinesin-2 motors and BBSome or altering post-translational modifications of tubulin does not have a significant impact on IFT velocity. They however observe that when the amount of IFT88 is reduced it impacts the cilia length, IFT velocity as well as the number and size of IFT trains. They also show that the IFT train size is slightly smaller in one of the organs with shorter cilia (spinal cord). Based on their observations they propose that IFT velocity determines cilia length and go one step further to propose that IFT velocity is regulated by the size of IFT trains.

      Strengths:

      The main highlight of this study is the direct visualization of IFT dynamics in multiple organs of a living complex multi-cellular organism, zebrafish. The quality of the imaging is really good. Further, the authors have developed phenomenal resources to study IFT in zebrafish which would allow us to explore several mechanisms involved in IFT regulation in future studies. They make some interesting findings in mutants with disrupted function of kinesin-2, BBSome, and tubulin modifying enzymes which are interesting to compare with cilia studies in other model organisms. Also, their observation of a possible link between cilia length and IFT speed is potentially fascinating.

      Weaknesses:

      The manuscript as it stands, has several issues.

      (1) The study does not provide a qualitative description of cilia organization in different cell types, the cilia length variation within the same organ, and IFT dynamics. The methodology is also described minimally and must be detailed with more care such that similar studies can be done in other laboratories.

      Thank you for your comments. We found that cilia length is generally consistent within the same cell types we examined, including those in the pronephric duct, spinal cord, and epidermal cells. However, we observed variability in cilia length within ear crista cilia. Upon comparing IFT velocities, we found no differences among these cilia, further confirming our conclusion that IFT velocity is directly related to cell type rather than cilia length. These new results are presented in Figure S4 of the revised version.

      We apologize for the lack of methodological details in the original manuscript. Following the reviewer's suggestion, we have added a detailed description of the methods used to generate the transgenic line and to perform IFT velocity analysis. These details are included in Figure S2 and are thoroughly described in the methods section of the revised manuscript.

      (2) They provide remarkable new observations for all the mutants. However, discussion regarding what the findings imply and how these observations align (or contradict) with what has been observed in cilia studies in other organisms is incomprehensive.

      Thank you for this suggestion. We initially submitted this paper as a report, which have word limits. We believe the main finding of our work is that IFT velocity is directly associated with cell type, with longer cilia requiring higher velocities to maintain their length. This association of IFT velocity with cell type has also been observed in mouse olfactory neurons(Williams et al., 2014). We have included a discussion of our findings, along with related data published in other organisms, in the revised version.

      (3) The analysis of IFT velocities, the main parameter they compare between experiments, is not described at all. The IFT velocities appear variable in several kymographs (and movies) and are visually difficult to see in shorter cilia. It is unclear how they make sure that the velocity readout is robust. Perhaps, a more automated approach is necessary to obtain more precise velocity estimates.

      Thank you for these comments. To measure the IFT velocities, we first used ImageJ software to generate a kymograph, where moving particles appear as oblique lines. The velocity of these particles can be calculated based on the slope of the lines (Zhou et al, 2001). In the initial version, most of the lines were drawn manually. To eliminate potential artifacts, we also used KymographDirect software to automatically trace the particle paths. The velocities obtained with this method were similar to those calculated manually. These new data are now shown in Figure S2 B-D. For shorter cilia, we only used particles with clear moving paths for our calculations. In the revised version, we have included a detailed description of the velocity analysis methods.

      (4) They claim that IFT speeds are determined by the size of IFT trains, based on their observations in samples with a reduced amount of IFT88. If this was indeed the case, the velocity of a brighter IFT train (larger train) would be higher than the velocity of a dimmer IFT train (smaller train) within the same cilia. This is not apparent from the movies and such a correlation should be verified to make their claim stronger.

      Thank you for these excellent suggestions. We measured the particle size and fluorescence intensity of 3 dpf crista cilia using high-resolution images acquired with Abberior STEDYCON. The results showed a positive correlation between the two. These data have been added to the revised version in Figure 5I, which includes both control and ift88 morphant data.

      (5) They make an even larger claim that the cilia length (and IFT velocity) in different organs is different due to differences in the sizes of IFT trains. This is based on a marginal difference they observe between the cilia of crista and the spinal cord in immunofluorescence experiments (Figure 5C). Inferring that this minor difference is key to the striking difference in cilia length and IFT velocity is incorrect in my opinion.

      Impact:

      Overall, I think this work develops an exciting new multicellular model organism to study IFT mechanisms. Zebrafish is a vertebrate where we can perform genetic modifications with relative ease. This could be an ideal model to study not just the role of IFT in connection with ciliary function but also ciliopathies. Further, from an evolutionary perspective, it is fascinating to compare IFT mechanisms in zebrafish with unicellular protists like Chlamydomonas, simple multicellular organisms like C elegans, and primary mammalian cell cultures. Having said that, the underlying storyline of this study is flawed in my opinion and I would recommend the authors to report the striking findings and methodology in more detail while significantly toning down their proposed hypothesis on ciliary length regulation. Given the technological advancements made in this study, I think it is fine if it is a descriptive manuscript and doesn't necessarily need a breakthrough hypothesis based on preliminary evidence.

      Thanks for with these comments. We agree with this reviewer that more evidences are required to explain why IFT is transported faster in longer cilia. In the revised version, we have modified and softened this section, focusing primarily on the novel findings of IFT velocity differences between cilia of varying lengths.

      Reviewer #3 (Public Review):

      Summary:

      A known feature of cilia in vertebrates and many, if not all, invertebrates is the striking heterogeneity of their lengths among different cell types. The underlying mechanisms, however, remain largely elusive. In the manuscript, the authors addressed this question from the angle of intraflagellar transport (IFT), a cilia-specific bidirectional transportation machinery essential to biogenesis, homeostasis, and functions of cilia, by using zebrafish as a model organism. They conducted a series of experiments and proposed an interesting mechanism. Furthermore, they achieved in situ live imaging of IFT in zebrafish larvae, which is a technical advance in the field.

      Strengths:

      The authors initially demonstrated that ectopically expressed Ift88-GFP through a certain heatshock induction protocol fully sustained the normal development of mutant zebrafish that would otherwise be dead by 7 dpf due to the lack of this critical component of IFT-B complex.

      Accordingly, cilia formations were also fully restored in the tissues examined. By imaging the IFT using Ift88-GFP in the mutant fish as a marker, they unexpectedly found that both anterograde and retrograde velocities of IFT trains varied among cilia of different cell types and appeared to be positively correlated with the length of the cilia.

      For insights into the possible cause(s) of the heterogeneity in IFT velocities, the authors assessed the effects of IFT kinesin Kif3b and Kif17, BBSome, and glycylation or glutamylation of axonemal tubulin on IFT and excluded their contributions. They also used a cilia-localized ATP reporter to exclude the possibility of different ciliary ATP concentrations. When they compared the size of Ift88-GFP puncta in crista cilia, which are long, and spinal cord cilia, which are relatively short, by imaging with a cutting-edge super-resolution microscope, they noticed a positive correlation between the puncta size, which presumably reflected the size of IFT trains, and the length of the cilia.

      Finally, they investigated whether it is the size of IFT trains that dictates the ciliary length. They injected a low dose (0.5 ng/embryo) of ift88 MO and showed that, although such a dosage did not induce the body curvature of the zebrafish larvae, crista cilia were shorter and contained less Ift88-GFP puncta. The particle size was also reduced. These data collectively suggested mildly downregulated expression levels of Ift88-GFP. Surprisingly, they observed significant reductions in both retrograde and anterograde IFT velocities. Therefore, they proposed that longer IFT trains would facilitate faster IFT and result in longer cilia.

      Weaknesses:

      The current manuscript, however, contains serious flaws that markedly limit the credibility of major results and findings. Firstly, important experimental information is frequently missing, including (but not limited to) developmental stages of zebrafish larvae assayed (Figures 1, 3, and 5), how the embryos or larvae were treated to express Ift88-GFP (Figures 3-5), and descriptions on sample sizes and the number of independent experiments or larvae examined in statistical results (Figures 3-5, S3, S6). For instance, although Figure 1B appears to be the standard experimental scheme, the authors provided results from 30-hpf larvae (Figure 3) that, according to Figure 1B, are supposed to neither express Ift88-GFP nor be genotyped because both the first round of heat shock treatment and the genotyping were arranged at 48 hpf. Similarly, the results that ovl larvae containing Tg(hsp70l:ift88 GFP) (again, because the genotype is not disclosed in the manuscript, one can only deduce) display normal body curvature at 2 dpf after the injection of 0.5 ng of ift88 MO (Fig 5D) is quite confusing because the larvae should also have been negative for Ift88-GFP and thus displayed body curvature. Secondly, some inferences are more or less logically flawed. The authors tend to use negative results on specific assays to exclude all possibilities. For instance, the negative results in Figures 4A-B are not sufficient to "suggest that the variability in IFT speeds among different cilia cannot be attributed to the use of different motor proteins" because the authors have not checked dynein-2 and other IFT kinesins. In fact, in their previous publication (Zhao et al., 2012), the authors actually demonstrated that different IFT kinesins have different effects on ciliogenesis and ciliary length in different tissues. Furthermore, instead of also examining cilia affected by Kif3b or Kif17 mutation, they only examined crista cilia, which are not sensitive to the mutations. Similarly, their results in Figures 4C-G only excluded the importance of tubulin glycylation or glutamylation in IFT. Thirdly, the conclusive model is based on certain assumptions, e.g., constant IFT velocities in a given cell type. The authors, however, do not discuss other possibilities.

      Thank you for pointing out the flaws in our experiments. We apologize for any confusion caused by the lack of detail in our descriptions. Regarding Figure 2B, we want to clarify that it depicts the procedure for heat shock experiments conducted for the ovl mutants' rescue assay, not the experimental procedure for IFT imaging. In the revised version, we have included detailed methods on how to induce the expression of Ift88-GFP via heat shock and the subsequent image processing. The procedure for heat induction is also shown in Figure S2A. We have also added the sample sizes for each experiment and descriptions of the statistical tests used in the appropriate sections of the revised version.

      Regarding the comments on the relationship between IFT speed variability and motor proteins, we completely agree with the reviewer. We have revised our description of this part accordingly.

      Lastly, the results shown in Figure 5D are from a wild-type background, not ovl mutants. We aimed to demonstrate that a lower dose of ift88 morpholino (0.5 ng) can partially knock down Ift88, allowing embryos to maintain a generally normal body axis, while the cilia in the ear crista became significantly shorter.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor

      (I recommend adding page numbers and probably line numbers. This makes commenting easier)

      We have added page numbers and line numbers in the revised manuscript.

      Intro: Furthermore, ultra-high-resolution microscopy showed a close association between cilia length in different organs and the size of IFT fluorescent particles, indicating the presence of larger IFT trains in longer cilia.

      This correlation is not that strong and data are only available for 2 types of cilia.

      Thanks. We have modified this part.

      P5) cilia (Fig. 1D) -> (Fig. S1)

      Thanks. We have corrected this.

      P5) "These movies provide a great opportunity to compare IFT across different cilia." Rewrite: "This approach allows one to determine the velocity and frequency based of IFT based on kymographs" or similar. 

      Thank you for your correction, we have changed it in the revised manuscript.

      This observation suggests that cargo and motor proteins are more effectively coordinated in transporting materials, resulting in increased IFT velocity-a novel regulatory mechanism governing IFT speed in vertebrate cilia.

      This is a somewhat cryptic phrase, rewrite?

      We have modified this sentence.

      P6 and elsewhere: "IFT in the absence of Kif17 or Bbs proteins" I wonder if it would be better to provide subheadings summarizing the main observation instead of descriptive titles. This includes the title of the manuscript.

      Thanks for this suggestion. We have changed the title of subheadings in the revised manuscript. We prefer to keep the current title of this manuscript, as we think this paper is mainly to describe IFT in different types of cilia. 

      Is it known whether IFT protein and motors are alternatively spliced in the various ciliated cells of zebrafish? In this context, is it known whether the cells express IFT proteins at different levels?

      We analyzed the transcript isoforms of several ciliary genes, including ift88, ift52, ift70, ift172, and kif3a. Most of these IFT genes possess only a single transcript isoform. The Kif3a motor proteins have two isoforms (long and short isoforms), however, the shorter isoform contains only the motor domain and is presumed to be nonfunctional for IFT. While we cannot completely rule out this possibility, we consider it unlikely that the variation in IFT speed is due to alternative splicing in ciliary tissues.

      P6) The relation between osm-3 and Kif17 needs to be introduced briefly.  

      Thank you for pointing this out. We have added it in the proper place of the revised manuscript.

      P6) "IFT was driven by kinesin or dynein motor proteins along the ciliary axoneme." "is driven"?

      Delete phrase and IFT to the next sentence?

      We have deleted this sentence.

      P7) "Moreover, the mutants were able to survive to adulthood and there is no difference in the fertility or sperm motility between mutants and control siblings, which is slightly different from those observed in mouse mutants(Gadadhar et al., 2021)." Could some of these data be shown? 

      Thanks for this suggestion. When crossed with wild-type females, all homozygous mutants showed no difference in fertility compared to controls. The percentage of fertilization rates in mutants was 90.5% (n = 7), which was similar to wild-type (87.2%, n = 7). We determined the trajectories of free-swimming sperm by high-speed video microscopy. The vast majority of sperm in ttll3 mutant, similar to wild-type sperm, swim almost entirely along a straight path, which is different from what was observed in the mouse mutant (where 86% of TTLL3-/-TTLL8-/- sperm rotate in situ). We assessed cilia motility in the pronephric ducts of 5dpf embryos using high-speed video microscopy. The ttll3 mutant exhibited a rhythmic sinusoidal wave pattern similar to the control, and there was no significant difference in ciliary beating frequency. These new data are now included in Figure S7C-H.

      P7) "which has been shown early to reduce" earlier

      We have changed it. Thanks.

      Maybe the authors could speculate how the cells ensure the assembly of larger/faster trains in certain cells. Are the relative expression levels known or worth exploring?

      Thank you for these suggestions. We believe that longer cilia may maintain larger IFT particle pools in the basal body region, facilitating the assembly of large IFT trains. The higher frequency of IFT injection in longer cilia further supports this hypothesis. It is likely that cells with longer cilia have higher expression levels of IFT proteins. However, due to the lack of proper antibodies for IFT proteins in zebrafish, it is currently unfeasible to compare this. This experiment is certainly worth investigating in the future. We have added this discussion in the revised manuscript.

      Reviewer #2 (Recommendations for The Authors):

      Here are detailed comments for the authors:

      (1) The authors need to describe their methodology of imaging and what they observe in much greater detail. How were the different cilia types organized? Approximately how many were observed in every organ? How were they oriented? Were there length variations between cilia in the same organ? While imaging, were individual cilium mostly lying in a single focal plane of imaging or the authors often performed z-scans over multiple planes. Velocity measurement is highly variable if individual cilia are spanning over a large volume, with only part of it in focus in single plane acquisition.

      Thank you for your comments. We apologize for the lack of details in the methodology. We have added a detailed description in the 'Materials and Methods' section and illustrated the experimental paradigm in Figure S2A of the revised manuscript. In most tissues we examined, the length of cilia was relatively uniform, except in the crista. The cilia in the crista were significantly longer, with lengths varying between 5 and 30 μm, compared to those in other tissues. We categorized the cilia lengths in the crista into three groups at intervals of 10 μm and measured the anterograde and retrograde velocities of IFT in each group. The results, shown in Figure S4, revealed no significant difference in IFT velocity among the different cilia lengths within the same tissue.  Regarding the imaging, all IFT movies were captured in a single focal plane. In most cases, we did not observe significant velocity variability within the same cilium.

      (2) It is very difficult to directly observe the large differences in IFT velocity from the kymographs, especially in the case of shorter cilia and retrograde motion in them. The quality of the example kymographs could be improved and more zoomed in several cases.

      Thank you for this suggestion. We have modified this.

      (3) The authors do not describe at all, how velocity analysis was done on the kymographs? Were lines drawn manually on the kymographs? From the movies and the kymographs it is visible that the IFT motion is often variable and sometimes gets stuck. How did the authors determine the velocities of such trains? A single slope through the entire train or part of the train? Were they consistent with this? Such variable motion is not so easy to discern in the case of really short cilia. The authors could use a more automatic way of extracting velocities from kymographs using tools such as kymodirect or kymobutler. Keeping in mind that IFT velocity is the main parameter studied in this work, it is important that the analysis is robust.

      We apologize for the previous lack of detailed description. We utilized ImageJ software to generate kymographs, where particles appear as lines. For a moving particle, this line appears oblique. We manually drew lines on the kymographs, and the velocity of particles was calculated based on the slope (Zhou et al., 2001). We only analyzed particles that tracked the full length of the cilia. Following the reviewer's suggestions, we also used the automatic software KymographDirect to calculate the velocity of IFT particles. The results were similar to those calculated using the previous method. These new data are now shown in Figure S2B-D. For shorter cilia, we only used particles with clear moving paths for our calculations. In the revised version, we have included a detailed description of the velocity analysis methods.

      (4) In line with the previous point, as visible from the kymographs the velocity is significantly slower near the transition zone. Did the authors make sure they are not including the region around the transition zone while measuring the IFT velocity, especially in the case of shorter cilia?

      Thank you for the comment. In the revised manuscript, we automatically extracted the path of particle using KymographDirect software. Quantification of each particle's velocity versus position in crista reveals that anterograde IFT proceeds from the base to the tip at a relatively constant speed, whereas retrograde IFT undergoes a slightly acceleration process when returning to the base (Fig. S2E). This finding differs from observations in C. elegans, which dynein-2 first accelerating and then decelerating back to 1.2 μm/s adjacent to the ciliary base (Yi et al, 2017). We believe it is very unlikely that the slow IFT velocity is due to the calculation of IFT only in the transition zone of shorter cilia.

      (5) There are several fascinating findings in this work that the authors do not discuss properly. Firstly, do the authors have a hypothesis as to why IFT speeds are so radically different in different cilia types, given that they are driven by the same motor proteins and have the same ATP levels? They make a big claim in this paper that IFT train sizes correlate with train velocities. IFT trains have a highly ordered structure with regular binding sites for motor proteins. So, a smaller train would have a proportional number of motors attached to them. Why (and how) are the motors moving trains so slowly in some cilia and not in others? If there is no clear answer, the authors must put forward the open question with greater clarity.

      Thank you for the comment. We hypothesize that if multiple motors drive the movement of cargoes synergistically, it could increase the speed of IFT transport. An example supporting this hypothesis is the principle of multiple-unit high-speed trains, which use multiple motors in each individual car to achieve high speeds. Of course, this is just one hypothesis, and we cannot exclude other possibilities, such as the use of different adaptors in different cell types. We have revised our conclusions accordingly in the updated manuscript.

      (6) They find that IFT speeds do not change in kif17 mutants. Are the cilia length also similar (does not appear to be the case in Figure 4 and Figure S3)? Cilia length needs to be quantified. Further, they mention that in C elegans, heterotrimeric kinesin-2 and homodimeric kinesin-2 coordinate IFT. However, from several previous studies, we know that in Chlamydomonas and in mammalian cilia IFT is driven primarily by heterotrimeric kinesin-2 with no evidence that homodimeric kinesin-2 is linked with driving IFT. It appears to be the same in zebrafish. This is an interesting finding and needs to be discussed far more comprehensively.

      Thank you for your comments. We have previously shown that the number and length of crista cilia were grossly normal in kif17 mutants (Zhao et al, 2012). The length of crista cilia displayed slight variability even in wild-type larvae. We quantified the length of cilia in both the crista and neuromast within different mutants, and our analysis revealed no significant difference (see Author response image 1). We agree with the reviewer that Kif17 may play a minor role in driving IFT in cilia. However, previous studies have shown that KIF17 exhibits robust, processive particle movement in both the anterograde and retrograde directions along the entire olfactory sensory neuron cilia in mice. This suggests that, although not essential, KIF17 may also be involved in IFT (Williams et al., 2014). We have added more discussion about Kif17 and heterotrimeric kinesin in the appropriate section of the revised manuscript.

      Author response image 1.

      Statistical significance is based on Kruskal-Wallis statistic, Dunn's multiple comparisons test. n.s., not significant, p>0.05.

      (7) Again, they find that IFT speeds do not change in BBS-4 mutants. I have the same comment about the cilia length as for kif17 mutants. Further, the discussion for this finding is lacking. The authors mention that IFT is disrupted in BBSome mutants of C elegans. Is this the case in other organisms as well? Structural studies on IFT trains reveal that BBSomes are not part of the core structure, while other studies reveal that BBSomes are not essential for IFT. So perhaps the results here are not too surprising.

      We agree with the reviewer that BBSome is possibly not essential for IFT in most cilia. However, in the cilia of olfactory sensory neurons, BBSome is involved in IFT in both mice and nematodes (Ou et al, 2005; Williams et al., 2014). We have added more discussion about BBSome in the appropriate section of the revised manuscript.

      (8) No change in IFT velocities in kif3b mutants is rather surprising. The authors suggest that Kif3C homodimerizes to carry out IFT in the absence of Kif3B. Even if that is the case, the individual homodimer constituents of heterotrimeric kinesin-2 have been shown in previous studies to have different motor properties when homodimerized artificially. Why is IFT not affected in these mutants? This should be discussed. Also, the cilia lengths should be quantified.

      We think the presence of the Kif3A/Kif3C/KAP3 trimeric kinesin may substitute for the Kif3A/Kif3B/KAP3 motors in kif3b mutants, which show normal length of cristae cilia. The Kif3A/Kif3C/KAP3 trimeric kinesin may have similar transport speeds as the Kif3A/Kif3B/KAP3 motors. We did not propose that the Kif3C homodimer can drive the cargoes alone. We apologize for this misunderstanding. Additionally, we have reevaluated the IFT velocities among different lengths of cristae cilia and found no difference between longer and shorter cilia within the same cell types.

      (9) The findings with tubulin modifications should also be discussed in comparison to what has been observed in other organisms.

      We have added further discussion about this result in the revised manuscript.

      (10) The authors find that IFT velocity is lower in ift88 morphants. They also find that the cilia length is shorter (in which cilia type?). Immunofluorescence experiments show that the IFT particle number and size are lower in the ift88 morphants. How many organisms did they look at for this data? What is the experimental variability in intensity measurements in immunofluorescence experiments? Wouldn't the authors expect much higher variability in ift88 morphants (between individual organisms) due to different amounts of IFT88 than for wildtype?

      Thank you for your comments. We apologize for the lack of information regarding the number of organisms observed in Figure 5. These numbers have been added to the figure legends in the revised manuscript. When a low dose of ift88 morpholino was injected, we observed significant shortening of cilia in the ear crista, along with reduced IFT speed. We measured the fluorescence intensity of different IFT particles and found a positive correlation between IFT particle size and fluorescence intensity (Fig 5I). Moreover, the variability of cilia length in cristae is slightly higher in ift88 morphants. These new data have been included in the revised version.

      (11) From their observations they make the claim that IFT velocity is directly proportional to IFT train size. Now within every cilium, IFT trains have large size variations, given the variable intensities for different IFT trains. The authors themselves show that they resolve far more trains when imaging with STED (possibly because they are able to visualize the smaller trains). Is the IFT velocity within the same cilium directly correlated with the intensity of the train, both for wildtype and ift88 morphants? That is the most direct way the authors can test that their hypothesis is true. Higher intensity (larger train size) results in faster velocity. From a qualitative look at their movies, I do not see any strong evidence for that.

      Thank you for your comments. We have measured the particle size and fluorescence intensity of 3dpf crista cilia using high-resolution images acquired with Abberior STEDYCON. The results, shown in Figure 5I, demonstrate a positive correlation between particle size and fluorescence intensity.

      (12) Are the sizes of both anterograde and retrograde trains lower in ift88 morphants? It's not clear from the data. It should be clearly stated that the authors speculate this and this is not directly evident from the data.

      Because the size of IFT fluorescence particles is based on immunostaining results, not live imaging, we cannot determine whether they are anterograde or retrograde IFT particles.

      Therefore, we can only speculate that possibly both anterograde and retrograde trains are reduced in ift88 morphants.

      (13) The biggest claim in this paper is that the cilia lengths in different organs are different due to differences in IFT train sizes. This is based on highly preliminary data shown in Figure 5C (how many organisms did they measure?). The difference is marginal and the dataset for spinal cord cilia is really small. The internal variability within the same cilia type is larger than the difference. How is this tiny difference resulting in such a large difference in IFT speeds? I believe their conclusions based on this data are incorrect.

      From our results, we believe that IFT velocity is related to cell types rather than the length of cilia (Fig. S4), which has also been mentioned in previous studies (Williams et al., 2014).  We agree with the reviewer that the evidence for faster IFT speed due to larger train size is not very solid. We have accordingly softened our conclusion and mentioned other possibilities in the revised version.

      Minor comments:

      (1) The authors only mention the number of IFT particles for their data. They should provide the number of cilia and the number of organisms as well.

      Thank you for your suggestion. We added the number of cilia and organisms next to the number of particles in Figure 3, Figure S2-S5 and Table S1 of the revised manuscript.

      (2) Cilia and flagella are similar structurally but not the same. The authors should change the following sentence: In contrast to the localization of most organelles within cells, cilia (also known as flagellar) are microtubule-based structures that extend from the cell surface, facilitating a more straightforward quantification of their size.  

      Thank you for the detailed review. We have changed it in our revised manuscript. 

      (3) The authors should provide references here. For example, Chlamydomonas has two flagella with lengths ranging from 10 to 14 μm, while sensory cilia in C. elegans vary from approximately 1.5 μm to 7.5 μm. In most mammalian cells, the primary cilium typically measures between 3 and 10 μm.  

      We have added it in our revised manuscript. 

      (4) They should mention ovl mutants are IFT88 mutants when they introduce it in the main text.

      We have added it in our revised manuscript. 

      (5) Correct the grammar here: The velocity of IFT within different cilia also seems unchanged (Figure 4F, Movie S9, Table S1).  

      We have changed it. 

      (6) Correct the grammar here: Similarly, the IFT speeds also exhibited only slight changes in ccp5 morphants, which decreased the deglutamylase activities of Ccp5 and resulted in a hyperglutamylated tubulin

      We have changed it. 

      Reviewer #3 (Recommendations For The Authors):

      Introduction:

      1st paragraph, "flagellar" should be "flagella"; 2nd paragraph, "result a wide range of" should be "result in a...".  

      We have changed it. 

      Results and discussion:

      "...certain specialized cell types, including olfactory epithelia and pronephric duct, ...": olfactory epithelia and pronephric duct are tissues, not cells.  

      "...the GFP fluorescence of the transgene was prominently enriched in the cilia (Fig 1D)" : Fig 2D?  

      "The velocity of IFT within different cilia was also seems unchanged (Fig. 4 F, Movie S9, Table S1)": "was" and "seems" cannot be used together.  

      "...driven by b-actin2 promotor":    -actin2? 

      "...each dynein motor protein might propel multiple IFT complexes": The "protein" should be deleted.  

      Thanks. We have corrected all of these mistakes.  

      Figures:

      Figure 1: Dyes and antibodies used other than the anti-acetylated tubulin antibody should mentioned. The developmental stages of zebrafish used for the imaging are mostly missing.  

      Thanks. In the revised version, we have updated the figure legends to include descriptions of the antibodies, developmental stages, as well as N numbers.

      Figure 2B: What "hphs" means should be explained somewhere.  

      Thanks. We have added full name for these abbreviations.  

      Figures 3A-E: For clarity, the cilia whose IFT kymographs are shown should be marked. "Representative particle traces are marked with white lines in panels D and E" (legend): they are actually black lines. The authors should also clearly disclose the developmental stages of zebrafish used for the imaging.  

      Thank you for your comments. In the revised manuscript, the cilia used to generate the kymograph are marked by yellow arrows. We have updated the legend to change "white" to "black." Additionally, we have included the developmental stages of zebrafish used for imaging in Figure 3A.

      Figures 3G-K: The authors used quantification results from 4-dpf larvae and 30-hpf embryos for comparisons. Nevertheless, according to their experimental scheme in Figure 2B, 30-hpf embryos were not subjected to heat-shock treatment and genotyping. How could they express Ift88-GFP for the imaging? How could the authors choose larvae of the right genotypes? In addition, even if the authors heat-shocked them in time but forgot to mention, there are issues that need to be clarified experimentally and/or through citations, at least through discussions. Firstly, at 30 hpf, those motile cilia are probably still elongating. If this is the case, their final lengths would be longer than those presented (H; the authors need to disclose whether the lengths were measured from ciliary Ift88-GFP or another marker). In other words, the correlation with IFT velocities (H and I) might no longer exist when mature cilia were measured. Similarly, cilia undergo gradual disassembly during the cell cycle. Epidermal cells at 30-hpf are likely proliferating actively, and the average length of their cilia (H) would be shorter than that measured from quiescent epidermal cells in later stages.

      Thank you for these comments. First, we want to clarify that Figure 2B depicts the procedure for heat shock experiments conducted for the ovl mutants' rescue assay, not the experimental procedure for IFT imaging. We visualized IFT in five types of cilia using Tg (hsp70l: ift88-GFP) embryos without the ovl mutant background. In the revised manuscript, we have provided a detailed description of embryo treatment in the 'Materials and Methods' section and illustrated the experimental paradigm in Figure S2A. 

      Regarding the ciliary length differences between different developmental stages, we quantified cilia length in epidermal cells at 30 hpf versus 4 dpf, and in pronephric duct cilia at 30 hpf versus 48 hpf. Our analysis found no significant difference in length between earlier and later stages. Additionally, IFT velocities were comparable between these stages. These findings suggest that slower IFT velocities may not be attributed to the selection of different embryonic stages. Furthermore, we demonstrated that longer and shorter cilia maintain similar IFT velocities in crista cilia, indicating that elongated cilia within the same cell type exhibit comparable IFT velocities. These new results are presented in Figures S4 and S5 in the revised version.

      Secondly, do IFT velocities differ between elongating and mature cilia or remain relatively constant for a given cell type? The authors apparently take the latter for granted without even discussing the possibility of the former. In addition, whether the quantification results were from cilia of one or multiple fish, an important parameter to reflect the reproducibility, and sample sizes for the length data are not disclosed. The lack of descriptions on sample sizes and the number of independent experiments or larvae examined are actually common for statistical results in this manuscript.

      Thank you for your comments. We apologize for omitting the basic description of sample sizes and the number of cilia analyzed. We have addressed these issues in the revised manuscript. The length of 4dpf Crista cilia is variable, with longer cilia reaching up to 30 µm and shorter cilia measuring only around 5 µm within the same crista. We categorized the cilia length of Crista into three groups at intervals of 10 µm and measured anterograde and retrograde velocities of IFT in each group. The results revealed no significant difference in IFT velocity among elongating and mature cilia within crista. These supplementary data are now included in Figure S4.

      Figures 4A-B: When mutating neither Kif17 nor Kif3b affected the IFT of crista cilia, the data unlikely "suggest that the variability in IFT speeds among different cilia cannot be attributed to the use of different motor proteins". In fact, in the cited publication (Zhao et al., 2012), the authors used the same and additional mutants (Kif3c and Kif3cl) to demonstrate that different IFT-related kinesin motors have different effects on ciliogenesis and ciliary length in different tissues, results actually implying tissue-specific contributions of different kinesin motors to IFT. Furthermore, although likely only cytoplasmic dynein-2 is involved in the retrograde IFT, the authors cannot exclude the possibility that different combinations or isoforms of its many subunits and regulators contribute to the velocity regulation. Therefore, the authors need to reconsider their wording. This reviewer would suggest that the authors examine the IFT status of cilia that were previously reported to be shortened in the Kif3b mutant to see whether the correlation between ciliary length and IFT velocities still stands. This would actually be a critical assay to assess whether the proposed correlation is only a coincidence or indeed has a certain causality.

      Thank you for your comments. The shortened cilia observed in Kif3b mutants may be attributed to the presence of maternal Kif3b proteins, making it challenging to exclude the involvement of Kif3b motor. Regarding the relationship between IFT speed variability and motor proteins, we agree with the reviewer that we cannot entirely dismiss the possibility of different motors or adaptors being involved. We have revised our description of this aspect accordingly.

      Figures 4C-G: Similarly, when the authors found that tubulin glycylation or glutamylation has little effect on IFT, they cannot use these observations to exclude possible influences of other types of tubulin modifications on IFT. They should only stick to their observations.

      Yes, we agree. We have changed the description in the revised manuscript.

      Figure 5:

      A-C: When the authors only compared immotile cilia of crista with motile cilia of the spinal cord, it is hard to say whether the difference in particle size is correlated with ciliary length or motility. Cilia from more tissues should be included to strengthen their point, especially when the authors want to make this point the central one.

      D: The authors showed that ovl larvae containing Tg(hsp70l:ift88 GFP) (as they do not indicate the genotype, this reviewer can only deduce) display normal body curvature at 2 dpf after the injection of 0.5 ng of ift88 MO. Such a result, however, is quite confusing. According to their experimental scheme in Figure 2B, these larvae were not subjected to heat shock induction for Ift88-GFP. Do ovl larvae containing Tg(hsp70l:ift88 GFP) naturally display normal body curvature at 2 dpf? 

      Thank you for your comments. Due to technical limitations, comparing IFT particle size across different cilia using STED is challenging. We agree with this reviewer that the evidence supporting this aspect is relatively weak. Accordingly, we have modified and softened our conclusion in the revised version.

      Regarding the injection of ift88 morpholino, we want to clarify that we are injecting it into wildtype embryos, not oval mutants. The lower dose of ift88 morpholino (0.5ng) partially knocked down Ift88, allowing embryos to maintain a grossly normal body axis while resulting in shorter cilia in the ear crista.

      E: The authors need to indicate the developmental stage of the larvae examined. One piece of missing data is global expression levels of both endogenous (maternal) Ift88 and exogenous

      Ift88-GFP in zebrafish larvae that are either uninjected, 8-ng-ift88 MO-injected, or 0.5-ng-ift88 MO-injected, preferably at multiple time points up to 3 dpf. The results will clarify (1) the total levels of Ift88 following time; (2) the extent of downregulation the MO injections achieved at different developmental stages; and importantly (3) whether the low MO dosage (0. 5 ng) indeed allowed a persistent downregulation to affect IFT trains at 3 dpf, a time the authors made the assays for Figures 5F-J to reach the model (K). It will be great to include wild-type larvae for comparison.

      Thank you for these valuable suggestions. The ift88 morpholino (MO) was designed to block the splicing of ift88 transcripts and has been used in multiple studies. This morpholino specifically blocks the expression of endogenous ift88, while the expression of the Ift88-GFP transgene remains unaffected. It would be beneficial to titrate the expression level of Ift88 in the morphants at different stages. Unfortunately, we do not have access to a zebrafish Ift88 antibody. We assessed the effects of a lower amount of MO based on our observation that the fish maintained a normal body axis while exhibiting shorter cilia. Ideally, the amount of Ift88 should be lower in the morphants, considering the presence of ciliogenesis defects. We have included additional comments regarding this limitation in the revised version.

      Movies:

      Movies 1-5: Elapsed time is not provided. Furthermore, cilia in the pronephric duct and spinal cord are known to beat rapidly. Their motilities, however, appear to be largely compromised in Movies 3 and 4. Although the quantification results in Fig 3G imply that the authors imaged 30hpf embryos for such cilia, there is no statement on real conditions.

      Thank you for your comments. We apologize for missing elapsed time in our movies. We have addressed this issue in the revised manuscript. Motile cilia are difficult to image due to their fast beating. To immobilize the moving cilia and enable the capture of IFT movement within the cilia, we gently press the embryo with a round cover glass to inhibit the beating of cilia. Data from each embryo were collected within 5 minutes to avoid the impact of embryo death on the results. We have added detail description in the 'Materials and Methods' section.

      Materials:

      The sequence of morpholino oligonucleotide against ift88 is missing.  

      We have added the sequence of ift88 morpholino in the revised manuscript.

      References:

      Important references are missing, including (1) the paper by Leventea et al., 2016 (PMID: 27263414), which shows cilia morphologies in various zebrafish tissues with more detailed descriptions of tissue anatomies and experimental techniques; (2) papers documenting that dynein motors "move faster than Kinesin motors" in IFT of C. reinhardtii and C. elegans cilia; and (3) the paper by Li et al., 2020 (PMID: 33112235), in which the authors constructed a hybrid IFT kinesin to markedly reduced anterograde IFT velocity (~ 2.8 fold) and IFT injection rate in C. reinhardtii cilia and found only a mild reduction (~15%) in ciliary length. This paper is important because it is a pioneer one that elegantly investigated the relationship between IFT velocity and ciliary length. The findings, however, do not necessarily contradict the current manuscript due to differences in, e.g., model organisms and methodology.

      Thank you for the detailed review, we have cited these literatures in the proper place of the revised manuscript.

      Reference

      Broekhuis JR, Verhey KJ, Jansen G (2014) Regulation of cilium length and intraflagellar transport by the RCK-kinases ICK and MOK in renal epithelial cells. PLoS One 9: e108470

      Kunova Bosakova M, Varecha M, Hampl M, Duran I, Nita A, Buchtova M, Dosedelova H, Machat R, Xie Y, Ni Z et al (2018) Regulation of ciliary function by fibroblast growth factor signaling identifies FGFR3-related disorders achondroplasia and thanatophoric dysplasia as ciliopathies. Hum Mol Genet 27: 1093-1105

      Luo W, Ruba A, Takao D, Zweifel LP, Lim RYH, Verhey KJ, Yang W (2017) Axonemal Lumen Dominates Cytosolic Protein Diffusion inside the Primary Cilium. Sci Rep 7: 15793 Ou G, Blacque OE, Snow JJ, Leroux MR, Scholey JM (2005) Functional coordination of intraflagellar transport motors. Nature 436: 583-587

      See SK, Hoogendoorn S, Chung AH, Ye F, Steinman JB, Sakata-Kato T, Miller RM, Cupido T, Zalyte R, Carter AP et al (2016) Cytoplasmic Dynein Antagonists with Improved Potency and Isoform Selectivity. ACS Chem Biol 11: 53-60

      Williams CL, McIntyre JC, Norris SR, Jenkins PM, Zhang L, Pei Q, Verhey K, Martens JR (2014) Direct evidence for BBSome-associated intraflagellar transport reveals distinct properties of native mammalian cilia. Nat Commun 5: 5813

      Yi P, Li WJ, Dong MQ, Ou G (2017) Dynein-Driven Retrograde Intraflagellar Transport Is Triphasic in C. elegans Sensory Cilia. Curr Biol 27: 1448-1461 e1447

      Zhao C, Omori Y, Brodowska K, Kovach P, Malicki J (2012) Kinesin-2 family in vertebrate ciliogenesis. Proceedings of the National Academy of Sciences 109: 2388 - 2393

      Zhou HM, Brust-Mascher I, Scholey JM (2001) Direct visualization of the movement of the monomeric axonal transport motor UNC-104 along neuronal processes in living Caenorhabditis elegans. J Neurosci 21: 3749-3755

    2. eLife Assessment

      The manuscript represents a valuable conceptual and technical contribution to our understanding of ciliogenesis and intraflagellar transport in vertebrates. Through a series of solid and technically superb live imaging experiments to directly visualize intraflagellar transport in various zebrafish ciliated tissues, the authors unveil the surprising breadth of intraflagellar transport speed among differing organs and link this to cell type-specific differences in cilia length and intraflagellar transport train size. This work will be of broad interest to researchers in numerous fields, including development, cell biology, and imaging.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors study intraflagellar transport (IFT) in cilia of diverse organs in zebrafish. They elucidate that IFT88-GFP (an IFT-B core complex protein) can substitute for endogenous IFT88 in promoting ciliogenesis and use it as a reporter to visualize IFT dynamics in living zebrafish embryo. They observe striking differences in cilia lengths and velocity of IFT trains in different cilia types, with smaller cilia length correlating with lower IFT speed. They generate several mutants and show that disrupting function of different kinesin-2 motors and BBSome or altering post translational modifications of tubulin does not have a significant impact on IFT velocity. They however observe that when the amount of IFT88 is reduced it impacts the cilia length, IFT velocity as well as the number and size of IFT trains. They also show that IFT train size is slightly smaller in one of the organs with shorter cilia (spinal cord). Based on their observations they propose that IFT velocity determines cilia length and go one step further to propose that IFT velocity is regulated by the size of IFT trains.

      Strengths:

      The main highlight of this study is the direct visualization of IFT dynamics in multiple organs of a living complex multi-cellular organism, zebrafish. The quality of the imaging is really good. Further, the authors have developed phenomenal resources to study IFT in zebrafish which would allow us to explore several mechanisms involved in IFT regulation in future studies. They make some interesting findings in mutants with disrupted function of kinesin-2, BBSome and tubulin modifying enzymes which are interesting to compare with cilia studies in other model organisms. Also, there observation of a possible link between cilia length and IFT speed is potentially fascinating.

      Weaknesses:

      The central hypothesis of the manuscript, which is cilia length regulation occurs via controlling IFT speed through the modulation of the size of the IFT complex, is supported only with preliminary data and needs stronger evidence.<br /> The authors have robustly shown that the cilia length and IFT train speeds are highly variable between organs and have a strong correlation. With this they hypothesize that IFT train speeds could play a role in determining ciliary length, which is an interesting hypothesis that merits discussion. However, the claim that the cilia length (and IFT velocity) in different organs is different due to difference in the sizes of IFT trains is based on weak evidence. This is based on a marginal difference of IFT train sizes they observe between cilia of crista and spinal cord in immunofluorescence experiments (Fig. 5C). Inferring that this minor difference is key to the striking difference in cilia length and IFT velocity is too bold in my opinion.<br /> To back this hypothesis, they look at ift88 morphants where there is a reduced pool of IFT88 (part of the IFTB1 complex which forms the core of IFT trains, based on multiple cryo-EM studies of IFT trains). Disruption (or reduced number) of IFTB1 complex could indeed lead to IFT trains not being formed properly, which can have an impact on IFT (train size, speed, frequency, etc.) and ciliary structure, as shown by the authors. However, this does not directly imply that under wild-type conditions, cilia in spinal cord have poorly formed slightly shorter IFT trains (cilia length ˜0.9 µm in spinal cord vs ˜1.2 µm in cristae; Fig. 3G) which results in strikingly lower speeds (˜0.4 µm/s in spinal cord vs ˜1.6 µm/s in cristae; Fig. 3G) and shorter cilia (˜3µm in spinal cord vs ˜26µm in cristae; Fig. 3H). Such a claim would require much stronger evidence.

      Finally, if IFT train speeds directly correlate with size of IFT train, the authors should be able to see this within the same cilia, i.e., the velocity of a brighter IFT train (larger train) would be higher than the velocity of a dimmer IFT train (smaller train) within the same cilia. This is not apparent from the movies and such a correlation should be verified to make their claim stronger.

      Impact:

      Overall, I think this work develops an exciting new multicellular model organism to study IFT mechanisms. Zebrafish is a vertebrate where we can perform genetic modifications with relative ease. This could be an ideal model to study not just the role of IFT in connection with ciliary function but also ciliopathies. Further, from an evolutionary perspective, it is fascinating to compare IFT mechanisms in zebrafish with unicellular protists like Chlamydomonas, simple multicellular organisms like C elegans and primary mammalian cell cultures. Having said that, the central hypothesis of the manuscript in not backed with strong evidence and I would recommend the authors to not give too much weight on the hypothesis that IFT train velocity is determined by the size of IFT trains. Given the technological advancements made in this study, I think it is fine if it is a descriptive manuscript and doesn't necessarily need a breakthrough hypothesis based on the marginal correlation they observe.

    4. Reviewer #3 (Public review):

      Summary:

      An interesting feature of cilia in vertebrates and many, if not all, invertebrates is the striking heterogeneity of their lengths among different cell types. As mutations interfering with ciliary length usually impair ciliary functions, ciliary length appears to be tuned for proper ciliary functions in a given type of cells. Although ciliary length is known to be affected by multiple factors, including intraflagellar transport (IFT), a cilia-specific, train-like bidirectional transportation, and ciliary proteins regulating microtubule dynamics, how it is intrinsically controlled remains largely elusive.

      In the manuscript, the authors addressed this question from the angle of IFT by using zebrafish as a model organism. They demonstrated that ectopically expressed Ift88-GFP induced by heat shock treatment was able to sustain the normal development of and the cilia formation in ovl-/- zebrafish that would otherwise be dead by 7 dpf and lack of cilia due to the lack of Ift88, a critical component of IFT-B complex, suggesting a full function of the exogenous protein. They next live imaged Ift88-GFP in wild-type zebrafish larvae to visualize the IFT. Interestingly, they found that both anterograde and retrograde velocities of Ift88-GFP puncta differed in cilia of different cell types (crista, neuromast, pronephric duct, spinal chord, and epidermal cells) and displayed a positive correlation with the inherent length of the cilia. Similar results were obtained with ectopically expressed tdTomato-Ift43 driven by a beta-actin promoter. In the same cell type, however, the velocities of Ift88-GFP puncta did not alter in cilia of different lengths or at different developmental stages. Depletion of proteins such as Bbs4, Ttll3, Ttll6, and Ccp5 did not substantially alter the IFT velocities, excluding contributions of the BBSome or the enzymes involved in tubulin glycylation or glutamylation. They also used a cilia-localized ATP reporter to exclude the possibility of different ciliary ATP concentrations. When they compared the size of Ift88-GFP puncta in crista cilia, which are inherently long, and spinal chord cilia, which are relatively short, by imaging with a STED super-resolution microscope, they noticed a positive correlation between the puncta size, which presumably reflected the size of IFT trains, and the length of the cilia. Furthermore, in morphant larvae with slightly decreased Ift88 levels, judged by the grossly normal body axis, IFT particle sizes, their velocities, and ciliary lengths were all reduced as compared to control morphants. Therefore, they proposed that longer IFT trains facilitate faster IFT to result in longer cilia.

      Strengths:

      The authors demonstrated that: (1) both anterograde and retrograde IFT velocities can differ markedly in cilia of different cell types in zebrafish larvae; (2) specific IFT velocities are intrinsic to cell types; (3) IFT velocities in different types of cells are positively correlated with inherent ciliary lengths; and (4) IFT velocities are positively correlated with the size of IFT trains. These findings provide both new knowledge on IFT properties in zebrafish and insights that would facilitate understandings on mechanisms underlying the diversity of ciliary lengths in multicellular organisms. The experiments were carefully done and results are generally convincing. The imaging methods for tracing IFT in cilia of multiple cell types in zebrafish larvae are expected to be useful to other researchers in the field.

      Weaknesses:

      (1) Although the proposed model is reasonable, it is largely based on correlations.<br /> (2) The effects of anti-sense RNA-induced Ift88 downregulation on IFT and ciliary length are artificial. It is unclear whether the levels of one or more IFT components are indeed regulated to control IFT train sizes and ciliary lengths in physiological conditions. Similarly, whether IFT velocities are indeed dictated by the size of IFT trains remains to be clarified.<br /> (3) In the Discussion section, Kif17 is described as an important motor for IFT in mouse olfactory cilia. In the cited literature (Williams et al., 2014), however, Kif17 is reported to be dispensable for IFT in mouse olfactory cilia. This makes the discussions on Kif17 absurd.

    1. eLife Assessment

      This study provides an important resource by thoroughly benchmarking multiple sequencing-based tRNA quantification methods. The suggested best practice is supported by convincing evidence from in silico experiments in multiple scenarios.

    2. Reviewer #1 (Public review):

      Summary:

      In the manuscript titled "Benchmarking tRNA-Seq quantification approaches by realistic tRNA-Seq data simulation identifies two novel approaches with higher accuracy," Tom Smith and colleagues conducted a comparative evaluation of various sequencing-based tRNA quantification methods. The inherent challenges in accurately quantifying tRNA transcriptional levels, stemming from their short sequences (70-100nt), extensive redundancy (~600 copies in human genomes with numerous isoacceptors and isodecoders), and potential for over 100 post-transcriptional chemical modifications, necessitate sophisticated approaches. Several wet-experimental methods (QuantM-tRNA, mim-tRNA, YAMAT, DM-tRNA, and ALL-tRNA) combined with bioinformatics tools (bowtie2-based, SHRiMP, and mimseq) have been proposed for this purpose. However, their practical strengths and weaknesses have not been comprehensively explored to date. In this study, the authors systematically assessed and compared these methods, considering factors such as incorrect alignments, multiple alignments, misincorporated bases (experimental errors), truncated reads, and correct assignments. Additionally, the authors introduced their own bioinformatic approaches (referred to as Decision and Salmon), which, while not without flaws (as perfection is unattainable), exhibit significant improvements over existing methods.

      Strengths:

      The manuscript meticulously compares tRNA quantification methods, offering a comprehensive exploration of each method's relative performance using standardized evaluation criteria. Recognizing the absence of "ground-truth" data, the authors generated in silico datasets mirroring common error profiles observed in real tRNA-seq data. Through the utilization of these datasets, the authors gained insights into prevalent sources of tRNA read misalignment and their implications for accurate quantification. Lastly, the authors proposed their own downstream analysis pipelines (Salmon and Decision), enhancing the manuscript's utility.

    3. Reviewer #2 (Public review):

      Summary:

      The authors provided benchmarking study results on tRNA-seq in terms of read alignment and quantification software with optimal parameterization. This result can be a useful guideline for choosing optimal parameters for tRNA-seq read alignment and quantification.

      Strengths:

      Benchmarking results for read alignment can be a useful guideline for choosing optimal parameters and mapping strategy (mapping to amino acid) for various tRNAseq.

      Weaknesses:

      Some explanation on sequencing data analysis pipeline is not clear for general readers.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Because tRNA-sequencing methods have not been widely used (compared to mRNA-seq), many readers would not be familiar with the characteristics of different methods introduced in this study (QuantM-tRNA, mim-tRNA, YAMAT, DM-tRNA, and ALL-tRNA; bowtie2-based, SHRiMP, and mimseq; what are the main features of "Salmon?"). The manuscript will read better when the basic features of these methods are described in the manuscript, however brief.

      Introduction page 4 now clarifies a little more the difference between bowtie2, SHRiMP and mimseq. Results page 9 briefly summarises the differences between the tRNA-Seq methods. Results page 14 clarifies how Decision and Salmon work.

      Reviewer 2:

      (1) The explanation of the parameter D for bowtie2 sounds ambiguous. "How much effort to expend" needs to be explained in more detail.

      Results page 6 gives a more precise explanation of the D parameter.

      (2) Please provide optimal parameters (L and D) for tRNA-seq alignment.

      I think optimal here is not possible to determine. It will depend on the species, the frequency of misincorporations due to modifications (tRNA-Seq protocol specific) and how long one is willing to let bowtie continue searching for a better match. The point of Figure 1a is that D needs to be increased if L is decreased and an error is allowed in the seed. I think the sentence in the results section Figure 1a is the appropriate way to express this without committing to a single ‘optimal’ parameterisation_:_ ‘We observed that when an error in the seed is allowed, as the seed length is decreased, there needs to be a concomitant increase in effort expended to allow bowtie2 more opportunities to find the best possible alignment, especially with respect to the Transcript ID‘.

      (3) I think the authors chose L=10 and D=100 based on Figure 1A. Which dataset did you choose for this parameterization among ALL-tRNAseq, DM-tRNAseq, mim-tRNAseq, QuantM-tRNA-seq, and YAMAT-seq?

      Figure 1A is based on simulation of full length reads with only sequencing errors, e.g not from any tRNA-Seq method in particular. This is stated in the results text and I’ve clarified in the figure legend.

      (4) Salmon does not need a read alignment process such as Bowtie2. Hence, it is not clear "Only results from alignment with bowtie2" in Figure legend for Figure 4a.

      I’m using Salmon in ‘alignment-mode’, taking the alignments from bowtie2. I’ve clarified this in results page 14.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is an interesting and potentially important paper, which however has some deficiencies.

      Strengths:

      A significant amount of potentially useful data.

      Weaknesses:

      One issue is a confusion of thermal stability with solubility. While thermal stability of a protein is a thermodynamic parameter that can be described by the Gibbs-Helmholtz equation, which relates the free energy difference between the folded and unfolded states as a function of temperature, as well as the entropy of unfolding. What is actually measured in PISA is a change in protein solubility, which is an empirical parameter affected by a great many variables, including the presence and concentration of other ambient proteins and other molecules. One might possibly argue that in TPP, where one measures the melting temperature change ∆Tm, thermal stability plays a decisive or at least an important role, but no such assertion can be made in PISA analysis that measures the solubility shift.

      We completely agree with the insightful comment from the reviewer and we are very grateful that the point was raised. Our goal was to make this manuscript easily accessible to the entire scientific community, not just experts in the field. In an attempt to simplify the language, we likely also simplified the underlying physical principles that these assays exploit. In defense of our initial manuscript, we did state that PISA measures “a fold change in the abundance of soluble protein in a compound-treated sample vs. a vehicle-treated control after thermal denaturation and high-speed centrifugation.” Despite this attempt to accurately communicate the reviewer’s point, we seem to have not been sufficiently clear. Therefore, we tried to further elaborate on this point and made it clear that we are measuring differences in solubility and interpreting these differences as changes in thermal stability. 

      In the revised version of the manuscript, we elaborated significantly on our original explanation. The following excerpt appears in the introduction (p. 3):

      “So, while CETSA and TPP measure a change in melting temperature (∆TM), PISA measures a change in solubility (∆SM).  Critically, there is a strong correlation between ∆TM and ∆SM, which makes PISA a reliable, if still imperfect, surrogate for measuring direct changes in protein thermal stability (Gaetani et al., 2019; Li et al., 2020). Thus, in the context of PISA, a change in protein thermal stability (or a thermal shift) can be defined as a fold change in the abundance of soluble protein in a compoundtreated sample vs. a vehicle-treated control after thermal denaturation and high-speed centrifugation. Therefore, an increase in melting temperature, which one could determine using CETSA or TPP, will lead to an increase in the area under the curve and an increase in the soluble protein abundance relative to controls (positive log2 fold change). Conversely, a decrease in melting temperature will result in a decrease in the area under the curve and a decrease in the soluble protein abundance relative to controls (negative log2 fold change).”

      And the following excerpt appears in the results section (p. 4): 

      “In a PISA experiment, a change in melting temperature or a thermal shift is approximated as a

      significant deviation in soluble protein abundance following thermal melting and high-speed centrifugation. Throughout this manuscript, we will interpret these observed alterations in solubility as changes in protein thermal stability. Most commonly this is manifested as a log2 fold change comparing the soluble protein abundance of a compound treated sample to a vehicle-treated control (Figure 1 – figure supplement 1A).”

      We have now drawn a clear distinction between what we were actually measuring (changes in solubility) and how we were interpreting these changes (as thermal shifts). We trust that the Reviewer will agree with this point, as they rightly claim that many of the observations presented in our work, which measures thermal stability, indirectly, are consistent with previous studies that measured thermal stability, directly. Again, we thank the reviewer for raising the point and feel that these changes have significantly improved the manuscript. 

      Another important issue is that the authors claim to have discovered for the first time a number of effects well described in prior literature, sometimes a decade ago. For instance, they marvel at the differences between the solubility changes observed in lysate versus intact cells, while this difference has been investigated in a number of prior studies. No reference to these studies is given during the relevant discussion.

      We thank the reviewer for raising this point. Our aim with this paper was to test the proficiency of this assay in high-throughput screening-type applications. We considered these observations as validation of our workflow, but admit that our choice of wording was not always appropriate and that we should have included more references to previous work. It was certainly never our intention to take credit for these discoveries. Therefore, we were more than happy to include more references in the revised version. We think that this makes the paper considerably better and will help readers better understand the context of our study.  

      The validity of statistical analysis raises concern. In fact, no calculation of statistical power is provided.

      As only two replicates were used in most cases, the statistical power must have been pretty limited. Also, there seems to be an absence of the multiple-hypothesis correction.

      We agree with the reviewer that a classical comparison using a t-test would be underpowered comparing all log2 normalized fold changes. We know from the data and our validation experiments that stability changes that generate log2 fold changes of 0.2 are indicative of compound engagement. When we use 0.2 to calculate power for a standard two-sample t-test with duplicates, we estimated this to have a power of 19.1%. Importantly, increasing this to n=3 resulted in a power estimate of only 39.9%, which would canonically still be considered to be underpowered. Thus, it is important to note that we instead use the distribution of all measurements for a single protein across all compound treatments to calculate standard deviations (nSD) as presented in this work. Thus, rather than a 2-by-2 comparison, we are comparing two duplicate compound treatments to 94 other compound treatments and 18 DMSO vehicle controls. Moreover, we are using this larger sample set to estimate the sampling distribution. Estimating this with a standard z-test would result in a p-value estimate <<< 0.0001 using the population standard deviation. Additionally, rather than estimate an FDR using say a BenjaminiHochberg correction, we estimated an empirical FDR for target calls based on applying the same cutoffs to our DMSO controls and measuring the proportion of hits called in control samples at each set of thresholds. Finally, we note that several other PISA-based methods have used fold-change thresholds similar to, or less than, those employed in this work (PMID: 35506705, 36377428, 34878405, 38293219).  

      Also, the authors forgot that whatever results PISA produces, even at high statistical significance, represent just a prediction that needs to be validated by orthogonal means. In the absolute majority of cases such validation is missing.

      We appreciate this point and we can assure the reviewer that this point was not lost on us. To this point, we state throughout the paper that the primary purpose of this paper was to execute a chemical screen. Furthermore, we do not claim to present a definitive list of protein targets for each compound. Instead, our intention is to provide a framework for performing PISA studies at scale. In total, we quantified thousands of changes and feel that it would be unreasonable to validate the majority of these cases. Instead, as has been done for CETSA (PMID: 34265272), PISA (PMID: 31545609), and TPP (PMID: 25278616) experiments before, we chose to highlight a few examples and provide a reasonable amount of validation for these specific observations. In Figure 2, we show that two screening compounds—palbociclib and NVP-TAE-226—have a similar impact on PLK1 solubility as the two know PLK1 inhibitors. We then assay each of these compounds, alongside BI 2536, and show that the same compounds that impact the solubility of PLK1, also inhibit its activity in cell-based assays. Finally, we model the structure of palbociclib (which is highly similar to BI 2536) in the PLK1 active site. In Figure 4, we show that AZD-5438 causes a change in solubility of RIPK1 in cell- and lysate-based assays to a similar extent as other compounds known to engage RIPK1. We then test these compounds in cellbased assays and show that they are capable of inhibiting RIPK1 activity in vivo. Finally, in Figure 5, we show that treatment with tyrosine kinase inhibitors and AZD-7762 result in a decrease in the solubility of CRKL. We showed that these compounds, specifically, prevented the phosphorylation of CRKL at Y207. Next, we show that AZD-7762, impacts the thermal stability of tyrosine kinases in lysate-based PISA. Finally, we performed phosphoproteomic profiling of cells treated with bafetinib and AZD-7762 and find that the abundance of many pY sites is decreased after treatment with each compound. It is also worth stating that an important goal of this study was to determine the proficiency of these methods in identifying the targets of each compound. We do not feel that comprehensive validation of the “absolute majority of cases” would significantly improve this manuscript. 

      Finally, to be a community-useful resource the paper needs to provide the dataset with a user interface so that the users can data-mine on their own.

      We agree and are working to develop an extensible resource for this. Owing to the size and complexities there, that work will need to be included in a follow-up manuscript. For now, we feel that the supplemental table we provide can be easily navigated the full dataset. Indeed, this has been the main resource that we have been emailed about since the preprint was first made public. We are glad that the Reviewer considers this dataset to be a highly valuable resource for the scientific community.  

      Reviewer #2 (Public Review):

      Summary:

      Using K562 (Leukemia) cells as an experimental model, Van Vracken et. al. use Thermal Proteome Profiling (TPP) to investigate changes in protein stability after exposing either live cells or crude cell lysates to a library of anti-cancer drugs. This was a large-scale and highly ambitious study, involving thousands of hours of mass spectrometry instrument time. The authors used an innovative combination of TPP together with Proteome Integral Solubility Alternation (PISA) assays to reduce the amount of instrument time needed, without compromising on the amount of data obtained.

      The paper is very well written, the relevance of this work is immediately apparent, and the results are well-explained and easy to follow even for a non-expert. The figures are well-presented. The methods appear to be explained in sufficient detail to allow others to reproduce the work.

      We thank the reviewer. One of our major goals was to make these assays and the resulting data approachable, especially for non-experts. We are glad that this turned out to be the case. 

      Strengths:

      Using CDK4/6 inhibitors, the authors observe strong changes in protein stability upon exposure to the drug. This is expected and shows their methodology is robust. Further, it adds confidence when the authors report changes in protein stability for drugs whose targets are not well-known. Many of the drugs used in this study - even those whose protein targets are already known - display numerous offtarget effects. Although many of these are not rigorously followed up in this current study, the authors rightly highlight this point as a focus for future work.

      Weaknesses:

      While the off-target effects of several drugs could've been more rigorously investigated, it is clear the authors have already put a tremendous amount of time and effort into this study. The authors have made their entire dataset available to the scientific community - this will be a valuable resource to others working in the fields of cancer biology/drug discovery.

      We agree with the reviewer that there are more leads here that could be followed and we look forward to both exploring these in future work and seeing what the community does with these data.

      Reviewer #3 (Public Review):

      Summary:

      This work aims to demonstrate how recent advances in thermal stability assays can be utilised to screen chemical libraries and determine the compound mechanism of action. Focusing on 96 compounds with known mechanisms of action, they use the PISA assay to measure changes in protein stability upon treatment with a high dose (10uM) in live K562 cells and whole cell lysates from K562 or HCT116. They intend this work to showcase a robust workflow that can serve as a roadmap for future studies.

      Strengths:

      The major strength of this study is the combination of live and whole cell lysates experiments. This allows the authors to compare the results from these two approaches to identify novel ligand-induced changes in thermal stability with greater confidence. More usefully, this also enables the authors to separate the primary and secondary effects of the compounds within the live cell assay.

      The study also benefits from the number of compounds tested within the same framework, which allows the authors to make direct comparisons between compounds.

      These two strengths are combined when they compare CHEK1 inhibitors and suggest that AZD-7762 likely induces secondary destabilisation of CRKL through off-target engagement with tyrosine kinases.

      Weaknesses:

      One of the stated benefits of PISA compared to the TPP in the original publication (Gaetani et al 2019) was that the reduced number of samples required allows more replicate experiments to be performed. Despite this, the authors of this study performed only duplicate experiments. They acknowledge this precludes the use of frequentist statistical tests to identify significant changes in protein stability. Instead, they apply an 'empirically derived framework' in which they apply two thresholds to the fold change vs DMSO: absolute z-score (calculated from all compounds for a protein) > 3.5 and absolute log2 fold-change > 0.2. They state that the fold-change threshold was necessary to exclude nonspecific interactors. While the thresholds appear relatively stringent, this approach will likely reduce the robustness of their findings in comparison to an experimental design incorporating more replicates. Firstly, the magnitude of the effect size should not be taken as a proxy for the importance of the effect.

      They acknowledge this and demonstrate it using their data for PIK3CB and p38α inhibitors (Figures 2BC). They have thus likely missed many small, but biologically relevant changes in thermal stability due to the fold-change threshold. Secondly, this approach relies upon the fold-changes between DMSO and compound for each protein being comparable, despite them being drawn from samples spread across 16 TMT multiplexes. Each multiplex necessitates a separate MS run and the quantification of a distinct set of peptides, from which the protein-level abundances are estimated. Thus, it is unlikely the fold changes for unaffected proteins are drawn from the same distribution, which is an unstated assumption of their thresholding approach. The authors could alleviate the second concern by demonstrating that there is very little or no batch effect across the TMT multiplexes. However, the first concern would remain. The limitations of their approach could have been avoided with more replicates and the use of an appropriate statistical test. It would be helpful if the authors could clarify if any of the missed targets passed the z-score threshold but fell below the fold-change threshold.

      The authors use a single, high, concentration of 10uM for all compounds. Given that many of the compounds likely have low nM IC50s, this concentration will often be multiple orders of magnitude above the one at which they inhibit their target. This makes it difficult to assess the relevance of the offtarget effects identified to clinical applications of the compounds or biological experiments. The authors acknowledge this and use ranges of concentrations for follow-up studies (e.g. Figure 2E-F). Nonetheless, this weakness is present for the vast bulk of the data presented.

      We agree that there is potential to drive off-target effects at such high-concentrations. However, we note that the concentration we employ is in the same range as previous PISA/CETSA/TPP studies. For example, 10 µM treatments were used in the initial descriptions of TPP (Savitski et al., 2014) and PISA (Gaetani et al., 2019). We also note that temperature may affect off-rates and binding interactions (PMID: 32946682) potentiating the need to use compound concentrations to overcome these effects.

      Additionally, these compounds likely accumulate in human plasma/tissues at concentrations that far exceed the compound IC50 values. For example, in patients treated with a standard clinical dose of ribocicilb, the concentration of the compound in the plasma fluctuates between 1 µM and 10 µM. (Bao, X., Wu, J., Sanai, N., & Li, J. (2019). Determination of total and unbound ribociclib in human plasma and brain tumor tissues using liquid chromatography coupled with tandem mass spectrometry. Journal of pharmaceutical and biomedical analysis, 166, 197–204. https://doi.org/10.1016/j.jpba.2019.01.017)

      The authors claim that combining cell-based and lysate-based assays increases coverage (Figure 3F) is not supported by their data. The '% targets' presented in Figure 3F have a different denominator for each bar. As it stands, all 49 targets quantified in both assays which have a significant change in thermal stability may be significant in the cell-based assay. If so, the apparent increase in % targets when combining reflects only the subsetting of the data. To alleviate this lack of clarity, the authors could update Figure 3F so that all three bars present the % targets figure for just the 60 compounds present in both assays.

      We spent much time debating the best way to present this data, so we are grateful for the feedback. Consistent with the Reviewer’s suggestion, we have included a figure that only considers the 60 compounds for which a target was quantified in both cell-based and lysate-based PISA (now Figure 3E). In addition, we included a pie chart that further illustrates our point (now Figure 3 – figure supplement 2A). Of the 60 compounds, there were 37 compounds that had a known target pass as a hit using both approaches, 6 compounds that had a known target pass as a hit in only cell-based experiments, and 6 compounds that had a known target pass as a hit in only lysate-based experiments.

      Within the Venn diagram, we also included a few examples of compounds that fit into each category. Furthermore, we highlighted two examples of compound-target pairs that pass as a hit with one approach, but not the other (Figure 3 – figure supplement 2B,C). We would also like to refer the reviewer to Figure 4D, which indicates that BRAF inhibitors cause a significant change in BRAF thermal stability in lysates but not cells. 

      Aims achieved, impact and utility:

      The authors have achieved their main aim of presenting a workflow that serves to demonstrate the potential value of this approach. However, by using a single high dose of each compound and failing to adequately replicate their experiments and instead applying heuristic thresholds, they have limited the impact of their findings. Their results will be a useful resource for researchers wishing to explore potential off-target interactions and/or mechanisms of action for these 96 compounds, but are expected to be superseded by more robust datasets in the near future. The most valuable aspect of the study is the demonstration that combining live cell and whole cell lysate PISA assays across multiple related compounds can help to elucidate the mechanisms of action.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      More specifically:

      P 1 l 20, we quantified 1.498 million thermal stability measurements.

      It's a staggering assertion, and it takes some reading to realize that the authors mean the total number of proteins identified and quantified in all experiments. But far from all of these proteins were quantified with enough precision to provide meaningful solubility shifts.

      We can assure the reviewer that we were not trying to deceive the readers. We stated ‘1.498 million thermal stability measurements.’ We did not say 1.498 million compound-specific thermal stability shifts.’ We assume that most readers will appreciate that the overall quality of the measurements will be variable across the dataset, e.g., in any work that describes quantitation of thousands of proteins in a proteomics dataset. In accordance with the Reviewer’s suggestion, we have weakened this statement. The revised version of the manuscript now reads as follows (p. 1): 

      “Taking advantage of this advance, we quantified more than one million thermal stability measurements in response to multiple classes of therapeutic and tool compounds (96 compounds in living cells and 70 compounds in lysates).”

      P 7 l 28. We observed a large range of thermal stability measurements for known compound-target pairs, from a four-fold reduction in protein stability to a four-fold increase in protein stability upon compound engagement (Figure 2A).

      PISA-derived solubility shift cannot be interpreted simply as a "four-fold reduction/increase in protein stability".

      We thank the Reviewer for highlighting this specific passage and agree that it was worded poorly. As such, we have modified the manuscript to the following (p. 8): 

      “We observed a large range of thermal stability measurements for known compound-target pairs, from a four-fold reduction in protein solubility after thermal denaturation to a four-fold increase in protein solubility upon compound engagement (Figure 2A).”

      P 8, l 6. Instead, we posit that maximum ligand-induced change in thermal stability is target-specific.

      Yes, that's right, but this has been shown in a number of prior studies.

      We agree with the reviewer and accept that we made a mistake in how we worded this sentence, which we regret upon reflection. As such, we have modified this sentence to the following:

      “Instead, our data appears to be consistent with the previous observation that the maximum ligandinduced change in thermal stability is target-specific (Savitski et al., 2014; Becher et al., 2016).”

      P 11 l 7. Combining the two approaches allows for greater coverage of the cellular proteome and provides a better chance of observing the protein target for a compound of interest. In fact, the main difference is that in-cell PISA provides targets in cases when the compound is a pro-drug that needs to be metabolically processed before engaging the intended target. This has been shown in a number of prior studies, but not mentioned in this manuscript.

      While our study was not focused on the issue of pro-drugs, this is an important point and we would be happy to re-iterate it in our manuscript. We thank the Reviewer for the suggestion and have modified the manuscript to reflect this point (p. 19): 

      “Cell-based studies, on the other hand, have the added potential to identify the targets of pro-drugs that must be metabolized in the cell to become active and secondary changes that occur independent of direct engagement (Savitski et al., 2014; Franken et al., 2015; Almqvist et al., 2016; Becher et al., 2016; Liang et al., 2022).”

      While we are happy to make this change, we also would like to point out that the reviewer’s assertions that, “the main difference is that in-cell PISA provides targets in cases when the compound is a prodrug that needs to be metabolically processed before engaging the intended target” also may not fully capture the nuances of protein engagement effectors in the cellular context. Thus, we believe it is important to highlight the ability of cell-based assays to identify secondary changes in thermal stability.  

      P 11 l 28. These data suggest that the thermal destabilization observed in cell-based experiments might stem from a complex biophysical rearrangement. That's right because it is not about thermal stability, but about protein solubility which is much affected by the environment.

      We agree that the readout of solubility is an important caveat for nearly every experiment in the family of assays associated with ‘thermal proteome profiling’. Inherently complex biophysical arrangements could affect the inherent stability and solubility of a protein or complex. Thus, we would be happy to make the following change consistent with the reviewer’s suggestion (p. 12): 

      “These data suggest that the decrease in solubility observed in cell-based experiments might stem from a complex biophysical rearrangement.”

      P 12 l 7 A). Thus, certain protein targets are more prone to thermal stability changes in one experimental setting compared to the other. Same thing - it's about solubility, not stability.

      We thank the Reviewer for the recommendation and have modified the revised manuscript as follows (p. 13):

      “Thus, certain protein targets were more prone to solubility (thermal stability) changes in one experimental setting compared to the other (Huber et al., 2015).”

      P13 l 15. While the data suggests that cell- and lysate-based PISA are equally valuable in screening the proteome for evidence of target engagement... No, they are not equally valuable - cell-based PISA can provide targets of prodrugs, which lysate PISA cannot.

      We have removed this sentence to avoid any confusion. We will not place any value judgments on the two approaches. 

      P 18 l 10. In general, a compound-dependent thermal shift that occurs in a lysate-based experiment is almost certain to stem from direct target engagement. That's true and has been known for a decade. Reference needed.

      We recognize this oversight and would be happy to include references. The revised manuscript reads as follows: 

      “In general, a compound-dependent thermal shift that occurs in a lysate-based experiment is almost certain to stem from direct target engagement (Savitski et al., 2014; Becher et al., 2016). This is because cell signaling pathways and cellular structures are disrupted and diluted. Cell-based studies, on the other hand, have the added potential to identify the targets of pro-drugs that must be metabolized in the cell to become active and secondary changes that occur independent of direct engagement (Savitski et al., 2014; Franken et al., 2015; Almqvist et al., 2016; Becher et al., 2016; Liang et al., 2022).”

      P 18 l 29. the data seemed to indicate that the maximal PISA fold change is protein-specific. Therefore, a log2 fold change of 2 for one compound-protein pair could be just as meaningful as a log2 fold change of 0.2 for another. This is also not new information.

      We again appreciate the Reviewer for highlighting this oversight. The revised manuscript reads as follows: 

      “Ultimately, the data seemed to be consistent with previous studies that indicate the maximal change in thermal stability in protein specific (Savitski et al., 2014; Becher et al., 2016; Sabatier et al., 2022). Therefore, a log2 fold change of 2 for one compound-protein pair could be just as meaningful as a log2 fold change of 0.2 for another.”

      P 19 l 5. Specifically, the compounds that most strongly impacted the thermal stability of targets, also acted as the most potent inhibitors. I wish this was true, but this is not always so. For instance, in Nat Meth 2019, 16, 894-901 it was postulated that large ∆Tm correspond to biologically most important sites ("hot spots") - the idea that was later challenged and largely discredited in subsequent studies.

      Indeed, we agree with the Reviewer that there may be no essential connection between these. Rather, we are simply drawing conclusions from observations within the presented dataset. 

      Saying nothing about the work presented in the paper that the reviewer notes above, the referenced definition is also more nuanced “…we hypothesized that ‘hotspot’ modification sites identified in this screen (namely, those significantly shifted relative to the unmodified, bulk and even other phosphomodiforms of the same protein) may represent sites with disproportionate effects on protein structure and function under specific cellular conditions.” Indeed, in the response to that work, Potel et al. (https://doi.org/10.1038/s41592-021-01177-5) “agree with the premise of the Huang et al. study that phosphorylation sites that have a significant effect on protein thermal stability are more likely to be functionally relevant, for example, by modulating protein conformation, localization and protein interactions.” 

      Anecdotally, we also speculate that if we observe proteome engagement for two compounds (let’s say two ATP-competitive kinase inhibitors) that bind in the same pocket (let’s say the ATP binding site) and one causes a greater change in solubility, then it is reasonable to assume that it is a stronger evidence and we see evidence supporting this claim in Figure 2, Figure 3, Figure 4, and Figure 5.

      It is also important to point out that previous work has also made similar points. This is highlighted in a review article by Mateus et al. (10.1186/s12953-017-0122-4). The authors state, “To obtain affinity estimates with TPP, a compound concentration range TPP (TPP-CCR) can be performed. In TPPCCR, cells are incubated with a range of concentrations of compound and heated to a single temperature.” In support of this claim, the authors reference two papers—Savitski et al., 2014 and Becher et al., 2016. We have updated this section in the revised manuscript (p. 20): 

      “While the primary screen was carried out at fixed dose, the increased throughput of PISA allowed for certain compounds to be assayed at multiple doses in a single experiment. In these instances, there was a clear dose-dependent change in thermal stability of primary targets, off-targets, and secondary targets. This not only helped corroborate observations from the primary screen, but also seemed to provide a qualitative assessment of relative compound potency in agreement with previous studies (Savitski et al., 2014; Becher et al., 2016; Mateus et al., 2017). Specifically, the compounds that most strongly impacted the thermal stability of targets, also acted as the most potent inhibitors. In order to be a candidate for this type of study, a target must have a large maximal thermal shift (magnitude of log2 fold change) because there must be a large enough dynamic range to clearly resolve different doses.”

      Also, the compound efficacy is strongly dependent upon the residence time of the drug, which may or may not correlate with the PISA shift. Also important is the concentration at which target engagement occurs (Anal Chem 2022, 94, 15772-15780).

      In our study, the time and concentration of treatment and was fixed for all compounds at 30 minutes and 10 µM, respectively. Therefore, we do not believe these parameters will affect our conclusions.  

      P 19 l 19. For example, we found that the clinically-deployed CDK4/6 inhibitor palbociclib is capable of directly engaging and inhibiting PLK1. This is a PISA-based prediction that needs to be validated by orthogonal means.

      As we demonstrate in this work, the PISA assays serve as powerful screening methods, thus we agree that validation is important for these types of studies. To this end, we show the following:  

      • Proteomics: Palbociclib causes a decrease in solubility following thermal melting in cells.

      • Chemical Informatic: Palbociclib is structurally similar to BI 2536.

      • Protein informatics: Modeling of palbociclib in empirical structures of the PLK1 active site generates negligible steric clashes. 

      • Biochemical: Palbociclib inhibits PLK1 activity in cells.

      We have changed this text to the following to clarify these points:

      “For example, we found that the clinically-deployed CDK4/6 inhibitor palbociclib has a dramatic impact on PLK1 thermal stability in live cells, is capable of inhibiting PLK1 activity in cell-based assays, and can be modelled into the PLK1 active site.”

      Reviewer #2 (Recommendations For The Authors):

      I am wondering why the authors chose to use K562 (leukaemia) cells in this work as opposed to a different cancer cell line (HeLa? Panc1?). It would be helpful if the authors could present some rationale for this decision.

      This is a great question. Two reasons really. First, they are commonly used in various fields of research, especially previous studies using proteome-wide thermal shift assays (PMID: 25278616, 32060372) and large scale chemical perturbations screens (PMID: 31806696). Second, they are a suspension line that makes executing the experiments easier because they do not need to be detached from a plate prior to thermal melting. We think this is a valuable point to make in the manuscript, such that non-experts understand this concept. We tried to communicate this succinctly in the revised manuscript, but would be happy to elaborate further if the Reviewer would like us to. 

      “To enable large-scale chemical perturbation screening, we first sought to establish a robust workflow for assessing protein thermal stability changes in living cells. We chose K562 cells, which grow in suspension, because they have been frequently used in similar studies and can easily be transferred from a culture flask to PCR tubes for thermal melting (Savitski et al., 2014; Jarzab et al., 2020).”

      I note that integral membrane proteins are over-represented among targets for anti-cancer therapeutics. To what extent is the membrane proteome (plasma membrane in particular) identified in this work? After examining the methods, I would expect at least some integral membrane proteins to be identified. Do the authors observe any differences in the behaviour of water-soluble proteins versus integral membrane proteins in their assays? It would be helpful if the authors could comment on this in a potential revision.

      We agree this is an important point when considering the usage of PISA and thermal stability assays in general for specific classes of therapeutics. To address this, we explored what effect the analysis of thermal stability/solubility had on the proportion of membrane proteins in our data (Author response image 1). Annotations were extracted from Uniprot based on each protein being assigned to the “plasma membrane” (07/2024). We quantified 1,448 (16.5% of total proteins) and 1,558 (17.3% of total proteins) membrane proteins in our cell and lysate PISA datasets, respectively. We also compared the proportion of annotated proteins in these datasets to a recent TMTpro dataset (Lin et al.; PMID: 38853901) and found that the PISA datasets recovered a slightly lower proportion of membrane proteins (~17% in PISA versus 18.9% in total proteome analysis). Yet, we note that we expect more membrane proteins in urea/SDS based lysis methods compared to 0.5% NP-40 extractions.

      Author response image 1.

      We were not able to find an appropriate place to insert this data into the manuscript, so we have left is here in the response. If the Reviewer feels strongly that this data should be included in the manuscript, we would be happy to include these data.  

      A final note: I commend the authors for making their full dataset publicly available upon submission to this journal. This data promises to be a very useful resource for those working in the field.

      We thank the Reviewer for this and note that we are excited for this data to be of use to the community.

      Reviewer #3 (Recommendations For The Authors):

      There is no dataset PDX048009 in ProteomeXchange Consortium. I assume this is because it's under an embargo which needs to be released.

      We can confirm that data was uploaded to ProteomeXchange.

      MS data added to the manuscript during revisions was submitted to ProteomeXchange with the identifier – PDX053138.

      Page 9 line 5 refers to 59 compounds quantified in both cell-based and lysate-based, but Figure 3E shows 60 compounds quantified in both. I believe these numbers should match.

      We thank the Reviewer for catching this. In response to critiques from this Reviewer in the Public Review, we re-worked this section considerably. Please see the above critique/response for more details. 

      Page 10, lines 26-28: It would help the reader if some of the potential 'artefactual effects of lysatebased analyses' were described briefly.

      We thank the Reviewer for raising this point. The truth is, that we are not exactly sure what is happening here, but we know that, at least, for vorinostat, this excess of changes in lysate-based PISA is consistent across experiments. We also do not see pervasive issues within the plexes containing these compounds. Therefore, we do not think this is due to a mistake or other experimental error. We hypothesize that the effect might result from a change in pH or other similar property that occurs upon addition of the molecule, though we note that we have previously seen that vorinostat can induce large numbers of solubility changes in a related solvent shift assays (doi: 10.7554/eLife.70784). We have modified the text to indicate that we do not fully understand the reason for the observation (p. 11):

      “It is highly unlikely that these three molecules actively engage so many proteins and, therefore, the 2,176 hits in the lysate-based screen were likely affected in part by consistent, but artefactual effects of lysate-based analyses that we do not fully understand (Van Vranken et al., 2021).”

      Page 24, lines 29-30 appear to contain a typo. I believe the '>' should be '<' or the 'exclude' should be 'retain'.

      The Reviewer is completely correct. We appreciate the attention to detail. This mistake has been corrected in the revised manuscript.  

      Page 25, lines 5-7: The methods need to explain how the trimmed standard deviation is calculated.

      We apologize for this oversight. To calculate the trimmed standard deviation, we used proteins that were measured in at least 30 conditions. For these, we then removed the top 5% of absolute log2 foldchanges (compared to DMSO controls) and calculated the standard deviation of the resulting set of log2 fold-changes. This is similar in concept to the utilization of “trimmed means” in proteomics data (https://doi.org/10.15252/msb.20145625), which helps to overcome issues due to extreme outliers in datasets. We have added the following statement to the methods to clarify this point (p. 27):

      “Second, for each protein across all cells or lysate assays, the number of standard deviations away from the mean thermal stability measurement (z-score) for a given protein was quantified based on a trimmed standard deviation. Briefly, the trimmed standard deviation was calculated for proteins that were measured in at least 30 conditions. For these, we removed the top 5% of absolute log2 foldchanges (compared to DMSO controls) and calculated the standard deviation of the resulting set of log2 fold-changes.”

      Page 25, lines 9-11 needs editing for clarity.

      We tested empirical hit rates for estimation of mean and trimmed standard deviation (trimmedSD) thresholds to apply, to maximize sensitivity and minimizing the ‘False Hit Rate’, or the number of proteins in the DMSO control samples called as hits divided by the total number of proteins called as hits with a given threshold applied. 

      Author response image 2.

      Hit calling threshold setting based on maximizing the total hits called and minimizing the False Hit Rate in cells (number of DMSO hits divided by the total number of hits).

      Author response image 3.

      Hit calling threshold setting based on maximizing the total hits called and minimizing the False Hit Rate in lysates (number of DMSO hits divided by the total number of hits).

      Figure 1 supplementary 2a legend states: '32 DMSO controls'. Should that be 64?

      We thank the Reviewer for catching our mistake. This has been corrected in the revised manuscript. 

      I suggest removing Figure 1 supplementary 3c which is superfluous as only the number it presents is already stated in the text (page 5, line 9).

      We thank the Reviewer for the suggestion and agree that this panel is superfluous. It has been removed from the revised manuscript.

      New data and tables added during revisions:  

      (1) Table 3 – All log2 fold change values for the cell-based screen. Using this table, proteincentric solubility profiles can be plotted (as in Figures 2D and others). 

      (2) Table 4 – All log2 fold change values for the lysate-based screen. Using this table, proteincentric solubility profiles can be plotted (as in Figures 2D and others). 

      (3) Figure 1 – Figure supplement 3H – Table highlighting proteins that pass log2 fold change cutoffs, but not nSD cutoffs and vice versa. 

      (4) Figure 2 – Panels H and I were updated with a new color scheme. 

      (5) Figure 3 – Updated main figure and supplement at the request of Reviewer 3. 

      • Figure 3E – Compares on-target hits for the cell- and lysate-based screens for all compounds for which a target was quantified in both screens. 

      • Figure 3 – Figure supplement 2 – Highlights on-target hits in both screens, exclusively in cells, and exclusively in lysates. 

      (6) Figure 5 – PISA data for K562 lysates treated with AZD-7762 at multiple concentrations.

      • Figure 5F

      • Figure 5 – Figure supplement 3A-C

      • Figure 5 – Source data 2

      (7) Figure 5 – Phosphoproteomic profiling of K562 cells treated with AZD7762 or Bafetinib. 

      • Figure 5G

      • Figure 5 – Figure supplement 4A-F

      • Figure 5 – Source data 3 (phosphoproteome)

      • Figure 5 – Source data 4 (associated proteome data)

    2. Reviewer #1 (Public review):

      This paper describes proteome solubility analysis (PISA) of 96 compounds in living cells and 70 compounds in cell lysates. A wealth of information related to on- and off-target engagement is uncovered. This work fits well the eLife profile, will be of interest to a large community of proteomics researchers, and thus is likely to be reasonably highly cited.

    3. Reviewer #3 (Public review):

      Summary:

      This work aims to demonstrate how recent advances in thermal stability assays can be utilised to screen chemical libraries and determine compound mechanism of action. Focusing on 96 compounds with known mechanisms of action, they use the PISA assay to measure changes in protein stability upon treatment with a high dose (10uM) in live K562 cells and whole cell lysates from K562 or HCT116. They intend this work to showcase a robust workflow which can serve as a roadmap for future studies.

      Strengths:

      The major strength of this study is the combination of live and whole cell lysates experiments. This allows the authors to compare the results from these two approaches to identify novel ligand-induced changes in thermal stability with greater confidence. More usefully, this also enables the authors to separate primary and secondary effects of the compounds within the live cell assay.

      The study also benefits from the number of compounds tested within the same framework, which allows the authors to make direct comparisons between compounds.

      These two strengths are combined when they compare between CHEK1 inhibitors and suggest that AZD-7762 likely induces secondary destabilisation of CRKL through off-target engagement with tyrosine kinases.

      Weaknesses:

      One of the stated benefits of PISA compared to the TPP in the original publication (Gaetani et al 2019) was that the reduced number of samples required allows more replicate experiments to be performed. Despite this, the authors of this study performed only duplicate experiments. They acknowledge this precludes use of frequentist statistical tests to identify significant changes in protein stability. Instead, they apply an 'empirically derived framework' in which they apply two thresholds to the fold change vs DMSO: absolute z-score (calculated from all compounds for a protein) > 3.5 and absolute log2 fold-change > 0.2. They state that the fold-change threshold was necessary to exclude non-specific interactors. While the thresholds appear relatively stringent, this approach will likely reduce the robustness of their findings in comparison to an experimental design incorporating more replicates. Firstly, the magnitude of the effect size should not be taken as a proxy for the importance of the effect. They acknowledge this and demonstrate it using their own data for PIK3CB and p38α inhibitors (Figure 2B-C). They have thus likely missed many small, but biological relevant changes in thermal stability due to the fold-change threshold. Secondly, this approach relies upon the fold-changes between DMSO and compound for each protein being comparable, despite them being drawn from samples spread across 16 TMT multiplexes. Each multiplex necessitates a separate MS run and the quantification of a distinct set of peptides, from which the protein-level abundances are estimated. Thus, it is unlikely the fold-changes for unaffected proteins are drawn from the same distribution, which is an unstated assumption of their thresholding approach. The authors could alleviate the second concern by demonstrating that there is very little or no batch effect across the TMT multiplexes. However, the first concern would remain. The limitations of their approach could have been avoided with more replicates and use of an appropriate statistical test. It would be helpful if the authors could clarify if any of the missed targets passed the z-score threshold but fell below the fold-change threshold.

      The authors use a single, high, concentration of 10uM for all compounds. Given that many of the compounds may have low nM IC50s, this concentration could be orders of magnitude above the one at which they inhibit their target. This makes it difficult to assess the relevance of the off-target effects identified to clinical applications of the compounds or biological experiments. The authors acknowledge this and use ranges of concentrations for follow-up studies (e.g. Figure 2E-F). Nonetheless, this weakness is present for the vast bulk of the data presented.

      Aims achieved, impact and utility:

      The authors have achieved their main aim of presenting a workflow which serves to demonstrate the potential value of this approach. However, by using a single high dose of each compound and failing to adequately replicate their experiments and instead applying heuristic thresholds, they have limited the impact of their findings. Their results will be a useful resource for researchers wishing to explore potential off-target interactions and/or mechanisms of action for these 96 compounds but are expected to be superseded by more robust datasets in the near future. The most valuable aspect of the study is the demonstration that combining live cell and whole cell lysate PISA assays across multiple related compounds can help to elucidate the mechanisms of action.

    1. eLife Assessment

      Research on push-pull systems often focuses on controlled environments, limiting our understanding of their effectiveness under real-world conditions. This important study has validated how push-pull systems work in natural settings. However, the manuscript remains incomplete, since the findings have only been partially supported, as acknowledged by the authors.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript of Odermatt et al. investigates the volatiles released by two species of Desmodium plants and the response of herbivores to maize plants alone or in combination with these species. The results show that Desmodium releases volatiles in both the laboratory and the field. Maize grown in the laboratory also released volatiles, in a similar range. While female moths preferred to oviposit on maize, the authors found no evidence that Desmodium volatiles played a role in lowering attraction to or oviposition on maize.

      Strengths:

      The manuscript is a response to recently published papers that presented conflicting results with respect to whether Desmodium releases volatiles constitutively or in response to biotic stress, the level at which such volatiles are released, and the behavioral effect it has on the fall armyworm. These questions are relevant as Desmodium is used in a textbook example of pest-suppressive sustainable intercropping technology called push-pull, which has supported tens of thousands of smallholder farmers in suppressing moth pests in maize. A large number of research papers over more than two decades have implied that Desmodium suppresses herbivores in push-pull intercropping through the release of large amounts of volatiles that repel herbivores. This premise has been questioned in recent papers. Odermatt et al. thus contribute to this discussion by testing the role of odors in oviposition choice. The paper confirms that ovipositing FAW preferred maize, and also confirmed that odors released from Desmodium appeared not important in their bioassays.

      The paper is a welcome addition to the literature and adds quality headspace analyses of Desmodium from the laboratory and the field. Furthermore, the authors, some of whom have since long contributed to developing push-pull, also find that Desmodium odors are not significant in their choice between maize plants. This advances our knowledge of the mechanisms through which push-pull suppresses herbivores, which is critically important to evolving the technique to fit different farming systems and translating this mechanism to fit with other crops and in other geographical areas.

      Weaknesses:

      Below I outline the major concerns:

      (1) Clear induction of the experimental plants, and lack of reflective discussion around this: from literature data and previous studies of maize and Desmodium, it is clear that the plants used in this study, particularly the Desmodium, were induced. Maize appeared to be primarily manually damaged, possibly due to sampling (release of GLV, but little to no terpenoids, which is indicative of mostly physical stress and damage, for example, one of the coauthor's own paper Tamiru et al. 2011), whereas Desmodium releases a blend of many compounds (many terpenoids indicative of herbivore induction). Erdei et al. also clearly show that under controlled conditions maize, silver leaf and green leaf Desmodium release volatiles in very low amounts. While the condition of the plants in Odermatt et al. may be reflective of situations in push-pull fields, the authors should elaborate on the above in the discussion (see comments) such that the readers understand that the plant's condition during the experiments. This is particularly important because it has been assumed that Desmodium releases typical herbivore-induced volatiles constitutively, which is not the case (see Erdei et al. 2024). This reflection is currently lacking in the manuscript.

      (2) Lack of controls that would have provided context to the data: The experiments lack important controls that would have helped in the interpretation:

      (2a) The authors did not control the conditions of the plants. To understand the release of volatiles and their importance in the field, the authors should have included controlled herbivory in both maize and Desmodium. This would have placed the current volatile profiles in a herbivory context. Now the volatile measurements hang in midair, leading to discussions that are not well anchored (and should be rephrased thoroughly, see eg lines 183-188). It is well known that maize releases only very low levels of volatiles without abiotic and biotic stressors. However, this changes upon stress (GLVs by direct, physical damage and eg terpenoids upon herbivory, see above). Erdei et al. confirm this pattern in Desmodium. Not having these controls, means that the authors need to put the data in the context of what has been published (see above).

      (2b) It would also have been better if the authors had sampled maize from the field while sampling Desmodium. Together with the above point (inclusion of herbivore-induced maize and Desmodium), the levels of volatile release by Desmodium would have been placed into context.

      (2c) To put the volatiles release in the context of push-pull, it would have been important to sample other plants which are frequently used as intercrop by smallholder farmers, but which are not considered effective as push crops, particularly edible legumes. Sampling the headspace of these plants, both 'clean' and herbivore-induced, would have provided a context to the volatiles that Desmodium (induced) releases in the field - one would expect unsuccessful push crops to not release any of these 'bioactive' volatiles (although 'bioactive' should be avoided) if these odors are responsible for the pest suppressive effect of Desmodium. Many edible intercrops have been tested to increase the adoption of push-pull technology but with little success.

      Because of the lack of the above, the conclusions the authors can draw from their data are weakened. The data are still valuable in the current discussion around push-pull, provided that a proper context is given in the discussion along the points above.

      (3) 'Tendency' of the authors to accept the odor hypothesis (i.e. that Desmodium odors are responsible for repelling FAW and thereby reduce infestation in maize under push-pull management) in spite of their own data: The authors tested the effects of odor in oviposition choice, both in a cage assay and in a 'wind tunnel'. From the cage experiments, it is clear that FAW preferred maize over Desmodium, confirming other reports (including Erdei et al. 2024). However, when choosing between two maize plants, one of which was placed next to Desmodium to which FAW has no tactile (taste, structure, etc), FAW chose equally. Similarly in their wind tunnel setup (this term should not be used to describe the assay, see below), no preference was found either between maize odor in the presence or absence of Desmodium. This too confirms results obtained by Erdei et al. (but add an important element to it by using Desmodium plants that had been induced and released volatiles, contrary to Erdei et al. 2024). Even though no support was found for repellency by Desmodium odors, the authors in many instances in the manuscript (lines 30-33, 164-169, 202, 279, 284, 304-307, 311-312, 320) appear to elevate non-significant tendencies as being important. This is misleading readers into thinking that these interactions were significant and in fact confirming this in the discussion. The authors should stay true to their own data obtained when testing the hypothesis of whether odors play a role in the pest-suppressive effect of push-pull.

      (4) Oviposition bioassay: with so many assays in close proximity, it is hard to certify that the experiments are independent. Please discuss this in the appropriate place in the discussion.

      (5) The wind tunnel has a number of issues (besides being poorly detailed):

      (5a) The setup which the authors refer to as a 'wind tunnel' does not qualify as a wind tunnel. First, there is no directional flow: there are two flows entering the setup at opposite sides. Second, the flow is way too low for moths to orient in (in a wind tunnel wind should be presented as a directional cue. Only around 1.5 l/min enters the wind tunnel in a volume of 90 l approximately, which does not create any directional flow. Solution: change 'wind tunnel' throughout the text to a dual choice setup /assay.

      (5b) There is no control over the flows in the flight section of the setup. It is very well possible that moths at the release point may only sense one of the 'options'. Please discuss this.

      (5c) Too low a flow (1,5 l per minute) implies a largely stagnant air, which means cross-contamination between experiments. An experiment takes 5 minutes, but it takes minimally 1.5 hours at these flows to replace the flight chamber air (but in reality much longer as the fresh air does not replace the old air, but mixes with it). The setup does not seem to be equipped with e.g. fans to quickly vent the air out of the setup. See comments in the text. Please discuss the limitations of the experimental setup at the appropriate place in the discussion.

      (5d) The stimulus air enters through a tube (what type of tube, diameter, length, etc) containing pressurized air (how was the air obtained into bags (type of bag, how is it sealed?), and the efflux directly into the flight chamber (how, nozzle?). However, it seems that there is no control of the efflux. How was leakage prevented, particularly how the bags were airtight sealed around the plants?

      (5e) The plants were bagged in very narrowly fitting bags. The maize plants look bent and damaged, which probably explains the GLVs found in the samples. The Desmodium in the picture (Figure 5 supplement), which we should assume is at least a representative picture?) appears to be rather crammed into the bag with maize and looks in rather poor condition to start with (perhaps also indicating why they release these volatiles?). It would be good to describe the sampling of the plants in detail and explain that the way they were handled may have caused the release of GLVs.

      (6) Figure 1 seems redundant as a main figure in the text. Much of the information is not pertinent to the paper. It can be used in a review on the topic. Or perhaps if the authors strongly wish to keep it, it could be placed in the supplemental material.

    3. Reviewer #2 (Public review):

      Based on the controversy of whether the Desmodium intercrop emits bioactive volatiles that repel the fall armyworm, the authors conducted this study to assess the effects of the volatiles from Desmodium plants in the push-pull system on behavior of FAW oviposition. This topic is interesting and the results are valuable for understanding the push-pull system for the management of FAW, the serious pest. The methodology used in this study is valid, leading to reliable results and conclusions. I just have a few concerns and suggestions for improvement of this paper:

      (1) The volatiles emitted from D. incanum were analyzed and their effects on the oviposition behavior of FAW moth were confirmed. However, it would be better and useful to identify the specific compounds that are crucial for the success of the push-pull system.

      (2) That would be good to add "symbols" of significance in Figure 4 (D).

      (3) Figure A is difficult for readers to understand.

      (4) It will be good to deeply discuss the functions of important volatile compounds identified here with comparison with results in previous studies in the discussion better.

    4. Author response:

      We thank both reviewers for their thorough and insightful feedback, which will contribute to improving our manuscript. In summary, the key concerns raised include the potential induction of GLV volatiles due to plant handling, limitations in the design of the "wind tunnel" bioassay, and the need for a deeper analysis of specific volatile compounds that contribute to the success of push-pull systems. We are happy to revise the entire manuscript according to all comments of the reviewers. This includes clarification of our methodology and providing a more reflective discussion on how physical stress might have influenced volatile emissions. Additionally, we will conduct new experiments with a modified bioassay setup to address concerns about directional cues and airflow control, minimizing cross-contamination. While the identification of individual compounds was beyond the scope of this study, we acknowledge its importance and propose it as a direction for future research.

      Reviewer #1 (Public review):

      Summary:

      The manuscript of Odermatt et al. investigates the volatiles released by two species of Desmodium plants and the response of herbivores to maize plants alone or in combination with these species. The results show that Desmodium releases volatiles in both the laboratory and the field. Maize grown in the laboratory also released volatiles, in a similar range. While female moths preferred to oviposit on maize, the authors found no evidence that Desmodium volatiles played a role in lowering attraction to or oviposition on maize.

      Strengths:

      The manuscript is a response to recently published papers that presented conflicting results with respect to whether Desmodium releases volatiles constitutively or in response to biotic stress, the level at which such volatiles are released, and the behavioral effect it has on the fall armyworm. These questions are relevant as Desmodium is used in a textbook example of pest-suppressive sustainable intercropping technology called push-pull, which has supported tens of thousands of smallholder farmers in suppressing moth pests in maize. A large number of research papers over more than two decades have implied that Desmodium suppresses herbivores in push-pull intercropping through the release of large amounts of volatiles that repel herbivores. This premise has been questioned in recent papers. Odermatt et al. thus contribute to this discussion by testing the role of odors in oviposition choice. The paper confirms that ovipositing FAW preferred maize, and also confirmed that odors released from Desmodium appeared not important in their bioassays.

      The paper is a welcome addition to the literature and adds quality headspace analyses of Desmodium from the laboratory and the field. Furthermore, the authors, some of whom have since long contributed to developing push-pull, also find that Desmodium odors are not significant in their choice between maize plants. This advances our knowledge of the mechanisms through which push-pull suppresses herbivores, which is critically important to evolving the technique to fit different farming systems and translating this mechanism to fit with other crops and in other geographical areas.

      Thank you for your careful assessment of our manuscript.

      Weaknesses:

      Below I outline the major concerns:

      (1) Clear induction of the experimental plants, and lack of reflective discussion around this: from literature data and previous studies of maize and Desmodium, it is clear that the plants used in this study, particularly the Desmodium, were induced. Maize appeared to be primarily manually damaged, possibly due to sampling (release of GLV, but little to no terpenoids, which is indicative of mostly physical stress and damage, for example, one of the coauthor's own paper Tamiru et al. 2011), whereas Desmodium releases a blend of many compounds (many terpenoids indicative of herbivore induction). Erdei et al. also clearly show that under controlled conditions maize, silver leaf and green leaf Desmodium release volatiles in very low amounts. While the condition of the plants in Odermatt et al. may be reflective of situations in push-pull fields, the authors should elaborate on the above in the discussion (see comments) such that the readers understand that the plant's condition during the experiments. This is particularly important because it has been assumed that Desmodium releases typical herbivore-induced volatiles constitutively, which is not the case (see Erdei et al. 2024). This reflection is currently lacking in the manuscript.

      We acknowledge the need for a more reflective discussion on the possible causes of GLV (green leaf volatiles) emission, particularly regarding physical damage. Although the field plants were carefully handled, it is possible that some physical stress may have contributed to the release of GLVs. We will ensure the revised manuscript reflects this nuanced interpretation. However, we will also explain more clearly that our aim was to capture the volatile emission of plants used by farmers under realistic conditions and moth responses to these plants, not to be able to attribute the volatile emission to a specific cause. We think that this is also clear in the manuscript. However, we plan to revise relevant passages throughout the manuscript to ensure that we do not make any claims about the reason for volatile emissions, and that our claims regarding these plants and their headspace being representative of the system as practiced by farmers are supported. In the revised manuscript we will explain better that the volatile profiles comprise a majority of non-GLV compounds. As shown in figure 1, the majority of the substances that were found in the headspace of the sampled plants of Desmodium intortum or Desmodium incanum are non-GLV monoterpenes, sesquiterpenes, or aromatic compounds. We will also note that the experimental plants used in the study were grown in insect proof screenhouses and were checked for any insect damage before volatile collection and bioassay.

      (2) Lack of controls that would have provided context to the data: The experiments lack important controls that would have helped in the interpretation:

      (2a) The authors did not control the conditions of the plants. To understand the release of volatiles and their importance in the field, the authors should have included controlled herbivory in both maize and Desmodium. This would have placed the current volatile profiles in a herbivory context. Now the volatile measurements hang in midair, leading to discussions that are not well anchored (and should be rephrased thoroughly, see eg lines 183-188). It is well known that maize releases only very low levels of volatiles without abiotic and biotic stressors. However, this changes upon stress (GLVs by direct, physical damage and eg terpenoids upon herbivory, see above). Erdei et al. confirm this pattern in Desmodium. Not having these controls, means that the authors need to put the data in the context of what has been published (see above).

      We appreciate this concern. Our study aimed to capture the real-world conditions of push-pull fields, where Desmodium and maize grow in natural environments without the direct induction of herbivory for experimental purposes. We will update the discussion to provide better context based on existing literature regarding the volatile release under stress conditions. We agree that in further studies it would be important to carry out experiments under different environmental conditions, including herbivore damage. However, this was not within the scope of the present study.

      (2b) It would also have been better if the authors had sampled maize from the field while sampling Desmodium. Together with the above point (inclusion of herbivore-induced maize and Desmodium), the levels of volatile release by Desmodium would have been placed into context.

      We acknowledge that sampling maize and other intercrop plants, such as edible legumes, alongside Desmodium in the push-pull field would have allowed us to make direct comparisons of the volatile profiles of different plants in the push-pull system under shared field conditions. Again, this should be done in future experiments but was beyond the scope of the present study. Due to the amount of samples, we could handle given cost and workload, we chose to focus on Desmodium because there is much less literature on the volatile profiles of field-grown Desmodium than maize plants in the field: we are aware of one study attempting to measure field volatile profiles from Desmodium intortum (Erdei et al. 2024) and no study attempting this for Desmodium incanum. We will point out this justification for our focus on Desmodium in the manuscript. Additionally, we will suggest in the discussion that future studies should measure volatile profiles from maize and intercrop legumes alongside Desmodium and border grass in push-pull fields.

      (2c) To put the volatiles release in the context of push-pull, it would have been important to sample other plants which are frequently used as intercrop by smallholder farmers, but which are not considered effective as push crops, particularly edible legumes. Sampling the headspace of these plants, both 'clean' and herbivore-induced, would have provided a context to the volatiles that Desmodium (induced) releases in the field - one would expect unsuccessful push crops to not release any of these 'bioactive' volatiles (although 'bioactive' should be avoided) if these odors are responsible for the pest suppressive effect of Desmodium. Many edible intercrops have been tested to increase the adoption of push-pull technology but with little success.

      Again, we very much agree that such measurements are important for the longer-term research program in this field. But again, for the current study this would have exploded the size of the required experiment. Regarding bioactivity, we have been careful to use the phrase "potentially bioactive", or to cite other studies showing bioactivity, where we have not demonstrated bioactivity ourselves.

      Because of the lack of the above, the conclusions the authors can draw from their data are weakened. The data are still valuable in the current discussion around push-pull, provided that a proper context is given in the discussion along the points above.

      We agree that our study is limited to its specific aims. Therefore, we think the revisions will make these more explicit and help to avoid misleading claims.

      (3) 'Tendency' of the authors to accept the odor hypothesis (i.e. that Desmodium odors are responsible for repelling FAW and thereby reduce infestation in maize under push-pull management) in spite of their own data: The authors tested the effects of odor in oviposition choice, both in a cage assay and in a 'wind tunnel'. From the cage experiments, it is clear that FAW preferred maize over Desmodium, confirming other reports (including Erdei et al. 2024). However, when choosing between two maize plants, one of which was placed next to Desmodium to which FAW has no tactile (taste, structure, etc), FAW chose equally. Similarly in their wind tunnel setup (this term should not be used to describe the assay, see below), no preference was found either between maize odor in the presence or absence of Desmodium. This too confirms results obtained by Erdei et al. (but add an important element to it by using Desmodium plants that had been induced and released volatiles, contrary to Erdei et al. 2024). Even though no support was found for repellency by Desmodium odors, the authors in many instances in the manuscript (lines 30-33, 164-169, 202, 279, 284, 304-307, 311-312, 320) appear to elevate non-significant tendencies as being important. This is misleading readers into thinking that these interactions were significant and in fact confirming this in the discussion. The authors should stay true to their own data obtained when testing the hypothesis of whether odors play a role in the pest-suppressive effect of push-pull.

      We appreciate this feedback and agree that we may have overstated claims that could not be supported by strict significance tests. However, we believe that non-significant tendencies can still provide valuable insights. In the revised version of the manuscript, we will ensure a clear distinction between statistically significant findings and non-significant trends and remove any language that may imply stronger support for the odor hypothesis that what the data show.

      (4) Oviposition bioassay: with so many assays in close proximity, it is hard to certify that the experiments are independent. Please discuss this in the appropriate place in the discussion.

      We have pointed this out in the submitted manuscript in the lines 275 – 279. Furthermore, we include detailed captions to figure 4 - supporting figure 3 & figure 4 - supporting figure 4. We are aware that in all such experiments there is a danger of between-treatment interference, which we will point out for our specific case. We will also mention that this common caveat does not invalidate experimental designs when practicing replication and randomization and assume insect’s ability to select suitable oviposition site in the background of such confounding factors under realistic conditions. We will also mention explicitly that with our experimental setup we tried to minimize interference between treatments by spacing and temporal staggering.

      (5) The wind tunnel has a number of issues (besides being poorly detailed):

      (5a) The setup which the authors refer to as a 'wind tunnel' does not qualify as a wind tunnel. First, there is no directional flow: there are two flows entering the setup at opposite sides. Second, the flow is way too low for moths to orient in (in a wind tunnel wind should be presented as a directional cue. Only around 1.5 l/min enters the wind tunnel in a volume of 90 l approximately, which does not create any directional flow. Solution: change 'wind tunnel' throughout the text to a dual choice setup /assay.)

      We agree with these criticisms and will change the terminology accordingly. We also plan to conduct an additional experiment with a no-choice arena that provides conditions closer to a true wind tunnel. The setup of the added experiment features an odor entry point at only one side of the chamber to create a more directional airflow. Each treatment (maize alone, maize + D. intortum, maize + D. incanum, and a control with no plants) will be tested separately, with only one treatment conducted per evening to avoid cross-contamination.

      (5b) There is no control over the flows in the flight section of the setup. It is very well possible that moths at the release point may only sense one of the 'options'. Please discuss this.

      We will add this to the discussion. The newly planned assays also address this concern by using a setup with laminar flow.

      (5c) Too low a flow (1,5 l per minute) implies a largely stagnant air, which means cross-contamination between experiments. An experiment takes 5 minutes, but it takes minimally 1.5 hours at these flows to replace the flight chamber air (but in reality much longer as the fresh air does not replace the old air, but mixes with it). The setup does not seem to be equipped with e.g. fans to quickly vent the air out of the setup. See comments in the text. Please discuss the limitations of the experimental setup at the appropriate place in the discussion.

      We will add these limitations to the discussion and will address these concerns with new experiments (see answer 5a).

      (5d) The stimulus air enters through a tube (what type of tube, diameter, length, etc) containing pressurized air (how was the air obtained into bags (type of bag, how is it sealed?), and the efflux directly into the flight chamber (how, nozzle?). However, it seems that there is no control of the efflux. How was leakage prevented, particularly how the bags were airtight sealed around the plants? 

      We will add the missing information to the methods and provide details about types of bags, manufacturers, and pre-treatments. In short, Teflon tubes connected bagged plants to the bioassay setup and air was pumped in at an overpressure, so leakage was not eliminated but contamination from ambient air was avoided.

      (5e) The plants were bagged in very narrowly fitting bags. The maize plants look bent and damaged, which probably explains the GLVs found in the samples. The Desmodium in the picture (Figure 5 supplement), which we should assume is at least a representative picture?) appears to be rather crammed into the bag with maize and looks in rather poor condition to start with (perhaps also indicating why they release these volatiles?). It would be good to describe the sampling of the plants in detail and explain that the way they were handled may have caused the release of GLVs.

      We will include a more detailed description of the plant handling and bagging processes to the methods to clarify how the plants were treated during all assays reported in the submitted manuscript and the newly planned assays. This will address concerns about the possible influence of plant stress, such as GLV emission due to bagging, on the results. We politely disagree that the maize plants were damaged and the Desmodium plants not representative of those encountered in the field. The Desmodium plant pictured was D. incanum, which has sparser foliage and smaller leaves than D. intortum.

      (6) Figure 1 seems redundant as a main figure in the text. Much of the information is not pertinent to the paper. It can be used in a review on the topic. Or perhaps if the authors strongly wish to keep it, it could be placed in the supplemental material.

      We think that Figure 1 provides essential information about the push-pull system and the FAW. To our knowledge, this partly contradictory evidence so far has not been synthesized in the literature. We realize that such a figure would more commonly be provided in a review article, but we do not think that the small number of studies on this topic so far justify a stand-alone review. Instead, the introduction to our manuscript includes a brief review of these few studies, complemented by the visual summary provided in Figure 1 and a detailed supplementary table. We will revise the figure and associated text in the introduction to highlight its relevance for the current study and to reduce redundant information.

      Reviewer #2 (Public review):

      Based on the controversy of whether the Desmodium intercrop emits bioactive volatiles that repel the fall armyworm, the authors conducted this study to assess the effects of the volatiles from Desmodium plants in the push-pull system on behavior of FAW oviposition. This topic is interesting and the results are valuable for understanding the push-pull system for the management of FAW, the serious pest. The methodology used in this study is valid, leading to reliable results and conclusions. I just have a few concerns and suggestions for improvement of this paper:

      (1) The volatiles emitted from D. incanum were analyzed and their effects on the oviposition behavior of FAW moth were confirmed. However, it would be better and useful to identify the specific compounds that are crucial for the success of the push-pull system.

      We fully agree that identifying specific volatile compounds responsible for the push-pull effect would provide valuable insights into the underlying mechanisms of the system. However, the primary focus of this study was to address the still unresolved question whether Desmodium emits volatiles at all under field conditions, and the secondary aim was to test whether we could demonstrate a behavioral effect of Desmodium headspace on FAW moths. Before conducting our experiments, we carefully considered the option of using single volatile compounds and synthetic blends in bioassays. We decided against this because we judged that the contradictory evidence in the literature was not a sufficient basis for composing representative blends. Furthermore, we think it is an important first step to test for behavioral responses to the headspaces of real plants. We consider bioassays with pure compounds to be important for confirmation and more detailed investigation in future studies. There was also contradictory evidence in the literature regarding moth responses to plants. We thus opted to focus on experiments with whole plants to maintain ecological relevance.

      (2) That would be good to add "symbols" of significance in Figure 4 (D).

      We report the statistical significance of the parameters in Figure 4 (D) in Table 3. While testing significance between groups is a standard approach, we used a more robust model-based analysis to assess the effects of multiple factors simultaneously. We will clarify this in the figure legend and provide a cross-reference to Table 3 for readers to easily find the statistical details.

      (3) Figure A is difficult for readers to understand.

      Unfortunately, it is not entirely clear which specific figure is being referred to as "Figure A" in this comment. We kindly request further clarification on which figure needs improvement, and we will make adjustments accordingly to ensure that all figures are easily comprehensible for readers.

      (4) It will be good to deeply discuss the functions of important volatile compounds identified here with comparison with results in previous studies in the discussion better.

      Our study does not provide strong evidence that specific volatiles from Desmodium plants are important determinants of FAW oviposition or choice in the push-pull system. Therefore, we prefer to refrain from detailed discussions of the potential importance of individual compounds. However, in the revised version, we will indicate specifically which of the volatiles we identified overlap with those previously reported from Desmodium, as only the total numbers are summarized in the discussion of the submitted paper.

    1. eLife Assessment

      This study attempts to understand the functional roles of the human DCP1 paralogs in regulating RNA decay by DCP2. Using a combination of cellular-based assays and in vitro assays, the authors conclude that DCP1a/b plays a role in regulating DCP2 activity. While this revised version presents some new and interesting observations on human DCP1, the underlying data to support its claims remain incomplete. Overall, these results will be useful to the RNA community.

    2. Reviewer #1 (Public review):

      Summary & Assessment:

      The catalytic core of the eukaryotic decapping complex consists of the decapping enzyme DCP2 and its key activator DCP1. In humans, there are two paralogs of DCP1, DCP1a and DCP1b, that are known to interact with DCP2 and recruit additional cofactors or coactivators to the decapping complex; however, the mechanisms by which DCP1 activates decapping and the specific roles of DCP1a versus DCP1b, remain poorly defined. In this manuscript, the authors used CRISPR/Cas9-generated DCP1a/b knockout cells to begin to unravel some of the differential roles for human DCP1a and DCP1b in mRNA decapping, gene regulation, and cellular metabolism. While this manuscript presents some new and interesting observations on human DCP1 (e.g. human DCP1a/b KO cells are viable and can be used to investigate DCP1 function; only the EVH1 domain, and not its disordered C-terminal region which recruits many decapping cofactors, is apparently required for efficient decapping in cells; DCP1a and b target different subsets of mRNAs for decay and may regulate different aspects of metabolism), there is one key claim about the role of DCP1 in regulating DCP2-mediated decapping that is still incompletely or inconsistently supported by the presented data in this revised version of the manuscript.

      Strengths & well-supported claims:

      • Through in vivo tethering assays in CRISPR/Cas9-generated DCP1a/b knockout cells, the authors show that DCP1 depletion leads to significant defects in decapping and the accumulation of capped, deadenylated mRNA decay intermediates.<br /> • DCP1 truncation experiments reveal that only the EVH1 domain of DCP1 is necessary to rescue decapping defects in DCP1a/b KO cells.<br /> • RNA and protein immunoprecipitation experiments suggest that DCP1 acts as a scaffold to help recruit multiple decapping cofactors to the decapping complex (e.g. EDC3, DDX6, PATL1 PNRC1, and PNRC2), but that none of these cofactors are essential for DCP2-mediated decapping in cells.<br /> • The authors investigated the differential roles of DCP1a and DCP1b in gene regulation through transcriptomic and metabolomic analysis and found that these DCP1 paralogs target different mRNA transcripts for decapping and have different roles in cellular metabolism and their apparent links to human cancers. (Although I will note that I can't comment on the experimental details and/or rigor of the transcriptomic and metabolomic analyses, as these are outside my expertise.)

      Weaknesses & incompletely supported claims:

      (1) One of the key mechanistic claims of the paper is that "DCP1a can regulate DCP2's cellular decapping activity by enhancing DCP2's affinity to RNA, in addition to bridging the interactions of DCP2 with other decapping factors. This represents a pivotal molecular mechanism by which DCP1a exerts its regulatory control over the mRNA decapping process." Similar versions of this claim are repeated in the abstract and discussion sections. However, this claim appears to be at odds with the observations that: (a) in vitro decapping assays with immunoprecipitated DCP2 show that DCP1 knockout does not significantly affect the enzymatic activity of DCP2 (Fig 2C&D; I note that there may be a very small change in DCP2 activity shown in panel D, but this may be due to slightly different amounts of immunoprecipitated DCP2 used in the assay); and (b) the authors show only weak changes in relative RNA levels immunoprecipitated by DCP2 with versus without DCP1 (~2-3 fold change in Fig 3H, where expression of the EVH1 domain, previously shown in this manuscript to fully rescue the DCP1 KO decapping defects in cells, looks to be almost within error of the control in terms of increasing RNA binding). If DCP1 pivotally regulates decapping activity by enhancing RNA binding to DCP2, why is no difference in in vitro decapping activity observed in the absence of DCP1, and very little change observed in the amounts of RNA immunoprecipitated by DCP2 with the addition of the DCP1 EVH1 domain?

      In the revised manuscript and in their response to initial reviews, the authors rightly point out that in vivo effects may not always be fully reflected by or recapitulated in in vitro experiments due to the lack of cellular cofactors and simpler environment for the in vitro experiment, as compared to the complex environment in the cell. I fully agree with this of course! And further completely agree with the authors that this highlights the critical importance of in cell experiments to investigate biological functions and mechanisms! However, because the in vitro kinetic and IP/binding data both suggest that the DCP1 EVH1 domain has minimal to no effects on RNA decapping or binding affinity, while the in cell data suggest the EVH1 domain alone is sufficient to rescue large decapping defects in DCP1a/b KO cells (and that all the decapping cofactors tested were dispensable for this), I would argue there is insufficient evidence here to make a claim that (maybe weakly) enhanced RNA binding induced by DCP1 is what is regulating the cellular decapping activity. Maybe there are as-yet-untested cellular cofactors that bind to the EVH1 domain of DCP1 that change either RNA recruitment or the kinetics of RNA decapping in cells; we can't really tell from the presented data so far. Furthermore, even if it is the case that the EVH1 domain modestly enhances RNA binding to DCP2, the authors haven't shown that this effect is what actually regulates the large change in DCP2 activity upon DCP1 KO observed in the cell.

      Overall, while I absolutely appreciate that there are many possible reasons for the differences observed in the in vitro versus in cell RNA decapping and binding assays, because this discrepancy between those data exists, it seems difficult to draw any clear conclusions about the actual mechanisms by which DCP1 helps regulate RNA decapping by DCP2. For example, in the cell it could be that DCP1 enhances RNA binding, or recruits unidentified cofactors that themselves enhance RNA binding, or that DCP1 allosterically enhances DCP2-mediated decapping kinetics, or a combination of these, etc; my point is that without in vitro data that clearly support one of those mechanisms and links this mechanism back to cellular DCP2 decapping activity (for example, in cell data that show EVH1 mutants that impair RNA binding fail to rescue DCP1 KO decapping defects), it's difficult to attribute the observed in cell effects of DCP1a/b KO and rescue by the EVH1 domain directly to enhancement of RNA binding (precisely because, as the authors describe, the decapping process and regulation may be very complex in the cell!).

      This contradiction between the in vitro and in-cell decapping data undercuts one of the main mechanistic takeaways from the first half of the paper; I still think this conclusion is overstated in the revised manuscript.

      Additional minor comment:

      • Related to point (1) above, the kinetic analysis presented in Fig 2C shows that the large majority of transcript is mostly decapped at the first 5 minute timepoint; it may be that DCP2-mediated decapping activity is actually different in vitro with or without DCP1, but that this is being missed because the reaction is basically done in less than 5 minutes under the conditions being assayed (i.e. these are basically endpoint assays under these conditions). It may be that if kinetics were done under conditions to slow down the reaction somewhat (e.g. lower Dcp2 concentration, lower temperatures), so that more of the kinetic behavior is captured, the apparent discrepancy between in vitro and in-cell data would be much less. Indeed, previous studies have shown that in yeast, Dcp1 strongly activates the catalytic step (kcat) of decapping by ~10-fold, and reduces the KM by only ~2 fold (Floor et al, NSMB 2010). It might be beneficial to use purified proteins here, if possible, to better control reaction conditions.

      In their response to initial reviews, the authors comment that they tried to purify human DCP2 from E coli, but were unable to obtain active enzyme in this way. Fair enough! I will only comment that just varying the relative concentration of immunoprecipitated DCP2 would likely be enough to slow down the reaction and see if activity differences are seen in different kinetic regimes, without the need to obtain fully purified / recombinant Dcp2.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Weaknesses & incompletely supported claims:

      (1) A central mechanistic claim of the paper is that "DCP1a can regulate DCP2's cellular decapping activity by enhancing DCP2's affinity to RNA, in addition to bridging the interactions of DCP2 with other decapping factors. This represents a pivotal molecular mechanism by which DCP1a exerts its regulatory control over the mRNA decapping process." Similar versions of this claim are repeated in the abstract and discussion sections. However, this appears to be entirely at odds with the observation from in vitro decapping assays with immunoprecipitated DCP2 that showed DCP1 knockout does not significantly affect the enzymatic activity of DCP2 (Figures 2B-D; I note that there may be a very small change in DCP2 activity shown in panel C, but this may be due to slightly different amounts of immunoprecipitated DCP2 used in the assay, as suggested by panel D). If DCP1 pivotally regulates decapping activity by enhancing RNA binding to DCP2, why is no difference in decapping activity observed in the absence of DCP1?

      Furthermore, the authors show only weak changes in relative RNA levels immunoprecipitated by DCP2 with versus without DCP1 (~2-3 fold change; consistent with the Valkov 2016 NSMB paper, which shows what looks like only modest changes in RNA binding affinity for yeast Dcp2 +/- Dcp1). Is the argument that only a 2-3 fold change in RNA binding affinity is responsible for the sizable decapping defects and significant accumulation of deadenylated intermediates observed in cells upon Dcp1 depletion? (and if so, why is this the case for in-cell data, but not the immunoprecipitated in vitro data?)

      We appreciate the reviewer's thoughtful comments on our paper. The reviewer points out an apparent contradiction between the claim that DCP1a regulates DCP2's cellular decapping activity and the observation that knocking out DCP1a does not significantly affect DCP2's enzymatic activity in vitro. However, it is important to underscore the challenge of reconciling differences between in vitro and in vivo experiments in scientific research. Although in vitro systems provide a controlled environment, they have inherent limitations that often fail to capture the complexities of cellular processes. Our in vitro experiments used immunoprecipitated proteins to ensure the presence of relevant factors, but these experiments cannot fully replicate the precise stoichiometry and dynamic interactions present in a cellular environment. Furthermore, the limited volume in vitro can actually facilitate reactions that may not occur as readily in the complex and heterogeneous environment of a cell. Therefore, the lack of a significant difference in decapping activity observed in vitro does not necessarily negate the regulatory role of DCP1 in the cellular context. Rather, it underscores our previous oversight of DCP1's importance in the decapping process under in vitro conditions. The conclusions regarding DCP1's regulatory mechanisms remain valid and supported by the presented evidence, especially when considering the inherent differences between in vitro and in vivo experimental conditions. It is precisely because of these differences that we recognized our previous underestimation of DCP1's significance. Therefore, our subsequent experiments focused on elucidating DCP1's regulatory mechanisms in the decapping process

      The authors acknowledge this apparent discrepancy between the in vitro DCP2 decapping assays and in-cell decapping data, writing: "this observation could be attributed to the inherent constraints of in vitro assays, which often fall short of faithfully replicating the complexity of the cellular environment where multiple factors and cofactors are at play. To determine the underlying cause, we postulated that the observed cellular decapping defect in DCP1a/b knockout cells might be attributed to DCP1 functioning as a scaffold." This is fair. They next show that DCP1 acts as a scaffold to recruit multiple factors to DCP2 in cells (EDC3, DDX6, PatL1, and PNRC1 and 2). However, while DCP1 is shown to recruit multiple cofactors to DCP2 (consistent with other studies in the decapping field, and primarily through motifs in the Dcp1 C-terminal tail), the authors ultimately show that *none* of these cofactors are actually essential for DCP2-mediated decapping in cells (Figures 3A-F). More specifically, the authors showed that the EVH1 domain was sufficient to rescue decapping defects in DCP1a/b knockout cells, that PNRC1 and PNRC2 were the only cofactors that interact with the EVH1 domain, and finally that shRNA-mediated PNRC1 or PNCR2 knockdown has no effect on in-cell decapping (Figures 3E and F). Therefore, based on the presented data, while DCP1 certainly does act as a scaffold, it doesn't seem to be the case that the major cellular decapping defect observed in DCP1a/b knockout is due to DCP1's ability to recruit specific cofactors to DCP2.

      The findings that none of the decapping cofactors recruited by DCP1 to DCP2 are essential for decapping in cells further underscore the complexity of the decapping process in vivo. This observation suggests that while DCP1's scaffolding function is crucial for recruiting cofactors, the decapping process likely involves additional layers of regulation that are not fully captured by our current understanding of DCP1. Furthermore, the reviewer mentions that the observed changes in RNA binding affinity (approximately 2-3 fold) in our in vitro experiments seem relatively modest. While these changes may appear insignificant in vitro, their cumulative impact in the dynamic cellular environment could be substantial. Even minor perturbations in RNA binding affinity can trigger cascading effects, leading to significant changes in decapping activity and the accumulation of deadenylated intermediates upon Dcp1 depletion. Cellular processes involve complex networks of interrelated events, and small molecular changes can result in amplified biological outcomes. The subtle molecular variations observed in vitro may translate into significant phenotypic outcomes within the complex cellular environment, underscoring the importance of DCP1a's regulatory role in the cellular decapping process.

      So as far as I can tell, the discrepancy between the in vitro (DCP1 not required) and in-cell (DCP1 required) decapping data, remains entirely unresolved. Therefore, I don't think that the conclusions that DCP1 regulates decapping by (a) changing RNA binding affinity (authors show this doesn't matter in vitro, and that the change in RNA binding affinity is very small) or (b) by bridging interactions of cofactors with DCP2 (authors show all tested cofactors are dispensable for robust in-cell decapping activity), are supported by the evidence presented in the paper (or convincingly supported by previous structural and functional studies of the decapping complex).

      We have addressed the reconciliation of differences between in vitro and in vivo experiments in the revised manuscript and emphasized the importance of considering cellular interactions when interpreting our findings.

      (2) Related to the RNA binding claims mentioned above, are the differences shown in Figure 3H statistically significant? Why are there no error bars shown for the MBP control? (I understand this was normalized to 1, but presumably, there were 3 biological replicates here that have some spread of values?). The individual data points for each replicate should be displayed for each bar so that readers can better assess the spread of data and the significance of the observed differences. I've listed these points as major because of the key mechanistic claim that DCP1 enhances RNA binding to DCP2 hinges in large part on this data.

      Thank you for your feedback. Regarding your comments on the statistical significance of the differences shown in Figure 3H and the absence of error bars for the MBP control, we will address these concerns in the revised manuscript. We’ll include individual data points for the three biological replicates and corresponding statistical analysis to more clearly demonstrate the data spread and significance of the observed differences.

      (3) Also related to point (1) above, the kinetic analysis presented in Figure 2C shows that the large majority of transcript is mostly decapped at the first 5-minute timepoint; it may be that DCP2-mediated decapping activity is actually different in vitro with or without DCP1, but that this is being missed because the reaction is basically done in less than 5 minutes under the conditions being assayed (i.e. these are basically endpoint assays under these conditions). It may be that if kinetics were done under conditions to slow down the reaction somewhat (e.g. lower Dcp2 concentration, lower temperatures), so that more of the kinetic behavior is captured, the apparent discrepancy between in vitro and in-cell data would be much less. Indeed, previous studies have shown that in yeast, Dcp1 strongly activates the catalytic step (kcat) of decapping by ~10-fold, and reduces the KM by only ~2 fold (Floor et al, NSMB 2010). It might be beneficial to use purified proteins here (only a Western blot is used in Figure 2D to show the presence of DCP2 and/or DCP1, but do these complexes have other, and different, components immunoprecipitated along with them?), if possible, to better control reaction conditions.

      This contradiction between the in vitro and in-cell decapping data undercuts one of the main mechanistic takeaways from the first half of the paper. This needs to be addressed/resolved with further experiments to better define the role of DCP1-mediated activation, or the mechanistic conclusions significantly changed or removed.

      We genuinely appreciate the reviewer’s insightful comments on the kinetic analysis presented in Figure 2C. Your astute observation regarding the potential influence of reaction duration on the interpretation of in vitro decapping activity, especially in the absence of DCP1, is well-received. The time-sensitive nature of our experiments, as you rightly pointed out, might not fully capture the nuanced kinetic behaviors. In addition, the DCP2 complex purified from cells could not be precisely quantified. In response to your suggestion, we attempted to purify human DCP2 protein from E. coli; however, regrettably, the purified protein failed to exhibit any enzymatic activity. This disparity may be attributed to species differences.

      Considering the reviewer’s valuable insights, our revised manuscript emphasized that purified DCP2 from cells exhibits activity regardless of the presence of DCP1. This adjustment aims to provide a clearer perspective on our findings and to better align with the nuances of our experimental design and the meticulous consideration of the results.

      (4) The second half of the paper compares the transcriptomic and metabolic profiles of DCP1a versus DCP1b knockouts to reveal that these target a different subset of mRNAs for degradation and have different levels of cellular metabolites. This is a great application of the DCP1a/b KO cells developed in this paper and provides new information about DCP1a vs b function in metazoans, which to my knowledge has not really been explored at all. However, the analysis of DCP1 function/expression levels in human cancer seems superficial and inconclusive: for example, the authors conclude that "...these findings indicate that DCP1a and DCP1b likely have distinct and non-redundant roles in the development and progression of cancer", but what is the evidence for this? I see that DCP1a and b levels vary in different cancer cell types, but is there any evidence that these changes are actually linked to cancer development, progression, or tumorigenesis? If not, these broader conclusions should be removed.

      Thank you to the reviewer for pointing out that such a description may be misleading. We have removed our previous broader conclusion and revised our sentences. To further explore the potential impact of DCP1a and DCP1b on cancer progression, we examined the association between the expression levels of DCP1a and DCP1b and progression-free interval (PFI). We have incorporated this information into our revised manuscript.

      (5) The authors used CRISPR-Cas9 to introduce frameshift mutations that result in premature termination codons in DCP1a/b knockout cells (verified by Sanger sequencing). They then use Western blotting with DCP1a or DCP1b antibodies to confirm the absence of DCP1 in the knockout cell lines. However, the DCP1a antibody used in this study (Sigma D5444) is targeted to the C-terminal end of DCP1a. Can the authors conclusively rule out that the CRISPR/Cas-generated mutations do not result in the production of truncated DCP1a that is just unable to be detected by the C-terminally targeted antibody? While it is likely the introduced premature termination codon in the DCP1a gene results in nonsense-mediated decay of the resulting transcript, this outcome is indeed supported by the knockout results showing large defects in cellular decapping which can be rescued by the addition of the EVH1 domain, it would be better to carefully validate the success of the DCP1a knockout and conclusively show no truncated DCP1a is produced by using N-terminally targeted DCP1a antibodies (as was the case for DCP1b).

      Thank you for your insightful comment regarding the validation of our DCP1a/b knockout cell line. We acknowledge your point about the DCP1a C-terminal targeting of the Sigma D5444 antibody used in our Western blot analysis. We agree that we cannot definitively rule out the possibility of truncated DCP1a protein production solely based on the lack of full-length protein detection. To address this limitation, we utilized a commercial information available N-terminally targeted DCP1a antibody (aviva ARP39353_T100) in a Western blot analysis. This will allow us to comprehensively detect any truncated protein fragments remaining after the CRISPR-Cas9-generated frameshift mutation.

      Some additional minor comments:

      • More information would be helpful on the choice of DCP1 truncation boundaries; why was 1-254 chosen as one of the truncations?

      Thank you for the reviewer's comment and suggestion. Regarding the choice of DCP1 1-254 truncation boundaries based on the predicted structure from AlphaFoldDB (A0A087WT55). We will include this information in the revised manuscript.

      • Figure S2D is a pretty important experiment because it suggests that the observed deadenylated intermediates are in fact still capped; can a positive control be added to these experiments to show that removal of cap results in rapid terminator-mediated degradation?

      Unfortunately, due to our institution's current laboratory safety policies, we are unable to perform experiments involving the use of radioactive isotopes such as 32P. Therefore, while adding the suggested positive control experiment to demonstrate rapid RNA degradation upon decapping would further validate our interpretation, we regret that we cannot carry out this experiment at the moment. However, the observed deadenylated intermediates in Figure S2D match the predicted size of capped RNA fragments, and not the expected sizes of degradation products after decapping. Furthermore, previous literature has well-established that for these types of RNAs, decapping leads directly to rapid 5' to 3' exonuclease-mediated degradation, without producing stable deadenylated intermediates. Thus, we believe that the current data is sufficient to support our conclusion that the deadenylated intermediates retain the 5' cap structure.

      Reviewer #2 (Public Review):

      Weaknesses:

      The direct targets of DCP1a and/or DCP1b were not determined as the analysis was restricted to RNA-seq to assess RNA abundance, which can be a result of direct or indirect regulation by DCP1a/b.

      Thank you for raising this important point. In our study, we acknowledge that the use of RNA-seq to assess RNA abundance provides a broad overview of the regulatory impacts of DCP1a and DCP1b. This method captures changes in RNA levels that may arise from both direct and indirect regulatory actions of these proteins. While we did not directly determine the targets of DCP1a and DCP1b, the data obtained from our RNA-seq analysis serve as a foundational step for future targeted experiments, which could include techniques such as RIP-seq, to delineate the direct targets of DCP1a and DCP1b more precisely. We believe that our current findings contribute valuable information to the field and pave the way for these subsequent analyses.

      P-bodies appear to be larger in human cells lacking DCP1a and DCP1b but a lack of image quantification prevents this conclusion from being drawn.

      Thank you for the reviewer’s valuable feedback. We have addressed the reviewer’s concern regarding P-bodies' size in human cells lacking DCP1a and DCP1b. We have now performed image quantification and can confirm that P-bodies are indeed larger in these cells.

      The lack of details in the methodology and figure legends limit reader understanding.

      We acknowledge the reviewer's concerns regarding the level of detail provided in the methodology and figure legends. To address this, we are committed to enhancing both sections with additional details and clarifications in our revised manuscript. Thank you for bringing this to our attention.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) To me, the second half of the paper comparing DCP1a and DCP1b is in many ways distinct from the first half and could stand on its own as an interesting paper if this comparative analysis is explored a little deeper (maybe by validating some of the differences in decay observed for individual mRNAs targeted by DCP1a versus DCP1b, by measuring and comparing the decay rates of some individual transcripts under differential control by DCP1a vs b?), and revising the conclusions about links to cancer as mentioned above. I think these later comparative results in the paper present the most new and interesting data concerning DCP1 function in humans (especially since I think the mechanistic conclusions from the first half aren't well supported yet or are at least inconsistent), but when I read these later sections of the paper I struggle to understand the key takeaways from the transcriptomic and metabolomic data.

      Thank you for the reviewer's suggestions. Estimating the decay rates of individual transcripts within the transcriptomes of DCP1a_KO, DCP1b_KO, and wild type can provide insight into the direct targets of DCP1a or DCP1b. However, this requires either time-series RNA-seq or specialized sequencing technologies such as Precision Run-On sequencing (PRO-seq) or RNA Approach to Equilibrium Sequencing (RATE-Seq). Unfortunately, we lack the necessary dataset in our project to estimate the decay rates for the potential targets identified in our RNA-seq data. Despite this limitation, we acknowledge the potential of this approach in identifying the true targets of DCP1a and DCP1b and have included this idea in our discussion.

      (2) I think it would be helpful to add a little more descriptive or narrative language to the figure legends (I know some of them are already quite long!) so that readers can follow the general idea of the experiment through the figure legend as well as the main text; as written, the figure legends are mostly exclusively technical details, so it can be hard to parse what experiment is being carried out in some cases.

      Thank you for the reviewer’s suggestion, we will strive to improve the language of the figure legends to include technical details while clearly conveying the main idea of the experiment. We will ensure that the language of the figure legends is more readable and comprehensible so that readers can more easily parse what experiment is being carried out.

      Reviewer #2 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data, or analyses:

      The use of RNA-seq to measure RNA abundance in DCP1a and/or b knockout cells can give some insight into both the indirect and direct effects of DCP1a/b on gene expression but cannot identify the direct targets of these genes. Rather, global analysis of RNA stability or capturing uncapped RNA decay intermediates would allow the authors to conclude they have identified direct targets of DCP1a and/or b. Without such analyses, the interpretation of these data should be scaled back to clearly state that RNA levels can be altered through indirect effects of DCP1a/b absence throughout the text.

      We appreciate the reviewer's suggestion. We have modified our sentences to emphasize that the dysregulated genes could be caused by both direct and indirect effects.

      A control/randomly generated gene list should be analyzed for GO terms to determine whether the enrichment of cancer-related pathways in the differentially expressed genes in the DCP1a/b knockout cells is meaningful.

      Thank you for the reviewer's comment. We shuffled our gene list and reperformed the pathway enrichment analysis in Figure 4C and 4D 1,000 times. We focused on the following cancer-related pathways: E2F targets, MTORC1 signaling, G2M checkpoint, MYC target V1, EMT transition, KRAS signaling DN, P53 pathway, and NOTCH signaling pathways. We then calculated how many times the q-values obtained from the shuffled gene list were more significant than the q-value obtained from our real data. In four of the eight pathways (E2F targets, MTORC1 signaling, G2M checkpoint, and MYC target v1), none of the shuffled gene lists resulted in a q-value smaller than the real one. In the other four pathways (EMT transition, KRAS signaling DN, P53 pathway, and NOTCH signaling pathways), the q-values were smaller than the real q-value 2, 11, 4, and 4 times out of the 1000 shuffles. Based on the shuffled results, we conclude that the transcriptome of DCP1a/b knockout cells is statistically enriched in these cancer-related pathways.

      Author response image 1.

      Distribution of q-values resulting from the Gene Set Enrichment Analysis (GSEA) conducted on 1,000 shuffled gene lists for eight cancer-related pathways. The q-values derived from Figure 4C and 4D are indicated by red (DCP1a_KO) and blue (DCP1b_KO) dashed lines, respectively. Some q-values derived from Figure 4C are too small to be labeled on the plots, such as in E2F targets (q value: 5.87E-07), MTORC1 signaling (q values: 6.59E-07 and 1.58E-06 for DCP1a_KO and DCP1b_KO, respectively), MYC target V1 (q value: 0.004644174 for DCP1a_KO), etc. The numbers x/1000 indicate how often the shuffled q-values were smaller than the real q-value out of 1,000 permutations.

      Comparisons of the DCP1a and/or b knockout RNA-seq results should be done to published datasets such as those published by Luo et al., Cell Chemical Biology (2021) to determine whether there are common targets with DCP2 and validate the reported findings.

      Thank you for reviewer’s suggestion. We compared the upregulated genes from DCP1a_KO, DCP1b_KO, and DCP1a/b_KO cell lines with the 91 targets of DPC2 identified by Luo et al. in Cell Chemical Biology (2021). Only EPPK1 was found to be overlapped between the potential DCP1b_KO targets and the targets of DCP2. No genes were found to be overlapped between the potential DCP1a_KO targets and the targets of DCP2. However, three genes, TES, PAX6, and C18orf21, were found to be overlapped between the significantly upregulated DEGs of DCP1a/b_KO and the targets of DCP2. We have included this information in the discussion section.

      The RNA tethering assays are not clear and are difficult to interpret without further controls to delineate the polyadenylated and deadenylated species.

      Thank you for the reviewer’s feedback. We acknowledge that the reviewer might harbor some doubts regarding the outcomes of the RNA tethering assays. Nonetheless, this methodology is well-established and has also found extensive application across many studies. We are committed to enhancing the clarity of our experiment’s details and results within the figure legends and textual descriptions.

      The representative images of p-bodies clearly show that DCP1a/b KO cells have larger p-bodies than the wild-type cells. The authors should quantify p-body size in each image set as the current interpretation of the data is that there is no difference in size or number of p-bodies, but the data suggest otherwise.

      Thank you very much for the reviewer’s insightful comments and for drawing our attention to the need to quantify p-body sizes in DCP1a/b KO and wild-type cells. We agree with the reviewer’s assessment that the representative images suggest a difference in p-body size between DCP1a/b KO cells and wild-type cells, which we initially overlooked. We will revise our manuscript accordingly to include these findings, ensuring that our interpretation of the data aligns with the observed differences.

      Statistical analysis of the Figure 2C results should be included because the difference between the wild-type and Dco1a/b KO cells with GFP-DCP2 looks significantly different but is interpreted in the text as not significant.

      Thank you for pointing out the need for a statistical analysis of the results shown in Figure 2C. We acknowledge that the visual difference between the wild-type and Dco1a/b KO cells with GFP-DCP2 suggests a significant variation, which may not have been clearly communicated in our text. We will conduct the necessary statistical analysis to substantiate the observations made in Figure 2C. Furthermore, we would like to emphasize that our primary focus was to demonstrate that purified DCP2 within cells retains its activity even in the absence of DCP1. This critical point will be highlighted and clarified in the revised version of our manuscript to prevent any misunderstanding.

      Recommendations for improving the writing and presentation:

      Additional context including what is known about the role of dcp1 in decapping from the decades of work in yeast and other model organisms should be incorporated into the introduction and discussion sections.

      Thank you for the reviewer’s suggestion. We will incorporate additional context about the function and significance of DCP1 in decapping processes within our revised manuscript's introduction and discussion sections.

      Details should be provided within the figure legends and methods section on experimental approaches and the number of replicates and statistical analyses used throughout the manuscript. For example, it is not clear whether western blots or RNA-IP experiments were performed more than once as representative images are shown.

      Thank you for the reviewer’s suggestion. In the figure legends and methods section, we will provide more details about the experimental methods, number of replicates, and statistical analyses. Regarding the Western blots and RNA-IP experiments the reviewer mentioned, we performed multiple experiments and presented representative images in the manuscript. We will clarify this in the revised manuscript to eliminate potential confusion.

      The rationale for performing metabolic profiling is not clear.

      We appreciate the reviewer's thoughtful feedback. The rationale behind conducting metabolic profiling in our study is rooted in its efficacy as a valuable tool for deciphering the consequences of specific gene mutations, particularly those closely associated with phenotypic changes or final metabolic pathways. Our objective is to utilize metabolic profiling to unravel the distinct biofunctions of DCP1a and DCP1b. By employing this approach, we aim to gain insights into the intricate metabolic alterations that result from the absence of these genes, thereby enhancing our understanding of their roles in cellular processes. We recognize the necessity of clearly presenting this rationale and promise to bolster the articulation of these points in the revised version of our manuscript to ensure the clarity and transparency of our research motivation.

      Details in the methods section should be included for the CRISPR/Cas9-mediated gene editing validation. The Sangar sequencing results presented in Figure S1b should be explained. The entire western blot(s) should be shown in Figure S1A to give confidence the Dcp1a/b KO cells are not expressing truncated proteins and the epitopes of the antibodies used to detect Dcp1a/b should be described. The northern blot probes should be described and sequences included. The transcriptomics method should be detailed.

      Thank you for your feedback, in the revised manuscript we will detail the CRISPR/Cas9 gene editing validation, explain the Sanger sequencing results in Figure S1b, show the full Western blot in Figure S1A to confirm that the Dcp1a/b knockout cells are not expressing truncated proteins, describe the Northern blot probes used, and detail the transcriptomics method, all to ensure clarity and comprehensiveness in our experimental procedures and results.

      A diagram showing the RNA tethering assays with labels corresponding to all blots/gels should be provided.

      Thank you for your suggestion. We will provide a diagram showing the RNA tethering assays with labels corresponding to all blots/gels in our revised manuscript. This will help readers better understand our experimental design and results.

      The statement, "This suggests that the disruption of the decapping process in DCP1a/b-knockout cells results in the accumulation of unprocessed mRNA intermediates" regarding the results of the RNA-seq assay is not supported by the evidence as RNA-seq does not measure RNA decay intermediates or RNA decay rates.

      Thank you for the reviewer’s comment. We agree with that RNA-seq experiments indeed do not directly measure RNA decay intermediates or RNA decay rates. Our statement could have caused confusion, and we have therefore removed this sentence from the manuscript.

      Minor corrections to the text and figures:

      Figure S6A is uninterpretable as presented.

      Thank you for the reviewer’s valuable feedback. We have taken note and made improvements. We have simplified Figure S6A to enhance its interpretability, hoping that the current version will make it easier for the readers to understand.

    1. eLife Assessment

      In this manuscript, the authors present valuable findings on the apparent role of a salience-network anterior insula node in directing fronto-parietal and default-mode network activity within a tripartite network during control of memory, drawn from an impressive invasive human neurophysiological dataset. Overall, the authors have presented a convincing set of analyses. We also commend the use of a large intracranial EEG dataset to approach this question.

    2. Reviewer #1 (Public review):

      Summary

      Das and Menon describe an analysis of a large open-source iEEG dataset (UPENN-RAM). From encoding and recall phases of memory tasks, they analyzed power and phase-transfer entropy as a measure of directed information flow in regions across a hypothesized tripartite network system. The anterior insula (AI) was found to have heightened high gamma power during encoding and retrieval, which corresponded to suppression of high gamma power in medial prefrontal cortex (mPFC) and posterior cingulate cortex (PCC) during encoding but not recall. In contrast, directed information flow from (but not to) AI to mPFC and PCC is high during both time periods when PTE is analyzed with broadband but not narrowband activity. They claim that these findings significantly advance an understanding of how network communication facilitates cognitive operations during memory tasks, and that the AI of the salience network (SN) is responsible for influencing both the frontoparietal network (FPN) and default-mode network (DMN) during memory encoding and retrieval.

      I find this question interesting and important and agree with the authors that iEEG presents a unique opportunity to investigate the temporal dynamics within network nodes. Their findings convey intriguing information about the structure and order of communication between network regions during on-task cognition in general (though, perhaps not specific to memory - see Weaknesses), with the AI of the SN ostensibly playing an important role in possibly influencing the DMN and FPN.

      Strengths

      - The authors present results from an impressively-sized iEEG sample. For reader context, this type of invasive human data is difficult and time-consuming to collect and many similar studies in high-level journals include 5-20 participants, typically not all of whom have electrodes in all regions of interest. It is excellent that they have been able to leverage open-source data in this way.<br /> - Preprocessing of iEEG data also seems sensible and appropriate based on field standards.<br /> - The authors tackle the replication issues inherent in much of the literature by replicating findings across task contexts, demonstrating that the principles of network communication evidenced by their results generalize in multiple task memory contexts. Again, the number of iEEG patients who have multiple tasks' worth of data is impressive.<br /> - Though the revised manuscript presents a broader and more novel investigation of the tripartite network's role in memory encoding and retrieval (as opposed to cognitive control of memory) the authors now thoroughly review the literature motivating this investigation of open-source data.

      Weaknesses

      - As the authors discuss, it is currently unclear if the directed information flow from AI to DMN and FPN nodes truly arises from memory-associated processes as opposed to more general attentional and cognitive demands, especially given that information flow does not relate meaningfully to task performance (whether memory retrieval is successful or not). I also note this is a concern because - though the authors have now demonstrated that information flow is increased compared to an off-task baseline - influences of AI on DMN or FPN were not increased relative to baseline epochs during the task in the original preprint version, again suggesting these effects may not be specific to the memory component of the analyzed tasks. The authors have thoughtfully noted in the Discussion several ways that experimental design can be improved in future studies to address this limitation.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Removing claims of causality: To avoid confusion, we have now removed claims of causality from our manuscript and also changed the title of the manuscript accordingly

      "Electrophysiological dynamics of salience, default mode, and frontoparietal networks during episodic memory formation and recall: A multi-experiment iEEG replication".

      Control analyses directly comparing AI and IFG: As per the reviewer’s suggestion, we have carried out additional control analyses by directly comparing the net inward/outward balance between the AI and the IFG. Our analysis revealed that the net outflow for the AI is significantly higher compared to the IFG during both encoding and recall phases, a pattern that was replicated across all four experiments. 

      These findings further highlight the unique role of the AI as a key hub in coordinating network interactions during episodic memory formation and retrieval, distinguishing it from a key anatomically adjacent prefrontal region implicated in cognitive control.

      We have incorporated these results into the manuscript (see new Figure S6 and updated Results section). 

      Control analyses directly comparing task with resting state: As per the reviewer’s suggestion, we compared the AI's net outflow during task periods to resting state, finding significantly higher outflow during both encoding and recall across all experiments (ps < 0.05). These results provide further evidence for enhanced role of AI net directed information flow to the DMN and FPN during memory processing compared to the resting state. 

      We have incorporated these results into the manuscript (see new Figure S9 and updated Results section). 

      Control analysis using every region of the brain outside the considered networks: We appreciate the reviewer's suggestion to conduct additional control analyses. However, we have concerns about implementing this approach for several reasons:

      (1) Hypothesis-driven research: Our study was designed based on a strong hypothesis derived from prior fMRI studies, which have consistently shown that the salience network (SN), anchored by the anterior insula (AI), plays a critical role in regulating the engagement and disengagement of the default mode network (DMN) and frontoparietal network (FPN) across diverse cognitive tasks.

      (2) Risk of p-hacking: Running analyses on a large number of brain regions outside our networks of interest without a priori hypotheses could lead to p-hacking, a practice strongly criticized in the scientific community, including by eLife editors (Makin & Orban de Xivry, 2019). Such an approach could potentially yield spurious results and undermine the validity of our findings.

      (3) Principled control region selection: Our choice of the inferior frontal gyrus (IFG) as a control region was hypothesis-driven, based on its: a) Anatomical adjacency to the AI b) Involvement in cognitive control functions, including response inhibition c) Frequent coactivation with the AI in fMRI studies. 

      (4) Robustness of current findings: Our PTE analysis involving the IFG, along with the additional control analyses requested by the reviewer (comparing the task-related net balance of the AI with the IFG and with resting state, see response to reviewer comment 2.1), strongly support a key role for the AI in orchestrating large-scale network dynamics during memory processes.

      (5) Specificity of findings: The contrast between AI and IFG results demonstrates that our observed patterns are not general to all task-active regions but are specific to the AI's role in network coordination. 

      We believe that our current analyses, including the additional controls, provide a comprehensive and rigorous examination of the AI's role in memory-related network dynamics. Adding analyses of numerous additional regions without clear hypotheses could potentially dilute the focus and interpretability of our results. 

      However, we acknowledge the importance of considering broader network interactions. In future studies, we could explore the role of other key regions in a hypothesis-driven manner, potentially expanding our understanding of the complex interactions between multiple brain networks during memory processes.

      These revisions, combined with our rigorous methodologies and comprehensive analyses, provide compelling support for the central claims of our manuscript. We believe these changes significantly enhance the scientific contribution of our work.

      Our point-by-point responses to the reviewers' comments are provided below.

      Reviewer 1:

      (1.1) Because phase-transfer entropy is referenced as a "causal" analysis in this investigation (PTE), I believe it is important to highlight for readers recent discussions surrounding the description of "causal mechanisms" in neuroscience (see "Confusion about causation" section from Ross and Bassett, 2024, Nature Neuroscience). A large proportion of neuroscientists (myself included) use "causal" only to refer to a mechanism whose modulation or removal (with direct manipulation, such as by lesion or stimulation) is known to change or control a given outcome (such as a successful behavior). As Ross and Bassett highlight, it is debatable whether such mechanistic causality is captured by Granger "causality" (a.k.a. Granger prediction) or the parametric PTE, and imprecise use of "causation" may be confusing. The authors have defined in the revised Introduction what their definition of "causality" is within the context of this investigation. 

      We appreciate the reviewer's feedback in terms of the terminology used in our manuscript. To avoid confusion, we have now removed claims of causality from our manuscript and also changed the title of the manuscript accordingly. 

      Reviewer 2:

      (2.1) Clarifying the new control analyses. The authors have been responsive to our feedback and implemented several new analyses. The use of a pre-task baseline period and a control brain region (IFG) definitively help to contextualize their results, and the findings shown in the revision do suggest that (1) relative to a pre-task baseline, directed interactions from the AI are stronger and (2) relative to a nearby region, the IFG, the AI exhibits greater outward-directed influence. 

      However, it is difficult to draw strong quantitative conclusions from the analyses as presented, because they do not directly statistically contrast the effect in question (directed interactions with the FPN and DMN) between two conditions (e.g. during baseline vs. during memory encoding/retrieval). As I understand it, in their main figures the authors ask, "Is there statistically greater influence from the AI to the DMN/FPN in one direction versus another?" And in the AI they show greater "outward" PTE than "inward" PTE from other networks during encoding/retrieval. The balance of directed information favors an outward influence from the AI to DMN/FPN. 

      But in their new analyses, they simply show that the degree of "outward" PTE is greater during task relative to baseline in (almost) all tasks. I believe a more appropriately matched analysis would be to quantify the inward/outward balance during task states, quantify the inward/outward balance during rest states, and then directly statistically compare the two. It could be that the relative balance of directed information flow is nonsignificantly changed between task and rest states, which would be important to know. 

      We thank the reviewer for this suggestion. We have now run additional analysis by directly comparing the inward/outward balance during the task versus the rest states. To calculate the net inward/outward balance, we calculated the net outflow as the difference between the total outgoing information and total incoming information (PTE(out)–PTE(in)). This analysis revealed that net outflow during task periods is significantly higher compared to rest, during both encoding and recall, and across the four experiments (ps < 0.05). These results provide further evidence for enhanced role of AI net directed information flow to the DMN and FPN during memory processing compared to the resting state. These new results have now been included in the revised manuscript (page 12). 

      Likewise, a similar principle applies to their IFG analysis. They show that the IFG tends to have an "inward" balance of influence from the DMN/FPN (the opposite of the AIs effect), but this does not directly answer whether the AI occupies a statistically unique position in terms of the magnitude of its influence on other regions. More appropriate, as I suggest above, would be to quantify the relative balance inward/outward influence, both for the IFG and the AI, and then directly compare those two quantities. (Given the inversion of the direction of effect, this is likely to be a significant result, but I think it deserves a careful approach regardless.) 

      We appreciate the reviewer's suggestion. As per the reviewer’s suggestion, we directly compared the net inward/outward balance between the AI and the IFG. Specifically, we compared the net outflow (PTE(out)–PTE(in)) for the AI with the IFG. This analysis revealed that the net outflow for the AI is significantly higher compared to the IFG during both encoding and recall, and across the four experiments. These findings further highlight a key role for the AI in orchestrating large-scale network dynamics during memory processes. The AI's pattern of directed information flow stands in contrast to that of the IFG, despite their anatomical proximity and shared involvement in cognitive control processes. This dissociation underscores the specificity of the AI's function in coordinating network interactions during memory formation and retrieval. These new results have now been included in our revised manuscript (page 11). 

      (2.2) Consider additional control regions. The authors justify their choice of IFG as a control region very well. In my original comments, I perhaps should have been more clear that the most compelling control analyses here would be to subject every region of the brain outside these networks (with good coverage) to the same analysis, quantify the degree of inward/outward balance, and then see how the magnitude of the AI effect stacks up against all possible other options. If the assertion is that the AI plays a uniquely important role in these memory processes, showing how its influence stacks up against all possible "competitors" would be a very compelling demonstration of their argument. 

      We thank the reviewer for this suggestion. However, please note that running a large number of random analysis by including a large number of brain regions (every region of the brain outside these networks) and comparing their dynamics to the AI without a hypothesis or solid principle amounts to p-hacking, which has been previously strongly criticized by the eLife editors (Makin & Orban de Xivry, 2019). Our study was strongly driven by a solid hypothesis based on prior fMRI studies that have shown that the SN, anchored by the anterior insula (AI), plays a critical role in regulating the engagement and disengagement of the DMN and FPN across diverse cognitive tasks (Bressler & Menon, 2010; Cai et al., 2016; Cai, Ryali, Pasumarthy, Talasila, & Menon, 2021; Chen, Cai, Ryali, Supekar, & Menon, 2016; Kronemer et al., 2022; Raichle et al., 2001; Seeley et al., 2007; Sridharan, Levitin, & Menon, 2008). Moreover, our selection of the IFG as a control region for comparison was also very strongly hypothesis driven, due to its anatomical adjacency to the AI, its involvement in a wide range of cognitive control functions including response inhibition (Cai, Ryali, Chen, Li, & Menon, 2014), and its frequent co-activation with the AI in fMRI studies. Furthermore, the IFG has been associated with controlled retrieval of memory (Badre, Poldrack, Paré-Blagoev, Insler, & Wagner, 2005; Badre & Wagner, 2007; Wagner, Paré-Blagoev, Clark, & Poldrack, 2001), making it a compelling region for comparison. Our findings related to the PTE analysis involving the IFG and also the additional control analyses requested by the reviewer (directly comparing the task-related net balance of the AI with the IFG and also to resting state, please see response to reviewer comment 2.1) strongly highlight a key role of the AI in orchestrating large-scale network dynamics during memory processes. 

      We believe that our current analyses, including the additional controls, provide a comprehensive and rigorous examination of the AI's role in memory-related network dynamics. Adding analyses of numerous additional regions without clear hypotheses could potentially dilute the focus and interpretability of our results.

      However, we acknowledge the importance of considering broader network interactions. In future studies, we could explore the role of other key regions in a hypothesis-driven manner, potentially expanding our understanding of the complex interactions between multiple brain networks during memory processes.

      (2.3) Reporting of successful vs. unsuccessful memory results. I apologize if I was not clear in my original comment (2.7, pg. 13 of the response document) regarding successful vs. unsuccessful memory. The fact that no significant difference was found in PTE between successful/unsuccessful memory is a very important finding that adds valuable context to the rest of the manuscript. I believe it deserves a figure, at least in the Supplement, so that readers can visualize the extent of the effect in successful/unsuccessful trials. This is especially important now that the manuscript has been reframed to focus more directly on claims regarding episodic memory processing; if that is indeed the focus, and their central analysis does not show a significant effect conditionalized on the success of memory encoding/retrieval, it is important that readers can see these data directly.

      As per the reviewer’s suggestion, we have now included a Figure related to the results for the successful versus unsuccessful comparison in the Supplementary materials of the revised manuscript (Figures S10, S11).   

      (2.4) Claims regarding causal relationships in the brain. I understand that the authors have defined "causal" in a specific way in the context of their manuscript; I do believe that as a matter of clear and transparent scientific communication, the authors nonetheless bear a responsibility to appreciate how this word may be erroneously interpreted/overinterpreted and I would urge further review of the manuscript to tone down claims of causality. Reflective of this, I was very surprised that even as both reviewers remarked on the need to use the word "causal" with extreme caution, the authors added it to the title in their revised manuscript.

      We thank the reviewer for this suggestion. To avoid confusion, we have now removed claims of causality from our manuscript and also changed the title of the manuscript accordingly. 

      References 

      Badre, D., Poldrack, R. A., Paré-Blagoev, E. J., Insler, R. Z., & Wagner, A. D. (2005). Dissociable controlled retrieval and generalized selection mechanisms in ventrolateral prefrontal cortex. Neuron, 47(6), 907-918. doi:10.1016/j.neuron.2005.07.023

      Badre, D., & Wagner, A. D. (2007). Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia, 45(13), 2883-2901. doi:10.1016/j.neuropsychologia.2007.06.015

      Bressler, S. L., & Menon, V. (2010). Large-scale brain networks in cognition: emerging methods and principles. Trends in Cognitive Sciences, 14(6), 277-290. doi:10.1016/j.tics.2010.04.004

      Cai, W., Chen, T., Ryali, S., Kochalka, J., Li, C. S., & Menon, V. (2016). Causal Interactions Within a Frontal-Cingulate-Parietal Network During Cognitive Control: Convergent Evidence from a Multisite-Multitask Investigation. Cereb Cortex, 26(5), 2140-2153. doi:10.1093/cercor/bhv046

      Cai, W., Ryali, S., Chen, T., Li, C. S., & Menon, V. (2014). Dissociable roles of right inferior frontal cortex and anterior insula in inhibitory control: evidence from intrinsic and taskrelated functional parcellation, connectivity, and response profile analyses across multiple datasets. J Neurosci, 34(44), 14652-14667. doi:10.1523/jneurosci.3048-14.2014

      Cai, W., Ryali, S., Pasumarthy, R., Talasila, V., & Menon, V. (2021). Dynamic causal brain circuits during working memory and their functional controllability. Nat Commun, 12(1), 3314. doi:10.1038/s41467-021-23509-x

      Chen, T., Cai, W., Ryali, S., Supekar, K., & Menon, V. (2016). Distinct Global Brain Dynamics and Spatiotemporal Organization of the Salience Network. PLOS Biology, 14(6), e1002469. doi:10.1371/journal.pbio.1002469

      Kronemer, S. I., Aksen, M., Ding, J. Z., Ryu, J. H., Xin, Q., Ding, Z., . . . Blumenfeld, H. (2022). Human visual consciousness involves large scale cortical and subcortical networks independent of task report and eye movement activity. Nat Commun, 13(1), 7342. doi:10.1038/s41467-022-35117-4

      Makin, T. R., & Orban de Xivry, J. J. (2019). Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. Elife, 8. doi:10.7554/eLife.48175

      Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proc Natl Acad Sci U S A, 98(2), 676-682. doi:10.1073/pnas.98.2.676

      Seeley, W. W., Menon, V., Schatzberg, A. F., Keller, J., Glover, G. H., Kenna, H., . . . Greicius, M. D. (2007). Dissociable Intrinsic Connectivity Networks for Salience Processing and Executive Control. Journal of Neuroscience, 27(9), 2349-2356. doi:10.1523/JNEUROSCI.5587-06.2007

      Sridharan, D., Levitin, D. J., & Menon, V. (2008). A critical role for the right fronto-insular cortex in switching between central-executive and default-mode networks. Proceedings of the National Academy of Sciences, 105(34), 12569-12574. doi:10.1073/pnas.0800005105

      Wagner, A. D., Paré-Blagoev, E. J., Clark, J., & Poldrack, R. A. (2001). Recovering meaning: left prefrontal cortex guides controlled semantic retrieval. Neuron, 31(2), 329-338. doi:10.1016/s0896-6273(01)00359-2

    1. eLife Assessment

      This is an important piece of work that sheds light on our understanding of early lung development. There is solid evidence that there is a key new role for Svep1, which may be acting via FGF9. A more precise understanding of the interactions between Svep1 and FGF9, with a possibility of other ECM factors, would add value.

    2. Reviewer #1 (Public review):

      Summary:

      This is an important contribution to the field of molecular embryology of the lung. The authors introduce a novel mesenchymally expressed molecule Svep1. Knocking it out in mice produces a profoundly hypoplastic phenotype which can be rescued in vitro. Svep1 interacts with the FGF signaling complex to control differentiation and expression of smooth muscle in lung mesenchyme, thereby affecting proximal-distal patterning of the airway branches by acting as a putative branch suppressor.

      Strengths:

      The study shows strong evidence in mouse knockouts, in vitro embryonic lung culture as well as gene expression and in vitro rescue studies that confirm a key role for Svep1. It is a beautiful piece of work and an important contribution to our understanding of early lung branching morphogenesis.

      Weaknesses:

      Claiming a possible therapeutic role for this gene is a bit far-fetched at the present state of the art.

    3. Reviewer #2 (Public review):

      Summary:

      In an effort to elucidate the role of the ECM protein Svep1 in lung development, Foxworth and colleagues have generated a Svep1 mutant (lacking exon 8). Based on their developmental analyses of branching morphogenesis and expression of epithelial, mesenchymal, and mesothelial markers in these mutants, the authors conclude that Svep1 is essential for normal lung growth, morphogenesis, and patterning. They propose that the Svep1 protein regulates, in part, FGF9 signalling. Overall, the paper demonstrates that the ECM is important for lung development and tries to implicate the ECM in the regulation of epithelial-mesenchymal interactions during lung development.

      Strengths:

      The strengths of this paper are the careful spatiotemporal characterization of 1) lung growth 2) branching morphogenesis 3) epithelial marker expression. The differential perturbation of growth and branching morphogenesis along the D-V axis and the progressive perturbation of branching morphogenesis are both very interesting and noteworthy phenotypes.

      Weaknesses:

      The weakness of this paper is that it does not present a convincing explanation for how Svep1 regulates any of the phenotypes described. In this regard, a demonstration of a genetic interaction between Svep1 and FGF9 mutants or a careful characterization of a tissue-specific knockdown of Svep1, could be insightful. In addition, a comparison of the phenotype of Svep1 mutants and the phenotypes of other mutants affecting ECM components would be worthwhile.

      A minor weakness is that the title of the paper is not fully supported by the data presented. While the defects in the morphogenesis of the distal lung in Svep1 mutants presage a defect in alveolar differentiation, this cannot be formally demonstrated since the animals die soon after birth.

    4. Author response:

      Response to Reviewer #1:

      “Claiming a possible therapeutic role for this gene is a bit far-fetched at the present state of the art”.

      We agree that while the therapeutic relevance of Svep1 is not clear at this point, this potential is always something we consider in interpreting our data.

      Response to Reviewer #2:

      a. “The weakness of this paper is that it does not present a convincing explanation for how Svep1 regulates any of the phenotypes described. In this regard, a demonstration of a genetic interaction between Svep1 and FGF9 mutants or a careful characterization of a tissue-specific knockdown of Svep1, could be insightful. In addition, a comparison of the phenotype of Svep1 mutants and the phenotypes of other mutants affecting ECM components would be worthwhile”. 

      We agree that additional experiments are needed to determine how exactly Svep1 contributes to the phenotypes described. While our preliminary data point to an interaction of Svep1 and Fgf9, we agree that additional data are needed to prove that such interaction is a primary driver of the phenotypes observed.

      b. “A minor weakness is that the title of the paper is not fully supported by the data presented. While the defects in the morphogenesis of the distal lung in Svep1 mutants presage a defect in alveolar differentiation, this cannot be formally demonstrated since the animals die soon after birth”

      The reviewer is correct that we cannot formally demonstrate this in the current model. The profound defects observed in Svep1 mutants lead to early death, making it challenging to study the full process of alveolarization. However, it is important to note that lung morphogenesis is a continuous process in which earlier developmental phases lay the groundwork for subsequent stages. During the branching phase, the fate of alveolar cell types is established, while the saccular stage serves as a critical foundation for alveolar development, where alveolar cells begin to differentiate. We believe that the significant abnormalities in cellular differentiation observed prior to the bulk of alveolarization indicate likely defects in the later stages of alveolar differentiation. Therefore, while the model limits our ability to directly assess alveolarization, we anticipate that defects in cellular differentiation will continue to manifest beyond the saccular stage in Svep1 mice.

    1. eLife Assessment

      The study presents a very well-illustrated specimens of the artiopodan Cindarella eucalla from the Chengjiang Biota, using computer tomography (CT) scanning to illustrate multiple specimens with preserved appendages, a rarity in artiopodans. The description of these fossils is important for expanding our understanding of this taxon and its relatives. The imaging and morphological description are followed by a discussion of how this morphology relates to other Cambrian arthropods and its potential ecological function. The evidence provided in this section about resulting function and ecology is presently incomplete and the conclusions are put forward too strongly. This assessment could be improved if the work is revised with more careful wording and additional data.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Zhang et al. analyzed 17 specimens of Cindarella eucalla with 3D technology and discussed the anatomical findings, the relationship to other artiopods, and the ecology of the animal. The results are excellent and the findings are very interesting. However, the discussion needs to be extended, as the point the authors are trying to make is not always clear. I also recommend some restructuring of the discussion. Overall this is an important manuscript, and I'm looking forward to reading the edited version.

      Strengths:

      The analyses, the 3D data is excellent and provides new information.

      Weaknesses:

      The discussion - the authors provide information for the findings, but do not discuss them in detail. More information is needed.

    3. Reviewer #2 (Public review):

      Summary:

      Zhang et al. present very well-illustrated specimens of the artiopodan Cinderella eucalla from the Chengjiang Biota. Multiple specimens are shown with preserved appendages, which is rare for artiopodans and will greatly help our understanding of this taxon. The authors use CT scanning to reveal the ventral organization of this taxon. The description of the taxon needs some modification, specifically expansion of the gut and limb morphology. The conclusion that Cinderella was a fast-moving animal is very weak, comparisons with extant fast animals and possibly FEA analyses are necessary to support such a claim. Although the potential insights provided by such well-preserved fossils could be valuable, the claims made are tenuous and based on the available evidence presented herein.

      Strengths:

      The images produced through CT scanning specimens reveal the very fine detail of the appendages and are well illustrated. Specimens preserve guts and limbs, which are informative both for the phylogenetic position and ecology of this taxon. The limbs are very well preserved, with protopodite, exopodite, and endopodites visible. Addressing the weaknesses below will make the most of this compelling data that demonstrates the morphology of the limbs well.

      Weaknesses:

      Although this paper includes very well-illustrated fossils, including new information on the eyes, guts, and limbs of Cinderella, the data are not fully explained, and the conclusions are weakly supported.

      The authors suggest the preservation of complex ramifying diverticular, but it should be better illustrated and the discussion of the gut diverticulae should be longer, especially as gut morphology can provide insights into the feeding strategy.

      The conclusion that Cinderella eucalla was fast, sediment feeding in a muddy environment, is not well supported. These claims seem to be tenuously made without any evidence to support them. The authors should add a new section in the discussion focused on feeding ecology where they explicitly compare the morphology to suspension-feeding artiopodans to justify whether it fed that way or not. To further explore feeding, the protopodite morphology needs to be more carefully described and compared to other known taxa. The function of endites on the endopodite to stir up sediment for particle feeding in a muddy environment would also need to be more thoroughly discussed and compared with modern analogs. The impact of their findings is not highlighted in the discussion, which is currently more of a review of what has been previously said and should focus more on what insights are provided by the great fossils illustrated by the authors.

      The authors argue that their data supports fast escaping capabilities, but it is not clear how they reached that conclusion based on the data available. Is there a way this can be further evaluated? The data is impressive, so including comparisons with extant taxa that display fast escaping strategies would help the authors make their case more compelling. The authors also claim that the limbs of Cinderella are strong, again this conclusion is unclear. Comparison with the limbs of other taxa to show their robustness would be useful. To actually test how these limbs deal with the force and strain applied to them by a sudden burst of movement, the authors could conduct Finite Element Analyses.

    4. Reviewer #3 (Public review):

      This paper provides an interesting description of the ventral parts of the Cambrian xandarellid Cindarella eucalla, derived from exceptionally preserved specimens of the Chengjiang Biota. These morphological data are useful for our broad understanding and future research on Xandarellida, and are generally well-represented in the description and accompanying figures. The strengths of this work rest in this morphological description of exceptional fossil material, and this is generally well supported. In addition, the authors put this description in the context of the morphology of other xandarellids and Cambrian arthropod groups, with most of these parallels being useful and reasonably supported, though in several places homology is assumed and this currently lacks evidence. The manuscript goes on to use these morphological data and comparisons to other groups (particularly trilobites) to make suggestions for the ecology of Cindarella eucalla and other xandarellids. The majority of my comments on this work relate to this latter aim - the ecological conclusions drawn are generally derived through morphological comparisons, where a specific morphology has been suggested as an adaption to a particular ecological function in another extinct arthropod group. However, the original suggestions for ecological function are untested, and so remain hypotheses. Despite this, they are frequently presented as truisms to enable ecological conclusions to be drawn for Cindarella eucalla. I have listed my comments and queries on the study below for the authors to address or respond to, and I hope they are useful to the authors.

      Comments:

      There are a number of ecological and functional morphology conclusions stated that seem put too strongly to be considered sufficiently supported by the evidence given. These relate to both the description of C. eucalla, and comparisons to other extinct arthropod taxa (notably trilobites). Many of these latter statements are assumptions of functional morphology, and should not be repeated as truisms, rather than they represent suggested functions and ecologies based on the known morphological descriptions. This aspect occurs throughout the article, and, for me, is the primary concern.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Zhang et al. analyzed 17 specimens of Cindarella eucalla with 3D technology and discussed the anatomical findings, the relationship to other artiopods, and the ecology of the animal. The results are excellent and the findings are very interesting. However, the discussion needs to be extended, as the point the authors are trying to make is not always clear. I also recommend some restructuring of the discussion. Overall this is an important manuscript, and I'm looking forward to reading the edited version.

      Strengths:

      The analyses, the 3D data is excellent and provides new information.

      Weaknesses:

      The discussion - the authors provide information for the findings, but do not discuss them in detail. More information is needed.

      We are committed to enhancing the quality of our manuscript further and, in response to your comments, will implement the following improvements:

      (1) Comparative Analysis of Eyes: We will expand our discussion to include a detailed comparative analysis of the eyes of Cindarella eucalla with those of other artiopods (e.g. Xandarellids, trilobites, living insects), focusing on morphology, size, and other relevant characteristics.

      (2) Segmental Mismatch Discussion: We will provide an in-depth exploration of the specifics and significance of the segmental mismatch to offer a clearer understanding of its implications. We will also compare the characteristics of this mismatch in our focal species with those observed in extant arthropods, such as spiders and myriapods. This comparison will be further enriched by integrating our phylogenetic analysis, thereby providing a broader evolutionary context.

      (3) Methodological Clarity: We will provide more detailed information on the parameters used for the analyses in the Methods section, especially the phylogenetic sections and the X-ray tomography section.

      (4) Phylogenetic Analysis: We will engage in a more in-depth discussion of certain characters (e.g. anterior sclerite, hypostome, endopodite, segmental mismatch, etc.) within our phylogenetic analyses to clarify their relevance and contribution to our findings.

      Reviewer #2 (Public review):

      Summary:

      Zhang et al. present very well-illustrated specimens of the artiopodan Cinderella eucalla from the Chengjiang Biota. Multiple specimens are shown with preserved appendages, which is rare for artiopodans and will greatly help our understanding of this taxon. The authors use CT scanning to reveal the ventral organization of this taxon. The description of the taxon needs some modification, specifically expansion of the gut and limb morphology. The conclusion that Cinderella was a fast-moving animal is very weak, comparisons with extant fast animals and possibly FEA analyses are necessary to support such a claim. Although the potential insights provided by such well-preserved fossils could be valuable, the claims made are tenuous and based on the available evidence presented herein.

      Strengths:

      The images produced through CT scanning specimens reveal the very fine detail of the appendages and are well illustrated. Specimens preserve guts and limbs, which are informative both for the phylogenetic position and ecology of this taxon. The limbs are very well preserved, with protopodite, exopodite, and endopodites visible. Addressing the weaknesses below will make the most of this compelling data that demonstrates the morphology of the limbs well.

      Weaknesses:

      Although this paper includes very well-illustrated fossils, including new information on the eyes, guts, and limbs of Cinderella, the data are not fully explained, and the conclusions are weakly supported.

      The authors suggest the preservation of complex ramifying diverticular, but it should be better illustrated and the discussion of the gut diverticulae should be longer, especially as gut morphology can provide insights into the feeding strategy.

      The conclusion that Cinderella eucalla was fast, sediment feeding in a muddy environment, is not well supported. These claims seem to be tenuously made without any evidence to support them. The authors should add a new section in the discussion focused on feeding ecology where they explicitly compare the morphology to suspension-feeding artiopodans to justify whether it fed that way or not. To further explore feeding, the protopodite morphology needs to be more carefully described and compared to other known taxa. The function of endites on the endopodite to stir up sediment for particle feeding in a muddy environment would also need to be more thoroughly discussed and compared with modern analogs. The impact of their findings is not highlighted in the discussion, which is currently more of a review of what has been previously said and should focus more on what insights are provided by the great fossils illustrated by the authors.

      The authors argue that their data supports fast escaping capabilities, but it is not clear how they reached that conclusion based on the data available. Is there a way this can be further evaluated? The data is impressive, so including comparisons with extant taxa that display fast escaping strategies would help the authors make their case more compelling. The authors also claim that the limbs of Cinderella are strong, again this conclusion is unclear. Comparison with the limbs of other taxa to show their robustness would be useful. To actually test how these limbs deal with the force and strain applied to them by a sudden burst of movement, the authors could conduct Finite Element Analyses.

      Here are the key points we plan to address:

      (1) Gut and Limb Morphology: We will expand our description of the gut and limb morphology of C. eucalla, providing a more detailed comparison and analysis. This will include a revised discussion on the function and ecological implications of these features.

      (2) Fast-Moving Animal Claim: We acknowledge your concern about the conclusion that C. eucalla was a fast-moving animal. We will conduct a more detailed comparison among C. eucalla and other Cambrian artiopods and living arthropods, focusing on morphological and functional aspects. We will also reconsider our claim and will be more cautious in our conclusions. If the comparison proves insufficient, we will remove this assertion from the manuscript. But we may no longer conduct Finite Element Analysis, as a comprehensive and cautious analysis would require a massive project to complete.

      (3) Sediment Feeding in a Muddy Environment: We will revise the section discussing the feeding ecology of C. eucalla. We will enhance this section by comparing the morphology of C. eucalla to that of suspension-feeding artiopods, which will help to substantiate our claims. Additionally, we will expand the discussion to include a more detailed examination of endites, gnathobases, and other relevant anatomical structures.

      (4) Impact of Findings: We will endeavor to highlight the impact of our findings in the discussion, focusing on the insights provided by the well-preserved fossils illustrated in our study.

      Reviewer #3 (Public review):

      This paper provides an interesting description of the ventral parts of the Cambrian xandarellid Cindarella eucalla, derived from exceptionally preserved specimens of the Chengjiang Biota. These morphological data are useful for our broad understanding and future research on Xandarellida, and are generally well-represented in the description and accompanying figures. The strengths of this work rest in this morphological description of exceptional fossil material, and this is generally well supported. In addition, the authors put this description in the context of the morphology of other xandarellids and Cambrian arthropod groups, with most of these parallels being useful and reasonably supported, though in several places homology is assumed and this currently lacks evidence. The manuscript goes on to use these morphological data and comparisons to other groups (particularly trilobites) to make suggestions for the ecology of Cindarella eucalla and other xandarellids. The majority of my comments on this work relate to this latter aim - the ecological conclusions drawn are generally derived through morphological comparisons, where a specific morphology has been suggested as an adaption to a particular ecological function in another extinct arthropod group. However, the original suggestions for ecological function are untested, and so remain hypotheses. Despite this, they are frequently presented as truisms to enable ecological conclusions to be drawn for Cindarella eucalla. I have listed my comments and queries on the study below for the authors to address or respond to, and I hope they are useful to the authors.

      Comments:

      There are a number of ecological and functional morphology conclusions stated that seem put too strongly to be considered sufficiently supported by the evidence given. These relate to both the description of C. eucalla, and comparisons to other extinct arthropod taxa (notably trilobites). Many of these latter statements are assumptions of functional morphology, and should not be repeated as truisms, rather than they represent suggested functions and ecologies based on the known morphological descriptions. This aspect occurs throughout the article, and, for me, is the primary concern.

      We plan to address the following points in upon revision:

      (1) Homology Assumptions: You pointed out that we have assumed homology in certain instances without sufficient evidence. We will revise the manuscript to include a more detailed analysis of the anterior sclerite and exite, considering phylogenetic relationships and morphological comparisons to provide a more robust discussion.

      (2) Ecological and Functional Morphology: We acknowledge that our conclusions regarding the ecological function were presented with too much certainty. We will adopt a more cautious approach in our discussion, ensuring that our ideas are clearly labeled as such and are supported by a comparison of relevant studies on Cambrian artiopods and extant arthropods, including fluid dynamics, functional morphology, etc. We will re-evaluate the ecological function section, and if it does not adds value and clarity to the manuscript—our speculations do not contribute to the understanding of the specimen or may lead to misunderstandings—we will remove the relevant parts. We believe future changes reflect a more cautious and rigorous approach to the ecological and functional interpretations of C. eucalla.

    1. eLife Assessment

      The paper presents a new pipeline for functional validation of genes known to underlie fragile bone disorders, using CRISPR-mediated knockouts and a number of phenotypic assessments in zebrafish. The solid data demonstrate the feasibility and validity of the approach, which presents a valuable tool for rapid functional validation of candidate gene(s) associated with heritable bone diseases identified from genetic studies.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, a screening platform is presented for rapid and cost-effective screening of candidate genes involved in Fragile Bone Disorders. The authors validate the approach of using crispants, generating FO mosaic mutants, to evaluate the function of specific target genes in this particular condition. The design of the guide RNAs is convincingly described, while the effectiveness of the method is evaluated to 60% to 92% of the respective target genes being presumably inactivated. Thus, injected F0 larvae can be directly used to investigate the consequences of this inactivation.

      Skeletal formation is then evaluated at 7dpf and 14dpf, first using a transgenic reporter line revealing fluorescent osteoblasts, and second using alizarin-red staining of mineralized structures. In general, it appears that the osteoblast-positive areas are more often affected in the crispants compared to the mineralized areas, an observation that appears to correlate with the observed reduced expression of bglap, a marker for mature osteoblasts, and the increased expression of col1a1a in more immature osteoblasts.

      Finally, the injected fish (except two lines that revealed high mortality) are also analyzed at 90dpf, using alizarin red staining and micro-CT analysis, revealing an increased incidence of skeletal deformities in the vertebral arches, fractures, as well as vertebral fusions and compressions for all crispants except those for daam2. Finally, the Tissue Mineral Density (TMD) as determined by micro-CT is proposed as an important marker for investigating genes involved in osteoporosis.

      Taken together, this manuscript is well presented, the data are clear and well analyzed, and the methods are well described. It makes a compelling case for using the crispant technology to screen the function of candidate genes in a specific condition, as shown here for bone disorders.

      Strengths:

      Strengths are the clever combination of existing technologies from different fields to build a screening platform. All the required methods are comprehensively described.

      Weaknesses:

      One may have wished to bring one or two of the crispants to the stage of bona fide mutants, to confirm the results of the screening, however, this is done for some of the tested genes as laid out in the discussion.

    3. Reviewer #2 (Public review):

      Summary:

      More and more genes and genetic loci are being linked to bone fragility disorders like osteoporosis and osteogenesis imperfecta through GWAS and clinical sequencing. In this study, the authors seek to develop a pipeline for validating these new candidate genes using crispant screening in zebrafish. Candidates were selected based on GWAS bone density evidence (4 genes) or linkage to OI cases plus some aspect of bone biology (6 genes). NGS was performed on embryos injected with different gRNAs/Cas9 to confirm high mutagenic efficacy and off-target cutting was verified to be low. Bone growth, mineralization, density, and gene expression levels were carefully measured and compared across crispants using a battery of assays at three different stages.

      Strengths:

      (1) The pipeline would be straightforward to replicate in other labs, and the study could thus make a real contribution towards resolving the major bottleneck of candidate gene validation.

      (2) The study is clearly written and extensively quantified.

      (3) The discussion attempts to place the phenotypes of different crispant lines into the context of what is already known about each gene's function.

      (4) There is added value in seeing the results for the different crispant lines side by side for each assay.

      Weaknesses:

      (1) The study uses only well-established methods and is strategy-driven rather than question/hypothesis-driven.

      (2) Some of the measurements are inadequately normalized and not as specific to bone as suggested:

      (a) The measurements of surface area covered by osteoblasts or mineralized bone (Figure 1) should be normalized to body size. The authors note that such measures provide "insight into the formation of new skeletal tissue during early development" and reflect "the quantity of osteoblasts within a given structure and [is] a measure of the formation of bone matrix." I agree in principle, but these measures are also secondarily impacted by the overall growth and health of the larva. The surface area data are normalized to the control but not to the size/length of each fish - the esr1 line in particular appears quite developmentally advanced in some of the images shown, which could easily explain the larger bone areas. The fact that the images in Figure S5 were not all taken at the same magnification further complicates this interpretation.

      (b) Some of the genes evaluated by RT-PCR in Figure 2 are expressed in other tissues in addition to bone (as are the candidate genes themselves); because whole-body samples were used for these assays, there is a nonzero possibility that observed changes may be rooted in other, non-skeletal cell types.

      (3) Though the assays evaluate bone development and quality at several levels, it is still difficult to synthesize all the results for a given gene into a coherent model of its requirement.

      (4) Several additional caveats to crispant analyses are worth noting:

      (a) False negatives, i.e. individual fish may not carry many (or any!) mutant alleles. The crispant individuals used for most assays here were not directly genotyped, and no control appears to have been used to confirm successful injection. The authors therefore cannot rule out that some individuals were not, in fact, mutagenized at the loci of interest, potentially due to human error. While this doesn't invalidate the results, it is worth acknowledging the limitation.

      (b) Many/most loci identified through GWAS are non-coding and not easily associated with a nearby gene. The authors should discuss whether their coding gene-focused pipeline could be applied in such cases and how that might work.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript "Crispant analysis in zebrafish as a tool for rapid functional screening of disease-causing genes for bone fragility" describes the use of CRISPR gene editing coupled with phenotyping mosaic zebrafish larvae to characterize functions of genes implicated in heritable fragile bone disorders (FBDs). The authors targeted six high-confident candidate genes implicated in severe recessive forms of FBDs and four Osteoporosis GWAS-implicated genes and observed varied developmental phenotypes across all crispants, in addition to adult skeletal phenotypes.

      A major strength of the paper is the streamlined method that produced significant phenotypes for all candidate genes tested.

      A major weakness is a lack of new insights into underlying mechanisms that may contribute to disease phenotypes, nor any clear commonalities across gene sets. This was most evident in the qRT-PCR analysis of select skeletal developmental genes, which all showed varied changes in fold and direction, but with little insight into the implications of the results.

      Ultimately, the authors were able to show their approach is capable of connecting candidate genes with perturbation of skeletal phenotypes. It was surprising that all four GWAS candidate genes (which presumably were lower confidence) also produced a result. These authors have previously demonstrated that crispants recapitulate skeletal phenotypes of stable mutant lines for a single gene, somewhat reducing the novelty of the study.

    1. eLife Assessment

      This important work advances our understanding of trained immunity, especially in the context of Bacillus Calmette-Guérin (BCG) administration and host-pathogen interactions. The evidence supporting the conclusions are convincing, based on a combination of state-of-the-art omics techniques such as bulk and single-cell RNA sequencing with the use of JAK/STAT signaling inhibitors. The work will be of broad interest to immunologists and infection biologists.

    2. Reviewer #1 (Public review):

      Summary:

      In the submitted manuscript, Solomon et al carefully detail shifts in tissue-specific myeloid populations associated with trained immunity using intraperitoneal BCG injection as a model for induction. They define the kinetics of shifts in myeloid populations within the spleen and the transcriptional response associated with IP BCG exposure. In lineage tracing experiments, they demonstrate that tissue-resident macrophages, red-pulp macrophages (RPM) that are rapidly depleted after BCG exposure, are replenished from recruited monocytes and expansion of tissue-resident cells; they use transcriptional profiling to characterize those cells. In contrast to previous descriptions of BCG-driven immune training, they do not find BCG in the bone marrow in their model, suggesting that there is not direct training of myeloid precursor populations in the bone marrow. They then link the observed trained immunity phenotype (restriction of heterologous infection with ST) with early activation of STAT1 through IFN-γ.

      Strengths:

      The work includes careful detaining of shifts and origins of myeloid populations within tissue associated with trained immunity and is a meaningful advance for the field. Given that the temporality of exposure relative to trained immunity phenotypes is a major focus of the work, there are some additional experiments that would make the work stronger.

      Weaknesses:

      (1) The contribution of persistent BCG in spleen to the observed trained immunity phenotypes is not clear: The trained immunity phenotypes are interpreted as being driven by the early (within days) response to BCG exposure. While the fedratinib data generally support this interpretation, the authors show that BCG remains present in spleen albeit at low levels all the way out to 60 days post exposure. Given that the focus in the paper is on tissue-specific immune training, it would be helpful to know whether the ongoing presence of BCG at low levels in the profiled tissue contributes to the trained immunity phenotypes observed.

      (2) Unclear temporality of STAT1/IFN-γ requirement for the trained immunity phenotype: The data demonstrate that STAT1/IFN-γ are required at the earliest time points post-BCG exposure for trained immunity to be initiated. Related to the point about BCG above, it would be helpful to understand whether this is a specifically time-limited requirement when trained immunity is first induced, or whether ongoing signaling through this axis is required for maintenance of the observed trained immunity phenotypes.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, Solomon and colleagues demonstrate that trained immunity induced by BCG can reprogram myeloid cells within localised tissue, which can sustain prolonged protective effects. The authors further demonstrate an activation of STAT1-dependent pathways.

      Strengths:

      The main strength of this paper is the in-depth investigation of cell populations affected by BCG training, and how their transcriptome changes at different time points post-training. Through use of flow cytometry and sequencing methods, the authors identify a new cell population derived from classical monocytes. They also show that long-term trained immunity protection in the spleen is dependent on resident cells. Through sequencing, drug and recombinant inhibition of IFNg pathways, the authors reveal STAT1-dependent responses are required for changes in the myeloid population upon training, and recruitment of trained monocytes.

      Weaknesses:

      A significant amount of work has already been performed for this study. No significant weaknesses found.