10,000 Matching Annotations
  1. Jun 2025
    1. caterpillar

      caterpillar

      English Explanation

      A caterpillar is the larval stage of an insect, particularly of butterflies and moths, which belong to the order Lepidoptera. Caterpillars are known for their elongated, worm-like bodies and are often highly voracious eaters, primarily consuming leaves and other plant material. They are typically characterized by having a segmented body with a head, thorax, and abdomen, and they may have many small legs.

      Caterpillars undergo a transformation process known as metamorphosis, which involves several stages: the egg from which they hatch, the larval stage (caterpillar), the pupal stage (chrysalis or cocoon), and finally, the adult stage (butterfly or moth). This process is a remarkable transformation, allowing the caterpillar to evolve into a completely different form. During the caterpillar stage, they grow significantly before forming a pupa, where they undergo a radical transformation into their adult form. Many caterpillars are also camouflaged or possess defensive mechanisms to protect themselves from predators.

      中文解释

      毛虫 是一种昆虫的幼虫阶段,特别是属于鳞翅目(蜻蜓和蛾类)的昆虫。毛虫以其修长、像虫一样的身体而闻名,通常是非常贪食的主要以树叶和其他植物材料为食。毛虫通常具有分段的身体,分为头、胸、腹,并且可能有许多小腿。

      毛虫经历一种称为变态发育的转变过程,这个过程包括几个阶段:首先是它们从卵中孵化出来自幼虫阶段(毛虫),然后是蛹阶段(蛹或蚕茧),最后是成年阶段(蝴蝶或蛾)。这个过程是一个显著的变革,使得毛虫能够演变为一种完全不同的形态。在毛虫阶段,它们会显著生长,然后形成蛹,在蛹的阶段中经历一种彻底的转变,最终变成它们的成年形态。许多毛虫也通过伪装或具有防御机制来保护自己免受捕食者的攻击。

    2. mantises

      Mantises, commonly known as mantids or praying mantises, belong to the order Mantodea. They are characterized by their distinctive posture, which involves holding their front limbs up in a position that resembles prayer. Here are some key points about mantises:

      Physical Characteristics:

      • Appearance: Mantises have elongated bodies and a triangular head. Their large, compound eyes provide excellent vision.
      • Limbs: They are known for their raptorial forelegs, which are adapted for grasping and capturing prey.

      Behavior:

      • Predatory Nature: Mantises are carnivorous insects that primarily feed on other insects, although larger species can also prey on small vertebrates.
      • Camouflage: Many mantises can blend into their surroundings, making it easier for them to ambush prey.

      Reproduction:

      • Mating: The mating process can be risky for male mantises, as females of some species may eat the males after or during mating.
      • Egg Cases: Female mantises can produce egg cases known as oothecae, which contain multiple eggs and are often protected by a hard, frothy exterior.

      Habitat:

      • Mantises can be found in a variety of environments, including gardens, forests, and grasslands, and they are distributed across most of the world except for extreme cold regions.

      Importance:

      • Mantises play a crucial role in the ecosystem as pest controllers, helping to manage populations of agricultural pests.

      If you have specific questions about mantises or want to know about a particular aspect, feel free to ask!

    1. And his trails do not fade. Several yearslater, his talk with a friend trims to the queerways in which a people resist innovations, evenof vital interest. He has an example, in the factthat the outranged Europeans still failed to adoptthe Turkish bow. In fact he has a trail on it.A touch brings up the code book. Tapping a fewkeys projects the head of the trail. A leverruns through it at will, stopping at interestingitems, going off on side excursions. It is aninteresting trail, pertinent to the discussion.So he sets a reproducer in action, photographsthe whole trail out, and passes it to his friendfor insertion in his own memex, there to be linkedinto the more general trail.

      his trails do not fade

    1. head over heels.

      head over heels.

      The phrase "head over heels" refers to someone who is deeply in love or infatuated, often to the point of being irrational. In the context provided, it suggests that George has strong feelings for someone, which could explain why he hasn't shared the news previously.

      在这个上下文中,“head over heels”指的是一个人深深地爱上或迷恋某人,常常达到一种非理性的程度。该短语在对话中暗示,乔治对某人有强烈的感情,这也解释了他为何之前没有分享这个消息。

    1. Google has become nearly unusable for me because of those AI-generated summaries, and I'd hate to see Wikipedia head in the same direction.

      I'm not 100% sure but I think the way how Google and Wikipedia would use it is different.

      The proposed use here in Wikipedia directly summarizes the content of the article where you are reading the summary.

      Whereas in the Google case it may be using knowledge embedded in the model itself, plus summarizing some choice among the search results.

    1. Reviewer #2 (Public review):

      This paper introduces a framework for modeling individual differences in decision-making by learning a low-dimensional representation (the "individuality index") from one task and using it to predict behaviour in a different task. The approach is evaluated on two types of tasks: a sequential value-based decision-making task and a perceptual decision task (MNIST). The model shows improved prediction accuracy when incorporating this learned representation compared to baseline models.

      The motivation is solid, and the modelling approach is interesting, especially the use of individual embeddings to enable cross-task generalization. That said, several aspects of the evaluation and analysis could be strengthened.

      (1) The MNIST SX baseline appears weak. RTNet isn't directly comparable in structure or training. A stronger baseline would involve training the GRU directly on the task without using the individuality index-e.g., by fixing the decoder head. This would provide a clearer picture of what the index contributes.

      (2) Although the focus is on prediction, the framework could offer more insight into how behaviour in one task generalizes to another. For example, simulating predicted behaviours while varying the individuality index might help reveal what behavioural traits it encodes.

      (3) It's not clear whether the model can reproduce human behaviour when acting on-policy. Simulating behaviour using the trained task solver and comparing it with actual participant data would help assess how well the model captures individual decision tendencies.

      (4) Figures 3 and S1 aim to show that individuality indices from the same participant are closer together than those from different participants. However, this isn't fully convincing from the visualizations alone. Including a quantitative presentation would help support the claim.

      (5) The transfer scenarios are often between very similar task conditions (e.g., different versions of MNIST or two-step vs three-step MDP). This limits the strength of the generalization claims. In particular, the effects in the MNIST experiment appear relatively modest, and the transfer is between experimental conditions within the same perceptual task. To better support the idea of generalizing behavioural traits across tasks, it would be valuable to include transfers across more structurally distinct tasks.

      (6) For both experiments, it would help to show basic summaries of participants' behavioural performance. For example, in the MDP task, first-stage choice proportions based on transition types are commonly reported. These kinds of benchmarks provide useful context.

      (7) For the MDP task, consider reporting the number or proportion of correct choices in addition to negative log-likelihood. This would make the results more interpretable.

      (8) In Figure 5, what is the difference between the "% correct" and "% match to behaviour"? If so, it would help to clarify the distinction in the text or figure captions.

      (9) For the cognitive model, it would be useful to report the fitted parameters (e.g., learning rate, inverse temperature) per individual. This can offer insight into what kinds of behavioural variability the individuality index might be capturing.

      (10) A few of the terms and labels in the paper could be made more intuitive. For example, the name "individuality index" might give the impression of a scalar value rather than a latent vector, and the labels "SX" and "SY" are somewhat arbitrary. You might consider whether clearer or more descriptive alternatives would help readers follow the paper more easily.

      (11) Please consider including training and validation curves for your models. These would help readers assess convergence, overfitting, and general training stability, especially given the complexity of the encoder-decoder architecture.

    1. Then we noticed that in the second pillow was the indentation of a head. One ofus lifted something from it, and leaning forward, that faint and invisible dust dry andacrid in the nostrils, we saw a long strand of iron-gray hair

      We now know that Emily had in fact killed Homer. In a very twisted way I understand her. She wanted nothing more than to be loved and made sure she had the one she wanted. Got to love a girl who knows what she wants.

    1. Take your dog to play in the water, and what’s the first thing she’ll do when she gets out? Likely shake her head and shoulders in a semicircular motion, flinging fat water droplets onto everything in the vicinity.

      I can relate to this from personal experience!

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The authors report a study on how stimulation of receptive-field surround of V1 and LGN neurons affects their firing rates. Specifically, they examine stimuli in which a grey patch covers the classical RF of the cell and a stimulus appears in the surround. Using a number of different stimulus paradigms they find a long latency response in V1 (but not the LGN) which does not depend strongly on the characteristics of the surround grating (drifting vs static, continuous vs discontinuous, predictable grating vs unpredictable pink noise). They find that population responses to simple achromatic stimuli have a different structure that does not distinguish so clearly between the grey patch and other conditions and the latency of the response was similar regardless of whether the center or surround was stimulated by the achromatic surface. Taken together they propose that the surround-response is related to the representation of the grey surface itself. They relate their findings to previous studies that have put forward the concept of an ’inverse RF’ based on strong responses to small grey patches on a full-screen grating. They also discuss their results in the context of studies that suggest that surround responses are related to predictions of the RF content or figure-ground segregation. Strengths:

      I find the study to be an interesting extension of the work on surround stimulation and the addition of the LGN data is useful showing that the surround-induced responses are not present in the feedforward path. The conclusions appear solid, being based on large numbers of neurons obtained through Neuropixels recordings. The use of many different stimulus combinations provides a rich view of the nature of the surround-induced responses.

      Weaknesses:

      The statistics are pooled across animals, which is less appropriate for hierarchical data. There is no histological confirmation of placement of the electrode in the LGN and there is no analysis of eye or face movements which may have contributed to the surround-induced responses. There are also some missing statistics and methods details which make interpretation more difficult.

      We thank the reviewer for their positive and constructive comments, and have addressed these specific issues in response to the minor comments. For the statistics across animals, we refer to “Reviewer 1 recommendations” point 1. For the histological analysis, we refer to “Reviewer 1 recommendations point 2”. For the eye and facial movements, we refer to “Reviewer 1 recommendations point 5”. Concerning missing statistics and methods details, we refer to various responses to “Reviewer 1 recommendations”. We thoroughly reviewed the manuscript and included all missing statistical and methodological details.

      Reviewer #2 (Public review):

      Cuevas et al. investigate the stimulus selectivity of surround-induced responses in the mouse primary visual cortex (V1). While classical experiments in non-human primates and cats have generally demonstrated that stimuli in the surround receptive field (RF) of V1 neurons only modulate activity to stimuli presented in the center RF, without eliciting responses when presented in isolation, recent studies in mouse V1 have indicated the presence of purely surround-induced responses. These have been linked to prediction error signals. In this study, the authors build on these previous findings by systematically examining the stimulus selectivity of surround-induced responses.

      Using neuropixels recordings in V1 and the dorsal lateral geniculate nucleus (dLGN) of head-fixed, awake mice, the authors presented various stimulus types (gratings, noise, surfaces) to the center and surround, as well as to the surround only, while also varying the size of the stimuli. Their results confirm the existence of surround-induced responses in mouse V1 neurons, demonstrating that these responses do not require spatial or temporal coherence across the surround, as would be expected if they were linked to prediction error signals. Instead, they suggest that surround-induced responses primarily reflect the representation of the achromatic surface itself.

      The literature on center-surround effects in V1 is extensive and sometimes confusing, likely due to the use of different species, stimulus configurations, contrast levels, and stimulus sizes across different studies. It is plausible that surround modulation serves multiple functions depending on these parameters. Within this context, the study by Cuevas et al. makes a significant contribution by exploring the relationship between surround-induced responses in mouse V1 and stimulus statistics. The research is meticulously conducted and incorporates a wide range of experimental stimulus conditions, providing valuable new insights regarding center-surround interactions.

      However, the current manuscript presents challenges in readability for both non-experts and experts. Some conclusions are difficult to follow or not clearly justified.

      I recommend the following improvements to enhance clarity and comprehension:

      (1) Clearly state the hypotheses being tested at the beginning of the manuscript.

      (2) Always specify the species used in referenced studies to avoid confusion (esp. Introduction and Discussion).

      (3) Briefly summarize the main findings at the beginning of each section to provide context.

      (4) Clearly define important terms such as “surface stimulus” and “early vs. late stimulus period” to ensure understanding.

      (5) Provide a rationale for each result section, explaining the significance of the findings.

      (6) Offer a detailed explanation of why the results do not support the prediction error signal hypothesis but instead suggest an encoding of the achromatic surface.

      These adjustments will help make the manuscript more accessible and its conclusions more compelling.

      We thank the reviewer for their constructive feedback and for highlighting the need for improved clarity regarding the hypotheses and their relation to the experimental findings.

      • We have strongly improved the Introduction and Discussion section, explaining the different hypotheses and their relation to the performed experiments.

      • In the Introduction, we have clearly outlined each hypothesis and its predictions, providing a structured framework for understanding the rationale behind our experimental design. • In the Discussion, we have been more explicit in explaining how the experimental findings inform these hypotheses.

      • We explicitly mentioned the species used in the referenced studies.

      • We provided a clearer rationale for each experiment in the Results section.

      We have also always clearly stated the species that previous studies used, both in the Introduction and Discussion section.

      Reviewer #3 (Public review):

      Summary:

      This paper explores the phenomenon whereby some V1 neurons can respond to stimuli presented far outside their receptive field. It introduces three possible explanations for this phenomenon and it presents experiments that it argues favor the third explanation, based on figure/ground segregation.

      Strengths:

      I found it useful to see that there are three possible interpretations of this finding (prediction error, interpolation, and figure/ground). I also found it useful to see a comparison with LGN responses and to see that the effect there is not only absent but actually the opposite: stimuli presented far outside the receptive field suppress rather than drive the neurons. Other experiments presented here may also be of interest to the field.

      Weaknesses:

      The paper is not particularly clear. I came out of it rather confused as to which hypotheses were still standing and which hypotheses were ruled out. There are numerous ways to make it clearer.

      We thank the reviewer for their constructive feedback and for highlighting the need for improved clarity regarding the hypotheses and their relation to the experimental findings.

      • We have strongly improved the Introduction and Discussion section, explaining the different hypotheses and their relation to the performed experiments.

      • In the Introduction, we have clearly outlined each hypothesis and its predictions, providing a structured framework for understanding the rationale behind our experimental design. • In the Discussion, we have been more explicit in explaining how the experimental findings inform these hypotheses.

      ** Recommendations for the Authors:**

      Reviewer #1 (Recommendations for the Authors):

      (1) Given the data is hierarchical with neurons clustered within 6 mice (how many recording sessions per animal?) I would recommend the use of Linear Mixed Effects models. Simply pooling all neurons increases the risk of false alarms.

      To clarify: We used the standard method for analyzing single-unit recordings, by comparing the responses of a population of single neurons between two different conditions. This means that the responses of each single neuron were measured in the different conditions, and the statistics were therefore based on the pairwise differences computed for each neuron separately. This is a common and standard procedure in systems neuroscience, and was also used in the previous studies on this topic (Keller et al., 2020; Kirchberger et al., 2023). We were not concerned with comparing two groups of animals, for which hierarchical analyses are recommended. To address the reviewer’s concern, we did examine whether differences between baseline and the gray/drift condition, as well as the gray/drift compared to the grating condition, were consistent across sessions, which was indeed the case. These findings are presented in Supplementary Figure 6.

      (2) Line 432: “The study utilized three to eight-month-old mice of both genders”. This is confusing, I assume they mean six mice in total, please restate. What about the LGN recordings, were these done in the same mice? Can the authors please clarify how many animals, how many total units, how many included units, how many recording sessions per animal, and whether the same units were recorded in all experiments?

      We have now clarified the information regarding the animals used in the Methods section.

      • We state that “We included female and male mice (C57BL/6), a total of six animals for V1 recordings between three and eight months old. In two of those animals, we recorded simultaneously from LGN and V1.”

      • We state that“For each animal, we recorded around 2-3 sessions from each hemisphere, and we recorded from both hemispheres.”

      • We noted that the number of neurons was not mentioned for each figure caption. We apologize for this omission. We have now added the number for all of the figures and protocols to the revised manuscript. We note that the same neurons were recorded for the different conditions within each protocol, however because a few sessions were short we recorded more units for the grating protocol. Note that we did not make statistical comparisons between protocols.

      (3) I see no histology for confirmation of placement of the electrode in the LGN, how can they be sure they were recording from the LGN? There is also little description of the LGN experiments in the methods.

      For better clarity, we have included a reconstruction of the electrode track from histological sections of one animal post-experiment (Figure S4). The LGN was targeted via stereotactical surgery, and the visual responses in this area are highly distinct. In addition, we used a flash protocol to identify the early-latency responses typical for the LGN, which is described in the Methods section: “A flash stimulus was employed to confirm the locations of LGN at the beginning of the recording sessions, similar to our previous work in which we recorded from LGN and V1 simultaneously (Schneider et al., 2023). This stimulus consisted of a 100 ms white screen and a 2 s gray screen as the inter-stimulus interval, designed to identify visually responsive areas. The responses of multi-unit activity (MUA) to the flash stimulus were extracted and a CSD analysis was then performed on the MUA, sampling every two channels. The resulting CSD profiles were plotted to identify channels corresponding to the LGN. During LGN recordings, simultaneous recordings were made from V1, revealing visually responsive areas interspersed with non-responsive channels.”

      (4) Many statements are not backed up by statistics, for example, each time the authors report that the response at 90degree sign is higher than baseline (Line 121 amongst other places) there is no test to support this. Also Line 140 (negative correlation), Line 145, Line 180.

      For comparison purposes, we only presented statistical analyses across conditions. However, we have now added information to the figure captions stating that all conditions show values higher than the baseline.

      (5) As far as I can see there is no analysis of eye movements or facial movements. This could be an issue, for example, if the onset of the far surround stimuli induces movements this may lead to spurious activations in V1 that would be interpreted as surround-induced responses.

      To address this point, we have included a supplementary figure analyzing facial movements across different sessions and comparing them between conditions (Supplementary Figure 5). A detailed explanation of this analysis has been added to the Methods section. Overall, we observed no significant differences in face movements between trials with gratings, trials with the gray patch, and trials with the gray screen presented during baseline. Animals exhibited similar face movements across all three conditions, supporting the conclusion that the observed neural firing rate increases for the gray-patch condition are not related to face movements.

      (6) The experiments with the rectangular patch (Figure 3) seem to give a slightly different result as the responses for large sizes (75, 90) don’t appear to be above baseline. This condition is also perceptually the least consistent with a grey surface in the RF, the grey patch doesn’t appear to occlude the surface in this condition. I think this is largely consistent with their conclusions and it could merit some discussion in the results/discussion section.

      While the effect is maybe a bit weaker, the total surround stimulated also covers a smaller area because of the large rectangular gray patch. Furthermore, the early responses are clearly elevated above baseline, and the responses up to 70 degrees are still higher than baseline. Hence we think this data point for 90 degrees does not warrant a strong interpretation.

      Minor points:

      (1) Figure 1h: What is the statistical test reported in the panel (I guess a signed rank based on later figures)? Figure 4d doesn’t appear to be significantly different but is reported as so. Perhaps the median can be indicated on the distribution?

      We explained that we used a signed rank test for Figure 1h and now included the median of the distributions in Figure 4d.

      (2) What was the reason for having the gratings only extend to half the x-axis of the screen, rather than being full-screen? This creates a percept (in humans at least) that is more consistent with the grey patch being a hole in the grating as the grey patch has the same luminance as the background outside the grating.

      We explained in the Methods section that “We presented only half of the x-axis due to the large size of our monitor, in order to avoid over-stimulation of the animals with very large grating stimuli.”. Perceptually speaking, the gray patch appears as something occluding the grating, not as a “hole”.

      (3) Line 103: “and, importantly, had less than 10degree sign (absolute) distance to the grating stimulus’ RF center.” Re-phrase, a stimulus doesn’t have an RF center.

      We corrected this to “We included only single units into the analysis that met several criteria in terms of visual responses (see Methods) and, importantly, the RF center had less than 10(absolute) distance to the grating stimulus’ center. ”.

      (4) Line 143: “We recorded single neurons LGN” - should be “single LGN neurons”.

      We corrected this to “we recorded single LGN neurons”.

      (5) Line 200: They could spell out here that the latency is consistent with the latency observed for the grey patch conditions in the previous experiments. (6) Line 465: This is very brief. What criteria did they use for single-unit assignation? Were all units well-isolated or were multi-units included?

      We clarified in the Methods section that “We isolated single units with Kilosort 2.5 (Steinmetz et al., 2021) and manually curated them with Phy2 (Rossant et al., 2021). We included only single units with a maximum contamination of 10 percent.”

      (7) Line 469: “The experiment was run on a Windows 10”. Typo.

      We corrected this to “The experiment was run on Windows 10”.

      (9) Line 481: “We averaged the response over all trials and positions of the screen”. What do they mean by ’positions of the screen’?

      We changed this to “We computed the response for each position separately right, by averaging the response across all the trials where a square was presented at a given position.”

      (9) Line 483: “We fitted an ellipse in the center of the response”. How?

      We additionally explain how we preferred the detection of the RF using an ellipse fitting: “A heatmap of the response was computed. This heatmap was then smoothed, and we calculated the location of the peak response. From the heatmap we calculated the centroid of the response using the function regionprops.m that finds unique objects, we then selected the biggest area detected. Using the centroids provided as output. We then fitted an ellipse centered on this peak response location to the smoothed heatmap using the MATLAB function ellipse.m.“

      (10) Line 485 “...and positioned the stimulus at the response peak previously found”. Unclear wording, do you mean the center of the ellipse fit to the MUA response averaged across channels or something else? (11) Line 487: “We performed a permutation test of the responses inside the RF detected vs a circle from the same area where the screen was gray for the same trials.”. The wording is a bit unclear here, can they clarify what they mean by the ’same trials’, what is being compared to what here?

      We used a permutation test to compare the neuron’s responses to black and white squares inside the RF to the condition where there was no square in the RF (i.e. the RF was covered by the gray background).

      (12) Was the pink noise background regenerated on each trial or as the same noise pattern shown on each trial?

      We explain that “We randomly presented one of two different pink noise images”

      (13) Line 552: “...used a time window of the Gaussian smoothing kernel from-.05 to .05”. Missing units.

      We explained that “we used a time window of the Gaussian smoothing kernel from -.05 s to .05 s, with a standard deviation of 0.0125 s.”

      (14) Line 565: “Additionally, for the occluded stimulus, we included patch sizes of 70 degree sign and larger.”. Not sure what they’re referring to here.

      We changed this to: “For the population analyses, we analyzed the conditions in which the gray patch sizes were 70 degrees and 90 degrees”.

      (15) Line 569: What is perplexity, and how does changing it affect the t-SNE embeddings?

      Note that t-SNE is only used for visualization purposes. In the revised manuscript, we have expanded our explanation regarding the use of t-SNE and the choice of perplexity values. Specifically, we have clarified that we used a perplexity value of 20 for the Gratings with circular and rectangular occluders and 100 for the black-and-white condition. These values were empirically selected to ensure that the groups in the data were clearly separable while maintaining the balance between local and global relationships in the projected space. This choice allowed us to visually distinguish the different groups while preserving the meaningful structure encoded in the dissimilarity matrices. In particular, varying the perplexity values would not alter the conclusions drawn from the visualization, as t-SNE does not affect the underlying analytical steps of our study.

      (16) Line 572: “We trained a C-Support Vector Classifier based on dissimilarity matrices”. This is overly brief, please describe the construction of the dissimilarity matrices and how the training was implemented. Was this binary, multi-class? What conditions were compared exactly?

      In the revised manuscript, we have expanded our explanation regarding the construction of the dissimilarity matrices and the implementation of the C-Support Vector Classification (C-SVC) model (See Methods section).

      The dissimilarity matrices were calculated using the Euclidean distance between firing rate vectors for all pairs of trials (as shown in Figure 6a-b). These matrices were used directly as input for the classifier. It is important to note that t-SNE was not used for classification but only for visualization purposes. The classifier was binary, distinguishing between two classes (e.g., Dr vs St). We trained the model using 60% of the data for training and used 40% for testing. The C-SVC was implemented using sklearn, and the classification score corresponds to the average accuracy across 20 repetitions.

      Reviewer #2 (Recommendations for the Authors):

      The relationship between the current paper and Keller et al. is challenging to understand. It seems like the study is critiquing the previous study but rather implicitly and not directly. I would suggest either directly stating the criticism or presenting the current study as a follow-up investigation that further explores the observed effect or provides an alternative function. Additionally, defining the inverse RF versus surround-induced responses earlier than in the discussion would be beneficial. Some suggestions:

      (1) The introduction is well-written, but it would be helpful to clearly define the hypotheses regarding the function of surround-induced responses and revisit these hypotheses one by one in the results section.

      Indeed, we have generally improved the Introduction of the manuscript, and stated the hypotheses and their relationships to the Experiments more clearly.

      (2) Explicitly mention how you compare classic grating stimuli of varying sizes with gray patch stimuli. Do the patch stimuli all come with a full-field grating? For the full-field grating, you have one size parameter, while for the patch stimuli, you have two (size of the patch and size of the grating).

      We now clearly describe how we compare grating stimuli of varying sizes with gray patch stimuli.

      (3) The third paragraph in the introduction reads more like a discussion and might be better placed there.

      We have moved content from the third paragraph of the Introduction to the Discussion, where it fits more naturally.

      (4) Include 1-2 sentences explaining how you center RFs and detail the resolution of your method.

      We have added an explanation to the Methods: “To center the visual stimuli during the recording session, we averaged the multiunit activity across the responsive channels and positioned the stimulus at the center of the ellipse fit to the MUA response averaged across channels.”.

      (5) Motivate the use of achromatic stimuli. This section is generally quite hard to understand, so try to simplify it.

      We explained better in the Introduction why we performed this particular experiment.

      (6) The decoding analysis is great, but it is somewhat difficult to understand the most important results. Consider summarizing the key findings at the beginning of this section.

      We now provide a clearer motivation at the start of the Decoding section.

      Reviewer #3 (Recommendations for the Authors):

      I have a few suggestions to improve the clarity of the presentation.

      Abstract: it lists a series of observations and it ends with a conclusion (“based on these findings...”). However, it provides little explanation for how this conclusion would arise from the observations. It would be more helpful to introduce the reasoning at the top and show what is consistent with it.

      We have improved the abstract of the paper incorporating this feedback.

      To some extent, this applies to Results too. Sometimes we are shown the results of some experiment just because others have done a similar experiment. Would it be better to tell us which hypotheses it tests and whether the results are consistent with all 3 hypotheses or might rule one or more out? I came out of the paper rather confused as to which hypotheses were still standing and which hypotheses were ruled out.

      We have strongly improved our explanation of the hypotheses and the relationships to the experiments in the Introduction.

      It would be best if the Results section focused on the results of the study, without much emphasis on what previous studies did or did not measure. Here, instead, in the middle of Results we are told multiple times what Keller et al. (2020) did or did not measure, and what they did or did not find. Please focus on the questions and on the results. Where they agree or disagree with previous papers, tell us briefly that this is the case.

      We have revised the Results section in the revised manuscript, and ensured that there is much less focus on what previous studies did in the Results. Differences to previous work are now discussed in the Discussion section.

      The notation is extremely awkward. For instance “Gc” stands for two words (Gray center) but “Gr” stands for a single word (Grating). The double meaning of G is one of many sources of confusion.

      This notation needs to be revised. Here is one way to make it simpler: choose one word for each type of stimulus (e.g. Gray, White, Black, Drift, Stat, Noise) and use it without abbreviations. To indicate the configuration, combine two of those words (e.g. Gray/Drift for Gray in the center and Drift in the surround).

      We have corrected the notation in the figures and text to enhance readability and improve the reader’s understanding.

      Figure 1e and many subsequent ones: it is not clear why the firing rate is shown in a logarithmic scale. Why not show it in a linear scale? Anyway, if the logarithmic scale is preferred for some reason, then please give us ticks at numbers that we can interpret, like 0.1,1,10,100... or 0.5,1,2,4... Also, please use the same y-scale across figures so we can compare.

      To clarify: it is necessary to normalize the firing rates relative to baseline, in order to pool across neurons. However such a divisive normalization would be by itself problematic, as e.g. a change from 1 to 2 is the same as a change from 1 to 0.5, on a linear scale. Furthermore such division is highly outlier sensitive. For this reason taking the logarithm (base 10) of the ratio is an appropriate transformation. We changed the tick labels to 1, 2, 4 like the reviewer suggested.

      Figure 3: it is not clear what “size” refers to in the stimuli where there is no gray center. Is it the horizontal size of the overall stimulus? Some cartoons might help. Or just some words to explain.

      Figure 3: if my understanding of “size” above is correct, the results are remarkable: there is no effect whatsoever of replacing the center stimulus with a gray rectangle. Shouldn’t this be remarked upon?

      We have added a paragraph under figure 3 and in the Methods section explaining that the sizes represent the varying horizontal dimensions of the rectangular patch. In this protocol, the classical condition (i.e. without gray patch) was shown only as full-field gratings, which is depicted in the plot as size 0, indicating no rectangular patch was present.

      DETAILS The word “achromatic” appears many times in the paper and is essentially uninformative (all stimuli in this study are achromatic, including the gratings). It could be removed in most places except a few, where it is actually used to mean “uniform”. In those cases, it should be replaced by “uniform”.

      Ditto for the word “luminous”, which appears twice and has no apparent meaning. Please replace it with “uniform”.

      We have replaced the words achromatic and luminous with “uniform” stimuli to improve the clarity when we refer to only black or white stimuli.

      Page 3, line 70: “We raise some important factors to consider when describing responses to only surround stimulation.” This sentence might belong in the Discussion but not in the middle of a paragraph of Results.

      We removed this sentence.

      Neuropixel - Neuropixels (plural)

      “area LGN” - LGN

      We corrected for misspellings.

      References

      Keller, A.J., Roth, M.M., Scanziani, M., 2020. Feedback generates a second receptive field in neurons of the visual cortex. Nature 582, 545–549. doi:10.1038/s41586-020-2319-4.

      Kirchberger, L., Mukherjee, S., Self, M.W., Roelfsema, P.R., 2023. Contextual drive of neuronal responses in mouse V1 in the absence of feedforward input. Science Advances 9, eadd2498. doi:10. 1126/sciadv.add2498.

      Rossant, C., et al., 2021. phy: Interactive analysis of large-scale electrophysiological data. https://github.com/cortex-lab/phy.

      Schneider, M., Tzanou, A., Uran, C., Vinck, M., 2023. Cell-type-specific propagation of visual flicker. Cell Reports 42.

      Steinmetz, N.A., Aydin, C., Lebedeva, A., Okun, M., Pachitariu, M., Bauza, M., Beau, M., Bhagat, J., B¨ohm, C., Broux, M., Chen, S., Colonell, J., Gardner, R.J., Karsh, B., Kloosterman, F., Kostadinov, D., Mora-Lopez, C., O’Callaghan, J., Park, J., Putzeys, J., Sauerbrei, B., van Daal,R.J.J., Vollan, A.Z., Wang, S., Welkenhuysen, M., Ye, Z., Dudman, J.T., Dutta, B., Hantman, A.W., Harris, K.D., Lee, A.K., Moser, E.I., O’Keefe, J., Renart, A., Svoboda, K., H¨ausser, M., Haesler, S., Carandini, M., Harris, T.D., 2021. Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings. Science 372, eabf4588. doi:10.1126/science.abf4588.

    1. The universe emerged at the moment of the Big Bang, 13.8 billion years ago. This is our starting point for time

      I still can’t wrap my head around the universe having a start. I always wondered what it was before then if there even was anything.

    1. Reviewer #1 (Public review):

      This study is part of an ongoing effort to clarify the effects of cochlear neural degeneration (CND) on auditory processing in listeners with normal audiograms. This effort is important because ~10% of people who seek help for hearing difficulties have normal audiograms and current hearing healthcare has nothing to offer them.

      The authors identify two shortcomings in previous work that they intend to fix. The first is a lack of cross-species studies that make direct comparisons between animal models in which CND can be confirmed and humans for which CND must be inferred indirectly. The second is the low sensitivity of purely perceptual measures to subtle changes in auditory processing. To fix these shortcomings, the authors measure envelope following responses (EFRs) in gerbils and humans using the same sounds, while also performing histological analysis of the gerbil cochleae, and testing speech perception while measuring pupil size in the humans.

      The study begins with a comprehensive assessment of the hearing status of the human listeners. The only differences found between the young adult (YA) and middle aged (MA) groups are in thresholds at frequencies > 10 kHz and DPOAE amplitudes at frequencies > 5 kHz. The authors then present the EFR results, first for the humans and then for the gerbils, showing that amplitudes decrease more rapidly with increasing envelope frequency for MA than for YA in both species. The histological analysis of the gerbil cochleae shows that there were, on average, 20% fewer IHC-AN synapses at the 3 kHz place in MA relative to YA, and the number of synapses per IHC was correlated with the EFR amplitude at 1024 Hz.

      The study then returns to the humans to report the results of the speech perception tests and pupillometry. The correct understanding of keywords decreased more rapidly with decreasing SNR in MA than in YA, with a noticeable difference at 0 dB, while pupillary slope (a proxy for listening effort) increased more rapidly with decreasing SNR for MA than for YA, with the largest differences at SNRs between 5 and 15 dB. Finally, the authors report that a linear combination of audiometric threshold, EFR amplitude at 1024 Hz, and a few measures of pupillary slope is predictive of speech perception at 0 dB SNR.

      I only have two questions/concerns about the specific methodologies used:

      (1) Synapse counts were made only at the 3 kHz place on the cochlea. But the EFR sounds were presented at 85 dB SPL, which means that a rather large section of the cochlea will actually be excited. Do we know how much of the EFR actually reflects AN fibers coming from the 3 kHz place? And are we sure that this is the same for gerbils and humans given the differences in cochlear geometry, head size, etc.?

      [Note added after revision: the authors have added new data, references, and discussion that have answered my initial questions].

      (2) Unless I misunderstood, the predictive power of the final model was not tested on held out data. The standard way to fit and test such model would be to split the data into two segments, one for training and hyperparameter optimization, and one for testing. But it seems that the only spilt was for training and hyperparameter optimization.

      [Note added after revision: the authors now make it clear in their response that the modeling tells us how much of the current data can be explained but not necessary about generalization to other datasets.]

      While I find the study to be generally well executed, I am left wondering what to make of it all. The purpose of the study with respect to fixing previous methodological shortcomings was clear, but exactly how fixings these shortcomings has allowed us to advance is not. I think we can be more confident than before that EFR amplitude is sensitive to CND, and we now know that measures of listening effort may also be sensitive to CND. But where is this leading us?

      I think what this line of work is eventually aiming for is to develop a clinical tool that can be used to infer someone's CND profile. That seems like a worthwhile goal but getting there will require going beyond exploratory association studies. I think we're ready to start being explicit about what properties a CND inference tool would need to be practically useful. I have no idea whether the associations reported in this study are encouraging or not because I have no idea what level of inferential power is ultimately required.

      [Note added after revision: the authors have added to the Discussion to put their work into a broader perspective.]

      That brings me to my final comment: there is an inappropriate emphasis on statistical significance. The sample size was chosen arbitrarily. What if the sample had been half the size? Then few, if any, of the observed effects would have been significant. What if the sample had been twice the size? Then many more of the observed effects would have been significant (particularly for the pupillometry). I hope that future studies will follow a more principled approach in which relevant effect sizes are pre-specified (ideally as the strength of association that would be practically useful) and sample sizes are determined accordingly.

      [Note added after revision: my intention with this comment was not to make a philosophical or nitty-gritty point about statistics. It was more of a follow on to the previous point. Because I don't know what sort of effect size is big enough to matter (for whatever purpose), I don't find the statistical significance (or lack thereof) of the effect size observed to be informative. But I don't think there is anything more that the authors can or should do in this regard.]

      So, in summary, I think this study is a valuable but limited advance. The results increase my confidence that non-invasive measures can be used to infer underlying CND, but I am unsure how much closer we are to anything that is practically useful.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Wang et al. investigated how sexual failure influences sweet taste perception in male Drosophila. The study revealed that courtship failure leads to decreased sweet sensitivity and feeding behavior via dopaminergic signaling. Specifically, the authors identified a group of dopaminergic neurons projecting to the suboesophageal zone that interacts with sweet-sensing Gr5a+ neurons. These dopaminergic neurons positively regulate the sweet sensitivity of Gr5a+ neurons via DopR1 and Dop2R receptors. Sexual failure diminishes the activity of these dopaminergic neurons, leading to reduced sweet-taste sensitivity and sugar-feeding behavior in male flies. These findings highlight the role of dopaminergic neurons in integrating reproductive experiences to modulate appetitive sensory responses.

      Previous studies have explored the dopaminergic-to-Gr5a+ neuronal pathways in regulating sugar feeding under hunger conditions. Starvation has been shown to increase dopamine release from a subset of TH-GAL4 labeled neurons, known as TH-VUM, in the suboesophageal zone. This enhanced dopamine release activates dopamine receptors in Gr5a+ neurons, heightening their sensitivity to sugar and promoting sucrose acceptance in flies. Since the function of the dopaminergic-to-Gr5a+ circuit motif has been well established, the primary contribution of Wang et al. is to show that mating failure in male flies can also engage this circuit to modulate sugar-feeding behavior. This contribution is valuable because it highlights the role of dopaminergic neurons in integrating diverse internal state signals to inform behavioral decisions.

      An intriguing discrepancy between Wang et al. and earlier studies lies in the involvement of dopamine receptors in Gr5a+ neurons. Prior research has shown that Dop2R and DopEcR, but not DopR1, mediate starvation-induced enhancement of sugar sensitivity in Gr5a+ neurons. In contrast, Wang et al. found that DopR1 and Dop2R, but not DopEcR, are involved in the sexual failure-induced decrease in sugar sensitivity in these neurons. I wish the authors had further explored or discussed this discrepancy, as it is unclear how dopamine release selectively engages different receptors to modulate neuronal sensitivity in a context-dependent manner.

      Our immunostaining experiments showed that three dopamine receptors, Dop1R1, Dop2R, and DopEcR were expressed in Gr5a<sup>+</sup> neurons in the proboscis, which was consistent with previous findings by using RT-PCR (Inagaki et al 2012). As the reviewer pointed out, we found that Dop1R1 and Dop2R were required for courtship failure-induced suppression of sugar sensitivity, whereas Marella et al 2012 and Inagaki et al 2012 found that Dop2R and DopEcR were required for starvation-induced enhancement of sugar sensitivity. These results may suggest that different internal states (courtship failure vs. starvation) modulate the peripheral sensory system via different signaling pathways (e.g. different subsets of dopaminergic neurons; different dopamine release mechanisms; and different dopamine receptors). We have discussed these possibilities in the revised manuscript.

      The data presented by Wang et al. are solid and effectively support their conclusions. However, certain aspects of their experimental design, data analysis, and interpretation warrant further review, as outlined below.

      (1) The authors did not explicitly indicate the feeding status of the flies, but it appears they were not starved. However, the naive and satisfied flies in this study displayed high feeding and PER baselines, similar to those observed in starved flies in other studies. This raises the concern that sexually failed flies may have consumed additional food during the 4.5-hour conditioning period, potentially lowering their baseline hunger levels and subsequently reducing PER responses. This alternative explanation is worth considering, as an earlier study demonstrated that sexually deprived males consumed more alcohol, and both alcohol and food are known rewards for flies. To address this concern, the authors could remove food during the conditioning phase to rule out its influence on the results.

      This is an important consideration. To rule out potential confound from food intake during courtship conditioning, we have now also conducted courtship conditioning in vials absent of food. In the absence of any feeding opportunity over the 4.5-hour courtship conditioning period, sexually rejected males still exhibited a robust decrease in sweet taste sensitivity compared with Naïve and Satisfied controls (Figure 1-supplement 1C). These data confirm that the suppression of PER is driven by courtship failure per se, rather than by differences in feeding during the conditioning phase.

      (2) Figure 1B reveals that approximately half of the males in the Failed group did not consume sucrose yet Figure 1-S1A suggests that the total volume consumed remained unchanged. Were the flies that did not consume sucrose omitted from the dataset presented in Figure 1-S1A? If so, does this imply that only half of the male flies experience sexual failure, or that sexual failure affects only half of males while the others remain unaffected? The authors should clarify this point.

      Our initial description of the experimental setup might be a bit confusing. Here is a brief clarification of our experimental design and we have further clarified the details in the revised manuscript, which should resolve the reviewer’s concerns:

      After the behavioral conditioning, male flies were divided for two assays. On the one hand, we quantified PER responses of individual flies. As shown in Figure 1C, Failed males exhibited decreased sweet sensitivity (as demonstrated by the right shift of the dose-response curve). On the other hand, we sought to quantify food consumption of individual flies by using the MAFE assay (Qi et al 2005).

      In the initial submission, we used 400 mM sucrose for the MAFE assay. When presented with 400 mM sucrose, approximately 100% of the flies in the Naïve and Satisfied groups, and 50% of the flies in the Failed group, extended their proboscis and started feeding, as a natural consequence of decreased sugar sensitivity (Figure 1B). We were able to quantify the actual volume of food consumed of these flies showing PER responses towards 400 mM sucrose and observed no change (Figure 1-supplement 1A, left). To avoid potential confusion, we have now repeated the MAFE assay with 800 mM sucrose, which elicited feeding in ~100% of flies among all three groups, as shown in Figure 1C. Again, we observed no change in food intake (Figure 1-supplement 1A, right).

      These experiments in combination suggest that sexual failure suppresses sweet sensitivity of the Failed males. Meanwhile, as long as they still responded to a certain food stimulus and initiated feeding, the volume of food consumption remained unchanged. These results led us to focus on the modulatory effect of sexual failure on the sensory system, the main topic of this present study.

      (3) The evidence linking TH-GAL4 labeled dopaminergic neurons to reduced sugar sensitivity in Gr5a+ neurons in sexually failed males could be further strengthened. Ideally, the authors would have activated TH-GAL4 neurons and observed whether this restored GCaMP responses in Gr5a+ neurons in sexually failed males. Instead, the authors performed a less direct experiment, shown in Figures 3-S1C and D. The manuscript does not describe the condition of the flies used in this experiment, but it appears that they were not sexually conditioned. I have two concerns with this experiment. First, no statistical analysis was provided to support the enhancement of sucrose responses following activation of TH-GAL4 neurons. Second, without performing this experiment in sexually failed males, the authors lack direct evidence to confirm that the dampened response of Gr5a+ neurons to sucrose results from decreased activity in TH-GAL4 neurons.

      We have now quantified the effect of TH<sup>+</sup> neuron activation on Gr5a<sup>+</sup> neuron calcium responses. in Naïve males, dTRPA1-mediated activation of TH<sup>+</sup> cells significantly enhanced sucrose-induced calcium responses (Figure 3-supplement 1C); while in Failed males, the baseline activity of Gr5a<sup>+</sup> neurons was lower (Figure 3C), the same activation also produced significant (even slightly larger) effect on the calcium responses of Gr5a<sup>+</sup> neurons (Figure 3-supplement 1D).

      Taken together, we would argue that these experiments using both Naïve and Failed males were adequate to show a functional link between TH<sup>+</sup> neurons and Gr5a<sup>+</sup> neurons. Combining with the results that these neurons form active synapses (Figure 3-supplement 1B) and that the activity of TH<sup>+</sup> neurons was dampened in sexually failed males (Figure 3G-I), our data support the notion that sexual failure suppresses sweet sensitivity via TH-Gr5a circuitry.

      (4) The statistical methods used in this study are poorly described, making it unclear which method was used for each experiment. I suggest that the authors include a clear description of the statistical methods used for each experiment in the figure legends. Furthermore, as I have pointed out, there is a lack of statistical comparisons in Figures 3-S1C and D, a similar problem exists for Figures 6E and F.

      We have added detailed information of statistical analysis in each figure legend.

      (5) The experiments in Figure 5 lack specificity. The target neurons in this study are Gr5a+ neurons, which are directly involved in sugar sensing. However, the authors used the less specific Dop1R1- and Dop2R-GAL4 lines for their manipulations. Using Gr5a-GAL4 to specifically target Gr5a+ neurons would provide greater precision and ensure that the observed effects are directly attributable to the modulation of Gr5a+ neurons, rather than being influenced by potential off-target effects from other neuronal populations expressing these dopamine receptors.

      We agree with the reviewer that manipulating Dop1R1 and Dop2R genes (Figure 4) and the neurons expressing them (Figure 5) might have broader impacts. For specificity, we have also tested the role of Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons by RNAi experiments (Figure 6). As shown by both behavioral and calcium imaging experiments, knocking down Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons both eliminated the effect of sexual failure to dampen sweet sensitivity, further confirming the role of these two receptors in Gr5a<sup>+</sup> neurons.

      (6) I found the results presented in Fig. 6F puzzling. The knockdown of Dop2R in Gr5a+ neurons would be expected to decrease sucrose responses in naive and satisfied flies, given the role of Dop2R in enhancing sweet sensitivity. However, the figure shows an apparent increase in responses across all three groups, which contradicts this expectation. The authors may want to provide an explanation for this unexpected result.

      We agree that there might be some potential discrepancies. We have now addressed the issues by re-conducting these calcium imaging experiments again with a head-to-head comparison with the controls (Gr5a-GCaMP, +/- Dop1R1 and Dop2R RNAi).

      In these new experiments, Dop1R1 or Dop2R knockdown completely prevented the suppression of Gr5a<sup>+</sup> neuron responsiveness by courtship failure (Figure 6E), whereas the activities of Gr5a<sup>+</sup> neurons in Naïve/Satisfied groups were not altered. These results demonstrate that Dop1R1 and Dop2R are specifically required to mediate the decrease in sweet sensitivity following courtship failure.

      (7) In several instances in the manuscript, the authors described the effects of silencing dopamine signaling pathways or knocking down dopamine receptors in Gr5a neurons with phrases such as 'no longer exhibited reduced sweet sensitivity' (e.g., L269 and L288), 'prevent the reduction of sweet sensitivity' (e.g., L292), or 'this suppression was reversed' (e.g. L299). I found these descriptions misleading, as they suggest that sweet sensitivity in naive and satisfied groups remains normal while the reduction in failed flies is specifically prevented or reversed. However, this is not the case. The data indicate that these manipulations result in an overall decrease in sweet sensitivity across all groups, such that a further reduction in failed flies is not observed. I recommend revising these descriptions to accurately reflect the observed phenotypes and avoid any confusion regarding the effects of these manipulations.

      We have changed the wording in the revised manuscript. In brief, we think that these manipulations have two consequences: suppressing the overall sweet sensitivity, and eliminating the effect of sexual failure on sweet sensitivity.

      Reviewer #2 (Public review):

      Summary:

      The authors exposed naïve male flies to different groups of females, either mated or virgin. Male flies can successfully copulate with virgin females; however, they are rejected by mated females. This rejection reduces sugar preference and sensitivity in males. Investigating the underlying neural circuits, the authors show that dopamine signaling onto GR5a sensory neurons is required for reduced sugar preference. GR5a sensory neurons respond less to sugar exposure when they lack dopamine receptors.

      Strengths:

      The findings add another strong phenotype to the existing dataset about brain-wide neuromodulatory effects of mating. The authors use several state-of-the-art methods, such as activity-dependent GRASP to decipher the underlying neural circuitry. They further perform rigorous behavioral tests and provide convincing evidence for the local labellar circuit.

      Weaknesses:

      The authors focus on the circuit connection between dopamine and gustatory sensory neurons in the male SEZ. Therefore, it is still unknown how mating modulates dopamine signaling and what possible implications on other behaviors might result from a reduced sugar preference.

      We agree with the reviewer that in the current study, we did not examine the exact mechanism of how mating experience suppressed the activity of dopaminergic neurons in the SEZ. The current study mainly focused on the behavioral characterization (sexual failure suppresses sweet sensitivity) and the downstream mechanism (TH-Gr5a pathway). We think that examining the upstream modulatory mechanism may be more suitable for a separate future study.

      We believe that a sustained reduction in sweet sensitivity (not limited to sucrose but extend to other sweet compounds Figure 1-supplement 1D-E) upon courtship failure suggests a generalized and sustained consequence on reward-related behaviors. Sexual failure may thus resemble a state of “primitive emotion” in fruit flies. We have further discussed this possibility in the revised manuscript.

      Reviewer #3 (Public review):

      Summary

      In this work, the authors asked how mating experience impacts reward perception and processing. For this, they employ fruit flies as a model, with a combination of behavioral, immunostaining, and live calcium imaging approaches.

      Their study allowed them to demonstrate that courtship failure decreases the fraction of flies motivated to eat sweet compounds, revealing a link between reproductive stress and reward-related behaviors. This effect is mediated by a small group of dopaminergic neurons projecting to the SEZ. After courtship failure, these dopaminergic neurons exhibit reduced activity, leading to decreased Gr5a+ neuron activity via Dop1R1 and Dop2R signaling, and leading to reduced sweet sensitivity. The authors therefore showed how mating failure influences broader behavioral outputs through suppression of the dopamine-mediated reward system and underscores the interactions between reproductive and reward pathways.

      Concern

      My main concern regarding this study lies in the way the authors chose to present their results. If I understood correctly, they provided evidence that mating failure induces a decrease in the fraction of flies exhibiting PER. However, they also showed that food consumption was not affected (Fig. 1, supplement), suggesting that individuals who did eat consumed more. This raises questions about the analysis and interpretation of the results. Should we consider the group as a whole, with a reduced sensitivity to sweetness, or should we focus on individuals, with each one eating more? I am also concerned about how this could influence the results obtained using live imaging approaches, as the flies being imaged might or might not have been motivated to eat during the feeding assays. I would like the authors to clarify their choice of analysis and discuss this critical point, as the interpretation of the results could potentially be the opposite of what is presented in the manuscript.

      Please refer to our responses to the Public Review (Reviewer 1, Point 2) for details.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The label for the y-axis in Figure 1B should be "fraction", not "percentage".

      We have revised the figure as suggested.

      (2) I suggest that the authors indicate the ROIs they used to quantify the signal intensity in Figure 3E and G.

      We have revised the figures as suggested.

      (3) There is a typo in Figure 4A: it should be "Wilde type", not "Wide type".

      We have revised the figure as suggested.

      (4) The elav-GAL4/+ data in Figure 4-S1B, C, and D appears to be reused across these panels. However, the number of asterisks indicating significance in the MAT plots differs between them (three in panels B and C, and four in panel D). Is this a typo?

      It is indeed a typo, and we have revised the figure accordingly.

      Reviewer #2 (Recommendations for the authors):

      Additional comments:

      The authors should add this missing literature about dopamine and neuromodulation in courtship:

      Boehm et al., 2022 (eLife) - this study shows that mating affects olfactory behavior in females.

      Cazalé-Debat et al., 2024 (Nature) - Mating proximity blinds threat perception.

      Gautham et al., 2024 (Nature) - A dopamine-gated learning circuit underpins reproductive state-dependent odor preference in Drosophila females.

      We have added these references in the introduction section.

      Has the mating behavior been quantified? How often did males copulate with mated and virgin females?

      We tried to examine the copulation behavior based on our video recordings. In the “Failed” group (males paired with mated females), we observed virtually no successful copulation events at all, confirming that nearly 100% of those males experienced sexual failure. In contrast, males in the “Satisfied” group (paired with virgin females) mated on average 2-3 times during the 4.5-hour conditioning period. We have added some explanations in the manuscript.

      Do the rejected males live shorter? Is the effect also visible when they are fed with normal fly food, or is it only working with sugar?

      We did not directly measure the lifespan of these males. But we conducted a relevant assay (starvation resistance), in which “Failed” males died significantly faster than both Naïve and Satisfied controls, indicating a clear reduction in their ability to endure food deprivation (Figure 1-supplement 1B). Since sweet taste is a primary cue for food detection in Drosophila, and sugar makes up a large portion of their standard diet, the drop in sugar sensitivity we observed in Failed males could likewise impair their perception and consumption of regular fly food, hence their resistance to starvation.

      Also, the authors mention that the reward pathway is affected, this is probably the case as sugar sensation is impaired. One interesting experiment would be (and maybe has been done?) to test rejected males in normal odor-fructose conditioning. The data would suggest that they would do worse.

      We have already measured how courtship failure affected fructose sensitivity (Figure 1 supplement 1D), and we found that the reduction in fructose perception was even more profound than for sucrose. We have not yet tested whether Failed males showed deficits in odor-fructose associative conditioning. That was indeed a very interesting direction to explore. But olfactory reward learning relies on molecular and circuit mechanisms distinct from those governing taste. We therefore argue such experiments would be more suitable in a separate, follow up study.

      The authors could have added another group where males are exposed to other males. It would be interesting if this is also a "stressful" context and if it would also reduce sugar preference - probably beyond the scope of this paper.

      In our experiments, all flies, including those in the Naïve, Failed, and Satisfied groups, were housed in groups of 25 males per vial before the conditioning period (and the Naïve group remained in the same group housing until PER testing). This means every cohort experienced the same level of “social stress” from male-male interactions. While it would indeed be interesting to compare that to solitary housing or other male-only exposures, isolation itself imposes a different kind of stress, and disentangling these effects on sugar preference would require a separate, dedicated study beyond the scope of the present work.

      Would the behavior effect also show up with experienced males? Maybe this has been tested before. Does mating rejection in formerly successful males have the same impact?

      As suggested by the reviewer, we performed an additional experiment in which males that had previously mated successfully were subsequently subjected to courtship rejection. As shown in Figure 1 supplement 1F, prior successful mating did not prevent the decline in sweet sensitivity induced by subsequent mating failure, indicating that even experienced males exhibit the reduction in sugar sensitivity after rejection.

      Is the same circuit present and functioning in females? Does manipulating dopamine receptors in GR5a neurons in females lead to the same phenotype? This would suggest that different internal states in males and females could lead to the same phenotype and circuit modulations.

      This is indeed a very interesting suggestion. In male flies, Gr5a-specific knockdown of dopamine receptors did not alter baseline sweet sensitivity, but it selectively prevented the reduction in sugar perception that followed mating failure (Figure 6C-D), indicating that this dopaminergic pathway is engaged only in the context of courtship rejection. By extension, knocking down the same receptors in female GR5a neurons would likewise be expected to leave their basal sugar sensitivity unchanged. Moreover, because there is currently no established paradigm for inducing mating failure in female flies, we cannot yet test whether sexual rejection similarly modulates sweet taste in females, or whether it operates via the same circuit.

      Reviewer #3 (Recommendations for the authors):

      Suggestions to the authors:

      Introduction, line 61. I suggest the authors add references in fruit flies concerning the rewarding nature of mating. For example, the paper from Zhang et al, 2016 "Dopaminergic Circuitry Underlying Mating Drive" demonstrates the role of the dopamine rewarding system in mating drive. There is a large body of literature showing the link between dopamine and mating.

      We have added this literature in the introduction section.

      Figure 1B and Figure Supplement 1: If I understood correctly, Figure Supplement 1A shows that the total food consumption across all tested flies remains unchanged. However, fewer flies that failed to mate consumed sucrose. I would be curious to see the results for sucrose consumption per individual fly that did eat. According to their results, individual flies that failed to mate should consume more sucrose. This would change the conclusion. The authors currently show that a group of flies that failed to mate consumed less sucrose overall, but since fewer males actually ate, those that failed to mate and did eat consumed more sucrose. The authors should distinguish between failed and satisfied flies in two groups: those that ate and those that did not.

      Please see our responses to the Public Review for details (Reviewer 1, Point 2).

      Figure 1C, right: For a better understanding of all the "MAT" figures, I suggest the authors start the Y axis with the unit 25 and increase it to 400. This would match better the text (line 114) saying that it was significantly elevated in the failed group. As it is, we have the impression of a decrease in the graph.

      We have revised the figures accordingly.

      Line 103: When suggesting a reduced likelihood of meal initiation of these males, do these males take longer to eat when they did it? In other words, is the latency to eat increased in failed males? That would be a good measure of motivational state.

      We tried to analyze feeding latency in the MAFE assay by measuring the time from sucrose presentation to the first proboscis extension, but it was too short to be accurately accounted. Nevertheless, when conducting the experiments, we did not feel/observe any significant difference in the feeding latency between Failed males and Naïve or Satisfied controls.

      Line 117. I don't understand which results the authors refer to when writing "an overall elevation in the threshold to initiate feeding upon appetitive cues". Please specify.

      This phrase refers to the fact that for every sweet tastant we tested, including sucrose (Figure 1C), fructose and glucose (Figure 1 supplement 1D-E), the concentration-response curve in Failed males shifted to the right, and the Mean Acceptance Threshold (MAT) was significantly higher. In other words, for these different appetitive cues, mating failure raised the concentration of sugar required to trigger a proboscis extension, indicating a general elevation in the threshold to initiate feeding upon an appetitive cue.

      Figure 1D. Please specify the time for the satisfied group.

      For clarity, the Naïve and Satisfied groups in Figure 1D each represent pooled data from 0 to 72 hours post-treatment, as their sweet sensitivity remained stable throughout this period. Only the Failed group was shown with time-resolved data, since it was the only group exhibiting a dynamic change in sugar sensitivity over time. We have now specified this in the figure legend.

      Figure 1F. The phenotype was not totally reversed in failed-re-copulated males. Could it be due to the timing between failure and re-copulation? I suggest the authors mention in the figure or in the text, the time interval between failure and re-copulation.

      We’d like to clarify that the interval between the initial treatment (“Failed”) and the opportunity for re copulation was within 30 minutes. The incomplete reversal in the Failed-re-copulated group indeed raised interesting questions. One possible explanation is that mating failure reduces synaptic transmissions between the SEZ dopaminergic neurons and Gr5a<sup>+</sup> sweet sensory neurons (Figure 3), and the regeneration of these transmissions takes a longer time. We have added this information to the figure legend and the Method section.

      Line 227-228 and Figure 3E. The authors showed that the synaptic connections between dopaminergic neurons and Gr5a+ GRNs were significantly weakened. I am wondering about the delay between mating failure and the GFP observation. It would be informative to know this timing to interpret this decrease in synaptic connections. If the timing is relatively long, it is possible that we can observe a neuronal plasticity. However, if this timing is very short, I would not expect such synaptic plasticity.

      The interval between the behavioral treatment and the GRASP-GFP experiment was approximately 20 hours. We chose this time window because it was sufficient for both GFP expression and accumulation. Therefore, the observed reduction in synaptic connections between dopaminergic neurons and Gr5a<sup>+</sup> GRNs likely reflects a genuine, experience-induced structural and functional change rather than an immediate, transient effect. We have added this information to the revised manuscript for clarity in the Method section.

      Line 240-243: The authors demonstrated that there is a reduction of CaLexA-mediated GFP signals in dopaminergic neurons in the SEZ after mating failure, but not a reduction in Gr5a+ GRNs. I suggest replacing "indicate" with "suggest' in line 240.

      We have made the change accordingly. Meanwhile, we would like to clarify that while we observed a reduction of NFAT signal in SEZ dopaminergic neurons (Figure 3G), we did not directly test NFAT signal in Gr5a<sup>+</sup> neurons. Notably, the results that the synaptic transmissions from SEZ dopaminergic neurons to Gr5a<sup>+</sup> neurons were weakened (Figure 3E-F), and the reduction of NFAT signal in SEZ dopaminergic neurons (Figure 3G-I), were in line with a reduction in sweet sensitivity of Gr5a<sup>+</sup> neurons upon courtship failure (Figure 3B-D).

      Line 243: replace "consecutive" with "constitutive".

      We have revised it accordingly.

      Figure 5: I have trouble understanding the results obtained in Figure 5. Both constitutive activation and inhibition of Dop1R1 and Dop2R neurons lead to the same results, knowing that males who failed mating no longer exhibit decreased sweet sensitivity. I would have expected contrary results for both experimental conditions. I suggest the author to discuss their results.

      Both activation and inhibition of Dop1R1 and Dop2R neurons eliminated the effect of courtship failure on sweet sensitivity (Figure 5). These results are in line with our hypothesis that courtship failure leads to changes in dopamine signaling and hence sweet sensitivity. If dopamine signaling via Dop1R1 and Dop2R was locked, either to a silenced or a constitutively activated state, the effect of courtship failure on sweet sensitivity was eliminated.

      Nevertheless, as the reviewer pointed out, constitutive activation/inhibition should in principle lead to the opposite effect on Naïve flies. In fact, when Dop1R1<sup>+</sup>/Dop2R<sup>+</sup> neurons were silenced in Naïve flies, PER to sucrose was significantly reduced (Figure 5C-D), confirming that these neurons normally facilitate sweet sensation. Meanwhile, while neuronal activation by NaChBac did show a trend towards enhanced PER compared to the GAL4/+ controls, it did not exhibit a difference compared to +>UAS-NaChBac controls that showed a high PER level, likely due to a potential ceiling effect. We have added the discussions to the manuscript.

      Figure 7: I suggest the authors modify their figure a bit. It is not clear why in failed mating, the red arrow in "behavioral modulation" goes to the fly. The authors should find another way to show that mating failure decreased the percentage of flies that are motivated to eat sugar.

      We have modified the figure as suggested.

      Overall, I would suggest the authors be precautious with their conclusion. For example, line 337= "sexual failure suppressed feeding behavior". This is not what is shown by this study. Here, the study shows that mating failure decreases the fraction of flies to eat sucrose. Unless the authors demonstrate that this decrease is generalizable to other metabolites, I suggest the authors modify their conclusion.

      While we primarily used sucrose as the stimulant in our experiments, we also tested responses to two other sugars: fructose and glucose (Figure 1 supplement 1D-E). In all three cases, mating failure led to a significant reduction in sweet perception, suggesting that the effect of courtship failure is not limited to a single metabolite but rather reflects a general decrease in sweet sensitivity. Meanwhile, reduced sweet sensitivity indeed led to a reduction of feeding initiation (Figure 1).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In their comprehensive analysis Diallo et al. deorphanise the first olfactory receptor of a nonhymenopteran eusocial insect - a termite and identified the well-established trail pheromone neocembrene as the receptor's best ligand. By using a large set of odorants the authors convincingly show that, as expected for a pheromone receptor, PsimOR14 is very narrowly tuned. While the authors first make use of an ectopic expression system, the empty neuron of Drosophila melanogaster, to characterise the receptor's responses, they next perform single sensillum recordings with different sensilla types on the termite antenna. By that, they are able to identify a sensillum that houses three neurons, of which the B neuron exhibits the narrow responses described for PsimOR14. Hence the authors do not only identify the first pheromone receptor in a termite but can even localize its expression on the antenna. The authors in addition perform a structural analysis to explain the binding properties of the receptor and its major and minor ligands (as this is beyond my expertise, I cannot judge this part of the manuscript). Finally, they compare expression patterns of ORs in different castes and find that PsimOR14 is more strongly expressed in workers than in soldier termites, which corresponds well with stronger antennal responses in the worker caste.

      Strengths:

      The manuscript is well-written and a pleasure to read. The figures are beautiful and clear. I actually had a hard time coming up with suggestions.

      We thank the reviewer for the positive comments.

      Weaknesses:

      Whenever it comes to the deorphanization of a receptor and its potential role in behaviour (in the case of the manuscript it would be trail-following of the termite) one thinks immediately of knocking out the receptor to check whether it is necessary for the behaviour. However, I definitely do not want to ask for this (especially as the establishment of CRISPR Cas-9 in eusocial insects usually turns out to be a nightmare). I also do not know either, whether knockdowns via RNAi have been established in termites, but maybe the authors could consider some speculation on this in the discussion.

      We agree that a functional proof of the PsimOR14 function using reverse genetics would be a valuable addition to the study to firmly establish its role in trail pheromone sensing. Nevertheless, such a functional proof is difficult to obtain. Due to the very slow ontogenetic development inherent to termites (several months from an egg to the worker stage) the CRISPR Cas-9 is not a useful technique for this taxon. By contrast, termites are quite responsive to RNAimediated silencing and RNAi has previously been used for the silencing of the ORCo co-receptor in termites resulting in impairment of the trail-following behavior (DOI: 10.1093/jee/toaa248). Likewise, our previous experiments showed a decreased ORCo transcript abundance, lower sensitivity to neocembrene and reduced neocembrene trail following upon dsPsimORCo administration to P. simplex workers, while we did not succeed in reducing the transcript abundance of PsimOR14 upon dsPsimOR14 injection. We do not report these negative results in the present manuscript so as not to dilute the main message. In parallel, we are currently developing an alternative way of dsRNA delivery using nanoparticle coating, which may improve the RNAi experiments with ORs in termites.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors performed the functional analysis of odorant receptors (ORs) of the termite Prorhinotermes simplex to identify the receptor of trail-following pheromone. The authors performed single-sensillum recording (SSR) using the transgenic Drosophila flies expressing a candidate of the pheromone receptor and revealed that PsimOR14 strongly responds to neocembrene, the major component of the pheromone. Also, the authors found that one sensillum type (S I) detects neocembrene and also performed SSR for S I in wild termite workers. Furthermore, the authors revealed the gene, transcript, and protein structures of PsimOR14, predicted the 3D model and ligand docking of PsimOR14, and demonstrated that PsimOR14 is higher expressed in workers than soldiers using RNA-seq for heads of workers and soldiers of P. simplex and that EAG response to neocembrene is higher in workers than soldiers. I consider that this study will contribute to further understanding of the molecular and evolutionary mechanisms of the chemoreception system in termites.

      Strength:

      The manuscript is well written. As far as I know, this study is the first study that identified a pheromone receptor in termites. The authors not only present a methodology for analyzing the function of termite pheromone receptors but also provide important insights in terms of the evolution of ligand selectivity of termite pheromone receptors.

      We thank the reviewer for the overall positive evaluation of the manuscript.

      Weakness:

      As you can see in the "Recommendations to the Authors" section below, there are several things in this paper that are not fully explained about experimental methods. Except for this point, this paper appears to me to have no major weaknesses.

      We address point by point the specific comments listed in the Recommendation to the authors chapter below.

      Reviewer #3 (Public review):

      Summary:

      Chemical communication is essential for the organization of eusocial insect societies. It is used in various important contexts, such as foraging and recruiting colony members to food sources. While such pheromones have been chemically identified and their function demonstrated in bioassays, little is known about their perception. Excellent candidates are the odorant receptors that have been shown to be involved in pheromone perception in other insects including ants and bees but not termites. The authors investigated the function of the odorant receptor PsimOR14, which was one of four target odorant receptors based on gene sequences and phylogenetic analyses. They used the Drosophila empty neuron system to demonstrate that the receptor was narrowly tuned to the trail pheromone neocembrene. Similar responses to the odor panel and neocembrene in antennal recordings suggested that one specific antennal sensillum expresses PsimOR14. Additional protein modeling approaches characterized the properties of the ligand binding pocket in the receptor. Finally, PsimOR14 transcripts were found to be significantly higher in worker antennae compared to soldier antennae, which corresponds to the worker's higher sensitivity to neocembrene.

      Strengths:

      The study presents an excellent characterization of a trail pheromone receptor in a termite species. The integration of receptor phylogeny, receptor functional characterization, antennal sensilla responses, receptor structure modeling, and transcriptomic analysis is especially powerful. All parts build on each other and are well supported with a good sample size.

      We thank the reviewer for these positive comments.

      Weaknesses:

      The manuscript would benefit from a more detailed explanation of the research advances this work provides. Stating that this is the first deorphanization of an odorant receptor in a clade is insufficient. The introduction primarily reviews termite chemical communication and deorphanization of olfactory receptors previously performed. Although this is essential background, it lacks a good integration into explaining what problem the current study solves.

      We understand the comment about the lack of an intelligible cue to highlight the motivation and importance of the present study. In the current version of the manuscript the introduction has been reworked. As suggested by Reviewer 3 in the Recommendations section below, the introduction now integrates some parts of the original discussion, especially the part discussing the OR evolution and emergence of eusociality in hymenopteran social insects and in termites, while underscoring the need of data from termites to compare the commonalities and idiosyncrasies in neurophysiological (pre)adaptations potentially linked with the independent eusociality evolution in the two main social insect clades.

      Selecting target ORs for deorphanization is an essential step in the approach. Unfortunately, the process of choosing these ORs has not been described. Were the authors just lucky that they found the correct OR out of the 50, or was there a specific selection process that increased the probability of success?

      Indeed, we were extremely lucky. Our strategy was to first select a modest set of ORs to confirm the feasibility of the Empty Neuron Drosophila system and newly established SSR setup, while taking advantage of having a set of termite pheromones, including those previously identified in the P. simplex model, some of them de novo synthesized for this project. The selection criteria for the first set of four receptors were (i) to have full-length ORF and at least 6 unambiguously predicted transmembrane regions, and (ii) to be represented on different branches (subbranches) of the phylogenetic tree. Then it was a matter of a good luck to hit the PsimOR14 selectively responding to the genuine P. simplex trail-following pheromone main component. In the revised version, we state these selection criteria in the results section (Phylogenetic reconstruction and candidate OR selection).

      The deorphanization attempts of additional P. simplex ORs are currently running.

      The authors assigned antennal sensilla into five categories. Unfortunately, they did not support their categories well. It is not clear how they were able to differentiate SI and SII in their antennal recordings.

      We agree that the classification of multiporous sensilla into five categories lacks robust discrimination cues. The identification of the neocembrene-responding sensillum was initially carried out by SSR measurements on individual olfactory sensilla of P. simplex workers one-by-one and the topology of each tested sensillum was recorded on optical microscope photographs taken during the SSR experiment. Subsequently, the SEM and HR-SEM were performed in which we localized the neocembrene sensillum and tried to find distinguishing characters. We admit that these are not robust. Therefore, in the revised version of the manuscript we decided to abandon the attempt of sensilla classification and only report the observations about the specific sensillum in which we consistently recorded the response to neocembrene (and geranylgeraniol). The modifications affect Fig. 4, its legend and the corresponding part of the results section (Identification of P. simplex olfactory sensillum responding to neocembrene).

      The authors used a large odorant panel to determine receptor tuning. The panel included volatile polar compounds and non-volatile non-polar hydrocarbons. Usually, some heat is applied to such non-volatile odorants to increase volatility for receptor testing. It is unclear how it is possible that these non-volatile compounds can reach the tested sensilla without heat application.

      The reviewer points at an important methodological error we made while designing the experiments. Indeed, the inclusion of long-chain hydrocarbons into Panel 1 without additional heat applied to the odor cartridges was inappropriate, even though the experiments were performed at 25–26 °C. We carefully considered the best solution to correct the mistake and finally decided to remove all tested ligands beyond C22 from Panel 1, i.e. altogether five compounds. These changes did not affect the remaining Panels 2-4 (containing compounds with sufficient volatility), nor did they affect the message of the manuscript on highly selective response of PsimOR14 to neocembrene (and geranylgeryniol). In consequence, Figures 2, 3 and 5 were updated, along with the supplementary tables containing the raw data on SSR measurements. In addition, the tuning curve for PsimOR14 was re-built and receptor lifetime sparseness value re-calculated (without any important change). We also exchanged squalene for limonene in the docking and molecular dynamics analysis and made new calculations.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) L 208: "than" instead of "that"

      Corrected.

      (2) L 527+527 strange squares (•) before dimensions

      Apparently an error upon file conversion, corrected.

      (3) L553 "reconstructing" instead of "reconstruct"

      Corrected.

      (4) Two references (Chahda et al. and Chang et al. appear too late in the alphabet.

      Corrected. Thank you for spotting this mistake. Due to our mistake the author list was ordered according to the alphabet in Czech language, which ranks CH after H.

      Reviewer #2 (Recommendations for the authors):

      (1) L148: Why did the authors select only four ORs (PsimOR9, 14, 30, and 31) though there are 50 ORs in P. simplex? I would like you to explain why you chose them.

      Our strategy was to first select a modest set of ORs to confirm the feasibility of the Empty Neuron Drosophila system and newly established SSR setup, while taking advantage of having a set of termite pheromones, including those previously identified in the P. simplex model, some of them de novo synthesized for this project. Then, it was a matter of a good luck to hit the PsimOR14 selectively responding to the genuine P. simplex trail-following pheromone main component, while the deorphanization attempts of a set of additional P. simplex ORs is currently running. In the revised version of the manuscript, we state the selection criteria for the four ORs studied in the Results section (Phylogenetic reconstruction and candidate OR selection).

      (2) L149: Where is Figure 1A? Does this mean Figure 1?

      Thank you for spotting this mistake. Fig. 1 is now properly labelled as Fig. 1A and 1B in the figure itself and in the legend. Also the text now either refers to either 1A or 1B.

      (3) Figure 1: The authors also showed the transcription abundance of all 50 ORs of P. simplex in the right bottom of Figure 1, but there is no explanation about it in the main text.

      The heatmap reporting the transcript abundances is now labelled as Fig. 1B and is referred to in the discussion section (in the original manuscript it was referred to on the same place as Fig. 1).

      (4) L260-265: The authors confirmed higher expression of PsimOR14 in workers than soldiers by using RNA-seq data and stronger EAG responses of PsimOR14 to neocembrene in workers than soldiers, but I think that confirming the expression levels of PsimOR14 in workers and soldiers by RT-qPCR would strengthen the authors' argument (it is optional).

      qPCR validation is a suitable complement to read count comparison of RNA Seq data, especially when the data comes from one-sample transcriptomes and/or low coverage sequencing. Yet, our RNA Seq analysis is based on sequencing of three independent biological replicates per phenotype (worker heads vs. soldier heads) with ~20 millions of reads per sample. Thus, the resulting differential gene expression analysis is a sufficient and powerful technique in terms of detection limit and dynamic range.

      We admit that the replicate numbers and origin of the RNA seq data should be better specified since the Methods section only referred to the GenBank accession numbers in the original manuscript. Therefore, we added more information in the Methods section (Bioinformatics) and make clear in the Methods that this data comes from our previous research and related bioproject.

      (5) L491: I think that "The synthetic processes of these fatty alcohols are ..." is better.

      We replaced the sentence with “The de novo organic synthesis of these fatty alcohols is described …”

      (6) L525 and 527: There are white squares between the number and the unit. Perhaps some characters have been garbled.

      Apparently an error upon file conversion, corrected.

      (7) L795: ORCo?

      Corrected.

      (8) L829-830 & Figure 4: Where is Figure 4D?

      Thank you for spotting this mistake from the older version of Figure 4. The SSR traces referred to in the legend are in fact a part of Figure 5. Moreover, Figure 4 is now reworked based on the comments by Reviewer 3.

      (9) L860-864: Why did the authors select the result of edgeR for the volcano plot in Figure 7 although the authors use both DESeq2 and edgeR? An explanation would be needed.

      Both algorithms, DESeq2 and EdgeR, are routinely used for differential gene expression analysis. Since they differ in read count normalization method and statistical testing we decided to use both of them independently in order to reduce false positives. Because the resulting fold changes were practically identical in both algorithms (results for both analyses are listed in Supplementary table S15), we only reported in Fig. 7 the outputs for edgeR to avoid redundancies. We added in the Results section the information that both techniques listed PsimOR14 among the most upregulated in workers.

      Reviewer #3 (Recommendations for the authors):

      The discussion contains many descriptions that would fit better into the introduction, where they could be used to hint at the study's importance (e.g., 292-311, 381-412). The remaining parts often lack a detailed discussion of the results that integrates details from other insect studies. Although references were provided, no details were usually outlined. It would be helpful to see a stronger emphasis on what we learn from this study.

      Along with rewriting the introduction, we also modified the discussion. As suggested, the lines 292-311 were rewritten and placed in the introduction. By contrast, we preferred to keep the two paragraphs 381-412 in the discussion, since both of them outline the potential future interesting targets of research on termite ORs.

      As suggested, the discussion has been enriched and now includes comparative examples and relevant references about the broad/narrow selectivity of insect ORs, about the expected breadth of tuning of pheromone receptors vs. ORs detecting environmental cues, about the potential role of additional neurons housed in the neocembrene-detecting sensillum of P. simplex workers, etc. From both introduction and discussion the redundant details on the chemistry of termite communication have been removed.

      This includes explanations of the advantages of the specific methodologies the authors used and how they helped solve the manuscript's problem. What does the phylogeny solve? Was it used to select the ORs tested? It would be helpful to discuss what the phylogeny shows in comparison to other well-studied OR phylogenies, like those from the social Hymenoptera.

      We understand the comment. In fact, our motivation to include the phylogenetic tree of termite ORs was essentially to demonstrate (i) the orthologous nature of OR diversity with few expansions on low taxonomic levels, and (ii) to demonstrate graphically the relationship among the four selected sequences. We do not attempt here for a comprehensive phylogenetic analysis, because it would be redundant given that we recently published a large OR phylogeny which includes all sequences used in the present manuscript and analysed them in the proper context of related (cockroaches) and unrelated insect taxa (Johny et al., 2023). This paper also discusses the termite phylogenetic pattern with those observed in other Insecta. This paper is repeatedly cited on appropriate places of the present manuscript and its main observations are provided in the Introduction section. Therefore, we feel that thorough discussion on termite phylogeny would be redundant in the present paper.

      The authors categorized the sensilla types. Potential problems in the categorization aside, it would be helpful to know if it is expected that you have sensilla specialized in perceiving one specific pheromone. What is known about sensilla in other insects?

      We understand. In the discussion of the revised version, we develop more about the features typical/expected for a pheromone receptor and the sensillum housing this receptor together with two other olfactory sensory neurons, including examples from other insects.

      As the manuscript currently stands, specialist readers with their respective background knowledge would find this study very interesting. In contrast, the general reader would probably fail to appreciate the importance of the results.

      We hope that the re-organized and simplified introduction may now be more intelligible even for non-specialist readers.

      (1) L35: Should "workers" be replaced with "worker antennae"?

      Corrected.

      (2) L62: Should "conservativeness" be replaced by "conservation"?

      Replaced with “parsimony”.

      (3) L129: How and why did the authors choose four candidate ORs? I could not find any information about this in the manuscript. I wondered why they did not pick the more highly expressed PsimOr20 and 26 (Figure 7).

      As already replied above in the Weaknesses section, we selected for the first deorphanization attempts only a modest set of four ORs, while an additional set is currently being tested. We also explained above the inclusion criteria, i.e. (i) full-length ORF and at least 6 unambiguously predicted transmembrane regions, and (ii) presence on different branches (subbranches) of the OR phylogeny. For these reasons, we did not primarily consider the expression patterns of different ORs. As for Fig. 7, it shows differential expression between soldiers and workers, which was not the primary guideline either and the data was obtained only after having the ORs tested by SSR. Yet, even though we had data on P. simplex ORs expression (Fig. 1B), we did not presume that pheromone receptors should be among the most expressed ORs, given the richness of chemical cues detected by worker termites and unlike, e.g., male moths, where ORs for sex pheromones are intuitively highly expressed.

      The strategy of OR selection is specified in the results section of the revised manuscript under “Phylogenetic reconstruction and candidate OR selection”.

      (4) 198 to 200: SI, II, and III look very similar. Additional measurements rather than qualitative descriptions are required to consider them distinct sensilla. The bending of SIII could be an artifact of preparation. I do not see how the authors could distinguish between SI and SII under the optical microscope for recordings. A detailed explanation is required.

      As we responded above in “Weaknesses” chapter, we admit that the sensilla classification is not intelligible. Therefore, we decided in the revised version to abandon the classification of sensilla types and only focus on the observations made on the neocembreneresponding sensillum. To recognize the specific sensillum, we used its topology on the last antennal segment. Because termite antennae are not densely populated with sensilla, it is relatively easy to distinguish individual sensilla based on their topology on the antenna, both in optical microscope and SEM photographs. The modifications affect Fig. 4, its legend and the corresponding part of the results section (Identification of P. simplex olfactory sensillum responding to neocembrene).

      (5) 208: "Than" instead of "that"

      Corrected.

      (6) 280: I suggest replacing "demand" with "capabilities"

      Corrected.

      (7) 312: Why "nevertheless? It sounds as if the authors suggest that there is evidence that ORs are not important for communication. This should be reworded.

      We removed “Nevertheless” from the beginning of the sentence.

      (8) 321 to 323: This sentence sounds as if something is missing. I suggest rewriting it.

      This sentence simply says that empty neuron Drosophila is a good tool for termite OR deorphanization and that termite ORs work well Drosophila ORCo. We reworded the sentence.

      (9) 323: I suggest starting a new paragraph.

      Corrected.

      (10) 421: How many colonies were used for each of the analyses?

      The data for this manuscript were collected from three different colonies collected in Cuba. We now describe in the Materials and Methods section which analyses were conducted with each of the colonies.

      (11) 430: Did the termites originate from one or multiple colonies and did the authors sample from the Florida and Cuba population?

      The data for this manuscript were collected from three different colonies collected in Cuba. We now describe in the Materials and Methods section which analyses were conducted with each of the colonies.

      (12) 501: How was the termite antenna fixated? The authors refer to the Drosophila methods, but given the large antennal differences between these species, more specific information would be helpful.

      Understood. We added the following information into the Methods section under “Electrophysiology”: “The grounding electrode was carefully inserted into the clypeus and the antenna was fixed on a microscope slide using a glass electrode. To avoid the antennal movement, the microscope slide was covered with double-sided tape and the three distal antennal segments were attached to the slide.”

      (13)509: I want to confirm that the authors indicate that the outlet of the glass tube with the airstream and odorant is 4 cm away from the Drosophila or termite antenna. The distance seems to be very large.

      Thank you for spotting this obvious mistake. The 4 cm distance applies for the distance between the opening for Pasteur pipette insertion into the delivery tube, the outlet itself is situated approx. 1 cm from the antenna. This information is now corrected.

      (14) 510/527: It looks like all odor panels were equally applied onto the filter paper despite the difference in solvent (hexane and paraffin oil). How was the solvent difference addressed?

      In our study we combine two types of odorant panels. First, we test on all four studied receptors a panel containing several compounds relevant for termite chemical communication including the C12 unsaturated alcohols, the diterpene neocembrene, the sesquiterpene (3R,6E)-nerolidol and other compounds. These compounds are stored in the laboratory as hexane solutions to prevent the oxidation/polymerization and it is not advisable to transfer them to another solvent. In the second step we used three additional panels of frequently occurring insect semiochemicals, which are stored as paraffin oil solutions, so as to address the breadth of PsimOR14 tuning. We are aware that the evaporation dynamics differ between the two solvents but we did not have any suitable option how to solve this problem. We believe that the use of the two solvents does not compromise the general message on the receptor specificity. For each panel, the corresponding solvent is used as a control. Similarly, the use of two different solvents for SSR can be encountered in other studies, e.g. 10.1016/j.celrep.2015.07.031.

      (15) 518: delta spikes/sec works for all tables except for the wild type in Table S5. I could not figure out how the authors get to delta spikes/sec in that table.

      Thank you for your sharp eye. Due to our mistake, the values of Δ spikes per second reported in Table S5 for W1118 were erroneously calculated using the formula for 0.5 sec stimulation instead of 1 sec. We corrected this mistake which does not impact the results interpretation in Table S5 and Fig. 2.

      522: Did the workers and soldiers originate from different colonies or different populations?

      We now clearly describe in the Material and Methods section the origin of termites for different experiments. EAG measurements were made using individuals (workers, soldiers) from one Cuban colony.

      (16) Figure 6C/D: I suggest matching colors between the two figures. For example, instead of using an orange circle in C and a green coloration of the intracellular flap in D, I recommend using blue, which is not used for something else. In addition, the binding pocket could be separated better from anything else in a different color.

      We agree that the color match for the intracellular flap was missing. This figure is now reworked and the colors should have a better match and the binding region is better delineated.

      (17) Figure 7/Table S15: It is unclear where the transcriptome data originate and what they are based on. Are these antennal transcriptomes or head transcriptomes? Do these data come from previous data sets or data generated in this study? Figure 7 refers to heads, Table S15 to workers and soldiers, and the methods only refer to antennal extractions. This should be clarified in the text, the figure, and the table.

      We admit that the replicate numbers and origin of the RNA seq data should be better specified and that the information that the RNASeq originated from samples of heads+antennae of workers and soldiers should be provided at appropriate places. Therefore, we added more information on replicates and origin of the data in the Methods section (Bioinformatics) and make clear that this data comes from our previous research and refer to the corresponding bioproject. Likewise, the Figure 7 legend and Table S15 heading have been updated.

    1. Reviewer #1 (Public review):

      SMC5/6 is a highly conserved complex able to dynamically alter chromatin structure, playing in this way critical roles in genome stability and integrity that include homologous recombination and telomere maintenance. In the last years, a number of studies have revealed the importance of SMC5/6 in restricting viral expression, which is in part related to its ability to repress transcription from circular DNA. In this context, Oravcova and colleagues recently reported how SMC5/6 is recruited by two mutually exclusive complexes (orthologs of yeast Nse5/6) to SV40 LT-induced PML nuclear bodies (SIMC/SLF2) and DNA lesions (SLF1/2). In this current work, the authors extend this study, providing some new results. However, as a whole, the story lacks unity and does not delve into the molecular mechanisms responsible for the silencing process. One has the feeling that the story is somewhat incomplete, putting together not directly connected results.

      (1) In the first part of the work, the authors confirm previous conclusions about the relevance of a conserved domain defined by the interaction of SIMC and SLF2 for their binding to SMC6, and extend the structural analysis to the modelling of the SIMC/SLF2/SMC complex by AlphaFold. Their data support a model where this conserved surface of SIMC/SLF2 interacts with SMC at the backside of SMC6's head domain, confirming the relevance of this interaction site with specific mutations. These results are interesting but confirmatory of a previous and more complete structural analysis in yeast (Li et al. NSMB 2024). In any case, they reveal the conservation of the interaction. My major concern is the lack of connection with the rest of the article. This structure does not help to understand the process of transcriptional silencing reported later beyond its relevance to recruit SMC5/6 to its targets, which was already demonstrated in the previous study.

      (2) In the second part of the work, the authors focus on the functionality of the different complexes. The authors demonstrate that SMC5/6's role in transcription silencing is specific to its interaction with SIMC/SLF2, whereas SMC5/6's role in DNA repair depends on SLF1/2. These results are quite expected according to previous results. The authors already demonstrated that SLF1/2, but not SIMC/SLF2, are recruited to DNA lesions. Accordingly, they observe here that SMC5/6 recruitment to DNA lesions requires SLF1/2 but not SIMC/SLF2. Likewise, the authors already demonstrated that SIMC/SLF2, but not SLF1/2, targets SMC5/6 to PML NBs. Taking into account the evidence that connects SMC5/6's viral resistance at PML NBs with transcription repression, the observed requirement of SIMC/SLF2 but not SLF1/2 in plasmid silencing is somehow expected. This does not mean the expectation has not to be experimentally confirmed. However, the study falls short in advancing the mechanistic process, despite some interesting results as the dispensability of the PML NBs or the antagonistic role of the SV40 large T antigen. It had been interesting to explore how LT overcomes SMC5/6-mediated repression: Does LT prevent SIMC/SLF2 from interacting with SMC5/6? Or does it prevent SMC5/6 from binding the plasmid? Is the transcription-dependent plasmid topology altered in cells lacking SIMC/SLF2? And in cells expressing LT? In its current form, the study is confirmatory and preliminary. In agreement with this, the cartoons modelling results here and in the previous work look basically the same.

      (3) There are some points about the presented data that need to be clarified.

    2. Author response:

      This study builds on, extends, and experimentally validates results/models from our previous study. Our and others’ data implicated SMC5/6, PML nuclear bodies (PML NBs), and SUMOylation in the transcriptional repression of extrachromosomal circular DNA (ecDNA). Moreover, multiple viruses were found to express early genes that combat SMC5/6-based repression through targeted proteasomal degradation (e.g. Hepatitis B virus HBx and HIV-1 Vpr). Thus, our analysis of the roles of the foregoing in plasmid repression yields a coherent set of results for the field to build on.

      In our previous study we presented a model, but no supportive ecDNA silencing data, suggesting that distinct SMC5/6 subcomplexes, SIMC1-SLF2 and SLF1/2, separately control its transcriptional repression and DNA repair activities. In this study we experimentally validate that prediction using an ecDNA silencing assay and SMC5/6 localization analysis following DNA damage.

      Our study further reveals the unexpected dispensability of PML NBs in the silencing of simple plasmid DNA, a departure from current dogma. This raises important questions for the field to address in terms of the silencing mechanisms for different ecDNAs across different cell types. Despite the dispensability of SUMO-rich PML NBs, SUMOylation is required for ecDNA repression. Lastly, the SV40 LT antigen early gene product counteracts ecDNA silencing. These results used genetic epistasis arguments to implicate SUMO and LT in SMC5/6-based transcriptional silencing. We provide provisional responses for some of the reviewer’s general comments below.

      Public Reviews:

      Reviewer #1 (Public review):

      SMC5/6 is a highly conserved complex able to dynamically alter chromatin structure, playing in this way critical roles in genome stability and integrity that include homologous recombination and telomere maintenance. In the last years, a number of studies have revealed the importance of SMC5/6 in restricting viral expression, which is in part related to its ability to repress transcription from circular DNA. In this context, Oravcova and colleagues recently reported how SMC5/6 is recruited by two mutually exclusive complexes (orthologs of yeast Nse5/6) to SV40 LT-induced PML nuclear bodies (SIMC/SLF2) and DNA lesions (SLF1/2). In this current work, the authors extend this study, providing some new results. However, as a whole, the story lacks unity and does not delve into the molecular mechanisms responsible for the silencing process. One has the feeling that the story is somewhat incomplete, putting together not directly connected results.

      Please see the introductory overview above.

      (1) In the first part of the work, the authors confirm previous conclusions about the relevance of a conserved domain defined by the interaction of SIMC and SLF2 for their binding to SMC6, and extend the structural analysis to the modelling of the SIMC/SLF2/SMC complex by AlphaFold. Their data support a model where this conserved surface of SIMC/SLF2 interacts with SMC at the backside of SMC6's head domain, confirming the relevance of this interaction site with specific mutations. These results are interesting but confirmatory of a previous and more complete structural analysis in yeast (Li et al. NSMB 2024). In any case, they reveal the conservation of the interaction. My major concern is the lack of connection with the rest of the article. This structure does not help to understand the process of transcriptional silencing reported later beyond its relevance to recruit SMC5/6 to its targets, which was already demonstrated in the previous study.

      Demonstrating the existence of a conserved interface between the Nse5/6-like complexes and SMC6 in both yeast and human is foundationally important and was not revealed in our previous study. It remains unclear how this interface regulates SMC5/6 function, but yeast studies suggest a potential role in inhibiting the SMC5/6 ATPase cycle. Nevertheless, the precise function of Nse5/6 and its human orthologs in SMC5/6 regulation remain undefined, largely due to technical limitations in available in vivo analyses. The SIMC1/SLF2/SMC6 complex structure likely extends to the SLF1/2/SMC6 complex, suggesting a unifying function of the Nse5/6-like complexes in SMC5/6 regulation, albeit in the distinct processes of ecDNA silencing and DNA repair. There have been no studies to date (including this one) showing that SIMC1-SLF2 is required for SMC5/6 recruitment to ecDNA. Our previous study showed that SIMC1 was needed for SMC5/6 to colocalize with SV40 LT antigen at PML NBs. Here we show that SIMC1 is required for ecDNA repression, in the absence of PML NBs, which was not anticipated.

      (2) In the second part of the work, the authors focus on the functionality of the different complexes. The authors demonstrate that SMC5/6's role in transcription silencing is specific to its interaction with SIMC/SLF2, whereas SMC5/6's role in DNA repair depends on SLF1/2. These results are quite expected according to previous results. The authors already demonstrated that SLF1/2, but not SIMC/SLF2, are recruited to DNA lesions. Accordingly, they observe here that SMC5/6 recruitment to DNA lesions requires SLF1/2 but not SIMC/SLF2.

      Our previous study only examined the localization of SLF1 and SIMC1 at DNA lesions. The localization of these subcomplexes alone should not be used to define their roles in SMC5/6 localization. Indeed, the field is split in terms of whether Nse5/6-like complexes are required for ecDNA binding/loading, or regulation of SMC5/6 once bound.

      Likewise, the authors already demonstrated that SIMC/SLF2, but not SLF1/2, targets SMC5/6 to PML NBs. Taking into account the evidence that connects SMC5/6's viral resistance at PML NBs with transcription repression, the observed requirement of SIMC/SLF2 but not SLF1/2 in plasmid silencing is somehow expected. This does not mean the expectation has not to be experimentally confirmed. However, the study falls short in advancing the mechanistic process, despite some interesting results as the dispensability of the PML NBs or the antagonistic role of the SV40 large T antigen. It had been interesting to explore how LT overcomes SMC5/6-mediated repression: Does LT prevent SIMC/SLF2 from interacting with SMC5/6? Or does it prevent SMC5/6 from binding the plasmid? Is the transcription-dependent plasmid topology altered in cells lacking SIMC/SLF2? And in cells expressing LT? In its current form, the study is confirmatory and preliminary. In agreement with this, the cartoons modelling results here and in the previous work look basically the same.

      We agree, determining the potential mechanism of action of LT in overcoming SMC5/6-based repression is an important next step. It will require the identification of any direct interactions with SMC5/6 subunits, and better methods for assessing SMC5/6 loading and activity on ecDNAs. Unlike HBx, Vpr, and BNRF1 it does not appear to induce degradation of SMC5/6, making it a more complex and interesting challenge. Also, the dispensability of PML NBs in plasmid silencing versus viral silencing raises multiple important questions about SMC5/6’s repression mechanism.

      (3) There are some points about the presented data that need to be clarified.

      Reviewer #2 (Public review):

      Oracová et al. present data supporting a role for SIMC1/SLF2 in silencing plasmid DNA via the SMC5/6 complex. Their findings are of interest, and they provide further mechanistic detail of how the SMC5/6 complex is recruited to disparate DNA elements. In essence, the present report builds on the author's previous paper in eLife in 2022 (PMID: 36373674, "The Nse5/6-like SIMC1-SLF2 complex localizes SMC5/6 to viral replication centers") by showing the role of SIMC1/SLF2 in localisation of the SMC5/6 complex to plasmid DNA, and the distinct requirements as compared to recruitment to DNA damage foci. Although the findings of the manuscript are of interest, we are not yet convinced that the new data presented here represents a compelling new body of work and would better fit the format of a "research advance" article. In their previous paper, Oracová et al. show that the recruitment of SMC5/6 to SV40 replication centres is dependent on SIMC1, and specifically, that it is dependent on SIMC1 residues adjacent to neighbouring SLF2.

      We agree, this manuscript fits the Research Advance model, which is the format that this manuscript was submitted in.

      Reviewer #3 (Public review):

      Summary:

      This study by the Boddy and Otomo laboratories further characterizes the roles of SMC5/6 loader proteins and related factors in SMC5/6-mediated repression of extrachromosomal circular DNA. The work shows that mutations engineered at an AlphaFold-predicted protein-protein interface formed between the loader SLF2/SIMC1 and SMC6 (similar to the interface in the yeast counterparts observed by cryo-EM) prevent co-IP of the respective proteins. The mutations in SLF2 also hinder plasmid DNA silencing when expressed in SLF2-/- cell lines, suggesting that this interface is needed for silencing. SIMC1 is dispensable for recruitment of SMC5/6 to sites of DNA damage, while SLF1 is required, thus separating the functions of the two loader complexes. Preventing SUMOylation (with a chemical inhibitor) increases transcription from plasmids but does not in SLF2-deleted cell lines, indicating the SMC5/6 silences plasmids in a SUMOylation dependent manner. Expression of LT is sufficient for increased expression, and again, not additive or synergistic with SIMC1 or SLF2 deletion, indicating that LT prevents silencing by directly inhibiting 5/6. In contrast, PML bodies appear dispensable for plasmid silencing.

      Strengths:

      The manuscript defines the requirements for plasmid silencing by SMC5/6 (an interaction of Smc6 with the loader complex SLF2/SIMC1, SUMOylation activity) and shows that SLF1 and PML bodies are dispensable for silencing. Furthermore, the authors show that LT can overcome silencing, likely by directly binding to (but not degrading) SMC5/6.

      Weaknesses:

      (1) Many of the findings were expected based on recent publications.

      Please see introductory paragraphs above.

    1. The sun shines full in the faces of the expectant multitude, but a Greek is not fastidious about weather;—besides, there is a pleasant breeze blowing over us from the sea. And the time is passed in discussion of the probable character of the different plays, and the chances of the competitors. These are not, as we might have expected, the poets whose plays are to be presented, but the rich men who put the several plays upon the stage. A poet is not usually a rich man, and could not of course afford to hire, as he must, a chorus and actors, and get dresses and scenery arranged; left to himself, he could no more bring out his piece than the ordinary composer could bring out an opera. So the plan in Athens was this. The rich men in each tribe were required to contribute out of their wealth to the benefit and amusement of their fellow-citizens. When ships were wanted, the burden of supplying them was laid on the wealthier citizens, to each of whom, or to several clubbed together, the duty of providing a ship was assigned. Similarly, when the festivals were to be supplied with plays, the office of putting a piece on the stage—of furnishing a chorus, as it was called—devolved upon some one very rich citizen, or upon several of moderate wealth who bore the expense between them. The play to be thus provided for was assigned by the magistrates out of those which the rival poets had sent in. The furnisher of the chorus then collected men who could sing and dance to be trained for the chorus, chose the two or the three actors among whom the parts should be distributed, had scenes painted and dresses hired, and provided whatever else was needed for the due performance of the piece. It was a point of honour to do the whole as liberally and artistically as possible; and an ambitious man would gain popularity by introducing new stage-machinery, new effects in the music, or new inventions for making the gestures of the actors visible and their voices audible throughout the immense building. For it will seem most wonderful, if we consider the case, that any actor could make himself heard by thirty thousand people in the open air; still more that his voice, so elevated as to penetrate through all that multitude, should be able to preserve distinct the various tones of grief or joy, of submission or command. To meet this difficulty the Greeks contrived masks, which enclosed, it seems, the whole head, and were fitted with acoustic arrangements such as are unknown to us, by which the power of the human voice was wonderfully increased. In the same way, in order that the persons of the actors might not appear diminutive from the great distance at which most of the spectators saw them, they were made taller by very thick-soled boots, and broader by the judicious arrangement of their dresses; while the mask, no doubt, rendered the appearance of the head proportionate to this enlarged stature. There were, too, in the building of the wall which formed the back of the stage, acoustic principles observed, by which those who spoke from the interior—as from within a house or a room—might be heard more distinctly. And improvements in these matters were made from time to time by those to whom the equipment of plays was assigned. So when the names of such and such men are mentioned as probable competitors, it is these furnishers of the chorus who are meant, though the success of any one of them would no doubt be considered the more probable if he had Æschylus or Sophocles for his poet.

      Describes the poet as a normal person with not much as people would think, showing the multi tasking of having to do the best to get everything needed for a show. Like that the rich had to contribute more to help the less fortunate and the city as a whole with entertainment from a financial perspective.Like how the wealthy were tasked to do more because they were able to so the citizens as a whole and the city were able to grow and still do more together especially in entertainment. The success of the furnishers towards the end could of had a more probable success if they had aeschylus or sophicles and I like how they added him next to aeschylus and added the in this to show they're greatness in this field.

    1. With conventional RT, xerostomia is permanent. Salivary gland‑sparing techniques using IMRT(intensity‑modulated irradiation technique - yoğunluk ayarlı radyoterapi) have been now in use.IMRT is rapidly emerging as the standard of care for head and neck cancer. Salivarygland‑sparing IMRT is associated with a gradual recovery of salivary flow over time andimproved quality‑of‑life as compared to conventional RT. A team including speech therapist,dietitian, dental specialist, and psychologist along with radiation oncology is required to dealwith those complications and prevent to morbidity and mortality of head and neck cancerpatients during and after radiotherapy.

      ① With conventional RT, xerostomia is permanent. ① Konvansiyonel radyoterapide, ağız kuruluğu (xerostomia) kalıcıdır.

      ② Salivary gland‑sparing techniques using IMRT (intensity‑modulated irradiation technique - yoğunluk ayarlı radyoterapi) have been now in use. ② Şu anda, tükürük bezlerini koruyan yöntemler olarak IMRT (yoğunluk ayarlı radyoterapi) kullanılmaktadır.

      ③ IMRT is rapidly emerging as the standard of care for head and neck cancer. ③ IMRT, baş ve boyun kanserleri için hızla standart tedavi yöntemi haline gelmektedir.

      ④ Salivary gland‑sparing IMRT is associated with a gradual recovery of salivary flow over time and improved quality‑of‑life as compared to conventional RT. ④ Tükürük bezlerini koruyan IMRT, konvansiyonel radyoterapiye kıyasla zamanla tükürük akışında kademeli iyileşme ve yaşam kalitesinde artış ile ilişkilidir.

      ⑤ A team including speech therapist, dietitian, dental specialist, and psychologist along with radiation oncology is required to deal with those complications and prevent to morbidity and mortality of head and neck cancer patients during and after radiotherapy. ⑤ Konuşma terapisti, diyetisyen, diş hekimi, psikolog ve radyasyon onkolojisi uzmanlarından oluşan bir ekip, baş ve boyun kanseri hastalarının radyoterapi sırasında ve sonrasında ortaya çıkan komplikasyonları yönetmek ve hastalık yükünü azaltmak için gereklidir.

    2. To minimize patient discomfort and morbidity, an understanding of the deleterious effects ofradiotherapy is required. Introducing good oral home care and morefrequent oral prophylaxis visits to the dentists before radiotherapy will allow for continuing careduring and after therapy. The cancer patient who is to receive or has received curative doses ofradiation to the head and neck cancer presents a challenge for the dentist. The importance ofpatient compliance should be emphasized.

      ❶ To minimize patient discomfort and morbidity, an understanding of the deleterious effects of radiotherapy is required. ❶ Hasta rahatsızlığını ve hastalık yükünü en aza indirmek için, radyoterapinin zararlı etkilerinin anlaşılması gereklidir.

      ❷ Introducing good oral home care and more frequent oral prophylaxis visits to the dentists before radiotherapy will allow for continuing care during and after therapy. ❷ Radyoterapiden önce iyi ağız hijyeni uygulamalarının başlatılması ve diş hekimine daha sık profilaktik ziyaretlerin yapılması, tedavi sırasında ve sonrasında devam eden bakımın sağlanmasına olanak tanır.

      ❸ The cancer patient who is to receive or has received curative doses of radiation to the head and neck cancer presents a challenge for the dentist. ❸ Baş ve boyun kanseri için küratif dozlarda radyasyon alacak veya alan kanser hastası, diş hekimi için zorlu bir durum oluşturur.

      ❹ The importance of patient compliance should be emphasized. ❹ Hasta uyumunun önemi vurgulanmalıdır.

    3. Osteoradionecrosis is a condition of a nonvital bone in the site of radiation injury. ORN can bespontaneous but it most related to hypovascular, hypocellular, and hypoxic conditions that existin bone. Induced deficient cellular turnover and collagen synthesis in a hypoxic, hypovascular,and hypocellular environment in which tissue breakdown exceeds the repair capabilities of thewounded tissue. Clinically, ORN may initially present as bone lysis under gingiva and mucosa.This process is self‑limiting because the damaged bone sequestrates then is shed withsubsequent healing. If the soft tissue breakdown, the bone becomes exposed to saliva andsecondary contamination occurs. Sepsis may also be introduced by dental extraction or surgeryproducing a more aggressive form. This progressive from may produce severe pain or fractureand require extensive resection. The reported incidence of ORN ranges from 0.92% of allhead and neck cancer patients receiving radiotherapy.

      ❶ Osteoradionecrosis is a condition of a nonvital bone in the site of radiation injury. ❶ Osteoradiyonekroz, radyasyon hasarı olan bölgede canlılığını yitirmiş kemik durumudur.

      ❷ ORN can be spontaneous but it most related to hypovascular, hypocellular, and hypoxic conditions that exist in bone. ❷ ORN kendiliğinden ortaya çıkabilir ancak daha çok kemikte bulunan az damar, az hücre ve oksijen yetersizliği koşullarıyla ilişkilidir.

      ❸ Induced deficient cellular turnover and collagen synthesis in a hypoxic, hypovascular, and hypocellular environment in which tissue breakdown exceeds the repair capabilities of the wounded tissue. ❸ Oksijen yetersizliği, az damar ve az hücre içeren ortamda hücre yenilenmesi ve kolajen sentezindeki yetersizlik, doku yıkımının yaralı dokunun onarım kapasitesini aşmasıyla oluşur.

      ❹ Clinically, ORN may initially present as bone lysis under gingiva and mucosa. ❹ Klinik olarak ORN, başlangıçta diş eti ve mukoza altında kemik çözünmesi şeklinde görülebilir.

      ❺ This process is self‑limiting because the damaged bone sequestrates then is shed with subsequent healing. ❺ Bu süreç kendi kendini sınırlar çünkü hasarlı kemik ayrılır ve ardından iyileşme gerçekleşir.

      ❻ If the soft tissue breakdown, the bone becomes exposed to saliva and secondary contamination occurs. ❻ Yumuşak dokunun yıkılması halinde, kemik tükürüğe maruz kalır ve ikincil kontaminasyon gelişir.

      ❼ Sepsis may also be introduced by dental extraction or surgery producing a more aggressive form. ❼ Diş çekimi veya cerrahi işlemler sepsise neden olarak daha agresif bir form oluşturabilir.

      ❽ This progressive form may produce severe pain or fracture and require extensive resection. ❽ Bu ilerleyici form şiddetli ağrıya veya kırığa yol açabilir ve geniş rezeksiyon gerektirebilir.

      ❾ The reported incidence of ORN ranges from 0.92% of all head and neck cancer patients receiving radiotherapy. ❾ ORN bildirilen insidansı, radyoterapi alan tüm baş ve boyun kanseri hastalarının %0,92’si arasında değişmektedir.

    4. Patients treated for head and neck cancer have a high risk of malnutrition. Patients may lose onedesire to eat because of soreness of the mouth, trouble swallowing or dry mouth. When eatingcauses discomfort or pain, the patients quality of life and nutritional of wellbeing suffer.• Nutritional support may include liquid diet and tube feeding• High calorie, high protein liquid to meet their needs• Intravenous infusion of nutritional supplements.Swallowing problems are managed by a team of experts• Speech therapist• Dietician• Dental specialist• Psychologist

      ❶ Patients treated for head and neck cancer have a high risk of malnutrition. ❶ Baş ve boyun kanseri tedavisi gören hastalarda malnütrisyon riski yüksektir.

      ❷ Patients may lose one desire to eat because of soreness of the mouth, trouble swallowing or dry mouth. ❷ Hastalar, ağızda ağrı, yutma güçlüğü veya kuru ağız nedeniyle yemek yeme isteğini kaybedebilirler.

      ❸ When eating causes discomfort or pain, the patients quality of life and nutritional of wellbeing suffer. ❸ Yemek yemek rahatsızlık veya ağrıya neden olduğunda, hastaların yaşam kalitesi ve beslenme durumu olumsuz etkilenir.

      ❹ Nutritional support may include liquid diet and tube feeding ❹ Beslenme desteği sıvı diyet ve beslenme tüpü ile olabilir.

      ❺ High calorie, high protein liquid to meet their needs ❺ İhtiyaçlarını karşılamak için yüksek kalorili, yüksek proteinli sıvılar.

      ❻ Intravenous infusion of nutritional supplements. ❻ Besin takviyelerinin damar yoluyla verilmesi.

      ❼ Swallowing problems are managed by a team of experts ❼ Yutma problemleri bir uzman ekibi tarafından yönetilir.

      ❽ Speech therapist ❽ Konuşma terapisti

      ❾ Dietician ❾ Diyetisyen

      ❿ Dental specialist ❿ Diş hekimi uzmanı

      ⓫ Psychologist ⓫ Psikolog

    5. Radiation‑induced dry mouth is a common and significant consequences of head and neckradiotherapy, oral dryness reflects the progressive radiation‑induced salivary gland acinar cellinflammation, fibrosis, and degeneration. The salivary glands are very sensitive to radiation.There is a sharp decrease in the salivary flow rate during the 1st week of RT with conventionalfractionation (2 Gy/day). The decrease in flow rate is continuous throughout the treatmentperiod especially when both parotids are irradiated. This correlates to the dose and duration ofRT. There is immediate serous cell death accomplished by inflammatory cell infiltration and thencontinuous reduction of salivary flow rates. Patients often complain of thick, ropy, saliva, and asensation that there is too much saliva because it is difficult to swallow.

      ❶ Radiation‑induced dry mouth is a common and significant consequences of head and neck radiotherapy, oral dryness reflects the progressive radiation‑induced salivary gland acinar cell inflammation, fibrosis, and degeneration. ❶ Radyasyon kaynaklı kuru ağız, baş ve boyun radyoterapisinin yaygın ve önemli bir sonucudur; ağız kuruluğu, radyasyona bağlı tükürük bezi asinar hücrelerinde ilerleyici iltihaplanma, fibrozis ve dejenerasyonu yansıtır.

      ❷ The salivary glands are very sensitive to radiation. There is a sharp decrease in the salivary flow rate during the 1st week of RT with conventional fractionation (2 Gy/day). ❷ Tükürük bezleri radyasyona karşı çok hassastır. Konvansiyonel fraksiyonlamada (günde 2 Gy) radyoterapinin ilk haftasında tükürük akış hızında keskin bir azalma olur.

      ❸ The decrease in flow rate is continuous throughout the treatment period especially when both parotids are irradiated. This correlates to the dose and duration of RT. ❸ Akış hızındaki azalma, özellikle her iki parotis bezi de ışınlandığında, tedavi süresi boyunca devam eder. Bu durum radyasyon dozu ve süresiyle ilişkilidir.

      ❹ There is immediate serous cell death accomplished by inflammatory cell infiltration and then continuous reduction of salivary flow rates. ❹ İnflamatuar hücre infiltrasyonuyla birlikte hemen seröz hücre ölümü meydana gelir ve ardından tükürük akış hızlarında sürekli azalma olur.

      ❺ Patients often complain of thick, ropy, saliva, and a sensation that there is too much saliva because it is difficult to swallow. ❺ Hastalar genellikle yoğun, yapışkan tükürükten ve yutkunmanın zor olması nedeniyle aşırı tükürük varmış hissinden şikayet ederler.

    6. Damage to the lining of the mouth and weakened immune system makes it easy for the infectionto occur. Oral mucositis breaks down the lining of the mouth which lets, bacteria, viruses, andfungal get into the blood. Dry mouth which is common during radiotherapy to head and neckmay also raise the risk of infection in the mouth. Infection may be caused by bacteria, virus orfungal. A systemic review indicated that the weighted mean prevalence of clinical oralcandidiasis during head and neck RT is 37.4%.

      ❶ Damage to the lining of the mouth and weakened immune system makes it easy for the infection to occur. ❶ Ağız astarının (mukozasının) zarar görmesi ve bağışıklık sisteminin zayıflaması, enfeksiyonun oluşmasını kolaylaştırır.

      ❷ Oral mucositis breaks down the lining of the mouth which lets, bacteria, viruses, and fungal get into the blood. ❷ Oral mukozit, ağız astarını bozar ve bu durum bakteri, virüs ve mantarların kana karışmasına neden olabilir.

      ❸ Dry mouth which is common during radiotherapy to head and neck may also raise the risk of infection in the mouth. ❸ Baş ve boyun radyoterapisi sırasında yaygın olan ağız kuruluğu da ağızda enfeksiyon riskini artırabilir.

      ❹ Infection may be caused by bacteria, virus or fungal. ❹ Enfeksiyon; bakteri, virüs veya mantar kaynaklı olabilir.

      ❺ A systemic review indicated that the weighted mean prevalence of clinical oral candidiasis during head and neck RT is 37.4%. ❺ Sistematik bir derleme, baş ve boyun radyoterapisi sırasında klinik oral kandidiyazis görülme sıklığının ağırlıklı ortalama %37.4 olduğunu göstermiştir.

    7. The oral complications of head and neck radiation can be divided into two groups on the basis ofthe usual time of their occurrence. Acute complications– A therapeutic dose of radiation in head and neck cancer usually comprises a total of 64 Gy to70 Gy in 32–35 fractions with the daily dose of 1.8–2.0 Gy/fraction. Acute complication appears1–2 weeks after radiation starts, it depends on dose and site of radiation also.• Oropharyngeal mucositis• Change in salivary composition• Alteration of taste (Dysguesia)• Infection (bacterial, fungal and viral)• Periodontium pain

      ❶ The oral complications of head and neck radiation can be divided into two groups on the basis of the usual time of their occurrence. ❶ Baş ve boyun radyasyonunun ağız komplikasyonları, genellikle ortaya çıkış zamanlarına göre iki gruba ayrılabilir.

      ❷ Acute complications ❷ Akut komplikasyonlar

      ❸ A therapeutic dose of radiation in head and neck cancer usually comprises a total of 64 Gy to 70 Gy in 32–35 fractions with the daily dose of 1.8–2.0 Gy/fraction. ❸ Baş ve boyun kanserinde terapötik radyasyon dozu genellikle toplam 64 Gy ile 70 Gy arasında, 32–35 fraksiyonda ve günlük 1.8–2.0 Gy/fraksiyon şeklindedir.

      ❹ Acute complication appears 1–2 weeks after radiation starts; it depends on dose and site of radiation also. ❹ Akut komplikasyonlar radyasyon başladıktan 1–2 hafta sonra ortaya çıkar; doz ve radyasyon bölgesine bağlıdır.

      ❺ Common acute complications include: ❺ Yaygın akut komplikasyonlar şunlardır:

      Oropharyngeal mucositis Orofaringeal mukozit

      Change in salivary composition Tükürük bileşiminde değişiklik

      Alteration of taste (Dysgeusia) Tat değişikliği (Dizgezi)

      Infection (bacterial, fungal and viral) Enfeksiyon (bakteriyel, fungal ve viral)

      Periodontium pain Periodonsiyum ağrısı

    8. Oral complications of head and neck radiation are more predictable, and are often moresevere and can lead to permanent tissue changes that the patient at risk for serious chroniccomplication, patients should go for:

      ❶ Oral complications of head and neck radiation are more predictable, and are often more severe and can lead to permanent tissue changes that put the patient at risk for serious chronic complications. ❶ Baş ve boyun radyasyonuna bağlı oral komplikasyonlar daha öngörülebilirdir, genellikle daha şiddetlidir ve hastayı ciddi kronik komplikasyon riskine sokabilecek kalıcı doku değişikliklerine yol açabilir.

      ❷ Patients should go for routine dental evaluation, preventive oral care, and close follow-up before, during, and after radiation therapy. ❷ Hastalar radyoterapi öncesinde, sırasında ve sonrasında düzenli diş muayenesi, koruyucu ağız bakımı ve yakın takip yaptırmalıdır.

    Annotators

    1. Reviewer #1 (Public review):

      Summary:

      The authors note that it is challenging to perform diffusion MRI tractography consistently in both humans and macaques, particularly when deep subcortical structures are involved. The scientific advance described in this paper is effectively an update to the tracts that the XTRACT software supports. The claims of robustness are based on a very small selection of subjects from a very atypical dMRI acquisition (n=50 from HCP-Adult) and an even smaller selection of subjects from a more typical study (n=10 from ON-Harmony).

      Strengths:

      The changes to XTRACT are soundly motivated in theory (based on anatomical tracer studies) and practice (changes in seeding/masking for tractography), and I think the value added by these changes to XTRACT should be shared with the field. While other bundle segmentation software typically includes these types of changes in release notes, I think papers are more appropriate.

      Weaknesses:

      The demonstration of the new tracts does not include a large number of carefully selected scans and is only compared to the prior methods in XTRACT. The small n and limited statistical comparisons are insufficient to claim that they are better than an alternative. Qualitatively, this method looks sound.

      Subject selection at each stage is unclear in this manuscript. On page 5 the data are described as "Using dMRI data from the macaque (𝑁 = 6) and human brain (𝑁 = 50)". Were the 50 HCP subjects selected to cover a range of noise levels or subject head motion? Figure 4 describes 72 pairs for each of monozygotic, dizygotic, non-twin siblings, and unrelated pairs - are these treated separately? Similarly, NH had 10 subjects, but each was scanned 5 times. How was this represented in the sample construction?

      In the paper, the authors state "the mean agreement between HCP and NH reconstructions was lower for the new tracts, compared to the original protocols (𝑝 < 10^−10). This was due to occasionally reconstructing a sparser path distribution, i.e., slightly higher false negative rate," - how can we know this is a false negative rate without knowing the ground truth?

    1. Joint Public Review:

      Summary:

      The authors have conducted the largest to date Mendelian Randomization (MR) analysis of the association between genetically predicted measures of adiposity and risk of head and neck cancer (HNC) overall and by subsites within HNC. MR uses genetic predictors of an exposure, such as gene variants associated with high BMI or tobacco use, rather than data from individual physical exams or questionnaires, and if it can be done in its idealized state, there should be no problems with confounding. Traditional epidemiologic studies have reported a variety of associations between BMI (and a few other measures of adiposity) and risk of HNC that typically differ by the smoking status of the subjects. Those findings are controversial given the complex relationship between tobacco and both BMI and HNC risk. Tobacco smokers are often thinner than non-smokers, so this could create an artificial ('confounded') association that may not be fully adjusted away in risk models. The findings of a BMI-HNC association are often attributed to residual confounding, and this seems ripe for an MR approach if suitable genetic instrumental variables can be created. Here, the authors built a variety of genetic instrumental variables for BMI and other measures of adiposity, as well as two instrumental variables for smoking habits, and then tested their hypotheses in a large case-control set of HNC and controls with genetic data.

      The authors found that the genetic model for BMI was associated with HNC risk in simple models, but this association disappeared when using models that better accounted for pleiotropy, the condition when genetic variants are associated with more than one trait, such as both BMI and tobacco use. When they used both adiposity and tobacco use genetic instruments in a single model, there was a strong association with genetically predicted tobacco use (as is expected), but there was no remaining association with genetic predictors of adiposity. They conclude that high BMI/adiposity is not a risk factor for HNC.

      Strengths:

      The primary strength was the expansive use of a variety of different genetic instruments for BMI/adiposity/body size, along with employing a variety of MR model types, several of which are known to be less sensitive to pleiotropy. They also used the largest case-control sample size to date.

      Weaknesses:

      The lack of pleiotropy is an unconfirmable assumption of MR, and the addition of those models is therefore quite important, as this is a primary weakness of the MR approach. Given that concern, I read the sensitivity analyses using pleiotropy-robust models as the main result, and in that case, they can't test their hypotheses as these models do not show a BMI instrumental variable association. The other weakness, which might be remedied, is that the power of the tests here is not described. When a hypothesis is tested with an under-powered model, the apparent lack of association could be due to inadequate sample size rather than a true null. Typically, when a statistically significant association is reported, power concerns are discounted as long as the study is not so small as to create spurious findings. That is the case with their primary BMI instrumental variable model - they find an association so we can presume it was adequately powered. But the primary models they share are not the pleiotropy-robust methods MR-Egger, weighted median, and weighted mode. The tests for these models are null, and that could mean a couple of things: (1) the original primary significant association between the BMI genetic instrument was due to pleiotropy, and they therefore don't have a robust model to explore the effects of the tobacco genetic instrument. (2) The power for the sensitivity analysis models (the pleiotropy-robust methods) is inadequate, and the authors share no discussion about the relative power of the different MR approaches. If they do have adequate power, then again, there is no need to explore the tobacco instrument.

      Reviewing Editor Comments:

      We suggest that the authors add power estimates to assess whether the sample size is sufficient, given the strength and variability of the genetic instruments. It would also be helpful to present effect estimates for the tobacco instruments alone, to clarify their independent contribution and improve the interpretation of the joint models. In addition, the role of pleiotropy should be addressed more clearly, including which model is considered primary. Stratified analyses by smoking status are encouraged, as prior studies indicate that BMI-HNC associations may differ between smokers and non-smokers. Finally, the comparison with previous studies should be revised, as most reported null findings without accounting for tobacco instruments. If this study finds an association, it should not be framed as a replication.

    2. Author response:

      Our response aims to address the following:

      The lack of pleiotropy is an unconfirmable assumption of MR, and the addition of those models is therefore quite important, as this is a primary weakness of the MR approach. Given that concern, I read the sensitivity analyses using pleiotropy-robust models as the main result, and in that case, they can't test their hypotheses as these models do not show a BMI instrumental variable association. The other weakness, which might be remedied, is that the power of the tests here is not described. When a hypothesis is tested with an under-powered model, the apparent lack of association could be due to inadequate sample size rather than a true null. Typically, when a statistically significant association is reported, power concerns are discounted as long as the study is not so small as to create spurious findings. That is the case with their primary BMI instrumental variable model - they find an association so we can presume it was adequately powered. But the primary models they share are not the pleiotropy-robust methods MR-Egger, weighted median, and weighted mode. The tests for these models are null, and that could mean a couple of things: (1) the original primary significant association between the BMI genetic instrument was due to pleiotropy, and they therefore don't have a robust model to explore the effects of the tobacco genetic instrument. (2) The power for the sensitivity analysis models (the pleiotropy-robust methods) is inadequate, and the authors share no discussion about the relative power of the different MR approaches. If they do have adequate power, then again, there is no need to explore the tobacco instrument.

      We would like to highlight that post-hoc power calculations are often considered redundant since the statistical power estimated for an observed association is directly related to its p-value[1]. In other words, the uncertainty of the association is already reflected in its 95% confidence interval. However, we understand power calculations may still be of interest to the reader, so we will incorporate them in the revised manuscript.

      The reason we use inverse variance weighted (IVW) Mendelian randomization (MR) to obtain our main results rather than the pleiotropy-robust methods mentioned by the reviewer/editors (i.e., MR-Egger, weighted median and weighted mode) is that the former has greater statistical power than the latter[2]. Hence, instead of focussing on the statistical significance of the pleiotropy-robust analyses, we consider it is of more value to compare the consistency of the effect sizes and direction of the effect estimates across methods. Any evidence of such consistency increases our confidence in our main findings, since each method relies on different assumptions. As we cannot be sure about the presence and nature of horizontal pleiotropy, it is useful to compare results across methods even though they are not equally powered. It is true that our results for the genetically predicted effects of body mass index (BMI) on the risk of head and neck cancer (HNC) differ across methods. This is precisely what led us to question the validity of our main finding (suggesting a positive effect of BMI on HNC risk). We will clarify this in the discussion section of the revised manuscript as advised.

      We understand that the reviewer/editors are concerned that we do not have a robust model to explore the role of tobacco consumption in the link between BMI and HNC. However, we have a different perspective on the matter. If indeed, the main IVW finding for BMI and HNC is due to pleiotropy (since some of the pleiotropy-robust methods suggest conflicting results), then the IVW multivariable MR method is a way to explore the potential source of this bias[3]. We were particularly interested in exploring the role of smoking in the observed association because smoking and adiposity are known to influence each other [4-9] and share a genetic basis[10, 11].

      References:

      (1) Heinsberg LW, Weeks DE: Post hoc power is not informative. Genet Epidemiol 2022, 46(7):390-394.

      (2) Burgess S, Butterworth A, Thompson SG: Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol 2013, 37(7):658-665.

      (3) Burgess S, Davey Smith G, Davies NM, Dudbridge F, Gill D, Glymour MM, Hartwig FP, Kutalik Z, Holmes MV, Minelli C et al: Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res 2019, 4:186.

      (4) Morris RW, Taylor AE, Fluharty ME, Bjorngaard JH, Asvold BO, Elvestad Gabrielsen M, Campbell A, Marioni R, Kumari M, Korhonen T et al: Heavier smoking may lead to a relative increase in waist circumference: evidence for a causal relationship from a Mendelian randomisation meta-analysis. The CARTA consortium. BMJ Open 2015, 5(8):e008808.

      (5) Taylor AE, Morris RW, Fluharty ME, Bjorngaard JH, Asvold BO, Gabrielsen ME, Campbell A, Marioni R, Kumari M, Hallfors J et al: Stratification by smoking status reveals an association of CHRNA5-A3-B4 genotype with body mass index in never smokers. PLoS Genet 2014, 10(12):e1004799.

      (6) Taylor AE, Richmond RC, Palviainen T, Loukola A, Wootton RE, Kaprio J, Relton CL, Davey Smith G, Munafo MR: The effect of body mass index on smoking behaviour and nicotine metabolism: a Mendelian randomization study. Hum Mol Genet 2019, 28(8):1322-1330.

      (7) Asvold BO, Bjorngaard JH, Carslake D, Gabrielsen ME, Skorpen F, Smith GD, Romundstad PR: Causal associations of tobacco smoking with cardiovascular risk factors: a Mendelian randomization analysis of the HUNT Study in Norway. Int J Epidemiol 2014, 43(5):1458-1470.

      (8) Carreras-Torres R, Johansson M, Haycock PC, Relton CL, Davey Smith G, Brennan P, Martin RM: Role of obesity in smoking behaviour: Mendelian randomisation study in UK Biobank. BMJ 2018, 361:k1767.

      (9) Freathy RM, Kazeem GR, Morris RW, Johnson PC, Paternoster L, Ebrahim S, Hattersley AT, Hill A, Hingorani AD, Holst C et al: Genetic variation at CHRNA5-CHRNA3-CHRNB4 interacts with smoking status to influence body mass index. Int J Epidemiol 2011, 40(6):1617-1628.

      (10) Thorgeirsson TE, Gudbjartsson DF, Sulem P, Besenbacher S, Styrkarsdottir U, Thorleifsson G, Walters GB, Consortium TAG, Oxford GSKC, consortium E et al: A common biological basis of obesity and nicotine addiction. Transl Psychiatry 2013, 3(10):e308.

      (11) Wills AG, Hopfer C: Phenotypic and genetic relationship between BMI and cigarette smoking in a sample of UK adults. Addict Behav 2019, 89:98-103.

    3. eLife Assessment

      The findings represent an important contribution to understanding whether BMI influences head and neck cancer (HNC) risk after accounting for tobacco use. Within the context of the Mendelian Randomization (MR) field, the strength of evidence appears convincing, supported by rigorous methods and a thorough exploration of multiple genetic models of adiposity using diverse MR approaches. Limitations include the absence of associations in sensitivity models designed to better account for pleiotropy, which prevents evaluation of whether incorporating an instrumental variable for tobacco use would alter the findings. Additionally, the lack of a formal power assessment for detecting associations with the instrumental variables employed limits the interpretability and reach of the results.

    1. 1. Be aware of ergonomic problems in the operatory.2. Chairside stretching is an important strategy to perform throughout the workday toprevent microtrauma and muscle imbalances. Strengthen specific stabilizing muscles(like shoulder and back). Physical therapists, neuromuscular therapist should beconsulted for musculoskeletal disorders.3. Patient should be seated so that all his body parts are well supported. The patient’s headshould always be supported by adjustable/articulated.4. Upright Position is the initial position of chair from which further adjustments are made.Almost Supine position is such that patient’s head, knees and feet are approximately atsame level, patient is almost in a lying position. Patient’s head should not be lower thanfeet. In reclined 45 degree, chair is reclined at 45°, and the mandibular occlusal surfacesare almost at 45° to the floor.5. For better understanding, sitting positions of operator are related to a clock. In this clockconcept, an imaginary circle is drawn over the dental chair, keeping the patient’s head at thecenter of the circle. Then the numbering to circle is given similar to a clock with the top ofthe circle at 12 o’clock.6. Accordingly the operator’s positions (right handed operator) can be 7 o’clock, 9 o’clock,11 o’clock, and 12 o’clock and for left handed operator, it can be 5 o’clock, 3 o’clock and 1o’clock.7. Stool height of assistant should be 4 to 6 inches above the dentist’s eye level

      ① Be aware of ergonomic problems in the operatory. ① Klinik ortamındaki ergonomik sorunların farkında olun.

      ② Chairside stretching is an important strategy to perform throughout the workday to prevent microtrauma and muscle imbalances. Strengthen specific stabilizing muscles (like shoulder and back). Physical therapists, neuromuscular therapist should be consulted for musculoskeletal disorders. ② Gün boyunca yapılacak sandalye başı esneme hareketleri, mikrotravmaları ve kas dengesizliklerini önlemek için önemli bir stratejidir. Omuz ve sırt gibi belirli dengeleyici kaslar güçlendirilmelidir. Kas-iskelet sistemi rahatsızlıklarında fizyoterapistlere veya nöromüsküler terapistlere danışılmalıdır.

      ③ Patient should be seated so that all his body parts are well supported. The patient’s head should always be supported by adjustable/articulated. ③ Hastanın tüm vücut bölgeleri iyi bir şekilde desteklenecek biçimde oturtulmalıdır. Hastanın başı her zaman ayarlanabilir/eklemli bir baş desteği ile desteklenmelidir.

      ④ Upright Position is the initial position of chair from which further adjustments are made. Almost Supine position is such that patient’s head, knees and feet are approximately at same level, patient is almost in a lying position. Patient’s head should not be lower than feet. In reclined 45 degree, chair is reclined at 45°, and the mandibular occlusal surfaces are almost at 45° to the floor. ④ Dik pozisyon, koltuğun başlangıç konumudur ve buradan sonra ayarlamalar yapılır. Neredeyse sırtüstü pozisyonda hastanın başı, dizleri ve ayakları yaklaşık aynı seviyededir; hasta neredeyse yatar durumdadır. Hastanın başı, ayaklarından daha aşağıda olmamalıdır. 45 derece yatırılmış pozisyonda ise koltuk 45° eğimle yatırılmıştır ve mandibular oklüzal yüzeyler zemine yaklaşık 45° açıyla durur.

      ⑤ For better understanding, sitting positions of operator are related to a clock. In this clock concept, an imaginary circle is drawn over the dental chair, keeping the patient’s head at the center of the circle. Then the numbering to circle is given similar to a clock with the top of the circle at 12 o’clock. ⑤ Daha iyi anlaşılması için operatörün oturma pozisyonları bir saat ile ilişkilendirilir. Bu saat kavramında, hastanın başı merkeze alınarak diş üniti üzerine hayali bir daire çizilir. Dairenin üst kısmı 12 yönü olacak şekilde saat numaraları verilir.

      ⑥ Accordingly the operator’s positions (right handed operator) can be 7 o’clock, 9 o’clock, 11 o’clock, and 12 o’clock and for left handed operator, it can be 5 o’clock, 3 o’clock and 1 o’clock. ⑥ Buna göre, sağ elini kullanan bir operatör için pozisyonlar 7, 9, 11 ve 12 yönleri olabilirken; sol elini kullanan operatör için bu pozisyonlar 5, 3 ve 1 yönleri olabilir.

      ⑦ Stool height of assistant should be 4 to 6 inches above the dentist’s eye level. ⑦ Asistanın oturduğu taburenin yüksekliği, diş hekiminin göz hizasından 4 ila 6 inç daha yüksek olmalıdır.

    2. Nowadays, stools having backrest with curved extensions that provide additional bodysupport are also available.• If seat is positioned too high, the edges will cut off supply to user’s legs resulting in morefatigue and stress.• Assistant should sit as close as possible to the back of the patient’s chair with feet directedtowards the head of the chair.• Stool height of assistant should be 4 to 6 inches above the dentist’s eye level.• The assistant should sit in an erect position with feet firmly placed on the foot-support ringat the base of the assistant chair.• The instrument tray should be placed towards the head of the patient’s chair, andpositioned to allow easy access to the instruments and materials.• Above positions can be adjusted according to specific needs.

      ① Nowadays, stools having backrest with curved extensions that provide additional body support are also available. ① Günümüzde, ilave vücut desteği sağlayan kıvrımlı uzantılara sahip sırt dayanaklı tabureler de mevcuttur.

      ② If seat is positioned too high, the edges will cut off supply to user’s legs resulting in more fatigue and stress. ② Eğer oturma yeri çok yüksek ayarlanırsa, kenarları bacaklardaki kan dolaşımını keserek daha fazla yorgunluk ve strese neden olur.

      ③ Assistant should sit as close as possible to the back of the patient’s chair with feet directed towards the head of the chair. ③ Asistan, hastanın sandalyesinin arkasına mümkün olduğunca yakın oturmalı ve ayakları sandalyenin baş kısmına doğru yönlendirilmelidir.

      ④ Stool height of assistant should be 4 to 6 inches above the dentist’s eye level. ④ Asistanın taburesinin yüksekliği, diş hekiminin göz seviyesinden 4 ila 6 inç daha yüksek olmalıdır.

      ⑤ The assistant should sit in an erect position with feet firmly placed on the foot-support ring at the base of the assistant chair. ⑤ Asistan, dik bir pozisyonda oturmalı ve ayakları asistan sandalyesinin tabanındaki ayak destek halkasına sağlam şekilde yerleştirilmelidir.

      ⑥ The instrument tray should be placed towards the head of the patient’s chair, and positioned to allow easy access to the instruments and materials. ⑥ Alet tepsisi, hastanın sandalyesinin baş kısmına yerleştirilmeli ve aletlere ve malzemelere kolay erişim sağlayacak şekilde konumlandırılmalıdır.

      ⑦ Above positions can be adjusted according to specific needs. ⑦ Yukarıdaki pozisyonlar, özel ihtiyaçlara göre ayarlanabilir.

    3. CHAIR AND PATIENT POSITIONS Dental chair and patient positions are important aspectin restorative dentistry. Modern dental chairs are properly designed so as to provide totalbody support and comfort in any position. Patient should be seated so that all his body partsare well supported. The patient’s head should always be supported by adjustable/articulatedheadrest. Preferably the patient’s head should be in line with his back, whether the dentalchair base is parallel or slightly at an angle to the floor. The dental chair should be designedin such a way that it should provide maximum working area to the operator. The footswitches are preferred than hand switches so as to improve infection control. And theadjustable control switches should be conveniently located. The chair height should be keptlow, backrest should be upright and armrest should be adjustable while making the patientto seat in the dental chair. Now, the chair can be adjusted to place the patient in recliningposition. Patient position can vary with operator, type of procedure and area of the oralcavity.

      ① CHAIR AND PATIENT POSITIONS Dental chair and patient positions are important aspect in restorative dentistry. ⓵ Sandalye ve hasta pozisyonları, restoratif diş hekimliğinde önemli bir unsurdur.

      ② Modern dental chairs are properly designed so as to provide total body support and comfort in any position. ⓶ Modern diş hekimi koltukları, her pozisyonda tam vücut desteği ve konfor sağlamak üzere uygun şekilde tasarlanmıştır.

      ③ Patient should be seated so that all his body parts are well supported. ⓷ Hasta, tüm vücut bölümleri iyi desteklenecek şekilde oturtulmalıdır.

      ④ The patient’s head should always be supported by adjustable/articulated headrest. ⓸ Hastanın başı her zaman ayarlanabilir veya eklemli baş desteği ile desteklenmelidir.

      ⑤ Preferably the patient’s head should be in line with his back, whether the dental chair base is parallel or slightly at an angle to the floor. ⓹ Tercihen, hastanın başı sırtıyla aynı hizada olmalıdır; diş hekimi koltuğunun tabanı yere paralel ya da hafif açılı olsa bile.

      ⑥ The dental chair should be designed in such a way that it should provide maximum working area to the operator. ⓺ Diş ünitesi, operatöre maksimum çalışma alanı sağlayacak şekilde tasarlanmalıdır.

      ⑦ The foot switches are preferred than hand switches so as to improve infection control. ⓻ Enfeksiyon kontrolünü artırmak için ayak pedalları, el anahtarlarına göre daha fazla tercih edilmelidir.

      ⑧ And the adjustable control switches should be conveniently located. ⓼ Ayarlanabilir kontrol düğmeleri, kolay erişilebilir bir konumda yer almalıdır.

      ⑨ The chair height should be kept low, backrest should be upright and armrest should be adjustable while making the patient to seat in the dental chair. ⓽ Hastayı koltuğa oturturken, koltuk yüksekliği düşük, sırt desteği dik ve kol destekleri ayarlanabilir olmalıdır.

      ⑩ Now, the chair can be adjusted to place the patient in reclining position. ⓾ Bundan sonra, hasta yatırılabilir pozisyona getirilmek üzere koltuk ayarlanabilir.

      ⑪ Patient position can vary with operator, type of procedure and area of the oral cavity. ⓫ Hasta pozisyonu, operatöre, yapılan işleme ve ağız boşluğundaki bölgeye göre değişkenlik gösterebilir.

    4. The goal of overhead lighting is to produce even, shadow- free, color-correctedillumination that is concentrated on the operating field. In general, the intensity ratiobetween task lighting (the dental operating light) and ambient room lighting should beno greater than 3 to 1.6. Furthermore, the light source should be in the patient’s mid-sagittal plane; directly above and slightly behind the patient’s oral cavity, and 5°toward the head of the operator in the 12 o’clock position.

      ① The goal of overhead lighting is to produce even, shadow-free, color-corrected illumination that is concentrated on the operating field. ⓵ Üstten aydınlatmanın amacı, operasyon alanına odaklanmış, eşit, gölgesiz ve renk düzeltmeli aydınlatma sağlamaktır.

      ② In general, the intensity ratio between task lighting (the dental operating light) and ambient room lighting should be no greater than 3 to 1.6. ⓶ Genel olarak, görev aydınlatması (diş operasyon ışığı) ile ortam ışığı arasındaki yoğunluk oranı 3’e 1.6’dan fazla olmamalıdır.

      ③ Furthermore, the light source should be in the patient’s mid-sagittal plane; directly above and slightly behind the patient’s oral cavity, and 5° toward the head of the operator in the 12 o’clock position. ⓷ Ayrıca, ışık kaynağı hastanın orta-sagital düzleminde olmalı; hastanın ağız boşluğunun tam üstünde ve hafifçe arkasında, operatörün başına doğru 12 saat pozisyonunda 5° açıyla konumlandırılmalıdır.

    5. Various delivery systems have advantages and disadvantages. When working infour-handed dentistry the dentist maintains a position around the operating field withlimited hand, arm and body movement, and should best confine eye focus to theworking field. Additionally, the dental equipment and instruments should be centeredon the dental assistant. From an ergonomic viewpoint, over-the-head and over-the-patient delivery systems better allow the dental assistant to access the handpiecesfor bur changes or other operations.

      ① Various delivery systems have advantages and disadvantages. ⓵ Çeşitli teslimat sistemlerinin avantajları ve dezavantajları vardır.

      ② When working in four-handed dentistry the dentist maintains a position around the operating field with limited hand, arm and body movement, and should best confine eye focus to the working field. ⓶ Dört elli diş hekimliği çalışılırken, diş hekimi ellerini, kollarını ve vücudunu sınırlı hareket ettirerek operasyona yakın bir pozisyonda kalmalı ve gözlerini en iyi şekilde çalışma alanına odaklamalıdır.

      ③ Additionally, the dental equipment and instruments should be centered on the dental assistant. ⓷ Ayrıca, diş ekipmanları ve aletleri diş asistanının erişiminde ve merkezinde olmalıdır.

      ④ From an ergonomic viewpoint, over-the-head and over-the-patient delivery systems better allow the dental assistant to access the handpieces for bur changes or other operations. ⓸ Ergonomik açıdan bakıldığında, baş üstü ve hasta üstü teslimat sistemleri, diş asistanının bur değiştirme veya diğer işlemler için handpiece'lere daha kolay erişmesini sağlar.

    Annotators

    1. Author response:

      The following is the authors’ response to the current reviews.

      We wanted to clarify Reviewer #1’s latest comment in the last round of review, “Furthermore, the referee appreciates that the authors have echoed the concern regarding the limited statistical robustness of the observed scrambling events.” We appreciate the follow up information provided from Reviewer #1 that their comment is specifically about the low count alternative pathway events that we view at the dimer interface, and not the statistics of the manuscript overall as they believe that “the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations (Reviewer #1)”. We agree with the Reviewer and acknowledge that overall our coarse-grained study represents the most comprehensive single manuscript of the entire TMEM16 family to date.


      The following is the authors’ response to the original reviews.

      Public Review:

      Reviewer #1 (Public review):

      Summary:

      The manuscript investigates lipid scrambling mechanisms across TMEM16 family members using coarse-grained molecular dynamics (MD) simulations. While the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations, several critical issues undermine its novelty, impact, and alignment with experimental observations.

      Critical issues:

      (1) Lack of Novelty:

      The phenomenon of lipid scrambling via an open hydrophilic groove is already well-established in the literature, including through atomistic MD simulations. The authors themselves acknowledge this fact in their introduction and discussion. By employing coarse-grained simulations, the study essentially reiterates previously known findings with limited additional mechanistic insight. The repeated observation of scrambling occurring predominantly via the groove does not offer significant advancement beyond prior work.

      We agree with the reviewer’s statement regarding the lack of novelty when it comes to our observations of scrambling in the groove of open Ca2+-bound TMEM16 structures. However, we feel that the inclusion of closed structures in this study, which attempts to address the yet unanswered question of how scrambling by TMEM16s occurs in the absence of Ca2+, offers new observations for the field. In our study we specifically address to what extent the induced membrane deformation, which has been theorized to aid lipids cross the bilayer especially in the absence of Ca2+, contributes to the rate of scrambling (see references 36, 59, and 66). There are also several TMEM16F structures solved under activating conditions (bound to Ca2+ and in the presence of PIP2) which feature structural rearrangements to TM6 that may be indicative of an open state (PDB 6P48) and had not been tested in simulations. We show that these structures do not scramble and thereby present evidence against an out-of-the-groove scrambling mechanism for these states. Although we find a handful of examples of lipids being scrambled by Ca2+-free structures of TMEM16 scramblases, none of our simulations suggest that these events are related to the degree of deformation.

      (2) Redundancy Across Systems:

      The manuscript explores multiple TMEM16 family members in activating and non-activating conformations, but the conclusions remain largely confirmatory. The extensive dataset generated through coarse-grained MD simulations primarily reinforces established mechanistic models rather than uncovering fundamentally new insights. The effort, while statistically robust, feels excessive given the incremental nature of the findings.

      Again, we agree with the reviewer’s statement that our results largely confirm those published by other groups and our own. We think there is however value in comparing the scrambling competence of these TMEM16 structures in a consistent manner in a single study to reduce inconsistencies that may be introduced by different simulation methods, parameters, environmental variables such as lipid composition as used in other published works of single family members. The consistency across our simulations and high number of observed scrambling events have allowed us to confirm that the mechanism of scrambling is shared by multiple family members and relies most obviously on groove dilation.

      (3) Discrepancy with Experimental Observations:

      The use of coarse-grained simulations introduces inherent limitations in accurately representing lipid scrambling dynamics at the atomistic level. Experimental studies have highlighted nuances in lipid permeation that are not fully captured by coarse-grained models. This discrepancy raises questions about the biological relevance of the reported scrambling events, especially those occurring outside the canonical groove.

      We thank the reviewer for bringing up the possible inaccuracies introduced by coarse graining our simulations. This is also a concern for us, and we address this issue extensively in our discussion. As the reviewer pointed out above, our CG simulations have largely confirmed existing evidence in the field which we think speaks well to the transferability of observations from atomistic simulations to the coarse-grained level of detail. We have made both qualitative and quantitative comparisons between atomistic and coarse-grained simulations of nhTMEM16 and TMEM16F (Figure 1, Figure 4-figure supplement 1, Figure 4-figure supplement 5) showing the two methods give similar answers for where lipids interact with the protein, including outside of the canonical groove. We do not dispute the possible discrepancy between our simulations and experiment, but our goal is to share new nuanced ideas for the predicted TMEM16 scrambling mechanism that we hope will be tested by future experimental studies.

      (4) Alternative Scrambling Sites:

      The manuscript reports scrambling events at the dimer-dimer interface as a novel mechanism. While this observation is intriguing, it is not explored in sufficient detail to establish its functional significance. Furthermore, the low frequency of these events (relative to groove-mediated scrambling) suggests they may be artifacts of the simulation model rather than biologically meaningful pathways.

      We agree with the reviewer that our observed number of scrambling events in the dimer interface is too low to present it as strong evidence for it being the alternative mechanism for Ca2+-independent scrambling. This will require additional experiments and computational studies which we plan to do in future research. However, we are less certain that these are artifacts of the coarse-grained simulation system as we observed a similar event in an atomistic simulation of TMEM16F.

      Conclusion:

      Overall, while the study is technically sound and presents a large dataset of lipid scrambling events across multiple TMEM16 structures, it falls short in terms of novelty and mechanistic advancement. The findings are largely confirmatory and do not bridge the gap between coarse-grained simulations and experimental observations. Future efforts should focus on resolving these limitations, possibly through atomistic simulations or experimental validation of the alternative scrambling pathways.

      Reviewer #2 (Public review):

      Summary:

      Stephens et al. present a comprehensive study of TMEM16-members via coarse-grained MD simulations (CGMD). They particularly focus on the scramblase ability of these proteins and aim to characterize the "energetics of scrambling". Through their simulations, the authors interestingly relate protein conformational states to the membrane's thickness and link those to the scrambling ability of TMEM members, measured as the trespassing tendency of lipids across leaflets. They validate their simulation with a direct qualitative comparison with Cryo-EM maps.

      Strengths:

      The study demonstrates an efficient use of CGMD simulations to explore lipid scrambling across various TMEM16 family members. By leveraging this approach, the authors are able to bypass some of the sampling limitations inherent in all-atom simulations, providing a more comprehensive and high-throughput analysis of lipid scrambling. Their comparison of different protein conformations, including open and closed groove states, presents a detailed exploration of how structural features influence scrambling activity, adding significant value to the field. A key contribution of this study is the finding that groove dilation plays a central role in lipid scrambling. The authors observe that for scrambling-competent TMEM16 structures, there is substantial membrane thinning and groove widening. The open Ca2+-bound nhTMEM16 structure (PDB ID 4WIS) was identified as the fastest scrambler in their simulations, with scrambling rates as high as 24.4 {plus minus} 5.2 events per μs. This structure also shows significant membrane thinning (up to 18 Å), which supports the hypothesis that groove dilation lowers the energetic barrier for lipid translocation, facilitating scrambling.

      The study also establishes a correlation between structural features and scrambling competence, though analyses often lack statistical robustness and quantitative comparisons. The simulations differentiate between open and closed conformations of TMEM16 structures, with open-groove structures exhibiting increased scrambling activity, while closed-groove structures do not. This finding aligns with previous research suggesting that the structural dynamics of the groove are critical for scrambling. Furthermore, the authors explore how the physical dimensions of the groove qualitatively correlate with observed scrambling rates. For example, TMEM16K induces increased membrane thinning in its open form, suggesting that membrane properties, along with structural features, play a role in modulating scrambling activity.

      Another significant finding is the concept of "out-of-the-groove" scrambling, where lipid translocation occurs outside the protein's groove. This observation introduces the possibility of alternate scrambling mechanisms that do not follow the traditional "credit-card model" of groove-mediated lipid scrambling. In their simulations, the authors note that these out-of-the-groove events predominantly occur at the dimer interface between TM3 and TM10, especially in mammalian TMEM16 structures. While these events were not observed in fungal TMEM16s, they may provide insight into Ca2+-independent scrambling mechanisms, as they do not require groove opening.

      Weaknesses:

      A significant challenge of the study is the discrepancy between the scrambling rates observed in CGMD simulations and those reported experimentally. Despite the authors' claim that the rates are in line experimentally, the observed differences can mean large energetic discrepancies in describing scrambling (larger than 1kT barrier in reality). For instance, the authors report scrambling rates of 10.7 events per μs for TMEM16F and 24.4 events per μs for nhTMEM16, which are several orders of magnitude faster than experimental rates. While the authors suggest that this discrepancy could be due to the Martini 3 force field's faster diffusion dynamics, this explanation does not fully account for the large difference in rates. A more thorough discussion on how the choice of force field and simulation parameters influence the results, and how these discrepancies can be reconciled with experimental data, would strengthen the conclusions. Likewise, rate calculations in the study are based on 10 μs simulations, while experimental scrambling rates occur over seconds. This timescale discrepancy limits the study's accuracy, as the simulations may not capture rare or slow scrambling events that are observed experimentally and therefore might underestimate the kinetics of scrambling. It's however important to recognize that it's hard (borderline unachievable) to pinpoint reasonable kinetics for systems like this using the currently available computational power and force field accuracy. The faster diffusion in simulations may lead to overestimated scrambling rates, making the simulation results less comparable to real-world observations. Thus, I would therefore read the findings qualitatively rather than quantitatively. An interesting observation is the asymmetry observed in the scrambling rates of the two monomers. Since MARTINI is known to be limited in correctly sampling protein dynamics, the authors - in order to preserve the fold - have applied a strong (500 kJ mol-1 nm-2) elastic network. However, I am wondering how the ENM applies across the dimer and if any asymmetry can be noticed in the application of restraints for each monomer and at the dimer interface. How can this have potentially biased the asymmetry in the scrambling rates observed between the monomers? Is this artificially obtained from restraining the initial structure, or is the asymmetry somehow gatekeeping the scrambling mechanism to occur majorly across a single monomer? Answering this question would have far-reaching implications to better describe the mechanism of scrambling.

      The main aim of our computational survey was to directly compare all relevant published TMEM16 structures in both open and closed states using the Martini 3 CGMD force field. Our standardized simulation and analysis protocol allowed us to quantitatively compare scrambling rates across the TMEM16 family, something that has never been done before. We do acknowledge that direct comparison between simulated versus experimental scrambling rates is complicated and is best to be interpreted qualitatively. In line with other reports (e.g., Li et al, PNAS 2024), lipid scrambling in CGMD is 2-3 orders of magnitude faster than typical experimental findings. In the CG simulation field, these increased dynamics due to the smoother energy landscape are a well known phenomenon. In our view, this is a valuable trade-off for being able to capture statistically robust scrambling dynamics and gain mechanistic understanding in the first place, since these are currently challenging to obtain otherwise. For example, with all-atom MD it would have been near-impossible to conclude that groove openness and high scrambling rates are closely related, simply because one would only measure a handful of scrambling events in (at most) a handful of structures.

      Considering the elastic network: the reviewer is correct in that the elastic network restrains the overall structure to the experimental conformation. This is necessary because the Martini 3 force field does not accurately model changes in secondary (and tertiary) structure. In fact, by retaining the structural information from the experimental structures, we argue that the elastic network helped us arrive at the conclusion that groove openness is the major contributing factor in determining a protein’s scrambling rate. This is best exemplified by the asymmetric X-ray structure of TMEM16K (5OC9), in which the groove of one subunit is more dilated than the other. In our simulation, this information was stored in the elastic network, yielding a 4x higher rate in the open groove than in the closed groove, within the same trajectory.

      Notably, the manuscript does not explore the impact of membrane composition on scrambling rates. While the authors use a specific lipid composition (DOPC) in their simulations, they acknowledge that membrane composition can influence scrambling activity. However, the study does not explore how different lipids or membrane environments or varying membrane curvature and tension, could alter scrambling behaviour. I appreciate that this might have been beyond the scope of this particular paper and the authors plan to further chase these questions, as this work sets a strong protocol for this study. Contextualizing scrambling in the context of membrane composition is particularly relevant since the authors note that TMEM16K's scrambling rate increases tenfold in thinner membranes, suggesting that lipid-specific or membrane-thickness-dependent effects could play a role.

      Considering different membrane compositions: for this study, we chose to keep the membranes as simple as possible. We opted for pure DOPC membranes, because it has (1) negligible intrinsic curvature, (2) forms fluid membranes, and (3) was used previously by others (Li et al, PNAS 2024). As mentioned by the reviewer, we believe our current study defines a good, standardized protocol and solid baseline for future efforts looking into the additional effects of membrane composition, tension, and curvature that could all affect TMEM16-mediated lipid scrambling.

      Reviewer #3 (Public review):

      Strengths:

      The strength of this study emerges from a comparative analysis of multiple structural starting points and understanding global/local motions of the protein with respect to lipid movement. Although the protein is well-studied, both experimentally and computationally, the understanding of conformational events in different family members, especially membrane thickness less compared to fungal scramblases offers good insights.

      We appreciate the reviewer recognizing the value of the comparative study. In addition to valuable insights from previous experimental and computational work, we hope to put forward a unifying framework that highlights various TMEM16 structural features and membrane properties that underlie scrambling function.

      Weaknesses:

      The weakness of the work is to fully reconcile with experimental evidence of Ca²⁺-independent scrambling rates observed in prior studies, but this part is also challenging using coarse-grain molecular simulations. Previous reports have identified lipid crossing, packing defects, and other associated events, so it is difficult to place this paper in that context. However, the absence of validation leaves certain claims, like alternative scrambling pathways, speculative.

      Answer: It is generally difficult to quantitatively compare bulk measurements of scrambling phenomena with simulation results. The advantage of simulations is to directly observe the transient scrambling events at a spatial and temporal resolution that is currently unattainable for experiments. The current experimental evidence for the precise mechanism of Ca2+-independent scrambling is still under debate. We therefore hope to leverage the strength of MD and statistical rigor of coarse-grained simulations to generate testable hypotheses for further structural, biochemical, and computational studies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The findings are largely confirmatory and do not bridge the gap between coarse-grained simulations and experimental observations. Future efforts should focus on resolving these limitations, possibly through atomistic simulations or experimental validation of the alternative scrambling pathways.

      While we agree with what the reviewer may be hinting at regarding limitations of coarse-grained MD simulations, we believe that our study holds much more merit than this comment suggests. We have provided something that has yet to be done in the field: a comprehensive study that directly compares the scrambling rates of multiple TMEM16 family members in different conformations using identical simulation conditions. Our work clearly shows that a sufficiently dilated grooves is the major structural feature that enables robust scrambling for all TMEM16 scramblases members with solved structures. While all TMEM16s cause significant distortion and thinning of the membrane, we assert that the extreme thinning observed around open grooves is significantly enhanced by the lipid scrambling itself as the two leaflets merge through lipid exchange.  We saw no evidence that membrane thinning/distortion alone, in the absence of an open groove, could support scrambling at the rates observed under activating conditions or even the low rates observed in Ca2+-independent scrambling. Moreover, our handful of observations of scrambling events outside of the groove, which has not yet been reported in any study, opens an exciting new direction for studying alternative scrambling mechanisms. That said, we are currently following up on many of the observations reported here such as: scrambling events outside the groove, the kinetics of scrambling, the possibility that lipids line the groove of non-scramblers like TMEM16A, etc. This is being done experimentally with our collaborators through site directed mutagenesis and with all-atom MD in our lab. Unfortunately, it is well beyond the scope of the current study to include all of this in the current paper.

      Reviewer #2 (Recommendations for the authors):

      Major comments and questions:

      (1) Line 214 and Figure 1- Figure Supplement 1: why have you only compared the final frame of the trajectory to the cryo-EM structure? Even if these comparisons are qualitative, they should be representative of the entire trajectory, not a single frame.

      We thank the reviewer for this suggestion and replaced the single-frame snapshots in Figure 1-figure supplement 1 for ensemble-averaged head groups densities. The overall agreement between membrane shapes in CGMD and cryo-EM was not affected by this change.

      (2) Lines 228-231: You comment 'Residues in this site on nhTMEM16 and TMEMF also seem to play a role in scrambling but the mechanism by which they do so is unclear.' This is something you could attempt to quantify in the simulations by calculating the correlation between scrambling and protein-membrane interactions/contacts in this site. Can you speculate on a mechanism that might be a contributing factor?

      We probed the correlation between these residues and scrambling lipids, as suggested by the reviewer, and interestingly not all scrambling lipids interact with these residues. Yet there is strong lipid density in this vicinity (see insets in Figure 1 and Figure 4-figure supplement 2). These observations lead us to suspect these residues impact scrambling indirectly through influencing the conformation of the protein or flexibility and shape of the membrane. This interpretation fits with mutagenesis studies highlighting a role for these residues in scrambling (see refs 59, 62, and 67). Specifically, Falzone et al. 2022 (ref 59) suggested that they may thin the membrane near the groove, but this has not been tested via structure determination and a detailed model of how they impact scrambling is missing. We could address this question with in silico mutations; however, CG simulation is not an appropriate method to study large scale protein dynamics, and AA simulations are likely best, but beyond the scope of this paper.

      (3) Lines 240-245 and Figure 1B: This section discusses the coupling between membrane distortions and the sinusoidal curve around the protein, however, Figure 1B only shows snapshots of the membrane distortions. Is it possible to understand how these two collective variables are correlated quantitatively (as opposed to the current qualitative analysis)?

      We believe that it may be possible to quantitatively capture these two key features of the membrane, as we did previously with nhTMEM16 using our continuum elasticity-based model of the membrane (Bethel and Grabe 2016). Our model agreed with all atom MD surfaces to within ~1 Å, hence showing good quantitative agreement throughout the entire membrane. However, we doubt that we could distill the essence of our model down to a simple functional relationship between the sinusoidal wave and pinching, which we think the reviewer is asking. Rather, we believe that the large-scale sinusoidal distortion (collective variable 1) and pinching/distortion (collective variable 2) near the groove arise from the interplay of the specific protein surface chemistry for each protein (patterning of polar and non-polar residues) and the membrane. This is why we chose to simply report the distinct patterns that the family members impose on the surrounding membrane, which we think is fascinating. Specifically, Fig. 1B shows that different TMEM16 family members distort the membrane in different ways. Most notably, fungal TMEM16s feature a more pronounced sinusoidal deformation, whereas the mammalian members primarily produce local pinching. Then, in Fig. 3A we show that the thinning at the groove happens in all structures and is more pronounced in open, scrambling-competent conformations. In other words, proteins can show very strong thinning (e.g. TMEM16K, 5OC9) even though the membrane generally remains flat.

      (4) Lines 257-258: Authors comment that TMEM16A lacks scramblase activity yet can achieve a fully lipid-lined groove (note the typo - should be lipid-lined, not lipid-line). Is a fully lipid-lined groove a prerequisite for scramblase activity? Are lipid-lined grooves the only requirement for scramblase activity? Could the authors clarify exactly what the prerequisite for scramblase activity is to avoid any confusion; this will be useful for later descriptions (i.e. line 295) where scrambling competence is again referred to. Additionally, the associated figure panel (Figure 1D) shows a snapshot of this finding but lacks any statistical quantifications - is a fully lipid-lined groove a single event? Perhaps the additional analyses, such as the groove-lipid contacts, may be useful here.

      The definition of lipid scrambling is that a lipid fully transitions from one membrane leaflet to the other. While a single lipid could transition through the groove on its own, it is well documented in both atomistic and CG MD simulations, that lipid scrambling typically happens through a lipid-lined groove, as shown in Fig. 1A-B. The lipids tend to form strong choline-to-phosphate interactions with nearest neighbors that make this energetically favorable. That said, lipid-lined grooves are not sufficient for robust scrambling, which is what we show in Fig. 1D where the non-scrambler TMEM16A did in fact feature a lipid-lined groove. As suggested, we performed contact analysis and found that residue K645 on TM6 in the middle of the groove contacts lipids in 9.2% of the simulation frames.

      To get a better understanding of how populated the TM4-TM6 pathway is with lipids across all simulated structures, we determined for every simulation frame how many headgroup beads resided in the groove. This indicates that the ion-conductive state of TMEM16A (5OYB*, Fig. 1D) only had 1 lipid in the pathway, on average, meaning that the configuration shown Fig. 1D is indeed exceptional. As a reference, our strongest scrambler nhTMEM16 4WIS, had an average of 2.8 lipids in the groove. We added a table containing the means and standard deviations that resulted from this analysis as Figure 1-Table supplement 1.

      (5) Lines 295-298 : The scrambling rates of the Ca²⁺-bound and Ca²⁺-free structures fall within overlapping error margins, it becomes difficult to definitively state that Ca²⁺ binding significantly enhances scrambling activity. This undermines the claim that the Ca²⁺-bound structure is the strongest scrambler. The authors should conduct statistical analyses to determine if the difference between the two conditions is statistically significant.

      In contrast to the reviewer’s comment, we do not claim that Ca2+-binding itself enhances lipid scrambling. Instead, what we show is that WT structures that are solved in an open confirmation (all of which are Ca2+-bound, except 6QM6) are robust scramblers. For nhTMEM16, we did not observe any scrambling events for the closed-groove proteins, making further statistical analysis redundant.

      (6) The authors claim that the scrambling rates derived from their MD simulations are in "excellent agreement" with experimental findings (lines 294-295), despite significant discrepancy between simulated and experimentally measured rates. For example, the simulated rate of 24.4 {plus minus} 5.2 events/µs for the open, Ca²⁺-bound fungal nhTMEM16 (PDB ID 4WIS) corresponds to approximately 24 million events per second, which is vastly higher than experimental rates. Experimental studies have reported scrambling rate constants of ~0.003 s⁻¹ for TMEM16 family members in the absence of Ca²⁺, measured under physiological conditions (https://doi.org/10.1038/s41467-019-11753-1 ). Even with Ca²⁺ activation, scrambling rates remain several orders of magnitude lower than the rates observed in simulations. Moreover, this highlights a larger problem: lipid scrambling rates occur over timescales that are not captured by these simulations. While the authors elude to these discrepancies (lines 605-606), they should be emphasised in the text, as opposed to the table caption. These should also be reconducted to differences between the membrane compositions of different studies.

      We agree with the spirit of the reviewer’s comment, and because of that, we were very careful not to claim that we reproduce experimental scrambling rates, just that the trends (scrambling-competent, or not) are correct. On lines 294-295, we actually said that the scrambling rates in our simulations excellently agree with “the presumed scrambling competence of each experimental structure”, which is true. 

      As explained extensively in the discussion section of our paper (and by many others), direct comparison between MD (e.g., Martini 3, but also atomistic force fields) dynamics and experimental measurements is challenging. The primary goal of our paper is to quantify and compare the scrambling capacity of different TMEM16 family members and different states, within a CGMD context.

      That said, we agree with the reviewer that we may have missed rare or long-timescale events (as is the case in any MD experiment) and added this point to the discussion.

      (7) To address these discrepancies, the authors should: i) emphasize that simulated rates serve as qualitative indicators of scrambling competence rather than absolute values comparable to experimental findings and ii) discuss potential reasons for the divergence, such as simulation timescale limitations or lipid bilayer compositions that may favor scrambling and force field inaccuracies.

      Please see our answer to question 6. Within the context of our CGMD survey, we confidently call our results quantitative. However, we agree with the reviewer that comparison with experimental scrambling rates is qualitative and should be interpreted with caution. To reflect this, we rewrote the first sentence of the relevant paragraph in the discussion section.

      (8) Line 310: Can the authors provide a rationale as to why one monomer has a wider groove than the other? Perhaps a contact analysis could be useful. See the comment above about ENM.

      The simulation of Ca2+-bound TMEM16K was initiated from an asymmetric X-ray structure in which chain B features a more dilated groove than chain A (PDB 5OC9). The backbones of TM4 and TM6 in the closed groove (A) are close enough together to be directly interconnected by the elastic network. In contrast, TM4 and TM6 in the more dilated subunit (B) are not restricted by the elastic network and, as a consequence, display some “breathing” behavior (Fig. 3B and Fig. 3-Suppl. 6A), giving rise to a ~4x higher scrambling rate. We explicitly added the word “cryo-EM” and the PDB ID to the sentence to emphasize that the asymmetry stems from the original experimental structure.

      When answering this question, we also corrected a mislabeled chain identifier which was in the original manuscript ‘chain A’ when it is actually ‘chain B’ in Fig.2-Suppl. 3A.

      (9) Line 312: Authors speculate that increased groove width likely accounts for increased scrambling rates. For statistical significance, authors should attempt to correlate scrambling rates and groove width over the simulation period.

      The Reviewer is referring to our description of scrambling rates we measured for TMEM16K where we noted that on average the groove with the highest scrambling rate is also on average wider than the opposite subunit which is below 6 Å. We do not suggest that the correlation between scrambling and groove width is continuous, as the Reviewer may have interpreted from our original submission, but we think it is a binary outcome – lipids cannot easily enter narrow grooves (< 6 Å) and hence scrambling can only occur once this threshold is reached at which point it occurs at a near constant rate. We showed this for 4 different family members in the original Fig. 3B, where scrambling events (black dots) were much more likely during, or right after, groove dilation to distances > 6 Å. 

      (10) Line 359: Authors have plotted the minimum distance between residues TM4 and TM6 in Fig. 3A/B, claiming that a wide groove is required for scrambling. Upon closer examination, it is clear that several of these distributions overlap, reducing the statistical significance of these claims. Statistical tests (i.e. KS-tests) should be performed to determine whether the differences in distributions are significant.

      The Reviewer appears to be asking for a statistical test between the six distance distributions represented by the data in Fig. 3A for the scrambling competent structures (6QP6*, 8B8J, 6QM6, 7RXG, 4WIS, 5OC9), and we think this is being asked because it is believed that we are making a claim that the greater the distance, the greater the scrambling rate. If we have interpreted this comment correctly, we are not making this claim. Rather, we are simply stating that we only observe robust scrambling when the groove width regularly separates beyond 6 Å. The full distance distributions can now be found in Figure 3-figure supplement 6B, and we agree there is significant overlap between some of these distributions. However, the distinguishing characteristic of the 6 distributions from scrambling competent proteins is that they all access large distances, while the others do not. Notably, TMEM16F proteins (6QP6*, 8B8J) are below the 6 Å threshold on average, but they have wide standard deviations and spend well over ¼ of their time in the permissive regime (the upper error bar in the whisker plots in Fig. 3A is the 75% boundary).

      (11) Line 363-364: The authors state that all TMEM16 structures thin the membrane. Could the authors include a description of how membrane thinning is calculated, for instance, is the entire membrane considered, or is thinning calculated on a membrane patch close to the protein? Do membrane patches closer to the transmembrane protein increase or decrease thickness due to hydrophobic packing interactions? The latter question is of particular concern since Martini3 has been shown to induce local thinning of the membrane close to transmembrane helices, yielding thicknesses 2-3 Å thinner than those reported experimentally (https://doi.org/10.1016/j.cplett.2023.140436). This could be an important consideration in the authors' comparison to the bulk membrane thickness (line 364). Finally, how is the 'bulk membrane thickness' measured (i.e., from the CG simulations, from AA simulations, or from experiments)?

      Regarding the calculation of thinning and bulk membrane thickness, as described in Method “Quantification of membrane deformations”, the minimal membrane thickness, or thinning, is defined as the shortest distance between any two points from the interpolated upper and lower leaflet surfaces constructed using the glycerol beads (GL1 and GL2). Bulk membrane thickness is calculated by taking the vertical distance between the averaged glycerol surfaces at the membrane edge.

      The concern of localized membrane deformation due to force field artifacts is well-founded. However, the sinusoidal deformations shown here are much greater than 2-3 Å Martini3 imperfections, and they extend for up to 10 Å radially away from the protein into the bulk membrane (see Figure 3-figure supplement 1-5 for more of a description). Most importantly, the sinusoidal wave patterns set up by the proteins is very similar to those described in the previous continuum calculation and all-atom MD for nhTMEM16 (https://www.pnas.org/doi/full/10.1073/pnas.1607574113).

      (12) Line 374: The authors state a 'positive correlation' between membrane thinning/groove opening and scrambling rates. To support this claim, the authors should report. the correlation coefficients.

      We have removed any discussion concerning correlations between the magnitude of the scrambling rate and the degree of membrane thinning/groove opening. Rather we simply state that opening beyond a threshold distance is required for robust scrambling, as shown in our analysis in Fig. 3A.

      Concerning the relation between thinning and scrambling: Instantaneous membrane thinning is poorly defined (because it is governed by fluctuations of single lipids), and therefore difficult to correlate with the timing of individual scrambling events in a meaningful way.  Moreover, as we state later in that same section, “we argue that the extremely thin membranes are likely correlated with groove opening, rather than being an independent contributing factor to lipid scrambling”.

      (13) Line 396: It is stated that TMEM16A is not a scramblase but the simulating scrambling activity is not zero. How can you be sure that you are monitoring the correct collective variable if you are getting a false positive with respect to experiments?

      We only observe 2 scrambling events in 10 ms, which is a very small rate compared to the scrambling competent states. In a previous large survey Martini CG simulation study that inspired our protocol (Li et al, PNAS 2024), they employed a 1 event/ms cut-off to distinguish scramblers from non-scramblers. Hence, they would have called TMEM16A a non-scrambler as well. We expect that false negatives in this context might be an artifact of the CG forcefield, or it could be that TMEM16A can scramble but too slowly to be experimentally detected. Regarding the collective variable for lipid flipping, it is correct, and we know that this lipid actually flipped.

      (14) Line 402: Distance distributions for the electrostatic interactions between E633 and K645 should be included in the manuscript. This is also the case for the interactions between E843-K850 (lines 491-492).

      Our description of interactions between lipid headgroups and E633 and K645 in TMEM16A (5OYB*) are based on qualitative observations of the MD trajectory, and we highlight an example of this interaction in Figure 3-video 4. The video clearly shows that the lipid headgroups in the center of the groove orient themselves such that the phosphate bead (red) rests just above K645 (blue) and at other times the choline bead (blue) rests just below E633 (red). We do not think an additional plot with the distance distributions between lipids and these residues will add to our understanding of how lipids interact residues in the TMEM16A pore.

      We made a similar qualitative observation for the interaction between the POPC choline to E843 and POPC phosphate to K850 while watching the AAMD simulation trajectory of TMEM16F (PDB ID 6QP6). Given that this was a single observation, and the same interactions does not appear in CG simulation of the same structure (see simulation snapshots in Figure 4-figure supplement 5) we do not think additional analysis would add significantly to our understanding of which residues may stabilize lipids in the dimer interface.

      (15) Lines 450-451: 'As the groove opens, water is exposed to the membrane core and lipid headgroups insert themselves into the water-filled groove to bridge the leaflets.' Is this a qualitative observation? Could the authors report the correlation between groove dilation and the number of water permeation events?

      Yes, this is qualitative, and it sketches the order of events during scrambling, and we revised the main text starting at line 450 to indicate this. As illustrated by the density isosurfaces in Appendix 1-Figure 2A, the amount of water found in the closed versus open grooves is striking – there is a significant flood of water that connects the upper and lower solutions upon groove opening. Moreover, Appendix 1-Figure 2B shows much greater water permeation for open structures (4WIS, 7RXG, 5OC9, 8B8J, …) compared to closed structures (6QMB, 6QMA, 8B8Q, and many of the non-labeled data in the figure that all have closed grooves and near 0 water permeation). A notable exception is TMEM16A (7ZK3*8), which has water permeation but a closed groove and little-to-no lipid scrambling.

      Minor Comments:

      (1) Inconsistent use of '10' and 'ten' throughout.

      We like to kindly point out that we do not find examples of inconsistent use.

      (2) Line 32: 'TM6 along with 3, 4 and 5...' should be 'TM6 along with TM3, TM4 and TM5...'. Same in line 142. Naming should stay consistent.

      Changes are reflected in the updated manuscript.

      (3) Line 141: do you mean traverse (i.e. to travel across)? Or transverse (i.e. to extend across the membrane)?

      This is a typo. We meant “traverse”. Thanks for pointing it out.

      (4) Line 142: 'greasy' should be 'strongly hydrophobic'.

      Changes are reflected in the updated manuscript.

      (5) Line 143-144: "credit card mechanism" requires quotation marks.

      Changes are reflected in the updated manuscript.

      (6) Line 144: state if Nectria haematococca is mammalian or fungal, this is not obvious for all readers.

      Changes are reflected in the updated manuscript.

      (7) Line 147-148: Is TMEM16A/TMEM16K fungal or mammalian? What was the residue before the mutation and which residue is mutated? Perhaps the nomenclature should read as TMEM16X10Y where X=the residue prior to the mutation, 10 is a placeholder for the residue number that is mutated and Y=the new residue following mutation.

      “TMEM16” is the protein family. “A” denotes the specific homolog rather than residue.  

      (8) Lines 157-158: same as 10, it is unclear if these are fungal or mammalian.

      Clarifications added.

      (9) Line 184: "...CGMD simulation" should be "...CGMD simulations".

      Changes made.

      (10) Line 191-192: It would help to create a table of all of the mutants (including if they are mammalian or fungal) summarizing the salt concentrations, lipid and detergent environments, the presence of modulators/activators, etc.

      We added this information to Appendix 1-Table 1 in the supplemental information. We did not specify NaCl concentrations, because they all experimental procedures used standard physiological values for this (100-150 mM).

      (11) Line 210: inconsistencies with 'CG' and 'coarse-grain'.

      Changes made.

      (12) Figure 1 caption: '...totaling ~2μs (B)...' is missing the fullstop after 2μs.

      Changes made.

      (13) Figure 1B: it may be useful to label where the Ca2+ ion binds or include a schematic.

      We updated Fig. 1A to illustrate where Ca2+ binds.

      (14) Line 311: Are these mean distances? The authors should add standard deviations.

      Yes, they are. We added the standard deviations to the text.

      (15) Line 321-322: Perhaps a schematic in Figure 2 would be useful to visualize the structural features described here.

      We would kindly refer interested readers to reference [60].

      (16) Line 377: '...are likely a correlate of groove opening...' should read as: '...are likely correlated to groove opening...'.

      Thank you for pointing it out. Changes made.

      (17) Line 398: the '...empirically determined 6Å threshold for scrambling.' Was this determined from the simulations or from experiments? What does "empirically" mean here? Please state this.

      This value was determined from the simulations. Based on our analysis of the correlation between scrambling rate and groove dilation, we found that the minimal TM4/6 distance of 6 Å can distinguish between the high and low activity scramblers. The exact numerical value is somewhat arbitrary as there is a range of values around 6 Å that serve to distinguish scramblers from non-scramblers.

      (18) Figure 4: This figure should be labelled as A, B, C and D, with the figure caption updated accordingly.

      We updated Figure 4 and its caption.

      Reviewer #3 (Recommendations for Authors):

      The authors must do additional simulations to further validate their claim with different lipids and further substantiate dimer interface independent of Ca2+ ions.

      Thank you for the suggestion. We completely agree that studying scrambling in the context of a diverse lipid environment is an exciting area to explore. We are indeed actively working on a project that shares the similar idea. We decided not to include that study because we think the additional discussion involved would be excessive for the current manuscript. We, however, look forward to publishing our findings in a separate manuscript in the near future. In terms of Ca2+-independent scrambling, we are planning with our experimental collaborator for mutagenesis studies that target the residues we identified along the dimer interface.

      Since calcium ions are critical for the stability of these structures, authors should show that they were placed throughout the simulations consistently.

      As stated in the method section “Coarse-grained system preparation and simulation detail”, all Ca2+ ions are manually placed into the coarse-grained structure from the beginning of the simulation at their identical corresponding position in the experimental structure and harmonically bonded to adjacent acidic residues throughout the duration of simulation. We have also added a label to Fig 1A to indicate where the two Ca2+ ions are located.

      The comparison with experimental structures should be consistent with complete simulation, and not the last structure of the trajectory. Depending on the conformational variability, this might be misleading.

      We agree and updated Fig. 1-supplement figure 1 accordingly. The overall agreement between membrane shapes in CGMD and cryo-EM was not affected by this change.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Wang et al., recorded concurrent EEG-fMRI in 107 participants during nocturnal NREM sleep to investigate brain activity and connectivity related to slow oscillations (SO), sleep spindles, and in particular their co-occurrence. The authors found SO-spindle coupling to be correlated with increased thalamic and hippocampal activity, and with increased functional connectivity from the hippocampus to the thalamus and from the thalamus to the neocortex, especially the medial prefrontal cortex (mPFC). They concluded the brain-wide activation pattern to resemble episodic memory processing, but to be dissociated from task-related processing and suggest that the thalamus plays a crucial role in coordinating the hippocampal-cortical dialogue during sleep.

      The paper offers an impressively large and highly valuable dataset that provides the opportunity for gaining important new insights into the network substrate involved in SOs, spindles, and their coupling. However, the paper does unfortunately not exploit the full potential of this dataset with the analyses currently provided, and the interpretation of the results is often not backed up by the results presented. I have the following specific comments.

      Thank you for your thoughtful and constructive feedback. We greatly appreciate your recognition of the strengths of our dataset and findings Below, we address your specific comments and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We hope these revisions address your comments and further strengthen our manuscript. Thank you again for the constructive feedback.

      (1) The introduction is lacking sufficient review of the already existing literature on EEG-fMRI during sleep and the BOLD-correlates of slow oscillations and spindles in particular (Laufs et al., 2007; Schabus et al., 2007; Horovitz et al., 2008; Laufs, 2008; Czisch et al., 2009; Picchioni et al., 2010; Spoormaker et al., 2010; Caporro et al., 2011; Bergmann et al., 2012; Hale et al., 2016; Fogel et al., 2017; Moehlman et al., 2018; Ilhan-Bayrakci et al., 2022). The few studies mentioned are not discussed in terms of the methods used or insights gained.

      We acknowledge the need for a more comprehensive review of prior EEG-fMRI studies investigating BOLD correlates of slow oscillations and spindles. However, these articles are not all related to sleep SO or spindle. Articles (Hale et al., 2016; Horovitz et al., 2008; Laufs, 2008; Laufs, Walker, & Lund, 2007; Spoormaker et al., 2010) mainly focus on methodology for EEG-fMRI, sleep stages, or brain networks, which are not the focus of our study. Thank you again for your attention to the comprehensiveness of our literature review, and we will expand the introduction to include a more detailed discussion of the existing literature, ensuring that the contributions of previous EEG-fMRI sleep studies are adequately acknowledged.  

      Introduction, Page 4 Lines 62-76

      “Investigating these sleep-related neural processes in humans is challenging because it requires tracking transient sleep rhythms while simultaneously assessing their widespread brain activation. Recent advances in simultaneous EEG-fMRI techniques provide a unique opportunity to explore these processes. EEG allows for precise event-based detection of neural signal, while fMRI provides insight into the broader spatial patterns of brain activation and functional connectivity (Horovitz et al., 2008; Huang et al., 2024; Laufs, 2008; Laufs, Walker, & Lund, 2007; Schabus et al., 2007; Spoormaker et al., 2010). Previous EEG-fMRI studies on sleep have focused on classifying sleep stages or examining the neural correlates of specific waves (Bergmann et al., 2012; Caporro et al., 2012; Czisch et al., 2009; Fogel et al., 2017; Hale et al., 2016; Ilhan-Bayrakcı et al., 2022; Moehlman et al., 2019; Picchioni et al., 2011). These studies have generally reported that slow oscillations are associated with widespread cortical and subcortical BOLD changes, whereas spindles elicit activation in the thalamus, as well as in several cortical and paralimbic regions. Although these findings provide valuable insights into the BOLD correlates of sleep rhythms, they often do not employ sophisticated temporal modeling (Huang et al., 2024), to capture the dynamic interactions between different oscillatory events, e.g., the coupling between SOs and spindles.”

      (2) The paper falls short in discussing the specific insights gained into the neurobiological substrate of the investigated slow oscillations, spindles, and their interactions. The validity of the inverse inference approach ("Open ended cognitive state decoding"), assuming certain cognitive functions to be related to these oscillations because of the brain regions/networks activated in temporal association with these events, is debatable at best. It is also unclear why eventually only episodic memory processing-like brain-wide activation is discussed further, despite the activity of 16 of 50 feature terms from the NeuroSynth v3 dataset were significant (episodic memory, declarative memory, working memory, task representation, language, learning, faces, visuospatial processing, category recognition, cognitive control, reading, cued attention, inhibition, and action).

      Thank you for pointing this out, particularly regarding the use of inverse inference approaches such as “open-ended cognitive state decoding.” Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7. We will refocus the main text on direct neurobiological insights gained from our EEG-fMRI analyses, particularly emphasizing the hippocampal-thalamocortical network dynamics underlying SO-spindle coupling, and we will acknowledge the exploratory nature of these findings and highlight their limitations.

      Discussion, Page 17-18 Lines 323-332

      “To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”

      (3) Hippocampal activation during SO-spindles is stated as a main hypothesis of the paper - for good reasons - however, other regions (e.g., several cortical as well as thalamic) would be equally expected given the known origin of both oscillations and the existing sleep-EEG-fMRI literature. However, this focus on the hippocampus contrasts with the focus on investigating the key role of the thalamus instead in the Results section.

      We appreciate your insight regarding the relative emphasis on hippocampal and thalamic activation in our study. We recognize that the manuscript may currently present an inconsistency between our initial hypothesis and the main focus of the results. To address this concern, we will ensure that our Introduction and Discussion section explicitly discusses both regions, highlighting the complementary roles of the hippocampus (memory processing and reactivation) and the thalamus (spindle generation and cortico-hippocampal coordination) in SO-spindle dynamics.

      Introduction, Page 5 Lines 87-103

      “To address this gap, our study investigates brain-wide activation and functional connectivity patterns associated with SO-spindle coupling, and employs a cognitive state decoding approach (Margulies et al., 2016; Yarkoni et al., 2011)—albeit indirectly—to infer potential cognitive functions. In the current study, we used simultaneous EEG-fMRI recordings during nocturnal naps (detailed sleep staging results are provided in the Methods and Table S1) in 107 participants. Although directly detecting hippocampal ripples using scalp EEG or fMRI is challenging, we expected that hippocampal activation in fMRI would coincide with SO-spindle coupling detected by EEG, given that SOs, spindles, and ripples frequently co-occur during NREM sleep. We also anticipated a critical role of the thalamus, particularly thalamic spindles, in coordinating hippocampal-cortical communication.

      We found significant coupling between SOs and spindles during NREM sleep (N2/3), with spindle peaks occurring slightly before the SO peak. This coupling was associated with increased activation in both the thalamus and hippocampus, with functional connectivity patterns suggesting thalamic coordination of hippocampal-cortical communication. These findings highlight the key role of the thalamus in coordinating hippocampal-cortical interactions during human sleep and provide new insights into the neural mechanisms underlying sleep-dependent brain communication. A deeper understanding of these mechanisms may contribute to future neuromodulation approaches aimed at enhancing sleep-dependent cognitive function and treating sleep-related disorders.”

      Discussion, Page 16-17 Lines 292-307

      “When modeling the timing of these sleep rhythms in the fMRI, we observed hippocampal activation selectively during SO-spindle events. This suggests the possibility of triple coupling (SOs–spindles–ripples), even though our scalp EEG was not sufficiently sensitive to detect hippocampal ripples—key markers of memory replay (Buzsáki, 2015). Recent iEEG evidence indicates that ripples often co-occur with both spindles (Ngo, Fell, & Staresina, 2020) and SOs (Staresina et al., 2015; Staresina et al., 2023). Therefore, the hippocampal involvement during SO-spindle events in our study may reflect memory replay from the hippocampus, propagated via thalamic spindles to distributed cortical regions.

      The thalamus, known to generate spindles (Halassa et al., 2011), plays a key role in producing and coordinating sleep rhythms (Coulon, Budde, & Pape, 2012; Crunelli et al., 2018), while the hippocampus is found essential for memory consolidation (Buzsáki, 2015; Diba & Buzsá ki, 2007; Singh, Norman, & Schapiro, 2022). The increased hippocampal and thalamic activity, along with strengthened connectivity between these regions and the mPFC during SO-spindle events, underscores a hippocampal-thalamic-neocortical information flow. This aligns with recent findings suggesting the thalamus orchestrates neocortical oscillations during sleep (Schreiner et al., 2022). The thalamus and hippocampus thus appear central to memory consolidation during sleep, guiding information transfer to the neocortex, e.g., mPFC.”

      (4) The study included an impressive number of 107 subjects. It is surprising though that only 31 subjects had to be excluded under these difficult recording conditions, especially since no adaptation night was performed. Since only subjects were excluded who slept less than 10 min (or had excessive head movements) there are likely several datasets included with comparably short durations and only a small number of SOs and spindles and even less combined SO-spindle events. A comprehensive table should be provided (supplement) including for each subject (included and excluded) the duration of included NREM sleep, number of SOs, spindles, and SO+spindle events. Also, some descriptive statistics (mean/SD/range) would be helpful.

      We appreciate your recognition of our sample size and the challenges associated with simultaneous EEG-fMRI sleep recordings. We acknowledge the importance of transparently reporting individual subject data, particularly regarding sleep duration and the number of detected SOs, spindles, and SO-spindle events. To address this, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (5)Density of detected SOs; (6)Density of detected spindles; (7)Density of detected SO-spindle coupling events.

      However, most of the excluded participants were unable to fall asleep or had too short a sleep duration, so they basically had no NREM sleep period, so it was impossible to count the NREM sleep duration, SO, spindle, and coupling numbers.

      Supplementary Materials, Page 42-54, Table S1-S4

      (5) Was the 20-channel head coil dedicated for EEG-fMRI measurements? How were the electrode cables guided through/out of the head coil? Usually, the 64-channel head coil is used for EEG-fMRI measurements in a Siemens PRISMA 3T scanner, which has a cable duct at the back that allows to guide the cables straight out of the head coil (to minimize MR-related artifacts). The choice for the 20-channel head coil should be motivated. Photos of the recording setup would also be helpful.

      Thank you for your comment regarding our choice of the 20-channel head coil for EEG-fMRI measurements. We acknowledge that the 64-channel head coil is commonly used in Siemens PRISMA 3T scanners; however, the 20-channel coil was selected due to specific practical and technical considerations in our study. In particular, the 20-channel head coil was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil allowed us to maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.

      We have made this clearer in the revised manuscript. 

      Methods, Page 20 Lines 385-392

      “All MRI data were acquired using a 20-channel head coil on a research-dedicated 3-Tesla Siemens Magnetom Prisma MRI scanner. Earplugs and cushions were provided for noise protection and head motion restriction. We chose the 20-channel head coil because it was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil helped maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.”

      (6) Was the EEG sampling synchronized to the MR scanner (gradient system) clock (the 10 MHz signal; not referring to the volume TTL triggers here)? This is a requirement for stable gradient artifact shape over time and thus accurate gradient noise removal.

      Thank you for raising this important point. We confirm that the EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This synchronization was achieved using the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift. As a result, the gradient artifact waveform remained stable across volumes, allowing for more effective artifact correction during preprocessing. We appreciate your attention to this critical aspect of EEG-fMRI data acquisition.

      We have made this clearer in the revised manuscript. 

      Methods, Page 19-20 Lines 371-383

      “EEG was recorded simultaneously with fMRI data using an MR-compatible EEG amplifier system (BrainAmps MR-Plus, Brain Products, Germany), along with a specialized electrode cap. The recording was done using 64 channels in the international 10/20 system, with the reference channel positioned at FCz. In order to adhere to polysomnography (PSG) recording standards, six electrodes were removed from the EEG cap: one for electrocardiogram (ECG) recording, two for electrooculogram (EOG) recording, and three for electromyogram (EMG) recording. EEG data was recorded at a sample rate of 5000 Hz, the resistance of the reference and ground channels was kept below 10 kΩ, and the resistance of the other channels was kept below 20 kΩ. To synchronize the EEG and fMRI recordings, the BrainVision recording software (BrainProducts, Germany) was utilized to capture triggers from the MRI scanner. The EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This was achieved via the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift.”

      (7) The TR is quite long and the voxel size is quite large in comparison to state-of-the-art EPI sequences. What was the rationale behind choosing a sequence with relatively low temporal and spatial resolution?

      We acknowledge that our chosen TR and voxel size are relatively long and large compared to state-of-the-art EPI sequences. This decision was made to optimize the signal-to-noise ratio (SNR) and reduce susceptibility-related distortions, which are particularly critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. A longer TR allowed us to sample whole-brain activity with sufficient coverage, while a larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures such as the thalamus and hippocampus, which are key regions of interest in our study. We appreciate your concern and hope this clarification provides sufficient rationale for our sequence parameters.

      We have made this clearer in the revised manuscript. 

      Methods, Page 20-21 Lines 398-408

      “Then, the “sleep” session began after the participants were instructed to try and fall asleep. For the functional scans, whole-brain images were acquired using k-space and steady-state T2*-weighted gradient echo-planar imaging (EPI) sequence that is sensitive to the BOLD contrast. This measures local magnetic changes caused by changes in blood oxygenation that accompany neural activity (sequence specification: 33 slices in interleaved ascending order, TR = 2000 ms, TE = 30 ms, voxel size = 3.5 × 3.5 × 4.2 mm3, FA = 90°, matrix = 64 × 64, gap = 0.7 mm). A relatively long TR and larger voxel size were chosen to optimize SNR and reduce susceptibility-related distortions, which are critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. The longer TR allowed whole-brain coverage with sufficient temporal resolution, while the larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures (e.g., the thalamus and hippocampus), which are key regions of interest in this study.”

      (8) The anatomically defined ROIs are quite large. It should be elaborated on how this might reduce sensitivity to sleep rhythm-specific activity within sub-regions, especially for the thalamus, which has distinct nuclei involved in sleep functions.

      We appreciate your insight regarding the use of anatomically defined ROIs and their potential limitations in detecting sleep rhythm-specific activity within sub-regions, particularly in the thalamus. Given the distinct functional roles of thalamic nuclei in sleep processes, we acknowledge that using a single, large thalamic ROI may reduce sensitivity to localized activity patterns. To address this, we will discuss this limitation in the revised manuscript, acknowledging that our approach prioritizes whole-structure effects but may not fully capture nucleus-specific contributions.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      (9) The study reports SO & spindle amplitudes & densities, as well as SO+spindle coupling, to be larger during N2/3 sleep compared to N1 and REM sleep, which is trivial but can be seen as a sanity check of the data. However, the amount of SOs and spindles reported for N1 and REM sleep is concerning, as per definition there should be hardly any (if SOs or spindles occur in N1 it becomes by definition N2, and the interval between spindles has to be considerably large in REM to still be scored as such). Thus, on the one hand, the report of these comparisons takes too much space in the main manuscript as it is trivial, but on the other hand, it raises concerns about the validity of the scoring.

      We appreciate your concern regarding the reported presence of SOs and spindles in N1 and REM sleep and the potential implications. Our detection method for detecting SO, spindle, and coupling were originally designed only for N2&N3 sleep data based on the characteristics of the data itself, and this method is widely recognized and used in the sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). While, because the detection methods for SO and spindle are based on percentiles, this method will always detect a certain number of events when used for other stages (N1 and REM) sleep data, but the differences between these events and those detected in stage N23 remain unclear. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.

      Methods, Page 25 Lines 515-524

      “We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”

      (10) Why was electrode F3 used to quantify the occurrence of SOs and spindles? Why not a midline frontal electrode like Fz (or a number of frontal electrodes for SOs) and Cz (or a number of centroparietal electrodes) for spindles to be closer to their maximum topography?

      We appreciate your suggestion regarding electrode selection for SO and spindle quantification. Our choice of F3 was primarily based on previous studies (Massimini et al., 2004; Molle et al., 2011), where bilateral frontal electrodes are commonly used for detecting SOs and spindles. Additionally, we considered the impact of MRI-related noise and, after a comprehensive evaluation, determined that F3 provided an optimal balance between signal quality and artifact minimization. We also acknowledge that alternative electrode choices, such as Fz for SOs and Cz for spindles, could provide additional insights into their topographical distributions.

      (11) Functional connectivity (hippocampus -> thalamus -> cortex (mPFC)) is reported to be increased during SO-spindle coupling and interpreted as evidence for coordination of hippocampo-neocortical communication likely by thalamic spindles. However, functional connectivity was only analysed during coupled SO+spindle events, not during isolated SOs or isolated spindles. Without the direct comparison of the connectivity patterns between these three events, it remains unclear whether this is specific for coupled SO+spindle events or rather associated with one or both of the other isolated events. The PPIs need to be conducted for those isolated events as well and compared statistically to the coupled events.

      We appreciate your critical perspective on our functional connectivity analysis and the interpretation of hippocampus-thalamus-cortex (mPFC) interactions during SO-spindle coupling. We acknowledge that, in the current analysis, functional connectivity was only examined during coupled SO-spindle events, without direct comparison to isolated SOs or isolated spindles. To address this concern, we have conducted PPI analyses for all three ROIs(Hippocampus, Thalamus, mPFC) and all three event types (SO-spindle couplings, isolated SOs, and isolated spindles). Our results indicate that neither isolated SOs nor isolated Spindles yielded significant connectivity changes in all three ROIs, as all failed to survive multiple comparison corrections. This suggests that the observed connectivity increase is specific to SO-spindle coupling, rather than being independently driven by either SOs or spindles alone.

      Results, Page 14 Lines 248-255

      “Crucially, the interaction between FC and SO-spindle coupling revealed that only the functional connectivity of hippocampus -> thalamus (ROI analysis, t(106) = 1.86, p = 0.0328) and thalamus -> mPFC (ROI analysis, t(106) = 1.98, p = 0.0251) significantly increased during SO-spindle coupling, with no significant changes in all other pathways (Fig. 4e). We also conducted PPI analyses for the other two events (SOs and spindles), and neither yielded significant connectivity changes in the three ROIs, as all failed to survive whole-brain FWE correction at the cluster level (p < 0.05). Together, these findings suggest that the thalamus, likely via spindles, coordinates hippocampal-cortical communication selectively during SO-spindle coupling, but not isolated SOs or spindle events alone.”

      (12) The limited temporal resolution of fMRI does indeed not allow for easily distinguishing between fMRI activation patterns related to SO-up- vs. SO-down-states. For this, one could try to extract the amplitudes of SO-up- and SO-down-states separately for each SO event and model them as two separate parametric modulators (with the risk of collinearity as they are likely correlated).

      We appreciate your insightful comment regarding the challenge of distinguishing fMRI activation patterns related to SO-up vs. SO-down states due to the limited temporal resolution of fMRI. While our current analysis does not differentiate between these two phases, we acknowledge that separately modeling SO-up and SO-down states using parametric modulators could provide a more refined understanding of their distinct neural correlates. However, as you notes, this approach carries the risk of collinearity, and there is indeed a high correlation between the two amplitudes across all subjects in our results (r=0.98). Future studies could explore more on leveraging high-temporal-resolution techniques. While implementing this in the current study is beyond our scope, we will acknowledge this limitation in the Discussion section.

      Discussion, Page 17 Lines 308-322

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.”

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      (13) L327: "It is likely that our findings of diminished DMN activity reflect brain activity during the SO DOWN-state, as this state consistently shows higher amplitude compared to the UP-state within subjects, which is why we modelled the SO trough as its onset in the fMRI analysis." This conclusion is not justified as the fact that SO down-states are larger in amplitude does not mean their impact on the BOLD response is larger.

      We appreciate your concern regarding our interpretation of diminished DMN activity reflecting the SO down-state. We acknowledge that the current expression is somewhat misleading, and our interpretation of it is: it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. And we will make this clear in the Discussion section.

      Discussion, Page 17 Lines 308-322

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.”

      (14) Line 77: "In the current study, while directly capturing hippocampal ripples with scalp EEG or fMRI is difficult, we expect to observe hippocampal activation in fMRI whenever SOs-spindles coupling is detected by EEG, if SOs- spindles-ripples triple coupling occurs during human NREM sleep". Not all SO-spindle events are associated with ripples (Staresina et al., 2015), but hippocampal activation may also be expected based on the occurrence of spindles alone (Bergmann et al., 2012).

      We appreciate your clarification regarding the relationship between SO-spindle coupling and hippocampal ripples. We acknowledge that not all SO-spindle events are necessarily accompanied by ripples (Staresina et al., 2015). However, based on previous research, we found that hippocampal ripples are significantly more likely to occur during SO-spindle coupling events. This suggests that while ripple occurrence is not guaranteed, SO-spindle coupling creates a favorable network state for ripple generation and potential hippocampal activation. To ensure accuracy, we will revise the manuscript to delete this misleading sentence in the Introduction section and acknowledge in the Discussion that our results cannot conclusively directly observe the triple coupling of SO, spindle, and hippocampal ripples.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      Reviewer #2 (Public review):

      In this study, Wang and colleagues aimed to explore brain-wide activation patterns associated with NREM sleep oscillations, including slow oscillations (SOs), spindles, and SO-spindle coupling events. Their findings reveal that SO-spindle events corresponded with increased activation in both the thalamus and hippocampus. Additionally, they observed that SO-spindle coupling was linked to heightened functional connectivity from the hippocampus to the thalamus, and from the thalamus to the medial prefrontal cortex-three key regions involved in memory consolidation and episodic memory processes.

      This study's findings are timely and highly relevant to the field. The authors' extensive data collection, involving 107 participants sleeping in an fMRI while undergoing simultaneous EEG recording, deserves special recognition. If shared, this unique dataset could lead to further valuable insights. While the conclusions of the data seem overall well supported by the data, some aspects with regard to the detection of sleep oscillations need clarification.

      The authors report that coupled SO-spindle events were most frequent during NREM sleep (2.46 [plus minus] 0.06 events/min), but they also observed a surprisingly high occurrence of these events during N1 and REM sleep (2.23 [plus minus] 0.09 and 2.32 [plus minus] 0.09 events/min, respectively), where SO-spindle coupling would not typically be expected. Combined with the relatively modest SO amplitudes reported (~25 µV, whereas >75 µV would be expected when using mastoids as reference electrodes), this raises the possibility that the parameters used for event detection may not have been conservative enough - or that sleep staging was inaccurately performed. This issue could present a significant challenge, as the fMRI findings are largely dependent on the reliability of these detected events.

      Thank you very much for your thorough and encouraging review. We appreciate your recognition of the significance and relevance of our study and dataset, particularly in highlighting how simultaneous EEG-fMRI recordings can provide complementary insights into the temporal dynamics of neural oscillations and their associated spatial activation patterns during sleep. In the sections that follow, we address each of your comments in detail. We have revised the text and conducted additional analyses wherever possible to strengthen our argument, clarify our methodological choices. We believe these revisions improve the clarity and rigor of our work, and we thank you for helping us refine it.

      We appreciate your insightful comments regarding the detection of sleep oscillations. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.

      Regarding the reported SO amplitudes (~25 µV), during preprocessing, we applied the Signal Space Projection (SSP) method to more effectively remove MRI gradient artifacts and cardiac pulse noise. While this approach enhances data quality, it also reduces overall signal power, leading to systematically lower reported amplitudes. Despite this, our SO detection in NREM sleep (especially N2/N3) remain physiologically meaningful and are consistent with previous fMRI studies using similar artifact removal techniques. We appreciate your careful evaluation and valuable suggestions.

      In addition, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (2)Density of detected SOs; (3)Density of detected spindles; (4)Density of detected SO-spindle coupling events.

      Methods, Page 25 Lines 515-524

      “We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”

      Supplementary Materials, Page 42-54, Table S1-S4

      Reviewer #3 (Public review):

      Summary:

      Wang et al., examined the brain activity patterns during sleep, especially when locked to those canonical sleep rhythms such as SO, spindle, and their coupling. Analyzing data from a large sample, the authors found significant coupling between spindles and SOs, particularly during the upstate of the SO. Moreover, the authors examined the patterns of whole-brain activity locked to these sleep rhythms. To understand the functional significance of these brain activities, the authors further conducted open-ended cognitive state decoding and found a variety of cognitive processing may be involved during SO-spindle coupling and during other sleep events. The authors next investigated the functional connectivity analyses and found enhanced connectivity between the hippocampus, the thalamus, and the medial PFC. These results reinforced the theoretical model of sleep-dependent memory consolidation, such that SO-spindle coupling is conducive to systems-level memory reactivation and consolidation.

      Strengths:

      There are obvious strengths in this work, including the large sample size, state-of-the-art neuroimaging and neural oscillation analyses, and the richness of results.

      Weaknesses:

      Despite these strengths and the insights gained, there are weaknesses in the design, the analyses, and inferences.

      Thank you for your detailed and thoughtful review of our manuscript. We are delighted that you recognize our advanced analysis methods and rich results of neuroimaging and neural oscillations as well as the large sample size data. In the following sections, we provide detailed responses to each of your comments. And we have revised the text and conducted additional analyses to strengthen our arguments and clarify our methodological choices. We believe these revisions enhance the clarity and rigor of our work, and we sincerely appreciate your thoughtful feedback in helping us refine the manuscript.

      (1) A repeating statement in the manuscript is that brain activity could indicate memory reactivation and thus consolidation. This is indeed a highly relevant question that could be informed by the current data/results. However, an inherent weakness of the design is that there is no memory task before and after sleep. Thus, it is difficult (if not impossible) to make a strong argument linking SO/spindle/coupling-locked brain activity with memory reactivation or consolidation.

      We appreciate your suggestion regarding the lack of a pre- and post-sleep memory task in our study design. We acknowledge that, in the absence of behavioral measures, it is hard to directly link SO-spindle coupling to memory consolidation in an outcome-driven manner. Our interpretation is instead based on the well-established role of these oscillations in memory processes, as demonstrated in previous studies. We sincerely appreciate this feedback and will adjust our Discussion accordingly to reflect a more precise interpretation of our findings.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      (2) Relatedly, to understand the functional implications of the sleep rhythm-locked brain activity, the authors employed the "open-ended cognitive state decoding" method. While this method is interesting, it is rather indirect given that there were no behavioral indices in the manuscript. Thus, discussions based on these analyses are speculative at best. Please either tone down the language or find additional evidence to support these claims.

      Moreover, the results from this method are difficult to understand. Figure 3e showed that for all three types of sleep events (SO, spindle, SO-spindle), the same mental states (e.g., working memory, episodic memory, declarative memory) showed opposite directions of activation (left and right panels showed negative and positive activation, respectively). How to interpret these conflicting results? This ambiguity is also reflected by the term used: declarative memory and episodic memories are both indexed in the results. Yet these two processes can be largely overlapped. So which specific memory processes do these brain activity patterns reflect? The Discussion shall discuss these results and the limitations of this method.

      We appreciate your critical assessment of the open-ended cognitive state decoding method and its interpretational challenges. Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7. 

      Due to the complexity of memory-related processes, we acknowledge that distinguishing between episodic and declarative memory based solely on this approach is not straightforward. We will revise the Supplementary Materials to explicitly discuss these limitations and clarify that our findings do not isolate specific cognitive processes but rather suggest general associations with memory-related networks.

      Discussion, Page 17-18 Lines 323-332

      “To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potenial functional claims.”

      (3) The coupling strength is somehow inconsistent with prior results (Hahn et al., 2020, eLife, Helfrich et al., 2018, Neuron). Specifically, Helfrich et al. showed that among young adults, the spindle is coupled to the peak of the SO. Here, the authors reported that the spindles were coupled to down-to-up transitions of SO and before the SO peak. It is possible that participants' age may influence the coupling (see Helfrich et al., 2018). Please discuss the findings in the context of previous research on SO-spindle coupling.

      We appreciate your concern regarding the temporal characteristics of SO-spindle coupling. We acknowledge that the SO-spindle coupling phase results in our study are not identical to those reported by Hahn et al. (2020); Helfrich et al. (2018). However, these differences may arise due to slight variations in event detection parameters, which can influence the precise phase estimation of coupling. Notably, Hahn et al. (2020) also reported slight discrepancies in their group-level coupling phase results, highlighting that methodological differences can contribute to variability across studies. Furthermore, our findings are consistent with those of Schreiner et al. (2021), further supporting the robustness of our observations.  

      That said, we acknowledge that our original description of SO-spindle coupling as occurring at the "transition from the lower state to the upper state" was not entirely precise. The -π/2 phase represents the true transition point, while our observed coupling phase is actually closer to the SO peak rather than strictly at the transition. We will revise this statement in the manuscript to ensure clarity and accuracy in describing the coupling phase.  

      Discussion, Page 16 Lines 283-291

      “Our data provide insights into the neurobiological underpinnings of these sleep rhythms. SOs, originating mainly in neocortical areas such as the mPFC, alternate between DOWN- and UP-states. The thalamus generates sleep spindles, which in turn couple with SOs. Our finding that spindle peaks consistently occurred slightly before the UP-state peak of SOs (in 83 out of 107 participants), concurs with prior studies, including Schreiner et al. (2021). Yet it differs from some results suggesting spindles might peak right at the SO UP-state (Hahn et al., 2020; Helfrich et al., 2018). Such discrepancies could arise from differences in detection algorithms, participant age (Helfrich et al., 2018), or subtle variations in cortical-thalamic timing. Nonetheless, these results underscore the importance of coordinated SO-spindle interplay in supporting sleep-dependent processes.”

      (4) The discussion is rather superficial with only two pages, without delving into many important arguments regarding the possible functional significance of these results. For example, the author wrote, "This internal processing contrasts with the brain patterns associated with external tasks, such as working memory." Without any references to working memory, and without delineating why WM is considered as an external task even working memory operations can be internal. Similarly, for the interesting results on SO and reduced DMN activity, the authors wrote "The DMN is typically active during wakeful rest and is associated with self-referential processes like mind-wandering, daydreaming, and task representation (Yeshurun, Nguyen, & Hasson, 2021). Its reduced activity during SOs may signal a shift towards endogenous processes such as memory consolidation." This argument is flawed. DMN is active during self-referential processing and mind-wandering, i.e., when the brain shifts from external stimuli processing to internal mental processing. During sleep, endogenous memory reactivation and consolidation are also part of the internal mental processing given the lack of external environmental stimulation. So why during SO or during memory consolidation, the DMN activity would be reduced? Were there differences in DMN activity between SO and SO-spindle coupling events?

      We appreciate your concerns regarding the brevity of the discussion and the need for clearer theoretical arguments. We will expand this section to provide more in-depth interpretations of our findings in the context of prior literature. Regarding working memory (WM), we acknowledge that our phrasing was ambiguous. We will modify this statement in the Discussion section.

      For the SO-related reduction in DMN activity, we recognize the need for a more precise explanation. This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state.

      To address your final question, we have conducted the additional post hoc comparison of DMN activity between isolated SOs and SO-spindle coupling events. Our results indicate that

      DMN activation during SOs was significantly lower than during SO-spindle coupling (t(106) = -4.17, p < 1e-4). This suggests that SO-spindle coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. We appreciate your constructive feedback and will integrate these expanded analyses and discussions into our revised manuscript.

      Results, Page 11 Lines 199-208

      “Spindles were correlated with positive activation in the thalamus (ROI analysis, t(106) = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t(106) \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t(106) \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t(106) \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”

      Discussion, Page 17-18 Lines 308-332

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.

      To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”

      Recommendations for the authors:

      Reviewing Editor Comment:

      The reviewers think that you are working on a relevant and important topic. They are praising the large sample size used in the study. The reviewers are not all in line regarding the overall significance of the findings, but they all agree the paper would strongly benefit from some extra work, as all reviewers raise various critical points that need serious consideration.

      We appreciate your recognition of the relevance and importance of our study, as well as your acknowledgment of the large sample size as a strength of our work. We understand that there are differing perspectives regarding the overall significance of our findings, and we value the constructive critiques provided. We are committed to addressing the key concerns raised by all reviewers, including refining our analyses, clarifying our interpretations, and incorporating additional discussions to strengthen the manuscript. Below, we address your specific recommendations and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We believe that these revisions will significantly enhance the rigor and impact of our study, and we sincerely appreciate your thoughtful feedback in helping us improve our work.

      Reviewer #1 (Recommendations for the authors):

      (1) The phrase "overnight sleep" suggests an entire night, while these were rather "nocturnal naps". Please rephrase.

      Response: Thank you for pointing this out. We have revised the phrasing in our manuscript to "nocturnal naps" instead of "overnight sleep" to more accurately reflect the duration of the sleep recordings.

      (2) Sleep staging results (macroscopic sleep architecture) should be provided in more detail (at least min and % of the different sleep stages, sleep onset latency, total sleep duration, total recording duration), at least mean/SD/range.

      Thank you for this suggestion. We will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics. This information will help provide a clearer overview of the macroscopic sleep architecture in our dataset.

      Reviewer #2 (Recommendations for the authors):

      In order to allow for a better estimation of the reliability of the detected sleep events, please:

      (1) Provide densities and absolute numbers of all detected SOs and spindles (N1, NREM, and REM sleep).

      Thank you for pointing this out. We will provide comprehensive tables in the supplementary materials, contains detailed information about sleep waves at each sleep stage for all 107 subjects (Table S2-S4), listing for each subject:1) Different sleep stage duration; 2) Number of detected SOs; 3) Number of detected spindles; 4) Number of detected SO-spindle coupling events; 5) Density of detected SOs; 6) Density of detected spindles; 7) Density of detected SO-spindle coupling events.

      Supplementary Materials, Page 43-54, Table S2-S4

      (2) Show ERPs for all detected SOs and spindles (per sleep stage).

      Thank you for the suggestion. We will provide ERPs for all detected SOs and spindles, separated by sleep stage (N1, N2&N3, and REM) in supplementary Fig. S2-S4. These ERP waveforms will help illustrate the characteristic temporal profiles of SOs and spindles across different sleep stages.

      Methods, Page 25, Line 525-532

      “Event-related potentials (ERP) analysis. After completing the detection of each sleep rhythm event, we performed ERP analyses for SOs, spindles, and coupling events in different sleep stages. Specifically, for SO events, we took the trough of the DOWN-state of each SO as the zero-time point, then extracted data in a [-2 s to 2 s] window from the broadband (0.1–30 Hz) EEG and used [-2 s to -0.5 s] for baseline correction; the results were then averaged across 107 subjects (see Fig. S2a). For spindle events, we used the peak of each spindle as the zero-time point and applied the same data extraction window and baseline correction before averaging across 107 subjects (see Fig. S2b). Finally, for SO-spindle coupling events, we followed the same procedure used for SO events (see Fig. 2a, Figs. S3–S4).”

      (3) Provide detailed info concerning sleep characteristics (time spent in each sleep stage etc.).

      Thank you for this suggestion. Same as the response above, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics.

      Supplementary Materials, Page 42, Table S1 (same as above)

      (4) What would happen if more stringent parameters were used for event detection? Would the authors still observe a significant number of SO spindles during N1 and REM? Would this affect the fMRI-related results?

      Thank you for this suggestion. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).

      Furthermore, in order to explore the impact of this on our fMRI results, we conducted an additional sensitivity analysis by applying different detection parameters for SOs. Specifically, we adjusted amplitude percentile thresholds for SO detection (the parameter that has the greatest impact on the results). We used the hippocampal activation value during N2&N3 stage SO-spindle coupling as an anchor value and found that when the parameters gradually became stricter, the results were similar to or even better than the current results. However, when we continued to increase the threshold, the results began to gradually decrease until the threshold was increased to 80%, and the results were no longer significant. This indicates that our results are robust within a specific range of parameters, but as the threshold increases, the number of trials decreases, ultimately weakening the statistical power of the fMRI analysis.

      Thank you again for your suggestions on sleep rhythm event detection. We will add the results in Supplementary and revise our manuscript accordingly.

      Results, Page 11, Line 199-208

      “Spindles were correlated with positive activation in the thalamus (ROI analysis, t(106) = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t(106) \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t(106) \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t(106) \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”

      Finally, we sincerely thank all again for your thoughtful and constructive feedback. Your insights have been invaluable in refining our analyses, strengthening our interpretations, and improving the clarity and rigor of our manuscript. We appreciate the time and effort you have dedicated to reviewing our work, and we are grateful for the opportunity to enhance our study based on your recommendations.  

      References:

      Bergmann, T. O., Mölle, M., Diedrichs, J., Born, J., & Siebner, H. R. (2012). Sleep spindle-related reactivation of category-specific cortical regions after learning face-scene associations. NeuroImage, 59(3), 2733-2742. 

      Buzsáki, G. (2015). Hippocampal sharp wave‐ripple: A cognitive biomarker for episodic memory and planning. Hippocampus, 25(10), 1073-1188. 

      Caporro, M., Haneef, Z., Yeh, H. J., Lenartowicz, A., Buttinelli, C., Parvizi, J., & Stern, J. M. (2012). Functional MRI of sleep spindles and K-complexes. Clinical neurophysiology, 123(2), 303-309. 

      Coulon, P., Budde, T., & Pape, H.-C. (2012). The sleep relay—the role of the thalamus in central and decentral sleep regulation. Pflügers Archiv-European Journal of Physiology, 463, 53-71. 

      Crunelli, V., Lőrincz, M. L., Connelly, W. M., David, F., Hughes, S. W., Lambert, R. C., Leresche, N., & Errington, A. C. (2018). Dual function of thalamic low-vigilance state oscillations: rhythm-regulation and plasticity. Nature Reviews Neuroscience, 19(2), 107-118. 

      Czisch, M., Wehrle, R., Stiegler, A., Peters, H., Andrade, K., Holsboer, F., & Sämann, P. G. (2009). Acoustic oddball during NREM sleep: a combined EEG/fMRI study. PloS one, 4(8), e6749. 

      Diba, K., & Buzsáki, G. (2007). Forward and reverse hippocampal place-cell sequences during ripples. Nature Neuroscience, 10(10), 1241. 

      Diekelmann, S., & Born, J. (2010). The memory function of sleep. Nature Reviews Neuroscience, 11(2), 114-126. 

      Fogel, S., Albouy, G., King, B. R., Lungu, O., Vien, C., Bore, A., Pinsard, B., Benali, H., Carrier, J., & Doyon, J. (2017). Reactivation or transformation? Motor memory consolidation associated with cerebral activation time-locked to sleep spindles. PloS one, 12(4), e0174755. 

      Hahn, M. A., Heib, D., Schabus, M., Hoedlmoser, K., & Helfrich, R. F. (2020). Slow oscillation-spindle coupling predicts enhanced memory formation from childhood to adolescence. Elife, 9, e53730. 

      Halassa, M. M., Siegle, J. H., Ritt, J. T., Ting, J. T., Feng, G., & Moore, C. I. (2011). Selective optical drive of thalamic reticular nucleus generates thalamic bursts and cortical spindles. Nature Neuroscience, 14(9), 1118-1120. 

      Hale, J. R., White, T. P., Mayhew, S. D., Wilson, R. S., Rollings, D. T., Khalsa, S., Arvanitis, T. N., & Bagshaw, A. P. (2016). Altered thalamocortical and intra-thalamic functional connectivity during light sleep compared with wake. NeuroImage, 125, 657-667. 

      Helfrich, R. F., Lendner, J. D., Mander, B. A., Guillen, H., Paff, M., Mnatsakanyan, L., Vadera, S., Walker, M. P., Lin, J. J., & Knight, R. T. (2019). Bidirectional prefrontal-hippocampal dynamics organize information transfer during sleep in humans. Nature Communications, 10(1), 3572. 

      Helfrich, R. F., Mander, B. A., Jagust, W. J., Knight, R. T., & Walker, M. P. (2018). Old brains come uncoupled in sleep: slow wave-spindle synchrony, brain atrophy, and forgetting. Neuron, 97(1), 221-230. e224. 

      Horovitz, S. G., Fukunaga, M., de Zwart, J. A., van Gelderen, P., Fulton, S. C., Balkin, T. J., & Duyn, J. H. (2008). Low frequency BOLD fluctuations during resting wakefulness and light sleep: A simultaneous EEG‐fMRI study. Human brain mapping, 29(6), 671-682. 

      Huang, Q., Xiao, Z., Yu, Q., Luo, Y., Xu, J., Qu, Y., Dolan, R., Behrens, T., & Liu, Y. (2024). Replay-triggered brain-wide activation in humans. Nature Communications, 15(1), 7185. 

      Ilhan-Bayrakcı, M., Cabral-Calderin, Y., Bergmann, T. O., Tüscher, O., & Stroh, A. (2022). Individual slow wave events give rise to macroscopic fMRI signatures and drive the strength of the BOLD signal in human resting-state EEG-fMRI recordings. Cerebral Cortex, 32(21), 4782-4796. 

      Laufs, H. (2008). Endogenous brain oscillations and related networks detected by surface EEG‐combined fMRI. Human brain mapping, 29(7), 762-769. 

      Laufs, H., Walker, M. C., & Lund, T. E. (2007). ‘Brain activation and hypothalamic functional connectivity during human non-rapid eye movement sleep: an EEG/fMRI study’—its limitations and an alternative approach. Brain, 130(7), e75. 

      Margulies, D. S., Ghosh, S. S., Goulas, A., Falkiewicz, M., Huntenburg, J. M., Langs, G., Bezgin, G., Eickhoff, S. B., Castellanos, F. X., & Petrides, M. (2016). Situating the default-mode network along a principal gradient of macroscale cortical organization. Proceedings of the National Academy of Sciences, 113(44), 12574-12579. 

      Massimini, M., Huber, R., Ferrarelli, F., Hill, S., & Tononi, G. (2004). The sleep slow oscillation as a traveling wave. Journal of Neuroscience, 24(31), 6862-6870. 

      Moehlman, T. M., de Zwart, J. A., Chappel-Farley, M. G., Liu, X., McClain, I. B., Chang, C., Mandelkow, H., Özbay, P. S., Johnson, N. L., & Bieber, R. E. (2019). All-night functional magnetic resonance imaging sleep studies. Journal of neuroscience methods, 316, 83-98. 

      Molle, M., Bergmann, T. O., Marshall, L., & Born, J. (2011). Fast and slow spindles during the sleep slow oscillation: disparate coalescence and engagement in memory processing. Sleep, 34(10), 1411-1421. 

      Ngo, H.-V., Fell, J., & Staresina, B. (2020). Sleep spindles mediate hippocampal-neocortical coupling during long-duration ripples. Elife, 9, e57011. 

      Picchioni, D., Horovitz, S. G., Fukunaga, M., Carr, W. S., Meltzer, J. A., Balkin, T. J., Duyn, J. H., & Braun, A. R. (2011). Infraslow EEG oscillations organize large-scale cortical– subcortical interactions during sleep: a combined EEG/fMRI study. Brain research, 1374, 63-72. 

      Schabus, M., Dang-Vu, T. T., Albouy, G., Balteau, E., Boly, M., Carrier, J., Darsaud, A., Degueldre, C., Desseilles, M., & Gais, S. (2007). Hemodynamic cerebral correlates of sleep spindles during human non-rapid eye movement sleep. Proceedings of the National Academy of Sciences, 104(32), 13164-13169. 

      Schreiner, T., Kaufmann, E., Noachtar, S., Mehrkens, J.-H., & Staudigl, T. (2022). The human thalamus orchestrates neocortical oscillations during NREM sleep. Nature communications, 13(1), 5231. 

      Schreiner, T., Petzka, M., Staudigl, T., & Staresina, B. P. (2021). Endogenous memory reactivation during sleep in humans is clocked by slow oscillation-spindle complexes. Nature Communications, 12(1), 3112. 

      Singh, D., Norman, K. A., & Schapiro, A. C. (2022). A model of autonomous interactions between hippocampus and neocortex driving sleep-dependent memory consolidation. Proceedings of the National Academy of Sciences, 119(44), e2123432119. 

      Spoormaker, V. I., Schröter, M. S., Gleiser, P. M., Andrade, K. C., Dresler, M., Wehrle, R., Sämann, P. G., & Czisch, M. (2010). Development of a large-scale functional brain network during human non-rapid eye movement sleep. Journal of Neuroscience, 30(34), 11379-11387. 

      Staresina, B. P., Bergmann, T. O., Bonnefond, M., van der Meij, R., Jensen, O., Deuker, L., Elger, C. E., Axmacher, N., & Fell, J. (2015). Hierarchical nesting of slow oscillations, spindles and ripples in the human hippocampus during sleep. Nature Neuroscience, 18(11), 1679-1686. 

      Staresina, B. P., Niediek, J., Borger, V., Surges, R., & Mormann, F. (2023). How coupled slow oscillations, spindles and ripples coordinate neuronal processing and communication during human sleep. Nature Neuroscience, 1-9. 

      Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature methods, 8(8), 665-670. 

      Yeshurun, Y., Nguyen, M., & Hasson, U. (2021). The default mode network: where the idiosyncratic self meets the shared social world. Nature Reviews Neuroscience, 1-12.

    1. If you’re having an especially hard time listening to the thoughts inside your head, journaling can be a great way of working through and evaluating those emotions

      It would be incredibly difficult to be open to even listening to your friends vent about their day if you can't even calm down the storm in your head to hear your own thoughts ! A great example of why it's so important to take care of yourself so you have the capacity to be there for your community.

    1. Author response:

      We thank the reviewers for the valuable and constructive reviews. Thanks to these, we believe the article will be considerably improved. We have organized our response to address points that are relevant to both reviewers first, after which we address the unique concerns of each individual reviewer separately. We briefly paraphrase each concern and provide comments for clarification, outlining the precise changes that we will make to the text.

      Common Concerns (Reviewer 1 & Reviewer 2):

      Can you clarify how NREM and REM sleep relate to the oneirogen hypothesis?

      Within the submission draft we tried to stay agnostic as to whether mechanistically similar replay events occur during NREM or REM sleep; however, upon a more thorough literature review, we think that there is moderately greater evidence in favor of Wake-Sleep-type replay occurring during REM sleep which is related to classical psychedelic drug mechanisms of action.

      First, we should clarify that replay has been observed during both REM and NREM sleep, and dreams have been documented during both sleep stages, though the characteristics of dreams differ across stages, with NREM dreams being more closely tied to recent episodic experience and REM dreams being more bizarre/hallucinatory (see Stickgold et al., 2001 for a review). Replay during sleep has been studied most thoroughly during NREM sharp-wave ripple events, in which significant cortical-hippocampal coupling has been observed (Ji & Wilson, 2007). However, it is critical to note that the quantification methods used to identify replay events in the hippocampal literature usually focus on identifying what we term ‘episodic replay,’ which involves a near-identical recapitulation of neural trajectories that were recently experienced during waking experimental recordings (Tingley & Peyrach, 2020). In contrast, our model focuses on ‘generative replay,’ where one expects only a statistically similar reproduction of neural activity, without any particular bias towards recent or experimentally controlled experience. This latter form of replay may look closer to the ‘reactivation’ observed in cortex by many studies (e.g. Nguyen et al., 2024), where correlation structures of neural activity similar to those observed during stimulus-driven experience are recapitulated. Under experimental conditions in which an animal is experiencing highly stereotyped activity repeatedly, over extended periods of time, these two forms of replay may be difficult to dissociate.

      Interestingly, though NREM replay has been shown to couple hippocampal and cortical activity, a similar study in waking animals administered psychedelics found hippocampal replay without any obvious coupling to cortical activity (Domenico et al., 2021). This could be because the coupling was not strong enough to produce full trajectories in the cortex (psychedelic administration did not increase ‘alpha’ enough), and that a causal manipulation of apical/basal influence in the cortex may be necessary to observe the increased coupling. Alternatively, as Reviewer 1 noted, it may be that psychedelics induce a form of hippocampus-decoupled replay, as one would expect from the REM stage of a recently proposed complementary learning systems model (Singh et al., 2022). 

      Evidence in favor of a similarity between the mechanism of action of classical psychedelics and the mechanism of action of memory consolidation/learning during REM sleep is actually quite strong. In particular, studies have shown that REM sleep increases the activity of soma-targeting parvalbumin (PV) interneurons and decreases the activity of apical dendrite-targeting somatostatin (SOM) interneurons (Niethard et al., 2021), that this shift in balance is controlled by higher-order thalamic nuclei, and that this shift in balance is critical for synaptic consolidation of both monocular deprivation effects in early visual cortex (Zhou et al., 2020) and for the consolidation of auditory fear conditioning in the dorsal prefrontal cortex (Aime et al., 2022). These last studies were not discussed in the present manuscript–we will add them, in addition to a more nuanced description of the evidence connecting our model to NREM and REM replay.

      Can you explain how synaptic plasticity induced by psychedelics within your model relates to learning at a behavioral level?

      While the Wake-Sleep algorithm is a useful model for unsupervised statistical learning, it is not a model of reward or fear-based conditioning, which likely occur via different mechanisms in the brain (e.g. dopamine-dependent reinforcement learning or serotonin-dependent emotional learning). The Wake-Sleep algorithm is a ‘normative plasticity algorithm,’ that connects synaptic plasticity to the formation of structured neural representations, but it is not the case that all synaptic plasticity induced by psychedelic administration within our model should induce beneficial learning effects. According to the Wake-Sleep algorithm, plasticity at apical synapses is enhanced during the Wake phase, and plasticity at basal synapses is enhanced during the Sleep phase; under the oneirogen hypothesis, hallucinatory conditions (increased ‘alpha’) cause an increase in plasticity at both apical and basal sites. Because neural activity is in a fundamentally aberrant state when ‘alpha’ is increased, there are no theoretical guarantees that plasticity will improve performance on any objective: psychedelic-induced plasticity within our model could perhaps better be thought of as ‘noise’ that may have a positive or negative effect depending on the context.

      In particular, such ‘noise’ may be beneficial for individuals or networks whose synapses have become locked in a suboptimal local minimum. The addition of large amounts of random plasticity could allow a system to extricate itself from such local minima over subsequent learning (or with careful selection of stimuli during psychedelic experience), similar to simulated annealing optimization approaches. If our model were fully validated, this view of psychedelic-induced plasticity as ‘noise’ could have relevance for efforts to alleviate the adverse effects of PTSD, early life trauma, or sensory deprivation; it may also provide a cautionary note against repeated use of psychedelic drugs within a short time frame, as the plasticity changes induced by psychedelic administration under our model are not guaranteed to be good or useful in-and-of themselves without subsequent re-learning and compensation.

      We should also note that we have deliberately avoided connecting the oneirogen hypothesis model to fear extinction experimental results that have been observed through recordings of the hippocampus or the amygdala (Bombardi & Giovanni, 2013; Jiang et al., 2009; Kelly et al., 2024; Tiwari et al., 2024). Both regions receive extensive innervation directly from serotonergic synapses originating in the dorsal raphe nucleus, which have been shown to play an important role in emotional learning (Lesch & Waider, 2012); because classical psychedelics may play a more direct role in modulating this serotonergic innervation, it is possible that fear conditioning results (in addition to the anxiolytic effects of psychedelics) cannot be attributed to a shift in balance between apical and basal synapses induced by psychedelic administration. We will provide a more detailed review of these results in the text, as well as more clarity regarding their relation to our model.

      Reviewer 1 Concerns:

      Is it reasonable to assign a scalar parameter ‘alpha’ to the effects of classical psychedelics? And is your proposed mechanism of action unique to classical psychedelics? E.g. Could this idea also apply to kappa opioid agonists, ketamine, or the neural mechanisms of hallucination disorders?

      We will clarify that within our model ‘alpha’ is a parameter that reflects the balance between apical and basal synapses in determining the activity of neurons in the network. For the sake of simplicity we used a single ‘alpha’ parameter, but realistically, each neuron would have its own ‘alpha’ parameter, and different layers or individual neurons could be affected differentially by the administration of any particular drug; therefore, our scalar ‘alpha’ value can be thought of as a mean parameter for all neurons, disregarding heterogeneity across individual neurons.

      There are many different mechanisms that could theoretically affect this ‘alpha’ parameter, including: 5-HT2a receptor agonism, kappa opioid receptor binding, ketamine administration, or possibly the effects of genetic mutations underlying the pathophysiology of complex developmental hallucination disorders. We focused exclusively on 5-HT2a receptor agonism for this study because the mechanism is comparatively simple and extensively characterized, but similar mechanisms may well be responsible for the hallucinatory symptoms of a variety of drugs and disorders.

      Can you clarify the role of 5-HT2a receptor expression on interneurons within your model?

      While we mostly focused on the effects of 5-HT2a receptors on the apical dendrites of pyramidal neurons, these receptors are also expressed on soma-targeting parvalbumin (PV) interneurons. This expression on PV interneurons is consistent with our proposed psychedelic mechanism of action, because it could lead to a coordinated decrease in the influence of somatic and proximal dendritic inputs while increasing the influence of apical dendritic inputs. We will elaborate on this point, and will move the discussion earlier in the text.

      Discussions of indigenous use of psychedelics over millenia may amount to over-romanticization.

      We will take great care to conduct a more thorough literature review to reevaluate our statement regarding indigenous psychedelic use (including the citations you suggested), and will either provide a more careful statement or remove this discussion from our introduction entirely, as it has little bearing on the rest of the text. The Ethics Statement will also be modified accordingly.

      You isolate the 5-HT2a agonism as the mechanism of action underlying ‘alpha’ in your model, but there exist 5-HT2a agonists that do not have hallucinatory effects (e.g. lisuride). How do you explain this?

      Lisuride has much-reduced hallucinatory effects compared to other psychedelic drugs at clinical doses (though it does indeed induce hallucinations at high doses; Marona-Lewicka et al., 2002), and we should note that serotonin (5-HT) itself is pervasive in the cortex without inducing hallucinatory effects during natural function. Similarly, MDMA is a partial agonist for 5-HT2a receptors, but it has much-reduced perceptual hallucination effects relative to classical psychedelics (Green et al., 2003) in addition to many other effects not induced by classical psychedelics.

      Therefore, while we argue that 5-HT2a agonism induces an increase in influence of apical dendritic compartments and a decrease in influence of basal/somatic compartments, and that this change induces hallucinations, we also note that there are many other factors that control whether or not hallucinations are ultimately produced, so that not all 5-HT2a agonists are hallucinogenic. We will discuss two such factors in our revision: 5-HT receptor binding affinity and cellular membrane permeability.

      Importantly, many 5-HT2a receptor agonists are also 5-HT1a receptor agonists (e.g. serotonin itself and lisuride), while MDMA has also been shown to increase serotonin, norepinephrine, and dopamine release (Green et al., 2003). While 5-HT2a receptor agonism has been shown to reduce sensory stimulus responses (Michaiel et al., 2019), 5-HT1a receptor agonism inhibits spontaneous cortical activity (Azimi et al., 2020); thus one might expect the net effect of administering serotonin or a nonselective 5-HT receptor agonist to be widespread inhibition of a circuit, as has been observed in visual cortex (Azimi et al., 2020). Therefore, selective 5-HT2a agonism is critical for the induction of hallucinations according to our model, though any intervention that jointly excites pyramidal neurons’ apical dendrites and inhibits their basal/somatic compartments across a broad enough area of cortex would be predicted to have a similar effect. Lisuride has a much higher binding affinity for 5-HT1a receptors than, for instance, LSD (Marona-Lewicka et al., 2002).

      Secondly, it has recently been shown that both the head-twitch effect (a coarse behavioral readout of hallucinations in animals) and the plasticity effects of psychedelics are abolished when administering 5-HT2a agonists that are impermeable to the cellular membrane because of high polarity, and that these effects can be rescued by temporarily rendering the cellular membrane permeable (Vargas et al., 2023). This suggests that the critical hallucinatory effects of psychedelics (apical excitation according to our model) may be mediated by intracellular 5-HT2a receptors. Notably, serotonin itself is not membrane permeable in the cortex.

      Therefore, either of these two properties could play a role in whether a given 5-HT2a agonist induces hallucinatory effects. We will provide a considerably extended discussion of these nuances in our revision.

      Your model proposes that an increase in top-down influence on neural activity underlies the hallucinatory effects of psychedelics. How do you explain experimental results that show increases in bottom-up functional connectivity (either from early sensory areas or the thalamus)?

      Firstly, we should note that our proposed increase in top-down influence is a causal, biophysical property, not necessarily a statistical/correlative one. As such, we will stress that the best way to test our model is via direct intervention in cortical microcircuitry, as opposed to correlative approaches taken by most fMRI studies, which have shown mixed results with regard to this particular question. Correlative approaches can be misleading due to dense recurrent coupling in the system, and due to the coarse temporal and spatial resolution provided by noninvasive recording technologies (changes in statistical/functional connectivity do not necessarily correspond to changes in causal/mechanistic connectivity, i.e. correlation does not imply causation).

      There are two experimental results that appear to contradict our hypothesis that deserve special consideration in our revision. The first shows an increase in directional thalamic influence on the distributed cortical networks after psychedelic administration (Preller et al., 2018). To explain this, we note that this study does not distinguish between lower-order sensory thalamic nuclei (e.g. the lateral and medial geniculate nuclei receiving visual and auditory stimuli respectively) and the higher-order thalamic nuclei that participate in thalamocortical connectivity loops (Whyte et al., 2024). Subsequent more fine-grained studies have noted an increase in influence of higher order thalamic nuclei on the cortex (Pizzi et al., 2023; Gaddis et al., 2022), and in fact extensive causal intervention research has shown that classical psychedelics (and 5-HT2a agonism) decrease the influence of incoming sensory stimuli on the activity of early sensory cortical areas, indicating decoupling from the sensory thalamus (Evarts et al., 1955; Azimi et al., 2020; Michaiel et al. 2019). The increased influence of higher-order thalamic nuclei is consistent with both the cortico-striatal-thalamo-cortical (CTSC) model of psychedelic action as well as the oneirogen hypothesis, since higher-order thalamic inputs modulate the apical dendrites of pyramidal neurons in cortex (Whyte et al., 2024).

      The second experimental result notes that DMT induces traveling waves during resting state activity that propagate from early visual cortex to deeper cortical layers (Alamia et al., 2020). There are several possibilities that could explain this phenomenon: 1) it could be due to the aforementioned difficulties associated with directed functional connectivity analyses, 2) it could be due to a possible high binding affinity for DMT in the visual cortex relative to other brain areas, or 3) it could be due to increases in apical influence on activity caused by local recurrent connectivity within the visual cortex which, in the absence of sensory input, could lead to propagation of neural activity from the visual cortex to the rest of the brain. This last possibility is closest to the model proposed by (Ermentrout & Cowan, 1979), and which we believe would be best explained within our framework by a topographically connected recurrent network architecture trained on video data; a potentially fruitful direction for future research.

      Shouldn’t the hallucinations generated by your model look more ‘psychedelic,’ like those produced by the DeepDream algorithm?

      We believe that the differences in hallucination visualization quality between our algorithm and DeepDream are mostly due to differences in the scale and power of the models used across these two studies. We are confident that with more resources (and potentially theoretical innovations to improve the Wake-Sleep algorithm’s performance) the produced hallucination visualizations could become more realistic, but we believe this falls outside the scope of the present study.

      We note that more powerful generative models trained with backpropagation are able to produce surreal images of comparable quality (Rezende et al., 2014; Goodfellow et al., 2020; Vahdat & Kautz, 2020), though these have not yet been used as a model of psychedelic hallucinations. However, the DeepDream model operates on top of large pretrained image processing models, and does not provide a biologically mechanistic/testable interpretation of its hallucination effects. When training smaller models with a local synaptic plasticity rule (as opposed to backpropagation), the hallucination effects are less visually striking due to the reduced quality of our trained generative model, though they are still strongly tied to the statistics of sensory inputs, as quantified by our correlation similarity metric (Fig. 5b). We will provide a more detailed explanation of this phenomenon when we discuss our model limitations in our revised manuscript.

      Your model assumes domination by entirely bottom-up activity during the ‘wake’ phase, and domination entirely by top-down activity during ‘sleep,’ despite experimental evidence indicating that a mixture of top-down and bottom-up inputs influence neural activity during both stages in the brain. How do you explain this?

      Our use of the Wake-Sleep algorithm, in which top-down inputs (Sleep) or bottom-up inputs (Wake) dominate network activity is an over-simplification made within our model for computational and theoretical reasons. Models that receive a mixture of top-down and bottom-up inputs during ‘Wake’ activity do exist (in particular the closely related Boltzmann machine (Ackley et al., 1985)), but these models are considerably more computationally costly to train due to a need to run extensive recurrent network relaxation dynamics for each input stimulus. Further, these models do not generalize as cleanly to processing temporal inputs. For this reason, we focused on the Wake-Sleep algorithm, at the cost of some biological realism, though we note that our model should certainly be extended to support mixed apical-basal waking regimes. We will make sure to discuss this in our ‘Model Limitations’ section.

      Your model proposes that 5-HT2a agonism enhances glutamatergic transmission, but this is not true in the hippocampus, which shows decreases in glutamate after psychedelic administration.

      We should note that our model suggests only compartment specific increases in glutamatergic transmission; as such, our model does not predict any particular directionality for measures of glutamatergic transmission that includes signaling at both apical and basal compartments in aggregate, as was measured in the provided study (Mason et al., 2020).

      You claim that your model is consistent with the Entropic Brain theory, but you report increases in variance, not entropy. In fact, it has been shown that variance decreases while entropy increases under psychedelic administration. How do you explain this discrepancy?

      Unfortunately, ‘entropy’ and ‘variance’ are heavily overloaded terms in the noninvasive imaging literature, and the particularities of the method employed can exert a strong influence on the reported effects. The reduction in variance reported by (Carhart-Harris et al., 2016) is a very particular measure: they are reporting the variance of resting state synchronous activity, averaged across a functional subnetwork that spans many voxels; as such, the reduction in variance in this case is a reduction in broad, synchronous activity. We do not have any resting state synchronous activity in our network due to the simplified nature of our model (particularly an absence of recurrent temporal dynamics), so we see no reduction in variance in our model due to these effects.

      Other studies estimate ‘entropy’ or network state disorder via three different methods that we have been able to identify. 1) (Carhart-Harris et al., 2014) uses a different measure of variance: in this case, they subtract out synchronous activity within functional subnetworks, and calculate variability across units in the network. This measure reports increases in variance (Fig. 6), and is the closest measure to the one we employ in this study. 2) (Lebedev et al., 2016) uses sample entropy, which is a measure of temporal sequence predictability. It is specifically designed to disregard highly predictable signals, and so one might imagine that it is a measure that is robust to shared synchronous activity (e.g. resting state oscillations). 3) (Mediano et al., 2024) uses Lempel-Ziv complexity, which is, similar to sample entropy, a measure of sequence diversity; in this case the signal is binarized before calculation, which makes this method considerably different from ours. All three of the preceding methods report increases in sequence diversity, in agreement with our quantification method. Our strongest explanation for why the variance calculation in (Carhart-Harris et al., 2016) produces a variance reduction is therefore due to a reduction in low-rank synchronous activity in subnetworks during resting state.

      As for whether the entropy increase is meaningful: we share Reviewer 1’s concern that increases in entropy could simply be due to a higher degree of cognitive engagement during resting state recordings, due to the presence of sensory hallucinations or due to an inability to fall asleep. This could explain why entropy increases are much more minimal relative to non-hallucinating conditions during audiovisual task performance (Siegel et al., 2024; Mediano et al., 2024). However, we can say that our model is consistent with the Entropic Brain Theory without including any form of ‘cognitive processing’: we observe increases in variability during resting state in our model, but we observe highly similar distributions of activity when averaging over a wide variety of sensory stimulus presentations (Fig. 5b-c). This is because variability in our model is not due to unstructured noise: it corresponds to an exploration of network states that would ordinarily be visited by some stimulus. Therefore, when averaging across a wide variety of stimuli, the distribution of network states under hallucinating or non-hallucinating conditions should be highly similar.

      One final point of clarification: here we are distinguishing Entropic Brain Theory from the REBUS model–the oneirogen hypothesis is consistent with the increase in entropy observed experimentally, but in our model this entropy increase is not due to increased influence of bottom-up inputs (it is due instead to an increase in top-down influence). Therefore, one could view the oneirogen hypothesis as consistent with EBT, but inconsistent with REBUS.

      You relate your plasticity rule to behavioral-timescale plasticity (BTSP) in the hippocampus, but plasticity has been shown to be reduced in the hippocampus after psychedelic administration. Could you elaborate on this connection?

      When we were establishing a connection between our ‘Wake-Sleep’ plasticity rule and BTSP learning, the intended connection was exclusively to the mathematical form of the plasticity rule, in which activity in the apical dendrites of pyramidal neurons functions as an instructive signal for plasticity in basal synapses (and vice versa): we will clarify this in the text. Similarly, we point out that such a plasticity rule tends to result in correlated tuning between apical and basal dendritic compartments, which has been observed in hippocampus and cortex: this is intended as a sanity check of our mapping of the Wake-Sleep algorithm to cortical microcircuitry, and has limited further bearing on the effects of psychedelics specifically.

      Reduction in plasticity in the hippocampus after psychedelic administration could be due to a complementary learning systems-type model, in which the hippocampus becomes partly decoupled from the cortex during REM sleep (Singh et al., 2022); were this to be the case, it would not be incompatible with our model, which is mostly focused on the cortex. Notably, potentiating 5HT-2a receptors in the ventral hippocampus does not induce the head-twitch response, though it does produce anxiolytic effects (Tiwari et al., 2024), indicating that the hallucinatory and anxiolytic effects of classical psychedelics may be partly decoupled. 

      Reviewer 2 Concerns:

      Could you provide visualizations of the ‘ripple’ phenomenon that you’re referring to?

      We will do this! For now, you can get a decent understanding of what the ‘ripple effect’ looks like from the ‘eyes closed’ hallucination condition for networks trained on CIFAR10 (Fig. 2d). The ripple effect that we are referring to is very similar, except it is superimposed on a naturalistic image under ordinary viewing conditions; to give a higher quality visualization of the ripple phenomenon itself, we will subtract out the static contribution of the image itself, leaving only the ripple phenomenon.

      Could you provide a more nuanced description of alternative roles for top-down feedback, beyond being used exclusively for learning as depicted in your model?

      For the sake of simplicity, we only treat top-down inputs in our model as a source of an instructive teaching signal, the originator of generative replay events during the Sleep phase, and as the mechanism of hallucination generation. However, as discussed in a response to a previous question, in the cortex pyramidal neurons receive and respond to a mixture of top-down and bottom-up processing.

      There are a variety of theories for what role top-down inputs could play in determining network activity. To name several, top-down input could function as: 1) a denoising/pattern completion signal (Kadkhodaie & Simoncelli, 2021), 2) a feedback control signal (Podlaski & Machens, 2020), 3) an attention signal (Lindsay, 2020), 4) ordinary inputs for dynamic recurrent processing that play no specialized role distinct from bottom-up or lateral inputs except to provide inputs from higher-order association areas or other sensory modalities (Kar et al., 2019; Tugsbayar et al., 2025). Though our model does not include these features, they are perfectly consistent with our approach.

      In particular, denoising/pattern completion signals in the predictive coding framework (closely related to the Wake-Sleep algorithm) also play a role as an instructive learning signal (Salvatori et al., 2021); and top-down control signals can play a similar role in some models (Gilra & Gerstner, 2017; Meulemans et al., 2021). Thus, options 1 and 2 are heavily overlapping with our approach, and are a natural consequence of many biologically plausible learning algorithms that minimize a variational free energy loss (Rao & Ballard, 1997; Ackley et al., 1985). Similarly, top-down attentional signals can exist alongside top-down learning signals, and some models have argued that such signals can be heavily overlapping or mutually interchangeable (Roelfsema & van Ooyen, 2005). Lastly, generic recurrent connectivity (from any source) can be incorporated into the Wake-Sleep algorithm (Dayan & Hinton, 1996), though we avoided doing this in the present study due to an absence of empirical architecture exploration in the literature and the computational complexity associated with training on time series data.

      To conclude, there are a variety of alternative functions proposed for top-down inputs onto pyramidal neurons in the cortex, and we view these additional features as mutually compatible with our approach; for simplicity we did not include them in our model, but we believe that these features are unlikely to interfere with our testable predictions or empirical results.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Recent work has demonstrated that the hummingbird hawkmoth, Macroglossum stellatarum, like many other flying insects, use ventrolateral optic flow cues for flight control. However, unlike other flying insects, the same stimulus presented in the dorsal visual field elicits a directional response. Bigge et al., use behavioral flight experiments to set these two pathways in conflict in order to understand whether these two pathways (ventrolateral and dorsal) work together to direct flight and if so, how. The authors characterize the visual environment (the amount of contrast and translational optic flow) of the hawkmoth and find that different regions of the visual field are matched to relevant visual cues in their natural environment and that the integration of the two pathways reflects a priortiziation for generating behavior that supports hawkmoth safety rather than than the prevalence for a particular visual cue that is more prevalent in the environment.

      Strengths:

      This study creatively utilizes previous findings that the hawkmoth partitions their visual field as a way to examine parallel processing. The behavioral assay is well-established and the authors take the extra steps to characterize the visual ecology of the hawkmoth habitat to draw exciting conclusions about the hierarchy of each pathway as it contributes to flight control.

      Weaknesses:

      The work would be further clarified and strengthened by additional explanation included in the main text, figure legends, and methods that would permit the reader to draw their own conclusions more feasibly. It would be helpful to have all figure panels referenced in the text and referenced in order, as they are currently not. In addition, it seems that sometimes the incorrect figure panel is referenced in the text, Figure S2 is mislabeled with D-E instead of A-C and Table S1 is not referenced in the main text at all. Table S1 is extremely important for understanding the figures in the main text and eliminating acronyms here would support reader comprehension, especially as there is no legend provided for Table S1. For example, a reader that does not specialize in vision may not know that OF stands for optic flow. Further detail in figure legends would also support the reader in drawing their own conclusions. For example, dashed red lines in Figures 3 and 4 A and B are not described and the letters representing statistical significance could be further explained either in the figure legend or materials to help the reader draw their own conclusions.

      We appreciate the suggestions to improve the clarity of the manuscript. We have extensively re-structured the entire manuscript. Among others, we have referenced all figure panels in the text in the order they appear. To do so, we combined the optic flow and contrast measurements of our setup with the methods description of the behavioural experiments (formerly Figs. 5 and 2, respectively). This new figure 2 now introduces the methods of the study, while the remainder of Fig. 2, which presented the experiments that investigated the vetrolateral and dorsal response in more detail, is now a separate figure (Fig. 3). This arrangement also balances the amount of information contained  in each figure better.

      Reviewer #2 (Public review):

      Summary:

      Bigge and colleagues use a sophisticated free-flight setup to study visuo-motor responses elicited in different parts of the visual field in the hummingbird hawkmoth. Hawkmoths have been previously shown to rely on translational optic flow information for flight control exclusively in the ventral and lateral parts of their visual field. Dorsally presented patterns, elicit a formerly completely unknown response - instead of using dorsal patterns to maintain straight flight paths, hawkmoths fly, more often, in a direction aligned with the main axis of the pattern presented (Bigge et al, 2021). Here, the authors go further and put ventral/lateral and dorsal visual cues into conflict. They found that the different visuomotor pathways act in parallel, and they identified a 'hierarchy': the avoidance of dorsal patterns had the strongest weight and optic flow-based speed regulation the lowest weight.

      Strengths:

      The data are very interesting, unique, and compelling. The manuscript provides a thorough analysis of free-flight behavior in a non-model organism that is extremely interesting for comparative reasons (and on its own). These data are both difficult to obtain and very valuable to the field.

      Weaknesses:

      While the present manuscript clearly goes beyond Bigge et al, 2021, the advance could have perhaps been even stronger with a more fine-grained investigation of the visual responses in the dorsal visual field. Do hawkmoths, for example, show optomotor responses to rotational optic flow in the dorsal visual field?

      We thank the reviewer for the feedback, and the suggestions for improvement of the manuscript (our implementations are detailed below). We fully agree that this study raises several intriguing questions regarding the dorsal visual response, including how the animals perceive and respond to rotational optic flow in their dorsal visual field, particularly since rotational optic flow may be processed separately from translational optic flow.

      In our free-flight setup, it was not possible to generate rotational optic flow in a controlled manner. To explore this aspect more systematically, a tethered-flight setup would be ideal, or alternatively, a free-flight setup integrated with virtual reality. This would be a compelling direction for a follow-up study.

      Reviewer #3 (Public review):

      The central goal of this paper as I understand it is to extract the "integration hierarchy" of stimulus in the dorsal and ventrolateral visual fields. The segregation of these responses is different from what is thought to occur in bees and flies and was established in the authors' prior work. Showing how the stimuli combine and are prioritized goes beyond the authors' prior conclusions that separated the response into two visual regions. The data presented do indeed support the hierarchy reported in Figure 5 and that is a nice summary of the authors' work. The moths respond to combinations of dorsal and lateral cues in a mixed way but also seem to strongly prioritize avoiding dorsal optic flow which the authors interpret as a closed and potentially dangerous ecological context for these animals. The authors use clever combinations of stimuli to put cues into conflict to reveal the response hierarchy.

      My most significant concern is that this hierarchy of stimulus responses might be limited to the specific parameters chosen in this study. Presumably, there are parameters of these stimuli that modulate the response (spatial frequency, different amounts of optic flow, contrast, color, etc). While I agree that the hierarchy in Figure 5 is consistent for the particular stimuli given, this may not extend to other parameter combinations of the same cues. For example, as the contrast of the dorsal stimuli is reduced, the inequality may shift. This does not preclude the authors' conclusions but it does mean that they may not generalize, even within this species. For example, other cue conflict studies have quantified the responses to ranges of the parameters (e.g. frequency) and shown that one cue might be prioritized or up-weighted in one frequency band but not in others. I could imagine ecological signatures of dorsal clutter and translational positioning cues could depend on the dynamic range of the optic flow, or even having spatial-temporal frequency-dependent integration independent of net optic flow.

      We absolutely agree that in principle, an observed integration hierarchy is only valid for the stimuli tested. Yet, we do believe that we provide good evidence that our key observations are robust also for related stimuli to the ones tested:

      Most importantly, we found that both pathways act in parallel (and are not mutually exclusive, or winner-takes-all, for example), when the animals can enact the locomotion induced by the dorsal and ventrolateral pathway. We tested this with the same dorsal cue (the line switching direction), but different behavioural paradigms (centring vs unilateral avoidance), and different ventrolateral stimuli (red gratings of one spatial frequency, and 100% nominal contrast black-and-white checkerboard stimuli which comprised a range of spatial frequencies) – and found the same integration strategy.

      Certainly, if the contrast of the visual cues was reduced to the point that the dorsal or ventrolateral responses became weaker, we would expect this to be visible in the combined responses, with the respective reduction in response strength for either pathway, to the same degree as they would be reduced when stimuli were shown independently in the dorsal and ventrolateral visual field.

      For testing whether the animals would show a weighting of responses when it was not possible to enact locomotion to both pathways, we felt it was important to use similar external stimuli to be able to compare the responses. So we can confidently interpret their responses in terms of integration. Indeed, how this is translated to responses in the two pathways depends a) on the spatiotemporal tuning, contrast sensitivity and exact receptive fields of the two systems, b) the geometry of the setup and stimulus coverage, and therefore the ability of the animals to enact responses to both pathways independently and c) on the integration weights.

      It would indeed be fascinating to obtain this tuning and the receptive fields, and having these, test a large array of combinations of stimuli and presentation geometries, so that one could extract integration weights for different presentation scenarios from the resulting flight responses in a future study.

      We also expanded the respective discussion section to reflect these points: l. 391-417. We also updated the former Fig. 5, now Fig. 6 to reflect this discussion.

      The second part of this concern is that there seems to be a missed opportunity to quantify the integration, especially when the optic flow magnitude is already calculated. The discussion even highlights that an advantage of the conflict paradigm is that the weights of the integration hierarchy can be compared. But these weights, which I would interpret as stimulus-responses gains, are not reported. What is the ratio of moth response to optic flow in the different regions? When the moth balances responses in the dorsal and ventrolateral region, is it a simple weighted average of the two? When it prioritizes one over the other is the response gain unchanged? This plays into the first concern because such gain responses could strongly depend on the specific stimulus parameters rather than being constant.

      Indeed, we set up stimuli that are comparable, as they are all in the visual domain, and since we can calculate their external optic flow and contrast magnitudes, to control for imbalances in stimulus presentation, which is important for the interpretation of the resulting data.

      As we discussed above, we are confident that we are observing general principles of the integration of the two parallel pathways. However, we refrained from calculating integration weights, because these might be misleading for several reasons:

      (1) In situations where the animals can enact responses to both pathways, we show that they do so at the full original magnitudes. So there are no “weights” of the hierarchy in this case.

      (2) Only when responses to both systems are not possible in parallel, do we see a hierarchy. However, combined with point (1), this hierarchy likely depends on the geometry of the moths’ environment: it will be more pronounced the less both systems can be enacted in parallel.

      (3) The hierarchy also does not affect all features of the dorsal or ventrolateral pathway equally. The hawkmoths still regulate their perpendicular distance to ventral gratings with dorsal gratings present, to same degree as with only ventral grating - because perpendicular distance regulation is not a feature of the dorsal response. And while the hawkmoths show a significant reduction in their position adjustment to dorsal contrast when it is in conflict with lateral gratings (Fig. 4C), they show exactly the same amount of lateral movement and speed adjustment as for dorsal gratings alone, when not combined with lateral ones (Fig. 4D and Fig. S3A). So even for one particular setup geometry and stimulus combination, there clearly is not one integration weight for all features of the responses.

      We extended the discussion section to clarify these points “The benefit of our study system is that the same cues activate different control pathways in different regions of the visual field, so that the resulting behaviour can directly be interpreted in terms of integration weights” (l. 448-451)

      l. 391-417, we also updated the former Fig. 5, now Fig. 6 to reflect this discussion.

      The authors do explain the choice of specific stimuli in the context of their very nice natural scene analysis in Fig. 1 and there is an excellent discussion of the ecological context for the behaviors. However, I struggled to directly map the results from the natural scenes to the conclusions of the paper. How do they directly inform the methods and conclusions for the laboratory experiments? Most important is the discussion in the middle paragraph of page 12, which suggests a relationship with Figure 1B, but seems provocative but lacking a quantification with respect to the laboratory stimuli.

      We show that contrast cues and translational optic flow are not homogeneously distributed in the natural environments of hawkmoths. This directly related to our laboratory findings, when it comes to responses to these stimuli in different parts of their visual field. In order to interpret the results of these behavioural experiments with respect to the visual stimuli, we did perform measurements of translational optic flow and contrast cues in the laboratory setup. As a result, we make several predictions about the animals’ use of translational optic flow and contrast cues in natural settings:

      a) Hawkmoths in the lab responded strongest to ventral optic flow, even though it was not stronger in magnitude, given our measurements, than lateral optic flow. Thus, we propose that the stronger response to ventral optic flow might be an evolutionary adaptation to the natural distribution of translational optic flow cues.

      b) In the natural habitats of hawkmoths, dorsal coverage is much less frequent that ventrolateral structures generating translational optic flow, yet the hawkmoths responded with a much higher weight to the former. Moreover, in our flight tunnel experiments, the animals responded with the same or higher weights to dorsal cues, which had a lower magnitude of translational optic flow and contrast than the same cues in the ventrolateral visual field. So we showed, combining behavioural experiments and stimulus measurements in the lab that the weighting of dorsal and ventrolateral cues did not follow their stimulus magnitude in the lab. Moreover, comparing to the natural cue distributions, we suggest that the integration weights also did not evolve to match the prevalence of these cues in natural habitats.

      We integrated the measurements of natural visual scene statistics in the new Fig. 6, to relate the behavioural findings to the natural context also in the figure structure, and sequence logic of the text, as they are discussed here.

      The central conclusion of the first section of the results is that there are likely two different pathways mediating the dorsal and the ventrolateral response. This seems reasonable given the data, however, this was also the message that I got from the authors' prior paper (ref 11). There are certainly more comparisons being done here than in that paper and it is perfectly reasonable to reinforce the conclusion from that study but I think what is new about these results needs to be highlighted in this section and differentiated from prior results. Perhaps one way to help would be to be more explicit with the open hypotheses that remain from that prior paper.

      We appreciate the suggestion to highlight more clearly what the open questions that are addressed in this study are. As a result, we have entirely restructured the introduction, added sections to the discussion and fundamentally changed the graphical result summary in Fig. 6, to reflect the following new findings (and differences to the previous paper):

      The previous paper demonstrated that there are two different pathways in hummingbird hawkmoths that mediate visual flight guidance, and newly described one of them, the dorsal response. This established flight guidance in hummingbird hawkmoths as a model for the questions asked in the current study, which are very different in nature from the previous paper.  

      The main question addressed in the current study is how these two flight guidance pathways interact to generate consistent behaviour? Throughout the literature of parallel sensory and motor pathways guiding behaviour, there are different solutions – from winner-takes-all to equal mixed responses. We tested this fundamental question using the hummingbird hawkmoth flight guidance systems as a model.

      This is the main question addressed in the various conflict experiments in this study, and we show that indeed, the two systems operate in parallel. As long as the animals can enact both dorsal and optic-flow responses, they do so at the original strengths of the responses. Only when this is not possible, hierarchies become visible. We carefully measured the optic flow and contrast cues generated by the different stimuli to ensure that the hierarchies we observed were not generated by imbalances of the external stimuli.

      - Does the interaction hierarchy of the two pathways follow the statistics of natural environments?  We did show qualitatively previously how optic flow and contrast cues are distributed across the visual field in natural habitats of the hummingbird hawkmoth. In this study, we quantitatively analysed the natural image data, including a new analysis for the contrast edges, and statistically compared the results across conditions. This quantitative analysis supported the previous qualitative assessment that the prevalence of translational optic flow was highest in the ventral and lowest in the dorsal visual field in all natural habitat types. The distribution of contrast edges across the visual field did depend on habitat type much stronger than visible in the qualitative analysis in the previous paper. When compared to the magnitude of the behavioural responses, and considering that the hummingbird hawkmoth is predominantly found in open and semi-open habitats, the natural distributions of optic flow and contrast edges did not align with the response hierarchy observed in our laboratory experiments. Dorsal cues elicited much stronger responses relative to ventrolateral optic flow responses than would be expected.

      To provide a more complete picture of the dorsal pathway, which will be important to understand its nature, and also compare to other species, we conducted additional experiments that were specifically set up to test for response features known from the translational optic flow response. To compare and contrast the two systems. These experiments here allowed us to show that the dorsal response is not simply a translational optic flow reduction response that creates much stronger output than the ventrolateral optic flow response. We particularly show that the dorsal response was lacking the perpendicular distance regulation of the optic flow response, while it did provide alignment with prominent contrasts (possibly to reduce the perceived translational optic flow), which is not observed in the ventrolateral optic flow response. The strong avoidance of any dorsal contrast cues, not just those inducing translational optic flow, is another feature not found in the ventrolateral pathway.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Many comparisons between visual conditions are made and it was confusing at times to know which conditions the authors were comparing. Thinking of a way to label each condition with a letter or number so that the authors could specify which conditions are specifically being compared would greatly enhance comprehension and readability.

      We appreciate this concern. To be able to refer to the individual stimulus conditions in the analysis and results description, we gave each stimulus a unique identifier (see table S1), and provided these identifiers in the respective figures and throughout the text. We hope that this makes the identification of the individual stimuli easier.

      Consider adding in descriptive words to the y-axis labels for the position graphs that would help the reader quickly understand what a positive or negative value means with respect to the visual condition.

      We did now change the viewpoint on the example tracks in Figs. 2-5, to take a virtual viewpoint from the top, not as the camera recorded from below, which requires some mental rotation to reconcile the left and right sides. Moreover, we noticed that the example track axes were labelled in mm, while the axes for the plots showing median position in the tunnel were labelled in cm. We reconciled the units as well. This will make it easier to see the direct equivalent of the axis (as well as positive and negative values) in the example tracks in those figures, and the median positions, as well as the cross-index.

      There are no line numbers provided so it is a bit challenging to provide feedback on specific sentences but there are a handful of typos in the manuscript, a few examples:

      (1) Cue conflict section, first paragraph: "When both cues were presented to in combination, ..." (remove to)

      (2) The ecological relevance section, first paragraph, first sentence: "would is not to fly"

      (3) Figure S3 legend: explanation for C is labeled as B and B is not included with A

      We apologise for the missing line numbers. We added these and resolved the issues 1-3.

      Reviewer #2 (Recommendations for the authors):

      - The pictograms in Fig. 1a were at first glance not clear to me, maybe adding l, r, d, v to the first pictogram could make the figure more immediately accessible.

      We added these labels to make it more accessible.

      - I would suggest noting in the main text that the red patterns were chosen for technical reasons (see Methods), if this is correct.

      We added this information and a reference to the methods in the main text (lines 100-102).

      - "Thus, hawkmoths are currently the only insect species for which a partitioning of the visual field has been demonstrated in terms of optic-flow-based flight control [33-35]." I think that is a bit too strong and maybe it would be more interesting to connect the current data to connected data in other insects to perhaps discuss important similarities. Ref 32 for example shows that fruit flies weigh ventral translational optic flow considerably more than dorsal translational optic flow. Reichardt 1983 (Naturwissenschaften) showed that stripe fixation in large flies (a behaviour relying in part on the motion pathway) is confined to the ventral visual field, etc...

      We have changed this sentence to acknowledge partitioning in other insects, and motivating the use of our model species for this study: While fruit flies weight ventral translational optic flow stronger than dorsal optic flow, the most extreme partitioning of the visual field in terms of  optic-flow-based flight control has been observed in hawkmoths [33-35]. (lines 60-62)

      - I think the statistical differences group mean differences could be described in more detail at least in Fig. 2 (to me the description was not immediately clear, in particular with the double letters).

      We added an explanation of the letter nomenclature to all respective figure legends:

      Black letters show statistically significant differences in group means or median, depending on the normality of the test residuals (see Methods, confidence level: 5%). The red letters represent statistically significant differences in group variance from pairwise Brown–Forsythe tests (significance level 5%). Conditions with different letters were significantly different from each other. The white boxplots depict the median and 25% to 75% range, the whiskers represent the data exceeding the box by more than 1.5 interquartile ranges, and the violin plots indicate the distribution of the individual data points shown in black.

      - "When translational optic flow was presented laterally" I would use a more wordy description, since it is the hawkmoth that is controlling the optic flow and in addition to translational optic flow, there might also be rotational components, retinal expansion etc.

      We extended the description to explain that the moths were generating the optic flow percept based on stationary gratings in different orientations, by way of their flight through the tunnel. Lines 127-129

      - While it is clearly stated that the measure of the perpendicular distance from the ventral and dorsal pattern via the size of the insect as seen by the camera is indirect, I would suggest to determine the measurement uncertainty of distance estimate.

      - Connected to above - is the hawkmoth area averaged over the entire flight and is the variance across frames similar in all the stimuli conditions? Is it, in principle, conceivable that the hawkmoths' pitch (up or down) is different across conditions, e.g. with moths rising and falling more frequently in a certain condition, which could influence the area in addition to distance?

      There are a number of sources that generate variance in the distance estimate (which was based on the size of the moth in each video frame, after background subtraction): the size of the animal, the contrast with which the animal was filmed (which also depended on the type of pattern in the tunnel – it was lower with ventral or dorsal patterns as a background than with lateral ones), and the speed of the animal, as motion blur could impact the moth’s image on the video. The latter is hard to calibrate, but the uncertainty related to animal size and pattern types could theoretically be estimated. However, since we moved between finishing the data acquisition for this study and publishing the paper, the original setup has been dismantled. We could attempt to recreate it as faithfully as possible, but would be worried to introduce further noise. We therefore decided to not attempt to characterise the uncertainty, to not give a false impression of quantifiability of this measure. For the purpose of this study, it will have to remain a qualitative, rather than a quantitative measure. If we should use a similar measure again, we will make sure to quantify all sources of uncertainty that we have access to.

      The variance in area is different between conditions. Most likely, the animals vary their flight height different for different dorsal and ventral patterns, as they vary their lateral flight straightness with different lateral visual input. For the reasons mentioned above, we cannot disentangle the effects of variations in flight height and other sources of uncertainty relating to animal size in the video frames. We therefore averaged the extracted area across the entire flight, to obtain a coarse measure of their flight height. Future studies focusing specifically on the vertical component or filming in 3D will be required to determine the exact amount of vertical flight variation.

      - Results second paragraph, suggestion: pattern wavelength or spatial frequency instead of spatial resolution.

      - Same paragraph, suggestion: For an optimal wavelength/spatial frequency of XX

      We corrected these to spatial frequency.

      - Above Fig 3- "this strongly suggests a different visual pathway". In my opinion it would be better to say sensory-motor /visuomotor pathway or to more clearly define visual pathway? Could one in principle imagine a uniform set of local motion sensitive neurons across the entire visual field that connect differentially to descending/motor neurons.

      We appreciate this point and changed this, and further instances in the manuscript to visuomotor pathway.

      - If I understood correctly, you calculated the magnitude of optic flow in the different tunnel conditions based on the image of a fisheye camera moving centrally in the tunnel, equidistant from all walls. I did not understand why the magnitude of optic flow should differ between the four quadrants showing the same squarewave patterns. Apologies if I missed something, but maybe it is worth explaining this in more detail in the manuscript.

      We recognize that this point may not have been immediately clear and have therefore provided additional clarification in the Methods and results section (lines 106-111, 543-549). We anticipated differences in the magnitude of optic flow due to potential contrast variations arising from the way the stimuli were generated—being mounted on the inner surfaces of different tunnel walls while the light source was positioned above. On the dorsal wall, light from the overhead lamps passed through the red material. For laterally mounted patterns, the animals perceived mainly reflected light, as these tunnel walls were not transparent.

      A similar principle applied to the background, which consisted of a white diffuser allowing light to pass through dorsally, but white non-transmissive paper laterally, with a 5% contrast random checkerboard patterns. The ventral side presented a more complex scenario, as it needed to be partially transparent for the ventrally mounted camera. Consequently, the animals perceived a combination of light reflections from the red patterns and the white gauze covering the ventral tunnel side, against the much darker background of the surrounding room.

      To ensure that the observed flight responses were not artifacts of deviations in visual stimulation from an ideal homogeneous environment, we used the camera to quantify the magnitude of optic flow and contrast patterns under these real experimental conditions. This approach also allowed us to directly relate the optic flow measurements taken indoors to those recorded outdoors, as we employed the same camera and analytical procedures for both datasets.

      Reviewer #3 (Recommendations for the authors):

      In addition to the considerations above I had a few minor points:

      There are so many different directions of stimuli and response that it is quite challenging to parse the results. Can this be made a little easier for the reader?

      We appreciate this concern. To be able to refer to the individual stimulus conditions in the analysis and results description, we gave each stimulus a unique identifier (see table S1), and provided these identifiers in the respective figures and throughout the text. We hope that this makes the identification of the individual stimuli easier.

      One suggestion (only a suggestion): I found myself continuously rotating the violin plots in my head so that the lateral position axis lined up with the lateral position of the tunnel icons below. Consider if rotating the plots 90 degs would help interpretability. It was challenging to keep track of which side was side.

      We did discuss this with a number of test-readers, and tried multiple configurations. They all have advantages and drawbacks, but we decided that the current configuration for the majority of testers was the current one. To help the mental transformations from the example flight tracks in the figures, we now present the example flight tracks in Figs. 2-5 in the same reference frame as the figures showing median position (so positive and negative values on those axes correspond directly), and changed the view from a below the tunnel to an above the tunnel view, as this is the more typical depiction. We hope that this enhances readability.

      Are height measurements sensitive to the roll and pitch of the animal? I suspect this is likely small but worth acknowledging.

      They are indeed. These effects are likely small but contribute to the overall inaccuracy, which we could not quantify in this particular setup (see also response to reviewer 2 on that point), which is why the height measurements have to be considered a qualitative approximation rather than a quantification of flight height. We added text to acknowledge the effects of roll and pitch specifically (lines 657-658)

      The Brown-Forsythe test was reported as paired but this seems odd because the same moths were not used in each condition. Maybe the authors meant something different by "paired" than a paired statistical design?

      Indeed, the data was not paired in the sense that we could attribute individual datapoints to individual moths across conditions. We applied the Brown-Forsythe test in a pairwise manner, comparing the variance of each condition with another one in pairs each, to test if the variance in position differed across conditions. We did phrase this misleadingly, and have corrected it to „The variance in the median lateral position (in other words, the spread of the median flight position) was statistically compared between the groups using the pairwise Brown–Forsythe tests“ l. 187-188

      There is some concern about individual moth preferences and bias due to repeated measures. I appreciate that the individual moth's identity was not likely known in most cases, but can the authors provide an approximate breakdown of how many individual moths provided the N sample trajectories?

      This is a very valid concern, and indeed one we did investigate in a previous study with this setup. We confirmed that the majority of animals (70%, 68% and 53% out of 40 hawkmoths, measured on three consecutive days) crossed the tunnel within a randomly picked window of 3h (Stöckl et al. 2019). We now state this explicitly in the methods section (lines 594-597). Thus, for the sample sizes in our study, statistically, each moth would have contributed a small number of tracks compared to the overall number of tracks sampled.

      The statistics section of the methods said that both Tukey-Kramer (post-hoc corrected means) and Kruskal-Wallis (non-parametric medians) were done. It is sometimes not clear which test was done for which figure, and where the Kruskal-Wallis test was done there does not seem to be a corrected statistical significance threshold for the many multiple comparisons (Fig. 2). It is quite possible I am just missing the details and they need to be clarified. I think there also needs to be a correction for the Brown-Forsythe tests but I don't know this method well.

      We first performed an ANOVA, and if the test residuals were not normally distributed, we used a Kruskal-Wallis test instead. For the post-hoc tests of both we used Tukey-Kramer to correct for multiple comparisons. The figure legends did indeed miss this information. We added it to clarify our statistical analysis strategy and refer to the methods section for more details (i.e. l. 185-186). All statistical results, including the type of statistical test used, have been uploaded to the data repository as well.

      The connection to stimulus reliability in the discussion seems to conflate reliability with prevalence or magnitude.

      We have rephrased the respective discussion sections to clearly separate the prevalence and magnitude of stimuli, which was measured, from an implied or hypothesized reliability (lines 510-511).

      Line numbers would be helpful for future review.

      We apologize for missing the line numbers and have added them to the revised manuscript.

    1. We just need to be clear on terms. There are a few terms that are often confused or used interchangeably—“learning,” “education,” “training,” and “school”—but there are important differences between them. Learning is the process of acquiring new skills and understanding. Education is an organized system of learning. Training is a type of education that is focused on learning specific skills. A school is a community of learners: a group that comes together to learn with and from each other. It is vital that we differentiate these terms: children love to learn, they do it naturally; many have a hard time with education, and some have big problems with school.

      This whole paragraph really hit the nail right on the head in my opinion. Those four terms cannot be used interchangeably because they all mean different things entirely. I believe that everyone loves to learn naturally, it's the vector through which their education is brought to them in that plays a key role in determining if the training or education was valuable.

  2. social-media-ethics-automation.github.io social-media-ethics-automation.github.io
    1. Steve Jobs. December 2023. Page Version ID: 1189127326. URL: https://en.wikipedia.org/w/index.php?title=Steve_Jobs&oldid=1189127326 (visited on 2023-12-10).

      Steve Jobs is an inventor who co-founded the Apple industry and is at the head of modern tech. The introduction of the iPhone completely reshaped the smartphone industry and developments of the iPad and MacBook continue to impact the market today. I do feel he is slightly overshadowed by Microsoft in recent years and should still be remembered as one of the great pioneers of a new era of tech.

  3. May 2025
    1. Try a toothbrush with a long, thin handle and a small head.

      The brushes sold with many metal and rubber straws or thin bottle brushes are also excellent for reaching into places like this. Sometimes you can find similar thin brushes in the baby bottle section of big box retailers or specialty stores doing baby goods.

      Similarly a plastic oiler with mineral spirits in combination with an air compressor/blow gun or canned air is also a solid way to go.

      Long handle cotton swabs can also be used if necessary.

      reply to u/General-Writing-1764 at https://old.reddit.com/r/typewriters/comments/1l05a17/i_dont_think_that_superficial_dust_should_be_a/

    1. And the king said: “Yes, I am Montezuma.” Then he stood up to welcome Cortés; he came forward, bowed his head low and addressed him in these words: “Our lord, you are weary. The journey has tired you, but now you have arrived on the earth. You have come to your city, Mexico. You have come here to sit on your throne, to sit under its canopy.

      The king of Tenochtitlan and the incoming Spanish are having a friendly conversation showing no hatred or fear or longing to conquer. The king even treats the Spanish as gods coming to their land and praises them.

    1. Contacted LATAM head, close to the founder. Asked to follow up in 2 weeks. Interested in talking.

      They are Not selling now, its majority-owned by Russian partners and are Considering a minority funding round to become a top 2 player by year-end.

      Will talk just to see the possibilities

    1. When we drew near the island, we found it was at a place where there could be no landing, there being a great surff on the stony beach. So we dropt anchor, and swung round towards the shore. Some people came down to the water edge and hallow’d to us, as we did to them; but the wind was so high, and the surff so loud, that we could not hear so as to understand each other. There were canoes on the shore, and we made signs, and hallow’d that they should fetch us; but they either did not understand us, or thought it impracticable, so they went away, and night coming on, we had no remedy but to wait till the wind should abate; and, in the meantime, the boatman and I concluded to sleep, if we could; and so crowded into the scuttle, with the Dutchman, who was still wet, and the spray beating over the head of our boat, leak’d thro’ to us, so that we were soon almost as wet as he. In this manner we lay all night, with very little rest; but, the wind abating the next day, we made a shift to reach Amboy before night, having been thirty hours on the water, without victuals, or any drink but a bottle of filthy rum, and the water we sail’d on being salt. In the evening I found myself very feverish, and went in to bed; but, having read somewhere that cold water drank plentifully was good for a fever, I follow’d the prescription, sweat plentiful most of the night, my fever left me, and in the morning, crossing the ferry, I proceeded on my journey on foot, having fifty miles to Burlington, where I was told I should find boats that would carry me the rest of the way to Philadelphia.

      This part seems similar to the travel narratives of Smith and De Vaca, as Franklin is talking about his own struggles travelling to Philadelphia

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript the authors have done cryo-electron tomography of the manchette, a microtubule-based structure important for proper sperm head formation during spermatogenesis. They also did mass-spectrometry of the isolated structures. Vesicles, actin and their linkers to microtubules within the structure are shown.

      __We thank the reviewer for the critical reading of our manuscript; we have implemented the suggestions as detailed below, which we believe indeed improved the manuscript. __

      Major:

      The data the conclusions are based on seem very limited and sometimes overinterpreted. For example, only one connection between actin and microtubules was observed, and this is thought to be MACF1 simply based on its presence in the MS.

      __We regret giving the impression that the data is limited. We in fact collected >100 tilt series from 3 biological replicas for the isolated manchette. __

      __In the revised version, we added data from in-situ studies showing vesicles interacting with the manchette (as requested below, new Fig. 1). __

      Specifically, for the interaction of actin with microtubule we added more examples (Revised Fig. 6) and we toned down the discussion related to the relevance of this interaction (lines 193-194, 253-255). MACF1 is mentioned only as a possible candidate in the discussion (line 254).

      Another, and larger concern, is that the authors do a structural study on something that has been purified out of the cell, a process which is extremely disruptive. Vesicles, actin and other cellular components could easily be trapped in this cytoskeletal sieve during the purification process and as such, not be bona fide manchette components. This could create both misleading proteomics and imaging. Therefore, an approach not requiring extraction such as high-pressure freezing, sectioning and room-temperature electron tomography and/or immunoEM on sections to set aside this concern is strongly recommended. As an additional bonus, it would show if the vesicles containing ATP synthase are deformed mitochondria.

      __We recognise the concern raised by the reviewer. __

      __To alleviate this concern, we added imaging data of manchettes in-situ that show vesicles, mitochondria and filaments interacting with the manchette (new Fig. 1), essentially confirming the observations that were made on the isolated manchette. __

      __The benefits of imaging the isolated manchette were better throughput (being able to collect more data) and reaching higher resolution allowing to resolve unequivocally the dynein/dynactin and actin filaments. __

      Minor: Line 99: "to study IMT with cryo-ET, manchettes were isolated ...(insert from which organism)..."

      __Added in line 102 in the revised version. __

      Line 102 "...demonstrating that they can be used to study IMT".. can the authors please clarify?

      This paragraph was revised (lines 131-137), we hope it is now more clear.

      Line 111 "densities face towards the MT plus-end" How can a density "face" anywhere? For this, it needs to have a defined front and back.

      Microtubule motor proteins (kinesin and dynein) are often attached to the microtubules with an angle and dynactin and cargo on one side (plus end). We rephrased this part and removed the word “face” in the revised version to make it more clear (lines 161-162).

      Line 137: is the "perinuclear ring" the same as the manchette?

      The perinuclear ring is the apical part of the manchette that connects it to the nucleus. We added to the revised version imaging of the perinuclear ring with observations on how it changes when the manchette elongates (new Fig. 2).

      Figure 2B: How did the authors decide not to model the electron density found between the vesicle and the MT at 3 O'clock? Is there no other proteins with a similar lollipop structure as ATP synthase, so that this can be said to be this protein with such certainty?

      __The densities connecting the vesicles to the microtubules shown in (now) Fig. 4D are not consistent enough to be averaged. __

      __The densities resembling ATP synthase are inside the vesicles. Nevertheless, we have decided to remove the averaging of the ATP synthases from the revised manuscipt as they are not of great importance for this manuscript. Instead, the new in-situ data clearly show mitochondria (with their characteristic double membrane and cristae) interacting with manchette microtubule (new Fig 1C). __

      Line 189: "F-actin formed organized bundles running parallel to mMTs" - this observation needs confirming in a less disrupted sample.

      __Phalloidin (actin marker) was shown before to stain the manchette (PMID: 36734600). As actin filaments are very thin (7 nm) they are very hard to observe in plastic embedded EM. __

      In the in-situ data we added to the revised manuscipt (new Fig 1D), we observe filaments with a diameter corresponding to actin. In addition, we added more examples of microtubules interacting with actin in isolated manchette (new Fig. 6 E-K).

      Line 242 remove first comma sign.

      Removed.

      Line 363 "a total of 2 datasets" - is this manuscript based on only two tilt-series? Or two datasets from each of the 4 grids? In any case, this is very limited data.

      We apologise for not clearly providing the information about the data size in the original manuscipt. The data is based on three biological replicas (3 animals). We collected more than 100 tomograms of different regions of the manchettes. As such, we would argue that the data is not limited per se.

      Reviewer #1 (Significance (Required)):

      The article is very interesting, and if presented together with the suggested controls, would be informative to both microtubule/motorprotein researchers as well as those trying studying spermatogenesis.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manchette appears as a shield-like structure surrounding the flagellar basal body upon spermiogenesis. It consists of a number of microtubules like a comb, but actin (Mochida et al. 1998 Dev. Biol. 200, 46) and myosin (Hayasaka et al. 2008 Asian J. Androl. 10, 561) were found, suggesting transportation inside the manchette. Detailed structural information and functional insight into the manchette was still awaited. There is a hypothesis called IMT (intra-machette transport) based on the fact that machette and IFT (intraflagellar transport) share common components (or homologues) and on their transition along the stages of spermiogenesis. While IMT is considered as a potential hypothesis to explain delivery of centrosomal and flagellar components, no one has witnessed IMT at the same level as IFT. IMT has never been purified, visualized in motion or at high resolution. This study for the first time visualized manchette using high-end cryo-electron tomography of isolated manchettes, addressing structural characterization of IMT. The authors successfully microtubular bundles, vesicles located between microtubules and a linker-like structure connecting the vesicle and the microtubule. On multilamellar membranes in the vesicles they found particles and assigned them to ATPase complexes, based on intermediate (~60A) resolution structure. They further identified interesting structures, such as (1) particles on microtubules, which resemble dynein and (2) filaments which shows symmetry of F-actin. All the molecular assignments are consistent with their proteomics of manchettes.

      __We thank the reviewer for highlighting the novelty of our study.____ __

      Their assignment of ATPase will be strengthened by MS data, if it proves absence of other possible proteins forming such a membrane protein complex.

      All the ATPase components were indeed found in our proteomics data. Nevertheless, we have decided to remove the averaging of the ATPase as it does not directly relate to IMT, the focus of this manuscript.

      They discussed possible role of various motor proteins based on their abundance (Line 134-151, Line 200). This makes sense only with a control. Absolute abundance of proteins would not necessarily present their local importance or roles. This reviewer would suggest quantitative proteomics of other organelles, or whole cells, or other fractions obtained during manchette isolation, to demonstrate unique abundance of KIF27 and other proteins of their interest.

      We agree with the reviewer that absolute abundance does not necessarily indicate importance or a role. As such, we removed this part of the discussion from the revised manuscript.

      A single image from a tomogram, Fig.6B, is not enough to prove actin-MT interaction. A gallery and a number (how many such junctions were found from how many MTs) will be necessary.

      We agree that one example is not enough. In the new Fig. 6E-K, we provide a gallery of more examples. We have revised the text to reflect the point that these observations are still rare and more data will be needed to quantify this interaction (Lines 253-254).

      Minor points: Their manchette purification is based on Mochida et al., which showed (their Fig.2) similarity to the in vivo structure (for example, Fig.1 of Kierszenbaum 2001 Mol. Reproduc. Dev. 59, 347). Nevertheless, since this is not a very common prep, it is helpful to show the isolated manchette’s wide view (low mag cryo-EM or ET) to prove its intactness.

      We thank the reviewer for this suggestion, in the revised version, new Fig. 2 provides a cryo-EM overview of purified manchette from different developmental stages.

      Line 81: Myosin -> myosin (to be consistent with other protein names)

      Corrected.

      This work is a significant step toward the understanding of manchettes. While the molecular assignment of dynein and ATPase is not fully decisive, due to limitation of resolution (this reviewer thinks the assignment of actin filament is convincing, based on its helical symmetry), their speculative model still deserves publication.

      Reviewer #2 (Significance (Required)):

      This work is a significant step toward the understanding of manchettes. While the molecular assignment of dynein and ATPase is not fully decisive, due to limitation of resolution (this reviewer thinks the assignment of actin filament is convincing, based on its helical symmetry), their speculative model still deserves publication.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      ->Summary:

      The manchette is a temporary microtubule (MT)-based structure essential for the development of the highly polarised sperm cell. In this study, the authors employed cryo-electron tomography (cryo-ET) and proteomics to investigate the intra-manchette transport system. Cryo-EM analysis of purified rat manchette revealed a high density of MTs interspersed with actin filaments, which appeared either bundled or as single filaments. Vesicles were observed among the MTs, connected by stick-like densities that, based on their orientation relative to MT polarity, were inferred to be kinesins. Subtomogram averaging (STA) confirmed the presence of dynein motor proteins. Proteomic analysis further validated the presence of dynein and kinesins and showed the presence of actin crosslinkers that could bundle actin filaments. Proteomics data also indicated the involvement of actin-based transport mediated by myosin. Importantly, the data indicated that the intraflagellar transport (IFT) system is not part of the intra-manchette transport mechanism. The visualisation of motor proteins directly from a biological sample represents a notable technical advancement, providing new insights into the organisation of the intra-manchette transport system in developing sperm.

      We thank the reviewer for summarising the novelty of our observations.

      -> Are the key conclusions convincing? Below we comment on three main conclusions. MT and F-actin bundles are both constituents of the manchette While the data convincingly shows that MT and F-actin are part of the manchette, one cannot conclude from it that F-actin is an integral part of the manchette. The authors would need to rephrase so that it is clear that they are speculating.

      We have rephrased our statements and replaced “integral” with ‘actin filaments are associated’. Of note previous studies suggested actin are part of the manchette including staining with phalloidin (PMID: 36734600, PMID: 9698455, PMID: 18478159) and we here visualised the actin in high resolution.

      The transport system employs different transport machinery on these MTs Proteomics data indicates the presence of multiple motor proteins in the manchette, while cryo-EM data corroborates this by revealing morphologically distinct densities associated with the MTs. However, the nature of only one of these MT-associated densities has been confirmed-specifically, dynein, as identified through STA. The presence of kinesin or myosin in the EM data remains unconfirmed based on just the cryo-ET density, and therefore it is unclear whether these proteins are actively involved in cargo transport, as this cannot be supported by just the proteomics data. In summary, we recommend that the authors rephrase this conclusion and avoid using the term "employ".

      We agree that our cryo-ET only confirmed the motor protein dynein. As such, we removed the term employ and rephrased our claims regarding the active transport and accordingly changed the title.

      Dynein mediated transport (Line 225-227) The data shows that dynein is present in the manchette; however, whether it plays and active role in transport cannot be determined from the cryo-ET data provided in the manuscript, as it does not clearly display a dynein-dynactin complex attached to cargo. The attachment to cargo is also not revealed via proteomics as no adaptor proteins that link dynein-dynactin to its cargo have been shown.

      A list of cargo adaptor proteins were found in our proteomics data but we agree that cryo-ET and proteomics alone cannot prove active transport. As such we toned down the discussion about active transport (lines 212-220).

      -> Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      F-actin • In the abstract, the authors state that F-actin provides tracks for transport as well as having structural and mechanical roles. However, the manuscript does not include experiments demonstrating a mechanical role. The authors appear to base this statement on literature where actin bundles have been shown to play a mechanical role in other model systems. We suggest they clarify that the mechanical role the authors suggest is speculative and add references if appropriate.

      __ ____We removed the claim about the mechanical role of the actin from the abstract and rephrased this in the discussion to suggest this role for the F-actin (lines 242-243).__

      • Lines 15,92, 180 and 255: The statement "Filamentous actin is an integral part of the manchette" is misleading. While the authors show that F-actin is present in their purified manchette structures, whether it is integral has not been tested. Authors should rephrase the sentence.

      We removed the word integral.

      • To support the claim that F-actin plays a role in transport within the manchette, the authors present only one instance where an unidentified density is attached to an actin filament. This is insufficient evidence to claim that it is myosin actively transporting cargo. Although the proteomics data show the presence of myosin, we suggest the authors exercise more caution with this claim.

      We agree that our data do not demonstrate active transport as such we removed that claim. We mention the possibility of cargo transport in the discussion (lines 250-255).

      • The authors mention the presence of F-actin bundles but do not show direct crosslinking between the F-actin filaments. They could in principle just be closely packed F-actin filaments that are not necessarily linked, so the term "bundle" should be used more cautiously.

      We do not assume that a bundle means that the F-actin filaments are crosslinked. A bundle simply indicates the presence of multiple F-actin filaments together. We rephrased it to call them actin clusters.

      Observations of dynein • Relating to Figure 2B: From the provided image it is not clear whether the density corresponds to a dynein complex, as it does not exhibit the characteristic morphological features of dynein or dynactin molecules.

      We indeed do not claim that the densities in this figure are dynein or dynactin. __We revised this paragraph and hope that it is now more clear (lines 135-137). __

      • Lines 171-172 and Figure 4: It is well established that dynein is a dimer and should always possess two motor domains. The authors have incorrectly assumed they observed single motor heads, except possibly in Figure 4A (marked by an arrow). In all other instances, the dynein complexes show two motor domains in proximity, but these have not been segmented accurately. Furthermore, the "cargos" shown in grey are more likely to represent dynein tails or the dynactin molecule, based on comparisons with in vitro structures of these complexes (see references 1-3).

      We thank the reviewer for this correction. We improved the annotations in the figure and revised the text to clarify that we identified dimers of dynein motor heads (lines 140-144). We further added a projection of a dynein dynactin complex to compare to the observation on the manchette (new Fig. 5E). We further changed claims on the presence of protein cargo to the presence of dynein/dynactin that allows cargo tethering based on the presence of cargo adaptors in the proteomics data.

      • Lines 21, 173, and 233 mention cargos, but as noted above, it seems to be parts of the dynein complex the authors are referring to.

      This was corrected as mentioned above.

      • Panel 4B appears to show a dynein-dynactin complex, but whether there is a cargo is unclear and if there is it should be labelled accordingly. To assessment of whether there is any cargo bound to the dynein-dynactin complex a larger crop of the panel would be helpful In summary, we recommend that the authors revisit their segmentations in Figures 2B and 4, revise their text based on these observations, and perform quantification of the data (as suggested in the next section).

      We thank the reviewers for sharing their expertise on dynein-dynactin complexes. We have revised the text as detailed above and excluded the assignment of any cargo, as we cannot (even from larger panels) see a clear association of cargo. We have made clear that we only refer to dynein dynactin with the capability of linking cargo based on the presence of proteomics data. We have removed claims on active transport with dynein.

      Dynein versus kinesin-based transport The calculation presented in lines 147-151 does not account for the fact that both the dynein-dynactin complex and kinesin proteins require cargo adaptors to transport cargo. Additionally, the authors overlook the possibility that multiple motors could be attached to a single cargo. If the authors did not observe this, they should explicitly mention it to support their argument. In short, the calculations are based on an incorrect premise, rendering the comparison inaccurate. Unless the authors have identified any dynein-dynactin or kinesin cargo adaptors in their proteomics data which could be used for such a comparison, we believe the authors lack sufficient data to accurately estimate the "active transport ratio" between dynein and kinesin.

      Even though we detect cargo adaptors in our proteomics, we agree that calculating relative transport based only on the proteomics can be inaccurate as such we removed absolute quantification and comparison between dynein and kinesin-based IMT.

      • Would additional experiments be essential to support the claims of the paper?

      F-actin distance and length distribution • To support the claim that F-actin is bundled (line 189), could the authors provide the distance between each F-actin filament and its neighbours? Additionally, could they compare the average distance to the length of actin crosslinkers found in their proteomics data, or compare it to the distances between crosslinked F-actin observed in other research studies?

      We measured distances between the actin filaments and added a plot to new Fig 6.

      • While showing that F-actin is important for the manchette would require cellular experiments, authors could provide quantification of how frequently these actin structures are observed in comparison to MTs to support their claims that these actin filaments could be important for the manchette structure.

      We agree that claims on the role and function of actin in the manchette require cellular experiments that are beyond the scope of this study. Absolute quantification of the ratio between MTs and actin from cryoET is very hard and will be inaccurate as the manchette cannot be imaged as a whole due to its size and thickness. The ratio we have is based on the relative abundance provided by the proteomics (Fig. 5F).

      • In line 193, the authors claim that the F-actin in bundles appears too short for transport. Could they provide length distributions for these filaments? This might provide further support to their claim that individual F-actin filaments can serve as transport tracks (line 266).

      __In addition to the limitation mentioned in the previous point, quantification of length from high magnification imaging will likely be inaccurate as the length of the actin in most cases is bigger than the field of view that is captured. Nevertheless, we removed the claim about the actin being too short for transport. __

      • Could the authors also quantify the abundance of individual F-actin filaments observed, compared to MTs and F-actin bundles, to support the idea that they could play a role in transport?

      As explained for the above points absolute quantification of the ratio between MTs and actin is not feasible from cryoET data that cannot capture all of the manchette in high enough resolution to resolve the actin.

      • In the discussion, the authors mention "interactions between F-actin singlets and mMTs" (line 269), yet they report observing only one instance of this interaction (lines 210 and 211). Given the limited data, they should refer to this as a single interaction in the discussion. The scarcity of data raises questions about how representative this event truly is.

      We agree that one example is not enough. In the new Fig. 6E-K, we provide a gallery of more examples as also requested by reviewers 1 and 2. We have also revised the text to reflect the point that these observations are still rare (Lines 190-194).

      Quantifications for judgement of representativity The authors should quantify how often they observed vesicles with a stick-like connection to MTs (lines 106-107); this would strengthen the interpretation of the density, as currently only one example is shown in the manuscript (Figure 4A). If possible, they could show how many of them are facing towards the MT plus end.

      __As mentioned in the text (lines 135-137), the linkers connecting vesicles to MTs were irregular and so we could not interpret them further this is in contrast to dynein that were easily recognisable but were not associated with vesicles. __

      Dynein quantifications • The authors are recommended to quantify how many dynein molecules per micron of MT they observe and how often they are angled with their MT binding domain towards the minus-end.

      As the manchette is large and highly dense any quantification will likely be biased towards parts of the manchette that are easier to image, for example the periphery. As such we do not think quantifying the dynein density will yield meaningful insight.

      • Could the authors quantify how many dynein densities they found to be attached to a (vesicle) cargo, if any (line 175)? They could show these observations in a supplementary figure.

      We did not observe any case of a connection between a vesicle and dynein motors, we edited this sentence to be more clear on that.

      • For densities that match the size and location of dynein but lack clear dynein morphology (as seen in Figure 2B), could the authors quantify how many are oriented towards the MT minus end?

      We had many cases where the connection did not have a clear dynein morphology, and as the morphology is not clear, it is impossible to make a claim about whether they are oriented towards the minus end.

      Artefacts due to purification: Authors should discuss if the purification could have effects on visualizing components of the manchette. For example, if it has effect on the MTs and actin structure or the abundance/structure of the motor protein complexes (bound to cargo or isolated).

      We have followed a protocol that was published before and showed the overall integrity of the manchette. Nevertheless, losing connections between manchette and other cellular organelles are expected. To address this point, we added in-situ data (new Fig 1) showing manchette in intact spermatids interacting with vesicles and mitochondria, as well as overviews of manchettes (new Fig 2), the text was revised accordingly.

      • Are the experiments adequately replicated and statistical analysis adequate? The cryo-ET data presented in the manuscript is collected using two separate sample preparations. Along with the quantifications of the different observations suggested above which will help the reader assess how abundant and representative these observations are, the authors could further strengthen their claims by acquiring data from a third sample preparation and then analysing how consistent their observations are between different purifications. This however could be time consuming so it is not a major requirement but recommended if possible within a short time frame.

      We regret not explicitly mentioning our data set size, it was added now to the revised version. In essence, the data is based on three biological replicas (3 animals). We collected more than 100 tomograms of different regions of the manchettes. We provided in the revised version more observations (new Fig 1, 2, 4B-C and 6E-K).

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Most of the comments deal with either modifying the text or analysing the data already presented, so the revision could be done with 1-3 months.


      Minor comments: - Specific experimental issues that are easily addressable. 1) Could the authors state how many tilt series were collected for each dataset/independent sample preparation? We recommend that they upload their raw data or tomograms to EMPAIR.

      We added this information in the material and methods.

      2) It is not clear to me if the same sample was used for cryo-ET and proteomics. Could the authors clarify how comparable the sample preparation for the cryo-ET and proteomics data is or if the same sample was used for both. If there is a discrepancy between these preparations, they would need to discuss how this can affect comparing observations from cryo-ET and mass spectrometry. Ideally both samples should be the same.

      After sample preparation the manchettes were directly frozen on grids. The rest of the samples was used for proteomics. Consequently, EM and MS data were acquired on the same samples. We clarified this in the text (lines 327-328).

      • Are prior studies referenced appropriately? We recommend including additional references to support the claim that F-actin has a mechanical role (line 242). Could the authors compare their proteomics data to other mass spectrometry studies conducted on the Manchette (for example, see reference 4)?

      We added the comparison but it is important to point out that in reference 4 the manchettes were isolated from mice testes.

      • Are the text and figures clear and accurate? Text: We do not see the necessity of specifying the microtubules (MTs) in the data as "manchette MTs" or "mMTs" rather than simply "MTs". However, we recommend that the authors use either "MT" or "mMT" consistently throughout the manuscript.

      We changed to only MTs.

      The authors appear to refer to both dynein-1 (cytoplasmic dynein) and dynein-2 (axonemal dynein or IFT dynein). To avoid confusion, it is important that the authors clearly specify which dynein they are referring to throughout the text. This is particularly relevant as the study aims to demonstrate that IFT is not part of the manchette transport system.

      • Introduction: In the third paragraph (lines 59-75), the authors should specify that they are referring to dynein-2, which is distinct from cytoplasmic dynein discussed in the previous paragraph (lines 44-58).

      We specify the respective dyneins in the text (line 66,140-141,145).

      • Figure 4D: The authors could fit a dynein-1 motor domain instead of a dynein-2 into the density to stay consistent with the fact that the density belongs to cytoplasmic dynein-1.

      __We changed the figure and fitted a cytosolic dynein-1 structure (5nvu) instead. __

      Figures: • Figure 2B: The legend mentions a large linker complex; however, this may correspond to two or three separate densities.

      We have addressed this and changed the wording.

      • Figure 4: please revisit the segmentation of this whole figure based on previous comments.

      __We revised as suggested. __

      • Figures 1, 2, 4, 5, and 6: It would be helpful to state in the legends that the tomograms are denoised. There are stripe-like densities visible in the images (e.g., in the vesicle in Figure 2B). Do these artefacts also appear in the raw data?

      As stated in the Methods section, tomograms were generally denoised with CryoCare for visualisation purposes. The “stripe-like densities” are artefacts of the gold fiducials used for tomogram alignment and appear in the raw data (before denoising).

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? We suggest revising the paragraph title "Dynein-mediated cargo along the manchette" (line 165) to "Dynein-mediated cargo transport along the manchette".

      __We have changed this in the revised version. __

      We recommend that the authors provide additional evidence to support the interpretation that the observed EM densities correspond to motor proteins. Specifically: • Include scale bars or reference lines indicating the known dimensions of motor proteins, based on previous data, to demonstrate that the observed densities match the expected size.

      The dynein structure is provided for reference. We also added the cytosolic dynein–dynactin as a reference (Fig 5E).

      • Make direct comparisons to existing EM data and highlight morphological similarities.

      We have added a comparison to existing data (Fig 5E).

      In the discussion (lines 249-254), the authors could speculate on alternative roles for the IFT components in the manchette, particularly if they are not part of the IFT trains. We also suggest rephrasing the claim in line 266 to make it more speculative in tone.

      __We have addressed this in the revised version (lines 221-230). __

      Finally, a schematic overview of the manchette ultrastructure in a spermatid would greatly aid the reader in understanding the material presented.

      We now include a graphical abstract and overviews of isolated manchettes on cryo-EM grids.

      References: 1. Chowdhury, S., Ketcham, S., Schroer, T. et al. Structural organization of the dynein-dynactin complex bound to microtubules. Nat Struct Mol Biol 22, 345-347 (2015). https://doi.org/10.1038/nsmb.2996

      1. Grotjahn, D.A., Chowdhury, S., Xu, Y. et al. Cryo-electron tomography reveals that dynactin recruits a team of dyneins for processive motility. Nat Struct Mol Biol 25, 203-207 (2018). https://doi.org/10.1038/s41594-018-0027-7

      2. Chaaban, S., Carter, A.P. Structure of dynein-dynactin on microtubules shows tandem adaptor binding. Nature 610, 212-216 (2022).https://doi.org/10.1038/s41586-022-05186-y

      3. W. Hu, R. Zhang, H. Xu, Y. Li, X. Yang, Z. Zhou, X. Huang, Y. Wang, W. Ji, F. Gao, W. Meng, CAMSAP1 role in orchestrating structure and dynamics of manchette microtubule minus-ends impacts male fertility during spermiogenesis, Proc. Natl. Acad. Sci. U.S.A. 120 (45) e2313787120, https://doi.org/10.1073/pnas.2313787120 (2023).

      Reviewer #3 (Significance (Required)):

      This study employs cryo-electron tomography (cryo-ET) and proteomics to elucidate the architecture of the manchette. It advances our understanding of the components involved in intracellular transport within the manchette and introduces the following technical and conceptual innovations:

      a) Technical Advances: The authors have visualized the manchette at high resolution using cryo-ET. They optimized a purification pipeline capable of retaining, at least partially, the transport machinery of the manchette. Notably, they observed dynein and putative kinesin motors attached to microtubules-a significant achievement that, to our knowledge, has not been reported previously.

      b) Conceptual Advances: This study provides novel insights into spermatogenesis. The findings suggest that intraflagellar transport (IFT) is unlikely to play a role at this stage of sperm development while shedding light on alternative transport systems. Importantly, the authors demonstrate that actin filaments organize in two distinct ways: clustering parallel to microtubules or forming single filaments.

      This work is likely to be of considerable interest to researchers in sperm development and structural biology. Additionally, it may appeal to scientists studying motor proteins and the cytoskeleton.

      We thank the reviewers for appreciating the significance and novelty of our study.

      The reviewers possess extensive expertise in in situ cryo-electron tomography and single-particle microscopy, including work on dynein-based complexes. Collectively, they have significant experience in the field of cytoskeleton-based transport.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript the authors have done cryo-electron tomography of the manchette, a microtubule-based structure important for proper sperm head formation during spermatogenesis. They also did mass-spectrometry of the isolated structures. Vesicles, actin and their linkers to microtubules within the structure are shown.

      Major:

      The data the conclusions are based on seem very limited and sometimes overinterpreted. For example, only one connection between actin and microtubules was observed, and this is thought to be MACF1 simply based on its presence in the MS.

      Another, and larger concern, is that the authors do a structural study on something that has been purified out of the cell, a process which is extremely disruptive. Vesicles, actin and other cellular components could easily be trapped in this cytoskeletal sieve during the purification process and as such, not be bona fide manchette components. This could create both misleading proteomics and imaging. Therefore, an approach not requiring extraction such as high-pressure freezing, sectioning and room-temperature electron tomography and/or immunoEM on sections to set aside this concern is strongly recommended. As an additional bonus, it would show if the vesicles containing ATP synthase are deformed mitochondria.

      Minor:

      Line 99: "to study IMT with cryo-ET, manchettes were isolated ...(insert from which organism)..."

      Line 102 "...demonstrating that they can be used to study IMT".. can the authors please clarify?

      Line 111 "densities face towards the MT plus-end" How can a density "face" anywhere? For this, it needs to have a defined front and back.

      Line 137: is the "perinuclear ring" the same as the manchette?

      Figure 2B: How did the authors decide to not model the electron density found between the vesicle and the MT at 3 O'clock? Is there no other proteins with a similar lollipop structure as ATP synthase, so that this can be said to be this protein with such certainty?

      Line 189: "F-actin formed organized bundles running parallel to mMTs" - this observation needs confirming in a less disrupted sample.

      Line 242 remove first comma sign

      Line 363 "a total of 2 datasets" - is this manuscript based on only two tilt-series? Or two datasets from each of the 4 grids? In any case, this is very limited data.

      Significance

      The article is very interesting, and if presented together with the suggested controls, would be informative to both microtubule/motorprotein researchers as well as those trying studying spermatogenesis.

    1. eLife Assessment

      This useful study presents a virtual reality-based contextual fear conditioning paradigm for head-fixed mice. Solid evidence supports the claim that the reported methods provide a reliable paradigm for studying contextual fear conditioning in head-fixed mice. The approach provides a way to perform multiphoton imaging of neural circuits, and other techniques that are typically performed in head-fixed animals, during behaviors that have traditionally been studied in freely moving animals.

    2. Reviewer #1 (Public review):

      The authors have developed a contextual fear learning (CFC) paradigm in head-fixed mice that produces freezing as the conditioned response. Typically, lick suppression is the conditioned response in such designs, but this 1) introduces a potential confounding influence of reward learning on neural assessments of aversion learning and 2) does not easily allow comparison of head-fixed studies with extensive previous work in freely moving animals, which use freezing as the primary conditioned response. This report describes 3 versions of this virtual reality CFC paradigm, its validation using place-cell remapping, and provides suggestions for further refinement and application.

      The first part of this study is a report on the development and outcomes of 3 variations of the CFC paradigm in a virtual reality environment. The fundamental design is strong, with head-fixed mice required to run down a linear virtual track to obtain a water reward. Once trained, the water reward is no longer necessary and mice will navigate virtual reality environments. There are rigorous performance criteria to ensure that mice that make it to the experimental stage show very low levels of inactivity prior to fear conditioning. These criteria do result in only 40% of the mice making it to the experimental stage, but high rates of activity in the VR environment is crucial for detecting learning-related freezing. It is possible that further adjustments to the procedure could improve attrition rates.

      Paradigm versions 1 and 2 vary the familiarity of the control context while paradigm versions 2 and 3 vary the inter-shock interval. Version 1 is the most promising, showing the greatest increase in conditioned freezing (~40%) and good discrimination between contexts (delta ~15-20%). Version 2 showed no clear evidence of learning - average freezing at recall day 1 was not different than pre-shock freezing. First lap freezing showed a difference, but this single lap effect is not useful for many of the neural circuit questions for which this paradigm is meant to facilitate. Version 3 produces greater freezing and slower extinction than version 2. While the magnitude of the context discrimination is less than that in version 1, further optimization of the VR CFC is likely to produce robust learning and extinction. The authors discuss several options for further optimization.

      The second part of the study is a validation of the head-fixed CFC VR protocol through demonstration that fear conditioning leads to remapping of dorsal CA1 place fields, similar to that observed in freely moving subjects. The results support this aim and largely replicate previous findings in freely moving subjects. One difference from previous work of note is that VR CFC led to remapping of the control environment, not just the conditioning context. The authors present several possible explanations for this lack of specificity to the shock context. While this experiment examined place cell remapping after fear conditioning, it did not attempt to link neural activity to the learned association or freezing behavior.

      In summary, this is an important methodological innovation and this study sets the initial parameters and neuronal validation needed to further optimize a head-fixed CFC paradigm that produces freezing. In the discussion, the authors note the limitations of this study, suggest next steps in refinement, and point to several future directions using this protocol to significantly advance our understanding of the neural circuits of threat-related learning and behavior.

      Comments on revisions:

      The manuscript is much stronger with the additions and revisions the authors provided in their revised submission.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Krishnan et al devised three paradigms to perform contextual fear conditioning in head-fixed mice. Each of the paradigms relied on head-fixed mice running on a treadmill through virtual reality arenas. The authors tested the validity of three versions of the paradigms by using various parameters. The authors have addressed some of my initial concerns in their revised manuscript.

      Strengths:

      The authors have devised three new contextual fear conditioning paradigms in head-fixed mice. The authors tested a number of parameters towards optimization of this approach.

      Weaknesses:

      While some experimental parameters were tested in the manuscript, it appears that a large amount of additional testing and optimization will be required before reliable behavioral responses can be acquired and ultimately for the paradigm(s) to be useful for answering biological questions. One major factor will be optimizing parameters such that head-fixed mice in this paradigm can (largely) recapitulate what is observed in freely behaving mice. This may be challenging however, as they have previously published one of the three paradigms and the extensive additional testing they did in this current manuscript did not greatly improve the experimental setup. This may indicate limited immediate usefulness for the community as significant work likely remains for optimization.

      Achievement of Aims:

      The authors have put a significant amount of work in testing the paradigms, and as a result, progress has been made towards their usefulness in the field. However, a significant amount of optimization likely exists.

      Impact on the field:

      The development of a reliable paradigm for studying contextual fear in head-fixed animals would be a strong contribution to the field as it would enable sophisticated cell and circuit imaging analyses. This study is a good start towards this goal, but significant optimization is required for the paradigm(s) to fully benefit the field - especially to allow those who may have less experience in these approaches to use it in their own research.

    4. Reviewer #3 (Public review):

      Summary:

      Krishnan et al. present a novel contextual fear conditioning (CFC) paradigm using a virtual reality (VR) apparatus to evaluate whether conditioned context-induced freezing can be elicited in head-fixed mice. By combining this approach with two-photon imaging, the authors aim to provide high-resolution insights into the neural mechanisms underlying learning, memory, and fear. Their experiments demonstrate that head-fixed mice can discriminate between threat and non-threat contexts, exhibit fear-related behavior in VR, and show context-dependent variability during extinction. Supplemental analyses further explore alternative behaviors and the influence of experimental parameters, while hippocampal neuron remapping is tracked throughout the experiments, showcasing the paradigm's potential for studying memory formation and extinction processes.

      Strengths:

      Methodological Innovation: The integration of a VR-based CFC paradigm with real-time two-photon imaging offers a powerful, high-resolution tool for investigating the neural circuits underlying fear, learning, and memory.

      Versatility and Utility: The paradigm provides a controlled and reproducible environment for studying contextual fear learning, addressing challenges associated with freely moving paradigms.

      Potential for Broader Applications: By demonstrating hippocampal neuron remapping during fear learning and extinction, the study highlights the paradigm's utility for exploring memory dynamics, providing a strong foundation for future studies in behavioral neuroscience.

      Comprehensive Data Presentation: The inclusion of supplemental figures and behavioral analyses (e.g., licking behaviors and variability in extinction) strengthens the manuscript by addressing additional dimensions of the experimental outcomes.

      Weaknesses:

      Optimization: many parameters remain to be tested in the VR fear conditioning paradigm.

      Extended training and attrition rate: the paradigm requires weeks of training and only 40% of mice reach criteria.

    5. Author response:

      The following is the authors’ response to the original reviews

      We thank all the reviewers for their time and valuable feedback, which helped us improve our manuscript. Based on the comments, we have made several critical changes to the revised manuscript.

      (1) We have changed our threshold for detecting freezing epochs from 1 cm/s to 0 cm/s in this revised manuscript. This change allows us to capture periods when animals are completely still on the treadmill, better matching the "true freezing" behavior seen in freely moving set-ups. We have added a new supplementary video (Supplementary Video 2) that better demonstrates the freezing response we observe. All results and figures in the revised manuscript reflect this updated threshold (Figure 2-6, Supplementary Figures 16, Tables 1-6). Our main findings remain robust, demonstrating that freezing serves as a reliable conditioned response in our paradigms, comparable to freely moving animals. Specifically, freezing behavior increased reliably in the fear-conditioned environment following CFC across all paradigms. We have also added data from a no-shock control group (Supplementary Figure 2) which, when compared to the conditioned group, shows that freezing responses in the conditioned group result from fear conditioning rather than immobility. We do observe other avoidance behaviors unique to our treadmill-based task— such as hesitation, backward movement, and slow crawls. These conditioned behaviors are captured through a separate metric: the time taken to complete a lap.

      (2) As suggested by the reviewers, we have separately analyzed fear discrimination and extinction dynamics across recall days (Supplementary Figures 2, 5 and 6, Table 1-6). To assess fear discrimination, we use within-group comparisons to evaluate how well animals differentiate between the two VRs across days. For extinction, we use within-VR comparisons to examine freezing dynamics over time. Freezing across recall days is compared to baseline freezing (pre-conditioning) using a Linear Mixed Effects model (Tables 1-6), with recall days as fixed effects and mouse as a random effect, using baseline freezing as the reference.

      (3) We have expanded the behavioral dataset in Paradigm 1 to investigate the effect of shock amplitude on the conditioned fear response (Supplementary Figure 2 C-E). Consistent with findings in freely moving animals, our data show that increasing shock intensity from 0.6 mA to 1.0 mA leads to stronger freezing. For the revised manuscript, we specifically increased the sample size in the 0.6 mA group (n = 8) in Paradigm 1, as this intensity is used in Paradigm 3. These additional data demonstrate that combining a lower shock amplitude with shorter inter-shock intervals and retaining the tail-coat during recall can enhance freezing, suggesting that these parameters help compensate for lower shock intensity.

      (4) We have added more sample sizes to the imaging dataset (now n = 8, Figures 7-8).

      Finally, we acknowledge that many aspects of this paradigm still require optimization. The headfixed CFC paradigm is in its early stages compared to the decades of research dedicated to understanding fear learning parameters in freely moving CFC paradigms. While there are numerous parameters that could be tested—both those identified through our own discussions and those raised by the reviewers—it is not feasible for a single lab to conduct a full evaluation of all the possible factors that could influence CFC in the head-fixed prep. A key limitation is that our approach requires robust navigation behavior in the VR without rewards, which requires weeks of training per mouse. It also necessitates larger sample sizes at the outset as not all animals will make it through our behavioral criteria required for CFC. Another important consideration is scalability. Unlike freely moving CFC paradigms, which allow parallel testing of many animals with minimal pre-training, the VR-CFC setup requires several weeks of behavior training and involves a more complex integration of hardware and software to accurately track behavior in virtual space. The number of VR rigs that can be operated simultaneously in a single lab is often limited, making high-throughput testing more challenging. These factors mean that the testing of a single parameter in a group of animals requires approximately 3–4 months to complete. Despite these constraints, we are committed to continue refining this paradigm over time. With this manuscript, our main aim was to provide a detailed framework, initial parameters, and evidence for conditioned behavior in the head-fixed preparation. By doing so, we hope to facilitate the adoption of this paradigm by researchers interested in studying the neural correlates of learning and memory using multiphoton imaging and stimulation techniques. This approach enables investigations that are not possible in freely moving animals, while the presence of freezing as a conditioned response allows for direct comparisons to the extensive body of work done in freely moving paradigms. Moving forward, we anticipate that optimizing this paradigm and identifying the key parameters that drive learning will be a collaborative, community-led effort.

      Public Reviews:

      Reviewer #1 (Public review):

      The authors set out to develop a contextual fear learning (CFC) paradigm in head-fixed mice that would produce freezing as the conditioned response. Typically, lick suppression is the conditioned response in such designs, but this (1) introduces a potential confounding influence of reward learning on neural assessments of aversion learning and (2) does not easily allow comparison of head-fixed studies with extensive previous work in freely moving animals, which use freezing as the primary conditioned response.

      The first part of this study is a report on the development and outcomes of 3 variations of the CFC paradigm in a virtual reality environment. The fundamental design is strong, with headfixed mice required to run down a linear virtual track to obtain a water reward. Once trained, the water reward is no longer necessary and mice will navigate virtual reality environments. There are rigorous performance criteria to ensure that mice that make it to the experimental stage show very low levels of inactivity prior to fear conditioning. These criteria do result in only 40% of the mice making it to the experimental stage, but high rates of activity in the VR environment are crucial for detecting learning-related freezing. It is possible that further adjustments to the procedure could improve attrition rates.

      We acknowledge that further adjustments to the procedure could improve attrition rates, and we will continue to work on improving the paradigm.

      Paradigm versions 1 and 2 vary the familiarity of the control context while paradigm versions 2 and 3 vary the inter-shock interval. Paradigm version 1 is the most promising, showing the greatest increase in conditioned freezing (~40%) and good discrimination between contexts (delta ~15-20%). Paradigm version 2 showed no clear evidence of learning - average freezing at recall day 1 was not different than pre-shock freezing. First-lap freezing showed a difference, but this single-lap effect is not useful for many of the neural circuit questions for which this paradigm is meant to facilitate. Also, the claim that mice extinguished first-lap freezing after 1 day is weak. Extinction is determined here by the loss of context discrimination, but this was not strong to begin with. First-lap freezing does not appear to be different between Recall Day 1 and 2, but this analysis was not done.

      This is an important point. Following reviewer suggestions, we have replotted our figures for all paradigms to show within-VR freezing (see Supplementary Figures 2, 5 and 6) as the appropriate method for quantifying fear extinction across days. Using an LME model (Tables 16), we quantify freezing during recall days against baseline freezing levels measured before fear conditioning within each VR. In Paradigm 2, while some fear discrimination persists across days, extinction does occur rapidly. After the first lap in the CFC VR, we observed no significant differences in freezing compared to the baseline. These results are shown in the revised Supplementary Figure 5, and the revised text is in lines 393-399.

      Paradigm version 3 has some promise, but the magnitude of the context discrimination is modest (~10% difference in freezing). Thus, further optimization of the VR CFC will be needed to achieve robust learning and extinction. This could include factors not thoroughly tested in this study, including context pre-exposure timing and duration and shock intensity and frequency.

      We acknowledge that many aspects of this paradigm still need optimization, as virtual reality CFC is in its early stages, and we have not explored all of the parameter space. We describe above the reasoning for this. However, for this revised version of the paper we have added new behavioral data (Supplementary Figure 2 C-E) showing that increasing shock intensities from 0.6 mA to 1 mA enhances freezing, both in the first lap and on average. There are of course many other parameters that are likely important, like the ones pointed out here by the reviewer, but exploring the entire parameter space will take many years and will likely require many labs. The purpose of this paper is to show that VR-CFC fundamentally works and is a starting point from which the field can build on. We have now pointed out in the introduction (lines 54-58) and discussion (lines 730-737, 810-814) that there remains significant scope for improving this paradigm and optimizing parameters in the future.

      The second part of the study is a validation of the head-fixed CFC VR protocol through the demonstration that fear conditioning leads to the remapping of dorsal CA1 place fields, similar to that observed in freely moving subjects. The results support this aim and largely replicate previous findings in freely moving subjects. One difference from previous work of note is that VR CFC led to the remapping of the control environment, not just the conditioning context. The authors present several possible explanations for this lack of specificity to the shock context, further underscoring the need for further refinement of the CFC protocol before it can be widely applied. While this experiment examined place cell remapping after fear conditioning, it did not attempt to link neural activity to the learned association or freezing behavior.

      This is an interesting observation. We think that the remapping observed in the control context likely occurred due to the absence of reward in a previously rewarded environment. Our prior work has demonstrated that removal of reward causes increased remapping (Krishnan et al., 2022, Krishnan and Sheffield, 2023). In other words, the continued presence of reward within an environment stabilizes CA1 place fields. The Moita et al. (2004) paper, which showed remapping only in the fear conditioned context and not in the control context, provided rats with food pellets throughout the experimental session in both the control and conditioned context— likely to increase exploration necessary for identifying place cells. The presence of reward in the Moita et al experiment could explain the minimal remapping observed in their control context compared to our control context which lacked reward. Another possibility could lie in the differences in the intervals between place cell activity recordings in our study and that of Moita et al. While Moita et al. separated their recordings by just one hour, our recordings were separated by a full day, with a sleep period in between. The absence of sleep and the shorter time interval between conditioning and retrieval sessions in their study could explain the minimal remapping observed by Moita et al. compared to our findings. We have now addressed this discrepancy explicitly in lines 596-606.

      Although we agree with the reviewer that it would be informative to perform analysis of how neural activity correlates with freezing responses, we think this warrants its own stand-alone manuscript as the neural dynamics and methods to appropriately analyze them are complicated. We are in the midst of analyzing this data further and will present these findings in a separate publication.

      In summary, this is an important study that sets the initial parameters and neuronal validation needed to establish a head-fixed CFC paradigm that produces freezing behaviors. In the discussion, the authors note the limitations of this study, suggest the next steps in refinement, and point to several future directions using this protocol to significantly advance our understanding of the neural circuits of threat-related learning and behavior.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Krishnan et al devised three paradigms to perform contextual fear conditioning in head-fixed mice. Each of the paradigms relied on head-fixed mice running on a treadmill through virtual reality arenas. The authors tested the validity of three versions of the paradigms by using various parameters. As described below, I think there are several issues with the way the paradigms are designed and how the data are interpreted. Moreover, as Paradigm 3 was published previously in a study by the same group, it is unclear to me what this manuscript offers beyond the validations of parameters used for the previous publication. Below, I list my concerns point-by-point, which I believe need to be addressed to strengthen the manuscript.

      Major comments

      (1) In the analysis using the LME model (Tables 1 and 2), I am left wondering why the mice had increased freezing across recall days as well as increased generalization (increased freezing to the familiar context, where shock was never delivered). Would the authors expect freezing to decrease across recall days, since repeated exposure to the shock context should drive some extinction? This is complicated by the analysis showing that freeing was increased only on retrieval day 1 when analyzing data from the first lap only. Since reward (e.g., motivation to run) is removed during the conditioning and retrieval tests, I wonder if what the authors are observing is related to decreased motivation to perform the task (mice will just sit, immobile, not necessarily freezing per se). I think that these aspects need to be teased out.

      This is an important point and we agree teasing out a lack of motivation versus fearful freezing would be useful. To address the possibility that reduced motivation to run without reward could contribute to the observed freezing behavior, we have now included a no-shock control group in the revised manuscript (n = 7; Supplementary Figure 2A-B, H–I). These control mice experienced the same protocol, including the wearing of a tail coat, but did not receive any shocks. We observed no increases in freezing across days in these controls, confirming that the increased freezing in the Familiar context of our experimental group stems from fear conditioning rather than the removal of reward from a previously rewarded context. If reduced motivation from reward removal were the primary driver, similar freezing patterns would have emerged in the no-shock controls. We have added lines 248-261 in the revised manuscript, discussing this point, and we thank the reviewer for motivating us to do this experiment and analysis.

      That said, the precise mechanisms underlying the fear generalization observed in the nonconditioned context—particularly its emergence during later recall days—remain unclear. Studies in freely moving animals have shown that fear memories initially specific to the conditioned context can become generalized with repeated exposures, which may be occurring here (Biedenkapp & Rudy, 2007; Wiltgen & Silva, 2007). Alternatively, it is possible that the combination of fear conditioning and the removal of expected reward contributes to a delayed generalization effect. This may reflect a limitation of our approach, which relies on reward to motivate initial training. As noted by another reviewer, we have now addressed this potential drawback of reward-based training in the discussion (see lines 809-817). Clearly, unique factors specific to the head-fixed VR paradigm may contribute to this phenomenon. Understanding the mechanisms underlying fear generalization in the head-fixed VR CFC paradigm will be a valuable direction for future research.

      (2) Related to point 1, the authors actually point out that these changes could be due to the loss of the water reward. So, in line 304, is it appropriate to call this freezing? I think it will be very important for the authors to exactly define and delineate what they consider as freezing in this task, versus mice just simply sitting around, immobile, and taking a break from performing the task when they realize there is no reward at the end.

      As noted in point 1 above, we have added a no-shock control group (n = 7; Supplementary Figure 2A-B, H–I) to determine whether the observed freezing was driven by fear conditioning or by reduced motivation to run in the absence of reward. The absence of increased freezing in these controls supports the interpretation that the behavior in the conditioned group is fearrelated. In future studies, incorporating additional physiological measures—such as heart rate monitoring—could further help distinguish fear-related freezing from other forms of immobility.

      (3) In the second paradigm, mice are exposed to both novel and (at the time before conditioning) neutral environments just before fear conditioning. There is a big chance that the mice are 'linking' the memories (Cai et al 2016) of the two contexts such that there is no difference in freezing in the shock context compared to the neutral context, which is what the authors observe (Lines 333-335). The experiment should be repeated such that exposure to the contexts does not occur on the conditioning day.

      This is an interesting idea. However, if memory linking were driving the observed freezing patterns, we would expect to see similarly reduced fear discrimination across all three paradigms, as mice experience both contexts sequentially in each case. However, this effect appears to be specific to Paradigm 2, suggesting this may be due to other factors. We agree it would be informative to eliminate pre-conditioning exposure to both environments—to assess whether this improves fear discrimination and helps clarify the potential contribution of memory linking. This is something we plan to do in future studies that are beyond the scope of this initial paper on VR-CFC.

      (4) On lines 360-361, the authors conclude that extinction happens rapidly, within the first lap of the VR trial. To my understanding, that would mean that extinction would happen within the first 5-10 seconds of the test (according to Figure S1E). That seems far too fast for extinction to occur, as this never occurs in freely behaving mice this quickly.

      We agree with the reviewer that extinction in Paradigm 2 appears to occur relatively rapidly.

      However, the average time to complete the first lap in the fear-conditioned context in Paradigm 2 is 25.68 ± 5.55 seconds (as stated in line 384), indicating that extinction occurs within approximately the first 30 seconds of context exposure—not within 5–10 seconds. This is specific to Paradigm 2 and does not happen in either of the other paradigms, as shown in Supplementary Figure 4. For clarification, Figure S1E pertains to baseline running in Paradigm 1 and does not apply to Paradigm 2.

      As the reviewer points out, even at 30 seconds, extinction seems to be happening more quickly in Paradigm 2 than seen in freely moving setups. This may be due to a key structural difference in our setup. The VR-CFC task is organized into discrete trials, with mice being teleported back to the start after reaching the end of the virtual track. Completing a full lap without receiving a shock could serve as a clear signal that the threat is no longer present within the environment as the completion of a lap means that the animals have surveyed all locations within the environment. This structure could accelerate extinction compared to freely moving setups, where animals take longer to explore their complete environment due to the lack of discrete trials. Although this is true for all our paradigms, the accelerated extinction seen in paradigm 2 versus 1 and 3 may be driven by other factors. As noted by the reviewers, other task parameters—such as context pre-exposure timing, shock intensity, and conditioning duration— are likely to play a role in shaping extinction dynamics. These factors warrant further investigation, and we plan to explore them in future studies to better understand the conditions influencing extinction in the VR-CFC paradigm.

      (5) Throughout the different paradigms, the authors are using different shock intensities. This can lead to differences in fear memory encoding as well as in levels of fear memory generalization. I don't think that comparisons can be made across the different paradigms as too many variables (including shock intensity - 0.5/0.6mA can be very different from 1.0 mA) are different. How can the authors pinpoint which works best? Indeed, they find Paradigm 3 'works' better than Paradigm 2 because mice discriminate better between the neutral and shock contexts. This can definitely be driven by decreased generalization from using a 0.6mA shock in Paradigm 3 compared to 1.0 mA shock in Paradigm 2.

      The reviewer brings up important points here. We have now added new data evaluating 0.6 mA shocks in Paradigm 1 (Supplementary Figure 2A–E, n=8). These data show that 1.0 mA shocks produced stronger conditioned responses and greater fear discrimination compared to 0.6 mA. Our goal in Paradigm 3 was to begin with a lower shock intensity and assess whether additional modifications—specifically the shorter ISI and retention of the tail-coat during recall—could enhance fear conditioning. Surprisingly, despite the weaker shock intensity, Paradigm 3 resulted in improved discrimination and freezing behavior relative to Paradigm 2. We have now clarified this point in the manuscript (lines 466-470), and we interpret this outcome as evidence that the shorter ISIs and contextual cue continuity (tail-coat) likely play a more significant role in enhancing learning and recall. However, as noted in the text (lines 511-514), further testing is needed to determine the individual contributions of each parameter to successful VR-CFC. Fully optimizing the parameter settings will take additional time and resources, and we aim to continually refine the parameter space in the future, as has been done over the years for freely moving animals.

      (6) There are some differences in the calcium imaging dataset compared to other studies, and the authors should perform additional testing to determine why. This will be integral to validating their head-fixed paradigm(s) and showing they are useful for modeling circuit dynamics/behaviors observed in freely behaving mice. Moreover, the sample size (number of mice) seems low.

      The one notable difference between our imaging study and that done in freely moving animals is that we observed remapping of place cells in the control context. In contrast, Moita et al. (2004) reported more stable place fields in the control context. A key distinction is that their study included rewards in the control context, which may have contributed to the spatial stability. We now discuss this difference in the manuscript (lines 599-605).

      It should be noted that there are many key distinctions among paradigms that study neural activity during fear conditioning in freely moving animals. These include varying exposure times to environments (1–6 days), the time interval between neural activity recordings, and the use of food rewards during the experiment stages in freely moving animals to encourage exploration for place cell identification. Although freely moving paradigms that investigate fear conditioning and place cells are heterogeneous, we were encouraged by the replication of several key findings. This validates VR-based CFC as a viable tool for neural circuit investigations. While future work will include more thorough analyses, our current findings demonstrate the paradigm's effectiveness for modeling circuit dynamics and behavior. We have now expanded our dataset, which includes four additional mice, further corroborating these original findings.

      (7) It appears that the authors have already published a paper using Paradigm 3 (Ratigan et al 2023). If they already found a paradigm that is published and works, it is unclear to me what the current manuscript offers beyond that initial manuscript.

      The reviewer is correct that we have published a paper using Paradigm 3. However, this manuscript goes beyond that one and provides a much more comprehensive description and fundamental analysis of the behavior and experimental parameters regarding VR-CFC, allowing the research community to adapt our paradigm reproducibly. While Ratigan et al. (2023) offered only a minimal description of behavior and included just Paradigm 3, we present two additional paradigms along with neuronal validation using hippocampal place cells. We have now explicitly stated this in the introduction (lines 50-55).

      (8) As written, the manuscript is really difficult to follow with the averages and standard error reported throughout the text. This reporting in the text occurred heterogeneously throughout the text, as sometimes it was reported and other times it was not. Cleaning this reporting up throughout the paper would greatly improve the flow of the text and qualitative description of the results.

      We completely agree with this point and have now cleaned up the text, leaving details only in a few places we felt were important.

      Reviewer #3 (Public review):

      Summary:

      Krishnan et al. present a novel contextual fear conditioning (CFC) paradigm using a virtual reality (VR) apparatus to evaluate whether conditioned context-induced freezing can be elicited in head-fixed mice. By combining this approach with two-photon imaging, the authors aim to provide high-resolution insights into the neural mechanisms underlying learning, memory, and fear. Their experiments demonstrate that head-fixed mice can discriminate between threat and non-threat contexts, exhibit fear-related behavior in VR, and show context-dependent variability during extinction. Supplemental analyses further explore alternative behaviors and the influence of experimental parameters, while hippocampal neuron remapping is tracked throughout the experiments, showcasing the paradigm's potential for studying memory formation and extinction processes.

      Strengths:

      Methodological Innovation: The integration of a VR-based CFC paradigm with real-time twophoton imaging offers a powerful, high-resolution tool for investigating the neural circuits underlying fear, learning, and memory.

      Versatility and Utility: The paradigm provides a controlled and reproducible environment for studying contextual fear learning, addressing challenges associated with freely moving paradigms.

      Potential for Broader Applications: By demonstrating hippocampal neuron remapping during fear learning and extinction, the study highlights the paradigm's utility for exploring memory dynamics, providing a strong foundation for future studies in behavioral neuroscience.

      Comprehensive Data Presentation: The inclusion of supplemental figures and behavioral analyses (e.g., licking behaviors and variability in extinction) strengthens the manuscript by addressing additional dimensions of the experimental outcomes.

      Weaknesses:

      Characterization of Freezing Behavior: The evidence supporting freezing behavior as the primary defensive response in VR is unclear. Supplementary videos suggest the observed behaviors may include avoidance-like actions (e.g., backing away or stopping locomotion) rather than true freezing. Additional physiological measurements, such as EMG or heart rate, are necessary to substantiate the claim that freezing is elicited in the paradigm.

      To strengthen our claim that freezing is a conditioned response in this task, we have taken three key steps:

      (1) We adjusted our freezing detection threshold from 1 cm/s to near 0 cm/s to capture only periods where the animal is virtually motionless on the treadmill. We validated this approach in Figure 2, particularly in the zoomed-in track position trace in Figure 2A, which clearly shows that the identified freezing epochs correspond to no change in track position. All analyses and figures have been updated to reflect this more stringent threshold.

      (2) We have added a no-shock control group in the revised manuscript (n = 7; Supplementary Figure 2A-B, H–I) where mice experienced the same protocol, including wearing a tail-coat, but received no shocks. These mice showed no increases in freezing behavior, which further demonstrates that the increased freezing we observe is a result of fear conditioning.

      (3) We have added a new supplementary video (Supplementary Video 2) that better illustrates the freezing behavior in our task.

      That said, we fully agree with the reviewer that freezing is not the only defensive response observed. Other behaviors—such as hesitation, backward movement, and slowing down—also emerge that are unique to our treadmill-based paradigm. We chose to focus on freezing in this manuscript to align with convention in freely moving fear conditioning studies and to facilitate direct comparisons. We agree that additional physiological measurements (e.g., EMG or heart rate) would provide further validation and could help distinguish between different forms of defensive responses. We view this as an important future direction and plan to incorporate such measures in upcoming studies. We highlight this in the results section (lines 175-179, 262-268) and in the discussion (lines 739-750).

      Analysis of Extinction: Extinction dynamics are only analyzed through between-group comparisons within each Recall day, without addressing within-group changes in behavior across days. Statistical comparisons within groups would provide a more robust demonstration of extinction processes.

      This is an important distinction and we have now added figures (Supplementary Figures 2H-I, 5C-D, 6C-D) showing within-VR behavior across Recall days, along with statistical comparisons and a description of the extinction process based on these results.

      Low Sample Sizes: Paradigm 1 includes conditions with very low sample sizes (N=1-3), limiting the reliability of statistical comparisons regarding the effects of shock number and intensity.

      Increasing sample sizes or excluding data from mice that do not match the conditions used in Paradigms 2 and 3 would improve the rigor of the analysis.

      While we included all conditions in Figure 2 for completeness, we have separated these conditions in Supplementary Figure 2 to ensure clarity. This allows researchers interested in this paradigm to see the approximate range of conditioned responses observed across different parameters. When comparing Paradigm 1 with Paradigms 2 and 3, we have only used data from 1mA, 6 shocks condition.

      Potential Confound of Water Reward: The authors critique the use of reward in conjunction with fear conditioning in prior studies but do not fully address the potential confound introduced by using water reward during the training phase in their own paradigm.

      We agree this is a point that needs discussion. We have now noted the limitation of using water rewards during training in the discussion section, particularly its effect on the animal’s motivation in the long term and on place cell activity (lines 814-820).

      Recommendations for the authors

      Reviewer #1 (Recommendations for the authors):

      I suggest changing "3 paradigms" to "3 versions of a CFC paradigm," as the paradigm is fundamentally the same, but parameters were adjusted towards finding an optimal protocol.

      We have changed this phrasing where applicable.

      Figure S2: There appear to be different sets of shock parameters for different mice, most with an n of 1 or 2. This is not reliable for making a decision for optimal shock parameters and should not be discussed in that way until a full-powered comparison is completed. Also, the N adds up to 19, yet only 18 are described as being included in the study.

      We thank the reviewer for this important point. We agree that the current study is not powered to definitively identify optimal parameter settings. We have been careful not to interpret it in that way in the text. Rather, we adopted a commonly used starting point from the freely moving literature—1 mA with six shocks—as our initial condition (lines 196-199). To provide context for others interested in pursuing this work, we have presented a range of conditioned responses from different parameter combinations to illustrate potential variability. In most cases, these data are intended for illustrative purposes only and are not meant to support firm conclusions. We agree that a systematic and fully powered investigation of each parameter would be highly valuable, and we plan to pursue this in future work (and hope other labs contribute to this goal, too), much like the iterative optimizations performed in freely moving paradigms over time.

      We thank the reviewer for catching the sample size discrepancy and have now corrected it.

      The number of animals for the no-shock condition should be included.

      Thank you. We have now included this.

      A possible explanation for the lower fear and poorer discrimination in versions 2 and 3 could be that 10 min pre-exposure to the CFC context on day -1 led to latent inhibition. Shorter (or eliminated) pre-exposure may improve outcomes.

      We agree that the exposure time is a parameter that we should explore. We have highlighted this in the discussion (lines 729-736) as a parameter that is worth testing in the future.

      For analysis of extinction, it is best to establish this within condition - is freezing to the CFC context significantly reduced compared with initial recall and similar to pre-training freezing? By using discrimination as your index of extinction, increases in control context freezing/inactivity can eliminate context discrimination without the conditioned response of freezing actually undergoing extinction.

      This is a good point, and we have now included analysis and conclusions based on a within-VR comparison for the analysis of fear extinction (Supplementary Figures 2H-I, 5C-D, 6C-D).

      Reviewer #3 (Recommendations for the authors):

      Clarification of Treadmill Shape: The manuscript describes the treadmill as "spherical" throughout. However, based on representative images and videos, the treadmill appears cylindrical. This discrepancy should be clarified to ensure consistency between the text and visuals.

      The reviewer is correct that the treadmill is cylindrical, and this was an error on our part. We have corrected it throughout.

      Figure and Legend Labeling: To improve clarity, all figures and their legends should be explicitly labeled with the corresponding paradigm (1, 2, or 3) to facilitate interpretation.

      We have now added a label on all figures that clarifies which Paradigm the figures are referring to. We have also explicitly added this to the figure legends.

      Objective Language: Subjective language, such as "since we wanted animals to" (Line 850), should be revised to reflect an objective tone (e.g., "to allow animals to"). Similarly, phrases like "We believe" (Line 896) should be avoided to maintain an unbiased presentation.

      We have removed subjective language from our text.

      Placement of Future Directions: Speculations on future experimental plans, such as the use of sex as a biological variable (Lines 895-903), should be included in the Discussion section rather than the Methods. Additionally, remarks about the responsiveness of female mice to tail shocks should be moved to the main text for proper contextualization.

      We have moved these lines as suggested by the reviewer.

    1. Whether the mean is the best summary depends on what you are using it for :-), i.e. your objective.
      • Whether the mean is the best summary metric depends on the distribution of the variable.
      • If it is normal distribution, then mean is a good summary metric, otherwise, median is better.
      • we can use function shapiro.test() to test the Normality of a variable
      • since shapiro.test() limits the maximum length of a vector of 5000, we can subset 5000 elements of gss_cat$tvhours to perform this test.

      gss_cat$tvhours %>% head(5000) %>% shapiro.test()

      results: p-value < 0.01, which means the tvhours variable is not normal distribution, therefore, mean is not a good summary metric for this variable, we'd better use median.

    1. i used a saw to just make the 6:46 one edge of that phillips head a little 6:48 bit deeper so that i could then use a 6:51 flat top screwdriver in there to remove 6:52 that screw the next time that way when i 6:55 put that screw back in there i could 6:56 remove it later just using a flat head 6:58 screwdriver instead of a phillips 7:00 now this was a little bit of a ratchet 7:02 job but it did the trick and with that i 7:04 had a working electric typewriter

      Sarah Everett suggests using a saw to turn Phillips head screws into a flat head screw if they've been stripped.

    1. Codecov Report All modified and coverable lines are covered by tests ✅ Project coverage is 56.27%. Comparing base (768bc0b) to head (27435bf). Report is 1 commits behind head on main. Additional details and impacted files @@ Coverage Diff @@ ## main #6780 +/- ## ======================================= Coverage 56.27% 56.27% ======================================= Files 509 509 Lines 32580 32580 Branches 3099 3099 ======================================= Hits 18336 18336 Misses 13386 13386 Partials 858 858 ☔ View full report in Codecov by Sentry. 📢 Have feedback on the report? Share it here. 👍 👎 😄 🎉 😕 ❤️ 🚀 👀

      CHECK_COMPATIBILITY

    1. The anatomy and physiology of this region are unique and complex. Function andappearance are critical to patients images quality‑of‑life, most patients with head and necksquamous cell carcinoma are middle‑aged, adult males in lower socioeconomic classes who arechronic tobacco chewer and alcohol consumers have advanced tumors. These patients tend toless conscious and to have less social support then most cancer patients.

      ① The anatomy and physiology of this region are unique and complex. ① Bu bölgenin anatomi ve fizyolojisi benzersiz ve karmaşıktır.

      ② Function and appearance are critical to patients images quality‑of‑life, ② İşlev ve görünüm, hastaların yaşam kalitesi algısı açısından kritik öneme sahiptir.

      ③ most patients with head and neck squamous cell carcinoma are middle‑aged, adult males in lower socioeconomic classes who are chronic tobacco chewer and alcohol consumers have advanced tumors. ③ Baş ve boyun yassı hücreli karsinomu olan hastaların çoğu, düşük sosyoekonomik sınıfa mensup, orta yaşlı, erişkin erkeklerdir ve bu kişiler genellikle kronik tütün çiğneyicisi ve alkol tüketicisidir; tümörleri de ileri evrededir.

      ④ These patients tend to less conscious and to have less social support then most cancer patients. ④ Bu hastalar, çoğu kanser hastasına kıyasla genellikle daha az bilinçli olma ve daha az sosyal desteğe sahip olma eğilimindedir.

    2. Head and neck cancer is a term used to define cancer that develops in the mouth, throat,nose, salivary glands, oral cancers or other areas of the head and neck. Most of these cancers aresquamous cell carcinomas, or cancers that begin in the lining of the mouth, nose and throat.Eighty-five percent of head and neck cancers are linked to tobacco use, and 75 percent areassociated with a combination of tobacco and alcohol use (tobacco and alcohol are strongsynergistic effects of oral cancer). Human papilloma virus especially types 16 and 18 are knownrisk factors (there are over 100 variables) and independent causative factor for oral cancer.Because of their location, head and neck tumors and treatment-related side effects may impairpatients’ ability to eat, swallow and breathe.

      ① Head and neck cancer is a term used to define cancer that develops in the mouth, throat, nose, salivary glands, oral cancers or other areas of the head and neck. ① Baş ve boyun kanseri, ağız, boğaz, burun, tükürük bezleri, ağız kanserleri veya baş ve boynun diğer bölgelerinde gelişen kanserleri tanımlamak için kullanılan bir terimdir.

      ② Most of these cancers are squamous cell carcinomas, or cancers that begin in the lining of the mouth, nose and throat. ② Bu kanserlerin çoğu, ağız, burun ve boğazın yüzeyini döşeyen hücrelerde başlayan yassı hücreli karsinomlardır.

      ③ Eighty-five percent of head and neck cancers are linked to tobacco use, ③ Baş ve boyun kanserlerinin %85’i tütün kullanımıyla ilişkilidir,

      ④ and 75 percent are associated with a combination of tobacco and alcohol use (tobacco and alcohol are strong synergistic effects of oral cancer). ④ ve %75’i tütün ve alkol kullanımının birleşimiyle ilişkilidir (tütün ve alkol, ağız kanseri üzerinde güçlü sinerjistik etkiye sahiptir).

      ⑤ Human papilloma virus especially types 16 and 18 are known risk factors (there are over 100 variables) and independent causative factor for oral cancer. ⑤ İnsan papilloma virüsü, özellikle 16 ve 18 tipleri, bilinen risk faktörleri olup (100’den fazla varyantı vardır), ağız kanseri için bağımsız bir neden olarak kabul edilmektedir.

      ⑥ Because of their location, head and neck tumors and treatment-related side effects may impair patients’ ability to eat, swallow and breathe. ⑥ Bulundukları konum nedeniyle, baş ve boyun tümörleri ile tedaviye bağlı yan etkiler, hastaların yeme, yutma ve nefes alma yetilerini bozabilir.

    Annotators

    1. And over the head of the master is always an image of felt, like a doll or statuette,which they call the brother of the master;

      This passage reveals the deeply spiritual and symbolic nature of Mongol domestic life, with each dwelling containing felt figures that represent both personal and household guardians. It’s notable that Rubruck focuses on these figures not as mere decorations, but as part of daily ritual life, emphasizing how the Mongols integrated their belief system into the structure of their homes. This challenges Western stereotypes of the Mongols as purely nomadic warriors, showing instead a rich and organized spiritual culture that informed their everyday routines.

  4. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. Related to the fear of naming is the insistance of schools on "sanitizing" the curriculum, or what Jonathon Kozol many years ago called "tailoring" important men and women for school use. Kozol described how schools manage to take the most exciting and memorable heroes and bleed the life and spirit completely 0ut of them because it can be dangerous, he wrote, to teach a history "studded with so many bold, and revolutionary, and subversive, and exhilarating men and women."

      It is once again how schools pull us away from uncomfortable truths — by neglecting or performing, as they might describe it, the “sanitary” action of removing complex pieces of history. Explaining that the fear of naming powerfully interprets why racism and other forms of injustice are so rarely tackled head-on in classrooms — they’re too disruptive. But Kozol's book adds another layer: while the inclusions themselves might be significant, they’re stripped of their radical and inspirational qualities to become more “acceptable.” It has all the effect of erasing their true presence. But along with the small events that do happen, taking this detour means depriving students of its deeper, more empowering understanding — making not to mention that, once they begin it, a tragedy.

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Ning et al. reported that Bcas2 played an indispensable role in zebrafish primitive hematopoiesis via sequestering β-catenin in the nucleus. The authors showed that loss of Bcas2 caused primitive hematopoietic defects in zebrafish. They unraveled that Bcas2 deficiency promoted β-catenin nuclear export via a CRM1-dependent manner in vivo and in vitro. They further validated that BCAS2 directly interacted with β-catenin in the nucleus and enhanced β-catenin accumulation through its CC domains. They unveil a novel insight into Bcas2, which is critical for zebrafish primitive hematopoiesis via regulating nuclear β-catenin stabilization rather than its canonical pre-mRNA splicing functions. Overall, the study is impressive and well-performed, although there are also some issues to address.

      Strengths:

      The study unveils a novel function of Bcas2, which is critical for zebrafish primitive hematopoiesis by sequestering β-catenin. The authors validated the results in vivo and in vitro. Most of the figures are clear and convincing. This study nicely complements the function of Bcas2 in primitive hematopoiesis.

      Weaknesses:

      A portion of the figures were over-exposed.

      Thank you for the time reviewing our manuscript. We agree with your suggestion and the exposure of Figure 5C and Figure 7E has been reduced. We hope that the revisions will meet your expectation.

      Reviewer #2 (Public Review):

      Summary:

      Ning and colleagues present studies supporting a role for breast carcinoma amplified sequence 2 (Bcas2) in positively regulating primitive wave hematopoiesis through amplification of beta-catenin-dependent (canonical) Wnt signaling. The authors present compelling evidence that zebrafish bcas2 is expressed at the right time and place to be involved in primitive hematopoiesis, that there are primitive hematopoietic defects in hetero- and homozygous mutant and knockdown embryos, that Bcas2 mechanistically positively regulates canonical Wnt signaling, and that Bcas2 is required for nuclear retention of B-cat through physical interaction involving armadillo repeats 9-12 of B-cat and the coiled-coil domains of Bcas2. Overall, the data and writing are clean, clear, and compelling. This study is a first-rate analysis of a strong phenotype with highly supportive mechanistic data. The findings shed light on the controversial question of whether, when, and how canonical Wnt signaling may be involved in hematopoietic development. We detail some minor concerns and questions below, which if answered, we believe would strengthen the overall story and resolve some puzzling features of the phenotype. Notwithstanding these minor concerns, we believe this is an exceptionally well-executed and interesting manuscript that is likely suitable for publication with minor additional experimental detail and commentary.

      Strengths:

      (1) The study features clear and compelling phenotypes and results.

      (2) The manuscript narrative exposition and writing are clear and compelling.

      (3) The authors have attended to important technical nuances sometimes overlooked, for example, focusing on different pools of cytosolic or nuclear b-catenin.

      (4) The study sheds light on a controversial subject: regulation of hematopoietic development by canonical Wnt signaling and presents clear evidence of a role.

      (5) The authors present evidence of phylogenetic conservation of the pathway.

      Weaknesses:

      (1) The authors present compelling data that Bcas2 regulates nuclear retention of B-cat through physical association involving binding between the Bcas2 CC domains and B-cat arm repeats 9-12. Transcriptional activation of Wnt target genes by B-cat requires physical association between B-cat and Tcf/Lef family DNA binding factors involving key interactions in Arm repeats 2-9 (Graham et al., Cell 2000). Mutually exclusive binding by B-cat regulatory factors, such as ICAT that prevent Tcf-binding is a documented mechanism (e.g. Graham et al., Mol Cell 2002). It would appear - based on the arm repeat usage by Bcas2 (repeats 9-12)-that Bcas2 and Tcf binding might not be mutually exclusive, which would support their model that Bcas2 physical association with B-cat to retain it in the nucleus would be compatible with co-activation of genes by allowing association with Tcf. It might be nice to attempt a three-way co-IP of these factors showing that B-cat can still bind Tcf in the presence of Bcas2, or at least speculate on the plausibility of the three-way interaction.

      We appreciate your assessment and generous comments for the manuscript. As you mentioned, the binding sites for TCF on β-catenin almost do not overlap with those for BCAS2. It is likely that BCAS2-mediated nuclear sequestration of β-catenin would be compatible with the initiation of gene transcription by allowing TCF to associate with β-catenin. To test this possibility, we have taken your suggestion and performed co-IP assays. The results showed that β-catenin still bound with TCF4 in the presence of BCAS2 (Supplemental Figure 12), confirming that the binding of BCAS2 to β-catenin would not interfere with the formation of β-catenin/TCF complex.

      (2) A major way that canonical Wnt signaling regulates hematopoietic development is through regulation of the LPM hematopoietic competence territories by activating expression of cdx1a, cdx4, and their downstream targets hoxb5a and hoxa9a (Davidson et al., Nature 2003; Davidson et al., Dev Biol 2006; Pilon et al., Dev Biol 2006; Wang et al., PNAS 2008). Could the authors assess (in situ) the expression of cdx1a, cdx4, hoxb5a, and hoxa9a in the bcas2 mutants?

      We agree with your suggestion and have examined the expression of cdx4 and hoxa9a by performing WISH. Diminished expression of cdx4 and hoxa9a was detected in the lateral plate mesoderm of bcas2<sup>+/-</sup> embryos at the 6-somite stage (Supplemental Figure 7).

      (3) The authors show compellingly that even heterozygous loss of bcas2 has strong Wnt-inhibitory effects. If Bcas2 is required for canonical Wnt signaling and bcas2 is expressed ubiquitously from the 1-cell stage through at least the beginning of gastrulation, why do bcas2 KO embryos not have morphological axis specification defects consistent with loss of early Wnt signaling, like loss of head (early), or brain anteriorization (later)? Could the authors provide some comments on this puzzle? Or if they do see any canonical Wnt signaling patterning defects in het- or homozygous embryos, could they describe and/or present them?

      You have raised an interesting question. In fact, we did not observe ventralization or axis determination defects in the early embryos of bcas2<sup>+/-</sup> mutants. Even in the very small number of homozygous mutant embryos, we did not find such morphological defects. Given that the homozygous and heterozygous mutant embryos were derived from crossing bcas2<sup>+/-</sup> males with bcas2<sup>+/-</sup> females, maternal Bcas2 might still remain and function in these embryos during gastrulation when axis determination and neural patterning took place. Accordingly, we have expanded our discussion to incorporate these insights (Line 565-572).

      Reviewer #3 (Public Review):

      Summary:

      This manuscript utilized zebrafish bcas2 mutants to study the role of bcas2 in primitive hematopoiesis and further confirms that it has a similar function in mice. Moreover, they showed that bcas2 regulates the transition of hematopoietic differentiation from angioblasts via activating Wnt signaling. By performing a series of biochemical experiments, they also showed that bcas2 accomplishes this by sequestering b-catenin within the nucleus, rather than through its known function in pre-mRNA splicing.

      Strengths:

      The work is well-performed, and the manuscript is well-written.

      Weaknesses:

      Several issues need to be clarified.

      (1) Is wnt signaling also required during hematopoietic differentiation from angioblasts? Can the authors test angioblast and endothelial markers in embryos with wnt inhibition? Also, can the authors add export inhibitor LMB to the mouse mutants to test if sequestering of b-catenin by bcas2 is conserved during primitive hematopoiesis in mice?

      Thank you very much for your appreciation and detailed assessment. To test whether Wnt signaling is also required during hematopoietic differentiation from angioblasts, wild-type embryos were exposed to 10 µM CCT036477, a small molecule β-catenin antagonist, from 9 hpf and then collected for WISH experiments. As shown in Supplemental Figure 8, the expression of hemangioblast markers npas4l, scl, and gata2 and endothelial marker fli1a remained unchanged, but the expression of erythroid progenitor marker gata1 was significantly reduced. These results suggest that canonical Wnt pathway may not be required for the generation of hemangioblasts or their endothelial differentiation, but is pivotal for their hematopoietic differentiation.

      It is quite difficult to validate the conserve role of BCAS2 during primitive hematopoiesis in mice, because the toxicity of LMB may cause severe adverse effects in mice.[1,2]

      (2) Bcas2 is required for primitive myelopoiesis in ALM. Does bcas2 play a similar function in primitive myelopoiesis, or is bcas2/b-catenin interaction more important for hematopoietic differentiation in PLM?

      You have raised an important question. In our study, we have demonstrated that the expression of myeloid progenitor marker pu.1 was significantly decreased in bcas2 mutants, hinting that Bcas2 is pivotal for primitive myelopoiesis. To further clarify the function of Bcas2 in primitive myelopoiesis, we injected 8 ng of bcas2 morpholino into Tg(coro1a:GFP) embryos at the 1-cell stage and examined β-catenin distribution at 17 hpf via immunostaining. We observed a significant decline of nuclear β-catenin in primitive myeloid cells (Supplemental Figure 9), indicating that Bcas2 is highly likely to play a similar role in sequestering β-catenin within the nucleus during primitive myelopoiesis.

      (3) Is it possible that CC1-2 fragment sequester b-catenin? The different phenotypes between this manuscript and the previous article (Yu, 2019) may be due to different mutations in bcas2. Is it possible that the bcas2 mutation in Yu's article produces a complete CC1-2 fragment, which might sequester b-catenin?

      This is an interesting perspective. To test the possibility that CC1-2 sequesters β-catenin, mRNA expressing the CC domains of BCAS2 has been co-injected with bcas2 morpholino into Tg(gata1:GFP) embryo at the one-cell stage. Increased nuclear β-catenin levels were detected in the GFP-positive hematopoietic progenitor cells at 16 hpf (Supplemental Figure 11). Our findings support that CC1-2 fragment of BCAS2 can sequester β-catenin within the nucleus.

      In the previous article (Yu, 2019), a deletion 5 bases mutation in the third exon of BCAS2 was produced by TALEN, therefore the CC domains of this mutant should be affected. It is difficult to conclude that the mutant BCAS2 protein in Yu’s study still remains association with β-catenin.

      (4) Can the author clarify what embryos the arrows point to in SI Figure 2D? In SI Figure 6B and B', can the author clarify how the nucleus and cytoplasm are bleached? In B, the nucleus also appears to be bleached.

      Thank you for your query and suggestion. In our revisions, the corresponding clarifications have been supplemented (Line 239-242; Line 978-979).

      We acknowledge that the nuclei in both the BCAS2 overexpression group and control group were slightly bleached. Given that we have performed real-time analysis for fluorescent recovery after photobleaching, and we have observed a much slower recovery of cytoplasmic fluorescence in BCAS2 overexpressed cells, the conclusion that BCAS2 inhibits the nuclear export of β-catenin but not its nuclear import, remains changed.

      Reviewer #1 (Recommendations For The Authors):

      Major concerns:

      (1) In this study, the authors detected β-catenin distribution in erythrocytes (gata1-GFP+ cells). Estimating the β-catenin distribution in the myeloid cells is recommended.

      Thank you for your assessment and we have taken your suggestion. Tg(coro1a:GFP) embryos, which is commonly used to track both macrophages and neutrophils,[3] were injected with 8 ng of bcas2 morpholino into at the 1-cell stage and collected for immunostaining to examine the β-catenin distribution at 17 hpf. We observed a significant decline of nuclear β-catenin in primitive myeloid cells (Supplemental Figure 9). This result indicates that Bcas2 is highly likely to play a similar role in sequestering β-catenin within the nucleus during primitive myelopoiesis.

      (2) The reduced nuclear localization of β-catenin in Figure 3H required further evidence. It would be helpful if the authors quantified the fluorescence intensity in the cell nucleus and cytoplasm. Meanwhile, the figures (Figure 5C, Figure 7E) were over-exposed. Please validate these figures.

      Thank you for your suggestions. We agree with you that the fluorescence intensity of β-catenin in the nucleus and cytoplasm should be quantified. However, as the nucleus comprises a large part of the cell, we believe it would be more appropriate to quantify the relative fluorescence intensity by dividing the fluorescence intensity of nuclear β-catenin by the fluorescence intensity of DAPI.

      Such quantifications have been added for Figure 3G, 5C, 7E, S9A, and S13A. In addition, we have reduced the exposure of Figure 5C and Figure 7E. We hope that you will be satisfied with the revisions.

      (3) The authors used cKO mice to validate that the erythrocytes were eliminated. It would be interesting to detect β-catenin distribution by immunofluorescent staining in primitive hematopoietic cells in cKO mice. Addressing this issue can provide further evidence to support the conservation of Bcas2.

      We appreciate your suggestion. However, we found that red blood cells were almost eliminated in the yolk sac of Bcas2<sup>F/F</sup>;Flk1-Cre mice at E12.5. It is difficult to further detect β-catenin distribution in primitive erythroid cells in these mice.

      (4) The authors discovered that Bcas2 mediated β-catenin nuclear export in a CRM1-dependent manner. CRM1 is a key regulator involved in the majority of factors of nuclear export via recognizing specific nuclear export signals (NES). Validating the NES of Bcas2 is recommended. Furthermore, I wonder about the relationship between Bcas2 and CRM1 in regulating β-catenin nuclear export. One possibility is that Bcas2 covers the NES to inhibit the interaction between CRM1 and β-catenin, thus leading to β-catenin accumulation in the cell nucleus. The authors should discuss this possibility accordingly.

      Thank you for providing an interesting perspective. CRM1-mediated nuclear export of β-catenin usually requires CRM1 recognition and binding with the NES sequences in chaperon proteins, such as APC, Axin and Chibby.[4-6] Moreover, CRM1 can bind directly to and function as an efficient nuclear exporter for β-catenin.[7] Since BCAS2 has not been reported to contain any recognizable NES sequences, it will be interesting to investigate whether BCAS2 competitively inhibits β-catenin from associating with CRM1, or with the chaperone proteins. We have rewritten the discussion on CRM1-dependent nuclear export of β-catenin in line with your comments (Line 572-578).

      (5) It would be interesting if the authors could answer the specificity in Bcas2-mediated protein nuclear export pathway. The authors should detect other classical factors (CRM1 mediated) distribution when loss of Bcas2.

      Thank you for bringing up this point. To test whether BCAS2 specifically regulates CRM1-mediated nuclear export of β-catenin, we have investigated the nucleocytoplasmic distribution of other known CRM1 cargoes, such as ATG3 and CDC37L.[8] BCAS2 overexpression in HeLa cells slightly enhanced the nuclear localization of CDC37L, and had no significant impact on that of ATG3 (Supplemental Figure 11), indicating the specificity of BCAS2 in the regulation of CRM1-dependent nuclear export of β-catenin.

      Minor concerns:

      (1) The name "bcas2Δ7+/- and bcas2Δ14+/-" should be changed into "bcas2+/Δ7 and bcas2+/Δ14"(+/Δ7 or +/Δ14 should be superior on the right).

      Thank you for your suggestion. We have changed the names of the mutants throughout the manuscript.

      (2) The scale bar position in the figures should be unified.

      We agree with your suggestion and have unified the scale bar position in all figures.

      (3) In Figure 4E, "Nuclear" should be changed into "Nucleus".

      We apologize for the mistake and Figure 4E has been revised.

      (4) There are some unaesthetic issues in the figures. The figures need to be further edited. Figure 3H "β-catenin and Merge", Figure 4D "Merge". All these words should be centered in the figures.

      Thank you. We have edited all the figures to ensure that the text is centered.

      Reviewer #2 (Recommendations For The Authors):

      (1) It would be nice to have whole blot images for the Westerns in Supplementary Info.

      Thank you for your suggestion. Whole images for immunoblotting have been supplemented as Source data.

      (2) Line 292 change 5 hpf to 5 dpf.

      (3) Line 301 change "primary" to "primitive"?

      We apologize for the mistakes. We have incorporated these suggestions in the revised manuscript and reexamined spelling throughout the paper.

      (4) Figure S2C: is "Maker" a typographical error? Change to "ladder"?

      We apologize for this typographical error and we have revised it in Figure S2C.

      Reference

      (1) Ishizawa J, Kojima K, Hail N, Tabe Y, Andreeff M. Expression, function, and targeting of the nuclear exporter chromosome region maintenance 1 (CRM1) protein. Pharmacology & Therapeutics. 2015;153:25-35.

      (2) Li X, Feng Y, Yan MF, et al. Inhibition of Autism-Related Crm1 Disrupts Mitosis and Induces Apoptosis of the Cortical Neural Progenitors. Cerebral Cortex. 2020;30(7):3960-3976.

      (3) Li L, Yan B, Shi YQ, Zhang WQ, Wen ZL. Live Imaging Reveals Differing Roles of Macrophages and Neutrophils during Zebrafish Tail Fin Regeneration. Journal of Biological Chemistry. 2012;287(30):25353-25360.

      (4) Neufeld KL, Nix DA, Bogerd H, et al. Adenomatous polyposis coli protein contains two nuclear export signals and shuttles between the nucleus and cytoplasm. Proceedings of the National Academy of Sciences of the United States of America. 2000;97(22):12085-12090.

      (5) Li FQ, Mofunanya A, Harris K, Takemaru KI. Chibby cooperates with 14-3-3 to regulate β-catenin subcellular distribution and signaling activity. Journal of Cell Biology. 2008;181(7):1141-1154.

      (6) Cong F, Varmus H. Nuclear-cytoplasmic shuttling of Axin regulates subcellular localization of β-catenin. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(9):2882-2887.

      (7) Ki H, Oh M, Chung SW, Kim K. β-Catenin can bind directly to CRM1 independently of adenomatous polyposis coli, which affects its nuclear localization and LEF-1/β-catenin-dependent gene expression. Cell Biology International. 2008;32(4):394-400.

      (8) Kirli K, Karaca S, Dehne HJ, et al. A deep proteomics perspective on CRM1-mediated nuclear export and nucleocytoplasmic partitioning. Elife. 2015;4.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Using a cross-modal sensory selection task in head-fixed mice, the authors attempted to characterize how different rules reconfigured representations of sensory stimuli and behavioral reports in sensory (S1, S2) and premotor cortical areas (medial motor cortex or MM, and ALM). They used silicon probe recordings during behavior, a combination of single-cell and population-level analyses of neural data, and optogenetic inhibition during the task.

      Strengths:

      A major strength of the manuscript was the clarity of the writing and motivation for experiments and analyses. The behavioral paradigm is somewhat simple but well-designed and wellcontrolled. The neural analyses were sophisticated, clearly presented, and generally supported the authors' interpretations. The statistics are clearly reported and easy to interpret. In general, my view is that the authors achieved their aims. They found that different rules affected preparatory activity in premotor areas, but not sensory areas, consistent with dynamical systems perspectives in the field that hold that initial conditions are important for determining trial-based dynamics.

      Weaknesses:

      The manuscript was generally strong. The main weakness in my view was in interpreting the optogenetic results. While the simplicity of the task was helpful for analyzing the neural data, I think it limited the informativeness of the perturbation experiments. The behavioral read-out was low dimensional -a change in hit rate or false alarm rate- but it was unclear what perceptual or cognitive process was disrupted that led to changes in these read-outs. This is a challenge for the field, and not just this paper, but was the main weakness in my view. I have some minor technical comments in the recommendations for authors that might address other minor weaknesses.

      I think this is a well-performed, well-written, and interesting study that shows differences in rule representations in sensory and premotor areas and finds that rules reconfigure preparatory activity in the motor cortex to support flexible behavior.

      Reviewer #2 (Public Review):

      Summary:

      Chang et al. investigate neuronal activity firing patterns across various cortical regions in an interesting context-dependent tactile vs visual detection task, developed previously by the authors (Chevee et al., 2021; doi: 10.1016/j.neuron.2021.11.013). The authors report the important involvement of a medial frontal cortical region (MM, probably a similar location to wM2 as described in Esmaeili et al., 2021 & 2022; doi: 10.1016/j.neuron.2021.05.005; doi: 10.1371/journal.pbio.3001667) in mice for determining task rules.

      Strengths:

      The experiments appear to have been well carried out and the data well analysed. The manuscript clearly describes the motivation for the analyses and reaches clear and well-justified conclusions. I find the manuscript interesting and exciting!

      Weaknesses:

      I did not find any major weaknesses.

      Reviewer #3 (Public Review):

      This study examines context-dependent stimulus selection by recording neural activity from several sensory and motor cortical areas along a sensorimotor pathway, including S1, S2, MM, and ALM. Mice are trained to either withhold licking or perform directional licking in response to visual or tactile stimulus. Depending on the task rule, the mice have to respond to one stimulus modality while ignoring the other. Neural activity to the same tactile stimulus is modulated by task in all the areas recorded, with significant activity changes in a subset of neurons and population activity occupying distinct activity subspaces. Recordings further reveal a contextual signal in the pre-stimulus baseline activity that differentiates task context. This signal is correlated with subsequent task modulation of stimulus activity. Comparison across brain areas shows that this contextual signal is stronger in frontal cortical regions than in sensory regions. Analyses link this signal to behavior by showing that it tracks the behavioral performance switch during task rule transitions. Silencing activity in frontal cortical regions during the baseline period impairs behavioral performance.

      Overall, this is a superb study with solid results and thorough controls. The results are relevant for context-specific neural computation and provide a neural substrate that will surely inspire follow-up mechanistic investigations. We only have a couple of suggestions to help the authors further improve the paper.

      (1) We have a comment regarding the calculation of the choice CD in Fig S3. The text on page 7 concludes that "Choice coding dimensions change with task rule". However, the motor choice response is different across blocks, i.e. lick right vs. no lick for one task and lick left vs. no lick for the other task. Therefore, the differences in the choice CD may be simply due to the motor response being different across the tasks and not due to the task rule per se. The authors may consider adding this caveat in their interpretation. This should not affect their main conclusion.

      We thank the Reviewer for the suggestion. We have discussed this caveat and performed a new analysis to calculate the choice coding dimensions using right-lick and left-lick trials (Fig. S3h) on page 8. 

      “Choice coding dimensions were obtained from left-lick and no-lick trials in respond-to-touch blocks and right-lick and no-lick trials in respond-to-light blocks. Because the required lick directions differed between the block types, the difference in choice CDs across task rules (Fig. S4f) could have been affected by the different motor responses. To rule out this possibility, we did a new version of this analysis using right-lick and left-lick trials to calculate the choice coding dimensions for both task rules. We found that the orientation of the choice coding dimension in a respond-to-touch block was still not aligned well with that in a respond-to-light block (Fig. S4h;  magnitude of dot product between the respond-to-touch choice CD and the respond-to-light choice CD, mean ± 95% CI for true vs shuffled data: S1: 0.39 ± [0.23, 0.55] vs 0.2 ± [0.1, 0.31], 10 sessions; S2: 0.32 ± [0.18, 0.46] vs 0.2 ± [0.11, 0.3], 8 sessions; MM: 0.35 ± [0.21, 0.48] vs 0.18 ± [0.11, 0.26], 9 sessions; ALM: 0.28 ± [0.17, 0.39] vs 0.21 ± [0.12, 0.31], 13 sessions).”

      We also have included the caveats for using right-lick and left-lick trials to calculate choice coding dimensions on page 13.

      “However, we also calculated choice coding dimensions using only right- and left-lick trials. In S1, S2, MM and ALM, the choice CDs calculated this way were also not aligned well across task rules (Fig. S4h), consistent with the results calculated from lick and no-lick trials (Fig. S4f). Data were limited for this analysis, however, because mice rarely licked to the unrewarded water port (# of licksunrewarded port  / # of lickstotal , respond-to-touch: 0.13, respond-to-light: 0.11). These trials usually came from rule transitions (Fig. 5a) and, in some cases, were potentially caused by exploratory behaviors. These factors could affect choice CDs.”

      (2) We have a couple of questions about the effect size on single neurons vs. population dynamics. From Fig 1, about 20% of neurons in frontal cortical regions show task rule modulation in their stimulus activity. This seems like a small effect in terms of population dynamics. There is somewhat of a disconnect from Figs 4 and S3 (for stimulus CD), which show remarkably low subspace overlap in population activity across tasks. Can the authors help bridge this disconnect? Is this because the neurons showing a difference in Fig 1 are disproportionally stimulus selective neurons?

      We thank the Reviewer for the insightful comment and agree that it is important to link the single-unit and population results. We have addressed these questions by (1) improving our analysis of task modulation of single neurons  (tHit-tCR selectivity) and (2) examining the relationship between tHit-tCR selective neurons and tHit-tCR subspace overlaps.  

      Previously, we averaged the AUC values of time bins within the stimulus window (0-150 ms, 10 ms bins). If the 95% CI on this averaged AUC value did not include 0.5, this unit was considered to show significant selectivity. This approach was highly conservative and may underestimate the percentage of units showing significant selectivity, particularly any units showing transient selectivity. In the revised manuscript, we now define a unit as showing significant tHit-tCR selectivity when three consecutive time bins (>30 ms, 10ms bins) of AUC values were significant. Using this new criterion, the percentage of tHittCR selective neurons increased compared with the previous analysis. We have updated Figure 1h and the results on page 4:

      “We found that 18-33% of neurons in these cortical areas had area under the receiver-operating curve (AUC) values significantly different from 0.5, and therefore discriminated between tHit and tCR trials (Fig. 1h; S1: 28.8%, 177 neurons; S2: 17.9%, 162 neurons; MM: 32.9%, 140 neurons; ALM: 23.4%, 256 neurons; criterion to be considered significant: Bonferroni corrected 95% CI on AUC did not include 0.5 for at least 3 consecutive 10-ms time bins).”

      Next, we have checked how tHit-tCR selective neurons were distributed across sessions. We found that the percentage of tHit-tCR selective neurons in each session varied (S1: 9-46%, S2: 0-36%, MM:25-55%, ALM:0-50%). We examined the relationship between the numbers of tHit-tCR selective neurons and tHit-tCR subspace overlaps. Sessions with more neurons showing task rule modulation tended to show lower subspace overlap, but this correlation was modest and only marginally significant (r= -0.32, p= 0.08, Pearson correlation, n= 31 sessions). While we report the percentage of neurons showing significant selectivity as a simple way to summarize single-neuron effects, this does neglect the magnitude of task rule modulation of individual neurons, which may also be relevant. 

      In summary, the apparent disconnect between the effect sizes of task modulation of single neurons and of population dynamics could be explained by (1) the percentages of tHit-tCR selective neurons were underestimated in our old analysis, (2) tHit-tCR selective neurons were not uniformly distributed among sessions, and (3) the percentages of tHit-tCR selective neurons were weakly correlated with tHit-tCR subspace overlaps. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      For the analysis of choice coding dimensions, it seems that the authors are somewhat data limited in that they cannot compare lick-right/lick-left within a block. So instead, they compare lick/no lick trials. But given that the mice are unable to initiate trials, the interpretation of the no lick trials is a bit complicated. It is not clear that the no lick trials reflect a perceptual judgment about the stimulus (i.e., a choice), or that the mice are just zoning out and not paying attention. If it's the latter case, what the authors are calling choice coding is more of an attentional or task engagement signal, which may still be interesting, but has a somewhat different interpretation than a choice coding dimension. It might be worth clarifying this point somewhere, or if I'm totally off-base, then being more clear about why lick/no lick is more consistent with choice than task engagement.

      We thank the Reviewer for raising this point. We have added a new paragraph on page 13 to clarify why we used lick/no-lick trials to calculate choice coding dimensions, and we now discuss the caveat regarding task engagement.  

      “No-lick trials included misses, which could be caused by mice not being engaged in the task. While the majority of no-lick trials were correct rejections (respond-to-touch: 75%; respond-to-light: 76%), we treated no-licks as one of the available choices in our task and included them to calculate choice coding dimensions (Fig. S4c,d,f). To ensure stable and balanced task engagement across task rules, we removed the last 20 trials of each session and used stimulus parameters that achieved similar behavioral performance for both task rules (Fig. 1d; ~75% correct for both rules).”

      In addition, to address a point made by Reviewer 3 as well as this point, we performed a new analysis to calculate choice coding dimensions using right-lick vs left-lick trials. We report this new analysis on page 8:

      “Choice coding dimensions were obtained from left-lick and no-lick trials in respond-to-touch blocks and right-lick and no-lick trials in respond-to-light blocks. Because the required lick directions differed between the block types, the difference in choice CDs across task rules (Fig. S4f) could have been affected by the different motor responses. To rule out this possibility, we did a new version of this analysis using right-lick and left-lick trials to calculate the choice coding dimensions for both task rules. We found that the orientation of the choice coding dimension in a respond-to-touch block was still not aligned well with that in a respond-to-light block (Fig. S4h;  magnitude of dot product between the respond-to-touch choice CD and the respond-to-light choice CD, mean ± 95% CI for true vs shuffled data: S1: 0.39 ± [0.23, 0.55] vs 0.2 ± [0.1, 0.31], 10 sessions; S2: 0.32 ± [0.18, 0.46] vs 0.2 ± [0.11, 0.3], 8 sessions; MM: 0.35 ± [0.21, 0.48] vs 0.18 ± [0.11, 0.26], 9 sessions; ALM: 0.28 ± [0.17, 0.39] vs 0.21 ± [0.12, 0.31], 13 sessions).” 

      We added discussion of the limitations of this new analysis on page 13:

      “However, we also calculated choice coding dimensions using only right- and left-lick trials. In S1, S2, MM and ALM, the choice CDs calculated this way were also not aligned well across task rules (Fig. S4h), consistent with the results calculated from lick and no-lick trials (Fig. S4f). Data were limited for this analysis, however, because mice rarely licked to the unrewarded water port (# of licksunrewarded port  / # of lickstotal , respond-to-touch: 0.13, respond-to-light: 0.11). These trials usually came from rule transitions (Fig. 5a) and, in some cases, were potentially caused by exploratory behaviors. These factors could affect choice CDs.”

      The authors find that the stimulus coding direction in most areas (S1, S2, and MM) was significantly aligned between the block types. How do the authors interpret that finding? That there is no major change in stimulus coding dimension, despite the change in subspace? I think I'm missing the big picture interpretation of this result.

      That there is no significant change in stimulus coding dimensions but a change in subspace suggests that the subspace change largely reflects a change in the choice coding dimensions.

      As I mentioned in the public review, I thought there was a weakness with interpretation of the optogenetic experiments, which the authors generally interpret as reflecting rule sensitivity. However, given that they are inhibiting premotor areas including ALM, one might imagine that there might also be an effect on lick production or kinematics. To rule this out, the authors compare the change in lick rate relative to licks during the ITI. What is the ITI lick rate? I assume pretty low, once the animal is welltrained, in which case there may be a floor effect that could obscure meaningful effects on lick production. In addition, based on the reported CI on delta p(lick), it looks like MM and AM did suppress lick rate. I think in the future, a task with richer behavioral read-outs (or including other measurements of behavior like video), or perhaps something like a psychological process model with parameters that reflect different perceptual or cognitive processes could help resolve the effects of perturbations more precisely.

      Eighteen and ten percent of trials had at least one lick in the ITI in respond-to-touch and  respond-tolight blocks, respectively. These relatively low rates of ITI licking could indeed make an effect of optogenetics on lick production harder to observe. We agree that future work would benefit from more complex tasks and measurements, and have added the following to make this point (page 14):

      “To more precisely dissect the effects of perturbations on different cognitive processes in rule-dependent sensory detection, more complex behavioral tasks and richer behavioral measurements are needed in the future.”

      Reviewer #2 (Recommendations For The Authors):

      I have the following minor suggestions that the authors might consider in revising this already excellent manuscript :

      (1) In addition to showing normalised z-score firing rates (e.g. Fig 1g), I think it is important to show the grand-average mean firing rates in Hz.

      We thank the Reviewer for the suggestion and have added the grand-average mean firing rates as a new supplementary figure (Fig. S2a). To provide more details about the firing rates of individual neurons, we have also added to this new figure the distribution of peak responses during the tactile stimulus period (Fig. S2b).

      (2) I think the authors could report more quantitative data in the main text. As a very basic example, I could not easily find how many neurons, sessions, and mice were used in various analyses.

      We have added relevant numbers at various points throughout the Results, including within the following examples:

      Page 3: “To examine how the task rules influenced the sensorimotor transformation occurring in the tactile processing stream, we performed single-unit recordings from sensory and motor cortical areas including S1, S2, MM and ALM (Fig. 1e-g, Fig. S1a-h, and Fig. S2a; S1: 6 mice, 10 sessions, 177 neurons, S2: 5 mice, 8 sessions, 162 neurons, MM: 7 mice, 9 sessions, 140 neurons, ALM: 8 mice, 13 sessions, 256 neurons).”

      Page 5: “As expected, single-unit activity before stimulus onset did not discriminate between tactile and visual trials (Fig. 2d; S1: 0%, 177 neurons; S2: 0%, 162 neurons; MM: 0%, 140 neurons; ALM: 0.8%, 256 neurons). After stimulus onset, more than 35% of neurons in the sensory cortical areas and approximately 15% of neurons in the motor cortical areas showed significant stimulus discriminability (Fig. 2e; S1: 37.3%, 177 neurons; S2: 35.2%, 162 neurons; MM: 15%, 140 neurons; ALM: 14.1%, 256 neurons).”

      Page 6: “Support vector machine (SVM) and Random Forest classifiers showed similar decoding abilities

      (Fig. S3a,b; medians of classification accuracy [true vs shuffled]; SVM: S1 [0.6 vs 0.53], 10 sessions, S2

      [0.61 vs 0.51], 8 sessions, MM [0.71 vs 0.51], 9 sessions, ALM [0.65 vs 0.52], 13 sessions; Random

      Forests: S1 [0.59 vs 0.52], 10 sessions, S2 [0.6 vs 0.52], 8 sessions, MM [0.65 vs 0.49], 9 sessions, ALM [0.7 vs 0.5], 13 sessions).”

      Page 6: “To assess this for the four cortical areas, we quantified how the tHit and tCR trajectories diverged from each other by calculating the Euclidean distance between matching time points for all possible pairs of tHit and tCR trajectories for a given session and then averaging these for the session (Fig. 4a,b; S1: 10 sessions, S2: 8 sessions, MM: 9 sessions, ALM: 13 sessions, individual sessions in gray and averages across sessions in black; window of analysis: -100 to 150 ms relative to stimulus onset; 10 ms bins; using the top 3 PCs; Methods).” 

      Page 8: “In contrast, we found that S1, S2 and MM had stimulus CDs that were significantly aligned between the two block types (Fig. S4e; magnitude of dot product between the respond-to-touch stimulus CDs and the respond-to-light stimulus CDs, mean ± 95% CI for true vs shuffled data: S1: 0.5 ± [0.34, 0.66] vs 0.21 ± [0.12, 0.34], 10 sessions; S2: 0.62 ± [0.43, 0.78] vs 0.22 ± [0.13, 0.31], 8 sessions; MM: 0.48 ± [0.38, 0.59] vs 0.24 ± [0.16, 0.33], 9 sessions; ALM: 0.33 ± [0.2, 0.47] vs 0.21 ± [0.13, 0.31], 13 sessions).”  Page 9: “For respond-to-touch to respond-to-light block transitions, the fractions of trials classified as respond-to-touch for MM and ALM decreased progressively over the course of the transition (Fig. 5d; rank correlation of the fractions calculated for each of the separate periods spanning the transition, Kendall’s tau, mean ± 95% CI: MM: -0.39 ± [-0.67, -0.11], 9 sessions, ALM: -0.29 ± [-0.54, -0.04], 13 sessions; criterion to be considered significant: 95% CI on Kendall’s tau did not include 0).

      Page 11: “Lick probability was unaffected during S1, S2, MM and ALM experiments for both tasks, indicating that the behavioral effects were not due to an inability to lick (Fig. 6i, j; 95% CI on Δ lick probability for cross-modal selection task: S1/S2 [-0.18, 0.24], 4 mice, 10 sessions; MM [-0.31, 0.03], 4 mice, 11 sessions; ALM [-0.24, 0.16], 4 mice, 10 sessions; Δ lick probability for simple tactile detection task: S1/S2 [-0.13, 0.31], 3 mice, 3 sessions; MM [-0.06, 0.45], 3 mice, 5 sessions; ALM [-0.18, 0.34], 3 mice, 4 sessions).”

      (3) Please include a clearer description of trial timing. Perhaps a schematic timeline of when stimuli are delivered and when licking would be rewarded. I may have missed it, but I did not find explicit mention of the timing of the reward window or if there was any delay period.

      We have added the following (page 3): 

      “For each trial, the stimulus duration was 0.15 s and an answer period extended from 0.1 to 2 s from stimulus onset.”

      (4) Please include a clear description of statistical tests in each figure legend as needed (for example please check Fig 4e legend).

      We have added details about statistical tests in the figure legends:

      Fig. 2f: “Relationship between block-type discriminability before stimulus onset and tHit-tCR discriminability after stimulus onset for units showing significant block-type discriminability prior to the stimulus. Pearson correlation: S1: r = 0.69, p = 0.056, 8 neurons; S2: r = 0.91, p = 0.093, 4 neurons; MM: r = 0.93, p < 0.001, 30 neurons; ALM: r = 0.83, p < 0.001, 26 neurons.” 

      Fig. 4e: “Subspace overlap for control tHit (gray) and tCR (purple) trials in the somatosensory and motor cortical areas. Each circle is a subspace overlap of a session. Paired t-test, tCR – control tHit: S1: -0.23, 8 sessions, p = 0.0016; S2: -0.23, 7 sessions, p = 0.0086; MM: -0.36, 5 sessions, p = <0.001; ALM: -0.35, 11 sessions, p < 0.001; significance: ** for p<0.01, *** for p<0.001.”  

      Fig. 5d,e: “Fraction of trials classified as coming from a respond-to-touch block based on the pre-stimulus population state, for trials occurring in different periods (see c) relative to respond-to-touch → respondto-light transitions. For MM (top row) and ALM (bottom row), progressively fewer trials were classified as coming from the respond-to-touch block as analysis windows shifted later relative to the rule transition. Kendall’s tau (rank correlation): MM: -0.39, 9 sessions; ALM: -0.29, 13 sessions. Left panels: individual sessions, right panels: mean ± 95% CI. Dash lines are chance levels (0.5). e, Same as d but for respond-to-light → respond-to-touch transitions. Kendall’s tau: MM: 0.37, 9 sessions; ALM: 0.27, 13 sessions.”

      Fig. 6: “Error bars show bootstrap 95% CI. Criterion to be considered significant: 95% CI did not include 0.”

      (5) P. 3 - "To examine how the task rules influenced the sensorimotor transformation occurring in the tactile processing stream, we performed single-unit recordings from sensory and motor cortical areas including S1, S2, MM, and ALM using 64-channel silicon probes (Fig. 1e-g and Fig. S1a-h)." Please specify if these areas were recorded simultaneously or not.

      We have added “We recorded from one of these cortical areas per session, using 64-channel silicon probes.”  on page 3.  

      (6) Figure 4b - Please describe what gray and black lines show.

      The gray traces are the distance between tHit and tCR trajectories in individual sessions and the black traces are the averages across sessions in different cortical areas. We have added this information on page 6 and in the Figure 4b legend. 

      Page 6: “To assess this for the four cortical areas, we quantified how the tHit and tCR trajectories diverged from each other by calculating the Euclidean distance between matching time points for all possible pairs of tHit and tCR trajectories for a given session and then averaging these for the session (Fig. 4a,b; S1: 10 sessions, S2: 8 sessions, MM: 9 sessions, ALM: 13 sessions, individual sessions in gray and averages across sessions in black; window of analysis: -100 to 150 ms relative to stimulus onset; 10 ms bins; using the top 3 PCs; Methods).

      Fig. 4b: “Distance between tHit and tCR trajectories in S1, S2, MM and ALM. Gray traces show the time varying tHit-tCR distance in individual sessions and black traces are session-averaged tHit-tCR distance (S1:10 sessions; S2: 8 sessions; MM: 9 sessions; ALM: 13 sessions).”

      (7) In addition to the analyses shown in Figure 5a, when investigating the timing of the rule switch, I think the authors should plot the left and right lick probabilities aligned to the timing of the rule switch time on a trial-by-trial basis averaged across mice.

      We thank the Reviewer for suggesting this addition. We have added a new figure panel to show the probabilities of right- and left-licks during rule transitions (Fig. 5a).

      Page 8: “The probabilities of right-licks and left-licks showed that the mice switched their motor responses during block transitions depending on task rules (Fig. 5a, mean ± 95% CI across 12 mice).” 

      (8) P. 12 - "Moreover, in a separate study using the same task (Finkel et al., unpublished), high-speed video analysis demonstrated no significant differences in whisker motion between respond-to-touch and respond-to-light blocks in most (12 of 14) behavioral sessions.". Such behavioral data is important and ideally would be included in the current analysis. Was high-speed videography carried out during electrophysiology in the current study?

      Finkel et al. has been accepted in principle for publication and will be available online shortly. Unfortunately we have not yet carried out simultaneous high-speed whisker video and electrophysiology in our cross-modal sensory selection task.

      Reviewer #3 (Recommendations For The Authors):

      (1) Minor point. For subspace overlap calculation of pre-stimulus activity in Fig 4e (light purple datapoints), please clarify whether the PCs for that condition were constructed in matched time windows. If the PCs are calculated from the stimulus period 0-150ms, the poor alignment could be due to mismatched time windows.

      We thank the Reviewer for the comment and clarify our analysis here. We previously used timematched windows to calculate subspace overlaps. However, the pre-stimulus activity was much weaker than the activity during the stimulus period, so the subspaces of reference tHit were subject to noise and we were not able to obtain reliable PCs. This caused the subspace overlap values between the reference tHit and control tHit to be low and variable (mean ± SD, S1:  0.46± 0.26, n = 8 sessions, S2: 0.46± 0.18, n = 7 sessions, MM: 0.44± 0.16, n = 5 sessions, ALM: 0.38± 0.22, n = 11 sessions).  Therefore, we used the tHit activity during the stimulus window to obtain PCs and projected pre-stimulus and stimulus activity in tCR trials onto these PCs. We have now added a more detailed description of this analysis in the Methods (page 32). 

      “To calculate the separation of subspaces prior to stimulus delivery, pre-stimulus activity in tCR trials (100 to 0 ms from stimulus onset) was projected to the PC space of the tHit reference group and the subspace overlap was calculated. In this analysis, we used tHit activity during stimulus delivery (0 to 150 ms from stimulus onset) to obtain reliable PCs.”   

      We acknowledge this time alignment issue and have now removed the reported subspace overlap between tHit and tCR during the pre-stimulus period from Figure 4e (light purple). However, we think the correlation between pre- and post- stimulus-onset subspace overlaps should remain similar regardless of the time windows that we used for calculating the PCs. For the PCs calculated from the pre-stimulus period (-100 to 0 ms), the correlation coefficient was 0.55 (Pearson correlation, p <0.01, n = 31 sessions). For the PCs calculated from the stimulus period (0-150 ms), the correlation coefficient was 0.68 (Figure 4f, Pearson correlation, p <0.001, n = 31 sessions). Therefore, we keep Figure 4f.  

      (2) Minor point. To help the readers follow the logic of the experiments, please explain why PPC and AMM were added in the later optogenetic experiment since these are not part of the electrophysiology experiment.

      We have added the following rationale on page 9.

      “We recorded from AMM in our cross-modal sensory selection task and observed visually-evoked activity (Fig. S1i-k), suggesting that AMM may play an important role in rule-dependent visual processing. PPC contributes to multisensory processing51–53 and sensory-motor integration50,54–58.  Therefore, we wanted to test the roles of these areas in our cross-modal sensory selection task.”

      (3) Minor point. We are somewhat confused about the timing of some of the example neurons shown in figure S1. For example, many neurons show visually evoked signals only after stimulus offset, unlike tactile evoked signals (e.g. Fig S1b and f). In addition, the reaction time for visual stimulus is systematically slower than tactile stimuli for many example neurons (e.g. Fig S1b) but somehow not other neurons (e.g. Fig S1g). Are these observations correct?

      These observations are all correct. We have a manuscript from a separate study using this same behavioral task (Finkel et al., accepted in principle) that examines and compares (1) the onsets of tactile- and visually-evoked activity and (2) the reaction times to tactile and visual stimuli. The reaction times to tactile stimuli were slightly but significantly shorter than the reaction times to visual stimuli (tactile vs visual, 397 ± 145 vs 521 ± 163 ms, median ± interquartile range [IQR], Tukey HSD test, p = 0.001, n =155 sessions). We examined how well activity of individual neurons in S1 could be used to discriminate the presence of the stimulus or the response of the mouse. For discriminability for the presence of the stimulus, S1 neurons could signal the presence of the tactile stimulus but not the visual stimulus. For discriminability for the response of the mouse, the onsets for significant discriminability occurred earlier for tactile compared with visual trials (two-sided Kolmogorov-Smirnov test, p = 1x10-16, n = 865 neurons with DP onset in tactile trials, n = 719 neurons with DP onset in visual trials).

    1. One may be tempted to assume that GenAI tools, likeChatGPT, have negated the need for many types of knowl-edge.

      While I agree that ChatGPT and other similar mediums have provided users with a tremendous amount of information is it still not up to the human counterpart to distill that information down to something specific? Something usable and actionable?

      Effective prompt engineering can assist with those efforts, of course, yet still the human counterpart must provide the parameter of the queries and distill the information down to a workable solution. One idea I am personally struggling with is recognizing that every thought that pops into my head may not be true. Here, too, AI generated ideas may or may not be true and further human interaction can help discard the informational flotsam.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ Summary In this work, the authors present a careful study of the lattice of the indirect flight muscle (IFM) in Drosophila using data from a morphometric analysis. To this end, an automated tool is developed for precise, high-throughput measurements of sarcomere length and myofibril width, and various microscopy techniques are used to assess sub-sarcomeric structures. These methods are applied to analyze sarcomere structure at multiple stages in the process of myofibrillogenesis. In addition, the authors present various factors and experimental methods that may affect the accurate measurement of IFM structures. Although the comprehensive structural study is appreciated, there are major issues with the presentation/scope of the work that need to be addressed: Major Comments 1. The main weakness of the paper is in its claim of presenting a model of the sarcomere. Indeed, the paper reports a structural study that is drawn onto a 3D schematic. There is no myofibrillogenesis model that would provide insights into mechanisms. Therefore, the use of the word model is grossly overstated.

      In biology, the term “model” is used in various contexts, but it generally refers to a simplified representation of a biological system, a structure or a process. Accordingly, we consider “model” the most fitting phrase for what we present in Figure 4 (Figure 7 in the revised manuscript). These are not arbitrary 3D schematics; they are scaled representations in which the length, the number and the relative three-dimensional arrangement of thin and thick filaments are based on measurements. These measurements are primarily based on our own data (presented in the main text and provided in the supplementary materials), as published data were either lacking or inconsistent. Moreover, we would like to highlight that we do not claim to present a conceptual or mechanistic model of myofibrillogenesis, but we do present structural reconstructions or models for four developmental time points. Therefore, we disagree with the remark that “the use of the word model is grossly overstated”, as our wording fully corresponds to the common sense.

      In general, the major focus and contribution of the work is unclear. How does the comprehensive nature of the measurements contribute to existing literature?

      We significantly revised the text to highlight the main points more firmly, and added an additional section to help non-specialist readers to better understand our aims and findings.

      Figure labels are often rather confusing - for example it is unclear why there is a B, B', B' etc instead of B,C,D, etc.

      The figure labels have been revised in accordance with the reviewer’s recommendation.

      Some comments in the text are not clearly tied to the figures. For example, in lines 108-109, are the authors referring to the shadow along the edges of the myofibril when saying they are not clearly defined (Figure 1C)?

      The lines refer to the fact that identifying the boundary of an “object” in a fluorescence microscopy image is inherently challenging - even under ideal conditions where the object’s image is not affected by nearby signals or background noise. To improve clarity, we revised this section and now it reads: The other key parameter - myofibril diameter - is typically measured using phalloidin staining. However, accurately delineating their boundaries in micrographs is difficult - even under optimal conditions (high signal‑to‑noise ratio, no overlapping fibers, etc.; Fig. 1C). This limitation arises from the fundamental nature of light microscopy as the image produced is a blurred version of the actual structure, due to convolution with the microscope’s point spread function.

      In line 116, it is unclear what "surrounding structures" the authors are referring to if the myofibrils are isolated.

      We revised the text for clarity. It now states: Once isolated, myofibrils lie flat on the coverslip, aligning with the focal plane of the objective lens. This orientation allows for high-resolution, undistorted imaging and accurate two-dimensional measurements, free from interference by neighboring biological structures (e.g.: other myofibrils).

      In lines 141-142, there is no reference of data to back up the claim of validation.

      We addressed this mistake by including a reference to Fig. S1E (Fig. S1D in the revised manuscript).

      In line 170, the authors mention the mef2-Gal4/+ strain as a Gal4 driver line but do not clearly state how this strain is different from the wildtypes or how this impacts their results.

      Mef2-Gal4 is a muscle-specific Gal4 driver, often used in Drosophila muscle studies. It is a convention between Drosophila geneticists that presence of a transgene (i.e. Mef2-Gal4) changes the genetic background, and although it does not necessariliy cause any phenotypic effect, it is clearly distinguished from the wild type situation, and whenever relevant, Mef2-Gal4/+ is the preferred choice (if not the correct choice) as a control instead of wild type. As clear from our data, presence of the Mef2-Gal4 driver line does not affect the length or width of IFM sarcomeres as compared to wild type.

      In lines 182-185, the authors discuss the effects of tissue embedding on morphometrics. Were factors such as animal sex, age, fiber type, etc. conserved in these experiments? If not, any differences in results may be confounding.

      We fully agree with the reviewer that when testing the effect of a single variable, all other variables should remain constant. This is actually one of the main points emphasized in the results section. Additionally, this information is already provided in the Source Data files for each panel.

      In lines 199-201, the authors discuss results of myofibril diameter using different preparation methods, yet no data is cited to support the claims. In line 220, the phrase "6 independent experiments" is unclear. Is each independent experiment performed using a different animal? Furthermore, are 6 experiments performed for each time point?

      We substantially revised the relevant paragraphs and ensured that the corresponding data (Figure 2A in the revised manuscript) is cited each time when it is discussed. We conducted six independent experiments at each time point. This is consistently indicated in the figures and can be verified in the SourceData files (specifically, Fig3SourceData in this case). To clarify what we mean by "independent experiments," we added the following sentence to the Methods section: Experiments were considered independent when specimens came from different parental crosses, and each experiment included approximately six animals to capture individual variability.

      In line 254, the authors refer to "number of sarcomeres". It must be clearly stated if this refers to sarcomeres per myofibril, image area, etc.

      It is now clearly stated as: "number of sarcomeres per myofibril".

      In line 274, the authors refer to "myofilament number". It must be clearly stated if this refers to myofilaments per myofibril, image area, etc.

      We counted the number of myofilaments in developing myofibrils, and this is now clearly stated in the text and in the legend of Figure 3 (Figure 4 in the revised manuscript).

      In line 299, the authors mention that thin filaments measured less than 560 nm in length, yet no data is cited to support this.

      The previously missing reference to Figure 4 (Figure 7 in the revised manuscript) has now been added in addition to the revised Supplementary Figure 5.

      In the "Quantifying sarcomere growth dynamics" section of the summary (starting from line 402) the authors introduce data that would be more naturally placed in the results and discussion section.

      As suggested by the reviewer, we incorporated the key aspects of sarcomere growth dynamics into the Results and Discussion section.

      In lines 422-423, it is not mentioned what the controls are for.

      This was already explained in the main text between lines 167 and 173.

      In the caption of Figure 1C, it is not mentioned what the red dashed lines in the microscope images represent.

      The caption has been updated to include the following clarification: The red dashed lines border the ROI used for generating the intensity profiles.

      In the caption of Figure 1D, the difference between the lighter and darker grey points is not mentioned.

      This was already explained in each relevant figure legend. In this specific case, it is stated between lines 850 and 852: “Light gray dots represent individual measurements of sarcomere length and myofibril diameter, while the larger dots indicate the mean values from independent experiments.”

      In line 849, the stated p-value (0.003) does not match that mentioned in the figure (0.0003).

      We thank the reviewer for noticing this small mistake; correction was made to display the accurate p-value of 0.0003 at both places.

      In line 874, it is not clear what an "independent experiment" refers to (different animal, etc.?).

      We refer the reviewer to point 9, where this question has already been addressed.

      Figure 2A is hard to read. Using different colored dots for different time points might help.

      As suggested by the reviewer, we generated a plot with the individual points color-coded by time.

      The significant figures presented in Figure 4 give a completely inaccurate representation of the variability of the measurements achieved with these techniques.

      Certainly, each measured parameter exhibits inherent biological and technical variability. We have made all the raw data available to the reader through the SourceData files, and this variability is also evident in Figures 1, 2, 3, Supplementary Figure 1, 3, and 5 (Figure 1, 2, 3, 4, 6, and Supplementary Figure 1 in the revised manuscript). Also we have included an additional plot (Supplementary Figure 5 in the revised manuscript) that presents the calculated thin and thick filament lengths and their uncertainty. However, in Figure 4 (Figure 7 in the revised manuscript), our goal was to present an easily understandable visual representation of the sarcomeric structures for each time point, based on the averages of the relevant measurements.

      In line 877, it should be mentioned that the number of filaments is counted per myofibril. The y-axes in the figure should also be adjusted to clarify this.

      As suggested by the reviewer, both the figure legend and the plot have been updated to clearly indicate that the filament count refers to the number per myofibril.

      In line 883, it is not clear what an "independent experiment" refers to (different animal, etc.?).

      We refer the reviewer to point 9, where this question has already been addressed.

      The statement of sample sizes in all figures is a little confusing.

      Following general guidelines, we used SuperPlots to effectively present the data, as nicely demonstrated in the JCB viewpoint article by Lord et al., 2020 (PMID: 32346721). Individual measurements are shown as pooled data points, allowing readers to appreciate the spread, distribution and number of measurements. Overlaid on these pooled dot plots are the mean values from each independent experiment, with error bars representing variability between independent experiments. Sample sizes are provided for both individual measurements and independent experiments. This is now clearly explained in the Materials and Methods section, and we corrected the legends to improve clarity (“n” indicates the number of independent experiments/individual measurements).

      In lines 1007-1008, the authors imply that the lattice model is needed for calculation of myofilament length. However, from the equations and previous data, it seems that this can be estimated using the confocal and dSTORM images.

      As the reviewer correctly noted, myofilament length can be estimated using measurements from confocal and dSTORM images, following the equations provided. However, constructing even a simplified model requires multiple constraints to be defined and applied in a specific order. In practice, one must first determine the number and arrangement of myofilaments in a cross-sectional view of an “average sarcomere” before attempting to build a longitudinal model, where length calculations become relevant. This is now clarified in the text.

      A more specific discussion of future directions is needed to put this paper in context. For example: Can anything from the overall process be used to better understand sarcomere dynamics in larger animals/humans? Can this be applied to disease modelling?

      To address these questions, we have added a section titled STUDY LIMITATIONS, which states: “Our study is focused on describing the growth of IFM sarcomeres during myofibrillogenesis at the level of individual myofilaments. Additionally, we developed a user-friendly software tool for precise sarcomere size measurements and demonstrate that these measurements are sensitive to varying conditions. Whereas, this tool can be used successfully on whole muscle fiber preparations as well, our pipeline was intentionally optimized for individual IFM myofibrils ensuring higher measurement precision in our hands than other type of preparations. Thus, we predict that future work will be required to extend it to sarcomeres from other muscle tissues or species. Nevertheless, our study exemplifies a workflow how to measure sarcomere dimensions precisely. With some variations, it should be possible to adopt it for other muscles, including vertebrate and human striated muscles. To facilitate this and to enhance the accessibility and usability of this dataset, we welcome any feedback and suggestions from researchers in the field.”

      One of the major claims of the paper is that there is a measurable variability with sex and other parameters. However, this data is never clearly summarized, presented (except for supplement), or discussed for its implications.

      We followed the suggestion of the reviewer, and we moved this supplementary data into a main figure, and thoroughly revised the corresponding paragraphs to present and discuss the findings more clearly.

      Minor Comments: 1. Lines 60-65 seem to break the flow of the introduction. As the authors discuss existing methods in literature for IFM analysis in the previous couple sentences, the following sentences should clearly state the limitations of existing methods/current gap in literature and a general idea of what the current work is contributing.

      We agree with this remark, and we substantially revised the Introduction to clearly define the existing gap in the literature and to articulate how our work addresses this gap.

      In line 104, the acronym for ZASPs is not spelled out.

      The acronym has now been spelled out for clarity.

      **Referee Cross-commenting**

      I agree as well.

      Reviewer #1 (Significance (Required)):

      In summary, this paper provides a multi-scale characterization of Drosophila flight muscle sarcomere structure under a variety of conditions, which is potentially a significant contribution for the field. However, the paper scope is overstated in that it does not provide an actual sarcomere model. Further, there are multiple issues with data presentation that impact the readability of the manuscript.

      Although it is somewhat unclear what would be “an actual sarcomere model” for the reviewer, but we cannot accept that we made on overstatement by using the word “model”, because one of the main outcomes of our work are indeed the myofilament level sarcomere models depicted in Figure 4 (Figure 7 in the revised manuscript). As said above, we do not claim that these would be molecular models, or mechanistic models or developmental models, but it makes absolutely nonsense (even in common terms!) that our scaled graphical representations (based on a wealth of measurements) should not be or cannot be called models.

      As to the comment with data presentation, we thank the reviewer for the numerous suggestions, and we substantially revised the manuscript to increase clarity and overall readability.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ Summary: In this manuscript titled "A myofilament lattice model of Drosophila flight muscle sarcomeres based on multiscale morphometric analysis during development," Görög et al. perform a detailed analysis of morphological parameters of the indirect flight muscle (IFM) of D. melanogaster. The authors start by illustrating the range of measurements reported in the literature for mature IFM sarcomere length and width, showing a need to revisit and determine a standardized measurement. They develop a new Python-based tool, IMA, to analyze sarcomere lengths from confocal micrographs of isolated myofibrils stained with phalloidin and a z-disc marker. Using this tool, they demonstrate that sample preparation (especially mounting medium), as well as fiber type, sex, and age influence sarcomere measurements. Combining IMA, TEM, and STORM data, they measure sarcomere parameters across development, providing a comprehensive and up-to-date set of "standardized" sarcomere measurements. Using these data, they generate a model integrating all of the parameters to model sarcomeres at four discrete timepoints of development, recapitulating key phases of sarcomere formation and growth.

      Major comments: Line 200 & 901 - Figure S1B - The authors make a strong statement about the use of liquid versus hardening media, and it is clear from the image provided in Figure S1 that there is a difference in the apparent sarcomere width. The identity of the "liquid media" versus the "hardening media" should be clearly identified in the Results, in addition to the legend for Figure S1. The authors show that "glycerol-based solutions" increase sarcomere width, but the Materials only list 90% glycerol and PBS. However, a frequently used liquid mounting media is Vectashield. Based on the literature, measurements in liquid Vectashield show diameters significantly less than 2.2 microns observed here with presumably 90% glycerol or PBS. Can the authors qualify this statement, or provide data that all forms of liquid mounting media cause this effect? Does this also apply to hemi-thorax and sectioned preparations, or just isolated myofibrils?

      We used a PBS-based solution containing 90% glycerol as our liquid medium, as now stated in the main text. In response to the reviewer’s suggestion, we also tested a non-hardening version of Vectashield (H-1000). Myofibrils in Vectashield were significantly thicker than those in ProLong Gold but still thinner than those in the 90% glycerol–PBS solution, shown in Figure 2B. The mechanisms that could potentially explain these observations have been described in several studies (Miller et al., 2008; Tanner et al., 2011, 2012). Briefly, IFM is a densely packed macromolecular assembly. Upon removal of the cell membrane, myofibrillar proteins attract water, leading to overhydration of the myofilament lattice. This increases the spacing between filaments, resulting in an expansion of overall myofibril diameter. The extent of hydration depends on the osmolarity of the surrounding medium, as the system eventually reaches osmotic equilibrium. While both liquid media induced significant swelling, the observed differences likely reflect variations in their osmotic properties. In contrast, dehydration - an essential step in electron microscopy sample preparation - reduces the spacing between filaments, making myofibrils appear thinner. This explains why EM micrographs consistently show significantly smaller myofibril diameters (Chakravorty et al., 2017).

              Hardening media such as ProLong Gold introduce additional artifacts: during polymerization, these media shrink, exerting compressive forces on the tissue (Jonkman et al., 2020). We therefore propose that isolated myofibrils first expand due to overhydration in the dissection solution, and are then compressed back toward their *in vivo* dimensions during incubation in ProLong Gold. The average *in vivo* diameter of IFM myofibrils can be estimated without direct measurements, as it is determined by two key factors: (i) the number of myofilaments, which has been quantified in EM cross-sections in several studies (Fernandes & Schöck, 2014; Shwartz et al., 2016; Chakravorty et al., 2017) including our own, and (ii) the spacing between filaments, which can be measured by X-ray diffraction even in live *Drosophila* or under various experimental conditions (Irving & Maughan, 2000; Miller et al., 2008; Tanner et al., 2011, 2012). Our findings suggest that the effects of lattice overhydration and media-induced shrinkage are most pronounced in isolated myofibrils. In larger tissue preparations, the inter-myofibrillar space likely acts as a mechanical and osmotic buffer, reducing the extent of such distortions
      

      Can the authors comment on whether the length of fixation or fixation buffer solution, in addition to the mounting medium, make a difference on sarcomere length and diameter measurements? This is another source of variation in published protocols.

      The effect of fixation time on sarcomere morphometrics in whole-mount IFM preparations has been previously demonstrated by DeAguero et al. (2019), as briefly noted in our manuscript. To extend these findings, we performed a comparison using isolated myofibrils, assessing morphometric parameters after fixation for 10, 20 (standard) and 60 minutes. We found no difference between the 10- and 20-minute fixation conditions; however, fixation for 60 minutes resulted in significantly increased myofibril diameter (and these data are now shown in Supplementary Figure 1C). A comparable increase in thickness was also observed when using a glutaraldehyde-based fixative. These results suggest that more extensively fixed myofibrils may better resist the compressive forces exerted by hardening media.

      Line 237-238. The authors conclude that premyofibrils are much thinner than previously measured. The use of Airyscan to more accurately measure myofibril width at this timepoint is a good contribution, as indeed diffraction and light scatter likely contribute to increased width measured in light microscopy images. I also wonder, though, how well the IMP software performs in measuring width at 36h APF, given how irregular the isolated myofibrils at this stage look (wide z-lines but thinner and weaker H and I bands as shown in Fig. 2B)?

      The reviewer is correct that measurements during the early stages of myofibrillogenesis require additional effort. However, in addition to its automatic mode, IMA can also operate in semi-automatic or manual modes, ensuring complete control over the measurements. Myofibril width is determined from the phalloidin channel at the Z-line (as described in the software’s User Guide and Supplementary Figure 2), where it is at its thickest.

      Also, how much of the difference in sarcomere width arises due to effects of "stripping" components off of the sarcomere at the earliest timepoint (for example alpha-actinin or Zasp proteins)?

      A comparison between isolated myofibrils and those from microdissected muscles (Supplementary Figure 3B, Figure 3C in the revised manuscript) shows that the isolation process does not alter the morphometric measurements of sarcomeres. Moreover, the measured myofibril width aligns well with what we expect based on the number of myofilaments observed in TEM cross-sections of myofibrils at 36 hours APF (Figure 3A, now Figure 4A in the revised manuscript), supporting the consistency of our model.

      Myofibrils at early timepoints do contain more than 4-12 sarcomeres in a line (they extend the full length of the myofiber), so it is possible they are breaking due to the detergent and mechanical disruption induced by the isolation method.

      The reviewer is correct - myofibrils likely span the full length of the myofiber from the onset of myofibrillogenesis. However, during the isolation of individual myofibrils, they often break, and even mature myofibrils typically fragment into pieces of about 300 µm in length (illustrated in Figure 1E, now Figure 2A in the revised manuscript). Importantly, our measurements show that this fragmentation does not affect the assessed sarcomere length or width (as shown in Supplementary Figure 3B, now Figure 3C in the revised manuscript).

      Line 312 - What does "stable association" mean in this context? The authors mention early timepoints lack stable association of alpha-Actinin or Zasp52, and they reference Fig. S4C, but this figure only shows 72h and 24 AE, not 36h and 48 h APF. Previous reports have seen localization of both alpha-Actinin and Zasp52, so presumably the detergent or mechanical isolation is stripping these components off of the isolated myofibrils up until 72h.

      In agreement with previous reports, we also detected both α-Actinin (as shown in former Supplementary Figure 3B, now Figure 3C) and Zasp52 in microdissected IFM starting from 36 hours APF. However, these markers were largely absent from the isolated myofibrils of young pupae (36 to 60 hours APF). By 60 hours APF, strong α-Actinin and Zasp52 staining became evident in isolated myofibrils, whereas dTitin epitopes were clearly detectable from the earliest time point examined. This indicates that some proteins, such as α-Actinin and Zasp52, can be lost during the isolation process, whereas others like dTitin are retained and this differential sensitivity appears to depend on developmental stage. A likely explanation is that α-Actinin and Zasp52 are recruited early to Z-bodies but are only fully incorporated as more mature Z-disks form between 48 and 60 hours APF. This incomplete incorporation at the earlier stages could account for their loss during the isolation process. This interpretation is supported by our morphological analysis of the Z-discs, as shown in the dSTORM dataset (former Figure 3B, B’’, now Figure 4C, E) and in longitudinal TEM sections (former Supplementary Figure 5B, now in Figure 6B). Because α-Actinin and Zasp52 are not detected in isolated myofibrils at 36 and 48 hours APF, they are not included in Figure S4C (Figure 5C in the revised manuscript). This is explained in the updated figure legend.

      This same type of issue comes up again in Lines 325-334, where the authors talk about 3E8 and MAC147. They state that 3E8 signal significantly declines in later stages and that MAC147 is not suitable to label myofibrils in young pupae, but they only show data from 72 APF and 24 AE (which looks to have decent staining for both 3E8 and MAC147). A clearer explanation here would be helpful.

      To put it simply: we used one myosin antibody to label the A-band in the IFM of 36h APF and 48h APF animals, and a different antibody for the 72h APF and 24h AE stages. In more detail: Myosin 3E8 is a monoclonal antibody targeting the myosin heavy chain and labels the entire length of mature thick filaments except for the bare zone (former Supplementary Figure 4D, now in Figure 5D), suggesting its epitope is near the head domain. As a result, we expect a uniform A-band staining - excluding the bare zone - which is exactly what we observe in the IFM of young pupae (36h APF and 48h APF; formerly Figure 3B, now Figure 4C in the revised manuscript). However, at 72h APF and 24h AE, Myosin 3E8 produces a different staining pattern: two narrow stripes flanking the bare zone and two broader, more diffuse stripes near the A/I band junction (former Supplementary Figure 4D, now Figure 5D). This change is likely due to restricted antigen accessibility at these later developmental stages - a common issue in the densely packed IFM - making this antibody unsuitable for reliably measuring thick filament length in these stages.

      MAC147 is another monoclonal antibody against Mhc that recognizes an epitope near the head domain. However, it only works reliably in more mature myofibrils (72h APF and 24h AE; formerly Figure 3B, now Figure 4C in the revised manuscript), likely due to its specificity for a particular Mhc isoform. This is why we do not include images from earlier developmental stages using this antibody. We added a revised, concise explanation in the main text for general readers, and provided a more detailed description for specialist readers in the legend of Supplementary Figure 4D (updated as Figure 5D in the revised manuscript).

      Figure 3B. The authors show the H, Z, and I lengths in B', B', and B' and discuss these lengths in the text (lines 305-320). It would also be nice to actually have the plots showing the measured/calculated lengths for thin and thick filaments. These are mentioned in the results, but I cannot find the plots in the figures and there is no panel reference.

      A summary table of the measured and calculated parameters is provided in Fig4SourceData (Fig7Source Data in the revised manuscript). However, following the reviewer’s suggestion, we also generated an additional plot (Supplementary Figure 5 in the revised manuscript) that displays the calculated thin and thick filament lengths.

      Line 400. Does the model in Figure 4 actually have molecular resolution as the authors claim? From these views, thick and thin filaments appear to be represented by cylindrical objects. Localization of specific molecules would require further modeling with individual proteins. Or do the authors mean localization from STORM imaging relative to the ends of the thick and/or thin filaments? The model itself is a useful contribution, but based on Figure 4, resolution of individual molecules is not evident.

      The reviewer is correct; and we fully agree that we do not present a molecular model of sarcomeres in this study - nor do we claim to. Instead we present a myofilament level model. Nevertheless, the scaled myofilament lattice model we introduce could serve as a geometric constraint when constructing supramolecular models of sarcomeres. As the reviewer rightly notes, implementing such an approach would require additional effort.

      The main Results section of the text is condensed into 4 figures. However, I found myself flipping back and forth between the main figures and the supplement continuously, especially parts of Supplemental Figures 1, 3, 4, and 5. With such large amounts of detail in the Results relying on the supplement, it may be worth considering reorganizing the main and supplemental figures, and having 7 main figures, to include important panels that are currently in the supplement (esp. Fig S1B, S1C, S1D, S3B, S4, S5).

      We found it a very useful suggestion, and we substantially reorganized the figures in the revised manuscript according to the recommendations of the reviewer.

      Minor comments: On the plots in Fig. S1B, D, and F, it is hard to see the color of the dots because the red error bars are on top of them. Can the other distribution dots be tinted the correct color or the x-axis labels be added, so it is clear which dataset is which?

      We significantly enlarged the dots to enhance visual clarity.

      Line 142 needs a reference to Figure S1, Panel E, which shows the accuracy and precision measurements.

      The requested panel reference has now been included in the revised manuscript.

      Lines 198 - is this range from the above publications? Needs to be clearly cited.

      The range has indeed been estimated using measurements from the aforementioned publications, and this point is now further clarified in the revised text.

      Figure S3B is confusing - why do the blow-ups overlap both the top (presumably microdissected) and the bottom (presumably isolated) images? The identity of microdissected images should be labeled, as they are hard to see underneath of the blown-up images and the identity of individual image planes wasn't immediately obvious.

      We refined the panel structure of Figure S3B (Figure 3C in the revised manuscript) to enhance clarity as the reviewer suggested.

      Line 298. By "misaligned," do the authors mean the pointed ends are not uniformly anchored in the z-disc, leading to the wide z-disc measurements? At this early stage, I'm not sure "misaligned" is the right word - perhaps "were not yet aligned in register at the z-disc" or something similar.

      We revised the text for clarity. It now reads: At 36 hours APF, thin filaments had not yet aligned in perfect register at the Z-disc, with most measuring less than 560 nm in length - and exhibiting considerable variability.

      Figure S6 - spelling mistake in label of panel A, "sarcomer" should be "sarcomere"

      The typo is corrected.

      Line 487. Spelling "Zaps52" should be "Zasp52"

      The typo is corrected.

      Line 887. Spelling "Myofilement" should be "Myofilament"

      The typo is corrected.

      Line 946-947. In the legend for Supp. Fig. 3., the authors should specify which published datasets on sarcomere length are shown in the figure by including the references in the legend. Presumably the "isolated individual myofibrils" are the blue "this study" lines, leaving the "microdissected muscles" as the magenta "previous reports" on the figure. Without the reference, it is not clear if these are microdissected, isolated myofibrils, hemi-thorax sections, cryosections, or another preparation method for the "previous reports" data.

      The references have now been added to both the figure and its legend.

      **Referee Cross-commenting**

      I agree with the comments from the other reviewers. Many of the major themes are consistent across the reviews, including regarding the model, preparation methods, and the software tool.

      Reviewer #2 (Significance (Required)):

      Strengths: This manuscript is an important contribution to the field of sarcomere development. The authors use modern technologies to revisit variation in morphometric measurements in the literature, and they identify parameters that influence this variation. Notably, sex-specific differences, DLM versus DVM measurements, and mouting media are potential contributors to the variability. Combining TEM and STORM with a confocal timecourse of isolated myofibrils, they refine previously published values of sarcomere length and width, and add more comprehensive data for filament length, number and spacing. This highly accurate timecourse demonstrates continual growth of sarcomeres after 48 h APF, and correct some inconsistencies from previous large-scale timecourse datasets. These data are very valuable to the field, especially Drosophila muscle biologists, and will serve as a comparative resource for future studies. Weaknesses: At early timepoints, loss of sarcomere components through mechanical or detergent-mediated artifacts may influence the authors' measurements. In addition, isolating myofibrils is not always the most ideal approach, as it loses information on myofiber structure as well as organization and structure of the myofibrils in vivo.

      We believe that the control experiments we presented here adequately demonstrate that sarcomere measurements are not affected by the myofibril isolation process at early timepoints (Figure 3C). Nevertheless, we certainly agree with the reviewer that isolated myofibrils alone cannot capture the entire complexity of muscle tissues, and additional approaches should also be applied in complex projects. Yet, we are confident that our approach offers the most reliable and efficient method for precise morphometric analysis of the sarcomeres, and although alone it is very unlikely to be sufficient to address all questions of a muscle development project, it can still be applied as a very useful and robust tool.

      The point regarding liquid versus hardening mounting media is valuable, but remains to be tested and validated with the diverse liquid and hardening media used by other labs.

      Whereas it would not be feasible for us to test all possible liquid and hardening media used by others in all possible conditions, we tested the effect of Vectashield (the most commonly used liquid media) according to the suggestion of the reviewer, and the results are now included in the manuscript. We think that this is a valuable extension of the list of the materials and conditions we tested, although we need to point out that our primary goal was not necessarily to test as many conditions as possible (because the number of those conditions is virtually endless), rather to raise awareness among colleagues that these variables can significantly impact the data obtained and affect their comparability.

      The IMA software seems to be designed specifically for analysis of isolated myofibrils, and it is unclear if it would work for other types of IFM preparations.

      As stated in the manuscript, IMA is a specialized tool designed for the analysis of individual myofibrils. While it can also process other types of IFM preparations in semi-automatic or manual modes, we believe these approaches compromise both efficiency and accuracy. This is further clarified in the revised manuscript.

      A last point is that TEM and STORM may not be available on a regular basis to many labs, hindering wide implementation of the approach used in this manuscript to generate very accurate and detailed measurements of sarcomere morphometrics.

      Regarding the availability of TEM and STORM, we acknowledge that these techniques are not universally accessible. However, that is exactly one major value of our work that our open-source software tool now allows researchers to generate valuable data using only a confocal microscope in combination with our published datasets.

      Audience: Scientists who study sarcomerogenesis or Drosophila muscle biology.

      My expertise: I study muscle development in the Drosophila model.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ Summary: This manuscripts presents a computational tool to quantify sarcomere length and myofibril width of the Drosophila indirect flight muscles, including developmental samples. This tool was applied to confocal and STORM super-resolution images of isolated myofibrils from adult and developing flight muscles. Thick filament numbers per myofibril were counted during development of flight muscles. A myofilament model of developing flight muscle myofibrils is presented that remains speculative for the early developmental stages.

      Major comments: 1. The title of the manuscript appears unclear. What is a lattice model? Lattice is an ordered array. The filament array parameters for mature flight muscles was aready measured. It appears that the authors speculate how this order might be generated during sarcomere assembly, which is not studied in this manuscript as it is limited to periodic arrays after 36h APF.

      As the reviewer correctly points out, a lattice refers to an ordered array - in the case of IFM sarcomeres, this includes both thin and thick filaments. Therefore, the phrase "myofilament lattice model of Drosophila flight muscle sarcomeres" specifically describes a model representing the spatial organization of these filament arrays within the sarcomere. To provide additional clarity for readers, we have revised the title to include more context. It now reads: Developmental Remodeling of Drosophila Flight Muscle Sarcomeres: A Scaled Myofilament Lattice Model Based on Multiscale Morphometrics

      To create a model of these arrays, three essential pieces of information are required:

      1) The length of the filaments,

      2) The number of filaments, and

      3) The relative position of the filaments.

      While some direct measurements are available in the literature, and others can be used to calculate the necessary values, available data is often contradictory or simply different from each other (as described in our ms) making them unsuitable for constructing scaled models of the myofilament arrays. In contrast to that, here we present a comprehensive and consistent set of measurements that enabled us to build models not only of mature sarcomeres but also of sarcomeres at three other significant developmental time points.

      Regarding the mention of "sarcomere assembly" in line 37, we intended it to refer to the growth of the sarcomeres, not their initial formation. We do not speculate about sarcomere assembly anywhere in the text. In fact, we have clearly stated multiple times that our focus is on the growth of the IFM myofilament array during myofibrillogenesis. Nevertheless, to avoid confusion, we revised the phrase in line 37 to "sarcomere growth".

      The authors review the flight muscle sarcomere length literature and conclude it is variable because of imprecise measurements. Likely this is partially true, however, more importantly is that the sarcomere length and width changes during isolation methods of the myofibrils, as well as by various embedding methods, as the authors show here as well in Figure 1B-E.

      We dedicated two sections of the Results - “An automated method to accurately measure sarcomeric parameters” and “IFM sarcomere morphometrics are affected by sex, age, fiber type, and sample preparation” - to exploring potential sources of variability in published IFM sarcomere measurements. Based on these analyses, we conclude that such variability stems from both measurement imprecision and biological or technical factors, including sex, age, fiber type and, of foremost, sample preparation. Because it is difficult to quantify the relative impact of each variable across published studies, we have refrained from speculations about the relative contribution of the different factors in the revised manuscript.

      Hence, I find the strongly claims the authors make here surprising, while they are isolating the myofibrils. Hence, these myofibrils are ruptured at the ends, relaxed or contracted, depending on buffer choice and passive tension is released. On page 8, the authors correctly state that the embedding medium causes shrinkage of the myofibrils. While isolation is state of the art for electron microscopy techniques, other methods including sectioning or even whole mount preparation have been developed for high resolution microscopy of IFMs that avoid these artifacts. Unfortunately, this manuscript only uses isolated myofibrils that were fixed and then mechanically dissociated by pipetting. This method likely induces variations as seen by the large spread of sarcomere length reported in Figure 1C (2.8-3.9µm?) and even bigger spreads for myofibril widths. Are these also seen in tissue without dissections? Unfortunately, no comparision to intact flight muscles are reported with the here presented quantification tool. The sarcomere length spread in the developmental samples is even larger.

      The major issue raised in this paragraph is the use of isolated myofibril versus intact flight muscle preparations. The reviewer claims that the latter might be superior because the isolated myofibrils are ruptured at their ends. Clearly, the intact IFMs cannot be imaged in vivo by light microscopy because the adult fly cuticle is opaque. To visualize these muscles, one must open the thorax, but neither microdissection nor sectioning preserves them perfectly, even the cleanest longitudinal cuts sever some myofibrils, and dissection itself can damage the tissue. Although published images often show only the most pristine regions, the practice of selective cropping cannot be taken as a scientific argument. Here, by comparing sarcomere lengths measured in isolated myofibrils with those from whole-mount longitudinal DLM sections and microdissected IFM myofibers, we demonstrate that isolation does not alter sarcomere length (Figure 1E, now Figure 2A in the revised manuscript). As to myofibril width, it is determined by two parameters: the number of myofilaments and the spacing between them. In vivo filament spacing has been measured directly, and filament counts can be obtained from EM cross-sections of DLM fibers. Combining these values gives an expected in vivo myofibril diameter. While isolated myofibrils measure thinner than those in whole-mount or microdissected samples (Figure 1E, now Figure 2A in the revised manuscript), their diameter closely matches this in vivo estimate (see manuscript, lines 187–198). Therefore, we conclude that isolated myofibrils (even if it seems counterintuitive for this reviewer) are superior for sarcomere measurements than whole-mount preparations - and that is why we primarily rely on them here.

      Despite that, we certainly recognize that isolated myofibrils cannot recapitulate every aspect of an IFM fiber, and the need for whole-mount preparations during our IFM studies is not questioned by us.

              In addition to this general answer to the issues raised in the above paragraph of the reviewer, we would like to specifically reflect for some of the remarks:
      

      „Unfortunately, this manuscript only uses isolated myofibrils that were fixed and then mechanically dissociated by pipetting.”

      This is a false statement that “this manuscript only uses isolated myofibrils” as we used different preparation methods for initial comparisons (see Figure 1E, now Figure 2A in the revised manuscript). Additionally, unlike the reviewer assumed, the myofibrils were first dissociated and then fixed, and not vice versa (as described in the Materials and Methods section).

      „This method likely induces variations as seen by the large spread of sarcomere length reported in Figure 1C (2.8-3.9µm?) and even bigger spreads for myofibril widths. Are these also seen in tissue without dissections?”

      This remark makes absolutely no sense, as we do not report sarcomere length values in Figure 1C at all. By assuming that the reviewer meant to refer to Figure 1B, it still remains a misunderstanding or a false statement, because that panel refers to the variations found in published data (not in our current data), and this is clearly explained both in the figure legend and the main text. Regardless of that, the stated spread does not appear unusual. In the article by Spletter et al. (2018), the authors report a similar spread (2.576–3.542 µm) for sarcomere length in mature IFM using whole-mount DLM cross-sections. As to the second question here, we do observe a comparable spread in other preparations as well (see Figure 1E, now Figure 2A in the revised manuscript), which is again the opposite conclusion as compared to the (clearly false) assumption of the reviewer.

      „Unfortunately, no comparision to intact flight muscles are reported with the here presented quantification tool. „

      This is also a false statement; as we do report comparison to whole mount cross sections which we belive the reviewer considers „intact” in Figure 1E (Figure 2A in the revised manuscript).

      „The sarcomere length spread in the developmental samples is even larger.”

      The spread is not larger at all than in previous reports, as clearly shown in Supplementary Figure 3A.

      The authors suggest that there are sex differences in sarcomere length and pupal development duration. This is potentially interesting, unfortunately they then use mixed sex samples to analyse sarcomeres during flight muscle development.

      In the revised manuscript, we now provide a more detailed description of a subtle post-eclosion difference in IFM sarcomere metrics between male and female Drosophila. We attribute this variation to the well-established observation that female pupae develop slightly faster than males, a property that may last till shortly after eclosion. Confirming this experimentally would require considerable effort with limited scientific benefit. Nonetheless, the subtle nature of this sex-linked variation reinforced our decision to include IFM sarcomeres from both male and female flies in our comprehensive developmental analysis.

      The IMA software tool lacks critical assessment of its performance compared to other tools and the validation presented is too limited. IMA seems to generate systematic errors, based on Fig S1E, as it does not report the ground truth. These have to be discussed and compared to available tools. The principles of fitting used in IMA seem well adapted to IFM myofibrils in low noise conditions, but may not be usable in other situations. This should be assessed and discussed.

      IMA is a specialized software tool developed to address a specific need, notably, to accurately and efficiently measure sarcomere length and myofibril diameter in individual IFM myofibril images labeled with both phalloidin and Z-disc markers. For our purposes, it remains the most suitable and reliable option, and we are confident that IMA outperforms all other available tools. To demonstrate this, we have included a table comparing the few alternatives (MyofibrilJ, SarcGraph, and sarcApp) capable of both measurements, which further supports our conclusion. Given IMA's focused application, extensive validation under artificially low signal-to-noise conditions is unnecessary. While IMA may introduce minor systematic errors (~0.01 µm for sarcomere length and ~0.03 µm for myofibril diameter), these are negligible errors relative to the limitations of the simulated ground truth data used for benchmarking. This point is now addressed in the manuscript.

      It is claimed that validation was achieved on simulated IFM images: do the authors rather mean simulated isolated IFM myofibril images? This is not quite the same in terms of algorithm complexity and this should be corrected if this is the case.

      Indeed, we used simulated individual IFM myofibril images, where both phalloidin labeling and Z-disc labeling are present. This is clearly shown in Supplementary Figure 1A, and stated in the text when first introduced: „we generated artificial images of IFM myofibrils with known dimensions, simulating the image formation process”

      The authors need to revise their comparison to other tools. It is incomplete and seemingly incorrect. It should be clearly stated that IMA is limited to isolated myofibrils, which is a far easier segmentation task than what other tools can do, such as sarcApp (Neininger-Castro et al. 2023, PMID: 37921850). Defining the acronym would be valuable in that sense. The claim line 129-130 "none can adequately measure myofibril diameter from regular side view images" is unclear. What do the authors refer to as "side view images"? Sarc-Graph from Zhao et al 2021, PMID: 34613960, and sarcApp from Neininger-Castro et al. 2023 provide sarcomere width, in conditions that are very similar to what IMA does, e.g. on xy images based on the documentation provided on github. A performance comparison with these tools would be valuable. Does installation and use of IMA require computational skills?

      Motivated by the reviewer’s comments, we revised the section introducing IMA. However, we chose not to include an extensive comparison with other software tools, as this would divert the manuscript’s focus without impacting the main conclusions. Instead, we added a summary table highlighting the key requirements for analyzing IFM sarcomere morphometrics from Z-stacks of phalloidin- and Z-line-labeled individual myofibrils and compared the available tools accordingly. In our experience, most software tools are developed to address very specific problems, even those marketed as general-purpose solutions. Consequently, applying them beyond their intended scope often results in reduced efficiency and suboptimal performance. Although sarcApp was initially available as a free tool, one of its dependencies (PySimpleGUI 5) has since adopted a commercial license model. Using a trial version of PySimpleGUI 5, we evaluated sarcApp on our dataset. The software is limited to single-plane image input, hence raw image stacks must be preprocessed into a suitable format, which is a time consuming step. Furthermore, implementation requires basic programming proficiency, as parameter adjustments must be performed directly within the source code to accommodate dataset-specific configurations. Once appropriately configured, sarcApp reliably quantifies both sarcomere length and myofibril width with accuracy comparable to that of IMA. However, it lacks built-in diagnostic feedback or visualization tools to facilitate measurement verification or troubleshooting during batch processing. SarcGraph also supports only single-plane image inputs and requires prior image preprocessing. Additionally, images must be loaded manually one by one, which further reduces processing efficiency. Parameter optimization relies on direct code modification through a trial-and-error process, demanding a certain level of programming proficiency. Even with these adjustments, the software frequently introduces artifacts - such as Z-line splitting - when applied to our dataset. Even when segmentation is successful, sarcomere length is often overestimated, whereas myofibril diameter is consistently underestimated. As compared to these issues, IMA was designed for ease of use and does not require any programming experience to install or operate. It can automatically handle raw microscopic image formats without the need for preprocessing. Segmentation is fully automated, with no requirement for parameter tuning. The tool provides visual feedback during both the segmentation and fitting steps, allowing users to confidently assess and validate the results. IMA produces accurate and precise measurements of sarcomere length and diameter. Batch processing is enabled by default, significantly improving efficiency when analyzing multiple images. Finally, unlike the reviewer stated, IMA is not limited to isolated myofibrils. It is optimized for isolated myofibrils (i.e. full performance is achieved on these samples), but it can also work on whole-mount preparations in semi-automatic and manual mode, which still allow precise measurements (with some reduction in processing efficiency).

      As to the minor comments, the acronym IMA was already defined in lines 541 and 917–918 of the original submission, as well as on the software’s GitHub page. Additionally, we replaced the phrase "side view images" with "longitudinal myofibril projections" to improve clarity.

      How do the authors know that the bright phallodin signal visible that the Z-disc at 36h and 48h APF is due to actin filament overlap, as suggested? An alternative solution are more short actin filaments at the early Z-discs.

      It is widely accepted that the bright phalloidin signal at the Z-line in mature sarcomeres reflects actin filament overlap (e.g., Littlefield and Fowler, 2002; PMID: 11964243). Accordingly, in slightly stretched myofibrils, this bright signal diminishes, and in more significantly stretched myofibrils, a small gap appears (e.g., Kulke et al., 2001; PMID: 11535621). The width of this bright phalloidin signal corresponds to the electron-dense band seen in longitudinal EM sections (Figure 3B and Supplementary Figure 5B, now Figure 4B and Figure 6B in the revised manuscript) and matches the actin filament overlap observed in Z-disc cryo-EM reconstructions from other species (Yeganeh et al., 2023; Rusu et al., 2017), where individual thin filaments can be resolved. By extension, we interpret the bright phalloidin signals at the Z-discs observed at 36 h and 48 h APF as arising from similar actin filament overlaps, given their comparable width to the electron-dense Z-bodies described both in our study (Supplemantary Figure 5B, now Figure 6B in the revised manuscript) and by Reedy and Beall (1993). While we cannot fully rule out the reviewer’s alternative interpretation, for the time being it remains a bold speculation without supporting evidence, and therefore we prefer to stay with the conventional view.

      The authors seem to doubt their own interpretation that actin filaments shrink when reading line 304 and following. This is obviously critical for the "model" presented.

      Unlike the reviewer implies, we certainly do not doubt our own interpretation, but to avoid confusion we revised the corresponding paragraph in the manuscript and provided more details on our explanation, and we also provide a brief overview of it here. Between 36 h and 48 h APF we observe a pronounced structural transition in the IFM sarcomeres. In EM cross-sections, the previously irregular myofilament lattice becomes organized into a regular hexagonal pattern (Figure 3A, now Figure 4A in the revised manuscript) with filament spacing typical of mature myofibrils (Supplementary Figure 5A, now Figure 6A in the revised manuscript). In longitudinal EM sections, the elongated, amorphous Z-bodies condense along the myofibril axis to form well-defined, adult-like Z-discs (Supplementary Figure 5B, now Figure 6B in the revised manuscript). Similarly, dSTORM imaging shows that the Z-disc associated D-Titin epitopes become more compact and organized during this period (Supplementary Figure 4E, now Figure 5E in the revised manuscript). The edges of the thick filament arrays also become more sharply defined, and the appearance of a distinct bare zone indicates the establishment of a regular register (Figure 3B, now Figure 4B in the revised manuscript). By assuming that a similar reorganization occurs within the thin filament array, the apparent length of the thin filament array would decrease—not due to shortening of individual filaments, rather due to improved alignment. Although we cannot directly resolve single thin filaments, this reorganization offers the most plausible explanation for the observed change.

      Minor comments: 1. Figure S1B is not called out in the text.

      The reviewer might have missed this, but in fact, it is explicitly called out in line 181.

      Fig. 1: Please state whenever images are simulations?

      We appreciate the reviewer’s observation that the simulated IFM myofibril images are indistinguishable from the real ones, as this confirms the adequacy of these images for testing our software tool. However, this is already clearly indicated: Figure 1B features simulated images, as noted in the figure legend (line 824), and Supplementary Figure 1A similarly shows simulated images, as stated both in the legend (line 886) and in the figure.

      Fig. 2: Length-width correlation - please provide individual points color-coded by time point?

      As suggested by the reviewer, we generated a plot with the individual points color-coded by time.

      "newly eclosed males and females, we observed that males have slightly shorter sarcomeres and narrower myofibrils". Please provide a statistical test supporting the difference.

      In the revised manuscript, we compared sarcomere length and myofibril width between males and females from 0 to 96 hours AE using a two-way ANOVA with Sidak’s multiple comparisons test. We expanded our description of these observations in the main text, and details of the statistical analysis are now included in the revised figure legend (Figure 1E). Briefly, newly eclosed males showed slightly shorter sarcomeres than females - a consistent but non-significant trend (p = 0.9846) - which resolved by 12 h AE, with sarcomere lengths remaining similar thereafter (p = 0.1533; Figure 1E). In contrast, myofibril width was significantly narrower in the newly eclosed males (p = 0.0374), but this difference disappeared between 24 and 48 h AE as myofibrils expanded in diameter during post-eclosion development (p

      Were statistical tests performed using animals as sample numbers? Please clarify in the images what are animal and what are sarcomere numbers.

      Following standard guidelines, statistical tests were performed using the means of independent experiments, as noted in the figure legends. For each experiment, we used approximately 6 animals, and this information is now included in the Materials and Methods section.

      mef2-Gal4 should be spelled Mef2-GAL4 according to Flybase.

      This has been corrected in the revised text and figures.

      Are the images shown in Figure 2B representative? 96h AE appears thicker than 24h AE but the graph reports no difference.

      We aimed to show representative images, however, in the case of 96h APF we may have selected a wrong example. We now changed the image for a more appropriate one.

      The authors only found Zasp52 and alpha-Actinin at the Z-discs from 72h APF onwards, which is different to what others have reported.

      Similarly to former reports, we detected both α-Actinin (see Supplementary Figure 3B, now Figure 3C in the revised manuscript) and Zasp52 in microdissected IFMs as early as 36 hours APF. However, these markers were largely absent in isolated myofibrils from the early pupal stages (36–60 hours APF). By 60 hours APF, strong α-Actinin and Zasp52 signals were clearly visible in isolated myofibrils (the closest timepoint captured by dSTORM is 72h APF). As discussed in the manuscript, a likely explanation is that α-Actinin and Zasp52 are recruited to developing Z-bodies early on but are only fully incorporated into mature Z-discs between 48 and 60 hours APF. Their incomplete integration at earlier stages may lead to their loss during the isolation procedure.

      Thick filament length during development has also been estimated by Orfanos and Sparrow, which should be cited (PMID: 23178940)

      Contrary to the reviewer’s claim, the article 'Myosin isoform switching during assembly of the Drosophila flight muscle thick filament lattice' does not provide any measurements or estimates of thick filament length; it only includes a schematic illustration where the length of the thick filaments is not based on empirical data.

      **Referee Cross-commenting**

      I also agree with my colleagues comments, which are largely consistent.

      Reviewer #3 (Significance (Required)):

      This paper introduces a tool to measure sarcomere length. Easy to use tools that do this as well already exist. The tool can also measure sarcomere width, which it claims as unique point, which is not the case, see above comment.

      We are aware that other tools exist to measure sarcomere parameters (and we did not claim the opposite in our ms), nevertheless, we need to emphasize that based on our comparisons, IMA is superior to all three alternatives. Three software tools could, in principle, be used to measure both sarcomere length and myofibril diameter: MyofibrilJ, SarcGraph, and sarcApp. However, two of them - MyofibrilJ and SarcGraph - consistently under- or overestimate these values. The only tool capable of performing these measurements reliably, sarcApp, is no longer freely available, it requires programming expertise, and it does not support raw image file formats, making it difficult to use in practice (see above comments for more details). In contrast, IMA is user-friendly and does not require any programming expertise to install or operate. It can automatically process raw microscopic image formats without the need for preprocessing. Segmentation is fully automated, and no parameter tuning is necessary. The tool offers visual feedback on both the segmentation and fitting processes, enabling users to validate results with confidence. IMA delivers accurate and precise measurements of sarcomere length and diameter. Additionally, batch processing is enabled by default, significantly enhancing workflow efficiency.

      This manuscript shows that depending on the isolation and embedding media sarcomere and myofibrils width changes and hence artifacts can be introduced. While this is not suprising, it has not been well controlled in a number of previous publications.

      Furthermore, this paper measures sarcomere length and width during flight muscle development and consolidates what was already known from previous publications. Sarcomeres are added until 48 h APF, then they grow in diameter. Despite strong claims in the text, I do not see any significant novel findings how sarcomeres grow in length or width or any significant deviations from what has been published before. This is even documented in the supplementary graphs by comparing to published data. It is close to identical.

      The overall process has been quantitatively described in four previous studies (Reedy and Beall, 1993, Orfanos et al., 2015, Spletter et al., 2018, Nikonova et al., 2024). While there is general agreement on the pattern of sarcomere development, significant discrepancies exist among these datasets; differences that become particularly problematic when attempting to build structural models. More specifically: Reedy and Beall (1993) report substantially shorter sarcomeres compared to all other datasets, including ours. This discrepancy likely stems from two factors: (i) their use of longitudinal EM sections, where sample preparation is known to cause considerable tissue shrinkage; and (ii) the maintenance of their flies at 23 °C, a temperature that clearly delays development relative to the more commonly used 25 °C. Interestingly, Spletter et al. (2018) and Nikonova et al. (2024) conducted their experiments at 27 °C, which also deviates from standard conditions and may complicate comparisons. Orfanos et al. (2015) suggested that mature sarcomere length is reached by approximately 88 hours after puparium formation (APF). In contrast, our measurements show that sarcomeres continue to elongate beyond this point, reaching mature length between 12 and 24 hours post-eclosion. All four earlier studies report a mature sarcomere length around 3.2-3.3 µm, only slightly longer than the ~3.2 µm length of thick filaments (Katzemich et al., 2012; Gasek et al., 2016). This would imply an I-band length below ~100 nm, which is an implausibly short distance. In contrast, our data, along with several recent studies (González-Morales et al., 2019; Deng et al., 2021; Dhanyasi et al., 2020; DeAguero et al., 2019), support a mature sarcomere length of approximately 3.45 µm, placing the length of the I-band at around 250 nm. This estimate is more consistent with high-resolution structural observations from longitudinal EM sections and fluorescent nanoscopy (Szikora et al., 2020; Schueder et al., 2023). Although Reedy and Beall (1993) provide limited data on myofibril diameter during myofibrillogenesis, a more detailed quantitative analysis is presented by Spletter et al. (2018) and by Nikonova et al. (2024). Interestingly, Spletter et al. report two separate datasets - one based on longitudinal sections and another on cross-sections of DLM fibers. While the measurements are consistent during early pupal stages, they diverge significantly in mature IFMs (1.116 ± 0.1025 µm vs. 1.428 ± 0.0995 µm), a discrepancy that is not addressed in their publication. Nikonova et al. (2024) report even narrower myofibril widths (0.9887 ± 0.1273 µm). Moreover, the reported diameters of early myofibrils in all three datasets are nearly twice as large as those reported by Reedy and Beall (1993) and in our own measurements, directly contradicting the reviewer's claim that the values are “close to identical.” Finally, our data clearly demonstrate that both the length and diameter of IFM sarcomeres reach a plateau in young adults, which is a key developmental feature not examined in previous studies.

      In summary, we did not and we do not intend to claim that our conclusions are novel as to the general mechanisms of myofibril and sarcomere growth. Rather, our contribution lies in providing a high-precision, robust analysis of the growth process using a state-of-the-art toolkit, resulting in a comprehensive description that aligns with structural data obtained from TEM and dSTORM. We therefore believe that expert readers will recognize numerous valuable aspects of our approaches that will advance research in the field.

      Counting the total number of thick filaments during myofibril development is nice, however, this also has been done (REEDY, M. C. & BEALL, C. 1993, PMID: 8253277). In this old study, the authors reported the amount of filament across one myofibril. How does this compare to the new data here counting all filaments? Unfortunatley, this is not discussed.

      Indeed, the study by Reedy and Beall (1993) was primarily based on longitudinal DLM sections, which were used to estimate myofibril width and count the number of thick filaments on this lateral view images (e.g., ~15 thick filaments wide at 75 hours APF), but total thick filament numbers were not provided. While such data could theoretically be used to estimate the number of myofilaments per myofibril, these estimations would depend on the unverified assumption that the section includes the full width of the myofibril. Additionally, the study did not provide standard deviations or the number of measurements, limiting the interpretability and reproducibility of their findings. These points highlight the need for a more rigorous and quantitative approach. For these reasons, we chose to quantify myofilament number using cross-sections, providing more accurate and reliable assessments.

      Besides the difference between the lateral versus cross sections, a direct comparison of our studies is further complicated by differences in the developmental time points and experimental conditions used. Reedy and Beall (1993) reports data from pupae aged 42, 60, 75 and 100 hours, as well as from adults, whereas we present data from 36, 48, and 72 hours APF, and from 24 hours after eclosion, which corresponds to approximately 124 hours APF. Moreover, their experiments were carried out at 23 °C, a temperature that somewhat slows down pupal development and results in adult eclosion at around 112 hours APF, as stated in their study. In contrast, our experiments were carried out at the more commonly used 25 °C, where adults typically emerge around 100 hours APF.

      Collectively, these differences prevented meaningful comparisons between the two datasets, and therefore we preferred to avoid lengthy discussions on this issue.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript is a focused investigation of the phosphor-regulation of a C. elegans kinesin-2 motor protein, OSM-3. In C-elegans sensory ciliary, kinesin-2 motor proteins Kinesin-II complex and OSM-3 homodimer transport IFT trains anterogradely to the ciliary tip. Kinesin-II carries OSM-3 as an inactive passenger from the ciliary base to the middle segment, where kinesin-II dissociates from IFT trains and OSM-3 gets activated and transports IFT trains to the distal segment. Therefore, activation/inactivation of OSM-3 plays an essential role in its ciliary function.

      Strengths:

      In this study, using mass spectrometry, the authors have shown that the NEKL-3 kinase phosphorylates a serine/threonine patch at the hinge region between coiled coils 1 and 2 of an OSM-3 dimer, referred to as the elbow region in ubiquitous kinesin-1. Phosphomimic mutants of these sites inhibit OSM-3 motility both in vitro and in vivo, suggesting that this phosphorylation is critical for the autoinhibition of the motor. Conversely, phospho-dead mutants of these sites hyperactivate OSM-3 motility in vitro and affect the localization of OSM3 in C. elegans. The authors also showed that Alanine to Tyrosine mutation of one of the phosphorylation rescues OS-3 function in live worms.

      Weaknesses:

      Collectively, this study presents evidence for the physiological role of OSM-3 elbow phosphorylation in its autoregulation, which affects ciliary localization and function of this motor. Overall, the work is well performed, and the results mostly support the conclusions of this manuscript. However, the work will benefit from additional experiments to further support conclusions and rule out alternative explanations, filling some logical gaps with new experimental evidence and in-text clarifications, and improving writing before I can recommend publication.

      We appreciate Reviewer #1’s comments and suggestions. We have now provided additional evidences and discussions to further support our conclusions and fill the logical gaps. We have also provided alternative explanations to our data and improved writing.

      Reviewer #2 (Public review):

      Summary:

      The regulation of kinesin is fundamental to cellular morphogenesis. Previously, it has been shown that OSM-3, a kinesin required for intraflagellar transport (IFT), is regulated by autoinhibition. However, it remains totally elusive how the autoinhibition of OSM-3 is released. In this study, the authors have shown that NEKL-3 phosphorylates OSM-3 and releases its autoinhibition.

      The authors found NEKL-3 directly phosphorylates OSM-3 (although the method is not described clearly) (Figure 1). The phophorylated residue is the "elbow" of OSM-3. The authors introduced phospho-dead (PD) and phospho-mimic (PM) mutations by genome editing and found that the OSM-3(PD) protein does not form cilia, and instead, accumulates to the axonal tips. The phenotype is similar to another constitutive active mutant of OSM-3, OSM-3(G444A) (Imanishi et al., 2006; Xie et al., 2024). osm-3(PM) has shorter cilia, which resembles with loss of function mutants of osm-3 (Figure 3). The authors did structural prediction and showed that G444E and PD mutations change the conformation of OSM-3 protein (Figure 3). In the single-molecule assays G444E and PD mutations exhibited increased landing rate (Figure 4). By unbiased genetic screening, the authors identified a suppressor mutant of osm-3(PD), in which A489T occurs. The result confirms the importance of this residue. Based on these results, the authors suggest that NEKL-3 induces phosphorylation of the elbow domain and inactivates OSM-3 motor when the motor is synthesized in the cell body. This regulation is essential for proper cilia formation.

      Strengths:

      The finding is interesting and gives new insight into how the IFT motor is regulated.

      Weaknesses:

      The methods section has not presented sufficient information to reproduce this study.

      We appreciate that Reviewer #2 is also positive to our study. We have now provided sufficient information in the revised Methods section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Major Concerns

      (1) Why do the authors think that NEKL-3 phosphorylates OSM-3 in the first place? This seems to come out of nowhere and prior evidence indicating that NEKL-3 may be phosphorylating OSM-3 is not even mentioned in the Introduction.

      We thank the Reviewer for raising this important point. Our hypothesis that NEKL-3 phosphorylates OSM-3 stems from prior findings in our lab. In a previous study (Yi et al., Traffic, 2018, PMID: 29655266), we identified NEKL-4, a member of the NIMA kinase family, as a suppressor of the OSM-3(G444E) hyperactive mutation. This discovery prompted us to explore the broader role of NIMA kinases in regulating OSM3. Subsequent genetic screens (Xie et al., EMBO J, 2024, PMID: 38806659) revealed that both NEKL-3 and NEKL-4 suppress multiple OSM-3 mutations, further supporting their functional interaction. Given the established role of NIMA kinases in phosphorylation-dependent processes (Fry et al., JCS, 2012, PMID: 23132929; Chivukula et al., Nat. Med., 2020, PMID: 31959991; Thiel, C. et al. Am. J. Hum. Genet. 2011, PMID: 21211617; Smith, L. A. et al., J. Am. Soc. Nephrol., 2006, PMID: 16928806), we hypothesized that NEKL-3/4 may directly phosphorylate OSM-3 to modulate its activity.

      To test this hypothesis, we expressed recombinant C. elegans NEKL-3 and OSM-3 proteins and conducted in vitro phosphorylation assays. While we were unable to obtain active recombinant NEKL-4 (limitations noted in the revised text), our experiments with NEKL-3 revealed phosphorylation at residues 487-490 (YSTT motif) in OSM-3’s tail region, as confirmed by mass spectrometry. These findings are now explicitly contextualized in the Introduction and Results sections of the revised manuscript.

      Page #4, Line #11:

      “...In our previous study (Yi et al., Traffic, 2018, PMID: 29655266), a genetic screen targeting the OSM-3(G444E) hyperactive mutation identified NEKL-4, a member of the NIMA kinase family, as a suppressor of this phenotype. This finding, combined with reports that NIMA kinases regulate ciliary processes independently of their canonical mitotic roles (Fry et al., JCS, 2012, PMID: 23132929; Chivukula et al., Nat. Med., 2020, PMID: 31959991; Thiel, C. et al. Am. J. Hum. Genet. 2011, PMID: 21211617; Smith, L. A. et al., J. Am. Soc. Nephrol., 2006, PMID: 16928806), prompted us to investigate whether NIMA kinases modulate OSM-3-driven intraflagellar transport. We hypothesized that NEKL-3/4, as paralogs within this family, might directly phosphorylate OSM-3 to regulate its motility...”

      Page #4, line #26:  

      “... To determine whether NIMA kinase family members could directly phosphorylate

      OSM-3, we purified prokaryotic recombinant C. elegans NEKL-3/NEKL-4 and OSM3 protein in order to perform in vitro phosphorylation assays. We were able to obtain active recombinant NEKL-3 but not NEKL-4. The in vitro phosphorylation assays showed that NEKL-3, directly phosphorylates OSM-3 (Fig. 1A-B, Appendix Table S1). Subsequent mass spectrometric analysis revealed phosphorylation at residues 487-490, which localize to the conserved "YSTT" motif within OSM-3’s C-terminal tail region ...”

      (2) The authors need to characterize the proteins they expressed and purified for in vitro ATPase and motility assays. Are these proteins monomers or dimers?

      For our in vitro ATPase and motility assays, OSM-3 was expressed in E. coli BL21(DE3) and purified using established protocols (Xie et al., EMBO J, 2024, PMID: 38806659; Imanishi et al., JCB, 2006, PMID: 17000874). To confirm its oligomeric state, we analyzed recombinant OSM-3 by size-exclusion chromatography coupled with multiangle light scattering (SEC-MALS). As reported in Xie et al. (2024), OSM-3 (~80 kDa monomer) elutes with a molecular weight of 173–193 kDa under physiological buffer conditions, consistent with a homodimeric assembly. These findings confirm that the functional unit used in our assays is the biologically relevant dimer. This characterization has been added to the revised manuscript on Page #35, Line #7.

      “…OSM-3 was expressed in E. coli BL21(DE3) and purified for in vitro assays using established protocols (REFs). Size-exclusion chromatography coupled with multiangle light scattering (SEC-MALS) (Xie et al., EMBO J., 2024) confirmed that recombinant OSM-3 forms a homodimer (173–193 kDa) under physiological conditions, ensuring its dimeric state remained intact....” 

      (3) The authors primarily used PD and PM mutations, which affect all four amino acids in the region. This may or may not be physiologically relevant. Figure 5 indicates that T489 is a critical regulatory site. However, this conclusion is undermined by reliance on PD mutations, which affect all four amino acids. Creating PM (T489E) and PD (T489A) mutations based on WT OSM-3 would better reflect physiological relevance. In vitro assays with a single phosphomimic or phosphor-dead mutation at residue 489 are missing at the end of this story. This would better link Figure 5 with the rest of the manuscript.

      We thank the reviewer for this constructive critique. Below, we address the concerns and integrate new data to strengthen the link between T489 and autoinhibition:

      To probe the regulatory role of T489 phosphorylation, we generated osm-3(T489E) (phosphomimetic, PM) and osm-3(T489A) (phospho-dead, PD) mutant animals. Strikingly, both mutants formed axonal puncta (Figure S7), recapitulating the hyperactive phenotype of the OSM-3G444E mutant. While the similar puncta formation in PM and PD mutants initially appeared paradoxical, this observation underscores the necessity of dynamic phosphorylation cycling at T489 for proper autoinhibition. Specifically, the PD mutant (T489A) likely disrupts phosphorylationdependent autoinhibition stabilization, leading to constitutive activation, where as the PM mutant (T489E) may mimic a "locked" phosphorylated state, preventing dephosphorylation-dependent release of autoinhibition in cilia and trapping OSM-3 in an aggregation-prone conformation. These results highlight T489 as a structural linchpin whose post-translational modification dynamically regulates motor activity. While the precise molecular mechanism—such as how phosphorylation modulates tailmotor domain interactions—remains to be elucidated, our data conclusively demonstrate that perturbing T489 (even in isolation) destabilizes autoinhibition, driving puncta formation and the constitutive activity.

      We have integrated the above paragraph in the revised manuscript on page #8, line #27.

      (4) There seems to be a disconnect between the MT gliding assays in Figure 4C and single molecule motility assays in Figure 4E. The gliding assays show that all constructs can glide microtubules at near WT speeds. Yet, the motility assays show that WT and PM cannot land or walk on MTs. The authors need to explain why this is the case. Is this because surface immobilization of kinesin from its tail disrupts autoinhibition? Alternatively, the protein preparation may include monomers that cannot be autoinhibited and cannot land and processively walk on surface-immobilized microtubules (because they only have one motor domain) but can glide microtubules when immobilized on the surface from their tail.

      The surface immobilization of OSM-3 via its tail domain disrupts autoinhibition, a phenomenon previously observed in other kinesins such as kinesin-1 (Nitzsche et al, Methods Cell Biol., 2010, PMID: 20466139). In our assays, OSM-3 was nonspecifically immobilized on glass surfaces, enabling microtubule gliding by motors whose autoinhibition was relieved through tail anchoring. Critically, the PD and PM mutations reside in the tail region and do not alter the intrinsic properties of the motor head domain. Consequently, once autoinhibition is released via immobilization, the gliding velocities reflect the conserved motor head activity, which is expected to remain comparable across all constructs. While we cannot entirely rule out the presence of monomeric OSM-3 in solution, several lines of evidence argue against this possibility. First, the mutations are located in the elbow region, which is dispensable for motor dimerization. Second, SEC-MALS analysis from prior studies confirms that purified OSM-3 exists predominantly as dimers in solution. 

      We have discussed these issues in the revised text on page #10, line #18: 

      “…In our gliding assays, OSM-3PM has an increased gliding speed of 0.69 ± 0.07 μm/s (Fig. 4 C-D), similar to PD mutant. PD and PM mutations are confined to the elbow region, leaving the motor head’s mechanochemical properties intact. Upon tail immobilization—which releases autoinhibition—the gliding speeds reflect motor head activity. Single-molecule assays, however, directly resolve their native regulatory states: PD mutants are constitutively active, whereas PM mutants persist in an autoinhibited state (Fig. 4E-G). Although monomeric OSM-3 could theoretically mediate singlemotor gliding, the previous SEC-MALS data demonstrate that OSM-3 purifies as stable dimers (Xie et al., EMBO J, 2024, PMID: 38806659). Thus, dimeric OSM-3 is perhaps the predominant functional species in our assays…”

      (5) An alternative explanation for the data is that both PD and PM mutations result in loss-of-function effects, disrupting OSM-3 activity. For instance:

      a) In Figure 2C, both mutations cause shorter cilia than the wild type (WT).

      b) In Figure 4A, both mutations result in higher ATPase activity than WT.

      c) In Figure 4D, both mutations show increased gliding velocity compared to WT. These results suggest the observed effects could stem from loss of function rather than phosphorylation-specific regulation.

      Although PD and PM mutations exhibit superficially similar "loss-of-function" phenotypes in certain assays, they mechanistically disrupt motor regulation in distinct ways:

      a) Ciliary Length (Figure 2C) PD Mutants: Hyperactivation causes OSM-3-PD to prematurely aggregate into axonal puncta, preventing ciliary entry. Consequently, cilia are built solely by the weaker Kinesin-II motor, which only constructs shorter middle segments.

      PM Mutants: OSM-3-PM retains autoinhibition during transport (enabling ciliary entry) but cannot be dephosphorylated in cilia. This blocks activation, leaving OSM-3-PM partially functional and resulting in cilia intermediate in length between WT and PD.

      We have discussed this issue in the revised text on page #5, line #30:

      “…These findings indicate that OSM-3-PM is in an autoinhibited state capable of ciliary delivery, yet fails to achieve full activation due to defective dephosphorylation. This incomplete activation results in suboptimal motor function and intermediate ciliary length phenotypes (Fig.2 B-C). In contrast, OSM-3-PD exhibits constitutive activation leading to aggregation into axonal puncta, which completely abolishes its ciliary entry capacity (Fig.2 A-B)...”

      b) ATPase Activity (Figure 4A)

      PD Mutants: Fully autoinhibition-released (98.15% of KHC ATPase activity), consistent with constitutive activation.

      PM Mutants: Show partial ATPase activity (34.28% of KHC), reflecting imperfect phosphomimicry. While the DDEE substitution introduces negative charges, it fails to fully replicate the steric/kinetic effects of phosphorylated tyrosine (Y486; phenyl ring absent), resulting in incomplete autoinhibition stabilization. Despite this, the residual inhibition is sufficient to phenocopy shorter cilia in vivo.

      We have discussed this issue in the revised text on page #7, line#19:

      “…The PM mutant’s partial ATPase activity (34.28% of KHC) might arise from imperfect phosphomimicry—while the DDEE substitution introduces negative charges, it lacks the steric bulk of phosphorylated tyrosine (pY487). And this incomplete mimicry allows residual autoinhibition, sufficient to limit ciliary construction in vivo...”

      c) Microtubule Gliding Velocity (Figure 4D)

      Gliding Assay Limitation: Tail immobilization artificially releases autoinhibition, masking regulatory differences. Thus, all constructs (PD, PM) exhibit similar velocities (~0.7 µm/s), reflecting conserved motor head activity.

      Single-Molecule Assay (Figure 4E): Directly resolves native autoinhibition states:

      PD mutants show robust motility (autoinhibition released).

      PM mutants remain largely inactive (autoinhibition retained).

      We have discussed this issue in the revised text on page #10, line#18:

      “…In our gliding assays, OSM-3PM has an increased gliding speed of 0.69 ± 0.07 μm/s (Fig. 4 C-D), similar to PD mutant. PD and PM mutations are confined to the elbow region, leaving the motor head’s mechanochemical properties intact. Upon tail immobilization—which releases autoinhibition—the gliding speeds reflect motor head activity. Single-molecule assays, however, directly resolve their native regulatory states: PD mutants are constitutively active, whereas PM mutants persist in an autoinhibited state (Fig. 4E-G)...”

      Minor Suggestions and Concerns

      (1) Lines 60-66: References that support these observations are missing from this section.

      We have added the relevant references.

      (2) Lines 66-67: I would revise this sentence as "It remains unclear how OSM-3 becomes enriched...".

      We have made the changes.

      (3) Line 85: The authors should describe how they perform these assays (i.e. recombinantly expressed NEKL-3 and OSM-3, are these C. elegans proteins, and which expression system was used...).

      We have described them in the main text and methods

      Page #4 line #26

      “...To determine whether NIMA kinase family members could directly phosphorylate OSM-3, we purified prokaryotic recombinant C. elegans NEKL-3/NEKL-4 and OSM-3 protein in order to perform in vitro phosphorylation assays...”

      Page #35 line#12

      “...Basically, point mutations was introduced in to pET.M.3C OSM-3-eGFP-His6 plasmid for prokaryotic expression. Plasmid transformed E. coli (BL21) was cultured at 37°C and induced overnight at 23°C with 0.2 mM IPTG. Cells were lysed in lysis buffer (50 mM NaPO4 pH8.0, 250 mM NaCl, 20 mM imidazole, 10 mM bME, 0.5 mM ATP, 1 mM MgCl¬2, Complete Protease Inhibitor Cocktail (Roche)) and Ni-NTA beads were applied for affinity purification. After incubation, beads were washed with wash buffer (50 mM NaPO4 pH6.0, 250 mM NaCl, 10 mM bME, 0.1 mM ATP, 1 mM MgCl¬2) and eluted with elute buffer (50 mM NaPO4 pH7.2, 250 mM NaCl, 500 mM imidazole, 10 mM bME, 0.1 mM ATP, 1 mM MgCl¬2). Protein concentration was determined by standard Bradford assay. C elegans nekl-3 cDNA was cloned in to pGEX-6P GST vector and expressed in E. coli BL21 (DE3) and purified for in vitro phosphorylation assays. Plasmid transformed E. coli (BL21) was cultured at 37°C and induced overnight at 18°C with 0.5 mM IPTG. Cells were lysed in lysis buffer (50 mM NaPO4 pH8.0, 250 mM NaCl, 1 mM DTT, Complete Protease Inhibitor Cocktail (Roche)) and GST beads were applied for affinity purification. After incubation, beads were washed with wash buffer (50 mM NaPO4 pH6.0, 250 mM NaCl, 1 mM DTT) and eluted with elute buffer (50 mM NaPO4 pH7.2, 150 mM NaCl, 10 mM GSH, 1 mM DTT). Purified proteins were dialyzed against storge buffer (50 mM Tris-HCl, pH 8.0, 150 mM NaCl). Protein concentration was determined by standard Bradford assay...”

      (4) Line 141: The first sentence of this paragraph lacks motivation. I would start this sentence with "To directly observe the effects of phosphor mutants in the elbow region in microtubule binding and motility of OSM-3, we...".

      We have made the change.

      (5) Figure 1B: The mass spectrometry data in Figure 1B lacks adequate explanation. The Methods section should detail the experimental protocol, data interpretation, and any databases used. Additionally, the manuscript should list all identified phosphorylation sites on OSM-3 to provide context, including whether Y487_T490 is the major site.

      We have provided the detailed experimental protocol, data interpretation, and databases used in methods. We have provided all identified sites as Appendix table S1.

      (6) Figure 1C: Is it possible to model the effect of PM and PD mutations using AlphaFold? The authors should also show PAE or pLDDT scores of their model.

      AlphaFold cannot well model the effect of mutants, but we conducted the Rosetta relax to capture their possible conformational changes, as shown in the revised Figure 3. We have provided PAE and pLDDT as a new figure, Figure S2.

      (7) Figure 2D: The unit for speed should use a lowercase "s" for seconds.

      We have fixed it.

      (8) Figure 3: I am not sure whether this figure stands for a main text figure on its own, as it is only a Rosetta prediction and is not supported by any experimental data. In addition, it remains unclear what the labels on the x-axis mean.

      We have updated the figure and explain the labels on the x-axis in Figure S4 to make it more reader-friendly.

      (9) Figure 4: NEKL-3-treated OSM-1 should be included as a positive control in the in vitro experiments.

      We suspect that the Reviewer asked for NEKL-3-treated OSM-3. 

      In our other study which has just been accepted by the Journal of Cell Biology, NEKL3-treated OSM-3 significantly reduced the affinity between OSM-3 motor and microtubules and showed very low ATPase activity. We have cited and discussed this in the revised text on page #10, line #28: 

      “…As demonstrated in our recent study (Huang et al., JCB, 2025, In press, attached), phosphorylation of OSM-3 by NEKL-3 at two distinct regions—Ser96 and the conserved "elbow" motif—differentially regulates its activity and localization. Phosphorylation at Ser96 reduces OSM-3’s ATPase activity and alters its ciliary distribution from the distal segment to a uniform localization, while elbow phosphorylation induces autoinhibition, retaining OSM-3 in the cell body. Strikingly, in vitro phosphorylation of OSM-3 by NEKL-3 significantly reduces its microtubulebinding affinity, likely arising from combined modifications at both sites. We propose a model wherein elbow phosphorylation ensures anterograde ciliary transport, while Ser96 phosphorylation fine-tunes distal segment targeting. This multistep regulation may involve distinct phosphatases to reverse phosphorylation at specific sites, a hypothesis warranting further investigation….”

      (10) Figure 4C, D, and F: The unit of velocity is wrong. The authors should use the same units they used in the table shown in Figure 4B.

      We have fixed these errors

      (11) Figure 4F: The velocity of PD is a lot lower than G444E. Therefore, it would be more appropriate to refer to PD as partially active, rather than hyperactive.

      We have made the change. 

      (12) Figure 5: There is too much genetics jargon on this figure (EMF, F2, 100%Dyf,...). How are the alleles numbered? Is it OK to refer to them as Alleles 1 and 2 for simplicity?

      According to the established C. elegans allele nomenclature, each worm allele has a unique number named after the lab code for identification. We have simplified the labels and updated the figure to make it more reader-friendly.

      (13) Figure 5E: A plot would be more reader-friendly than a table. Additionally, the legend for Fig. 5E mistakenly refers to it as "D."

      We have changed the table to a plot and fixed the mistakes. We thank the Reviewer for pointing them out.

      Reviewer #2 (Recommendations for the authors):

      (1) The model appears as if NEKL-3 induces dephosphorylation of OSM-3 (Figure 6). This is not consistent with the conclusions described in the Discussion and is confusing.

      We have updated the model figure and fixed the error.

      (2) It should be described why the authors hypothesized NEKL-3 phosphorylates OSM3. Was there genetic evidence? Did the authors screened cilia-related kinases? or Did the authors identify it incidentally? Providing this information would help readers to understand the context of the research.

      We appreciate both Reviewers for pointing out this issue. 

      Our hypothesis that NEKL-3 phosphorylates OSM-3 stems from prior findings in our lab. In a previous study (Yi et al., Traffic, 2018, PMID: 29655266), we identified NEKL-4, a member of the NIMA kinase family, as a suppressor of the OSM-3(G444E) hyperactive mutation. This discovery prompted us to explore the broader role of NIMA kinases in regulating OSM-3. Subsequent genetic screens (Xie et al., EMBO J, 2024, PMID: 38806659) revealed that both NEKL-3 and NEKL-4 suppress multiple OSM-3 mutations, further supporting their functional interaction. Given the established role of NIMA kinases in phosphorylation-dependent processes (Fry et al., JCS, 2012, PMID: 23132929; Chivukula et al., Nat. Med., 2020, PMID: 31959991; Thiel, C. et al. Am. J. Hum. Genet. 2011, PMID: 21211617; Smith, L. A. et al., J. Am. Soc. Nephrol., 2006, PMID: 16928806), we hypothesized that NEKL-3/4 may directly phosphorylate OSM3 to modulate its activity.

      To test this hypothesis, we expressed recombinant C. elegans NEKL-3 and OSM-3 proteins and conducted in vitro phosphorylation assays. While we were unable to obtain active recombinant NEKL-4 (limitations noted in the revised text), our experiments with NEKL-3 revealed phosphorylation at residues 487-490 (YSTT motif) in OSM-3’s tail region, as confirmed by mass spectrometry. These findings are now explicitly contextualized in the Introduction and Results sections of the revised manuscript.

      Page #4, Line #11:

      “... In our previous study (Yi et al., Traffic, 2018, PMID: 29655266), a genetic screen targeting the OSM-3(G444E) hyperactive mutation identified NEKL-4, a member of the NIMA kinase family, as a suppressor of this phenotype. This finding, combined with reports that NIMA kinases regulate ciliary processes independently of their canonical mitotic roles (Fry et al., JCS, 2012, PMID: 23132929; Chivukula et al., Nat. Med., 2020, PMID: 31959991; Thiel, C. et al. Am. J. Hum. Genet. 2011, PMID: 21211617; Smith, L. A. et al., J. Am. Soc. Nephrol., 2006, PMID: 16928806), prompted us to investigate whether NIMA kinases modulate OSM-3-driven intraflagellar transport. We hypothesized that NEKL-3/4, as paralogs within this family, might directly phosphorylate OSM-3 to regulate its motility...”

      Page #4, line #26: 

      “... To determine whether NIMA kinase family members could directly phosphorylate OSM-3, we purified prokaryotic recombinant C. elegans NEKL-3/NEKL-4 and OSM3 protein in order to perform in vitro phosphorylation assays. We were able to obtain active recombinant NEKL-3 but not NEKL-4. The in vitro phosphorylation assays showed that NEKL-3, directly phosphorylates OSM-3 (Fig. 1A-B, Appendix Table S1). Subsequent mass spectrometric analysis revealed phosphorylation at residues 487-490, which localize to the conserved "YSTT" motif within OSM-3’s C-terminal tail region...”

      (3) It is curious the authors have not addressed the cilia phenotype and the localization of OSM-3 in nekl-3 mutant. Regardless of whether these observations agrees with the proposed mechanisms, it is essential for the authors to show and discuss the cilia phenotype and OSM-3 localization in nekl-3 mutants.

      We thank the Reviewer for highlighting this critical point. Indeed, nekl-3 null mutants are inviable due to essential mitotic roles (Barstead et al., 2012, PMID: 23173093), precluding direct analysis of ciliary phenotypes. To bypass this limitation, we recently generated nekl-3 conditional knockouts (cKOs) in ciliated neurons (Huang et al., JCB, 2025 in press, attached). In these mutants, OSM-3—which is normally enriched in the ciliary distal segment—becomes uniformly distributed along the cilium. This redistribution correlates with premature activation of OSM-3-driven anterograde motility in the ciliary middle region, consistent with our proposed model where NEKL3 phosphorylation suppresses OSM-3 activity. We have now integrated this result and discussion into the revised manuscript, reinforcing the physiological relevance of NEKL-3-mediated regulation in ciliary transport. 

      Page #6 line #10

      “… While nekl-3 null mutants are inviable due to essential mitotic roles (Barstead et al., 2012, PMID: 23173093), conditional knockout (cKO) of nekl-3 in ciliated neurons (Huang et al., JCB, 2025 in press, attached) revealed its critical role in regulating OSM3 dynamics. In nekl-3 cKO animals, OSM-3—normally enriched in the ciliary distal segment—redistributed uniformly along the cilium, concomitant with premature activation of anterograde motility in the middle ciliary region. This phenotype aligns with our model wherein NEKL-3 phosphorylation suppresses OSM-3 activity, ensuring spatiotemporal regulation of IFT.…”

      (4) The methods section lacks some information, which is critical to reproducing this study.

      We have now provided detailed information in the methods section in the revised manuscript.

      (a) It is not described how the authors determined phosphorylation of OSM-3 by NEKL-3. In methods, nothing is described about the assay.

      We performed in vitro phosphorylation assays using recombinant OSM-3 and NEKL3 purified from bacteria. We then used LC-MS/MS for identification of phosphorylation sites. We have now updated the methods section to include all the information.

      Page #4 line #26

      “... To determine whether NIMA kinase family members could directly phosphorylate OSM-3, we purified prokaryotic recombinant C. elegans NEKL-3/NEKL-4 and OSM3 protein in order to perform in vitro phosphorylation assays. We were able to obtain active recombinant NEKL-3 but not NEKL-4. The in vitro phosphorylation assays showed that NEKL-3, directly phosphorylates OSM-3 (Fig. 1A-B, Appendix Table S1). Subsequent mass spectrometric analysis revealed phosphorylation at residues 487-490, which localize to the conserved "YSTT" motif within OSM-3’s C-terminal tail region...”

      Page #36, line #19

      “In vitro phosphorylation assay 20 μM purified OSM-3 was incubated with 1 μM GST-NEKL-3 at 30 °C in 100 μL reaction buffer (50 mM Tris-HCl pH 8.0, 10 mM MgCl2, 150 mM NaCl, and 2 mM ATP) for 30 min. The reaction was terminated by boiling for 5 min with an SDS-sample buffer.

      Mass spectrometry

      Following NEKL-3 treatment, OSM-3 proteins were resolved by SDS-PAGE and visualized with Coomassie Brilliant Blue staining. Protein bands corresponding to OSM-3 were excised and subjected to digestion using the following protocol: reduction with 5 mM TCEP at 56°C for 30 min; alkylation with 10 mM iodoacetamide in darkness for 45 min at room temperature, and tryptic digestion at 37°C overnight with a 1:20 enzyme-to-protein ratio. The resulting peptides were subjected to mass spectrometry analysis. Briefly, the peptides were analyzed using an UltiMate 3000 RSLCnano system coupled to an Orbitrap Fusion Lumos mass spectrometer (Thermo Fisher Scientific). We applied an in-house proteome discovery searching algorithm to search the MS/MS data against the C. elegans database. Phosphorylation sites were determined using PhosphoRS algorithm with manual validation of MS/MS spectra.”

      (b) The method of structural prediction by Alfafold2 and LocalColabFold needs clarification. In general, the prediction gives several candidates. How did the authors choose one of these candidates?

      We generated five candidate models and all of them showed similar conformation. We thus chose the model with the highest confidence. We have provided PAE and pLDDT as additional data in Figure S2 and discussed them in the revised text on, Page #4, line #32: 

      “...To gain structural insights from this motif, we employed LocalColabFold based on AlphaFold2 to predict the dimeric structure of OSM-3 (Evans et al., 2022; Jumper et al., 2021; Mirdita et al., 2022). The highest-confidence model was selected for further analysis (Fig. 1C, Fig. S2)...”

      (c) The methods to predict conformational changes by introducing various point mutations are interesting (Figure 3). However, the methods require more detailed descriptions. In the current form, the manuscript only lists the tools used. The pipelines and parameters need to be described. This information is important because AlphaFoldbased predictions often give folded conformations because the training data are mainly composed of folded proteins. It is surprising that the methods applied here give open conformations induced by point mutations.

      We have described the pipelines in the revised Methods section on page#34, line#25: 

      “…OSM-3 model was predicted using LocalColabFold (Evans et al., 2022; Jumper et al., 2021; Mirdita et al., 2022). Mutated proteins were designed by Pymol 2.6, choosing the rotamer of the mutated residues in G444E, PM and PD models with the least clash as the initial conformation. To predict mutation-induced conformational changes, the initial models were subjected to Pyrosetta (Chaudhury et al., 2010). The energies of pre-relaxed models were evaluated with Rosetta Energy Function 2015 (Alford et al., 2017), and then the relax procedure were applied to the models with default parameters to obtain the relaxed models visualized by Pymol to minimize the energy of these models. In detail, to obtain the relaxed models visualized by Pymol and minimize the energy of these models, the classic relax mover was used in the procedure mentioned above with default settings. The relax script has been uploaded to Github: https://github.com/young55775/RosettaRelax_for_OSM3...”

      (5) The authors have purified proteins. Do they show different properties in gel filtration that are consistent with the structural prediction? It is anticipated that open-form mutants are eluted from earlier than closed forms.

      We thank the reviewer for this insightful suggestion. Indeed, our recent study supported that the open-from of the active OSM-3 G444E mutation were eluted earlier than the wild-type closed form (Xie et al., EMBO J., 2024). While the current study did not perform gel filtration chromatography (SEC) to directly compare the hydrodynamic properties of the OSM-3 mutants, our functional assays provide robust evidence for conformational changes predicted by structural modeling. For example: ATPase activity assays revealed that the open-state mutants (e.g., G444E and PD muatnts) exhibited significantly enhanced enzymatic activity (Figure 4A), consistent with structural predictions of an active, destabilized autoinhibitory interface (Figure 3A). These functional readouts collectively validate the predicted structural states. While SEC could further corroborate these findings by distinguishing compact (closed) versus extended (open) conformations, we prioritized assays that directly link structural predictions to in vitro enzymatic activity and in vivo ciliary transport dynamics. Future studies incorporating SEC or cryo-EM will provide additional biophysical validation of these states.

      We have revised the text in the manuscript (Page #7, Lines #22): 

      “…Notably, the open-state OSM-3 mutants (e.g., G444E) displayed elevated ATPase activity, consistent with structural predictions of autoinhibition release (Fig. 3A, Fig. 4A) (Xie et al., 2024). While hydrodynamic profiling (e.g., SEC) could further resolve conformational states, our functional assays directly connect predicted structural changes to altered biochemical and cellular activity...”

      Minor point

      (1) Line 85 "MIMA kinase family" should be "NIMA kinase family".

      We have corrected the typo and appreciate that the Reviewer for pointing it out. 

      (2) M.S. and D.S. need to be defined in Figure 2D.

      We have updated the figures.

    1. The reception of Saiáwush by Afrásiyáb was warm and flattering. From the gates of the city to the palace, gold and incense were scattered over his head in the customary manner, and exclamations of welcome uttered on every side.   "Thy presence gives joy to the land,   Which awaits thy command;     It is thine! it is thine!   All the chiefs of the state have assembled to meet thee,   All the flowers of the land are in blossom to greet thee!"

      This version of the epic leans into the celebration that Saiáwush received when arriving in Túrán. It generally seems to emphasize how beloved he was. Through this, differences between the Iranians and the Turanians are marked. As soon as Saiáwush arrives, the people essentially beg him to rule over them. This suggests that he is equipped to do what the locals cannot, therefore he should have power, putting Iranians in a favorable position and building a sense of national pride. At the same time, a distinction between them and the foreign group is established. CC BY-NC-SA

    1. 01:16:46:13 Surphanaka is the one with the really ugly nose.

      Be it through caricatures or descriptions, differences in physical features have always been an incredibly common way of othering groups. Surphanaka, Ravana's sister, having a so-called “ugly nose” exemplifies this phenomenon. A similar sentiment about Ravana is also expressed: “Your ugly yellow eyes should fall out of your head as you stare at me so lustfully, Ravana.” (Paley, 00:27:16 - 00:27:21). In his case, Ravana’s yellow eyes serve to dehumanize him. Where distinct features emphasize a difference, inhuman features—like yellow eyes—only widen that gap. CC BY-NC-SA

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for providing valuable comments and suggestions for improving the manuscript.

      Response to reviewer comments:

      Reviewer-1

      Comment 1: Major concern is the study lacks rigor in several areas where n=2, results are not quantified with statistics. They need to run power analysis and increase their samples sizes. Please include statistics on all measurements. Filamentous actin staining and alpha-sma is used to visualize mechanosensing but also in other cell activities such as cell contractility for movement, cell to substrate adhesion, cell division, etc. They need to query more mechanosensing related pathways (Piezo1/2, Yap/taz-Hippo, integrin-Focal Adhesion Kinase, etc) to show that mechanosensing changed.

      Response: We have increased the sample size to a minimum of n=3 in most cases. However, a few experiments will require more time to increase sample size, as mentioned below.

      Our data emphasized the role of Rac1 and SRF. We understand that other molecular players may also be involved in sensing or responding to mechanical forces, but surveying multiple families of candidates without a specific hypothesis or functional experiment is beyond the scope of this study.

      __Comment 2: __Fig. 1: In panel E, the cranial bone area measurement is not normalized to mitigate the possible effect of individual differences.

      Response: We have re-quantified the data with normalization to the length of the skull.

      __Comment 3: __In Fig. 2 the authors mentioned many phenotypical changes (bone length changes, gap thickness change, apex thickness change, etc.) based on histology stain, none of them are quantified to show a significant difference between Rac1-WT and Rac1-KO.

      Response: In Fig. 2A, we present the gross morphology of the Rac1-KO embryos and only discuss the tissue defects like edema, hematoma, and hypoplasia, confirmed through H&E as shown in Fig. 2C. We also show the apical limits of the intact calvaria in Fig. 2D, consistent with the calvaria defects observed at birth. In fact, we do not discuss any “bone length changes, gap thickness, or apex thickness change” in this section as suggested by the reviewer. To address the request for more quantification we have added measurement of the edematous area of the apical mesenchyme at E14.5 (Fig. 2C), and this is now shown in Suppl. Fig. 1E. We also added quantification of embryo genotypes and Chi-square tests, now shown in Suppl. Fig. 1D.

      Comment 4: Fig. 2 In panel D, with only 2 embryos per group is not enough for quantitation

      Response: We plan to increase the number of embryos during the revision period.

      Comment 5: Fig. 2 In panel D, the two arrows in the Rac1-KO mutants are not easy to catch.

      Response: We made the arrows bigger and bolder.

      Comment 6: Fig. 3 The thickness quantification is not performed.

      Response: We added quantification in Fig. 3D.

      Comment 7: Fig. 3 The images show an obvious curve change of the apex between the control and mutant. Such change is not discussed in the results. Is it due to histology issue?

      Response: We do not think it is due to technical issues but reflects a real change in the shape of the apex of the head. We modified the graphical representation in Figure 3E to reflect this change in curvature. We also added the following sentence to the results on page 7: “We also noted a loss of curvature in the apex of the Rac1-KO head at E13.5, which correlated with loss of aSMA+ mesenchymal cells and thinning of the EMM (Fig. 3E).”

      __Comment 8: __The merged layer did not show S100a6. While the authors are showing apical expansion of the mesenchyme toward the dermis and meninges, it is hard to track where they are without a merged image.

      Response: We added merged images.

      Comment 9: Fig. 4 In panel B, 2 biological replicates per genotype are very low.

      __Response: __The effect of Rac1-KO on cell cycle is already known (Moore et al. 1997; Nikolova et al. 2007; Gahankari et al. 2021), and our result is supported by in vivo quantification of Tom+Edu+ cells in different regions of the embryonic head shown in Fig. 4A. We prefer not to repeat this assay.

      Comment 10: Fig. 4 There is no cell death data.

      Response: We will generate data on cell death during the revision period.

      __Comment 11: __Fig. 5 In panel B, the GAPDH western plot bands in the mutants seem to be thinner than those of controls.

      Response: We verified equal loading with a Ponceau stain, so this minor change in the GAPDH level could be due to biological differences in the protein level. Nevertheless, by our estimation this minor difference does not explain away the major difference in Rac1 and Srf levels.

      __Comment 12: __Though the immunostain showed a decrease in signal intensity, it is hard to know whether the decrease is significant enough across all Rac1-KO mutants. They need to measure the fluorescence intensity and perform statistics.

      Response: We will generate better images of SRF staining and quantify the difference between Rac1-WT and Rac1-KO during the revision period.

      Comment 13: Fig. 6: Similar as Fig. 2, there is no quantification and n=1 per genotype is not enough

      Response: During the revision period we will increase the number of E12.5 Srf-KO and Srf-WT embryos to n=3 for Figure 6G. All other panels currently have n=7 or greater.

      Comment 14: Fig. 7: Need quantification between Srf-KO and Rac1-KO with statistics to show they are not different, but both significantly different from WTs

      Response: In Figure 7D we have added quantification of aSMA area in Srf-KO and Rac1-KO. These results show that both mutants have a similar phenotype with reduced aSMA expression compared to their respective WT littermates, which supports the conclusion that they work in the same pathway. We do not agree with the reviewer that the two mutants should show no statistical difference, because Rac1 and Srf are different genes with overlapping but also non-overlapping functions. During the revision period we will add more Srf-KO embryos and repeat the statistical analysis.

      Comment 15: Supplement Fig.2: No image showing the time point before E11.5.

      Response: We will add an E10.5 time point during the revision period.

      Comment 16: Supplement Fig.3: The ventral view of Rac1-WT does not have the same angle as it shows in Rac1-KO. Makes harder to see the difference between control and mutant.

      Response: We adjusted the brightness/contrast to make the difference clearer.

      Comment 17: Supplement Fig.4 &7: The alkaline phosphatase stained area needs to be normalized to some other metric because the embryos could be different size.

      Response: We normalized to the width of the eye and is now represented in Suppl. Fig. 4 and 7.

      Comment 18: Supplement Fig 6 A: The legend and figure don't match. Is it E13.5 or 14.5. Panel 6B needs better images without curling of the tissue.

      Response: This has been fixed. The immunostaining images in Suppl. Fig. 6A is E14.5. Panel B is now replaced with better images in the revised manuscript.


      Reviewer-2

      __Comment 1.1: __In Fig. 5, links between Rac1, SRF, αSMA, and contractility in mesenchymal cells are shown. Molecular analyses (Western blot and qPCR) were performed using primary cultured mesenchymal cells (prepared after freed from the epidermal population). Although use of cells prepared from E18.5 embryos may have been chosen by the authors for the safe isolation of the mesenchymal population without contamination of epidermal cells, this reviewer finds that anti-SRF immunoreactivity is weaker at E13.5 than at E12.5 (throughout the section including the mesencephalic wall) and therefore wonder whether SRF expression changes in a stage-dependent manner. So, simply borrowing results obtained from E18.5-derived cells for describing the scenario around E12.5 and E13.5 is a little disappointing point found only here in this study.

      Response: In fact, the reason we chose E18.5 was to get enough cells to do the experiments in Figure 5A-D without extensive passaging and/or immortalization, which would undoubtedly cause the cells to deviate from their in vivo character as they become adapted to growing on plastic with 10% serum. Therefore, we prefer not to change the cells as suggested by the reviewer.

      __Comment 1.2: __In Fig. 5F, it is difficult to clearly see "reduction" of SRF immunoreactivity in Rac1-KO. Therefore, quantification of %SRF+/totalTomato+ would be desired.

      Response: __We will generate better images of SRF staining and quantify the difference between Rac1-WT and Rac1-KO during the __revision period.

      __Comment 1.3: __Separately, direct comparison of spontaneous centripetal shrinkage of the apical/dorsal scalp tissues, which will occur in 30 min when prepared at E12.5 or E13.5 (Tsujikawa et al., 2022), between WT and Rac1-KO would strengthen the results in Fig. 5D. As KO is specific to the mesenchyme, the authors do not have to worry about removal of the epidermal layer (which would be much more difficult at E12.5-13.5 than E18.5). If the degree of centripetal shrinkage of the "epidermis plus mesenchyme" layers were smaller in Rac1-KO, it would be interpreted to be mainly due to poorer recoiling activity and contractility of the Rac1-KO mesenchymal tissue.

      Response: __We will try to perform the centripetal shrinkage assays as shown by Tsujikawa et al., during the __revision period.

      Comment 2: The authors favor "apical" vs. "basolateral" to tell the relative positions in the embryonic head, not only in the adult head. But "apical" vs. "basolateral" should be accompanied with dorsal vs. ventral at least at the first appearance. Apical-to-basal axis or apex vs. basolateral by itself can provide, in many contexts, impressions that epithelial layers/cells are being discussed. Please note that the authors also use "caudal" (in the embryonic head). Usually, a universally defined anatomical axis perpendicular to the rostral-to-caudal axis is the dorsal-to-ventral axis.

      Response: Apologies for confusing terminology. The terminology is now defined uniformly according to the anatomical axis.

      Comment 3: One of the authors' statements in ABSTRACT "In control embryos, α-smooth muscle actin (αSMA) expression was spatially restricted to the apical mesenchyme, suggesting a mechanical interaction between the growing brain and the overlying mesenchyme" and a similar one in RESULTS "αSMA was not detected in the basolateral mesenchyme of either genotype from E12.5-E14.5 (Suppl. Fig. 4A), suggesting restriction of the mechanosensitive cell state to the apical mesenchyme" need to be at least partly revised, taking previous publication about the normal αSMA pattern in the embryonic head into account more carefully. Tsujikawa et al. (2022) described "Low-magnification observations showed superficial immunoreactivity for alpha smooth muscle actin (αSMA), which has been suggested to function in cells playing force-generating and/or constricting roles; this immunoreactivity was continuously strong throughout the dorsal (calvarial) side of the head but not ventrally toward the face, producing a staining pattern similar to a cap (Figure 2A)" . Therefore, in this new paper, descriptions like "we observed ...., consistent with ....(2022)" or "we confirmed .... (2022)" would be more accurate and appropriate regarding this specific point. Such a minor change does not reduce this study's overall novelty at all.

      Response: Thank you for the correction. We have replaced the terminology and cited the article (Tsujikawa et al., 2022) appropriately, crediting their finding.

      Comment 4: It would be very helpful if the authors provide a schematic illustration in which physiological and pathological scenarios (at the molecular, cellular, and tissue levels found or suggested by this study) are shown.

      Response: We have added a schematic representation of the molecular changes happening in the apical head development because of Rac1- and Srf-KO, and it is represented in Suppl. Fig. 7C.


      Comment 5: Despite being put in the title, "mechanosensing" by mesenchymal cells is not directly assessed in this study. If appropriate, something like "mechano-functioning" would be closer to what the authors demonstrated.

      __Response: __We changed the title to refer to “mechano-responsive mesenchyme”. We think this is appropriate because the cells of interest have reduced aSMA and reduced proliferation, both of which are known to occur, at least in part, as responses to mechanical inputs.

      Reviewer-3

      Comment 1: Prrx1-Cre targets calvarial mesenchyme and Suzuki et al., 2009 showed that Prrx1-Cre mediated loss of Rac1 lead to calvarial bone phenotype due to incomplete fusion of the skull. While this phenotype was not studied in detail, the statement in the intro and discussion that the calvarial phenotype has not been recapitulated in mice is incorrect.

      Response: Suzuki et al showed incomplete fusion of the skull. Although the skull is a tissue that is affected in AOS, it is not akin to the scalp and calvaria aplasia that typifies AOS. Our result stands apart from this. We clarified our position as such:

      Introduction (page 4): “Nevertheless, the calvaria phenotype seen in AOS individuals has not been explored in detail or fully recapitulated in mice.”

      Discussion (page 11): Previous studies have demonstrated the role of Rac1 in mesenchyme-derived tissues, but they did not recapitulate AOS phenotypes.”

      Comment 2: The authors show that Pdgfra-Cre induced knockout of Rac1 leads to lower-than-expected numbers of Rac1-cKO embryos at E18.5 and P1. Phenotypic analysis shows that the earliest phenotype is blebbing and hematoma in the nasal region at E11.5/12.5. It is stated that this was resolved at E18.5. It is unclear if this is truly a resolution of the phenotype or that these embryos fail to survive until E18.5. Do 100% of the Rac1-cKO embryos exhibit the blebbing/hematoma at E11.5/12.5? What is the observed number/percentage of Rac1-cKO embryos at E11.5/12.5? If the observed percentage of Rac1-cKO is similar to that at E18.5 (lower than the expected 25%), this would support resolution. If the observed ratio is as expected at E11.5/12.5, then this would support embryonic loss before E18.5 rather than phenotypic resolution.

      Response: Please note that 100% (n=12) of E12.5 Rac1-KO embryos displayed nasal and mild caudal edema as exhibited in Fig. 2A, but none (n=16) had blebbing/hematoma by E18.5. We added tables for the number of embryos recovered at E12.5 and E18.5 to Supplemental Figure 1. These results show that the percentage of mutants at E12.5 was 21.42%, not significantly different from the expected frequency (p = 0.5371). At E18.5, the percentage dropped slightly to 18.3%, but still not significantly different from expected (p = 0.1545). The significant change in frequency of blebbing/hematoma from E12.5 to E18.5, without any significant change in the frequency of mutants, supports phenotypic resolution of the early blebbing/hematoma.

      Comment 3: It is stated that brain shape is altered in Rac1-cKO embryos at E14.5 and E18.5 and concluded that these shape differences are secondary to the cranial defects. Pdgfra+ cells gives rise to the meninges and if the Pdgfra-Cre line recapitulates this expression, then loss of the ubiquitously expressed Rac1 in the meninges could lead to a primary defect in the brain, which may lead to secondary defects in the calvarium and scalp. Their conclusion should recognize other possibilities.

      Response: We agree it is possible that there are meninges defects that secondarily change the shape of the brain, and we added a mention of this possibility. It is highly unlikely that scalp defects are only secondary to brain changes because the first observable phenotypes are in the EMM that gives rise to the scalp.

      Comment 4: The TdTom staining in wholemount at E13.5 (Supplemental Figure 2B) is difficult to appreciate in the image shown.

      Response: At E11.5 there is good contrast between labeled cranial structures and non-labeled body. At E13.5, Tomato appears in most of the mesenchymal cells in the embryo, so there is not as much contrast. The lack of contrast at E13.5 may cause the reviewer think there is something wrong with the image.

      Comment 5: The idea that the EMM laminates into the meninges and scalp layers is not new and should be properly cited (Vu et al., 2021, Scientific Reports). The following paper should also be cited on the use of alpha-SMA (Acta2) as a marker of the anterior calvaria mesenchyme: Holms et al., 2020 Cell Reports.

      Response: Thank you. We are happy to add those citations.

      Comment 6: It is concluded that meningeal development is maintained in the cKO; however, this conclusion was based on a single marker (S100a6) that is both expressed in the presumptive meninges and dermis and greatly reduced overall in the cKO. This conclusion should be softened or other markers used to show that the meninges is indeed normal.

      Response: We softened the conclusion on the meninges in the revised manuscript, as this part of the phenotype is was not our focus but it would be a good thing to look at in the future.

      Comment 7: The overlap of S100a6 and alpha-SMA is difficult to appreciate in the images shown in Figure 3. Since this is important to the conclusion, co-staining should be done. If co-staining cannot be done due to the primary antibodies' origins, then ISH should be done.

      Response: We added merged images.

      Comment 8: It is concluded that reduced alpha-SMA suggests an early failure of Rac-cKO cells to respond to the mechanical environment. While this is one possibility, the reduction of alpha-SMA may simply be due to a reduction of these cells resulting from failed differentiation, decreased proliferation, or increased apoptosis.

      Response: We think the fact that aSMA is downregulated in cultured cells strongly argues against it being a trivial consequence of reduce proliferation etc. Nevertheless, we softened our conclusion to allow for some of these things to also contribute to the reduced aSMA expression. We will check apoptosis during the revision period.

      Comment 9: The conclusion that alpha-SMA is a transient population only present in apical cranial mesenchyme between E12.5-14.5 is not consistent with prior studies: Holms et al., 2020 Cell Reports; Holms et al., 2021 Nature Communications; Farmer et al., 2021 Nature Communications; Takeshita et al., 2016 JBMR.

      Response: There is no contradiction. Our statements are based on antibody staining where it is very evident that a-SMA-expressing cells are detectable throughout the apical mesenchyme between E12.5 and E14.5. But at E18.5 we do not see this kind of broad aSMA expression the apical head, suggesting a transient and spatially restricted population of cells in the apical mesenchyme. This is consistent with the studies from Tsujikawa et al., 2022 and Angelozzi et al., 2022. The papers mentioned by the reviewer are only focused on the suture mesenchyme. They do not claim there is broad aSMA/Acta2 expression in the apical head, but only in a spatially restricted subpopulation of suture mesenchymal cells.

      Comment 10: In the SRF immunostaining results in control and Rac1-cKO embryos, it is difficult to appreciate the nuclear localization at E12.5 in Figure 5E, as the DAPI is over saturated, and the image quality is poor. The image quality is also poor in Figure 5F.

      Response: We will generate better images of SRF staining and quantify the difference between Rac1-WT and Rac1-KO during the revision period.

      Comment 11: To what extent is the expression/localization of MRTF, the transcriptional co-activator of SRF, altered in the calvarial mesenchyme of Rac1-cKO embryos? Changes in MRTF would strengthen the link between Rac1 and SRF.

      Response: We do not know how MRTF expression/localization changes in the embryo tissue, but western blot data on Rac1-KO fibroblasts revealed a reduction in expression/nuclear localization of MRTF-A/B that mirrored the changes in SRF. We added these blots to Figure 5A. However, as noted at the end of the discussion, MRTF is not always required for SRF function in vivo ( Dinsmore, Elife 2022). The MRTFA/B-KO is a possibility for future work.

      Comment 12: Hypoplasia of the apical mesenchyme (Figure 6G, inset 1) in Srf-cKO is difficult to see.

      Response: During the revision period we will increase the number of E12.5 Srf-KO and Srf-WT embryos to n=3 for Figure 6G and replace the picture with a better one.

      Comment 13: Generally, the organization of the data into many main and supplemental Figures makes the flow difficult to follow.

      __Response____: __We understand the concern, but we have tried our best to organize the most important data into main figures and the relevant but less essential data into supplemental figures.

      Comment 14: SFR interacts with Pdgfra interacts genetically with Srf in neural crest cells in craniofacial development, with Srf being a target of PDGFRa signaling (Vasudevan and Soriano, 2015, Dev Cell). Since the Pdgfra-Cre line used here is hemizygous, is important that the control used to look at SRF expression in the Rac1-cKO is Pdgfra-Cre+.

      Response: It is standard practice to include some Cre+ mice in the control set to reveal whether Cre has toxic effects in the cells of interest. To the reviewer’s concern about genetic interactions between the Pdgfra gene and Srf, this should not be relevant here because the Pdgfra-Cre used in our study is a transgene and does not affect the endogenous Pdgfra gene.

      Comment 15: The text size in all figures is too small and varies throughout, making it difficult to read.

      Response: To fit the panel in the Word document, the figure is resized. This should not be an issue in the final manuscript.

      Comment 16: Details about the pulse-chase timing of the EdU experiments should be included in the results. Also, does n = 3 for each stage and each genotype? I would be helpful to include a representative section for a control and cKO littermate pair.

      Response: The details are now included in the methods section. Yes, n=3 in each stage and genotype (Fig. 4A). The representative images are also included.

      Comment 17: The relative sizing of the panels within and between figures is haphazard. Some are very large and others very small (Figure 2, 6, Supplemental Figure 1, 2, 6, 7).

      Response: The image panels are fixed in the revised manuscript.

      Comment 18: In Figure 5A and F, the titles "E12.5" and "E13.5" are in italics.

      Response: The fonts for the figures are fixed in the revised manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary: This manuscript by Rathnakar et al. examines the role of the small GTPase Rac1 in apical closure of the scalp and skull. Rac1 activity is regulated the guanine nucleotide exchange factor DOCK6 and the GTPase AHGAP31. Loss of function variants in DOCK6 and gain of function variants in AHGAP31 lead to sustained inactivation of Rac1 in Adams-Oliver syndrome (AOS), which is characterized by aplasia cutis congenita, underlying calvarial defects, and limb abnormalities. While Rac1 is thought to be a key in the pathogenesis of AOS, how decreased in Rac1 activity impact development of the head is not well-understood. The authors find that conditional loss of Rac1 in cranial mesenchyme (using Pdgfra-Cre), leads to AOS-like abnormalities in the scalp and skull. They go on to show that these abnormalities are linked to reduced alpha-SMA expression in the early migrating mesenchyme (EMM), decreased osteoprogenitor cells in the supraorbital mesenchyme (SOM), decreased proliferation, and the contractile function of fibroblasts. They also find that Rac1 cKO leads to reduced expression of the mechanosensitive transcription factor SRF. Finally, they show that loss of SRF in cranial mesenchyme (using Pdgfra-Cre) leads to an AOS-like scalp and skull phenotype that has mechanistic overlap with their findings in the Rac1 cKO.

      Major:

      1. Prrx1-Cre targets calvarial mesenchyme and Suzuki et al., 2009 showed that Prrx1-Cre mediated loss of Rac1 lead to calvarial bone phenotype due to incomplete fusion of the skull. While this phenotype was not studied in detail, the statement in the intro and discussion that the calvarial phenotype has not been recapitulated in mice is incorrect.
      2. The authors show that Pdgfra-Cre induced knockout of Rac1 leads to lower-than-expected numbers of Rac1-cKO embryos at E18.5 and P1. Phenotypic analysis shows that the earliest phenotype is blebbing and hematoma in the nasal region at E11.5/12.5. It is stated that this was resolved at E18.5. It is unclear if this is truly a resolution of the phenotype or that these embryos fail to survive until E18.5. Do 100% of the Rac1-cKO embryos exhibit the blebbing/hematoma at E11.5/12.5? What is the observed number/percentage of Rac1-cKO embryos at E11.5/12.5? If the observed percentage of Rac1-cKO is similar to that at E18.5 (lower than the expected 25%), this would support resolution. If the observed ratio is as expected at E11.5/12.5, then this would support embryonic loss before E18.5 rather than phenotypic resolution.
      3. It is stated that brain shape is altered in Rac1-cKO embryos at E14.5 and E18.5 and concluded that these shape differences are secondary to the cranial defects. Pdgfra+ cells gives rise to the meninges and if the Pdgfra-Cre line recapitulates this expression, then loss of the ubiquitously expressed Rac1 in the meninges could lead to a primary defect in the brain, which may lead to secondary defects in the calvarium and scalp. Their conclusion should recognize other possibilities.
      4. The TdTom staining in wholemount at E13.5 (Supplemental Figure 2B) is difficult to appreciate in the image shown.
      5. The idea that the EMM laminates into the meninges and scalp layers is not new and should be properly cited (Vu et al., 2021, Scientific Reports). The following paper should also be cited on the use of alpha-SMA (Acta2) as a marker of the anterior calvaria mesenchyme: Holms et al., 2020 Cell Reports.
      6. It is concluded that meningeal development is maintained in the cKO; however, this conclusion was based on a single marker (S100a6) that is both expressed in the presumptive meninges and dermis and greatly reduced overall in the cKO. This conclusion should be softened or other markers used to show that the meninges is indeed normal.
      7. The overlap of S100a6 and alpha-SMA is difficult to appreciate in the images shown in Figure 3. Since this is important to the conclusion, co-staining should be done. If co-staining cannot be done due to the primary antibodies' origins, then ISH should be done.
      8. It is concluded that reduced alpha-SMA suggests an early failure of Rac-cKO cells to respond to the mechanical environment. While this is one possibility, the reduction of alpha-SMA may simply be due to a reduction of these cells resulting from failed differentiation, decreased proliferation, or increased apoptosis.
      9. The conclusion that alpha-SMA is a transient population only present in apical cranial mesenchyme between E12.5-14.5 is not consistent with prior studies: Holms et al., 2020 Cell Reports; Holms et al., 2021 Nature Communications; Farmer et al., 2021 Nature Communications; Takeshita et al., 2016 JBMR.
      10. In the SRF immunostaining results in control and Rac1-cKO embryos, it is difficult to appreciate the nuclear localization at E12.5 in Figure 5E, as the DAPI is over saturated, and the image quality is poor. The image quality is also poor in Figure 5F.
      11. To what extent is the expression/localization of MRTF, the transcriptional co-activator of SRF, altered in the calvarial mesenchyme of Rac1-cKO embryos? Changes in MRTF would strengthen the link between Rac1 and SRF.
      12. Hypoplasia of the apical mesenchyme (Figure 6G, inset 1) in Srf-cKO is difficult to see.
      13. Generally, the organization of the data into many main and supplemental Figures makes the flow difficult to follow.
      14. SFR interacts with Pdgfra interacts genetically with Srf in neural crest cells in craniofacial development, with Srf being a target of PDGFRa signaling (Vasudevan and Soriano, 2015, Dev Cell). Since the Pdgfra-Cre line used here is hemizygous, is important that the control used to look at SRF expression in the Rac1-cKO is Pdgfra-Cre+.

      Minor:

      1. The text size in all figures is too small and varies throughout, making it difficult to read.
      2. Details about the pulse-chase timing of the EdU experiments should be included in the results. Also, does n = 3 for each stage and each genotype? I would be helpful to include a representative section for a control and cKO littermate pair.
      3. The relative sizing of the panels within and between figures is haphazard. Some are very large and others very small (Figure 2, 6, Supplemental Figure 1, 2, 6, 7).
      4. In Figure 5A and F, the titles "E12.5" and "E13.5" are in italics.

      Significance

      Overall, this is an interesting study that shares mechanistic insight into the scalp and skull deformities in AOS. The overall presentation of the work, particularly the figures, should be improved and streamlined to enhance clarity and better emphasize the novelty of the study. In addition, the conclusions are not always well-supported by the results and the interpretation of the results do not fully consider and cite previous studies.

      Audience: Developmental Biologists

      Expertise: Craniofacial development and disease

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In mice lacking Rac1 in the PDGFRa+ mesenchymal cell lineage, the authors found Adams-Oliver syndrome (AOS)-like defects of the apical/dorsal scalp and calvaria, which was accompanied by the secondary brain protrusion by E18.5. The primary phenotype emerged at E11.5 and worsened from E12.5 to E14.5 in the apical/dorsal region of the embryonic head, with limited lateral expansion as well as reduced thickening/stratification of the mesenchymal layer expressing α-smooth muscle actin (αSMA). Very similar in vivo abnormalities were obtained when serum response factor (SRF), known as a mechanotransducing factor, was removed in PDGFRα+ mesenchymal cells. Rac1-lacking mesenchymal cells proliferated poorly in vivo and contracted weakly in culture, with reduced expression of SRF and αSMA. Based on these results and previously obtained understanding that the developing apical/dorsal mesenchyme is mechanically stretched by the underlying brain, the authors conclude that the mechanosensing-triggered morphogenetic behaviors of the apical/dorsal mesenchymal cells (i.e., proliferation, stratification, and contraction, which all lead to physical stability or mechanical resilience of that layer) is mediated by Rac1 and SRF. The authors also suggest that this molecular mechanism for the physiological maturation of the apical/dorsal mesenchyme may underlie the ventral-to-dorsal progression of osteogenesis, absence of which explains AOS pathogenesis.

      Major comments:

      In Fig. 5, links between Rac1, SRF, αSMA, and contractility in mesenchymal cells are shown. Molecular analyses (Western blot and qPCR) were performed using primary cultured mesenchymal cells (prepared after freed from the epidermal population). Although use of cells prepared from E18.5 embryos may have been chosen by the authors for the safe isolation of the mesenchymal population without contamination of epidermal cells, this reviewer finds that anti-SRF immunoreactivity is weaker at E13.5 than at E12.5 (throughout the section including the mesencephalic wall) and therefore wonder whether SRF expression changes in a stage-dependent manner. So, simply borrowing results obtained from E18.5-derived cells for describing the scenario around E12.5 and E13.5 is a little disappointing point found only here in this study. In Fig. 5F, it is difficult to clearly see "reduction" of SRF immunoreactivity in Rac1-KO. Therefore, quantification of %SRF+/totalTomato+ would be desired. Separately, direct comparison of spontaneous centripetal shrinkage of the apical/dorsal scalp tissues, which will occur in 30 min when prepared at E12.5 or E13.5 (Tsujikawa et al., 2022), between WT and Rac1-KO would strengthen the results in Fig. 5D. As KO is specific to the mesenchyme, the authors do not have to worry about removal of the epidermal layer (which would be much more difficult at E12.5-13.5 than E18.5). If the degree of centripetal shrinkage of the "epidermis plus mesenchyme" layers were smaller in Rac1-KO, it would be interpreted to be mainly due to poorer recoiling activity and contractility of the Rac1-KO mesenchymal tissue.

      Minor comments:

      1. The authors favor "apical" vs. "basolateral" to tell the relative positions in the embryonic head, not only in the adult head. But "apical" vs. "basolateral" should be accompanied with dorsal vs. ventral at least at the first appearance. Apical-to-basal axis or apex vs. basolateral by itself can provide, in many contexts, impressions that epithelial layers/cells are being discussed. Please note that the authors also use "caudal" (in the embryonic head). Usually, a universally defined anatomical axis perpendicular to the rostral-to-caudal axis is the dorsal-to-ventral axis.
      2. One of the authors' statements in ABSTRACT "In control embryos, α-smooth muscle actin (αSMA) expression was spatially restricted to the apical mesenchyme, suggesting a mechanical interaction between the growing brain and the overlying mesenchyme" and a similar one in RESULTS "αSMA was not detected in the basolateral mesenchyme of either genotype from E12.5-E14.5 (Suppl. Fig. 4A), suggesting restriction of the mechanosensitive cell state to the apical mesenchyme" need to be at least partly revised, taking previous publication about the normal αSMA pattern in the embryonic head into account more carefully. Tsujikawa et al. (2022) described "Low-magnification observations showed superficial immunoreactivity for alpha smooth muscle actin (αSMA), which has been suggested to function in cells playing force-generating and/or constricting roles; this immunoreactivity was continuously strong throughout the dorsal (calvarial) side of the head but not ventrally toward the face, producing a staining pattern similar to a cap (Figure 2A)" . Therefore, in this new paper, descriptions like "we observed ...., consistent with ....(2022)" or "we confirmed .... (2022)" would be more accurate and appropriate regarding this specific point. Such a minor change does not reduce this study's overall novelty at all.
      3. It would be very helpful if the authors provide a schematic illustration in which physiological and pathological scenarios (at the molecular, cellular, and tissue levels found or suggested by this study) are shown.
      4. Despite being put in the title, "mechanosensing" by mesenchymal cells is not directly assessed in this study. If appropriate, something like "mechano-functioning" would be closer to what the authors demonstrated.

      Significance

      This study advances understanding of a key aspect of the molecular mechanisms underlying the normal mammalian craniofacial development, unveiling the role of Rac1 and SRF in the apical/dorsal mesenchymal layer which has inter-tissue mechanical relationships with the embryonic brain underneath. This study also advances understanding of Adams-Oliver Syndrome pathogenesis, demonstrating the biological significance of the normal inter-tissue mechanical relationships in the developing mammalian head. This study may have opened a door for the genetic/molecular dissection toward the tissue-level mechano-engineering, which would stimulate development of next-generation organoids or assembloids. Broad audience including developmental biologists/neuroscientists, molecular/cellular biologists, pathologists, clinical geneticists, and pediatricians would be interested in this work.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper "Mouse scalp development requires Rac1 and SRF for the maintenance of mechanosensing mesenchyme", the authors demonstrated that deletion of Rac1 (Rac1-KO) with a PDGFRαCreTG mouse model led to absence of skull apex and a blebbing formation while the limbs were not impacted. Rac1-KO mice showed the Rac1 regulated expansion of the apical mesenchyme toward the very apex meningeal and dermis layer and the osteogenic differentiation of supra orbital arch mesenchyme. Rac1 also regulates the proliferation of apical mesenchyme, dermis differentiation, and mechanosensing of the cranial mesenchyme cells. The authors also indicated Rac1 was a regulator of Srf by showing the deletion of Rac1 lead to lower Srf mRNA level and SRF protein expression. Deletion of Srf showed similar phenotypes as Rac1-KO mice.

      Major concern is the study lacks rigor in several areas where n=2, results are not quantified with statistics. They need to run power analysis and increase their samples sizes. Please include statistics on all measurements. Filamentous actin staining and alpha-sma is used to visualize mechanosensing but also in other cell activities such as cell contractility for movement, cell to substrate adhesion, cell division, etc. They need to query more mechanosensing related pathways (Piezo1/2, Yap/taz-Hippo, integrin-Focal Adhesion Kinase, etc) to show that mechanosensing changed.

      Comments by figure.

      Fig. 1: In panel E, the cranial bone area measurement is not normalized to mitigate possible effect of individual differences.

      Fig. 2:

      1. While the authors mentioned many phenotypical changes(bone length changes, gap thickness change, apex thickness change, etc) based on histology stain, none of them are quantified to show a siginificant difference between Rac1-WT and Rac1-KO.
      2. In panel D, with only 2 embryos per group is not enough for quantitation.
      3. In panel D, the two arrows in the Rac1-KO mutants are not easy to catch.

      Fig. 3:

      1. The thickness quantification is not performed.
      2. The images show an obvious curve change of the apex between the control and mutant. Such change is not discussed in the results. Is it due to histology issue?
      3. The merged layer did not show S100a6. While the authors are showing apical expansion of the mesenchyme toward the dermis and meninges, it is hard to track where they are without a merged image.

      Fig.4:

      1. In panel B, 2 biological replicates per genotype are very low
      2. There is no cell death data.

      Fig. 5:

      1. In panel B, the GPDH western plot bands in the mutants seem to be thinner than those of controls.
      2. Though the immunostain showed a decrease in signal intensity, it is hard to know whether the decrease is significant enough across all Rac1-KO mutants. They need to measure the fluorescence intensity and perform statistics.

      Fig. 6: Similar as Fig. 2, there is no quantification and n=1 per genotype is not enough.

      Fig. 7: Need quantification between Srf-KO and Rac1-KO with statistics to show they are not different but both significantly different with WTs.

      Supplement Fig.2: No image showing the time point before E11.5.

      Supplement Fig.3: The ventral view of Rac1-WT does not have the same angle as it shows in Rac1-KO. Makes harder to see the difference between control and mutant.

      Supplement Fig.4 &7: The alkaline phosphatase stained area needs to be normalized to some other metric because the embryos could be different size.

      Supplement Fig 6 A: The legend and figure don't match. Is it E13.5 or 14.5. Panel 6B needs better images without curling of the tissue.

      Significance

      Please see my comments above. This work is broadly of interest to developmental biologist, fracture healing, and human genetics fields.

      The paper is easy to understand and follow. The massive amount of histology and immunostaining images make it easy to identify the point the authors want to show. All the figures are well-labeled and visually informative. The experiment sequence is logic. The gene deletion models provide solid and direct evidence on the necessity of their function during early head development. The discussion is thoughtfully written and clear. The authors discuss the connection of Rac1 and SRF with other signaling pathways, which makes them promising target toward Adams-Oliver syndrome.

    1. Reviewer #1 (Public review):

      This study uses structural and functional approaches to investigate regulation of the Na/Ca exchanger NCX1 by an activator, PIP2 and an inhibitor, SEA0400. Previous functional studies suggest both of these compounds interact with the Na-dependent inactivation process to mediate their effects.

      State of the art methods are employed here, and the data are of high quality and presented very clearly. While there is merit in combining structural studies on both compounds as they relate to Na-dependent activation, in the end it is somewhat disappointing that neither is explored in further depth.

      The novel aspect of this work is the study on PIP2. Unfortunately, technical limitations precluded structural data on binding of the native PIP2, and so an unnatural short-chained analog, di-C8 PIP2, was used instead. This raises the question of whether these two molecules, which have similar but very distinctly different profiles of activation, actually share the same binding pocket and mode of action. The authors conduct a "competition" experiment, arguing the effect of di-C8-PIP2 addition subsequent to PIP2 suggests competition for a single binding site. In this scenario, PIP2 would need to vacate the binding site prior to di-C8-PIP2 occupying it. However, the lack of an effect of washout alone, suggests PIP2 does not easily unbind. This raises the possibility (probability?) of a non-competitive effect of di-C8-PIP2 at a different site. An additionally informative experiment would be to determine if a saturating concentration of di-C8-PIP2 could prevent the full activation induced by subsequent PIP2 addition. However, the relative affinities of the two ligands might make such an experiment challenging in practice.

      In an effort to address the binding site directly, the authors mutate key residues predicted to be important in liganding the phosphorylated head group of PIP2. However, the only mutations that have a significant effect in PIP2 activation also influence the Na-dependent inactivation process independently of PIP2. While these data are consistent with altering PIP2 binding (which cannot be easily untangled from its functional effect on Na-dependent inactivation), a primary effect on Na-inactivation, rather than PIP2 binding, cannot be fully ruled out. A more extensive mutagenic study, based on other regions of the di-C8 PIP2 binding site, would have given more depth to this work and might have been more revealing mechanistically.

      The SEA0400 aspect of the work does not integrate particularly well with the rest of the manuscript. This study confirms the previously reported structure and binding site for SEA0400 but provides little further information. While interesting speculation is presented regarding the connection between SEA0400 inhibition and Na-dependent inactivation, further experiments to test this idea are not included here.

      Comments on revisions:

      (1) The competition assay data for di-C8-PIP2 and PIP2 is a nice addition, but in its description in the text, the authors should be a bit more circumspect about their conclusions, based on the possibility/probability that the effect observed is actually non-competitive (as detailed above).<br /> (2) The authors should acknowledge the formal possibility that the functional effects of the mutations studies are a consequence of a direct effect on Na-dependent inactivation, independent of PIP2 binding.<br /> (3) The authors might strengthen their arguments for combining studies on PIP2 and SEA0400.<br /> (4) The authors could be clearer where their work on SEA0400 extends beyond the previously published observations.

    1. We call these failures breakdowns, the idea being that someone can be following the correct sequence of steps to complete a task, but then fail to get past a crucial step.

      This idea is also important in day to day academic life, as things make sense to you as the one in your head but we must be able to translate that into usability and understanding for others.

    1. Reviewer #1 (Public review):

      The study addresses how faces and bodies are integrated in two STS face areas revealed by fMRI in the primate brain. It is building upon recordings and analysis of the responses of large populations of neurons to three sets of images, that vary face and body positions. These sets allowed the author to thoroughly investigate invariance to position on the screen (MC HC), to pose (P1 P2), to rotation (0 45 90 135 180 225 270 315), to inversion, to possible and impossible postures (all vs straight), to presentation of head and body together or in isolation. By analyzing neuronal responses, they find that different neurons showed preferences for body orientation, or head orientation or for the interaction between the two. By using a linear support vector machine classifier, they show that the neuronal population can decode head-body angle presented across orientations, in the anterior aSTS patch (but not middle mSTS patch), except for mirror orientation. On the contrary, mSTS neurons show less invariance for head-body angle and more specialization for head or body orientation.

      Strengths:

      These results expand prior work on the role of Anterior STS fundus face area in face-body integration and its invariance to mirror symmetry, with a rigorous set of stimuli revealing the workings of these neuronal populations in processing individuals as a whole, in an important series of carefully designed conditions.

      It also raises questions for future investigations comparing humans and monkeys expertise with upright and inverted configurations, in light of monkey-specific hanging upside-down behavior. Further, using two types of body postures (sitting, standing), they show a correlation in head-body angle between postures, suggesting that monkey orientation might be more meaningful to these neurons than precise posture.

    2. Reviewer #2 (Public review):

      Summary:

      This paper investigates the neuronal encoding of the relationship between head and body orientations in the brain. Specifically, the authors focus on the angular relationship between the head and body by employing virtual avatars. Neuronal responses were recorded electrophysiologically from two fMRI-defined areas in the superior temporal sulcus and analyzed using decoding methods. They found that: (1) anterior STS neurons encode head-body angle configurations; (2) these neurons distinguish aligned and opposite head-body configurations effectively, whereas mirror-symmetric configurations are more difficult to differentiate; and (3) an upside-down inversion diminishes the encoding of head-body angles. These findings advance our understanding of how visual perception of individuals is mediated, providing a fundamental clue as to how the primate brain processes the relationship between head and body-a process that is crucial for social communication.

      Strengths:

      The paper is clearly written, and the experimental design is thoughtfully constructed and detailed. The use of electrophysiological recordings from fMRI-defined areas elucidated the mechanism of head-body angle encoding at the level of local neuronal populations. Multiple experiments, control conditions, and detailed analyses thoroughly examined various factors that could affect the decoding results. The decoding methods effectively and consistently revealed the encoding of head-body angles in the anterior STS neurons. Consequently, this study offers valuable insights into the neuronal mechanisms underlying our capacity to integrate head and body cues for social cognition-a topic that is likely to captivate readers in this field.

      Weaknesses:

      I did not identify any major weaknesses in this paper.

    3. Reviewer #3 (Public review):

      Summary:

      Zafirova et al. investigated the interaction of head and body orientation in the macaque superior temporal sulcus (STS). Combining fMRI and electrophysiology, they recorded responses of visual neurons to a monkey avatar with varying head and body orientations. They found that STS neurons integrate head and body information in a nonlinear way, showing selectivity for specific combinations of head-body orientations. Head-body configuration angles can be reliably decoded, particularly for neurons in the anterior STS, suggesting a transformation of face/body orientation signals from the middle to the anterior STS. Furthermore, body inversion resulted in reduced decoding of head-body configuration angles. Compared to previous work that examined face or body alone, this study demonstrates how head and body information are integrated to compute a socially meaningful signal.

      Strengths:

      This work presents an elegant design of visual stimuli, with a monkey avatar of varying head and body orientations, making the analysis and interpretation straightforward. Together with several control experiments, the authors systematically investigated different aspects of head-body integration in the macaque STS. The results and analyses of the paper are convincing.

      Weakness:

      While this work has characterized the neural integration of head and body information in detail, it's unclear how the neural representation relates to the animal's perception. Behavioural experiments using the same set of stimuli could help address this question, but I agree that these additional experiments may be beyond the scope of the current paper.

    1. Reviewer #2 (Public review):

      A long-standing debate in the field of Pavlovian learning relates to the phenomenon of timescale invariance in learning i.e. that the rate at which an animal learns about a Pavlovian CS is driven by the relative rate of reinforcement of the cue (CS) to the background rate of reinforcement. In practice, if a CS is reinforced on every trial, then the rate of acquisition is determined by the relative duration of the CS (T) and the ITI (C = inter-US-interval = duration of CS + ITI), specifically the ratio of C/T. Therefore, the point of acquisition should be the same with a 10s CS and a 90s ITI (T = 10; C = 90 + 10 = 100, C/T = 100/10 = 10) and with a 100s CS and a 900s ITI (T = 100; C = 900 + 100 = 1000, C/T = 1000/100 = 10). That is to say, the rate of acquisition is invariant to the absolute timescale as long as this ratio is the same. This idea has many other consequences, but is also notably different from more popular prediction-error based associative learning models such as the Rescrola-Wagner model. The initial demonstrations that the ratio C/T predicts the point of acquisition across a wide range of parameters (both within and across multiple studies) was conducted in Pigeons using a Pavlovian autoshaping procedure. What has remained under contention is whether or not this relationship holds across species, particularly in the standard appetitive Pavlovian conditioning paradigms used in rodents. The results from rodent studies aimed at testing this have been mixed, and often the debate around the source of these inconsistent results focuses on the different statistical methods used to identify the point of acquisition for the highly variable trial-by-trial responses at the level of individual animals.<br /> The authors successfully replicate same effect found in pigeon autoshaping paradigms decades ago (with almost identical model parameters) in a standard Pavlovian appetitive paradigm in rats. They achieve this through a clever change the experimental design, using a convincingly wide range of parameters across 14 groups of rats, and by a thorough and meticulous analysis of these data. It is also interesting to note that the two author's have published on opposing sides of this debate for many years, and as a result have developed and refined many of the ideas in this manuscript through this process.

      Main findings

      (1) The present findings demonstrate that the point of initial acquisition of responding is predicted by the C/T ratio.

      (2) The terminal rates of responding to the CS appears to be related to the reinforcement rate of the CS (T; specifically, 1/T) but not its relation to the reinforcement rate of the context (i.e. C or C/T). In the present experiment, all CS trials were reinforced so it is also the case that the terminal rate of responding was related to the duration of the CS.

      (3) An unexpected finding was that responding during the ITI was similarly related to the rate of contextual reinforcement (1/C). This novel finding suggests that the terminal rate of responding during the ITI and the CS are related to their corresponding rates of reinforcement. This finding is surprising as it suggests that responding during the ITI is not being driven by the probability of reinforcement during the ITI.

      (4) Finally, the authors characterised the nature of increased responding from the point of initial acquisition until responding peaks at a maximum. Their analyses suggest that nature of this increase was best described as linear in the majority of rats, as opposed to the non-linear increase that might be predicted by prediction error learning models (e.g. Rescorla-Wagner). However, more detailed analyses revealed that these changes can be quite variable across rats, and more variable when the CS had lower informativeness (defined as C/T).

      Strengths and Weaknesses:

      There is an inherent paradox regarding the consistency of the acquisition data from Gibbon & Balsam's (1981) meta-analysis of autoshaping in pigeons, and the present results in magazine response frequency in rats. This consistency is remarkable and impressive, and is suggestive of a relatively conserved or similar underlying learning principle. However, the consistency is also surprising given some significant differences in how these experiments were run. Some of these differences might reasonably be expected to lead to differences in how these different species respond. For example:

      - The autoshaping procedure commonly used in the pigeons from these data were pretrained to retrieve rewards from a grain hopper with an instrumental contingency between head entry into the hopper and grain availability. During Pavlovian training, pecking the key light also elicited an auditory click feedback stimulus, and when the grain hopper was made available the hopper was also illuminated.

      - In the present experimental procedure, the rats were not given contextual exposure to the pellet reinforcers in the magazine (e.g. a magazine training session is typically found in similar rodent procedures). The Pavlovian CS was a cue light within the magazine itself.

      These design features in the present rodent experiment are clearly intentional. Pretraining with the reinforcer in the testing chambers would reasonably alter the background rate of reinforcement (parameter), so it make sense not to include this but differs from the paradigm used in pigeons. Having the CS inside the magazine where pellets are delivered provides an effective way to reduce any potential response competition between CS and US directed responding and combines these all into the same physical response. This makes the magazine approach response more like the pecking of the light stimulus in the pigeon autoshaping paradigm. However, the location of the CS and US is separated in pigeon autoshaping, raising questions about why the findings across species are consistent despite these differences.

      Intriguingly, when the insertion of a lever is used as a Pavlovian cue in rodent studies, CS directed responding (sign-tracking) often develops over training such that eventually all animals bias their responding towards the lever than towards the US (goal-tracking at the magazine). However, the nature of this shift highlights the important point that these CS and US directed responses can be quite distinct physically as well as psychologically. Therefore, by conflating the development of these different forms of responding, it is not clear whether the relationship between C/T and the acquisition of responding describes the sum of all Pavlovian responding or predominantly CS or US directed responding.

      Another interesting aspect of these findings is that there is a large amount of variability that scales inversely with C/T. A potential account of the source of this variability is related to the absence of preexposure to the reward pellets. This is normally done within the animals' homecage as a form of preexposure to reduce neophobia. If some rats take longer to notice and then approach and finally consume the reward pellets in the magazine, the impact of this would systematically differ depending on the length of the ITI. For animals presented with relatively short CSs and ITIs, they may essentially miss the first couple of trials and/or attribute uneaten pellets accumulating in the magazine to the background/contextual rate of reinforcement. What is not currently clear is whether this was accounted for in some way by confirming when the rats first started retrieving and consuming the rewards from the magazine.

      While the generality of these findings across species is impressive, the very specific set of parameters employed to generate these data raise questions about the generality of these findings across other standard Pavlovian conditioning parameters. While this is obviously beyond the scope of the present experiment, it is important to consider that the present study explored a situation with 100% reinforcement on every trial, with a variable duration CS (drawn form a uniform distribution), with a single relatively brief CS (maximum of 122s) CS and a single US. Again, the choice of these parameters in the present experiment is appropriate and very deliberately based on refinements from many previous studies from the authors. This includes a number of criteria used to define magazine response frequency that includes discarding specific responses (discussed and reasonably justified clearly in the methods section). Similarly, the finding that terminal rates of responding are reliably related to 1/T is surprising, and it is not clear whether this might be a property specific to this form of variable duration CS, the use of a uniform sampling distribution, or the use of only a single CS. However, it is important to keeps these limitations in mind when considering some of the claims made in the discussion section of this manuscript that go beyond what these data can support.

      The main finding demonstrating the consistent findings across species is presented in Figure 3. In the analysis of these data, it is not clear why the correlations between C, T, and C/T and the measure of acquisition in Figure 3A were presented as r values, whereas the r2 values were presented in the discussion of Figure 3B, and no values were provided in discussing Figure 3C. The measure of acquisition in Figure 3A is based on a previously established metric, whereas the measure in Figure 3B employs the relatively novel nDKL measure that is argued to be a better and theoretically based metric. Surprisingly, when r and r2 values are converted to the same metric across analyses, it appears that this new metric (Figure 3B) does well but not as well as the approach in Figure 3A. This raises questions about why a theoretically derived measure might not be performing as well on this analysis, and whether the more effective measure is either more reliable or tapping into some aspect of the processes that underlie acquisition that is not accounted for by the nDKL metric. Unfortunately, the new metric is discussed and defined at great length but its utility is not considered.<br /> An important analysis issue that is unclear in the present manuscript is exactly how the statistics were run (how the model was defined, were individual subjects or group medians used, what software was used etc...). For example, it is not clear whether the analyses conducted in relation to Figure 3 used the data from individual rats or the group medians. Similarly, it appears that each rat contributes four separate data points, and a single regression line was fit to all these data despite the highly likely violation of the assumption independent observations (or more precisely, the assumption of uncorrelated errors) in this analysis. Furthermore, it is claimed that the same regression line fit the IT and CS period data in this figure, however this<br /> If the data in figure 3 were analyzed with log(ITI) or log(C/ITI) i.e. log(C/(T-C)), would this be a better fit for these data? Is it the case that the ratio of C/T the best predictor of the trial/point of acquisition, or is it the case that another metric related to reinforcement rates provides a better fit?

      Based on the variables provided in Supplementary file 3, containing the acquisition data, I was unable to reproduce the values reported in the analysis of Figure 3.<br /> In relation to Figure 3: I am curious about whether the authors would be able to comment on whether the individual variability in trials to acquisition would be expected to scale differently based on C/T, or C, or (if a less restricted range was used) T?<br /> It is not clear why Figure 3C is presented but not analyzed, and why the data presented in Figure 4 to clarify the spread of the distribution of the data observed across the plots in Figure 3 uses the data from Figure 3C. This would seem like the least representative data to illustrate the point of Figure 4. It also appears to my eye that the data actually plotted in Figure 4 correspond to Figure 3A and 3B rather than the odds 10:1 data indicated in text.

      What was the decision criteria used to decide on averaging the final 5 conditioning sessions as terminal responding for the analyses in Figure 5? This is an oddly specific number. Was this based on consistency with previous work, or based on the greatest number of sessions where stable data for all animals could be extracted?<br /> In the analysis corresponding to Figures 7-8: If I understand the description of this analysis correctly, for each rat the data are the cumulative response data during the CS, starting from the trial on which responding to the CS > ITI (t = 1), and ending at the trial on which CS responding peaked (maximum over 3 session moving average window; t = end). This analysis does not seem to account for changes (decline) in the ITI response rates over this period of acquisition, and it is likely that responding during the ITI is still declining after t=1. Are the 4 functions that were fit to these data to discriminate between different underlying generative processes still appropriate on total CS responding instead of conditional CS responding after accounting for changes in baseline response rates during ITI?

      Page 27, Procedure, final sentence: The magazine responding during the ITI is defined as the 20s period immediately before CS onset. The range of ITI values (Table 1) always starts as low as 15s in all 14 groups. Even in the case of an ITI on a trial that was exactly 20s, this would also mean that the start of this period overlaps with the termination of the CS from the previous trial and delivery (and presumably consumption) of a pellet. Please indicate if the definition of the ITI period was modified on trials where the preceding ITI was <20s, and if any other criteria were used to define the ITI.

      Were the rats exposed to the reinforcers/pellets in their home cage prior to acquisition? Please indicate whether rats where pre-exposed to the reward pellets in their home cages e.g. as is often done to reduce neophobia. Given the deliberate absence of a magazine-training phase, this information is important when assessing the experienced contingency between the CS and the US.

      For all the analyses, please provide the exact models that were fit and the software used. For example, it is not necessarily clear to the reader (particularly in the absence of degrees of freedom) that the model fits discussed in Figure 3 are fit on the individual subject data points or the group medians. Similarly, in Figure 6 there is no indication of whether a single regression model was fit to all the plotted data or whether tests of different slopes for each of the conditions were compared. With regards to the statistics in Figure 6, depending on how this was run, it is also a potential problem that the analyses does not correct for the potentially highly correlated multiple measurements from the same subjects i.e. each rat provides 4 data points which are very likely not to be independent observations.

      A number of sections of the discussion are speculative or not directly supported by the present experimental data (but may well be supported by previous findings that are not the direct focus of the present experiment). For example, Page 19, Paragraph 2: this entire paragraph is not really clearly explained and is presenting an opinion rather than a strong conclusion that follows directly from the present findings. Evidence for an aspect of RET in the present paper (i.e. the prediction of time scale invariance on the initial point of acquisition, but not necessarily the findings regarding the rate of terminal acquisition) - while supportive - does not necessarily provide unconditional evidence for this theory over all the alternatives.

      Similarly, the Conclusion section (Page 23) makes the claim that "the equations have at most one free parameter", which may be an oversimplification that is conditionally true in the narrow context of the present experiment where many things were kept constant between groups and run in a particular way to ensure this is the case. While the equations do well in this narrow case, it is unlikely that additional parameters would not need to be added to account for more general learning situations. To clarify, I am not contending that this kind of statement is necessarily untrue, merely that it is being presented in a narrow context and may require a deeper discussion of much more of the literature to qualify/support properly - and the discussion section of the present experiment/manuscript may not be the appropriate place for this.

      - Consider taking advantage of an "Ideas and Speculation" subsection within the Discussion that is supported by eLife [ https://elifesciences.org/inside-elife/e3e52a93/elife-latest-including-ideas-and-speculation-in-elife-papers ]. This might be more appropriate to qualify the tone of much of the discussion from page 19 onwards.

      It seems like there are entire analyses and new figures being presented in the discussion e.g. Page 20: Information-Theoretic Contingency. These sections might be better placed in the methods section or a supplementary section/discussion.

    2. Author response:

      The following is the authors’ response to the original reviews

      ANALYTICAL

      (1) A key claim made here is that the same relationship (including the same parameter) describes data from pigeons by Gibbon and Balsam (1981; Figure 1) and the rats in this study (Figure 3). The evidence for this claim, as presented here, is not as strong as it could be. This is because the measure used for identifying trials to criterion in Figure 1 appears to differ from any of the criteria used in Figure 3, and the exact measure used for identifying trials to criterion influences the interpretation of Figure 3***. To make the claim that the quantitative relationship is one and the same in the Gibbon-Balsam and present datasets, one would need to use the same measure of learning on both datasets and show that the resultant plots are statistically indistinguishable, rather than simply plotting the dots from both data sets and spotlighting their visual similarity. In terms of their visual characteristics, it is worth noting that the plots are in log-log axis and, as such, slight visual changes can mean a big difference in actual numbers. For instance, between Figure 3B and 3C, the highest information group moves up only "slightly" on the y-axis but the difference is a factor of 5 in the real numbers. Thus, in order to support the strong claim that the quantitative relationships obtained in the Gibbon-Balsam and present datasets are identical, a more rigorous approach is needed for the comparisons.

      ***The measure of acquisition in Figure 3A is based on a previously established metric, whereas the measure in Figure 3B employs the relatively novel nDKL measure that is argued to be a better and theoretically based metric. Surprisingly, when r and r2 values are converted to the same metric across analyses, it appears that this new metric (Figure 3B) does well but not as well as the approach in Figure 3A. This raises questions about why a theoretically derived measure might not be performing as well on this analysis, and whether the more effective measure is either more reliable or tapping into some aspect of the processes that underlie acquisition that is not accounted for by the nDKL metric.

      Figure 3 shows that the relationship between learning rate and informativeness for our rats was very similar to that shown with pigeons by Gibbon and Balsam (1981). We have used multiple criteria to establish the number of trials to learn in our data, with the goal of demonstrating that the correspondence between the data sets was robust. In the revised Figure 3, specifically 3C and 3D, we have plotted trials to acquisition using decision criterion equivalent to those used by Gibbon and Balsam. The criterion they used—at least one peck at the response key on at least 3 out of 4 consecutive trials—cannot be directly applied to our magazine entry data because rats make magazine entries during the inter-trial interval (whereas pigeons do not peck at the response key in the inter-trial interval). Therefore, evidence for conditioning in our paradigm must involve comparison between the response rate during CS and the baseline response rate, rather than just counting responses during the CS. We have used two approaches to adapt the Gibbon and Balsam criterion to our data. One approach, plotted in Figure 3C, uses a non-parametric signed rank test for evidence that the CS response rate exceeds the pre-CS response rate, and adopting a statistical criterion equivalent to Gibbon and Balsam’s 3-out-of-4 consecutive trials (p<.3125). The second method (Figure 3D) estimates the nDkl for the criterion used by Gibbon and Balsam and then applies this criterion to the nDkl for our data. To estimate the nDkl of Gibbon and Balsam’s data, we have assumed there are no responses in the inter-trial interval and the response probability during the CS must be at least 0.75 (their criterion of at least 3 responses out of 4 trials). The nDkl for this difference is 2.2 (odds ratio 27:1). We have then applied this criterion to the nDkl obtained from our data to identify when the distribution of CS response rates has diverged by an equivalent amount from the distribution of pre-CS response rates. These two analyses have been added to the manuscript to replace those previously shown in Figures 3B and 3C.

      (2) Another interesting claim here is that the rates of responding during ITI and the cue are proportional to the corresponding reward rates with the same proportionality constant. This too requires more quantification and conceptual explanation. For quantification, it would be more convincing to calculate the regression slope for the ITI data and the cue data separately and then show that the corresponding slopes are not statistically distinguishable from each other. Conceptually, it is not clear why the data used to test the ITI proportionality came from the last 5 conditioning sessions. What were the decision criteria used to decide on averaging the final 5 sessions as terminal responses for the analyses in Figure 5? Was this based on consistency with previous work, or based on the greatest number of sessions where stable data for all animals could be extracted?

      If the model is that animals produce response rates during the ITI (a period with no possible rewards) based on the overall rate of rewards in the context, wouldn't it be better to test this before the cue learning has occurred? Before cue learning, the animals would presumably only have attributed rewards in the context to the context and thus, produce overall response rates in proportion to the contextual reward rate. After cue learning, the animals could technically know that the rate of rewards during ITI is zero. Why wouldn't it be better to test the plotted relationship for ITI before cue learning has occurred? Further, based on Figure 1, it seems that the overall ITI response rate reduces considerably with cue learning. What is the expected ITI response rate prior to learning based on the authors' conceptual model? Why does this rate differ from pre and post-cue learning? Finally, if the authors' conceptual framework predicts that ITI response rate after cue learning should be proportional to contextual reward rate, why should the cue response rate be proportional to the cue reward rate instead of the cue reward rate plus the contextual reward rate?

      A single regression line, as shown in Figure 5, is the simplest possible model of the relationship between response rate and reinforcement rate and it explains approximately 80% of the variance in response rate. Fixing the log-log slope at 1 yields the maximally simple model. (This regression is done in the logarithmic domain to satisfy the homoscedasticity assumption.) When transformed into the linear domain, this model assumes a truly scalar relation (linear, intercept at the origin) and assumes the same scale factor and the same scalar variability in response rates for both sets of data (ITI and CS). Our plot supports such a model. Its simplicity is its own motivation (Occam’s razor).

      If separate regression lines are fitted to the CS and ITI data, there is a small increase in explained variance (R<sub>2</sub> = 0.82). These regression lines have been added to the plot in the revised manuscript (Figure 5). We leave it to further research to determine whether such a complex model, with 4 parameters, is required. However, we do not think the present data warrant comparing the simplest possible model, with one parameter, to any more complex model for the following reasons:

      · When a brain—or any other machine—maps an observed (input) rate to a rate it produces (output rate), there is always an implicit scalar. In the special case where the produced rate equals the observed rate, the implicit scalar has value 1. Thus, there cannot be a simpler model than the one we propose, which is, in and of itself, interesting.

      · The present case is an intuitively accessible example of why the MDL (Minimum Description Length) approach to model complexity (Barron, Rissanen, & Yu, 1998; Grünwald, Myung, & Pitt, 2005; Rissanen, 1999) can yield a very different conclusion from the conclusion reached using the Bayesian Information Criterion (BIC) approach. The MDL approach measures the complexity of a model when given N data specified with precision of B bits per datum by computing (or approximating) the sum of the maximum-likelihoods of the model’s fits to all possible sets of N data with B precision per datum. The greater the sum over the maximum likelihoods, the more complex the model, that is, the greater its measured wiggle room, it’s capacity to fit data. Recall that von Neuman remarked to Fermi that with 4 parameters he could fit an elephant. His deeper point was that multi-parameter models bring neither insight nor predictive power; they explain only post-hoc, after one has adjusted their parameters in the light of the data. For realistic data sets like ours, the sums of maximum likelihoods are finite but astronomical. However, just as the Sterling approximation allows one to work with astronomical factorials, it has proved possible to develop readily computable approximations to these sums, which can be used to take model complexity into account when comparing models. Proponents of the MDL approach point out that the BIC is inadequate because models with the same number of parameters can have very different amounts of wiggle room. A standard illustration of this point is the contrast between logarithmic model and power-function model. Log regressions must be concave; whereas power function regressions can be concave, linear, or convex—yet they have the same number of parameters (one or two, depending on whether one counts the scale parameter that is always implicit). The MDL approach captures this difference in complexity because it measures wiggle room; the BIC approach does not, because it only counts parameters.

      · In the present case, one is comparing a model with no pivot and no vertical displacement at the boundary between the black dots and the red dots (the 1-parameter unilinear model) to a bilinear model that allows both a change in slope and a vertical displacement for both lines. The 4-parameter model is superior if we use the BIC to take model complexity into account. However, 4-parameter has ludicrously more wiggle room. It will provide excellent fits—high maximum likelihood—to data sets in which the red points have slope > 1, slope 0, or slope < 0 and in which it is also true that the intercept for the red points lies well below or well above the black points (non-overlap in the marginal distribution of the red and black data). The 1-parameter model, on the other hand, will provide terrible fits to all such data (very low maximum likelihoods). Thus, we believe the BIC does not properly capture the immense actual difference in the complexity between the 1-parameter model (unilinear with slope 1) to the 4-parameter model (bilinear with neither the slope nor the intercept fixed in the linear domain).

      · In any event, because the pivot (change in slope between black and red data sets), if any, is small and likewise for the displacement (vertical change), it suffices for now to know that the variance captured by the 1-parameter model is only marginally improved by adding three more parameters. Researchers using the properly corrected measured rate of head poking to measure the rate of reinforcement a subject expects can therefore assume that they have an approximately scalar measure of the subject’s expectation. Given our data, they won’t be far wrong even near the extremes of the values commonly used for rates of reinforcement. That is a major advance in current thinking, with strong implications for formal models of associative learning. It implies that the performance function that maps from the neurobiological realization of the subject’s expectation is not an unknown function. On the contrary, it’s the simplest possible function, the scalar function. That is a powerful constraint on brain-behavior linkage hypotheses, such as the many hypothesized relations between mesolimbic dopamine activity and the expectation that drives responding in Pavlovian conditioning (Berridge, 2012; Jeong et al., 2022; Y.  Niv, Daw, Joel, & Dayan, 2007; Y. Niv & Schoenbaum, 2008).

      The data in Figures 4 and 5 are taken from the last 5 sessions of training. The exact number of sessions was somewhat arbitrary but was chosen to meet two goals: (1) to capture asymptotic responding, which is why we restricted this to the end of the training, and (2) to obtain a sufficiently large sample of data to estimate reliably each rat’s response rate. We have checked what the data look like using the last 10 sessions, and can confirm it makes very little difference to the results. We now note this in the revised manuscript. The data for terminal responding by all rats, averaged over both the last 5 sessions and last 10 sessions, can be downloaded from https://osf.io/vmwzr/

      Finally, as noted by the reviews, the relationship between the contextual rate of reinforcement and ITI responding should also be evident if we had measured context responding prior to introducing the CS. However, there was no period in our experiment when rats were given unsignalled reinforcement (such as is done during “magazine training” in some experiments). Therefore, we could not measure responding based on contextual conditioning prior to the introduction of the CS. This is a question for future experiments that use an extended period of magazine training or “poor positive” protocols in which there are reinforcements during the ITIs as well as during the CSs. The learning rate equation has been shown to predict reinforcements to acquisition in the poor-positive case (Balsam, Fairhurst, & Gallistel, 2006).

      (3) There is a disconnect between the gradual nature of learning shown in Figures 7 and 8 and the information-theoretic model proposed by the authors. To the extent that we understand the model, the animals should simply learn the association once the evidence crosses a threshold (nDKL > threshold) and then produce behavior in proportion to the expected reward rate. If so, why should there be a gradual component of learning as shown in these figures? In terms of the proportional response rule to the rate of rewards, why is it changing as animals go from 10% to 90% of peak response? The manuscript would be greatly strengthened if these results were explained within the authors' conceptual framework. If these results are not anticipated by the authors' conceptual framework, this should be explicitly stated in the manuscript.

      One of us (CRG) has earlier suggested that responding appears abruptly when the accumulated evidence that the CS reinforcement rate is greater than the contextual rate exceeds a decision threshold (C.R.  Gallistel, Balsam, & Fairhurst, 2004). The new more extensive data require a more nuanced view. Evidence about the manner in which responding changes over the course of training is to some extent dependent on the analytic method used to track those changes. We presented two different approaches. The approach shown in Figures 7 and 8 (now 6 and 7), extending on that developed by Harris (2022), assumes a monotonic increase in response rate and uses the slope of the cumulative response rate to identify when responding exceeds particular milestones (percentiles of the asymptotic response rate). This analysis suggests a steady rise in responding over trials. Within our theoretical model, this might reflect an increase in the animal’s certainty about the CS reinforcement rate with accumulated evidence from each trial. While this method should be able to distinguish between a gradual change and a single abrupt change in responding (Harris, 2022) it may not distinguish between a gradual change and multiple step-like changes in responding and cannot account for decreases in response rate.

      The other analytic method we used relies on the information theoretic measure of divergence, the nDkl (Gallistel & Latham, 2023), to identify each point of change (up or down) in the response record. With that method, we discern three trends. First, the onset tends to be abrupt in that the initial step up is often large (an increase in response rate by 50% or more of the difference between its initial value and its terminal value is common and there are instances where the initial step is to the terminal rate or higher). Second, there is marked within-subject variability in the response rate, characterized by large steps up and down in the parsed response rates following the initial step up, but this variability tends to decrease with further training (there tend to be fewer and smaller steps in both the ITI response rates and the CS response rate as training progresses). Third, the overall trend, seen most clearly when one averages across subjects within groups is to a moderately higher rate of responding later in training than after the initial rise. We think that the first tendency reflects an underlying decision process whose latency is controlled by diminishing uncertainty about the two reinforcement rates and hence about their ratio. We think that decreasing uncertainty about the true values of the estimated rates of reinforcement is also likely to be an important part of the explanation for the second tendency (decreasing within-subject variation in response rates). It is less clear whether diminishing uncertainty can explain the trend toward a somewhat greater difference in the two response rates as conditioning progresses. It is perhaps worth noting that the distribution of the estimates of the informativeness ratio is likely to be heavy tailed and have peculiar properties (as witness, for example, the distribution of the ratio of two gamma distributions with arbitrary shape and scale parameters) but we are unable at this time to propound an explanation of the third trend.

      (4) Page 27, Procedure, final sentence: The magazine responding during the ITI is defined as the 20 s period immediately before CS onset. The range of ITI values (Table 1) always starts as low as 15 s in all 14 groups. Even in the case of an ITI on a trial that was exactly 20 s, this would also mean that the start of this period overlaps with the termination of the CS from the previous trial and delivery (and presumably consumption) of a pellet. It should be indicated whether the definition of the ITI period was modified on trials where the preceding ITI was < 20 s, and if any other criteria were used to define the ITI. Were the rats exposed to the reinforcers/pellets in their home cage prior to acquisition?

      There was an error in the description provided in the original text. The pre-CS period used to measure the ITI responding was 10 s rather than 20 s. There was always at least a 5-s gap between the end of the previous trial and the start of the pre-CS period. The statement about the pre-CS measure has been corrected in the revised manuscript.

      (5) For all the analyses, the exact models that were fit and the software used should be provided. For example, it is not necessarily clear to the reader (particularly in the absence of degrees of freedom) that the model discussed in Figure 3 fits on the individual subject data points or the group medians. Similarly, in Figure 6 there is no indication of whether a single regression model was fit to all the plotted data or whether tests of different slopes for each of the conditions were compared. With regards to the statistics in Figure 6, depending on how this was run, it is also a potential problem that the analyses do not correct for the potentially highly correlated multiple measurements from the same subjects, i.e. each rat provides 4 data points which are very unlikely to be independent observations.

      Details about model fitting have been added to the revision. The question about fitting a single model or multiple models to the data in Figure 6 (now 5) is addressed in response 2 above. In Figure 5, each rat provides 2 behavioural data points (ITI response rate and CS response rate) and 2 values for reinforcement rate (1/C and 1/T). There is a weak but significant correlation between the ITI and CS response rates (r = 0.28, p < 0.01; log transformed to correct for heteroscedasticity). By design, there is no correlation between the log reinforcement rates (r = 0.06, p = .404).

      CONCEPTUAL

      (1) We take the point that where traditional theories (e.g., Rescorla-Wagner) and rate estimation theory (RET) both explain some phenomenon, the explanation in terms of RET may be preferred as it will be grounded in aspects of an animal's experience rather than a hypothetical construct. However, like traditional theories, RET does not explain a range of phenomena - notably, those that require some sort of expectancy/representation as part of their explanation. This being said, traditional theories have been incorporated within models that have the representational power to explain a broader array of phenomena, which makes me wonder: Can rate estimation be incorporated in models that have representational power; and, if so, what might this look like? Alternatively, do the authors intend to claim that expectancy and/or representation - which follow from probabilistic theories in the RW mould - are unnecessary for explanations of animal behaviour?***

      It is important for the field to realize that the RW model cannot be used to explain the results of Rescorla’s (Rescorla, 1966; Rescorla, 1968, 1969) contingency-not-pairing experiments, despite what was claimed by Rescorla and Wagner (Rescorla & Wagner, 1972; Wagner & Rescorla, 1972) and has subsequently been claimed in many modelling papers and in most textbooks and reviews (Dayan & Niv, 2008; Y. Niv & Montague, 2008). Rescorla programmed reinforcements with a Poisson process. The defining property of a Poisson process is its flat hazard function; the reinforcements were equally likely at every moment in time when the process was running. This makes it impossible to say when non-reinforcements occurred and, a fortiori, to count them. The non-reinforcements are causal events in RW algorithm and subsequent versions of it. Their effects on associative strength are essential to the explanations proffered by these models. Non-reinforcements—failures to occur, updates when reinforcement is set to 0, hence also the lambda parameter—can have causal efficacy only when the successes may be predicted to occur at specified times (during “trials”). When reinforcements are programmed by a Poisson process, there are no such times. Attempts to apply the RW formula to reinforcement learning soon foundered on this problem (Gibbon, 1981; Gibbon, Berryman, & Thompson, 1974; Hallam, Grahame, & Miller, 1992; L.J. Hammond, 1980; L. J. Hammond & Paynter, 1983; Scott & Platt, 1985). The enduring popularity of the delta-rule updating equation in reinforcement learning depends on “big-concept” papers that don’t fit models to real data and discretize time into states while claiming to be real-time models (Y. Niv, 2009; Y. Niv, Daw, & Dayan, 2005).

      The information-theoretic approach to associative learning, which sometimes historically travels as RET (rate estimation theory), is unabashedly and inescapably representational. It assumes a temporal map and arithmetic machinery capable in principle of implementing any implementable computation. In short, it assumes a Turing-complete brain. It assumes that whatever the material basis of memory may be, it must make sense to ask of it how many bits can be stored in a given volume of material. This question is seldom posed in associative models of learning, nor by neurobiologists committed to the hypothesis that the Hebbian synapse is the material basis of memory. Many—including the new Nobelist, Geoffrey Hinton— would agree that the question makes no sense. When you assume that brains learn by rewiring themselves rather than by acquiring and storing information, it makes no sense.

      When a subject learns a rate of reinforcement, it bases its behavior on that expectation, and it alters its behavior when that expectation is disappointed. Subjects also learn probabilities when they are defined. They base some aspects of their behavior on those expectations, making computationally sophisticated use of their representation of the uncertainties (Balci, Freestone, & Gallistel, 2009; Chan & Harris, 2019; J. A. Harris, 2019; J.A. Harris & Andrew, 2017; J. A. Harris & Bouton, 2020; J. A. Harris, Kwok, & Gottlieb, 2019; Kheifets, Freestone, & Gallistel, 2017; Kheifets & Gallistel, 2012; Mallea, Schulhof, Gallistel, & Balsam, 2024 in press).

      (2) The discussion of Rescorla's (1967) and Kamin's (1968) findings needs some elaboration. These findings are already taken to mean that the target CS in each design is not informative about the occurrence of the US - hence, learning about this CS fails. In the case of blocking, we also know that changes in the rate of reinforcement across the shift from stage 1 to stage 2 of the protocol can produce unblocking. Perhaps more interesting from a rate estimation perspective, unblocking can also be achieved in a protocol that maintains the rate of reinforcement while varying the sensory properties of the US (Wagner). How does rate estimation theory account for these findings and/or the demonstrations of trans-reinforcer blocking (Pearce-Ganesan)? Are there other ways that the rate estimation account can be distinguished from traditional explanations of blocking and contingency effects? If so, these would be worth citing in the discussion. More generally, if one is going to highlight seminal findings (such as those by Rescorla and Kamin) that can be explained by rate estimation, it would be appropriate to acknowledge findings that challenge the theory - even if only to note that the theory, in its present form, is not all-encompassing. For example, it appears to me that the theory should not predict one-trial overshadowing or the overtraining reversal effect - both of which are amenable to discussion in terms of rates.

      I assume that the signature characteristics of latent inhibition and extinction would also pose a challenge to rate estimation theory, just as they pose a challenge to Rescorla-Wagner and other probability-based theories. Is this correct?

      The seemingly contradictory evidence of unblocking and trans-reinforcer blocking by Wagner and by Pearce and Ganesan cited above will be hard for any theory to accommodate. It will likely depend on what features of the US are represented in the conditioned response.

      RET predicts one-trial overshadowing, as anyone may verify in a scientific programming language because it has no free parameters; hence, no wiggle room. Overtraining reversal effects appear to depend on aspects of the subjects’ experience other than the rate of reinforcement. It seems unlikely that it can proffer an explanation.

      Various information-theoretic calculations give pretty good quantitative fits to the relatively few parametric studies of extinction and the partial-reinforcement extinction effect (see Gallistel (2012, Figs 3 & 4); Wilkes & Gallistel (2016, Fig 6) and Gallistel (2025, under review, Fig 6). It has not been applied to latent inhibition, in part for want of parametric data. However, clearly one should not attribute a negative rate to a context in which the subject had never been reinforced. An explanation, if it exists, would have to turn on the effect of that long period on initial rate estimates AND on evidence of a change in rate, as of the first reinforcement.

      Recommendations for authors:

      MINOR POINTS

      (1) It is not clear why Figure 3C is presented but not analyzed, and why the data presented in Figure 4 to clarify the spread of the distribution of the data observed across the plots in Figure 3 uses the data from Figure 3C. This would seem like the least representative data to illustrate the point of Figure 4. It also appears that the data plotted in Figure 4 corresponds to Figure 3A and 3B rather than the odds 10:1 data indicated in the text.

      Figures 3 has changed as already described. The data previously plotted in Figure 4 are now shown in 3B and corresponds to that plotted in Figure 3A.

      (2) Log(T) was not correlated with trials to criterion. If trials to criterion is inversely proportional to log(C/T) and C is uncorrelated with T, shouldn't trials to criterion be correlated with log(T)? Is this merely a matter of low statistical power?

      Yes. There is a small, but statistically non-significant, correlation between log(T) and trials to criterion, r = 0.35, p = .22. That correlation drops to .08 (p = .8) after factoring out log(C/T), which demonstrates that the weak correlation between log(T) and trials to criterion is based on the correlation between log(t) and log(C/T).

      (3) The rationale for the removal of the high information condition samples in the Fig 8 "Slope" plot to be weak. Can the authors justify this choice better? If all data are included, the relationship is clearly different from that shown in the plot.

      We have now reported correlations that include those 3 groups but noted that the correlations are largely driven by the much lower slope values of those 3 groups which is likely an artefact of their smaller number of trials. We use this to justify a second set of correlations that excludes those 3 groups.

      (4) The discussion states that there is at most one free parameter constrained by the data - the constant of proportionality for response rate. However, there is also another free parameter constrained by data-the informativeness at which expected trials to acquisition is 1.

      I think this comment is referring to two different sets of data. The constant of proportionality of the response rate refers to the scalar relationship between reinforcement rate and terminal response rate shown in Figure 5. The other parameter, the informativeness when trials to acquisition equals 1, describes the intercept of the regression line in Figure 1 (and 3).

      (5) The authors state that the measurement of available information is not often clear. Given this, how is contingency measurable based on the authors' framework?

      (6) Based on the variables provided in Supplementary File 3, containing the acquisition data, we were unable to reproduce the values reported in the analysis of Figure 3.

      Figure 3 has changed, using new criteria for trials to acquisition that attempt to match the criterion used by Gibbon and Balsam. The data on which these figures are based has been uploaded into OSF.

      GRAPHICAL AND TYPOGRAPHICAL

      (1) Y-axis labels in Figure 1 are not appropriately placed. 0 is sitting next to 0.1. 0 should sit at the bottom of the y-axis.

      If this comment refers to the 0 sitting above an arrow in the top right corner of the plot, this is not misaligned. The arrow pointing to zero is used to indicate that this axis approaches zero in the upward direction. 0 should not be aligned to a value on the axis since a learning rate of zero would indicate an infinite number of learning trials. The caption has been edited to explain this more clearly.

      (2) Typo, Page 6, Final Paragraph, line 4. "Fourteen groups of rats were trained with for 42 session"

      Corrected. Thank you.

      (3) Figure 3 caption: Typo, should probably be "Number of trials to acquisition"?

      This change has now been made. The axis shows reinforcements to acquisition to be consistent with Gibbon and Balsam, but trials and number of reinforcements are identical in our 100% reinforcement schedule.

      (4) Typo Page 17 Line 1: "Important pieces evidence about".

      Correct. Thank you.

      (5) Consider consistent usage of symbols/terms throughout the manuscript (e.g. Page 22, final paragraph: "iota = 2" is used instead of the corresponding symbol that has been used throughout).

      Changed.

      (6) Typo Page 28, Paragraph 1, Line 9: "We used a one-sample t-test using to identify when this".

      This section of text has been changed to reflect the new analysis used for the data in Figure 3.

      (7) Typo Page 29, Paragraph 1, Line 2: "problematic in cases where one of both rates are undefined" either typo or unclear phrasing.

      “of” has been corrected to “or”

      (8) Typo Page 30: Equation 3 appears to have an error and is not consistent with the initial printing of Equation 3 in the manuscript.

      The typo in initial expression of Eq 3 (page 23) has been corrected.

      (9) Typo Page 33, Line 5: "Figures 12".

      Corrected.

      (10) Typo Page 34, Line 10: "and the 5 the increasingly"? Should this be "the 5 points that"?

      Corrected.

      (11) Typo Page 35, Paragraph 2: "estimate of the onset of conditioned is the trial after which".

      Corrected.

      (12) Clarify: Page 35, final paragraph: it is stated that four-panel figures are included for each subject in the Supplementary files, but each subject has a six-panel figure in the Supplementary file.

      The text now clarifies that the 4-panel figures are included within the 6-panel figures in the Supplementary materials.

      (13) It is hard to identify the different groups in Figure 2 (Plot 15).

      The figure is simply intended to show that responding across seconds within the trial is relatively flat for each group. Individuation of specific groups is not particularly important.

      (14) It appears that the numbering on the y-axis is misaligned in Figure 2 relative to the corresponding points on the scale (unless I have misunderstood these values and the response rate measure to the ITI can drop below 0?).

      The numbers on the Y axes had become misaligned. That has now been corrected.

      (15) Please include the data from Figure 3A in the spreadsheet supplementary file 3. If it has already been included as one of the columns of data, please consider a clearer/consistent description of the relevant column variable in Supplementary File 1.

      The data from Figure 3 are now available from the linked OSF site, referenced in the manuscript.

      (16) Errors in supplementary data spreadsheets such that the C/T values are not consistent with those provided in Table 1 (C/T values of 4.5, 54, 180, and 300 are slightly different values in these spreadsheets). A similar error/mismatch appears to have occurred in the C/T labels for Figures (e.g. Figure 10) and the individual supplementary figures.

      The C/T values on the figures in the supplementary materials have been corrected and are now consistent with those in Table 1.

      (17) Currently the analysis and code provided at https://osf.io/vmwzr/ are not accessible without requesting access from the author. Please consider making these openly available without requiring a request for authorization. As such, a number of recommendations made here may already have been addressed by the data and code deposited on OSF. Apologies for any redundant recommendations.

      Data and code are now available in at the OSF site which has been made public without requiring request.

      (18) Please consider a clearer and more specific reference to supplementary materials. Currently, the reader is required to search through 4 separate supplementary files to identify what is being discussed/referenced in the text (e.g. Page 18, final line: "see Supplementary Materials" could simply be "see Figure S1").

      We have added specific page numbers in references to the Supplementary Materials.

    1. Sometimes content goes viral in a way that is against the intended purpose of the original content. For example, this TikTok started as a slightly awkward video of a TikToker introducing his girlfriend. Other TikTokers then used the duet feature to add an out-of-frame gun pointed at the girlfriend’s head, and her out-of-frame hands tied together, being held hostage. TikTokers continued to build on this with hostage negotiators, press conferences and news sources. All of this is almost certainly not the impression the original TikToker was trying to convey.

      I remember this Tiktok going viral and how amusing I found it. A video like this is a testament to how creative the internet can be for good or bad. It was fairly unhinged and a perfect example of how just about anything can go viral given the right circumstances, no matter the intention.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      This study provides useful findings about the effects of heterozygosity for Trio variants linked to neurodevelopmental and psychiatric disorders in mice. However, the strength of the evidence is limited and incomplete mainly because the experimental flow is difficult to follow, raising concerns about the conclusions' robustness. Clearer connections between variables, such as sex, age, behavior, brain regions, and synaptic measures, and more methodological detail on breeding strategies, test timelines, electrophysiology, and analysis, are needed to support their claims.

      We appreciate the opportunity to address the constructive feedback provided by eLife and the reviewers. Below, we respond to the overall assessment and individual reviewers' comments, clarifying our experimental approach, addressing concerns, and providing additional details where necessary.

      We thank the editors for highlighting the significance of our findings regarding the effects of Trio variant heterozygosity in mice. We acknowledge the feedback concerning the experimental flow and agree that clarity is paramount. To address these concerns:

      (1) Connections between variables: The word limit of the initial submission constrained our ability to provide adequate details and connections between variables. We have revised the manuscript to explicitly outline and extend explanations and the relationships between sex, age, behavior, brain regions, and synaptic measures, ensuring that the rationale for each experiment and its relevance to the overall conclusions are improved.

      (2) Methodological details: The Methods section of our initial submission was condensed, with key details provided in the Supplemental Methods section. We have merged all into an extended section to improve clarity. We have expanded our description of breeding strategies, test timelines, electrophysiological protocols, and data analysis methods in the revised Methods section. We believe the additions have enhanced the transparency and reproducibility of our study and ensured full support of our conclusions.

      (3) Experimental flow: We have revised and extended our results, methods, and discussion sections to clarify the rationale and experimental design to guide readers through the experimental sequence and rationale.

      We are confident these revisions address the concerns raised and enhance the robustness and coherence of our findings.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study explores how heterozygosity for specific neurodevelopmental disorder-associated Trio variants affects mouse behavior, brain structure, and synaptic function, revealing distinct impacts on motor, social, and cognitive behaviors linked to clinical phenotypes. Findings demonstrate that Trio variants yield unique changes in synaptic plasticity and glutamate release, highlighting Trio's critical role in presynaptic function and the importance of examining variant heterozygosity in vivo.

      Strengths:

      This study generated multiple mouse lines to model each Trio variant, reflecting point mutations observed in human patients with developmental disorders. The authors employed various approaches to evaluate the resulting behavioral, neuronal morphology, synaptic function, and proteomic phenotypes.

      Weaknesses:

      While the authors present extensive results, the flow of experiments is challenging to follow, raising concerns about the strength of the experimental conclusions. Additionally, the connection between sex, age, behavioral data, brain regions, synaptic transmission, and plasticity lacks clarity, making it difficult to understand the rationale behind each experiment. Clearer explanations of the purpose and connections between experiments are recommended. Furthermore, the methodology requires more detail, particularly regarding mouse breeding strategies, timelines for behavioral tests, electrophysiology conditions, and data analysis procedures.

      We appreciate the reviewer’s recognition of the novelty and comprehensiveness of our approach, particularly the generation of multiple mouse lines and our efforts to model Trio variant effects in vivo.

      Weaknesses

      (1) Experimental flow and rationale and connection between variables: We have expanded on the connections between behavioral data, neuronal morphology, synaptic function, and proteomics in the Results and Discussion sections to clarify how each experiment informs the reasoning and the conclusions and to highlight the relationships between sex, age, behavior, and synaptic measures.

      (2) Methodological details: Our initial Methods section was formatted to be short to fulfill word limits on the submitted version, with additional details provided in the Supplemental Methods section. We have merged our Methods and Supplemental Methods sections and expanded on our breeding strategies, test timelines, electrophysiological protocols, and data analysis. We believe these additions enhance the transparency and reproducibility of our study.

      (3) Recommendations for the authors: We thank Reviewer #1 for providing several recommendations to improve our manuscript. We have addressed their comments in the revision, as detailed below, adding key experiments that bolster our findings.

      Reviewer #2 (Public review):

      Summary:

      The authors generated three mouse lines harboring ASD, Schizophrenia, and Bipolar-associated variants in the TRIO gene. Anatomical, behavioral, physiological, and biochemical assays were deployed to compare and contrast the impact of these mutations in these animals. In this undertaking, the authors sought to identify and characterize the cellular and molecular mechanisms responsible for ASD, Schizophrenia, and Bipolar disorder development.

      Strengths:

      The establishment of TRIO dysfunction in the development of ASD, Schizophrenia, and Bipolar disorder is very recent and of great interest. Disorder-specific variants have been identified in the TRIO gene, and this study is the first to compare and contrast the impact of these variants in vivo in preclinical models. The impact of these mutations was carefully examined using an impressive host of methods. The authors achieved their goal of identifying behavioral, physiological, and molecular alterations that are disorder/variant specific. The impact of this work is extremely high given the growing appreciation of TRIO dysfunction in a large number of brain-related disorders. This work is very interesting in that it begins to identify the unique and subtle ways brain function is altered in ASD, Schizophrenia, and Bipolar disorder.

      Weaknesses:

      (1) Most assays were performed in older animals and perhaps only capture alterations that result from homeostatic changes resulting from prodromal pathology that may look very different.

      (2) Identification of upregulated (potentially compensating) genes in response to these disorder-specific Trio variants is extremely interesting. However, a functional demonstration of compensation is not provided.

      (3) There are instances where data is not shown in the manuscript. See "data not shown". All data collected should be provided even if significant differences are not observed.

      I consider weaknesses 1 and 2 minor. While they would be very interesting to explore, these experiments might be more appropriate for a follow-up study. I would recommend that the missing data in 3 should be provided in the supplemental material.

      We are grateful for the reviewer’s recognition of our study’s significance and methodological rigor. The acknowledgment of Trio dysfunction as a novel and impactful area of research is deeply appreciated.

      Weaknesses:

      We agree that focusing on older animals limits insights into early-stage pathophysiology. However, our goal in this study was to examine the functional impacts of Trio heterozygosity at an adolescent stage and to reveal the ultimate impact of these alleles on synaptic function. Our choice of age aligns with our objectives. Future studies of earlier developmental stages will be beneficial and complement these findings.

      Functional compensation:

      We tested functional compensation through rescue experiments in +/K1431M brain slices using a Rac1-specific inhibitor, NSC23766, which prevents Rac1 activation by Trio or Tiam1. Our finding that direct Rac1 inhibition normalizes deficient neurotransmitter release in +/K1431M mice strongly suggests that increased Rac1 activity drives this phenotype.

      Data not shown:

      We will incorporate all previously shown data into the Supplemental Materials, even when results are nonsignificant. We agree that this ensures full transparency and facilitates a more comprehensive evaluation of our findings.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) In Figure 1K-N, the lack of observed differences in +/M2145T mice across all tests raises questions about its validity as a BPD model. Furthermore, the differences in female behavior data compared to males, as shown in the Supplemental section, lack clarification-specifically, whether these variations are due to sex differences or sample size disparities, which is not discussed. Additionally, it's unclear if the same mice were used in tests K through L-N, as the reported numbers differ without explanation; if relevant, any mortality should be reported. Given the observed body weight differences, it is important to display locomotor data, despite the mention of no change in open field results. Lastly, a detailed breeding strategy and timeline for behavioral testing would enhance clarity.

      We thank Reviewer 1 for recognizing these confusing points in our behavioral data and seek to add clarification in our Revision as below:

      (a) We have revised the text to emphasize our goal to evaluate the impact of NDD-related Trio alleles that have discrete and measurable effects on brain development and function, and not to model specific NDDs (e.g. ASD, SCZ, or BPD). The three specific Trio mutations were chosen based on strong evidence of these mutations impairing the biochemical functions of Trio. We reasoned our approach would reveal how impairing Trio in different ways – i.e. altering protein level or GEF1/GEF2 function – and under genetic conditions (heterozygosity) that mimic those found in individuals with Trio-related disorders impacts brain development and function. The lack of behavioral phenotypes in +/M2145T mice is indeed intriguing, especially given the alterations in electrophysiology and biochemistry experiments. It remains possible that further behavioral analyses of these mice will reveal behavioral phenotypes.

      (b) Given that the prevalence and clinical presentation of individuals with various NDDs are influenced by sex, it is possible that the behavioral differences we see in male versus female Trio variant mice reflect human sex difference phenotypes. We have reorganized the Figure panels to clarify these sex differences in behaviors (new Fig. 2, Supp. Fig. 2). We focused on the most significant behavioral phenotypes shared by both sexes in the main text, or in males alone, as our anatomical and electrophysiological experiments were restricted to males to reduce variation due to estrus. The observed behavioral sex differences are not likely due to sample size disparities as power analyses were performed for all experimental results to ensure adequate sample size. A comprehensive study of the mechanisms underlying these behavioral findings merits examination but is outside the scope of this study.

      (c) All mice were subjected to all behavioral tests described. No sudden mortality was observed during the behavioral experiments. Outliers in post-hoc statistical analyses were removed, which explains the apparent sample size differences between behavioral tests. We have revised the Data analysis section in our Methods to include these details (Lines 216-289, 450-457).

      (d) Results of the open field test have been added to the Supplemental Data (new Supp. Fig. 2) and Results (Lines 532-537)

      (e) The Methods section was expanded to include more detail on the breeding strategy (Lines 98-106). A timeline for behavioral testing has also been included in the Figures to enhance clarity (new Fig. 2A).

      (2) In Figure 2A-E, head width and brain weight showed significant differences, but not body weight, how come the ratio does not change? Comparing with female results in Supplementary Figure 2A-E, it does show a difference between males and females. It is essential to clarify which sex authors use in all follow-up experiments, including synapse, transmission, and plasticity. Since the males and females have different phenotypes, why do the authors focus on males only? The E plot has no data points on the bar graph. In Figure 2I, it lacks example images for all four conditions.

      We greatly appreciate this Reviewer’s attention to details in our brain and body weight data and revised the manuscript to address these concerns.

      (a) The ratios of head width/body weight were calculated for each individual mouse. Hence the distribution of the ratio data (old Fig. 2D; new Fig. 3D) differs from the distribution of head width or body weight data alone (old Fig. 2A, 2C, resp.; now Fig. 3A, 3C), and therefore can affect the p-value for statistical significance. The body weight of +/M2145T males is 21.217 ±0.327 g, while for WT males is 21.745 ±0.224 g, a non-significant decrease of 0.528 g (adjusted p=0.3806). These values have been added to the Fig 3. figure legend (Lines 1020-1034) for clarity.

      (b) Similar to the behavioral experiments in comment (1), we observed sex differences in head width, brain weight, and body weight in Trio heterozygous variant mice compared to WT counterparts. The differences in the ratios of head width/body weight or brain weight/body weight were the same for both males and females (i.e. head width/body weight ratio is decreased in +/K1431M mice compared to WT regardless of sex, and brain weight/body weight ratio is decreased in both +/K1431M and +/K1918X mice compared to WT regardless of sex). These findings affirm the impact of Trio mutations on these phenotypes across both sexes. We have modified the text to draw more attention to this key point (Lines 554-566 and 777-801).

      (c) All experiments (excluding behavior and weight data) were performed in males only to minimize the variation in spine and synapse morphology and physiological activity that can occur due to estrus. We have clarified this in the ‘Animal Work’ section of the Methods (Lines 103-106) as well as in the Figure Legends.

      (d) We thank the Reviewer for pointing out Fig. 3E lacks individual data points on the bar graph. Fig. 3E has been modified to now include the brain weight/body weight ratio for each individual mouse rather than across the population, to be consistent with the calculation of head width/body weight ratio (see point 2a).

      On original submission, only a representative WT image was selected due to space constraints. The figure (new Fig. 3H and 3K) and figure legend have been revised to include representative traces for all genotypes examined.

      (3) In lines 315-320, "None of the Trio variant heterozygotes exhibited altered dendritic spine density on M1 L5 pyramidal neurons compared to WT mice on either apical or basal arbors (Supplementary Figure 3L, M). Electron microscopy of cortical area M1 L5 revealed that synapse density was significantly increased in +/K1918X mice compared to WT (Figure 3A, B), possibly due to a net reduction in neuropil resulting from smaller dendritic arbors." The proposed explanation does not adequately address the observed discrepancy between spine density and synapse density reported in these two experiments. A more thorough analysis is needed to reconcile these conflicting findings and clarify how these distinct measurements may relate to each other in the context of the study's conclusions.

      We acknowledge the apparent discrepancy between our dendritic spine density data, which is unchanged from WT for all three Trio variant heterozygotes, and our synapse density data, which showed an increase in +/K1918X M1 L5 compared to WT. We have expanded the explanation for this discrepancy below and added this to the Discussion (Lines 802-811):

      a) Because spine density can vary by dendritic branch order and distance from the soma, only protrusions from secondary dendritic arbors of M1 L5 pyramidal neurons were quantified for consistency in analyses. However, all synapses meeting criteria were quantified in EM images, regardless of where they were located along an individual neuron’s arbors. It is possible that the density and distribution of spines along other arbors are different between genotypes but was not captured in our current data.

      b) +/K1918X L5 pyramidal neurons are smaller and less complex than WT neurons, especially in the basal compartment corresponding to L5 where EM images were obtained, consistent with the smaller brain size and reduced cortical thickness of +/K1918X mice. We posit that due to their smaller dendritic field size, L5 neurons pack more densely contributing to the increased synapse density observed in +/K1918X M1 L5 cortex. Consistent with this hypothesis, we observed a trend toward increased DAPI+ cell density in M1 L5 of +/K1918X neurons (Supp. Fig. 3N).

      (4) In Figure 4, one potential rationale for measuring AMPAR mEPSC frequency is to infer synapse density changes. However, the findings show no frequency change in +/K1431M and +/K1918X, with an increase only in +/M2145T, which contradicts Figure 3 results indicating a trend toward increased density across variants.

      This inconsistency is confusing, especially since the authors claim to follow the methodology from the study "Trio Haploinsufficiency Causes Neurodevelopmental Disease-Associated Deficits"; yet, the observed mEPSC amplitude differs significantly from that study, while the frequency remains unaffected. Additionally, the NMDAR mEPSCs reflect combined AMPAR and NMDAR responses at positive holding potentials, with peak amplitude dominated by AMPAR. This inconsistency between holding potential results is unclear, as frequency should theoretically align across negative and positive potentials. For accurate NMDAR mEPSC measurement, it would be optimal to assess amplitude 50 ms post-initial peak and, if possible, increase the holding potential to enhance the driving force given the typically low signal of NMDAR response.

      We thank the Reviewer for highlighting these important points.

      a) Previous work from our lab and others demonstrate that Trio regulates synaptic AMPA receptor levels, which is why we chose to focus on AMPAR-mediated evoked and miniature EPSC frequencies and amplitudes in the current study. We acknowledge Reviewer 1’s comment on seemingly contradictory results regarding AMPAR mEPSC frequency and synapse density; however, the unchanged AMPAR mEPSC frequency in +/K1431M and +/K1918X mice is consistent with our finding of unaltered dendritic spine density in these mice compared to WT (Supp. Fig. 4L,M). The differences between dendritic spine counts and synapse density is addressed in Response (3) above.

      b) While synapse density changes can be inferred from AMPAR mEPSC frequency, mEPSCs are also measures of spontaneous neurotransmitter release changes especially in the absence of changes in synaptic numbers. Notably, the increased mEPSC frequency in the +/M2145T variant is linked to enhanced spontaneous release, not to spine or synapse density changes. These findings are reinforced by increase in counts of synaptic vesicles, calculated PPR changes, and estimates of the Pr and RRP from HFS train analysis. We have included these points in the Discussion (Lines 861-863).

      c) While it is tempting to compare the current study to our previously published conditional Trio haploinsufficiency model, we highlight key distinctions that may underlie phenotypic differences between these two mouse models. First, our prior model used a NEX-Cre transgene to ablate one Trio allele from excitatory neurons only beginning at embryonic day 11. In contrast, our Trio variants are expressed in all cell types throughout development, akin to the genetic variants found in individuals with TRIO-related disorders. Second, the Trio variant mice in this study are on a C57BL/6 background, while the Trio haploinsufficient mice were on a mixed 129Sv/J X C57BL/6 background. These differences in the current study may explain why some measures, such as mEPSC amplitude, may not align with those from the Trio conditional haploinsufficiency model.

      d) Recordings were performed using specific inhibitors to isolate AMPA and NMDA mEPSCs; these missing methodological details have now been clarified in the updated Methods section (Lines 353-360).

      (5) In Supplementary Figure 4, the sample traces indicate a higher NMDA/AMPA ratio, raising the question of whether the AMPA EPSC amplitude changes, as this could reflect PSD length. In Figure 4B, the increased AMPAR mEPSC amplitude in the +/K1918X condition compared to WT suggests an enhanced postsynaptic response, yet the PSD length is reduced in Figure 3C. Can the authors provide a potential hypothesis to explain this?

      We appreciate the Reviewer’s feedback. Yes, both evoked and miniature recordings indicate increased AMPAR amplitudes in the +/K1918X variants compared to WT. While PSD length is often linked to synaptic strength, the observed reduction in PSD length in EM PSD length reduction in +/K1918X synapses is small (~6% of WT) and clearly does not correlate with significant changes in synaptic strength. We also note that the whole cell recordings of mEPSCs represent input from all active synapses on the neuron, while PSD length is measured only in synapses of the L5.

      (6) In Figure 4, synaptic plasticity appears to decrease to around 50% of baseline; could this reduction be attributed to LTD, or might it result from changes in pipette resistance? Additionally, is the observed potentiation due to changes in presynaptic release probability? Measuring paired-pulse ratio (PPR) before and after induction would clarify this aspect.

      We thank the Reviewer for highlighting these important points.

      a) We used a well-established theta burst stimulation method for LTP induction in M1 L5 pyramidal neurons. This protocol reliably evokes LTP in WT neurons, as shown in Fig. 5J and K. Both +/K1431M and +/K1918X variants exhibit a slight but discernible increase in evoked excitatory postsynaptic currents (eEPSCs), indicative of the initiation of LTP. Although this increase is smaller compared to WT, the presence of potentiation indicates that long-term depression (LTD) is an unlikely explanation for the observed reduction.

      b) To rule out the influence of technical artifacts, pipette resistance was carefully monitored before and after LTP induction. Any cells exhibiting resistance changes exceeding 20% during electrophysiological recordings were excluded from the analysis, ensuring that fluctuations in pipette resistance did not confound LTP measurements. These technical details are denoted in the Methods (Lines 344-346 and 364-366).

      c) The potentiation in the +/M2145T variant may stem from increased release probability (Pr) and greater synaptic vesicle availability, but is beyond the scope of this work. We agree this is an intriguing question, not only for +/M2145T but also for +/K1431M mice. Future studies should address this, ideally using models where the Trio variant is selectively introduced into the presynaptic neuron.

      (7) In lines 377-380, "The +/M2145T PPR curve was unusual, with significantly reduced PPF at short ISIs, yet clearly increased PPF at longer ISI (Figure 5A, B) compared to WT." The unusual PPR observed at the 100 ms ISI appears unexpected. Can the authors provide an explanation for this anomaly? This finding could suggest atypical presynaptic dynamics or modulation at this specific interval, which may differ from typical synaptic behavior. Further insights into possible mechanisms or experimental conditions affecting this result would be valuable.

      "The decreased PPF at initial ISI in +/M2145T mice correlated with increased mEPSC frequency (Fig. 4A-C), suggestive of a possible increase in spontaneous glutamate Pr." If this is the case, it raises the question of why the increased PPR at the initial ISI in +/K1431M does not correspond to the result shown in Figure 4C. This discrepancy suggests that factors beyond initial presynaptic release probability might be influencing the observed synaptic response, or that compensatory mechanisms could be affecting PPR and mEPSC frequency differently in this variant. Further clarification on the interplay between these measurements would help resolve this inconsistency.

      We appreciate the Reviewer’s critical reading and genuine interest on this phenotype in +/M2145T mice.

      a) The unusual shift of the PPR in +/M2145T at ISI 100ms is fascinating and will require significant additional experimentation that lies beyond the scope of this report to address. We propose it results from altered presynaptic regulators, including increased Syt3 and reduced RhoA activity. Notably, Syt3 influences calcium-dependent SV replenishment, which can cause similar PPR defects (Weingarten DJ et al., 2022); this is now included in the Discussion. (Lines 915-918).

      Weingarten DJ, Shrestha A, Juda-Nelson K, Kissiwaa SA, Spruston E, Jackman SL. Fast resupply of synaptic vesicles requires synaptotagmin-3. Nature. 2022 Nov;611(7935):320-325. doi: 10.1038/s41586-022-05337-1. Epub 2022 Oct 19. PMID: 36261524.

      b) Thank you for raising the concern in clarity of this statement "The decreased PPF at initial ISI in +/M2145T mice correlated with increased mEPSC frequency (Fig. 4A-C), suggestive of a possible increase in spontaneous glutamate Pr." We have edited the sentence to be more clear (Lines 701-703). First, the K1431M and M2145T variants impact different TRIO catalytic activities disrupting distinct GTPase pathways and differentially affecting presynaptic regulators, which can lead to non-overlapping phenotypes. Also, we expand our discussion that +/K1431M variant data suggest increased AMPAR numbers and fewer silent synapses (Lines 850-855), potentially increasing AMPAR mEPSC frequency and masking the expected decrease in spontaneous release (Lines 905-910). Further experiments are needed, ideally using mixed cultures with TRIO variants in presynaptic neurons with synapses on WT neurons, as minimal stimulation variance analysis in slices would be inconclusive due to its reflection of both Pr and silent synapse changes, similar to mEPSC frequency.

      (8) In Figure 5, there is no evidence demonstrating that the NSC inhibitor functions specifically in the +/K1431M condition without affecting other conditions. To verify its specificity, the authors should test the NSC inhibitor's effects across other conditions in parallel, including a control group. Additionally, cumulative RRP measurements should be provided for a more comprehensive assessment of the inhibitor's impact on synaptic function.

      We appreciate the Reviewer’s feedback.

      a) Previous studies have shown that Rac1 activity can bidirectionally regulate synchronous release probability (Pr). We used the Rac1-specific inhibitor NSC23766 (NSC) to test how Rac1 inhibition impacted the neurotransmitter release deficits observed in +/K1431M mice. We also added control experiments testing the impact of NSC on WT slices. These new experiments are now presented in new Fig. 8 of the revised manuscript, with expanded details in the Results (Lines 737-750) and Discussion (Lines 892-900).

      b) To estimate Pr and the RRP, we employed the Decay method as described by (Ruiz et al., 2011), which does not rely on cumulative EPSC plots for RRP estimation. This approach was chosen to account for the initial facilitation in these synapses and fits are done using EPSCs plotted against stimulus number. Additional details have been provided in the Methods section  (Lines 367-373).

      Ruiz R, Cano R, Casañas JJ, Gaffield MA, Betz WJ, Tabares L. Active zones and the readily releasable pool of synaptic vesicles at the neuromuscular junction of the mouse. J Neurosci. 2011 Feb 9;31(6):2000-8. doi: 10.1523/JNEUROSCI.4663-10.2011. PMID: 21307238; PMCID: PMC6633039.

      (9) Given the relevance to NDD, specifying the age window of the mice used is crucial. It is confusing that the synaptic function studies were conducted at P42, while the proteomic analysis was performed at P21. Could the authors clarify the rationale behind using different age points for these analyses? Consistency in age selection, or an explanation for this variation, would help in interpreting the developmental relevance of the findings.

      P42 was chosen as the age as it represents young adulthood, by which time clinical features will have already presented in individuals with neurodevelopmental disorders. Our prior studies of NEX-Cre Trio<sup>-/-</sup> mice found significant measurable differences from WT at this age, after neuronal migration, differentiation, synaptogenesis and pruning have occurred. An earlier developmental timepoint, P21, which coincides with juvenile age in mice, was chosen for proteomics studies to identify earlier changes and potentially targetable and modifiable mechanisms that could influence the phenotypes we observed in older mice. The experiments in P42 versus P21 mice were originally two independent lines of investigation that converged in the current study.

    1. Reviewer #2 (Public review):

      Summary:

      The current study aims to shed light on why previous work on perceptual rhythmicity has led to inconsistent results. They propose that the differences may stem from conceptual and methodological issues. In a series of experiments, the current study reports perceptual rhythmicity in different frequency bands that differ between different ear stimulations and behavioral measures. The study suggests challenges regarding the idea of universal perceptual rhythmicity in hearing.

      Strengths:

      The study aims to address differences observed in previous studies about perceptual rhythmicity. This is important and timely because the existing literature provides quite inconsistent findings. Several experiments were conducted to assess perceptual rhythmicity in hearing from different angles. The authors use sophisticated approaches to address the research questions.

      Weaknesses:

      (1) Conceptional concerns:

      The authors place their research in the context of a rhythmic mode of perception. They also discuss continuous vs rhythmic mode processing. Their study further follows a design that seems to be based on paradigms that assume a recent phase in neural oscillations that subsequently influence perception (e.g., Fiebelkorn et al.; Landau & Fries). In my view, these are different facets in the neural oscillation research space that require a bit more nuanced separation. Continuous mode processing is associated with vigilance tasks (work by Schroeder and Lakatos; reduction of low frequency oscillations and sustained gamma activity), whereas the authors of this study seem to link it to hearing tasks specifically (e.g., line 694). Rhythmic mode processing is associated with rhythmic stimulation by which neural oscillations entrain and influence perception (also, Schroeder and Lakatos; greater low-frequency fluctuations and more rhythmic gamma activity). The current study mirrors the continuous rather than the rhythmic mode (i.e., there was no rhythmic stimulation), but even the former seems not fully fitting, because trials are 1.8 s short and do not really reflect a vigilance task. Finally, previous paradigms on phase-resetting reflect more closely the design of the current study (i.e., different times of a target stimulus relative to the reset of an oscillation). This is the work by Fiebelkorn et al., Landau & Fries, and others, which do not seem to be cited here, which I find surprising. Moreover, the authors would want to discuss the role of the background noise in resetting the phase of an oscillation, and the role of the fixation cross also possibly resetting the phase of an oscillation. Regardless, the conceptional mixture of all these facets makes interpretations really challenging. The phase-reset nature of the paradigm is not (or not well) explained, and the discussion mixes the different concepts and approaches. I recommend that the authors frame their work more clearly in the context of these different concepts (affecting large portions of the manuscript).

      (2) Methodological concerns:

      The authors use a relatively unorthodox approach to statistical testing. I understand that they try to capture and characterize the sensitivity of the different analysis approaches to rhythmic behavioral effects. However, it is a bit unclear what meaningful effects are in the study. For example, the bootstrapping approach that identifies the percentage of significant variations of sample selections is rather descriptive (Figures 5-7). The authors seem to suggest that 50% of the samples are meaningful (given the dashed line in the figure), even though this is rarely reached in any of the analyses. Perhaps >80% of samples should show a significant effect to be meaningful (at least to my subjective mind). To me, the low percentage rather suggests that there is not too much meaningful rhythmicity present. I suggest that the authors also present more traditional, perhaps multi-level, analyses: Calculation of spectra, binning, or single-trial analysis for each participant and condition, and the respective calculation of the surrogate data analysis, and then comparison of the surrogate data to the original data on the second (participant) level using t-tests. I also thought the statistical approach undertaken here could have been a bit more clearly/didactically described as well.

      The authors used an adaptive procedure during the experimental blocks such that the stimulus intensity was adjusted throughout. In practice, this can be a disadvantage relative to keeping the intensity constant throughout, because, on average, correct trials will be associated with a higher intensity than incorrect trials, potentially making observations of perceptual rhythmicity more challenging. The authors would want to discuss this potential issue. Intensity adjustments could perhaps contribute to the observed rhythmicity effects. Perhaps the rhythmicity of the stimulus intensity could be analyzed as well. In any case, the adaptive procedure may add variance to the data.

      Additional methodological concerns relate to Figure 8. Figures 8A and C seem to indicate that a baseline correction for a very short time window was calculated (I could not find anything about this in the methods section). The data seem very variable and artificially constrained in the baseline time window. It was unclear what the reader might take from Figure 8.

      Motivation and discussion of eye-movement/pupillometry and motor activity: The dual task paradigm of Experiment 4 and the reasons for assessing eye metrics in the current study could have been better motivated. The experiment somehow does not fit in very well. There is recent evidence that eye movements decrease during effortful tasks (e.g., Contadini-Wright et al. 2023 J Neurosci; Herrmann & Ryan 2024 J Cog Neurosci), which appears to contradict the results presented in the current study. Moreover, by appealing to active sensing frameworks, the authors suggest that active movements can facilitate listening outcomes (line 677; they should provide a reference for this claim), but it is unclear how this would relate to eye movements. Certainly, a person may move their head closer to a sound source in the presence of competing sound to increase the signal-to-noise ratio, but this is not really the active movements that are measured here. A more detailed discussion may be important. The authors further frame the difference between Experiments 1 and 2 as being related to participants' motor activity. However, there are other factors that could explain differences between experiments. Self-paced trials give participants the opportunity to rest more (inter-trial durations were likely longer in Experiment 2), perhaps affecting attentional engagement. I think a more nuanced discussion may be warranted.

      Discussion:

      The main data in Figure 3 showed little rhythmicity. The authors seem to glance over this fact by simply stating that the same phase is not necessary for their statistical analysis. Previous work, however, showed rhythmicity in the across-participant average (e.g., Fiebelkorn's and similar work). Moreover, one would expect that some of the effects in the low-frequency band (e.g., 2-4 Hz) are somewhat similar across participants. Conduction delays in the auditory system are much smaller than the 0.25-0.5 s associated with 2-4 Hz. The authors would want to discuss why different participants would express so vastly different phases that the across-participant average does not show any rhythmicity, and what this would mean neurophysiologically.

      An additional point that may require more nuanced discussion is related to the rhythmicity of response bias versus sensitivity. The authors could discuss what the rhythmicity of these different measures in different frequency bands means, with respect to underlying neural oscillations.

      Figures:

      Much of the text in the figures seems really small. Perhaps the authors would want to ensure it is readable even for those with low vision abilities. Moreover, Figure 1A is not as intuitive as it could be and may perhaps be made clearer. I also suggest the authors discuss a bit more the potential monoaural vs binaural issues, because the perceptual rhythmicity is much slower than any conduction delays in the auditory system that could lead to interference.

    1. .

      SUMMARY: I really like this article. Defining something as undefinable as the self is really difficult and hard to wrap your head around, but I feel like the network self really allows for nuance and exceptions that containerizing ones self does not. We truly are made up of every single thought, experience, opinion, event that has happened throughout a lifetime. Imagining the complexity of the web that makes me me is incredibly difficult, and will probably only get more difficult as I continue throughout my life.

    2. unique interrelatedness of its particular relational traits, psychobiological, social, political, cultural, linguistic and physical.

      this is a very complicated web to wrap my head around. but it totally makes sense and thats the point. is that is impossible to truly boil down it down to a few things that make you you.

    1. oan, Van Tassel. 1997. "PIPELINE POTPOURRI: CABLERS AND COMPUTER MAKERS GO HEAD-TO-HEAD IN THE LIVING ROOM." The Hollywood Reporter (Archive: 1930-2015), Suppl.Trends and Forecasts 349 (43) (Oct 29): S-4. https://login.libproxy.furman.edu/login?auth=shib&url=https://www.proquest.com/trade-journals/pipeline-potpourri/docview/2469245487/se-2.Hettrick, Scott. 1995. "TCI Feeling @ Home on Internet." The Hollywood Reporter (Archive: 1930-2015) 337 (7) (May 05): 4-4, 79. https://login.libproxy.furman.edu/login?auth=shib&url=https://www.proquest.com/trade-journals/tci-feeling-home-on-internet/docview/2469278742/se-2.Sherman, Jay. 1999. "Microsoft Feels Home in Cable Internet Access." The Hollywood Reporter (Archive: 1930-2015) 357 (41) (May 14): 34. https://www.proquest.com/eima/docview/2469228430/AAEB3952309C47D2PQ/17?accountid=11012&sourcetype=Trade%20Journals

      Great job including additional primary sources! It's helpful to get the industry/trade press perspective

    1. Will users unfamiliar with the convention know that they can tap that switch toggle it? Maybe. It’s worth usability testing. They’ll probably try to tap the labels and nothing will happen and they’ll get confused.

      I was surprised by this statement because the functionality of the sliding toggle switch is so ingrained in my head that its function has become intuitive. However, a person having never used one of these toggles would likely have a much, much harder time discerning its function. I wouldn't have even of thought to even think about usability testing for this feature, as its function is so apparent to me. I would of liked Ko to explain how to design with these specific biases in place, and how a designer can "step out of their own shoes" when it comes to interface functionality.

    2. typography

      Whenever I head typography in the context of digital experiences, I always think of Apple. Steve Jobs happened to really like the typography class in college and once he was building their first Macintosh, he was the first to put such an emphasis on typography and aesthetics which changed the course of how computers look and feel like.

    1. Forrester: No thinking - that comes later. You must write your first draft with your heart. You rewrite with your head. The first key to writing is... to write, not to think!

      https://www.imdb.com/title/tt0181536/quotes

      In this quote from Finding Forrester (Columbia Pictures, 2000) Forrester (portrayed by Sean Connery) turns the idea that writing is thinking on its head.

    1. But now ’tis done, repent with grief do I, Hang down my head with shame, blush, sigh, and cry.

      Now that the writing is published, does the author regret the 'hasty' decision to have it printed? She goes on to explain the different symptoms of distress that she is experiencing

    1. It is perhaps the most hotly contested pronoun in the history of television. To whom does the "I" in I Love Lucy truly belong? Moreover, is the proper pronoun indeed an "I"? Jess Oppenheimer, the producer and head writer for the show, conceived the title and claims that part of his motivation in choosing I Love Lucy was to grant Desi Arnaz first billing. But ascribing the "I" to Arnaz, while it has the heterosexual simplicity that television and film have always valued, tells nowhere near the entire story. In the more than fifty years since I Love Lucy aired for the first time, many other claimants have imagined themselves part of television's most famous valentine.

      This immediate reflection on the "I" as the introduction, quickly establishes the idea that the series was never only about Lucy Ricardo or Lucille Ball. It was always a collaborative project, influeced by various actors. The difference between the television industry then and now is that back then personal and public identities were usually indistinct. Earlier stars were forced to embody fictional ideals. I wouldn't say it's completely different today, but actors are more free to maintain those clear distinctions.

    1. "But my colleague, a Mexican national writer, will never finish reading your great book because she can't get past the errors,” Sandra said.

      I can understand this from a readers perspective. I read anywhere from 75 to over 100 books a year. With so many new books coming out, I can't be caught up on the grammar or punctation. I understand that this is significant for this story but consumers expect the best of the best making it hard to make everyone happy. She translated it herself, took a leap and this is where it landed her. I'm happy that she got noticed but you gotta keep rolling with the punches. Keep your head up girl! Just adjust for the next time :)

    1. Elizabeth R. Gordon Interviewed by Lilia Bierman TranscriptElizabeth R. Gordon Interviewed by Lilia Bierman00:00:00:00 - 00:00:37:24LILIA: Okay. I'm recording. ERG: Okay. As I'm scratching my head. Please edit that out. (Laughs)LILIA: (laughs) I will. Okay, our topic is on the transition from VCR, VHS, and DVD rentals to online streaming. The first question is, how old were you when VCR, VHS, and DVD became a thing, and later, when digital became a big thing? 00:00:37:24 - 00:01:05:21ERGSo, VCR, I was 14. Okay. DVD, I think, is probably like college. So maybe 21, 22. So that would have been like in 1993, but they still weren't affordable. Yeah. And then streaming. We probably didn't start streaming anything till about five years ago. I was in my late forties. 00:01:05:21 - 00:01:31:15LILIAOkay. What was your experience adapting to the transition to digital away from VHS, DVD, and VCR? And what did you think about these social changes?00:01:32:15 - 00:01:58:12ERGLike, when you have DVDs, when they get scratched, you would have to deal with that. And that was problematic. A lot of my videos are still on videotape. So my wedding is on tape. Oh my son, all his first moments are also on videotape.So I've got to get those transitioned—and then streaming and digital stuff. I mean like I said, because I came in the generation where we did not have personal computers in college. Everything has had to be self-taught. Luckily, my husband is very good about this, and he helps me out. But now I feel very confident in streaming and doing things like that and having apps on my phone—stuff like that.00:01:58:28 - 00:02:19:10Unknown(LILIA) Okay. (ERG) And then what was the second part of that. (Lilia) And what did you think about these social changes. (ERG) What do you mean by that. (LILIA)I mean it's just like how it it kind of ties into the next question, how it kind of changed your everyday lifestyle, if at all. If you noticed any changes, was it more difficult to adapt to.00:02:19:12 - 00:02:36:24ERGI mean, you made it easier because you didn't have to carry all this technology around. You have this I can stream Netflix on my phone now. And you don't have to keep up with X, Y and Z. It, I thought it made it very, it made it much easier and I definitely would not want to go backwards.00:02:38:18 - 00:03:09:11ERGBut I like my parents who are in their 80s. There's no way that they, they like the idea of probably have a Netflix or Amazon Prime, but there's no way that my dad could handle that. Yeah. He has a smartphone that, you know, it's, tech support. Yeah. Smartphone. LILIA Yep. I get it. Were there any challenges that you or others that you know, faced while adapting to these new technologies, whether it was learning it or just kind of want to throw your computer at the wall?00:03:09:16 - 00:03:30:01ERGYou know, because we didn't have any computer classes in high school. Yeah. I think they had one section. But the computers that we had or what we did, especially when I was in college, like I wanted C plus programing, I had never it was never taught like word processing Microsoft Word I learned how to type on a typewriter.00:03:30:22 - 00:03:51:21ERGSo again everything was self-taught. It was very hard to begin with and made me kind of nervous. I know a lot of people, think that they can mess something up and can't get it back, and, and there was a lot of anxiety, with that transition. But I feel, you know, again, like, I don't know everything.00:03:51:23 - 00:04:11:10ERGAnd I have children that can help me out, but, you know, I've had to learn a lot. My generation has had to learn a lot. Yeah. And most of us have adapted well, I think. Yes. I'm in Gen X, so that's 1965 to about 1980. And and we've learned a lot and adapted. You know. Yeah. The generation before us.00:04:11:12 - 00:04:38:29ERGNo they're not going to do that. No they're not. In retrospect what were the pros and cons of these shifts in technology. You can get more data on things. So I remember when I was writing my thesis in graduate school, and I was still we we didn't have a lot of memory on computers and had to save it on disks, and it took like 6 or 7 deaths and it would be awful.00:04:38:29 - 00:05:01:07ERGAnd then I'd have to get another. So that was extremely frustrating. You know, being able to have things that are quicker and easier to access and knowing that I've got more space and understanding what a megabyte is, what a gigabyte is, and the storage, that is a lot, lot more helpful. But again, I, I, I've enjoyed the technology push.00:05:01:07 - 00:05:26:12ERGThe one thing I don't like about it is that, I'm glad that I raised my children before this. Because I think that kids that are now being raised, a lot of them, you know, this is, this is shoved in their direction in order to occupy them and they're missing out on reading books. They're missing out on dealing with time that you just have to entertain yourself.00:05:26:12 - 00:05:42:26ERGLike going to the doctor's office. We always read books, or we always did stories, or we always just talked about our day. And now I see, you know, like a two year old or one year old, the doctor's office and the parent says this. Yep, yep. And that is just. And then again, you know, my students, I say it's constant.00:05:42:28 - 00:06:08:01ERGYeah. They can't cut it all. No. Like you got to be professional and put it aside and make eye contact. So it's all like that. Yeah. No, I totally agree. Looking back, what are the biggest lasting impacts of this shift? I just like the fact that you have more information that's accessible. You do have to decipher what is true and what's not true.00:06:08:02 - 00:06:29:26ERG Yeah, but, you know, if I have a question, instead of having to go to a library and find the book or and I would have I mean, I've taken graduate classes since the shift and my papers, I can find so much more information to write about. Because it's more accessible than half in a way on interlibrary loan or going over there and looking something up.00:06:29:28 - 00:06:54:27ERGSo I do like that quick access to information. I do like the portability of it. And I think that has really changed. And then I mean, things like exposure, like medical records. And when I make a doctor's appointment, the reminder will shift through my cell phone, or I'll shift through the app and then I can find out my test, my blood test for that rather quickly, and have to rely on somebody to call me and tell.00:06:55:00 - 00:07:04:29LILIAYeah, I totally agree. So I love all that. Yeah, it is very helpful. How would you describe this shift in one word?00:07:05:24 - 00:07:10:15ERGOne word?00:07:11:18 - 00:07:35:04ERGI think it's exciting. Yeah, I think it really is. I mean, again, I've embraced it because I've been forced to embrace it as an educator. As a parent. So I've everything about I've like except for again that this is just steering people away from having relationships. Yeah. And learning how to deal with, you know just empty time.00:07:35:04 - 00:07:56:10ERGYou've, you've got to, I think, a lot of parents are missing out on that. They definitely are. LILIAYeah, I totally agree. Do you miss VCR, VHS or DVD? And if so, what aspects specifically do you miss?00:07:56:13 - 00:08:19:09ERGCan't miss it if it's never gone. And I still have all my children's Pixar stuff. We lived on it. They had portable DVD players that would hook into the car. Yeah. We had 13-hour (car) rides to go with it. LILIAI mean, you can't argue about that.00:08:19:15 - 00:08:40:27ERGNo, you cannot, but no, I don't miss this at all. You know, I need to get the one thing that I'm really concerned about, which is that I need to get all my son's videos transferred over, and I'm about to send them to somebody. Yeah. And then my wedding video. I need to get that transferred into something. So, no, I don't miss it.00:08:40:29 - 00:09:01:17ERGNo, I still have a bunch, and I still have a DVD player. We got rid of the VCR a couple of years ago. Oh, maybe we haven't. So I can't watch my wedding videos anymore. But now I don't miss this at all. Okay, well that's fair. I don't blame you, since it does, and there's nothing in your computer, so, like.00:09:01:23 - 00:09:37:29ERGNo, I can't know. And there used to be some laptops where you could plug in CD's. Yeah, I remember that. And then like, you know, in the cars when I was 16, you had just, you had a radio and then you had a tape. And then like if you're real fancy, you had a plug in DVD and you plug in a CD player, but like when you went over a bob it was and then came you know they installed and I think my car right now it's like a 2016 I think it has a cassette and a DVD player.00:09:38:12 - 00:09:54:03ERGMay not have the cassette probably then, but yeah, it's just and then all that trying to figure out your song that you want, I mean it's just so much easier. Yeah. Just to plug something in or auto-connect it. It's fantastic. LILIAYeah. Okay. Well, that was all of my questions.Steven Hawk Interviewed by Colby Hawk TranscriptDr. Steven Hawk Interviewed by Colby Hawk00:00:00:00 - 00:00:28:08 Steven: Okay. Go ahead. You can introduce yourself. Yes. My name is Doctor Steven Hawk and I am a licensed K through 12 English teacher. And I've been teaching in the public schools for eight years now. Colby:  Cool. So, about how old were you? When, you know, you grew up with the, you know, VHS, VCR and everything, what was it like with that being a big thing back in the day?  00:00:28:08 - 00:00:48:04 Colby: What was your experiences with everyday life and having it having this technology?  Steven: Yeah. From, from the age where I was able to really watch movies, I was watching VHS tapes. So, I had a very small collection of VHS tapes and pretty much just rewatched the same 2 or 3 movies again and again and again and again.  00:00:48:04 - 00:01:06:24 Steven: As my mom would tell you, she would say, I wore out Land Before Time on VHS and Home Alone. Those are my two movies that I pretty much would play ‘em rewind ‘em, play ‘em, rewind ‘em. So as a child, that was my experience was just VHS tapes. You could go to a blockbuster and rent a VHS tape at that point.  00:01:06:26 - 00:01:29:22 Steven: But you owned very few and you were able to rent very few. If you were able to rent, it was usually like once a week. So, you didn't watch a lot of movies. And when you did, hopefully it was something you really liked, and you just watched it again and again and again.  Colby: Cool. Yeah. And having the technology and everything and, you know, the, you know, VHS mainly for you.  00:01:29:24 - 00:01:53:16 Colby: what was it like transitioning, to this digital, you know, internet age when you have iPhones in your pocket, MacBooks and streaming and all of that?  Steven: Yeah. So, the, the, the chain for me, was we went from VHS to DVD probably when I was about 13 years old, around 13. We, we had DVDs and that was a big deal.  00:01:53:19 - 00:02:15:11 Steven: And then DVDs evolved into Blu rays. So, the quality of the DVD DVDs got better. I remember it was my sophomore year of high school when MP3's became a thing. So no longer do we have to carry Walkmans to listen to music, but which is like a DVD, right? we transitioned to MP3's, and so the digital age kind of came upon us.  00:02:15:15 - 00:02:42:09 Steven: It wasn't until I was probably 22 that I had my first iPhone. So growing up, you know, we didn't have internet for the most part of my life. We didn't have any kind of apps or streaming until I was in my probably early 20s. And so that was a huge change because of the amount of things that you could be, I guess, exposed to through streaming.  00:02:42:12 - 00:03:07:12 Steven: It went from having to have a physical copy of a movie or a disc for music to being able to just choose from a vast digital library of different genres and different artists, to then seek out things which isn't something you were able to do. No more than just going to blockbuster and looking through the shelves, could you really seek out different genres and different types of things.  00:03:07:12 - 00:03:29:03 Steven: So, it in a lot of ways it was very freeing because it introduced you to a lot of new things, and you were able to discover a lot of new, tastes, genres, artists, things like that. So, yeah, I would say I was probably about 22 when streaming really caught on in the United States.  00:03:29:05 - 00:03:49:05 Colby: Now, if when you were 22, when you were 22, you would have just gotten out of college. So when you were still at UTK, what was that like, you know, going, you know, if you wanted to go watch something with your friends or, you know, catch up on the newest whatever, what what was that experience like before you had access to all this?  00:03:49:06 - 00:04:11:11 Steven: Yeah. So it was still DVDs were still the thing. You know, when I was in college, we hadn't moved to streaming quite yet. We had the internet age where you were streaming games online with friends and multiplayer and stuff like that. But not really movies. Movies and TV were not mainstream stream. They were not streamed to the mainstream yet.  00:04:11:14 - 00:04:33:23 Steven: And so for me, it was still going to the movies, you know, my friends and I, we would go to the movie theater if there was a movie coming out. You knew the release date and you would you would set a date and a time to go see the movie with your friends physically at a theater. So it wasn't like we stayed in our dorms or apartments and were able to stream the newest movie or TV show.  00:04:33:25 - 00:05:03:12 Steven: So, for me, that was it was still kind of what you would consider an old school experience. I know I've told you Facebook came out in 2005 when I first went to college. And, you know, so social media and the evolution of all streaming from internet, computer platforms, to digital media, for movies, and games, and music, that all really, you know, came mainstream after my college experience. Not during.  00:05:03:15 - 00:05:25:03 Colby: Now, the one big thing I think, and most everybody knows about right is blockbuster.  Steven  Yeah.  Colby  So, can you tell me a little bit more about your experiences with blockbuster? You know, was there like a membership program? Was there like certain deals that they had? What was it like going into one of these stores and renting and picking out your favorite flicks?  00:05:25:05 - 00:05:51:07 Steven: Yeah. If there was a membership program, I'm not aware. As a small child, I don't remember if there was a membership program. But what I do remember, and I tell people often, it was always like Christmas morning for me. I loved blockbuster. I think everyone kind of had the same experience where it was 1 or 2 times a week that you might be fortunate enough to go to a blockbuster and get to rent a new movie that you had never seen.  00:05:51:10 - 00:06:09:23 Steven: It was usually a Friday night, and you've been going to school all week and you're just looking forward to Friday night, because that's the one time your parents get to take you to blockbuster and you walk in the store, and it was like toys R us. You have all these movies, and it was just the covers of the movies with a DVD behind it.  00:06:09:25 - 00:06:32:09 Steven: And if you wanted to watch that movie, you had to take the cover out of the way and see if the DVD was still left. And if there was no DVD, then someone had already rented that movie. And if there were enough left, then you got to take one home. But very often they'd already been rented, and so some, some nights you would go for a certain movie, a new release, and it wasn't there.  00:06:32:14 - 00:06:50:03 Steven: And you'd be a little bummed, but you would just go pick out another movie and you would be excited because you didn't get to watch movies, but maybe once or twice a week. like, at all. You didn't get to watch any more than 1 or 2 movies a week. And so, it was a big deal to watch a movie back then, and it was a lot of fun.  00:06:50:04 - 00:07:15:08 Steven: It was something you really look forward to for Monday. You look forward to getting to Friday and Saturday so you could watch a movie and, and so yeah. It was really special back then. And, it had its. Looking back, you could say it had its difficulties. Like I said, you know, the movie may not be there for you to rent, but we dealt with that disappointment really well, I think, and just say, hey, maybe it'll be back by tomorrow.  00:07:15:08 - 00:07:36:02 Steven: Maybe we could rent it on Saturday night. If not, maybe next week. That'll be the movie. So, you know, we didn't get mad about it. It was part of the deal when you went to blockbuster. So I feel like, you know, movies were so much more special back then because they were so much more rare, and they're not rare anymore.  00:07:36:05 - 00:07:56:08 Steven: And so, you know, I miss I miss blockbuster, I miss the excitement of going into the store and the excitement of seeing if the DVD is still there and the excitement of taking it home and watching it. In the VHSs, you had to be kind and rewind is what you had to do. You know, you rewound the tape for the next person to use it.  00:07:56:15 - 00:08:14:18 Steven: When DVDs came along, it was special because you no longer had to rewind the movie. You could just return the disc. So that was a big deal for us. And then of course, as it moved to streaming, you could watch whatever you wanted whenever, you know, whatever day of the week. You didn't have to worry about rewinding or anything.  00:08:14:18 - 00:08:37:21 Steven: So, it was definitely an evolution. But, for me, blockbuster was really special. And not just blockbuster, but, you know, even Redbox later and, you know, any form of renting a movie during the week was really special.  Colby: Yeah. And, you're talking about how, you know, now it's not as you know, it's not special. You know, it's not, you know, you have easy access to everything.  00:08:37:21 - 00:09:10:19 Colby: And, kind of on that note, like looking back at your experiences having, you know, dealt with DVDs, VHS, all this stuff, and then having Disney+ and Netflix, and, whatever, Hulu, whatever. You know, how has that changed, like your lifestyle or, you know, just society today and, and like what what would you say or like in some of the pros and cons with having this easy access through, you know, the internet or whatever, you know.  00:09:10:24 - 00:09:35:04 Steven: Yeah. Definitely, it's a double edged sword. To kind of go back to say, Netflix started as a DVD subscription process, and then that turned into a digital streaming process. I didn't jump into that process, probably for a couple of years into when Netflix became a digital subscription service. Netflix was the first one that I subscribed to.  00:09:35:06 - 00:09:54:08 Steven: It was fairly cheap, and I thought, hey, this seems pretty neat, and I gave it a try. And that was my first foray into the digital streaming world. And I enjoyed it. You know, my first experience was, or my first thought was this, this is nice. This is a lot better than having to, you know, get out of my house and drive to a store and it may or may not be there.  00:09:54:08 - 00:10:20:06 Steven: And so, there were some pros there. There were some benefits to that process. But I think as time went on, and this is a year's process, right? As more and more things started to become, digital based, streaming based platforms, news, TV, movies, eventually, taking you out of the theater, even, and just leaving you in your living room.  00:10:20:08 - 00:10:50:07 Steven: Then the layers with Covid. You know, people not getting out of their house. They marketed streaming really heavily during the Covid years, and the years to follow Covid, as something to keep you safe. So it was a marketing ploy to really get you to binge watch and stream. So like I said, it became over time, I believe more of a negative thing had a negative impact on my life because it's so addictive.   00:10:50:09 - 00:11:27:02 Steven: Right? That word binge is probably not a positively connotated word in any other setting. If you binge on food per se, that would not be good. But to binge on Netflix has been marketed as a culturally positive thing. It's something that's good to do. And while it may seem good and may seem fun, and you may find a show or, you know, a series of shows that have five, seasons, and you can watch all of them in a matter of two weeks, I’m not sure that that’s healthy.   00:11:27:10 - 00:11:53:13 Steven: And, in my own life, personally, I think, I think it has had a negative impact to be totally honest. It’s much easier after a hard day of work to go to my bedroom and shut the door away from my kids and silence the house and just consume right? To not give anymore, but to just consume, to binge.   00:11:53:15 - 00:12:16:00 Steven: And that's not good. And I know that that's not good. And so, I feel like now I'm having to self-police. I'm having to say this much is okay, but this much is dangerous. This is not good, not healthy. And so, there's it's a fine line. I'm not exactly sure where the line is now because it's all an evolving process.    00:12:16:02 - 00:12:54:07 Steven: But for me personally, I know it's taking time from my kids, taking time from me reading books and things that I used to do more of, perhaps taking time away from, you know, talking to my wife and communicating. Giving myself a pass when things have been difficult to just sit there and binge and to stream. So, while there have been good things, I think you are, you're probably, kind of like the genres of music. You’re able to discover more through streaming, things that you didn't know existed or things that you didn't know perhaps you were interested in.  00:12:54:10 - 00:13:20:01 Steven: But the negative effect, I think, perhaps outweighs the positive. And that's just my experience. I know some people would disagree.  Colby: Yeah, there's a lot of differing opinions on, streaming and everything. And I think, I mean, I don't even have time to binge these days anymore, which is probably a good thing.  Steven: Yeah, I think so.  Colby: So we talked, you know, you touched on, like, the society and the shift and changes.  00:13:20:01 - 00:13:51:08 Colby: That was very good. With online and all that. Were there any, I guess, you kind of talked about this maybe a little bit, but like any challenges that you or any others that you observed or faced with this challenge of going away from, you know, more analog, whatever, to digital?   Steven: Yeah. I mean, nothing, nothing dramatic or drastic, but I think the first challenge was, of course, going from DVD to streaming because we were in an in-between stage there for a while.  00:13:51:13 - 00:14:07:23 Steven: You had streaming apps out there, and you had Netflix and things that you could, you know, sign up for and partake of, but it's like you kind of had a toe in that world, but you were still stuck to DVDs and you rented from, you know, once blockbuster went out, it was Redbox or, you know, stuff like that.  00:14:07:23 - 00:14:30:20 Steven: And then when I went full into streaming, then, I guess the challenge is, you know, part of its financial, to be totally honest. You’re, you're paying for things regularly that you didn't used to pay for, you know. Monthly, you're paying at a minimum, People are probably paying for one streaming app. Lots of people are paying for five or more streaming apps.  00:14:30:22 - 00:14:57:01 Steven: So what used to be free through cable is now charged through apps. So that's been a struggle. Just a financial struggle is like, where's the line between what's an appropriate amount to spend on this form of entertainment and what's not? What’s healthy, what's not? I know this was not for me, but for for some elderly people, there was a huge problem trying to transition to the digital streaming apps.  00:14:57:01 - 00:15:19:13 Steven: And, you know, they they had their TVs that they liked, but they weren't smart TVs. So, you know, they had to figure that they needed a new TV and how to work a new remote and how to download apps and work apps. And that wasn't a problem for me. But I did deal and try to help a lot of elderly people through that transition process to understand how to stream content.  00:15:19:16 - 00:15:40:17 Steven: But for me, you know, like I said, it was just kind of a. It was a learning phase then followed by a self-policing phase of what's. What do I need and what do I not need? Because everyone who develops a streaming app tells you that you need it. And it's kind of hard to select the right service, you know? Do you go with Hulu?  00:15:40:17 - 00:15:59:22 Steven: Do you go with, you know, Comcast? Which one do you go with? There are just so many to choose from that I had to do my research before I landed on the one that I would pay for. Yeah.  Colby: So I think we've already talked about, like, looking back, what were the big impacts on that.  00:15:59:22 - 00:16:29:29 Colby: I think we already touched that. Steven:  Yeah.  Colby:  How would you describe that shift in one word? Or that shift or like actually three things. How do you describe the shift?  The time before the like the VHS DVDs, all that. And then the time now after this shift. Like three, I know upped it but three.  Steven: Yeah. I would say for the time past, nostalgic. Nostalgic is my word because I miss it.  00:16:30:01 - 00:16:51:15 Steven: It's it's something you didn't know that you would miss when it when when it went away. there was sadness when blockbuster went out of business, but there was also an acceptance that this is just the new way of things. And sometimes the more we get into the new way, the more I wish it could become the old way.  00:16:51:18 - 00:17:19:01 Steven: So nostalgic would be that one. For the transition, I would say exciting would be the word I would use for that. I can remember being the only, high schooler, on the way to a baseball team with a new iPod that streamed. Or not streamed but you know, had the MP3 downloaded music that I could just select from a playlist, while all my friends had a Walkman disc that would skip if, you know, they didn't hold it right.  00:17:19:01 - 00:17:47:03 Steven: And so for me, it was exciting. It was a new frontier. It was a new challenge to learn the technology of it. What was for for the, what was the last question for now? I would say the word is dangerous. For the reasons I've stated already, you know, the, mainly the social reasons. What is marketed to us is that we, again, should binge these things.  00:17:47:09 - 00:18:15:27 Steven: We need these things. We can't live without these things. There's a lot of clever marketing that goes into it, and a lot of people that are persuaded by that marketing, including me to some extent. Right. Because I stream. I do watch shows and a lot of it, a lot more than I used to. What used to be one movie a week has turned into ten movies a week. And  20 episodes a week. And that's dangerous.  00:18:15:28 - 00:18:38:02 Steven: It’s dangerous because it's taking me from things that are more important. And it's giving me a pass when I'm tired to say I don't have to struggle with difficult things. I can just. I deserve this. To just sit quietly in my room, away from my children, away from my wife, away from whomever, and reward myself. I think that's a dangerous notion.  00:18:38:04 - 00:18:50:15 Steven: So dangerous, I think, would be the word. Colby: Cool. Yeah. And then. Yeah my battery’s giving me the warning. I think I've got 1 or 2. One more question.  00:18:50:15 - 00:19:10:24 Colby: Okay, so that two part thing, I guess if you could give me one more comment, like do you miss it? You know, do you miss the VHS? You know, rewinding and you know, having, you know, all that the blockbuster and what do you. What, if anything, would you change today? And then what were your favorite, you know, tapes? Or your.  00:19:10:28 - 00:19:34:01 Steven: Yeah. Yeah. Yeah. So I mentioned earlier, my two favorites when I was young was Land Before Time. The original Land Before time. The first one. Petrie, Longneck, and all the, Sharptooth. That was, I've watched that on repeat, I think. And, and then later when I was a little older, it was, Home Alone, the original Home Alone with Macaulay Culkin. And I just thought that was hilarious.  00:19:34:04 - 00:19:53:05 Steven: It’s kind of slapstick humor, you know? And so those are the two that were my favorite. As far as, you know, do I miss it? Absolutely. I miss the way things were, because I think I missed the way I was, and my family was, and other people were. That's what I missed. It's not that I miss blockbuster itself.  00:19:53:07 - 00:20:21:08 Steven: I miss the type of world that we lived in when we still had a blockbuster. When movies were still special. I didn't say earlier, but you know, as a, as a ninth-grade high school teacher, when we, when I was young and we had a special movie day that was like the best day ever. And so, as a teacher, I thought, hey, when they've really worked hard, I'm going to give them a special movie day occasionally, because I love that when I was young. And I tried that.  00:20:21:11 - 00:20:45:07 Steven: And I've learned that you can't get these kids to focus on a movie anymore. They're so desensitized. They're so overstimulated. They won't even watch a movie anymore. They don't care about movies anymore. I miss how much people cared about movies. So, yeah, I miss it. It's not that I miss VHS again. It's just I miss the way people were.  00:20:45:10 - 00:21:03:00 Steven: And I don't think we can ever get that back. I think we're too far away from that. I don't think we get back to that. So as far as the second part, you know, what could, I what would I change if I could change something? What would I want to change I don't think I have the power to change.  00:21:03:02 - 00:21:23:03 Steven: I want families to sit together on a couch on a Friday night, like I did with a couple pizzas and a show and watch it together, and laugh together, and have time together like family should. That's what I want to happen. but I can't make that happen for other people. I can try to make it happen in my home.  00:21:23:05 - 00:21:47:25 Steven: And, and I've been trying to do that more, you know? I've been consciously trying to do that more in my own home. But I can't do it for other peoples. And so, what I'm seeing in our culture is a shift away from, from loving one another, from spending time, quality time together, and for giving ourselves, as parents, a pass for spending time with our kids.  00:21:47:25 - 00:22:08:07 Steven: And sometimes, even for parenting our kids. Because it's easier just to put them in front of an iPad or a TV screen and just let them watch a movie than it is to discipline, or to ask them how their day was, or to troubleshoot things in their lives, or to help them with their math homework.  00:22:08:09 - 00:22:28:24 Steven: It’s easier just to let them stream something. So I don't know how we fix that, Colby. That's that's something that I've thought about a lot lately. How do we, as a society, as a culture, get back to at least some part of what we used to be when blockbuster still existed? I don't know, I don't know the answer to that.  00:22:28:24 - 00:22:52:17 Steven: I think it's a. It’s a question that people have to challenge themselves with personally. They have to know who they are, what they've become, what they want to be, and then find a way to, to find that middle ground between what's enough streaming and what's too much streaming for themselves as parents, as adults, and also for their children.  00:22:52:19 - 00:23:00:15 Steven: And I just don't have a good answer to that, even though I wish I could. Colby:  Sweet. That was a very good answer.Paul Navis  Interviewed by Cole Kennedy Transcript

      Good job running the interviews as conversations rather than spitting the questions out, without any follow up questions! I also appreciate that the transcripts were cleaned up and made easier to navigate.

    1. Reviewer #3 (Public review):

      Summary:

      Rayshubskiy et al. performed whole-cell recordings from descending neurons (DNs) of fruit-flies to characterize their role in steering. Two DNs implicated in "walking control" and "steering control" by previous studies (Namiki et al., 2018, Cande et al., 2018, Chen et al., 2018) were chosen by the authors for further characterization. In-vivo whole-cell recordings from DNa01 and DNa02 showed that their activity predicts spontaneous ipsilateral turning events. The recordings also showed that while DNa02 predicts transient turns DNa01 predicts slow sustained turns. However, optogenetic activation or inactivation showed relatively subtle phenotypes for both neurons (consistent with data in other recent preprints, Yang et al 2023 and Feng et al 2024). The authors also further characterized DNa02 with respect to its inputs and show functional connection with olfactory and thermosensory inputs as well as with the head-direction system. DNa01 is not characterized to this extent.

      Strengths:

      (1). In-vivo recordings and especially dual recordings are extremely challenging in Drosophila and provide a much higher resolution DN characterization than other recent studies which have relied on behavior or calcium imaging. Especially impressive are the simultaneous recordings from bilateral DNs (Fig. 3). These bilateral recordings show clearly that DNa02 cells not only fire more during ipsilateral turning events but that they get inhibited during contralateral turns. In-line with this observation, the difference between left and right DNa02 neuronal activity is a much better predictor of turning events compared to individual DNa02 activity.

      (2). Another technical feat in this work is driving local excitation in the head-direction neuronal ensemble (PEN-1 neurons), while simultaneously imaging its activity and performing whole-cell recordings from DNa02 (Fig. 4). This impressive approach provided a way to causally relate changes in the head-direction system to DNa02 activity. Indeed, DNa02 activity could predict the rate at which an artificially triggered bump in the PEN-1 ring-attractor returns to its previous stable point.

      (3). The authors also support the above observations with connectomics analysis and provide circuit motifs that can explain how head direction system (as well as external olfactory/thermal stimuli) communicated with DNa02. All these results unequivocally put DNa02 as an essential DN in steering control, both during exploratory navigation as well as stimulus directed turns.

      Weaknesses:

      While this study makes a compelling case for the importance of DNa02 in steering control, the role of DNa01 on the other hand seems unclear based on physiology, optogenetics perturbations as well as connectome analysis. DNa01 still remains a bit mysterious regarding both its role in controlling steering maneuvers as well as what in behavioral context it would be relevant.

    2. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      The paper addresses the knowledge gap between the representation of goal direction in the central complex and how motor systems stabilize movement toward that goal. The authors focused on two descending neurons, DNa01 and 02, and showed that they play different roles in steering the fly toward a goal. They also explored the connectome data to propose a model to explain how these DNs could mediate response to lateralized sensory inputs. They finally used lateralized optogenetic activation/inactivation experiments to test the roles of these neurons in mediating turnings in freely walking flies.

      Strengths:

      The experiments are well-designed and controlled. The experiment in Figure 4 is elegant, and the authors put a lot of effort into ensuring that ATP puffs do not accidentally activate the DNs. They also have explained complex experiments well. I only have minor comments for the authors.

      We are grateful for this positive feedback.

      Weaknesses:

      (1) I do not fully understand how the authors extracted the correlation functions from the population data in Figure 1. Since the ipsilateral DNs are anti-correlated with the contralateral ones, I expected that the average will drop to zero when they are pooled together (e.g., 1E-G). Of course, this will not be the case if all the data in Figure 1 are collected from the same brain hemisphere. It would be helpful if the authors could explain this.

      We regret that this information was not easy to find in our initial submission. As noted in the Figure 1D legend, Here and elsewhere, ipsi and contra are defined relative to the recorded DN(s). We have now added a sentence to the Results (right after we introduce Figure 1D) that also makes this point.

      (2) What constitutes the goal directions in Figures 1-3 and 8, as the authors could not use EPG activity as a proxy for goal directions? If these experiments were done in the dark, without landmarks, one would expect the fly's heading to drift randomly at times, and they would not engage the DNa01/02 for turning. Do the walking trajectories in these experiments qualify as menotactic bouts?

      Published work (Green et al., 2019) has shown that, even in the dark, flies will often walk for extended periods while holding the bump of EPG activity at a fixed location. During these epochs, the brain is essentially estimating that the fly is walking in a straight line in a fixed direction. (The fact that the fly is actually rotating a bit on the spherical treadmill is not something the fly can know, in the dark.) Thus, epochs where the EPG bump is held fixed are treated as menotactic bouts, even in darkness.

      Our results provide additional support for this interpretation. We find that, when flies are walking in darkness and holding the bump of EPG activity at a fixed location, they will make a corrective behavioral turning maneuver in response to an imposed bump-jump. This result argues that the flies are actually engaging in goal-directed straight-line walking, i.e. menotaxis, and it reproduces the findings of Green et al. (2019).

      To clarify this point, we have adjusted the wording of the Results pertaining to Figure 4.

      (3) In Figure 2B, the authors mentioned that DNa02 overpredicts and 01 underpredicts rapid turning and provided single examples. It would be nice to see more population-level quantification to support this claim.

      In this revision, we have reorganized Figures 1 and 2 (and associated text) to improve clarity. As part of this reorganization, we have removed this passage from the text, as it was a minor point in any event.

      Reviewer #2 (Public review):

      The data is largely electrophysiological recordings coupled with behavioral measurements (technically impressive) and some gain-of-function experiments in freely walking flies. Loss-of-function was tested but had minimal effect, which is not surprising in a system with partially redundant control mechanisms. The data is also consistent with/complementary to subsequent manuscripts (Yang 2023, Feng 2024, and Ros 2024) showing additional descending neurons with contributions to steering in walking and flying.

      The experiments are well executed, the results interesting, and the description clear. Some hypotheses based on connectome anatomy are tested: the insights on the pre-synaptic side - how sensory and central complex heading circuits converge onto these DNs are stronger than the suggestions about biomechanical mechanisms for how turning happens on the motor side.

      Of particular interest is the idea that different sensory cues can converge on a common motor program. The turn-toward or turn-away mechanism is initiated by valence rather than whether the stimulus was odor or temperature or memory of heading. The idea that animals choose a direction based on external sensory information and then maintain that direction as a heading through a more internal, goal-based memory mechanism, is interesting but it is hard to separate conclusively.

      To clarify, we mention the role of memory in connection with two places in the manuscript. First, we note that the EPG/head direction system relies on learning and memory to construct a map of directional cues in the environment. These cues are, in principle, inherently neutral, i.e. without valence. Second, we note that specific mushroom body output neurons rely on learning and memory to store the valence associated with an odor. This information is not necessarily associated with an allocentric direction: it is simply the association of odor with value. Both of these ideas are well-attested by previous work.

      The reviewer may be suggesting a sequential scheme whereby the brain initializes an allocentric goal direction based on valence, and then maintains that goal direction in memory, based on that initialization. In other words, memory is used to associate valence with some allocentric direction. This seems plausible, but it is not a claim we make in our manuscript.

      The "see-saw", where left-right symmetry is broken to allow a turn, presumably by excitation on one side and inhibition of the other leg motor modules, is interesting but not well explained here. How hyperpolarization affects motor outputs is not clear.

      We have added several sentences to the Discussion to clarify this point. According to this see-saw model, steering can emerge from right/left asymmetries in excitation, or inhibition, or both. It may be nonintuitive to think that inhibitory input to a DN can produce an action. However, this becomes more plausible given our finding that DNa02 has a relatively high basal firing rate (Fig. 1D), and DNa02 hyperpolarization is associated with contraversive turning (Fig. 5A). It is also relevant to note that there are many inhibitory cell types that form strong unilateral connections onto DNa02 (e.g., AOTU019).

      The statement near Figure 5B that "DNa02 activity was higher on the side ipsilateral to the attractive stimulus, but contralateral to the aversive stimulus" is really important - and only possible to see because of the dual recordings.

      We thank the reviewer for this positive feedback.

      Reviewer #3 (Public review):

      Summary:

      Rayshubskiy et al. performed whole-cell recordings from descending neurons (DNs) of fruit flies to characterize their role in steering. Two DNs implicated in "walking control" and "steering control" by previous studies (Namiki et al., 2018, Cande et al., 2018, Chen et al., 2018) were chosen by the authors for further characterization. In-vivo whole-cell recordings from DNa01 and DNa02 showed that their activity predicts spontaneous ipsilateral turning events. The recordings also showed that while DNa02 predicts transient turns DNa01 predicts slow sustained turns. However, optogenetic activation or inactivation showed relatively subtle phenotypes for both neurons (consistent with data in other recent preprints, Yang et al 2023 and Feng et al 2024). The authors also further characterized DNa02 with respect to its inputs and showed a functional connection with olfactory and thermosensory inputs as well as with the head-direction system. DNa01 is not characterized to this extent.

      Strengths:

      (1) In-vivo recordings and especially dual recordings are extremely challenging in Drosophila and provide a much higher resolution DN characterization than other recent studies that have relied on behavior or calcium imaging. Especially impressive are the simultaneous recordings from bilateral DNs (Figure 3). These bilateral recordings show clearly that DNa02 cells not only fire more during ipsilateral turning events but that they get inhibited during contralateral turns. In line with this observation, the difference between left and right DNa02 neuronal activity is a much better predictor of turning events compared to individual DNa02 activity.

      (2) Another technical feat in this work is driving local excitation in the head-direction neuronal ensemble

      (PEN-1 neurons), while simultaneously imaging its activity and performing whole-cell recordings from DNa02

      (Figure 4). This impressive approach provided a way to causally relate changes in the head-direction system to DNa02 activity. Indeed, DNa02 activity could predict the rate at which an artificially triggered bump in the PEN-1 ring attractor returns to its previous stable point.

      (3) The authors also support the above observations with connectomics analysis and provide circuit motifs that can explain how the head direction system (as well as external olfactory/thermal stimuli) communicated with DNa02. All these results unequivocally put DNa02 as an essential DN in steering control, both during exploratory navigation as well as stimulus-directed turns.

      We are grateful for this detailed positive feedback.

      Weaknesses:

      (1) I understand that the first version of this preprint was already on biorxiv in 2020, and some of the "weaknesses" I list are likely a reflection of the fact that I'm tasked to review this manuscript in late 2024 (more than 4 years later). But given this is a 2024 updated version it suffers from laying out the results in contemporary terms. For instance, the manuscript lacks any reference to the DNp09 circuit implicated in object-directed turning and upstream to DNa02 even though the authors cite one of the papers where this was analyzed (Braun et al, 2024). More importantly, these studies (both Braun et al 2024 and Sapkal et al 2024) along with recent work from the authors' lab (Yang et al 2023) and other labs (Feng et al 2024) provide a view that the entire suite of leg kinematics changes required for turning are orchestrated by populations of heterogeneous interconnected DNs. Moreover, these studies also show that this DN-DN network has some degree of hierarchy with some DNs being upstream to other DNs. In this contemporary view of steering control, DNa02 (like DNg13 from Yang et al 2023) is a downstream DN that is recruited by hierarchically upstream DNs like DNa03, DNp09, etc. In this view, DNa02 is likely to be involved in most turning events, but by itself unable to drive all the motor outputs required for the said events. This reasoning could be used while discussing the lack of major phenotypes with DNa02 activation or inactivation observed in the current study, which is in stark contrast to strong phenotypes observed in the case of hierarchically upstream DNs like DNp09 or DNa03. In the section, "Contributions of single descending neuron types to steering behavior": the authors start off by asking if individual DNs can make measurable contributions to steering behavior. Once more, any citations to DNp09 or DNa03 - two DNs that are clearly shown to drive strong turning-on activation (Bidaye et al, 2020, Feng et al 2024) - are lacking. Besides misleading the reader, such statements also digress the results away from contemporary knowledge in the field. I appreciate that the brief discussion in the section titled "Ensemble codes for steering" tries to cover these recent updates. However, I think this would serve a better purpose in the introduction and help guide the results.

      We apologize for these omissions of relevant citations, which we have now fixed. Specifically, in our revised Discussion, we now point out that:

      - Braun et al. (2024) reported that bilateral optogenetic activation of either DNa02 or DNa01 can drive turning (in either direction). 

      - Braun et al. (2024) also identified DNb02 as a steering-related DN.

      - Bidaye et al. (2020), Sapkal et al. (2024), and Braun et al. (2024) all contributed to the identification of DNp09 as a broadcaster DN with the capacity to promote ipsiversive turning.

      We have also revised the beginning of the Results section titled “Contributions of single descending neuron types to steering behavior”, as suggested by the Reviewer.

      Finally, we agree with the Reviewer’s overall point that steering is influenced by multiple DNs. We have not claimed that any DN is solely responsible for steering. As we note in the Discussion: “We found that optogenetically inhibiting DNa01 produced only small defects in steering, and inhibiting DNa02 did not produce statistically significant effects on steering; these results make sense if DNa02 is just one of many steering DNs.”

      (2) The second major weakness is the lack of any immunohistochemistry (IHC) images quantifying the expression of the genetic tools used in these studies. Even though the main split-Gal4 tools for DNa01 and DNa02 were previously reported by Namiki et al, 2018, it is important to document the expression with the effectors used in this work and explicitly mention the expression in any ectopic neurons. Similarly, for any experiments where drivers were combined together (double recordings, functional connectivity) or modified for stochastic expression (Figure 8), IHC images are absolutely necessary. Without this evidence, it is difficult to trust many of the results (especially in the case of behavioral experiments in Figure 8). For example, the DNa01 genetic driver used by the authors is also expressed in some neurons in the nerve cord (as shown on the Flylight webpage of Janelia Research Campus). One wonders if all or part of the results described in Figure 8 are due to DNa01 manipulation or manipulation of the nerve cord neurons. The same applies for optic lobe neurons in the DNa02 driver.

      This is a reasonable request. We used DN split-Gal4 lines to express three types of UAS-linked transgenes:

      (1) GFP

      In these flies, we know that expression in DNs is restricted to the DN types in question, based on published work (Namki et al., 2018), as well as the fact that we see one labeled DN soma per hemisphere. When we label both cells with GFP, we use the spike waveform to identify DNa02 and DNa01, as described in Figure S1

      (2) ReaChR

      In these flies, expression patterns were different in different flies because ReaChR expression was stochastically sparsened using hs-FLP. Expression was validated in each fly after the experiment, as described in the Methods (“Stochastic ReaChR expression”). hs-FLP-mediated sparsening will necessarily produce stochastic patterns of expression in both DNa02 and off-target cells, and this is true of all the flies in this experiment. What makes the “unilateral” flies distinct from the “bilateral” flies is that unilateral flies express ReaChR in one copy of DNa02, whereas bilateral flies express ReaChR in both copies of DNa02. On average, off-target expression will be the same in both groups.

      (3) GtACR1

      In these flies, we initially assumed that GtACR1 expression was the same as GFP expression under control of the same driver. However, we agree with the reviewer’s point that these two expression patterns are not necessarily identical. Therefore, to address the reviewer’s question, we performed immunofluorescence microscopy to characterize GtACR1 patterns in the brain and VNC of both genotypes. These expression patterns are now shown in a new supplemental figure (Figure S8). This figure shows that, as it happens, expression of GtACR1 is indeed indistinguishable from the GFP expression patterns for the same lines (archived on the FlyLight website). Both DN split-Gal4 lines are largely selective for the DNs in question, with limited off-target labeling. We have now drawn attention to this off-target labeling in the last paragraph of the Results, where the GtACR1 results are discussed.

      (3) The paper starts off with a comparative analysis of the roles of DNa01 and DNa02 during steering. Unfortunately, after this initial analysis, DNa01 is largely ignored for further characterization (e.g. with respect to inputs, connectomics, etc.), only to return in the final figure for behavioral characterization where DNa01 seems to have a stronger silencing phenotype compared to DNa02. I couldn't find an explanation for this imbalance in the characterization of DNa01 versus DNa02. Is this due to technical reasons? Or was it an informed decision due to some results? In addition to being a biased characterization, this also results in the manuscript lacking a coherent thread, which in turn makes it a bit inaccessible to the non-specialist.

      Yes, the first portion of the manuscript focuses on DNa01 and DNa02. The latter part of the manuscript transitions to focus mainly on DNa02. 

      Our rationale is noted at the point in the manuscript where we make this transition, with the section titled “Steering toward internal goals”: “Having identified steering-related DNs, we proceeded to investigate the brain circuits that provide input to these DNs. Here we decided to focus on DNa02, as this cell’s activity is predictive of larger steering maneuvers.” When we say that DNa02 is predictive of larger steering maneuvers, we are referring to several specific results:

      - We obtain larger filter amplitudes for DNa02 versus DNa01 (Fig. 2A-C). This means that, just after a unit change in DN firing rate, we see on average a larger change in steering velocity for DNa02 versus DNa01.

      - The linear filter for DNa02 has a higher variance explained, as compared to DNa01 (Fig. 2D). This means that DNa02 is more predictive of steering.

      - The relationship between firing rate and rotational velocity (150 ms later) is steeper for DNa02 than for DNa01 (Fig. 2G). This means that, if we ignore dynamics and we just regress firing rate against subsequent rotational velocity, we see a higher-gain relationship for DNa02.

      Our focus on DNa02 was also driven by connectivity considerations. In the same paragraph (the first paragraph in the section titled “Steering toward internal goals”). We note that “there are strong anatomical pathways from the central complex to DNa02”; the same is not true of DNa01. This point has also been noted by other investigators (Hulse et al. 2021).

      We don’t think this focus on DNa02 makes our work biased or inaccessible. Any study must balance breadth with depth. A useful general way to balance these constraints is to begin a study with a somewhat broader scope, and then narrow the study’s focus to obtain more in-depth information. Here, we began with comparative study of two cell types, and we progressed to the cell type that we found more compelling.

      (4) There seems to be a discrepancy with regard to what is emphasized in the main text and what is shown in Figures S3/S4 in relation to the role of these DNs in backward walking. There are only two sentences in the main text where these figures are cited.

      a) "DNa01 and DNa02 firing rate increases were not consistently followed by large changes in forward velocity

      (Figs. 1G and S3)."

      b) "We found that rotational velocity was consistently related to the difference in right-left firing rates (Fig. 3B). This relationship was essentially linear through its entire dynamic range, and was consistent across paired recordings (Fig. 3C). It was also consistent during backward walking, as well as forward walking (Fig. S4)." These main text sentences imply the role of the difference between left and right DNa02 in turning. However, the actual plots in the Figures S3 and S4 and their respective legends seem to imply a role in "backward walking". For instance, see this sentence from the legend of Figure S3 "When (ΔvoltageDNa02>>ΔvoltageDNa01), the fly is typically moving backward. When (firing rateDNa02>>firing rateDNa01), the fly is also often moving backward, but forward movement is still more common overall, and so the net effect is that forward velocity is small but still positive when (firing rateDNa02>>firing rateDNa01). Note that when we condition our analysis on behavior rather than neural activity, we do see that backward walking is associated with a large firing rate differential (Fig. S4)." This sort of discrepancy in what is emphasized in the text, versus what is emphasized in the figures, ends up confusing the reader. More importantly, I do not agree with any of these conclusions regarding the implication of backward walking. Both Figures S3 and S4 are riddled with caveats, misinterpretations, and small sample sizes. As a result, I actually support the authors' decision to not infer too much from these figures in the "main text". In fact, I would recommend going one step further and removing/modifying these figures to focus on the role of "rotational velocity". Please find my concerns about these two figures below:

      a) In Figures S3 and S4, every heat map has a different scale for the same parameter: forward velocity. S3A is -10 to +10mm/s. S3B is -6 to +6 S4B (left) is -12 to +12 and S4B (right) is -4 to +4. Since the authors are trying to depict results based on the color-coding this is highly problematic.

      b) Figure S3A legend "When (ΔvoltageDNa02>>ΔvoltageDNa01), the fly is typically moving backward." There are also several instances when ΔvoltageDNa02= ΔvoltageDNa01 and both are low (lower left quadrant) when the fly is typically moving backwards. So in my opinion, this figure in fact suggests DNa02 has no role in backward velocity control.

      c) Based on the example traces in S4A, every time the fly walks backwards it is also turning. Based on this it is important to show absolute rotational velocity in Figure S4C. It could be that the fly is turning around the backward peak which would change the interpretation from Figure S4C. Also, it is important to note that the backward velocities in S4A are unprecedentedly high. No previous reports show flies walking backwards at such high velocities (for example see Chen et al 2018, Nat Comm. for backward walking velocities on a similar setup).

      d) In my opinion, Figure S4D showing that right-left DNa02 correlates with rotational velocity, regardless of whether the fly is in a forward or backward walking state, is the only important and conclusive result in Figures S3/S4. These figures should be rearranged to only emphasize this panel.

      We agree that it is difficult to interpret some of the correlations between DN activity and forward velocity, given that forward velocity and rotational velocity are themselves correlated to some degree. This is why we did not make claims based on these results in the main text. In response to these comments, we have taken the Reviewer’s suggestion to preserve Figure S4D (now Figure S3). The other components of these supplemental figures have been removed.

      (5) Figure 3 shows a really nice analysis of the bilateral DNa02 recordings data. While Figure S5 [now Figure S4] shows that authors have a similar dataset for DNa01, a similar level analysis (Figures 3D, E) is not done for DNa01 data. Is there a reason why this is not done?

      The reason we did not do the same analysis for DNa01 is that we only have two paired DNa01-DNa01 recordings. It turned out to be substantially more difficult to perform DNa01-DNa01 recordings, as compared to DNa02-DNa02 recordings. For this reason, we were not able to get more than two of these recordings.

      (6) In Figure 4 since the authors have trials where bump-jump led to turning in the opposite direction to the DNa02 being recorded, I wonder if the authors could quantify hyperpolarization in DNa02 as is predicted from connectomics data in Figure 7.

      We agree this is an interesting question. However, DNa02 firing rate and membrane potential are variable, and stimulus-evoked hyperpolarizations in these DNs tend to be relatively small (on the order of 1 mV, in the case of a contralateral fictive olfactory stimulus, Figure 5A). In the case of our fictive olfactory stimuli, we could look carefully for these hyperpolarizations because we had a very large number of trials, and we could align these trials precisely to stimulus onset. By contrast, for the bump-jump experiments, we have a more limited number of trials, and turning onset is not so tightly time-locked to the chemogenetic stimuli; for these reasons, we are hesitant to make claims about any bump-jump-related hyperpolarization in these trials.

      (7) Figure 6 suggests that DNa02 contains information about latent steering drives. This is really interesting. However, in order to unequivocally claim this, a higher-resolution postural analysis might be needed. Especially given that DNa02 activation does not reliably evoke ipsilateral turning, these "latent" steering events could actually contain significant postural changes driven by DNa02 (making them "not latent"). Without this information, at least the authors need to explicitly mention this caveat.

      This is a good point. We cannot exclude the possibility that DNa02 is driving postural changes when the fly is stopped, and these postural changes are so small we cannot detect them. In this case, however, there would still be an interesting mismatch between the stimulus-evoked change in DNa02 firing rate (which is large) and the stimulus-evoked postural response (which would be very small). We have added language to the relevant Results section in order to make this explicit.

      (8) Figure 7 would really benefit from connectome data with synapse numbers (or weighted arrows) and a corresponding analysis of DNa01.

      In response to this comment, we have added synapses number information (represented by weighted arrows) to Figures 7C, E, and F. We also added information to the Methods to explain how cells were chosen for inclusion in this diagram. (In brief: we thresholded these connections so as to discard connections with small numbers of synapses.)

      We did perform an analogous connectome circuit analysis for DNa01, but if we use the same thresholds as we do for DNa02, we obtain a much sparser connectivity graph. We now show this in a new supplemental figure (Figure S9). MBON32 makes no monosynaptic connections onto DNa01, and it only forms one disynaptic connection, via LAL018, which is relatively weak. PFL3 and PFL2 make no mono- or disynaptic connections onto DNa01 comparable in strength to what we find for DNa02. 

      The sparser connectivity graph for DNa01 is partly due to the fact that fewer cell types converge onto DNa01 as compared to DNa02 (110 cell types, versus 287 cell types). Also, it seems that DNa01 is simply less closely connected to the central complex and mushroom body, as compared to DNa02.

      (9) In Figure 8E, the most obvious neuronal silencing phenotype is decreased sideways velocity in the case of DNa01 optogenetic silencing. In Figure S2, the inverse filter for sideways velocity for DNa01 had a higher amplitude than the rotational velocity filter. Taken together, does this point at some role for DNa01 in sideways velocity specifically?

      No. The forward filters describe the average velocity impulse response, given a brief step change in firing rate.

      Figure 1 and Figure S2 show that the sideways velocity forward filter is actually smaller for DNa01 than for DNa02. This means that a brief step change in DNa01 firing rate is followed by only a very small sideways velocity response. Conversely, the reverse filters describe the average firing rate impulse response, given a brief step change in sideways velocity. Figure S2 shows that the sideways velocity reverse filter is larger for DNa01 than for DNa02, but this means that the relationship between DNa01 activity and sideways velocity is so weak that we would need to see a very large neural response in order to get a brief step change in sideways velocity. In other words, the reverse filter says that DNa01 likely has very little role in determining sideways velocity.

      (10) In Figure 8G, the effect on inner hind leg stance prolongation is very weak, and given the huge sample size, hard to interpret. Also, it is not clear how this fits with the role of DNa01 in slow sustained turning based on recordings.

      Yes, this effect is small in magnitude, which is not too surprising, given that many DNs seem to be involved in the control of steering in walking. To clarify the interpretation of these phenotypes, we have added a paragraph to the end of the Results:

      “All these effects are weak, and so they should be interpreted with caution. Also, both DN split-Gal4 lines drive expression in a few off-target cell types, which is another reason for caution (Fig. S8). However, they suggest that both DNs can lengthen the stance phase of the ipsilateral back leg, which would cause ipsiversive turning. These results are also compatible with a scenario where both DNs decrease the step length in the ipsilateral legs, which would also cause ipsiversive turning. Step frequency does not normally change asymmetrically during turning, so the observed decrease in step frequency during optogenetic inhibition may just be a by-product of increasing step length when these DNs are inhibited.” We have also added caveats and clarifications in a new Discussion paragraph:

      “Our study does not fully answer the question of how these DNs affect leg kinematics, because we were not able to simultaneously measure DN activity and leg movement. However, our optogenetic experiments suggest that both DNs can lengthen the stance phase of the ipsilateral back leg (Fig. 8G), and/or  decrease the step length in the ipsilateral legs (Fig. 8H), either of which would cause ipsiversive turning. If these DNs have similar qualitative effects on leg kinematics, then why does DNa02 precede larger and more rapid steering events? This may be due to the fact that DNa02 receives stronger and more direct input from key steering circuits in the brain (Fig. S9). It may also relate to the fact that DNa02 has more direct connections onto motor neurons (Fig. 1B).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) I found the sign conventions for rotational velocity particularly confusing. Figure 3 represents clockwise rotations as +ve values, but Figure 4H represents anticlockwise rotations as positive values. But for EPG bumps, anticlockwise rotations are given negative values. Please make them consistent unless I am missing something obvious.

      Different fields use different conventions for yaw velocity. In aeronautics, a clockwise turn is generally positive. In robotics and engineering of terrestrial vehicles, a counterclockwise turn is generally positive. Historically, most Drosophila studies that quantified rotational (yaw) velocity were focused on the behavior of flying flies, and these studies generally used the convention from aeronautics, where a clockwise turn is defined as a positive turn. When we began working in the field, we adopted this convention, in order to conform to previous literature. It might be argued that walking flies are more like robots than airplanes, but it seemed to us that it was confusing to have different conventions for different behaviors of the same animal. Thus, all of the published studies from our lab define clockwise rotation as having positive rotational velocity.

      Figure 4 focuses on the role of the central complex in steering. As the fly turns clockwise (rightward), the bump of activity in EPG neurons normally moves counterclockwise around the ellipsoid body, as viewed from the posterior side (Turner-Evans et al., 2017). The posterior view is the conventional way to represent these dynamics, because (1) we and others typically image the brain from the posterior side, not the anterior side, and (2) in a posterior view, the animal’s left is on the left side of the image, and vice versa. We have added a sentence to the Figure 4A legend to clarify these points.

      Previous work has shown that, when an experimenter artificially “jumps” the EPG bump, this causes the fly to make a compensatory turn that returns the bump to (approximately) its original location (Green et al., 2019). Our work supports this observation. Specifically, we find that clockwise bump jumps are generally followed by rightward turns (which drive the bump to return to its approximate original location via a counterclockwise path), and vice versa. This is noted in the Figure 4D legend. Note that Figure 4D plots the fly’s rotational velocity during the bump return, plotted against the initial bump jump. 

      Figure 4H shows that clockwise (blue) bump returns were typically preceded by leftward turning, counter-clockwise (green) bump returns were preceded by rightward turning, as expected. This is detailed in the Figure 4H legend, and it is consistent with the coordinate frame described above.

      (2) It would be helpful to have images of the DNa01 and DNa02 split lines used in this paper, considering this paper would most likely be used widely to describe the functions of these neurons. Similarly, images of their reconstructions would be a useful addition.

      High-quality three-dimensional confocal stacks of all the driver lines used in our study are publicly available. We have added this information to the Methods (under “Fly husbandry and genotypes”). Confocal images of the full morphologies of DNa01 and DNa02 have been previously published (Namiki et al., 2018). Figure 1A is a schematic that is intended to provide a quick visual summary of this information.

      EM reconstructions of DNa01 and DNa02 are publicly accessible in a whole-brain dataset (https://codex.flywire.ai/) and a whole-VNC dataset (https://neuprint.janelia.org/). Both datasets are referenced in our study. As these datasets are easy to search and browse via user-friendly web-based tools, we expect that interested readers will have no difficulty accessing the underlying datasets directly.

      Reviewer #2 (Recommendations for the authors):

      (1) The description of the activity of the DNs that they "PREDICT steering during walking". This is an interesting word choice. Not causes, not correlates with, not encodes... does that mean the activity always precedes the action? Does that mean when you see activity, you will get behavior? This is important for assessing whether the DN activity is a cause or an effect. It is good to be cautious but it might be worth expanding on exactly what kind of connection is implied to justify the use of the word 'predict'.

      Conventionally, “predict” means “to indicate in advance”. We write that DNs “predict” certain features of behavior. We use this term because (1) these DNs correlate with certain features of behavior, and (2) changes in DN activity precede changes in behavior.

      The notion that neurons can “predict” behavior is not original to our study. Whenever neuroscientists summarize the relationship between neural activity and behavior by fitting a mathematical model (which may be as simple as a linear regression), the fitted model can be said to represent a “prediction” of behavior. These models are evaluated by comparing their predictions with measured behaviors. A good model is predictive, but it also implies that the underlying neural signal is also predictive (Levenstein et al., 2023 Journal of Neuroscience 43: 1074-1088; DOI: 10.1523/JNEUROSCI.1179-22.2022). Here, prediction simply means correlation, without necessarily implying causation. We also use “prediction” to imply correlation.

      We do not think the term “prediction” implies determinism. Meteorologists are said to predict the weather, but it is understood that their predictions are probabilistic, not deterministic. Certainly, we would not claim that there is a deterministic relationship between DN activity and behavior. Figure 2D shows that neither DN type can explain all the variance in the fly’s rotational or sideways velocity. At the same time, both DNs have significant predictive power.

      We might equally say that these DNs “encode” behavior. We have chosen to use the word “predict” rather than “encode” because we do not think it is necessary to use the framework of symbolic communication in connection with these DNs.

      We agree with the Reviewer that it is helpful to test whether any neuron that “predicts” a behavior might also “cause” this behavior. In Figure 8, we show that directly perturbing these DNs can indeed alter locomotor behavior, which suggests a causal role. Connectome analyses also suggest a causal role for these DNs in locomotor behavior (Figure 1B, see especially also Cheong et al., 2024).

      At the same time, it is clear from our results that these DNs are not “command neurons” for turning: they do not deterministically cause turning. Therefore, to avoid misunderstanding, we have generally been careful to summarize the results of our perturbation experiments by avoiding the statement that “this DN causes this behavior”. Rather, we have generally tried to say that “this DN influences this behavior”, or “this DN promotes this behavior”.

      (2) There is some concern about how the linear filter models were developed and then used to predict the relationship between firing rate and steering behavior: how exactly were the build and test data separated to avoid re-extracting the input? It reads like a self-fulfilling prophecy/tautology.

      We used conventional cross-validation for model fitting and evaluation. We apologize that this was not made explicit in our original submission; this was due to an oversight on our part. To be clear: linear filters were computed using the data from the first 20% of a given experiment. We then convolved each cell’s firing rate estimate with the computed Neuron→Behavior filter (the “forward filter”) using the data from the final 80% of the experiment, in order to generate behavioral predictions. Thus, when a model has high variance explained, this is not attributable to overfitting: rather, it quantifies the bona fide predictive power of the model. We have added this information to the Methods (under “Data analysis - Linear filter analysis”).

      (3) Type-O right above Figure 2 [now Figure 1E]: I assume spike rate fluctuations in DNa02 precede DNa01?

      Fixed. Thank you for reading the manuscript carefully.

      (4) The description of the other manuscripts about neural control of the steering as "follow-up" papers is a bit diminishing. They were likely independent works on a similar theme that happened afterwards, rather than deliberate extensions of this paper, so "subsequent" might be a more accurate description.

      We apologize, as we did not intend this to be diminishing. Given this request, we have revised “follow-up” to “subsequent”.

      (5) The idea that DNa02 is high-gain because it is more directly connected to motor neurons is a hypothesis and this should be made clear. We really don't know the functional consequences of the directness of a path or the number of synapses, and which circuits you compare to would change this. DNa02 may be a higher gain than DNa01, but what about relative to the other DNs that enter pre-motor regions? How do you handle a few synapses and several neurons in a common class? All of these connectivity-based deductions await functional tests - like yours! I think it is better to make this clear so readers don't assume a higher level of certainty than we have.

      The Reviewer asks how we handled few-synapse connections, and how we combined neurons in the same class. We apologize for not making this explicit in our original submission. We have now added this information to the Methods. Briefly, to select cell types for inclusion in Figures 7C, we identified all individual cells postsynaptic to PFL3 and presynaptic to DNa02, discarding any unitary connections with <5 synapses. We then grouped unitary connections by cell type, and then summed all synapse numbers within each connection group (e.g., summing all synapses in all PFL3→LAL126 connections). We then discarded connection groups having <200 synapses or <1% of a cell type’s pre- or postsynaptic total. Reported connection weights are per hemisphere, i.e. half of the total within each connection group. For Figure 7F we did the same, but now discarding connection groups having <70 synapses or <0.4% of a cell type’s pre- or postsynaptic total. In Figure S9, we used the same procedures for analyzing connections onto DNa01. 

      We agree that it is tricky to infer function from connectome data, and this applies to motor neuron connectivity. We bring up DN connectivity onto motor neurons in two places. First, in the Results, we note that “steering filters (i.e., rotational and sideways velocity filters) were larger for DNa02 (Fig. 2A,B). This means that an impulse change in firing rate predicts a larger change in steering for this neuron. In other words, this result suggests that DNa02 operates with higher gain. This may be related to the fact that DNa02 makes more direct output synapses onto motor neurons (Fig. 1B) [emphasis added].” We feel this is a relatively conservative statement.

      Subsequently, in the Discussion, we ask, “why does DNa02 precede larger and more rapid steering events? This may be due to the fact that DNa02 receives stronger and more direct input from key steering circuits in the brain (Fig. S9). It may also relate to the fact that DNa02 has more direct connections onto motor neurons (Fig. 1B) [emphasis added].” Again, we feel this is a relatively conservative statement.

      To be sure, none of the motor neurons postsynaptic to DNa02 actually receive most of their synaptic input from DNa02 (or indeed any DN), and this is typical of motor neurons controlling leg muscles. Rather, leg motor neurons tend to get most of their input from interneurons rather than motor neurons (Cheong et al. 2024). Available data suggests that the walking rhythm originates with intrinsic VNC central pattern generators, and the DNs that influence walking do so, in large part, by acting on VNC interneurons. These points have been detailed in recent connectome analyses (see especially Cheong et al. 2024).

      We are reluctant to broaden the scope of our connectome analyses to include other DNs for comparison, because we think these analyses are most appropriate to full-central-nervous-system-(CNS)-connectomes (brain and VNC together), which are currently under construction. Without a full-CNS-connectome, many of the DN axons in the VNC cannot be identified. In the future, we expect that full-CNS-connectomes will allow a systematic comparison of the input and output connectivity of all DN types, and probably also the tentative identification of new steering DNs. Those future analyses should generate new hypotheses about the specializations of DNa02, DNa01, and other DNs. Our study aims to help lay a conceptual foundation for that future work.

      (6) Given the emphasis on the DNa02 to Motor Neuron connectivity shown (Figure 1B) and multiple text mentions, could you include more analyses of which motor neurons are downstream and how these might be expected to affect leg movements? I would like to see the synapse numbers (Figure 1B) as well as the fraction of total output synapses. These additions would help understand the evidence for the "see-saw" model.

      We agree this is interesting. In follow-up work from our lab (Yang et al., 2023), we describe the detailed VNC connectivity linking DNa02 to motor neurons. We refer the Reviewer specifically to Figure 7 of that study (https://www.cell.com/cell/fulltext/S0092-8674(24)00962-0).

      We regret that the see-saw model was perhaps not clear in our original submission. Briefly, this model proposes that an increase in excitatory synaptic input to one DN (and/or a disinhibition of that DN) is often accompanied by an increase in inhibitory synaptic input to the contralateral DN. This model is motivated by connectome data on the brain inputs to DNa02 (Figure 7), along with our observation that excitation of one DN is often accompanied by inhibition of the contralateral DN (Figure 5). We have now added text to the Results in several places in order to clarify these points. 

      This model specifically pertains to the brain inputs to DNs, comparing the downstream targets of these DNs in the VNC would not be a test of this hypothesis. The Reviewer may be asking to see whether there is any connectivity in the brain from one DN to its contralateral partner. We do not find connections of this sort, aside from multisynaptic connections that rely on very weak links (~10 synapses per connection). Figure 7 depicts a much stronger basis for this hypothesis, involving feedforward see-saw connections from PFL3 and MBON32. 

      (7) The conclusions from the data in Figure 8 could be explained more clearly. These seem like small effect sizes on subtle differences in leg movements - maybe like what was seen in granular control by Moonwalker's circuits? Measuring joint angles or step parameters might help clarify, but a summary description would help the reader.

      We agree that these results were not explained very well in our original submission. 

      In our revised manuscript, we have added a new paragraph to the end of this Results section providing some summary and interpretation:

      “All these effects are weak, and so they should be interpreted with caution. However, they suggest that both DNs can lengthen the stance phase of the ipsilateral back leg, which would promote ipsiversive turning. These results are also compatible with a scenario where both DNs decrease the step length in the ipsilateral legs, which would also promote ipsiversive turning. Step frequency does not normally change asymmetrically during turning, so the observed decrease in step frequency during optogenetic inhibition may just be a by-product of increasing step length when these DNs are inhibited.”

      Moreover, in the Discussion, we have also added a new paragraph that synthesizes these results with other results in our study, while also noting the limitations of our study:

      “Our study does not fully answer the question of how these DNs affect leg kinematics, because we were not able to simultaneously measure DN activity and leg movement. However, our optogenetic experiments suggest that both DNs can lengthen the stance phase of the ipsilateral back leg (Fig. 8G), and/or  decrease the step length in the ipsilateral legs (Fig. 8H), either of which would promote ipsiversive turning. If these DNs have similar qualitative effects on leg kinematics, then why does DNa02 precede larger and more rapid steering events? This may be due to the fact that DNa02 receives stronger and more direct input from key steering circuits in the brain (Fig. S9). It may also relate to the fact that DNa02 has more direct connections onto motor neurons (Fig. 1B).”

      In Figure 8D-H, we measure step parameters in freely walking flies during acute optogenetic inhibition of DNa01 and DNa02. In experiments measuring neural activity in flies walking on a spherical treadmill, we did not have a way to measure step parameters. Subsequently, this methodology was developed by Yang et al. (2023) and results for DNa02 are described in that study. 

      Reviewer #3 (Recommendations for the authors):

      Minor Points:

      (1) If space allows, actual membrane potential should be mentioned when raw recordings are shown (for example Figure 1D).

      We have now added absolute membrane potential information to Figure 1d.

      (2) Typo in the sentence "To address this issue directly, we looked closely at the timing of each cell's recruitment in our dual recordings, and found that spike rate fluctuations in DNa02 typically preceded the spike rate fluctuations in DNa02 (Fig. 2A)." The final word should be "DNa01".

      Fixed. Thank you for reading the manuscript carefully.

      (3) Figure 2A - although there aren't direct connections between a01 and a02 in the connectome, the authors never rule out functional connectivity between these two. Given a02 precedes a01, shouldn't this be addressed?

      In the full brain FAFB data set, there are two disynaptic connections from DNa02 onto the ipsilateral copy of DNa01. One connection is via CB0556 (which is GABAergic), and the other is via LAL018 (which is cholinergic). The relevant DNa02 output connections are very weak: each DNa02→CB0556 connection consists of 11 synapses, whereas each DNa02→LAL018 connection consists of 10 synapses (on average). Conversely, each CB0556→DNa01 connection consists of 29 synapses, whereas  each LAL018→DNa01 connection consists of 64 synapses. In short, LAL018 is a nontrivial source of excitatory input to DNa01, but DNa02 is not positioned to exert much influence over LAL018, and the two disynaptic connections from DNa02 onto DNa01 also have the opposite sign. Thus, it seems unlikely that DNa02 is a major driver of DNa01 activity. At the same time, it is difficult to completely exclude this possibility, because we do not understand the logic of the very complicated premotor inputs to these DNs in the brain. Thus, we are hesitant to make a strong statement on this point.

    1. Reviewer #1 (Public review):

      Summary:

      This study utilises fNIRS to investigate the effects of undernutrition on functional connectivity patterns in infants from a rural population in Gambia. fNIRS resting-state data recording spanned ages 5 to 24 months, while growth measures were collected from birth to 24 months. Additionally, executive functioning tasks were administered at 3 or 5 years of age. The results show an increase in left and right frontal-middle and right frontal-posterior connections with age and, contrary to previous findings in high-income countries, a decrease in frontal interhemispheric connectivity. Restricted growth during the first months of life was associated with stronger frontal interhemispheric connectivity and weaker right frontal-posterior connectivity at 24 months of age. Additionally, the study describes some connectivity patterns, including stronger frontal interhemispheric connectivity, which is associated with better cognitive flexibility at preschool age.

      Strengths:

      - The study analyses longitudinal data from a large cohort (n = 204) of infants living in a rural area of Gambia. This already represents a large sample for most infant studies, and it is impressive, considering it was collected outside the lab in a population that is underrepresented in the literature. The research question regarding the effect of early nutritional deficiency on brain development is highly relevant and may highlight the importance of early interventions. The study may also encourage further research on different underrepresented infant populations (i.e., infants not residing in Western high-income countries) or in settings where fMRI is not feasible.

      - The preprocessing and analysis steps are carefully described, which is very welcome in the fNIRS field, where well-defined standards for preprocessing and analysis are still lacking.

      Weaknesses:

      - While the study provides a solid description of the functional connectivity changes in the first two years of life at the group level and investigates how restricted growth influences connectivity patterns at 24 months, it does not explore the links between adverse situations and developmental trajectories for functional connectivity. Considering the longitudinal nature of the dataset, it would have been interesting to apply more sophisticated analytical tools to link undernutrition to specific developmental trajectories in functional connectivity. The authors mention that they lack the statistical power to separate infants into groups according to their growing profiles. However, I wonder if this aspect could not have been better explored using other modelling strategies and dimensional reduction techniques. I can think about methods such as partial least squares correlation, with age included as a numerical variable and measures of undernutrition.

      - Connectivity was asses in 6 big ROIs. While the authors justify this choice to reduce variability due to head size and optode placement, this also implies a significant reduction in spatial resolution. Individual digitalisation and co-registration of the optodes to the head model, followed by image reconstruction, could have provided better spatial resolution. This is not a weakness specific to this study but rather a limitation common to most fNIRS studies, which typically analyse data at the channel level since digitalisation and co-registration can be challenging, especially in complex setups like this. However, the BRIGHT project has demonstrated that it is possible and that differences in placement affect activation patterns, which become more localised when data is co-registered at the subject level (Collins-Jones et al., 2021). Could the co-registration of individual data have increased sensitivity, particularly given that longitudinal effects are being investigated?

      - I believe that a further discussion in the manuscript on the application of global signal regression and its effects could have been beneficial for future research and for readers to better understand the negative correlations described in the results. Since systemic physiological changes affect HbO/HbR concentrations, resulting in an overestimation of functional connectivity, regressing the global signal before connectivity computation is a common strategy in fNIRS and fMRI studies. However, the recommendation for this step remains controversial, likely depending on the case (Murphy & Fox, 2017). I understand that different reasons justify its application in the current study. In addition to systemic physiological changes originating from brain tissue, fNIRS recordings are contaminated by changes occurring in superficial layers (i.e., the scalp and skull). While having short-distance channels could have helped to quantify extracerebral changes, challenges exist in using them in infant populations, especially in a longitudinal study such as the one presented here. The optimal source-detector distance that minimises sensitivity to changes originating from the brain would increase with head size, and very young participants would require significantly shorter source-detector distances (Brigadoi & Cooper, 2015). Thus, having them would have been challenging. Under these circumstances (i.e., lack of short channels and external physiological measures), and considering that the amount the signal is affected by physiological noise (either coming from the brain or superficial tissue) might change through development, the choice of applying global signal regression is justified. Nevertheless, since the method introduces negative correlations in the data by forcing connectivity to average to zero, I believe a further discussion of these points would have enriched the interpretation of the results.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Cognitive and brain development during the first two years of life is vast and determinant for later development. However, longitudinal infant studies are complicated and restricted to occidental high-income countries. This study uses fNIRS to investigate the developmental trajectories of functional connectivity networks in infants from a rural community in Gambia. In addition to resting-state data collected from 5 to 24 months, the authors collected growing measures from birth until 24 months and administrated an executive functioning task at 3 or 5 years old.

      The results show left and right frontal-middle and right frontal-posterior negative connections at 5 months that increase with age (i.e., become less negative). Interestingly, contrary to previous findings in high-income countries, there was a decrease in frontal interhemispheric connectivity. Restricted growth during the first months of life was associated with stronger frontal interhemispheric connectivity and weaker right frontal-posterior connectivity at 24 months. Additionally, the study describes that some connectivity patterns related to better cognitive flexibility at pre-school age.

      Strengths:

      - The authors analyze data from 204 infants from a rural area of Gambia, already a big sample for most infant studies. The study might encourage more research on different underrepresented infant populations (i.e., infants not living in occidental high-income countries).

      - The study shows that fNIRS is a feasible instrument to investigate cognitive development when access to fMRI is not possible or outside a lab setting.

      - The fNIRS data preprocessing and analysis are well-planned, implemented, and carefully described. For example, the authors report how the choices in the parameters for the motion artifacts detection algorithm affect data rejection and show how connectivity stability varies with the length of the data segment to justify the threshold of at least 250 seconds free of artifacts for inclusion.

      - The authors use proper statistical methods for analysis, considering the complexity of the dataset.

      We thank the reviewer for highlighting the strengths of this work.

      Weaknesses:

      - No co-registration of the optodes is implemented. The authors checked for correct placement by looking at pictures taken during the testing session. However, head shape and size differences might affect the results, especially considering that the study involves infants from 5 months to 24 months and that the same fNIRS array was used at all ages.

      The fNIRS array used in this work was co-registered onto age-appropriate MNI templates at every time point in a previous published work L. H. Collins-Jones, et al., Longitudinal infant fNIRS channel-space analyses are robust to variability parameters at the group-level: An image reconstruction investigation. Neuroimage 237, 118068 (2021). This is reference No. 68 in the manuscript.

      As we mentioned in the section fNIRS preprocessing and data-analysis: ‘The sections were established via the 17 channels of each hemisphere which were grouped into front, middle and back (for a total of six regions) based on a previous co-registration of the BRIGHT fNIRS arrays onto age-appropriate templates’. The procedure mentioned by the reviewer, involving the examination of pictures showing the placement of headbands on participants, aimed to exclude infants with excessive cap displacement from further analysis.

      - The authors regress the global signal to remove systemic physiological noise. While the authors also report the changes in connectivity without global signal regression, there are some critical differences. In particular, the apparent decrease in frontal inter-hemispheric connections is not present when global signal regression is omitted, even though it is present for deoxy-Hb. The authors use connectivity results obtained after applying global signal regression for further analysis. The choice of regressing the global signal is questionable since it has been shown to introduce anti-correlations in fMRI data (Murphy et al., 2009), and fNIRS in young infants does not seem to be highly affected by physiological noise (Emberson et al., 2016). Systemic physiological noise might change at different ages, which makes its remotion critical to investigate functional network development. However, global signal regression might also affect the data differently. The study would have benefited from having short separation channels to measure the systemic psychological component in the data.

      The work of Emberson et. al (2016) mentioned by the reviewer highlights indeed the challenges of removing systemic changes from the infants’ haemodynamic signal with short-channel separation (SSC). In fact, even a SSC of 1 cm detected changes in the blood in the brain, therefore by regressing this signal from the recorded one, the authors removed both systemic changes AND haemodynamic signal. This paper from Emberson et. al (2016) is taken as a reference in the field to suggest that SSC might not be an ideal tool to remove systemic changes when collecting fNIRS data on young infants, as we did in this work.

      We agree with the reviewer's observation that systemic physiological noise may vary with age and among infants. Therefore, for each infant at each age, we regressed the mean value calculated across all channels. This ensures that the regressed signal is not biased by averaged calculations at group levels.

      We are aware of the criticisms directed towards global signal regression in the fMRI literature, although some other works showed anticorrelations in functional connectivity networks both with and without global signal regression (Chaia, 2012). Furthermore, Murphy himself revised his criticism on the use of global signal regression in functional connectivity analysis in one of his more recent works (Murphy et al, 2017). The fact that the decreased FC is significant in results from data pre-processed without global signal regression gives us confidence that this finding is statistically robust and not solely driven by this preprocessing choice in our pipeline.

      An interesting study by Abdalmalak et al. (2022) demonstrated that failing to correct for systemic changes using any method is inappropriate when estimating FC with fNIRS, as it can lead to a high risk of elevated connectivity across the whole brain (see Figure 4 of the mentioned paper). Consequently, we strongly advocate for the implementation of global signal regression in our analysis pipeline as a fundamental step for accurate functional connectivity estimations.

      References:

      Emberson, L. L., Crosswhite, S. L., Goodwin, J. R., Berger, A. J., & Aslin, R. N. (2016). Isolating the effects of surface vasculature in infant neuroimaging using short-distance optical channels: a combination of local and global effects. Neurophotonics, 3(3), 031406-031406.

      Chaia, X. J., Castañóna, A. N., Öngürb, D., & Whitfield-Gabrielia, S. (2012). Anticorrelations in resting state networks without global signal regression. NeuroImage, 59(2), 1420–1428. https://doi.org/10.1515/9783050076010-014

      Murphy, K., & Fox, M. D. (2017). Towards a consensus regarding global signal regression for resting state functional connectivity MRI. NeuroImage, 154(November 2016), 169–173. https://doi.org/10.1016/j.neuroimage.2016.11.052

      Abdalmalak, A., Novi, S. L., Kazazian, K., Norton, L., Benaglia, T., Slessarev, M., ... & Owen, A. M. (2022). Effects of systemic physiology on mapping resting-state networks using functional near-infrared spectroscopy. Frontiers in neuroscience, 16, 803297.

      - I believe the authors bypass a fundamental point in their framing. When discussing the results, the authors compare the developmental trajectories of the infants tested in a rural area of Gambia with the trajectories reported in previous studies on infants growing in occidental high-income countries (likely in urban contexts) and attribute the differences to adverse effects (i.e., nutritional deficits). Differences in developmental trajectories might also derive from other environmental and cultural differences that do not necessarily lead to poor cognitive development.

      We agree with the reviewer that other factors differing between low- and poor-resource settings might have an impact on FC trajectories. We therefore specified this in the discussion as follows: “We acknowledge that differences in FC could also be attributed to other environmental and cultural disparities between high-resource and low-resource settings, and future studies are needed to investigate this further” (line 238).

      - While the study provides a solid description of the functional connectivity changes in the first two years of life at the group level, the evidence regarding the links between adverse situations, developmental trajectories, and later cognitive capacities is weaker. The authors find that early restricted growth predicts specific connectivity patterns at 24 months and that certain connectivity patterns at specific ages predict cognitive flexibility. However, the link between development trajectories (individual changes in connectivity) with growth and later cognitive capacities is missing. To address this question adequately, the study should have compared infants with different growing profiles or those who suffered or did not from undernutrition. However, as the authors discussed, they lacked statistical power.

      We agree with the reviewer, and indeed we highlighted this as one of the main limitation of our work: “Even given the large sample in our study, we were underpowered to test for group comparisons between sets of infants with distinct undernutrition growth profiles, e.g., infants with early poor growth that later resolved and infants with standard growth early that had a poor growth later. We were also underpowered to test the associations between early growth and FC on clinically undernourished infants (defined as having DWLZ two standard deviations below the mean) (line 311, discussion section).

      We believe this is an important point to consider for the field, as it addresses the sample size required for studies investigating brain development in clinically malnourished infants. We hope this will serve as a valuable reference for future studies in the field. For example, a new study led by Prof. Sophie Moore and other members of the BRIGHT team (INDiGO) is currently recruiting six-hundreds pregnant women with the aim of obtaining a broader distribution of infants’ growth measures (https://www.kcl.ac.uk/research/sophie-moore-research-group).

      Reviewer #2 (Public Review):

      Summary and strengths:

      The article pertains to a topic of importance, specifically early life growth faltering, a marker of undernutrition, and how it influences brain functional connectivity and cognitive development. In addition, the data collection was laborious, and data preprocessing was quite rigorous to ensure data quality, utilizing cutting-edge preprocessing methods.

      We thank the reviewer for highlighting the strengths of this work.

      Weaknesses:

      However, the subsequent analysis and explanations were not very thorough, which made some results and conclusions less convincing. For example, corrections for multiple tests need to be consistently maintained; if the results do not survive multiple corrections, they should not be discussed as significant results. Additionally, alternative plans for analysis strategies could be worth exploring, e.g., using ΔFC in addition to FC at a certain age. Lastly, some analysis plans lacked a strong theoretical foundation, such as the relationship between functional connectivity (FC) between certain ROIs and the development of cognitive flexibility.

      Thus, as much as I admire the advanced analysis of connectivity that was conducted and the uniqueness of longitudinal fNIRS data from these samples (even the sheer effort to collect fNIRS longitudinally in a low-income country at such a scale!), I have reservations about the importance of this paper's contribution to the field in its present form. Major revisions are needed, in my opinion, to enhance the paper's quality. 

      We acknowledge the reviewer’s concern regarding the reporting of results that do not survive multiple comparisons. However, considering the uniqueness of our dataset and the novelty of our work, we believe it is crucial to report all significant findings as well as hypothesis-generating findings that may not pass stringent significance thresholds. We have taken great care to transparently distinguish between results that survived multiple comparisons and those that did not in both the Results and Discussion sections, ensuring that readers are not misled. It is possible that future studies may replicate and further strengthen these associations. Therefore, by sharing these results with the research community, we provide valuable insights for future investigations.

      The relationship between FC and cognitive flexibility (as well as the relationship between growth and FC) has been explored focusing on those FC that showed a significant change with age, as specified in the results sections: ‘To investigate the impact of early nutritional status on FC at 24 months, we used multiple regression with the infant growth trajectory [...] and FC at 24 months [...]. To maximise power, we considered only those FC that showed a statistically significant change with age’ (line 183) and ‘To investigate whether FC early in life predicted cognitive flexibility at preschool age, we used multiple regression of FC across the first two years of life against later cognitive flexibility in preschoolers at three and five years. As per the analysis above, we focused on only those FC that showed a statistically significant change with age’ (line 198).

      We explored the possibility of investigating the relationship between changes in FC and changes in growth. However, the degrees of freedom in these analyses dropped dramatically (~25/30), thereby putting the significance and the meaning of the results at risk. We look forward to future longitudinal studies with less attrition across these time points to maintain the statistical power necessary to run such analyses.

      Reviewer #3 (Public Review):

      Summary:

      This study aimed to investigate whether the development of functional connectivity (FC) is modulated by early physical growth and whether these might impact cognitive development in childhood. This question was investigated by studying a large group of infants (N=204) assessed in Gambia with fNIRS at 5 visits between 5 and 24 months of age. Given the complexity of data acquisition at these ages and following data processing, data could be analyzed for 53 to 97 infants per age group. FC was analyzed considering 6 ensembles of brain regions and thus 21 types of connections. Results suggested that: i) compared to previously studied groups, this group of Gambian infants have different FC trajectory, in particular with a change in frontal inter-hemispheric FC with age from positive to null values; ii) early physical growth, measured through weight-for-length z-scores from birth on, is associated with FC at 24 months. Some relationships were further observed between FC during the first two years and cognitive flexibility at 4-5 years of age, but results did not survive corrections for multiple comparisons.

      Strengths:

      The question investigated in this article is important for understanding the role of early growth and undernutrition on brain and behavioral development in infants and children. The longitudinal approach considered is highly relevant to investigate neurodevelopmental trajectories. Furthermore, this study targets a little-studied population from a low-/middle-income country, which was made possible by the use of fNIRS outside the lab environment. The collected dataset is thus impressive and it opens up a wide range of analytical possibilities.

      We thank the reviewer for highlighting the strengths of this work.

      Weaknesses:

      - Analyzing such a huge amount of collected data at several ages is not an easy task to test developmental relationships between growth, FC, and behavioral capacities. In its present form, this study and the performed analyses lack clarity, unity and perhaps modeling, as it suggests that all possible associations were tested in an exploratory way without clear mechanistic hypotheses. Would it be possible to specify some hypotheses to reduce the number of tests performed? In particular, considering metrics at specific ages or changes in the metrics with age might allow us to test different hypotheses: the authors might clarify what they expect specifically for growth-FC-behaviour associations. Since some FC measures and changes might be related to one another, would it be reasonable to consider a dimensionality reduction approach (e.g., ICA) to select a few components for further correlation analyses?

      We confirm that this work was motivated by a compelling theoretical question: whether neural mechanisms, specifically FC, can be influenced by early adversity, such as growth, and subsequently impact cognitive outcomes, such as cognitive flexibility. This aligns with the overarching goal of the BRIGHT project, established in 2015 (Lloyd-Fox, 2023). We believe this was evident throughout the manuscript in several instances, for example:

      - “The goal of the study was to investigate early physical growth in infancy, developmental trajectories of brain FC across the first two years of life, and cognitive outcome at school age in a longitudinal cohort of infants and children from rural Gambia, an environment with high rates of maternal and child undernutrition. Specifically, we aimed to: (i) investigate whether differences in physical growth through the first two years of life are related to FC at 24 months, and (ii) investigate if trajectories of early FC have an impact on cognitive outcome at pre-school age in these children.” (page 4, introduction)

      - “This study investigated how early adversity via undernutrition drives longitudinal changes in brain functional connectivity at five time points throughout the first two years of life and how these developmental trajectories are associated with cognitive flexibility at preschool age.” (page 6, discussion)

      - We had a clear hypothesis regarding short-range connectivity decreasing with age and long-range connectivity increasing with age, as stated at the end of the introduction: We hypothesized that (i) long-range FC would increase and short-range FC would decrease throughout the first two years of life” (page 4, line 147). However, we were not able to formulate clear hypotheses about the localization of these connections due to the scarcity of previous studies conducted within this age range, particularly in low-resource settings. The ROI approach for analysis was chosen to mitigate this challenge by reducing the number of comparisons while still enabling us to estimate the developmental trajectories of all the connections from which we acquired data.

      Regarding the use of dimensionality reduction approach, we have not considered the use of ICA in our analysis. These methods require selecting a fixed number of components to remove from all participants. However, due to the high variability of infant fNIRS data across the five timepoints, we considered it untenable to precisely determine the number of components to remove at the group level. Such a procedure carries the risk of over-cleaning the data for some participants while leaving noise in for others (Di Lorenzo, 2019). We also felt that using PCA in this initial study would be beyond the scope of the brain-region-specific hypotheses and would be more appropriate in a follow-up analysis of these important data.

      References:

      Lloyd-Fox, S., McCann, S., Milosavljevic, B., Katus, L., Blasi, A., Bulgarelli, C., Crespo-Llado, M., Ghillia, G., Fadera, T., Mbye, E., Mason, L., Njai, F., Njie, O., Perapoch-Amado, M., Rozhko, M., Sosseh, F., Saidykhan, M., Touray, E., Moore, S. E., … Team, and the B. S. (2023). The Brain Imaging for Global Health (BRIGHT) Study: Cohort Study Protocol. Gates Open Research, 7(126).

      Di Lorenzo, R., Pirazzoli, L., Blasi, A., Bulgarelli, C., Hakuno, Y., Minagawa, Y., & Brigadoi, S. (2019). Recommendations for motion correction of infant fNIRS data applicable to multiple data sets and acquisition systems. NeuroImage, 200(April), 511–527.

      - It seems that neurodevelopmental trajectories over the whole period (5-24 months) are little investigated, and considering more robust statistical analyses would be an important aspect to strengthen the results. The discussion mentions the potential use of structural equation modelling analyses, which would be a relevant way to better describe such complex data.

      We appreciate the complexity of the dataset we are working with, which includes multiple measures and time points. Currently, our focus within the outputs from the BRIGHT project is on examining the relationship between selected measures. While this may not involve statistically advanced modelling at the moment, it is worth noting that most of the results presented in this work have survived correction for multiple comparisons, indicating their statistical robustness. We believe that more advanced statistical analyses are beyond the scope of this rich initial study. In the next phase of the project, known as BRIGHT IMPACT, our team is collaborating with statisticians and experts in statistical modelling to apply more sophisticated and advanced statistical techniques to the data.

      - Given the number of analyses performed, only describing results that survive correction for multiple comparisons is required. Unifying the correction approach (FDR / Bonferroni) is also recommended. For the association between cognitive flexibility and FC, results are not significant, and one might wonder why FC at specific ages was considered rather than the change in FC with age. One of the relevant questions of such a study would be whether early growth and later cognitive flexibility are related through FC development, but testing this would require a mediation analysis that was not performed.

      We acknowledge the reviewer’s concern regarding the reporting of results that do not survive multiple comparisons. However, considering the uniqueness of our dataset and the novelty of our work, we believe it is crucial to report all significant findings. We have taken great care to transparently distinguish between results that survived multiple comparisons and those that did not in both the Results and Discussion sections, ensuring that readers are not misled. It is possible that future studies may replicate and further strengthen these associations. Therefore, by sharing these results with the research community, we provide valuable insights for future investigations.

      We did not perform a mediation analysis as i) ΔWLZ between birth and the subsequent time points positively predicted frontal interhemispheric FC at 24 months, ii) frontal interhemispheric FC at 18 months (and right fronto-posterior connectivity at 24 months) predicted cognitive flexibility at preschool age. Considering that the frontal interhemispheric FC at 24 months that was positively predicted by growth, did not significantly predicted cognitive outcome at preschool age, we did not perform mediation models.

      The reviewer raised concerns about using different methods to correct for multiple comparisons throughout the work. Results showing changes in FC with age were Bonferroni corrected, while we used FDR correction for the regression analyses investigating the relationship between growth and FC, as well as FC and cognitive flexibility. Both methods have good control over Type I errors (false positives), but Bonferroni is very conservative, increasing the likelihood of Type II errors (false negatives). We considered Bonferroni an appropriate method for correcting results showing changes in FC with age, where we had a large sample with strong statistical power (i.e. linear mixed models with 132 participants who had at least 250 seconds of good data for 2 out of 5 visits). However, Bonferroni was too conservative for the regression analyses, with N between 57 and 78) (Acharya, 2014; Félix & Menezes, 2018; Narkevich et al., 2020; Narum, 2006; Olejnik et al., 1997).

      References:

      Acharya, A. (2014). A Complete Review of Controlling the FDR in a Multiple Comparison Problem Framework--The Benjamini-Hochberg Algorithm. ArXiv Preprint ArXiv:1406.7117.

      Félix, V. B., & Menezes, A. F. B. (2018). Comparisons of ten corrections methods for t-test in multiple comparisons via Monte Carlo study. Electronic Journal of Applied Statistical Analysis, 11(1), 74–91.

      Narkevich, A. N., Vinogradov, K. A., & Grjibovski, A. M. (2020). Multiple comparisons in biomedical research: the problem and its solutions. Ekologiya Cheloveka (Human Ecology), 27(10), 55–64.

      Narum, S. R. (2006). Beyond Bonferroni: less conservative analyses for conservation genetics. Conservation Genetics, 7, 783–787.

      Olejnik, S., Li, J., Supattathum, S., & Huberty, C. J. (1997). Multiple testing and statistical power with modified Bonferroni procedures. Journal of Educational and Behavioral Statistics, 22(4), 389–406.

      - Growth is measured at different ages through different metrics. Justifying the use of weight-for-length z-scores would be welcome since weight-for-age z-scores might be a better marker of growth and possible undernutrition (this impacting potentially both weight and length). Showing the distributions of these z-scores at different ages would allow the reader to estimate the growth variability across infants.

      We consistently used WLZ as the metric to measure growth throughout. Our analysis investigating the relationship between WLZ and growth included HCZ at 7/14 days to correct for head size at birth. When selecting the best growth measure for this paper, we opted for WLZ over WAZ, given extant evidence that infants in our sample are smaller and shorter compared to the reference WHO standard for the same age group (Nabwera et al., 2017). Therefore, using WLZ allows us to adjust each infant's weight for its own length.

      References:

      Nabwera, H. M., Fulford, A. J., Moore, S. E., & Prentice, A. M. (2017). Growth faltering in rural Gambian children after four decades of interventions: a retrospective cohort study. The Lancet Global Health, 5(2), e208–e216.

      - Regarding FC, clarifications about the long-range vs short-range connections would be welcome, as well as drawing a summary of what is expected in terms of FC "typical" trajectory, for the different brain regions and connections, as a marker of typical development. For instance, the authors suggest that an increase in long-range connectivity vs a decrease in short-range is expected based on previous fNIRS studies. However anatomical studies of white matter growth and maturation would suggest the reverse pattern (short-range connections developing mostly after birth, contrarily to long-range connections prenatally).

      We expected an increase in long-range functional connectivity with age, as discussed in the introduction:

      - “Based on data from fMRI, current models hypothesize that FC patterns mature throughout early development (23–27), where in typically developing brains, adult-like networks emerge over the first years of life as long-range functional connections between pre-frontal, parietal, temporal, and occipital regions become stronger and more selective (28–31). This maturation in FC has been shown to be related to the cascading maturation of myelination and synaptogenesis (32, 33) - fundamental processes for healthy brain development (34)” (line 93, page 3, introduction);

      - “Importantly, normative developmental patterns may be disrupted and even reversed in clinical conditions that impact development; e.g., increased short-range and reduced long-range FC have been observed in preterm infants (36) and in children with autism spectrum disorder (37, 38)” (line 103, page 3, introduction);

      - “We hypothesized that (i) long-range FC would increase and short-range FC would decrease throughout the first two years of life” (line 147, page 4, introduction).

      Since inferences about FC patterns recorded with fNIRS are highly limited by the number and locations of the optodes, it is challenging to make strong inferences about specific brain regions. Moreover, infant FC fNIRS studies are still limited, which is why we focused our inferences on long-range versus short-range connectivity, without specifically pinpointing particular brain regions.

      Additionally, were unable to locate the works mentioned by the reviewer regarding an increase in short-range white matter connectivity immediately after birth. On the contrary, we found several studies documenting an increase in white-matter long-range connectivity after birth, which is consistent with the hypothesised increase in FC long-range connectivity, such as:

      Yap, P. T., Fan, Y., Chen, Y., Gilmore, J. H., Lin, W., & Shen, D. (2011). Development trends of white matter connectivity in the first years of life. PloS one, 6(9), e24678.

      Dubois, J., Dehaene-Lambertz, G., Kulikova, S., Poupon, C., Hüppi, P. S., & Hertz-Pannier, L. (2014). The early development of brain white matter: a review of imaging studies in fetuses, newborns and infants. Neuroscience, 276, 48-71.

      Stephens, R. L., Langworthy, B. W., Short, S. J., Girault, J. B., Styner, M. A., & Gilmore, J. H. (2020). White matter development from birth to 6 years of age: a longitudinal study. Cerebral Cortex, 30(12), 6152-6168.

      Hagmann, P., Sporns, O., Madan, N., Cammoun, L., Pienaar, R., Wedeen, V. J., ... & Grant, P. E. (2010). White matter maturation reshapes structural connectivity in the late developing human brain. Proceedings of the National Academy of Sciences, 107(44), 19067-19072.

      Collin G, van den Heuvel MP. The ontogeny of the human connectome: development and dynamic changes of brain connectivity across the life span. Neuroscientist. 2013 Dec;19(6):616-28. doi: 10.1177/1073858413503712.

      The authors test associations between FC and growth, but making sense of such modulation results is difficult without a clearer view of developmental changes per se (e.g., what does an early negative FC mean? Is it an increase in FC when the value gets close to 0? In particular, at 24m, it seems that most FC values are not significantly different from 0, Figure 2B). Observing positive vs negative association effects depending on age is quite puzzling. It is also questionable, for some correlation analyses with cognitive flexibility, to focus on FC that changes with age but to consider FC at a given age.

      We thank the reviewer for bringing up this important point and understand that it requires some additional consideration. The negative FC values decreasing with age indicate that these regions go from being anti-correlated to becoming increasingly correlated. Hence, FC of these ROIs increased with age. The trajectory seems to suggest that this will keep increasing with age but of course further data need to be collected to assess this.

      Unfortunately, when considering ΔFC to predict cognitive flexibility, the numbers of participants dropped significantly, with N=~15/20 infants per group of preschoolers, making it very challenging to interpret the results with meaningful statistical power.

      - The manuscript uses inappropriate terms "to predict", "prediction" whereas the conducted analyses are not prediction analyses but correlational.

      We thank the reviewer for giving us to opportunity to thoroughly revise the manuscript about this matter. In this work, we had clear hypotheses regarding which variables predicted which certain measures (such as growth predicting FC and FC predicting cognitive outcomes). Therefore, we performed regression analyses rather than correlational analyses to investigate these associations. Hence, we believe that using the term ‘predict and ‘prediction’ is appropriate

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) In the introduction and discussion, the authors talk about the link between developmental trajectories and cognitive capacities, and undernutrition. However, they did not compare developmental trajectories but connectivity patterns at different ages with ΔWLZ and cognitive flexibility. I recommend that the authors rephrase the introduction and discussion.

      We thank the reviewer for pointing out places requiring better clarity in the text. We made edits through the introduction to better match our investigations. In particular we changed:

      - ‘our understanding of the relationships between early undernutrition, developmental trajectories of brain connectivity, and later cognitive outcomes is still very limited,’ to, ‘our understanding of the relationships between early undernutrition, brain connectivity, and later cognitive outcomes is still very limited’ (line 89, introduction);

      - ‘(ii) investigate if trajectories of early FC have an impact on cognitive outcome at pre-school age in these children,’ to, ‘(ii) investigate if early FC has an impact on cognitive outcome at pre-school age in these children’ (line 137, introduction);

      - ‘This study investigated how early adversity via undernutrition drives longitudinal changes in brain functional connectivity at five time points throughout the first two years of life and how these developmental trajectories are associated with cognitive flexibility at preschool age,’ to, ‘This study investigated how early adversity via undernutrition drives brain functional connectivity throughout the first two years of life and how these early functional connections are associated with cognitive flexibility at preschool age’ (line 215, discussion).

      (2) Considering most research is done in occidental high-income countries, and this work is one of the few presenting research in another context, I think the authors should discuss in the manuscript that differences with previous studies might also be due to environmental and cultural differences. Since the study lacks the statistical power to perform a statistical analysis that directly establishes a link between developmental trajectories and restricted growth and cognitive flexibility, the authors cannot disentangle which differences are related to undernutrition and which might result from growing up in a different environment. I recommend that the authors avoid phrases like (lines 57-58): "We observed that early physical growth before the fifth month of life drove optimal developmental trajectories of FC..." or (lines 223-224) "...our cohort of Gambian infants exhibit atypical developmental trajectories of functional connectivity...".

      We thank the reviewer for this observation, and we agree with the reviewer that other factors differing between low- and poor-resource settings might have an impact on FC trajectories. We therefore specified this in the discussion as follows: “We acknowledge that differences in FC could also be attributed to other environmental and cultural disparities between high-resource and low-resource settings, and future studies are needed to explore this further” (line 238). We revised the whole manuscript to reflect similar statements.

      (3) To better interpret the results, it would be interesting to know if poor early growth predicts late cognitive flexibility in the tested sample and if the ΔWLZ distributions differ compared to a population in a high-income country where undernutrition is less frequent.

      We explored the relationship between changes in growth and cognitive flexibility in the two preschooler group, but there were no significant associations.

      Mean and SD values of WLZ are reported in Table 3. The values at every age are negative, indicating that the infants' weight-for-length is below the expected norm at all ages. To our knowledge, no other studies have assessed changes in growth in an infant sample with similar closely spaced age time points in high-income countries, making comparisons on growth changes challenging.

      (4) It is unclear why WLZ at birth and HCZ at 7-14 days are included in the models. I imagine this is to ensure that differences are not due to growing restrictions before birth. It would be nice if the authors could explain this.

      As the reviewer pointed out, HCZ at 7-14 days was included to ensure associations between growth and FC are not due to physical differences at birth. This case be considered as a 'baseline' measure for cerebral development, in the same way that WLZ at birth was used as a baseline for physical development. Therefore, we can more confidently  assume that the associations between growth and FC were specific to the impact of change in WLZ postnatally and not confounded by the size or maturity of the infant at birth. We specified this in the manuscript as follows: “These analyses were adjusted by WLZ at birth and HCZ at 7/14 days, to more confidently assume that the associations between growth and FC were specific to the impact of change in WLZ postnatally and not confounded by the size or maturity of the infant at birth” (line 520, statistical analysis section in the method section).

      (5) Right frontal-posterior connections at 24 months negatively correlate with ΔWLZ. Thus, restricted growth results in stronger frontal-posterior connections at 24 months. However, the same connections at 24 months positively correlate with cognitive flexibility (stronger connections predict better cognitive flexibility). Do the authors have any interpretation of this? How could this relate to previous findings of the authors (Bulgarelli et al. 2020), showing first an increase and then a decrease in functional connectivity between frontal and parietal regions?

      We acknowledge that interpreting the negative relationship between changes in growth and fronto-posterior FC at 24 months, alongside the positive association between the same connection and later cognitive flexibility, is challenging. We refrain from relating these findings to those published by Bulgarelli in 2020 due to differences in optode locations and because in that work the decrease in fronto-posterior FC was observed after 24 months (up to 36 months), whereas the endpoint in this study is right at 24 months.

      (6) With the growth of the head, the frontal channels move to more temporal areas, right? Could this determine the decrease in frontal inter-hemisphere connections?

      As shown in Nabwera (2017) head size does not increase that much in Gambian infants, or at least as expected by the WHO standard measures. We have added HCZ mean and SD values per age in Table 3.

      Minor points

      - HCZ is used in line 184 but not defined.

      We thank the reviewer for spotting this, we have now specified HCZ at line 184 as follows: ‘head-circumference z-score (HCZ)’.

      - Table SI2: NIRS not undertaken = the participant was assessed but did want or could not perform... I imagine there is a missing "not".

      We thank the reviewer for spotting this, we have now modified the legend of Table SI2 as follows: ‘the participant was assessed but did not want or could not perform the NIRS assessments.’

      - The authors should explain what weight-for-length is for those who are not familiar with it.

      We have added an explanation of weight-for-length in the experimental design section, line 339 as follows: ‘We then tested for relationships between brain FC at age 24 months with measures of early growth, as indexed by changes in weight-for-length z-scores (reflecting body weight in proportion to attained growth in length) at one month of age, and at each of the four subsequent visits (details provided below).’

      Reviewer #2 (Recommendations For The Authors):

      (1) I am confused about the authors' interpretation that left and right front-middle and right front-back FC increased with age. It appears in Figure 2 that the negative FC among these ROIs should actually decrease with age. This means that as individuals grow older, the FC values between these regions and zero diminished, albeit starting with negative FC (anticorrelation values) in younger age groups.

      Yes, the reviewer is correct. The negative values of the left and right front-middle and right front-back FC decreasing with age indicate that these regions go from being anti-correlated to becoming increasingly correlated. Hence, FC of these ROIs increased with age.

      (2) Are these negative values mentioned above at 24 months still negative? Have t-tests been run to examine the differences from zero?

      As suggested, we performed t-tests against zero for the mentioned FC at 24 months, and only the left and right fronto-middle FC are significantly different than zero (left fronto-middle FC: t(94) = 1.8, p = 0.036; right fronto-middle FC t(94) = 2.7, p = 0.003).

      (3) With so many correlation analyses, have multiple comparisons been consistently controlled for? While I assume this was done according to the Methods section, could the authors clarify whether FDR adjustment was applied to all the p-values at once or to a group of p-values each time? I found the following way of reporting FDR-adjusted p-values quite informative, such as PFDR, 24 pairs of ROIs < 0.05.

      We thank the reviewer for this insightful comment. P-values of regression analyses were FDR corrected per connection investigated, i.e. 21 possible ΔWLZ values per connection. We have specified this in the method section as follows: “To ensure statistical reliability, results from the regression analyses on each FC were corrected for multiple comparisons using false discovery rate (FDR)(Benjamini & Hochberg, 1995) per each connection investigated, i.e. 21 possible ΔWLZ values per each connection,” (page 12, Statistical Analyses section).

      (4) Can early growth trajectories predict changes in FC? Why not use ΔWLZ to predict ΔFC?

      Unfortunately, when considering ΔWLZ to predict ΔFC, the numbers of participants dropped significantly, with N=~30 infants, making it very challenging to interpret the results. We believe this emphasizes the importance of recruiting large samples when conducting longitudinal studies involving infants and employing multiple measures.

      (5) I might have missed the rationale, but why weren't the growth changes after 5 months studied?

      ΔWLZ between all time points were assessed as predictors of FC at 24 months. We have specified this at line 183 as follows: ‘we used multiple regression with the infant growth trajectory (delta weight for length z-score between all time points, DWLZ) and FC at 24 months’. As indicated in Table 2 and 3 the associations between ΔWLZ at all time points and FC at 24 months were tested, but only those with DWLZ calculated between birth and 1 month and the subsequent time points were significant. DWLZ between 5 months and the subsequent time points, DWLZ between 8 months and the subsequent time points, DWLZ between 12 months and the subsequent time points, DWLZ between 18 months and the subsequent time points did not significantly predict FC at 24 months. These are highlighted in Table 2 and Figure 3 in blue and marked as NS (non-significant).

      (6) Once more, the advantage of longitudinal data is that it allows us to tap into developmental changes. Analyzing and predicting cognitive development based solely on FC values at a single age stage (i.e., 24 months) would overlook the benefits of a longitudinal design, which is regrettable. I suggest that the authors attempt to use ΔFC for predictions and observe the outcomes.

      As mentioned to point (4) raised by the reviewer, unfortunately, when considering ΔWLZ to predict ΔFC, the numbers of participants dropped significantly, with N=~30 infants, making it very challenging to interpret the results. We believe this emphasizes the importance of recruiting large samples when conducting longitudinal studies involving infants and employing various measures.

      (7) In the section "Early FC predicts cognitive flexibility at preschool age", the authors pointed out that "...,none of these survived FDR correction for multiple comparisons." However, the paper discussed the association between FC at 24 months of age and cognitive flexibility, as it was supported by the statistical analysis in the following sections. If FDR correction cannot be satisfied, I would rephrase the implication/conclusion of the results to suggest that early FC does not predict cognitive flexibility at preschool age.

      We acknowledge the reviewer’s concern regarding the reporting of results that do not survive multiple comparisons. However, considering the uniqueness of our dataset and the novelty of our work, we believe it is crucial to report all significant findings, even those not passing multiple comparisons corrections, as they may motivate hypothesis-generation for future studies. We have taken great care to transparently distinguish between results that survived multiple comparisons and those that did not in both the Results and Discussion sections, ensuring that readers are not misled. It is possible that future studies may replicate and further support these associations. Therefore, by sharing these results with the research community, we provide valuable insights for future investigations.

      Following the reviewer’ suggestion, we specified that results from regression analysis are significant but they did not survive multiple comparisons in the discussion as follows: ‘While our results are consistent with previous studies, we acknowledge that the significant association between early FC and later cognitive flexibility does not withstand multiple comparisons. Therefore, we encourage future studies that may replicate these findings with a larger sample. (line 290, discussion section).

      (8) Have the authors assessed the impact of growth trajectories on cognitive flexibility?

      We explored the relationship between changes in growth and cognitive flexibility in the two preschooler groups, but there were no significant associations.

      (9) Are there no other cognitive or behavioural measures available? Cognitive flexibility is just one domain of cognitive development, and would the impact of undernutrition on cognitive development be domain-specific? There is a lack of theoretical support here. Why choose cognitive flexibility, and should the impact of undernutrition be domain-specific or domain-general?

      We agree with the reviewer that in this work, we chose to focus on one specific cognitive outcome. While this does not imply that the impact of undernutrition is domain-specific, cognitive flexibility, being a core executive function, has been extensively studied in terms of its neural underpinnings using other neuroimaging modalities, especially fMRI (for example see Dajani, 2015; Uddin, 2021).

      Moreover, other studies looking at the effect of adversity on cognitive outcomes focus on specific cognitive skills, such as working memory (Roberts, 2017), reading and arithmetic skills (Soni, 2021).

      We did assess infants also with Mullen Scales of Early Learning (MSEL), although the cognitive flexibility task within the Early Years Toolbox has been specifically designed for preschoolers (Howard, 2015), and this set of tasks has recently been validated in our team in The Gambia (Milosavljevic, 2023).Future works from the BRIGHT team will investigate performance at the MSEL in relation to other variable of the project.

      References:

      D. R. Dajani, L. Q. Uddin, Demystifying cognitive flexibility: Implications for clinical and developmental neuroscience. Trends Neurosci. 38, 571–578 (2015).

      L. Q. Uddin, Cognitive and behavioural flexibility: neural mechanisms and clinical considerations. Nat. Rev. Neurosci. 22, 167–179 (2021).

      Roberts, S. B., Franceschini, M. A., Krauss, A., Lin, P. Y., de Sa, A. B., Có, R., ... & Muentener, P. (2017). A pilot randomized controlled trial of a new supplementary food designed to enhance cognitive performance during prevention and treatment of malnutrition in childhood. Current developments in nutrition, 1(11), e000885.

      Soni, A., Fahey, N., Bhutta, Z. A., Li, W., Frazier, J. A., Moore Simas, T., ... & Allison, J. J. (2021). Early childhood undernutrition, preadolescent physical growth, and cognitive achievement in India: A population-based cohort study. PLoS Medicine, 18(10), e1003838.

      Howard, S. J., & Melhuish, E. (2015). An Early Years Toolbox (EYT) for assessing early executive function, language, self-regulation, and social development: Validity, reliability, and preliminary norms. Journal of Psychoeducational Assessment, 35(3), 255-275.

      Milosavljevic, B., Cook, C. J., Fadera, T., Ghillia, G., Howard, S. J., Makaula, H., ... & Lloyd‐Fox, S. (2023). Executive functioning skills and their environmental predictors among pre‐school aged children in South Africa and The Gambia. Developmental Science, e13407.

      (10) I would review more previous fNIRS studies on infants if they exist (e.g., the work by S Lloyd-Fox, L Emberson, and others). These studies can help identify brain ROIs likely linked to undernutrition and cognitive flexibility. The current analysis methods lean towards exploratory research. This makes the paper more of a proof-of-concept report rather than a strongly theoretically-driven study.

      We thank the reviewer for this important point. While we have reviewed existing fNIRS infant studies, there are no extant works that showed whether specific brain regions are related undernutrition. However, several fMRI studies assessed regions that do support cognitive flexibility, and we mentioned these in the manuscript (for example see Dajani, 2015; Uddin, 2021).

      Other than the BRIGHT project, we are aware of two other projects that assessed the effect of undernutrition on brain development, assessing cognitive outcomes in poor-resource settings:

      - the BEAN project in Bangladesh in which fNIRS data were recorded from the bilateral temporal cortex (i.e. Pirazzoli, 2022);

      - a project in India in which fNIRS data were recorded from frontal, temporal and parietal cortex bilaterally (i.e. Delgado Reyes, 2020)

      The brain regions recorded in these studies largely overlap with the brain regions we recorded from in this study.

      Another aspect to consider is that infants underwent several fNIRS tasks as part of the BRIGHT project, focusing on social processing, deferred imitation, and habituation responses. Therefore, brain regions for data acquisition were chosen to maximize the likelihood of recording meaningful data for all tasks (Lloyd-Fox, 2023). To clarify the text, we specified this information in the methods section (line 383).

      References:

      D. R. Dajani, L. Q. Uddin, Demystifying cognitive flexibility: Implications for clinical and developmental neuroscience. Trends Neurosci. 38, 571–578 (2015).

      Pirazzoli, L., Sullivan, E., Xie, W., Richards, J. E., Bulgarelli, C., Lloyd-Fox, S., ... & Nelson III, C. A. (2022). Association of psychosocial adversity and social information processing in children raised in a low-resource setting: an fNIRS study. Developmental Cognitive Neuroscience, 56, 101125.

      Delgado Reyes, L., Wijeakumar, S., Magnotta, V. A., Forbes, S. H., & Spencer, J. P. (2020). The functional brain networks that underlie visual working memory in the first two years of life. NeuroImage, 219, Article 116971.

      Lloyd-Fox, S., McCann, S., Milosavljevic, B., Katus, L., Blasi, A., Bulgarelli, C., Crespo-Llado, M., Ghillia, G., Fadera, T., Mbye, E., Mason, L., Njai, F., Njie, O., Perapoch-Amado, M., Rozhko, M., Sosseh, F., Saidykhan, M., Touray, E., Moore, S. E., … Team, and the B. S. (2023). The Brain Imaging for Global Health (BRIGHT) Study: Cohort Study Protocol. Gates Open Research, 7(126).

      (11) Last but not least, in the paper, the authors mentioned that fNIRS offers better spatial resolution and anatomical specificity compared to EEG, thereby providing more precise and reliable localization of brain networks. While I partially agree with this perspective, it remains to be explored whether the current fNIRS analysis strategies indeed yield higher spatial resolution. It is hoped that the authors will delve deeper into this discussion in the paper.

      The brain regions of focus were selected based on coregistration work previously conducted at each time point on the array used in this project (Collins-Jones, 2019). We deliberately avoided making claims about small brain regions, considering that head size might increase slightly less with age in The Gambia compared to Western countries (Nabwera, 2017) . However, we maintain that the conclusions drawn in this study offer higher brain-region specificity than could have been  identified with current common EEG methods alone.

      References:

      L. H. Collins-Jones, et al., Longitudinal infant fNIRS channel-space analyses are robust to variability parameters at the group-level: An image reconstruction investigation. Neuroimage 237, 118068 (2021).

      Nabwera, H. M., Fulford, A. J., Moore, S. E., & Prentice, A. M. (2017). Growth faltering in rural Gambian children after four decades of interventions: a retrospective cohort study. The Lancet Global Health, 5(2), e208–e216.

      Reviewer #3 (Recommendations For The Authors):

      Introduction

      - Among important developmental mechanisms to mention are the development of exuberant connections and the further selection/stabilization of the relevant ones according to environmental stimulation, vs the pruning of others.

      We agree with the reviewer that the development of exuberant connections and subsequent pruning is a universal process of paramount importance during the first years of life. However, after revising our introduction, given the word limit of the journal, we maintained focus on neurodevelopment and early adversity.

      Results

      - Adding a few more information on the 6 sections and 21 connections would be welcome. In particular for within-section FC: how was this computed?

      The 6 sections were created based on the co-registration of the array used in this study at each age in a previous published work L. H. Collins-Jones, et al., Longitudinal infant fNIRS channel-space analyses are robust to variability parameters at the group-level: An image reconstruction investigation. Neuroimage 237, 118068 (2021). This is reference No. 68 in the manuscript.

      As we mentioned in the section fNIRS preprocessing and data-analysis: ‘The sections were established via the 17 channels of each hemisphere which were grouped into front, middle and back (for a total of six regions) based on a previous co-registration of the BRIGHT fNIRS arrays onto age-appropriate templates’.

      The 21 connections were defined as all the possible links between the 6 regions, specifically: the interhemispheric homotopic connections (in orange in Figure SI1), which connect the same regions between hemispheres (i.e., front left with front right); the intrahemispheric connections (in green in Figure SI1), which correlate channels belonging to the same region; the fronto-posterior connections (in blue in Figure SI1), which link front and middle, middle and back, and front and back regions of the same hemisphere; and the crossing interhemispheric connections (non-homotopic interhemispheric, in yellow in Figure SI1), which link the front, middle, and back areas between left and right hemispheres. We added these specifications also in the legend of Figure SI1 for clarity.

      - The denomination intrahemispheric vs fronto-posterior vs crossed connections is not clear. Maybe prefer intra-hemispheric vs inter-hemispheric homotopic vs inter-hemispheric non-homotopic (also in Figure SI1).

      We appreciate the reviewer's suggestion regarding terminology. However, we believe that the term 'inter-hemispheric non-homotopic' could potentially refer to both connections within the same brain hemisphere from front to back and connections crossing between hemispheres, leading to increased confusion. Therefore, we have chosen not to include the term 'non-homotopic' and instead added 'homotopic' to 'interhemispheric' throughout the manuscript to emphasize that these functional connections occur between corresponding regions of the two hemispheres.

      - with time -> with age.

      We replaced “with time” with “with age” as suggested through the manuscript.

      - The description of both HbO2 and HHb results overloads the main text: would it be relevant to present one of the two in Supplementary Information if the results are coherent?

      We understand the reviewer’s concern regarding overloading the results section with reporting both chromophores. However, reporting results for both HbO and HHb is considered a crucial step for publications in the fNIRS field, as emphasized in recent formal guidance (Yücel et al., 2020). One of the strengths of fNIRS compared to fMRI is its ability to record from both chromophores, enabling a more precise characterization of brain activations and oscillations. Moreover, in FC studies like this one, ensuring that HbO and HHb results overlap is an important check that increases confidence in interpreting the findings.

      References:

      Yücel, M. A., von Lühmann, A., Scholkmann, F., Gervain, J., Dan, I., Ayaz, H., Boas, D., Cooper, R. J., Culver, J., Elwell, C. E., Eggebrecht, A. ., Franceschini, M. A., Grova, C., Homae, F., Lesage, F., Obrig, H., Tachtsidis, I., Tak, S., Tong, Y., … Wolf, M. (2020). Best Practices for fNIRS publications. Neurophotonics, 1–34. https://doi.org/10.1117/1.NPh.8.1.012101

      - HCZ is not defined when first used.

      We thank the reviewer for spotting this, we have now specified HCZ at line 184 as follows: ‘head-circumference z-score (HCZ)’.

      - Choosing the analyzed measures to "maximize power" could be criticised.

      We appreciate the reviewer’s concern. However, correlating all the FC values with all changes in growth would have raised an important issue for multiple comparisons. We therefore we made a priori decision to focus on investigating the relationship between changes in growth and those FC that showed a significant change with age, considering these as the most interesting ones from a developmental perspective in our sample.

      Discussion

      - I would recommend using the same order to synthesize results and further discuss them.

      We agree with the reviewer that the suggested structure is optimal for a clear discussion section. We have indeed followed it, with each paragraph covering specific aspects:

      - Recap of the study aims

      - Results summary and discussion of developmental changes

      - Results summary and discussion of the relationship between changes in growth and FC

      - Results summary and discussion of the relationship between FC and cognitive flexibility

      - Limitations

      - Conclusion

      Given the numerous results presented in this paper, we believe that readers will better digest them by first reading a summary of the results followed by their interpretations, rather than condensing all the interpretations together.

      - Highlighting how "atypical" developmental trajectories are in Gambian infants would be welcome in the Results section. Other interpretations can be found than "The observed decrease in frontal inter-hemispheric FC with increasing age may be due to the exposure to early life undernutrition adversity".

      We agree with the reviewer that other factors that differ between low- and high-resource settings might have an impact on FC trajectories. We therefore specified this in the discussion as follows: “We acknowledge that differences in FC could also be attributed to other environmental and cultural disparities between high-resource and low-resource settings, and future studies are needed to further investigate cultural, environmental, and genetic effects on brain FC” (line 238).

      - Focusing on FC at 24m for the relationship with growth is questionable.

      Correlating the FC values at 5 time points with all changes in growth would have raised an important issue for multiple comparisons. We therefore we made a decision a priori to focus on investigating the relationship between changes in growth and FC at 24 months as our final time point of data collection. We added this information in the methods section as follows: “To investigate the impact of undernutrition on FC development, we used DWLZ as independent variables in regression analyses on HbO2 (as the chromophore with the highest signal-to-noise ratio) FC at 24 months, our final time point of data collection” (line 517, method section).

      - There is too much emphasis on the correlation between FC and cognitive flexibility, whereas results are not significant after correction for multiple comparisons.

      Following the reviewer’ suggestion, we specified that results from regression analysis are significant but they did not survive multiple comparisons in the discussion as follows: While our results are consistent with previous studies, we acknowledge that the significant association between early FC and later cognitive flexibility does not withstand multiple comparisons. Therefore, we encourage future studies that may replicate these findings with a larger sample. (line 290, discussion section).

      Methods

      - I would recommend detailing how z-scores were computed in the paragraph "Anthropometric measures".

      We specified how z-scores were computed in the statistical analysis section as follows: “Anthropometric measures were converted to age and sex adjusted z‐scores that are based on World Health Organization Child Growth Standards (93). Weight‐for‐Length (WLZ) and Head Circumference (HCZ) z-scores were computed” (line 509, method section). As transforming data is the first step of statistical analysis and is not directly related to data collection, we believe it is more appropriate to retain this description in the statistical analysis section.

      - FC computation: the mention of "correlating the first and the last 250s" is not clear.

      We specified this more clearly in the text as follows: We found that correlating the first and the last 250 seconds of valid data after pre-processing provided the highest percentage of infants with strong correlation between the first and the last portion of data (line 467).

      - The manuscript mentions "age 3 years" for the younger preschoolers but ~48months rather corresponds to 4 years.

      We revised the entire manuscript and the supplementary materials, but we could not find any instance in which preschoolers are referred with age in months rather than in years.

      - Specify the number of children evaluated at 4 and 5 years. Is the test of cognitive flexibility normalized for age? If not, how were the 2 groups considered in the analyses? (age as a confounding factor).

      We have added the number of children in the two preschooler groups as follows: younger preschoolers (age mean ± SD=47.96 ± 2.77 months, N=77) and older preschoolers (age mean ± SD=57.58 ± 2.11 months, N=84). (line 484).

      The cognitive flexibility test was not normalized for age, as this task was specifically developed for preschoolers (Howard, 2015). As mentioned in ‘Cognitive flexibility at preschool age’ of the methods section, “data were collected in two ranges of preschool ages”, which guided our decision to perform regression analysis on the impact of FC on cognitive flexibility separately within these two age groups, rather than treating them as a single group of preschoolers.

      References:

      Howard, S. J., & Melhuish, E. (2015). An Early Years Toolbox (EYT) for assessing early executive function, language, self-regulation, and social development: Validity, reliability, and preliminary norms. Journal of Psychoeducational Assessment, 35(3), 255-275.

      Figures and Tables

      - Table 1 could highlight the significant results. It is not clear what the "baseline" results correspond to.

      We have marked in bold the results that are statistically significant in Table 1. In the linear mixed model we performed, the first time point (i.e. 5 months) is chosen as ‘baseline’, i.e. the reference against which the other time points are compared to, and its statistical values refer to its significance against 0 (as it has been performed in Bulgarelli 2020).

      - Figures 2 B and C seem redundant? What is SE vs SD?

      We believe that both figures 2B and 2C are useful for the readers. While the first one shows the mean FC values at the group level, the second one highlights the individual variability of FC values (typical of infant neuroimaging data), which also why it is interesting to relate these measures to other variables of our dataset (i.e. growth and cognitive flexibility). Figure 2C also reports mean FC values per age, but these might be less visible considering that also one dot per infant is also plotted.

      SE stands for standard error, and in the legend of the figure we specified this as follows: ‘Mean and standard error of the mean (SE)’. SD stands for standard deviation, and we have now specified this as follows: ‘mean ± standard deviation (SD)’ .

      - Table 2: I would recommend removing results that don't survive corrections for multiple comparisons.

      We acknowledge the reviewer’s concern regarding the reporting of results that do not survive multiple comparisons. However, considering the uniqueness of our dataset and the novelty of our work, we believe it is crucial to report all significant findings. We have taken great care to transparently distinguish between results that survived multiple comparisons and those that did not in both the Results and Discussion sections, ensuring that readers are not misled. It is possible that future studies may replicate and further strengthen these associations. Therefore, by sharing these results with the research community, we provide valuable insights for future investigations.

      - Figure 3: the top is redundant with Table 2: to be merged? B: the statistical results might be shown in a Table.

      We agree with the reviewer that the top part of Figure 3 and Table 2 report the same results. However, given the richness of these findings, we believe that the top part of Figure 3 serves as a useful summary for readers. Additionally, examining both the top and bottom parts of Figure 3 provides a comprehensive overview of the regression analysis conducted in this study.

      - Figure SI6: Is it really a % in x-axis?

      We thank the reviewer for spotting this typo, the percentage is relevant for the y-axis only. We removed the % symbol from ticks of the x-axis.

      - Table SI1: the presented p-values don't seem to survive Bonferroni correction, contrary to what is written.

      We thank the reviewer for spotting this mistake, we removed the reference to the Bonferroni correction for the p-values.

      - Table SI2: For the proportion of children included in the analysis, maybe be precise that the proportion was computed based on the ones with acquired data. Maybe also add the proportion according to all children, to better show the high drop-out rate at certain ages?

      We thank the reviewer for these useful suggestions. We have specified in the legend of the table how we calculated the proportion of infants included as follows: ‘The proportion of children included in the analysis was computed based on the infants with FC data’. We have also added a column in the table called ‘Inclusion rate (from the 204 infants recruited)’, following the reviewer’s suggestion. This will be a useful reference for future studies.

      - A few typos should be corrected throughout the manuscript.

      We thoroughly revised the main manuscript and the supplementary materials for typos.

    1. What is the use of living? Come thunder, come lightning of the sky, come and crash upon my head! I cannot stand the pain! Hades, come! Come Hades and cut down this miserable life of mine!

      Seeing this quote from Medea is sad to see because most women we're treated terribly in ancient times. The extraordinary women we discuss in this class all seem to have a sense of power even if they are female. If Medea was truly this unhappy it makes me question if she regrets anything

  5. Apr 2025
    1. CBS fired Owens’ predecessor, Jeff Fager, in 2018 for sexual misconduct. CBS investigators said that his misconduct was less severe than that of “60 Minutes” founder and original executive producer Don Hewitt. A draft report from those investigators found that “the physical, administrative and cultural separation between ’60 Minutes’ and the rest of CBS News permitted misconduct by some ’60 Minutes’ employees,” according to the Times. CBS has agreed to pay one woman who accused Hewitt of sexual assault over $5 million, the Times reported. The network’s powerful former head, Les Moonves, also resigned in 2018 after multiple women accused him of sexual assault.

      im really not sure how we got to this from owens resignation. these are two unrelated topics.

      They bring up CBS’s past misconduct to suggest a pattern, even though there’s no clear evidence of wrongdoing in the Kamala Harris interview. This appeals to suspicion—using history to make the current situation seem questionable. It’s more about persuasion than proof.

    1. He was the valedictorian of a prestigious Baltimore prep school who earned bachelor’s and master’s degrees at the University of Pennsylvania and served as a head counselor at a pre-college program at Stanford University.

      Introduces facts about Luigi Mangione.

    1. But watch out, Commander, I tell him in my head. I've got my eye onyou. One false move and I'm dead.

      Almost comedic, satirical line, it is usually "one false move and you're dead" but this power of women comes from the fearing of their own safety being in the hands of a sole man.

    Tags

    Annotators

    1. The student-led movement is resilient partly because it is leaderless by design - it is not so easy to ‘cut off the head’ of a regenerative organism that does not depend on any single person or even a small group of figureheads.

      for - example - decentralized movement - Serbia 2025 - example - decentralized movement - pros - no head to decapitate - Serbia protests 2025 - same applies to decentralized web - no central server to shut down - adjacency - decentralized movement - decentralized web - no single server to shut down - cannot decapitate

    1. eLife Assessment

      This valuable study introduces a self-supervised machine learning method to classify C. elegans postures and behaviors directly from video data, offering an alternative to the skeleton-based approaches that rely on often error-prone tracking. This novel approach holds promise for advancing ethology research. That said, the strength of evidence is currently incomplete, as key aspects - including measuring head-tail orientation, increased behavioral interpretability, and quantitative comparisons to established methods - are underdeveloped and would benefit from further validation.

    2. Reviewer #3 (Public review):

      Summary:

      In this paper, the authors present an unsupervised learning approach to represent C. elegans poses and temporal sequences of poses in low-dimensional spaces by directly using pixel values from video frames. The method does not rely on the exact identification of the worm's contour/midline, nor on the identification of the head and tail prior to analyzing behavioral parameters. In particular, using contrastive learning, the model represents worm poses in low-dimensional spaces, while a transformer encoder neural network embeds sequences of worm postures over short time scales. The study evaluates this newly developed method using a dataset of different C. elegans genetic strains and aging individuals. The authors compared the representations inferred by the unsupervised learning with features extracted by an established approach, which relies on direct identification of the worm's posture and its head-tail direction.

      Strengths:

      The newly developed method provides a coarse classification of C. elegans posture types in a low-dimensional space using a relatively simple approach that directly analyzes video frames. The authors demonstrate that representations of postures or movements of different genotypes, based on pixel values, can be distinguishable to some extent.

      Weaknesses:

      - A significant disadvantage of the presented method is that it does not include the direction of the worm's body (e.g., head/tail identification). This highly limits the detailed and comprehensive identification of the worm's behavioral repertoire (on- and off-food), which requires body directionality in order to infer behaviors (for example, classifying forward vs. reverse movements). In addition, including a mix of opposite postures as input to the new method may create significant classification artifacts in the low-dimensional representation-such that, for example, curvature at opposite parts of the body could cluster together. This concern applies both to the representation of individual postures and to the representation of sequences of postures.<br /> - The authors state that head-tail direction can be inferred during forward movement. This is true when individuals are measured off-food, where they are highly likely to move forward. However, when animals are grown on food, head-tail identification can also be based on quantifying the speed of the two ends of the worm (the head shows side-to-side movements). This does not require identifying morphological features. See, for example, Harel et al. (2024) or Yemini et al. (2013).<br /> - Another confounding parameter that cannot be distinguished using the presented method is the size of individuals. Size can differ between genotypes, as well as with aging. This can potentially lead to clustering of individuals based on their size rather than behavior.<br /> - There is no quantitative comparison between classification based on the presented method and methods that rely on identifying the skeleton.

    3. Author response:

      We thank the editors and the reviewers for their valuable comments and for taking the time to evaluate our manuscript.

      Answers to Reviewer 1:

      (1) The core contribution of our method is that it learns meaningful spatiotemporal embeddings directly from image data without requiring pose estimation or eigenworm-based features as input. The learned embedding space can serve as a foundation for downstream tasks such as behavioral classification, clustering, or anomaly detection, further supporting its utility beyond visualization through eigenworm-derived features. Here we use the Tierpsy-derived features for latent space interpretation and for validation that our approach does indeed encode meaningful postural information. Additionally, without any Tierpsy-calculated features users can still color embeddings by known metadata like mutation or age and compare different strains to each other. 

      (2) The numbers shown in Fig. 2.3 are illustrative placeholders intended to conceptually represent a vector of behavioral features. They do not correspond to any specific measurements or carry intrinsic meaning. We agree that this may lead to confusion, and we will clarify this in the revised manuscript.

      (3) The visualizations in Figs. 4 (b) and (c) show the embeddings of sequences of behavior, rather than individual poses. Therefore, motion-related features such as speed are related to temporal patterns in those sequences rather than static postures. The color overlays reflect average motion characteristics (e.g., speed) of short behavior clips projected into the embedding space, rather than being directly linked to any single frame or pose.

      Answers to Reviewer 2:

      (1) In the abstract, our use of the term "unbiased" refers specifically to the avoidance of human-generated bias through feature engineering—i.e., the model does not rely on handcrafted features or predefined pose representations – the representations are based on data only. However, we agree that the model is still subject to dataset biases and will rectify this in the revised manuscript.

      (2) The worm images are rotated to a common vertical orientation to remove orientation as a source of variability in the input. This ensures that the model focuses on learning pose and behavioral dynamics rather than arbitrary head-tail or angular positioning. While data augmentation could in theory account for this variability, we found in our preliminary experiments that applying this preprocessing step led to more stable and interpretable embeddings.

      (3) We agree that simplifying the technical explanations would enhance the manuscript’s accessibility. In the revised version, we will briefly introduce contrastive learning in a less technical language.

      (4) The gray points in Fig. 3a represent frames that Tierpsy could not resolve, primarily due to coiled, self-intersecting, or overlapping worm postures as Tierpsy uses skeletonization to estimate the centerline. This approach can fail if kind of challenging elements are part of the image.

      (5) We appreciate this suggestion and consider it for a revised version of the manuscript.

      (6) Although it may seem intuitive for highly bent (red) poses to lie near coiled (gray) ones in the embedding space, the clustering pattern observed reflects how the network organizes pose information. The red/orange cluster consists of distinguishable bent poses that are visually distinct and consistently separable from other postures. In contrast, the greenish and blueish poses are less strongly bent and may share more visual overlap with the unresolved (gray) images.

      (7) The overlap occurs because some highly bent or coiled worms can still be (partially) resolved by Tierpsy, depending on specific pose conditions (e.g., head and tail not touching, not self-overlapping). However, Tierpsy fails to consistently resolve such frames. We will describe these cases in more detail in the revised manuscript.

      (8) Thank you, we agree this claim needs to be better supported and will develop it in the revision.

      (9) To support this statement we mainly visualized the respective sequences embedded in this area of the embedding space and found that it mostly consists of common behaviors such as forward locomotion. 

      (10) We agree that interpretability is important and plan to include additional figures quantifications of the embedding space using more basic Tierpsy features.

      (11) Fig. 5a is indeed based solely on N2 animals. In the revised manuscript we will include quantitative measures of behavioral variability and its change with age.

      (12) We appreciate this suggestion and consider it for a revised version

      (13) We agree this would be a valuable analysis. However, our current dataset primarily includes aging data for N2 animals. We acknowledge this limitation and consider adding more strains for future work.

      (14) We will include links to our source code in the revised manuscript

      Answers to Reviewer 3:

      (1-2) Our current method is agnostic to head-tail orientation, which indeed restricts the ability to distinguish behaviors that rely on directional cues. We made this design choice as we believe that correctly identifying head/tail orientation can be a challenging task that may introduce additional biases or fail in difficult imaging conditions. However, we fully agree that integrating directional information would improve behavioral resolution, and this is a natural extension of our current framework. In future work, we aim to incorporate head-tail disambiguation.

      (3) We explicitly designed our preprocessing and training pipeline to encourage size invariance, for example by resizing individuals to a consistent scale, as the focus of our work is to encode posture and movement only. However, we acknowledge that absolute size information is lost in this process, which can be informative for distinguishing genotypes or age-related changes.

      (4) We agree that a direct quantitative comparison between our embedding-based representations and skeleton-based feature sets would strengthen the paper. Our current focus was to assess whether meaningful behavioral features could be learned from a skeleton-free representation.

    1. By legalizing murder, robbery, torture, and destruction, these instructions put the moral basis of martial law, and thereby of military discipline, on its' head. The army did not simply pretend not to notice the criminal actions of the regime, it positively ordered its own troops to carry them out, and was distressed when breaches of discipline prevented their more efficient execution.

      ordering what would ordinarily be crimes

    1. Make the most out of all of that research and preparation by bringing notes. A nice notebook or paper and a pen are perfectly acceptable for you to have in the interview and they can help you feel more focused by getting some of the information out of your head and organized on paper.

      Bringing notes is a smart tip—it shows you're prepared and helps reduce nerves by keeping key points handy.

    1. Urban poverty looks differ-ent from rural poverty which looks different from Southern poverty.

      This particularly sparked an interest regarding what poverty and its affects look like throughout the regions of the US such as the Midwest, South, etc. If the education system is able to gather statistics on common occurrences within poverty that affect children, regions can curate school systems to directly address these issues more head on rather than guessing a vague mission in programs or schools.

    1. Telling people what state you are from may give them a sense of “who you are.” Jimmy Emerson, DVM – Welcome to Texas – CC BY-NC-ND 2.0.

      I was walking down 42nd Street in NYC once and some tourists stopped me and asked me for direction on how to get to Broadway. Why was this remarkable? I was wearing my big puffy Green Bay Packer jacket and I did not look anything like the other New Yorkers I walked amogst. Something about being a Packer fan made me safe to ask for directions. So, in true, midwest nice fashion, I pointed north and said, you need to turn right and head up one block then over another and you will start to see the lights of Broadway. You're headed in the right direction. Keep going. And I'm 100% sure they found Broadway.

    1. ATTENTION-BASED ARCHITECTURE FOR G-P MAPPING

      The model is a stack of attention layers, but I was surprised to see it omit all the typical components that brought attention into the limelight via transformers: multi-head attention, residual connections, layer norm, and position-wise FFNs. These have become standard and widely adopted, and largely for good reason, as they've shown to be very effective across many distinct domains.

      Was there a particular reason this specific custom architecture was preferred over implementing or at least comparing to a standard transformer encoder?

    1. and with a potentially damning congressional ethics report still hanging over his head.

      The words "potentially damning" paint a more serious picture than other articles. This suggests that Gaetz career could be over if the report is released.

    2. Josh Christenson and Diana Glebova

      This article is published in the New York Post which received a "leans-right" rating by AllSides. Josh Christenson is the New York Post's head Washington DC columnist.

    3. Matt Gaetz withdraws from attorney general consideration with House ethics report hanging over his head

      This title adds upon the basic factual title and includes part of the reason why he withdrew. It leaves out some of the complexities of the situation and why this report is hanging over him.

    1. “We need you to be ‘Momala’ of the country.” Interviewers should question if they would ask President Donald Trump, former President Joe Biden or any prior president to be America’s “Daddy” and caretaker.

      Yes. while mothers are the reason that we are all here today, ther is a deeply sexist notion that the mother is the careatker but not the head of the hose. they hadle things at home but when it comes to buisness or any authoriy, they have no part in that

    1. Helena provides an example of how Asian Americans are often classed together by others. Some white classmates did not bother to find out that she was Korean. When discussing such events, Helena, like other respondents, is still in pain from them and has a difficult time making eye contact. She keeps her head down and speaks softly, crying a few times as she recounts painful memories. She was not accepted for being the smart, high-achieving youngster she was, but was ostra-cized for her intelligence and identity. Helena fit the “model” myth because she was a standout student. Frank Wu explains that the myth is important because it “is useful, even if it is not true. Its content assuages the conscience and assigns blame, a function that is psychologically needed and socially desired.”7 In this otherwise savvy comment Wu never clarifies whom the myth is useful for and does not specifically name whites as the central culprits.

      The section illustrates how Asian American students experience contradictory realities because white society views them as brilliant in academics but strange socially and emotionally aloof. Asian students discover that institutions pretending to be neutral platforms of fairness typically contain hidden discrimination elements which manifest strongly in activities where white students dominate. Some Asian American students receive racial harassment during their growth but do not recognize it as racial discrimination because they lack awareness and understanding of systemic racism. They develop exclusively Asian social groups to protect themselves because racism forces them to do so rather than a product of cultural choice.

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review): 

      Summary: 

      This fascinating manuscript studies the effect of education on brain structure through a natural experiment. Leveraging the UK BioBank, these authors study the causal effect of education using causal inference methodology that focuses on legislation for an additional mandatory year of education in a regression discontinuity design. 

      Strengths: 

      The methodological novelty and study design were viewed as strong, as was the import of the question under study. The evidence presented is solid. The work will be of broad interest to neuroscientists 

      Weaknesses: 

      There were several areas which might be strengthed from additional consideration from a methodological perspective. 

      We sincerely thank the reviewer for the useful input, in particular, their recommendation to clarify RD and for catching some minor errors in the methods (such as taking the log of the Bayes factors). 

      Reviewer #1 (Recommendations for the authors): 

      (1) The fuzzy local-linear regression discontinuity analysis would benefit from further description. 

      (2) In the description of the model, the terms "smoothness" and "continuity" appear to be used interchangeably. This should be adjusted to conform to mathematical definitions. 

      We have now added to our explanations of continuity regression discontinuity. In particular, we now explain “fuzzy”, and add emphasis on the two separate empirical approaches (continuity and local-randomization), along with fixing our use of “smoothness” and “continuity”.

      results:

      “Compliance with ROSLA was very high (near 100%; Sup. Figure 2). However, given the cultural and historical trends leading to an increase in school attendance before ROSLA, most adolescents were continuing with education past 15 years of age before the policy change (Sup Plot. 7b). Prior work has estimated 25 percent of children would have left school a year earlier if not for ROSLA 41. Using the UK Biobank, we estimate this proportion to be around 10%, as the sample is healthier and of higher SES than the general population (Sup. Figure 2; Sup. Table 2) 46–48.”

      methods:

      “RD designs, like ours, can be ‘fuzzy’ indicating when assignment only increases the probability of receiving it, in turn, treatment assigned and treatment received do not correspond for some units 33,53. For instance, due to cultural and historical trends, there was an increase in school attendance before ROSLA; most adolescents were continuing with education past 15 years of age (Sup Plot. 7b). Prior work has estimated that 25 percent of children would have left school a year earlier if not for ROSLA 41. Using the UK Biobank, we estimate this proportion to be around 10%, as the sample is healthier and of higher SES than the general population (Sup. Figure 2; Sup. Table 2) 46–48.”

      (3) The optimization of the smoother based on MSE would benefit from more explanation and consideration. How was the flexibility of the model taken into account in testing? Were there any concerns about post-selection inference? A sensitivity analysis across bandwidths is also necessary. Based on the model fit in Figure 1, results from a linear model should also be compared. 

      It is common in the RD literature to illustrate plots with higher-order polynomial fits while inference is based on linear (or at most quadratic) models (Cattaneo, Idrobo & Titiunik, 2019). We agree that this field-specific practice can be confusing to readers. Therefore, we have redone Figure 1 using local-linear fits better aligning with our analysis pipeline. Yet, it is still not a one-to-one alignment as point estimation and confidence are handled robustly while our plotting tools are simple linear fits. In addition, we updated Sup. Fig 3 and moved 3rd-order polynomial RD plots to Sup. Fig 4.

      Empirical RD has many branching analytical decisions (bandwidth, polynomial order, kernel) which can have large effects on the outcome. Fortunately, RD methodology is starting to become more standardized (Catteneo & Titiunik, 2022, Ann. Econ Rev) as there have been indications of publication bias using these methods (Stommes, Aronow & Sävje, 2023, Research and Politics (This paper suggest it is not researcher degrees of freedom, rather inappropriate inferential methods)). While not necessarily ill-intended, researcher degrees of freedom and analytic flexibility are major contributors to publication bias. We (self) limited our analytic flexibility by using pre-registration (https://osf.io/rv38z).

      One of the most consequential analytic decisions in RD is the bandwidth size as there is no established practice, they are context-specific and can be highly influential on the results. The choice of bandwidths can be framed as a ‘bias vs. variance trade-off’. As bandwidths increase, variance decreases since more subjects are added yet bias (misspecification error/smoothing bias) also increases (as these subjects are further away and less similar). In our case, our assignment (running/forcing) variable is ‘date of birth in months’; therefore our smallest comparison would be individuals born in August 1957 (unaffected/no treatment) vs September 1957 (affected/treated). This comparison has the least bias (subjects are the most similar to each other), yet it comes at the expense of very few subjects (high variance in our estimate). 

      MSE-derived bandwidths attempt to solve this issue by offering an automatic method to choose an analysis bandwidth in RD. Specifically, this aims to minimize the MSE of the local polynomial RD point estimator – effectively choosing a bandwidth by balancing the ‘bias vs. variance trade-off’ (explained in detail 4.4.2 Cattaneo et al., 2019 p 45 - 51 “A practical introduction to regression discontinuity designs: foundations”). Yet, you are very correct in highlighting potential overfitting issues as they are “by construction invalid for inference” (Calonico, Cattaneo & Farrell, 2020, p. 192). Quoting from Cattaneo and Titiunik’s Annual Review of Economics from 2022: 

      “Ignoring the misspecification bias can lead to substantial overrejection of the null hypothesis of no treatment effect. For example, back-of-the-envelop calculations show that a nominal 95% confidence interval would have an empirical coverage of about 80%.”

      Fortunately, modern RD analysis packages (such as rdrohust or RDHonest) calculate robust confidence intervals - for more details see Armstrong and Kolesar (2020). For a summary on MSE-bandwidths see the section “Why is it hard to estimate RD effects?” in Stommes and colleagues 2023 (https://arxiv.org/abs/2109.14526). For more in-depth handling see the Catteneo, Idrobo, and Titiunik primer (https://arxiv.org/abs/1911.09511).

      Lastly, with MSE-derived bandwidths sensitivity tests only make sense within a narrow window of the MSE-optimized bandwidth (5.5 Cattaneo et al., 2019 p 106 - 107). When a significant effect occurs, placebo cutoffs (artificially moving the cutoff) and donut-hole analysis are great sensitivity tests. Instead of testing our bandwidths, we decided to use an alternate RD framework (local randomization) in which we compare 1-month and 5-month windows. Across all analysis strategies, MRI modalities, and brain regions, we do not find any effects of the education policy change ROSLA on long-term neural outcomes.

      (4) In the Bayesian analysis, the authors deviated from their preregistered analytic plan. This whole section is a bit confusing in its current form - for example, point masses are not wide but rather narrow. Bayes factors are usually estimated; it is unclear how or why a prior was specified. What exactly is being modeled using a prior? Also, throughout - If the log was taken, as the methods seem to indicate for the Bayes factor, this should be mentioned in figures and reported estimates. 

      First, we would like to thank you for spotting that we incorrectly kept the log in the methods. We have fixed this and added the following sentence to the methods: 

      “Bayes factors are reported as BF<sub>10</sub> in support of the alternative hypothesis, we report Bayes factors under 1 as the multiplicative inverse (BF<sub>01</sub> = 1/BF)”

      All Bayesian analyses need to have a prior. In practice, this becomes an issue when you’re uncertain about 1) the location of the effect (directionality & center mass, defined by a location parameter), yet more importantly, the 2) confidence/certainty of the range-spread of possible effects (determined by a scale parameter). In normally distributed priors these two ‘beliefs’ are represented with a mean and a standard deviation (the latter impacts your confidence/certainty on the range of plausible parameter space). 

      Supplementary figure 6 illustrates several distributions (location = 0 for all) with varying scale parameters; when used as Bayesian priors this indicates differing levels of confidence in our certainty of the plausible parameter space. We illustrate our three reported, normally distributed priors centered at zero in blue with their differing scale parameters (sd = .5, 1 & 1.5).

      All of these five prior distributions have the same location parameter (i.e., 0) yet varying differences in the scale parameter – our confidence in the certainty of the plausible parameter space. At first glance it might seem like a flat/uniform prior (not represented) is a good idea – yet, this would put equal weight on the possibility of every estimate thereby giving the same probability mass to implausible values as plausible ones. A uniform prior would, for instance, encode the hypothesis that education causing a 1% increase in brain volume is just as plausible as it causing either a doubling or halving in brain volume. In human research, we roughly know a range of reasonable effect sizes and it is rare to see massive effects.

      A benefit of ‘weakly-informative’ priors is that they limit the range of plausible parameter values. The default prior in STAN (a popular Bayesian estimation program; https://mc-stan.org) is a normally distributed prior with a mean of zero and an SD of 2.5 (seen in orange in the figure; our initial preregistered prior). This large standard deviation easily permits positive and negative estimates putting minimal emphasis on zero. Contrast this to BayesFactor package’s (Morey R, Rouder J, 2023) default “wide” prior which is the Cauchy distribution (0, .7) illustrated in magenta (for more on the Cauchy see: https://distribution-explorer.github.io/continuous/cauchy.html). 

      These different defaults reflect differing Bayesian philosophical schools (‘estimate parameters’ vs ‘quantify evidence’ camps); if your goal is to accurately estimate a parameter it would be odd to have a strong null prior, yet (in our opinion) when estimating point-null BF’s a wide default prior gives far too much evidence in support of the null. In point-null BF testing the Savage-Dickey density ratio is the ratio between the height of the prior at 0 and the height of the posterior at zero (see Figure under section “testing against point null 0”). This means BFs can be very prior sensitive (seen in SI tables 5 & 6). For this reason, we thought it made sense to do prior sensitivity testing, to ensure our conclusions in favor of the null were not caused solely by an overly wide prior (preregistered orange distribution) we decided to report the 3 narrower priors (blue ones).

      Alternative Bayesian null hypotheses testing methods such as using Bayes Factors to test against a null region and ‘region of practical equivalence testing’ are less prior sensitive, yet both methods demand the researcher (e.g. ‘us’) to decide on a minimal effect size of practical interest. Once a minimal effect size of interest is determined any effect within this boundary is taken as evidence in support of the null hypothesis.

      (5) It is unclear why a different method was employed for the August / September data analysis compared to the full-time series. 

      We used a local-randomization RD framework, an entirely different empirical framework than continuity methods (resulting in a different estimate). For an overview see the primer by Cattaneo, Idrobo & Titiunik 2023 (“A Practical Introduction to Regression Discontinuity Designs: Extensions”; https://arxiv.org/abs/2301.08958).

      A local randomization framework is optimal when the running variable is discrete (as in our case with DOB in months) (Cattaneo, Idrobo & Titiunik 2023). It makes stronger assumptions on exchangeability therefore a very narrow window around the cutoff needs to be used. See Figure 2.1 and 2.2 (in the Cattaneo, Idrobo & Titiunik 2023) for graphical illustrations of 1) a randomized experiment, 2) a continuity RD design, and 3) local-randomization RD. Using the full-time series in a local randomization analysis is not recommended as there is no control for differences between individuals as we move further away from the cutoff – making the estimated parameter highly endogenous.

      We understand how it is confusing to have both a new framework and Bayesian methods (we could have chosen a fully frequentist approach) but using a different framework allows us to weigh up the aforementioned ‘bias vs variance tradeoff’ while Bayesian methods allow us to say something about the weight of evidence (for or against) our hypothesis.

      (6) Figure 1 - why not use model fits from those employed for hypothesis testing? 

      This is a great suggestion (ties into #3), we have now redone Figure 1.

      (7) The section on "correlational effect" might also benefit from additional analyses and clarifications. Indeed, the data come from the same randomized experiment for which minimum education requirements were adjusted. Was the only difference that the number of years of education was studied as opposed to the cohort? If so, would the results of this analysis be similar in another subsample of the UK Biobank for which there was no change in policy?

      We have clarified the methods section for the correlational/associational effect. This was the same subset of individuals for the local randomization analysis; all we did was change the independent variable from an exogenous dummy-coded ROSLA term (where half of the sample had the natural experiment) to a continuous (endogenous) educational attainment IV. 

      In principle, the results from the associational analysis should be exactly the same if we use other UK Biobank cohorts. To see if the association of education attainment with the global neuroimaging cohorts was similar across sub-cohorts of new individuals, we conducted post hoc Bayesian analysis on eight more subcohort of 10-month intervals, spaced 2 years apart from each other (Sup. Figure 7; each indicated by a different color). Four of these sub-cohorts predate ROSLA, while the other four are after ROSLA. Educational attainment is slowly increasing across the cohorts of individuals born from 1949 until 1965; intriguingly the effect of ROSLA is visually evident in the distributions of educational attainment (Sup. Figure 7). Also, as seen in the cohorts predating ROSLA more and more individuals were (already) choosing to stay in education past 15 years of age (see cohort 1949 vs 1955 in Sup. Figure 7).

      Sup. Figure 8 illustrates boxplots of the educational attainment posterior of the eight sub-cohorts in addition to our original analysis (s1957) using a normal distributed prior with a mean of 0 and a sd of 1. Total surface area shows a remarkably replicable association with education attainment. Yet, it is evident the “extremely strong” association we found for CSF was a statistical fluke – as the posterior of other cohorts (bar our initial test) crosses zero. The conclusions for the other global neuroimaging covariates where we concluded ‘no associational effect’ seems to hold across cohorts.

      We have now added methods, deviation from preregistration, and the following excerpt to the results:

      “A post hoc replication of this associational analysis in eight additional 10-month cohorts spaced two years apart (Sup. Figure 7) indicates our preregistered report on the associational effect of educational attainment on CSF to be most likely a false-positive (Sup. Figure 8). Yet, the positive association between surface area and educational attainment is robust across the additional eight replication cohorts.”

      Reviewer #2 (Public review): 

      Summary: 

      The authors conduct a causal analysis of years of secondary education on brain structure in late life. They use a regression discontinuity analysis to measure the impact of a UK law change in 1972 that increased the years of mandatory education by 1 year. Using brain imaging data from the UK Biobank, they find essentially no evidence for 1 additional year of education altering brain structure in adulthood. 

      Strengths: 

      The authors pre-registered the study and the regression discontinuity was very carefully described and conducted. They completed a large number of diagnostic and alternate analyses to allow for different possible features in the data. (Unlike a positive finding, a negative finding is only bolstered by additional alternative analyses). 

      Weaknesses: 

      While the work is of high quality for the precise question asked, ultimately the exposure (1 additional year of education) is a very modest manipulation and the outcome is measured long after the intervention. Thus a null finding here is completely consistent educational attainment (EA) in fact having an impact on brain structure, where EA may reflect elements of training after a second education (e.g. university, post-graduate qualifications, etc) and not just stopping education at 16 yrs yes/no. 

      The work also does not address the impact of the UK Biobank's well-known healthy volunteer bias (Fry et al., 2017) which is yet further magnified in the imaging extension study (Littlejohns et al., 2020). Under-representation of people with low EA will dilute the effects of EA and impact the interpretation of these results. 

      References: 

      Fry, A., Littlejohns, T. J., Sudlow, C., Doherty, N., Adamska, L., Sprosen, T., Collins, R., & Allen, N. E. (2017). Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. American Journal of Epidemiology, 186(9), 1026-1034. https://doi.org/10.1093/aje/kwx246 

      Littlejohns, T. J., Holliday, J., Gibson, L. M., Garratt, S., Oesingmann, N., Alfaro-Almagro, F., Bell, J. D., Boultwood, C., Collins, R., Conroy, M. C., Crabtree, N., Doherty, N., Frangi, A. F., Harvey, N. C., Leeson, P., Miller, K. L., Neubauer, S., Petersen, S. E., Sellors, J., ... Allen, N. E. (2020). The UK Biobank imaging enhancement of 100,000 participants: rationale, data collection, management and future directions. Nature Communications, 11(1), 2624. https://doi.org/10.1038/s41467-020-15948-9 

      We thank the reviewer for the positive comments and constructive feedback, in particular, their emphasis on volunteer bias in UKB (similar points were mentioned by Reviewer 3). We have now addressed these limitations with the following passage in the discussion:

      “The UK Biobank is known to have ‘healthy volunteer bias’, as respondents tend to be healthier, more educated, and are more likely to own assets [71,72]. Various types of selection bias can occur in non-representative samples, impacting either internal (type 1) or external (type 2) validity. One benefit of a natural experimental design is that it protects against threats to internal validity from selection bias [43], design-based internal validity threats still exist, such as if volunteer bias differentially impacts individuals based on the cutoff for assignment. A more pressing limitation – in particular, for an education policy change – is our power to detect effects using a sample of higher-educated individuals. This is evident in our first stage analysis examining the percentage of 15-year-olds impacted by ROSLA, which we estimate to be 10% in neuro-UKB (Sup. Figure 2 & Sup. Table 2), yet has been reported to be 25% in the UK general population [41]. Our results should be interpreted for this subpopulation  (UK, 1973, from 15 to 16 years of age, compliers) as we estimate a ‘local’ average treatment effect [73]. Natural experimental designs such as ours offer the potential for high internal validity at the expense of external validity.”

      We also highlighted it both in the results and methods.

      We appreciate that one year of education may seem modest compared to the entire educational trajectory, but as an intervention, we disagree that one year of education is ‘a very modest manipulation’. It is arguably one of the largest positive manipulations in childhood development we can administer. If we were to translate a year of education into the language of a (cognitive) intervention, it is clear that the manipulation, at least in terms of hours, days, and weeks, is substantial. Prior work on structural plasticity (e.g., motor, spatial & cognitive training) has involved substantially more limited manipulations in time, intensity, and extent. There is even (limited) evidence of localized persistent long-term structural changes (Wollett & Maguire, 2011, Cur. Bio.).

      We have now also highlighted the limited generalizability of our findings since we estimate a ‘local’ average treatment effect. It is possible higher education (college, university, vocational schools, etc.) could impact brain structure, yet we see no theoretical reason why it would while secondary wouldn’t. Moreover, higher education education is even trickier to research empirically due to heightened self and administrative selection pressures. While we cannot discount this possibility, the impacts of endogenous factors such as genetics and socioeconomic status are most likely heightened. That being said, higher education offers exciting possibilities to compare more domain-specific processes (e.g., by comparing a philosophy student to a mathematics student). Causality could be tested in European systems with point entry into field-specific programs – allowing comparison of students who just missed entry criteria into one topic and settled for another.

      Regarding the amount of time following the manipulation, as we highlight in our discussion this is both a weakness and a strength. Viewed from a developmental neuroplasticity lens it would have been nice to have imaging immediately following the manipulation. Yet, from an aging perspective, our design has increased power to detect an effect.  

      Reviewer #2 (Recommendations for the authors): 

      (1) The authors assert there is no strong causal evidence for EA on brain structure. This overlooks work from Mendielian Randomisation, e.g. this careful work: https://pubmed.ncbi.nlm.nih.gov/36310536/ ... evidence from (good quality) MR studies should be considered. 

      We thank the reviewer for highlighting this well-done mendelian randomization study. We have now added this citation and removed previous claims on the “lack of causal evidence existing”. We refrain from discussing Mendelian randomization, as it it would need to be accompanied by a nuanced discussion on the strong limitations regarding EduYears-PGS in Mendelian randomization designs.

      (2) Tukey/Boxplot is a good name for your identification of outliers but your treatment of outliers has a well-recognized name that is missing: Windsorisation. Please add this term to your description to help the reader more quickly understand what was done. 

      Thanks, we have now added the term winsorized.

      (3) Nowhere is it plainly stated that "fuzzy" means that you allow for imperfect compliance with the exposure, i.e. some children born before the cut-off stayed in school until 16, and some born after the cut-off left school before 16. For those unfamiliar with RD it would be very helpful to explain this at or near the first reference of the term "fuzzy". 

      We have now clarified the term ‘fuzzy’ to the results and methods:

      methods:

      “RD designs, like ours, can be ‘fuzzy’ indicating when assignment only increases the probability of receiving it, in turn, treatment assigned and treatment received do not correspond for some units 33,53. For instance, due to cultural and historical trends, there was an increase in school attendance before ROSLA; most adolescents were continuing with education past 15 years of age (Sup Plot. 7b). Prior work has estimated that 25 percent of children would have left school a year earlier if not for ROSLA 41. Using the UK Biobank, we estimate this proportion to be around 10%, as the sample is healthier and of higher SES than the general population (Sup. Figure 2; Sup. Table 2) 46–48.”

      (4) Supplementary Figure 2 never states what the percentage actually measures. What exactly does each dot represent? Is it based on UK Biobank subjects with a given birth month? If so clarify. 

      Fixed!

      Reviewer #3 (Public review): 

      Summary: 

      This study investigates evidence for a hypothesized, causal relationship between education, specifically the number of years spent in school, and brain structure as measured by common brain phenotypes such as surface area, cortical thickness, total volume, and diffusivity. 

      To test their hypothesis, the authors rely on a "natural" intervention, that is, the 1972 ROSLA act that mandated an extra year of education for all 15-year-olds. The study's aim is to determine potential discontinuities in the outcomes of interest at the time of the policy change, which would indicate a causal dependence. Naturalistic experiments of this kind are akin to randomised controlled trials, the gold standard for answering questions of causality. 

      Using two complementary, regression-based approaches, the authors find no discernible effect of spending an extra year in primary education on brain structure. The authors further demonstrate that observational studies showing an effect between education and brain structure may be confounded and thus unreliable when assessing causal relationships. 

      Strengths: 

      (1) A clear strength of this study is the large sample size totalling up to 30k participants from the UK Biobank. Although sample sizes for individual analyses are an order of magnitude smaller, most neuroimaging studies usually have to rely on much smaller samples. 

      (2) This study has been preregistered in advance, detailing the authors' scientific question, planned method of inquiry, and intended analyses, with only minor, justifiable changes in the final analysis. 

      (3) The analyses look at both global and local brain measures used as outcomes, thereby assessing a diverse range of brain phenotypes that could be implicated in a causal relationship with a person's level of education. 

      (4) The authors use multiple methodological approaches, including validation and sensitivity analyses, to investigate the robustness of their findings and, in the case of correlational analysis, highlight differences with related work by others. 

      (5) The extensive discussion of findings and how they relate to the existing, somewhat contradictory literature gives a comprehensive overview of the current state of research in this area. 

      Weaknesses: 

      (1) This study investigates a well-posed but necessarily narrow question in a specific setting: 15-year-old British students born around 1957 who also participated in the UKB imaging study roughly 60 years later. Thus conclusions about the existence or absence of any general effect of the number of years of education on the brain's structure are limited to this specific scenario. 

      (2) The authors address potential concerns about the validity of modelling assumptions and the sensitivity of the regression discontinuity design approach. However, the possibility of selection and cohort bias remains and is not discussed clearly in the paper. Other studies (e.g. Davies et al 2018, https://www.nature.com/articles/s41562-017-0279-y) have used the same policy intervention to study other health-related outcomes and have established ROSLA as a valid naturalistic experiment. Still, quoting Davies et al. (2018), "This assumes that the participants who reported leaving school at 15 years of age are a representative sample of the sub-population who left at 15 years of age. If this assumption does not hold, for example, if the sampled participants who left school at 15 years of age were healthier than those in the population, then the estimates could underestimate the differences between the groups.". Recent studies (Tyrrell 2021, Pirastu 2021) have shown that UK Biobank participants are on average healthier than the general population. Moreover, the imaging sub-group has an even stronger "healthy" bias (Lyall 2022). 

      (3) The modelling approach used in this study requires that all covariates of no interest are equal before and after the cut-off, something that is impossible to test. Mentioned only briefly, the inclusion and exclusion of covariates in the model are not discussed in detail. Standard imaging confounds such as head motion and scanning site have been included but other factors (e.g. physical exercise, smoking, socioeconomic status, genetics, alcohol consumption, etc.) may also play a role. 

      We thank the reviewer for their numerous positive comments and have now attempted to address the first two limitations (generalizability and UKB bias) with the following passage in the discussion:

      “The UK Biobank is known to have ‘healthy volunteer bias’, as respondents tend to be healthier, more educated, and are more likely to own assets [71,72]. Various types of selection bias can occur in non-representative samples, impacting either internal (type 1) or external (type 2) validity. One benefit of a natural experimental design is that it protects against threats to internal validity from selection bias [43], design-based internal validity threats still exist, such as if volunteer bias differentially impacts individuals based on the cutoff for assignment. A more pressing limitation – in particular, for an education policy change – is our power to detect effects using a sample of higher-educated individuals. This is evident in our first stage analysis examining the percentage of 15-year-olds impacted by ROSLA, which we estimate to be 10% in neuro-UKB (Sup. Figure 2 & Sup. Table 2), yet has been reported to be 25% in the UK general population [41]. Our results should be interpreted for this subpopulation  (UK, 1973, from 15 to 16 years of age, compliers) as we estimate a ‘local’ average treatment effect [73]. Natural experimental designs such as ours offer the potential for high internal validity at the expense of external validity.”

      We further highlight this in the results section:

      “Compliance with ROSLA was very high (near 100%; Sup. Figure 2). However, given the cultural and historical trends leading to an increase in school attendance before ROSLA, most adolescents were continuing with education past 15 years of age before the policy change (Sup Plot. 7b). Prior work has estimated 25 percent of children would have left school a year earlier if not for ROSLA 41. Using the UK Biobank, we estimate this proportion to be around 10%, as the sample is healthier and of higher SES than the general population (Sup. Figure 2; Sup. Table 2) 46–48.”

      Healthy volunteer bias can create two types of selection bias; crucially participation itself can serve as a collider threatening internal validity (outlined in van Alten et al., 2024; https://academic.oup.com/ije/article/53/3/dyae054/7666749). Natural experimental designs are partially sheltered from this major limitation, as ‘volunteer bias’ would have to differentially impact individuals on one side of the cutoff and not the other – thereby breaking a primary design assumption of regression discontinuity. Substantial prior work (including this article) has not found any threats to the validity of the 1973 ROSLA (Clark & Royer 2010, 2013; Barcellos et al., 2018, 2023; Davies et al., 2018, 2023). While the Davies 2028 article did IP-weight with the UK Biobank sample, Barcellos and colleagues 2023 (and 2018) do not, highlighting the following “Although the sample is not nationally representative,  our estimates have internal validity because there is no differential selection on the two sides of the September 1, 1957 cutoff – see  Appendix A.”.

      The second (more acknowledged & arguably less problematic) type of selection bias results in threats to external validity (aka generalizability). As highlighted in your first point; this is a large limitation with every natural experimental design, yet in our case, this is further amplified by the UK Biobank’s healthy volunteer bias. We have now attempted to highlight this limitation in the discussion passage above.

      Point 3 – the inability to fully confirm design validity – is again, another inherent limitation of a natural experimental approach. That being said, extensive prior work has tested different predetermined covariates in the 1973 ROSLA (cited within), and to our knowledge, no issues have been found. The 1973 ROSLA seems to be one of the better natural experiments around (there was also a concerted effort to have an ‘effective’ additional year; see Clark & Royer 2010). For these reasons, we stuck with only testing the variables we wanted to use to increase precision (also offering new neuroimaging covariates that didn’t exist in the literature base). One additional benefit of ROSLA was that the cutoff was decided years later on a variable that happened (date of birth) in the past – making it particularly hard for adolescents to alter their assignments.

      Reviewer #3 (Recommendations for the authors): 

      (1) FMRIB's preprocessing pipeline is mentioned. Does this include deconfounding of brain measures? Particularly, were measures deconfounded for age before the main analysis? 

      This is such a crucial point that we triple-checked, brain imaging phenotypes were not corrected for age (https://biobank.ctsu.ox.ac.uk/crystal/crystal/docs/brain_mri.pdf) – large effects of age can be seen in the global metrics; older individuals have less surface area, thinner cortices, less brain volume (corrected for head size), more CSF volume (corrected for head size), more white matter hyperintensities, and worse FA values. Figure 1 shows these large age effects, which are controlled for in our continuity-based RD analysis.

      One’s date of birth (DOB) of course does not match perfectly to their age, this is why we included the covariate ‘visit date’; this interplay can now be seen in our updated SI Figure 1 (recommended in #3) which shows the distributions of visit date, DOB, and age of scan. 

      In a valid RD design covariates should not be necessary (as they should be balanced on either side of the cutoff), yet the inclusion of covariates does increase precision to detect effects. We tested this assumption, finding the effect of ‘visit date’ and its quadratic term to be not related to ROSLA (Sup. Table 1). This adds further evidence (specific to the UK Biobank sample) to the existing body of work showing the 1973 ROSLA policy change to not violate any design assumptions. Threats to internal validity would more than likely increase endogeneity and result in ‘false causal positive causal effects’ (which is not what we find).  

      (2) Despite the large overall sample size, I am wondering whether the effective number of samples is sufficient to detect a potentially subtle effect that is further attenuated by the long time interval before scanning. As stated, for the optimised bandwidth window (DoB 20 to 35 months around cut-off), N is about 5000. Does this mean that effectively about 250 (10%) out of about 2500 participants born after the cut-off were leaving school at 16 rather than 15 because of ROSLA? For the local randomisation analysis, this becomes about N=10 (10% out of 100). Could a power analysis show that these cohort sizes are large enough to detect a reasonably large effect? 

      This is a very valid point, one which we were grappling with while the paper was out for review. We now draw attention to this in the results and highlight this as a limitation in the discussion. While UKB’s non-representativeness limits our power (10% affected rather than 25% in the general population), it is still a very large sample. Our sample size is more in line with standard neuroimaging studies than with large cohort studies. 

      The novelty of our study is its causal design, while we could very precisely measure an effect of some phenotype (variable X) in 40,000 individuals. This effect is probably not what we think we are measuring. Without IP-weighting it could even have a different sign. But more importantly, it is not variable X – it is the thousands of things (unmeasured confounders) that lead an individual to have more or less of variable X. The larger the sample the easier it is for small unmeasured confounders to reach significance (Big data paradox) – this in no way invalidates large samples, it is just our thinking and how we handle large samples will hopefully change to a more casual lens.

      (3) Supplementary Figure 1: A similar raincloud plot of date of birth would be instructive to visualise the distribution of subjects born before and after the 1957 cut-off. 

      Great idea! We have done this in Sup Fig. 1 for both visit date and DOB.

      (4) p.9: Not sure about "extreme evidence", very strong would probably be sufficient. 

      As preregistered, we interpreted Bayes Factors using Jeffrey’s criteria. ‘Extreme evidence’ is only used once and it is about finding an associational effect of educational attainment on CSF (BF10 > 100). Upon Reviewer 1’s recommendation 7, we conducted eight replication samples (Sup. Figure 7 & 8) and have now added the following passage to the results:

      “A post hoc replication of this associational analysis in eight additional 10-month cohorts spaced two years apart (Sup. Figure 7) indicates our preregistered report on the associational effect of educational attainment on CSF to be most likely a false-positive (Sup. Figure 8). Yet, the positive association between surface area and educational attainment is robust across the additional eight replication cohorts.”

      (5) The code would benefit from a bit of clean-up and additional documentation. In its current state, it is not easy to use, e.g. in a replication study. 

      We have now further added documentation to our code; including a readme describing what each script does. The analysis pipeline used is not ideal for replications as the package used for continuity-based RD (RDHonest) initially could not handle covariates – therefore we manually corrected our variables after a discussion with Prof Kolesár (https://github.com/kolesarm/RDHonest/issues/7). 

      Prof Kolesár added this functionality recently and future work should use the latest version of the package as it can correct for covariates. We have a new preprint examining the effect of 1972 ROLSA on telomere length in the UK Biobank using the latest package version of RDHonest (https://www.biorxiv.org/content/10.1101/2025.01.17.633604v1). To ensure maximum availability of such innovations, we will ensure the most up-to-date version of this script becomes available on this GitHub link (https://github.com/njudd/EduTelomere).

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript described a structure-guided approach to graft important antigenic loops of the neuraminidase to a homotypic but heterologous NA. This approach allows the generation of well-expressed and thermostable recombinant proteins with antigenic epitopes of choice to some extent. The loop-grafted NA was designated hybrid.

      Strengths:

      The hybrid NA appeared to be more structurally stable than the loop-donor protein while acquiring its antigenicity. This approach is of value when developing a subunit NA vaccine which is difficult to express. So that antigenic loops could be potentially grafted to a stable NA scaffold to transfer strain-specific antigenicity.

      Weaknesses:

      However, major revisions to better organize the text, and figure and make clarifications on a number of points, are needed. There are a few cases in which a later figure was described first, data in the figures were not sufficiently described, or where there were mismatched references to figures.

      More importantly, the hybrid proteins did not show any of the advantages over the loop-donor protein in the format of VLP vaccine in mouse studies, so it's not clear why such an approach is needed to begin with if the original protein is doing fine.

      We thank the reviewer for their helpful comments. We have incorporated feedback from the authors to improve the manuscript. Please see our point-by-point response.

      The purpose of loop-grafting between H5N1/2021 (a high-expressor) and the PR8 virus was not to improve the expression of PR8, which is already a good expressing NA. Instead, the loop-grafting and the in vivo experiments were done to show the loop-specific protection following a lethal PR8 virus challenge.

      Reviewer #2 (Public review):

      In their manuscript, Rijal and colleagues describe a 'loop grafting' strategy to enhance expression levels and stability of recombinant neuraminidase. The work is interesting and important, but there are several points that need the author's attention.

      Major points

      (1) The authors overstress the importance of the epitopes covered by the loops they use and play down the importance of antibodies binding to the side, the edges, or the underside of the NA. A number of papers describing those mAbs are also not included.

      We have discussed the distribution of epitopes on NA molecule in the Discussion section "The distribution of epitopes in neuraminidase" (new line number 350). In Supplementary Figures 1 and 2, we have compiled the epitopes reported by polyclonal sera and mAbs via escape virus selection or crystal structural studies. There are 45 residues examples of escape virus selection, and we found that approximately 90% of the epitopes are located within the top loops (Loops 01 and Loops 23, which include the lateral sides and edges of NA). We have also included the epitopes of underside mAbs NDS.1 and NDS.3 in Supplementary Figure 2. Some of the interactions formed by these mAbs are also within the L01 and L23 loops. All relevant references are cited in Supplementary Figures 1 and 2.

      A new figure has been added [Figure 1b (ii)] to illustrate the surface mapping of epitopes on NA.

      (2) The rationale regarding the PR8 hybrid is not well described and should be described better.

      We described the rationale for the PR8 hybrid (new lines 247-250). For clarity, we have added the following sentence within the section "Loop transfer between two distant N1 NAs:...."

      (new lines 255-258):

      "mSN1 showed sufficient cross-reactivity to N1/09 to protect mice against virus challenge. Therefore, we performed loop transfer between mSN1 and PR8N1, which differ by 18 residues within the L01 and L23 loops and show no or minimal cross-reactivity, to assess the loop-specific protection."

      (3) Figure 3B and 6C: This should be given as numbers (quantified), not as '+'.

      We have included the numerical data in Supplementary Figure 6. The data is presented in semi-quantitative manner for simplification. To improve clarity, we have now added the following sentence to the Figure 3c legend: "Refer to Supplementary Figure 6 for binding titration data".

      (4) Figure 5A and 7A: Negative controls are missing.

      A pool of Empty VLP sera was included as a negative control, showing no inhibition at 1:40 dilution. In the figure legends, we have stated "Pooled sera to unconjugated mi3 VLP was negative control and showed no inhibition at 1:40 dilution (not included in the graphs)"

      (5) The authors claim that they generate stable tetramers. Judging from SDS-PAGE provided in Supplementary Figure 3B (BS3-crosslinked), many different species are present including monomers, dimers, tetramers, and degradation products of tetramers. In line 7 for example there are at least 5 bands.

      Tetrameric conformation of soluble proteins is evidenced by the size-exclusion chromatographs shown in Figures 3a and 6b. The BS3 crosslinked SDS-PAGE are only suggestive data, indicating that the protein is a tetramer if a band appears at ~250 kDa. However, depending on the reaction conditions, lower molecular weight bands may also be observed if crosslinking is incomplete.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Specific comments:

      - Description of Figure 2 on page 3 should go before Figure 3 lines 87-105 or swap the order of the two figures.

      We have moved lines 91-96, which refer to Figure 3, to appear after Figure 2.

      - Figure 3a, an EC50 should be calculated for both NA activity assay.

      Figure 3a has been updated to include the EC50 and AUC (Area under curve) values for both NA activity assays. The same update has also been made for Figure 6b.

      - Line 150, I'm not sure it's appropriate to cite a manuscript that was in preparation but not published. I'm referring to the two mAbs AG7C and AF9C that were claimed to bind to the L01 and L23 loops but not.

      We have changed the "manuscript in preparation" to "personal communication with Dr. Yan Wu, Capital Medical University".

      - The description in Figure 4a is lacking.

      We have added a detailed description for Figure 4a.

      - Figure 4c, sufficient description is needed. For example, the cavity should be outlined and annotated, what is the role of Val149? Why the first monomer is assigned a number of II and the second monomer with a number of I.

      We have added a detailed description for Figure 4c and amended the figure as per the reviewer’s suggestions.

      - Figure 5a, in addition to ELLA data to mSN1 and N1/09, ELLA data to N1/19 should also be measured and shown. Figure S7, please show IC50 instead of curves for better comparison.

      We included IC50 for mSN1 and N1/09 as we intended to associate the loops with protection.  Graphs for N1/19 have not been reported, but the IC50 titres from pooled sera are shown in Supplementary Figure 7 as a representation. Due to the limited sera sample sourced from tail vein bleed, these assays were performed using pooled sera, which represent the total response (established in numbers of experiments).

      - Line 234-238, the author made a statement about the data shown in Figure 7b "These results mirrored several studies in the literature which showed that immunization with the 2009 N1 could provide at least partial protection in mice and ferrets to the avian H5N1 challenge". The data did not reflect that. In Figure 5b, mSN1 protects as well as other proteins. In fact, there was no advantage of N109 and N109 hybrid over mSN1 in protection against the homologous H1N109. Although higher levels of NAI antibodies were induced with the homologous protein in Figure 5a. The protection could be contributed by non-NAI antibodies, so the authors should measure binding antibodies. The author may increase the challenge dose from 200 LD50 to 1000 LD50 to see a difference due to the strong immunogenicity of the nanoparticles vaccine plus addavax. Otherwise, it looks like loop grafting is not necessary as heterologous NA could broadly protect.

      We agree that msN1, despite its low NAI titres, was equally protective as homologous NA or its hybrid NA against H1N1/09 virus challenge at 200 LD50. There may be additional protective components, including non-NAI antibodies in homologous groups that may have contributed to the protection.

      We assessed sera binding to H1N1/2009 and found that the binding antibody levels were also lower in the msN1 group. The corresponding graph has now been added in Figure S7d. It was difficult to determine the NAI titre required to confer protection in this experiment. For this reason, we later chose PR8 as the challenge virus to demonstrate loop-specific protection.

      We are uncertain whether a 1000 LD50 challenge would have helped establish a correlation between protection and NAI IC50 titres, as the dose used is already lethal for DBA/2 mice.

      - Why would the authors separate work with N1/09 and N1/19 from PR8 N1? To this reviewer's understanding, they are all the same strategies with increasing numbers of dissimilar residues from N1/09 (12) to N1/19 (16) and to PR8 (18). They are all characterized by the same approaches in vitro and in vivo.

      We had two different goals for making hybrids with N1/09 and PR8 N1, therefore, we have presented these results separately.

      (1) For N1/09 and N1/19, we showed that loop-grafting improved protein yield and stability. Additionally, we showed that the N1/09 hybrid can be as protective as the homologous protein.

      (2) PR8 N1 is a high-yielding protein, so loop grafting did not significantly increase its yield. However, the PR8 virus challenge confirmed loop-specific protection.

      - For in vivo study testing the PR8 construct, although PR8 and PR8 hybrid protect better than the heterologous mSN1, the hybrid again did not show any advantages over the PR8 original proteins.

      That's correct - the PR8 hybrid was not advantageous over the original PR8 protein. However, the purpose of this experiment was to demonstrate loop specific protection. The PR8 hybrid (PR8 loops - mS scaffold) protected 6/6 mice, whereas mS hybrid (mS loops - PR8 scaffold) provided no protection.

      - Line 243-249, lack of reference to figures.

      References to Supplementary Figure 7b,c and Figure 2 has been added.

      - What was the reason that the challenge was one by 200 LD50 for 2009 H1N1 and 1000 LD50 for PR8.

      Viruses were titrated in the BALB/c strain for PR8 virus and the DBA/2 strain for X-179A (H1N1/2009) virus. These doses were selected based on their lethality and the time required to reach the endpoint (~20% weight loss) post-infection, which is 5-6 days. Most studies in the literature have used 10 LD50 or higher; thus the virus doses we used are relatively high.

      - Line 268, there is no Figure 5C.

      This was a mistake and has been corrected to Figure 6c.

      - Line 275 what are the readers supposed to see in supplementary Figure 5a? There is not enough description for the referred figures.

      A sentence has been added to Fig S5a description, to make a point about recognition of the NA scaffold by mAb CD6. "Binding by mAb CD6 is predominantly scaffold dependent and occurs across two protomers"

      - The discussion is very long and some of it is not relevant to the study. For example, the role of the tetramerization domain and the basis for structurally stable tetramer formation, were not the focuses of this study.

      We felt it was important to discuss the tetramerisation domain and the basis for stable tetramer formation. A previous study by Ellis et al.  used the VASP tetramerisation domain and introduced multiple NA interface mutations to achieve a more stable closed conformation. In contrast, NA proteins used in our study required the tetrabrachion tetramerisation domain to form a properly assembled tetramer.

      In lines 382-383, there is one unfinished sentence.

      This is corrected.

      The definition of the loops is also confusing. Line 381, the author stated that in the N1/19 hybrid design, residue N200S, could have been considered as part of the loop B2L23, and was it not?

      The designation of loop ends should not be rigid but rather based on multiple factors such as, their proximity to antigenic epitopes, charge, and hydrophobicity. This is discussed in the " Definition of loops" section.

      - Figure 1a and Figure S2, please provide sufficient descriptions, what do the blocks in different colors mean?

      We have updated the Figure 1a legend to indicate the colours.

      The descriptions for Figures S1 and S2 have also been revised for clarity.

      Reviewer #2 (Recommendations for the authors):

      Minor points

      (1) Line 37: Should be 'Influenza virus neuraminidase'.

      This is corrected.

      (2) Line 65: https://pubmed.ncbi.nlm.nih.gov/35446141/, https://pubmed.ncbi.nlm.nih.gov/33568453/ and https://pubmed.ncbi.nlm.nih.gov/28827718/ indicate that protective mAbs bind all over the NA head domain.

      We have discussed the epitopes on the NA head in detail in the section "The distribution of epitopes on Neuraminidase". In Supplementary Figures 1 and 2, we compiled several studies, including those on polyclonal sera and mAbs epitopes, emphasizing that loops 01 and 23 are the predominant antibody targets (~90%). Some antibodies also bind to the underside of NA. We have discussed and referenced these studies accordingly.

      A new figure has been added [Figure 1b (ii)] to illustrate the surface mapping of epitopes on NA.

      The first reference has been included in both our discussion and Supplementary figure 1.

      The NA epitopes discussed in the second reference have also been incorporated into our discussion and Supplementary figures 1 and 2. Note that, the E258K mutation generated on the NA underside was not relevant to mAbs and was generated randomly by passaging of H3N2 A/New York/PV190/2017 virus. 

      The third reference pertains to murine mAbs against influenza B virus NA.

      (3) Lines 71, 72, and throughout: 'et al.' should be in italics.

      All "et al." have been italicised.

      (4) Many abbreviations are not defined including CHO, SDS-PAGE, MUNANA, mi3, HEPES, BSA, TPCK, MWCO, HRP, PBS, TMB, TCID50, LD50, MES, PEG, PGA, MME, PGA-LM.

      The text has been amended to define these abbreviations.

      (5) Line 209: Shouldn't this be ID50 instead of IC50? Also, it is not defined.

      IC50 has been defined.

      (6) Line 210, line 346, line 581-582: No need to capitalize letters at the beginning of words mid-sentence.

      This is amended.

      (7) Line 227: Is 2009 H1N1 NA meant?

      This has been changed to "H1N1/2009 neuraminidase"

      (8) Line 310: Is this really quantitatively true? (see major comment 1).

      Based on the compilation of epitopes from published NA mAbs and polyclonal sera (via escape mutagenesis and NA-Fabs crystal structures), it is accurate to state that the protective epitopes are primarily located within loops 01 and 23.

      Please also refer to our response to minor point 2. 

      (9) Line 352 and throughout the manuscript: 'in vitro' should be in italics.

      This is amended.

      (10) Line 355: https://pubmed.ncbi.nlm.nih.gov/35446141/https://pubmed.ncbi.nlm.nih.gov/33568453/ and https://pubmed.ncbi.nlm.nih.gov/28827718/ should be included here.

      Studies reporting epitopes on Influenza A neuraminidase have been compiled in Supplementary Figures 1 and 2 and cited appropriately.

      (11) Line 365: https://pubmed.ncbi.nlm.nih.gov/35446141/ and https://pubmed.ncbi.nlm.nih.gov/33568453/ also describe epitopes on the underside of the NA.

      Please refer to the above response to point 10.

      (12) Line 365: Reference https://pubmed.ncbi.nlm.nih.gov/37506693/ is missing here.

      The reference has been added.

      (13) Line 369-371: Is it really a minority?

      In terms of the protective response, the majority of the antibody response is directed towards loops 01 and 23, which form the top antigenic surface. The term 'lateral' is used in some literature to describe NA mAb epitopes; loops 01 and 23 also encompass the lateral regions.

      To clarify this, we have added the following sentence to the Discussion section - "The distribution of epitopes on neuraminidase"

      "It is important to note that loops 01 and 23 include a portion of epitopes that have been described in the literature as side, lateral, or underside (see mAbs NDS.1, NDS.3, and CD6 in Supplementary Fig. 2)"

      Additionally in our studies in mice, we showed that protection is mediated by antibodies targeting the loops (Figure 7). We are uncertain about the binding response to the NA underside, but the NA inhibiting and protective response to the underside appears to be minimal.

      Furthermore Lederhof et al. showed that among the 'underside' mAbs, NDS.1 protected mice against virus challenge, whereas NDS.3 did not. In our analysis (Supplementary Figure 2), NDS.1 makes eight-residue contacts with B4L01 and B5L01, whereas NDS.3 make five-residue contacts with B3L01 and B4L01.

      (14) Line 530: The A in ELLA already stands for assay.

      This is corrected.

    1. When Beyoncé’s and the Chicks’ voices meld in harmony withthose lines about the father, gun and head held high, there is gloryand dignity in this image

      Very powerful and great performance

    2. “Daddy Lessons” teaches his daughter to fight.In encouraging her to “be tough,” learning how to shoot his rifle,riding motorcycles duded up in classic vinyl and leather, the fatherencourages his daughter to both defend herself and to take care ofher mother and sister—that is, to take the place as the head of thefamily, a place usually reserved for sons.

      This idea/summary is prevalent through the lyrics of the song. Urging the daughter to "shoot" and to stand her ground when troubles comes in town

    1. My father laughed—then shook his head. He looked a little sad. “Carlota was mad. Fucking crazy. Just like your mother.”

      Could this mean that the mother was impressed and wanted to be like Empress Corlota. Was there a direct correlation between the names, the mother, and the stories.

    1. You're ready to head out to work or an important appointment, and suddenly your car won't start. It can be frustrating and overwhelming, especially if you're not mechanically-inclined. But don't worry, there are a few things you can do to try and get your car running again. Here, we’ll go over some of the most common reasons why a car won't start, and what you can do to troubleshoot and fix the problem.

      You're ready to head out to work or an important appointment, and suddenly your Mazda won't start. Whether you have a Mazda3, Mazda CX-5, or any other model, it can be frustrating and overwhelming when your Mazda car won't start, especially if you're not mechanically-inclined.

      But don't worry, there are a few things you can do to try and get your car running again. Here, we'll go over some of the most common reasons why a Mazda won't start, and what you can do to troubleshoot and fix the problem.

    1. Welcome back, and in this lesson, I want to cover Aurora Serverless. Aurora Serverless is a service which is to Aurora what Fargate is to ECS. It provides a version of the Aurora database product where you don't need to statically provision database instances of a certain size or worry about managing those database instances. It's another step closer to a database as a service product. It removes one more piece of admin overhead, the admin overhead of managing individual database instances. From now on, when you're referring to the Aurora product that we've covered so far in the course, you should refer to it as Aurora provisioned versus Aurora Serverless, which is what we'll cover in this lesson.

      With Aurora Serverless, you don't need to provision resources in the same way as you did with Aurora provisioned. You still create a cluster, but Aurora Serverless uses the concept of ACUs or Aurora Capacity Units. Capacity units represent a certain amount of compute and a corresponding amount of memory. For a cluster, you can set minimum and maximum values, and Aurora Serverless will scale between those values, adding or removing capacity based on the load placed on the cluster. It can even go down to zero and be paused, meaning that you're only billed for the storage that the cluster consumes.

      Now billing is based on the resources that you use on a per-second basis, and Aurora Serverless provides the same levels of resilience as you're used to with Aurora provisioned. So, you get cluster storage that's replicated across six storage nodes across multiple availability zones. Now, some of the high-level benefits of Aurora Serverless: it's much simpler, it removes much of the complexity of managing database instances and capacity, it's easier to scale, it seamlessly scales the compute and memory capacity in the form of ACUs as needed with no disruption to client connections, and you'll see how that works architecturally on the next screen. It's also cost-effective. When you use Aurora Serverless, you only pay for the database resources that you consume on a per-second basis, unlike with Aurora provisioned, where you have to provision database instances in advance, and you charge for the resources that they consume, whether you're utilizing them or not.

      The architecture of Aurora Serverless has many similarities with Aurora provisioned, but it also has crucial differences, so let's review both of those, the similarities and the differences. The Aurora cluster architecture still exists, but it's in the form of an Aurora Serverless cluster. Now, this has the same cluster volume architecture which Aurora provisioned uses. In an Aurora Serverless cluster, though, instead of using provisioned servers, we have ACUs, which are Aurora Capacity Units. These capacity units are actually allocated from a warm pool of Aurora capacity units which are managed by AWS. The ACUs are stateless, they're shared across many AWS customers, and they have no local storage, so they can be allocated to your Aurora Serverless cluster rapidly when required. Now, once these ACUs are allocated to an Aurora Serverless cluster, they have access to the cluster storage in the same way that a provisioned Aurora instance would have access to the storage in a provisioned Aurora cluster. It's the same thing; it's just that these ACUs are allocated from a shared pool managed by AWS.

      Now, if the load on an Aurora Serverless cluster increases beyond the capacity units which are being used, and assuming the maximum capacity setting of the cluster allows it, then more ACUs will be allocated to the cluster. And once the compute resource, which represents this new, potentially bigger ACU, is active, then any old compute resources representing unused capacity can be deallocated from your Aurora Serverless cluster. Now, because of the ACU architecture, because the number of ACUs are dynamically increased and decreased based on load, the way that connections are managed within an Aurora Serverless cluster has to be slightly more complex versus a provisioned cluster. In an Aurora Serverless cluster, we have a shared proxy fleet which is managed by AWS. Now, this happens transparently to you as a user of an Aurora Serverless cluster, but if a user interacts with the cluster via an application, it actually goes via this proxy fleet. Any of the proxy fleet instances can be used, and they will broker a connection between the application and the Aurora Capacity Units.

      Now, this means that because the client application is never directly connecting to the compute resource that provides an ACU, it means that the scaling can be fluid, and it can scale in or out without causing any disruptions to applications while it's occurring because you're not directly connecting with an ACU. You're connecting via an instance in this proxy fleet. So, the proxy fleet is managed by AWS on your behalf. The only thing you need to worry about for an Aurora Serverless cluster is picking the minimum and maximum values for the ACU, and you only have a bill for the amount of ACU that you're using at a particular point in time as well as the cluster storage. So that makes Aurora Serverless really flexible for certain types of use cases.

      Now, a couple of examples of types of applications which really do suit Aurora Serverless. The first is infrequently used applications, maybe a low-volume blog site such as "The Best Cats," where connections are only attempted for a few minutes several times per day, or maybe on really popular days of the week. With Aurora Serverless, if you were using the product to run the "Best Cat Pics" blog, which you'll experience in the demo lesson, then you'd only pay for resources for the Aurora Serverless cluster as you consume them on a per-second basis. Another really good use case is new applications. If you're deploying an application where you're unsure about the levels of load that will be placed on the application, so you're going to be unsure about the size of the database instance that you'll need. With Aurora provisioned, you would still need to provision that in advance and potentially change it, which could cause disruption. If you use Aurora Serverless, you can create the Aurora Serverless cluster and have the database autoscale based on the incoming load.

      It's also really good for variable workloads. If you're running a normally lightly used application which has peaks, maybe 30 minutes out of an hour or on certain days of the week during sale periods, then you can use Aurora Serverless and have it scale in and out based on that demand. You don't need to provision static capacity based on the peak or average as you would do with Aurora provisioned. It's also really good for applications with unpredictable workloads, so if you're really not sure about the level of workload at a given time of day, you can't predict it, you don't have enough data, then you can provision an Aurora Serverless cluster and initially set a fairly large range of ACUs so the minimum is fairly low and the maximum is fairly high, and then over the initial period of using the application, you can monitor the workload. If it really does stay unpredictable, then potentially Aurora Serverless is the perfect database product to use because if you're using anything else, say an Aurora provisioned cluster, then you always have to have a certain amount of capacity statically provisioned. With Aurora Serverless, you can, in theory, leave an unpredictable application inside Aurora Serverless constantly and just allow the database to scale in and out based on that unpredictable workload.

      It's also great for development and test databases because Aurora Serverless can be configured to pause itself during periods of no load, and during the database pause, you only build for the storage. So, if you do have systems which are only used as part of your development and test processes, then they can scale back to zero and only incur storage charges during periods when it's not in use, and that's really cost-effective for this type of workload. It's also great for multi-tenant applications. If you've got an application where you're billing a user a set dollar amount per month per license to the application, if your incoming load is directly aligned to your incoming revenue, then it makes perfect sense to use Aurora Serverless. You don't mind if a database supporting your product scales up and costs you more if you also get more customer revenue, so it makes perfect sense to use Aurora Serverless for multi-tenant applications where the scaling is fairly aligned between infrastructure size and incoming revenue.

      So, these are some classic examples of when Aurora Serverless makes perfect sense. Now, this is a product I don't yet expect to feature extensively on the exam. It will feature more and more as time goes on, and so by learning the architecture at this point, you get a head start and you can answer any questions which might feature on the exam about Aurora Serverless and comparing it to the other RDS products, which is often just as important. But at this point, that's all of the theory that I wanted to cover, all of the architecture. So go ahead, finish up this video, and when you're ready, I look forward to joining you in the next lesson.

    1. Reviewer #2 (Public review):

      Summary:

      The authors describe a "beads-on-a-string" (BOAS) immunogen, where they link, using a non-flexible glycine linker, up to eight distinct hemagglutinin (HA) head domains from circulating and non-circulating influenzas and assess their immunogenicity. They also display some of their immunogens on ferritin NP and compare the immunogenicity. They conclude that this new platform can be useful to elicit robust immune responses to multiple influenza subtypes using one immunogen and that it can also be used for other viral proteins.

      Strengths:

      The paper is clearly written. While the use of flexible linkers has been used many times, this particular approach (linking different HA subtypes in the same construct resembling adding beads on a string, as the authors describe their display platform) is novel and could be of interest.

      Comments on revisions:

      The authors have addressed most comments. Some mistakes/issues remain:

      TI should be defined earlier on line 61 not on line 196

      No legend for Figure 3E - it looks like this is where the authors did the first immunization with the "mix" to compare to the BOAs but strangely they do not mention this in the response to reviewers letter and only mention fig 6G and 7<br /> Maybe add "mix" to the title of Figure 3?

      In Figure 6G they do show the response to the mix but do not mention it in the immunizations for that figure. Also weird because obviously the mix is not a NP while this figure addresses NP format.

      Line 796 - pseudo viruses

      The authors should add some clarification in the paper as they did in response to reviewers.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Thronlow Lamson et al., the authors develop a "beads-on-a-string" or BOAS strategy to link diverse hemagglutinin head domains, to elicit broadly protective antibody responses. The authors are able to generate varying formulations and lengths of the BOAS and immunization of mice shows induction of antibodies against a broad range of influenza subtypes. However, several major concerns are raised, including the stability of the BOAS, that only 3 mice were used for most immunization experiments, and that important controls and analyses related to how the BOAS alone, and not the inclusion of diverse heads, impacts humoral immunity.

      Strengths:

      Vaccine strategy is new and exciting.

      Analyses were performed to support conclusions and improve paper quality.

      Weaknesses:

      Controls for how different hemagglutinin heads impact immunity versus the multivalency of the BOAS.

      Only 3 mice were used for most experiments.

      There were limited details on size exclusion data.

      We appreciate the reviewer’s comments and have made the following changes to the manuscript.

      (1) We recognize that deconvoluting the effect of including a diverse set of HA heads and multivalency in the BOAS immunogens is necessary to understand the impact on antigenicity. Therefore, we now include a cocktail of the identical eight HA heads used in the 8-mer and BOAS nanoparticle (NP) as an additional control group. While we observed similar HA binding titers relative to the 8-mer and BOAS NP groups, the cocktail group-elicited sera was unable to neutralize any of the viruses tested; multivalency thus appears to be important for eliciting neutralizing responses

      (2) We increased the sample size by repeated immunizations with n=5 mice, for a total of n=8 mice across two independent experiments.

      (3) We expanded the details on size exclusion data to include:

      a) extended chromatograms from Figure 2C as Supplemental Figure 3.

      b) additional details in the materials and methods section (lines 370-372):

      “Recovered proteins were then purified on a Superdex 200 (S200) Increase 10/300 GL (for trimeric HAs) or Superose 6 Increase 10/300 GL (for BOAS) size-exclusion column in Dulbecco’s Phosphate Buffered Saline (DPBS) within 48 hours of cobalt resin elution.”

      Reviewer #2 (Public Review):

      Summary:

      The authors describe a "beads-on-a-string" (BOAS) immunogen, where they link, using a non-flexible glycine linker, up to eight distinct hemagglutinin (HA) head domains from circulating and non-circulating influenzas and assess their immunogenicity. They also display some of their immunogens on ferritin NP and compare the immunogenicity. They conclude that this new platform can be useful to elicit robust immune responses to multiple influenza subtypes using one immunogen and that it can also be used for other viral proteins.

      Strengths:

      The paper is clearly written. While the use of flexible linkers has been used many times, this particular approach (linking different HA subtypes in the same construct resembling adding beads on a string, as the authors describe their display platform) is novel and could be of interest.

      Weaknesses:

      The authors did not compare to individuals HA ionized as cocktails and did not compare to other mosaic NP published earlier. It is thus difficult to assess how their BOAS compare.<br /> Other weaknesses include the rationale as to why these subtypes were chosen and also an explanation of why there are different sizes of the HA1 construct (apart from expression). Have the authors tried other lengths? Have they expressed all of them as FL HA1?

      We appreciate the reviewer’s comments. We responded to the concerns below and modified the manuscript accordingly.

      (1) We recognize that including a “cocktail” control is important to understand how the multivalency present in a single immunogen affects the immune response. We now include an additional control group comprised of a mixture of the same eight HA heads used in the 8-mer and the BOAS nanoparticle (NP). While this cocktail elicited similar HA binding titers relative to the 8-mer and BOAS NP immunogens (Fig. 6G), there was no detectable neutralization any of the viruses tested (Fig. 7).

      (2) In the introduction we reference other multivalent display platforms but acknowledge that distinct differences in their immunogen design platforms make direct comparisons to ours difficult—which is ultimately why we did not use them as comparators for our in vivo studies. Perhaps most directly relevant to our BOAS platform is the mosaic HA NP from Kanekiyo et al. (PMID 30742080). Here, HA heads, with similar boundaries to ours, were selected from historical H1N1 strains. These NPs however were significantly less antigenic diverse relative to our BOAS NPs as they did not include any group 2 (e.g., H7, H9) or B influenza HAs; restricting their multivalent display to group 1 H1N1s likely was an important factor in how they were able to achieve broad, neutralizing H1N1 responses. Additionally, Cohen et al. (PMID 33661993) used similarly antigenically distinct HAs in their mosaic NP, though these included full-length HAs with the conserved stem region, which likely has a significant impact on the elicited cross-reactive responses observed. Lastly, we reference Hills et al. (PMID 38710880), where authors designed similar NPs with four tandemly-linked betacoronoavirus receptor binding domains (RBDs) to make “quartets”. In contrast to our observations, the authors observed increased binding and neutralization titers following conjugation to protein-based NPs. We acknowledge potential differences between the studies, such as the antigen and larger VLP NP, that could lead to the different observed outcomes.

      (3) We intended to highlight the “plug-and-play” nature of the BOAS platform; theoretically any HA subtype could be interchanged into the BOAS. To that end, our rationale for selecting the HA subtypes in our proof-of-principle immunogen was to include an antigenically diverse set of circulating and non-circulating HAs that we could ultimately characterize with previously published subtype-specific antibodies that were also conformation-specific. In doing so, these diagnostic antibodies could confirm presence and conformation integrity of each component. We intentionally did not include HA subtypes that we did not have a conformation-specific antibody for.

      The different sizes of HA head domains was determined exclusively by expression of the recombinant protein. We have not attempted expression of full-length HA1 domains. Furthermore, we have not attempted to express the full-length HA (inclusive of HA1 and HA2) in our BOAS platform. The primary reason was to avoid including the conserved stem region of HA2 which may distract from the HA1 epitopes (e.g., receptor binding site, lateral patch) that can be engaged by broadly neutralizing antibodies. Additionally, the full-length HA is inherently trimeric and may not be as amenable to our BOAS platform as the monomeric HA1 head domain.

      Reviewer #3 (Public Review):

      This work describes the tandem linkage of influenza hemagglutinin (HA) receptor binding domains of diverse subtypes to create 'beads on a string' (BOAS) immunogens. They show that these immunogens elicit ELISA binding titers against full-length HA trimers in mice, as well as varying degrees of vaccine mismatched responses and neutralization titers. They also compare these to BOAS conjugated on ferritin nanoparticles and find that this did not largely improve immune responses. This work offers a new type of vaccine platform for influenza vaccines, and this could be useful for further studies on the effects of conformation and immunodominance on the resulting immune response.

      Overall, the central claims of immunogenicity in a murine model of the BOAS immunogens described here are supported by the data.

      Strengths included the adaptability of the approach to include several, diverse subtypes of HAs. The determination of the optimal composition of strains in the 5-BOAS that overall yielded the best immune responses was an interesting finding and one that could also be adapted to other vaccine platforms. Lastly, as the authors discuss, the ease of translation to an mRNA vaccine is indeed a strength of this platform.

      One interesting and counter-intuitive result is the high levels of neutralization titers seen in vaccine-mismatched, group 2 H7 in the 5-BOAS group that differs from the 4-BOAS with the addition of a group 1 H5 RBD. At the same time, no H5 neutralization titers were observed for any of the BOAS immunogens, yet they were seen for the BOAS-NP. Uncovering where these immune responses are being directed and why these discrepancies are being observed would constitute informative future work.

      There are a few caveats in the data that should be noted:

      (1) 20 ug is a pretty high dose for a mouse and the majority of the serology presented is after 3 doses at 20 ug. By comparison, 0.5-5 ug is a more typical range (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380945/, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9980174/). Also, the authors state that 20 ug per immunogen was used, including for the BOAS-NP group, which would mean that the BOAS-NP group was given a lower gram dose of HA RBD relative to the BOAS groups.

      We agree that this is on the “upper end” of recombinant protein dose. While we did not do a dose-response, we now include serum analyses after a single prime. The overall trends and reactivity to matched and mis-matched BOAS components remained similar across days d28 and d42. However, the differences between the BOAS and BOAS NP groups and the mixture group were more pronounced at d28, which reinforces our observation that the multivalency of the HA heads is necessary for eliciting robust serum responses to each component. These data are included in Supplemental Figure 5, and we’ve modified the text (lines 185-187) to include;

      “Similar binding trends were also observed with d28 serum, though the difference between the 8mer and mix groups was more pronounced at d28 (Supplemental Figure 5).”

      Additionally, we acknowledge that there is a size discrepancy between the BOAS NP and the largest BOAS, leading to an approximately ~15-fold difference on a per mole basis of the BOAS immunogen. The smallest and largest BOAS also differ by ~ 2.5-fold on a per mole basis; this could favor the overall amount of the smaller immunogens, however because vaccine doses are typically calculated on a mg per kg basis, we did not calculate on a molar basis for this study. Any promising immunogens will be evaluated in dose-response study to optimize elicited responses.

      (2) Serum was pooled from all animals per group for neutralization assays, instead of testing individual animals. This could mean that a single animal with higher immune responses than the rest in the group could dominate the signal and potentially skew the interpretation of this data.

      We repeated the neutralization assays with data points for individual mice. There does appear to be variability in the immune response between mice. This is most noticeable for responses to the H5 component. We are currently assessing what properties of our BOAS immunogen might contribute to the variability across individual mice.

      (3) In Figure S2, it looks like an apparent increase in MW by changing the order of strains here, which may be due to differences in glycosylation. Further analysis would be needed to determine if there are discrepancies in glycosylation amongst the BOAS immunogens and how those differ from native HAs.

      There does appear to be a relatively small difference in MW between the two BOAS configurations shown in Figure S2. This could be due to differences in glycosylation, as the reviewer points out, and in future studies, we intend to assess the influence of native glycosylation on antibody responses elicited by our BOAS immunogens.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Concerns

      (1) From Figure 2D-E, it looks like BOAS are forming clusters, rather than a straight line. Do these form aggregates over time? Both at 4 degrees over a few days or after freeze-thaw cycle(s)? It is unclear from the SEC methods how long after purification this was performed and stability should be considered.

      Due to the inherent flexibility of the Gly-Ser linker between each component we do not anticipate that any rigidity would be imposed resulting in a “straight line”. Nevertheless, we appreciate the reviewers concern about the long-term stability of the BOAS immunogens. To address this, we include 1) the extended chromatograms from Figure 2C as Supplemental Figure 3 to show any aggregates present, 2) traces from up to 48 hours post-IMAC, and 3) chromatograms following a freeze-thaw cycle. Post-IMAC purification there is a minor (<10% total peak height) at ~9mL corresponding to aggregation. Note, we excluded this aggregation for immunizations. Post freeze-thaw cycle, we can see that upon immediate (<24hrs) thawing, the BOAS maintain a homogeneous peak with no significant (<10%) aggregation or degradation peak. However, after ~1 week post-freeze-thaw cycle at 4C, additional peaks within the chromatogram correspond to degradation of the BOAS.

      We modified the materials and methods section to state (lines 370-372)

      “Recovered proteins were then purified on a Superdex 200 (S200) Increase 10/300 GL (for trimeric HAs) or Superose 6 Increase 10/300 GL (for BOAS) size-exclusion column in Dulbecco’s Phosphate Buffered Saline (DPBS) within 48 hours of cobalt resin elution.”

      We commented on BOAS stability in the results section (lines 142-148)

      “Following SEC, affinity tags were removed with HRV-3C protease; cleaved tags, uncleaved BOAS, and His-tagged enzyme were removed using cobalt affinity resin and snap frozen in liquid nitrogen before immunizations. BOAS maintained monodispersity upon thawing, though over time, degradation was observed following longer term (>1 week) storage at 4C (Supplemental Figure 3). This degradation became more significant as BOAS increased in length (Supplemental Figure 3).”

      We also included in the discussion (lines 277-279):

      “Notably, for longer BOAS we observed degradation following longer term storage at 4C, which may reflect their overall stability.”

      (2) Figures 3-4 and 6-7, to make conclusions off of 3 mice per group is inappropriate. A sample size calculation should have been conducted and the appropriate number of mice tested. In addition, two independent mouse experiments should always be performed. Moreover, the reliability of the statistical tests performed seems unlikely, given the very small sample size.

      We agree that additional mice are necessary to make assessments regarding immunogenicity and cross-reactivity differences between the immunogens. To address this, we repeated the immunization with 5 additional mice, for a total of n=8 mice over two independent experiments. We incorporated these data into Figure 3B-D, as well as an additional Figure 3E (see below). We also now report the log-transformed endpoint titer (EPT) values rather than reciprocal EC50 values and added clarity to statistical analyses used. We have added the following lines to the methods section

      lines 427-431:

      “Serum endpoint titer (EPT) were determined using a non-linear regression (sigmoidal, four-parameter logistic (4PL) equation, where x is concentration) to determine the dilution at which dilution the blank-subtracted 450nm absorbance value intersect a 0.1 threshold. Serum titers for individual mice against respective antigens are reported as log transformed values of the EPT dilution.”

      lines 406-408:

      “C57BL/6 mice (Jackson Laboratory) (n=8 per group for 3-, 4-, 5-, 6-, 7-, and 8mer cohorts; n=5 for BOAS NP, NP, and mix cohorts) were immunized with 20µg of BOAS immunogens of varying length and adjuvanted with 50% Sigmas Adjuvant for a total of 100µL of inoculum.”

      lines 482-490:

      “Statistical Analysis

      Significance for ELISAs and microneutralization assays were determined using Prism (GraphPad Prism v10.2.3). ELISAs comparing serum reactivity and microneutralization and comparing >2 samples were analyzed using a Kruskal-Wallis test with Dunn’s post-hoc test to correct for multiple comparisons. Multiple comparisons were made between each possible combination or relative to a control group, where indicated. ELISAs comparing two samples were analyzed using a Mann-Whitney test. Significance was assigned with the following: * = p<0.05, ** = p<0.01, *** = p<0.001, and **** = p<0.0001. Where conditions are compared and no significance is reported, the difference was non-significant.”

      (3) One critical control that is missing is a homogenous BOAS, for example, just linking one H1 on a BOAS. Does oligomerization and increasing avidity alone improve humoral immunity?

      We agree that this is an interesting point, However, to address the impact of oligomerization and avidity on humoral immunity, we now include an additional control with a cocktail of HA heads used in the 8mer. We have incorporated this into Figure 3A, 3D and 3E, Figure 6G, and Figure 7.

      Additionally, we have added the following lines in the manuscript:

      lines 38-40:

      “Finally, vaccination with a mixture of the same HA head domains is not sufficient to elicit the same neutralization profile as the BOAS immunogens or nanoparticles.”

      lines 105-106:

      “Additionally, we showed that a mixture of the same HA head components was not sufficient to recapitulate the neutralizing responses elicited by the BOAS or BOAS NP.”

      lines 169-172:

      “To determine immunogenicity of each BOAS immunogen, we performed a prime-boost-boost vaccination regimen in C5BL/6 mice at two-week intervals with 20µg of immunogen and adjuvanted with Sigma Adjuvant (Figure 3A). We compared these BOAS to a control group immunized with a mixture of the eight HA heads present in the 8mer.”

      lines 265-267:

      “There were qualitatively immunodominant HAs, notably H4 and H9, and these were relatively consistent across BOAS in which they were a component. This effect was reduced in the mix cohort.”

      (4) While some cross-reactivity is likely (Figure 6G), there is considerable loss of binding when there is a mismatch. Of the antibodies induced, how much of this is strain-specific? For example, how well do serum antibodies bind to a pre-2009 H1?

      We agree with the reviewer that there is a considerable loss of binding when there is a mismatched HA component. To better understand this and incorporate a mismatched strain into our analysis of the 8mer and BOAS NP, we looked at serum binding titers to a pre-2009 H1, H1/Solomon Islands/2006, and an antigenically distinct H3, H3/Hong Kong/1968. We have incorporated this data into Figures 3D, 3E, 6F and 6G. We observed relatively high titers against both a mismatched H1 and H3, indicating that the BOAS maintain high titers against subtype-specific strains that are conserved over considerable antigenic distance. However, this was similar in the mixture group, indicating that this may not be specific to oligomerization of BOAS immunogens.

      We added the following to the methods section:

      lines 357-361

      “Head subdomains from these HAs were used in the BOAS immunogens, and full-length soluble ectodomain (FLsE) trimers were used in ELISAs. Additional H1 (H1/A/Solomon Islands/3/2006) and H3 (H3/A/Hong Kong/1/1968) FLsEs were used in ELISAs as mismatched, antigenically distinct HAs for all BOAS.”

      Minor Concerns

      (1) Line 44-46, the deaths per year are almost exclusively due to seasonal influenza outbreaks caused by antigenically drifted viruses in humans, not those spilling over from avian sp. and swine. For accuracy, please adjust this sentence.

      We have adjusted lines 45-48 to say “This is largely a consequence of viral evolution and antigenic drift as it circulates seasonally within humans and ultimately impacts vaccine effectiveness. Additionally, the chance for spillover events from animal reservoirs (e.g., avian, swine) is increasing as population and connectivity also increase.”

      (2) Figure 4D-E, provide a legend for what the symbols indicate, or simply just put the symbol next to either the homology score and % serum competition labels on the y-axis.

      We have included a legend in Figures 4D,E to distinguish between homology score and % serum competition

      (3) I am a bit confused by the data presented in Figure 7. The figure legend says the two symbols represent technical replicates. How? Is one technical replicate of all the mice in a group averaged and that's what's graphed? If so, this is not standard practice. I would encourage the authors to show the average technical replicates of each animal, which is standard.

      We thank the reviewer for their suggestion, and we have revised Figure 7 such that each symbol represents a single animal for n=5 animals. We have also adjusted the figure caption to the following:

      “Figure 7: Microneutralization titers to matched and mis-matched virus- Microneutralization of matched and mis-matched psuedoviruses: H1N1 (green, top left), H3N2 (orange, top right), H5N1 (yellow, bottom left), and H7N9 viruses (pink, bottom right) with d42 serum. Solid bars below each plot indicate a matched sub-type, and striped bars indicate a mis-matched subtype (i.e. not present in the BOAS). NP negative controls were used to determine threshold for neutralization. Upper and lower dashed lines represent the first dilution (1:32) (for H1N1, H3N2, and H5N1) or neutralization average with negative control NP serum (H7N9), and the last serum dilution (1:32,768), respectively, and points at the dashed lines indicate IC50s at or outside the limit of detection. Individual points indicate IC50 values from individual mice from each cohort (n=5). The mean is denoted by a bar and error bars are +/- 1 s.d., * = p<0.05 as determined by a Kruskal-Wallis test with Dunn’s multiple comparison post hoc test relative to the mix group.”

      (4) Paragraphs 298-313, multiple studies are referred to but not referenced.

      We have added the following references to this section:

      (38) Kanekiyo, M. et al. Self-assembling influenza nanoparticle vaccines elicit broadly neutralizing H1N1 antibodies. Nature 498, 102–106 (2013).

      (48) Hills, R. A. et al. Proactive vaccination using multiviral Quartet Nanocages to elicit broad anti-coronavirus responses. Nat. Nanotechnol. 1–8 (2024) doi:10.1038/s41565-024-01655-9.

      (65) Jardine, J. et al. Rational HIV immunogen design to target specific germline B cell receptors. Science 340, 711–716 (2013).

      (66) Tokatlian, T. et al. Innate immune recognition of glycans targets HIV nanoparticle immunogens to germinal centers. Science 363, 649–654 (2019).

      (67) Kato, Y. et al. Multifaceted Effects of Antigen Valency on B Cell Response Composition and Differentiation In Vivo. Immunity 53, 548-563.e8 (2020).

      (68) Marcandalli, J. et al. Induction of Potent Neutralizing Antibody Responses by a Designed Protein Nanoparticle Vaccine for Respiratory Syncytial Virus. Cell 176, 1420-1431.e17 (2019).

      (69) Bruun, T. U. J., Andersson, A.-M. C., Draper, S. J. & Howarth, M. Engineering a Rugged Nanoscaffold To Enhance Plug-and-Display Vaccination. ACS Nano 12, 8855–8866 (2018).

      (70) Kraft, J. C. et al. Antigen- and scaffold-specific antibody responses to protein nanoparticle immunogens. Cell Reports Medicine 100780 (2022) doi:10.1016/j.xcrm.2022.100780.

      Reviewer #2 (Recommendations For The Authors):

      Can the authors define "detectable titers"?

      Maybe add a threshold value of reciprocal EC on the figure for each plot.

      We recognize the reviewers concern with reporting serum titers in this way, and we have adjusted our reported titers as endpoint titers (EPT) with a dotted line for the first detectable dilution (1:50). We have also adjusted the methods section to reflect this change:

      (lines 427-431)

      “Serum endpoint titer (EPT) were determined using a non-linear regression (sigmoidal, four-parameter logistic (4PL) equation, where x is concentration) to determine the dilution at which dilution the blank-subtracted 450nm absorbance value intersect a 0.1 threshold. Serum titers for individual mice against respective antigens are reported as log transformed values of the EPT dilution.”

      It also appears that not all X-mer elicits an immune response against matched HA, e.g. for the 7 and 8 -mer. Not sure why the authors do not mention this. It could be due to too many HAs, not sure.

      We apologize for the confusion, and agree that our original method of reporting EC50 values does not reflect weak but present binding titers. Upon further analysis with additional mice as well as adjusting our method of reporting titers, it is easier to see in Figure 3D that all X-mer BOAS do indeed elicit binding detectable titers to matched HA components.

      It will be nice to add a conclusion to the cross-reactivity - again it appears that past 6-mer there has been a loss in cross-reactivity even though there are more subtypes on the BOAS.

      Also, the TI seemed to be the more conserved epitope targeted here.

      (Of note these two are mentioned in the discussion)

      We have updated the results section to include the following:

      (lines 281-294)

      “Based on the immunogenicity of the various BOAS and their ability to elicit neutralizing responses, it may not be necessary to maximize the number of HA heads into a single immunogen. Indeed, it qualitatively appears that the intermediate 4-, 5-, and 6mer BOAS were the most immunogenic and this length may be sufficient to effectively engage and crosslink BCR for potent stimulation. These BOAS also had similar or improved binding cross-reactivity to mis-matched HAs as compared to longer 7- or 8mer BOAS. Notably, the 3mer BOAS elicited detectable cross-reactive binding titers to H4 and H5 mismatched HAs in all mice. This observed cross-reactivity could be due to sequence conservation between the HAs, as H3 and H4 share ~51% sequence identity, and H1 and H2 share ~46% and ~62% overall sequence identity with H5, respectively (Supplemental Figure 6). Additionally, the degree of surface conservation decreased considerably beyond the 5mer as more antigenically distinct HAs were added to the BOAS. These data suggest that both antigenic distance between HA components and BOAS length play a key role in eliciting cross-reactive antibody responses, and further studies are necessary to optimize BOAS valency and antigenic distance for a desired response.”

      Figure 5E, the authors could indicate which subtype each mab is specific to for those who are not HA experts. (They have them color-coded but it is hard to see because very small).

      The authors also do not explain why 3E5 does not bind well to H1, H2, H3, H4 4-mer BOA, etc...

      We apologize for the lack of clarity in this figure. We updated Figure 5E to include the subtype it is specific for as well as listing the antibodies and their subtype and targeted epitope in the figure caption.

      Minor

      Figure 1B zoom looks like the line is hidden to the structure - should come in front

      We adjusted the figure accordingly.

      Line 127 - whether the order

      Corrected

      What is the rationale for thinking that a different order will lead to a different expression and antigenic results?

      We thank the reviewer for this question. We did not necessarily anticipate a difference in protein expression based on BOAS order We, however, wanted to verify that our platform was indeed “plug-and-play” platform and we could readily exchange components and order. We do, however, hypothesize that a different order may in fact lead to different antigenic results. We think that the conformation of the BOAS as well as physical and antigenic distance of HA components may influence cross-linking efficiency of BCRs and lead to different antigenic results with different levels of cross-reactivity. For example, a BOAS design with a cluster of group 1 HAs followed by a cluster of group 2 HAs, rather than our roughly alternating pattern could impact which HAs are in proximity to each other or could be potentially shielded in certain conformations, and thus could affect antigenic results. We expand on this rationale in the discussion in lines 310-314:

      “Further studies with different combinations of HAs could aid in understanding how length and composition influences epitope focusing. For example, a BOAS design with a cluster of group 1 HAs followed by a cluster of group 2 HAs, rather than our roughly alternating pattern could impact which HAs are in close proximity to one other or could be potentially shielded in certain conformations, and thus could affect antigenic results.”

      Maybe list HA#1 HA#2 HA#3 instead of HA1, HA2, HA3 to make sure it is not confounded with HA2 and HA2

      We agree that this may be confusing for readers, and have adjusted Figure 1C to show HA#1, HA#2, etc.

      For nsEM, do the authors have 2D classes and even 3D reconstructions? Line 148-149: maybe or just because there are more HAs.

      We did not obtain 2D class or 3D reconstructions of these BOAS. However, we do agree with the reviewer that the collapsed/rosette structure of the 8mer BOAS may be a consequence of the additional HA heads as well as the flexible Gly-Ser linkers between the components. We have added clarify to our statement in the discussion to read:

      lines 154-156:

      “This is likely a consequence of the flexible GSS linker separating the individual HA head components as well as the addition of significantly more HA head components to the construct.”.

      Line 153 " interface-directed" - what does this mean?

      We apologize for any confusion- we intend for “interface-directed” to refer antibodies that engage the trimer interface (TI) epitope between HA protomers. We have adjusted the manuscript to use the same terminology throughout, i.e. trimer interface or its abbreviation, TI.

      For Figure 2 F - do you have a negative control? Usually one does not determine an ELISA KD, it is not very accurate but shows binding in terms of OD value.

      We did include a negative control, MEDI8852, a stem-directed antibody, though it was not shown in the figure because we observed no binding, as expected. This negative control antibody was also used in Figure 5E for characterizing the BOAS NPs, and also shows no binding. We recognize that in an ELISA the KD is an equilibrium measurement and we do not report kinetic measurements as determined by a method such as bio-layer interferometry (BLI), and have this adjusted the figure caption to denote the values as “apparent K<sub>D</sub> values”.

      Line 169 - reads strangely, "BOAS-elicited serum, regardless of its length, reacted<br /> The length is the one of the Immunogen, not the serum

      We agree that this statement is unclear, and we have modified the sentence to read:

      lines 177-178:

      “Each of the BOAS, regardless of its length, elicited binding titers to all matched full-length HAs representing individual components (Figure 3D).”

      What is the adjuvant used (add in results)?

      We used Sigma adjuvant for all immunizations, and have included this information in the results section:

      lines 169-171:

      “To determine immunogenicity of each BOAS, we performed a prime-boost-boost vaccination regimen in C5BL/6 mice at two-week intervals with 20µg of immunogen and adjuvanted with Sigma Adjuvant (Figure 3A).”

      This information is also included in the methods section in lines 406-412.

      Line 178 - remove " across"

      We have removed the word “across” in this sentence and replaced it with “on” (line 194)

      Trimer- interface, and interface epitopes are used exchangeably - maybe keep it as trimer interface to be more precise

      As stated above, we have adjusted the manuscript to use the same term throughout, i.e., trimer interface or its abbreviation, TI.

      Line 221 - no figure 6H (6G?)

      We apologize for this typo and have corrected to Figure 6G (line 231)

      Reviewer #3 (Recommendations For The Authors):

      (1) Since 20 ug x3 doses is quite a high amount of vaccine, differences between immunogens may become blurred. Thus, it may be informative to compare post-prime serology for all immunogens or select immunogens to compare to the post-3rd dose data.

      We agree with the reviewer that this is on the upper end of vaccine dose and thus we explored the serum responses after a single boost. The overall trends and reactivity to matched and mis-matched BOAS components remained similar across days d28 and d42. However, the differences between the BOAS and BOAS NP groups and the mixture group were more pronounced at d28, which bolsters our claim that the presentation of the HA heads is important for eliciting strong serum responses to all components. We have included this data in Supplemental Figure 5, and have acknowledged this in the text:

      lines 185-187:

      “Similar binding trends were also observed with d28 serum, though the difference between the 8mer and mix groups was more pronounced at d28 (Supplemental Figure 5).”

      (2) Significance statistics for all immunogenicity data should be added and discussed; it is particularly absent in Figures 3D and 7.

      We have added statistical analyses to Figure 3 and Figure 7 to reflect changes in immunogenicity. We have also added the following to the methods section:

      lines 482-490:

      “Statistical Analysis

      Significance for ELISAs and microneutralization assays were determined using either a Mann-Whitney test or a Kruskal-Wallis test with Dunn’s post-hoc test in Prism (GraphPad Prism v10.2.3) to correct for multiple comparisons. Multiple comparisons were made between each possible combination or relative to a control group, where indicated. Significance was assigned with the following: * = p<0.05, ** = p<0.01, *** = p<0.001, and **** = p<0.0001. Where conditions are compared and no significance is reported, the difference was non-significant.”

      (3) Figure 2F: the figure has K03.12 listed for the H3-specific mAb and in the main text, but the caption says 3E5 - is the 3E5 in the caption a typo? 3E5 is listed for the competition ELISAs as an RBS mAb, but its binding site is distal to the RBS at residues 165-170 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9787348/), H7.167 binds in the RBS periphery and not directly within the RBS, and the epitope for P2-D9 is undetermined/not presented. This could mean that there is actually a higher proportion of RBS-directed antibodies than what is determined from this serum competition data. Also, reference to these as 'RBS-directed' in the serum competition methods section should be revised for accuracy.

      We sincerely apologize for this error and the resulting confusion. 3E5 in the caption is incorrect and should be K03.12 (https://www.rcsb.org/structure/5W08) and does engage the receptor binding site. We also apologize for the oversight that H7.167 is in the RBS periphery and not directly in the RBS. The additional P2-D9 in the panel of RBS-directed antibodies was also in error, as we do not believe it is RBS-directed, but is indeed H4 specific. We also included a reference to the paper and immunogen that elicited this antibody. We agree that this indicates that there could be a higher proportion of RBS-directed antibodies in the serum and have modified the text in the results and methods sections to read:

      lines 300-306:

      “Notably, this proportion is approximate, as at the time of reporting, antibodies that bind the receptor binding site of all components were not available. RBS-directed antibodies to the H4 and H9 component were not available, and the RBS-directed antibodies used targeting the other HA components have different footprints around the periphery of the RBS. Additionally, there are currently no reported influenza B TI-directed antibodies in the literature. Therefore, this may be an underestimate of the serum proportion focused to the conserved RBS and TI epitopes.”

      lines 435-439:

      “Following blocking with BSA in PBS-T, blocking solution was discarded and 40µL of either DPBS (no competition control), a cocktail of humanized antibodies targeting the RBS and periphery (5J8, 2G1, K03.12, H5.3, H7.167, H1209), a cocktail of humanized TI-directed antibodies (S5V2-29, D1 H1-17/H3-14, D2 H1-1/H3-1), or a negative control antibody (MEDI8852) were added at a concentration of 100µg/mL per antibody.”

      (4) Only nsEM data is shown for the 3-BOAS and 8-BOAS, where differences in morphology were seen between these longer and shorter proteins. Including nsEM images for all BOAS immunogens may show trends in morphology or organization that could correlate with immune responses, e.g. if the 5-BOAS also forms a higher proportion of rosette-like structures, while the the 4-BOAS is still a mix between extended and rosette-like, this could be a factor in the better immune responses seen for 5-BOAS.

      We appreciate the reviewer’s suggestion for further analysis of morphology between the intermediate BOAS sizes. We agree that the relationship between BOAS length and morphology should be explored more in depth, and we intend to do so in future studies and to also vary linker length and rigidity.

    1. .7.1. Consider Different Use Cases

      I'm just gonna fire off ideas that come off the top of my head - find good places to eat - learn a new hobby - find places to visit - keep up with sports news - how to videos (instructional) learning a skill - how to fix something - what my friends are doing - browse a community that relates to you - reviews/opinions on products (decide what to buy)

    1. I saw also Samuqan, god of cattle, and therewas Ereshkigal the Queen of the Underworld; and Befit-Sheri squatted in front of her, shewho is recorder of the gods and keeps the book of death. She held a tablet from which sheread. She raised her head; she saw me and spoke:" Who has brought this one here?" Then Iawoke like a man drained of blood who wanders alone in a waste of rashes; like one whomthe bailiff has seized and his heart pounds with terror

      The politics of translation matter: Gilgamesh’s cultural heroism often centers on his confrontation with mortality yet that confrontation is dealt with directly by women, not warriors. Editors and translators from patriarchal eras could have shrunk the symbolic weight of these female figures to maintain the male hero’s dominance, which can illustrate how the person who holds the pen has the ability to manipulate and influence of gender beliefs of this epic and those who read it thoughts of women during the time.

  6. pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
    1. Great white bearskins lay about underfoot, and the only furniture was a lot of low beds covered with Indian rugs. Instead of pictures hung up on the walls, he had antlers and buffalo horns and a stuffed rabbit head. Lenny jutted a thumb at the meek little grey muzzle and stiff jackrabbit ears.

      predatory....

    1. Reviewer #1 (Public review):

      Summary:

      The present study addresses whether physiological signals influence aperiodic brain activity with a focus on age-related changes. The authors report age effects on aperiodic cardiac activity derived from ECG in low and high-frequency ranges in roughly 2300 participants from four different sites. Slopes of the ECGs were associated with common heart variability measures, which, according to the authors, shows that ECG, even at higher frequencies, conveys meaningful information. Using temporal response functions on concurrent ECG and M/EEG time series, the authors demonstrate that cardiac activity is instantaneously reflected in neural recordings, even after applying ICA analysis to remove cardiac activity. This was more strongly the case for EEG than MEG data. Finally, spectral parameterization was done in large-scale resting-state MEG and ECG data in individuals between 18 and 88 years, and age effects were tested. A steepening of spectral slopes with age was observed, particularly for ECG and, to a lesser extent, in cleaned MEG data in most frequency ranges and sensors investigated. The authors conclude that commonly observed age effects on neural aperiodic activity can mainly be explained by cardiac activity.

      Strengths:

      Compared to previous investigations, the authors demonstrate effects of aging on the spectral slope in the currently largest MEG dataset with equal age distribution available. Their efforts of replicating observed effects in another large MEG dataset and considering potential confounding by ocular activity, head movements, or preprocessing methods are commendable and highly valuable to the community. This study also employs a wide range of fitting ranges and two commonly used algorithms for spectral parameterization of neural and cardiac activity, hence providing a comprehensive overview of the impact of methodological choices. The authors discuss their findings in-depth and give recommendations for the separation of physiological and neural sources of aperiodic activity.

      Weaknesses:

      While the study's aim is well-motivated and analyses rigorously conducted, it remains vague what is reflected in the ECG at higher frequency ranges that contributed to the confounding of the age effects in the neural data. However, the authors address this issue in their discussion.

    2. Reviewer #3 (Public review):

      Summary:

      Schmidt et al., aimed to provide an extremely comprehensive demonstration of the influence cardiac electromagnetic fields have on the relationship between age and the aperiodic slope measured from electroencephalographic (EEG) and magnetoencephalographic (MEG) data.

      Strengths:

      Schmidt et al., used a multiverse approach to show that the cardiac influence on this relationship is considerable, by testing a wide range of different analysis parameters (including extensive testing of different frequency ranges assessed to determine the aperiodic fit), algorithms (including different artifact reduction approaches and different aperiodic fitting algorithms), and multiple large datasets to provide conclusions that are robust to the vast majority of potential experimental variations.

      The study showed that across these different analytical variations, the cardiac contribution to aperiodic activity measured using EEG and MEG is considerable, and likely influences the relationship between aperiodic activity and age to a greater extent than the influence of neural activity.

      Their findings have significant implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac fields is essential.

      Weaknesses:

      The authors have addressed the weaknesses of their study in their manuscript. Most alternative explanations for their results have been explored to ensure their conclusions are robust and are not explained by unexplored confounds. Minor potential weaknesses are:

      (1) The number of electrodes used in the EEG analyses was on the lower side, and as such, the results do not confirm that the influence of ECG on the 1/f activity in the EEG is high even for higher density EEG montages where ICA may provide better performance at removing cardiac components (as noted by the authors). Having noted this potential weakness, I doubt the effects of cardiac activity can be completely mitigated with current methods, even in higher-density EEG montages.

      (2) Head movements were used as a proxy for muscle activity. However, this may imperfectly address the potential influence of muscle activity on the slope in the EEG activity. As such, remaining muscle artifacts may have affected some of the results, particularly those that included high frequency ranges in the aperiodic estimate. Perhaps if muscle activity were left in the EEG data, it could have disrupted the ability to detect a relationship between age and 1/f slope in a way that didn't disrupt the same relationship in the cardiac data. However, I doubt this would reverse the overall conclusions given the number of converging results, including in lower frequency bands. The authors also note this potential weakness and suggest how future research might address it.

    3. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      Examination of (a)periodic brain activity has gained particular interest in the last few years in the neuroscience fields relating to cognition, disorders, and brain states. Using large EEG/MEG datasets from younger and older adults, the current study provides compelling evidence that age-related differences in aperiodic EEG/MEG signals can be driven by cardiac rather than brain activity. Their findings have important implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac signals is essential.

      We want to thank the editors for their assessment of our work and highlighting its importance for the understanding of aperiodic neural activity. Additionally, we want to thank the three present and four former reviewers (at a different journal) whose comments and ideas were critical in shaping this manuscript to its current form. We hope that this paper opens up many more questions that will guide us - as a field - to an improved understanding of how “cortical” and “cardiac” changes in aperiodic activity are linked and want to invite readers to engage with our work through eLife’s comment function.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The present study addresses whether physiological signals influence aperiodic brain activity with a focus on age-related changes. The authors report age effects on aperiodic cardiac activity derived from ECG in low and high-frequency ranges in roughly 2300 participants from four different sites. Slopes of the ECGs were associated with common heart variability measures, which, according to the authors, shows that ECG, even at higher frequencies, conveys meaningful information. Using temporal response functions on concurrent ECG and M/EEG time series, the authors demonstrate that cardiac activity is instantaneously reflected in neural recordings, even after applying ICA analysis to remove cardiac activity. This was more strongly the case for EEG than MEG data. Finally, spectral parameterization was done in large-scale resting-state MEG and ECG data in individuals between 18 and 88 years, and age effects were tested. A steepening of spectral slopes with age was observed particularly for ECG and, to a lesser extent, in cleaned MEG data in most frequency ranges and sensors investigated. The authors conclude that commonly observed age effects on neural aperiodic activity can mainly be explained by cardiac activity.

      Strengths:

      Compared to previous investigations, the authors demonstrate the effects of aging on the spectral slope in the currently largest MEG dataset with equal age distribution available. Their efforts of replicating observed effects in another large MEG dataset and considering potential confounding by ocular activity, head movements, or preprocessing methods are commendable and valuable to the community. This study also employs a wide range of fitting ranges and two commonly used algorithms for spectral parameterization of neural and cardiac activity, hence providing a comprehensive overview of the impact of methodological choices. Based on their findings, the authors give recommendations for the separation of physiological and neural sources of aperiodic activity.

      Weaknesses:

      While the aim of the study is well-motivated and analyses rigorously conducted, the overall structure of the manuscript, as it stands now, is partially misleading. Some of the described results are not well-embedded and lack discussion.

      We want to thank the reviewer for their comments focussed on improving the overall structure of the manuscript. We agree with their suggestions that some results could be more clearly contextualized and restructured the manuscript accordingly.

      Reviewer #2 (Public review):

      I previously reviewed this important and timely manuscript at a previous journal where, after two rounds of review, I recommended publication. Because eLife practices an open reviewing format, I will recapitulate some of my previous comments here, for the scientific record.

      In that previous review, I revealed my identity to help reassure the authors that I was doing my best to remain unbiased because I work in this area and some of the authors' results directly impact my prior research. I was genuinely excited to see the earlier preprint version of this paper when it first appeared. I get a lot of joy out of trying to - collectively, as a field - really understand the nature of our data, and I continue to commend the authors here for pushing at the sources of aperiodic activity!

      In their manuscript, Schmidt and colleagues provide a very compelling, convincing, thorough, and measured set of analyses. Previously I recommended that the push even further, and they added the current Figure 5 analysis of event-related changes in the ECG during working memory. In my opinion this result practically warrants a separate paper its own!

      The literature analysis is very clever, and expanded upon from any other prior version I've seen.

      In my previous review, the broadest, most high-level comment I wanted to make was that authors are correct. We (in my lab) have tried to be measured in our approach to talking about aperiodic analyses - including adopting measuring ECG when possible now - because there are so many sources of aperiodic activity: neural, ECG, respiration, skin conductance, muscle activity, electrode impedances, room noise, electronics noise, etc. The authors discuss this all very clearly, and I commend them on that. We, as a field, should move more toward a model where we can account for all of those sources of noise together. (This was less of an action item, and more of an inclusion of a comment for the record.)

      I also very much appreciate the authors' excellent commentary regarding the physiological effects that pharmacological challenges such as propofol and ketamine also have on non-neural (autonomic) functions such as ECG. Previously I also asked them to discuss the possibility that, while their manuscript focuses on aperiodic activity, it is possible that the wealth of literature regarding age-related changes in "oscillatory" activity might be driven partly by age-related changes in neural (or non-neural, ECG-related) changes in aperiodic activity. They have included a nice discussion on this, and I'm excited about the possibilities for cognitive neuroscience as we move more in this direction.

      Finally, I previously asked for recommendations on how to proceed. The authors convinced me that we should care about how the ECG might impact our field potential measures, but how do I, as a relative novice, proceed. They now include three strong recommendations at the end of their manuscript that I find to be very helpful.

      As was obvious from previous review, I consider this to be an important and impactful cautionary report, that is incredibly well supported by multiple thorough analyses. The authors have done an excellent job responding to all my previous comments and concerns and, in my estimation, those of the previous reviewers as well.

      We want to thank the reviewer for agreeing to review our manuscript again and for recapitulating on their previous comments and the progress the manuscript has made over the course of the last ~2 years. The reviewer's comments have been essential in shaping the manuscript into its current form. Their feedback has made the review process truly feel like a collaborative effort, focused on strengthening the manuscript and refining its conclusions and resulting recommendations.

      Reviewer #3 (Public review):

      Summary:

      Schmidt et al., aimed to provide an extremely comprehensive demonstration of the influence cardiac electromagnetic fields have on the relationship between age and the aperiodic slope measured from electroencephalographic (EEG) and magnetoencephalographic (MEG) data.

      Strengths:

      Schmidt et al., used a multiverse approach to show that the cardiac influence on this relationship is considerable, by testing a wide range of different analysis parameters (including extensive testing of different frequency ranges assessed to determine the aperiodic fit), algorithms (including different artifact reduction approaches and different aperiodic fitting algorithms), and multiple large datasets to provide conclusions that are robust to the vast majority of potential experimental variations.

      The study showed that across these different analytical variations, the cardiac contribution to aperiodic activity measured using EEG and MEG is considerable, and likely influences the relationship between aperiodic activity and age to a greater extent than the influence of neural activity.

      Their findings have significant implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac fields is essential.

      We want to thank the reviewer for their thorough engagement with our work and the resultant substantive amount of great ideas both mentioned in the section of Weaknesses and Authors Recommendations below. Their suggestions have sparked many ideas in us on how to move forward in better separating peripheral- from neuro-physiological signals that are likely to greatly influence our future attempts to better extract both cardiac and muscle activity from M/EEG recordings. So we want to thank them for their input, time and effort!

      Weaknesses:

      Figure 4I: The regressions explained here seem to contain a very large number of potential predictors. Based on the way it is currently written, I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions?

      I'm not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including these latent contributions to the full signal back into the same regression model. It seems that there could be some circularity or redundancy in doing so. Can the authors provide a justification for why this is a valid approach?

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      I'm not sure whether there is good evidence or rationale to support the statement in the discussion that the presence of the ECG signal in reference electrodes makes it more difficult to isolate independent ECG components. The ICA algorithm will still function to detect common voltage shifts from the ECG as statistically independent from other voltage shifts, even if they're spread across all electrodes due to the referencing montage. I would suggest there are other reasons why the ICA might lead to imperfect separation of the ECG component (assumption of the same number of source components as sensors, non-Gaussian assumption, assumption of independence of source activities).

      The inclusion of only 32 channels in the EEG data might also have reduced the performance of ICA, increasing the chances of imperfect component separation and the mixing of cardiac artifacts into the neural components, whereas the higher number of sensors in the MEG data would enable better component separation. This could explain the difference between EEG and MEG in the ability to clean the ECG artifact (and perhaps higher-density EEG recordings would not show the same issue).

      The reviewer is making a good argument suggesting that our initial assumption that the presence of cardiac activity on the reference electrode influences the performance of the ICA may be wrong. After rereading and rethinking upon the matter we think that the reviewer is correct and that their assumptions for why the ECG signal was not so easily separable from our EEG recordings are more plausible and better grounded in the literature than our initial suggestion. We therefore now highlight their view as a main reason for why the ECG rejection was more challenging in EEG data. However, we also note that understanding the exact reason probably ends up being an empirical question that demands further research stating that:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      In addition to the inability to effectively clean the ECG artifact from EEG data, ICA and other component subtraction methods have also all been shown to distort neural activity in periods that aren't affected by the artifact due to the ubiquitous issue of imperfect component separation (https://doi.org/10.1101/2024.06.06.597688). As such, component subtraction-based (as well as regression-based) removal of the cardiac artifact might also distort the neural contributions to the aperiodic signal, so even methods to adequately address the cardiac artifact might not solve the problem explained in the study. This poses an additional potential confound to the "M/EEG without ECG" conditions.

      The reviewer is correct in stating that, if an “artifactual” signal is not always present but appears and disappears (like e.g. eye-blinks) neural activity may be distorted in periods where the “artifactual” signal is absent. However, while this plausibly presents a problem for ocular activity, there is no obvious reason to believe that this applies to cardiac activity. While the ECG signal is non-stationary in nature, it is remarkably more stable than eye-movements in the healthy populations we analyzed (especially at rest). Therefore, the presence of the cardiac “artifact” was consistently present across the entirety of the MEG recordings we visually inspected.

      Literature Analysis, Page 23: was there a method applied to address studies that report reducing artifacts in general, but are not specific to a single type of artifact? For example, there are automated methods for cleaning EEG data that use ICLabel (a machine learning algorithm) to delete "artifact" components. Within these studies, the cardiac artifact will not be mentioned specifically, but is included under "artifacts".

      The literature analysis was largely performed automatically and solely focussed on ECG related activity as described in the methods section under Literature Analysis, if no ECG related terms were used in the context of artifact rejection a study was flagged as not having removed cardiac activity. This could have been indeed better highlighted by us and we apologize for the oversight on our behalf. We now additionally link to these details stating that:

      “However, an analysis of openly accessible M/EEG articles (N<sub>Articles</sub>=279; see Methods - Literature Analysis for further details) that investigate aperiodic activity revealed that only 17.1% of EEG studies explicitly mention that cardiac activity was removed and only 16.5% measure ECG (45.9% of MEG studies removed cardiac activity and 31.1% of MEG studies mention that ECG was measured; see Figure 1EF).”

      The reviewer makes a fair point that there is some uncertainty here and our results probably present a lower bound of ECG handling in M/EEG research as, when I manually rechecked the studies that were not initially flagged in studies it was often solely mentioned that “artifacts” were rejected. However, this information seemed too ambiguous to assume that cardiac activity was in fact accounted for. However, again this could have been mentioned more clearly in writing and we apologize for this oversight. Now this is included as part of the methods section Literature Analysis stating that:

      “All valid word contexts were then manually inspected by scanning the respective word context to ensure that the removal of “artifacts” was related specifically to cardiac and not e.g. ocular activity or the rejection of artifacts in general (without specifying which “artifactual” source was rejected in which case the manuscript was marked as invalid). This means that the results of our literature analysis likely present a lower bound for the rejection of cardiac activity in the M/EEG literature investigating aperiodic activity.”

      Statistical inferences, page 23: as far as I can tell, no methods to control for multiple comparisons were implemented. Many of the statistical comparisons were not independent (or even overlapped with similar analyses in the full analysis space to a large extent), so I wouldn't expect strong multiple comparison controls. But addressing this point to some extent would be useful (or clarifying how it has already been addressed if I've missed something).

      In the present study we tried to minimize the risk of type 1 errors by several means, such as A) weakly informative priors, B) robust regression models and C) by specifying a region of practical equivalence (ROPE, see Methods Statistical Inference for further Information) to define meaningful effects.

      Weakly informative priors can lower the risk of type 1 errors arising from multiple testing by shrinking parameter estimates towards zero (see e.g. Lemoine, 2019). Robust regression models use a Student T distribution to describe the distribution of the data. This distribution features heavier tails, meaning it allocates more probability to extreme values, which in turn minimizes the influence of outliers. The ROPE criterion ensures that only effects exceeding a negligible size are considered meaningful, representing a strict and conservative approach to interpreting our findings (see Kruschke 2018, Cohen, 1988).

      Furthermore, and more generally we do not selectively report “significant” effects in the situations in which multiple analyses were conducted on the same family of data (e.g. Figure 2 & 4). Instead we provide joint inference across several plausible analysis options (akin to a specification curve analysis, Simonsohn, Simmons & Nelson 2020) to provide other researchers with an overview of how different analysis choices impact the association between cardiac and neural aperiodic activity.

      Lemoine, N. P. (2019). Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128(7), 912-928.

      Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve analysis. Nature Human Behaviour, 4(11), 1208-1214.

      Methods:

      Applying ICA components from 1Hz high pass filtered data back to the 0.1Hz filtered data leads to worse artifact cleaning performance, as the contribution of the artifact in the 0.1Hz to 1Hz frequency band is not addressed (see Bailey, N. W., Hill, A. T., Biabani, M., Murphy, O. W., Rogasch, N. C., McQueen, B., ... & Fitzgerald, P. B. (2023). RELAX part 2: A fully automated EEG data cleaning algorithm that is applicable to Event-Related-Potentials. Clinical Neurophysiology, result reported in the supplementary materials). This might explain some of the lower frequency slope results (which include a lower frequency limit <1Hz) in the EEG data - the EEG cleaning method is just not addressing the cardiac artifact in that frequency range (although it certainly wouldn't explain all of the results).

      We want to thank the reviewer for suggesting this interesting paper, showing that lower high-pass filters may be preferable to the more commonly used >1Hz high-pass filters for detection of ICA components that largely contain peripheral physiological activity. However, the results presented by Bailey et al. contradict the more commonly reported findings by other researchers that >1Hz high-pass filter is actually preferable (e.g. Winkler et al. 2015; Dimingen, 2020 or Klug & Gramann, 2021) and recommendations in widely used packages for M/EEG analysis (e.g. https://mne.tools/1.8/generated/mne.preprocessing.ICA.html). Yet, the fact that there seems to be a discrepancy suggests that further research is needed to better understand which type of high-pass filtering is preferable in which situation. Furthermore, it is notable that all the findings for high-pass filtering in ICA component detection and removal that we are aware of relate to ocular activity. Given that ocular and cardiac activity have very different temporal and spectral patterns it is probably worth further investigating whether the classic 1Hz high-pass filter is really also the best option for the detection and removal of cardiac activity. However, in our opinion this requires a dedicated investigation on its own..

      We therefore highlight this now in our manuscript stating that:

      “Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)).

      Winkler, S. Debener, K. -R. Müller and M. Tangermann, "On the influence of high-pass filtering on ICA-based artifact reduction in EEG-ERP," 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 2015, pp. 4101-4105, doi: 10.1109/EMBC.2015.7319296.

      Dimigen, O. (2020). Optimizing the ICA-based removal of ocular EEG artifacts from free viewing experiments. NeuroImage, 207, 116117.

      Klug, M., & Gramann, K. (2021). Identifying key factors for improving ICA‐based decomposition of EEG data in mobile and stationary experiments. European Journal of Neuroscience, 54(12), 8406-8420.

      It looks like no methods were implemented to address muscle artifacts. These can affect the slope of EEG activity at higher frequencies. Perhaps the Riemannian Potato addressed these artifacts, but I suspect it wouldn't eliminate all muscle activity. As such, I would be concerned that remaining muscle artifacts affected some of the results, particularly those that included high frequency ranges in the aperiodic estimate. Perhaps if muscle activity were left in the EEG data, it could have disrupted the ability to detect a relationship between age and 1/f slope in a way that didn't disrupt the same relationship in the cardiac data (although I suspect it wouldn't reverse the overall conclusions given the number of converging results including in lower frequency bands). Is there a quick validity analysis the authors can implement to confirm muscle artifacts haven't negatively affected their results?

      I note that an analysis of head movement in the MEG is provided on page 32, but it would be more robust to show that removing ICA components reflecting muscle doesn't change the results. The results/conclusions of the following study might be useful for objectively detecting probable muscle artifact components: Fitzgibbon, S. P., DeLosAngeles, D., Lewis, T. W., Powers, D. M. W., Grummett, T. S., Whitham, E. M., ... & Pope, K. J. (2016). Automatic determination of EMG-contaminated components and validation of independent component analysis using EEG during pharmacologic paralysis. Clinical neurophysiology, 127(3), 1781-1793.

      We thank the reviewer for their suggestion. Muscle activity can indeed be a potential concern, for the estimation of the spectral slope. This is precisely why we used head movements (as also noted by the reviewer) as a proxy for muscle activity. We also agree with the reviewer that this is not a perfect estimate. Additionally, also the riemannian potato would probably only capture epochs that contain transient, but not persistent patterns of muscle activity.

      The paper recommended by the reviewer contains a clever approach of using the steepness of the spectral slope (or lack thereof) as an indicator whether or not an independent component (IC) is driven by muscle activity. In order to determine an optimal threshold Fitzgibbon et al. compared paralyzed to temporarily non paralyzed subjects. They determined an expected “EMG-free” threshold for their spectral slope on paralyzed subjects and used this as a benchmark to detect IC’s that were contaminated by muscle activity in non paralyzed subjects.

      This is a great idea, but unfortunately would go way beyond what we are able to sensibly estimate with our data for the following reasons. The authors estimated their optimal threshold on paralyzed subjects for EEG data and show that this is a feasible threshold to be applied across different recordings. So for EEG data it might be feasible, at least as a first shot, to use their threshold on our data. However, we are measuring MEG and as alluded to in our discussion section under “Differences in aperiodic activity between magnetic and electric field recordings” the spectral slope differs greatly between MEG and EEG recordings for non-trivial reasons. Furthermore, the spectral slope even seems to also differ across different MEG devices. We noticed this when we initially tried to pool the data recorded in Salzburg with the Cambridge dataset. This means we would need to do a complete validation of this procedure for the MEG data recorded in Cambridge and in Salzburg, which is not feasible considering that we A) don’t have direct access to one of the recording sites and B) would even if we had access face substantial hurdles to get ethical approval for the experiment performed by Fitzgibbon et al..

      However, we think the approach brought forward by Fitzgibbon and colleagues is a clever way to remove muscle activity from EEG recordings, whenever EMG was not directly recorded. We therefore suggested in the Discussion section that ideally also EMG should be recorded stating that:

      “It is worth noting that, apart from cardiac activity, muscle activity can also be captured in (non-)invasive recordings and may drastically influence measures of the spectral slope(72). To ensure that persistent muscle activity does not bias our results we used changes in head movement velocity as a control analysis (see Supplementary Figure S9). However, it should be noted that this is only a proxy for the presence of persistent muscle activity. Ideally, studies investigating aperiodic activity should also be complemented by measurements of EMG. Whenever such measurements are not available creative approaches that use the steepness of the spectral slope (or the lack thereof) as an indicator to detect whether or not e.g. an independent component is driven by muscle activity are promising(72,73). However, these approaches may require further validation to determine how well myographic aperiodic thresholds are transferable across the wide variety of different M/EEG devices.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) As outlined above, I recommend rephrasing the last section of the introduction to briefly summarize/introduce all main analysis steps undertaken in the study and why these were done (for example, it is only mentioned that the Cam-CAN dataset was used to study the impact of cardiac on MEG activity although the author used a variety of different datasets). Similarly, I am missing an overview of all main findings in the context of the study goals in the discussion. I believe clarifying the structure of the paper would not only provide a red thread to the reader but also highlight the efforts/strength of the study as described above.

      This is a good call! As suggested by the reviewer we now try to give a clearer overview of what was investigated why. We do that both at the end of the introduction stating that: “Using the publicly available Cam-CAN dataset(28,29), we find that the aperiodic signal measured using M/EEG originates from multiple physiological sources. In particular, significant portions of age-related changes in aperiodic activity –normally attributed to neural processes– can be better explained by cardiac activity. This observation holds across a wide range of processing options and control analyses (see Supplementary S1), and was replicable on a separate MEG dataset. However, the extent to which cardiac activity accounts for age-related changes in aperiodic activity varies with the investigated frequency range and recording site. Importantly, in some frequency ranges and sensor locations, age-related changes in neural aperiodic activity still prevail. But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging. In sum, our results highlight the complexity of aperiodic activity while cautioning against interpreting it as solely “neural“ without considering physiological influences.”

      and at the beginning of the discussion section:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources (see Figure 1EF). Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)). “

      (2) I found it interesting that the spectral slopes of ECG activity at higher frequency ranges (> 10 Hz) seem mostly related to HRV measures such as fractal and time domain indices and less so with frequency-domain indices. Do the authors have an explanation for why this is the case? Also, the analysis of the HRV measures and their association with aperiodic ECG activity is not explained in any of the method sections.

      We apologize for the oversight in not mentioning the HRV analysis in more detail in our methods section. We added a subsection to the Methods section entitled ECG Processing - Heart rate variability analysis to further describe the HRV analyses.

      “ECG Processing - Heart rate variability analysis

      Heart rate variability (HRV) was computed using the NeuroKit2 toolbox, a high level tool for the analysis of physiological signals. First, the raw electrocardiogram (ECG) data were preprocessed, by highpass filtering the signal at 0.5Hz using an infinite impulse response (IIR) butterworth filter(order=5) and by smoothing the signal with a moving average kernel with the width of one period of 50Hz to remove the powerline noise (default settings of neurokit.ecg.ecg_clean). Afterwards, QRS complexes were detected based on the steepness of the absolute gradient of the ECG signal. Subsequently, R-Peaks were detected as local maxima in the QRS complexes (default settings of neurokit.ecg.ecg_peaks; see (98) for a validation of the algorithm). From the cleaned R-R intervals, 90 HRV indices were derived, encompassing time-domain, frequency-domain, and non-linear measures. Time-domain indices included standard metrics such as the mean and standard deviation of the normalized R-R intervals , the root mean square of successive differences, and other statistical descriptors of interbeat interval variability. Frequency-domain analyses were performed using power spectral density estimation, yielding for instance low frequency (0.04-0.15Hz) and high frequency (0.15-0.4Hz) power components. Additionally, non-linear dynamics were characterized through measures such as sample entropy, detrended fluctuation analysis and various Poincaré plot descriptors. All these measures were then related to the slopes of the low frequency (0.25 – 20 Hz) and high frequency (10 – 145 Hz) aperiodic spectrum of the raw ECG.”

      With regards to association of the ECG’s spectral slopes at high frequencies and frequency domain indices of heart rate variability. Common frequency domain indices of heart rate variability fall in the range of 0.01-.4Hz. Which probably explains why we didn’t notice any association at higher frequency ranges (>10Hz).

      This is also stated in the related part of the results section:

      “In the higher frequency ranges (10 - 145 Hz) spectral slopes were most consistently related to fractal and time domain indices of heart rate variability, but not so much to frequency-domain indices assessing spectral power in frequency ranges < 0.4 Hz.”

      (3) Related to the previous point - what is being reflected in the ECG at higher frequency ranges, with regard to biological mechanisms? Results are being mentioned, but not further discussed. However, this point seems crucial because the age effects across the four datasets differ between low and high-frequency slope limits (Figure 2C).

      This is a great question that definitely also requires further attention and investigation in general (see also Tereshchenko & Josephson, 2015). We investigated the change of the slope across frequency ranges that are typically captured in common ECG setups for adults (0.05 - 150Hz, Tereshchenko & Josephson, 2015; Kusayama, Wong, Liu et al. 2020). While most of the physiological significant spectral information of an ECG recording rests between 1-50Hz (Clifford & Azuaje, 2006), meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz) that falls straight in our spectral analysis window. However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). Yet, the exact physiological mechanisms underlying so-called high-frequency QRS remain unclear (HF-QRS; see Tereshchenko & Josephson, 2015; Qiu et al. 2024 for a review discussing possible mechanisms). Yet, at the same time the HF-QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range (Schlegel et al. 2004; Qiu et al. 2024). All optimism aside, it is also worth noting that ECG recordings at higher frequencies can capture skeletal muscle activity with an overlapping frequency range up to 400Hz (Kusayama, Wong, Liu et al. 2020). We highlight all of this now when introducing this analysis in the results sections as outstanding research question stating that:

      “However, substantially less is known about aperiodic activity above 0.4Hz in the ECG. Yet, common ECG setups for adults capture activity at a broad bandwidth of 0.05 - 150Hz(33,34).

      Importantly, a lot of the physiological meaningful spectral information rests between 1-50Hz(35), similarly to M/EEG recordings. Furthermore, meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz(35)). However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). For instance, the so-called high-frequency QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range(36,37). Yet, the exact physiological mechanisms underlying the high-frequency QRS remain unclear (see (37) for a review discussing possible mechanisms). ”

      Tereshchenko, L. G., & Josephson, M. E. (2015). Frequency content and characteristics of ventricular conduction. Journal of electrocardiology, 48(6), 933-937.

      Kusayama, T., Wong, J., Liu, X. et al. Simultaneous noninvasive recording of electrocardiogram and skin sympathetic nerve activity (neuECG). Nat Protoc 15, 1853–1877 (2020). https://doi.org/10.1038/s41596-020-0316-6

      Clifford, G. D., & Azuaje, F. (2006). Advanced methods and tools for ECG data analysis (Vol. 10). P. McSharry (Ed.). Boston: Artech house.

      Qiu, S., Liu, T., Zhan, Z., Li, X., Liu, X., Xin, X., ... & Xiu, J. (2024). Revisiting the diagnostic and prognostic significance of high-frequency QRS analysis in cardiovascular diseases: a comprehensive review. Postgraduate Medical Journal, qgae064.

      Schlegel, T. T., Kulecz, W. B., DePalma, J. L., Feiveson, A. H., Wilson, J. S., Rahman, M. A., & Bungo, M. W. (2004, March). Real-time 12-lead high-frequency QRS electrocardiography for enhanced detection of myocardial ischemia and coronary artery disease. In Mayo Clinic Proceedings (Vol. 79, No. 3, pp. 339-350). Elsevier.

      (4) Page 10: At first glance, it is not quite clear what is meant by "processing option" in the text. Please clarify.

      Thank you for catching this! Upon re-reading this is indeed a bit oblivious. We now swapped “processing options” with “slope fits” to make it clearer that we are talking about the percentage of effects based on the different slope fits.

      (5) The authors mention previous findings on age effects on neural 1/f activity (References Nr 5,8,27,39) that seem contrary to their own findings such as e.g., the mostly steepening of the slopes with age. Also, the authors discuss thoroughly why spectral slopes derived from MEG signals may differ from EEG signals. I encourage the authors to have a closer look at these studies and elaborate a bit more on why these studies differ in their conclusions on the age effects. For example, Tröndle et al. (2022, Ref. 39) investigated neural activity in children and young adults, hence, focused on brain maturation, whereas the CamCAN set only considers the adult lifespan. In a similar vein, others report age effects on 1/f activity in much smaller samples as reported here (e.g., Voytek et al., 2015).

      I believe taking these points into account by briefly discussing them, would strengthen the authors' claims and provide a more fine-grained perspective on aging effects on 1/f.

      The reviewer is making a very important point. As age-related differences in (neuro-)physiological activity are not necessarily strictly comparable and entirely linear across different age-cohorts (e.g. age-related changes in alpha center frequency). We therefore, added the suggested discussion points to the discussion section.

      “Differences in electric and magnetic field recordings aside, aperiodic activity may not change strictly linearly as we are ageing and studies looking at younger age groups (e.g. <22; (44) may capture different aspects of aging (e.g. brain maturation), than those looking at older subjects (>18 years; our sample). A recent report even shows some first evidence of an interesting putatively non-linear relationship with age in the sensorimotor cortex for resting recordings(59)”

      (6) The analysis of the working memory paradigm as described in the outlook-section of the discussion comes as a bit of a surprise as it has not been introduced before. If the authors want to convey with this study that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity, I recommend introducing this analysis and the results earlier in the manuscript than only in the discussion to strengthen their message.

      The reviewer is correct. This analysis really comes a bit out of the blue. However, this was also exactly the intention for placing this analysis in the discussion. As the reviewer correctly noted, the aim was to suggest “that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity”. We placed this outlook directly after the discussion of “(neuro-)physiological origins of aperiodic activity”, where we highlight the potential challenges of interpreting drug induced changes to M/EEG recordings. So the aim was to get the reader to think about whether age is the only feature affected by cardiac activity and then directly present some evidence that this might go beyond age.

      However, we have been rethinking this approach based on the reviewers comments and moved that paragraph to the end of the results section accordingly and introduce it already at the end of the introduction stating that:

      “But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging.”

      (7) The font in Figure 2 is a bit hard to read (especially in D). I recommend increasing the font sizes where necessary for better readability.

      We agree with the Reviewer and increased the font sizes accordingly.

      (8) Text in the discussion: Figure 3B on page 10 => shouldn't it be Figure 4?

      Thank you for catching this oversight. We have now corrected this mistake.

      (9) In the third section on page 10, the Figure labels seem to be confused. For example, Figure 4 E is supposed to show "steepening effects", which should be Figure 4B I believe.

      Please check the figure labels in this section to avoid confusion.

      Thank you for catching this oversight. We have now corrected this mistake.

      (10) Figure Legend 4 I), please check the figure labels in the text

      Thank you for catching this oversight. We have now corrected this mistake.

      Reviewer #3 (Recommendations for the authors):

      I have a number of suggestions for improving the manuscript, which I have divided by section in the following:

      ABSTRACT:

      I would suggest re-writing the first sentences to make it easier to read for non-expert readers: "The power of electrophysiologically measured cortical activity decays with an approximately 1/fX function. The slope of this decay (i.e. the spectral exponent, X) is modulated..."

      Thank you for the suggestion. We adjusted the sentence as suggested to make it easier for less technical readers to understand that “X” refers to the exponent.

      Including the age range that was studied in the abstract could be informative.

      Done as suggested.

      As an optional recommendation, I think it would increase the impact of the article if the authors note in the abstract that the current most commonly applied cardiac artifact reduction approaches don't resolve the issue for EEG data, likely due to an imperfect ability to separate the cardiac artifact from the neural activity with independent component analysis. This would highlight to the reader that they can't just expect to address these concerns by cleaning their data with typical cleaning methods.

      I think it would also be useful to convey in the abstract just how comprehensive the included analyses were (in terms of artifact reduction methods tested, different aperiodic algorithms and frequency ranges, and both MEG and EEG). Doing so would let the reader know just how robust the conclusions are likely to be.

      This is a brilliant idea! As suggested we added a sentence highlighting that simply performing an ICA may not be sufficient to separate cardiac contributions to M/EEG recordings and refer to the comprehensiveness of the performed analyses.

      INTRODUCTION:

      I would suggest re-writing the following sentence for readability: "In the past, aperiodic neural activity, other than periodic neural activity (local peaks that rise above the "power-law" distribution), was often treated as noise and simply removed from the signal"

      To something like: "In the past, aperiodic neural activity was often treated as noise and simply removed from the signal e.g. via pre-whitening, so that analyses could focus on periodic neural activity (local peaks that rise above the "power-law" distribution, which are typically thought to reflect neural oscillations).

      We are happy to follow that suggestion.

      Page 3: please provide the number of articles that were included in the examination of the percentage that remove cardiac activity, and note whether the included articles could be considered a comprehensive or nearly comprehensive list, or just a representative sample.

      We stated the exact number of articles in the methods section under Literature Analysis. However, we added it to the Introduction on page 3 as suggested by the reviewer. The selection of articles was done automatically, dependent on a list of pre-specified terms and exclusively focussed on articles that had terms related to aperiodic activity in their title (see Literature Analysis). Therefore, I would personally be hesitant in calling it a comprehensive or nearly comprehensive list of the general M/EEG literature as the analysis of aperiodic activity is still relatively niche compared to the more commonly investigated evoked potentials or oscillations. I think whether or not a reader perceives our analysis as comprehensive should be up to them to decide and does not reflect something I want to impose on them. This is exacerbated by the fact that the analysis of neural aperiodic activity has rapidly gained traction over the last years (see Figure 1D orange) and the literature analysis was performed almost 2 years ago and therefore, in my eyes, only represents a glimpse in the rapidly evolving field related to the analysis of aperiodic activity.

      Figure 1E-F: It's not completely clear that the "Cleaning Methods" part of the figure indicates just methods to clean the cardiac artifact (rather than any artifact). It also seems that ~40% of EEG studies do not apply any cleaning methods even from within the studies that do clean the cardiac artifact (if I've read the details correctly). This seems unlikely. Perhaps there should be a bar for "other methods", or "unspecified"? Having said that, I'm quite familiar with the EEG artifact reduction literature, and I would be very surprised if ~40% of studies cleaned the cardiac artifact using a different method to the methods listed in the bar graph, so I'm wondering if I've misunderstood the figure, or whether the data capture is incomplete / inaccurate (even though the conclusion that ICA is the most common method is almost certainly accurate).

      The cleaning is indeed only focussed on cardiac activity specifically. This was however also mentioned in the caption of Figure 1: “We were further interested in determining which artifact rejection approaches were most commonly used to remove cardiac activity, such as independent component analysis (ICA(22)), singular value decomposition (SVD(23)), signal space separation (SSS(24)), signal space projections (SSP(25)) and denoising source separation (DSS(26)).” and in the methods section under Literature Analysis. However, we adjusted figure 1EF to make it more obvious that the described cleaning methods were only related to the ECG. Aside from using blind source separation techniques such as ICA a good amount of studies mentioned that they cleaned their data based on visual inspection (which was not further considered). Furthermore, it has to be noted that only studies were marked as having separated cardiac from neural activity, when this was mentioned explicitly.

      RESULTS:

      Page 6: I would delete the "from a neurophysiological perspective" clause, which makes the sentence more difficult to read and isn't so accurate (frequencies 13-25Hz would probably more commonly be considered mid-range rather than low or high). Additionally, both frequency ranges include 15Hz, but the next sentence states that the ranges were selected to avoid the knee at 15Hz, which seems to be a contradiction. Could the authors explain in more detail how the split addresses the 15Hz knee?

      We removed the “from a neurophysiological perspective” clause as suggested. With regards to the “knee” at ~15Hz I would like to defer the reviewer to Supplementary Figure S1. The Knee Frequency varies substantially across subjects so splitting the data at only 1 exact Frequency did not seem appropriate. Additionally, we found only spurious significant age-related variations in Knee Frequency (i.e. only one out of the 4 datasets; not shown).

      Furthermore, we wanted to better connect our findings to our MEG results in Figure 4 and also give the readers a holistic overview of how different frequency ranges in the aperiodic ECG would be affected by age. So to fulfill all of these objectives we decided to fit slopes with respective upper/lower bounds around a range of 5Hz above and below the average 15Hz Knee Frequency across datasets.

      The later parts of this same paragraph refer to a vast amount of different frequency ranges, but only the "low" and "high" frequency ranges were previously mentioned. Perhaps the explanation could be expanded to note that multiple lower and upper bounds were tested within each of these low and high frequency windows?

      This is a good catch we adjusted the sentence as suggested. We now write: “.. slopes were fitted individually to each subject's power spectrum in several lower (0.25 – 20 Hz) and higher (10-145 Hz) frequency ranges.”

      The following two sentences seem to contradict each other: "Overall, spectral slopes in lower frequency ranges were more consistently related to heart rate variability indices(> 39.4% percent of all investigated indices)" and: "In the lower frequency range (0.25 - 20Hz), spectral slopes were consistently related to most measures of heart rate variability; i.e. significant effects were detected in all 4 datasets (see Figure 2D)." (39.4% is not "most").

      The reviewer is correct in stating that 39.4% is not most. However, the 39.4% is the lowest bound and only refers to 1 dataset. In the other 3 datasets the percentage of effects was above 64% which can be categorized as “most” i.e. above 50%. We agree that this was a bit ambiguous in the sentence so we added the other percentages as well as a reference to Figure 2D to make this point clearer.

      Figure 2D: it isn't clear what the percentages in the semi-circles reflect, nor why some semi-circles are more full circles while others are only quarter circles.

      The percentages in the semi-circles reflect the amount of effects (marked in red) and null effects (marked in green) per dataset, when viewed as average across the different measures of HRV. Sometimes less effects were found for some frequency ranges resulting in quarters instead of semi circles.

      Page 8: I think the authors could make it more clear that one of the conditions they were testing was the ECG component of the EEG data (extracted by ICA then projected back into the scalp space for the temporal response function analysis).

      As suggested by the reviewer we adjusted our wording and replaced the arguably a bit ambiguous “... projected back separately” with “... projected back into the sensor space”. We thank the reviewer for this recommendation, as it does indeed make it easier to understand the procedure.

      “After pre-processing (see Methods) the data was split in three conditions using an ICA(22). Independent components that were correlated (at r > 0.4; see Methods: MEG/EEG Processing - pre-processing) with the ECG electrode were either not removed from the data (Figure 3ABCD - blue), removed from the data (Figure 2ABCD - orange) or projected back into the sensor space (Figure 3ABCD - green).”

      Figure 4A: standardized beta coefficients for the relationship between age and spectral slope could be noted to provide improved clarity (if I'm correct in assuming that is what they reflect).

      This was indeed shown in Figure 4A and noted in the color bar as “average beta (standardized)”. We do not specifically highlight this in the text, because the exact coefficients would depend on both on the analyzed frequency range and the selected electrodes.

      Figure 4I: The regressions explained at this point seems to contain a very large number of potential predictors, as I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions? (if that is not the case, it could be explained in greater detail). I'm also not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including them back into the same regression model. It seems that there could be some circularity or redundancy in doing so. However, I'm not confident that this is an issue, so would appreciate the authors explaining why it this is a valid approach (if that is the case).

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      The explanation of results for relationships between spectral slopes and aging reported in Figure 4 refers to clusters of effects, but the statistical inference methods section doesn't explain how these clusters were determined.

      The wording of “cluster” was used to describe a “category” of effects e.g. null effects. We changed the wording from “cluster” to “category” to make this clearer stating now that: “This analysis, which is depicted in Figure 4, shows that over a broad amount of individual fitting ranges and sensors, aging resulted in a steepening of spectral slopes across conditions (see Figure 4E) with “steepening effects” observed in 25% of the processing options in MEG<sub>ECG not rejected</sub> , 0.5% in MEG<sub>ECG rejected</sub>, and 60% for MEG<sub>ECG components</sub>. The second largest category of effects were “null effects” in 13% of the options for MEG<sub>ECG not rejected</sub> , 30% in MEG<sub>ECG rejected</sub>, and 7% for MEG<sub>ECG components</sub>. ”

      Page 12: can the authors clarify whether these age related steepenings of the spectral slope in the MEG are when the data include the ECG contribution, or when the data exclude the ECG? (clarifying this seems critical to the message the authors are presenting).

      We apologize for not making this clearer. We now write: “This analysis also indicates that a vast majority of observed effects irrespective of condition (ECG components, ECG not rejected, ECG rejected) show a steepening of the spectral slope with age across sensors and frequency ranges.”

      Page 13: I think it would be useful to describe how much variance was explained by the MEG-ECG rejected vs MEG-ECG component conditions for a range of these analyses, so the reader also has an understanding of how much aperiodic neural activity might be influenced by age (vs if the effects are really driven mostly by changes in the ECG).

      With regards to the explained variance I think that the very important question of how strong age influences changes in aperiodic activity is a topic better suited for a meta analysis. As the effect sizes seems to vary largely depending on the sample e.g. for EEG in the literature results were reported at r=-0.08 (Cesnaite et al. 2023), r=-0.26 (Cellier et al. 2021), r=-0.24/r=-0.28/r=-0.35 (Hill et al. 2022) and r=0.5/r=0.7 (Voytek et al. 2015). I would defer the reader/reviewer to the standardized beta coefficients as a measure of effect size in the current study that is depicted in Figure 4A.

      Cellier, D., Riddle, J., Petersen, I., & Hwang, K. (2021). The development of theta and alpha neural oscillations from ages 3 to 24 years. Developmental cognitive neuroscience, 50, 100969.

      Cesnaite, E., Steinfath, P., Idaji, M. J., Stephani, T., Kumral, D., Haufe, S., ... & Nikulin, V. V. (2023). Alterations in rhythmic and non‐rhythmic resting‐state EEG activity and their link to cognition in older age. NeuroImage, 268, 119810.

      Hill, A. T., Clark, G. M., Bigelow, F. J., Lum, J. A., & Enticott, P. G. (2022). Periodic and aperiodic neural activity displays age-dependent changes across early-to-middle childhood. Developmental Cognitive Neuroscience, 54, 101076.

      Voytek, B., Kramer, M. A., Case, J., Lepage, K. Q., Tempesta, Z. R., Knight, R. T., & Gazzaley, A. (2015). Age-related changes in 1/f neural electrophysiological noise. Journal of Neuroscience, 35(38), 13257-13265.

      Also, if there are specific M/EEG sensors where the 1/f activity does relate strongly to age, it would be worth noting these, so future research could explore those sensors in more detail.

      I think it is difficult to make a clear claim about this for MEG data, as the exact location or type of the sensor may differ across manufacturers. Such a statement could be easier made for source projected data or in case EEG electrodes were available, where the location would be normed eg. according to the 10-20 system.

      DISCUSSION:

      Page 15: Please change the wording of the following sentence, as the way it is currently worded seems to suggest that the authors of the current manuscript have demonstrated this point (which I think is not the case): "The authors demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods."

      Apologies for the oversight! The reviewer is correct we in fact did not show this, but the authors of the cited manuscript. We correct the sentence as suggested stating now that:

      “Bénar et al. demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods.”

      Page 16: The authors mention the results can be sensitive to the application of SSS to clean the MEG data, but not ICA. I think it would be sensitive to the application of either SSS or ICA?

      This is correct and actually also supported by Figure S7, as differences in ICA thresholds affect also the detection of age-related effects. We therefore adjusted the related sentences stating now that:

      “ In case of the MEG signal this may include the application of Signal-Space-Separation algorithms (SSS(24,55)), different thresholds for ICA component detection (see Figure S7), high and low pass filtering, choices during spectral density estimation (window length/type etc.), different parametrization algorithms (e.g. IRASA vs FOOOF) and selection of frequency ranges for the aperiodic slope estimation.”

      It would be worth clarifying that the linked mastoid re-reference alone has been proposed to cancel out the ECG signal, rather than that a linked-mastoid re-reference improves the performance of the ICA separation (which could be inferred by the explanation as it's currently written).

      This is correct and we adjusted the sentence accordingly! Stating now that:

      “ Previous work(12,56) has shown that a linked mastoid reference alone was particularly effective in reducing the impact of ECG related activity on aperiodic activity measured using EEG. “

      The issue of the number of EEG channels could probably just be noted as a potential limitation, as could the issue of neural activity being mixed into the ECG component (although this does pose a potential confound to the M/EEG without ECG condition, I suspect it wouldn't be critical).

      This is indeed a very fair point as a higher amount of electrodes would probably make it easier to better isolate ECG components in the EEG, which may be the reason why the separation did not work so well in our case. However, this is ultimately an empirical question so we highlighted it in the discussion section stating that: “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      OUTLOOK:

      Page 19: Although there has been a recent trend to control for 1/f activity when examining oscillatory power, recent research suggests that this should only be implemented in specific circumstances, otherwise the correction causes more of a confound than the issue does. It might be worth considering this point with regards to the final recommendation in the Outlook section: Brake, N., Duc, F., Rokos, A., Arseneau, F., Shahiri, S., Khadra, A., & Plourde, G. (2024). A neurophysiological basis for aperiodic EEG and the background spectral trend. Nature Communications, 15(1), 1514.

      We want to thank the reviewer for recommending this very interesting paper! The authors of said paper present compelling evidence showing that, while peak detection above an aperiodic trend using methods like FOOOF or IRASA is a prerequisite to determine the presence of oscillatory activity, it’s not necessarily straightforward to determine which detrending approach should be applied to determine the actual power of an oscillation. Furthermore, the authors suggest that wrongfully detrending may cause larger errors than not detrending at all. We therefore added a sentence stating that: “However, whether or not periodic activity (after detection) should be detrended using approaches like FOOOF or IRASA still remains disputed, as incorrectly detrending the data may cause larger errors than not detrending at all(75).”

      RECOMMENDATIONS:

      Page 20: "measure and account for" seems like it's missing a word, can this be re-written so the meaning is more clear?

      Done as suggested. The sentence now states: “To better disentangle physiological and neural sources of aperiodic activity, we propose the following steps to (1) measure and (2) account for physiological influences.”

      I would re-phrase "doing an ICA" to "reducing cardiac artifacts using ICA" (this wording could be changed in other places also).

      I do not like to describe cardiac or ocular activity as artifactual per se. This is also why I used hyphens whenever I mention the word “artifact” in association with the ECG or EOG. However, I do understand that the wording of “doing an ICA” is a bit sloppy. We therefore reworded it accordingly throughout the manuscript to e.g. “separating cardiac from neural sources using an ICA” and “separating physiological from neural sources using an ICA”.

      I would additionally note that even if components are identified as unambiguously cardiac, it is still likely that neural activity is mixed in, and so either subtracting or leaving the component will both be an issue (https://doi.org/10.1101/2024.06.06.597688). As such, even perfect identification of whether components are cardiac or not would still mean the issue remains (and this issue is also consistent across a considerable range of component based methods). Furthermore, current methods including wavelet transforms on the ICA component still do not provide good separation of the artifact and neural activity.

      This is definitely a fair point and we also highlight this in our recommendations under 3 stating that:

      “However, separating physiological from neural sources using an ICA is no guarantee that peripheral physiological activity is fully removed from the cortical signal. Even more sophisticated ICA based methods that e.g. apply wavelet transforms on the ICA components may still not provide a good separation of peripheral physiological and neural activity76,77. This turns the process of deciding whether or not an ICA component is e.g. either reflective of cardiac or neural activity into a challenging problem. For instance, when we only extract cardiac components using relatively high detection thresholds (e.g. r > 0.8), we might end up misclassifying residual cardiac activity as neural. In turn, we can’t always be sure that using lower thresholds won’t result in misinterpreting parts of the neural effects as cardiac. Both ways of analyzing the data can potentially result in misconceptions.”

      Castellanos, N. P., & Makarov, V. A. (2006). Recovering EEG brain signals: Artifact suppression with wavelet enhanced independent component analysis. Journal of neuroscience methods, 158(2), 300-312.

      Bailey, N. W., Hill, A. T., Godfrey, K., Perera, M. P. N., Rogasch, N. C., Fitzgibbon, B. M., & Fitzgerald, P. B. (2024). EEG is better when cleaning effectively targets artifacts. bioRxiv, 2024-06.

      METHODS:

      Pre-processing, page 24: I assume the symmetric setting of fastica was used (rather than the deflation setting), but this should be specified.

      Indeed the reviewer is correct, we used the standard setting of fastICA implemented in MNE python, which is calling the FastICA implementation in sklearn that is per default using the “parallel” or symmetric algorithm to compute an ICA. We added this information to the text accordingly, stating that:

      “For extracting physiological “artifacts” from the data, 50 independent components were calculated using the fastica algorithm(22) (implemented in MNE-Python version 1.2; with the parallel/symmetric setting; note: 50 components were selected for MEG for computational reasons for the analysis of EEG data no threshold was applied).”

      Temporal response functions, page 26: can the authors please clarify whether the TRF is computed against the ECG signal for each electrode or sensory independently, or if all electrodes/sensors are included in the analysis concurrently? I'm assuming it was computed for each electrode and sensory separately, since the TRF was computed in both the forward and backwards direction (perhaps the meaning of forwards and backwards could be explained in more detail also - i.e. using the ECG to predict the EEG signal, or using the EEG signal to predict the ECG signal?).

      A TRF can also be conceptualized as a multiple regression model over time lags. This means that we used all channels to compute the forward and backward models. In the case of the forward model we predicted the signal of the M/EEG channels in a multivariate regression model using the ECG electrode as predictor. In case of the backward model we predicted the ECG electrode based on the signal of all M/EEG channels. The forward model was used to depict the time window at which the ECG signal was encoded in the M/EEG recording, which appears at 0 time lags indicating volume conduction. The backward model was used to see how much information of the ECG was decodable by taking the information of all channels.

      We tried to further clarify this approach in the methods section stating that:

      “We calculated the same model in the forward direction (encoding model; i.e. predicting M/EEG data in a multivariate model from the ECG signal) and backward direction (decoding model; i.e. predicting the ECG signal using all M/EEG channels as predictors).”

      Page 27: the ECG data was fit using a knee, but it seems the EEG and MEG data was not.

      Does this different pose any potential confound to the conclusions drawn? (having said this, Figure S4 suggests perhaps a knee was tested in the M/EEG data, which should perhaps be explained in the text also).

      This was indeed tested in a previous review round to ensure that our results are not dependent on the presence/absence of a knee in the data. We therefore added figure S4, but forgot to actually add a description in the text. We are sorry for this oversight and added a paragraph to S1 accordingly:

      “Using FOOOF(5), we also investigated the impact of different slope fitting options (fixed vs. knee model fits) on the aperiodic age relationship (see Supplementary Figure S4). The results that we obtained from these analyses using FOOOF offer converging evidence with our main analysis using IRASA.”

      Page 32: my understanding of the result reported here is that cleaning with ICA provided better sensitivity to the effects of age on 1/f activity than cleaning with SSS. Is this accurate? I think this could also be reported in the main manuscript, as it will be useful to researchers considering how to clean their M/EEG data prior to analyzing 1/f activity.

      The reviewer is correct in stating that we overall detected slightly more “significant” effects, when not additionally cleaning the data using SSS. However, I am a bit wary of recommending omitting the use of SSS maxfilter solely based on this information. It can very well be that the higher quantity of effects (when not employing SSS maxfilter) stems from other physiological sources (e.g. muscle activity) that are correlated with age and removed when applying SSS maxfiltering. I think that just conditioning the decision of whether or not maxfilter is applied based on the amount or size of effects may not be the best idea. Instead I think that the applicability of maxfilter for research questions related to aperiodic activity should be the topic of additional methodological research. We therefore now write in Text S1:

      “Considering that we detected less and weaker aperiodic effects when using SSS maxfilter is it advisable to omit maxfilter, when analyzing aperiodic signals? We don’t think that we can make such a judgment based on our current results. This is because it's unclear whether or not the reduction of effects stems from an additional removal of peripheral information (e.g. muscle activity; that may be correlated with aging) or is induced by the SSS maxfiltering procedure itself. As the use of maxfilter in detecting changes of aperiodic activity was not subject of analysis that we are aware of, we suggest that this should be the topic of additional methodological research.”

      Page 39, Figure S6 and Figure S8: Perhaps the caption could also briefly explain the difference between maxfilter set to false vs true? I might have missed it, but I didn't gain an understanding of what varying maxfilter would mean.

      Figure S6 shows the effect of ageing on the spectral slope averaged across all channels. The maxfilter set to false in AB) means that no maxfiltering using SSS was performed vs. in CD) where the data was additionally processed using the SSS maxfilter algorithm. We now describe this more clearly by writing in the caption:

      “Supplementary Figure S6: Age-related changes in aperiodic brain activity are most prominent on explained by cardiac components irrespective of maxfiltering the data using signal space separation (SSS) or not AC) Age was used to predict the spectral slope (fitted at 0.1-145Hz) averaged across sensors at rest in three different conditions (ECG components not rejected [blue], ECG components rejected [orange], ECG components only [green].”

    1. If a scholar writes as though a field of inquiry has sprung full blown from her own head, ignores an important text, or uses inferior sources uncrit- ically, she invites skepticism about the quality of her work

      • don’t use good sources or act like the ideas are only yours, people won’t believe your work. It shows why using good sources is important CH 6, page 59
  7. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. According to economic theory, families with higher incomes are better able to purchase or produce important "inputs" into their young chil-dren's development-for example, nutritious meals, enriched home learn-ing environments and child-care settings outside the home, and safe and stimulating neighborhood environments.4

      This line really shows how money doesn’t just change what you can buy—it changes your whole environment. From healthier food to better neighborhoods, wealth gives kids a head start before they even get to school. It’s kinda sad how these “inputs” aren’t available to every child. It makes you realize just how unfair the system can be from the very beginning.

    1. pg 1 - A man's conscience has become the most impressive character of our theatrical season - Although More's conscience opposed the King's divorce and head of church, he was careful to say or do noting disloyal - King Henry might have got More's head but not his soul

    Annotators

  8. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. c~ Affordability THE GREAT EQUALIZER? Affirmative action/legacy admission Postgraduation connections/capital ~.,,..,.,... High School AP/18/GA Tor vocational education Paid standardized test preparation First-generation college capital Work or extracurriculars? Middle SchOQ/.. Staffing and cuniculum Collegepreparatory trajectory PSA T/standaroized test awareness ON, $W Elementary School Early tracking Gifted and talented (GA 7J Head Start/Pre-K/Klndergarten Quality early education Access to school trajectory Lags in literacy/numeracy Emb.iI1h Employment Prenatal care Adequate nutrition pre-and pas/birth Social Stru~ Racialized classism Generational wealth Housing segregation School funding Figure 16.1 How Schools Structure Inequality. II9

      The illustration powerfully demonstrates that racialized classist systems alongside housing segregation and wealth transfer mechanics control admission to early learning programs and legacy college placement. The main issue arises when policies fixate on isolated interventions such as test prep and AP access even though they fail to treat the complete system effectively. The figure communicates how true access to success under the American Dream myth contrasts with existing educational systems since children receive predestined developmental paths before starting kindergarten. The experience turned my attention to all the different shifts that need implementation across various levels unless genuine equality becomes achievable.

    2. A Head Start for Whom

      This is the vast chasm between federally supported access and privately engineered advantage. Jackson’s comparison between Head Start and elite preschool prep is haunting—it’s not just about access, it’s about the quality and trajectory that follow.

    3. Thankfully, poor children may have access to the federally funded Head Start program, but children of the wealthy have a dif~erent kind of head start.

      I completely agree with this. Wealthy children frequently have access to services that low-income students do not, such as parental academic help and college guidance. However, right now most school does have college guidance but families that are wealthy can get extra help for their kids. While government programs like Head Start seek to help low-income students, the advantages that wealthier children have, such as better-funded schools, private tutoring, and extracurricular activities continue to create a huge gap. Because of these differences, I doubt there will ever be a genuinely fair chance to succeed in education for low- and high-income students.

    1. Author response:

      (This author response relates to the first round of peer review by Biophysics Colab. Reviews and responses to both rounds of review are available here: https://sciety.org/articles/activity/10.1101/2023.10.23.563601.)

      General Assessment:

      Pannexin (Panx) hemichannels are a family of heptameric membrane proteins that form pores in the plasma membrane through which ions and relatively large organic molecules can permeate. ATP release through Panx channels during the process of apoptosis is one established biological role of these proteins in the immune system, but they are widely expressed in many cells throughout the body, including the nervous system, and likely play many interesting and important roles that are yet to be defined. Although several structures have now been solved of different Panx subtypes from different species, their biophysical mechanisms remain poorly understood, including what physiological signals control their activation. Electrophysiological measurements of ionic currents flowing in response to Panx channel activation have shown that some subtypes can be activated by strong membrane depolarization or caspase cleavage of the C-terminus. Here, Henze and colleagues set out to identify endogenous activators of Panx channels, focusing on the Panx1 and Panx2 subtypes, by fractionating mouse liver extracts and screening for activation of Panx channels expressed in mammalian cells using whole-cell patch clamp recordings. The authors present a comprehensive examination with robust methodologies and supporting data that demonstrate that lysophospholipids (LPCs) directly Panx-1 and 2 channels. These methodologies include channel mutagenesis, electrophysiology, ATP release and fluorescence assays, molecular modelling, and cryogenic electron microscopy (cryo-EM). Mouse liver extracts were initially used to identify LPC activators, but the authors go on to individually evaluate many different types of LPCs to determine those that are more specific for Panx channel activation. Importantly, the enzymes that endogenously regulate the production of these LPCs were also assessed along with other by-products that were shown not to promote pannexin channel activation. In addition, the authors used synovial fluid from canine patients, which is enriched in LPCs, to highlight the importance of the findings in pathology. Overall, we think this is likely to be a landmark study because it provides strong evidence that LPCs can function as activators of Panx1 and Panx2 channels, linking two established mediators of inflammatory responses and opening an entirely new area for exploring the biological roles of Panx channels. Although the mechanism of LPC activation of Panx channels remains unresolved, this study provides an excellent foundation for future studies and importantly provides clinical relevance.

      We thank the reviewers for their time and effort in reviewing our manuscript. Based on their valuable comments and suggestions, we have made substantial revisions. The updated manuscript now includes two new experiments supporting that lysophospholipid-triggered channel activation promotes the release of signaling molecules critical for immune response and demonstrates that this novel class of agonist activates the inflammasome in human macrophages through endogenously expressed Panx1. To better highlight the significance of our findings, we have excluded the cryo-EM panel from this manuscript. We believe these changes address the main concerns raised by the reviewers and enhance the overall clarity and impact of our findings. Below, we provide a point-by-point response to each of the reviewers’ comments.

      Recommendations:

      (1) The authors present a tremendous amount of data using different approaches, cells and assays along with a written presentation that is quite abbreviated, which may make comprehension challenging for some readers. We would encourage the authors to expand the written presentation to more fully describe the experiments that were done and how the data were analysed so that the 2 key conclusions can be more fully appreciated by readers. A lot of data is also presented in supplemental figures that could be brought into the main figures and more thoroughly presented and discussed.

      We appreciate and agree with the reviewers’ observation. Our initial manuscript may have been challenging to follow due to our use of both wild-type and GS-tagged versions of Panx1 from human and frog origins, combined with different fluorescence techniques across cell types. In this revision, we used only human wild-type Panx1 expressed in HEK293S GnTI- cells, except for activity-guided fractionation experiments, where we used GS-tagged Panx1 expressed in HEK293 cells (Fig. 1). For functional reconstitution studies, we employed YO-PRO-1 uptake assays, as optimizing the Venus-based assay was challenging. We have clarified these exceptions in the main text. We think these adjustments simplify the narrative and ensure an appropriate balance between main and supplemental figures.

      (2) It would also be useful to present data on the ion selectivity of Panx channels activated by LPC. How does this compare to data obtained when the channel is activated by depolarization? If the two stimuli activate related open states then the ion selectivity may be quite similar, but perhaps not if the two stimuli activate different open states. The authors earlier work in eLife shows interesting shifts in reversal potentials (Vrev) when substituting external chloride with gluconate but not when substituting external sodium with N-methyl-D-glucamine, and these changed with mutations within the external pore of Panx channels. Related measurements comparing channels activated by LPC with membrane depolarization would be valuable for assessing whether similar or distinct open states are activated by LPC and voltage. It would be ideal to make Vrev measurements using a fixed step depolarization to open the channel and then various steps to more negative voltages to measure tail currents in pinpointing Vrev (a so called instantaneous IV).

      We fully agree with the reviewer on the importance of ion selectivity experiments. However, comparing the properties of LPC-activated channels with those activated by membrane depolarization presented technical challenges, as LPC appears to stimulate Panx1 in synergy with voltage. Prolonged LPC exposure destabilizes patches, complicating G-V curve acquisition and kinetic analyses. While such experiments could provide mechanistic insights, we think they are beyond the scope of current study.

      (3) Data is presented for expression of Panx channels in different cell types (HEK vs HEKS GnTI-) and different constructs (Panx1 vs Panx1-GS vs other engineered constructs). The authors have tried to be clear about what was done in each experiment, but it can be challenging for the reader to keep everything straight. The labelling in Fig 1E helps a lot, and we encourage the authors to use that approach systematically throughout. It would also help to clearly identify the cell type and channel construct whenever showing traces, like those in Fig 1D. Doing this systematically throughout all the figures would also make it clear where a control is missing. For example, if labelling for the type of cell was included in Fig 1D it would be immediately clear that a GnTI- vector alone control for WT Panx1 is missing as the vector control shown is for HEK cells and formally that is only a control for Panx2 and 3. Can the authors explain why PLC activates Panx1 overexpressed in HEK293 GnTl- cells but not in HEK293 cells? Is this purely a function of expression levels? If so, it would be good to provide that supporting information.

      As mentioned above, we believe our revised version is more straightforward to digest. We have improved labeling and provided explanations where necessary to clarify the manuscript. While Panx1 expression levels are indeed higher in GnTI- than in HEK293 cells, we are uncertain whether the absence of detectable currents in HEK293 cells is solely due to expression levels. Some post-translational modifications that inhibit Panx1, such as lysine acetylation, may also impact activity. Future studies are needed to explore these mechanisms further.

      (4) The mVenus quenching experiments are somewhat confusing in the way data are presented. In Fig 2B the y axis is labelled fluorescence (%) but when the channel is closed at time = 0 the value of fluorescence is 0 rather than 100 %, and as the channel opens when LPC is added the values grow towards 100 instead of towards 0 as iodide permeates and quenches. It would be helpful if these types of data could be presented more intuitively. Also, how was the initial rate calculated that is plotted in Fig 2C? It would be helpful to show how this is done in a figure panel somewhere. Why was the initial rate expressed as a percent maximum, what is the maximum and why are the values so low? Why is the effect of CBX so weak in these quenching experiments with Panx1 compared to other assays? This assay is used in a lot of experiments so anything that could be done to bolster confidence is what it reports on would be valuable to readers. Bringing in as many control experiments that have been done, including any that are already published, would be helpful.

      We modified the Y-axis in Figure 2 to “Quench (%)” for clarity. The data reflects fluorescence reduction over time, starting from LPC addition, normalized to the maximal decrease observed after Triton-X100 addition (3 minutes), enabling consistent quenching value comparisons. Although the quenching value appears small, normalization against complete cell solubilization provides reproducible comparisons. We do not fully understand why CBX effects vary in Venus quenching experiments, but we speculate that its steroid-like pentacyclic structure may influence the lysophospholipid agonistic effects. As noted in prior studies (DOI: 10.1085/jgp.201511505; DOI: 10.7554/eLife.54670), CBX likely acts as an allosteric modulator rather than a simple pore blocker, potentially contributing to these variations.

      (5) Could provide more information to help rationalize how Yo-Pro-1, which has a charge of +2, can permeate what are thought to be anion favouring Panx channels? We appreciate that the biophysical properties of Panx channel remain mysterious, but it would help to hear how a bit more about the authors thinking. It might also help to cite other papers that have measured Yo-Pro-1 uptake through Panx channels. Was the Strep-tagged construct of Panx1 expressed in GnTI- cells and shown to be functional using electrophysiology?

      Our recent study suggest that the electrostatic landscape along the permeation pathway may influence its ion selectivity (DOI: 10.1101/2024.06.13.598903). However, we have not yet fully elucidated how Panx1 permeates both anions and cations. Based on our findings, ion selectivity may vary with activation stimulus intensity and duration. Cation permeation through Panx1 is often demonstrated with YO-PRO-1, which measures uptake over minutes, unlike electrophysiological measurements conducted over milliseconds to seconds. We referenced two representative studies employing YO-PRO-1 to assess Panx1 activity. Whole-cell current measurements from a similar construct with an intracellular loop insertion indicate that our STREP-tagged construct likely retains functional capacity.

      (6) In Fig 5 panel C, data is presented as the ratio of LPC induced current at -60 mV to that measured at +110 mV in the absence of LPC. What is the rationale for analysing the data this way? It would be helpful to also plot the two values separately for all of the constructs presented so the reader can see whether any of the mutants disproportionately alter LPC induced current relative to depolarization activated current. Also, for all currents shown in the figures, the authors should include a dashed coloured line at zero current, both for the LPC activated currents and the voltage steps.

      We used the ratio of LPC-induced current to the current measured at +110 mV to determine whether any of the mutants disproportionately affect LPC-induced current relative to depolarization-activated current. Since the mutants that did not respond to LPC also exhibited smaller voltage-stimulated currents than those that did respond, we reasoned that using this ratio would better capture the information the reviewer is suggesting to gauge. Showing the zero current level may be helpful if the goal was to compare basal currents, which in our experience vary significantly from patch to patch. However, since we are comparing LPC- and voltage-induced currents within the same patch, we believe that including basal current measurements would not add useful information to our study.

      Given that new experiments included to further highlight the significance of the discovery of Panx1 agonists, we opted to separate structure-based mechanistic studies from this manuscript and removed this experiment along with the docking and cryo-EM studies.

      (7) The fragmented NTD density shown in Fig S8 panel A may resemble either lipid density or the average density of both NTD and lipid. For example, Class7 and Class8 in Fig.S8 panel D displayed split densities, which may resemble a phosphate head group and two tails of lipid. A protomer mask may not be the ideal approach to separate different classes of NTD because as shown in Fig S8 panel D, most high-resolution features are located on TM1-4, suggesting that the classification was focused on TM1-4. A more suitable approach would involve using a smaller mask including NTD, TM1, and the neighbouring TM2 region to separate different NTD classes.

      We agree with the reviewer and attempted 3D classification using multiple smaller masks including the suggested region. However, the maps remained poorly defined, and we were unable to confidently assign the NTD.

      (8) The authors don’t discuss whether the LPC-bound structures display changes in the external part of the pore, which is the anion-selective filter and the narrower part of the pore. If there are no conformational changes there, then the present structures cannot explain permeability to large molecules like ATP. In this context, a plot for the pore dimension will be helpful to see differences along the pore between their different structures. It would also be clearer if the authors overlaid maps of protomers to illustrate differences at the NTD and the "selectivity filter."

      Both maps show that the narrowest constriction, formed by W74, has a diameter of approximately 9 Å. Previous steered molecular dynamics simulations suggest that ATP can permeate through such a constriction, implying an ion selection mechanism distinct from a simple steric barrier.

      (9) The time between the addition of LPC to the nanodisc-reconstituted protein and grid preparation is not mentioned. Dynamic diffusion of LPC could result in equal probabilities for the bound and unbound forms. This raises the possibility of finding the Primed state in the LPC-bound state as well. Additionally, can the authors rationalize how LPC might reach the pore region when the channel is in the closed state before the application of LPC?

      We appreciate the reviewer’s insight. We incubated LPC and nanodisc-reconstituted protein for 30 minutes, speculating that LPC approaches the pore similarly to other lipids in prior structures. In separate studies, we are optimizing conditions to capture more defined conformations.

      (10) In the cryo-EM map of the “resting” state (EMDB-21150), a part of the density was interpreted as NTD flipped to the intracellular side. This density, however, is poorly defined, and not connected to the S1 helix, raising concerns about whether this density corresponds to the NTD as seen in the “resting” state structure (PDB-ID: 6VD7). In addition, some residues in the C-terminus (after K333 in frog PANX1) are missing from the atomic model. Some of these residues are predicted by AlphaFold2 to form a short alpha helix and are shown to form a short alpha helix in some published PANX1 structures. Interestingly, in both the AF2 model and 6WBF, this short alpha helix is located approximately in the weak density that the authors suggest represents the “flipped” NTD. We encourage the authors to be cautious in interpreting this part as the “flipped” NTD without further validation or justification.

      We agree that the density corresponding the extended NTD into the cytoplasm is relatively weak. In our recent study, we compared two Panx1 structures with or without the mentioned C-terminal helix and found evidence suggesting the likelihood of NTD extension (DOI: 10.1101/2024.06.13.598903). Nevertheless, to prevent potential confusion, we have removed the cryo-EM panel from this manuscript.

      (11) Since the authors did not observe densities of bound PLC in the cryo-EM map, it is important to acknowledge in the text the inherent limitations of using docking and mutagenesis methods to locate where PLC binds.

      Thank you for the suggestion. We have removed this section to avoid potential confusion.

      Optional suggestions:

      (1) The authors used MeOH to extract mouse liver for reversed-phase chromatography. Was the study designed to focus on hydrophobic compounds that likely bind to the TMD? Panx1 has both ECD and ICD with substantial sizes that could interact with water soluble compounds? Also, the use of whole-cell recordings to screen fractions would not likely identify polar compounds that interact with the cytoplasmic part of the TMD? It would be useful for the authors to comment on these aspects of their screen and provide their rationale for fractionating liver rather than other tissues.

      We have added a rationale in line 90, stating: “The soluble fractions were excluded from this study, as the most polar fraction induced strong channel activities in the absence of exogenously expressed pannexins.” Additionally, we have included a figure to support this rationale (Fig. S1A).

      (2) The authors show that LPCs reversibly increase inward currents at a holding voltage of -60 mV (not always specified in legends) in cells expressing Panx1 and 2, and then show families of currents activated by depolarizing voltage steps in the absence of LPC without asking what happens when you depolarize the membrane after LPC activation? If LPCs can be applied for long enough without disrupting recordings, it would be valuable to obtain both I-V relations and G-V relations before and after LPC activation of Panx channels. Does LPC disproportionately increase current at some voltages compared to others? Is the outward rectification reduced by LPC? Does Vrev remain unchanged (see point above)? Its hard to predict what would be observed, but almost any outcome from these experiments would suggest additional experiments to explore the extent to which the open states activated by LPC and depolarization are similar or distinct.

      Unfortunately, in our hands, the prolonged application of lysolipids at concentrations necessary to achieve significant currents tends to destabilize the patch. This makes it challenging to obtain G-V curves or perform the previously mentioned kinetic analyses. We believe this destabilization may be due to lysolipids’ surfactant-like qualities, which can disrupt the giga seal. Additionally, prolonged exposure seems to cause channel desensitization, which could be another confounding factor.

      (3) From the results presented, the authors cannot rule out that mutagenesis-induced insensitivity of Panx channels to LPCs results from allosteric perturbations in the channels rather than direct binding/gating by LPCs. In Fig 5 panel A-C, the authors introduced double mutants on TM1 and TM2 to interfere with LPC binding, however, the double mutants may also disrupt the interaction network formed within NTD, TM1, and TM2. This disruption could potentially rearrange the conformation of NTD, favouring the resting closed state. Three double Asn mutants, which abolished LPC induced current, also exhibited lower currents through voltage activation in Fig 5S, raising the possibility the mutant channels fail to activate in response to LPC due to an increased energy barrier. One way to gain further insight would be to mutate residues in NTD that interact with those substituted by the three double Asn mutants and to measuring currents from both voltage activation and LPC activation. Such results might help to elucidate whether the three double Asn mutants interfere with LPC binding. It would also be important to show that the voltage-activated currents in Fig. S5 are sensitive to CBX?

      Thank you for the comment, with which we agree. Our initial intention was to use the mutagenesis studies to experimentally support the docking study. Due to uncertainties associated with the presented cryo-EM maps, we have decided to remove this study from the current manuscript. We will consider the proposed experiments in a future study.

      (4) Could the authors elaborate on how LPC opens Panx1 by altering the conformation of the NTDs in an uncoordinated manner, going from “primed” state to the “active” state. In the “primed” state, the NTDs seem to be ordered by forming interactions with the TMD, thus resulting in the largest (possible?) pore size around the NTDs. In contrast, in the “active” state, the authors suggest that the NTDs are fragmented as a result of uncoordinated rearrangement, which conceivably will lead to a reduction in pore size around NTDs (isn’t it?). It is therefore not intuitive to understand why a conformation with a smaller pore size represents an “active” state.

      We believe the uncoordinated arrangement of NTDs is dynamic, allowing for potential variations in pore size during the activated conformation. Alternatively, NTD movement may be coupled with conformational changes in TM1 and the extracellular domain, which in turn could alter the electrostatic properties of the permeation pathway. We believe a functional study exploring this mechanism would be more appropriately presented as a separate study.

      (5) Can the authors provide a positive control for these negative results presented in Fig S1B and C?

      The positive results are presented in Fig. 1D and E.

      (6) Raw images in Fig S6 and Fig S7 should contain units of measurement.

      Thank you for pointing this out.

      (7) It may be beneficial to show the superposition between primed state and activated state in both protomer and overall structure. In addition, superposition between primed state and PDB 7F8J.

      We attempted to superimpose the cryo-EM maps; however, visually highlighting the differences in figure format proved challenging. Higher-resolution maps would allow for model building, which would more effectively convey these distinctions.

      (8) Including particles number in each class in Fig S8 panel C and D would help in evaluating the quality of classification.

      Noted.

      (9) A table for cryo-EM statistics should be included.

      Thanks, noted.

      (10) n values are often provided as a range within legends but it would be better to provide individual values for each dataset. In many figures you can see most of the data points, which is great, but it would be easy to add n values to the plots themselves, perhaps in parentheses above the data points.

      While we agree that transparency is essential, adding n-values to each graph would make some figures less clear and potentially harder to interpret in this case. We believe that the dot plots, n-value range, and statistical analysis provide adequate support for our claims.

      (11) The way caspase activation of Panx channels is presented in the introduction could be viewed as dismissive or inflammatory for those who have studied that mechanism. We think the caspase activation literature is quite convincing and there is no need to be dismissive when pointing out that there are good reasons to believe that other mechanisms of activation likely exist. We encourage you to revise the introduction accordingly.

      Thank you for this comment. Although we intended to support the caspase activation mechanism in our introduction, we understand that the reviewer’s interpretation indicates a need for clarification. We hope the revised introduction removes any perception of dismissiveness.

      (12) Why is the patient data in Fig 4F normalized differently than everything else? Once the above issues with mVenus quenching data are clarified, it would be good to be systematic and use the same approach here.

      For Fig. 4F, we used a distinct normalization method to account for substantial day-to-day variation in experiments involving body fluids. Notably, we did not apply this normalization to other experimental panels due to their considerably lower day-to-day variation.

      (13) What was the rational for using the structure from ref 35 in the docking task?

      The docking task utilized the human orthologue with a flipped-up NTD. We believe that this flipped-up conformation is likely the active form that responds to lysolipids. As our functional experiments primarily use the human orthologue for biological relevance, this structure choice is consistent. Our docking data shows that LPC does not dock at this site when using a construct with the downward-flipped NTD.

      (14) Perhaps better to refer to double Asn ‘substitutions’ rather than as ‘mutations’ because that makes one think they are Asn in the wt protein.

      Done.

      (15) From Fig S1, we gather that Panx2 is much larger than Panx1 and 3. If that is the case, its worth noting that to readers somewhere.

      We have added the molecular weight of each subtype in the figure legend.

      (16) Please provide holding voltages and zero current levels in all figures presenting currents.

      We provided holding voltages. However, the zero current levels vary among the examples presented, making direct comparisons difficult. Since we are comparing currents with and without LPC, we believe that indicating zero current levels is unnecessary for this study.

      (17) While the authors successfully establish lysophospholipid-gating of Panx1 and Panx2, Panx3 appears unaffected. It may be advisable to be more specific in the title of the article.

      We are uncertain whether Panx3 is unaffected by lysophospholipids, as we have not observed activation of this subtype under any tested conditions.

  9. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. Despite this consensus Americans disagree intensely about the education policies that will best help us achieve this dual goal. In recent years disputes over educational issues have involved all the branches and levels of government and have affected millions of students. The controversies-over matters like school funding, vouchers, bilingual education, high-stakes testing, desegrega-tion, and creationism-seem, at first glance, to be separate problems. In im-portant ways, however, they all reflect contention over the goals of the American dream. At the core of debates over one policy or another has often been a con-flict between what is (or seems to be) good for the individual and what is good for the whole; sometimes the conflict revolves around an assault on the valid-ity of the dream itself by certain groups of people. Because education is so im-portant to the way the American dream works, people care about it intensely and can strongly disagree about definitions, methods, and priorities. Sustained and serious disagreements over education policy can never be completely resolved because they spring from a fundamental paradox at the heart of the American dream. Most Americans believe that everyone has the right to pursue success but that only some deserve to win, based on their tal-ent, effort, or ambition. The American dream is egalitarian at the starting point in the "race of life," but not at the end. That is not the paradox; it is simply an ideological choice. The paradox stems from the fact that the success of one generation depends at least partly on the success of their parents or guardians. People who succeed get to keep the fruits of their labor and use them as they see fit; if they buy a home in a place where the schools are better, or use their superior resources to make the schools in their neighborhood better, their chil-dren will have a head start and other children will fall behind through no fault of their own. The paradox lies in the fact that schools are supposed to equal-ize opportunities across generations and to create democratic citizens out of each generation, but people naturally wish to give their own children an ad-vantage in attaining wealth or power, and some can do it. When they do, every-one does not start equally, politically or economically. This circle cannot be squared. Many issues in education policy have therefore come down to an apparent choice between the individual success of comparatively privileged students and the collective good of all students or the nation as a whole. Efforts to promote the collective goals of the American dream through public schooling have run up against almost insurmountable barriers when enough people believe (rightly or wrongly, with evidence or without) that those efforts will endanger the com-parative advantage of their children or children like them. At that point a gap

      The reading this week showed me that American education and its relationship with the American Dream generates many conflicts. The “paradox” concept resonated with me since Americans claim to support equal opportunity yet parents who possess extra resources end up giving their children privileged opportunities which hamper collective efforts of leveling the playing field. The situation demanded me to consider how challenging it is for governments to develop programs which promote personal growth while serving the interests of society. The reading demonstrates that the basic consensus about educational value in America remains strong while people disagree fiercely regarding education's ultimate objectives which stem from struggles about equality and social position and national sentiments. The material directly connected to my personal focus on school systems that promote or prevent education fairness.

    2. Sustained and serious disagreements over education policy can never be completely resolved because they spring from a fundamental paradox at the heart of the American dream. Most Americans believe that everyone has the right to pursue success but that only some deserve to win, based on their tal-ent, effort, or ambition. The American dream is egalitarian at the starting point in the "race of life," but not at the end. That is not the paradox; it is simply an ideological choice. The paradox stems from the fact that the success of one generation depends at least partly on the success of their parents or guardians. People who succeed get to keep the fruits of their labor and use them as they see fit; if they buy a home in a place where the schools are better, or use their superior resources to make the schools in their neighborhood better, their chil-dren will have a head start and other children will fall behind through no fault of their own. The paradox lies in the fact that schools are supposed to equal-ize opportunities across generations and to create democratic citizens out of each generation, but people naturally wish to give their own children an ad-vantage in attaining wealth or power, and some can do it. When they do, every-one does not start equally, politically or economically. This circle cannot be squared

      This paragraph highlights the contradiction in the american dream, talking about education and inequality. it explains that while americans value equal opportunity, success is often inherited rather than earned. How wealth and privilege allow some families to give there kids some educational advantages. Like better schools and more resources than other kids.

    3. The paradox stems from the fact that the success of one generation depends at least partly on the success of their parents or guardians. People who succeed get to keep the fruits of their labor and use them as they see fit; if they buy a home in a place where the schools are better, or use their superior resources to make the schools in their neighborhood better, their chil-dren will have a head start and other children will fall behind through no fault of their own. The paradox lies in the fact that schools are supposed to equal-ize opportunities across generations and to create democratic citizens out of each generation, but people naturally wish to give their own children an ad-vantage in attaining wealth or power, and some can do it. When they do, every-one does not start equally, politically or economically. This circle cannot be squared.

      While the American dream seems to colloquially promise equal opportunity, reality shows that a large factor of success is inheritance based. This undermines the idea that merit shapes outcomes, suggesting systematic advantages/disadvantages are prevalent.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      This valuable study uses consensus-independent component analysis to highlight transcriptional components (TC) in high-grade serous ovarian cancers (HGSOC). The study presents a convincing preliminary finding by identifying a TC linked to synaptic signaling that is associated with shorter overall survival in HGSOC patients, highlighting the potential role of neuronal interactions in the tumour microenvironment. This finding is corroborated by comparing spatially resolved transcriptomics in a small-scale study; a weakness is in being descriptive, non-mechanistic, and requiring experimental validation.”

      We sincerely thank the editors for their valuable and constructive feedback. We are grateful for the recognition of our findings and the importance of identifying transcriptional components in high-grade serous ovarian cancers.

      We acknowledge the editors’ observation regarding the descriptive nature of our study and its limited mechanistic depth. We agree that additional experimental validation would further strengthen our conclusions. We are planning and executing the experiments for a future study to provide mechanistic insights into the associations found in this study. In addition, recent reviews focused on the emerging field of cancer neuroscience emphasize the early stages the field is in, specifically in terms of a mechanistic understanding of the contributions of tumor-infiltrating nerves in tumor initiation and progression (Amit et al., 2024; Hwang et al., 2024). Nonetheless, we wish to emphasize that emerging mechanistic preclinical studies have demonstrated the influence of tumour-infiltrating nerves on disease progression (Allen et al., 2018; Balood et al., 2022; Darragh et al., 2024; Globig et al., 2023; Jin et al., 2022; Restaino et al., 2023; Zahalka et al., 2017). Several of these studies include contributions from our co-authors and feature in vitro and in vivo research on head and neck squamous cell carcinoma as well as high-grade serous ovarian carcinoma samples. This study further strengthens the preclinical work by showing in patient data, the potential relevance of neuronal signaling on disease outcome.

      For instance, Restiano et al. (2023) demonstrated that substance P, released from tumour-infiltrating nociceptors, potentiates MAP kinase signaling in cancer cells, thereby driving disease progression. Crucially, this effect was shown to be reversible in vivo by blocking the substance P receptor (Restaino et al., 2023). These findings offer compelling evidence of the role of tumour innervation in cancer biology.

      Our current study in tumor samples of patients with high-grade serous ovarian cancer identifies a transcriptional component that is enriched for genes for which the protein is located in the synapse. We believe that the previously published mechanistic insights support our findings and suggest that this transcriptional component could serve as a valuable screening tool to identify innervated tumours based on bulk transcriptomes. Clinically, this information is highly relevant, as patients with innervated tumours may benefit from alternate therapeutic strategies targeting these innervations.

      Reviewer #1 (Public review)

      This manuscript explores the transcriptional landscape of high-grade serous ovarian cancer (HGSOC) using consensus-independent component analysis (c-ICA) to identify transcriptional components (TCs) associated with patient outcomes. The study analyzes 678 HGSOC transcriptomes, supplemented with 447 transcriptomes from other ovarian cancer types and noncancerous tissues. By identifying 374 TCs, the authors aim to uncover subtle transcriptional patterns that could serve as novel drug targets. Notably, a transcriptional component linked to synaptic signaling was associated with shorter overall survival (OS) in patients, suggesting a potential role for neuronal interactions in the tumour microenvironment. Given notable weaknesses like lack of validation cohort or validation using another platform (other than the 11 samples with ST), the data is considered highly descriptive and preliminary.

      Strengths:

      (1) Innovative Methodology:

      The use of c-ICA to dissect bulk transcriptomes into independent components is a novel approach that allows for the identification of subtle transcriptional patterns that may be overshadowed in traditional analyses.

      We thank the reviewer for recognizing the strengths and novelty of our study. We appreciate the positive feedback on using consensus-independent component analysis (c-ICA) to decompose bulk transcriptomes, which allowed us to detect subtle transcriptional signals often overlooked in traditional analyses.

      (2) Comprehensive Data Integration:

      The study integrates a large dataset from multiple public repositories, enhancing the robustness of the findings. The inclusion of spatially resolved transcriptomes adds a valuable dimension to the analysis.

      We thank the reviewer for recognizing the robustness of our study through comprehensive data integration. We appreciate the acknowledgment of our efforts to leverage a large, multi-source dataset, as well as the additional insights gained from spatially resolved transcriptomes. We consider this integrative approach enhances the depth of our analysis and contributes to a more nuanced understanding of the tumour microenvironment.

      (3) Clinical Relevance:

      The identification of a synaptic signaling-related TC associated with poor prognosis highlights a potential new avenue for therapeutic intervention, emphasizing the role of the tumour microenvironment in cancer progression.

      We appreciate the recognition of the clinical implications of our findings. The identification of a synaptic signaling-related transcriptional component associated with poor prognosis underscores the potential for novel therapeutic targets within the tumour microenvironment. We agree that this insight could open new avenues for intervention and further highlights the role of neuronal interactions in cancer progression.

      Weaknesses:

      (1) Mechanistic Insights:

      While the study identifies TCs associated with survival, it provides limited mechanistic insights into how these components influence cancer progression. Further experimental validation is necessary to elucidate the underlying biological processes.

      We acknowledge the point regarding the limited mechanistic insights provided in our study. We agree that further experimental validation would significantly enhance our understanding of how the biological processes captured by these transcriptional components influence cancer progression. We are planning and executing the experiments for  a future study to provide mechanistic insights into the associations found in this study.

      Our analyses were performed on publicly available bulk and spatial resolved expression profiles. To investigate the mechanistic insights in future studies, we plan to integrate spatial transcriptomic data with immunohistochemical analysis of the same tumour samples to validate our findings. Additionally, we have initiated efforts to set up in vitro co-cultures of neurons and ovarian cancer cells. These co-cultures will enable us to investigate how synaptic signaling impacts ovarian cancer cell behavior.

      (2) Generalizability:

      The findings are primarily based on transcriptomic data from HGSOC. It remains unclear how these results apply to other subtypes of ovarian cancer or different cancer types.

      To respond to this remark, we utilized survival data from Bolton et al. (2022) and TCGA to investigate associations between TC activity scores and overall survival of patients with ovarian clear cell carcinoma, the second most common subtype of epithelial ovarian cancer, and  other cancer types respectively. However, we acknowledge the limitations of TCGA survival data, as highlighted in the referenced article (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8726696/). Additionally, as shown in Figure 5, we provided evidence of TC121 activity across various cancer types, suggesting broader relevance. For the results of the analyses mentioned above, please refer to our response to remark 1.3 of the recommendation section (page 4).

      (3) Innovative Methodology:

      Requires more validation using different platforms (IHC) to validate the performance of this bulk-derived data. Also, the lack of control over data quality is a concern.

      We acknowledge the value of validating our results with alternative platforms such as IHC. We are planning and executing the experiments for a future study to provide mechanistic insights into the associations found in this study.

      We implemented regarding data quality control, the following measures to ensure the reliability of our analysis:

      Bulk Transcriptional Profiles: To assess data quality, we conducted principal component analysis (PCA) on the sample Pearson product-moment correlation matrix. The first principal component (PCqc), which explains approximately 80-90% of the variance, was used to distinguish technical variability from biological signals (Bhattacharya et al., 2020). Samples with a correlation coefficient below 0.8 relative to PCqc were identified as outliers and excluded. Additionally, MD5 hash values were generated for each CEL file to identify and remove duplicate samples. Expression values were standardized to a mean of zero and a variance of one for each gene to minimize probeset- or gene-specific variability across datasets (GEO, CCLE, GDSC, and TCGA).

      Spatial Transcriptional Profiles: PCA was also applied to spatial transcriptomic data for quality control. Only samples with consistent loading factor signs for the first principal component across all individual spot profiles were retained. Samples failing this criterion were excluded from further analyses.

      (4) Clinical Application:

      Although the study suggests potential drug targets, the translation of these findings into clinical practice is not addressed. Probably given the lack of some QA/QC procedures it'll be hard to translate these results. Future studies should focus on validating these targets in clinical settings.”

      Regarding clinical applications, we acknowledge the importance of further exploring strategies targeting synaptic signaling and neurotransmitter release in the tumour microenvironment (TME). As partially discussed in the first version of the manuscript, drugs such as ifenprodil and lamotrigine—commonly used to treat neuronal disorders—can block glutamate release, thereby inhibiting subsequent synaptic signaling. Additionally, the vesicular monoamine transporter (VMAT) inhibitor reserpine blocks the formation of synaptic vesicles (Reid et al., 2013; Williams et al., 2001). Previous in vitro studies with HGSOC cell lines demonstrated that ifenprodil significantly reduced cancer cell proliferation, while reserpine triggered apoptosis in cancer cells (North et al., 2015; Ramamoorthy et al., 2019). The findings highlight the potential of such approaches to disrupt synaptic neurotransmission in the TME.

      To address potential translation of our findings into clinical practice more comprehensively, we have included additional details in the manuscript:

      Section discussion, page 16, lines 338-341:

      “This interaction can be targeted with pan-TRK inhibitors such as entrectinib and larotrectinib. Both drugs are showing promising results in multiple phase II trials, including ovarian cancer and breast cancer patients. Furthermore, a TRKB-specific inhibitor was developed (ANA-12), but has not been subjected to any clinical trials in cancer so far (Ardini et al., 2016; Burris et al., 2015; Drilon et al., 2018, 2017).”

      On page 17, lines 361-374:

      “Strategies to disrupt neuronal signaling and neurotransmitter release in neurons target key elements of excitatory neurotransmission, such as calcium flux and vesicle formation. Drugs like ifenprodil and lamotrigine, commonly used to treat neuronal disorders, block glutamate release and subsequent neuronal signaling. Additionally, the vesicular monoamine transporter (VMAT) inhibitor reserpine prevents synaptic vesicle formation (Reid et al., 2013; Williams, 2001). In vitro studies with HGSOC cell lines have demonstrated that ifenprodil significantly inhibits tumour proliferation, while reserpine induces apoptosis in cancer cells (North et al., 2015; Ramamoorthy et al., 2019). These approaches hold promise for inhibiting neuronal signaling and interactions in the TME.”

      Reviewer #2 (Public review):

      Summary:

      Consensus-independent component analysis and closely related methods have previously been used to reveal components of transcriptomic data that are not captured by principal component or gene-gene coexpression analyses.

      Here, the authors asked whether applying consensus-independent component analysis (c-ICA) to published high-grade serous ovarian cancer (HGSOC) microarray-based transcriptomes would reveal subtle transcriptional patterns that are not captured by existing molecular omics classifications of HGSOC.

      Statistical associations of these (hitherto masked) transcriptional components with prognostic outcomes in HGSOC could lead to additional insights into underlying mechanisms and, coupled with corroborating evidence from spatial transcriptomics, are proposed for further investigation.

      This approach is complementary to existing transcriptomics classifications of HGSOC.

      The authors have previously applied the same approach in colorectal carcinoma (Knapen et al. (2024) Commun. Med).

      Strengths:

      (1) Overall, this study describes a solid data-driven description of c-ICA-derived transcriptional components that the authors identified in HGSOC microarray transcriptomics data, supported by detailed methods and supplementary documentation.

      We thank the reviewer for acknowledging the strength of our data-driven approach and the use of consensus-independent component analysis (c-ICA) to identify transcriptional components within HGSOC microarray data. We aimed to provide comprehensive methodological detail and supplementary documentation to support the reproducibility and robustness of our findings. We believe this approach allows for the identification of subtle transcriptional signals that might have been overlooked by traditional analysis methods.

      (2) The biological interpretation of transcriptional components is convincing based on (data-driven) permutation analysis and a suite of analyses of association with copy-number, gene sets, and prognostic outcomes.

      We appreciate the positive feedback on the biological interpretation of our transcriptional components. We are pleased that our approach, which includes data-driven permutation testing and analyses of associations with copy-number alterations, gene sets, and prognostic outcomes, was found to be convincing. These analyses were integral to enhancing our findings’ robustness and biological relevance.

      (3) The resulting annotated transcriptional components have been made available in a searchable online format.

      Thank you for this important positive remark.

      (4) For the highlighted transcriptional component which has been annotated as related to synaptic signalling, the detection of the transcriptional component among 11 published spatial transcriptomics samples from ovarian cancers appears to support this preliminary finding and requires further mechanistic follow-up.

      Thank you for acknowledging the accessibility of our annotated transcriptional components. We prioritized making these data available in a searchable online format to facilitate further research and enable the community to explore and validate our findings.

      Weaknesses:

      (1) This study has not explicitly compared the c-ICA transcriptional components to the existing reported transcriptional landscape and classifications for ovarian cancers (e.g. Smith et al Nat Comms 2023; TCGA Nature 2011; Engqvist et al Sci Rep 2020) which would enable a further assessment of the additional contribution of c-ICA - whether the cICA approach captured entirely complementary components, or whether some components are correlated with the existing reported ovarian transcriptomic classifications.

      We acknowledge the reviewer’s insightful suggestion to compare our c-ICA-derived transcriptional components with previously reported ovarian cancer classifications, such as those from Smith et al. (2023), TCGA (2011), and Engqvist et al. (2020). To address this, we incorporated analyses comparing the activity scores of our transcriptional components with these published landscapes and classifications, particularly focusing on any associations with overall survival. Additionally, we evaluated correlations between gene signatures from a subset of these studies and our identified TCs, enhancing our understanding of the unique contributions of the c-ICA approach. Please refer to our response to remark 10 for the results of these analyses.

      (2) Here, the authors primarily interpret the c-ICA transcriptional components as a deconvolution of bulk transcriptomics due to the presence of cells from tumour cells and the tumour microenvironment.

      However, c-ICA is not explicitly a deconvolution method with respect to cell types: the transcriptional components do not necessarily correspond to distinct cell types, and may reflect differential dysregulation within a cell type. This application of c-ICA for the purpose of data-driven deconvolution of cell populations is distinct from other deconvolution methods that explicitly use a prior cell signature matrix.”

      We acknowledge that c-ICA, unlike traditional deconvolution methods, is not specifically designed for cell-type deconvolution and does not rely on a predefined cell signature matrix. While we explored the transcriptional components in the context of tumour and microenvironmental interactions, we agree that these components may not correspond directly to distinct cell types but rather reflect complex patterns of dysregulation, potentially within individual cell populations.

      Our goal with c-ICA was to uncover hidden transcriptional patterns possibly influenced by cellular heterogeneity. However, we recognize these patterns may also arise from regulatory processes within a single cell type. To investigate further, we used single-cell transcriptional data (~60,000 cell-types annotated profiles from GSE158722) and projected our transcriptional components onto these profiles to obtain activity scores, allowing us to assess each TC’s behavior across diverse cellular contexts after removing the first principal component to minimize background effects. Please refer to our response to remark 2.2 in the recommendations to the authors (page 14) for the results of this analysis.

      References

      Allen JK, Armaiz-Pena GN, Nagaraja AS, Sadaoui NC, Ortiz T, Dood R, Ozcan M, Herder DM, Haemerrle M, Gharpure KM, Rupaimoole R, Previs R, Wu SY, Pradeep S, Xu X, Han HD, Zand B, Dalton HJ, Taylor M, Hu W, Bottsford-Miller J, Moreno-Smith M, Kang Y, Mangala LS, Rodriguez-Aguayo C, Sehgal V, Spaeth EL, Ram PT, Wong ST, Marini FC, Lopez-Berestein G, Cole SW, Lutgendorf SK, diBiasi M, Sood AK. 2018. Sustained adrenergic signaling promotes intratumoral innervation through BDNF induction. Cancer Res 78 (12):3233-3242.

      Ardini E, Menichincheri M, Banfi P, Bosotti R, Ponti CD, Pulci R, Ballinari D, Ciomei M, Texido G, Degrassi A, Avanzi N, Amboldi N, Saccardo MB, Casero D, Orsini P, Bandiera T, Mologni L, Anderson D, Wei G, Harris J, Vernier J-M, Li G, Felder E, Donati D, Isacchi A, Pesenti E, Magnaghi P, Galvani A. 2016. Entrectinib, a Pan–TRK, ROS1, and ALK Inhibitor with activity in multiple molecularly defined cancer Indications. Mol Cancer Ther 15:628–639.

      Balood M, Ahmadi M, Eichwald T, Ahmadi A, Majdoubi A, Roversi Karine, Roversi Katiane, Lucido CT, Restaino AC, Huang S, Ji L, Huang K-C, Semerena E, Thomas SC, Trevino AE, Merrison H, Parrin A, Doyle B, Vermeer DW, Spanos WC, Williamson CS, Seehus CR, Foster SL, Dai H, Shu CJ, Rangachari M, Thibodeau J, Rincon SVD, Drapkin R, Rafei M, Ghasemlou N, Vermeer PD, Woolf CJ, Talbot S. 2022. Nociceptor neurons affect cancer immunosurveillance. Nature 611:405–412.

      Bhattacharya A, Bense RD, Urzúa-Traslaviña CG, Vries EGE de, Vugt MATM van, Fehrmann RSN. 2020. Transcriptional effects of copy number alterations in a large set of human cancers. Nat Commun 11:715.

      Burris HA, Shaw AT, Bauer TM, Farago AF, Doebele RC, Smith S, Nanda N, Cruickshank S, Low JA, Brose MS. 2015. Abstract 4529: Pharmacokinetics (PK) of LOXO-101 during the first-in-human Phase I study in patients with advanced solid tumors: Interim update. Cancer Res 75:4529–4529.

    1. Shaken Baby Syndrome/Abusive Head Trauma;wrongful conviction;psychological testimony;mis-information effects;forensic confirmation bias

      Great Keyword Ideas to place in the natural search bar on databases.

    1. eLife Assessment

      This important study provides solid evidence for new insights into the role of Type-1 nNOS interneurons in driving neuronal network activity and controlling vascular network dynamics in awake, head-fixed mice. The authors use an original strategy based on the ablation of Type-1 nNOS interneurons with local injection of saporin conjugated to a substance P analogue into the somatosensory cortex. They show that ablation of type I nNOS neurons has surprisingly little effect on neurovascular coupling, although it alters neural activity and vascular dynamics.

    2. Reviewer #1 (Public review):

      Turner et al. present an original approach to investigate the role of Type-1 nNOS interneurons in driving neuronal network activity and in controlling vascular network dynamics in awake head-fixed mice. Selective activation or suppression of Type-1 nNOS interneurons has previously been achieved using either chemogenetic, optogenetic, or local pharmacology. Here, the authors took advantage of the fact that Type-1 nNOS interneurons are the only cortical cells that express the tachykinin receptor 1 to ablate them with a local injection of saporin conjugated to substance P (SP-SAP). SP-SAP causes cell death in 90 % of type1 nNOS interneurons without affecting microglia, astrocytes, and neurons. The authors report that the ablation has no major effects on sleep or behavior. Refining the analysis by scoring neural and hemodynamic signals with electrode recordings, calcium signal imaging, and wide-field optical imaging, the authors observe that Type-1 nNOS interneuron ablation does not change the various phases of the sleep/wake cycle. However, it does reduce low-frequency neural activity, irrespective of the classification of arousal state. Analyzing neurovascular coupling using multiple approaches, they report small changes in resting-state neural-hemodynamic correlations across arousal states, primarily mediated by changes in neural activity. Finally, they show that nNOS type 1 interneurons play a role in controlling interhemispheric coherence and vasomotion.

      In conclusion, these results are interesting, use state-of-the-art methods, and are well supported by the data and their analysis. I have only a few comments on the stimulus-evoked haemodynamic responses, and these can be easily addressed.

    1. He is at this time transporting large Armies offoreign Mercenaries to compleat the works ofdeath, desolation and tyranny, already begun withcircumstances of Cruelty & perfidy [treachery]scarcely paralleled in the most barbarous ages, andtotally unworthy the Head of a civilized nation

      this sentence is interesting because it can be both PRO and ANTI war. the invasion of the middle east by the US could be interpreted as desolation and tyranny so why would we do that? but then it only applies to US citizens? so then are they really inalienable?

    1. Aperson should not be allowed to reapthe benefits of the violation of anoth-er’s rights by gaining a head start (aslong as the person knows of the viola-tion of the rights).

      period

    Annotators

    1. a process between man and nature, a process by which man,through his own actions, mediates, regulates, and controls themetabolism between himself and nature. [. . .] He sets in motionthe natural forces which belong to his own body, his arms, legs,head, and hands, in order to appropriate the materials of naturein a form adapted to his own needs. Through this movement heacts upon external nature and changes it, and in this way he simul-taneously changes his own nature

      making clay

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      The article emphasizes vocal social behavior but none of the experiments involve a social element. Marmosets are recorded in isolation which could be sufficient for examining the development of vocal behavior in that particular context. However, the early-life maturation of vocal behavior is strongly influenced by social interactions with conspecifics. For example, the transition of cries and subharmonic phees which are high-entropy calls to more low-entropy mature phees is affected by social reinforcement from the parents. And this effect extends cross context where differences in these interaction patterns extend to vocal behavior when the marmosets are alone. From the chord diagrams, cries still consist of a significant proportion of call types in lesioned animals. Additionally, though it is an intriguing finding that the infants' phee calls have acoustic differences being 'blunted of variation, less diverse and more regular,' the suggestion that the social message conveyed by these infants was 'deficient, limited, and/or indiscriminate' is not but can be tested with, for example, playback experiments.

      We recognize that our definition of vocal social behavior is not within the normal realm of direct social interactions. We were particularly interested in marmoset vocalizations as a social signal, such as phees, cries and twitter, even when their family members or conspecifics are not visibly present. Generally speaking, in the laboratory, infant marmosets make few calls when in the presence of another conspecific, but when isolated they naturally make phee calls to reach out to their distantly located relatives. In this context, while we did not assess the animals interacting directly, we assessed what are normally referred to as ‘social contact calls,’ hence the term ‘social vocalizations.’ Playback recordings might provide potential evidence of antiphonal calling as a means of social interaction and might reveal the poor quality of the social message conveyed by the infant, but even here, the vocalizing marmoset would be calling to a non-visible conspecific. Thus, although our experiment lacked a direct social element, our data suggest that in the absence of a functioning ACC in early life, infant calls that convey social information, and which would elicit feedback from parents and other family members, may be compromised, and this could potentially influence how that infant develops its social interactive skills. We have now commented on the significance of social vocalizations in the introductory text (page 3) and discussion (page 15).

      The manuscript would benefit from the addition of more details to be able to better determine if the conclusions are well supported by the data. Understanding that this is very difficult data to get, the number of marmosets and some variability in the collection of the data would allow for the plotting of each individual across figures. For example, in the behavioral figures, which is the marmoset that is in the behavioral data that has a sparing of the ACC lesion in one hemisphere? Certain figures, described below in the recommendations for the authors, could also do with additional description.

      Thanks for these suggestions. We have plotted the individual animals in the relevant figures and addressed the comments and recommendations listed below.

      Reviewer #1 (Recommendations For The Authors):

      Given the number of marmosets, variability in the collected data, lesion extent, and different controls, I would like to see more plots with individuals indicated (perhaps with different symbols). More details could also be added for several plots.

      Figure 2D (new) and 2E now have plots that represent the individual animals, each represented by a different symbol.

      Figure 2A) Since lesions are bilateral, could you also show the extent of the lesions on the other side for completeness?

      Our intention was to process one hemisphere of each brain for Golgi staining to examine changes in cell morphology in the ACC and associated brain regions following the lesion. Unfortunately, the Golgi stain was unsuccessful. Consequently, we were unable to use the tissue to reconstruct the bilateral extent of the lesion. We did, however, first establish the bilateral nature of the lesion through coronal slices of the animals MRI scan before processing the intact hemisphere to confirm the bilateral extent of the lesion. The MRI scans (every 5th section) for each control and lesioned animal is compiled in a figure in the supplementary materials (Fig. S1). These scans show that the ACC-lesioned animals have bilateral lesions with one animal (ACC1) showing some sparing in one hemisphere, as we noted in the text. We have now made reference to this supplemental figure in the text (page 5).

      Figure 2B/C) In Figure 2B, control and ACC lesions are in the columns while right next to it in 2C, ACC lesion and control are in the rows. Could these figures be adjusted so that they are consistent?

      We have now adjusted these figures and updated the figure legends accordingly.

      Figure 2C) Is there quantification of the 'loss of neurons and respective increase in glial cells at the lesioned site especially at the interface between gray and white matter'? There are multiple slices for each animal.

      Thanks for suggesting this. We have now quantified these data which are presented as a new graph as Fig. 2D. These data revealed a significant loss of neurons (NeuN) in the ACC group as well as an increase in glial cells (GFAP and Iba1) relative to the controls. The figure legend and results have also been updated.

      Figure 2C) It is difficult for me to distinguish between white and purple - could you show color channels independently since images were split into separate channels for each fluorophore?

      Fig. 2C has been revised to better visualize the neurons and glia at the gray and white matter interface. We found that grayscale images for each channel offered a better contrast than separating the channels for each fluorophore.

      Figure 2C/D) I like how there are individual dots here for the individual marmosets. Since there are four in each group, could they be represented throughout with symbols (with a key indicating the pair and also the control condition)? For example, were there changes in the histology for control animals that got saline injections as opposed to those that didn't get any surgery?

      We have highlighted the individual animals with different symbols in the figures. Although some animals were twin pairs, it was not possible to have twins in all cases. Only two sets were twins. We have indicated the symbols that represent the twin pair in Fig. 2 as well as the MRI scans of the twin pairs in Fig. S1. There were no observed changes in histology for the sham animals relative to the other non-sham controls. The MRI scan for one sham CON2 shows herniated tissue in the right hemisphere which is a normal consequence of brain exposure caused by a craniotomy.

      Figure 3D-E) Here, individual data points could be informative especially given that some animals are missing data past the third week.

      To prevent cluttering the figure with too many data points, we have added the sample size for each group in the figure legend (pages 33).

      Figure 3D/F) What exactly is the period that goes into this analysis? In the text, 'Further analysis showed that the ACC lesion had minimal effects on the rate of most call types during this period'. Is this period from weeks 3 to 6 relative to the proportions in week 2? I think I also don't quite understand the chord diagram. The legend says 'the numbers around each chord diagram represents relative probability value for each call type transition' so how does that relate to the proportion of these call types? It looks like there is a wider slice for cries for ACC-lesioned animals each week. I also don't see in the week 4 chord diagram, the text description of 'elevation in the rate of 'other' calls, which comprised tsik, egg, eck, chatter and seep calls. These calls were significantly elevated in animals after the ACC lesion."

      We apologize for the confusion. Fig 3D and Fig 3F are not directly related. Fig. 3D shows the different types of emitted calls. The figure shows the averaged data per group pooled from post-surgery weeks (week 3 – week 6). It represents the proportion of individual call types relative to the total number of calls during each recording period. The only major finding here was the increased rate of ‘other’ calls comprising tsik, egg, ock, chatter and seep calls. These calls were significantly elevated in animals after the ACC lesion.

      While Fig. 3D represents the differences in the proportion of calls, the chord diagrams in Fig. 3F represents the probability of call-to-call transition obtained from a probability matrix. At postnatal week 6, marmosets with ACC lesions showed a higher likelihood of transitions between all call types, but less frequent transitions between social contact calls relative to sham controls. The chord diagrams visualize the weighted probabilities and directionality of these transitions between the different call types. Weighted probabilities were used to account for variations in call counts. The thickness of the arrows or links indicates the probability of a call transition, while the numbers surrounding each chord diagram represent the relative probability value for each specific transition. We have now reworded the text and clarified these details in the figure legend (pages 32-33).

      Figure 3E) How is the ratio on the y-axis calculated here?

      The y-axis represents the averaged value of the ratios of the number of social contact calls relative to non-social contact calls in each recording per subject per group (i.e., (x̄ (# social calls / # non-social calls). This is now included in the figure legend and the axis is updated (page 32).

      Also, cries could be considered a 'social contact call' since they are produced by infants to elicit responses from the parents. There is also the hypothesis in the literature that cries transition into phees.

      The reviewer is correct. Cries are often considered a social contact call because they elicit parental feedback. We decided to separate cry-calls from other social contact calls for two reasons. First, in our sample, we found cry behavior to be highly variable across the animals. For example, one control infant cried incessantly whereas another control infant cried less than normal. This extreme variability in animals of the same group masked the features between animals that reliably differentiated between them. Second, cry-calls elicit feedback from parents who are normally within the vicinity of the infant whereas phee calls elicit antiphonal phee calls from any distantly located conspecific. In other words, the context in which these calls are often elicited are very different.

      The use of 'syntactical' is a bit jarring to me because outside of linguistics, its use in animal communication generally refers to meaning-bearing units that can be combined into well-formed complexes such as pod-specific whale songs or predator alarm calls with concatenated syllable types in some species of monkeys. To my knowledge, individual phee syllables have not been currently shown to carry information on their own and may be better described as 'sequential' rather than 'syntactical'.

      We agree. We have made this change accordingly.

      Figure 4B) How many phee calls with differing numbers of syllables are present each week? How equal is the distribution given that later analyses go up to 5 syllables?

      The total number of phee calls with differing number of syllables ranged between 20-40 phees. This number varied between subjects, per week. The most common were 3- and 4-syllable phee calls which ranged from 7-15. Due to this variability, Fig. 4B presents the average syllable count. The axis is now updated.

      Figure 4C-E) How is the data combined here? Is there a 2nd syllable, the combined data from the 2nd syllable from phee calls of all lengths (1 - 5?). If so, are there differences based on how long the total sequence is?

      The combined data represents the specific syllable (e.g., the 1st syllable in a 2-syllable phee, in a 3-syllable phee and in a 4-syllable phee) irrespective of the length of the sequence in a sequence. No differences were observed between 2nd syllable in a 2 syllable phee and 2nd syllable in a 3 or a 4 syllable phee. We have included this detail in the figure legend (page 33-34).

      So duration is a vocal parameter that is highly dependent on physical factors such as body size and lung volume, where there differences in physical growth between the pairs of ACC-lesioned marmosets and their twins? Entropy is less closely tied to these physical factors but has previously been shown to decrease as phee calls mature, which we can also see in the negative relationship of the control animals. Do you know of experiments that show that lower entropy calls are more 'blunted'?

      Thank you for raising the important issue of physical growth factors. For twin pairs, it is not uncommon for one infant to be slightly bigger, heavier or stronger than the other presumably because one gets more access to food. With increasing age, we did not observe significant changes in bodyweight between the groups. We examined grip strength in all infants as a means of assessing how well the infant was able to access food during nursing. Poor grip strength would indicate a lower propensity to ‘hang on’ to the mother for nursing which could lead to lower weight gain and reduced physical growth. We found that both grip strength and body weight increased as the infants got older and both parameters were equivalent. We have included an additional figure to show the normal increase in both weight and grip strength to the supplemental materials (Fig. S3) and have made reference to this in the text (page 8).

      As for entropy, it’s impact on the emotional quality of vocalizations has not been systematically explored. Generally speaking, high entropy relates to high randomness and distortion in the signal. Accordingly, one view posits low-entropy phee calls represent mature sounding calls relative to noisy and immature high-entropy calls (e.g., Takahasi et al 2017). In the current study, the reduction in syllable entropy observed for both groups of animals with increasing age is consistent with this view. At the same time entropy can relate to vocal complexity; high entropy refers to complex and variable sound patterns whereas low entropy sounds are predictable, less diverse and simple vocal sequences (Kershenbaum, A. 2013. Entropy rate as a measure of animal vocal complexity. Bioacoustics, 23(3), 195–208). One possibility is that call maturity does not equate directly to emotional quality. In other words, a low-entropy mature call can also be lacking in emotion as observed in humans with ACC damage; these patients show mature speech, but they lack the variations in rhythms, patterns and intonation (i.e., prosody) that would normally convey emotional salience and meaning. Our observation of a reduction in phee syllable entropy in the ACC group in the context of being short and loud with reduced peak frequency is consistent with this view. Our use of the word ‘blunt’ was to convey how the calls exhibited by the ACC group were potentially lacking emotional meaning. Beyond this speculation, we are not aware of any papers that have examined the relationship between entropy and blunted calls directly. We have now included this speculation in the discussion (pages 12-13).

      Reviewer #2 (Public Review):

      The authors state that the integrity of white matter tracts at the injection site was impacted but do not show data.

      We have added representative micrographs of a control and ACC-lesioned animal in a new supplementary figure which shows the neurotoxin impacted the integrity of white matter tracts local to the site of the lesion (Fig. S2).

      The study only provides data up to the 6th week after birth. Given the plasticity of the cortex, it would be interesting to see if these impairments in vocal behavior persist throughout adulthood or if the lesioned marmosets will recover their social-vocal behavior compared to the control animals.

      We agree. Our original intention was to examine behavior into adulthood. Unfortunately, the COVID-19 pandemic compromised the continuation of the study. We were limited by the data that we were allowed to acquire due to imposed restrictions. Some non-vocalization data collected when the animals were young adults is currently being prepared for another paper.

      Even though this study focuses entirely on the development of social vocalizations, providing data about altered social non-vocal behaviors that accompany ACC lesions is missing. This data can provide further insights and generate new hypotheses about the exact role of ACC in social vocal development. For example, do these marmosets behave differently towards their conspecifics or family members and vice versa, and is this an alternate cause for the observed changes in social-vocal development?

      We agree. At the time however, apparatus for assessing behavior between the infant’s family and non-family members was not available. Assessing such behaviors in the animals holding room posed some difficulty since marmosets are easily distracted by other animals as well as the presence of an experimenter, amongst other things. This is an area of investigation we are currently pursuing.

      Reviewer #3 (Public Review):

      It is striking to find that the vocal repertoire of infant marmosets was not significantly affected by ACC lesions. During development, the neural circuits are still maturing and the role of different brain regions may evolve over time. While the ACC likely contributes to vocalizations across the lifespan, its relative importance may vary depending on the developmental stage. In neonates, vocalizations may be more reflexive or driven by physiological needs. At this stage, the ACC may play a role in basic socioemotional regulation but may not be as critical for vocal production. Since the animals lived for two years, further analysis might be helpful to elucidate the precise role of ACC in the vocal behavior of marmosets.

      Figure 3D. According to the Introduction "...infant ACC lesions abolish the characteristic cries that infants normally issue when separated from its mother". Are the present results in marmosets showing the opposite effect? Please discuss.

      To date, the work of Maclean (1985) is the only publication that describes the effect of early cingulate ablation on the spontaneous production of ‘separation calls’ largely construed as cries, coos and whimpers in response to maternal separation. All of this work was largely performed in rhesus macaques or squirrel monkeys. In addition to ablating the cingulate cortex, Maclean found that it was necessary to ablate the subcallosal (areas 25) and preseptal cingulate cortex (presumably referring to prelimbic area 32) to permanently eliminate the spontaneous production of separation cry calls. Our ablation of the ACC was more circumscribed to area 24 and is therefore consistent with MacLean’s earlier work that removal of ACC alone does not eliminate cry behavior. In adults, ACC ablation is insufficient at eliminating vocalization as well. We make reference to this on pages 13-14 of the discussion.

      Figure 3E and Discussion. Phees are mature contact calls and cries immature contact calls (Zhang et al, 2019, Nat Commun). Therefore, I would rather say that the proportion of immature (cries) contact calls increases vs the mature (phee, trill, twitters) contact calls in the ACC group. Cries are also "isolated-induced contact calls" to attract the attention of the caregivers.

      The reviewer is correct in that cries are directed towards caregivers but in our sample, cry behavior was highly variable between the infants. Consequently, in Fig. 3E social contact calls include phee, twitter and trill calls but does not include cries which were separated (see also response to reviewer #1). Many of the calls made during babbling were immature in their spectral pattern (compare phee calls between Fig. 3A and 3B). Cries typically transitioned into phees, twitters or trills before they fully matured. Fig 3E shows that the controls made more isolation-induced social contact calls at postnatal week 6 which were presumably maturing at this time point. Thus, if anything, there was an increase in the proportion of mature contact calls vs immature contact calls with increasing age.

      Figure 4D. Animal location and head direction within the recording incubator can have significant effects on the perceived amplitude of a call. Were these factors taken into account?

      The reviewer makes an excellent observation. Unfortunately, we did not account for location and head direction because the infants were quite mobile in the incubator. The directional microphone was hidden from view because the infants were distracted by it, and positioned ~12 cm from the marmoset, and placed in the exact same location for every recording. In addition, calls with phantom frequencies were eliminated during visual inspection of spectrograms. Beyond these details, location and head direction were not taken into account.

      Figure 4E. When a phee call has a higher amplitude, as is the case for the ACC group (Figure 4D), the energy of the signal will be concentrated more strongly at the phee call frequency ~8KHz. This concentration of the energy reduces the variability in the frequency distribution, leading to lower entropy. The interpretation of the results should be reconsidered. A faint call (control group) can exhibit more variability in the frequency content since the energy is distributed across a wider range of frequencies contributing to higher entropy. It can still be "fixed, regular, and stereotyped" if the behavior is consistent or predictable with little variation. Also, to define ACC calls as "monotonic" I would rather search for the lack of frequency modulation, amplitude variation, or narrower bandwidth.

      We very much appreciate this explanation. We were able to identify the maximum frequency that closely matched pitch of a sound for each syllable in a multisyllabic phee. New Fig. 4E shows that the peak frequency for each phee syllable was lower in the ACC-lesioned monkeys which may directly translate to the low entropy observed in this group. The term “monotonic” was used to relate our data to the classical and long-standing evidence of human ACC lesions causing monotonous intonation of speech. When all factors are taken into account, it is evident that the vocal phee signature of the ACC-lesioned animal was structurally different to the controls implicating a less complex and stereotyped ACC signal. Further studies are needed to systematically explore the relationship between entropy and emotional quality of vocalizations

      Apart from the changes in the vocal behavior, did the AAC lesions manifest in any other observable cognitive, emotional, or social behavior? ACC plays a role in processing pain and modulating pain perception. Could that be the reason for the observed increase in the proportion of cries in the ACC group and the increase in the phee call amplitude? Did the cries in the ACC group also display a higher amplitude than the cries in the control group?

      It was our intention to acquire as much data as possible from these infants as they matured from a cognitive, social and emotional perspective. Unfortunately, our study was hampered by variety of reasons including the COVID-19 pandemic which imposed major restrictions on our ability to continue with the experiment in a time sensitive manner. In addition, the development and construction of the custom apparatus to measure these behaviors was stalled during this period further preventing us from collecting behavioral data at regular time intervals. As for the cry behavior, the number of cries, in the ACC group were very low especially at postnatal week 5 and 6. Consequently, there were very few data points to work with.

      Discussion. Louder calls have the potential to travel longer distances compared to fainter calls, possess higher energy levels, and can propagate through the environment more effectively. If the ACC group produced louder phee syllables, how could be the message conveyed over long distances "deficient, limited, and/or indiscriminate"?

      Thanks for raising this interesting concept. Not all calls emitted by the animals were loud. We specifically examined the long-distance phee call in this regard. The phee syllables emitted by the ACC group were high amplitude with low frequencies, short duration and low entropy. Taking these factors into account, it is conceivable that the phee calls produced by the ACC group could not effectively convey their message over long distances despite their propagation through the environment. We have made reference to this in the discussion where we focus is specifically on the phee calls only (pages 12).

      Abstract: Do marmosets have syntax? Consider replacing "syntactical" with a more appropriate term (maybe "syntax-like").

      Thanks for this suggestion. We have replaced the term syntactical with ‘sequential’ as per the recommendation of reviewer #1.

      Introduction: "...cries that infants normally issue when separated from its mother". Please replace "its" with "their".

      This has been corrected.

      Results: Is the reference to Fig 1B related to the text?

      We have included and referred to Fig. 1B in the text (results and methods) to show other researchers how they can use this technique as a reliable and safe means of monitoring tidal volume under anesthesia in small infant marmoset without intubation.

      I understand that both "spectrograph" and "spectrogram" are used to analyze the frequency content of a signal. Nevertheless, "spectrogram" refers to the visual representation of the frequency content of a signal over time, and this term is commonly used in audio signal processing and specifically in the vocal communication field. I would recommend replacing "spectrograph" with "spectrogram".

      Thanks for this suggestion. We have corrected this throughout the manuscript.

      (Concerning the previous comment in the public review). Cries are uttered to attract the attention of the caregivers. The increase in the proportion of cries in the ACC group does not match the sentence: "...these infants appeared to make little effort in using vocalizations to solicit social contact when socially isolated".

      We apologize for the confusion. It is not the case that the ACC animals make more cries. Cry calls were highly variable amongst the animals. Consequently, although Fig 3D gives the impression that the proportion of cries in higher in ACC animals they did not differ significantly from the controls. Due to their high variability, cries were removed in the measurement of social contact. Accordingly, Fig. 3E does not include cry behavior; it shows that the ACC animals engage less in social contact calls.

      Related to Figure 3. What is the difference between "egg" and "eck" calls? Do you mean "ock"?

      We apologize. This was a typo. It should be ock calls.

      Figure 4B. Is the sample size five animals per group and per week? Overlapping data points seem to be placed next to each other. Why in some groups (e.g. ACC 6 weeks) less than five dots are visible?

      The sample size differed per week because of the lack of recording during the COVID restrictions. In Fig 4b, we have now separated the overlapping dots. We have also added the sample size of the groups in the figure legends.

      Would the authors expect to see stronger differences between the lesioned and the control groups when comparing a later developmental stage? The animals were euthanized at the age of

      These speculation is certainly feasible and yes, we were hoping to establish this level of detail with testing at later developmental stages. This is an aspect of development we are currently pursuing.

      Could these experiments be conducted?

      I’m afraid these animals are longer available, but we are currently conducting experiments in other animals with early life neurochemical manipulations who show behavioral changes into early adulthood.

      ACC lesion: It is reported that the lesions extended past 24b into motor area 6M. Did the animal display any motor control disability?

      Surprisingly, despite the lesion encroaching into 6M, these animals showed no observable motor impairment. We assessed the animals grip strength and body weight and discovered normal strength and growth in weight in both controls and the lesioned group. We have added this data as supplemental information (Fig. S3).

    1. eLife Assessment

      In this useful study, the authors perform voltage imaging of CA1 pyramidal cells in head-fixed mice running on a track while local field potentials (LFPs) are recorded. The authors conclude that synchronous ensembles of neurons are differentially associated with different types of LFP patterns, namely theta and ripples. However, evidence for the claims in the paper remains incomplete, due to caveats of the experimental approach and claims that are based on a relatively sparse data set collected with a cutting-edge but still largely untested method.

    2. Reviewer #2 (Public review):

      Summary:

      This study employed voltage imaging in the CA1 region of the mouse hippocampus during the exploration of a novel environment. The authors report synchronous activity, involving almost half of the imaged neurons, occurred during periods of immobility. These events did not correlate with SWRs, but instead, occurred during theta oscillations and were phased locked to the trough of theta. Moreover, pairs of neurons with high synchronization tended to display non-overlapping place fields, leading the authors to suggest these events may play a role in binding a distributed representation of the context.

      Strengths:

      Technically this is an impressive study, using an emerging approach that allows single cell resolution voltage imaging in animals, that while head-fixed, can move through a real environment. The paper is written clearly and suggests novel observations about population level activity in CA1.

      Comments on revisions:

      I have no further major requests and thank the authors for the additional data and analyses.

    3. Reviewer #3 (Public review):

      Summary:

      In the present manuscript, the authors use a few minutes of voltage imaging of CA1 pyramidal cells in head fixed mice running on a track while local field potential (LFPs) are recorded. The authors suggest that synchronous ensembles of neurons are differentially associated with different types of LFP patterns, theta and ripples. The experiments are flawed in that the LFP is not "local" but rather collected the other side of the brain.

      Strengths:

      The authors use a cutting-edge technique.

      Weaknesses:

      Although the authors have toned down their claims, the statement in the title ("Synchronous Ensembles of Hippocampal CA1 Pyramidal Neurons Associated with Theta but not Ripple Oscillations During Novel Exploration") is still unsupported.

      One could write the same title while voltage imaging one mouse and recording LFP from another mouse.

      To properly convey the results, the title should be modified to read "Synchronous Ensembles of Hippocampal CA1 Pyramidal Neurons Associated with Contralateral Theta but not with Contralateral Ripple Oscillations During Novel Exploration"

      Without making this change, the title - and therefore the entire work - is misleading at best.

    1. With so many characters that you might not think should be special, in fact being special, I just use the special characters anyway. This also puts me in the good habits of using bash completion, where it will auto-escape all the special characters in a filename. But it also puts me in the good habits of escaping/quoting EVERYTHING in scripts and multi-part 1-liners in bash. For example, in just a simple 1-liner: for file in *.txt; do something.sh "$file"; done That way, even if one of the files has a space, or some other character, the do part of the loop will still act on it, and not miss on 2 or more file-name-parts, possibly causing unintended side-effects. Since I cannot control the space/not-space naming of EVERY file I encounter, and if I tried, it would probably break some symlinks somewhere, causing yet more unintended consequences, I just expect that all filename/directoryname could have spaces in it, and just quote/escape all variables to compensate. So, then I just use whatever characters I want (often spaces) in filenames. I even use spaces in ZFS dataset names, which I have to admit has caused a fair amount of head-scratching among the developers that write the software for the NAS I use. Sum-up: Spaces are not an invalid character, so there's no reason not to use them.
  10. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. for Bourdieu, hools act as insti-tutional agents chat reward the cultural capital of the dominant c_lasses and devalue those of the working classes and the poor.

      This reminds me of a polisci class I took 2 quarters back, in which Pierre Bourdieu’s concept of habitus was introduced to me. Students from working-class or immigrant households carry cultural dispositions misaligned with institutional expectations, and this mismatch contributes to disengagement and internalized failure. At home, I was once taught to listen, never challenge elders (because they always have more wisdom), and keep my head down. At school, success meant speaking up, asserting opinions, and networking. It took me years to unlearn silence, at least now I am comfortable calling out what I disagree with my family elders.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02465

      Corresponding author(s): Saravanan, Palani

      1. General Statements

      We would like to thank the Review Commons Team for handling our manuscript and the Reviewers for their constructive feedback and suggestions. In our revised manuscript, we have addressed and incorporated all the major suggestions of the reviewers, and we have also added new significant data on the role of Tropomyosin in regulation of endocytosis through its control over actin monomer pool maintenance and actin network homeostasis. We believe that with all these additions, our study has significantly gained in quality, strength of conclusions made, and scope for future work.

      2. Point-by-point description of the revisions

      Reviewer #1

      Evidence, reproducibility and clarity

      There are 2 Major issues -

      Having an -ala-ser- linker between the GFP and tropomyosin mimics acetylation. This is not the case, and more likely the this linker acts as a spacer that allows tropomyosin polymers to form on the actin, and without it there is steric hindrance. A similar result would be seen with a simple flexible uncharged linker. It has been shown in a number of labs that the GFP itself masks the effect of the charge on the amino terminal methionine. This is consistent with NMR, crystallographic and cryo structural studies. Biochemical studies should be presented to demonstrate that the impact of a linker for the conclusions stated to be made, which provide the basis of a major part of this study.

      Response: We would like to clarify that all mNG-Tpm constructs used in our study contain a 40 amino-acid (aa) flexible linker between the N-terminal mNG fluorescent protein and the Tpm protein as per our earlier published study (Hatano et al., 2022). During initial optimization, we have also experimented with linker length and the 40aa-linker length works optimally for clear visualization of Tpm onto actin cable structures in budding yeast, fission yeast (both S. pombe and S. japonicus), and mammalian cells (Hatano et al., 2022). These constructs have also been used since in other studies (Wirshing et al., 2023; Wirshing and Goode, 2024) and currently represents the best possible strategy to visualize Tpm isoforms in live cells. In our study, we characterized these proteins for functionality and found that both mNG-Tpm1 and mNG-Tpm2 were functional and can rescue the synthetic lethality observed in Dtpm1Dtpm2 cells. During our study, we observed that mNG-Tpm1 expression from a single-copy integration vector did not restore full length actin cables in Dtpm1 cells (Fig. 1B, 1C). We hypothesized that this could be a result of reduced binding affinity of the tagged tropomyosin due to lack of normal N-terminal acetylation which stabilizes the N-terminus. The 40aa linker is unstructured and may not be able to neutralize the charge on the N-terminal Methionine, thus, we tried to insert -Ala-Ser- dipeptide which has been routinely used in vitro biochemical studies to stabilize the N-terminal helix and impart a similar effect as the N-terminal acetylation (Alioto et al., 2016; Palani et al., 2019; Christensen et al., 2017) by restoring normal binding affinity of Tpm to F-actin (Monteiro et al., 1994; Greenfield et al., 1994). We observed that addition of the -Ala-Ser- dipeptide to mNG-Tpm fusion, indeed, restored full length actin cables when expressed in Dtpm1 cells, performing significantly better in our in vivo experiments (Fig. 1B, 1C). We agree with the reviewer that the -AS- dipeptide addition may not mimic N-terminal acetylation structurally but as per previous studies, it may stabilize the N-terminus of Tpm and allow normal head-to-tail dimer formation (Greenfield et al., 1994; Monteiro et al., 1994; Frye et al., 2010). We have discussed this in our new Discussion section (Lines 350-372). Since, the addition of -AS- dipeptide was referred to as "acetyl-mimic (am)" in a previous study (Alioto et al., 2016), we continued to use the same nomenclature in our study. Now as per your suggestions and to be more accurate, we have renamed "mNG-amTpm" constructs as "mNG-ASTpm" throughout the study to not confuse or claim that -AS- addition mimics acetylation. In any case, we have not seen any other ill effect of -AS- dipeptide introduction in addition to our 40 amino acid linker suggesting that it can also be considered part of the linker. Although, we agree with the reviewer that biochemical characterization of the effect of linker would be important to determine, we strongly believe that it is currently outside the scope of this study and should be taken up for future work with these proteins. Our study has majorly aimed to understand the functionality and utility of these mNG-Tpm fusion proteins for cell biological experiments in vivo, which was not done earlier in any other model system.

      My major issue however is making the conclusions stated here, using an amino-terminal fluorescent protein tag that s likely to impact any type of isoform selection at the end of the actin polymer. Carboxyl terminal tagging may have a reduced effect, but modifying the ends of the tropomyosin, which are integral in stabilising end to end interactions with itself on the actin filament, never mind any section systems that may/maynot be present in the cell, is not appropriate.

      Response: We agree with the reviewer that N-terminal tagging of tropomyosin may have effects on its function, but these constructs represent the only fluorescently tagged functional tropomyosin constructs available currently while C-terminal fusions are either non-functional (we were unable to construct strains with endogenous Tpm1 gene fused C-terminally to GFP) or do not localize clearly to actin structures (See Figure R1 showing endogenous C-terminally tagged Tpm2-yeGFP that shows almost no localization to actin cables). To our knowledge, our study represents a first effort to understand the question of spatial sorting of Tpm isoforms, Tpm1 and Tpm2, in S. cerevisiae and any future developments with better visualization strategies for Tpm isoforms without compromising native N-terminal modifications and function will help improve our understanding of these proteins in vivo. We have also discussed these possibilities in our new Discussion section (Lines 391-396).

      Significance

      This paper explores the role of formin in determining the localisation of different tropomyosins to different actin polymers and cellular locations within budding yeast. Previous studies have indicated a role for the actin nucleating proteins in recruiting different forms of tropomyosin within fission yeast. In mammalian cells there is variation in the role of formins in affiecting tropomyosin localisation - variation between cell type. There is also evidence that other actin binding proteins, and tropomyosin abundance play roles in regulating the tropomyosin-actin association according to cell type. Biochemical studies have previously been undertaken using budding yeast and fission yeast that the core actin polymerisation domain of formins do not interact with tropomyosin directly. The significance of this study, given the above, and the concerns raised is not clear to this reviewer.

      Response: __Our study explores multiple facets of Tropomyosin (Tpm) biology. The lack of functional tagged Tpm has been a major bottleneck in understanding Tpm isoform diversity and function across eukaryotes. In our study, we characterize the first functional tagged Tpm proteins (Fig. 1, Fig. S1) and use them to answer long-standing questions about localization and spatial sorting of Tpm isoforms in the model organism S. cerevisiae (Fig. 2, Fig. 3, Fig. S2, Fig. S3). We also discover that the dual Tpm isoforms, Tpm1 and Tpm2, are functionally redundant for actin cable organization and function, while having gained divergent functions in Retrograde Actin Cable Flow (RACF) (Fig. 4, Fig. 5A-D, Fig. S4, Fig. S5, Fig. S6). We have now added new data on role of global Tpm levels controlling endocytosis via maintenance of normal linear-to-branched actin network homeostasis in S. cerevisiae (Fig. 5E-G)__. We respectfully differ with the reviewer on their assessment of our study and request the reviewer to read our revised manuscript which discusses the significance, limitations, and future perspectives of our study in detail.

      Reviewer #2

      Evidence, reproducibility and clarity

      This manuscript by Dhar, Bagyashree, Palani and colleagues examines the function of the two tropomyosins, Tpm1 and Tpm2, in the budding yeast S. cerevisiae. Previous work had shown that deletion of tpm1 and tpm2 causes synthetic lethality, indicating overlapping function, but also proposed that the two tropomyosins have distinct functions, based on the observation that strong overexpression of Tpm2 causes defects in bud placement and fails to rescue tpm1∆ phenotypes (Drees et al, JCB 1995). The manuscript first describes very functional mNeonGreen tagged version of Tpm1 and Tpm2, where an alanine-serine dipeptide is inserted before the first methionine to mimic acetylation. It then proposes that the Tpm1 and Tpm2 exhibit indistinguishable localization and that low level overexpression (?) of Tpm2 can replace Tpm1 for stabilization of actin cables and cell polarization, suggesting almost completely redundant functions. They also propose on specific function of Tpm2 in regulating retrograde actin cable flow.

      Overall, the data are very clean, well presented and quantified, but in several places are not fully convincing of the claims. Because the claims that Tpm1 and Tpm2 have largely overlapping function and localization are in contradiction to previous publication in S. cerevisiae and also different from data published in other organisms, it is important to consolidate them. There are fairly simple experiments that should be done to consolidate the claims of indistinguishable localization, and levels of expression, for which the authors have excellent reagents at their disposal.

      1. Functionality of the acetyl-mimic tagged tropomyosin constructs: The overall very good functionality of the tagged Tpm constructs is convincing, but the authors should be more accurate in their description, as their data show that they are not perfectly functional. For instance, the use of "completely functional" in the discussion is excessive. In the results, the statement that mNG-Tpm1 expression restores normal growth (page 3, line 69) is inaccurate. Fig S1C shows that tpm1∆ cells expressing mNG-Tpm1 grow more slowly than WT cells. (The next part of the same sentence, stating it only partially restores length of actin cables should cite only Fig S1E, not S1F.) Similarly, the growth curve in Fig S1C suggests that mNG-amTpm1, while better than mNG-Tpm1 does not fully restore the growth defect observed in tpm1∆ (in contrast to what is stated on p. 4 line 81). A more stringent test of functionality would be to probe whether mNG-amTpm1 can rescue the synthetic lethality of the tpm1∆ tpm2∆ double mutant, which would also allow to test the functionality of mNG-amTpm2.

      __Response: __We would like to thank the reviewer for his feedback and suggestions. Based on the suggestions, we have now more accurately described the growth rescue observed by expression of mNG-ASTpm1 in Dtpm1 cells in the revised text. We have also removed the use of "completely functional" to describe mNG-Tpm functionality and corrected any errors in Figure citations in the revised manuscript.

      As per reviewers' suggestion, we have now tested rescue of synthetic lethality of Dtpm1Dtpm2 cells by expression of all mNG-Tpm variants and we find that all of them are capable of restoring the viability of Dtpm1Dtpm2 cells when expressed under their native promoters via a high-copy plasmid (pRS425) (Fig. S1E) but only mNG-Tpm1 and mNG-ASTpm1 restored viability of Dtpm1Dtpm2 cells when expressed under their native promoters via an integration plasmid (pRS305) (Fig. S1F). These results clearly suggest that while both mNG-Tpm1 and mNG-Tpm2 constructs are functional, Tpm1 tolerates the presence of the N-terminal fluorescent tag better than Tpm2. These observations now enhance our understanding of the functionality of these mNG-Tpm fusion proteins and will be a useful resource for their usage and experimental design in future studies in vivo.

      It would also be nice to comment on whether the mNG-amTpm constructs really mimicking acetylation. Given the Ala-Ser peptide ahead of the starting Met is linked N-terminally to mNG, it is not immediately clear it will have the same effect as a free acetyl group decorating the N-terminal Met.

      Response: __We agree with the reviewer's observation and for the sake of clarity and accuracy, we have now renamed "mNG-amTpm" with "mNG-ASTpm". The use of -AS- dipeptide is very routine in studies with Tpm (Alioto et al., 2016; Palani et al., 2019; Christensen et al., 2017) and its addition restores normal binding affinities to Tpm proteins purified from E. coli (Monteiro et al., 1994). We agree with the reviewer that the -AS- dipeptide addition may not mimic N-terminal acetylation structurally but as per previous studies, it may help neutralize the impact of a freely protonated Met on the alpha-helical structure and stabilize the N-terminus helix of Tpm and allow normal head-to-tail dimer formation (Monteiro et al., 1994; Frye et al., 2010; Greenfield et al., 1994). Consistent with this, we also observe a highly significant improvement in actin cable length when expressing mNG-ASTpm as compared to mNG-Tpm in Dtpm1 cells, suggesting an improvement in function probably due to increased binding affinity (Fig. 1B, 1C). We have also discussed this in our answer to Question 1 of Reviewer 1 and the revised manuscript (Lines 350-372)__.

      __ Localization of Tpm1 and Tpm2:__Given the claimed full functionality of mNG-amTpm constructs and the conclusion from this section of the paper that relative local concentrations may be the major factor in determining tropomyosin localization to actin filament networks, I am concerned that the analysis of localization was done in strains expressing the mNG-amTpm construct in addition to the endogenous untagged genes. (This is not expressly stated in the manuscript, but it is my understanding from reading the strain list.) This means that there is a roughly two-fold overexpression of either tropomyosin, which may affect localization. A comparison of localization in strains where the tagged copy is the sole Tpm1 (respectively Tpm2) source would be much more conclusive. This is important as the results are making a claim in opposition to previous work and observation in other organisms.

      Response: __We thank the reviewer for this observation and their suggestions. We agree that relative concentrations of functional Tpm1 and Tpm2 in cells may influence the extent of their localizations. As per the reviewer's suggestion, we have now conducted our quantitative analysis in cells lacking endogenous Tpm1 and only expressing mNG-ASTpm1 from an integrated plasmid copy at the leu2 locus and the data is presented in new __Figure S3. We compared Tpm-bound cable length (Fig. S3A, S3B) __and Tpm-bound cable number (Fig. S3A, S3C) along with actin cable length (Fig. S3D, S3E) and actin cable number (Fig. S3D, S3F) in wildtype, Dbnr1, and Dbni1 cells. Our analysis revealed that mNG-ASTpm1 localized to actin cable structures in wildtype, Dbnr1, and Dbni1 cells and the decrease observed in Tpm-bound cable length and number upon loss of either Bnr1 or Bni1, was accompanied by a corresponding decrease in actin cable length and number upon loss of either Bnr1 or Bni1. Thus, this analysis reached the same conclusion as our earlier analysis (Fig. 2) that mNG-ASTpm1 does not show preference between Bnr1 and Bni1-made actin cables. mNG-ASTpm2 did not restore functionality, when expressed as single integrated copy, in Dtpm1Dtpm2 cells (new results in __Fig. S1E, S1F, S5A) thus, we could not conduct a similar analysis for mNG-ASTpm2. This suggests that use of mNG-ASTpm2 would be more meaningful in the presence of endogenous Tpm2 as previously done in Fig. 2D-F.

      We have now also performed additional yeast mating experiments with cells lacking bnr1 gene and expressing either mNG-ASTpm1 or mNG-ASTpm2 and the data is shown in new Figure 3. From these observations, we observe that both mNG-ASTpm1 and mNG-ASTpm2 localize to the mating fusion focus in a Bnr1-independent manner (Fig. 3B, 3D) and suggests that they bind to Bni1-made actin cables that are involved in polarized growth of the mating projection. These results also add strength to our conclusion that Tpm1 and Tpm2 localize to actin cables irrespective of which formin nucleates them. Overall, these new results highlight and reiterate our model of formin-isoform independent binding of Tpm1 and Tpm2 in S. cerevisiae.

      In fact, although the authors conclude that the tropomyosins do not exhibit preference for certain actin structures, in the images shown in Fig 2A and 2D, there seems to be a clear bias for Tpm1 to decorate cables preferentially in the bud, while Tpm2 appears to decorate them more in the mother cell. Is that a bias of these chosen images, or does this reflect a more general trend? A quantification of relative fluorescence levels in bud/mother may be indicative.

      Response: __We thank the reviewer for pointing this out. Our data and analysis do not suggest that Tpm1 and Tpm2 show any preference for decoration of cables in either mother or bud compartment. As per the reviewer's suggestion, we have now quantified the ratio of mean mNG fluorescence in the bud to the mother (Bud/Mother) and the data is shown in __Figure. S2G. The bud-to-mother ratio was similar for mNG-ASTpm1 and mNG-ASTpm2 in wildtype cells, and the ratio increased in Dbnr1 cells and decreased in Dbni1 cells for both mNG-ASTpm1 and mNG-ASTpm2 (Fig. S2G). __This is consistent with the decreased actin cable signal in the mother compartment in Dbnr1 cells and decreased actin cable signal in the bud compartment in Dbni1 cells (Fig. S2A-D). Thus, our new analysis shows that both mNG-ASTpm1 and mNG-ASTpm2 have similar changes in their concentration (mean fluorescence) upon loss of either formins Bnr1 and Bni1 and show similar ratios in wildtype cells as well, suggesting no preference for binding to actin cables in either bud or mother compartment. The preference inferred by the reviewer seems to be a bias of the current representative images and thus, we have replaced the images in __Fig. 2A, 2D to more accurately represent the population.

      The difficulty in preserving mNG-amTpm after fixation means that authors could not quantify relative Tpm/actin cable directly in single fixed cells. Did they try to label actin cables with Lifeact instead of using phalloidin, and thus perform the analysis in live cells?

      __Response: __We did not use LifeAct for our analysis as LifeAct is known to cause expression-dependent artefacts in cells (Courtemanche et al., 2016; Flores et al., 2019; Xu and Du, 2021) and it also competes with proteins that regulate normal cable organization like cofilin. Use of LifeAct would necessitate standardization of expression to avoid such artefacts in vivo. Also, phalloidin staining provides the best staining of actin cables and allows for better quantitative results in our experiments. The use of LifeAct along with mNG-Tpm would also require optimization with a red fluorescent protein which usually tend to have lower brightness and photostability. However, during the revision of our study, a new study from Prof. Goode's lab has developed and optimized expression of new LifeAct-3xmNeonGreen constructs for use in S. cerevisiae (Wirshing and Goode, 2024). Thus, a similar strategy of using tandem copies of bright and photostable red fluorescent proteins can be explored for use in combination with mNG-Tpm in the future studies.

      __ Complementation of tpm1∆ by Tpm2:__

      I am confused about the quantification of Tpm2 expression by RT-PCR shown in Fig S3F. This figure shows that tpm2 mRNA expression levels are identical in cells with an empty plasmid or with a tpm2-encoding plasmid. In both strains (which lack tpm1), as well as in the WT control, one tpm2 copy is in the genome, but only one strain has a second tpm2 copy expressed from a centromeric plasmid, yet the results of the RT-PCR are not significantly different. (If anything, the levels are lower in the tpm2 plasmid-containing strain.) The methods state that the primers were chosen in the gene, so likely do not distinguish the genomic from the plasmid allele. However, the text claims a 1-fold increase in expression, and functional experiments show a near-complete rescue of the tpm1∆ phenotype. This is surprising and confusing and should be resolved to understand whether higher levels of Tpm2 are really the cause of the observed phenotypic rescue.

      The authors could for instance probe for protein levels. I believe they have specific nanobodies against tropomyosin. If not, they could use expression of functional mNG-amTpm2 to rescue tpm1∆. Here, the expression of the protein can be directly visualized.

      Response: __We thank the reviewer for pointing this out. We would like to clarify that in our RT-qPCR experiments, the primers were chosen within the Tpm1 and Tpm2 gene and do not distinguish between transcripts from endogenous or plasmid copy. We have now mentioned this in the Materials and Methods section of the revised manuscript. So, they represent a relative estimate of the total mRNA of these genes present in cells. We were consistently able to detect ~19 fold increase in Tpm2 total mRNA levels as compared to wildtype and ∆tpm1 cells (Fig. S4D) when tpm2 was expressed from a high-copy plasmid (pRS425). This increase in Tpm2 mRNA levels was accompanied by a rescue in growth (Fig. S4A) and actin cable organization (Fig. S4B) of ∆tpm1 cells containing pRS425-ptpm2TPM2. When tpm2 was expressed from a low-copy number centromeric plasmid (pRS316), we detected a ~2 fold increase in Tpm2 transcript levels when using the tpm1 promoter and no significant change was detected when using tpm2 promoter (Fig. S4E)__. We have made sure that these results are accurately described in the revised manuscript.

      As per the reviewer's suggestion, we have now conducted a more extensive analysis to ascertain the expression levels of Tpm2 in our experiments and the data is now presented in new Figure S5. We used mNG-ASTpm1 and mNG-ASTpm2 to rescue growth of ∆tpm1 (Fig. S5A) and correlated growth rescue with protein levels using quantified fluorescence intensity (Fig. S5B, S5C) and western blotting (anti-mNG) (Fig. S5D, S5E). We find that ∆tpm1 cells containing pRS425-ptpm1mNG-ASTpm1 had the highest protein level followed by pRS425-ptpm2 mNG-ASTpm2, pRS305-ptpm1mNG-ASTpm1, and the least protein levels were found in pRS305-ptpm2 mNG-ASTpm2 containing ∆tpm1 cells in both fluorescence intensity and western blotting quantifications (Fig. S5C, S5E). Surprisingly, we were not able to detect any protein levels in ∆tpm1 cells containing pRS305-ptpm2 mNG-ASTpm2 with western blotting (Fig. S5D) which was also accompanied by a lack of growth rescue (Fig. S5A). This most likely due to weak expression from the native Tpm2 promoter which is consistent with previous literature (Drees et al., 1995). Taken together, this data clearly shows that the rescue observed in ∆tpm1 cells is caused due to increased expression of mNG-ASTpm2 in cells and supports our conclusion that increase in Tpm2 expression leads to restoration of normal growth and actin cables in ∆tpm1 cells.

      __ Specific function of Tpm2:__

      The data about the retrograde actin flow is interpreted as a specific function of Tpm2, but there is no evidence that Tpm1 does not also share this function. To reach this conclusion one would have to investigate retrograde actin flow in tpm1∆ (difficult as cables are weak) or for instance test whether Tpm1 expression restores normal retrograde flow to tpm2∆ cells.

      Response: __We agree with the reviewer and as per the reviewer's suggestion, we have performed another experiment which include wildtype, ∆tpm2 cells containing empty pRS316 vector or pRS316-ptpm2TPM1 or pRS316-ptpm1TPM1. We find that RACF rate increased in ∆tpm2 cells as compared to wildtype and was restored to wildtype levels by exogenous expression of Tpm2 but not Tpm1 (Fig. S6E, S6F). Since, actin cables were not detectable in ∆tpm1 cells, we measured RACF rates in ∆tpm1 cells expressing Tpm1 or Tpm2 from a plasmid copy, which restored actin cables as shown previously in __Fig. 5A-C. We observed that RACF rates were similar to wildtype in ∆tpm1 cells expressing either Tpm1 or Tpm2 (Fig. S6E, S6F), suggesting that Tpm1 is not involved in RACF regulation. Taken together, these results suggest a specific role for Tpm2, but not Tpm1, in RACF regulation in S. cerevisiae, consistent with previous literature (Huckaba et al., 2006).

      Minor comments: __1.__The growth of tpm1∆ with empty plasmid in Fig S3A is strangely strong (different from other figures).

      Response: We thank the reviewer for pointing this out. We have now repeated the drop test multiple times (Fig. R2), but we see similar growth rates as the drop test already presented in Fig. S4A. __At this point, it would be difficult to ascertain the basis of this difference observed at 23{degree sign}C and 30{degree sign}C, but a recent study that links leucine levels to actin cable stability (Sing et al., 2022) might explain the faster growth of these ∆tpm1 cells containing a leu2 gene carrying high-copy plasmid. However, there is no effect on growth rate at 37{degree sign}C which is consistent with other spot assays shown in __Fig. S1D, S4F, S5A.

      Significance

      I am a cell biologist with expertise in both yeast and actin cytoskeleton.

      The question of how tropomyosin localizes to specific actin networks is still open and a current avenue of study. Studies in other organisms have shown that different tropomyosin isoforms, or their acetylated vs non-acetylated versions, localize to distinct actin structures. Proposed mechanisms include competition with other ABPs and preference imposed by the formin nucleator. The current study re-examines the function and localization of the two tropomyosin proteins from the budding yeast and reaches the conclusion that they co-decorate all formin-assembled structures and also share most functions, leading to the simple conclusion that the more important contribution of Tpm1 is simply linked to its higher expression. Once consolidated, the study will appeal to researchers working on the actin cytoskeleton.

      We thank the reviewer for their positive assessment of our work and the constructive feedback that has greatly improved the quality of our study. After addressing the points raised by the reviewer, we believe that our study has significantly gained in consolidating the major conclusions of our work.

      **Referees cross-commenting**

      Having read the other reviewers' comments, I do agree with reviewer 1 that it is not clear whether the Ala-Ser linker really mimics acetylation. I am less convinced than reviewer 3 that the key conclusions of the study are well supported, notably the issue of Tpm2 expression levels is not convincing to me.

      Response: __We acknowledge the reviewer's point about the effect of Ala-Ser dipeptide and would request the reviewer to refer to our response to Reviewer 1 (Question 1) for a more detailed discussion on this. We have also extensively addressed the question of Tpm2 expression levels as suggested by the reviewer (new data in __Figure S5) which has further strengthened the conclusions of our study.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:__ The study presents the first fully functional fluorescently tagged Tpm proteins, enabling detailed probing of Tpm isoform localization and functions in live cells. The authors created a modified fusion protein, mNG-amTpm, which mimicked native N-terminal acetylation and restored both normal growth and full-length actin cables in yeast cells lacking native Tpm proteins, demonstrating the constructs' full functionality. They also show that Tpm1 and Tpm2 do not have a preference for actin cables nucleated by different formins (Bnr1 and Bni1). Contrary to previous reports, the study found that overexpressing Tpm2 in Δtpm1 cells could restore growth rates and actin cable formation. Furthermore, it is shown that despite its evolutionary divergence, Tpm2 retains actin-protective functions and can compensate for the loss of Tpm1, contributing to cellular robustness.

      Major and Minor Comments: 1. The key conclusions of this paper are convincing. However, I suggest that more detail be provided regarding the image analysis used in this study. Specifically, since threshold settings can impact the quality of the generated data and, therefore, its interpretation, it would be useful to see a representative example of the quantification methods used for actin cable length/number (as in refs. 80 and 81) and mitochondria morphology. These could be presented as Supplemental Figures. Additionally, it would help to interpret the results if the authors could be more specific about the statistical tests that were used.

      Response: __We agree with the reviewer's suggestions and have now updated our Materials and Methods section to describe the image analysis pipelines used in more detail. We have also added examples of quantification procedure for actin cable length/number and mitochondrial morphology as an additional Supplementary __Figure S7. Briefly, the following pipelines were used:

      • Actin cable length and number analysis: This was done exactly as mentioned in McInally et al., 2021, McInally et al., 2022. Actin cables were manually traced in Fiji as shown in __ S7A__, and then the traces files for each cell were run through a Python script (adapted from McInally et al., 2022) that outputs mean actin cable length and number per cell.
      • Mitochondria morphology: Mitochondria Analyzer plug-in in Fiji was used to segment out the mitochondrial fragments. The parameters used for 2D segmentation of mitochondria were first optimized using "2D Threshold Optimize" to find the most accurate segmentation and then the same parameters were run on all images. After segmentation of the mitochondrial network, measurements of fragment number were done using "Analyze Particles" function in Fiji. An example of the overall process is shown in __ S7B.__ As per the reviewer's suggestion, we have now included the description of the statistical test used in the Figure Legends of each Figure in the revised manuscript. We have used One-Way Anova with Tukey's Multiple Comparison test, Kruskal-Wallis test with Dunn's Multiple Comparisons, and Unpaired Two-tailed t-test using the in-built functions in GraphPad Prism (v.6.04).

      **Referees cross-commenting**

      I agree with both reviewers 1 and 2 regarding the issues with the Ala-Ser acetylation mimic and Tpm2 expression levels, respectively. I think the authors should be more careful in how they frame the results, but I consider that these issues do not invalidate the main conclusions of this study.

      Response: __We acknowledge the reviewer's concern about the Ala-Ser dipeptide and would request them to refer our earlier discussion on this in response to Reviewer 1 (Question 1) and Reviewer 2 (Question 2). We would also request the reviewer to refer to our answer to Reviewer 2 (Question 6) where we have extensively addressed the question of Tpm2 expression levels and their effect on rescue of Dtpm1 cells. This data is now presented as new __Figure S5 in our revised manuscript.

      Reviewer#3 (Significance (Required)):

      The finding that Tpm2 can compensate for the loss of Tpm1, restoring actin cable organization and normal growth rates, challenges previous assumptions about the non-redundant functions of these isoforms in Saccharomyces cerevisiae (ref. 16). It also supports a concentration-dependent and formin-independent localization of Tpm isoforms to actin cables in this species. The development of fully functional fluorescently tagged Tpm proteins is a significant methodological advancement. This advancement overcomes previous visualization challenges and allows for accurate in vivo studies of Tpm function and regulation in S. cerevisiae.

      The findings will be of particular interest to researchers in the field of cellular and molecular biology who study actin cytoskeleton dynamics. Additionally, it will be relevant for those utilizing advanced microscopy and live-cell imaging techniques.

      As a researcher, my experience lies in cytoskeleton dynamics and protein interactions, though I do not have specific experience related to tropomyosin. I use different yeast species as models and routinely employ live-cell imaging as a tool.

      We thank the reviewer for their positive outlook and assessment of our study. We have incorporated all their suggestions, and we are confident that the revised manuscript has significantly improved in quality due to these additions.

    1. 3 minTagspdf.jspdf viewerIn this article (a three-minute read), you’ll learn how to quickly embed a PDF in a web page using PDF.js, a popular open-source PDF viewer.1. Download and Extract the PDF.js Package2. Add the PDF viewer to your web pageWe will also use it as a full screen PDF viewer where we can pass in a PDF filename via URL query string. Try the full screen viewer now:Open Full Screen PDF.js ViewerStep 1 - Download and Extract the PDF.js PackageCopied to clipboardLet’s head over to GitHub to download the latest stable release and then extract the contents inside our website folder.Here are the contents of the .zip:Plain text├── build/ │ ├── pdf.js │ └── ... ├── web/ │ ├── viewer.css │ ├── viewer.html │ └── ... └── LICENSEAfter extracting the .zip contents, our website folder could look something like this:Plain text├── index.html ├── subpage.html ├── assets/ │ ├── pdf/ | ├── my-pdf-file.pdf | ├── my-other-pdf-file.pdf | ├── ... ├── build/ - PDF.js files │ ├── pdf.js │ ├── ... ├── web/ - PDF.js files │ ├── viewer.css │ ├── viewer.html │ ├── ... └── LICENSE - PDF.js licenseNote: Due to browser security restrictions, PDF.js cannot open local PDFs using a file:// URL. You will need to start a local web server or upload the files to your web server.Step 2 - Embed the PDF Viewer in WebsiteCopied to clipboardOur last step will be to embed the viewer in our web page by using an <iframe>. We will use a query string to tell the PDF viewer which PDF file to open. Like this: <!DOCTYPE html> <html> <head> <title>Hello world!</title> </head> <body style={{"fontFamily":"Arial, Helvetica, sans-serif"}}> <div style={{"width":"760px"}}> <h2>About Us</h2> <p>We help software developers do more with PDFs. PDF.js Express gives a flexible and modern UI to your PDF.js viewer while also adding out-of-the-box features like annotations, form filling and signatures.</p> <!-- Place the following <div> element where you want the PDF to be displayed in your website. You can adjust the size using the width and height attributes. --> <div> <iframe id="pdf-js-viewer" src="/web/viewer.html?file=%2assets%2pdf%2Fmy-pdf-file.pdf" title="webviewer" frameborder="0" width="500" height="600"></iframe> </div> </div> </body> </html>You’re done!If you’d like to load PDF files from a different domain name, you will need to ensure the server hosting the PDFs has been set up for CORS.Full Screen PDF ViewerIn addition to embedding the viewer in a page, we can also open it in a full screen: Open Full Screen PDF.js ViewerHere’s the code: <a href="/web/viewer.html?file=%2Fmy-pdf-file.pdf">Open Full Screen PDF.js Viewer</a>Just change the file query string parameter to open whatever you PDF you wish to open.Customizing the PDF.js ToolbarCopied to clipboardWe can also reorganize the toolbar by moving elements around, removing buttons, and changing the icons.Let’s open public/lib/web/viewer.html and add the following to the <head> section: <script src="customToolbar.js"></script>Next, we’ll create customToolbar.js inside the public/lib/web folder and add the following code:JavaScriptlet sheet = (function() { let style = document.createElement(&quot;style&quot;); style.appendChild(document.createTextNode(&quot;&quot;)); document.head.appendChild(style); return style.sheet; })(); function editToolBar(){ //when the page is resized, the viewer hides and move some buttons around. //this function forcibly show all buttons so none of them disappear or re-appear on page resize removeGrowRules(); /* Reorganizing the UI the &#39;addElemFromSecondaryToPrimary&#39; function moves items from the secondary nav into the primary nav there are 3 primary nav regions (toolbarViewerLeft, toolbarViewerMiddle, toolbarViewerRight) */ //adding elements to left part of toolbar addElemFromSecondaryToPrimary(&#39;pageRotateCcw&#39;, &#39;toolbarViewerLeft&#39;) addElemFromSecondaryToPrimary(&#39;pageRotateCw&#39;, &#39;toolbarViewerLeft&#39;) addElemFromSecondaryToPrimary(&#39;zoomIn&#39;, &#39;toolbarViewerLeft&#39;) addElemFromSecondaryToPrimary(&#39;zoomOut&#39;, &#39;toolbarViewerLeft&#39;) //adding elements to middle part of toolbar addElemFromSecondaryToPrimary(&#39;previous&#39;, &#39;toolbarViewerMiddle&#39;) addElemFromSecondaryToPrimary(&#39;pageNumber&#39;, &#39;toolbarViewerMiddle&#39;) addElemFromSecondaryToPrimary(&#39;numPages&#39;, &#39;toolbarViewerMiddle&#39;) addElemFromSecondaryToPrimary(&#39;next&#39;, &#39;toolbarViewerMiddle&#39;) //adding elements to right part of toolbar addElemFromSecondaryToPrimary(&#39;secondaryOpenFile&#39;, &#39;toolbarViewerRight&#39;) /* Changing icons */ changeIcon(&#39;previous&#39;, &#39;icons/baseline-navigate_before-24px.svg&#39;) changeIcon(&#39;next&#39;, &#39;icons/baseline-navigate_next-24px.svg&#39;) changeIcon(&#39;pageRotateCcw&#39;, &#39;icons/baseline-rotate_left-24px.svg&#39;) changeIcon(&#39;pageRotateCw&#39;, &#39;icons/baseline-rotate_right-24px.svg&#39;) changeIcon(&#39;viewFind&#39;, &#39;icons/baseline-search-24px.svg&#39;); changeIcon(&#39;zoomOut&#39;, &#39;icons/baseline-zoom_out-24px.svg&#39;) changeIcon(&#39;zoomIn&#39;, &#39;icons/baseline-zoom_in-24px.svg&#39;) changeIcon(&#39;sidebarToggle&#39;, &#39;icons/baseline-toc-24px.svg&#39;) changeIcon(&#39;secondaryOpenFile&#39;, &#39;./icons/baseline-open_in_browser-24px.svg&#39;) /* Hiding elements */ removeElement(&#39;secondaryToolbarToggle&#39;) removeElement(&#39;scaleSelectContainer&#39;) removeElement(&#39;presentationMode&#39;) removeElement(&#39;openFile&#39;) removeElement(&#39;print&#39;) removeElement(&#39;download&#39;) removeElement(&#39;viewBookmark&#39;) } function changeIcon(elemID, iconUrl){ let element = document.getElementById(elemID); let classNames = element.className; classNames = elemID.includes(&#39;Toggle&#39;)? &#39;toolbarButton#&#39;+elemID : classNames.split(&#39; &#39;).join(&#39;.&#39;); classNames = elemID.includes(&#39;view&#39;)? &#39;#&#39;+elemID+&#39;.toolbarButton&#39; : &#39;.&#39;+classNames classNames+= &quot;::before&quot;; addCSSRule(sheet, classNames, `content: url(${iconUrl}) !important`, 0) } function addElemFromSecondaryToPrimary(elemID, parentID){ let element = document.getElementById(elemID); let parent = document.getElementById(parentID); element.style.minWidth = &quot;0px&quot;; element.innerHTML =&#39;&#39; parent.append(element); } function removeElement(elemID){ let element = document.getElementById(elemID); element.parentNode.removeChild(element); } function removeGrowRules(){ addCSSRule(sheet, &#39;.hiddenSmallView *&#39;, &#39;display:block !important&#39;); addCSSRule(sheet, &#39;.hiddenMediumView&#39;, &#39;display:block !important&#39;); addCSSRule(sheet, &#39;.hiddenLargeView&#39;, &#39;display:block !important&#39;); addCSSRule(sheet, &#39;.visibleSmallView&#39;, &#39;display:block !important&#39;); addCSSRule(sheet, &#39;.visibleMediumView&#39;, &#39;display:block !important&#39;); addCSSRule(sheet, &#39;.visibleLargeView&#39;, &#39;display:block !important&#39;); } function addCSSRule(sheet, selector, rules, index) { if(&quot;insertRule&quot; in sheet) { sheet.insertRule(selector + &quot;{&quot; + rules + &quot;}&quot;, index); } else if(&quot;addRule&quot; in sheet) { sheet.addRule(selector, rules, index); } } window.onload = editToolBarThe PDF.js primary toolbar is broken down into 3 regions:The secondary toolbar is accessed via the chevron icon in the right region:We can move elements from the secondary toolbar into the left, middle, or right regions of the primary toolbar with the addElemFromSecondaryToPrimary function in customToolbar.js. For example, uncommenting this line will move the counter-clockwise rotation tool to the left region of the primary toolbar:JavaScriptaddElemFromSecondaryToPrimary(&#39;pageRotateCcw&#39;, &#39;toolbarViewerLeft&#39;)If you wanted to move pageRotateCcw to the middle region instead, you’d replace toolbarViewerLeft with toolbarViewerMiddle, or toolbarViewerRight for the right region. To move a different tool, replace the pageRotateCcw ID with the element ID you want to move. (See below for a full list of element IDs.)We can also hide elements like this:JavaScriptremoveElement(&#39;print&#39;) removeElement(&#39;download&#39;)To hide different elements, replace print or download with the element ID.NOTE: Hiding the download and print buttons is not a bulletproof way to protect our PDF, because it’s still possible to look at the source code to find the file. It just makes it a bit harder.

      for - prevent download of pdf

    1. One possibility to explain the negative effects is that young children have reduced interactions with adults while watching television. This point seems important, as interactions are known to be the core format for language development in young children

      this is where the issues i have experienced come in. kids cannot communicate healthily with others because they are so used to having their head buried in a device that they cannot make eye contact or hold normal conversations with normal vocabulary because they never have interacted with adults, or have had very limited interactions

    1. unlike the classical fairy tale, in whichthe wolf puts on the grandmother's clothes, here, the grandmother sees the wolf with Pussy's scarlet cloak onhis head and mistakes the thief for her granddaughter

      Seems like a turn around with deception being within the grandmother rather than Little Red Riding Hood

    Annotators

  11. readium.firebaseapp.com readium.firebaseapp.com
    1. here is an artist. He desires to paint you the dreamiest, shadiest, quietest, most enchanting bit of romantic landscape in all the valley of the Saco. What is the chief element he employs? There stand his trees, each with a hollow trunk, as if a hermit and a crucifix were within; and here sleeps his meadow, and there sleep his cattle; and up from yonder cottage goes a sleepy smoke. Deep into distant woodlands winds a mazy way, reaching to overlapping spurs of mountains bathed in their hill-side blue. But though the picture lies thus tranced, and though this pine-tree shakes down its sighs like leaves upon this shepherd’s head, yet all were vain, unless the shepherd’s eye were fixed upon the magic stream before him. Go visit the Prairies in June, when for scores on score

      test

    1. Reviewer #2 (Public review):

      This study examined the role of CRF neurons in the BNST in both phasic and sustained fear in males and females. The authors first established a differential fear paradigm whereby shocks were consistently paired with tones (Full) or only paired with tones 50% of the time (Part), or controls who were exposed to only tones with no shocks. Recall tests established that both Full and Part conditioned male and female mice froze to the tones, with no difference between the paradigms. Additional studies using the NSF and startle test, established that neither fear paradigm produced behavioral changes in the NSF test, suggesting that these fear paradigms do not result in an increase in anxiety-like behavior. Part fear conditioning, but not Full, did enhance startle responses in males but not females, suggesting that this fear paradigm did produce sustained increases in hypervigilance in males exclusively. Photometry studies found that while undifferentiated BNST neurons all responded to shock itself, only Full conditioning in males lead to a progressive enhancement of the magnitude of this response. BNST neurons in males, but not females, were also responsive to tone onset in both fear paradigms, but only in Full fear did the magnitude of this response increase across training. Knockdown of CRF from the BNST had no effect on fear learning in males or females, nor any effect in males on fear recall in either paradigm, but in females enhanced both baseline and tone-induced freezing only in Part fear group. When looking at anxiety following fear training, it was found in males that CRF knockdown modulated anxiety in Part fear trained animals and amplified startle in Full trained males but had no effect in either test in females. Using 1P imaging, it was found that CRF neurons in the BNST generally decline in activity across both conditioning and recall trials, with some subtle sex differences emerging in the Part fear trained animals in that in females BNST CRF neurons were inhibited after both shock and omission trials but in males this only occurred after shock and not omission trials. In recall trials, CRF BNST neuron activity remained higher in Part conditioned mice relative to Full conditioned mice.

      Overall, this is a very detailed and complex study that incorporates both differing fear training paradigms and males and females, as well as a suite of both state-of-the-art imaging techniques and gene knockdown approaches to isolate the role and contributions of CRF neurons in the BNST to these behavioral phenomena. The strengths of this study come from the thorough approach that the authors have taken, which in turn helped to elucidate nuanced and sex specific roles of these neurons in the BNST to differing aspects of phasic and sustained fear. More so, the methods employed provide a strong degree of cellular resolution for CRF neurons in the BNST. In general, the conclusions appropriately follow the data, although the authors do tend to minimize some of the inconsistencies across studies, although this has now been addressed to some degree. The discussion has also been improved to now address some of the inconsistencies in the data head on. Discussion of a few other points is below:

      - Given the focus on CRF neurons in the BNST, it was unclear why the photometry studies were performed in undifferentiated BNST neurons as opposed to CRF neurons specifically, although the authors have now explained this in better depth making this clearer to the reader.

      - The CRF KD studies are interesting, but it remains speculative as to whether these effects are mediated locally in the BNST or due to CRF signaling at downstream targets. As the literature on local pharmacological manipulation of CRF signaling within the BNST seems to be largely performed in males, the addition of pharmacological studies here would benefit this to help to resolve if these changes are indeed mediated by local impairments in CRF release within the BNST or not. While it is not essential to add these experiments, the authors have addressed this point in the discussion and highlighted studies like this as necessary in future work.

      - The authors have addressed the difference between arousal and anxiety by expanding the discussion to include more focus on the behavioral measures. The CRF KD data are still somewhat confusing but better contextualized now. Overall, the manuscript has been improved by the revisions and edits the authors have made.

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      Summary:

      In this study, Osiurak and colleagues investigate the neurocognitive basis of technical reasoning. They use multiple tasks from two neuroimaging studies and overlap analysis to show that the area PF is central for reasoning, and plays an essential role in tool-use and non-tool-use physical problem-solving, as well as both conditions of mentalizing task. They also demonstrate the specificity of the technical reasoning and find that the area PF is not involved in the fluid-cognition task or the mentalizing network (INT+PHYS vs. PHYS-only). This work suggests an understanding of the neurocognitive basis of technical reasoning that supports advanced technologies.

      Strengths:

      -The topic this study focuses on is intriguing and can help us understand the neurocognitive processes involved in technical reasoning and advanced technologies.

      -The researchers obtained fMRI data from multiple tasks. The data is rich and encompasses the mechanical problem-solving task, psychotechnical task, fluid-cognition task, and mentalizing task.

      -The article is well written.

      We sincerely thank Reviewer 1 for their positive and very helpful comments, which helped us improve the MS. Thank you.

      Weaknesses:

      - Limitations of the overlap analysis method: there are multiple reasons why two tasks might activate the same brain regions. For instance, the two tasks might share cognitive mechanisms, the activated regions of the two tasks might be adjacent but not overlapping at finer resolutions, or the tasks might recruit the same regions for different cognition functions.

      Thus, although overlap analysis can provide valuable information, it also has limitations.

      Further analyses that capture the common cognitive components of activation across different

      tasks are warranted, such as correlating the activation across different tasks within subjects for a region of interest (i.e. the PF).

      We thank Reviewer 1 for this comment. We added new analyses to address the two alternative interpretations stressed here by Reviewer 1, namely, the same-region-but-differentfonction interpretation and the adjacency interpretation. The new analyses ruled out both alternative interpretations, thereby reinforcing our interpretation.

      “The conjunction analysis reported was subject to at least two key limitations that needed to be overcome to assure a correct interpretation of our findings. The first was that the tasks could recruit the same regions for different cognition functions (same-region-but-different-function interpretation). The second was that the activated regions of the different tasks could be adjacent but did not overlap at finer resolutions (adjacency interpretation). We tested the same-region-but-different-function interpretation by conducting additional ROI analyses, which consisted of correlating the specific activation of the left area PF (i.e., difference in terms of mean Blood-Oxygen Level Dependent [BOLD] parameter estimates between the experimental condition minus the control condition) in the psychotechnical task, the fluid-cognition task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task. This analysis did not include the mechanical problem-solving task because the sample of participants was not the same for this task. As shown in Fig. 5, we found significant correlations between all the tasks that were hypothesized as recruiting technical reasoning, i.e., the psychotechnical task and the PHYS-Only and INT+PHYS conditions of the mentalizing task (all p < .05). By contrast, no significant correlation was obtained between these three tasks and the fluid-cognition task (all p > .15). This finding invalidates the same-region-but-different-function interpretation by revealing a coherent pattern in the activation of the left area PF in situations in which participants were supposed to reason technically. We examined the adjacency interpretation by analysing the specific locations of individual peak activations within the left area PF ROI for the mechanical problemsolving task, the psychotechnical task, the fluid-cognition task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task. These peaks, which corresponded to the maximum value of activation obtained for each participant within the left area PF ROI, are reported in Fig. 6. As can be seen, the peaks of the fluid-cognition task were located more anteriorly, in the left area PFt (Parietal Ft) and the postcentral cortex, compared to the peaks of the other four tasks, which were more posterior, in the left area PF. Statistical analyses based on the y coordinates of the individual activation peaks confirmed this description (Fig. 6). Indeed, the y coordinates of the peaks of the mechanical problem-solving task, the psychotechnical task and the PHYS-Only and INT+PHYS conditions of the mentalizing task were posterior to the y coordinates of the peaks of the fluid-cognition task (all p < .05), whereas no significant differences were reported between the four tasks (all p > .05). These findings speak against the adjacency interpretation by revealing that participants recruited the same part of the left area PF to perform tasks involving technical reasoning.” (p. 11-13)

      Control tasks may be inadequate: the tasks may involve other factors, such as motor/ actionrelated information. For the psychotechnical task, fluid-cognition task, and mentalizing task, the experiment tasks need not only care about technical-cognition information but also motor-related information, whereas the control tasks do not need to consider motor-related information (mainly visual shape information). Additionally, there may be no difference in motor-related information between the conditions of the fluid-cognition task. Therefore, the regions of interest may be sensitive to motor-related information, affecting the research conclusion.

      We thank Reviewer 1 for this comment. We added a specific section in the discussion that addresses this limitation.

      “The second limitation concerns the alternative interpretation that the left area PF is not central to technical reasoning but to the storage of sensorimotor programs about the prototypical manipulation of common tools. Here we show that the left area PF is recruited even in situations in which participants do not have to process common manipulable tools. For instance, some items of the psychotechnical task consisted of pictures of tractor, boat, pulley, or cannon. The fact that we found a common activation of the left area PF in such tasks as well as in the mechanical problem-solving task, in which participants could nevertheless simulate the motor actions of manipulating novel tools, indicates that this brain area is not central to tool manipulation but to physical understanding. That being said, some may suggest that viewing a boat or a cannon is enough to incite the simulation of motor actions, so our tasks were not equipped to distinguish between the manipulation-based approach and the reasoning-based approach. We have already shown that the left area PF is more involved in tasks that focus on the mechanical dimension of the tool-use action (e.g., the mechanical interaction between a tool and an object) than its motor dimension (i.e., the interaction between the tool and the effector [e.g., 24, 40]). Nevertheless, we recognize that future research is still needed to test the predictions derived from these two approaches.” (p. 18-19)

      -Negative results require further validation: the cognitive results for the fluid-cognition task in the study may need more refinement. For instance, when performing ROI analysis, are there any differences between the conditions? Bayesian statistics might also be helpful to account for the negative results.

      We agree that our negative results required further validation. We conducted the ROI analyses suggested by Reviewer 1, which confirmed the initial whole-brain analyses.

      “Region of interest (ROI) results. We conducted additional analyses to test the robustness of our findings. One of our results was that we did not report any specific activation of the left area PF in the fluid-cognition task contrary to the mechanical problem-solving task, the psychotechnical task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task. However, this negative result needed exploration at the ROI level. Therefore, we created a spherical ROI of the left area PF with a radius of 12 mm in the MNI standard space (–59; –31; 40). This ROI was literature-defined to ensure the independence of its selection (40). ROI results are shown in Fig. 4. The analyses confirmed the results obtained with the whole-brain analyses by indicating a greater activation of the left area PF in the mechanical problem-solving task, the psychotechnical task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task (all p < .001), but not in the fluid-cognition task (p \= .35).” (p. 10-11)

      Reviewer #1 (Recommendations For The Authors):

      (1) I may not fully grasp some of the arguments. In the abstract, what does the term "intermediate-level" mean, and why is it an intermediate-level state? In the sentence "the existence of a specific cognitive module in the human brain dedicated to materiality", I cannot see a clear link between technical cognition and the word "materiality".

      We used the term materiality to refer to a potential human trait that allows us to shape the physical world according to our ends, by using, making tools and transmiting them to others. This is a reference to Allen et al. (2020; PNAS): “We hope this empirical domain and modeling framework can provide the foundations for future research on this quintessentially human trait: using, making, and reasoning about tools and more generally shaping the physical world to our ends” (p. 29309). Scientists (including archaeologists, economists, psychologists, neuroscientists) interested in human materiality have tended to focus on how we manipulate things according to our thought (motor cognition) or how we conceptualize our behaviour to transmit it to others (language, social cognition). However, little has been said on the intermediate level, that is, technical cognition. We added the term “technical cognition” here, which should help to make the connection more quickly.

      “Yet, little has been said about the intermediate-level cognitive processes that are directly involved in mastering this materiality, that is, technical cognition.” (p. 2)

      (2) The introduction could provide more details on why the issue of "generalizability and specificity" is important to address, to clarify the significance of the research question.

      We followed this comment and added a sentence to explain why it is important to address this research question. Again, we thank Reviewer 1 for their helpful comments.

      “Here we focus on two key aspects of the technical-reasoning hypothesis that remain to be addressed: Generalizability and specificity. If technical reasoning is a specific form of reasoning oriented towards the physical world, then it should be implicated in all (the generalizability question) and only (the specificity question) the situations in which we need to think about the physical properties of our world.” (p. 5)

      Reviewer #2 (Public Review):

      Summary:

      The goal of this project was to test the hypothesis that a common neuroanatomic substrate in the left inferior parietal lobule (area PF) underlies reasoning about the physical properties of actions and objects. Four functional MRI (fMRI) experiments were created to test this hypothesis. Group contrast maps were then obtained for each task, and overlap among the tasks was computed at the voxel level. The principal finding is that the left PF exhibited differentially greater BOLD response in tasks requiring participants to reason about the physical properties of actions and objects (referred to as technical reasoning). In contrast, there was no differential BOLD response in the left PF when participants engaged in fMRI variant of the Raven's progressive matrices to assess fluid cognition.

      Strengths:

      This is a well-written manuscript that builds from extensive prior work from this group mapping the brain areas and cognitive mechanisms underlying object manipulation, technical reasoning, and problem-solving. Major strengths of this manuscript include the use of control conditions to demonstrate there are differentially greater BOLD responses in area PF over and above the baseline condition of each task. Another strength is the demonstration that area PF is not responsive in tasks assessing fluid cognition - e.g., it may just be that PF responds to a greater extent in a harder condition relative to an easy condition of a task. The analysis of data from Task 3 rules out this alternative interpretation. The methods and analysis are sufficiently written for others to replicate the study, and the materials and code for data analysis are publicly available.

      We sincerely thank Reviewer 2 for their precious comments, which helped us improve the MS. 

      Weaknesses:

      The first weakness is that the conclusions of the manuscript rely on there being overlap among group-level contrast maps presented in Figure 2. The problem with this conclusion is that different participants engaged in different tasks. Never is an analysis performed to demonstrate that the PF region identified in e.g., participant 1 in Task 2 is the same PF region identified in Participant 1 in Task 4.

      We added new analyses that demonstrated that “the PF region identified in e.g., participant 1 in Task 2 is the same PF region identified in Participant 1 in Task 4”. We thank Reviewer 2 for this comment, because these new analyses reinforced our interpretation.

      “The conjunction analysis reported was subject to at least two key limitations that needed to be overcome to assure a correct interpretation of our findings. The first was that the tasks could recruit the same regions for different cognition functions (same-region-but-different-function interpretation). The second was that the activated regions of the different tasks could be adjacent but did not overlap at finer resolutions (adjacency interpretation). We tested the same-region-but-different-function interpretation by conducting additional ROI analyses, which consisted of correlating the specific activation of the left area PF (i.e., difference in terms of mean Blood-Oxygen Level Dependent [BOLD] parameter estimates between the experimental condition minus the control condition) in the psychotechnical task, the fluid-cognition task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task. This analysis did not include the mechanical problem-solving task because the sample of participants was not the same for this task. As shown in Fig. 5, we found significant correlations between all the tasks that were hypothesized as recruiting technical reasoning, i.e., the psychotechnical task and the PHYS-Only and INT+PHYS conditions of the mentalizing task (all p < .05). By contrast, no significant correlation was obtained between these three tasks and the fluid-cognition task (all p > .15). This finding invalidates the same-region-but-different-function interpretation by revealing a coherent pattern in the activation of the left area PF in situations in which participants were supposed to reason technically. We examined the adjacency interpretation by analysing the specific locations of individual peak activations within the left area PF ROI for the mechanical problemsolving task, the psychotechnical task, the fluid-cognition task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task. These peaks, which corresponded to the maximum value of activation obtained for each participant within the left area PF ROI, are reported in Fig. 6. As can be seen, the peaks of the fluid-cognition task were located more anteriorly, in the left area PFt (Parietal Ft) and the postcentral cortex, compared to the peaks of the other four tasks, which were more posterior, in the left area PF. Statistical analyses based on the y coordinates of the individual activation peaks confirmed this description (Fig. 6). Indeed, the y coordinates of the peaks of the mechanical problem-solving task, the psychotechnical task and the PHYS-Only and INT+PHYS conditions of the mentalizing task were posterior to the y coordinates of the peaks of the fluid-cognition task (all p < .05), whereas no significant differences were reported between the four tasks (all p > .05). These findings speak against the adjacency interpretation by revealing that participants recruited the same part of the left area PF to perform tasks involving technical reasoning.” (p. 11-13)

      A second weakness is that there is a variance in accuracy between tasks that are not addressed. It is clear from the plots in the supplemental materials that some participants score below chance (~ 50%). This means that half (or more) of the fMRI trials of some participants are incorrect. The methods section does not mention how inaccurate trials were handled. Moreover, if 50% is chance, it suggests that some participants did not understand task instructions and were systematically selecting the incorrect item.

      It is true that the experimental conditions were more difficult than the control conditions, with some participants who performed at or below 50% in the experimental conditions. We added a section in the MS to stress this aspect. To examine whether this potential difficulty effect biased our interpretation, we conducted new ROI analyses by removing all the participants who performed at or below the chance level. These analyses revealed the same results as when no participant was excluded, suggesting that this did not bias our interpretation.

      “As mentioned above, the experimental conditions of all the tasks were more difficult than their control conditions. As a result, the specific activation of the left area PF documented above could simply reflect that this area responds to a greater extent in a harder condition relative to an easy condition of a task. This interpretation is nevertheless ruled out by the results obtained with the fluid-cognition task. We did not report a specific activation of the left area PF in this task while its experimental condition was more difficult than its control condition. To test more directly this effect of difficulty, we conducted new ROI analyses by removing all the participants who performed at or below 50% (Fig. S2). These new analyses replicated the initial analyses by showing a greater activation of the left area PF in the mechanical problem-solving task, the psychotechnical task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task (all p < .001), but not in the fluid-cognition task (p \= .48). In sum, the ROI analyses corroborated the wholebrain analyses and ruled out the potential effect of difficulty.” (p. 11)

      A third weakness is related to the fluid cognition task. In the fMRI task developed here, the participant must press a left or right button to select between 2 rows of 3 stimuli while only one of the 3 stimuli is the correct target. This means that within a 10-second window, the participant must identify the pattern in the 3x3 grid and then separately discriminate among 6 possible shapes to find the matching stimulus. This is a hard task that is qualitatively different from the other tasks in terms of the content being manipulated and the time constraints.

      We acknowledge that the fluid-cognition task involved a design that differed from the other tasks. However, this was also true for the other tasks, as the design also differed between the mechanical problem-solving task, the psychotechnical task, and the mentalizing task. Nevertheless, despite these distinctions, we found a consistent activation of the left area PF in these tasks with different designs including in the psychotechnical task, which seemed as difficult as the fluid-cognition task.

      “Region of interest (ROI) results. We conducted additional analyses to test the robustness of our findings. One of our results was that we did not report any specific activation of the left area PF in the fluid-cognition task contrary to the mechanical problem-solving task, the psychotechnical task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task. However, this negative result needed exploration at the ROI level. Therefore, we created a spherical ROI of the left area PF with a radius of 12 mm in the MNI standard space (–59; –31; 40). This ROI was literature-defined to ensure the independence of its selection (40). ROI results are shown in Fig. 4. The analyses confirmed the results obtained with the whole-brain analyses by indicating a greater activation of the left area PF in the mechanical problem-solving task, the psychotechnical task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task (all p < .001), but not in the fluid-cognition task (p \= .35).” (p. 10-11)

      In sum, this is an interesting study that tests a neuro-cognitive model whereby the left PF forms a key node in a network of brain regions supporting technical reasoning for tool and non-tool-based tasks. Localizing area PF at the level of single participants and managing variance in accuracy is critically important before testing the proposed hypotheses.

      We thank Reviewer 2 for this positive evaluation and their suggestions. As detailed in our response, our revision took into consideration both the localization of the left area PF at the level of single participants and the variance in accuracy. 

      Reviewer #2 (Recommendations For The Authors):

      Did the fMRI data undergo high-pass temporal filtering prior to modeling the effects of interest? Participants engaged in a long (17-24 minutes) run of fMRI data collection. Highpass filtering of the data is critically important when managing temporal autocorrelation in the fMRI response (e.g., see Shinn et al., 2023, Functional brain networks reflect spatial and temporal autocorrelation. Nature Neuroscience).

      Yes. We added this information.

      “Regressors of non-interest resulting from 3D head motion estimation (x, y, z translation and three axes of rotation) and a set of cosine regressors for high-pass filtering were added to the design matrix.” (p. 25-26)

      Including scales in Figure 2 would help the reader interpret the magnitude of the BOLD effects.

      We added this information in Figure 3 (Figure 2 in the initial version of the MS).

      It was difficult to inspect the small thumbnail images of the task stimuli in Figure 1. Higher resolution versions of those stimuli would help facilitate understanding of the task design and trial structure.

      We changed both Figure 1 and Figure S1.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript reports two neuroimaging experiments assessing commonalities and differences in activation loci across mechanical problem-solving, technical reasoning, fluid cognition, and "mentalizing" tasks. Each task includes a control task. Conjunction analyses are performed to identify regions in common across tasks. As Area PF (a part of the supramarginal gyrus of the inferior parietal lobe) is involved across 3 of the 4 tasks, the investigators claim that it is the hub of technical cognition.

      Strengths:

      The aim of finding commonalities and differences across related problem-solving tasks is a useful and interesting one.

      The experimental tasks themselves appear relatively well-thought-out, aside from the concern that they are differentially difficult.

      The imaging pipeline appears appropriate.

      We thank Reviewer 3 for their constructive comments, which helped us improve the MS.

      Weaknesses:

      (1) Methodological

      As indicated in the supplementary tables and figures, the experimental tasks employed differ markedly in 1) difficulty and 2) experimental trial time. Response latencies are not reported (but are of additional concern given the variance in difficulty). There is concern that at least some of the differences in activation patterns across tasks are the result of these fundamental differences in how hard various brain regions have to work to solve the tasks and/or how much of the trial epoch is actually consumed by "on-task" behavior. These difficulty issues should be controlled for by 1) separating correct and incorrect trials, and 2) for correct trials, entering response latency as a regressor in the Generalized Linear Models, 3) entering trial duration in the GLMs.

      We thank Reviewer 3 for this comment. It is true that the experimental conditions were more difficult than the control conditions, with some participants who performed at or below 50% in the experimental conditions. We added a section in the MS to stress this aspect. We could not conduct new analyses by separating correct and incorrect trials because, for each task, participants had to respond only on the last item of the block. Therefore, we did not record a response for each event. Nevertheless, we could examine whether this potential difficulty effect biased our interpretation, by conducting new ROI analyses in which we removed all the participants who performed at or below the chance level. These analyses revealed the same results as when no participant was excluded, suggesting that this did not bias our interpretation. 

      “As mentioned above, the experimental conditions of all the tasks were more difficult than their control conditions. As a result, the specific activation of the left area PF documented above could simply reflect that this area responds to a greater extent in a harder condition relative to an easy condition of a task. This interpretation is nevertheless ruled out by the results obtained with the fluid-cognition task. We did not report a specific activation of the left area PF in this task while its experimental condition was more difficult than its control condition. To test more directly this effect of difficulty, we conducted new ROI analyses by removing all the participants who performed at or below 50% (Fig. S2). These new analyses replicated the initial analyses by showing a greater activation of the left area PF in the mechanical problem-solving task, the psychotechnical task, and the PHYS-Only and INT+PHYS conditions of the mentalizing task (all p < .001), but not in the fluid-cognition task (p \= .48). In sum, the ROI analyses corroborated the wholebrain analyses and ruled out the potential effect of difficulty.” (p. 11)

      A related concern is that the control tasks also differ markedly in the degree to which they were easier and faster than their corresponding experimental task. Thus, some of the control tasks seem to control much better for difficulty and time on task than others. For example, the control task for the psychotechnical task simply requires the indication of which array contains a simple square shape (i.e., it is much easier than the psychotechnical task), whereas the control task for mechanical problem-solving requires mentally fitting a shape into a design, much like solving a jigsaw puzzle (i.e., it is only slightly easier than the experimental task).

      It is true that some control conditions could be easier than other ones. These differences reinforced the common activation found in the left area PF in the tasks hypothesized as involving technical reasoning, because this activation survived irrespective of the differences in terms of experimental design. For us, the rationale is the same as for a meta-analysis, in which we try to find what is common to a great variety of tasks. The only detrimental consequence we identified here is that this difference explained why we did not report a specific activation of the left area PF in the fluid-cognition task, as if the left area PF was more responsive when the task was difficult. This possibility assumes that the experimental condition of the fluid-cognition task is much more difficult than its control condition compared to what can be seen in the other tasks. As Reviewer 2 stressed in Point 1, this interpretation is unlikely, because the differences between the experimental and control conditions were similar to the fluid-cognition task in the mechanical problem-solving and psychotechnical tasks. In addition, again, the new ROI analyses in which we removed all the participants who performed at or below the chance level in expetimental conditions reproduced our initital results.

      (2) Theoretical 

      The investigators seem to overlook prior research that does not support their perspective and their writing seems to lack scientific objectivity in places. At times they over-reach in the claims that can be made based on the present data. Some claims need to be revised/softened.

      As this comment is also mentioned below, please find our response to it below.

      Reviewer #3 (Recommendations For The Authors):

      (1) Because of the high level of detail, Figures 1 and S2 (particularly the mentalizing task and mechanical problem-solving task, and their controls) are very hard to parse, even when examined relatively closely. It is suggested that these figures be broken down into separate panels for Experiment 1 and Experiment 2 to facilitate understanding.

      We changed both Figure 1 and Figure S1.

      (2) The behavioral data (including response latencies) should be reported in the main results section of the paper and not in a supplement.

      The behavioural data are now reported in the main results. We did not report response latencies because participants were not prompted to respond as quickly as possible.

      “Behavioural results. All the behavioural results are given in Fig. 2. As shown, scores were higher in the experimental conditions than for the control conditions for all the tasks (all p < .05). In other words, the experimental conditions were more difficult than the control conditions. This difference in terms of difficulty can also be illustrated by the fact that some participants performed at or below the chance level in the experimental conditions whereas none did so in the control conditions.” (p. 8)

      (3) The investigators seem to overlook prior research that does not support their perspective and their writing seems to lack scientific objectivity in places. At times they over-reach in the claims that can be made based on the present data. For example, claims that need to be revised/softened include:

      Abstract: "Area PF... can work along with social-cognitive skills to resolve day-to-day interactions that combine social and physical constraints". This statement is overly speculative.

      This statement is based on the fact that we reported a combined activation of the technical-reasoning network and the mentalizing network in the INT+PHYS condition of the mentalizing task. This suggests that both networks need to work together for solving a day-today problem in which both the physical constraints of the situation and the intention of the individual must be integrated. Our findings replicated previous ones with a similar task (e.g., Brunet et al. 2000; Völlm et al., 2006), in which the authors gave an interpretation similar to ours in considering that this task requires understanding physical and social causes. Perhaps that the reference to the results of the mentalizing task was not explicit enough. We added “dayto-day” before “problem” in the part of the discussion in which we discuss this possibility to make this aspect clearer.

      “In broad terms, the results of the mentalizing task indicate that causal reasoning has distinct forms and that it recruits distinct networks of the human brain (Social domain: Mentalizing; Physical domain: Technical reasoning), which can nevertheless interact together to solve day-to-day problems in which several domains are involved, such as in the INT+PHYS condition of the mentalizing task.” (p. 16)

      Introduction: "The manipulation-based approach... remains silent on the more general cognitive mechanisms...that must also encompass the use of unfamiliar or novel tools". This statement seems to be based on an overly selective literature review. There are a number of studies in which the relationship between a novel and familiar tool selection/use has been explored (e.g., Buchman & Randerath, 2017; Mizelle & Wheaton, 2010; Silveri & Ciccarelli, 2009; Stoll, Finkel et al., 2022; Foerster, 2023; Foerster, Borghi, & Goslin, 2020; Seidel, Rijntjes et al., 2023).

      We thank Reviewer 3 for this comment. Even if we accept the idea that we possess specific sensorimotor programs about tool manipulation, it remains that these programs cannot explain how an individual decides to bend a wire to make a hook or to pour water in a recipient to retrieve a target. As a matter of fact, such behaviour has been reported in nonhuman animals, such as crows (Weir et al., 2002, Nature) or orangutans (Mendes et al., 2007, Biology Letters). In these studies, the question is whether these nonhuman animals understand the physical causes or not, but the question of sensorimotor programs is never addressed (to our knowledge). This is also true in developmental studies on tool use (e.g., Beck et al., 2011, Cognition; Cutting et al., 2011, Journal of Experimental Child Psychology). This is what we meant here, that is, the manipulation-based approach is not equipped to explain how people solve physical problems by using or making tools – or any object – or by building constructions or producing technical innovations. However, we agree that some papers have been interested in exploring the link between common and novel tool use and have suggested that both could recruit common sensorimotor programs. It is noteworthy that these studies do not test the predictions from the manipulation-based approach versus the reasoning-based approach, so both interpretations are generally viable as stressed by Seidel et al. (2023), one of the papers recommended by Reviewer 3.

      “Apparently, the presentation of a graspable object that is recognizable as a tool is sufficient to provoke SMG activation, whether one tends to see the function of SMG to be either “technical reasoning” (Osiurak and Badets 2016; Reynaud et al. 2016; Lesourd et al. 2018; Reynaud et al. 2019) or “manipulation knowledge” (Sakreida et al. 2016; Buxbaum 2017; Garcea et al. 2019b).” (Seidel et al., 2023; p. 9)

      Regardless, as suggested by Reviewer 3, these papers deserve to be cited and this part needed to be rewritten to insist on the “making, construction, and innovation” dimension more than on the “unfamiliar and novel tool use” dimension to avoid any ambiguity.

      “This manipulation-based approach has provided interesting insights (12–16) and even elegant attempts to explain how these sensorimotor programs could support the use of both unfamiliar or novel tools (17–20), but remains silent on the more general cognitive mechanisms behind human technology that include the use of common and unfamiliar or novel tools but must also encompass tool making, construction behaviour, technical innovations, and transmission of technical content.” (p. 3)

      Introduction: "Here we focus on two important questions... to promote the technicalreasoning hypothesis as a comprehensive cognitive framework..."(italics added). This and other similar statements should be rewritten as testable scientific hypotheses rather than implying that the point of the research is to promote the investigators' preferred view.

      We agree that our phrasing could seem inappropriate here. What we meant here is that the technical-reasoning hypothesis could become an interesting framework for the study of the cognitive bases of human technology only if we are able to verify some of its key facets. As suggested, we rewrote this part. We also rewrote the abstract and the first paragraph of the discussion.

      “Here we focus on two key aspects of the technical-reasoning hypothesis that remain to be addressed: Generalizability and specificity. If technical reasoning is a specific form of reasoning oriented towards the physical world, then it should be implicated in all (the generalizability question) and only (the specificity question) the situations in which we need to think about the physical properties of our world.” (p. 5)

      Introduction: The Goldenberg and Hagmann paper cited actually shows that familiar tool use may be based either on retrieval from semantic memory or by inferring function from structure (mechanical problem solving); in other words, the investigators saw a role for both kinds of information, and the relationship between mechanical problem solving and familiar tool use was actually relatively weak. This requires correction.

      We disagree with Reviewer 3 on this point. The whole sentence is as follows:

      “This silence has been initially broken by a series of studies initiated by Goldenberg and Hagmann (9), which has documented a behavioural link in left brain-damaged patients between common tool use and the ability to solve mechanical problems by using and even sometimes making novel tools (e.g., extracting a target out from a box by bending a wire to create a hook) (9, 17).” (p. 3-4)

      We did not mention the interpretations given by Goldenberg and Hagmann about the link with the pantomime task, but only focused on the link they reported between common tool use and novel tool use. This is factual. In addition, we also disagree that the link between common tool use and novel tool use was weak.

      “The hypothesis put forward in the introduction predicts that knowledge about prototypical tool use assessed by pantomime of tool use and the ability to infer function from structure assessed by novel tool selection can both contribute to the use of familiar tools. Indeed results of both tests correlated signicantly with the use of familiar tools pantomime of tool use: r \= 0.77, novel tool selection: r \= 0.62; both P < 0.001), but there was also a signicant correlation between the two tests r \= 0.64, P < 0.001).” (Goldenberg & Hagmann, 1998; p. 585)

      As can be seen in this quote, they reported a significant correlation between novel tool selection and the use of familiar tools. It is also noteworthy that the novel tool selection test and the pantomime test correlated together. Georg Goldenberg told one of the authors (F. Osiurak; personal communication) that this result incited him to revise its idea that pantomime could assess “semantic knowledge”, which explains why he did not use it again as a measure of semantic knowledge. Instead, he preferred to use a classical semantic matching task in his 2009 Brain paper with Josef Spatt, in which they found a clearer dissociation between semantic knowledge and common/novel tool use not only at the behavioral level but also at the cerebral level.

      Introduction: Please expand and clarify this sentence "However, this involvement seems to be task-dependent, contrary to the systematic involvement of left are PF. The IFG and LOTC activations observed in prior studies are of interest as well. Were they indeed all taskdependent in these studies?

      We agree that this sentence is confusing. We meant that, in the studies reported just above in the paragraph, these regions were not systematically reported contrary to the left area PF. As we think that this information was not crucial for the logic of the paper, we preferred to remove it. 

      Introduction: If implicit mechanical knowledge is acquired through interactions with objects, how is that implicit knowledge conveyed to pass on the material culture to others?

      We thank Reviewer 3 for this comment. Although mechanical knowledge is implicit, it can be indirectly transmitted to other individuals, as shown in two papers we published in Nature Human Behaviour (Osiurak et al., 2021) and Science Advances (Osiurak et al., 2022). Actually, verbal teaching is not the only way to transmit information. There are many other ways of transmitting information such as gestural teaching (e.g., pointing the important aspects of a task to make them salient to the learner), observation without teaching (i.e., when we observe someone unbeknown to them) or reverse engineering (i.e., scrutinizing an artifact made by someone else). We have shown that even in reverse-engineering conditions, participants can benefit from what previous participants have done to increase their understanding of a physical system. In other words, all these forms of transmission allow the learners to understand new physical relationships without waiting that these relationships randomly occur in the environment. There is a wide literature on social learning, which describes very well how knowledge can be transmitted without using explicit communication. In fact, it is very likely that such forms of transmission were already present in our ancestors, allowing them to start accumulating knowledge without using symbolic language. We did not add this information in the MS because we think that this was a little bit beyond the scope of the MS. Nevetheless, we cited relevant literature on the topic to help the reader find it if interested in the topic.

      “Yet, recent accounts have proposed that non-social cognitive skills such as causal understanding or technical reasoning might have played a crucial role in cumulative technological culture (6, 29, 66). Support for these accounts comes from micro-society experiments, which have demonstrated that the improvement of technology over generations is accompanied by an increase in its understanding (67, 68), or that learners’ technical-reasoning skills are a good predictor of cumulative performance in such micro-societies (33, 69).” (p. 19)

      What distinguishes this implicit mechanical knowledge from stored knowledge about object manipulation? Are these two conceptualizations really demonstrably (testably) different?

      We agree that it is complex to distinguish between these two hypotheses as suggested by Seidel et al. (2023) cited above (see Reviewer 3 Point 8). We have conducted several studies to test the opposite predictions derived from each hypothesis. The main distinction concerns the understanding of physical materials and forces, which is central to the technical-reasoning hypothesis but not to the manipulation-based approach. Indeed, sensorimotor programs about tool manipulation are not assumed to contain information about physical materials and forces. In the present study, the understanding of physical materials and forces was needed in the four tasks hypothesized as requiring technical reasoning, i.e., the mechanical problem-solving task, the psychotechnical task and the PHYS-Only and INT+PHYS conditions of the mentalizing task. We can illustrate this aspect with items of each of these tasks. Figure 1A is of the mechanical problem-solving task. 

      As explained in the MS, participants had memorized the five possible tools before the scanner session. Thus, for 4 seconds, they had to imagine which of these tools could be used to extract the target out from the box. We did so to incit them to reason about mechanical solutions based on the physical properties of the problem. Then, they had 3 seconds to select the tool with the appropriate shape, here the right one. In this case, the motor action remains the same (i.e., pulling). Another illustration can be given, with the psychotechnical task (Figure 1B).

      In this task, the participant had to reason as to whether the boat-tractor connection was better in the left picture or in the right picture. This needs to reason about physical forces, but there is no need to recruit sensorimotor programs about tool manipulation. Finally, a last example can be given with the PHYS-Only condition of the mentalizing task (but the logic is the same for the INT+PHYS condition except that the character’s intentions must also be taken into consideration) Figure 1D).

      Here the participant must reason about which picture shows what is physically possible. In this task, there is no need to recruit sensorimotor programs about tool manipulation. In sum, what is common between these three tasks is the requirement to reason about physical materials and forces. We do not ignore that motor actions could be simulated in the mechanical problemsolving task, but no motor action needed to be simulated in the other three tasks. Therefore, what was common between all these tasks was the potential involvement of technical reasoning but not of sensorimotor programs about tool manipulation. Of course, an alternative is to consider that motor actions are always needed in all the situations, including situations where no “manipulable tool” is presented, such as a tractor and a boat, a pulley, or a cannon. We cannot rule out this alternative, which is nevertheless, for us, prejudicial because it implies that it becomes difficult to test the manipulation-based approach as motor actions would be everywhere. We voluntarily decided not to introduce a debate between the reasoning-based approach and the manipulation-based approach and preferred a more positive writing by stressing the insights from the present study. Note that we stressed the merits of the manipulation-based approach in the introduction because we sincerely think that this approach has provided interesting insights. However, we voluntarily did not discuss the debate between the two approaches. Given Reviewer 3’s comment (see also Reviewer 1 Point 2), we understand and agree that some words must be nevertheless said to discuss how the manipulation-based approach could interpret our results, thus stressing the potential limitations of our interpretations. Therefore, we added a specific section in the discussion in which we discussed this aspect in more details.

      “The second limitation concerns the alternative interpretation that the left area PF is not central to technical reasoning but to the storage of sensorimotor programs about the prototypical manipulation of common tools. Here we show that the left area PF is recruited even in situations in which participants do not have to process common manipulable tools. For instance, some items of the psychotechnical task consisted of pictures of tractor, boat, pulley, or cannon. The fact that we found a common activation of the left area PF in such tasks as well as in the mechanical problem-solving task, in which participants could nevertheless simulate the motor actions of manipulating novel tools, indicates that this brain area is not central to tool manipulation but to physical understanding. That being said, some may suggest that viewing a boat or a cannon is enough to incite the simulation of motor actions, so our tasks were not equipped to distinguish between the manipulation-based approach and the reasoning-based approach. We have already shown that the left area PF is more involved in tasks that focus on the mechanical dimension of the tool-use action (e.g., the mechanical interaction between a tool and an object) than its motor dimension (i.e., the interaction between the tool and the effector [e.g., 24, 40]). Nevertheless, we recognize that future research is still needed to test the predictions derived from these two approaches.” (p. 18-19)

      Introduction and throughout: The framing of left Area PF as a special area for technical reasoning is overly reductionistic from a functional neuroanatomic perspective in that it ignores a large relevant literature showing that the region is involved with many other tasks that seem not to require anything like technical cognition. Indeed, entering the coordinates - 56, -29, 36 (reported as the peak coordinates in common across the studied tasks) in Neurosynth reveals that 59 imaging studies report activations within 3 mm of those coordinates; few are action-related (a brief review indicated studies of verbal creativity, texture processing, reading, somatosensory processing, stress reactions, attentional selection etc). Please acknowledge the difficulty of claiming that a large brain region should be labeled the brain's technical reasoning area when it seems to also participate in so much else. The left IPL (including area PF) is densely connected to the ventral premotor cortex, and this network is activated in language and calculation tasks as well as tool use tasks (e.g., Matsumoto, Nair, et al., 2012). What other constructs might be able to unite this disparate literature, and are any of these alternative constructs ruled out by the present data? Lacking this objective discussion, the manuscript does read as a promotion of the investigators' preferred viewpoint.

      We thank Reviewer 3 for this comment. As stressed in the initial version of the MS, we did not write that the left area PF is sufficient but central to the network that allows us to reason about the physical world. Regardless, we agree that an objective discussion was needed on this aspect to help the reader not misunderstand our purpose. We added a section in this aspect as suggested. 

      “Before concluding, we would like to point out two potential limitations of the present study. The first limitation concerns the fact that the literature has documented the recruitment of the left area PF in many neuroimaging experiments in which there was no need to reason about physical events (e.g., language tasks). This can be easily illustrated by entering the left area PF coordinates in the Neurosynth database.

      This finding could be enough to refute the idea that this brain area is specific to technical reasoning. Although this limitation deserves to be recognized, it is also true for many other findings. For instance, sensory or motor brain regions such as the precentral or the postcentral cortex have been found activated in many non-motor tasks, the visual word form area in non-language tasks, or the Heschl’s gyrus in nonmusical tasks. This remains a major challenge for scientists, the question being how to solve these inconsistencies that can result from statistical errors or stress that considerable effort is needed to understand the very functional nature of these brain areas. Thus, understanding that the left area PF is central to physical understanding can be viewed as a first essential step before discovering its fundamental function, as suggested by the functional polyhedral approach (56).” (p. 18)

      Discussion: The discussion of a small cluster in the IFG (pars opercularis) that nearly survived statistical correction is noteworthy in light of the above point. This further underscores the importance of discussing networks and not just single brain regions (such as area PF) when examining complex processes. The investigators note, "a plausible hypothesis is that the left IFG integrates the multiple constraints posed by the physical situation to set the ground for a correct reasoning process, such as it could be involved in syntactic language processing". In fact, the hypothesis that the IFG and SMG are together related to resolving competition has been previously proposed, as has the more specific hypothesis that the SMG buffers actions and that the context-appropriate action is then selected by the IFG (e.g., Buxbaum & Randerath, 2018). The parallels with the way the SMG is engaged with competing lexical or phonological alternatives (e.g., Peramunage, Blumstein et al., 2011) have also been previously noted.

      We added the Buxbaum and Randerath (2018)’s reference in this section.

      “The functional role of the left IFG in the context of tool use has been previously discussed (24) and a plausible hypothesis is that the left IFG integrates the multiple constraints posed by the physical situation to set the ground for a correct reasoning process, such as it could be involved in syntactic language processing (for a somewhat similar view, see [51]).” (p. 16-17)

      Introduction and Discussion: Please clarify how the technical reasoning network overlaps with or is distinct from the tool-use network reported by many previous investigators.

      We added a couple of sentences in the discussion to clarify this point.

      “It should be clear here that we do not advocate the localizationist position simply stating that activation in the left area PF is the necessary and sufficient condition for technical reasoning. We rather defend the view according to which it requires a network of interacting brain areas, one of them – and of major importance – being the left area PF. This allows the engagement of different configurations of cerebral areas in different technical-reasoning tasks, but with a central process acting as a stable component: The left area PF. Thus, when people intend to use physical tools, it can work in concert with brain regions specific to object manipulation and motor control, thereby forming another network, the tool-use network. It can also interact with brain regions specific to intentional gestures to form a “social-learning” network that allows people to enhance their understanding about the physical aspects of a technical task (e.g., the making of a tool) through communicative gestures such as pointing gestures (42). The major challenge for future research is to specify the nature of the cognitive process supported by the left area PF and that might be involved in the broad understanding of the physical world.” (p. 14)

      Discussion: All of the experimental tasks require a response from a difficult choice in an array, and all of the tasks except for the fluid cognition task are likely to require prediction or simulation of a motion trajectory-whether an embodied or disembodied trajectory is unclear. The Discussion does mention the related (but distinct) idea of an "intuitive physics engine", a "kind of simulator", Please clarify how this study can rule out these alternative interpretations of the data. If the study cannot rule out these alternatives, the claims of the study (and the paper title which labels PF as a technical cognition area) should be scaled back considerably. 

      We thank Reviewer 3 for this comment. The authors of the papers on intuitive physics engine or associative learning do not suggest that these processes are embodied. As discussed above, we clarified our perspective on the role of the left area PF and hope that these modifications help the reader better understand it. We warmly thank Reviewer 3 for their comments, which considerably helped us improve the MS.

    1. Instead of drafting a first version with pen and paper (my preferred writing tools), I spent an entire hour walking outside, talking to ChatGPT in Advanced Voice Mode. We went through all the fuzzy ideas in my head, clarified and organized them, explored some additional talking points, and eventually pulled everything together into a first outline.

      Need to try this out.

    1. first try to analyze the problem you are solving, then generate ideas, then test those ideas with the people who have the problem you are solving. Then, repeat this process of analyzing the problem, designing, and testing (which we call iteration) until you converge upon an understanding of the problem and an effective solution. The premise of this approach is that by modeling a problem, and verifying solutions to it, the design one arrives at will be a better solution than if a designer just uses the pre-existing knowledge in their head.

      I like how it emphasizes that better designs come from engaging with real users rather than just relying on a designer’s intuition. But I wonder—can modeling and testing alone truly capture the complexity of a problem? It feels like there’s a risk of overlooking deeper systemic issues or missing perspectives that aren’t immediately visible in user testing. Maybe a more participatory approach could help bridge that gap.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      I would like to thank the reviewers for their comments and interest in the manuscript and the study.

      Reviewer #1

      1. I would assume that there are RNA-seq and/or ChIP-seq data out there produced after knockdown of one or more of these DBPs that show directional positioning.

      The directional positioning of CTCF-binding sites at chromatin interaction sites was analyzed by CRISPR experiment (Guo Y et al. Cell 2015). We found that the machine learning and statistical analysis showed the same directional bias of CTCF-binding motif sequence and RAD21-binding motif sequence at chromatin interaction sites as the experimental analysis of Guo Y et al. (lines 229-253, Figure 3b, c, d and Table 1). Since CTCF is involved in different biological functions (Braccioli L et al. Essays Biochem. 2019 ResearchGate webpage), the directional bias of binding sites may be reduced in all binding sites including those at chromatin interaction sites (lines 68-73). In our study, we investigated the DNA-binding sites of proteins using the ChIP-seq data of DNA-binding proteins and DNase-seq data. We also confirmed that the DNA-binding sites of SMC3 and RAD21, which tend to be found in chromatin loops with CTCF, also showed the same directional bias as CTCF by the computational analysis.

      __2. Figure 6 should be expanded to incorporate analysis of DBPs not overlapping CTCF/cohesin in chromatin interaction data that is important and potentially more interesting than the simple DBPs enrichment reported in the present form of the figure. __

      Following the reviewer's advice, I performed the same analysis with the DNA-binding sites that do no overlap with the DNA-binding sites of CTCF and cohesin (RAD21 and SMC3) (Fig. 6 and Supplementary Fig. 4). The result showed the same tendency in the distribution of DNA-binding sites. The height of a peak on the graph became lower for some DNA-binding proteins after removing the DNA-binding sites that overlapped with those of CTCF and cohesin. I have added the following sentence on lines 435 and 829: For the insulator-associated DBPs other than CTCF, RAD21, and SMC3, the DNA-binding sites that do not overlap with those of CTCF, RND21, and SMC3 were used to examine their distribution around interaction sites.

      3. Critically, I would like to see use of Micro-C/Hi-C data and ChIP-seq from these factors, where insulation scores around their directionally-bound sites show some sort of an effect like that presumed by the authors - and many such datasets are publicly-available and can be put to good use here.

      As suggested by the reviewer, I have added the insulator scores and boundary sites from the 4D nucleome data portal as tracks in the UCSC genome browser. The insulator scores seem to correspond to some extent to the H3K27me3 histone marks from ChIP-seq (Fig. 4a and Supplementary Fig. 3). We found that the DNA-binding sites of the insulator-associated DBPs were statistically overrepresented in the 5 kb boundary sites more than other DBPs (Fig. 4d). The direction of DNA-binding sites on the genome can be shown with different colors (e.g. red and green), but the directionality of insulator-associated DNA-binding sites is their overall tendency, and it may be difficult to notice the directionality from each binding site because the directionality may be weaker than that of CTCF, RAD21, and SMC3 as shown in Table 1 and Supplementary Table 2. We also observed the directional biases of CTCF, RAD21, and SMC3 by using Micro-C chromatin interaction data as we estimated, but the directionality was more apparent to distinguish the differences between the four directions of FR, RF, FF, and RR using CTCF-mediated ChIA-pet chromatin interaction data (lines 287 and 288).

       I found that the CTCF binding sites examined by a wet experiment in the previous study may not always overlap with the boundary sites of chromatin interactions from Micro-C assay (Guo Y et al. *Cell* 2015). The chromatin interaction data do not include all interactions due to the high sequencing cost of the assay, and include less long-range interactions due to distance bias. The number of the boundary sites may be smaller than that of CTCF binding sites acting as insulators and/or some of the CTCF binding sites may not be locate in the boundary sites. It may be difficult for the boundary location algorithm to identify a short boundary location. Due to the limitations of the chromatin interaction data, I planned to search for insulator-associated DNA-binding proteins without using chromatin interaction data in this study.
      
       I discussed other causes in lines 614-622: Another reason for the difference may be that boundary sites are more closely associated with topologically associated domains (TADs) of chromosome than are insulator sites. Boundary sites are regions identified based on the separation of numerous chromatin interactions. On the other hand, we found that the multiple DNA-binding sites of insulator-associated DNA-binding proteins were located close to each other at insulator sites and were associated with distinct nested and focal chromatin interactions, as reported by Micro-C assay. These interactions may be transient and relatively weak, such as tissue/cell type, conditional or lineage-specific interactions.
      
       Furthermore, I have added the statistical summary of the analysis in lines 372-395 as follows: Overall, among 20,837 DNA-binding sites of the 97 insulator-associated proteins found at insulator sites identified by H3K27me3 histone modification marks (type 1 insulator sites), 1,315 (6%) overlapped with 264 of 17,126 5kb long boundary sites, and 6,137 (29%) overlapped with 784 of 17,126 25kb long boundary sites in HFF cells. Among 5,205 DNA-binding sites of the 97 insulator-associated DNA-binding proteins found at insulator sites identified by H3K27me3 histone modification marks and transcribed regions (type 2 insulator sites), 383 (7%) overlapped with 74 of 17,126 5-kb long boundary sites, 1,901 (37%) overlapped with 306 of 17,126 25-kb long boundary sites. Although CTCF-binding sites separate active and repressive domains, the limited number of DNA-binding sites of insulator-associated proteins found at type 1 and 2 insulator sites overlapped boundary sites identified by chromatin interaction data. Furthermore, by analyzing the regulatory regions of genes, the DNA-binding sites of the 97 insulator-associated DNA-binding proteins were found (1) at the type 1 insulator sites (based on H3K27me3 marks) in the regulatory regions of 3,170 genes, (2) at the type 2 insulator sites (based on H3K27me3 marks and gene expression levels) in the regulatory regions of 1,044 genes, and (3) at insulator sites as boundary sites identified by chromatin interaction data in the regulatory regions of 6,275 genes. The boundary sites showed the highest number of overlaps with the DNA-binding sites. Comparing the insulator sites identified by (1) and (3), 1,212 (38%) genes have both types of insulator sites. Comparing the insulator sites between (2) and (3), 389 (37%) genes have both types of insulator sites. From the comparison of insulator and boundary sites, we found that (1) or (2) types of insulator sites overlapped or were close to boundary sites identified by chromatin interaction data.
      

      4. The suggested alternative transcripts function, also highlighted in the manuscripts abstract, is only supported by visual inspection of a few cases for several putative DBPs. I believe this is insufficient to support what looks like one of the major claims of the paper when reading the abstract, and a more quantitative and genome-wide analysis must be adopted, although the authors mention it as just an 'observation'.

      According to the reviewer's comment, I performed the genome-wide analysis of alternative transcripts where the DNA-binding sites of insulator-associated proteins are located near splicing sites. The DNA-binding sites of insulator-associated DNA-binding proteins were found within 200 bp centered on splice sites more significantly than the other DNA-binding proteins (Fig. 4e and Table 2). I have added the following sentences on lines 405 - 412: We performed the statistical test to estimate the enrichment of insulator-associated DNA-binding sites compared to the other DNA-binding proteins, and found that the insulator-associated DNA-binding sites were significantly more abundant at splice sites than the DNA-binding sites of the other proteins (Fig 4e and Table 2; Mann‒Whitney U test, p value 5. Figure 1 serves no purpose in my opinion and can be removed, while figures can generally be improved (e.g., the browser screenshots in Figs 4 and 5) for interpretability from readers outside the immediate research field.

      I believe that the Figure 1 would help researchers in other fields who are not familiar with biological phenomena and functions to understand the study. More explanation has been included in the Figures and legends of Figs. 4 and 5 to help readers outside the immediate research field understand the figures.

      6. Similarly, the text is rather convoluted at places and should be re-approached with more clarity for less specialized readers in mind.

      Reviewer #2's comments would be related to this comment. I have introduced a more detailed explanation of the method in the Results section, as shown in the responses to Reviewer #2's comments.

      Reviewer #2

      1. Introduction, line 95: CTCF appears two times, it seems redundant.

      On lines 91-93, I deleted the latter CTCF from the sentence "We examine the directional bias of DNA-binding sites of CTCF and insulator-associated DBPs, including those of known DBPs such as RAD21 and SMC3".

      2. Introduction, lines 99-103: Please stress better the novelty of the work. What is the main focus? The new identified DPBs or their binding sites? What are the "novel structural and functional roles of DBPs" mentioned?

      Although CTCF is known to be the main insulator protein in vertebrates, we found that 97 DNA-binding proteins including CTCF and cohesin are associated with insulator sites by modifying and developing a machine learning method to search for insulator-associated DNA-binding proteins. Most of the insulator-associated DNA-binding proteins showed the directional bias of DNA-binding motifs, suggesting that the directional bias is associated with the insulator.

       I have added the sentence in lines 96-99 as follows: Furthermore, statistical testing the contribution scores between the directional and non-directional DNA-binding sites of insulator-associated DBPs revealed that the directional sites contributed more significantly to the prediction of gene expression levels than the non-directional sites. I have revised the statement in lines 101-110 as follows: To validate these findings, we demonstrate that the DNA-binding sites of the identified insulator-associated DBPs are located within potential insulator sites, and some of the DNA-binding sites in the insulator site are found without the nearby DNA-binding sites of CTCF and cohesin. Homologous and heterologous insulator-insulator pairing interactions are orientation-dependent, as suggested by the insulator-pairing model based on experimental analysis in flies. Our method and analyses contribute to the identification of insulator- and chromatin-associated DNA-binding sites that influence EPIs and reveal novel functional roles and molecular mechanisms of DBPs associated with transcriptional condensation, phase separation and transcriptional regulation.
      

      3. Results, line 111: How do the SNPs come into the procedure? From the figures it seems the input is ChIP-seq peaks of DNBPs around the TSS.

      On lines 121-124, to explain the procedure for the SNP of an eQTL, I have added the sentence in the Methods: "If a DNA-binding site was located within a 100-bp region around a single-nucleotide polymorphism (SNP) of an eQTL, we assumed that the DNA-binding proteins regulated the expression of the transcript corresponding to the eQTL".

      4. Again, are those SNPs coming from the different cell lines? Or are they from individuals w.r.t some reference genome? I suggest a general restructuring of this part to let the reader understand more easily. One option could be simplifying the details here or alternatively including all the necessary details.

      On line 119, I have included the explanation of the eQTL dataset of GTEx v8 as follows: " The eQTL data were derived from the GTEx v8 dataset, after quality control, consisting of 838 donors and 17,382 samples from 52 tissues and two cell lines". On lines 681 and 865, I have added the filename of the eQTL data "(GTEx_Analysis_v8_eQTL.tar)".

      5. Figure 1: panel a and b are misleading. Is the matrix in panel a equivalent to the matrix in panel b? If not please clarify why. Maybe in b it is included the info about the SNPs? And if yes, again, what is then difference with a.

      The reviewer would mention Figure 2, not Figure 1. If so, the matrices in panels a and b in Figure 2 are equivalent. I have shown it in the figure: The same figure in panel a is rotated 90 degrees to the right. The green boxes in the matrix show the regions with the ChIP-seq peak of a DNA-binding protein overlapping with a SNP of an eQTL. I used eQTL data to associate a gene with a ChIP-seq peak that was more than 2 kb upstream and 1 kb downstream of a transcriptional start site of a gene. For each gene, the matrix was produced and the gene expression levels in cells were learned and predicted using the deep learning method. I have added the following sentences to explain the method in lines 133 - 139: Through the training, the tool learned to select the binding sites of DNA-binding proteins from ChIP-seq assays that were suitable for predicting gene expression levels in the cell types. The binding sites of a DNA-binding protein tend to be observed in common across multiple cell and tissue types. Therefore, ChIP-seq data and eQTL data in different cell and tissue types were used as input data for learning, and then the tool selected the data suitable for predicting gene expression levels in the cell types, even if the data were not obtained from the same cell types.

      6. Line 386-388: could the author investigate in more detail this observation? Does it mean that loops driven by other DBPs independent of the known CTCF/Cohesin? Could the author provide examples of chromatin structural data e.g. MicroC?

      As suggested by the reviewer, to help readers understand the observation, I have added Supplementary Fig. S4c to show the distribution of DNA-binding sites of "CTCF, RAD21, and SMC3" and "BACH2, FOS, ATF3, NFE2, and MAFK" around chromatin interaction sites. I have modified the following sentence to indicate the figure on line 501: Although a DNA-binding-site distribution pattern around chromatin interaction sites similar to those of CTCF, RAD21, and SMC3 was observed for DBPs such as BACH2, FOS, ATF3, NFE2, and MAFK, less than 1% of the DNA-binding sites of the latter set of DBPs colocalized with CTCF, RAD21, or SMC3 in a single bin (Fig. S4c).

       In Aljahani A et al. *Nature Communications* 2022, we find that depletion of cohesin causes a subtle reduction in longer-range enhancer-promoter interactions and that CTCF depletion can cause rewiring of regulatory contacts. Together, our data show that loop extrusion is not essential for enhancer-promoter interactions, but contributes to their robustness and specificity and to precise regulation of gene expression. Goel VY et al. *Nature Genetics* 2023 mentioned in the abstract: Microcompartments frequently connect enhancers and promoters and though loss of loop extrusion and inhibition of transcription disrupts some microcompartments, most are largely unaffected. These results suggested that chromatin loops can be driven by other DBPs independent of the known CTCF/Cohesin.
      
      I added the following sentence on lines 569-577: The depletion of cohesin causes a subtle reduction in longer-range enhancer-promoter interactions and that CTCF depletion can cause rewiring of regulatory contacts. Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently. Furthermore, the loop extrusion is not essential for enhancer-promoter interactions, but contributes to their robustness and specificity and to precise regulation of gene expression.
      
       FOXA1 pioneer factor functions as an initial chromatin-binding and chromatin-remodeling factor and has been reported to form biomolecular condensates (Ji D et al. *Molecular Cell* 2024). CTCF have also found to form transcriptional condensate and phase separation (Lee R et al. *Nucleic acids research* 2022). FOS was found to be an insulator-associated DNA-binding protein in this study and is potentially involved in chromatin remodeling, transcription condensation, and phase separation with the other factors such as BACH2, ATF3, NFE2 and MAFK. I have added the following sentence on line 556: FOXA1 pioneer factor functions as an initial chromatin-binding and chromatin-remodeling factor and has been reported to form biomolecular condensates.
      

      7. In general, how the presented results are related to some models of chromatin architecture, e.g. loop extrusion, in which it is integrated convergent CTCF binding sites?

      Goel VY et al. Nature Genetics 2023 identified highly nested and focal interactions through region capture Micro-C, which resemble fine-scale compartmental interactions and are termed microcompartments. In the section titled "Most microcompartments are robust to loss of loop extrusion," the researchers noted that a small proportion of interactions between CTCF and cohesin-bound sites exhibited significant reductions in strength when cohesin was depleted. In contrast, the majority of microcompartmental interactions remained largely unchanged under cohesin depletion. Our findings indicate that most P-P and E-P interactions, aside from a few CTCF and cohesin-bound enhancers and promoters, are likely facilitated by a compartmentalization mechanism that differs from loop extrusion. We suggest that nested, multiway, and focal microcompartments correspond to small, discrete A-compartments that arise through a compartmentalization process, potentially influenced by factors upstream of RNA Pol II initiation, such as transcription factors, co-factors, or active chromatin states. It follows that if active chromatin regions at microcompartment anchors exhibit selective "stickiness" with one another, they will tend to co-segregate, leading to the development of nested, focal interactions. This microphase separation, driven by preferential interactions among active loci within a block copolymer, may account for the striking interaction patterns we observe.

       The authors of the paper proposed several mechanisms potentially involved in microcompartments. These mechanisms may be involved in looping with insulator function. Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently (Hsieh TS et al. *Nature Genetics* 2022). Among the identified insulator-associated DNA-binding proteins, Maz and MyoD1 form loops without CTCF (Xiao T et al. *Proc Natl Acad Sci USA* 2021 ; Ortabozkoyun H et al. *Nature genetics* 2022 ; Wang R et al. *Nature communications* 2022). I have added the following sentences on lines 571-575: Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently. I have included the following explanation on lines 582-584: Maz and MyoD1 among the identified insulator-associated DNA-binding proteins form loops without CTCF.
      
       As for the directionality of CTCF, if chromatin loop anchors have some structural conformation, as shown in the paper entitled "The structural basis for cohesin-CTCF-anchored loops" (Li Y et al. *Nature* 2020), directional DNA binding would occur similarly to CTCF binding sites. Moreover, cohesin complexes that interact with convergent CTCF sites, that is, the N-terminus of CTCF, might be protected from WAPL, but those that interact with divergent CTCF sites, that is, the C-terminus of CTCF, might not be protected from WAPL, which could release cohesin from chromatin and thus disrupt cohesin-mediated chromatin loops (Davidson IF et al. *Nature Reviews Molecular Cell Biology* 2021). Regarding loop extrusion, the 'loop extrusion' hypothesis is motivated by in vitro observations. The experiment in yeast, in which cohesin variants that are unable to extrude DNA loops but retain the ability to topologically entrap DNA, suggested that in vivo chromatin loops are formed independently of loop extrusion. Instead, transcription promotes loop formation and acts as an extrinsic motor that extends these loops and defines their final positions (Guerin TM et al. *EMBO Journal* 2024). I have added the following sentences on lines 543-547: Cohesin complexes that interact with convergent CTCF sites, that is, the N-terminus of CTCF, might be protected from WAPL, but those that interact with divergent CTCF sites, that is, the C-terminus of CTCF, might not be protected from WAPL, which could release cohesin from chromatin and thus disrupt cohesin-mediated chromatin loops. I have included the following sentences on lines 577-582: The 'loop extrusion' hypothesis is motivated by in vitro observations. The experiment in yeast, in which cohesin variants that are unable to extrude DNA loops but retain the ability to topologically entrap DNA, suggested that in vivo chromatin loops are formed independently of loop extrusion. Instead, transcription promotes loop formation and acts as an extrinsic motor that extends these loops and defines their final positions.
      
       Another model for the regulation of gene expression by insulators is the boundary-pairing (insulator-pairing) model (Bing X et al. *Elife* 2024) (Ke W et al. *Elife* 2024) (Fujioka M et al. *PLoS Genetics* 2016). Molecules bound to insulators physically pair with their partners, either head-to-head or head-to-tail, with different degrees of specificity at the termini of TADs in flies. Although the experiments do not reveal how partners find each other, the mechanism unlikely requires loop extrusion. Homologous and heterologous insulator-insulator pairing interactions are central to the architectural functions of insulators. The manner of insulator-insulator interactions is orientation-dependent. I have summarized the model on lines 559-567: Other types of chromatin regulation are also expected to be related to the structural interactions of molecules. As the boundary-pairing (insulator-pairing) model, molecules bound to insulators physically pair with their partners, either head-to-head or head-to-tail, with different degrees of specificity at the termini of TADs in flies (Fig. 7). Although the experiments do not reveal how partners find each other, the mechanism unlikely requires loop extrusion. Homologous and heterologous insulator-insulator pairing interactions are central to the architectural functions of insulators. The manner of insulator-insulator interactions is orientation-dependent.
      

      8. Do the authors think that the identified DBPs could work in that way as well?

      The boundary-pairing (insulator-pairing) model would be applied to the insulator-associated DNA-binding proteins other than CTCF and cohesin that are involved in the loop extrusion mechanism (Bing X et al. Elife 2024) (Ke W et al. Elife 2024) (Fujioka M et al. PLoS Genetics 2016).

       Liquid-liquid phase separation was shown to occur through CTCF-mediated chromatin loops and to act as an insulator (Lee, R et al. *Nucleic Acids Research* 2022). Among the identified insulator-associated DNA-binding proteins, CEBPA has been found to form hubs that colocalize with transcriptional co-activators in a native cell context, which is associated with transcriptional condensate and phase separation (Christou-Kent M et al. *Cell Reports* 2023). The proposed microcompartment mechanisms are also associated with phase separation. Thus, the same or similar mechanisms are potentially associated with the insulator function of the identified DNA-binding proteins. I have included the following information on line 554: CEBPA in the identified insulator-associated DNA-binding proteins was also reported to be involved in transcriptional condensates and phase separation.
      

      9. Also, can the authors comment about the mechanisms those newly identified DBPs mediate contacts by active processes or equilibrium processes?

      Snead WT et al. Molecular Cell 2019 mentioned that protein post-transcriptional modifications (PTMs) facilitate the control of molecular valency and strength of protein-protein interactions. O-GlcNAcylation as a PTM inhibits CTCF binding to chromatin (Tang X et al. Nature Communications 2024). I found that the identified insulator-associated DNA-binding proteins tend to form a cluster at potential insulator sites (Supplementary Fig. 2d). These proteins may interact and actively regulate chromatin interactions, transcriptional condensation, and phase separation by PTMs. I have added the following explanation on lines 584-590: Furthermore, protein post-transcriptional modifications (PTMs) facilitate control over the molecular valency and strength of protein-protein interactions. O-GlcNAcylation as a PTM inhibits CTCF binding to chromatin. We found that the identified insulator-associated DNA-binding proteins tend to form a cluster at potential insulator sites (Fig. 4f and Supplementary Fig. 3c). These proteins may interact and actively regulate chromatin interactions, transcriptional condensation, and phase separation through PTMs.

      10. Can the author provide some real examples along with published structural data (e.g. the mentioned micro-C data) to show the link between protein co-presence, directional bias and contact formation?

      Structural molecular model of cohesin-CTCF-anchored loops has been published by Li Y et al. Nature 2020. The structural conformation of CTCF and cohesin in the loops would be the cause of the directional bias of CTCF binding sites, which I mentioned in lines 539 - 543 as follows: These results suggest that the directional bias of DNA-binding sites of insulator-associated DBPs may be involved in insulator function and chromatin regulation through structural interactions among DBPs, other proteins, DNAs, and RNAs. For example, the N-terminal amino acids of CTCF have been shown to interact with RAD21 in chromatin loops.

       To investigate the principles underlying the architectural functions of insulator-insulator pairing interactions, two insulators, Homie and Nhomie, flanking the *Drosophila even skipped *locus were analyzed. Pairing interactions between the transgene Homie and the eve locus are directional. The head-to-head pairing between the transgene and endogenous Homie matches the pattern of activation (Fujioka M et al. *PLoS Genetics* 2016).
      

      Reviewer #3

      Major Comments:

      1. Some of these TFs do not have specific direct binding to DNA (P300, Cohesin). Since the authors are using binding motifs in their analysis workflow, I would remove those from the analysis.

      When a protein complex binds to DNA, one protein of the complex binds to the DNA directory, and the other proteins may not bind to DNA. However, the DNA motif sequence bound by the protein may be registered as the DNA-binding motif of all the proteins in the complex. The molecular structure of the complex of CTCF and Cohesin showed that both CTCF and Cohesin bind to DNA (Li Y et al. Nature 2020). I think there is a possibility that if the molecular structure of a protein complex becomes available, the previous recognition of the DNA-binding ability of a protein may be changed. Therefore, I searched the Pfam database for 99 insulator-associated DNA-binding proteins identified in this study. I found that 97 are registered as DNA-binding proteins and/or have a known DNA-binding domain, and EP300 and SIN3A do not directory bind to DNA, which was also checked by Google search. I have added the following explanation in line 257 to indicate direct and indirect DNA-binding proteins: Among 99 insulator-associated DBPs, EP300 and SIN3A do not directory interact with DNA, and thus 97 insulator-associated DBPs directory bind to DNA. I have updated the sentence in line 20 of the Abstract as follows: We discovered 97 directional and minor nondirectional motifs in human fibroblast cells that corresponded to 23 DBPs related to insulator function, CTCF, and/or other types of chromosomal transcriptional regulation reported in previous studies.

      2. I am not sure if I understood correctly, by why do the authors consider enhancers spanning 2Mb (200 bins of 10Kb around eSNPs)? This seems wrong. Enhancers are relatively small regions (100bp to 1Kb) and only a very small subset form super enhancers.

      As the reviewer mentioned, I recognize enhancers are relatively small regions. In the paper, I intended to examine further upstream and downstream of promoter regions where enhancers are found. Therefore, I have modified the sentence in lines 929 - 931 of the Fig. 2 legend as follows: Enhancer-gene regulatory interaction regions consist of 200 bins of 10 kbp between -1 Mbp and 1 Mbp region from TSS, not including promoter.

      3. I think the H3K27me3 analysis was very good, but I would have liked to see also constitutive heterochromatin as well, so maybe repeat the analysis for H3K9me3.

      Following the reviewer's advice, I have added the ChIP-seq data of H3K9me3 as a truck of the UCSC Genome Browser. The distribution of H3K9me3 signal was different from that of H3K27me3 in some regions. I also found the insulator-associated DNA-binding sites close to the edges of H3K9me3 regions and took some screenshots of the UCSC Genome Browser of the regions around the sites in Supplementary Fig. 3b. I have modified the following sentence on lines 974 - 976 in the legend of Fig. 4: a Distribution of histone modification marks H3K27me3 (green color) and H3K9me3 (turquoise color) and transcript levels (pink color) in upstream and downstream regions of a potential insulator site (light orange color). I have also added the following result on lines 356 - 360: The same analysis was performed using H3K9me3 marks, instead of H3K27me3 (Fig. S3b). We found that the distribution of H3K9me3 signal was different from that of H3K27me3 in some regions, and discovered the insulator-associated DNA-binding sites close to the edges of H3K9me3 regions (Fig. S3b).

      4. I was not sure I understood the analysis in Figure 6. The binding site is with 500bp of the interaction site, but micro-C interactions are at best at 1Kb resolution. They say they chose the centre of the interaction site, but we don't know exactly where there is the actual interaction. Also, it is not clear what they measure. Is it the number of binding sites of a specific or multiple DBP insulator proteins at a specific distance from this midpoint that they recover in all chromatin loops? Maybe I am missing something. This analysis was not very clear.

      The resolution of the Micro-C assay is considered to be 100 bp and above, as the human nucleome core particle contains 145 bp (and 193 bp with linker) of DNA. However, internucleosomal DNA is cleaved by endonuclease into fragments of multiples of 10 nucleotides (Pospelov VA et al. Nucleic Acids Research 1979). Highly nested focal interactions were observed (Goel VY et al. Nature Genetics 2023). Base pair resolution was reported using Micro Capture-C (Hua P et al. Nature 2021). Sub-kilobase (20 bp resolution) chromatin topology was reported using an MNase-based chromosome conformation capture (3C) approach (Aljahani A et al. Nature Communications 2022). On the other hand, Hi-C data was analyzed at 1 kb resolution. (Gu H et al. bioRxiv 2021). If the resolution of Micro-C interactions is at best at 1 kb, the binding sites of a DNA-binding protein will not show a peak around the center of the genomic locations of interaction edges. Each panel shows the number of binding sites of a specific DNA-binding protein at a specific distance from the midpoint of all chromatin interaction edges. I have modified and added the following sentences in lines 593-597: High-resolution chromatin interaction data from a Micro-C assay indicated that most of the predicted insulator-associated DBPs showed DNA-binding-site distribution peaks around chromatin interaction sites, suggesting that these DBPs are involved in chromatin interactions and that the chromatin interaction data has a high degree of resolution. Base pair resolution was reported using Micro Capture-C.

      Minor Comments:

      1. PIQ does not consider TF concentration. Other methods do that and show that TF concentration improves predictions (e.g., ____https://www.biorxiv.org/content/10.1101/2023.07.15.549134v2____or ____https://pubmed.ncbi.nlm.nih.gov/37486787____/). The authors should discuss how that would impact their results.

      The directional bias of CTCF binding sites was identified by ChIA-pet interactions of CTCF binding sites. The analysis of the contribution scores of DNA-binding sites of proteins considering the binding sites of CTCF as an insulator showed the same tendency of directional bias of CTCF binding sites. In the analysis, to remove the false-positive prediction of DNA-binding sites, I used the binding sites that overlapped with a ChIP-seq peak of the DNA-binding protein. This result suggests that the DNA-binding sites of CTCF obtained by the current analysis have sufficient quality. Therefore, if the accuracy of prediction of DNA-binding sites is improved, although the number of DNA-binding sites may be different, the overall tendency of the directionality of DNA-binding sites will not change and the results of this study will not change significantly.

       As for the first reference in the reviewer's comment, chromatin interaction data from Micro-C assay does not include all chromatin interactions in a cell or tissue, because it is expensive to cover all interactions. Therefore, it would be difficult to predict all chromatin interactions based on machine learning. As for the second reference in the reviewer's comment, pioneer factors such as FOXA are known to bind to closed chromatin regions, but transcription factors and DNA-binding proteins involved in chromatin interactions and insulators generally bind to open chromatin regions. The search for the DNA-binding motifs is not required in closed chromatin regions.
      

      2. DeepLIFT is a good approach to interpret complex structures of CNN, but is not truly explainable AI. I think the authors should acknowledge this.

      In the DeepLIFT paper, the authors explain that DeepLIFT is a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input (Shrikumar A et al. ICML 2017). DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. DeepLIFT calculates a metric to measure the difference between an input and the reference of the input.

       Truly explainable AI would be able to find cause and reason, and to make choices and decisions like humans. DeepLIFT does not perform causal inferences. I did not use the term "Explainable AI" in our manuscript, but I briefly explained it in Discussion. I have added the following explanation in lines 623-628: AI (Artificial Intelligence) is considered as a black box, since the reason and cause of prediction are difficult to know. To solve this issue, tools and methods have been developed to know the reason and cause. These technologies are called Explainable AI. DeepLIFT is considered to be a tool for Explainable AI. However, DeepLIFT does not answer the reason and cause for a prediction. It calculates scores representing the contribution of the input data to the prediction.
      
       Furthermore, to improve the readability of the manuscript, I have included the following explanation in lines 159-165: we computed DeepLIFT scores of the input data (i.e., each binding site of the ChIP-seq data of DNA-binding proteins) in the deep leaning analysis on gene expression levels. DeepLIFT compares the importance of each input for predicting gene expression levels to its 'reference or background level' and assigns contribution scores according to the difference. DeepLIFT calculates a metric to measure the difference between an input and the reference of the input.
      
    1. Reviewer #2 (Public review):

      Summary:

      In this paper, Andriani and colleagues are examining the potential role of Zn flux in sperm and its effect on Slo3 channels. This is an interesting question that is likely critical to how sperm function properly and Slo3 channels are a possible candidate for a downstream molecule that is impacted by Zn. In this paper, the authors use Zn imaging, sperm motility assays, and electrophysiology to show that Zn flux impacts sperm function. They then go on to look at the impact Zn has on Slo3 current and propose a binding site based on MD simulations. While the ideas are interesting, the experiments are not well described in many places making understanding the results very difficult. In addition, critical controls are missing throughout the paper.

      Strengths:

      The question of how Zn flux impacts membrane potential and sperm motility is an important one. Moreover, Slo3 presents an interesting candidate or the target of Zn regulation. The combination of methods used here also has the potential to uncover mechanisms of Zn regulation of Slo3.

      Weaknesses:

      Much of the paper lacks experimental description which makes interpretation quite difficult, or a detailed discussion is missing. Examples include:

      (1) Figure 1, particularly the Zn imaging, is not sufficiently described. How is the fluorescence intensity measured? A representative ROI? The whole tail and head? Are the sperm immobile? If not, there is evidence that motion artifacts can significantly distort these sorts of measures from Calcium measurements in Cilia. Were there controls done? Is the small amount of Zn seen in the tail above the background?

      (2) The second half of Figure 1 is also not well described. What is the extracellular solution in the recordings? When you apply the Zn ionophore, do you expect influx or efflux? I assume efflux is based on the conclusions but this should be discussed explicitly.

      (3) Figure 2H labels the Y axis, "normalized current". Normalized to what? Why do neither of the curves end at 1? A better description of what this figure represents is needed.

      (4) The alpha fold simulations are not well described. How many Zn binding sites were found? Are all of the histidine mutations in Figure 4 Supplement 1 the ones that were found?

      (5) There is no discussion of physiological intracellular Zn concentration. How much Zn is inside the sperm? How much if likely Free vs buffered? Is 100uM a reasonable physiological concentration?

      There are a number of areas where the interpretation is not well supported by the data including:

      (6) You say in the Figure 4 supplement, that "we did not observe any significant decrease in the percentage of current inhibition." But that is a pretty misleading statement. There are large changes (increases) in the amount of zinc inhibition. These might be allosteric changes but I don't think you can safely eliminate these as relevant Zn binding sites. Also, some of these mutations appear to allow at least some unbinding of Zn.

      (7) Following up on the above point, it seems unfair to conclude that the D162S, E169A, and E205 mutants are part of the inhibitory binding site for Zn when the mutation has no effect on inhibition and only an effect on the washout. The mutations on the intracellular side also had an impact on the washout so it seems equally likely that they are the critical residues based on your data.

      (8) Nowhere in the paper do you make the specific link between Zn flux and membrane hyperpolariation via Slo3. You show that Zn flux changes the ability of the sperm to hyperpolarize and you show that Slo3 is inhibited by Zn but the connection between the two is not demonstrated. There appears to be a specific Slo3 blocker. If you use this in sperm, do you no longer see the Zn effect?

      (9) In the second half of Figure 1, the authors suggest that there is "no hyperpolization in 100uM Zn. That is not really true. It is reduced but not absent.

      (10) The claim that Lrcc52 with Slo3 shows a higher current inhibition at pH 7.5 than pH 8 is not well supported because there are only 3 replicates in the 7.5 case. In addition, the claim is made in the test that 100uM ZnCl2 "already inhibited mSlo3+Lrcc52 at pH7.5", contrasted with mSlo3 alone, is not tested statistically.

      In a number of places, better controls are needed.

      (11) How specific is this effect for Zn? Mg2+, for instance, is also a divalent cation that is in the hundreds of uM range inside the cell. Does it exert the same effect? Each ion certainly has unique preferred coordination geometries, does your predicted binding with MD show what you might expect for tetrahedral coordination with Zn? Did you test other divalent cations functionally or in silicon?

      (12) For the VCF experiments, a significantly higher concentration of Zn was used (10mM). What is the reason for this? There is no discussion of how much a "puff" is. Assuming you are using the RNA injector it is probably on the order of 50nL or less. Assuming the volume of an oocyte is 1uL that would argue that the final concentration is 500uM or higher. But this is also complicated by potential local effects of high Zn at the injection site, artifacts of injecting that much metal, and the fact that a great deal of the Zn will likely be bound to other things inside the cell. Better controls are needed for this experiment.

    1. Matthew Dietz

      Dietz became an editor at WLWT in October of 2022. In this article he is writing in a PR manner because he is using the press conference that Richard Pitino was at today.

    1. the game of cat-and-mouse gets turned on its head.

      I am not sure how I feel about using this metaphor in this article. This aspects of this situation are mre than just a "game of cat and mouse."

    1. he parietal lobes are located posterior to the frontal lobes at the top of the head. The parietal lobes are involved in body sensations, including temperature, touch, and pain

      The parietal lobes play a crucial role in processing somatosensory information, which includes temperature, touch, and pai

    1. Even when only the first layer (immediately after the embedding layer) is unfrozen, it can still influence the subsequent layers, enabling the model to produce informative embeddings for the regression head at the final layer

      This is fascinating! I wonder how the perplexity of the pre-training task is affected by which layer you choose to unfreeze.

    2. The ESM-Effect Architecture thus comprises the 35M ESM2 model with 10 of 12 layers frozen and the mutation position regression head (cf. Figure 2). The model’s performance is driven by two key inductive biases in the regression head:

      How would you extend this combined head architecture (mutation position embedding + mean pooled) if you were looking at the effect of a multi-mutation variant?

      One strategy I can think of would be to slice out all mutation positions and pool them. I'm wondering if you guys thought about generalizing the architecture to scenarios when the number of mutations in your DMS dataset varies.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript from Kaletsky et al is a response to a paper recently published by Craig Hunter's group (Gainey et al 2024). The Murphy lab has previously shown that learned avoidance of C. elegans to PA14 can be transmitted through four generations. In a series of detailed studies, they defined the mechanism of this transgenerational epigenetic inheritance (TEI), identifying both PA14 and C. elegans factors required for this effect (Moore et al., 2019, Kaletsky et al., 2020; Moore et al., 2021). PA14 produces a small RNA, P11, that is necessary and sufficient for transgenerational epigenetic inheritance of avoidance behaviour in C. elegans. In the worm, P11 decreases maco-1 expression, which in turn regulates daf-7.

      In the study by Gainey et al (eLife 2024), the authors report their attempt at replicating the original findings of the Murphy lab using a modified experimental setup. The Gainey study observed avoidance of PA14 and upregulation of daf-7::GFP in the F1 progeny of trained parents, but not in subsequent generations. Importantly, although they examined a number of different deviations of the protocol, they did not repeat the original experiment using the exact protocol outlined in the Moore or Kaletsky papers. Nevertheless, the authors concluded that "this example of TEI is insufficiently robust for experimental investigations".

      The manuscript by Kaletsky et al. attempts to provide an explanation as to why Gainey et al., were unable to observe transgenerational avoidance of PA14. They identify two discrepancies in the methodology used between the two studies and examine the possible impacts of these.

      One of the primary differences in protocols between the two papers is how avoidance is measured. The Murphy group uses the traditional method of adding azide to bacterial spots on the choice plates to trap worms once they have come close to the food spot. The animals are on the plate for 1 hour but most have likely been immobilized before this time point. Gainey et al. omit the azide and instead shift animals to 4C after 30-60 minutes of exposure to immobilize the worms for counting. Kaletsky et al show that the choice of assay has a significant impact on measuring attraction and avoidance.

      While Gainey et al., assert that the addition of azide had no discernable effect on the choice assay results, these data are not shown in their paper. Kaletsky et al. test these conditions head-to-head with the same 1 hour exposure time, showing that with azide, the initial response to PA14 in untrained worms is attraction. By contrast, in the absence of azide, when cold temperature is used to immobilize the worms , the response recorded is aversion to PA14. The choice assay generated by Kaletsky et al without azide is consistent with the choice assays in untrained worms shown in the Gainey paper, demonstrating that this is likely one factor that contributed to the different outcomes reported in the Gainey paper.

      Kaletsky et al. propose that learned aversion to PA14 may be occurring within the 1-hour exposure time when worms are not trapped in their initial decision with the use of azide. This is consistent with previous findings from another group (Ooi and Prahlad 2017), showing that 45 minutes of exposure is sufficient to overcome the attraction to PA14 and shift to avoidance of PA14. Importantly, the Gainey paper notes exposure times between 30 and 60 minutes before shifting worms to 4C to count, this window may have generated additional variability between assays.

      The second possibility explored by Kaletsky et al. is that the expression of P11 differed between the studies. Because P11 is required for TEI, differences in P11 expression is a reasonable explanation for different observations between studies. Unfortunately, in the Gainey study, P11 levels were not measured; it is therefore not possible to know whether low or absent levels of P11 explain the inability to observe TEI. Nevertheless, Kaletsky et al. test the potential for changes in one growth condition, temperature, to influence the production P11. Indeed, the expression of P11 differs in PA14 grown at different growth temperatures, providing an additional explanation for the discrepancies.

      While it is possible that temperature is the culprit, it may be another culture condition or media component suppressing P11 expression. Nevertheless, the fact that expression of P11 can so easily be modified demonstrates that P11 expression is not immune to differences in culture conditions. Given its role in nitrogen fixation, I would be surprised if it was not regulated by environmental conditions. Differences in iron content between media batches are notorious for altering bacteria phenotypes. Although outside the scope of this study, with the connection to biofilm formation, I would be curious if iron levels had an impact on P11 expression. All in all, the data highlight the fact that P11 levels should be measured if TEI is not seen.

      Strengths:

      Overall, this is an excellent study that has provided additional understanding of the difference between naïve preference and TEI and provides guidance for investigators in replicating TEI experiments. The manuscript is very well written and provides additional understanding regarding the replication of TEI in response to P. aeruginosa.

      The manuscript provides an important discussion about differences in methodology and how they might reflect specific biology. Many examples of experimental deviations that have large impacts have simple biological explanations. I believe the authors have done an excellent job making this point.

      Weaknesses:

      None noted.

    1. The Epicureans are among these; they deny that there is any Mind behind the universe at all. This view is contrary to all the facts of experience, their own existence included. For if all things had come into being in this automatic fashion, instead of being the outcome of Mind, though they existed, they would all be uniform and without distinction. In the universe everything would be sun or moon or whatever it was, and in the human body the whole would be hand or eye or foot. But in point of fact the sun and the moon and the earth are all different things, and even within the human body there are different members, such as foot and hand and head. This distinctness of things argues not a spontaneous generation but a prevenient Cause; and from that Cause we can apprehend God, the Designer and Maker of all.

      Simple but a great starting point against athiests. The world shows signs of intelligent designs, and we are born as intelligent human beings (not in terms of literacy, rather our senses and free will)

    1. the showrunner is someone who gives a series—and just as importantly, those who work for the series—a sense of structure and direction. The showrunner is in charge of the pro- duction and the creative content of a television show. The job demands the skills of a visionary: someone who can hold the entire narrative of the series in their head; who is the gatekeeper of language, tone, and aesthetics on the set and be- hind the scenes;

      It explains how a showrunner combines artistic vision with managerial skills to maintain narrative and style consistency, ensuring that each episode strikes the desired tone. The passage demonstrates that even before the term "showrunner" became widely used, television relied on a similar role to guide its creative endeavors. It suggests an evolving understanding of creative authority, in which the person behind the scenes can influence not only the narrative but also the production process.

    2. Testimony that day related to Oppenheimer’s disparate roles: as a producer working for Desilu, as Vice President of the Television Writers of America, and as head writer of Lucy. The attorney for the SWG insisted that as a prestigious producer and a potential employer of writers, Oppenheimer was exercising an undue amount of influence recruiting writers to join the Television Writers Guild.

      This passage focuses on a turning point when labor unions questioned the hyphenate's innovative role. The story follows Oppenheimer's appearance before the National Labor Relations Board, which shows the conflict between evolving creative roles and traditional labor norms. The Screen Writers Guild's objection to Oppenheimer's dual role as producer and writer reflects long-held concerns about power imbalances and conflicts of interest in a rapidly industrializing television industry.

    1. Ultimately, the TWA dissolved in 1954, and all writers of scripted entertainment for film, television, and radio gathered under the umbrella of the Writers Guild of America. But it was on account of writers for shows like Lucy, who first claimed credit as writers and as producers, that conflicting notions of authorship and ownership came to a head for the guilds that represented these media workers.

      I never knew this to be one of the reasons that television production moved primarily to LA. With TWA dissolving writer had to go to the Writers Guild of America in order to continue production. Inevitably challenging the owerships of content which picked up once projects were moved to LA that were represented by the guilds.

  12. Mar 2025
    1. . John Harvey Kellogg was the head of this Adventist health center.

      When you see the background of people and inventors that originally thought of what are now popular items, you can see more of why they created what they did and how their everyday lives compelled them to create something to fix an issue or fill a gap, just like the Kellogg brothers were hoping to create a breakfast food that would satisfy their religious requirements and their nutritional needs.

    1. role as head writer-producer. The head writer is—at best—a benevolent dictator who provides a consistency of voice from episode to episode, runs the writers’ room, works on set with the director, actors, cinematographer, and designers ensuring that the words on the page translate to the screen, and often sits in the editing room helping the editor craft a story. One could easily assume that it was Lucille Ball and Desi Arnaz who were the show’s creators, but they were not. Lucy’s creator was a man whose name is barely remembered and rarely mentioned: Jess Oppenheimer.

      This is a very interesting concept, and it makes me think about the "secret" creators behind modern television shows. Throughout history, there have always been quiet hitmakers, who choose to sit in the background rather than make themselves known. After all, it makes more sense for a studio to put a celebrity's name on a show than an unrecognized name.

    2. Desi Arnaz, as the head of Desilu, was like the best of studio moguls of the Hollywood era, assembling the most talented workers available.

      This is not something that's very common nowadays. I feel like actors who become producers usually do so on different films, rather than opting to act in their own projects.

    1. “Malibu” by Hole is one of the greatest songs in Americawhen I was younger I thought it was a sexy like summer story abt thesandy aesthetic wonder of aSoCal summer beach townHow you listen to something completely in yr own head.

      Personal

    1. Reviewer #1 (Public review):

      Summary:

      Praegel et al. explore the differences in learning an auditory discrimination task between adolescent and adult mice. Using freely moving (Educage) and head-fixed paradigms, they compare behavioral performance and neuronal responses over the course of learning. The mice were initially trained for seven days on an easy pure frequency tone Go/No-go task (frequency difference of one octave), followed by seven days of a harder version (frequency difference of 0.25 octave). While adolescents and adults showed similar performances on the easy task, adults performed significantly better on the harder task. Quantifying the lick bias of both groups, the authors then argue that the difference in performance is not due to a difference in perception, but rather to a difference in cognitive control. The authors then used neuropixel recordings across 4 auditory cortical regions to quantify the neuronal activity related to the behavior. At the single-cell level, the data shows earlier stimulus-related discrimination for adults compared to adolescents in both the easy and hard tasks. At the neuronal population level, adults displayed a higher decoding accuracy and lower onset latency in the hard task as compared to adolescents. Such differences were not only due to learning, but also to age as concluded from recordings in novice mice. After learning, neuronal tuning properties had changed in adults but not in adolescents. Overall, the differences between adolescent and adult neuronal data correlate with the behavior results in showing that learning a difficult task is more challenging for younger mice.

      Strengths:

      (1) The behavioral task is well designed, with the comparison of easy and difficult tasks allowing for a refined conclusion regarding learning across ages. The experiments with optogenetics and novice mice complete the research question in a convincing way.

      (2) The analysis, including the systematic comparison of task performance across the two age groups, is most interesting and reveals differences in learning (or learning strategies?) that are compelling.

      (3) Neuronal recording during both behavioral training and passive sound exposure is particularly powerful and allows interesting conclusions.

      Weaknesses:

      (1) The presentation of the paper must be strengthened. Inconsistencies, mislabeling, duplicated text, typos, and inappropriate color code should be changed.

      (2) Some claims are not supported by the data. For example, the sentence that says that "adolescent mice showed lower discrimination performance than adults (l.22) should be rewritten, as the data does not show that for the easy task (Figure 1F and Figure 1H).

      (3) The recording electrodes cover regions in the primary and secondary cortices. It is well known that these two regions process sounds quite differently (for example, one has tonotopy, the other does not), and separating recordings from both regions is important to conclude anything about sound representations. The authors show that the conclusions are the same across regions for Figure 4, but is it also the case for the subsequent analysis? In Figure 7 for example, are the quantified properties not distinct across primary and secondary areas? If this is not the case, how is it compatible with the published literature?

      (4) Some analysis interpretations should be more cautious. For example, I do not understand how the lick bias, defined -according to the method- as the inverse normal distribution of the z-score (hit rate) +z-scored (false alarm rate; Figure 1j?, l.749-750), should reflect a cognitive difficulty (l. 161-162, l.171). A lower lick rate in general could reflect a weaker ability to withhold licking- as indicated on l.164, but also so many other things, like a lower frustration threshold, lower satiation, more energy, etc).

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to find out how - and how well - adult and adolescent mice discriminate tones of different frequencies and whether there are differences in processing at the level of the auditory cortex that might explain differences in behavior between the two groups. Adolescent mice were found to be worse at sound frequency discrimination than adult mice. The performance difference between the groups was most pronounced when the sounds were close in frequency and thus difficult to distinguish, and could, at least in part, be attributed to the younger mice's inability to withhold licking in no-go trials. By recording the activity of individual neurons in the auditory cortex when mice performed the task or were passively listening as well as in untrained mice the authors identified differences in the way that the adult and adolescent brains encode sounds and the animals' choice that could potentially contribute to the differences in behavior.

      Strengths:

      The study combines behavioural testing in freely-moving and head-fixed mice, optogenetic manipulation, and high-density electrophysiological recordings in behaving mice to address important open questions about age differences in sound-guided behavior and sound representation in the auditory cortex.

      Weaknesses:

      For some of the analyses that the authors conducted it is unclear what the rationale behind them is and, consequently, what conclusion we can draw from them. The results of the optogenetic manipulation, while very interesting, warrant a more in-depth discussion.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, Benedikt et al. sought to understand how adolescents and adult mice differ in auditory cortical processing, performance on a go/nogo sound-guided task, and learning. They report that behavioral performance is superior in adults. They also report that neuronal representations of both the acoustic stimulus and behavioral choice are weaker and sluggish in adolescents compared to adults and that these differences were larger in expert mice than in novices. The neural basis of adolescent auditory cognition is an important topic (both clinically and from a basic science perspective) and vastly understudied. However, many aspects of the study fell short, thereby undermining the primary conclusions drawn by the authors. My major concerns are as follows:

      (1) The authors report that "adolescent mice showed lower auditory discrimination performance compared to adults" and that this performance deficit was due to (among other things) "weaker cognitive control". I'm not fully convinced of this interpretation, for a few reasons. First, the adolescents may simply have been thirstier, and therefore more willing to lick indiscriminately. The high false alarm rates in that case would not reflect a "weaker cognitive control" but rather, an elevated homeostatic drive to obtain water. Second, even the adult animals had relatively high (~40%) false alarm rates on the freely moving version of the task, suggesting that their behavior was not particularly well controlled either. One fact that could help shed light on this would be to know how often the animals licked the spout in between trials. Finally, for the head-fixed version of the task, only d' values are reported. Without the corresponding hit and false alarm rates (and frequency of licking in the intertrial interval), it's hard to know what exactly the animals were doing.

      (2) There are some instances where the citations provided do not support the preceding claim. For example, in lines 64-66, the authors highlight the fact that the critical period for pure tone processing in the auditory cortex closes relatively early (by ~P15). However, one of the references cited (ref 14) used FM sweeps, not pure tones, and even provided evidence that the critical period for this more complex stimulus occurred later in development (P31-38). Similarly, on lines 72-74, the authors state that "ACx neurons in adolescents exhibit high neuronal variability and lower tone sensitivity as compared to adults." The reference cited here (ref 4) used AM noise with a broadband carrier, not tones.

      (3) Given that the authors report that neuronal firing properties differ across auditory cortical subregions (as many others have previously reported), why did the authors choose to pool neurons indiscriminately across so many different brain regions? And why did they focus on layers 5/6? (Is there some reason to think that age-related differences would be more pronounced in the output layers of the auditory cortex than in other layers?)

    4. Author response:

      Reviewer #1:

      A) The presentation of the paper must be strengthened. Inconsistencies, mislabelling, duplicated text, typos, and inappropriate colour code should be changed.

      We will revise the manuscript to correct the abovementioned issues.

      B) Some claims are not supported by the data. For example, the sentence that says that "adolescent mice showed lower discrimination performance than adults (l.22) should be rewritten, as the data does not show that for the easy task (Figure 1F and Figure 1H).

      We will carefully review, verify claims, and correct conclusions where needed.

      C) In Figure 7 for example, are the quantified properties not distinct across primary and secondary areas?

      We will analyse the data in Figure 7 separately for AUDp and secondary auditory cortices to test regional differences. Additionally, we will provide a table summarizing key neuronal firing properties for each area during passive recordings to clarify how activity varies across cortical subregions and developmental stages.

      D) Some analysis interpretations should be more cautious. (..) A lower lick rate in general could reflect a weaker ability to withhold licking- as indicated on l.164, but also so many other things, like a lower frustration threshold, lower satiation, more energy, etc).

      We will address issues around lick bias including alternative explanations, such as differences in motivation or impulsivity.

      Reviewer #2:

      A) For some of the analyses that the authors conducted it is unclear what the rationale behind them is and, consequently, what conclusion we can draw from them.

      We will edit the discussion and clarify these points. In addition, we will adjust and extend the methodology section to clarify the rationale of our analysis.

      B) The results of the optogenetic manipulation, while very interesting, warrant a more in-depth discussion.

      We agree that the effects observed in our optogenetic manipulation warrant further discussion. We will extend on the analysis and discussion of ACx silencing.

      Reviewer #3:

      A) One fact that could help shed light on this would be to know how often the animals licked the spout in between trials. Finally, for the head-fixed version of the task, only d' values are reported. Without the corresponding hit and false alarm rates (and frequency of licking in the intertrial interval), it's hard to know what exactly the animals were doing.

      We recognize the need for a more nuanced analysis for the head-fixed version of the task. We will extend the behavioral analysis and provide more details to clarify these points.

      B) There are some instances where the citations provided do not support the preceding claim. For example, in lines 64-66, the authors highlight the fact that the critical period for pure tone processing in the auditory cortex closes relatively early (by ~P15). However, one of the references cited (ref 14) used FM sweeps, not pure tones, and even provided evidence that the critical period for this more complex stimulus occurred later in development (P31-38). Similarly, on lines 72-74, the authors state that "ACx neurons in adolescents exhibit high neuronal variability and lower tone sensitivity as compared to adults." The reference cited here (ref 4) used AM noise with a broadband carrier, not tones.

      We appreciate the reviewer pointing out instances where our citations may not fully support our claims. We will carefully review the relevant citations and revise them to ensure they accurately reflect the findings of the cited studies. We will update references in lines 64–66 and 72–74 to better align with the specific stimulus types and developmental timelines discussed.

      C) Given that the authors report that neuronal firing properties differ across auditory cortical subregions (as many others have previously reported), why did the authors choose to pool neurons indiscriminately across so many different brain regions?

      We agree that pooling neurons from multiple auditory cortical regions could potentially obscure region-specific differences. However, we addressed this concern by analyzing regional differences in neuronal firing properties, as shown in Supplementary Figures S4-1 and S4-2, and Supplementary Tables 2 and 3. Additionally, we examined stimulus-related and choice-related activity across regions and found no significant differences, as presented in Supplementary Figure S4-3. Please see our response to Reviewer 1, where we further elaborate on this point.

      D) And why did they focus on layers 5/6? (Is there some reason to think that age-related differences would be more pronounced in the output layers of the auditory cortex than in other layers?)

      We acknowledge that other cortical layers are also of interest and may contribute differently to auditory processing across development. Our focus on layers 5/6 was motivated by both methodological considerations and biological relevance. These layers contain many of the principal output neurons of the auditory cortex, and are therefore well positioned to influence downstream decision-making circuits. We will clarify this rationale in the revised manuscript and note the limitations of our approach.

    1. Joseph goes on to explain that he values “downtime to clear my head or to meditate on an idea,

      I have also strongly value down time and taking breaks in between writing or doing any kind of assignment. It gives you the opportunity to let your brain rest and rejuvenate before starting again.

    1. veiling or covering might sig-nal oppression

      Head coverings are interpreted differently in Western cultures versus in Muslim culture. In Western societies, head coverings are viewed as oppressive symbols and lack of independence. But in Muslim communities, they can symbolize religious values, cultural identity, or just personal preference. The practice of veiling is used as justification to save women as they are often seen as victims of an oppressive system rather than considering the social and religious context. The author opposes this assumption and argues that this justification serves political purposes rather than serving the Muslim community. I mostly agree with the author’s stance. I think it’s important to respect Muslim women who wear head coverings no matter their reason, whether it’s personal choice, religion, or other reasons. A clarifying question I have is, “How can we more accurately represent Muslim women who choose to wear head coverings?”

    1. I see no point in bishops or preachers or Christian evangelists just recycling the kind of stuff that you can get from any soft-left liberal because everyone is giving that. If I want that, I’ll get it from a Liberal Democrat councilor. If you’re a Christian, you think that the entire fabric of the cosmos was ruptured when by this strange singularity where someone who is a God and a man sets everything on its head. To say it’s supernatural is to downplay it. I mean this is a massive singularity at the heart of things. And if you don’t believe that, it seems to me you’re not really a confessional Christian. You may be a cultural Christian, but you’re not a confessional Christian. So if you believe that, it should be possible to dwell on all the other weird stuff that traditionally comes as part of the Christian package. I seems to me that there’s a deep anxiety about that, almost a sense of embarrassment…If it’s to be preached as something true, the strangeness of it, the way that it can’t be framed by what seems to be mere reality, has to be fundamental to it. I don’t want to hear what bishops think about Brexit; I know what they think about Brexit, and it’s not particularly interesting.– Tom Holland, “How Christianity Gained Dominion” (interview)

      juxtaposition of "soft-left liberal" and "Liberal Democrat" with "Christian" here....

      not Christian and non-Christian

      something telling in this dichotomy

    1. you can write your draft in one long, stream-of-thought rant. I call these “splat drafts” — you get everything in your head out on the page at once, without regard for form (hence the “splat”). Then you treat it as the raw material for a second, proper draft. Think of it as the act of dumping the puzzle pieces out on the table so you can sift through them and see what fits together.

      He uses the phrase "splat draft" in the sense that others might call a "vomit draft", but not in the sense of Mozart's peeing cow.

    1. "The second you blink, the second you stand up and see what's going on, he's cutting behind your head for a lob. He's getting underneath you to wedge you away from the basket for an offensive rebound. It's exhausting. It's exhausting playing against him."

      Flagg has a very big impact the way he plays basketball. He is "verified".

    1. But, instead of cakes, he gave him with his whip such a rude lash overthwart the legs, that the marks of the whipcord knots were apparent in them, then would have fled away; but Forgier cried out as loud as he could, O, murder, murder, help, help, help! and in the meantime threw a great cudgel after him, which he carried under his arm, wherewith he hit him in the coronal joint of his head, upon the crotaphic artery of the right side thereof, so forcibly, that Marquet fell down from his mare more like a dead than living man.

      When I read this part of the text, I felt quite shocked because of the level of detail the human anatomy was mentioned. Of course, my shock then subsided because I remembered that Rabelais was a doctor. I've read that Rabelais actually did use some of his writing as a way to make some more known about the privilege of medicine only to the elite. Simply taking some time to add the names of human anatomy rather than describing, for example, "the top of the head" (coronal joint) or "right temple/side of the head" (crotaphic artery, with a little bit of research). This definitely further reinforces his position as both a writer and someone with expertise in medicine and the human body.

      This, I think, also applies to the whole "vital urge" aspect of the writing because it is anatomical and gruesome towards the end of the passage. Also, this is simply the effect of panic and adrenaline, a human response.

      Anderson, J. “The Francois Rabelais school of medicine.” BMJ (Clinical research ed.) vol. 323,7327 (2001): 1456-7. doi:10.1136/bmj.323.7327.1456

    2. continueth in the children what was lost in the parents, and in the grandchildren that which perished in their fathers, and so successively until the day of the last judgment,

      This quote aids significantly in Rabelais's humanist views. Although the quote shows that one can not escape generational mistakes, it highlights Rabelais's idea that "His head is for the new learning, while his flesh and heart belong to the old" (Screech & Cohen). This is important because the letter shows how one must continue to learn and accept new knowledge even if the past cannot be completely forgotten.

      Source: Screech, M.A., Cohen, John Michael. "François Rabelais". Encyclopedia Britannica, 27 May. 2024, https://www.britannica.com/biography/Francois-Rabelais. Accessed 27 March 2025.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility, and clarity)

      The manuscript by Song et al presents evidence to show that the predicted cysteine protease type 6 secretion system (T6SS) effector Cpe1 inhibits target cell growth by cleaving type II DNA Topoisomerases GyrB and ParE. The authors determined the structure of the protein complex formed by Cpe1 and its immunity protein Cpi1, which allowed them to reveal the mechanism of inhibition. Moreover, the authors identified type II DNA topoisomerases GyrB and ParE as the targets of Cpe1. Overall, the major conclusions were well supported by experimental data of high quality. The findings have expanded our appreciation of the mechanism utilized by T6SS effectors to inhibit target cell growth.

      We thank the reviewer for their positive remarks and valuable suggestions to improve this manuscript.


      Major comments

      To better establish that GyrB and ParE are the sole targets of Cpe1, the authors should express the GG mutant in target cells and determine whether these cells become resistant to Cpe1-mediated killing (inhibition). They can also determine whether co-expression of the cleavage resistant mutants suppresses the toxicity of Cpe1.

      We appreciate the reviewer’s suggestion to investigate additional substrates of Cpe1 beyond GyrB and ParE, which may not have been fully captured in our crosslinking-mass spectrometry experiments due to technical limitations or low protein abundance. To address this topic, we generated target cells heterologously expressing cleavage-resistant GyrB and ParE variants (GyrBΔG102 and ParEΔG98) that are not susceptible to Cpe1, as described in our original manuscript (Figures 3h, i). We performed both Cpe1 expression assay and competition assay to assess if expression of the cleavage-resistant variants suppresses Cpe1 toxicity (Author Response Figures 1a, b). However, we did not observe a substantial protective effect. While this outcome could suggest that GyrB and ParE are not the sole targets of Cpe1, alternative explanations are also plausible. In the Cpe1 expression assay, high levels of Cpe1 could still act on endogenous wild-type GyrB and ParE, and although we attempted to increase variant expression, precise quantification remains challenging. In the competition assay, highly active Cpe1 may have continued to target wild-type substrates throughout the experiment, potentially masking any protective effect. Additionally, reduced activity of the mutant proteins could contribute to the observed results. Finally, deletion of the global repressor H-NS in the Cpe1-producing E. coli strain may have induced other interbacterial competition mechanisms1, leading to growth inhibition independently of Cpe1. Addressing these questions comprehensively would require a more systematic investigation under a wider range of conditions. We consider this an important avenue for future studies.

      Results in Figure 7 clearly show that Cpi1 is capable of displacing ParE from Cpe1 due to higher affinity. Yet, the "competitive inhibition model" described in the last result section does not completely match what is really happening in Cpe1-mediated interbacterial competition. If Cpi1 is in the target cell, it would more likely engage the incoming Cpe1 before it can interact with ParE or GyrB, so competition does not occur in this scenario. Similarly, in the predatory cells expressing Cpe1 and Cpi1, these two proteins will form a stably protein complex, and no competition with the target will occur. The authors should reconsider their model.

      We thank the reviewer for their comments and appreciate the opportunity to clarify this point. First, we believe the reviewer is referring to Figure 5 rather than Figure 7. In our model, the primary role of immunity proteins in interbacterial competition is to neutralize cognate toxins and prevent self- or kin-intoxication. These immunity proteins exhibit high specificity and strong binding affinity toward their associated toxins, ensuring effective protection2. In predatory cells, immunity proteins are typically co-expressed with their corresponding toxins, likely enabling immediate suppression upon translation. During kin competition, immunity proteins can protect cells even after foreign toxins engage their substrates.

      Our results demonstrate that Cpi1 binds Cpe1 with higher affinity than its substrates and can displace them from pre-formed Cpe1-substrate complexes (Figures 5b-f). This aligns with the established function of immunity proteins in interbacterial competition and provides a mechanistic basis for how they confer protection, even when toxins have initially engaged their targets2. We acknowledge the reviewer’s point that in both scenarios—whether in the recipient cell or the toxin-producing cell—Cpe1 may first encounter Cpi1. However, our model underscores that Cpi1 not only binds at the substrate site but also exhibits superior affinity for Cpe1, ensuring robust protection against Cpe1-mediated toxicity.

      Minor comments

      "Intoxication" was used throughout the text numerous times to describe the activity of Cpe1. Looking in the Marriam-Webster dictionary, "Intoxication" means "a condition of being drunk". This word should be replaced with "toxicity" or some other terms in this line.

      We thank the reviewer for this comment. We acknowledge that the term "intoxication" is commonly associated with alcohol consumption, yet the Merriam-Webster dictionary also defines it as "an abnormal state that is essentially a poisoning" (https://www.merriam-webster.com/dictionary/intoxication). This definition aligns with its well-established usage in the field of interbacterial competition to describe the effects of interbacterial toxins during antagonism3-5, which we have adopted in our manuscript. However, we appreciate the reviewer’s concern and remain open to revising the terminology if deemed necessary for clarity.

      Lines 46-48, references on contact-dependent killings by these systems mentioned should cited. Ref. 9 cited does NOT cover the information at all.

      We thank the reviewer for this comment. We have revised the citation and now reference studies that specifically describe contact-dependent killing systems in the relevant sentences (Lines 45–____50)

      "characterizations" should be "characterization".

      We have now modified the sentence as requested (Line 69)

      Line 229 "Cpe1-Bpa monomers" should be " apo Cpe1-Bpa". The results cannot distinguish whether these bands are monomers or multimers.

      We appreciate the reviewer’s careful assessment of our manuscript. The results in Line 233 (Figure 3c) show the enrichment of His-tagged proteins, including crosslinked complexes and overproduced Cpe1-Bpa. Based on the molecular weight marker, the Cpe1-Bpa bands appear between 10–15 kDa, consistent with the molecular weight of Cpe1 monomers (Figure 3a). Therefore, we have labeled this band as “Cpe1-Bpa monomers” and maintained this terminology throughout the text. This designation aligns with previous studies utilizing site-specific crosslinking via Bpa incorporation6,7

      Line 283, was the mutation deletion? Substitution was used I think.

      We thank the reviewer for highlighting this point. The GyrB and ParE mutants used to confirm the cleavage sites were deletion mutants, with a single glycine removed from the predicted double-glycine motifs. We have now revised the text for clarity (Lines 285–290)

      Lines 439-444 the discussion should be extended to include other bacterial toxins that target type II DNA topoisomerases (e.g. PMID: 26299961 and PMID: 26814232).

      We appreciate the reviewer’s suggestion. The studies referenced (PMID: 26299961 and PMID: 26814232) describe FicT toxin with adenylyl transferase activity that target and post-translationally modify GyrB and ParE at their ATPase domains, highlighting a potential hotspot for topoisomerase inhibition. We have now incorporated an additional paragraph in the Discussion section to describe these findings (Lines 424–439).

      Reviewer #1 (Significance)

      The authors determined the structure of the protein complex formed by Cpe1 and its immunity protein Cpi1, which allowed them to reveal the mechanism of inhibition. Moreover, the authors identified type II DNA topoisomerases GyrB and ParE as the targets of Cpe1. Overall, the major conclusions were well supported by experimental data of high quality. The findings have expanded our appreciation of the mechanism utilized by T6SS effectors to inhibit target cell growth.

      We sincerely thank the reviewer for their positive comments and for the suggestions to improve our manuscript.

      Reviewer #2 (Evidence, reproducibility, and clarity)

      The manuscript, titled "An Interbacterial Cysteine Protease Toxin Inhibits Cell Growth by Targeting Type II DNA Topoisomerases GyrB and ParE", describes how an effector family was identified and characterized as a papain-like cysteine protease (PLCP) that negatively impacts bacterial growth in the absence of its co-encoded immunity protein. This thorough report includes (1) bioinformatic analysis of prevalence, finding this PLCP effector encoded in many gram-negative bacteria, (2) confirming conservation of catalytic active site via structural (crystallographic) analysis, as well as visualizing contacts with the immunity protein, (3) validation of results using growth studies combined with mutagenesis, (4) using a cell-based cross-linking method to pull out potential targets, which were subsequently identified via mass spectrometry, (5) validation of these results using in vitro protease assays with purified (potential) substrates, including verification of the motif recognized on the substrate(s), and cell-based phenotype analyses, and finally, (6) demonstrating competition between immunity protein and ParE substrate using an in vitro pull-down approach. Overall, this is a strong body of work with compelling conclusions that are well supported by multiple experimental approaches.

      We appreciate the reviewer for their positive comments regarding our original submission.

      Major comments

      The claims made based on the presented results are well supported, including that this PLCP effector toxin is widespread, is neutralized in a competitive mechanism by its immunity partner, and that it effectively cleaves both GyrB and ParE (subunits of bacterial type II topoisomerases) at a conserved motif, resulting in suppression of bacterial cell growth via mis-regulating chromosome segregation. No additional experiments are needed to further validate these results, and the authors are commended on the cell-based and in vitro studies to deduce very specific mechanisms and structural details.

      We appreciate the reviewer’s positive feedback.

      Minor comments

      While the writing and data presentation are extremely clear, in general I recommend the authors indicate the level(s) of replication for experiments. Figure legends generally note that mean values with standard deviations are shown, but I did not find where the number of replicates (and independent versus technical) were listed.

      We appreciate the reviewer’s suggestion. We have now revised the manuscript to specify the levels of replication (independent vs. technical) for each experiment in the figure legends, particularly in Figures 2 and 3.

      The figures are very clear, but in many instances the addition of PLCP toxin is indicated as "before" and "after"; while a modest change, I recommend altering this to some type of "-" and "+" type nomenclature rather than a time-based notation (especially as presumably both samples were treated identically, just with or without protease).

      We thank the reviewer for this helpful comment. In Figures 3 and Supplementary Figures 5, 9, we used "before" and "after" to indicate the time points for in vitro cleavage assays verifying Cpe1 cleavage. To minimize variations between reactions, the catalytic mutant Cpe1tox (Cpe1toxC362A) was used as a comparison rather than a reaction without Cpe1tox. In these assays, duplicate reaction mixtures were prepared: one was denatured immediately after preparation ("before" reaction) to serve as a baseline, while the other was incubated to allow enzymatic activity ("after" reaction). This labeling clarifies the comparison between initial and processed samples. We believe this approach clearly distinguishes the effects of Cpe1 activity and provides a reliable basis for assessing proteolysis in our assays.

      I also suggest quantifying the intensities of the gel images presented in Figure 5c, d (for example, Cpe1 intensity as a ratio to that of the ParE ATPase domain), to make the interpretation even more evident.

      We thank the reviewer for the valuable suggestion to quantify the signal intensities of the gel images presented in Figures 5c, d. We have now included the quantification results in Supplementary Figures 9e, f and have updated the respective text in the manuscript (Lines 826-828 and 1066-1087).

      Crystallographic structure: the PDB report notes some higher-than-expected RZR (RSRZ) scores; I interpret this to mean that there was strain around the catalytic site of one of the two toxins in the asymmetric unit, or that this copy was less well ordered. The RZR outliers likely arise from non-optimal weighting for geometric restraints. While no figures of electron density are presented, these modest outliers are not expected to alter the conclusions reached in the current work. One point of interest that is not addressed, however, is if any variance between the two complexes in the asymmetric unit are noted? A passage compares the current toxins to others in the larger subfamily and notes a rotation of a side chain is needed to superpose (Line 159). Can the authors please clarify around which bond this rotation is needed, and if both copies in the asymmetric unit are in the same orientation at this site?

      We appreciate the reviewer’s insightful comments.

      1. We have provided the electron density map for the RSR-Z outlier residues along with the model (Author response Figure 2a). These outlier residues are located at the loop regions of a molecule within the asymmetric unit in the crystal (Chain B). As a result, the electron density for their side chains appears to be noisier compared to residues in the well-folded regions, leading to higher RSR-Z scores. Notably, when we superimposed the models of two complexes within the asymmetric unit, the calculated RMSD value was 0.402 Å (Author response Figure 2b), indicating that the two models are structurally very similar and that these residues are properly assigned. Therefore, the RSR-Z outliers do not significantly impact the overall structure.
      2. Here, we provide a zoomed-in view of Figure 2d, highlighting the superimposed crystal structures of Cpe1 and the closely related PLCPs, ComA and LahT (Author response Figure 2c). As shown, the side chain of the catalytic cysteine residue in ComA adopts a different orientation, positioning it slightly farther from the homologous residues in Cpe1 and LahT. However, since the backbone and catalytic pockets remain structurally intact, we believe that this deviation arises due to results from crystal packing effects rather than an inherent functional distinction. We have now modified the main text (Lines 159-166) to clarify this and prevent any potential misinterpretation.

      Reviewer #2 (Significance)

      Bacteria encode numerous effectors to successfully compete in natural environments or to mediate virulence; these effectors are typically associated with type VI secretion system machinery or referred to as contact dependent inhibition systems. The current work has identified a sub-family of papain-like cysteine protease effectors that are unique by targeting type II topoisomerases. Among the actionable findings is the identification of both the specific site of interaction with the topo substrates, as well as the specific motif recognized for cleavage. This should enable the field to move forward probing for this activity with other toxins and substrates. The insights provided by the competitive neutralization mechanism also stand out as an important contribution that can be more broadly applied. Within the literature, few effector targets are identified, making the current study stand out as impactful by the well-executed experiments that directly support the conclusions.

      While the current study has strong elements of novelty and is complete, it also nicely sets up future studies for remaining open questions. For example, does the nucleotide-bound status of the ATPase domain, or other catalytic intermediate, impact the susceptibility of topoisomerases to cleavage? Is this identified motif found in other ATPase domains? Is the negative supercoiling activity unique to gyrase also impacted, or is the phenotypic mechanism of cell toxicity reliant only on chromosome segregation? What types of kinetic parameters do this class of toxins demonstrate, and does sequence variability alter this? These ideas are a testament to the intriguing study as presented, capturing the readers' curiosity for additional details that are clearly beyond the scope of the current work.

      I anticipate this work will be of interest to the broad field of microbiologists that study interbacterial communication as well as pathogenic mechanisms. While the research is largely fundamental in nature, it is wide in scope with applications to many gram-negative bacteria that inhabit a myriad of niches. The work will also be of interest to specialists in topoisomerases, as the list of toxins that target these essential enzymes is growing and the therapeutic utility of topoisomerase inhibition remains vital. My interest lies in the latter, in toxin-mediated inhibition of topoisomerase enzymes as a means to alter bacterial cell growth. While I have strong expertise in structural biology, I am lacking in expertise for mass spectrometry. I note this because this method was used for the identification of the target substrate.

      We appreciate the reviewer’s insightful discussion and interest in our study. We agree that further investigations are crucial to address the open questions posed, and we have initiated work on some of these avenues.

      For example, considering Cpe1's specificity for the ATPase domain of GyrB and ParE, we have begun examining whether Cpe1 targets other ATPase domains by searching for the consensus sequence or double glycine motifs in the sequences of ATPase domains beyond GyrB and ParE. Among the 42 E. coli ATPase domains identified by the PEC database8, we found several with double glycine residues. However, none contained the exact LHAGGKF consensus sequence identified in GyrB and ParE, which are targeted by Cpe1 (Author Response Figure 3). These findings suggest that Cpe1 is less likely to target other ATPase domains. Nonetheless, due to Cpe1’s potential tolerance of certain variations within the consensus sequence, we cannot draw a definitive conclusion without further investigation into the cleavage sites.

      Another critical open question is the impact of Cpe1-mediated cleavage on the function of GyrB and ParE. To address this topic, we have begun investigating if Cpe1 cleavage affects the ATPase activity of these proteins. As expected, our biochemical analysis has demonstrated a significant decrease in ATP hydrolysis in the presence of active Cpe1tox, but not in the presence of the catalytic mutant Cpe1toxC362A (Author response Figures 4a, b). These results confirm that the ATP-dependent activities of both GyrB and ParE are disrupted following Cpe1 cleavage9. Previous work on FicT toxin that inhibits GyrB and ParE ATPase activity through post-translational modification found that ATP-dependent activities such as DNA supercoiling, relaxation, and decatenation were inhibited10,11. Interestingly, GyrB’s relaxation of negative supercoiled DNA, which does not require ATP, was also affected to some extent. This outcome raises the question as to whether Cpe1-cleaved GyrB results in similar downstream defects. Investigating this possibility would provide valuable insights into Cpe1’s mode of action, although we feel doing so is beyond the scope of the current study. Consequently, we view this as an important area for future research.

      Finally, regarding the potential applications of Cpe1, we are interested in further investigating its enzymatic specificity and properties. In this study, we analyzed the binding kinetics between Cpe1 and its substrate (Figure 5f) and currently we are endeavoring to characterize the kinetics of Cpe1-mediated proteolysis. To better probe hydrolytic dynamics, we plan to utilize a substrate with a reporting group (such as a chromogenic or fluorogenic leaving group) to monitor cleavage over time. We could achieve this by designing a recombinant substrate based on our knowledge of Cpe1’s native substrates (GyrB and ParE) and the target sequence (“LHAGGKF”). Alternatively, a secondary reaction leading to colorimetric changes could be employed for detection. We consider this an exciting research direction and an important next step for this study.

      Overall, we are grateful for the reviewer’s recognition of the novelty and importance of our work in advancing the understanding of interbacterial toxins and their inhibitory effects on topoisomerases. We plan to further investigate the consequences of Cpe1 cleavage on GyrB and ParE and to explore Cpe1 kinetics and its mechanistic actions in more detail. This will not only deepen our understanding of bacterial toxin-mediated inhibition but may also provide critical insights into strategies for targeting type II DNA topoisomerases. The reviewer’s insightful feedback has proven invaluable in shaping our ongoing and future research directions.

      Reviewer #3 (Evidence, reproducibility, and clarity)

      Bacterial warfare in microbial communities has become illuminated by recent discoveries on molecular weapons that allow contact-dependent injection of bacterial toxins between competitors. Among the best characterized systems are the type VI secretion system (T6SS) or the contact-dependent inhibition (CDI) system (i.e. some of the T5SSs). These systems are delivering a plethora of toxins with various biochemical activities and a broad range of targets. In recent years many such toxins have been characterized and their relevance in pointing at appropriate drug targets is increasing.

      In this study the authors built on a previously published association of a family of proteins, papain-like cysteine proteases (PLCPs), with their delivery by T6SS or CDI into target bacterial cells. Whereas this observation is not particularly novel, the findings that this set of proteins, that the authors called now Cpe1, can specifically target bacterial proteins such as ParE and GyrB, so that it affects chromosome partitioning and cell division, is groundbreaking. The authors are clearly demonstrating that Cpe1 cleaves their target proteins at double glycine recognition site which is in line with previous characterization of such proteases when fused to a particular category of ABC transporters. Even more remarkably they can show using biochemical approaches that Cpi1 is a cognate immunity for CpeI, preventing its activity, not by interfering with the catalytic site, but instead with the substrate binding site. The mechanism of competitive inhibition between immunity and substrate is also substantiated by biochemical data.

      We sincerely appreciate the reviewer’s interest in and support of our study.

      Major comments

      • This is a very well conducted study which combines bacterial genetics and phenotypes with excellent biochemical evidence.

      We thank the reviewer for their positive comments.

      • There are 8 targets identified for Cpe1 and yet only two are cleaved by the enzyme. It is intriguing that FtsZ is one identified target by the pull down but not confirmed for cleavage. The authors rules this as false positive but the cell division defect associated with Cpe1 activity would be consistent here. Are there any double glycine in FtsZ that could be identified as cleavage site? Is it possible that slightly different incubation conditions may promote degradation of FtsZ?

      We appreciate the reviewer’s thoughtful comment regarding FtsZ as a potential substrate of Cpe1. This was indeed an intriguing possibility, especially given the cell division defects observed following Cpe1 intoxication. Early on in the project, we also identified FtsZ as a Cpe1 interactor in our proteomic crosslinking assays, which further fueled the hypothesis that FtsZ might be a target.

      To explore this possibility, first we examined the FtsZ protein sequence for potential Cpe1 cleavage sites and identified several double glycine motifs (Author response Figure 5a). However, none of these motifs matched the consensus sequence identified in GyrB and ParE, which is LHAGGKF, a sequence that we have shown to be critical for Cpe1 cleavage activity. In an effort to better understand if FtsZ could still be cleaved by Cpe1, we conducted additional cleavage assays under various conditions (Author response Figure 5b). We tested different incubation temperatures, including increasing the temperature to 37 °C, and extended the reaction time to overnight. However, we did not observe any cleavage of FtsZ under these conditions. Given that FtsZ undergoes significant conformational changes upon binding to GTP12, we also considered the possibility that the GTP-bound form of FtsZ might be cleaved by Cpe1. However, even under those conditions, no significant cleavage of FtsZ was detected (Author response Figure 5b). Based on these results, we do not have any evidence to support that FtsZ is a target of Cpe1. The observed cell division defects are more likely a secondary effect resulting from the cleavage of GyrB and ParE, direct targets of Cpe1 that are crucial for chromosome segregation.

      • Could it be structurally predicted whether the GG of ParE or GyrB is fitted into the catalytic site of Cpe1.

      We appreciate the reviewer’s insightful question regarding the structural prediction of the GG motif of ParE and GyrB fitting into the catalytic site of Cpe1. To address this possibility, we used Alphafold 3 to predict the interaction structure between Cpe1 and its substrates13. The resulting model of Cpe1 interacting with the ATPase domain of GyrB (GyrBATPase) is shown in Supplementary Figure 9c. As illustrated, the loop of the GyrB ATPase domain containing the consensus targeting sequence (“LHAGGKF”) fits into the catalytic site of Cpe1, with the GG motif positioned closest to the catalytic cysteine residue, which likely facilitates hydrolysis. We also attempted to model the interaction between Cpe1 and the ATPase domain of ParE. However, confidence for this model was lower (ipTM = 0.74, pTM = 0.71), possibly due to Alphafold’s preference for certain protein configurations. To gain a more accurate understanding of how Cpe1 binds and recognizes its substrates, we are currently working on co-crystallizing Cpe1tox with GyrB and ParE. This long-term project aims to provide precise structural insights into the Cpe1-substrate interaction and further elucidate the mechanism of cleavage.

      Minor comments

      • The authors described a family of proteases, PLPCs, and characterized one here called Cpe1. Not clear whether this is a generic name or one specific protein from one particular bacterial species. Indeed, it is unclear from which bacterial strain the Cpe1 protein studied here originates.

      We thank the reviewer for this comment and apologize for the lack of clarity. To provide better context, we have now revised the manuscript (Lines 136-137 and 141-145) to clearly state that the Cpe1 protein characterized in this study originates from E. coli strain ATCC 11775.

      • It may be worth to emphasize that the Cpe1 domain is found in all possible configurations as T6SS cargo and that is to be linked to VgrG, PAAR or Rhs.

      Thank you for this suggestion. We have revised the manuscript accordingly to emphasize this point (Lines 106-109).

      • Line 49 the authors could indicate that the Esx system is also known as type VII secretion system (T7SS).

      Thank you for this suggestion. We have revised the manuscript accordingly (Line 48-50).

      • Line 113 it may be better to use Proteobacteria instead of Pseudomonadota

      We have revised the manuscript (Lines 114-115) as suggested by the reviewer. It is important to note that following the recent decision by the International Committee on Systematics of Prokaryotes (ICSP) to amend the International Code of Nomenclature of Prokaryotes (ICNP) and formally recognize "phylum" under official nomenclature rules14,15, the taxonomy database used in our analysis has adopted the updated nomenclature. To ensure consistency, we followed this updated nomenclature throughout the original manuscript.

      Reviewer #3 (Significance)

      This is an excellent piece of work. The characterization of Cpe1 might look poorly novel at the start when compared to previous studies. Yet the findings go crescendo by characterizing original mechanisms of action of the cognate immunity, and by identifying the molecular target of Cpe1. This is providing real conceptual advance in the T6SS field and not just reporting yet another T6SS toxin.

      As a T6SS expert I genuinely feel that these findings are groundbreaking and could be targeted to broad audience since the possible implications of these observations for future antimicrobial drugs discovery or therapeutic approaches is highly relevant.

      We sincerely appreciate the reviewer’s positive remarks and support of our study.

      References

      1. Ishihama, A., and Shimada, T. (2021). Hierarchy of transcription factor network in Escherichia coli K-12: H-NS-mediated silencing and Anti-silencing by global regulators. FEMS Microbiol Rev 45. 10.1093/femsre/fuab032.
      2. Hersch, S.J., Manera, K., and Dong, T.G. (2020). Defending against the Type Six Secretion System: beyond Immunity Genes. Cell Rep 33, 108259. 10.1016/j.celrep.2020.108259.
      3. Russell, A.B., Singh, P., Brittnacher, M., Bui, N.K., Hood, R.D., Carl, M.A., Agnello, D.M., Schwarz, S., Goodlett, D.R., Vollmer, W., and Mougous, J.D. (2012). A widespread bacterial type VI secretion effector superfamily identified using a heuristic approach. Cell Host Microbe 11, 538-549. 10.1016/j.chom.2012.04.007.
      4. Jana, B., Fridman, C.M., Bosis, E., and Salomon, D. (2019). A modular effector with a DNase domain and a marker for T6SS substrates. Nat Commun 10, 3595. 10.1038/s41467-019-11546-6.
      5. Halvorsen, T.M., Schroeder, K.A., Jones, A.M., Hammarlof, D., Low, D.A., Koskiniemi, S., and Hayes, C.S. (2024). Contact-dependent growth inhibition (CDI) systems deploy a large family of polymorphic ionophoric toxins for inter-bacterial competition. PLoS Genet 20, e1011494. 10.1371/journal.pgen.1011494.
      6. Nguyen, T.T., Sabat, G., and Sussman, M.R. (2018). In vivo cross-linking supports a head-to-tail mechanism for regulation of the plant plasma membrane P-type H(+)-ATPase. J Biol Chem 293, 17095-17106. 10.1074/jbc.RA118.003528.
      7. Liu, Y., Yu, J., Wang, M., Zeng, Q., Fu, X., and Chang, Z. (2021). A high-throughput genetically directed protein crosslinking analysis reveals the physiological relevance of the ATP synthase 'inserted' state. FEBS J 288, 2989-3009. 10.1111/febs.15616.
      8. Yamazaki, Y., Niki, H., and Kato, J. (2008). Profiling of Escherichia coli Chromosome database. Methods Mol Biol 416, 385-389. 10.1007/978-1-59745-321-9_26.
      9. Reece, R.J., and Maxwell, A. (1991). DNA gyrase: structure and function. Crit Rev Biochem Mol Biol 26, 335-375. 10.3109/10409239109114072.
      10. Harms, A., Stanger, F.V., Scheu, P.D., de Jong, I.G., Goepfert, A., Glatter, T., Gerdes, K., Schirmer, T., and Dehio, C. (2015). Adenylylation of Gyrase and Topo IV by FicT Toxins Disrupts Bacterial DNA Topology. Cell Rep 12, 1497-1507. 10.1016/j.celrep.2015.07.056.
      11. Lu, C., Nakayasu, E.S., Zhang, L.Q., and Luo, Z.Q. (2016). Identification of Fic-1 as an enzyme that inhibits bacterial DNA replication by AMPylating GyrB, promoting filament formation. Sci Signal 9, ra11. 10.1126/scisignal.aad0446.
      12. Matsui, T., Han, X., Yu, J., Yao, M., and Tanaka, I. (2014). Structural change in FtsZ Induced by intermolecular interactions between bound GTP and the T7 loop. J Biol Chem 289, 3501-3509. 10.1074/jbc.M113.514901.
      13. Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., Ronneberger, O., Willmore, L., Ballard, A.J., Bambrick, J., et al. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493-500. 10.1038/s41586-024-07487-w.
      14. Oren, A., Arahal, D.R., Rossello-Mora, R., Sutcliffe, I.C., and Moore, E.R.B. (2021). Emendation of Rules 5b, 8, 15 and 22 of the International Code of Nomenclature of Prokaryotes to include the rank of phylum. Int J Syst Evol Microbiol 71. 10.1099/ijsem.0.004851.
      15. Oren, A., and Garrity, G.M. (2021). Valid publication of the names of forty-two phyla of prokaryotes. Int J Syst Evol Microbiol 71. 10.1099/ijsem.0.005056.
    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Schmidt and colleagues are addressing the effects of severe hypoxia on proliferation and differentiation potential of (mouse) cranial neural crest, using a neural crest cell line subjected to hypoxic conditions, assessed by transcriptomics analysis (quantitative reverse transcription PCR, bulk RNA sequencing and bioinformatics). They are reporting a mild effect of cell proliferation and an extensive inhibition of differentiation towards osteoblasts, chondrocytes and smooth muscle cells. They reveal affected biological processes shared between the three fate biasing conditions related to cytoskeleton organization and amino acid metabolism. Lastly, among affected genes upon hypoxic conditions in vitro, they authors identified risk genes linked to non-syndromic (non-genetic) orofacial clefts exclusively downregulated in osteoblasts and smooth muscle cells, namely Fgfr2, Gstt1 and Tbxa2. Similarly, hypoxia-driven downregulation of genes implicated in syndromic orofacial clefts was observed in all three chondrocyte, osteoblast and smooth muscle differentiation scenarios. Lastly, STRING analysis of downregulated genes cross-validated their findings related to affected differentiation.

      Major comments:

      The conclusions drawn from the experimental data are carefully formulated for the most part. One of the main concerns is that the cells were subjected to extreme hypoxic conditions, while it may be more biologically relevant to include a condition representing more mild hypoxia (e.g. 10%). One of the opening claims regarding severe hypoxia only mildly affecting cell proliferation is not shown clearly, since no mitotic markers have been analyzed (i.e. KI67 or PCNA staining or a simple EdU incorporation assay). Thus, the claim that they assessed cell proliferation is not very convincing, even though cell death was analyzed. Additionally, cellular morphology of the cells could be assessed (brightfield images), since previous studies observed that hypoxia can be an inducive factor in cranial neural crest and driving EMT (Scully et al. 2016; Barriga et al. 2013).

      Furthermore, in the RNA seq analysis of chondrogenic fate biased cells the authors draw a conclusion based on the proximity of the samples on the PCA plot, which is not very convincing. More careful analysis of the bulk RNA seq data sets they have generated for key marker genes will be more convincing (for example, a heatmap with selected genes would be a helpful representation). As mentioned above, a straight-forward and not time consuming experiment (given that it was assessed for a maximum of 72 hrs) would be to repeat the culture of NCCs and stain for mitotic markers, and quantify the number of positively stained cells over total cell numbers. Furthermore, it is not that demanding to add an experimental condition of less severe hypoxia in this assay. Without underestimating how time consuming this would be, a major lack of experimental validation of the key genes they identify as important across all conditions may be the limitation of the study (this would be the difference between correlation and a probable underlying mechanism). This can be circumvented by more extensive reference to in situ data sets from mouse or existing data sets of single cell and spatial transcriptomicsA suggested targeted knock-down (for example with siRNA, shRNA or CRISPR) to validate a few of the key genes revealed as important could take a few months, with an estimated cost up to 5,000 euros per targeted gene and replicate. On methods, replicates and statistics: The experimental methods and approach are described efficiently and seem reproducible.All biological and technical replicates are of a minimum of N=3 from independent experiments and statistical tests have been run in all cases.

      Minor comments:

      One of the key implications of NCCs in palate formation is interaction with orofacial epithelial cells, which the authors also mention. It may be interesting to check if any signaling pathways involved in this crosstalk are affected under hypoxic conditions in their existing data sets of bulk RNA SEQ. This can be done by using available algorithms such as CellChat (Jin et al. 2021; Jin, Plikus, and Nie 2023), which has been reported to work also in bulk RNA seq data analysis (according to GitHub). The authors could mine the literature for existing RNA sequencing data that include osteoblasts, chondrocytes and epithelial cells (Ozekin, O'Rourke, and Bates 2023; Piña et al. 2023).

      Additionally, another process that may be affected is EMT (epithelial-to-mesenchymal-transition) and is possible to assess by re-analysis of bulk RNA-seq data while focusing on key genes implicated in this process (i.e. E-cadherin, vimentin, EpCAM, Snail, Twist, PRRX1). Lastly, when the authors report on the significantly up- or down-regulated genes, it may be interesting to categorize them by ligands, receptors, intracellular molecules and transcription factors (and use separate plots to visualize them). While a big focus of the manuscript are down-regulated genes, less emphasis was given in upregulated genes (other than the response to hypoxia gene module).

      The authors are referencing extensively and accurately existing studies in the field and the manuscript is exceptionally well-written, with only a few points of limited clarity or increased complexity. Such an example is when the authors refer to OFC risk genes, because it is not clearly stated how the referenced studies reached their conclusions (for example, are they mouse studies, do they involve mutants, are any of these studies based on GWAS on human cohorts). This matter would significantly improve the flow of the text and highlight the importance of the study and their findings. The figures could be redesigned to be more intuitive to interpret. For example, using violin plots and heatmaps, as discussed, and including references or re-analysis/re-use of existing spatial transcriptomics and in situs for marker genes.

      In all cases where there is a comparison of gene expression levels, violin plots would be a better representation of up- and down-regulated genes (i.e. selected genes from Fig1K, comparison of gene expression between normoxic and hypoxic NCCs, Fig 2G when analyzing chondrogenesis and the respective analysis for osteoblasts and smooth muscle cells, as well as when comparing the three fate-biasing conditions to identify common genes that are misregulated).

      References:

      Barriga, Elias H., Patrick H. Maxwell, Ariel E. Reyes, and Roberto Mayor. 2013. "The Hypoxia Factor Hif-1α Controls Neural Crest Chemotaxis and Epithelial to Mesenchymal Transition." The Journal of Cell Biology 201 (5): 759-76. https://doi.org/10.1083/jcb.201212100.

      Jin, Suoqin, Christian F. Guerrero-Juarez, Lihua Zhang, Ivan Chang, Raul Ramos, Chen-Hsiang Kuan, Peggy Myung, Maksim V. Plikus, and Qing Nie. 2021. "Inference and Analysis of Cell-Cell Communication Using CellChat." Nature Communications 12 (1). https://doi.org/10.1038/s41467-021-21246-9.

      Jin, Suoqin, Maksim V. Plikus, and Qing Nie. 2023. "CellChat for Systematic Analysis of Cell-Cell Communication from Single-Cell and Spatially Resolved Transcriptomics." bioRxiv. https://doi.org/10.1101/2023.11.05.565674.

      Ozekin, Yunus H., Rebecca O'Rourke, and Emily Anne Bates. 2023. "Single Cell Sequencing of the Mouse Anterior Palate Reveals Mesenchymal Heterogeneity." Developmental Dynamics : An Official Publication of the American Association of Anatomists 252 (6): 713-27. https://doi.org/10.1002/dvdy.573.

      Piña, Jeremie Oliver, Resmi Raju, Daniela M. Roth, Emma Wentworth Winchester, Parna Chattaraj, Fahad Kidwai, Fabio R. Faucz, et al. 2023. "Multimodal Spatiotemporal Transcriptomic Resolution of Embryonic Palate Osteogenesis." Nature Communications 14 (September):5687. https://doi.org/10.1038/s41467-023-41349-9.

      Scully, Deirdre, Eleanor Keane, Emily Batt, Priyadarssini Karunakaran, Debra F. Higgins, and Nobue Itasaki. 2016. "Hypoxia Promotes Production of Neural Crest Cells in the Embryonic Head." Development 143 (10): 1742-52. https://doi.org/10.1242/dev.131912.

      Significance

      Several pieces of evidence have pointed to hypoxia as an environmental factor contributing to congenital orofacial clefts, ranging from studies in mouse to observations in human. The authors are doing an excellent job in putting this information together and the question they are trying to answer is of high importance, given the prevalence of such congenital syndromes. In terms of the methods and model employed, there are some limitations, related to the choice of a mouse cell line over one from human, the severe hypoxia induced (over a more mild), and the conditions of directed differentiation not allowing for simultaneous examination of more complex lineage transitions. The methods as a whole are not that up-to-date, given the single cell and multiplexed transcriptomic advances the last couple of decades, advanced bioinformatics that could be used in combination with in vitro lineage tracing methods.

      The audience this work will reach are neural crest experts, developmental biologists, and potentially clinical doctors. The general public outreach of such a paper is also diverse, as more focus and visibility is required for the individuals affected by those syndromes and their families.

      Reviewer's expertise: mouse neural crest lineage and multipotency, lineage tracing, single cell transcriptomics, NGS, immunofluorescence, molecular methods (RNA, DNA based). Limited expertise with in vitro studies.

    1. It became clear that Richard was the right fit for Xavier to take us to championship success in the Big East and NCAA Tournament.”

      This could be seen as an opinion statement because it can't be proven as true or false. Rather it is Greg Christophers thoughts and feelings on why he chose Rick Pitino to be Xavier's next head basketball coach.

    2. Rick Pitino, just completed his fourth season as the head coach at New Mexico,

      Another fact statement because it is something that can be proven true.

  13. folger-main-site-assets.s3.amazonaws.com folger-main-site-assets.s3.amazonaws.com
    1. Till he unseamed him from the nave to th’ chops,And fixed his head upon our battlements

      Macbeth sliced from the middle of the body up, and prized his body among their battlements

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study combines predictions from MD simulations with sophisticated experimental approaches including native mass spectrometry (nMS), cryo-EM, and thermal protein stability assays to investigate the molecular determinants of cardiolipin (CDL) binding and binding-induced protein stability/function of an engineered model protein (ROCKET), as well as of the native E. coli intramembrane rhomboid protease, GlpG.

      Strengths:

      State-of-the-art approaches and sharply focused experimental investigation lend credence to the conclusions drawn. Stable CDL binding is accommodated by a largely degenerate protein fold that combines interactions from distant basic residues with greater intercalation of the lipid within the protein structure. Surprisingly, there appears to be no direct correlation between binding affinity/occupancy and protein stability.

      Weaknesses:

      (i) While aromatic residues (in particular Trp) appear to be clearly involved in the CDL interaction, there is no investigation of their roles and contributions relative to the positively charged residues (R and K) investigated here. How do aromatics contribute to CDL binding and protein stability, and are they differential in nature (W vs Y vs F)?

      Based on the simulations in Corey et al (Sci Adv 2021), aromatic residues, especially tryptophan, appear to help provide a binding platform for the glycerol moiety of CDL which is quite flat. This interaction is likely why we generally see the tryptophan slightly further into the plane of the membrane than the basic residues, where it may help to orient the lipid. Unlike charge interactions with lipid head groups, such subtle contributions are likely distorted by the transfer to the gas phase, making it difficult to confidently assign changes in stability or lipid occupancy to interactions with tryptophan. We have added an explanation of these considerations to the Discussion section (page 13, last paragraph).

      (ii) In the case of GlpG, a WR pair (W136-R137) present at the lipid-water on the periplasmic face (adjacent to helices 2/3) may function akin to the W12-R13 of ROCKET in specifically binding CDL. Investigation of this site might prove to be interesting if it indeed does.

      Thank you for the suggestion. In our CG simulations, we don’t see significant CDL binding at this site, likely because there is just a single basic residue. We note that there is a periplasmic site nearby with two basic residues (K132+K191+W125) with a higher occupancy, however still far lower than the identified cytoplasmic site. In general, periplasmic sites are less common and/or have lower affinity which may be related to leaflet asymmetry (Corey et al, Sci Adv 2021). We added the CDL density plot for the periplasmic side to Figure S7 and noted this on page 9, next-to-last paragraph.

      (iii) Examples of other native proteins that utilize combinatorial aromatic and electrostatic interactions to bind CDL would provide a broader perspective of the general applicability of these findings to the reader (for e.g. the adenine nucleotide translocase (ANT/AAC) of the mitochondria as well as the mechanoenzymatic GTPase Drp1 appear to bind CDL using the common "WRG' motif.)

      Several confirmed examples are presented in Corey et al (Sci Adv 2021), the dataset which we used to identify the CDL site in GlpG. So essentially, our broader perspective is that we test the common features observed in native proteins in an artificial system. While it is not clear how a peripheral membrane protein like Drp1 fits into this framework, the CDL binding sites in ANTs indeed have the same hallmarks as the one in GlpG (Hedger et al, Biochemistry 2016). We recently contributed to a study demonstrating that the tertiary structure of ANT Aac2 is stabilized by co-purified CDL molecules, underscoring the general validity of our findings (Senoo et al, EMBO J 2024).  We have added this information to the discussion, pg 12, third paragraph, and added a figure (S8, see below) to highlight the architecture of the Aac2-CDL complex.

      Overall, using both model and native protein systems, this study convincingly underscores the molecular and structural requirements for CDL binding and binding-induced membrane protein stability. This work provides much-needed insight into the poorly understood nature of protein-CDL interactions.

      We thank the reviewer for the positive assessment!

      Reviewer #2 (Public review):

      Summary:

      The work in this paper discusses the use of CG-MD simulations and nMS to describe cardiolipin binding sites in a synthetically designed, that can be extrapolated to a naturally occurring membrane protein. While the authors acknowledge their work illuminates the challenges in engineering lipid binding they are able to describe some features that highlight residues within GlpG that may be involved in lipid regulation of protease activity, although further study of this site is required to confirm it's role in protein activity.

      Comments

      Discrepancy between total CDL binding in CG simulations (Fig 1d) and nMS (Fig 2b,c) should be further discussed. Limitations in nMS methodology selecting for tightest bound lipids?

      We thank the reviewer for pointing out that this needs to be clarified. We analyze proteins in detergent, which is in itself delipidating, because detergent molecules compete with the lipids for binding to the protein, an effect that can be observed in MS (Bolla et al, Angew Chemie Int. Ed. 2020). Native MS of membrane proteins requires stripping of the surrounding lipid vesicle or detergent micelle in the vacuum region of the mass spectrometer, which is done through gentle thermal activation in the form of high-energy collisions with gas molecules. Detergent molecules and lipids not directly in contact with the protein generally dissociate easier than bound lipids (Laganowsky et al, Nature 2014), however, the even loosely bound lipids can readily dissociate with the detergent, artificially reducing occupancy. The nMS data is therefore likely biased towards lipids bound tightly (e.g. via electrostatic headgroup interactions), however, these are the lipids we are interested in, meaning that the use of MS is suitable here. We have noted this in the Discussion, last paragraph on page 12.

      Mutation of helical residues to alanine not only results in loss of lipid binding residues but may also impact overall helix flexibility, is this observed by the authors in CG-MD simulations? Change in helix overall RMSD throughout simulation? The figures shown in Fig.1H show what appear to be quite significant differences in APO protein arrangement between ROCKET and ROCKET AAXWA.

      For most of the study, we use CG with fixed backbone bead properties as well as an elastic network to maintain tertiary structure. This means that a mutation to alanine will have essentially no impact on the stability of the helix or protein in general in the CG simulations in the bilayer. It should be noted that Figure 1H shows snapshots from atomistic gas phase simulations with pulling force applied (see schematic in Figure 1F, as well as Figure S1 for ends-point structures), where we naturally expect large structural changes due to unfolding. We have analyzed the helix content in the gas-phase simulations and see that helix 1 in ROCKET unwinds within 10 ns but stays helical ca. 10 ns longer when bound to CDL. The AAWXA mutation stabilizes the helical conformation independently of CDL binding, but CDL tethers the folded helix closer to the core (see Figure 1 G and H). We have added this information to the results section and the plot below to Figure S2.

      CG-MD force experiments could be corroborated experimentally with magnetic tweezer unfolding assays as has been performed for the unfolding of artificial protein TMHC2. Alternatively this work could benefit to referencing Wang et al 2019 "On the Interpretation of Force-Induced Unfolding Studies of Membrane Proteins Using Fast Simulations" to support MD vs experimental values.

      We apologize for the confusion here. The force experiments are gas-phase all-atom MD. The simulations show that the protein-lipid complex has a more stable tertiary structure in the gas phase. Since these are gas-phase simulations, they cannot be corroborated using in-solution measurements. Similarly, the paper by Wang et al is a great reference for solution simulations, however, to date the only validations for gas-phase unfolding come from native MS.

      Did the authors investigate if ROCKET or ROCKETAAXWA copurifies with endogenous lipids? Membrane proteins with stabilising CDL often copurify in detergent and can be detected by MS without the addition of CDL to the detergent solution. Differences in retention of endogenous lipid may also indicate differences in stability between the proteins and is worth investigation.

      We have investigated the co-purification of the ROCKET variants and did not observe any co-purified lipids (see Figure S4) which we clarified in the results section (page 5, third paragraph) now. We previously showed that long residence times in CG-MD are linked to the observation of co-purified lipids, because they are not easily outcompeted by the detergent (Bolla et al, Angew Chemie Int. Ed. 2020). In CG-MD of ROCKET, we see that although the CDL sites are nearly constantly occupied, the CDL molecules are in rapid exchange with free CDL from the bulk membrane. For MS, all ROCKET proteins were extracted from the E. coli membrane fraction with DDM, which likely outcompetes CDL. This interpretation would explain why we see significant CDL retention when the protein is released from liposomes, but not when the protein is first extracted into detergent. For GlpG, CDL residence times in CG-MD  are longer, which agrees with CDL co-purification. Similarly, there is clearly an enrichment of CDL when the protein is extracted into nanodiscs (Sawczyc et al, Nature Commun 2024).

      Do the AAXWA and ROCKET have significantly similar intensities from nMS? The AAXWA appears to show slightly lower intensities than the ROCKET.

      We did not observe a significant difference, however, in most spectra, the AAXWA peaks have a lower intensity than those of the other variants (see e.g. Figure S5). While this could be batch-to-batch variations, there may be a small contribution from the lower number of basic residues (see Abramsson et al, JACS au 2021). However, there is an excess of basic residues in the soluble domain of ROCKET, so this interpretation is speculative.

      Can the authors extend their comments on why densities are observed only around site 2 in the cryo-em structures when site 1 is the apparent preferential site for ROCKET.

      We base the lipid preference of Site 1 > Site 2 on the CG MD data, where we see a higher occupancy for site 1. At the same time, as noted in the text, CDL at both sites have rather short residence times. When the protein is solubilized in detergent, these times can change, and lipids in less accessible sites (such as cavities and subunit interfaces) may be subject to a slower exchange than those that are fully exposed to the micelle (Bolla et al, Angew Chemie Int. Ed. 2020). We speculate that this effect may favor retaining a lipid at site 2. Furthermore, site 1 is flexible, with CDL attaching in various angles while site 2 has more uniform CDL orientations (see CDL density plot in Figure 1D). EM is likely biased towards the less flexible site. Notably, the density is still poorly defined, so it is possible that a more variable lipid position in site 1 would not yield a notable density at all. We have added this information to the Results section (page 5, second paragraph).

      The authors state that nMS is consistent with CDL binding preferentially to Site 1 in ROCKET and preferentially to Site 2 in the ROCKET AAXWA variant, yet it unclear from the text exactly how these experiments demonstrate this.

      As outlined in the previous answer, we base our assessment of the sites on the CG MD simulations. There, we note that CDL binds predominantly to site 1 in ROCKET and predominantly to site 2 in AAXWA, however, the overall occupancy is lower in AAXWA than in Rocket, meaning fewer lipids will be bound simultaneously in that variant. The nMS data show CDL retention by both variants when released from liposomes, but the AAXWA has lower-intensity CDL adduct peaks (Figure 2B, C). We interpret this that both have CDL sites, but in the AAXWA variant, the sites have lower occupancy. We agree that this observation does not demonstrate that the CG MD data are correct, however, it is the outcome one expects based on the simulations, so we described it as “consistent with the simulations”. We have rephrased the section to make this clear.

      As carried out for ROCKET AAXWA the total CDL binding to A61P and R66A would add to supporting information of characterisation of lipid stabilising mutations.

      We considered this possibility too. Unfortunately, the mass differences between A61P / R66A and AAXWA are slightly too high to unambiguously resolve CDL adducts of each variant, as the 1st CDL peak of AAWXA partially overlaps with the apo peak of A61P or R66A.

      Did the authors investigate a double mutation to Site 2 (e.g. R66A + M16A)?

      While designing mutants, we tested several double mutants involving the basic residues that bind the CDL headgroups (e.g. R66 + AAWXA) but found that they could not be purified, probably because a minimum of positive residues at the N-terminus is required for proper membrane insertion and folding. M16 is an interesting suggestion, but wasn’t considered because the more subtle effects of non-charged amino acids on CDL binding may be lost during desolvation (see also our response to Comment (i) from reviewer 1).

      Was the stability of R66A ever compared to the WT or only to AAXWA?

      Some of the ROCKET mutants have very similar masses that cannot be resolved well enough on the ToF instrument. While the R66-WT comparison is possible, we would not be able to compare it to R61P or D7A/S8R. To avoid three-point comparisons, we selected AAXWA as the common point of reference for all variants.

      How many CDL sites in the database used are structurally verified?

      At the time, 1KQF was the only verified E. coli protein with a CDL resolved in a high-resolution structure. The complex was predicted accurately, see Figure 6A in Corey et al (Sci Adv 2021), as were several non-E. coli complexes.

      The work on GlpG could benefit from mutagenesis or discussion of mutagenesis to this site. The Y160F mutation has already been shown to have little impact on stability or activity (Baker and Urban Nat Chem Biol. 2012).

      We thank the referee for their excellent suggestion. While Y160F did not have a pronounced effect, the other 3 positions of the predicted CDL binding site in GlpG have not been covered by Baker and Urban. Looking at sequence conservation in GlpG orthologs, manually sampling down to 50% identity (~1300 sequences in Uniprot) shows that Y160 and K167 are conserved, R92 varies between K/R/Q, whereas W98 is not conserved. The other (weak) site cited above (K132 and K191) is not conserved. A detailed investigation of how the conserved residues impact CDL binding and activity is already planned for a follow up study focusing on GlpG biology.

      Reviewer #3 (Public review):

      Summary:

      The relationships of proteins and lipids: it's complicated. This paper illustrates how cardiolipins can stabilize membrane protein subunits - and not surprisingly, positively charged residues play an important role here. But more and stronger binding of such structural lipids does not necessarily translate to stabilization of oligomeric states, since many proteins have alternative binding sites for lipids which may be intra- rather than intermolecular. Mutations which abolish primary binding sites can cause redistribution to (weaker) secondary sites which nevertheless stabilize interactions between subunits. This may be at first sight counterintuitive but actually matches expectations from structural data and MD modelling. An analogous cardiolipin binding site between subunits is found in E.coli tetrameric GlpG, with cardiolipin (thermally) stabilizing the protein against aggregation.

      “It’s complicated” We could not have phrased the main conclusions of our study better.

      Strengths:

      The use of the artificial scaffold allows testing of hypothesis about the different roles of cardiolipin binding. It reveals effects which are at first sight counterintuitive and are explained by the existence of a weaker, secondary binding site which unlike the primary one allows easy lipid-mediated interaction between two subunits of the protein. Introducing different mutations either changes the balance between primary and secondary binding sites or introduced a kink in a helix - thus affecting subunit interactions which are experimentally verified by native mass spectrometry.

      Weaknesses:

      The artificial scaffold is not necessarily reflecting the conformational dynamics and local flexibility of real, functional membrane proteins. The example of GlpG, while also showing interesting cardiolipin dependency, illustrates the case of a binding site across helices further but does not add much to the main story. It should be evident that structural lipids can be stabilizing in more than one way depending on how they bind, leading to different and possibly opposite functional outcomes.

      We share the reviewer’s concern, as we clearly observe that TMHC4_R does not have the same type of flexibility as a natural protein. We find that by introducing flexibility, we start to see CDL-mediated effects. To test the valIdity of our findings from the artificial system, we apply them to GlpG. In response to a suggestion from Reviewer 1, we compared the findings to Aac2, and found that its stabilizing CDL site closely resembles that in GlpG (see new Figure S8).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Minor comments:

      There are a number of typos/uncorrected statements in the text.

      i) The last sentence of the Abstract appears to be an uncorrected mishmash of two.

      ii) Line 66: "protects" should be just "protect"

      iii) Line 75: Sentence appears to be incomplete. "...associated changes in protein stability." The word "stability" is missing.

      We have made these changes.

      iv) Fig. 2E. Are the magenta and blue colors inverted for variants 1 and 2?

      No, the color is correct. greater stabilization of the blue tetramer (AAXAW) compared to WT (purple) will lead to fewer blue monomoers than purple monomers in the mass spectrum.

      v) Line 274: the salt bridge should be between R8-E68.

      We have corrected this.

      vi) Lines 350-354 (final sentence of the paragraph): The sentence does not read well (especially with the double negative element). Please reconstruct the sentence and/or break it into two. 

      We have split the sentence in two.

      Suggestions:

      (i) While aromatic residues (in particular Trp) appear to be clearly involved in the CDL interaction, there is no investigation of their roles and contributions relative to the positively charged residues (R and K) investigated here. How do aromatics contribute to CDL binding and protein stability, and are they differential in nature (W vs Y vs F)?

      See our response to comment (i) from reviewer 1. In short, subtle contribution to lipid interactions (such as pi stacking with Trp or Tyr) will likely be lost during transfer to the gas phase. However, see also our response to the last comment from reviewer 2, we plan to use solution-phase activity assays to investigate the effect of Trp on CDL binding to Glp. However, this is beyond thes cope oif the current study.

      (ii) In the case of GlpG, a WR pair (W136-R137) present at the lipid-water on the periplasmic face (adjacent to helices 2/3) may function akin to the W12-R13 of ROCKET in specifically binding CDL. Investigation of this site might prove to be interesting if it indeed does.

      We added the CDL density plot for the periplasmic side to Figure S7 and discuss further sites in GlpG in the Discussion section. See response to point (ii) above for details.

      Reviewer #2 (Recommendations for the authors):

      Minor comments

      - Typo in abstract line 39-40

      - Typo in figure legend of Fig 1 line 145

      - Typo in line 149, missing R66 in residues shown as sticks description

      - Lines 165-167 could benefit from describing what residues are represented as sticks

      We have made these changes.

      - Line 263 should refer to the figure where the tetrameric state was not affected by this mutation.

      The full spectrum of the A61P mutant is not included in the figure, hence there is no reference,

      - Addition of statistics to Fig. 4F ?

      We have added significance indicators to the graph and information about the statistics to the legend.

      Reviewer #3 (Recommendations for the authors):

      Minor issues

      l39: rewrite

      We have made these changes.

      l60: provide evidence for what is presented as a general statement - cardiolipins might also regulate function without affecting oligomeric state, e.g. MgtA

      This is a good point, we have added references to two examples where CDL work without affecting oligomerization (MtgA, Weikum et al BBA 2024, and Aac2, Senoo et al, EMBO J 2024).

      l74: not every functional interaction comes with a thermal shift

      We use thermal shift as a proxy because it indicates tight interactions, even if they may not be functional. We have made this distinction clearer in the text.

      l78: this is true for electrostatic interactions such as are at play here, but not necessarily for hydrophobic ones

      l133: in what direction is the pulling force applied - the figure seems to suggest diagonally?

      The pull coordinate is defined as the distance between the centers of mass of the two helices. The direction of the pull coordinate in Cartesian coordinate space is thus not fixed.

      fig 1f, l159: "dissociating" meaning separation of subunits? the placement of the lipid within one subunit would not suggest that intermolecular interactions are properly represented here, please clarify

      The lipid placement in the schematic is not representative since the lipid occupies different spaces in WT and AAXWA, we have noted this in the legend. Regarding line 159, “Dissociation” is not strictly correct, since the measure the force to separate helix 1 and 2, i.e. unfolding. We have changed the wording to “unfolding”.

      l173: was there any evidence in EM data for monomers or smaller oligomers?

      No smaller particles were identified by visual inspection or in the particle classes. We have noted this in the methods section.

      l203: were tetramer peaks isolated separately for CID?

      C8E4 can cause some activation-dependent charge reduction, which could allow some tetramers to “sneak out” of the isolation window. We used global activation without precursor selection which subjects all ions to activation.

      fig 2c: can you indicate the 3rd lipid binding as it seems to be in the noise

      We can unambiguously assign the retention of three CDL molecules for 17+ charge state only, and clarified this in the legend to Figrue 2.

      fig3: can you pls clarify what is meant by stabilization here - less monomer in case A means a more stable oligomer, but "A > B" should lead to ratios < 50%. This does not help with understanding what "stabilization" means in panels c-f, please define what the y axis means for these. Please also explain the bottom panels (side view) in each case, what do the dots represent?

      We apologize for the oversight of not explaining the side views, we have added a legend. The schematic in panel A is correct (compare the schematic in Figure 2 E). If tetramer A (blue) is stabilized by CDL more than tetramer B  “CDL stabilization A>B”), there will be fewer monomers ejected from A. If there is less A in the presence of CDL, then the ratio of B/(B+A) will go up.

      It is not very clear what consequences the kink introduced by proline has for intra- vs. intermolecular interactions - the cartoons don't help much here

      We agree, the A61P impact on the structure is subtle. The small kink it introduces is not really visible in the top view, and hence, we tried to emphasize this in the side view. We have clarified the meaning of the side view schematics in the legend.

      l360: is that an assumption made here or is there evidence for displacement? native MS could potentially prove this.

      This is an assumption based on the fact that we see very little binding of POPG in the mixed bilayer CG-MD. We have clarified this in the text. Measuring this with MS is an interesting idea, but we have no direct measurement of displacement, since addition of CDL and POPG to the protein in detergent would result in binding to other sites as well.

      fig 4d: there is not much POPG density visible at all - why is that?

      Both plots use the same absolute scale. There is simply much less POPG binding compared to CDL.

      fig 4e: is this released protein already dissociated into monomers due to denaturation or excessive energy (CID product) - please comment.

      The CID energy for the spectrum in Figure 4E was selected to show partial dissociation and monomer release at higher voltages (220V in this case). At lower voltages (150V-170V) we do not observe dissociation in C8E4, see Figure S4A.

      l363: pls comment on the apparent discrepancy between single lipid binding and double density

      We added a clarifying sentence regarding the double lipids. The density seen in the published structure is of four lipid tails next to each other, which is what one would expect for a CDL. Since the CDL could not be resolved unambiguously, two phospholipids with two acyl chains each were modeled into the density instead. Our MS and MD data strongly suggests that the density stems from a single CDL.