Author Response:
Summary: This is an interesting topic, and these findings are potentially of theoretical significance for the field of sleep and memory consolidation, as well as potentially of practical importance. However, reviewers raised potential issues with the methods and interpretation. Specifically, reviewers were not confident that the paper reveals major new mechanistic insights that will make a major impact on a broad range of fields.
Reviewer #1:
This work claims to show that learning of word associations during sleep can impair learning of similar material during wakefulness. The effect of sleep on learning depended on whether slow-wave sleep peaks were present during the presentation of that material during sleep. This is an interesting finding, but I have a lot of questions about the methods that temper my enthusiasm.
We thank the reviewer for the careful reading of our manuscript and for the helpful comments. Most of the issues that were raised concern the clarity of writing. We will remove sleep-specific jargon where possible and will add relevant theoretical background and methodological details in the revised version of our manuscript.
1) The proposed mechanism doesn't make sense to me: "synaptic down-scaling of hippocampal and neocortical language-related neurons, which were then too saturated for further potentiation required for the wake-relearning of the same vocabulary". Also lines 105-122. What is 'synaptic down-scaling'? what are 'language related neurons'? ' How were they 'saturated'? What is 'deficient synaptic renormalization'? How can the authors be sure that there are 'neurons that generated the sleep- and ensuing wake-learning of ... semantic associations'? How can inferences about neuronal mechanisms (ie mechanisms within neurons) be drawn from what is a behavioural study?<br>
We will improve the writing of our manuscript and will add formal definitions of the key concepts (synaptic down-scaling / renormalization, synaptic saturation, …) to clarify the proposed mechanism. We are also open to discuss alternative explanations.
We admit that there is no way of truly knowing whether there were specific neurons or neuronal networks that encoded the semantic associations for word pairs that were played during sleep or during ensuing wakefulness. However, the behavioural data of the implicit memory test and the recall test suggest that participants formed memories for the word pairs played during sleep and during learning in the waking state. These memories must be represented in the brain – most likely in the hippocampus and in cortical regions involved in the processing of language. Indeed, our previous report suggested that successful retrieval of sleep-played semantic associations recruited hippocampus and language sites (Züst et al., 2019, Curr. Biol.).
2) On line 54 the authors say "Here, we present additional data from a subset of participants of our previous study in whom we investigated how sleep-formed memories interact with wake-learning." It isn't clear what criteria were used to choose this 'subset of participants'. Were they chosen randomly? Why were only a subset chosen, anyway?
The dataset we reported in Current Biology (Züst et al. 2019) consisted of two samples. Participants of the first sample stayed in the sleep laboratory following waking to perform the implicit memory test and the wake-learning task in the sleep laboratory. Participants of the second sample were escorted to the MR centre following waking to perform the implicit memory test in the MR scanner. These participants did not take a wake-learning task. Therefore, we could not include them in the study of wake-learning. Nevertheless, we do include ALL data of the first sample. We will clarify this in the revised version of our manuscript.
3) The authors do not appear to have checked whether their nappers had explicit memory of the word pairs that had been presented. Why was this not checked, and couldn't explicit memory explain the implicit memory traces described in lines 66-70 (guessing would be above chance if the participants actually remembered the associations).
Previous work from our own group (Ruch et al, 2014) as well as from other groups (Andrillon & Kouider, 2016; Cox et al., 2014; Arzi et al., 2012) clearly suggests that sleep-played sounds and words are not remembered consciously after waking up. This is why we administered an implicit memory test following waking. We only asked participants at the end of the experiment – i.e. after they had completed the wake-learning task – whether they had noticed or heard something unusual or unexpected during sleep. This first question was followed by the second question of whether participants had heard words during sleep. All participants denied having heard anything during sleep. This suggests that participants had no explicit memory for the sleep-played vocabulary. We will mention this in the revised version of our manuscript.
Importantly, if participants had explicit memory for sleep-played vocabulary, the finding that these memories suppress conscious re-learning of the same or similar contents during subsequent wakefulness would oppose previous findings demonstrating that repeated learning improves retention.
Reviewer #2:
This paper reports on a very interesting and potentially highly important finding - that so-called "sleep learning" does not improve relearning of the same material during wake, but instead paradoxically hinders it. The effect of stimulus presentation during sleep on re-learning was modulated by sleep physiology, namely the number of slow wave peaks that coincide with presentation of the second word in a word pair over repeated presentations. These findings are of theoretical significance for the field of sleep and memory consolidation, as well as of practical importance.
We appreciate the reviewer’s enthusiasm for our work and are grateful for the detailed and helpful comments.
Concerns and recommendations:
1) The authors' results suggest that "sleep learning" leads to an impairment in subsequent wake learning. The authors suggest that this result is due to stimulus-driven interference in synaptic downscaling in hippocampal and language-related networks engaged in the learning of semantic associations, which then leads to saturation of the involved neurons and impairment of subsequent learning. Although at first the findings seem counter-intuitive, I find this explanation to be extremely interesting. Given this explanation, it would be interesting to look at the relationship between implicit learning (as measured on the size judgment task) and subsequent explicit wake-relearning. If this proposed mechanism is correct, then at the trial level one would expect that trials with better evidence of implicit learning (i.e. those that were judged "correctly" on the size judgment task) should show poorer explicit relearning and recall. This analysis would make an interesting addition to the paper, and could possibly strengthen the authors' interpretation.
The main findings did not change when we controlled for implicit memory performance. Most importantly, the reported interaction between re-learning Condition (congruent vs. incongruent translations) and the number of slow-wave peaks during sleep (0-1 vs. 2-4) remained significant when we included implicit memory retrieval as predictor. Furthermore, this interaction was not mediated by implicit retrieval performance (no significant 3-way interaction).
We decided against reporting these analyses in the manuscript because including performance in the implicit memory test as additional predictor reduced the trial count to critically low levels in some conditions, making significance testing unreliable.
2) In some cases, a null result is reported and a claim is based on the null result (for example, the finding that wake-learning of new semantic associations in the incongruent condition was not diminished). Where relevant, it would be a good idea to report Bayes factors to quantify evidence for the null.
We will include Bayes Factors for our post-hoc analyses in the revised version of our manuscript.
3) The authors report that they "further identified and excluded from all data analyses the two most consistently small-rated and the two most consistently large-rated foreign words in each word lists based on participants' ratings of these words in the baseline condition in the implicit memory test." Although I realize that the same approach was applied in their original 2019 paper, this decision point seems a bit arbitrary, particularly in the context of the current study where the focus is on explicit relearning and recall, rather than implicit size judgments. As a reader, I wonder whether the results hold when all words are included in the analysis.
We wanted the analyses to be consistent with the original report in Current Biology (Züst et al, 2019). Nevertheless, for this revision, we will include all learning material in the analyses. Note that the changes in the overall pattern of results are minuscule and the message remains the same when stereotypical/biased words are included vs. excluded.
4) In the main analysis examining interactions between test run, condition (congruent/incongruent) and number of peak-associated stimulations during sleep (0-1 versus 3-4), baseline trials (i.e. new words that were not presented during sleep) are excluded. As such, the interactions shown in the main results figure (Figure D) are a bit misleading and confusing, as they appear to reflect comparisons relative to the baseline trials (rather than a direct comparison between congruent and incongruent trials, as was done in the analysis). It also looks like the data in the "new" condition is just replicated four times over the four panes of the figure. I recommend reconstructing the figure so that a direct visual comparison can be made between the number of peaks within the congruent and incongruent trials. This change would allow the figure to more accurately reflect the statistical analyses and results that are reported in the manuscript.
We will update the figure with a panel that presents the results for all conditions on the same axes. This will facilitate direct comparisons between all conditions.
5) In addition to the main analysis, the authors report that they also separately compared the conscious recall of congruent and incongruent pairs that were never or once vs. repeatedly associated with slow-wave peaks with the conscious recall in the baseline condition. Given that four separate analyses were carried out, some correction for multiple comparisons should be done. It is unclear whether this was done as it does not seem to be reported.
We will clarify where and how we corrected for multiple comparisons in the revised version of the manuscript.