Reviewer #3 (Public Review):
This study investigated how reward-associated signals are represented in layer 2/3 neurons of the primary motor cortex. Water-restricted mice were trained to respond to a conditioned auditory stimulus in order to receive a water reward. Behavior analysis showed that mice quickly learned the association between the sound stimulus and the reward as indicated by increased anticipatory lick rate. Using this behavioral paradigm, neuronal activity was monitored throughout the training (7 days). Two-photon calcium imaging was performed separately from four different types of neurons; pyramidal neurons, PV-, VIP-, and SOM-positive interneurons. Tuning of individual neurons to the tone and reward stimuli were analyzed by using Spearman correlation between the trial-averaged fluorescence and the timing of stimulus delivery. Results showed that PV-positive interneuron responses became more reliable to the cue stimulus, whereas VIP-positive interneuron responses became more reliable to the reward stimulus. Some SOM-INs that were not responsive to the tone before training became responsive at day 7 of training. Activity of SOM-INs became more reliable to the reward after learning. The main findings are quite novel and may provide a new insight into the specific roles of interneurons. More representative imaging data and control experiments will make the story even more complete and convincing.
1) Imaging calcium responses from individual types of interneurons are important and challenging. Tracing activity changes from same population of neurons is especially important because it will show how learning shapes the pattern of changes in each neuron. Despite such powerful approaches, activity changes from each neuron were not shown. Calcium transients measured at day 1 were re-sorted at day 7, so it is not clear whether the same neurons responsive to cue or reward stimulus are still responsive to the same stimuli and, if so, how their onset timing is changed. Knowing whether the cue- or reward-sensitive population is the same population or not may lead to a different conclusion, so plotting calcium signals over days without resorting would be important.
In addition, representative calcium images from interneurons were not shown (like Fig. 1A). It seemed that about 80-90 cells of PV-INs, VIP-INs, SOM-INs were observed (Fig. 2D). Showing some representative images from individual cell types would be helpful for readers to better understand the results.
2) Identifying active cells that are above the chance level was good to define a subset of neurons responsive to a period of cue- or reward-stimulus. Quantifying the tuning of each cell's average response during the tone and reward response periods using non-parametric Spearman correlation was also a powerful way to display a subset of neurons with high or low trial-by-trial reliability. Results suggest that there are changes of less reliable neurons to more reliable ones in the case of PV-INs and VIP-INs (Fig. 4 and 5). However, whether these changes are specifically associated with learning is not clear. Running control experiments without water restriction or with random reward presentation independent of cues would be a good comparison. These experiments will help to rule out the possibility of naturally happening learning-independent changes from day 1 to day 7.