Reviewer #2 (Public review):
In this manuscript, Menegas et al. classify the "control" behavior of captive marmosets. They combine behavioral screening from video recordings with audio and neural recordings (from the striatum) to better define what can be considered a typical behavioral repertoire for captive marmoset monkeys. A range of analyses is presented, investigating various aspects of behavior, such as social interactions and the detection of atypical individuals.
The manuscript is compelling in many respects, especially due to the richness of the dataset and the breadth of analyses presented. However, a significant issue with the manuscript lies in its writing: the results are conveyed in an overly succinct and superficial manner, and the "Methods" section is nearly absent. Key concepts are often undefined, and the mathematical details underlying the figures are not explained, leaving readers to guess the authors' approach.
Another issue is the vague use of the term "natural behavior." All data presented here appear to have been collected in small cages with limited climbing opportunities and enrichment. Thus, the authors should refrain from using "natural" to describe these conditions.
Below, we elaborate further on the lack of methodological detail. Based on these issues, we believe the manuscript, in its current form, does not meet the scientific standards necessary for proper review. We strongly encourage the authors to undertake an extensive revision.
Major Revision Points:
The methods and results require significantly more detail. A scientific publication should provide readers with enough information to reproduce the study. Here, the detail level is far too low to fully understand, or reproduce, the study, and in many instances, readers are left to guess how the figure panels were produced. Below is a non-exhaustive list of examples illustrating these issues:
(1) "we temporarily placed horizontal cage dividers to reduce the total cage size during data collection": What were the resulting (and initial) cage dimensions?
(2) "After training the network, we hierarchically clustered the latent space": What is the latent space? Based on Figure 2a, it appears related to the network's recurrent layer, but this is not clarified in the text.
(3) Alpha and perplexity parameters: Please define these terms. Since these concepts appear fundamental, readers should not have to consult external references.
(4) "We then traced cluster identities across hierarchical levels": What are hierarchical levels?
(5) "To understand how the input time series data was weighed in the bottleneck layer of the model": What is the bottleneck layer?
(6) "we measured the average attention allocation to previous time points": The authors should define "attention allocation."
(7) "we compared each neuron's firing rate distribution to shuffled data based on the overall frequency of each behavior during the session": This description is insufficient to understand the analysis.
(8) "we hierarchically clustered neurons according to their firing rate enrichment maps": No mathematical explanation is provided for neuron clustering, nor is the concept of a "firing rate enrichment map" clarified.
(9) "Cluster 4 showed higher activity when neurons were 'alone' or 'active'": This is vague and uses unclear jargon (e.g., "neurons alone"). Additionally, no mathematical explanation is provided for assigning neuronal activity to behavioral states.
(10) Figure 3f, right-side panels: The analysis seems to involve cage mate positioning, yet no description is provided.
(11) "we used motion watches to measure activity across all hours": Are these motion-sensitive watches physically attached to the animals? The methodology should be described, including data analysis details.
This list could continue, but we trust the authors understand the point. There is a wealth of analyses and information in this study, but the descriptions are too superficial. We understand that fully describing each analysis may require significant rewriting, including supplementary figures, and will likely make the manuscript longer. This is entirely acceptable, as the ideas presented here are worth the added rigor.
"Natural behavior": Typically, the term "natural" suggests that the dataset reflects the range of behaviors exhibited by animals in the wild. Here, however, recordings were made in a small cage with limited climbing opportunities and enrichment. Under these conditions, it's hard to justify describing the behavior as "natural". In a project aimed at classifying the behavioral repertoire of marmoset monkeys and making this dataset accessible to other laboratories, it would be helpful to include more detailed information about the animals' housing conditions. This might include cage sizes, temperature, humidity, and details on food quantities, quality, and feeding times.
Correlation versus causation: In the section titled "Large-scale data collection reveals variability across days and correlation between cagemates," the authors conclude: "Overall, these results indicate that measurements of animals' behavioral traits depend heavily on their social environment." This interpretation seems incorrect. We know that animal behavior varies throughout the day, with activity peaks typically occurring in the morning and afternoon. Such factors, or other external influences, could induce correlations between animals that are not caused by social interactions.
Figure 4g: What are we intended to conclude from this analysis?
Figure 5: Please specify the type of calls analyzed. For example, did you analyze only long-distance calls (aka 'loud phees' or 'shrills')? In "We split the audio data into 5-minute (non-continuous) segments and found that the average call rate in these segments varied from 0 calls per minute to 60 calls per minute (Fig. 5d-e)," does the call rate refer to individual animals or the entire cage?
"This implies that a high rate of calls in a room can interrupt animals during social resting states and cause them to preferentially exhibit more active/attentive states." Does it? This could simply indicate that more active animals produce more calls.
"We recorded neural activity in the striatum because it is known to contain diverse signals related to movement and social interactions." While I understand that the authors intend to publish neural data separately, a brief discussion of the striatum's role here would be helpful.