Reviewer #2 (Public review):
Strengths:
The authors have done a nice job providing additional data in response to reviewer feedback. I appreciate that accuracy plots are now included, as well as a separate analysis where differences in parameter estimates are performed for participants whose accuracy data were above chance levels. I also appreciate the new figure with the sphere ROIs for each participant, as they help us appreciate anatomical variability in the peak response separately for each task.
I have four concerns related to the weaknesses of the study:
(1) Although the results still hold when removing participants whose accuracy was 50% or less, a major limitation of this study is that participants made a button press response only to the last trial in a block. This is problematic because a participant could get all trials in a block correct except for the last one, or a participant could get all trials in a block wrong, and performance would be considered equivalent-as a consequence, it is not possible for one to know if participants who are at chance are performing differently from participants who are not at chance, and it is not possible to control for variance in reaction time (a concern also raised by reviewer 3).
(2) My second concern relates to the way in which the data are interpreted based on thresholding. There is above-threshold activation in the left SMG for all tasks except the fluid cognition task. The z-scores associated with significant voxels in Figure 3 are very strong (minimum z is 6). If one were to relax the threshold of the group level maps to, e.g., p < .001, uncorrected, FDR q < .05, or FWER of .10, there will be overlapping voxels outside the SMG. The discussion of the left SMG in the manuscript is prominent and narrowly construed-the left SMG is discussed as if it were 'the' region: "This confirms that the technical-reasoning network depends upon the recruitment of the left area PF, even if additional cognitive processes involving other peripheral brain areas can be engaged depending on the task" (pp. 9). My intuition is there will be numerous other areas of overlap when using a threshold that is still highly significant (e.g., z = 3 or 4). So, for proponents of the technical reasoning hypothesis, is there a counterfactual or alternative brain area/network/system not in the left SMG?
(3) I like the new Figure 6 because it shows variability in the location of the peak coordinate at the level of single participants. And, indeed, there's considerable variability that is typical when localizing ROIs in single participants. My concern is the level at which hypothesis testing is performed. An independent SMG ROI is used to extract parameter estimates and correlate responses between tasks to show a pattern of correlation that comports with a technical reasoning model of left SMG function. This is a fine approach but it does not rule out the so-called 'same region different function' interpretation because it relies on correlation-one cannot reverse infer that the left SMG is carrying out the same function across different tasks because the response in that area is more strongly correlated between certain tasks. This finding points to that possibility and makes interesting predictions for future studies to pursue, but it cannot tell us whether common functions in the left SMG are involved in each task. E.g., one interesting prediction for future studies is to test if patients with lesions to this site are disproportionately more inaccurate in the experimental condition of the mechanical problem solving task, the psychotechnical task, the mentalizing task, but not the fluid cognition task.
(4) I appreciated the approach to testing the adjacency interpretation by showing the sphere and peak Y coordinate across the tasks. It is interesting that across the groups, there is no difference in the peak Y coordinate of the psychotechnical task and both conditions of the mentalizing task, whereas the peak Y coordinate in the fluid intelligence task is more anterior in the post-central gyrus across participants (why is that?). But why restrict the analysis to just the Y coordinate? A rigorous way to test the adjacency hypothesis is to compute Euclidean distance among X, Y, and Z coordinates between any two tasks collected in the same participant. One can then test if the Euclidean distance between, e.g., the psychotechnical task and one condition of the mentalizing task is smaller than the Euclidean distance between the psychotechnical task and the fluid cognition task. Similarly, one can test whether Euclidean distance between the INT and PHY conditions of the mentalizing task is smaller than the Euclidean distance between the INT and psychotechnical task or PHY and psychotechnical task. There is no justification to restrict this analysis to the anterior-posterior dimension only.