Reviewer #3 (Public review):
Summary:
This manuscript investigates the conditions under which representational distances estimated from brain-activity measurements accurately mirror the true geometry of the underlying neural representations. Using a theoretical framework and simulations, the authors show that (i) random weighted sampling of individual neurons preserves representational distances; (ii) the non-negative pooling characteristic of fMRI stretches the geometry along the population-mean dimension; and (iii) subtracting the across-channel mean from each activity pattern removes this distortion, explaining the well-known success of correlation-based RSA. They further argue that a mean-centred, squared Euclidean (or Mahalanobis) distance retains this corrective benefit while avoiding some pitfalls of variance normalisation.
Strengths:
(1) Theoretical clarity and novelty:<br /> The paper offers an elegant and convincing proof of how linear measurement models affect representational geometry and pinpoints the specific condition (non-zero-mean sampling weights) under which voxel pooling introduces a systematic bias. This quantitative explanation of why mean removal is effective in RSA is new and valuable.
(2) Simulations:<br /> Experiments on both synthetic high-dimensional fMRI data and macaque-IT-inspired embeddings corroborate the mathematics, providing practical insights into the theoretical reasoning outlined by the authors.
(3) Actionable recommendations:<br /> The work summarises the results into clear guidelines: random single-unit sampling is "safe" (the estimated geometry is undistorted); fMRI voxel data with unstructured or single-scale codes should be mean-centred; and multi-scale cortical maps require explicit forward modelling. These guidelines are clear, and useful for future research.
Weaknesses:
(1) Simplistic assumptions:<br /> The assumption that measurement-channel weights are drawn independently and identically distributed (i.i.d.) from a univariate distribution is a significant idealisation for fMRI data. Voxels have spatially structured responses (and noise), meaning they do not sample neurons with i.i.d. weights. The extent to which the conclusions (especially the "exact recovery" with mean centring) hold when this assumption is violated needs more discussion. While the paper states that the non-negative IWLCS model is a best-case scenario, the implications of deviations from this best case could be elaborated.
(2) Random-subpopulation model for electrophysiology:<br /> Similarly, the "random subpopulation model" is presented as an idealisation of single-cell recordings. In reality, electrophysiological sampling is often biased (e.g., towards larger, more active neurons or neurons in accessible locations). The paper acknowledges biased sampling as a challenge that requires separate modelling, but the gap between this idealised model and actual practice should be highlighted more strongly when interpreting the optimistic results.
(3) Noise as an "orthogonal issue":<br /> The theoretical derivations largely ignore measurement noise, treating it as an orthogonal problem solvable by cross-validation. Although bias from noise is a well-known problem, interactions between noise and sampling-induced distortions (especially the down-scaling of orthogonal dimensions) could complicate the picture. For instance, if a dimension is already heavily down-scaled by averaging, it might become more susceptible to being obscured by noise. Addressing or highlighting these points more explicitly would make the limitations of this theoretical framework more transparent.
(4) Simulation parameters and generalizability:<br /> The random ground-truth geometries were generated from a Gaussian mixture in 5-D and then embedded into 1,024-D, with ≈25 % of the variance coming from the mean dimension. The sensitivity of the findings to these specific parameters (initial dimensionality, geometry complexity, proportion of mean variance, and sample size) could be discussed. How would the results change if the true neural geometry had a much higher or lower intrinsic dimensionality, or if the population-mean component were substantially smaller or larger? If the authors' claims are to generalise, more scenarios should be considered.
(5) Mean addition to the neural-data simulation:<br /> In simulations based on neural data from Kiani et al., a random mean was added to each pattern to introduce variation along the mean dimension. This was necessary because the original patterns had identical mean activation. However, the procedure might oversimplify how population means vary naturally and could influence the conclusions, particularly regarding the impact of the population-mean dimension. While precisely modelling how the mean varies across conditions is beyond the manuscript's scope, this point should be stated and discussed more clearly.
(6) Effect of mean removal on representational geometry:<br /> As noted, the benefits of mean removal hold "under ideal conditions". Real data often violates these assumptions. A critical reader might ask: What if conditions differ in overall activation and in more complex ways (e.g., differing correlation structures across neurons)? Is it always desirable to remove population-mean differences? For example, if a stimulus truly causes a global increase in firing across the entire population (perhaps reflecting arousal or salience), subtracting the mean would treat this genuine effect as a nuisance and eliminate it from the geometry. Prior literature has cautioned that one should interpret RSA results after demeaning carefully. For instance, Ramírez (2017) dubbed this problem "representational confusion", showing that subtracting the mean pattern can change the relationships between conditions in non-intuitive ways. These potential issues and previous results should be discussed and properly referenced by the authors.
Appraisal, Impact, and Utility:
The authors set out to identify principled conditions under which measured representational distances faithfully reflect the underlying neural geometry and to provide practical guidance for RSA across modalities. Overall, I believe they achieved their goals. Theoretical derivations identify the bias-inducing factors in linear measurement models, and the simulations verify the analytic claims, demonstrating that mean-pattern subtraction can indeed correct some mean-related geometric distortions. These conclusions strongly rely on idealised assumptions (e.g., i.i.d. sampling weights and negligible noise), but the manuscript is explicit about them, and the reasoning from evidence to claim is sound. A deeper exploration of how robust each conclusion is to violations of these assumptions, particularly correlated voxel weights and realistic noise, would make the argument even stronger.
Beyond their immediate aims, the authors offer contributions likely to shape future work. Its influence is likely to influence both analysis decisions and the design of future studies exploring the geometry of brain representations. By clarifying why correlation-based RSA seems to work so robustly, they help demystify a practice that has so far been adopted heuristically. Their proposal to adopt mean-centred Euclidean or Mahalanobis distances promises a straightforward alternative that better aligns representational geometry with decoding-based interpretations.
In sum, I see this manuscript as a significant and insightful contribution to the field. The theoretical work clarifying the impact of sampling schemes and the role of mean removal is highly valuable. However, the identified concerns, primarily regarding the idealized nature of the models (especially for fMRI), the treatment of noise, and the need for more nuanced claims, suggest that some revisions are necessary. Addressing these points would substantially strengthen the paper's conclusions and enhance its impact on the neuroscience community by ensuring the proposed methods are robustly understood and appropriately applied in real-world research settings.