Reviewer #3 (Public Review):
This manuscript provides a helpful and transparent guide on the application of granger-causality (GC) to calcium datasets. This is a useful entry point toward understanding the suitability and limitations of GC to neural data. However, it is not entirely convincing that the variations of GC analysis provided in this manuscript can be effectively applied to large-scale calcium datasets without prior knowledge of the underlying circuit, especially when such networks are likely to contain redundancy and recurrent links.
I would like to acknowledge that, at the outset, I held an unfavorable prior belief toward GC, for reasons that are well addressed in this manuscript, including the dangers of applying spectral GC to nonlinear networks, as well as a variety of pathologies that can undermine naive GC.
The manuscript has been helpful, both for its effective presentation of both bivariate GC and its multivariate extension, as well as the practical considerations that are essential to applying it to real-life data. It was particularly helpful to see a treatment of the challenges and their possible resolutions. I commend the authors for their transparency - they should certainly be rewarded rather than punished for their transparency.
Major<br /> 1. Redundant signals: throughout the brain, it's expected that a population of neurons can encode the same information. It's unclear how GC (both the original and the modified versions) can handle this redundancy. Given how pervasive redundant signals are in the brain, this should be addressed in both simulation and experimental data. For example, in one of the manuscript's simulated networks, replace one neuron with 10 copies of it, each with identical inputs and outputs but with the weights scaled by 1/10. Such a network is functionally equivalent to the original but may pose some challenges for the various versions of GC. I believe this issue also accounts for the MVGC results in the hindbrain dataset. It might be more appropriate to apply GC to groups of neurons (as indeed the authors cited), instead of applying it at the single-cell level with redundant signals.<br /> 2. Similarly, there is recurrent connectivity throughout the brain. The current manuscript appears to assume feedforward networks. Is the idea that GC cannot be applied to recurrent networks? If so, this needs to be clearly stated. If the authors believe that GC can recover casual links even in the presence of recurrent connectivity, this needs to be demonstrated.<br /> 3. Both BVGC and MVGC appear to be extremely sensitive to any outlier signals. The most worrying aspect is that the authors developed their corrections and pipelines with the benefit of knowing the structure of the underlying system, whereas in the case where GC would be most useful, the user would be unable to rely on prior knowledge of the underlying structure. For instance, the motion artifact in Fig 3a-c was a helpful example of a vulnerability of naive GC, but one could easily imagine scenarios involving an unmeasured disturbance (e.g. the table is bumped) causing a similar artifact, but if the experimenter is unaware of such unmeasured disturbances then they will not be included in Z, and hence can result in the detection of widespread spurious links.<br /> There is a circularity here that's concerning. It seems that one already needs to have the answer (e.g. circuit connectivity) in order to clean up the data sufficiently for BVGC or MVGC to work effectively. Perhaps the authors would be interested in incorporating ideas from the systems identification literature, which can include the estimation of unmeasured disturbances, perhaps in conjunction with L1 regularization on the GC links. This is certainly out of scope for the present work, but it would be worth acknowledging the difficulties of unmeasured disturbances and deferring a general solution to future work. Similar considerations apply to a common unmeasured neuronal input (e.g. from a brain region not included in the field of view of the imaging).<br /> 4. Interpretation - would it be correct to state that BVGC identifies plausible causal links, while MVGC identifies a plausible system-level model? I think these interpretations, carefully stated, might provide a helpful way of thinking about the two GC approaches. Taking the results of the paper together, neither BVGC nor MVGC is definitive - BVGC may overestimate the true number of causal links but MVGC is prone to a winner-take-all phenomenon that may represent just one of many plausible system-level models that can account for the observed data. This should be more clearly stated in the manuscript.<br /> 5. "correlation completely misses the structure" - links are signed, so they should be shown with "bwr" colormap, with zero mapped to white (i.e. v_min is blue, 0 is white, v_max is red, |v_min| = |v_max|, this is natively supported in PyPlot and can be trivially implemented or downloaded in MATLAB). It is misleading that correlation appears to miss certain links marked in black, until one realizes that these links are inhibitory. It would substantially aid clarity and consistency if all panels followed this signed "bwr" convention. I think the emphasis for the GC panels is on whether links are detected, rather than the weight of the link, so I would suggest indicating detected inhibitory links as -1 (blue) and detected excitatory links as +1 (red), and link not detected as 0 (white).