The bias arose because the respondent's height is a collider variable—a direct product of another covariate (SNPs of height) and an outcome (sex).
excellent example
The bias arose because the respondent's height is a collider variable—a direct product of another covariate (SNPs of height) and an outcome (sex).
excellent example
this approach can bias the sample reducing the generalizability of the model, and runs the risk of misestimating the effect of interest
outlier exclusion
reassignment may introduce dependence between tests
Huh? Nested CV solves this.
70% accurate for cases (i.e., 70% sensitive) but 50% accurate overall
0.7^2 + 0.3^2 > 0.5
Overall accuracy is 50% to 100%, depending on proportion
This definition of R2 is the equivalent to the squared correlation between the predicted and observed values and reflects the error between the predicted values and its fit to the regression line, not the error between the predicted and observed values
R2 = r^2 for linear regression within a single sample. R2 is different and can be negative when evaluated out of sample.
These data demonstrate the bias-variance tradeoff of different cross-validation strategies.
No, Fig. 2B only demonstrates that smaller validation N increases variance (law of large numbers). There is no clear bias-variance tradeoff in choice of K (see variance of Fig. 2A).
increasing K will increase variance—the sensitivity of the model to changes caused by different training data—as the predictive model has less data for training in each sample selection
increasing K decreases (K-1)/K, what?
Although we used the 10 folds cross validation to tune parameters and verified the generalization ability of the models in the independent test-set, it may not completely represent the characteristics of different samples.
no nested cross validation
studies of the associations between common inter-individual variability in human brain structure/function and cognition or psychiatric symptomatology
Table 2
Table 3
focus should be on specific domains within medicine and science
Semantic Web lives on in academia.
Low reliability of either Time 1 or Time 2 score lowers the reliability of the change or residual score, whereas a high correlation between Time 1 and Time 2 scores causes lower reliability of difference scores
Burt and Obradović (2013)
Consider using DS rather than RS if there is a strong association between baseline assessments and the outcome
the normalized trace of the reordered cross-correlation matrix
t
e quantifies the reproducibility of the subspace spanned by the maps
subcortical structures, which also undergo significant development, were not available from the official ABCD data release
longitudinal component of ABCD is essential to begin to delineate causal longitudinal pathways
screen media activity cannot be reduced to a unidimensional impact on brain structure
main point
crystalized intelligence
-GFA 1 (cortical thickness, gray matter), -GFA 4 (social media), GFA 2 (gaming)
fluid intelligence
GFA 2 (gaming), -GFA 4 (social media)
externalizing behaviors
GFA 1 (cortical thickness, gray matter), GFA 4 (social media)
There are few studies to date using these technologies, and only in adult populations (Place et al., 2017). As such, validation of these models and platforms in large studies are needed.
check in 2022
It is now well accepted that low-level chronic exposure to environmental chemicals may contribute to the growing epidemic of childhood neurodevelopmental disorders worldwide (Grandjean and Landrigan, 2014).
controversial review...
Epigenetic studies such as those involving DNA methylation and histone modification data will be pursued in the future
coming soon...
Culture is described as the knowledge, skills, values and behaviors shared among a group of people.
What kind of parameter space are we operating in? That is, is the language/personality space sparse, so that a relatively small number of language variables account for the bulk of the explainable effects of personality on language? Or is it dense, with hundreds, or even thousands, of distinct but very small contributions? And can the space be adequately modeled by considering just the (additive) main effects of the predictors, or must one consider potentially very complex higher order interactions in order to generate adequate predictions?
data exploration, model selection
Backward selection under Proposition 1 discards only variables associated with treatment
associated with only treatment
An important result that we will draw upon here is that if the causal diagram is such that all common causes of any two variables on the graph are also on the graph then Pearl’s backdoor path theorem applies
how to be sure graph has all causes?
percent variance explained
% partial R2
This effect occurs when we adjust two independent variables for a potential confound that was actually a consequence of the independent variables. In this case, a spurious association between the independent variables can be incorrectly induced.
Berkson's paradox
sensitivity
should be specificity?
Separation by site, binarisation and normalisation of qualitative confounds
Separation by site, demedianing and outlier removal of quantitative confounds
As the ABCD study continues, it will be critical to test whether the separation between males and females changes with age and pubertal status.
TODO
strong effect of scanner that was most notable in posterior brain regions, as well as in the anterior temporal lobe and orbitofrontal cortex
visual, default
greater anticorrelation between the DMN and DAN was associated with higher general cognitive ability
replicates task-positive/negative anticorrelation
the first principal component of between-participant effects had high loadings on connections within the visual and default mode networks and those between default mode and dorsal attention networks
Seems more like {default, visual} ~ {dorsal, ventral} attention
regressed the NIH toolbox total scores onto RSFC for each ROI pair, while covarying scanner manufacturer, which had a more appreciable effect than data collection site, and sex, each coded as categorical varialbes
variables
See Fig. 8 for an overview of anatomical-based cortical parcellations.
another random figure...
See Fig. 5 for a summary of white-matter tracts.
why this figure? (used in Wikipedia)
ICA is then applied to the aggregated canonical mode expressions to recover the independent sources of the variation between observations expressed in the embedding space. While incurring additional computational load, this approach can be advantageous because CCA can only disentangle latent directions of variation in the data up to a random rotation
Miller et al, 2016 did ICA on variable-weights (p+q, m) not subject-weights (2N, m)
we extracted and combined the behavioral and MRI CCA scores for the three significant variates, correlated these with the original data matrix, transformed the correlations using a Fisher Z-transform, and submitted these to ICA
Incorrect. Miller et al. did ICA on variable weights (loadings). This ICA is on subject weights (scores).
They are also useful for detecting new trait associations by correlating observed phenotypes in a sample or cohort with the genetic prediction of another trait. This design is powerful, because if the discovery sample is fully independent of the new sample, an observed association between a complex trait and a genetic predictor from the discovery sample must be due to genetic factors, given that there are no shared environmental factors.
PRS ~ new traits
required sample626sizes for sparse CCA were still many times the number of627features: whenrtrue= 0.3, for example, 35–50 (depending on628the number of features) samples per feature were required
using datasets generated for CCA?
high dimensionalities622and low true correlations
when to choose sCCA
PLS has ben compared to sparse CCA in a setting with more632features than samples and it has been concluded that the for-633mer (latter) performs better when having fewer (more) than634about 500 features per sample (54).
backward (Grellmann et al., 2015)
Projecting the gene expression and functional association matrices back onto the gene weights and term weights, respectively, reflects how well a brain area exhibits the gene and term pattern, which we refer to as gene scores and term scores
(# node,) vectors Xu, Yv
with wj held fixed for all j ≠ i, (15) is convex in wi.
multiconvex
The loading of each term was computed as the Pearson’s correlation between the term’s functional association across brain regions and the PLS analysis-estimated scores
corr with gene scores only: corr(X, Xu) and corr(Y, Xu)
Gene scores for 12 unique brain regions gradually increase with development and peak in adulthood
Perceptual (negative) regions are more positive?
Differentiation of regions is more interesting, not overall increase.
affective–attentive or evaluation–perception axis
internal-external
To ensure that the correlation between gene and term scores is not inflated due to spatial autocorrelation, we selected the 75% of brain regions closest in Euclidean distance to a randomly chosen source node as the training set, and the remaining 25% of brain regions as the testing set.
citation for this cross-validation method?
we normalize voxel time series by dividing by the mean across time of each voxel and then use linear regression to remove quadratic trends, signals correlated with estimated motion time courses, and the mean time courses of cerebral white matter, ventricles, and whole brain, as well as their first derivatives
36 regressors
Interestingly, despite353the fact that these studies investigated different questions354using different datasets and modalities, the reported canonical355correlation could be well predicted simply by the number of356samples per feature alone (R2= 0.83).
Interest is in what variables are in the canonical variate, not the correlation.
Also consider publication bias.
extracting
extracted
SI Dataset 1
where?
we calculate the SD σs¯¯¯¯¯
test: firefox
we calculate the SD σs¯¯¯¯¯σs¯\overline{\sigma^s} as the squared mean of σsmσms{\sigma}_m^s’s
test: chrome
that is, σs¯¯¯¯¯=∑ksm=1(σsm)2ks−−−−−−−−−−−√.
test2
Figure 7
again, significant but very weak
Figure 6
unclear in legend, but p < 0.005 dots are more sparse
average magnitude of the negative edge weights decreased significantly with age
barely (r = -0.15), also magnitude increased
average magnitude of the positive edge weights increased significantly with age
barely (r=0.11)
right temporo-parietal junction
attention
posterior cingulate and medial orbitofrontal areas
self reference, valuation
During adolescence, functional connectivity between resting-state networks decreases with age, whereas functional connectivity within cortical resting-state networks increases with age, except for several connections within the salience network that decrease with age. There is limited evidence for dynamics in genetic or common environmental influences, suggesting mostly stable influences across adolescence.
but interpret dynamics analysis with caution (limited power)
individualized functional topography accurately predicted executive function in matched split-half samples while controlling for age, sex, and motion
Is r = 0.42 accurate?
2F-CV, see Figure S7
wrong figure
low cortical myelin content
later development
Σw=∑m=1M∑d=1D(xm,d−∑Dd′=1xm,d′)(xm,d−∑Dd′=1xm,d′)T
missing division by D for means (same problem in Σb below)
below
above
defined as follows:w=argmaxw′tr(w′TΣbw′w′TΣww′)
Why trace? Sloppy notations.
the projection of the direction of its movement relative to mouse B
The proper unambiguous term for this is vector rejection.