- Jul 2018
-
europepmc.org europepmc.org
-
On date unavailable, commented:
None
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2014 Aug 28, Niall Duncan commented:
Dr Bishop raises some interesting points. I can't add to those regarding reading ability per se but can perhaps add some information relevant to the MRS-related ones.
Glutamate concentrations appear to be stable over the relevant age range, as do choline concentrations (Dezortova M, 2008 ; Blüml S, 2013). [A longitudinal study in rats has found a continuing increase in Glx over the lifespan which with the higher field strength may be worth taking into consideration (Morgan JJ, 2013).] Creatine concentrations appear to change over the relevant period, which could be an issue as this is taken as the reference substance in the paper. The authors look at this possibility though and say that there is no relationship between reading ability and creatine. I still find it odd that they don't report the age and tissue proportion corrected partial correlations as their main findings as taking these into account is pretty standard in the MRS literature. [That it is standard does not of course mean that it is right and the authors may be able to give a convincing argument as to why their approach is better.]
Also curious is that they don't report anything about glutamine concentrations. As far as I can tell from the methods these are measured (fully accepting that I could be misreading what they have done). There has been a lot of conflation of glutamate with glutamate+glutamine (Glx) in the literature as not all methods allow these to be reliably separated. The two things are not equivalent however. Some clarification along these lines in the case of this paper would be nice.
MRS is probably best described as a trait measure. There are some so-called "functional MRS" studies that find changes in metabolite (generally glutamate or lactate) concentrations when a task is performed. Normally, though, measurements are made "at rest" over a period of 10-20 minutes.
The test-retest reliability of MRS varies between different acquisition/analysis methods, regions, and metabolites. For Glx the CoV seems to be around 7% and for GABA 10%. For creatine it seems to be closer to 5% (relevant where this is taken as the reference substance). Glutamine has poor reproducibility (CoV over 20%), which is worrying when trying to separate this signal from glutamate. These numbers are from adults though and might be different in children.
Related to the difference between children and adults, the issue of head movement is one that has not yet been properly addressed in the MRS literature. These is some evidence that choline concentrations are particularly affected by head motion (Andrews-Shigaki BC, 2011). It could therefore be the case that the children towards one end or the other of the reading score distribution tend to move more (differences in boredom thresholds or suchlike?).
Metabolite concentrations do differ between different parts of the brain but the question is whether they are correlated with each other or not. I'm not aware of anybody specifically testing this yet. It is thus possible that the occipital cortex is acting as a correlated "proxy region" for a different region that is more directly linked to reading. This question of specificity is, I believe, an important one for MRS studies. Alternatively, my naive first take would be that doesn't seem so unlikely that metabolite levels in the visual system are related to "higher-level" functions such as reading.
A few papers that interested people might find useful are:
- An excellent overview of the metabolites being measured with MRS (Rae CD, 2014).
- A review of MRS techniques for measuring Glx (Ramadan S, 2013).
- A review of MRS techniques for measuring GABA (Puts NA, 2012).
- A discussion of how to combine MRS with other measures (Duncan NW, 2014).
- A review of the multimodal imaging (i.e., combining MRS with other measures) literature to date (Duncan NW, 2014).
[Disclaimer - the last two are by colleagues and me.]
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2014 May 05, Dorothy V M Bishop commented:
The findings reported in this study suggest that the study of neurometabolites could help us understand the neural basis of individual differences in reading ability. It is good to see this novel approach to neurobiological bases of reading. There were, however, places where the paper was unclear on certain details, and I would be grateful if the authors could clarify these points – and/or make the whole dataset publicly available.
On p 4083, we are told that poor readers are identified on the basis of a composite score on TOWRE of 85 or less, and good readers with score of 100 or more. Later, though, data are presented for a reading composite score which is based on the two TOWRE subtests and reading comprehension from WJ-III. Why was WJ-III LWID not included?
I had assumed that in forming a composite, age-scaled scores would be used, yet on p 4085, it is stated that the analyses used a principal component based on raw scores, unadjusted for age. Specifically, it is stated that "Standard scores were not used because our focus is on the relation between neurometabolite levels and raw skill differences, not skill differences standardized to an age or grade norm." (p. 4085). Yet the focus of the paper in the Abstract, Introduction and Discussion is on reading disability – which can only be identified if we have age-adjusted reading scores.
This is key to interpreting the scatterplots in Figure 2. It seemed odd that some good/intermediate readers had lower reading composite scores than poor readers. I initially thought these might be children with poor reading comprehension, who had done well on TOWRE but got very low scores on reading comprehension. But if these scores are not age-adjusted, then these could be children whose low scores are less abnormal because they are young. Is that the case? Given the focus on reading disability, we do need to see the data for age-adjusted reading ability relative to glutamate levels. An association between raw reading scores and glutamate levels could simply mean that both measures show maturational changes.
The legend of Table 2 is confusing: it is stated that p-values indicated with * indicate significance after correction for multiple comparisons. With 4 neurometabolites and 4 dependent variables, a Bonferroni correction would treat as significant correlations with associated p-values less than .05/16 = .003. The p-values in Table 2 are uncorrected p-values for the correlations, and none met that criterion. It could be argued that a Bonferroni correction is too conservative here, but if so, further information is needed about the 'conservative correction for multiple comparisons' referred to on p 4085.
The partial correlations after removing linear effects of age and grey matter volume are presented in brackets in Table 2 and are not interpreted statistically. (They are generally lower than the raw correlations and would not be regarded as statistically significant if a Bonferroni correction were applied). It thus seems that the main results that form the focus of the discussion and conclusions are not age-adjusted and hence could reflect the fact that both reading and neurometabolite levels change with age. Is that correct?
It would be good to see fuller data from time 2 testing. A correlation between Cho and RC that had been significant (using an uncorrected p-value) at time 1 was no longer significant. This could be because the reading composite was not very reliable, because there is a genuine change in the strength of association, and/or because of reduced power in the smaller sample, but we cannot tell on the basis of the data that is provided. What is the correlation between RC at time 1 and time 2? How many children who were RD at time 1 still met criteria for RD at time 2? If we focus just on the 45 children tested both times, what were the correlations with Glu and Cho at time 1?
On p 4086 ANOVA is used to compare TD and RD groups. It is argued that age and gray matter volume need not be controlled because they do not differ between these groups. However, insofar as both variables are correlated with metabolite levels, it would be good to include them as covariates, as this would improve the power of the analysis by removing extraneous sources of variance. It is possible that the authors are underestimating the effect size here. In addition, the performance of the 'middle' group, with scores intermediate between reading disabled and typical development, is not shown, yet is of interest. More generally, converting quantitative data to categories using cutoffs and then excluding a middle group is problematic, because it creates 'researcher degrees of freedom' if cutoffs are not pre-specified - see Simmons, J. P., et al (2011). Psychological Science, 22(11), 1359-1366. doi: 10.1177/0956797611417632
It is good to see the authors attempting to replicate the Cho result with data from the NIH MRI study of normal brain development, but they do not compare like with like. The NIH study used two reading measures from WJ-III, both of which were also used in the Pugh et al study. We need to see a parallel analysis of the two studies with the same measures treated the same way, as well as a fuller report of the NIH sample data that would make it possible to compare the means and ranges of scores between the two samples.
Finally, as someone unfamiliar with MRS, I would be very interested to know more about the nature of this measure, in particular, how consistent is it within individuals, and how consistent is it across brain voxels? Regarding the individual consistency, I found myself wondering if it should be seen more as a measure of trait or state: Is there data on test-retest reliability of MRS values that could give insights into this? Alternatively, can neurometabolite levels be manipulated by behavioural interventions? As regards consistency across the brain, is it the case that glutamate/choline levels tend to be similar across the whole brain, so that sampling any region would give similar findings? I had assumed not, but in that case, it is surprising that the midline occipital voxel should show correlations with reading, since this brain region is not among those usually regarded as forming part of the reading network.
I appreciate that length constraints in journals often preclude full reporting of results, and I am aware that the Journal of Neuroscience has compounded this problem by ceasing to accept supplemental material. In my view this is bad for science as it means that readers are left with many questions about details of results, such as those that I have noted here. It is likely that future studies will aim to replicate and extend this line of work and these findings could then be incorporated into a meta-analysis – but as it stands, the level of detail is insufficient for this purpose. There are, however, options open to authors who wish to report fuller data, including FigShare or Open Science Framework, and perhaps this can provide a solution for authors whose complex datasets cannot be easily condensed into a brief report.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-
- Feb 2018
-
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov
-
On 2014 May 05, Dorothy V M Bishop commented:
The findings reported in this study suggest that the study of neurometabolites could help us understand the neural basis of individual differences in reading ability. It is good to see this novel approach to neurobiological bases of reading. There were, however, places where the paper was unclear on certain details, and I would be grateful if the authors could clarify these points – and/or make the whole dataset publicly available.
On p 4083, we are told that poor readers are identified on the basis of a composite score on TOWRE of 85 or less, and good readers with score of 100 or more. Later, though, data are presented for a reading composite score which is based on the two TOWRE subtests and reading comprehension from WJ-III. Why was WJ-III LWID not included?
I had assumed that in forming a composite, age-scaled scores would be used, yet on p 4085, it is stated that the analyses used a principal component based on raw scores, unadjusted for age. Specifically, it is stated that "Standard scores were not used because our focus is on the relation between neurometabolite levels and raw skill differences, not skill differences standardized to an age or grade norm." (p. 4085). Yet the focus of the paper in the Abstract, Introduction and Discussion is on reading disability – which can only be identified if we have age-adjusted reading scores.
This is key to interpreting the scatterplots in Figure 2. It seemed odd that some good/intermediate readers had lower reading composite scores than poor readers. I initially thought these might be children with poor reading comprehension, who had done well on TOWRE but got very low scores on reading comprehension. But if these scores are not age-adjusted, then these could be children whose low scores are less abnormal because they are young. Is that the case? Given the focus on reading disability, we do need to see the data for age-adjusted reading ability relative to glutamate levels. An association between raw reading scores and glutamate levels could simply mean that both measures show maturational changes.
The legend of Table 2 is confusing: it is stated that p-values indicated with * indicate significance after correction for multiple comparisons. With 4 neurometabolites and 4 dependent variables, a Bonferroni correction would treat as significant correlations with associated p-values less than .05/16 = .003. The p-values in Table 2 are uncorrected p-values for the correlations, and none met that criterion. It could be argued that a Bonferroni correction is too conservative here, but if so, further information is needed about the 'conservative correction for multiple comparisons' referred to on p 4085.
The partial correlations after removing linear effects of age and grey matter volume are presented in brackets in Table 2 and are not interpreted statistically. (They are generally lower than the raw correlations and would not be regarded as statistically significant if a Bonferroni correction were applied). It thus seems that the main results that form the focus of the discussion and conclusions are not age-adjusted and hence could reflect the fact that both reading and neurometabolite levels change with age. Is that correct?
It would be good to see fuller data from time 2 testing. A correlation between Cho and RC that had been significant (using an uncorrected p-value) at time 1 was no longer significant. This could be because the reading composite was not very reliable, because there is a genuine change in the strength of association, and/or because of reduced power in the smaller sample, but we cannot tell on the basis of the data that is provided. What is the correlation between RC at time 1 and time 2? How many children who were RD at time 1 still met criteria for RD at time 2? If we focus just on the 45 children tested both times, what were the correlations with Glu and Cho at time 1?
On p 4086 ANOVA is used to compare TD and RD groups. It is argued that age and gray matter volume need not be controlled because they do not differ between these groups. However, insofar as both variables are correlated with metabolite levels, it would be good to include them as covariates, as this would improve the power of the analysis by removing extraneous sources of variance. It is possible that the authors are underestimating the effect size here. In addition, the performance of the 'middle' group, with scores intermediate between reading disabled and typical development, is not shown, yet is of interest. More generally, converting quantitative data to categories using cutoffs and then excluding a middle group is problematic, because it creates 'researcher degrees of freedom' if cutoffs are not pre-specified - see Simmons, J. P., et al (2011). Psychological Science, 22(11), 1359-1366. doi: 10.1177/0956797611417632
It is good to see the authors attempting to replicate the Cho result with data from the NIH MRI study of normal brain development, but they do not compare like with like. The NIH study used two reading measures from WJ-III, both of which were also used in the Pugh et al study. We need to see a parallel analysis of the two studies with the same measures treated the same way, as well as a fuller report of the NIH sample data that would make it possible to compare the means and ranges of scores between the two samples.
Finally, as someone unfamiliar with MRS, I would be very interested to know more about the nature of this measure, in particular, how consistent is it within individuals, and how consistent is it across brain voxels? Regarding the individual consistency, I found myself wondering if it should be seen more as a measure of trait or state: Is there data on test-retest reliability of MRS values that could give insights into this? Alternatively, can neurometabolite levels be manipulated by behavioural interventions? As regards consistency across the brain, is it the case that glutamate/choline levels tend to be similar across the whole brain, so that sampling any region would give similar findings? I had assumed not, but in that case, it is surprising that the midline occipital voxel should show correlations with reading, since this brain region is not among those usually regarded as forming part of the reading network.
I appreciate that length constraints in journals often preclude full reporting of results, and I am aware that the Journal of Neuroscience has compounded this problem by ceasing to accept supplemental material. In my view this is bad for science as it means that readers are left with many questions about details of results, such as those that I have noted here. It is likely that future studies will aim to replicate and extend this line of work and these findings could then be incorporated into a meta-analysis – but as it stands, the level of detail is insufficient for this purpose. There are, however, options open to authors who wish to report fuller data, including FigShare or Open Science Framework, and perhaps this can provide a solution for authors whose complex datasets cannot be easily condensed into a brief report.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-