This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Pardoe HR, Cutter G, Alter RA, Kucharsky Hiess R, Semmelroch M, Parker D, Farquharson S, Jackson G, Kuzniecky R.2015. Pooling morphometric estimates: a statistical equivalence approach. PeerJ PrePrints3:e808v2https://doi.org/10.7287/peerj.preprints.808v2
Changes in hardware or image processing settings are a common issue for large multi-center studies. In order to pool MRI data acquired under these changed conditions, it is necessary to demonstrate that the changes do not affect MRI-based measurements. In these circumstances classical inference testing is inappropriate because it is designed to detect differences, not prove similarity. We used a method known as statistical equivalence testing to address this limitation.
Equivalence testing was carried out on three datasets: (i) cortical thickness and automated hippocampal volume estimates obtained from healthy individuals imaged using different multi-channel head coils; (ii) manual hippocampal volumetry obtained using two readers; and (iii) corpus callosum area estimates obtained using an automated method with manual cleanup carried out by two readers. Equivalence testing was carried out using the “two one-sided tests” (TOST) approach. Power analyses of the two one-sided tests were used to estimate sample sizes required for well-powered equivalence testing analyses. Mean and standard deviation estimates from the automated hippocampal volume dataset were used to carry out an example power analysis.
Cortical thickness values were found to be equivalent over 61% of the cortex when different head coils were used (q < 0.05, FDR correction). Automated hippocampal volume estimates obtained using the same two coils were statistically equivalent (TOST p = 4.28 × 10-15). Manual hippocampal volume estimates obtained using two readers were not statistically equivalent (TOST p = 0.97). The use of different readers to carry out limited correction of automated corpus callosum segmentations yielded equivalent area estimates (TOST p = 1.28 × 10-14). Power analysis of simulated and automated hippocampal volume data demonstrated that the equivalence margin affects the number of subjects required for well-powered equivalence tests.
We have presented a statistical method for determining if morphometric measures obtained under variable conditions can be pooled. The equivalence testing technique is applicable for analyses in which experimental conditions vary over the course of the study.
This is a submission to Journal of Neuroimaging (published there as http://onlinelibrary.wiley.com/doi/10.1111/jon.12265/abstract).Changes to this version include (i)addition of power analyses for sample size calculation in equivalence testing analyses, (ii) use of cross-sectional Freesurfer processing stream rather than longitudinal processing used in previous versions and (iii) minor corrections