Recovering data from summary statistics: Sample Parameter Reconstruction via Iterative TEchniques (SPRITE)

Bouve College of Health Sciences, Northeastern University, Boston, United States
Omnes Res, Charlottesville, Virginia, United States
Graduate School of Teaching (ICLON), Leiden University, Leiden, Netherlands
University Medical Center, University of Groningen, Groningen, Netherlands
DOI
10.7287/peerj.preprints.26968v1
Subject Areas
Ethical Issues, Science and Medical Education, Statistics
Keywords
Reproducibility, Statistics, Reanalysis, Replication
Copyright
© 2018 Heathers et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Heathers JA, Anaya J, van der Zee T, Brown NJ. 2018. Recovering data from summary statistics: Sample Parameter Reconstruction via Iterative TEchniques (SPRITE) PeerJ Preprints 6:e26968v1

Abstract

Scientific publications have not traditionally been accompanied by data, either during the peer review process or when published. Concern has arisen that the literature in many fields may contain inaccuracies or errors that cannot be detected without inspecting the original data. Here, we introduce SPRITE (Sample Parameter Reconstruction via Interative TEchniques), a heuristic method for reconstructing plausible samples from descriptive statistics of granular data, allowing reviewers, editors, readers, and future researchers to gain insights into the possible distributions of item values in the original data set. This paper presents the principles of operation of SPRITE, as well as worked examples of its practical use for error detection in real published work. Full source code for three software implementations of SPRITE (in MATLAB, R, and Python) and two web-based implementations requiring no local installation (1, 2) are available for readers.

Author Comment

This pre-print manuscript version (1.0) serves as a longer exposition of the technique; we expect at present it will be reformatted and condensed for future publication.