Recovering data from summary statistics: Sample Parameter Reconstruction via Iterative TEchniques (SPRITE)
- Subject Areas
- Ethical Issues, Science and Medical Education, Statistics
- Reproducibility, Statistics, Reanalysis, Replication
- © 2018 Heathers et al.
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. Recovering data from summary statistics: Sample Parameter Reconstruction via Iterative TEchniques (SPRITE) PeerJ Preprints 6:e26968v1 https://doi.org/10.7287/peerj.preprints.26968v1
Scientific publications have not traditionally been accompanied by data, either during the peer review process or when published. Concern has arisen that the literature in many fields may contain inaccuracies or errors that cannot be detected without inspecting the original data. Here, we introduce SPRITE (Sample Parameter Reconstruction via Interative TEchniques), a heuristic method for reconstructing plausible samples from descriptive statistics of granular data, allowing reviewers, editors, readers, and future researchers to gain insights into the possible distributions of item values in the original data set. This paper presents the principles of operation of SPRITE, as well as worked examples of its practical use for error detection in real published work. Full source code for three software implementations of SPRITE (in MATLAB, R, and Python) and two web-based implementations requiring no local installation (1, 2) are available for readers.
This pre-print manuscript version (1.0) serves as a longer exposition of the technique; we expect at present it will be reformatted and condensed for future publication.