Reanalyzing Head et al. (2015): Investigating the robustness of widespread p-hacking

Department of Methodology and Statistics, Tilburg University, Tilburg, The Netherlands
DOI
10.7287/peerj.preprints.2439v1
Subject Areas
Science Policy, Statistics
Keywords
qrps, nhst, reanalysis, p-hacking
Copyright
© 2016 Hartgerink
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Hartgerink CH. 2016. Reanalyzing Head et al. (2015): Investigating the robustness of widespread p-hacking. PeerJ Preprints 4:e2439v1

Abstract

Head et al. (2015b) provided a large collection of p-values that, from their analytic perspective, indicates widespread statistical significance seeking (i.e., p-hacking). This paper inspects this result for robustness. They correctly argue that an aggregate p-value distribution could show a bump below .05 when left-skew p-hacking occurs frequently. Theoretically, the p-value distribution should be a smooth, decreasing function, but the distribution of reported p-values shows systematically more reported p-values for .01, .02, .03, .04, and .05. Moreover, the elimination of p = .045 and p = .05, as done in the original paper, is debatable. Given that systematically more p-values are reported to two decimal places and the disputable selection of the bins .04 < p < .045 versus .045 < p < .05, I did not exclude p = .045 and p = .05, and I adjusted the bin selection to .03875 < p ≤ .04 versus .04875 < p ≤ .05. Results of the reanalysis indicate that no evidence for left-skew p-hacking remains when we take into account a second-decimal reporting tendency. Taking into account reporting tendencies is especially important because this dataset does not allow for the recalculation of the p-values. Moreover, given the weight of the findings by Head et al. (2015b), it is important that these findings are robust to choices that can be debated, if the conclusion is to be considered unequivocal. Although no evidence of widespread left-skew p-hacking is found in this reanalysis, this does not mean that there is no p-hacking at all. These results nuance the conclusion by Head et al. (2015b), indicating that the results are not robust and that the evidence for widespread left-skew p-hacking is ambiguous at best.

Author Comment

This is a submission to PeerJ for review.

Supplemental Information

Reanalysis results per discipline

DOI: 10.7287/peerj.preprints.2439v1/supp-1