Reanalyzing Head et al. (2015): Investigating the robustness of widespread p-hacking
- Published
- Accepted
- Subject Areas
- Science Policy, Statistics
- Keywords
- qrps, nhst, reanalysis, p-hacking
- Copyright
- © 2016 Hartgerink
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2016. Reanalyzing Head et al. (2015): Investigating the robustness of widespread p-hacking. PeerJ Preprints 4:e2439v1 https://doi.org/10.7287/peerj.preprints.2439v1
Abstract
Head et al. (2015b) provided a large collection of p-values that, from their analytic perspective, indicates widespread statistical significance seeking (i.e., p-hacking). This paper inspects this result for robustness. They correctly argue that an aggregate p-value distribution could show a bump below .05 when left-skew p-hacking occurs frequently. Theoretically, the p-value distribution should be a smooth, decreasing function, but the distribution of reported p-values shows systematically more reported p-values for .01, .02, .03, .04, and .05. Moreover, the elimination of p = .045 and p = .05, as done in the original paper, is debatable. Given that systematically more p-values are reported to two decimal places and the disputable selection of the bins .04 < p < .045 versus .045 < p < .05, I did not exclude p = .045 and p = .05, and I adjusted the bin selection to .03875 < p ≤ .04 versus .04875 < p ≤ .05. Results of the reanalysis indicate that no evidence for left-skew p-hacking remains when we take into account a second-decimal reporting tendency. Taking into account reporting tendencies is especially important because this dataset does not allow for the recalculation of the p-values. Moreover, given the weight of the findings by Head et al. (2015b), it is important that these findings are robust to choices that can be debated, if the conclusion is to be considered unequivocal. Although no evidence of widespread left-skew p-hacking is found in this reanalysis, this does not mean that there is no p-hacking at all. These results nuance the conclusion by Head et al. (2015b), indicating that the results are not robust and that the evidence for widespread left-skew p-hacking is ambiguous at best.
Author Comment
This is a submission to PeerJ for review.