Just enter your email
A peer-reviewed article of this Preprint also exists.
This is more a comment on your Hartgerink 2015, which you cite as evidence that in our paper Head et al. 2015 PLoS Biol, "the original results may have been confounded by publication bias and tendencies to round p-values".
Firstly, as you hint on the following line of the present PrePrint, publication bias (i.e. the file drawer problem for p-values above 0.05) does not affect our study because the method we used does not utilise p values >0.05. Head et al also conducted a manual analysis that suggested that when one re-calculates p-values from the test statistics, they often erroneously end up on the 'good' side of p=0.05. So, it's a bit misleading to ignore this and cite publication bias as something that screws the analysis in Head et al.
Regarding that argument in Hartgerink (2015) that Head et al's conclusions are confounded by rounding error, I already addressed this when your paper was reviewed and rejected by PLoS Biology, and again in a comment on the non-peer-reviewed Hartgerink (2015). If you don't agree with my assessment, I'd be interested to hear why. Otherwise, perhaps you should not cite Hartgerink (2015) as evidence that Head et al.'s conclusions do not hold. Here are my comments copied over from Hartgergink (2015):
Our original analyses removed all p-values reported to two decimal places (e.g. 0.04 and 0.05), because there is empirical evidence (and a priori, it seems likely) that these data are ‘tainted’ by inconsistent reporting or rounding practices.
For example, it seems probable that p values of 0.049 will very often be reported as p < 0.05 instead of p = 0.049 or p = 0.05, due to authors trying to hide the fact that 0.049 is ‘only just significant’. Conversely, authors are probably comparatively less likely to report 0.039 as p < X or p = 0.04, since 0.039 is a ‘good, significant looking number’, so there is no shame in reporting it exactly. Therefore, we expect biased reporting practices to cause the 0.05 peak to be smaller than the 0.04 peak, and indeed that is exactly what our data show (see Fig 1 in Hartgerink’s manuscript).
Because we are aware of this bias, we elected to do our analysis on the p- value bins 0.04 < p < 0.045 and 0.045 < p < 0.05 (note that we excluded the problematic values of 0.04 and 0.05). By contrast, Hartgerink’s analysis includes these tainted data, and unsurprisingly finds no evidence for p-hacking (since the 0.05 bin is clearly much lower than the 0.04 peak - ironically, probably because of p-hacking!). An additional (more minor) problem with Hartgerink’s analysis is that it uses p-value bins that are quite far apart (e.g. comparing 0.035-0.04 and 0.045-0.05). This makes his test less sensitive, because the overall p-curve displays right skew (because of evidential value), and we are trying to detect left skew, which should be most evident in the region close to 0.05, where the right skew is weaker.
Finally, in the PeerJ pre-print linked in the above comment [by Bishop and Thompson], the authors show that our conclusion of p-hacking holds even if one takes the comparatively drastic step of eliminating all papers in which there are any p-values reported to less than 3 decimal places. This seems like a conservative approach, but it illustrates further that our results are not a spurious consequence of the primary studies’ propensity to round off their p-values.
Please correct me if I'm wrong, but it seems like you don't understand my criticism of Hartgerink (2015). It's possible that you instead disagree with it, but you have not yet offered any reasoning. I am arguing:
1. To paraphrase, I think that your previous paper says: "The p-value data mined by Head et al has lots of spikes in it due to data rounding [Head et al were well aware of this]. Therefore, one needs to be careful that the rounded data do not cause issues [Head et al dealt with this by excluding all the rounded data]. Hartgerink (2015) did a new analysis which INCLUDES the tainted, rounded data, which is different to Head et al's analysis which EXCLUDES the tainted data [this confuses me, because you just argued that one should not include rounded data]. Hartgerink (2015) concludes that because it results are different, Head et al's results are not robust". Is that about right?
2. I think Hartgerink (2015) is wrong: that paper fully acknowledges that the rounded data should not be used, then it uses them anyway, and then (without any justification that I can see) it conclude that its reanalysis is preferable to Head et al.'s original one. Please let me know if you think otherwise, I'm keen to know if my assessment was wrong.
3. To repeat myself: another reason to think that Head et al is robust is that Bishop and Thompson (whom you cite) did another reanalysis of Head et al's dataset. They used an even more extreme data-cleaning approach than Head et al: they threw out every single paper in which there was at least one rounded p-value. They found essentially identical results to Head et al, which suggests that improper treatment of rounded p-values did not lead Head et al to a spurious result. How could it, if one gets the same result in a dataset that is totally free of rounded data?
4. Given points 2 and 3, I believe that Hartgerink (2015) does not call Head et al.'s results into question. Therefore, I think you should remove this reference from the present paper (and perhaps amend Hartgerink 2015, which is self-published and thus still open to edits). Alternatively, you could respond to my criticisms if you think I am wrong.
For the third time, my point is: your new PeerJ pre-print relies on a citation to a self-published, non-peer-reviewed Authorea paper (or rather, a paper that was rejected following peer review for the very reason I am outlining here, plus some others). You did not respond to my comment on your Authorea paper (see here: https://www.authorea.com/users/2013/articles/31568/showarticle), and you are not responding here either. This leads me to think that you have no rebuttal, and if that is the case then don't you think you think you should remove the sentence from your PeerJ prepint where you argue that Head et al is confounded? Alternatively you should convincingly defend the claims in your Authorea paper, either here on the Authorea website. Given that your Authorea manuscript is not peer-reviewed and seems to have a big error, I don't think you should reference it in this Preprint.
Thanks for revising the MS. I was not aiming to start a long argument, but presumably you can see why I feel it's not appropriate to criticise the Head et al. paper via a reference to a non-peer-reviewed paper. This is particularly true because I believe Chris' paper is debunked by my comments on it (repeated above) - he has never responded to these comments, making me think I may well be right, and his re-analysis does not discredit Head et al after all.
You wrote here "Chris Hartgerink told me he replied to your concerns more than once in the response to the reviews of Hartgerink (2015)." I invite you to read the comments on Chris' article: You will see that there are only 3: one from Bishop, one from Chris replying to Bishop, and my comment. There is no reply from Chris to me, let alone multiple replies as Chris appears to have told you. Here's the link: https://www.authorea.com/users/2013/articles/31568
Any idea where Chris' comments ended up?