Examining publication bias – A simulation-based evaluation of statistical tests on publication bias

Department of Sociology, Ludwig-Maximilians-Universität München, Munich, Germany
DOI
10.7287/peerj.preprints.3059v1
Subject Areas
Ethical Issues, Science Policy, Statistics
Keywords
statistics, publication bias, test for excess significance, caliper test, Monte Carlo simulation, p-uniform, Egger's test, FAT
Copyright
© 2017 Schneck
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Schneck A. 2017. Examining publication bias – A simulation-based evaluation of statistical tests on publication bias. PeerJ Preprints 5:e3059v1

Abstract

Background

Publication bias is a form of scientific misconduct. It threatens the validity of research results and the credibility of science. Although several tests on publication bias exist, no in-depth evaluations are available that suggest which test to use for the specific research problem.

Methods

In the study at hand four tests on publication bias, Egger’s test (FAT), p-uniform, the test of excess significance (TES), as well as the caliper test, were evaluated in a Monte Carlo simulation. Two different types of publication bias, as well as its degree (0%, 50%, 100%), were simulated. The type of publication bias was defined either as file-drawer, meaning the repeated analysis of new datasets, or p-hacking, meaning the inclusion of covariates in order to obtain a significant result. In addition, the underlying effect (β = 0, 0.5, 1, 1.5), effect heterogeneity, and the number of observations in the simulated primary studies (N =100, 500), as well as in the number of observations for the publication bias tests (K =100, 1000), were varied.

Results

All tests evaluated were able to identify publication bias both in the file-drawer and p-hacking condition. The false positive rates were, with the exception of the 15%- and 20%-caliper test, unbiased. The FAT had the largest statistical power in the file-drawer conditions, whereas under p-hacking the TES was, except under effect heterogeneity, slightly better. The caliper test was, however, inferior to the other tests under effect homogeneity and had a decent statistical power only in conditions with 1000 primary studies.

Discussion

The FAT is recommended as a test for publication bias in standard meta-analyses with no or only small effect heterogeneity. If no clear direction of publication bias is suspected the TES is the first alternative to the FAT. The 5%-caliper tests is recommended under conditions of effect heterogeneity, which may be found if publication bias is examined in a discipline-wide setting when primary studies cover different research problems.

Author Comment

This is a submission to PeerJ for review.

Supplemental Information