This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Background: In literature-based meta-analyses of cancer prognostic studies, methods for extracting summary statistics from published reports have been extensively employed. However, no assessment of the magnitude of bias produced by these methods or comparison of their influence on fixed vs. random effects models have been published previously. Therefore, the purpose of this study is to empirically assess the degree of bias produced by the methods used for extracting summary statistics and examine potential effects on fixed and random effects models. Methods: Using published data from cancer prognostic studies, systematic differences between reported statistics and those obtained indirectly using log-rank test p-values and total number of events were tested using paired t tests and the log-rank test of survival-agreement plots. The degree of disagreement between estimates was quantified using an information-based disagreement measure, which was also used to examine levels of disagreement between expressions obtained from fixed and random effects models. Results: Thirty-four studies provided a total of 65 estimates of lnHR and its variance. There was a significant difference between the means of the indirect lnHRs and the reported values (mean difference = -0.272, t = -4.652, p-value <0.0001), as well as between the means of the two estimates of variances (mean difference = -0.115, t = -4.5556, p-value <0.0001). Survival agreement plots illustrated a bias towards under-estimation by the indirect method for both lnHR (log-rank p-value = 0.031) and its variance (log-rank p-value = 0.0432). The magnitude of disagreement between estimates of lnHR based on the information-based measure was 0.298 (95% CI: 0.234 – 0.361) and, for the variances it was 0.406 (95% CI: 0.339 – 0.470). As the disagreement between variances was higher than that between lnHR estimates, this increased the level of disagreement between lnHRs weighted by the inverse of their variances in fixed effect models. In addition, results indicated that random effects meta-analyses could be more prone to bias than fixed effects meta-analyses as, in addition to bias in estimates of lnHRs and their variances, levels of disagreement as high as 0.487 (95% CI: 0.416 – 0.552) and 0.568 (95% CI: 0.496 – 0.635) were produced due to between-studies variance calculations. Conclusions: Extracting summary statistics from published studies could introduce bias in literature-based meta-analyses and undermine the validity of the evidence. These findings emphasise the importance of reporting sufficient statistical information in research articles and warrant further research into the influence of potential bias on random effects models.