A guide to Bayesian model checking for ecologists

Marine Mammal Laboratory, NOAA Alaska Fisheries Science Center, Seattle, Washington, United States
Department of Fish, Wildlife, and Conservation Biology, Colorado State University, Fort Collins, Colorado, United States
Department of Statistics, Colorado State University, Fort Collins, Colorado, United States
Colorado Cooperative Fish & Wildlife Research Unit, U.S. Geological Survey, Fort Collins, Colorado, United States
DOI
10.7287/peerj.preprints.3390v1
Subject Areas
Ecology, Marine Biology, Statistics
Keywords
Bayesian p-value, goodness-of-fit, hierarchical model, model diagnostics, posterior checks
Copyright
© 2017 Conn et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Conn P, Johnson D, Williams P, Melin S, Hooten M. 2017. A guide to Bayesian model checking for ecologists. PeerJ Preprints 5:e3390v1

Abstract

Checking that models adequately represent data is an essential component of applied statistical inference. Ecologists increasingly use hierarchical Bayesian statistical models in their research. The appeal of this modeling paradigm is undeniable, as researchers can build and fit models that embody complex ecological processes while simultaneously controlling observation error. However, ecologists tend to be less focused on checking model assumptions and assessing potential lack-of-fit when applying Bayesian methods than when applying more traditional modes of inference such as maximum likelihood. There are also multiple ways of assessing the fit of Bayesian models, each of which has strengths and weaknesses. For instance, Bayesian p-values are relatively easy to compute, but are well known to be conservative, producing p-values biased toward 0.5. Alternatively, lesser known approaches to model checking, such as prior predictive checks, cross-validation probability integral transforms, and pivot discrepancy measures may produce more accurate characterizations of goodness-of-fit but are not as well known to ecologists. In addition, a suite of visual and targeted diagnostics can be used to examine violations of different model assumptions and lack-of-fit at different levels of the modeling hierarchy, and to check for residual temporal or spatial autocorrelation. In this review, we synthesize existing literature to guide ecologists through the many available options for Bayesian model checking. We illustrate methods and procedures with several ecological case studies, including i) analysis of simulated spatio-temporal count data, (ii) N-mixture models for estimating abundance and detection probability of sea otters from an aircraft, and (iii) hidden Markov modeling to describe attendance patterns of California sea lion mothers on a rookery. We find that commonly used procedures based on posterior predictive p-values detect extreme model inadequacy, but often do not detect more subtle cases of lack of fit. Tests based on cross-validation and pivot discrepancy measures (including the ``sampled predictive p-value'') appear to be better suited to model checking and to have better overall statistical performance. We conclude that model checking is an essential component of scientific discovery and learning that should accompany most Bayesian analyses presented in the literature.

Author Comment

The authors have submitted this manuscript to Ecological Monographs.