This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Checking that models adequately represent data is an essential component of applied statistical inference. Ecologists increasingly use hierarchical Bayesian statistical models in their research. The appeal of this modeling paradigm is undeniable, as researchers can build and fit models that embody complex ecological processes while simultaneously controlling observation error. However, ecologists tend to be less focused on checking model assumptions and assessing potential lack-of-fit when applying Bayesian methods than when applying more traditional modes of inference such as maximum likelihood. There are also multiple ways of assessing the fit of Bayesian models, each of which has strengths and weaknesses. For instance, Bayesian p-values are relatively easy to compute, but are well known to be conservative, producing p-values biased toward 0.5. Alternatively, lesser known approaches to model checking, such as prior predictive checks, cross-validation probability integral transforms, and pivot discrepancy measures may produce more accurate characterizations of goodness-of-fit but are not as well known to ecologists. In addition, a suite of visual and targeted diagnostics can be used to examine violations of different model assumptions and lack-of-fit at different levels of the modeling hierarchy, and to check for residual temporal or spatial autocorrelation. In this review, we synthesize existing literature to guide ecologists through the many available options for Bayesian model checking. We illustrate methods and procedures with several ecological case studies, including i) analysis of simulated spatio-temporal count data, (ii) N-mixture models for estimating abundance and detection probability of sea otters from an aircraft, and (iii) hidden Markov modeling to describe attendance patterns of California sea lion mothers on a rookery. We find that commonly used procedures based on posterior predictive p-values detect extreme model inadequacy, but often do not detect more subtle cases of lack of fit. Tests based on cross-validation and pivot discrepancy measures (including the ``sampled predictive p-value'') appear to be better suited to model checking and to have better overall statistical performance. We conclude that model checking is an essential component of scientific discovery and learning that should accompany most Bayesian analyses presented in the literature.
The authors have submitted this manuscript to Ecological Monographs.