Application of zero-inflated negative binomial mixed model to human microbiota sequence data

Colorado School of Public Health, Uinversity of Colorado Denver, Aurora, Colorado, USA
School of Medicine, University of Colorado Denver, Aurora, Colorado, USA
University of Colorado Microbiome Research Consortium (MiRC), Aurora, Colorado, USA
DOI
10.7287/peerj.preprints.215v1
Subject Areas
Bioinformatics, Microbiology, Statistics
Keywords
microbiota, negative binomial, zero-inflation
Copyright
© 2014 Fang et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Fang R, Wagner B, Harris JK, Fillon SA. 2014. Application of zero-inflated negative binomial mixed model to human microbiota sequence data. PeerJ PrePrints 2:e215v1

Abstract

Identification of the majority of organisms present in human-associated microbial communities is feasible with the advent of high throughput sequencing technology. However, these data consist of non-negative, highly skewed sequence counts with a large proportion of zeros. Zero-inflated models are useful for analyzing such data. Moreover, the non-zero observations may be over-dispersed in relation to the Poisson distribution, biasing parameter estimates and underestimating standard errors. In such a circumstance, a zero-inflated negative binomial (ZINB) model better accounts for these characteristics compared to a zero-inflated Poisson (ZIP). In addition, complex study designs are possible with repeated measurements or multiple samples collected from the same subject, thus random effects are introduced to account for the within subject variation. A zero-inflated negative binomial mixed model contains components to model the probability of excess zero values and the negative binomial parameters, allowing for repeated measures using independent random effects between these two components. The objective of this study is to examine the application of a zero-inflated negative binomial mixed model to human microbiota sequence data.