Application of zero-inflated negative binomial mixed model to human microbiota sequence data
- Published
- Accepted
- Subject Areas
- Bioinformatics, Microbiology, Statistics
- Keywords
- microbiota, negative binomial, zero-inflation
- Copyright
- © 2014 Fang et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2014. Application of zero-inflated negative binomial mixed model to human microbiota sequence data. PeerJ PrePrints 2:e215v1 https://doi.org/10.7287/peerj.preprints.215v1
Abstract
Identification of the majority of organisms present in human-associated microbial communities is feasible with the advent of high throughput sequencing technology. However, these data consist of non-negative, highly skewed sequence counts with a large proportion of zeros. Zero-inflated models are useful for analyzing such data. Moreover, the non-zero observations may be over-dispersed in relation to the Poisson distribution, biasing parameter estimates and underestimating standard errors. In such a circumstance, a zero-inflated negative binomial (ZINB) model better accounts for these characteristics compared to a zero-inflated Poisson (ZIP). In addition, complex study designs are possible with repeated measurements or multiple samples collected from the same subject, thus random effects are introduced to account for the within subject variation. A zero-inflated negative binomial mixed model contains components to model the probability of excess zero values and the negative binomial parameters, allowing for repeated measures using independent random effects between these two components. The objective of this study is to examine the application of a zero-inflated negative binomial mixed model to human microbiota sequence data.