Automatic definition of robust microbiome sub-states in longitudinal data
- Published
- Accepted
- Subject Areas
- Bioinformatics, Computational Biology, Microbiology, Data Mining and Machine Learning
- Keywords
- Microbiome, Sub-states, Clustering, Longitudinal dataset, Machine Learning, Metagenomics
- Copyright
- © 2018 García-Jiménez et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. Automatic definition of robust microbiome sub-states in longitudinal data. PeerJ Preprints 6:e26657v1 https://doi.org/10.7287/peerj.preprints.26657v1
Abstract
The analysis of microbiome dynamics would allow us to elucidate patterns within microbial community evolution; however, microbiome state-transition dynamics have been scarcely studied. This is in part because a necessary first-step in such analyses has not been well-defined: how to deterministically describe a microbiome’s ”state”. Clustering in states have been widely studied, although no standard has been concluded yet. We propose a generic, domain-independent and automatic procedure to determine a reliable set of microbiome sub-states within a specific dataset, and with respect to the conditions of the study. The robustness of sub-state identification is established by the combination of diverse techniques for stable cluster verification. We reuse four distinct longitudinal microbiome datasets to demonstrate the broad applicability of our method, analysing results with different taxa subset allowing to adjust it depending on the application goal, and showing that the methodology provides a set of robust sub-states to examine in downstream studies about dynamics in microbiome.
Author Comment
This is a submission to PeerJ for review.
Supplemental Information
Robust clustering evaluation, with HCLUST algorithm, in different datasets
From top to bottom: (1) Human gut microbiome (David2014 et al.,2014), (2) Chick gut (Ballou et al.,2016), (3) Vagina (Gajer et al.,2012), (4) Preterm infant gut (La Rosa et al.,2014).
Clusters in Chick Gut with different number of taxa, represented as Principal Coordinates graphs
Top row: default taxonomic level (i.e. species), bottom row: genus aggregation. Columns from left to right: all, dominant and non-dominant.