Ananke: Temporal clustering reveals ecological dynamics of microbial communities

Faculty of Graduate Studies, Dalhousie University, Halifax, Nova Scotia, Canada
Environmental Chemistry and Technology Program, University of Wisconsin-Madison, Madison, Wisconsin, United States
Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
Department of Civil and Environmental Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States
Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, United States
DOI
10.7287/peerj.preprints.2879v1
Subject Areas
Bioinformatics, Ecology, Microbiology, Computational Science
Keywords
time series, microbiota, clustering, marker gene, visualization
Copyright
© 2017 Hall et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Hall MW, Rohwer RR, Perrie J, McMahon KD, Beiko RG. 2017. Ananke: Temporal clustering reveals ecological dynamics of microbial communities. PeerJ Preprints 5:e2879v1

Abstract

Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that clusters marker-gene data based on time-series profiles and provides interactive visualization of clusters. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.

Author Comment

This is a submission to PeerJ for review.