Swarm: robust and fast clustering method for amplicon-based studies
1
Department of Ecology, Technische Universität Kaiserslautern, Kaiserslautern, Germany
2
CNRS, UMR 7144, EPEP -- Évolution des Protistes et des Écosystèmes Pélagiques, Station Biologique de Roscoff, Roscoff, France
3
Sorbonne Universités, UPMC Univ Paris 06, UMR 7144, Station Biologique de Roscoff, Roscoff, France
4
Department of Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
5
Department of Informatics, University of Oslo, Oslo, Norway
6
School of Engineering, University of Glasgow, Glasgow, United Kingdom
- Published
- Accepted
- Subject Areas
- Biodiversity, Bioinformatics, Ecology, Microbiology, Molecular Biology
- Keywords
- environmental diversity, barcoding, molecular operational taxonomic units
- Copyright
- © 2014 Mahé et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2014. Swarm: robust and fast clustering method for amplicon-based studies. PeerJ PrePrints 2:e386v1 https://doi.org/10.7287/peerj.preprints.386v1
Abstract
Popular de novo amplicon clustering methods suffer from two fundamental flaws: arbitrary global clustering thresholds, and input-order dependency induced by centroid selection. Swarm was developed to address these issues by first clustering nearly identical amplicons iteratively using a local threshold, and then by using clusters' internal structure and amplicon abundances to refine its results. This fast, scalable, and input-order independent approach reduces the influence of clustering parameters and produces robust operational taxonomic units, improving the amount of meaningful biological information that can be extracted from amplicon-based studies.