Swarm: robust and fast clustering method for amplicon-based studies
- Subject Areas
- Biodiversity, Bioinformatics, Ecology, Microbiology, Molecular Biology
- environmental diversity, barcoding, molecular operational taxonomic units
- © 2014 Mahé et al.
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2014. Swarm: robust and fast clustering method for amplicon-based studies. PeerJ PrePrints 2:e386v1 https://doi.org/10.7287/peerj.preprints.386v1
Popular de novo amplicon clustering methods suffer from two fundamental flaws: arbitrary global clustering thresholds, and input-order dependency induced by centroid selection. Swarm was developed to address these issues by first clustering nearly identical amplicons iteratively using a local threshold, and then by using clusters' internal structure and amplicon abundances to refine its results. This fast, scalable, and input-order independent approach reduces the influence of clustering parameters and produces robust operational taxonomic units, improving the amount of meaningful biological information that can be extracted from amplicon-based studies.