Swarm: robust and fast clustering method for amplicon-based studies
A peer-reviewed article of this Preprint also exists.
Author and article information
Abstract
Popular de novo amplicon clustering methods suffer from two fundamental flaws: arbitrary global clustering thresholds, and input-order dependency induced by centroid selection. Swarm was developed to address these issues by first clustering nearly identical amplicons iteratively using a local threshold, and then by using clusters' internal structure and amplicon abundances to refine its results. This fast, scalable, and input-order independent approach reduces the influence of clustering parameters and produces robust operational taxonomic units, improving the amount of meaningful biological information that can be extracted from amplicon-based studies.
Cite this as
2014. Swarm: robust and fast clustering method for amplicon-based studies. PeerJ PrePrints 2:e386v1 https://doi.org/10.7287/peerj.preprints.386v1Sections
Supplemental Information
Supplementary File 1 (code and commands used to perform the analyses)
Additional Information
Competing Interests
The authors declare there are no competing interests.
Author Contributions
Frédéric Mahé conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Torbjørn Rognes conceived and designed the experiments, performed the experiments, reviewed drafts of the paper.
Christopher Quince analyzed the data, contributed reagents/materials/analysis tools, reviewed drafts of the paper.
Colomban de Vargas wrote the paper, reviewed drafts of the paper.
Micah Dunthorn conceived and designed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Grant Disclosures
The following grant information was disclosed by the authors:
EU EraNet BiodivErsA program BioMarKs (grant # 2008-6530)
French government "Investissements d’Avenir" project OCEANOMICS (grant ANR-11-BTBR-0008)
Deutsche Forschungsgemeinschaft (grant # DU1319/1-1)
EPSRC Career Acceleration Fellowship (grant # EP/H003851/1)
Funding
F.M. and C.deV. were supported by the EU EraNet BiodivErsA program BioMarKs (grant # 2008-6530) and the French government "Investissements d’Avenir" project OCEANOMICS (ANR-11-BTBR-0008) and the EU FP7 program MicroB3 (contract number 287589). F.M and M.D. were supported by the Deutsche Forschungsgemeinschaft (grant #DU1319/1-1). T.R. was supported by a Centre of Excellence grant from the Research Council of Norway to CMBN. C.Q. is funded by an EPSRC Career Acceleration Fellowship – EP/H003851/1. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.