De novo species delimitation in metabarcoding datasets using ecology and phylogeny
Author and article information
Abstract
Background: Metabarcoding studies allow a wide variety of taxa to be analysed simultaneously in a fraction of the time taken by morphological identification, but currently metabarcoding studies must rely on sequence similarity-based methodologies to delimit operational taxonomic units (OTUs). Similarity-based OTU clustering methodologies can lead to inaccurate estimates of diversity, species’ distributions or responses to change, meaning that there is a critical need for methods to delimit species in metabarcoding datasets.
Methods: We introduce SNAPhy (Species delimitation using Niche And PHYlogeny), a novel approach which utilises ecological and phylogenetic information to delimit de novo OTUs in metabarcoding datasets and avoids the problems associated with current OTU clustering methods. Sequencing reads are first divided into ecological groups based on co-occurrence, thereby reducing data complexity and facilitating the use of evolutionary and phylogenetic models (e.g. BEAST and GMYC) to delimit species-level groupings within discrete ecologically informed phylogenies. The utility of SNAPhy is demonstrated using an 18S rDNA nuclear small subunit (nSSU) dataset representing replicated samples taken along the entire length of an estuarine salinity gradient, and SNAPhy is then compared to existing OTU clustering methods.
Results: All of the OTU clustering methods compared yielded different numbers of OTUs and a different taxonomic distribution of OTUs, which we suggest is due to the taxon differences that are known to exist in the degree of intraspecific divergence. SNAPhy and UCLUST (with a 98% similarity threshold) gave the most plausible numbers of OTUs, especially within the Nematoda. Additionally, the degree of variation within nematode OTUs delimited by SNAPhy lies within the range of variation in deeply metabarcoded individuals.
Discussion: SNAPhy avoids the static clustering threshold problems associated with current OTU clustering methods and instead focuses on genuine biological diversity delimited according to a general lineage species concept. We suggest that the SNAPhy approach should play a crucial role in future sequencing-based biodiversity assessment by providing more accurate estimates of species diversity and distributions than current methods, thereby enabling more accurate impact assessments and better informing managerial decisions.
Cite this as
2017. De novo species delimitation in metabarcoding datasets using ecology and phylogeny. PeerJ Preprints 5:e3121v1 https://doi.org/10.7287/peerj.preprints.3121v1Author comment
This version of the manuscript was previously submitted to PeerJ for review, and is currently undergoing major revisions.
Sections
Supplemental Information
Additional Information
Competing Interests
The authors declare that they have no competing interests.
Author Contributions
Caitlin Potter performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Cuong Q Tang performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Vera Fonseca performed the experiments, reviewed drafts of the paper.
Delphine Lallias performed the experiments, reviewed drafts of the paper.
John M Gaspar performed the experiments, reviewed drafts of the paper.
Kelley Thomas performed the experiments, reviewed drafts of the paper.
Simon Creer conceived and designed the experiments, wrote the paper, reviewed drafts of the paper.
Funding
This work was funded by an HPC Wales/Fujitsu PhD studentship to CP/SC; a NERC Post-Genomics and Proteomics Grant (Ref NE/F001266/1), New Investigator Grant NE/E001505/1472 and Molecular Genetics Facility Grant (MGF-167) to SC and Portuguese Foundation for Science and Technology (FCT) Grant (SFRH/BD/27413/2006 to V.G.F.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.