Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass - sequence relationships with an innovative metabarcoding protocol
- Subject Areas
- Biodiversity, Conservation Biology, Genetics, Molecular Biology, Zoology
- Next-generation sequencing, Biodiversity assessment, Community barcoding, Freshwater Ecology, Illumina sequencing, Water Framework Directive, MiSeq, Benthos, Ecosystem monitoring, Metabarcoding
- © 2015 Elbrecht et al.
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2015. Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass - sequence relationships with an innovative metabarcoding protocol. PeerJ PrePrints 3:e1023v1 https://doi.org/10.7287/peerj.preprints.1023v1
Metabarcoding is an emerging genetic tool to rapidly assess biodiversity in ecosystems. It involves high-throughput sequencing of a standard gene from an environmental sample and comparison to a reference database. However, no consensus has emerged regarding laboratory pipelines to screen species diversity and infer species abundances from environmental samples. In particular, the effect of primer bias and the detection limit for specimens with a low biomass has not been systematically examined, when processing samples in bulk. We developed and tested a DNA metabarcoding protocol that utilises the standard cytochrome c oxidase subunit I (COI) barcoding fragment to detect freshwater macroinvertebrate taxa. DNA was extracted in bulk, amplified in a single PCR step, and purified, and the libraries were directly sequenced in two independent MiSeq runs (300-bp paired-end reads). Specifically, we assessed the influence of specimen biomass on sequence read abundance by sequencing 31 specimens of a stonefly species with known haplotypes spanning three orders of magnitude in biomass (experiment I). Then, we tested the recovery of 52 different freshwater invertebrate taxa of similar biomass using the same standard barcoding primers (experiment II). Each experiment was replicated ten times to maximise statistical power. The results of both experiments were consistent across replicates. We found a distinct positive correlation between species biomass and resulting numbers of MiSeq reads. Furthermore, we reliably recovered 83% of the 52 taxa used to test primer bias. However, sequence abundance varied by four orders of magnitudes between taxa despite the use of similar amounts of biomass. Our metabarcoding approach yielded reliable results for high-throughput assessments. However, the results indicated that primer efficiency is highly species-specific, which would prevent straightforward assessments of species abundance and biomass in a sample. Thus, PCR-based metabarcoding assessments of biodiversity should rely on presence-absence metrics.
This is a revised version of the manuscript which was submitted to PLOS ONE in January 2015. It is currently in the second round of peer review. A short YouTube video summarising the findings of this paper is available: https://www.youtube.com/watch?v=VifqvI5JeDM
S1 Table. Information on Dinocras cephalotes specimen weights (in milligram) for experiment I
S2 Table. Information on specimen weights (in milligram) for experiment II
S3 Table. MOTU assignment to individual specimens in experiment II
S1 Figure. Fusion COI Primers developed in this study
Fusion primer can be directly loaded onto the MiSeq system and universal primers modified or replaced.
S2 Figure. Increase of diversity by parallel sequencing. By sequencing forward and reverse primers together, sequence diversity and thus read quality is increased
By sequencing forward and reverse primers together, sequence diversity and thus read quality is increased.
S3 Figure. Number of reads excluded in data processing steps
Includes flow charts of the bioinformatics processing of experiment I (A) and experiment II (B).
S4 Figure. Reads in each replicate after demultiplexing
Data from experiment I (A) and experiment II (B).
S5 Figure. Experiment I: sequences per specimen
Normalised sequence abundance for each stonefly.
S6 Figure. Experiment I: sequencing artefacts
Sequence matches are shown for three individual specimens, including h28 and h13 that are affected by sequencing artefacts.
S7 Figure. Experiment I: Variability in sequence abundance
Variability in sequence abundance between the ten replicates as well as dependence on specimen biomass.
S8 Figure. Experiment I: Sequence abundance depended on specimen biomass
Mean normalised sequence abundance of all ten replicates, including standard errors.
S9 Figure. Experiment II: OTUs assigned to taxa
Detailed overview of all 213 OTUs and their taxonomic identification using the BOLD database.