Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass - sequence relationships with an innovative metabarcoding protocol

Department of Animal Ecology, Evolution and Biodiversity, Ruhr University Bochum, Bochum, Germany
DOI
10.7287/peerj.preprints.1023v1
Subject Areas
Biodiversity, Conservation Biology, Genetics, Molecular Biology, Zoology
Keywords
Next-generation sequencing, Biodiversity assessment, Community barcoding, Freshwater Ecology, Illumina sequencing, Water Framework Directive, MiSeq, Benthos, Ecosystem monitoring, Metabarcoding
Copyright
© 2015 Elbrecht et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Elbrecht V, Leese F. 2015. Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass - sequence relationships with an innovative metabarcoding protocol. PeerJ PrePrints 3:e1023v1

Abstract

Metabarcoding is an emerging genetic tool to rapidly assess biodiversity in ecosystems. It involves high-throughput sequencing of a standard gene from an environmental sample and comparison to a reference database. However, no consensus has emerged regarding laboratory pipelines to screen species diversity and infer species abundances from environmental samples. In particular, the effect of primer bias and the detection limit for specimens with a low biomass has not been systematically examined, when processing samples in bulk. We developed and tested a DNA metabarcoding protocol that utilises the standard cytochrome c oxidase subunit I (COI) barcoding fragment to detect freshwater macroinvertebrate taxa. DNA was extracted in bulk, amplified in a single PCR step, and purified, and the libraries were directly sequenced in two independent MiSeq runs (300-bp paired-end reads). Specifically, we assessed the influence of specimen biomass on sequence read abundance by sequencing 31 specimens of a stonefly species with known haplotypes spanning three orders of magnitude in biomass (experiment I). Then, we tested the recovery of 52 different freshwater invertebrate taxa of similar biomass using the same standard barcoding primers (experiment II). Each experiment was replicated ten times to maximise statistical power. The results of both experiments were consistent across replicates. We found a distinct positive correlation between species biomass and resulting numbers of MiSeq reads. Furthermore, we reliably recovered 83% of the 52 taxa used to test primer bias. However, sequence abundance varied by four orders of magnitudes between taxa despite the use of similar amounts of biomass. Our metabarcoding approach yielded reliable results for high-throughput assessments. However, the results indicated that primer efficiency is highly species-specific, which would prevent straightforward assessments of species abundance and biomass in a sample. Thus, PCR-based metabarcoding assessments of biodiversity should rely on presence-absence metrics.

Author Comment

This is a revised version of the manuscript which was submitted to PLOS ONE in January 2015. It is currently in the second round of peer review. A short YouTube video summarising the findings of this paper is available: https://www.youtube.com/watch?v=VifqvI5JeDM

Supplemental Information

S1 Table. Information on Dinocras cephalotes specimen weights (in milligram) for experiment I

DOI: 10.7287/peerj.preprints.1023v1/supp-1

S2 Table. Information on specimen weights (in milligram) for experiment II

DOI: 10.7287/peerj.preprints.1023v1/supp-2

S3 Table. MOTU assignment to individual specimens in experiment II

DOI: 10.7287/peerj.preprints.1023v1/supp-3

S1 Figure. Fusion COI Primers developed in this study

Fusion primer can be directly loaded onto the MiSeq system and universal primers modified or replaced.

DOI: 10.7287/peerj.preprints.1023v1/supp-4

S2 Figure. Increase of diversity by parallel sequencing. By sequencing forward and reverse primers together, sequence diversity and thus read quality is increased

By sequencing forward and reverse primers together, sequence diversity and thus read quality is increased.

DOI: 10.7287/peerj.preprints.1023v1/supp-5

S3 Figure. Number of reads excluded in data processing steps

Includes flow charts of the bioinformatics processing of experiment I (A) and experiment II (B).

DOI: 10.7287/peerj.preprints.1023v1/supp-6

S4 Figure. Reads in each replicate after demultiplexing

Data from experiment I (A) and experiment II (B).

DOI: 10.7287/peerj.preprints.1023v1/supp-7

S5 Figure. Experiment I: sequences per specimen

Normalised sequence abundance for each stonefly.

DOI: 10.7287/peerj.preprints.1023v1/supp-8

S6 Figure. Experiment I: sequencing artefacts

Sequence matches are shown for three individual specimens, including h28 and h13 that are affected by sequencing artefacts.

DOI: 10.7287/peerj.preprints.1023v1/supp-9

S7 Figure. Experiment I: Variability in sequence abundance

Variability in sequence abundance between the ten replicates as well as dependence on specimen biomass.

DOI: 10.7287/peerj.preprints.1023v1/supp-10

S8 Figure. Experiment I: Sequence abundance depended on specimen biomass

Mean normalised sequence abundance of all ten replicates, including standard errors.

DOI: 10.7287/peerj.preprints.1023v1/supp-11

S9 Figure. Experiment II: OTUs assigned to taxa

Detailed overview of all 213 OTUs and their taxonomic identification using the BOLD database.

DOI: 10.7287/peerj.preprints.1023v1/supp-12