Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding

Aquatic Ecosystem Research, University of Duisburg-Essen, Essen, Germany
University of Duisburg-Essen, Centre for Water and Environmental Research (ZWU) Essen, Essen, Germany
DOI
10.7287/peerj.preprints.2561v2
Subject Areas
Biodiversity, Conservation Biology, Environmental Sciences, Molecular Biology, Zoology
Keywords
Biomass bias, specimen sorting, metabarcoding, next generation sequencing, ecosystem assessment, metagenomics, DNA barcoding
Copyright
© 2017 Elbrecht et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Elbrecht V, Peinert B, Leese F. 2017. Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding. PeerJ Preprints 5:e2561v2

Abstract

1) Environmental bulk samples often contain many taxa with biomass differences of several orders of magnitude. This can be problematic in DNA metabarcoding and metagenomic high throughput sequencing approaches, as large specimens contribute disproportionate amounts of DNA template. Thus a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens remaining undetected. Sorting of samples and balancing the amounts of tissue used per size fraction should improve detection rates, but this approach has not been systematically tested.

2) Here we tested the effects of size sorting on taxa detection using freshwater macroinvertebrates. Kick sampling was performed at two locations of a low-mountain stream in West Germany, specimens were morphologically identified and sorted into small, medium and large size classes (< 2.5x5, 5x10 and up to 10x20 mm). Tissue from the 3 size categories was extracted individually, and pooled to simulate samples that were not sorted by biomass and samples that were sorted and then pooled so that each specimen contributed approximately equal amounts of biomass. DNA from all five extractions of samples from both sites was amplified using four different DNA metabarcoding primer sets targeting the Cytochrome c oxidase I (COI) gene. The library was sequenced on a HiSeq Illumina sequencer.

3) Sorting taxa by size and pooling them proportionately according to their abundance lead to a more equal amplification compared to the processing of complete samples without sorting. The sorted samples recovered 30% more taxa than the unsorted samples, at the same sequencing depth. Our results imply that sequencing depth can be decreased approximately five-fold when sorting the samples into three size classes.

4) Our results demonstrate that even a coarse size sorting can substantially improve detection of taxa using DNA metabarcoding. While high throughput sequencing will become more accessible and cheaper within the next years, sorting bulk samples by specimen biomass is a simple yet efficient method to reduce current sequencing costs.

Author Comment

Manuscript was rejected at MME: Study was to found to be well done but results to obvious. Minor things in the manuscript / manuscript figures were improved based on reviewer suggestions. Planned resubmission to "Ecology and Evolution"

Supplemental Information

Figure S1. Pictures of sorted specimens

Pictures of the specimens sorted into small, medium and large individuals. Also provides information on how S, M and L tissue was pooled to generate the proportionally sorted (So) and unsorted (Un) samples.

DOI: 10.7287/peerj.preprints.2561v2/supp-1

Figure S2. Flowchart detailing laboratory processing

Overview of the steps carried out for sample sorting and processing in the laboratory.

DOI: 10.7287/peerj.preprints.2561v2/supp-2

Figure S3. DNA extraction protocol

Shows the step where the digested buffers of S, M and L were pooled to generate unsorted (Un) and sorted (So) samples.

DOI: 10.7287/peerj.preprints.2561v2/supp-3

Figure S4. Sequencing depth and sequences discarded in bioinformatic processing

Barplot showing the number of total reads and proportion of sequences discarded in subsequent bioinformatic processing steps for all samples.

DOI: 10.7287/peerj.preprints.2561v2/supp-4

Figure S5. Flowchart detailing the bioinformatic pipeline

Figure giving an overview of the metabarcoding pipeline applied to this dataset.

DOI: 10.7287/peerj.preprints.2561v2/supp-5

Figure S6. Reproducibility between HiSeq lanes

Comparison of relative OTUs abundances between both HiSeq lanes.

DOI: 10.7287/peerj.preprints.2561v2/supp-6

Figure S7. Plot of OTU table

Visualisation of taxa detected within S, M, L, Un, So DNA extractions, with 4 different primer combinations. Data is also compared to morphological identifications and number of specimens of each morphologically identified taxon.

DOI: 10.7287/peerj.preprints.2561v2/supp-7

Figure S8. Database completeness

Plot showing the percent match of each OTU to the reference database, under consideration of read abundance.

DOI: 10.7287/peerj.preprints.2561v2/supp-8

Figure S9. Taxa identification with metabarcoding and morphology

Comparison of number of taxa identified with morphology and DNA metabarcoding on different taxonomic resolutions.

DOI: 10.7287/peerj.preprints.2561v2/supp-9

Figure S10. Taxa detection in sorted and unsorted samples

Comparison of the amount of diversity and taxa detected in sorted samples (So) and unsorted samples (Un).

DOI: 10.7287/peerj.preprints.2561v2/supp-10

Table S1. OTU table

Detailed OTU table giving the number of reads for each sample, including assigned taxonomy and OTU sequence.

DOI: 10.7287/peerj.preprints.2561v2/supp-11

Table S2. Morphologically identified taxa

Table giving an overview of morphologically identified taxa and abundance of specimens in S, M and L for both sample locations.

DOI: 10.7287/peerj.preprints.2561v2/supp-12

Dryad DOI : Metabarcoding pipeline scripts

DOI: 10.7287/peerj.preprints.2561v2/supp-13

Manuscript File

Please use for providing feedback (with track changes). Thank you

DOI: 10.7287/peerj.preprints.2561v2/supp-14