Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding

Aquatic Ecosystem Research, University of Duisburg-Essen, Essen, Germany
University of Duisburg-Essen, Centre for Water and Environmental Research (ZWU) Essen, Essen, Germany
DOI
10.7287/peerj.preprints.2561v1
Subject Areas
Biodiversity, Conservation Biology, Environmental Sciences, Molecular Biology, Zoology
Keywords
Biomass bias, specimen sorting, metabarcoding, next generation sequencing, ecosystem assessment, metagenomics, DNA barcoding
Copyright
© 2016 Elbrecht et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Elbrecht V, Peinert B, Leese F. 2016. Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding. PeerJ Preprints 4:e2561v1

Abstract

1) Environmental bulk samples often contain many taxa with biomass differences of several orders of magnitude. This can be problematic in DNA metabarcoding and metagenomic high throughput sequencing approaches, as large specimens contribute over proportionally much DNA template. Thus a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens remaining undetected. Sorting of samples and balancing the amounts of tissue used per size fraction should improve detection rates, but has not been systematically tested.

2) Here we tested the effects of size sorting on taxa detection using freshwater macroinvertebrates. Kick sampling was performed at two locations of a low-mountain stream in West Germany, specimens were morphologically identified and sorted into small, medium and large size classes (< 2.5x5, 5x10 and up to 10x20 mm). Tissue from the 3 size categories was extracted individually, and pooled to simulate bulk samples that were not sorted and samples which were sorted and then pooled proportionately by specimen size. DNA from all 5 extractions of both samples was amplified using 4 different freshwater primer sets for the COI gene and sequenced on a HiSeq Illumina sequencer.

3) Sorting taxa by size and pooling them proportionately according to their abundance lead to a more equal amplification compared to the processing of complete samples without sorting. The sorted samples recovered 30% more taxa than the unsorted samples, at the same sequencing depth. Our results imply that sequencing depth can be decreased ~ 5 fold when sorting the samples into three size classes.

4) Our results demonstrate that even a coarse size sorting can substantially improve detection rates. While high throughput sequencing will become more accessible and cheaper within the next years, sorting bulk samples by specimen biomass is a simple yet efficient method to reduce current sequencing costs.

Author Comment

Initial of our "size sorting" manuscript soon to be submitted at Methods in Ecology and Evolution. We would like some additional feedback on this draft! Feel free to download the word document of this manuscript (Supplementary informations) and make changes and comments with "track changes". We appreciate any comments and criticism! Thank you

Supplemental Information

Figure S1. Pictures of sorted specimens

Pictures of the specimens sorted into small, medium and large individuals. Also provides information on how S, M and L tissue was pooled to generate the proportionally sorted (So) and unsorted (Un) samples.

DOI: 10.7287/peerj.preprints.2561v1/supp-1

Figure S2. Flowchart detailing laboratory processing

Overview of the steps carried out for sample sorting and processing in the laboratory.

DOI: 10.7287/peerj.preprints.2561v1/supp-2

Figure S3. DNA extraction protocol

Shows the step where the digested buffers of S, M and L were pooled to generate unsorted (Un) and sorted (So) samples.

DOI: 10.7287/peerj.preprints.2561v1/supp-3

Figure S4. Sequencing depth and sequences discarded in bioinformatic processing

Barplot showing the number of total reads and proportion of sequences discarded in subsequent bioinformatic processing steps for all samples.

DOI: 10.7287/peerj.preprints.2561v1/supp-4

Figure S5. Flowchart detailing the bioinformatic pipeline

Figure giving an overview of the metabarcoding pipeline applied to this dataset.

DOI: 10.7287/peerj.preprints.2561v1/supp-5

Figure S6. Reproducibility between HiSeq lanes

Comparison of relative OTUs abundances between both HiSeq lanes.

DOI: 10.7287/peerj.preprints.2561v1/supp-6

Figure S7. Plot of OTU table

Visualisation of taxa detected within S, M, L, Un, So DNA extractions, with 4 different primer combinations. Data is also compared to morphological identifications and number of specimens of each morphologically identified taxon.

DOI: 10.7287/peerj.preprints.2561v1/supp-7

Figure S8. Database completeness

Plot showing the percent match of each OTU to the reference database, under consideration of read abundance.

DOI: 10.7287/peerj.preprints.2561v1/supp-8

Figure S9. Taxa identification with metabarcoding and morphology

Comparison of number of taxa identified with morphology and DNA metabarcoding on different taxonomic resolutions.

DOI: 10.7287/peerj.preprints.2561v1/supp-9

Figure S10. Taxa detection in sorted and unsorted samples

Comparison of the amount of diversity and taxa detected in sorted samples (So) and unsorted samples (Un).

DOI: 10.7287/peerj.preprints.2561v1/supp-10

Table S1. OTU table

Detailed OTU table giving the number of reads for each sample, including assigned taxonomy and OTU sequence.

DOI: 10.7287/peerj.preprints.2561v1/supp-11

Table S2. Morphologically identified taxa

Table giving an overview of morphologically identified taxa and abundance of specimens in S, M and L for both sample locations.

DOI: 10.7287/peerj.preprints.2561v1/supp-12

Dryad DOI : Metabarcoding pipeline scripts

DOI: 10.7287/peerj.preprints.2561v1/supp-13

Manuscript File

Please use for providing feedback (with track changes). Thank you

DOI: 10.7287/peerj.preprints.2561v1/supp-14