Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding

Vasco Elbrecht; Bianca Peinert; Florian Leese

doi:10.7287/peerj.preprints.2561v1

Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding

Vasco Elbrecht ¹, Bianca Peinert¹, Florian Leese^1,2

1 Aquatic Ecosystem Research, University of Duisburg-Essen, Essen, Germany

2 University of Duisburg-Essen, Centre for Water and Environmental Research (ZWU) Essen, Essen, Germany

DOI: 10.7287/peerj.preprints.2561v1

Published: 2016-10-28
Accepted: 2016-10-28

Subject Areas: Biodiversity, Conservation Biology, Environmental Sciences, Molecular Biology, Zoology
Keywords: Biomass bias, specimen sorting, metabarcoding, next generation sequencing, ecosystem assessment, metagenomics, DNA barcoding

Copyright: © 2016 Elbrecht et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Elbrecht V, Peinert B, Leese F. 2016. Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding. PeerJ Preprints 4:e2561v1 https://doi.org/10.7287/peerj.preprints.2561v1

Abstract

1) Environmental bulk samples often contain many taxa with biomass differences of several orders of magnitude. This can be problematic in DNA metabarcoding and metagenomic high throughput sequencing approaches, as large specimens contribute over proportionally much DNA template. Thus a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens remaining undetected. Sorting of samples and balancing the amounts of tissue used per size fraction should improve detection rates, but has not been systematically tested.

2) Here we tested the effects of size sorting on taxa detection using freshwater macroinvertebrates. Kick sampling was performed at two locations of a low-mountain stream in West Germany, specimens were morphologically identified and sorted into small, medium and large size classes (< 2.5x5, 5x10 and up to 10x20 mm). Tissue from the 3 size categories was extracted individually, and pooled to simulate bulk samples that were not sorted and samples which were sorted and then pooled proportionately by specimen size. DNA from all 5 extractions of both samples was amplified using 4 different freshwater primer sets for the COI gene and sequenced on a HiSeq Illumina sequencer.

3) Sorting taxa by size and pooling them proportionately according to their abundance lead to a more equal amplification compared to the processing of complete samples without sorting. The sorted samples recovered 30% more taxa than the unsorted samples, at the same sequencing depth. Our results imply that sequencing depth can be decreased ~ 5 fold when sorting the samples into three size classes.

4) Our results demonstrate that even a coarse size sorting can substantially improve detection rates. While high throughput sequencing will become more accessible and cheaper within the next years, sorting bulk samples by specimen biomass is a simple yet efficient method to reduce current sequencing costs.

Author Comment

Initial of our "size sorting" manuscript soon to be submitted at Methods in Ecology and Evolution. We would like some additional feedback on this draft! Feel free to download the word document of this manuscript (Supplementary informations) and make changes and comments with "track changes". We appreciate any comments and criticism! Thank you

Supplemental Information

Figure S1. Pictures of sorted specimens

Pictures of the specimens sorted into small, medium and large individuals. Also provides information on how S, M and L tissue was pooled to generate the proportionally sorted (So) and unsorted (Un) samples.

DOI: 10.7287/peerj.preprints.2561v1/supp-1

Download

Figure S2. Flowchart detailing laboratory processing

Overview of the steps carried out for sample sorting and processing in the laboratory.

DOI: 10.7287/peerj.preprints.2561v1/supp-2

Download

Figure S3. DNA extraction protocol

Shows the step where the digested buffers of S, M and L were pooled to generate unsorted (Un) and sorted (So) samples.

DOI: 10.7287/peerj.preprints.2561v1/supp-3

Download

Figure S4. Sequencing depth and sequences discarded in bioinformatic processing

Barplot showing the number of total reads and proportion of sequences discarded in subsequent bioinformatic processing steps for all samples.

DOI: 10.7287/peerj.preprints.2561v1/supp-4

Download

Figure S5. Flowchart detailing the bioinformatic pipeline

Figure giving an overview of the metabarcoding pipeline applied to this dataset.

DOI: 10.7287/peerj.preprints.2561v1/supp-5
Download

Figure S6. Reproducibility between HiSeq lanes

Comparison of relative OTUs abundances between both HiSeq lanes.

DOI: 10.7287/peerj.preprints.2561v1/supp-6
Download

Figure S7. Plot of OTU table

Visualisation of taxa detected within S, M, L, Un, So DNA extractions, with 4 different primer combinations. Data is also compared to morphological identifications and number of specimens of each morphologically identified taxon.

DOI: 10.7287/peerj.preprints.2561v1/supp-7
Download

Figure S8. Database completeness

Plot showing the percent match of each OTU to the reference database, under consideration of read abundance.

DOI: 10.7287/peerj.preprints.2561v1/supp-8
Download

Figure S9. Taxa identification with metabarcoding and morphology

Comparison of number of taxa identified with morphology and DNA metabarcoding on different taxonomic resolutions.

DOI: 10.7287/peerj.preprints.2561v1/supp-9
Download

Figure S10. Taxa detection in sorted and unsorted samples

Comparison of the amount of diversity and taxa detected in sorted samples (So) and unsorted samples (Un).

DOI: 10.7287/peerj.preprints.2561v1/supp-10
Download

Table S1. OTU table

Detailed OTU table giving the number of reads for each sample, including assigned taxonomy and OTU sequence.

DOI: 10.7287/peerj.preprints.2561v1/supp-11
Download

Table S2. Morphologically identified taxa

Table giving an overview of morphologically identified taxa and abundance of specimens in S, M and L for both sample locations.

DOI: 10.7287/peerj.preprints.2561v1/supp-12
Download

Dryad DOI : Metabarcoding pipeline scripts

DOI: 10.7287/peerj.preprints.2561v1/supp-13
Download

Manuscript File

Please use for providing feedback (with track changes). Thank you

DOI: 10.7287/peerj.preprints.2561v1/supp-14
Download

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Vasco Elbrecht conceived and designed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Bianca Peinert conceived and designed the experiments, performed the experiments, reviewed drafts of the paper, identified specimens.

Florian Leese conceived and designed the experiments, reviewed drafts of the paper.

DNA Deposition

The following information was supplied regarding the deposition of DNA sequences:

Sequence data is available on the NCBI SRA archive: SRR3399056 and SRR3399057

Data Deposition

The following information was supplied regarding data availability:

The raw data has been supplied as a supplementary file and additional sequence data is available on the NCBI SRA archive.

Funding
FL and VE are supported by a grant of the Kurt Eberhard Bode foundation to FL (No Grand number available). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.