Validation of COI metabarcoding primers for terrestrial arthropods

Centre for Biodiversity Genomics, University of Guelph, Guelph, ON, Canada
Department of Integrative Biology, University of Guelph, Guelph, ON, Canada
DOI
10.7287/peerj.preprints.27801v2
Subject Areas
Ecology, Ecosystem Science, Entomology, Molecular Biology, Forestry
Keywords
DNA metabarcoding, primer bias, degeneracy, insects, biodiversity
Copyright
© 2019 Elbrecht et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Elbrecht V, Braukmann TW, Ivanova NV, Prosser SW, Hajibabaei M, Wright M, Zakharov EV, Hebert PD, Steinke D. 2019. Validation of COI metabarcoding primers for terrestrial arthropods. PeerJ Preprints 7:e27801v2

Abstract

Metabarcoding can rapidly determine the species composition of bulk samples and thus aids ecosystem assessment. However , it is essential to use primer sets that minimize amplification bias among taxa to maximize species recovery. Despite this fact, the performance of primer sets employed for metabarcoding terrestrial arthropods has not been sufficiently evaluated. Thus this study tests the performance of 36 primer sets on a mock community containing 374 species. Amplification success was assessed with gradient PCRs and the 21 most promising primer sets selected for metabarcoding. These 21 primer sets where also tested by metabarcoding a Malaise trap sample. We identified eight primer sets, mainly those including inosine and/or high degeneracy, that recovered more than 95% of the species in the mock community. Results from the Malaise trap sample were congruent with the mock community, but primer sets generating short amplicons produced potential false positives. Taxon recovery from the 21 amplicon pools of the mock community and Malaise trap sample were used to select four primer sets for metabarcoding evaluation at different annealing temperatures (40-60 Co) using the mock community. Temperature did only have a minor effect on taxa recovery that varied with the specific primer pair. This study reveals the weak performance of some primer sets employed in past studies. It also demonstrated that certain primer sets can recover most taxa in a diverse species assemblage. Thus there is no need to employ several primer sets targeting the same amplicon. While we identified several suited primer sets for arthropod metabarcoding, the primer selection depends on the targeted taxonomic groups, as well as DNA quality, desired taxonomic resolution, and sequencing platform employed for analysis.

Author Comment

Added a figure showing taxa recovery on Order level, fixed smaller errors in spelling.

Supplemental Information

Figure S1: Mock community composition

DOI: 10.7287/peerj.preprints.27801v2/supp-1

Figure S2: Sequence alignment for 29 insect orders, including primer binding annotations. The alignment was used for primer development

DOI: 10.7287/peerj.preprints.27801v2/supp-2

Figure S3: Evaluation of Levenshtein distances for fusion primers used to metabarcode the 21 primer sets

DOI: 10.7287/peerj.preprints.27801v2/supp-3

Figure S4: Fusion primers used to metabarcode the 21 primer sets

DOI: 10.7287/peerj.preprints.27801v2/supp-4

Figure S5: Fusion primers used for gradient metabarcoding

DOI: 10.7287/peerj.preprints.27801v2/supp-5

Figure S6: Evaluation of Levenshtein distances for fusion primers used in gradient PCR

DOI: 10.7287/peerj.preprints.27801v2/supp-6

Figure S7: Gradient PCR gels for the initial 36 primer combinations

DOI: 10.7287/peerj.preprints.27801v2/supp-7

Figure S8: Amplicon concentration of the 36 primer sets after the first gradient PCR test

DOI: 10.7287/peerj.preprints.27801v2/supp-8

Figure S9: Sequencing depth for the mock community metabarcoding run

DOI: 10.7287/peerj.preprints.27801v2/supp-9

Figure S10: Distribution of read lengths after paired end merging for the mock community metabarcoding run

DOI: 10.7287/peerj.preprints.27801v2/supp-10

Figure S11: Rarefaction curves showing taxon recovery for the mock sample with different primer sets

DOI: 10.7287/peerj.preprints.27801v2/supp-11

Figure S12: Heat map showing taxon recovery for the mock sample with different primer sets

DOI: 10.7287/peerj.preprints.27801v2/supp-12

Figure S13: Principal component analysis of the metabarcoding OTU table for the mock community metabarcoding run

DOI: 10.7287/peerj.preprints.27801v2/supp-13

Figure S14: Jaccard similarity and Bray-Curtis distance based on taxa recovered from the mock community metabarcoding run

DOI: 10.7287/peerj.preprints.27801v2/supp-14

Figure S15: Plot showing the similarity between taxon recovery at 46 °C with primers of both the mock community metabarcoding and the final gradient metabarcoding run

DOI: 10.7287/peerj.preprints.27801v2/supp-15

Figure S16: Heat map showing taxon recovery with four primer sets at different annealing temperatures (40 - 56 °C)

DOI: 10.7287/peerj.preprints.27801v2/supp-16

Figure S17: Distribution of read length after paired end merging for the final gradient run

DOI: 10.7287/peerj.preprints.27801v2/supp-17

Figure S18: Heat map showing taxon recovery for the Malaise trap metabarcoding run with 21 primer sets

DOI: 10.7287/peerj.preprints.27801v2/supp-18

Figure S19: Rarefaction curves showing taxon recovery for the Malaise trap metabarcoding run with different primer sets

DOI: 10.7287/peerj.preprints.27801v2/supp-19

Scripts S1: R scripts used for bioinformatics processing, figure generation and statistical analysis

DOI: 10.7287/peerj.preprints.27801v2/supp-20

Table S1: Raw OTU table for both the 21 primer and the gradient metabarcoding run, as well as details on mock sample composition

DOI: 10.7287/peerj.preprints.27801v2/supp-21

Table S2: Primer sequences and primer combinations evaluated in this study

DOI: 10.7287/peerj.preprints.27801v2/supp-22

Table S3: NCBI SRA accession numbers for demultiplexed samples and raw MiSeq sequencing files

DOI: 10.7287/peerj.preprints.27801v2/supp-23

Manuscript file, for you to download and provide feedback using track changes!

DOI: 10.7287/peerj.preprints.27801v2/supp-24