Combining NCBI and BOLD databases for OTU assignment in metabarcoding and metagenomic data: The BOLD_NCBI _Merger

Aquatic Ecosystem Research, University Duisburg-Essen, Essen, Germany
DOI
10.7287/peerj.preprints.3133v1
Subject Areas
Biodiversity, Bioinformatics, Ecology, Molecular Biology, Data Mining and Machine Learning
Keywords
Metabarcoding, Metagenomics, Database, Tutorial, Taxonomy, Script, Mitogenomics, Biodiversity
Copyright
© 2017 Macher et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Macher J, Macher T, Leese F. 2017. Combining NCBI and BOLD databases for OTU assignment in metabarcoding and metagenomic data: The BOLD_NCBI _Merger. PeerJ Preprints 5:e3133v1

Abstract

Metabarcoding and metagenomic approaches are becoming routine techniques in biodiversity assessment and ecological studies. The assignment of taxonomic information to sequences is challenging, as many reference libraries are lacking information on certain taxonomic groups and can contain erroneous sequences. Combining different reference databases is therefore a promising approach for maximizing taxonomic coverage and reliability of results. This tutorial shows how to use the “BOLD_NCBI_Merger” script to combine sequence data obtained from the National Center for Biotechnology Information (NCBI) GenBank and the Barcode of Life Database (BOLD) and prepare it for taxonomic assignment with the software MEGAN.

Author Comment

This is a preprint submission to PeerJ Preprints.

Supplemental Information

Supplementary material 1: Tutorial and script

DOI: 10.7287/peerj.preprints.3133v1/supp-1