MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies

Department of Energy, Joint Genome Institute, Walnut Creek, CA, United States of America
School of Computer Science and Technology, University of Shanghai for Science and Technology, Hefei, Anhui, China
Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, United States of America
School of Natural Sciences, University of California at Merced, Merced, United States of America
DOI
10.7287/peerj.preprints.27522v1
Subject Areas
Bioinformatics, Computational Biology, Genomics, Microbiology, Statistics
Keywords
metagenomics, metagenome binning, clustering
Copyright
© 2019 Kang et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Kang D, Li F, Kirton ES, Thomas A, Egan RS, An H, Wang Z. 2019. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ Preprints 7:e27522v1

Abstract

We previously reported MetaBAT, an automated metagenome binning software tool to reconstruct single genomes from microbial communities for subsequent analyses of uncultivated microbial species. MetaBAT has become one of the most popular binning tools largely due to its computational efficiency and ease of use, especially in binning experiments with a large number of samples and a large assembly. MetaBAT requires users to choose parameters to fine-tune its sensitivity and specificity. If those parameters are not chosen properly, binning accuracy can suffer, especially on assemblies of poor quality. Here we developed MetaBAT 2 to overcome this problem. MetaBAT 2 uses a new adaptive binning algorithm to eliminate manual parameter tuning. We also performed extensive software engineering optimization to increase both computational and memory efficiency. Comparing MetaBAT 2 to alternative software tools on over 100 real world metagenome assemblies shows superior accuracy and computing speed. Binning a typical metagenome assembly takes only a few minutes on a single commodity workstation. We therefore recommend the community adopts MetaBAT 2 for their metagenome binning experiments. MetaBAT 2 is open source software and available at https://bitbucket.org/berkeleylab/metabat.

Author Comment

This is a submission to PeerJ for review.

Supplemental Information

A list of metagenome assemblies used to evaluate binning performance

IMG access IDs and Sample names for IMG-100 dataset

DOI: 10.7287/peerj.preprints.27522v1/supp-1

List of parameter sets to evaluate MetaBAT2 on 120 real metagenome assemblies

The parameter sets and their performance comparison to the default parameter set.

DOI: 10.7287/peerj.preprints.27522v1/supp-2