MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies
A peer-reviewed article of this Preprint also exists.
Author and article information
Abstract
We previously reported MetaBAT, an automated metagenome binning software tool to reconstruct single genomes from microbial communities for subsequent analyses of uncultivated microbial species. MetaBAT has become one of the most popular binning tools largely due to its computational efficiency and ease of use, especially in binning experiments with a large number of samples and a large assembly. MetaBAT requires users to choose parameters to fine-tune its sensitivity and specificity. If those parameters are not chosen properly, binning accuracy can suffer, especially on assemblies of poor quality. Here we developed MetaBAT 2 to overcome this problem. MetaBAT 2 uses a new adaptive binning algorithm to eliminate manual parameter tuning. We also performed extensive software engineering optimization to increase both computational and memory efficiency. Comparing MetaBAT 2 to alternative software tools on over 100 real world metagenome assemblies shows superior accuracy and computing speed. Binning a typical metagenome assembly takes only a few minutes on a single commodity workstation. We therefore recommend the community adopts MetaBAT 2 for their metagenome binning experiments. MetaBAT 2 is open source software and available at https://bitbucket.org/berkeleylab/metabat.
Cite this as
2019. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ Preprints 7:e27522v1 https://doi.org/10.7287/peerj.preprints.27522v1Author comment
This is a submission to PeerJ for review.
Sections
Supplemental Information
A list of metagenome assemblies used to evaluate binning performance
IMG access IDs and Sample names for IMG-100 dataset
List of parameter sets to evaluate MetaBAT2 on 120 real metagenome assemblies
The parameter sets and their performance comparison to the default parameter set.
Additional Information
Competing Interests
The authors declare that they have no competing interests.
Author Contributions
Dongwan Kang conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
Feng Li performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
Edward S Kirton performed the experiments, analyzed the data, approved the final draft.
Ashleigh Thomas performed the experiments, prepared figures and/or tables, approved the final draft.
Rob S Egan conceived and designed the experiments, approved the final draft.
Hong An authored or reviewed drafts of the paper, approved the final draft.
Zhong Wang conceived and designed the experiments, authored or reviewed drafts of the paper, approved the final draft.
Data Deposition
The following information was supplied regarding data availability:
Funding
The work was conducted by the US Department of Energy Joint Genome Institute. Dongwan Kang, Edward Kirton, Ashleigh Thomas, Rob Egan, and Zhong Wang’s work was supported by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research under Contract No. DE-AC02-05CH11231. Feng Li was supported by an exchange student fellowship from China Scholarship Council (CSC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.