mockrobiota: a public resource for microbiome bioinformatics benchmarking

Nicholas A Bokulich; Jai Ram Rideout; William G Mercurio; Benjamin Wolfe; Corinne F Maurice; Rachel J Dutton; Peter J Turnbaugh; Rob Knight; J. Gregory Caporaso

doi:10.7287/peerj.preprints.2065v1

mockrobiota: a public resource for microbiome bioinformatics benchmarking

Nicholas A Bokulich¹, Jai Ram Rideout¹, William G Mercurio¹, Benjamin Wolfe², Corinne F Maurice³, Rachel J Dutton^4,5, Peter J Turnbaugh⁶, Rob Knight^5,7,8, J. Gregory Caporaso ^1,9

May 23, 2016

Author and article information

1 Center for Microbial Genetics and Genomics, Northern Arizona University, Flagstaff, AZ, USA

2 Department of Biology, Tufts University, Medford, MA, USA

3 Department of Microbiology & Immunology Department, Microbiome and Disease Tolerance Centre, McGill University, Montreal, Quebec, Canada

4 Division of Biological Sciences, University of California, San Diego, San Diego, CA, United States

5 Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, United States

6 Department of Microbiology and Immunology, G.W. Hooper Foundation, University of California, San Francisco, San Francisco, CA, United States

7 Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, United States

8 Department of Pediatrics, University of California, San Diego, La Jolla, CA, United States

9 Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, United States

DOI: 10.7287/peerj.preprints.2065v1

Published: 2016-05-23
Accepted: 2016-05-23

Subject Areas: Bioinformatics, Computational Biology, Ecology, Microbiology
Keywords: mock community, rRNA, ITS, marker-gene sequencing, metagenomics, microbial ecology, microbiome, bioinformatics, methods development

Copyright: © 2016 Bokulich et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Bokulich NA, Rideout JR, Mercurio WG, Wolfe B, Maurice CF, Dutton RJ, Turnbaugh PJ, Knight R, Caporaso JG. 2016. mockrobiota: a public resource for microbiome bioinformatics benchmarking. PeerJ Preprints 4:e2065v1 https://doi.org/10.7287/peerj.preprints.2065v1

Abstract

Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at https://github.com/caporaso-lab/mockrobiota. The materials contained in mockrobiota include dataset and sample metadata, expected composition data, which are annotated based on one or more reference taxonomies, links to raw data (e.g., raw sequence data) for each mock community dataset, and optional reference sequences for mock community members. mockrobiota does not supply physical sample materials directly, but the dataset metadata included for each mock community indicate whether physical sample materials are available (and associated contact information). At the time of this writing, mockrobiota contains 11 mock community datasets with known species compositions (including bacterial, archaeal, and eukaryotic mock communities), analyzed by high-throughput marker-gene sequencing. The availability of standard, public mock community data will facilitate ongoing methods optimizations; comparisons across studies that share source data; greater transparency and access; and eliminate redundancy. This dynamic resource is intended to expand and evolve to meet the changing needs of the ‘omics community.

Cite this as

Bokulich NA, Rideout JR, Mercurio WG, Wolfe B, Maurice CF, Dutton RJ, Turnbaugh PJ, Knight R, Caporaso JG. 2016. mockrobiota: a public resource for microbiome bioinformatics benchmarking. PeerJ Preprints 4:e2065v1 https://doi.org/10.7287/peerj.preprints.2065v1

Author comment

This is a preprint submission to PeerJ.

Sections

Supplemental Information

Fig 1

Fig 1. Example usage of mockrobiota MC resource for marker-gene sequencing pipelines. MC datasets are selected based on multiple input criteria, including dataset metadata, sample metadata, and represented taxa. Raw data (e.g., fastq) are demultiplexed, sequences are dereplicated or clustered as OTUs, and taxonomy is assigned to representative sequences. Observed taxonomic assignments and abundances are compared to the expected composition (expected taxonomic assignments and abundances) of that MC, e.g., to generate precision and recall scores or correlations between observed/expected values.

DOI: 10.7287/peerj.preprints.2065v1/supp-1

Download

Additional Information

Competing Interests

J. Gregory Caporaso is an Academic Editor for PeerJ.

Author Contributions

Nicholas A Bokulich conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Jai Ram Rideout conceived and designed the experiments, performed the experiments, analyzed the data, reviewed drafts of the paper.

William G Mercurio analyzed the data, reviewed drafts of the paper.

Benjamin Wolfe contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Corinne F Maurice contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Rachel J Dutton contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Peter J Turnbaugh contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Rob Knight contributed reagents/materials/analysis tools, reviewed drafts of the paper.

J. Gregory Caporaso conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.

Data Deposition

The following information was supplied regarding data availability:

https://github.com/caporaso-lab/mockrobiota

Funding

The authors received no funding for this work.

mockrobiota: a public resource for microbiome bioinformatics benchmarking

Author and article information

Abstract

Author comment

Sections

Supplemental Information

Fig 1

Additional Information

Competing Interests

Author Contributions

Data Deposition

Funding

0

Add your feedback

Publish for free

Five new journals in Chemistry

Sections

Supplemental Information

Fig 1

Additional Information

Competing Interests

Author Contributions

Data Deposition

Funding

0

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article

Publish for free

Five new journals in Chemistry