A heritability-based comparison of methods used to cluster 16S rRNA gene sequences into operational taxonomic units
- Published
- Accepted
- Subject Areas
- Bioinformatics, Ecology, Microbiology
- Keywords
- Ecology, Microbiology, Computational Biology
- Copyright
- © 2016 Jackson et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2016. A heritability-based comparison of methods used to cluster 16S rRNA gene sequences into operational taxonomic units. PeerJ Preprints 4:e2115v1 https://doi.org/10.7287/peerj.preprints.2115v1
Abstract
A variety of methods are available to collapse 16S rRNA gene sequencing reads to the operational taxonomic units (OTUs) used in microbiome analyses. A number of studies have aimed to compare the quality of the resulting OTUs. However, in the absence of a standard method to define and enumerate the different taxa within a microbial community, existing comparisons have been unable to compare the ability of clustering methods to generate units that accurately represent functional taxonomic segregation. We have previously demonstrated heritability of the microbiome and we propose this as a measure of each methods’ ability to generate OTUs representing biologically relevant units. Our approach assumes that OTUs that best represent the functional units interacting with the hosts’ properties will produce the highest heritability estimates. Using 1750 unselected individuals from the TwinsUK cohort, we compared 11 approaches to OTU clustering in heritability analyses. We find that de novo clustering methods produce more heritable OTUs than reference based approaches, with VSEARCH and SUMACLUST performing well. We also show that differences resulting from each clustering method are minimal once reads are collapsed by taxonomic assignment, although sample diversity estimates are clearly influenced by OTU clustering approach. These results should help the selection of sequence clustering methods in future microbiome studies, particularly for studies of human host-microbiome interactions.
Author Comment
This is a submission to PeerJ for review.
Supplemental Information
S1 Supplementary Methods
Details of the analysis pipeline used and an outline of each clustering method.
S2 Supplementary Table
Summary of OTU counts resulting from each method
S3 Supplementary Table
Method-wise A,C and E Estimates for Each Taxa Found Across All Methods