Visitors   Views   Downloads

Minimizing spurious features in 16S rRNA gene amplicon sequencing

View preprint
Minimizing spurious taxa from sequencing https://t.co/AJOwqlvdMW via @PeerJPreprints
157 days ago
Minimizing spurious features in 16S rRNA gene amplicon sequencing https://t.co/dQF6p4376P
NOT PEER-REVIEWED
"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

Supplemental Information

Figure S1 The OTUs obtained by AOR approach in Mock data

(a-c) The number of OTUs decreased to 22 at thresholds; (d-f) the total ratio of sequences remapped back to OTUs also maintained at >99%; (g-i) the MCC values increased to >0.95, indicating ideal OTU delineation quality. The alternative x axis at the bottom indicates how many sequences did not attending initial OTU delineation at each threshold levels. After OTU delineation, qualified unique sequences were remapped to OTUs with 97% similarity threshold. Dots indicate the original results of corresponding OTU delineation methods.

DOI: 10.7287/peerj.preprints.26872v1/supp-1

Figure S2 Coefficient of variation(a-d) and the 99% confidential intervals of bootstrapped abundance (e-h) in (a, e) PWS, (b, f) Ultra, (c, g) River and (d, h) Water data

The Coefficient of variation decreased quickly along with the sequences’ abundances. The distribution of bootstrapped abundance included zero when the abundances were really low. Dashed vertical lines showed the abundance thresholds for OTU delineation.

DOI: 10.7287/peerj.preprints.26872v1/supp-2

Figure S3 The OTUs obtained by AOR approach in (a) PWS, (b) Ultra, (c) River and (d) Water data sets

The vertical dashed lines indicates the threshold set by bootstrap resampling. Different pipelines obtained close number of OTUs at these thresholds. Dots indicate the original results of corresponding OTU delineation methods.

DOI: 10.7287/peerj.preprints.26872v1/supp-3

Figure S4 The MCC value in (a) PWS, (b) Ultra, (c) River and (d) Water data sets increased along with the threshold

After OTU delineation, all “qualified sequences” were remapped to OTUs with 97% similarity. Dots indicate the original results of corresponding OTU delineation methods.

DOI: 10.7287/peerj.preprints.26872v1/supp-4

Figure S5 AOR resulted in less OTUs but comparable alpha diversity in PWS (a-d), Ultra (e-h), River (i-l) and Water (m-p) data

(a, e, i, m) Number of OTUs, (b, f, j, n) Chao1 indices, (c, g, k, o) Simpson indices and (d, h, l, p) Shannon indices per sample were calculated. Multiple comparison was performed using Wilcox test, p values were adjusted by FDR method.

DOI: 10.7287/peerj.preprints.26872v1/supp-5

Figure S6 AOR resulted in more consistent beta diversity among methods in (a) PWS, (b) Ultra, (c) River and (d) Water data

Mantel r Statistics were obtained by comparing beta diversity distance matrices between each pair of analysis methods with (Red) original results, (Blue) AOR approach incorporated.

DOI: 10.7287/peerj.preprints.26872v1/supp-6

Table S1 The construction of mock communities

DOI: 10.7287/peerj.preprints.26872v1/supp-7

Table S2 The 87 references used in simulated data

DOI: 10.7287/peerj.preprints.26872v1/supp-8

Table S3 The average error rates of the raw sequences reported by sequencing machine, QC sequences passing different quality control methods, the final qualified sequences for OTU delineation, and the qualified sequences pre-clustered with up to 1 differe

DOI: 10.7287/peerj.preprints.26872v1/supp-9

Table S4 The number of sequences passed quality filtration using different methods

DOI: 10.7287/peerj.preprints.26872v1/supp-10

Table S5 The abundance threshold of unreliable sequences

DOI: 10.7287/peerj.preprints.26872v1/supp-11

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Jing Wang conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Qianpeng Zhang performed the experiments.

Guojun Wu authored or reviewed drafts of the paper.

Chenhong Zhang performed the experiments.

Menghui Zhang conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, authored or reviewed drafts of the paper, approved the final draft.

Liping Zhao conceived and designed the experiments, authored or reviewed drafts of the paper, approved the final draft.

DNA Deposition

The following information was supplied regarding the deposition of DNA sequences:

The mock communities and PWS datasets supporting the conclusions of this article are available in the NCBI Short Read Archive repository under BioProject PRJNA306596 (http://www.ncbi.nlm.nih.gov/bioproject/PRJNA306596).

Data Deposition

The following information was supplied regarding data availability:

The 16S rRNA gene clones used in mock data is provided in the supplementary table 1. The 16S rRNA gene reference sequences used in simulated data is provided in the supplementary table 2. The custom R script (resample_uniques_ci.r) used to perform the bootstrapping approach is available in the supplementary.

Funding

This work was supported by grants from the National Science and Technology Major Project of China (2012ZX10005001-009), the National Natural Science Foundation of China (31330005, 30730005, 81401141 and 20875061), and the Science and Technology Commission of Shanghai Municipality (14YF1402200). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)
 
By posting this you agree to PeerJ's commenting policies