Figure S1 The OTUs obtained by AOR approach in Mock data
(a-c) The number of OTUs decreased to 22 at thresholds; (d-f) the total ratio of sequences remapped back to OTUs also maintained at >99%; (g-i) the MCC values increased to >0.95, indicating ideal OTU delineation quality. The alternative x axis at the bottom indicates how many sequences did not attending initial OTU delineation at each threshold levels. After OTU delineation, qualified unique sequences were remapped to OTUs with 97% similarity threshold. Dots indicate the original results of corresponding OTU delineation methods.
Figure S2 Coefficient of variation(a-d) and the 99% confidential intervals of bootstrapped abundance (e-h) in (a, e) PWS, (b, f) Ultra, (c, g) River and (d, h) Water data
The Coefficient of variation decreased quickly along with the sequences’ abundances. The distribution of bootstrapped abundance included zero when the abundances were really low. Dashed vertical lines showed the abundance thresholds for OTU delineation.
Figure S3 The OTUs obtained by AOR approach in (a) PWS, (b) Ultra, (c) River and (d) Water data sets
The vertical dashed lines indicates the threshold set by bootstrap resampling. Different pipelines obtained close number of OTUs at these thresholds. Dots indicate the original results of corresponding OTU delineation methods.
Figure S4 The MCC value in (a) PWS, (b) Ultra, (c) River and (d) Water data sets increased along with the threshold
After OTU delineation, all “qualified sequences” were remapped to OTUs with 97% similarity. Dots indicate the original results of corresponding OTU delineation methods.
Figure S5 AOR resulted in less OTUs but comparable alpha diversity in PWS (a-d), Ultra (e-h), River (i-l) and Water (m-p) data
(a, e, i, m) Number of OTUs, (b, f, j, n) Chao1 indices, (c, g, k, o) Simpson indices and (d, h, l, p) Shannon indices per sample were calculated. Multiple comparison was performed using Wilcox test, p values were adjusted by FDR method.
Figure S6 AOR resulted in more consistent beta diversity among methods in (a) PWS, (b) Ultra, (c) River and (d) Water data
Mantel r Statistics were obtained by comparing beta diversity distance matrices between each pair of analysis methods with (Red) original results, (Blue) AOR approach incorporated.