Simulated clustering accuracy if rarefying is not penalized for removing the lowest 15th percentile samples
The right axis represents the median library size (NL), while the x-axis ‘effect size’ is the multinomial mixing proportions of the two classes of samples, ‘Ocean’ and ‘Feces’. See caption for Fig. 2 for further details.
Low library size samples can diminish result quality, regardless of normalization technique
We show the inflammatory bowel disease (IBD) dataset of Gevers et al. ( Gevers et al. 2014 ) , which has an average library size 375 sequences per sample. (a) Extremely low depth samples cluster in lower right hand corner of PCoA plots with no normalization, or rarefying alternatives, unweighted UniFrac. (b) The original library size of samples is a dominant effect, even influencing weighted UniFrac, with low library sizes and subtle biological clustering for rarefying alternatives. This diminishes if low library size samples are removed from analysis.
All normalization techniques on key microbiome datasets, Bray Curtis distance
Rows of panels show (from top to bottom) data from 88soils ( Lauber et al. 2009 ) , Body Sites ( Costello et al. 2009 ) , Moving Pictures ( Caporaso et al. 2011a ) . 88 soils is colored according to a color gradient from low to high pH. The Costello et al. body sites dataset is colored according to body site: feces (blue), oral cavity (purple), the rest of the colors are external auditory canal, hair, nostril, skin, and urine. Moving Pictures dataset: Left and Right palm (red/blue), tongue (green), feces (orange). It is important to note that all the samples in these datasets are approximately the same depth, and there are very strong driving gradients.
All normalization techniques on key microbiome datasets, unweighed UniFrac distance
See Figure S3 caption for details.
Simple example of the reasoning behind differential abundance simulations
(a) In actual OTU tables generated from sequencing data, the counts (left column) are already compositional and therefore only relative (left column). Application of the ‘effect size’ to the original ‘Multinomial’ template to create fold-change differences disturbs the distinction between true positive (TP) and true negative (TN) OTUs in the ‘Original’ simulation, but not the ‘Balanced’ simulation. (c) Creation of a ‘Compositional’ OTU table from the ‘Multinomial’ template, where the counts/relative abundances are intentionally blurred for the TN OTUs.
Differential abundance detection performance where one sample group average library size is 3 times the size of the other
Labels are the same as in Fig. 4.
Differential abundance detection performance when the dataset is compositional
25% of OTUs are differentially abundant. Labels the same as in Fig. 4.