Abundanceweighted phylogenetic diversity measures distinguish microbial community states and are robust to sampling depth
 Published
 Accepted
 Received
 Academic Editor
 Ludmila Chistoserdova
 Subject Areas
 Biodiversity, Bioinformatics, Ecology, Microbiology, Statistics
 Keywords
 Alpha diversity, Diversity indices, Phylogenetic diversity, Microbial ecology, Human microbiome
 Copyright
 © 2013 McCoy et al.
 Licence
 This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
 Cite this article
 2013) Abundanceweighted phylogenetic diversity measures distinguish microbial community states and are robust to sampling depth. PeerJ 1:e157 https://doi.org/10.7717/peerj.157 (
Abstract
In microbial ecology studies, the most commonly used ways of investigating alpha (withinsample) diversity are either to apply nonphylogenetic measures such as Simpson’s index to Operational Taxonomic Unit (OTU) groupings, or to use classical phylogenetic diversity (PD), which is not abundanceweighted. Although alpha diversity measures that use abundance information in a phylogenetic framework do exist, they are not widely used within the microbial ecology community. The performance of abundanceweighted phylogenetic diversity measures compared to classical discrete measures has not been explored, and the behavior of these measures under rarefaction (subsampling) is not yet clear. In this paper we compare the ability of various alpha diversity measures to distinguish between different community states in the human microbiome for three different datasets. We also present and compare a novel oneparameter family of alpha diversity measures, BWPD_{θ}, that interpolates between classical phylogenetic diversity (PD) and an abundanceweighted extension of PD. Additionally, we examine the sensitivity of these phylogenetic diversity measures to sampling, via computational experiments and by deriving a closed form solution for the expectation of phylogenetic quadratic entropy under resampling. On the three datasets, a phylogenetic measure always performed best, and two abundanceweighted phylogenetic diversity measures were the only measures ranking in the top four across all datasets. OTUbased measures, on the other hand, are less effective in distinguishing community types. In addition, abundanceweighted phylogenetic diversity measures are less sensitive to differing sampling intensity than their unweighted counterparts. Based on these results we encourage the use of abundanceweighted phylogenetic diversity measures, especially for cases such as microbial ecology where species delimitation is difficult.
Introduction
It is now well accepted that incorporating phylogenetic information into alpha (singlesample) and beta (betweensample) diversity measures can be useful in a variety of ecological contexts. Phylogenetic equivalents of all of major alpha diversity measures have been developed (Table 1). Starting with Faith’s original definition of phylogenetic diversity (Faith, 1992), which generalizes species count, there are now phylogenetic generalizations of the Simpson index to Rao’s quadratic entropy (Rao, 1982; Warwick & Clarke, 1995), the Shannon index to phylogenetic entropy (Allen, Kon & BarYam, 2009), and the Hill numbers to ^{q}D(T) (Chao, Chiu & Jost, 2010). Phylogenetic diversity itself has been extended to incorporate taxon counts (Barker, 2002) and proportional abundance (Vellend et al., 2011). There have also been abundanceweighted measures that explicitly measure phylogenetic community structure (Fine & Kembel, 2011), or an “effective number of species” (Chao, Chiu & Jost, 2010). Many diversity measures can be tidily expressed in the framework of Leinster & Cobbold (2012), although the expression of phylogenetic diversity measures for nonultrametric trees is complex.
phylogenetic diversity (Faith, 1992) 
phylogenetic generalization of species count 
phylogenetic quadratic entropy (Rao, 1982; Warwick & Clarke, 1995) 
phylogenetic generalization of the Simpson index 
phylogenetic entropy (Allen, Kon & BarYam, 2009) 
phylogenetic generalization of the Shannon index 
^{q}D(T) (Chao, Chiu & Jost, 2010) 
phylogenetic generalization of Hill numbers 
BWPD_{1} (Barker, 2002; Vellend et al., 2011) 
abundanceweighted version of phylogenetic diversity 
BWPD_{θ} (this paper) 
oneparameter family interpolating between PD and BWPD_{1} 
In this paper we use three example human microbiome datasets to demonstrate the utility of abundanceweighted phylogenetic diversity measures. We also introduce a oneparameter family interpolating between classical PD and an abundanceweighted generalization. We call the parameter θ and denote the oneparameter family BWPD_{θ}; BWPD_{0} is classical PD, whereas BWPD_{1} is balanceweighted phylogenetic diversity, effectively PD_{aw} of Vellend et al. (2011). Intermediate values of θ allow a partiallyabundanceweighted compromise. Such a compromise has recently been shown to be useful for measuring beta diversity, with the introduction of a oneparameter family of “generalized UniFrac” measures (Chen et al., 2012). We use the name Balance Weighted Phylogenetic Diversity as described below because there are a variety of abundance weighted phylogenetic diversity measures. We compare the behavior of PD measures, including BWPD_{θ}, under various levels of sampling using theory and example datasets.
Materials and Methods
Datasets
We apply the methods described below to three previously described 16S rRNA surveys of the human microbiome. The first two datasets are composed of samples from “normal” and dysbiotic microbial communities, where previous studies have associated changes in diversity with changes in health. The third dataset investigates the changes of the skin microbiome through time.
Bacterial vaginosis
First, we reanalyze a pyrosequencing dataset describing bacterial communities from women being monitored in a sexually transmitted disease clinic for bacterial vaginosis (BV). BV has previously been shown to be associated with increased community diversity (Fredricks, Fiedler & Marrazzo, 2005). For this study, swabs were taken from 242 women from the Public Health, Seattle and King County Sexually Transmitted Diseases Clinic between September 2006 and June 2010 of which 220 samples resulted in enough material to analyze (Srinivasan et al., 2012).
Selection of reference sequences and sequence preprocessing were performed using the methods described in Srinivasan et al. (2012). 452,358 reads passed quality filtering, with a median of 1,779 reads per sample (range: 523–2,366).
Oral periodontitis
We also utilize sequence data from a study of subgingival communities in 29 subjects with periodontitis, along with an equal number of healthy controls (Griffen et al., 2011a). The publication analyzing this dataset showed increased community diversity in samples from dysbiotic patients compared to healthy controls. Raw sequences were filtered, retaining only those reads with: a mean quality score of at least 25, no ambiguous bases, at least 150 base pairs in length, and an exact match to the sequencing primer and barcode. A total of 759,423 reads passed quality filtering, with a median of 8,320 reads per sample (range: 4,096–14,319).
As the phylogenetic placement method used below to calculate our measures requires a reference tree and alignment, we created a tree with FastTree 2.1.4 (Price, Dehal & Arkin, 2010) using the alignment and accompanying taxonomic annotation from the curated CORE database of oral microbiota (Griffen et al., 2011b).
Skin microbiome through time
Our third dataset is a study of skin microbial diversity through adolescence (Oh et al., 2012). Aligned sequences were obtained courtesy of the authors, although sequence data is available under the accession numbers [GQ000001] to [GQ116391] and can be accessed through BioProject ID 46333. A total of 90,142 Sanger sequences were available, with a median of 693 sequences per sample (range: 317–2884).
Balanceweighted phylogenetic diversity
In this section we introduce BWPD_{θ}, our oneparameter family interpolating between classical PD and fully balanceweighted phylogenetic diversity. We will primarily consider socalled unrooted (Pardi & Goldman, 2007) phylogenetic diversity, which does not necessarily include the root. The case of rooted phylogenetic diversity can be calculated in a similar though simpler way as described below. Although we will primarily be working in an unrooted sense, it will be useful to use terminology that corresponds to the rooted case. For this reason, if the tree is not already rooted, assume an arbitrary root has been chosen; let the proximal side of a given edge be the side that contains the root and distal be the other.
We will describe BWPD_{θ} in terms of a phylogenetic tree T with leaves L, and a contingency table describing the number of observations of the organisms at the leaves in various samples. The contingency table has rows labeled with the leaves of T, and columns labeled by samples. In microbial ecology this is frequently known as an OTU table. The entry corresponding to a given leaf and a given sample is the number of times that leaf was observed in that sample.
The classical (unrooted) phylogenetic diversity of a given sample in this context is the total branch length of the tree subtended by the leaves in that sample.
The path to generalizing PD is to note that this can be expressed as a sum of branch lengths multiplied by a step function. Let f(x) be the function that is one for x > 0 and zero otherwise. Let g(x) = min(f(x), f(1−x)) and D_{s}(i) be the fraction of reads in sample s that are in leaves on the distal side of edge i. Phylogenetic diversity can be then expressed as (1)${\mathrm{PD}}_{\mathrm{u}}\left(s\right)=\sum _{i}{\ell}_{i}g\left({D}_{s}\right(i\left)\right).$ That is, the sum of edge lengths in T which have reads from s on both the distal and proximal side.
Note that the step function g is the limit of a oneparameter family of functions (Fig. 1). Indeed, defining (2)${g}_{\theta}\left(x\right)={\left[2min(x,1x)\right]}^{\theta},$g is the pointwise limit of the g_{θ} on the closed unit interval as θ goes to zero. Thus our oneparameter generalization is (3)${\mathrm{BWPD}}_{\theta}\left(s\right)=\sum _{i}{\ell}_{i}{g}_{\theta}\left({D}_{s}\right(i\left)\right).$ Note that when θ = 0 this is PD and when θ = 1 this is an abundanceweighted version of PD equivalent to executing the ΔnPD recipe of Barker (2002) up to a multiplicative factor.
The rooted equivalent of (3) is (4)${\mathrm{RBWPD}}_{\theta}\left(s\right)=\sum _{i}{\ell}_{i}{\left({D}_{s}\right(i\left)\right)}^{\theta},$ which interpolates between rooted PD and an abundanceweighted version. Vellend et al. (2011) describe a measure, PD_{aw}, which is equal to RBWPD_{1} multiplied by the total number of branches in T.
We call BWPD_{1} balanceweighted phylogenetic diversity because it weights edges according to the balance of read fractions on either side of an edge–edges with even amount of mass on either side are upweighted, while edges with an uneven balance of mass are downweighted. Indeed, if x−(1−x) is thought of as the imbalance of read fraction on either side of an edge, then 1−x−(1−x) is a measure of balance; note that on the unit interval, 2min(x, 1−x) = 1−x−(1−x). Because a small x or an x close to 1 gives a small coefficient in the summation, small collections of reads or small perturbations of the read distribution will not change the value of BWPD_{1} appreciably.
Calculation of PD measures in example applications
Reads from the vaginal and oral studies were placed on a tree created from a curated set of taxonomically annotated reference sequences. As phylogenetic entropy and ^{q}D(T) operate on a rooted phylogeny, reference trees were assigned a root taxonomically (Matsen & Gallagher, 2012) meaning that a root was found that best separated highlevel taxonomic groupings. pplacer was run in posterior probability mode (using the p and informativeprior flags), which defines an informative prior for pendant branch lengths with a mean derived from the average distances from the edge in question to the leaves of the tree. The resulting set of placements were classified at the family rank using a hybrid classifier implemented in the guppy tool from the pplacer suite. The hybrid classifier assigns taxonomic annotations to sequences using the combination of a naïve Bayes classifier (Wang et al., 2007) with a phylogenetic classifier (N Hoffman, A Gallagher and F Matsen, unpublished data). Any reads that could not be confidently classified to the family rank were not used in measures based on classification.
Fulllength 16S sequences were available for the skin data, and so a more traditional treebuilding approach was used. Representative OTUs were chosen for each site by clustering at 97% identity using USEARCH 5.1 (Edgar, 2010), with trees built on OTU centroids using FastTree (Price, Dehal & Arkin, 2010). To conform with methods used in Oh et al. (2012), the naïve Bayes classifier (Wang et al., 2007) was used to infer genuslevel classifications to taxonomically root the tree; in our case we used the RDP classifier v2.5. The contingency (OTU) tables generated by clustering were made available to our tools via the BIOM (McDonald et al., 2012) format.
PD_{u} (unrooted PD), phylogenetic quadratic entropy (Rao, 1982), phylogenetic entropy (Allen, Kon & BarYam, 2009), and ^{q}D(T) (Chao, Chiu & Jost, 2010) were implemented for phylogenetic placements in the freelyavailable pplacer suite of tools (Matsen, Kodner & Armbrust, 2010) (http://matsen.fhcrc.org/pplacer) in the subcommand guppy fpd. Prior to diversity estimation, either phylogenetic placements were rarefied to the read count of the specimen in the dataset with the fewest sequences using guppy rarefy, or the corresponding rarefaction on fulllength sequences was performed with the QIIME (Caporaso et al., 2010) rarefaction tool single_rarefaction.py. The mean value of each statistic over 100 such rarefactions was used for analysis.
Discrete measures of alpha diversity and richness were calculated on contingency tables obtained from clustering and taxonomic classification. Sequences were clustered into Operational Taxonomic Units (OTUs) at a 97% identity threshold using USEARCH 5.1 (Edgar, 2010). Similar results were observed when clustering at 95% identity (results not shown). OTU counts and familylevel taxon counts were then rarefied as above in R 3.0.1 (R Development Core Team, 2013) using the vegan package (Oksanen et al., 2012). We obtained values for the Simpson (1949) and Shannon (1948) diversity indices, as well as the Chao1 (Chao, 1984) and ACE (Chao & Lee, 1992) measures of species richness using vegan functions diversity and estimateR.
Comparative analysis of alpha diversity measures
To investigate the relation between various measures of alpha diversity, we calculated Pearson’s r between all pairs of measures using the function rcorr from the R package Hmisc (Harrell, 2012). We then performed hierarchical clustering with the R function hclust, using d = 1−r as the distance between two measures.
Association of each measure with clinical criteria for the first two datasets was evaluated by examining the accuracy of a logistic regression using the measure as the sole predictor of whether the sample came from a “normal” or dysbiotic subject. In the vaginal dataset, we assessed each measure’s ability to predict whether a sample was from a subject positive for BV by Amsel’s criteria, a clinical diagnostic method (Amsel et al., 1983). In the oral dataset, we assessed each measure’s ability to predict whether a sample was from a healthy control, or a subject with periodontitis. Accuracy in predicting sample community state was assessed by leaveoneout crossvalidation using the R package boot (Davison & Hinkley, 1997; Canty & Ripley, 2012).
For the vaginal dataset, we also calculated R^{2} values using each measure individually as a predictor for sample Nugent score in a linear regression. The Nugent score provides a diagnostic score for BV which ranges from 0 (BVnegative) to 10 (BVpositive) based on presence and absence of bacterial morphotypes as viewed under a microscope (Nugent, Krohn & Hillier, 1991).
We calculated pvalues to compare within and betweenstratification variability using R’s builtin t.test function for the vaginal data, which had a binary stratification, and the aov function for the oral and skin datasets. The vaginal dataset was stratified by Amsel’s criterion, the oral dataset by condition and sampling site, and the skin microbiome dataset by Tanner scale of physical development (Oh et al., 2012). Note that we are not presenting these uncorrected pvalues as evidence that there is an interesting relationship between the microbiome and a given stratification, but rather are using pvalues as a way of measuring withinstratum heterogeneity compared to betweenstratum heterogeneity for the various measures.
Results
Application to the human microbiome
Vaginal microbiome
Like Srinivasan et al. (2012) and many others in the field, we observe greater diversity in BV positive specimens using a variety of diversity and richness measures (Fig. S1). In particular, this is true for BWPD_{θ} for a variety of values of θ (Fig. S2).
Measure  Amsel accuracy  Nugent R^{2}  Amsel pvalue  Mean rank 

PD_{u}  0.834  0.737  1.84E−35  1.3 
BWPD_{0.25}  0.836  0.735  1.98E−35  2.0 
Simpson (Family)  0.817  0.735  4.11E−33  4.0 
BWPD_{0.5}  0.827  0.700  3.33E−33  4.3 
Shannon (Family)  0.813  0.724  2.28E−32  5.3 
Phylo. entropy  0.831  0.679  1.81E−31  5.7 
Chao1 (Family)  0.813  0.704  6.27E−31  7.0 
^{0.5}D(T)  0.818  0.658  7.47E−29  8.0 
^{0.25}D(T)  0.809  0.682  2.25E−30  8.3 
Phylo. quad. entropy  0.813  0.648  7.89E−30  9.0 
BWPD_{1}  0.795  0.611  5.38E−28  11.0 
Chao1 (OTU)  0.766  0.488  1.64E−23  12.7 
ACE (Family)  0.766  0.491  2.82E−11  13.0 
ACE (OTU)  0.764  0.469  6.82E−22  13.7 
Shannon (OTU)  0.758  0.380  5.27E−16  14.7 
Simpson (OTU)  0.697  0.191  1.42E−07  16.0 
In the vaginal data, phylogenetic measures of alpha diversity have better crossvalidation accuracy for the Amsel classification and better correlation with the Nugent score than discrete OTUbased measures (Table 2). All measures were somewhat accurate in identifying community state, with even the worst performers classifying almost 70% of samples correctly. BWPD_{0.25}, BWPD_{0.5}, PD_{u}, and phylogenetic entropy perform well predicting BV status. Correlation with Nugent score varies from 0.19 using Simpson (OTU) to 0.74 using PD_{u}. OTUbased measures rank in the bottom half of the measures tested, and below all phylogenetic measures.
In the hierarchical clustering of alpha measures on the vaginal dataset, phylogenetic methods are separated from OTUbased methods (Fig. 2). BWPD_{θ} is similar to different extant phylogenetic alpha diversity measures for different θ. The Simpson and Shannon diversity measures cluster together, as do the ACE and Chao1 richness measures.
Figure 3 shows values of BWPD_{θ} calculated before (xaxis) and after (yaxis) a single rarefaction to 523 sequences per sample. Samples for which the BWPD_{θ} value changes little lie close to the blue line, which shows the case of no difference between original and rarefied samples. Increasing θ, which corresponds to increased use of abundance information, reduces the change in BWPD_{θ} induced by rarefaction. Phylogenetic quadratic entropy and phylogenetic entropy both show behavior similar to BWPD_{1}, with rarefaction introducing little effect.
It might be possible to formalize a statement to this effect by computing the expectation of these alpha measures under rarefaction. However, computing the expectation for BWPD_{θ} under rarefaction does not appear to be straightforward: the methods of Dremin (1994) might be applicable in this setting, however, even the integer moments of the hypergeometric distribution are complicated and the noninteger moments are bound to be very complex. We have, however, shown in the Appendix that the expectation of phylogenetic quadratic entropy under rarefaction to k sequences assigned to the tips of a phylogenetic tree is $\mathbb{E}\left[{\mathrm{PQE}}_{k}\right]=\frac{k1}{kn(n1)}\sum _{i}{\ell}_{i}{d}_{i}(n{d}_{i})$ where d_{i} is the number of sequences falling below edge i and ℓ_{i} is the length of edge i. This is almost identical to the unrarefied value of phylogenetic quadratic entropy, i.e. $\mathrm{PQE}=\frac{1}{{n}^{2}}\sum _{i}{\ell}_{i}{d}_{i}(n{d}_{i}).$ Thus it is not surprising to see that the expectation of PQE under rarefaction is very close to the original value (Fig. S3) for reasonably large k and n.
Oral microbiome
As previously observed by Griffen et al. (2011a), we find generally higher diversity in samples from dysbiotic patients (Fig. 4). We evaluated the ability of each alpha diversity measure to predict whether a sample came from an individual with periodontitis, regardless of sample collection site, using the above methods.
In the oral dataset, phylogenetic alpha diversity measures incorporating abundance gave the best predictions of community state (Table 3, Fig. 4). In contrast, classical phylogenetic diversity performed less well. These results were almost identical in terms of rank order after applying additional quality filtering steps to correct sequencing errors and remove potentially chimeric sequences (Table S1).
Measure  Diseased status accuracy  ANOVA pvalue  Mean rank 

Phylo. entropy  0.791  4.97E−09  1.0 
BWPD_{0.5}  0.782  6.50E−09  2.0 
BWPD_{0.25}  0.755  7.16E−08  4.0 
Phylo. quad. entropy  0.770  2.47E−07  4.0 
Simpson (Family)  0.776  1.45E−06  4.0 
^{0.5}D(T)  0.734  4.74E−06  6.5 
^{0.25}D(T)  0.735  4.33E−05  7.0 
PD_{u}  0.691  6.37E−06  8.5 
Shannon (Family)  0.734  5.32E−05  8.5 
BWPD_{1}  0.698  3.57E−04  9.5 
Chao1 (OTU)  0.685  9.94E−04  11.0 
ACE (OTU)  0.682  1.30E−03  12.0 
Simpson (OTU)  0.676  2.39E−02  13.5 
Shannon (OTU)  0.672  1.31E−03  14.0 
Chao1 (Family)  0.674  2.64E−01  15.0 
ACE (Family)  0.663  1.82E−01  15.5 
OTUbased methods and phylogenetic methods are not as separated in a hierarchical clustering as for the vaginal dataset (Fig. S6). However, many of the same pairings are present in both clusterings: BWPD_{0.5} with phylogenetic entropy, BWPD_{1} with quadratic entropy, Simpson with Shannon, and ACE with Chao1. Interestingly, PD_{u}, BWPD_{0.25}, and the ^{q}D(T) measures all cluster with the discrete richness measures ACE and Chao1.
Like the vaginal dataset, incorporating abundance information decreases the effect of rarefaction on BWPD_{θ} values (Figs. S4 and S5).
Skin microbiome
To further assess resolution and robustness of phylogenetic diversity measures, we considered skin microbiome data from a study by Oh et al. (2012). This study tracked the changes of the skin microbiome through “Tanner” developmental stages of adolescence (Tanner & Whitehouse, 1976). Because there are five Tanner stages, and they do not have a monotonic relationship with skin microbiome diversity (Oh et al., 2012), we focused on ANOVA pvalues to see if the diversity measurements had small withinstage heterogeneity compared to betweenstage heterogeneity. To compare the ANOVA pvalues associated with the diversity measurements across the various datasets, we ranked the pvalue of the diversity measures from lowest to highest for each dataset individually. We averaged these ranks to gain an overall measure of performance. The results again show phylogenetic measures generally performing better than OTUbased measures (Table 4). This finding holds true even after removing potential chimeras (Table S1). In this case, a light weighting or no weighting of phylogenetic diversity by abundance performed better than full abundanceweighting.
Ac  N  Pc  Vf  Mean rank  

PD_{u}  3.34e−02  4.94e−03  3.03e−03  2.28e−04  4.50 
BWPD_{0.25}  4.22e−02  1.32e−03  6.37e−03  5.86e−04  5.25 
BWPD_{0.5}  8.54e−02  9.85e−05  3.65e−02  5.85e−03  6.00 
Shannon (OTU)  6.61e−02  9.48e−02  9.65e−02  1.05e−05  6.50 
Chao1 (OTU)  8.00e−02  6.46e−03  3.98e−03  3.18e−03  6.75 
Phylo. quad. entropy  2.52e−01  1.12e−05  4.99e−01  1.67e−01  7.50 
Phylo. entropy  1.37e−01  1.15e−03  1.55e−01  2.09e−02  7.75 
^{0.5}D(T)  8.91e−01  5.63e−04  3.84e−03  9.09e−01  8.25 
^{0.25}D(T)  7.00e−01  2.27e−03  2.35e−03  9.41e−01  8.50 
BWPD_{1}  3.09e−01  5.95e−05  6.65e−01  5.41e−01  8.50 
^{0}D(T)  4.42e−01  1.05e−02  1.38e−03  9.38e−01  8.75 
Simpson (OTU)  9.38e−02  4.01e−01  8.49e−01  1.01e−04  8.75 
Notes:
 Ac

antecubital fossa
 N

nares
 Pc

popliteal fossa
 Vf

volar forearm
Applications summary
In all three of the datasets investigated, abundanceweighted phylogenetic diversity measures showed good performance to distinguish between community states: between “normal” and dysbiotic samples in the oral and vaginal microbiomes, and between developmental stages in the skin microbiome. Notably, the best distinguishing measure in each dataset was phylogenetic; in addition BWPD_{0.25} and BWPD_{0.5} were the only measures that were in the top four for all datasets. The result that partial abundance weighting performs well corresponds to analogous results for beta diversity, where an intermediate exponent for “generalized UniFrac” was the most powerful (Chen et al., 2012).
Discussion
Phylogenetic alpha diversity measures were more closely related to community state than were discrete measures based on OTU clustering for the datasets investigated here. This result is especially interesting given that the Simpson index, the Shannon index, or counting applied to OTU tables are very common ways of characterizing microbial diversity (Fierer et al., 2007; Grice et al., 2009; Hill et al., 2003; Dethlefsen & Relman, 2011). As also noted by Aagaard et al. (2012), we find that measurements of diversity using taxonomic classification can be useful in describing communities, and in fact perform much better than the same measurements of diversity applied to OTU counts; however, this approach requires a taxonomically well characterized environment. Our results can be viewed as an experimental confirmation of the notion that incorporating similarity between species is important to get sensible measures of diversity, which has been advocated by many, including most recently by Leinster & Cobbold (2012).
We find that classical phylogenetic diversity is sensitive to sampling depth, underestimating the true value in small samples. Biases have also been described for diversity measures using OTU tables (Gihring, Green & Schadt, 2012). In contrast, we observe that some abundanceweighted phylogenetic measures are relatively robust to varying levels of sampling.
These results did not appear to be the result of sequencing issues. In principle, OTU methods could have performed badly because of errorprone and chimeric sequences inflating the number of OTUs. Although this is a real danger for OTU quantification, in this study its impact appears to be limited–similar results were obtained with the oral data after denoising and chimera removal and the skin data (which used Sanger sequencing) after chimera removal.
We note that on our data, nonphylogenetic measures applied to family level taxonomic groupings are generally more discriminating than the corresponding measures applied to OTUs. This may be because our sequences are from the human microbiome, and taxonomic classification is especially welldeveloped in that setting; in particular the taxonomic names may already be defined in a way that corresponds to dysbiosis. Thus this particular difference may not continue to be true in a taxonomically lesswellcharacterized environment.
As of the publication of this paper, no abundanceweighted phylogenetic alpha diversity measures are implemented in either mothur (Schloss et al., 2009) or QIIME (Caporaso et al., 2010), two of the most popular tools for analysis of microbial ecology data. Although the fact that abundanceweighted phylogenetic diversity measures performed very well for the three datasets investigated here does not imply that they are best in general, we suggest that abundanceweighted phylogenetic measures be given greater consideration for microbial ecology studies. For this to happen, implementations in commonly used microbial ecology software packages will be needed, in addition to our implementation and that of the picante R package (Kembel et al., 2010).
Supplemental Information
Supplementary mathematical results, Tables, and Figures
PDF containing a derivation and a more general description of weighted PD, as well as a number of supplementary figures and tables.