An empirical evaluation of four variants of a universal species–area relationship

View article


The species–area relationship (SAR) is a fundamental ecological pattern that characterizes the change in species richness as a function of spatial scale. The SAR plays a central role in predicting the diversity of unsampled areas (Palmer, 1990), reserve design (Whittaker et al., 2005), and estimating extinction rates due to habitat loss (Brooks, da Fonseca & Rodrigues, 2004). Applications involving the SAR depend strongly on the form of the relationship (Guilhaumon et al., 2008) which is known to change with spatial scale (Palmer & White, 1994; McGlinn & Hurlbert, 2012). Despite the scale-dependence of the SAR, a simple non-scale dependent model (the power-law) is still the most commonly used model for the SAR (Tjørve, 2003).

The Maximum Entropy Theory of Ecology (METE) is a unified theory that shows promise for characterizing a variety of macroecological patterns including the species-abundance distribution, a suite of relationships between body-size and abundance, and a number of spatial patterns including the species–area relationship (Harte et al., 2008; Harte, Smith & Storch, 2009; Harte, 2011). METE adopts the inferential machinery of Maximum Entropy (MaxEnt; Jaynes, 2003) to solve for the most likely state of an ecological community (Haegeman & Loreau, 2008; Haegeman & Loreau, 2009) using only information on the total number of species, the total number of individuals, the total metabolic rate of all the individuals, and the area of the community.

METE predicts that all SARs follow a universal relationship between the exponent of a power-law characterizing the SAR at a particular scale and the ratio of richness and community abundance. The exponent of the SAR is scale dependent, decreasing with increasing spatial scale. Empirical evaluation of the theory suggests that METE is a promising model for the SAR (Harte et al., 2008; Harte, Smith & Storch, 2009); however, there are currently four different approaches to applying METE to predict the SAR and it is unclear which approach should be used due to a lack of empirical comparison.

There are two distinct versions of METE, recursive (where richness at different scales is obtained by consecutively halving or doubling of area; Harte, Smith & Storch, 2009) and non-recursive (where richness at different scales is obtained directly; Harte et al., 2008). These two versions predict somewhat different SARs. It is not clear a priori which of these versions of METE should be more accurate, and it has been suggested that the best approach should be based on empirical comparisons (Harte, 2011). In addition, the METE-SAR is derived using the species-abundance distribution (SAD). The SAD can either be predicted from N and S or the empirical distribution can be used. The most general use of METE for predicting diversity across scales relies on the use of the theoretical abundance distribution, but there have been no comparisons of METE-SAR predictions using theoretical and empirical SADs.

To understand which approach to METE is best for characterizing diversity across scales we conducted a thorough empirical comparison of the four different variants of the METE-SAR prediction: (1) recursive with predicted SAD, (2) recursive with observed SAD, (3) non-recursive with predicted SAD, and (4) non-recursive with observed SAD. Using 16 spatially explicit plant datasets we compared the form and accuracy of the predicted SAR across the four variations of METE at a wide-range of spatial scales and across a diverse set of plant communities with over 1000 species and 300,000 individual trees and herbs.


Downscaling richness

The METE approach to predicting the SAR is a two-step application of the maximum entropy formalism (MaxEnt): (1) MaxEnt is first used to predict the SAD which represents the probability that a species has abundance n0 in a community of area A0 with S0 species and N0 individuals, Φ(n0|N0, S0, A0), and (2) MaxEnt is then used to predict the intra-specific, spatial-abundance distribution which represents the probability that n out of n0 individuals of a species are located in a random quadrat of area A drawn from a total area A0, Π(n|A, n0, A0). The Π distribution is spatially implicit and does not contain information on the spatial correlation between cells. If the observed species abundance distribution is used instead of the METE distribution, then only the Π distribution is solved for using MaxEnt.

There are no adjustable parameters in METE, and the solutions to Π and Φ only depend on the empirical constraints and possible system configurations (Haegeman & Loreau, 2008; Haegeman & Loreau, 2009; Haegeman & Etienne, 2010). If the observed SAD is not used then constraints on the average number of individuals per species (N0/S0) and on the upper bound of the number of individuals N0 can be used to yield a truncated log-series abundance distribution (Fig. 1, Harte et al., 2008; Harte, 2011). To predict Π, METE places constraints on the average number of individuals per unit area (n0A/A0) and on the upper bound of the total abundance of a species n0. Although METE requires total metabolic rate to derive its predictions, this variable can be ignored when solving for the METE SAD or SAR (Harte et al., 2008; Harte, Smith & Storch, 2009; Harte, 2011).

An illustration of the process for downscaling species richness from A0 to A0/4 across the four variants of METE.

Figure 1: An illustration of the process for downscaling species richness from A0 to A0/4 across the four variants of METE.

The recursive approach uses either (A) the theoretical SAD (inset curve) or (B) the observed SAD (inset points) to predict richness at A0/2 and then the process is repeated to generate a prediction at A0/4. In contrast, the non-recursive approach uses either (C) the theoretical SAD or (D) the observed SAD to predict richness at A0/4 directly. S0 is the total number of species, N0 is the total number of individuals, and n0 is the vector of species abundances at the community scale (A0) abundance.

There are two ways to downscale (and upscale) the Π distribution (Fig. 1). There is a recursive approach (Figs. 1A and 1B) in which the constraints at A0 are used to solve for Π at A0/2, which provides new constraints (i.e., predicted SA0/2 and NA0/2) for solving Π at A0/4 and so on until richness is computed at every bisection of the total area A0/2i where i is a positive integer (Harte, Smith & Storch, 2009; Harte, 2011, p 159). The recursive approach continually updates its prior information as it downscales richness. Alternatively we can use a non-recursive approach (Figs. 1C and 1D) in which we solve for Π at any area based only on the constraints at A0 (Harte et al., 2008; Harte, 2011, p 243). The recursive approach may be more accurate because it continually upgrades its prior information or less accurate due to error propagation thus only empirical comparisons can determine which approach is best used for prediction (Harte, 2011, p 160).

Harte (2011) provides the derivations for the Π and Φ distributions, so here we will only highlight the most relevant equations for differentiating the four METE variants. The MaxEnt solution to maximizing entropy for Π is: Π(n|A,n0,A0)=1ZΠeλΠn where λΠ is the Lagrange multiplier and ZΠ is the partition function (Harte, 2011, Eq. 7.48). The partition function ensures normalization and it is defined as: ZΠ=n=0n0eλΠn=1eλΠ(n0+1)1eλΠ. The Lagrange multiplier can be solved for by defining Π in terms of its constraints which yields: n̄=n0AA0=n=0n0nxnn=0n0xnx1x(n0+1)xn0+11xn0+1 where x = e−λΠ to simplify notation. Although the METE prediction for Π can be solved numerically for any area, it is only known analytically for a special case in which the area A is half the total area A0 (Harte, 2011, Eq. 7.51): ΠnA02,n0,A0=11+n0. Equation (4) shows that METE predicts that all possible arrangements of n0 individuals are equally likely across two equal area quadrats. The flat distribution characterized by Eq. (4) is identical to the prediction offered by the Hypothesis of Equal Allocation Probability (HEAP) model and therefore the recursive application of Eq. (4) to downscale Π generates the same set of Π distributions as the HEAP model (Harte et al., 2005; Harte, 2007; Harte, 2011): ΠnA02i,n0,A0=q=nn0ΠqA02i1,n0,A0(q+1),i=1,2,3,. Thus for a given bisection of the total area (i.e., A = A0/2i) we can either use the recursive approach (Eq. (5)) or the non-recursive approach (Eq. (1)) to compute the Π distribution.

Expected richness is simply the sum of the individual probabilities of species occupancy. Table 1 gives the expressions for expected richness at A0/2 given the four possible combinations of the choice of the downscaling approach and the choice of SAD to use. The equations in Table 1 will also hold for finer spatial scales except for the recursive, theoretical SAD approach which requires downscaling the SAD as well (Harte, 2011, Eq. 7.63).

Table 1:
The four variants of METE formulated for expected species richness at A0/2, given either the recursive or non-recursive method of downscaling and either the theoretical or observed SAD.
These equations also hold for finer spatial scales except for the recursive, METE-SAD approach which requires downscaling the SAD as well (see Harte, 2011, Eq. 7.63). n0 is the vector of empirical abundances and n0,j is the abundance of the jth species at the community scale (A0).
Method of downscaling
Recursive downscaling (Eq. (5)) Non-recursive downscaling (Eq. (1))
METE-SAD (truncated log-series distribution) S0n0=1N01q=0n0Π(q|A02,n0,A0)(q+1)Φ(n0|N0,S0) S0n0=1N01eλΠ0ZΠ(n=0|A02,n0,A0)Φ(n0|N0,S0)
Observed-SAD (n0) j=1S01q=0n0,jΠ(q|A02,n0,j,A0)(q+1) j=1S01eλΠ0ZΠ(n=0|A02,n0,j,A0)
DOI: 10.7717/peerj.212/table-1

Empirical comparison

Testing METE’s predictions requires spatially explicit, contiguous data from a single trophic level. We carried out an extensive search for data that met these requirements. This search resulted in a database of 16 communities (Table 2; see Table S1 for additional details). All of the datasets are terrestrial, woody plant communities with the exception of the serpentine grassland which is herbaceous. In the woody plant surveys, the minimum diameter at breast height (i.e., 1.4 m from the ground) that a tree must be to be included in the census was 10 mm with the exception of the Cross Timbers and Oosting sites where the minimum diameter was 25 and 20 mm respectively. Where datasets contained time-series information we selected a single census year from each dataset to analyze. Harte (2011) suggested that MaxEnt models will perform best when a single process such as the presence of a past disturbance is not dominating the system and rather a multitude of different interacting processes are operating. With this in mind, we attempted to choose the survey years that were the longest amount of time from known stand-scale disturbances (e.g., hurricane events).

Table 2:
Summary of the habitat type and state variables of the vegetation datasets analyzed.
The state variables are total area (A0), total abundance (N0) and total number of species (S0).
Site names Habitat type Ref. A0 (ha) N 0 S 0
BCI Tropical forest a,b,c 50 205096 301
Sherman Tropical forest d 2 7622.5 174.5
Cocoli Tropical forest d 2 4326 138.5
Luquillo Tropical forest e 12.5 32320 124
Bryan Oak-hickory forest f,g,h 1.7113 3394 48
Big Oak Oak-hickory forest f,g,h 2 5469 40
Oosting Oak-hickory forest i 6.5536 8892 39
Rocky Oak-hickory forest f,g,h 1.44 3383 37
Bormann Oak-hickory forest f,g,h 1.96 3879 30
Wood Bridge Oak-hickory forest f,g,h 0.5041 758 19
Bald Mtn. Oak-hickory forest f,g,h 0.5 669 17
Landsend Old field, pine forest f,g,h 0.845 2139 41
Graveyard Old field, pine forest f,g,h 1 2584 36
UCSC Mixed evergreen forest j 4.5 5885 31
Serpentine Serpentine grassland k 0.0064 37182 24
Cross Timbers Oak woodland l 4 7625 7
Ranges 0.0064–50 669–205096 7–301
DOI: 10.7717/peerj.212/table-2

For each dataset we constructed fully-nested, spatially-explicit SARs (Type IIA, Scheiner, 2003). Recursive METE only makes predictions for bisected areas so we restricted our datasets to areas that were square or rectangular with the dimensional ratio of 2:1. Due to the irregular shape of the Sherman and Cocoli sites we defined two separate 200 × 100 m subplots within each site (Fig. S1). We then calculated the results for each of the two subplots and reported the average.

To assess the accuracy of METE’s predictions for the SAD and the four downscaling algorithms of the SAR, we computed the coefficient of determination about the one-to-one line: R2=1i(obsipredi)2/i(obsiobs¯i)2 where obsi and predi are the ith log-transformed observed and METE-predicted values (abundance for the SAD, richness for the SAR) respectively. Log transformed richness was used to minimize the influence of the few very large richness values and because relative deviations are of greater interest in evaluating SARs than absolute differences. We used the python package METE (White et al., 2013), as well as a suite of project specific R and python scripts for our analysis. All R and Python code used to generate these analyses is archived in the supplemental materials and also available on GitHub (


The four versions of METE all produced reasonable estimates of downscaled richness (Fig. 1). The R2 values ranged from 0.944 for the recursive, observed-SAD model up to 0.997 for the non-recursive, observed-SAD (Fig. 2). Despite the high coefficient of determination, the recursive approach deviated systematically from the empirical data by underpredicting richness (Figs. 2 and 3). This deviation became larger at finer scales (Fig. 2). In contrast, the non-recursive approach showed no systemic deviations. The SAD was well characterized by the METE predictions (R2 = 0.95); however, METE did on average predict slightly more uneven communities (i.e., predicted abundance was too low for rare species and too high for abundant species, Fig. S2). Overall, the inclusion of the observed SAD did not strongly improve the prediction of the SAR. For the non-recursive approach including the observed SAD improved the overall R2 from 0.984 to 0.997 (Figs. 3C and 3D), but the accuracy of the recursive model actually decreased with the inclusion of the observed SAD (R2 from 0.976 to 0.944, Figs. 3A and 3B).

Empirical species–area relationships and the four versions of the METE model across the 16 sites.

Figure 2: Empirical species–area relationships and the four versions of the METE model across the 16 sites.

The habitat type of each site is given above each panel. The empirical averages are the open circles, the recursive approach is the red lines, the non-recursive approach is the blue lines, the curves using the observed SAD are dashed and those using the METE-SAD are solid.
Observed vs predicted richness across datasets and spatial scales for the four METE SAR models.

Figure 3: Observed vs predicted richness across datasets and spatial scales for the four METE SAR models.

The R2-value is computed with respect to the one-to-one line (diagonal).

Results were broadly consistent across datasets, with the exception of the serpentine grassland and Cross Timbers oak woodland. The serpentine community displayed a steeper non-saturating SAR in contrast to the other datasets, and was the only dataset where the recursive downscaling approach was more accurate (Fig. 2O). The oak community displayed a sigmoidal SAR, and in contrast to the other study sites the inclusion of the observed SAD for the oak community resulted in a large improvement in the predicted SAR (Fig. 2P).


All four variations of METE performed well at predicting species richness across scales (all R2 > 0.94); however, some versions performed consistently better than others. The non-recursive approach outperformed the recursive version of METE in all but one dataset (the serpentine grassland). The recursive approach also showed small, but consistent, under-predictions for species richness. This means that the recursive approach predicted stronger intra-specific spatial aggregation than observed in the data. This finding is consistent with Harte’s (2011) comparisons of the species-level spatial abundance distribution in which the recursive approach predicted greater aggregation than the non-recursive approach. Given that the recursive approach provides a poorer fit to empirical data and can only be applied at particular scales (i.e., A0/2, A0/4, …), we recommend the use of the non-recursive approach for downscaling the SAR. However, the recursive approach is currently the only means of providing a METE-based prediction for the distance decay relationship via the hypothesis of equal allocation probabilities approach in Harte (2007), and the universal relationship between S/N and the slope of the SAR is currently only known for the recursive approach (Harte, Smith & Storch, 2009).

The SAR predictions were generally robust to using the predicted rather than observed SAD. Including the observed SAD increases the amount of information used to constrain the predictions, but it did not substantially increase the overall accuracy of the SAR predictions. This was primarily because the empirical SAD was well characterized by the METE-SAD, consistent with several other studies (Harte et al., 2008; Harte, 2011; White, Thibault & Xiao, 2012). Models in general, and MaxEnt models in particular, typically match empirical data better as increasing numbers of parameters or constraints are included in the analysis (Haegeman & Loreau, 2008; Roxburgh & Mokany, 2010; Harte, 2011). Therefore the naïve expectation for using the observed SAD is that the accuracy of the prediction should increase. However, this was generally only true for the non-recursive approach. This occurred because rarity and intraspecific aggregation interact in subtle ways to determine the shape of the SAR (He & Legendre, 2002; McGlinn & Palmer, 2009), and simply fixing one of these pieces of information does not guarantee improved predictive power. While using the observed SAD does improve the R2 for the non-recursive form of METE, it only does so by ∼1%. Therefore N and S are generally sufficient to accurately downscale richness using METE across a wide range of habitat types. This is important because it should be possible to model geographic patterns of richness and abundance at a single scale to predict the SAD (White, Thibault & Xiao, 2012) and then use those modeled values to predict richness across scales.

Although METE yields accurate predictions for the SAR, its current form has limitations with respect to its extent of applicability and its ability to tie in more broadly with species–time and species–time–area relationships (Rosenzweig, 1995; White et al., 2010). Specifically, METE predictions are thought to be most relevant for single trophic level datasets that are spatially contiguous and relatively environmentally homogenous (Harte, 2011), thus constraining the applicability of METE. At the large spatial scales that are often of interest in conservation planning it is likely that a standard application of METE will fail once species ranges do not occupy all of A0. These are also the scales at which the third phase of the triphasic SAR is expected to occur (Allen & White, 2003; Storch, Keil & Jetz, 2012), and METE does not predict this accelerating phase. However, McGill (2010) suggested that METE’s local predictions could be connected with a broad-scale theory to predict a triphasic SAR. Additionally, METE does not currently make predictions through time; however, Harte (2011) suggests using Maximum Entropy Production (Dewar, 2005). It should be possible to extend METE to predict the species–time–area relationship (White et al., 2010) because this pattern, like the SAR, can be modeled in terms of the number of unique individuals sampled per unit area and time (McGlinn & Palmer, 2009).

Recently there have been two critiques of the METE spatial predictions. The universality of the relationship between the slope of the recursive METE-SAR and the ratio of N/S was questioned on the basis that the predicted METE-SAR for subsets of a community cannot be added to yield the community based prediction (Šizling et al., 2011; Šizling, Kunin & Storch, 2013). However, Harte et al. (2013) argue that it is not a flaw of METE or a strong argument against universality.

Additionally, Haegeman & Etienne (2010) argued that a multivariate, spatially implicit analog of the univariate Π distribution that is derived using the non-recursive METE approach makes different predictions at different spatial scales (i.e., it is not scale consistent); however, they recognize that a spatially-explicit, scale-consistent version of this distribution may still exist. This critique does not apply to the recursive approach (DJ McGlinn, X Xiao, J Kitzes & EP White, unpublished data), but it may apply in other contexts such as the scaling of the SAD. The lack of scale-consistency in some of METE’s predictions suggests that the choice of the anchor scale (A0) may influence the theory’s predictions; however, our results, which spanned a range of anchor scales (0.0064 to 50 ha), did not appear to change systematically with scale. Furthermore, White, Thibault & Xiao (2012) demonstrated that the METE-SAD accurately characterized empirical SADs across studies with a wide range of anchor scales. Although METE may not provide a universally applicable model of spatial structure in ecological systems and some of its predictions will depend on the anchor scale, our results as well as others suggest that METE can be used as a practical tool for inferring patterns of diversity and abundance from relatively little information.

We examined the down-scaling of richness; however, many conservation applications are interested in up-scaling richness or predicting diversity at a coarse unsampled scale using information at a fine scale. Harte, Smith & Storch (2009) demonstrated that recursive-METE accurately up-scaled tropical tree richness. Currently a formal examination of upscaling using the non-recursive approach is lacking. Thus, future investigations should examine the ability of different variants of METE to upscale richness across a range of spatial scales and ecological systems.

METE represents a useful practical tool for accurately predicting species richness across spatial scales. Among METE’s four different approaches to predict the SAR, our analysis demonstrates that the non-recursive approach outperforms the recursive approach, and that using the observed rather than predicted SAD does not substantially improve accuracy. Therefore the METE prediction derived using the non-recursive approach and the predicted SAD will likely be the most useful for future applications involving the SAR.

Supplemental Information

Complete dataset summary

A more complete description of the datasets used to test the spatial predictions of METE.

DOI: 10.7717/peerj.212/supp-1

Stem maps for subsampled sites

Stem maps for four of the study sites that required subsampling. The maps illustrate how the spatial data was partitioned for the analysis. Each dot represents a stem record, the colored stems were included in the analysis and the grey stems were not. Note that the dimensions of the plots appear visually distorted, but in reality they are all rectangles with an aspect ratio of 2:1.

DOI: 10.7717/peerj.212/supp-2

Observed-predicted plot for the species-abundance distribution

The observed-predicted plot for the species abundance distribution (SAD) across all 16 communities. The line is the 1:1 line. The points are color-coded to reflect the density of neighboring points, with warm (red) colors representing higher densities and cold (blue) colors representing lower densities.

DOI: 10.7717/peerj.212/supp-3

Code and files needed to reproduce results

Code for working with the spatial predictions of John Harte and colleagues’ Maximum Entropy Theory of Ecology.

DOI: 10.7717/peerj.212/supp-4
12 Citations   Views   Downloads