Temperature is a key factor that directly or indirectly governs the performance of biochemical reaction rates, physiological rates (e.g., respiration and photosynthesis), and even ecological rates (e.g., prey encounter rate). Understanding how biological rates respond to changes in environmental temperature (the thermal performance curve, TPC; Fig. 1) is important for ecological and comparative evolutionary analyses of thermal physiology, for better predicting how climate change will influence the dynamics of populations, communities, and ecosystems (Brown et al., 2004; Pörtner et al., 2006; Dell, Pawar & Savage, 2011; Hoffmann & Sgrò, 2011; Schulte, Healy & Fangue, 2011; Pawar, Dell & Savage, 2015). Another example of such analyses involves testing the hypothesis of metabolic cold adaptation (MCA; e.g., see Seibel, Dymowska & Rosenthal, 2007; White, Alton & Frappell, 2012; Clarke, 2017), according to which cold-adapted individuals exhibit higher metabolic rates at low temperatures (well below Tpk; see Fig. 1) than individuals adapted to higher temperatures.
The TPCs of fundamental biological rates (traits) are generally unimodal, and biological rate versus temperature relationships are typically well-described by mathematical models that quantify four key features of the response: the temperature where the performance peaks (Tpk), the rate performance at a reference temperature (B0), typically well below Tpk within its operational temperature range (Pawar et al., 2016), the rise of the rate up to Tpk (E), and the fall after Tpk (ED) (Fig. 1). The normalised rate value B0 is particularly important, as it allows rate performance to be standardised for comparison across individuals and species (Gillooly et al., 2001). In particular, the inference of normalised rate values at a reference temperature between species is key for studying MCA, or for comparisons of the performance of different biological rates (e.g., photosynthesis and respiration) at a common temperature (e.g., see Padfield et al., 2017).
Partly mechanistic models that explicitly link a cellular, organismal, or population rate’s value to the temperature-dependence of the underlying biochemical kinetics (e.g., Johnson & Lewin, 1946; Sharpe & DeMichele, 1977; Schoolfield, Sharpe & Magnuson, 1981; Ikemoto, 2005; Corkrey et al., 2012; Hobbs et al., 2013; DeLong et al., 2017) are becoming increasingly popular for quantifying empirically observed TPCs (Hochachka & Somero, 2002). Such models have occasionally received criticism on the grounds that they only constitute phenomenological statistical descriptions, as their assumptions are too simplistic and cannot be directly mapped onto physiological or ecological rates, which should be driven by a far more complex interplay of processes (e.g., Clarke, 2004; Clarke & Fraser, 2004; Clarke, 2006; Clarke, 2017; but see Gillooly et al., 2006). Nevertheless, these models continue to be used in the literature as they can adequately fit a large variety of experimentally determined TPCs, enabling the quantification of various aspects of the shape of the performance curve.
Among these models, the Sharpe-Schoolfield model (Schoolfield, Sharpe & Magnuson, 1981) has been frequently used in recent studies to address both ecological and evolutionary questions about the effects of temperature change on individuals, populations, and communities (Barmak et al., 2014; Barneche et al., 2014; Fand et al., 2014; Simoy, Simoy & Canziani, 2015; Barneche et al., 2016; Padfield et al., 2016; Vimercati et al., 2016). In particular, the B0 calculated from fitting this model to TPC data has been used to compare the rate performance of different species (e.g., Wohlfahrt et al., 1999), treatments (e.g., Padfield et al., 2016), or developmental stages (e.g., Hopp & Foley, 2001) at a reference temperature, Tref. However, the implicit assumption made by these studies, that B0 is exactly the normalised rate value at Tref, is only valid under certain conditions (see the Theoretical context section), and may in fact heavily overestimate the actual rate value at that temperature (Schoolfield, Sharpe & Magnuson, 1981) (Fig. 1). Such an overestimation could introduce unexpected biases not only in comparisons of temperature-normalised rates (among e.g., species, treatments, or developmental stages), but also in other analyses (e.g., exaggerating the rate performance of cold-adapted species could provide false support for MCA in its absence).
Here, we study the likely incidence of this overestimation of the normalised B0 obtained by fitting the Sharpe-Schoolfield model to data of biological rates measured at a range of temperatures. To this end, we investigate the conditions under which this overestimation becomes particularly pronounced by analysing 1,758 real thermal performance curves across diverse ectotherm species and rates. We then show how conclusions based upon biased B0 estimates can compromise the results of an important application of TPC models—detecting metabolic cold adaptation. Finally, we present alternative methods for obtaining realistic estimates of rate performance at a reference temperature under different scenarios of usage of the model.
The Sharpe-Schoolfield model proposes that the effect of temperature on the performance of a biological rate largely reflects the thermal sensitivity of a single rate-limiting enzyme that becomes deactivated at both extreme-high and low temperatures (Schoolfield, Sharpe & Magnuson, 1981). Nevertheless, low temperature inactivation is hard to detect, possibly because it requires multiple rate measurements at low temperatures for inferring accurate parameter estimates (see Pawar et al., 2016). Such resolution is typically lacking in currently available datasets of thermal performance. For this reason, it is usually more parsimonious to use a simpler version of the full model that ignores low-temperature enzyme inactivation (Fig. 1): (1) Here, B is the value of the rate at a given temperature T (K), E is the activation energy (eV), which controls the rise of the curve up to the peak, ED is the de-activation energy (eV), which sets the rate at which the rate falls after the peak, Th (K) is the temperature at which 50% of the enzyme units are inactive, and k is the Boltzmann constant (8.617 ⋅ 10−5 eV K−1). B0 is the value of the rate at a reference (normalisation) temperature Tref—i.e., B0 ≈ B(Tref)—assuming enzyme units are fully operational at that temperature. The model can also be reformulated without normalisation, but then B0 would lose any biological meaning (see Section A2.1 in Appendix S1). The assumption of this model variant is that, at low temperatures, the population of the key enzyme remains fully active, with low rate performance values being driven by the decreased amount of kinetic energy which causes biochemical reactions to proceed at a very low rate.
Schoolfield, Sharpe & Magnuson (1981) originally suggested using Tref = 25 °C, a choice they considered appropriate for most poikilotherm species. This suggestion has frequently been followed (see Table A1 and Fig. A1 in Appendix S1). However, when non-negligible loss of enzyme activity occurs at Tref—e.g., due to denaturation or inactivation of some other component of the metabolic pathway— B0 overestimates the real value of the rate at that temperature (B(Tref)) (Ikemoto, 2005). This is particularly problematic for comparisons of B0 across diverse species, as significant temperature-mediated inactivation may begin at very different temperatures, potentially leading to different degrees of inaccuracy in the B0 estimates.
The inflation of rate value at reference temperature (B0)
We first consider why B0 can be biased. For this, in addition to the parameters in Eq. (1) (B0, E, ED, Th, Tref), two extra parameters need to be defined to capture all aspects of the shape of the TPC: the temperature at which the TPC peaks (Tpk), and the performance at that peak (Ppk; see sections A2.2-3 in Appendix S1 for their derivations). Setting T = Tref in Eq. (1) shows that the amount by which B0 will deviate from B(Tref) is equal to the denominator of Eq. (1): (2)
When Tref is much lower than Th (the temperature at which 50% of the enzyme units become inactive), B0 ≈ B(Tref) because the denominator ≈1. On the other hand, as the chosen Tref approaches Th—or exceeds it—, B0 will increasingly deviate from B(Tref). In any case, B0 will always be greater than B(Tref) (at best, by a negligible amount) because of the denominator of Eq. (2). To explore this behaviour numerically across real TPCs of a single biological rate (for consistency reasons), we compiled a dataset of phytoplankton growth rates versus temperature (a combination of the López-Urrutia et al., 2006; Rose & Caron, 2007; Bissinger et al., 2008; Thomas et al., 2012 datasets), containing 672 species/strains with growth rate being measured at multiple temperatures per species/strain. To each TPC in this dataset, we fitted the Sharpe-Schoolfield model across a range of Tref values (−10 °C to 30 °C) using the nonlinear least-squares method (Levenberg–Marquardt algorithm). In order to eliminate less reliable fitted parameter estimates, we rejected fits with (i) an R2 below 0.5 (raising this cutoff to 0.9 yielded qualitatively identical results) or (ii) fewer than four data points either before or after Tpk. Based on these criteria, the number of accepted fits per Tref value ranged from 121 to 126 out of 672 starting TPCs (for an R2 cutoff of 0.5). The variation in the number of retained parameter estimates is due to the different Tref values that we used which can cause small changes in the quality of the fit, leading to the occasional exclusion of some fits with R2 values very close to the cutoff. The computer code—along with the names and versions of all modules or packages used—for the main analyses of this study (including fitting the Sharpe-Schoolfield model to TPCs) is available at https://github.com/dgkontopoulos/Kontopoulos_et_al_temperature_normalisation_2017.
Identification of conditions that lead to a severely overestimated B0
We next determine the characteristics of TPCs (parameter combinations of the Sharpe-Schoolfield model) that lead to a severely overestimated B0. This is a complex problem and not just a matter of determining the difference between Th and Tref, because the denominator of Eq. (2) also includes the ED parameter. As ED influences the relationship between Th and Tpk (see section A2.2 in Appendix S1), it is necessary to take into account the interplay of Th and Tref with Tpk. To address this, we use a conditional inference tree (a machine learning algorithm; Hothorn, Hornik & Zeileis, 2006) to determine the TPC model’s parameter combinations that lead to strong overestimation.
For maximising the power of the machine learning method we used a larger dataset—a subset of the Biotraits database (a substantial collection of performance measurements of ecological traits and physiological rates at multiple temperatures from a wide range of species; Dell, Pawar & Savage, 2013) combined with additional data extracted from the published literature (see section A5 in Appendix S1). We first fitted the Sharpe-Schoolfield model to each empirical TPC in this dataset. As the dataset is very diverse—including, among others, rates from bacteria, macroalgae, and terrestrial plants—we set Tref to 0 °C so that we could obtain reasonable estimates (i.e., at a temperature below Tpk) of B0 and B(Tref) even for cold-adapted species with low Tpk values. It is worth stressing that such a low Tref value is indeed appropriate because, as mentioned in the “Theoretical context” section, experimentally determined TPCs generally do not possess the required resolution for detecting low-temperature enzyme inactivation. Thus, it is safe to assume that rate estimates will be reasonable at low temperatures, even at 0 °C. In total, 1,758 species/individual curves were produced from this dataset. We did not filter the results based on goodness of fit metrics because we are interested in all the different parameter combinations regardless of how well they describe the data.
We then analysed this ensemble of fitted curves through the construction of a conditional inference tree from the data (see section A3.1 in Appendix S1 for details). More precisely, we specified a binary response variable: B0 is above or below Ppk. The choice of Ppk as the cutoff was due to the very high classification performance of the resulting model, especially when compared to other possible cutoffs (e.g., a three-fold increase from B(Tref)) which performed poorly. The predictor variables were the differences between (i) Tpk and Th, (ii) Tpk and Tref, and (iii) Th and Tref for each fit. The model was constrained by setting the maximum allowed p-value at each internal node below 10−10. Its performance was evaluated with the Matthews correlation coefficient (MCC; Matthews, 1975), a metric often used for machine learning models with a binary response. This metric takes values from −1 (complete disagreement with data) to 1 (complete agreement with data) and is considered reliable even when the different response states of the model (in this case B0 > Ppk and B0 < Ppk) are not evenly sampled. To further ensure that the model was accurate and generalisable, we also estimated its performance against a distinct dataset of 405 TPCs (testing dataset). The data for these curves were also part of the Biotraits database—similarly to the 1,758 curves—but were not used for training the model.
Implications of the inflation for investigations of thermal adaptation
Among other ecological and evolutionary questions, the effects of adaptation to different thermal environments on the shape of the TPC (e.g., see Huey & Kingsolver, 1989; Angilletta et al., 2003; Angilletta, 2009; Angilletta, Huey & Frazier, 2010; Clarke, 2017) can be investigated using estimates from the Sharpe-Schoolfield model. For example, a study may aim to uncover whether there are any trade-offs between performance at lower and higher temperatures by correlating B0 and Tpk (e.g., a negative correlation would suggest that high performance at warmer temperatures would come at the cost of lower performance at colder temperatures). Overestimating B0—especially for cold-adapted species with a Th value close to Tref—may potentially introduce such correlations where none existed, serving as false-positive evidence for the MCA hypothesis.
To explore this possible issue, we generated a synthetic dataset of 1,000 negatively skewed TPCs, in which MCA was absent. While a real-world dataset of a single rate could also be used for this purpose (e.g., the phytoplankton growth rates dataset in Fig. 2), we resorted to a simulation in order to obtain a bigger sample and, more importantly, to ensure that the input data were not the outcome of the process of MCA. To this end, each curve was obtained by sampling from a distinct realisation of the beta distribution, with shape parameters (α and β; see section A4 in Appendix S1) that were in turn sampled from normal distributions (Table 1). Skewness was assessed by examining the α and β parameters of each simulated curve. Curves that were not negatively skewed (i.e., those where α ≤ β) were removed and new ones were produced in their place. We also randomly varied the width and the height of the curves using two normally distributed parameters j and k. As the minimum Tpk in this simulation was at 8.23 °C, we arbitrarily set Tref to 7 °C, but any other Tref value below 8.23 °C could be used as well. Note that a different run of the simulation would most likely lead to a different minimum Tpk value, which would potentially require a change in the chosen value of Tref. To enforce the absence of MCA, we made sure that, in this population of curves, there was no significant association between the performance at a Tref of 7 °C, and the thermal optimum (r = − 0.03, 95% CI [−0.09 to 0.03], p = 0.35).
|β||α − i,|
|Final curve width||original width ⋅j,|
|Final curve height||original height +k,|
We then fitted the Sharpe-Schoolfield model to each synthetic curve and obtained parameter estimates where possible. Following this, we performed two different tests for MCA, and compared the results when using B0 versus B(Tref). For the first test, the estimates were split onto two groups: (i) those originating from curves with Tpk < 15 °C (colder-adapted species), and (ii) those with Tpk ≥ 15 °C (species adapted to warmer temperatures). We next tested whether the distributions of the normalised rates (B0 and B(Tref)) were significantly different using the two-sample Kolmogorov–Smirnov test (Corder & Foreman, 2014). The second test consisted of a simple correlation between the normalised rate values (B0 and B(Tref)) and the corresponding Tpk values.
Conditions that lead to different degrees of inflation of B0 estimates
Using the phytoplankton growth rates dataset, we show that, contingent on the difference between Th and Tref, B0 can be considerably greater than B(Tref) (Fig. 2). More precisely, the deviation of B0 from B(Tref) decreases nonlinearly with the difference between Tref and Th (A). In many circumstances, the deviation of B0 is extreme, becoming even greater than the rate value at or near optimum temperature, Ppk (B).
The search for thermal response parameter combinations that lead to B0 being above Ppk (highly overestimated) or below it (less overestimated) resulted in a conditional inference tree with four terminal nodes (Fig. 3). In each of those nodes, B0 was nearly exclusively below or above Ppk. This machine learning model exhibited high performance both on the training dataset (MCC = 0.954) and the testing dataset (MCC = 0.824; section A3.2 in Appendix S1). The sets of thermal response parameters in which B0 was greater than Ppk almost always had either a Th − Tref difference that was less than 0.6 (relatively narrow curves), or a Tpk − Tref difference of 49.1 or lower (relatively wide curves).
Impacts of the overestimation of B0 on tests for MCA
In total, we were able to obtain thermal response parameter estimates for 968 simulated curves, as the nonlinear least-squares algorithm failed to converge on solutions for the remaining 32. In the first test for MCA the distributions of B0 estimates differed between the two groups (D = 0.18, p = 1.7⋅10−6), with species adapted to colder temperatures having a higher median value of B0 (Fig. 4A, light blue violin plots). In contrast, the two distributions of B(Tref) estimates were statistically indistinguishable (D = 0.07, p = 0.21), as expected (Fig. 4A, green violin plots). The overestimation of B0 also affected the second MCA test, as a weak negative correlation between B0 and Tpk was detected, but not between B(Tref) and Tpk (Figs. 4B and 4C). These results indicate that the inflation of B0 can provide false support for the MCA hypothesis, even for datasets with complete absence of this pattern.
In this paper we have addressed the consequences of estimating the value of a rate at a reference temperature, B0, using the Sharpe-Schoolfield model, but without satisfying one of its fundamental assumptions: that the key enzyme—which is responsible for the temperature dependence of the rate—is fully functional at the reference temperature. When this assumption is not met, B0 will overestimate the real rate performance at the reference temperature, B(Tref) (Ikemoto, 2005).
We explain how the discrepancy between B0 and B(Tref) arises and determine the conditions under which it becomes particularly pronounced using a machine learning approach (Fig. 3). The resulting conditional inference tree shows that B0 estimates will generally exceed the rate performance at the peak of the curve (Ppk) as long as: (i) Tpk − Th is less than ∼37.58 °C and Th − Tref is less than ∼0.6 °C, or (ii) Tpk − Th is greater than ∼37.58 °C and Tpk − Tref is less than ∼49.11 °C. In any other case, B0 would most likely be smaller than Ppk, although its inflation may well still be of concern. Using a synthetic dataset, we then demonstrate that wrongly assuming B0 = B(Tref) can lead to erroneous conclusions in analyses of thermal adaptation, as the overestimation of B0 can mimic the effects of metabolic cold adaptation (Fig. 4) (a Type I error).
It is important to note that while we focus on the four-parameter version of the Sharpe-Schoolfield model in this study, the inflation of B0 estimates also mathematically occurs in the variant of the model that assumes enzyme inactivation at both high and low temperatures. Thus, caution is warranted regardless of the model variant that is chosen. Beyond this issue, fitting the simpler model instead of its full counterpart may potentially give rise to other inherent biases but, to our knowledge, a thorough comparison of the two model variants across different organismal groups and rates is not available.
As mentioned before, previous studies have tended to set the Tref—usually at a value of 25 °C—while fitting the Sharpe-Schoolfield model without considering the potential inflation of B0 (Table A1 and Fig. A1, Appendix S1). Whether results of these studies have been compromised by an inappropriate use of Tref is impossible to determine definitively because most of these studies report either Th or Tpk estimates, whereas the machine learning model depends on both (see the ‘Conditions leading to a severely overestimated B0’ section), along with the value of Tref. If these data were available, using the machine learning model that we generated would provide a straightforward procedure to identify cases where B0 is highly likely to be extremely overestimated (i.e, greater than Ppk). In fact, the only study where all necessary parameter estimates were reported for all fitted curves was that by Padfield et al. (2016). In that study, the maximum difference of Th from Tpk is 2.49 °C, and the minimum difference of Tref from Th is 5.79 °C, which, according to the machine learning model (see Fig. 3), are sufficient for the B0 estimates to be below those of Ppk. Having said that, as we showed in this paper, the fact that the overestimation of B0 is not extreme does not necessarily rid any drawn conclusions of bias (e.g., the possibility of falsely detecting the effect of MCA).
In any case, it is crucial to point out that choosing an appropriate reference temperature (i.e., one that is low enough but within the temperature range that the species can endure) is not—on its own—a sufficient strategy to avoid the overestimation of B0. As different species or individuals will most likely not share a common Th value, the difference between Th and Tref will vary across the dataset (see Fig. 2). This approach could again lead to an exaggeration (which may however be very small) of some B0 estimates and is therefore not an elegant solution to the problem.
Comparisons of temperature-normalised rates of diverse species
When data span the entire TPC
For studies in which the end goal is to compare the performance of different species at a common temperature, the simplest approach would be to fit the Sharpe-Schoolfield model—with or without normalising B0 at a reference temperature—and compare estimates of B(Tref), calculated a posteriori. The confidence intervals around B(Tref) can then be estimated by bootstrapping. Another option to avoid the issue of rate overestimation is to consider fitting other models, such as the macromolecular rates model (Hobbs et al., 2013) or the enzyme-assisted Arrhenius model (DeLong et al., 2017).
When data only cover the rising part of the TPC
While the previous solutions are applicable to thermal response datasets that capture either the rise of the curve or its entirety, few studies report temperature performance measurements after the unimodal peak of the response (Dell, Pawar & Savage, 2011). Therefore, to obtain an estimate of baseline performance from a dataset that only covers the exponential rise component, one could instead fit the Boltzmann-Arrhenius model (e.g., see Gillooly et al., 2001), (3) which does not suffer from the problems of the Sharpe-Schoolfield model, as B(Tref) indeed simplifies to B0.
A second alternative model is the one that includes the Q10 factor (see Gillooly et al., 2001), i.e., the rate of change in biological rate performance after a temperature rise of 10 °C: (4) In this case, one would first estimate the value of Q10 from known rate values at two temperatures, and use it to calculate the rate value at the reference temperature: (5)
Regardless of which of these two models is chosen, careful attention must be paid to ensure that the biological rate increases exponentially across the entire temperature range, without signs of a plateau being reached. Otherwise, the estimates may yet again be biased.
Using the ‘intrinsic optimum temperature’ instead of Tref
Alternatively, baseline performance could be defined as the height of the curve at the temperature where the population of the key enzyme is fully active, which should be characteristic for each individual or species. In the Sharpe-Schoolfield model, the denominator indicates the percentage of enzymes that are active. Therefore, in the four-parameter variant of the model, the intrinsic optimum temperature could be estimated as the highest temperature at which this percentage is sufficiently high (e.g., at 99%). If, instead, the model of choice is the Sharpe-Schoolfield variant that also accounts for enzyme inactivation at low temperatures, there will be a unique temperature at which the enzyme population is 100% active. Otherwise, the intrinsic optimum temperature can also be obtained from the Sharpe-Schoolfield-Ikemoto (SSI) model (Ikemoto, 2005). This model integrates the law of total effective temperature—often used in studies of arthropod or parasite development—within the Sharpe-Schoolfield model, replacing Tref with the intrinsic optimum temperature. However, this model introduces an extra parameter and is more challenging to fit compared to the original Sharpe-Schoolfield model. To mitigate this problem, software implementations have been developed that reduce the computation time from often more than 3 hours (Ikemoto, 2008) down to less than a second (Shi et al., 2011; Ikemoto, Kurahashi & Shi, 2013).
Obtaining accurate estimates of temperature-normalised rate performance is of crucial importance—especially in the face of climate change—for comparisons of the same rate across different organisms, or different rates within an individual. In this context, our study explains why temperature-normalised rate estimates obtained using the Sharpe-Schoolfield model can be strongly exaggerated—in comparison to the true rate values—when one of the assumptions of the model is violated, and gives an example of possible consequences of this exaggeration. The suggestions that we provide to address this issue should be useful to the burgeoning studies on ectotherm thermal performance and climate change, both for performing meta-analyses and for determining appropriate temperature ranges in laboratory experiments.