How has our knowledge of dinosaur diversity through geologic time changed through research history?

Jonathan P. Tennant; Alfio Alessandro Chiarenza; Matthew Baron

doi:10.7717/peerj.4417

How has our knowledge of dinosaur diversity through geologic time changed through research history?

Jonathan P. Tennant ¹, Alfio Alessandro Chiarenza ¹, Matthew Baron^2,3

1Department of Earth Science and Engineering, Imperial College London, London, UK

2Department of Earth Science, University of Cambridge, Cambridge, UK

3Earth Sciences Department, Natural History Museum, London, UK

DOI: 10.7717/peerj.4417

Published: 2018-02-19
Accepted: 2018-02-06
Received: 2017-05-24

Academic Editor: Laura Wilson

Subject Areas: Evolutionary Studies, Paleontology
Keywords: Dinosaurs, Diversity, Mesozoic, Cretaceous, Jurassic, Macroevolution, Publication bias, Paleobiology Database, Triassic, Extinction

Copyright: © 2018 Tennant et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Tennant JP, Chiarenza AA, Baron M. 2018. How has our knowledge of dinosaur diversity through geologic time changed through research history? PeerJ 6:e4417 https://doi.org/10.7717/peerj.4417

The authors have chosen to make the review history of this article public.

Abstract

Assessments of dinosaur macroevolution at any given time can be biased by the historical publication record. Recent studies have analysed patterns in dinosaur diversity that are based on secular variations in the numbers of published taxa. Many of these have employed a range of approaches that account for changes in the shape of the taxonomic abundance curve, which are largely dependent on databases compiled from the primary published literature. However, how these ‘corrected’ diversity patterns are influenced by the history of publication remains largely unknown. Here, we investigate the influence of publication history between 1991 and 2015 on our understanding of dinosaur evolution using raw diversity estimates and shareholder quorum subsampling for the three major subgroups: Ornithischia, Sauropodomorpha, and Theropoda. We find that, while sampling generally improves through time, there remain periods and regions in dinosaur evolutionary history where diversity estimates are highly volatile (e.g. the latest Jurassic of Europe, the mid-Cretaceous of North America, and the Late Cretaceous of South America). Our results show that historical changes in database compilation can often substantially influence our interpretations of dinosaur diversity. ‘Global’ estimates of diversity based on the fossil record are often also based on incomplete, and distinct regional signals, each subject to their own sampling history. Changes in the record of taxon abundance distribution, either through discovery of new taxa or addition of existing taxa to improve sampling evenness, are important in improving the reliability of our interpretations of dinosaur diversity. Furthermore, the number of occurrences and newly identified dinosaurs is still rapidly increasing through time, suggesting that it is entirely possible for much of what we know about dinosaurs at the present to change within the next 20 years.

Introduction

In the latter half of the 20th century, palaeobiology underwent a renaissance by adopting a more quantitative analytical approach to understanding changes in the fossil record through time (Valentine & Moores, 1970; Raup, 1972; Gould & Eldredge, 1977; Sepkoski et al., 1981; Van Valen, 1984; Sepkoski, 1996). This research was largely focussed around estimating patterns of animal diversity, extinction and speciation through time, and what the external processes governing these were. To this day, reconstructing the diversity of life through geological time remains one of the most crucial aspects of palaeobiology, as it allows us to address broader questions about the evolution of life and what the mechanisms of extinction and recovery are. These pioneering analyses were largely based on an archive of range-through taxa of marine animals, known as the ‘Sepkoski Compendium’. More recently, analytical palaeobiology has had a second wave of innovation, in part due to development of large databases that catalogue fossil occurrences and associated data such as the Paleobiology Database (www.paleobiodb.org), and also due to development of increasingly sophisticated analytical subsampling (Alroy, 2000a, 2003, 2010a; Starrfelt & Liow, 2016) and modelling (Smith & McGowan, 2007; Lloyd, 2012) techniques. Together, these are helping to provide new insight into how we can use the fossil record to understand the large-scale evolutionary patterns and processes that have shaped the history of life.

All of these studies, both older and more recent, are underpinned by a single principle, in that they rely on the recorded number of identifiable fossiliferous occurrences present through geological time. Despite meticulous work to ensure that these databases and compendia represent the best possible records of historical trends, there has been continuous discussion as to the accuracy of the data, and the extent to which estimates of palaeodiversity might be confounded by such bias. These biases include factors such as heterogeneous sampling intensity, fossiliferous rock availability, and variable depth of taxonomic research (Raup, 1972, 1976; Uhen & Pyenson, 2007; Benton, 2008a, 2008b; Marx & Uhen, 2010; Tarver, Donoghue & Benton, 2011; Smith, Lloyd & McGowan, 2012; Smith & Benson, 2013).

In 1993, Sepkoski added an additional dimension to these studies by assessing how database compilation history through changes in taxonomy, stratigraphic resolution, and sampling influences the shape of macroevolutionary patterns (Sepkoski, 1993). Based on comparison of the two compendia built in 1982 and 1992, Sepkoski (1993) found that in spite of numerous taxonomic changes over 10 years, the overall patterns of diversity for marine animals remained relatively constant, with the main notable change being that overall diversity was consistently higher in the 1992 compilation. Alroy (2000a) similarly showed that database age does appear to have an influence on North American mammal diversity estimates, and Alroy (2010c) further demonstrated that diversity estimates based on data from the Paleobiology Database were proportionally similar to either the genus- or family-level results based on Sepkoski’s original compendium. At the present, there are three main arguments regarding the historical reliability of diversity curves (Sepkoski et al., 1981; Sepkoski, 1993; Alroy, 2000a): firstly that because independent datasets produce similar diversity curves, this suggests that convergence on a common signal reflecting either a real evolutionary, fossil record structure, or taxonomic phenomenon; second, that the addition of new data to existing compilations should yield only minor changes to resulting diversity estimates; and third, that the addition of new data can potentially dramatically alter the shape of diversity (counter to the first and second arguments). At the present, the first argument appears to be the best supported by analytical evidence (Sepkoski, 1993; Alroy, 2000b).

However, besides Sepkoski and Alroy’s work, relatively little consideration has been given to how publication or database history can influence macroevolutionary patterns, despite an enormous reliance on their research utility (although see Benton (2008a, 2008b) and Tarver, Donoghue & Benton (2011) for examples using vertebrates). In particular, to our knowledge, no one has yet tested this potential influence using an occurrence-based tetrapod dataset, such as those available from the Paleobiology Database. This is important, given that a wealth of recent studies, and in particular on tetrapod groups, have focussed on estimating diversity patterns through geological time and interpreting what the potential drivers of these large-scale evolutionary patterns might be (Butler et al., 2009; Benson & Butler, 2011; Butler et al., 2011; Mannion et al., 2015; Nicholson et al., 2015; Benson et al., 2016; Grossnickle & Newham, 2016; Nicholson et al., 2016; Tennant, Mannion & Upchurch, 2016a; Brocklehurst et al., 2017). Many of these studies have employed subsampling methods that are sensitive to changes in the shape of the taxonomic abundance distribution, which we would expect to change in a non-random fashion based on new fossil discoveries through time as they are published (Benton et al., 2011, 2013; Benton, 2015) (e.g. due to the opening up of new discovery regions for geopolitical reasons, or the historical and macrostratigraphic availability of fossil-bearing rock formations). Furthermore, as sampling increases through time, we might also expect the relative proportion of singleton occurrences to decrease, improving the evenness of the underlying sampling pool (Alroy, 2010a; Chao & Jost, 2012), and therefore influencing calculated diversity estimates (see ‘Methods’ below). Assessing this influence in a historical context is therefore important for understanding how stable our interpretations of evolutionary patterns are.

While the data used in these analyses are typically based on a ‘mature’ dataset that has undergone rigorous taxonomic scrutiny and data addition or refinement, they often tend to neglect explicit consideration of the potential influence of temporal variations in the publication record (which these databases are explicitly based on). This has important implications for several reasons. First we might expect the shape of both raw and subsampled diversity curves to change through time in concert with new discoveries and as sampling increases (Sepkoski, 1993; Alroy, 2000a), or that subsampled diversity estimates stabilise at some point. Second, this could therefore impact our interpretations of the relative magnitude, tempo and mode of apparent radiations and extinctions. Third, if the shape of estimated diversity curves change (either based on raw or ‘corrected’ data), we could see that the strength of results from comparisons of diversity with extrinsic factors such as sea-level or palaeotemperature (Benson et al., 2010; Benson & Butler, 2011; Butler et al., 2011; Peters & Heim, 2011b; Mayhew et al., 2012; Martin et al., 2014; Mannion et al., 2015; Nicholson et al., 2015; Tennant, Mannion & Upchurch, 2016a, 2016b) will change.

As our data become updated, capturing this influence of sampling variation becomes more important through longer periods of time. We might expect sampling error to be highest earlier on in sampling history, and to reduce through time, therefore improving the reliability of our correlation estimates. However, if our subsampled diversity estimates remain stable through historical time, then we can be more confident in these interpretations, as well as the effectiveness of subsampling methods in reliably estimating diversity. Recently, this potential issue highlighted by Jouve et al. (2017) in a small study of Jurassic and Cretaceous thalattosuchian crocodylomorphs. Those authors tested the conclusions of Martin et al. (2014) and their assertion that sea-surface temperature was the primary factor driving marine crocodylomorph evolution, contra Mannion et al. (2015) and Tennant, Mannion & Upchurch (2016a). They found that the strength of the relationships reported by the first study, also different to those reported by Mannion et al. (2015) and Tennant, Mannion & Upchurch (2016a), were fairly unstable even based on very recent changes in taxonomy. This taxonomically constrained example provides an interesting case of how small changes in publication history can lead to potentially different or conflicting interpretations of macroevolutionary patterns.

In this study, we investigate the influence of publication history on our reading and understanding of diversity patterns through time. For this, we use the clade Dinosauria (excluding Aves) as a study group, as they have an intensely sampled fossil record and a rich history of taxonomic and macroevolutionary research. We note that this is just one of a whole suite of potential biases in palaeodiversity studies (e.g. appropriate time-binning methods, optimal analytical protocols, or the impact of variation in the rock record through space and time), and these factors are discussed in more detail elsewhere (Peters & Heim, 2010, 2011b; Benson & Butler, 2011; Heim & Peters, 2011; Benson & Upchurch, 2013; Benton et al., 2013; Dunhill, Hannisdal & Benton, 2014; Benton, 2015; Benson et al., 2016; Tennant, Mannion & Upchurch, 2016b).

Material and Methods

Dinosaur occurrences dataset

We used a primary dataset of dinosaur body fossil occurrences drawn from the Paleobiology Database (November, 2017) that spans the entirety of the Late Triassic to end-Cretaceous (235–66 Ma) (Supplemental Information 1). These data are based on a comprehensive compilation effort from multiple workers, and represent updated information on modern dinosaur taxonomy and palaeontology at this time. The records comprised only body fossil remains, and excluded ootaxa and ichnotaxa. This dataset was divided into the three major clades, Sauropodomorpha, Ornithischia, and Theropoda. We excluded Aves as they have a fossil record dominated by different and often exceptional modes of preservation. Having limited occurrences of exceptionally preserved fossils will bias our results, particularly in time periods characterised by the presence of avian-bearing Konservat–Lagerstätten (Brocklehurst et al., 2012; Dean, Mannion & Butler, 2016). We elected to use genera, as these are more readily identified and diagnosed, which means that we can integrate occurrences that are resolved only to the genus level (e.g. Allosaurus sp.), and therefore include a substantial volume of data that would be lost at any finer resolution (Robeck, Maley & Donoghue, 2000). A potential issue with a genus-level approach is that analysing palaeodiversity at different taxonomic levels can potentially lead to different interpretations about what the external factors mediating it are (Wiese, Renaudie & Lazarus, 2016). Despite the fact that some dinosaur genera are multispecific, it has been shown previously that both genus- and species-level dinosaur diversity curves are very similar (Barrett, McGowan & Page, 2009), and that there is more error in species level dinosaur taxonomy than for genera (Benton, 2008b). It has also been repeatedly demonstrated that the shape of species and genus curves are strongly correlated in spite of differential taxonomic treatment (Alroy, 2000a; Butler et al., 2011; Mannion et al., 2015), and therefore a genus level compilation should be sufficient for the scope of the present study. We elected to use a stage-level binning method based upon the Standard European Stages and absolute dates provided by Gradstein et al. (2012). Others have used an equal-length time binning approach (Mannion et al., 2015; Benson et al., 2016), but this has limitations in that it reduces the number of data points for statistical analyses, and can artificially group fossil occurrences from different stages that never temporally co-existed (Gibert & Escarguel, 2017), which would confound our analyses. Only body fossil occurrences that could be unambiguously assigned to a single stage bin were included, and those in which assignment to a single stage bin was either ambiguous or not possible were excluded. This procedure was implemented in order to avoid the over-counting of taxa or occurrences that have poorly constrained temporal durations or are contained within multiple time bins. Each dinosaurian sub-group was further sub-divided into approximately contiguous palaeocontinental regions: Africa, Asia, Europe, South America, and North America (Mannion et al., 2015). Unfortunately, sampling is too poor to analyse patterns in Antarctica, Australasia, or Indo-Madagascar, although these regions remain included in the global analyses. We also provide data on the number of newly identified occurrences (Supplemental Information 2) and newly named genera (Supplemental Information 3) based on publication date, as well as a list of dinosaur taxa that became invalidated between 1991 and 2015 (Supplemental Information 1).

Calculating diversity through time

To test how diversity changes through time, we reduced the primary dataset by successively deleting data from publications of each individual occurrence recursively at two-year intervals. Note that these dates are not the same as the date that the actual entries were made into the database, but the explicit date of publication of that occurrence record in the published version of record. We stopped at 1991, giving 12 sequential temporal datasets for each dinosaurian clade. What each version represents is the maturity of the dataset with respect to its present state (and taxonomy as of 2015) based on publication history. Two methods were used to assess diversity patterns. Firstly, empirical diversity based on raw in-bin counts of taxa. This method has been strongly suggested to be a ‘biased’ or poor estimator of true diversity as it is influenced by heterogeneous sampling (Benson et al., 2010; Benson & Butler, 2011; Benson & Upchurch, 2013; Butler, Benson & Barrett, 2013; Smith & Benson, 2013; Newham et al., 2014; Mannion et al., 2015; Tennant, Mannion & Upchurch, 2016b). Secondly, we employed the shareholder quorum subsampling (SQS) method, which was designed to account for differences in the shape of the taxon-abundance curve (Alroy, 2010a, 2010c), and implemented in Perl (Supplemental Informations 4 and 5).

Shareholder quorum subsampling standardises taxonomic occurrence lists based on an estimate of coverage to determine the relative magnitude of taxonomic biodiversity trends (Alroy, 2010a, 2010c). In this method, each taxon within a sample pool (time bin) is treated as a ‘shareholder,’ whose ‘share’ is its relative occurrence frequency. Taxa are randomly drawn from compiled in-bin occurrence lists, and when a summed proportion of these ‘shares’ reaches a certain ‘quorum’, subsampling stops and the number of sampled taxa is summed. Coverage, as a measure of sampling quality, is defined as the proportion of the frequency distribution of taxa within a sample. It is estimated by using randomized subsampling to calculate the mean value of Good’s u, which is defined as 1 minus the number of singleton occurrences, divided by the total number of occurrences (Good, 1953). A coverage value of zero indicates that all taxa are singleton occurrences (i.e. that all occurrences of a taxon are restricted to a single collection within a time bin). Higher coverage values indicate more even sampling of taxa, and therefore provides a measure of sample completeness that is independent of the overall sample pool size. For each time bin, u is then divided into the quorum level (Alroy, 2010a), thereby providing an estimate of the coverage of the total occurrence pool. In all subsampling replicates, singletons were excluded to calculate diversity (but included to calculate Good’s u), as they can distort estimates of diversity. Dominant taxa (those with the highest frequency of occurrences per bin) were included, and where these taxa are drawn, one is added to the subsampled diversity estimate for that bin (Alroy, 2010c). Finally, single large collections that can create the artificial appearance of poor coverage were accounted for by counting occurrences of taxa that only occur in single publications, as opposed to those which occur in single collections, and excluding taxa that are only ever found in the most diverse collection. A total of 1,000 subsampling trials were run for each dataset (Theropoda, Ornithischia, and Sauropodomorpha, for each region and two-year time interval), and the mean diversity was reported for each publication time interval. For each sequential subsampling iteration, whenever a collection from a new publication was drawn from the occurrence list, subsequent collections were sampled until exactly three collections from that publication had been selected (Alroy, 2010a). We set a baseline quorum of 0.4, as this has been widely used and demonstrated to be sufficient in accurately assessing changes in diversity (Alroy, 2010a, 2010c; Mannion et al., 2015; Nicholson et al., 2015; Tennant, Mannion & Upchurch, 2016a). Diversity estimates are not reported for any analyses in which this quorum could not be attained. This dual method of using raw and standardised data is important, as not all publications name new taxa; some add to our knowledge of existing taxa by publishing on new occurrences in different collections (or sites). Therefore, by applying a method that accounts for changes in taxonomic abundance across collections we can see how publication history influences diversity through subsampling methods.

Correlation between diversity extrinsic parameters

For our model-fitting protocol, we follow the procedure outlined in numerous recent analytical studies, by employing simple pairwise correlation tests to the residuals of detrended time series at the stage level (Benson & Butler, 2011; Butler et al., 2011; Butler, Benson & Barrett, 2013; Mannion et al., 2015; Tennant, Mannion & Upchurch, 2016a). Residuals for each of the two environmental extrinsic parameters were calculated using the arima() function in R, which uses maximum likelihood to fit a first-order autoregressive model to each time series (Gardner, Harvey & Phillips, 1980). This method detects the potential influence of any long-term background trend (i.e. a directed change in the mean value of the complete time series through time) within the time series, which has the potential to artificially inflate correlation coefficients in pairwise tests (Box & Jenkins, 1976), and also accounts for any potential serial autocorrelation (i.e. the correlation of a variable with itself through successive data points). This protocol has become standard practice now for palaeontological time series analysis following its recommendation by Alroy (2000a). For sea level, we used the curve of Miller et al. (2005), which has been widely applied in recent analyses of tetrapod diversification (Benson et al., 2010; Butler et al., 2011; Martin et al., 2014; Mannion et al., 2015; Tennant, Mannion & Upchurch, 2016a), and for palaeotemperature we used the data from Prokoph, Shields & Veizer (2008), available as stage level data from Hannisdal & Peters (2011) (Supplemental Information 6).

We performed an assessment of normality for each time series prior to any correlation analyses, using the Shapiro–Wilk test (shapiro.test() function in R). From the output, if the p-values are greater than the pre-defined alpha level (traditionally, 0.05, and used here) this implies that the distribution of the data are not significantly different from a normal distribution, and therefore we can assume normality and use Pearson’s test (Pearson’s product moment correlation coefficient [r]). If p > 0.05, we performed a non-parametric Spearman’s rank correlation (ρ). For each test, both the raw and adjusted p-values are reported, the latter calculated using the p.adjust() function, and using the ‘BH’ model (Benjamini & Hochberg, 1995). This method accounts for the false-discovery test when performing multiple hypothesis tests with the same data set, which can inflate type-1 error (i.e. in order to avoid falsely rejecting a true null hypothesis; a false positive). We avoided the more commonly used ‘Bonferroni correction’, due to the undesirable property it has of potentially increasing type-2 error to unacceptable levels (Nakagawa, 2004). This adjustment was performed on ‘families’ of analyses (i.e. non-independent tests), rather than on all correlation tests together, to avoid setting the pass rate for statistical significance too low.

We performed pairwise correlations for the detrended subsampled diversity estimates at each two-year iteration for each group to assess how the strength and direction of correlation changes through publication history. We do not use a maximum likelihood model fitting approach because rather than trying to distinguish between a set of candidate models, we are simply assessing how the strength of correlations changes through publication history.

All analyses were carried out in R version 3.0.2 (R Development Core Team, 2013) using the functions available in the default stats package.

Results

Occurrences and genera through time

From the first dinosaur discoveries until around 1950, the number of dinosaur occurrences published steadily increased through time (Fig. 1). From the mid- to the end of 20th century, the number of published occurrences has increased substantially. This is mostly due to the publication of theropod and ornithischian occurrences, which reached a peak around the turn of the millennium, with occurrences of all three groups remaining high but declining in rate of publication after this. A very similar pattern is observed for genera, with the publication of newly named genera increasing exponentially since around 1990, and at an equal rate for all three groups (Fig. 2). The cumulative frequency of newly named genera shows that, although the rate of growth remains approximately similar and increasing for all three groups, there are times when the relative overall number of genera between groups changes through publication history. For example, while sauropodomorphs had more named genera than theropods until around 1935, this changed at around 1960 when new theropod genera became more frequently published than sauropodomorphs. The recent rate of growth of newly named theropod genera in the last 15 years means that they are now named as frequently as newly named ornithischian genera. This recent rate of growth in the naming of new taxa is distinct from the patterns of taxonomic invalidation (e.g. through synonymy) that have occurred since 1991 (Fig. 3). While we see an increase in the number of invalidated taxa between 2000 and 2010, this is variable for each group, with theropods peaking in 2007, sauropodomorphs peaking in 2002–2004, and ornithischians in 2007 and 2013.

Figure 1: Frequency (A) and cumulative frequency (B) of newly published dinosaur occurrences through publication time.
Please note that all raw figure files (PDF) and the R code for generating these are available in Supplemental Information 10.

Download full-size image

DOI: 10.7717/peerj.4417/fig-1

Figure 2: Frequency (A) and cumulative frequency (B) of newly published dinosaur genera through publication time.

Download full-size image

DOI: 10.7717/peerj.4417/fig-2

Figure 3: The number of invalidated or revised dinosaur taxa between 1991 and 2015.

Download full-size image

DOI: 10.7717/peerj.4417/fig-3

‘Global’ patterns of total dinosaur diversity

Apparent ‘global’ empirical dinosaur diversity steadily rises until the end of the Jurassic (Fig. 4A). Diversity is low across the Jurassic/Cretaceous (J/K) interval until the Hauterivian, before recovering in the late Early Cretaceous. There is a second decline through the late Early to early Late Cretaceous interval, before diversity increases to its zenith in the latest Cretaceous. This general pattern remains constant throughout publication history, although diversity in the ‘middle’ Cretaceous and latest Cretaceous intervals shows the greatest increases. Subsampled global dinosaur diversity retains this overall pattern (Fig. 4B). The J/K interval decline is still visible, but the late Early Cretaceous apparent diversity increase supersedes Late Jurassic levels. The early Late Cretaceous decline is also still present, but the magnitude of the latest Cretaceous diversity increase is much lower than that recovered for the empirical data. The reason for this distinction between subsampled and raw diversity is that SQS estimates diversity by standardising coverage of the taxon-abundance distribution, and thereby reduces the impact of intensely sampled time intervals such as the latest Cretaceous.

Figure 4: Total dinosaur ‘global’ diversity patterns for (A) raw and (B) subsampled data.
The vertical red lines represent major interval boundaries. Time stage abbreviations (in chronological order). N, Norian; R, Rhaetian, He, Hettangian; S, Sinemurian; P, Pliensbachian; T, Toarcian; A, Aalenian; Bj, Bajocian; B, Bathonian; C, Callovian; O, Oxfordian; K, Kimmeridgian; Ti, Tithonian; Be, Berriasian; V, Valanginian; Ha, Hauterivian; Ba, Barremian; Ap, Aptian; Al, Albian; Ce, Cenomanian; Tu, Turonian; Co, Coniacian; Sa, Santonian; Cam, Campanian; M, Maastrichtian. Vertical dashed red lines indicate boundaries between different periods (Triassic/Jurassic, Jurassic/Cretaceous, and Cretaceous/Paleogene).

Download full-size image

DOI: 10.7717/peerj.4417/fig-4

Patterns of raw and subsampled diversity by group

Ornithischians

Raw ‘global’ ornithischian diversity (Fig. 5A) is constant and stable throughout publication history. The apparent magnitude of longer-term trends is obscured by the relative over-sampling of the Campanian and Maastrichtian, which are almost an order of magnitude higher than any other Jurassic or Cretaceous stage interval. Indeed, the Campanian shows no sign of slowing down in increasing diversity, and is the highest and most rapidly increasing of any time interval. In spite of this, the overall trends in raw diversity remain, with steadily increasing Middle to Late Jurassic diversity, a small earliest Cretaceous decline followed by a ‘middle’ Cretaceous peak in the Aptian, a shallow decline into the early Late Cretaceous, and an increase in the Campanian.

Figure 5: Raw ornithischian diversity at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-5

Raw diversity in Europe shows increasing diversity across the J/K transition before an earliest Cretaceous decline (Valanginian–Hauterivian), constant ‘middle’ Cretaceous diversity, and an increase from the Campanian to Maastrichtian (Fig. 5B). Raw African ornithischian diversity is too inconsistent to analyse any changes through geological time or publication time (Fig. 5C). Raw Asian diversity is fairly constant through the Cretaceous, until an apparent major Campanian peak and Maastrichtian decline (Fig. 5D). In North America, empirical diversity is flat and low throughout the Late Jurassic and most of the Cretaceous (Fig. 5E). There is a Campanian peak, and order of magnitude higher than any prior interval, which is rapidly increasing through publication time. Diversity decreases from this into the Maastrichtian, in which diversity has remained relatively stable through publication time. Sampling in South America is also relatively poor, with apparent diversity remaining low and flat where a signal is obtained (Fig. 5F).

Subsampled ‘global’ ornithischian diversity shows a distinctly different pattern from the raw curve, both in terms of overall trends, and in terms of the magnitude of the effect of publication history (Fig. 6A). The Jurassic is generally too poorly sampled to reveal a constant signal, but there is evidence of a decline through the J/K transition, which remains constant through publication time. This is followed by a middle-Cretaceous increase, in which ornithischian diversity is at its second highest level throughout their history. The magnitude of this Albian radiation has rapidly increased over publication time, the result being that originally what appeared to be increasing subsampled diversity over the Early/Late Cretaceous transition now shows a major decline from the Albian to Coniacian. Santonian subsampled diversity remains unknown, but when we see a signal emerge in the Campanian, diversity is higher than the Albian, reaching its highest level before declining by more than half into the Maastrichtian. This overall structure, besides the Albian, remains consistent throughout publication time with no major perturbations to the apparent ‘global’ curve.

Figure 6: Subsampled ornithischian diversity at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-6

Subsampled European diversity reveals increasing diversity across the Tithonian/Berriasian transition, followed by overall gradually decreasing diversity throughout the remainder of the Early Cretaceous (Fig. 6B). In Africa, the signal is too poor to reveal anything besides a Kimmeridgian/Tithonian subsampled diversity drop (Fig. 6C), and in Asia, there is evidence of a decline in subsampled diversity across the Albian/Cenomanian transition (Fig. 6D). In North America, subsampled diversity reveals a decline across the Early–Late Cretaceous transition, and a major decline from the Campanian to Maastrichtian, a pattern that remains stable through publication history (Fig. 6E). In South America, the subsampled signal is too poor to comment on ornithischian diversity (Fig. 6F).

If we look at how coverage has changed through publication history (based on Good’s u), we should expect that subsampled diversity patterns are reflective of this pattern. At a global level, coverage in the Cretaceous is much better than the Jurassic (Fig. 7A). Much of this, however, is based on patchy regional records. In Europe, we find that coverage increases across the J/K interval (Fig. 7B), and is the only place where a consistently reliable record here can be obtained. In Africa, coverage is generally poor, besides in the latest Jurassic (Fig. 7C). In Asia, coverage is poor up until the late Early Cretaceous (Fig. 7D). In North America, coverage is good in the latest Jurassic and ‘middle’ to Late Cretaceous, but non-existent in Early to Middle Jurassic and earliest Cretaceous (Fig. 7E). Coverage is generally poor for the entire South American ornithischian record (Fig. 7F), explaining why obtaining a subsampled diversity signal here is difficult.

Figure 7: Good’s u estimates for ornithischians at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-7

Theropods

The overall shape of the raw ‘global’ theropod diversity curve remains consistent through publication history for the Jurassic (Fig. 8A), similar to ornithischians, where we see steadily increasing Middle–Late Jurassic diversity. ‘Middle’ Cretaceous raw diversity fluctuated, followed by a major Campanian–Maastrichtian rise. The lowest apparent diversity is in the Coniacian, reaching earliest Cretaceous levels. Notable variations due to publication history are in the Barremian–Cenomanian, where diversity increases in magnitude through time, gradually exceeding that for Late Jurassic diversity. Raw European diversity is fairly constant through publication history (Fig. 8B), with a Middle Jurassic diversity peak in the Bathonian, followed by a Callovian–Oxfordian trough, a second larger Kimmeridgian peak, and then constant decline from the Tithonian to the Valanginian. Barremian diversity is increases through publication time, and is as high as Kimmeridgian levels. Aptian and Albian diversity is relatively low through publication history. Campanian and Maastrichtian diversity levels are slowly increasing through publication history. As with ornithischians, African theropods are generally too poorly sampled at the stage level to recognise any consistent empirical patterns (Fig. 8C). There is a Cenomanian raw diversity spike, but how this compares with much of the rest of the Cretaceous is obscured by patchy sampling. In Asia, raw Late Jurassic diversity is generally lower than for the Cretaceous (Fig. 8D). The Cretaceous sees three peaks in apparent diversity during the Aptian, Turonian, and Campanian–Maastrichtian, with the latter being considerably higher than any previous one, and growing rapidly through publication history. In North America, raw diversity levels are dwarfed by the intensive sampling of latest Cretaceous theropods, with major gaps in the Middle–Late Jurassic and earliest Cretaceous records (Fig. 8E). Campanian and Maastrichtian raw diversity is constantly increasing at a faster rate than any other time interval, and consistently reveals a slight apparent diversity decline into the end-Cretaceous. Raw South American diversity estimates are changing rapidly through publication history, with almost every interval in which dinosaurs are available to be sampled doubling or tripling since 1991 (Fig. 8F). Of note is a recently emerging Late Jurassic theropod fossil record in South America, which at the present reveals an apparent low diversity.

Figure 8: Raw theropod diversity at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-8

When subsampling is applied, in the Late Jurassic we see a switch from steadily increasing subsampled diversity to a major Oxfordian peak and subsequent decline in diversity through the J/K transition decline, a pattern that is consistently recovered through publication time (Fig. 9A). Subsampled diversity is at its highest level during the Aptian than at any other stage during theropod history, and has doubled in the last 20 years of publication history. Campanian and Maastrichtian diversity are as high as the Cenomanian, a pattern that remains consistent through publication time. We see the ‘global’ J/K transition decline reflected in Europe (Fig. 9B), and a strong Barremian peak, which is not captured on a ‘global’ scale. Latest Triassic subsampled diversity is higher than at any other point in the Jurassic in Europe. Maastrichtian subsampled diversity remains high, reaching the same level as that for the Kimmeridgian. In Africa, as with ornithischians the signal is very patchy after subsampling is applied (Fig. 9C), but captures an Albian–Cenomanian diversity increase, which remains constant throughout publication history, and flat diversity in the latest Cretaceous. The subsampled theropod diversity signal is also patchy in Asia, but does reveal a very high latest Cretaceous diversity level, which is not otherwise seen throughout theropod evolutionary history (Fig. 9D). In North America, the subsampled record is as patchy as that for ornithischians, but remains stable through publication history (Fig. 9E). Here, we see slightly increasing subsampled diversity in the latest Jurassic, a large decline from the Aptian to Albian, and a major diversification from the Santonian to Campanian. In South America, a subsampled diversity signal is almost entirely absent, although we do see a reduction in almost half from the Norian to Rhaetian, which remains stable through publication history (Fig. 9F).

Figure 9: Subsampled theropod diversity at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-9

Theropod coverage levels are quite patchy at the ‘global’ level, remaining constant in the Late Triassic, fluctuating in the Middle Jurassic to earliest Cretaceous, but remaining fairly stable in the ‘middle’ and latest Cretaceous through publication history (Fig. 10A). On a regional level, this apparent ‘global’ signal across the J/K transition is again emphasised in Europe, but in the Valanginian and Albian, coverage is getting notably worse through publication history (Fig. 10B). Coverage in Africa (Fig. 10C) and Asia (Fig. 10D) is very patchy, and does not appear to have changed in the last 20 years overall, besides the origin of moderate coverage levels in the Oxfordian and Aptian of Asia. In North America, coverage levels are moderately high in the latest Jurassic, Aptian, and Albian, and latest Cretaceous, only improving in the latest Jurassic through publication history (Fig. 10E). In South America, coverage is generally poor throughout the Jurassic and Cretaceous, but appears to be declining in the Norian and Rhaetian theropod records (Fig. 10F).

Figure 10: Good’s u estimates for theropods at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-10

Sauropodomorphs

Sauropodomorph empirical diversity emphasises some more changes in raw patterns through publication time, particularly in the ‘middle’ and Late Cretaceous (Fig. 11A). Late Jurassic patterns are fairly consistent, with a rising Kimmeridgian and Tithonian raw diversity emphasising an apparent major decline across the J/K interval. In Europe, sauropods show a consistent and major decline in raw diversity from the Kimmeridgian to the Berriasian (Fig. 11B). Much of the rest of the Cretaceous is too poorly sampled, but raw sauropod diversity never attains Kimmeridgian levels in Europe for the rest of their evolutionary history. Sauropodomorph dinosaurs are generally better sampled than theropods and ornithischians in Africa, showing an apparent decline through the Triassic/Jurassic transition, a latest Jurassic raw diversity peak, and low levels through the ‘middle’ to Late Cretaceous transition (Fig. 11C). In Asia, raw taxonomic diversity is generally low compared to the Maastrichtian, in which diversity is relatively high and still rapidly increasing through publication history (Fig. 11D). The North American sauropod record is very patchy, with the latest Jurassic showing a shift from rapidly increasing raw diversity from the Oxfordian to a slight drop from the Kimmeridgian to Tithonian (Fig. 11E). The South American Jurassic sauropod record is patchy, but raw diversity is increasing throughout the ‘middle’ to Late Cretaceous through publication history (Fig. 11F).

Figure 11: Raw sauropodomorph diversity at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-11

At a ‘global’ level, Jurassic sauropodomorph subsampled diversity remains consistent through publication history (Fig. 12A). Here, we see steadily increasing diversity levels through the Middle and Late Jurassic, before a decline through the J/K transition, which might have been initiated before the J/K boundary itself. The greatest change in subsampled diversity is in the Albian, which has almost doubled in the last 20 years, with implication for the ‘mid-Cretaceous sauropod hiatus’ (Mannion & Upchurch, 2011). Subsampling reduces the European diversity signal due to poor sampling of sauropods, although there is evidence for the sauropod decline beginning prior to the J/K transition (Fig. 12B). In Africa, when subsampling is applied, the few intervals in which a signal emerges reveal a fairly constant level of diversity through the Jurassic and Cretaceous, and through publication time, with the notable exception being an increase in subsampled diversity in the latest Jurassic (Fig. 12C). In Asia, the signal is also fairly poor after subsampling is applied (Fig. 12D). Here, we see an increase in subsampled diversity across the Triassic/Jurassic transition, and the highest diversity level is in the Maastrichtian, where subsampled estimates have increased by more than double in the last 20 years. In North America, the subsampled signal is highly degraded, although of note is a near doubling of Albian diversity levels in the last 20 years (Fig. 12E). In South America, the signal is very inconsistent, but improving through publication history, with a patchy Late Cretaceous signal beginning to emerge (Fig. 12F). Full subsampling results are provided in Supplemental Informations 7 and 8.

Figure 12: Subsampled sauropodomorph diversity at (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-12

Sauropodomorph coverage varies greatly at the ‘global’ level, with high levels in the Triassic-Jurassic transition, the Middle and Late Jurassic (with the exception of the Callovian), and the Maastrichtian (Fig. 13A). As with theropods and ornithischians, however, this is a composite of a very patchy regional record. In Europe, coverage is high during the latest Triassic, Middle Jurassic, and Late Cretaceous, and this does not seem to have varied with publication time (Fig. 13B). In Africa, moderate levels of coverage also have not changed substantially since 1991 (Fig. 13C). In Asia, coverage is generally high in the Late Jurassic, but the Cretaceous record is incredibly poor with just two data points (Aptian and Maastrichtian; Fig. 13D). In North America, the latest Jurassic has high coverage levels, which are increasing through publication history in the Kimmeridgian, and moderately high coverage in the Aptian and latest Cretaceous (Fig. 13E). In South America, coverage is very patchy and inconsistent, with the only noteworthy change through publication history being an increase for the Rhaetian interval (Fig. 13F).

Figure 13: Good’s u estimates for sauropodomorphs at a (A) global and (B–F) regional levels (Europe, Africa, Asia, North America, and South America, respectively) based on our published knowledge in 1991 and 2015.
Abbreviations as Fig. 4.

Download full-size image

DOI: 10.7717/peerj.4417/fig-13

Correlation results

Our results find varying strength of correlation between subsampled ‘global’ dinosaur diversity for each clade and both palaeotemperature and sea level, although the correlations are consistently weak (Supplemental Information 9). This lack of statistical strength occurs for subsampled diversity estimates at the two-year intervals for each of ornithischians (Table 1), sauropodomorphs (Table 2), theropods (Table 3), and dinosaurs overall (Table 4), meaning that we cannot confidently interpret anything here. The only time the results come close to alpha (0.05) is for the correlation between Ornithischia and sea level during 2007–2013 (p = 0.062–0.084, ρ = 0.481–0.516), but our correction methods reduce the strength of all our statistical results.

Table 1:

Ornithischian correlation test results.

Ornithischians			Sea level			Palaeotemperature
	Shapiro–Wilk (p)	Correlation test	cor	p	Adjusted p	cor	p	Adjusted p
2015	0.003	Spearman	0.42	0.137	0.322	−0.432	0.109	0.235
2013	0.002	Spearman	0.481	0.084	0.273	−0.396	0.145	0.235
2011	0.002	Spearman	0.481	0.084	0.273	−0.396	0.145	0.235
2009	0.002	Spearman	0.516	0.062	0.273	−0.429	0.113	0.235
2007	0.001	Spearman	0.503	0.069	0.273	−0.471	0.078	0.235
2005	<0.001	Spearman	0.358	0.209	0.273	−0.346	0.206	0.237
2003	0.002	Spearman	0.314	0.274	0.322	−0.325	0.237	0.237
2001	0.001	Spearman	0.332	0.246	0.322	−0.329	0.232	0.237
1999	0.002	Spearman	0.327	0.253	0.322	−0.432	0.109	0.235
1997	0.001	Spearman	0.341	0.233	0.322	−0.429	0.113	0.235
1995	<0.001	Spearman	0.258	0.394	0.394	−0.367	0.197	0.237
1993	0.001	Spearman	0.413	0.185	0.322	−0.495	0.089	0.235
1991	0.002	Spearman	0.329	0.297	0.322	−412	0.163	0.235

DOI: 10.7717/peerj.4417/table-1

Table 2:

Sauropodomorph correlation test results.

Sauropodomorphs			Sea level			Palaeotemperature
	Shapiro–Wilk (p)	Correlation test	cor	p	Adjusted p	cor	p	Adjusted p
2015	0.036	Spearman	−0.114	0.711	0.795	−0.171	0.527	0.609
2013	0.045	Spearman	−0.08	0.795	0.795	−0.138	0.609	0.609
2011	0.274	Pearson	0.399	0.201	0.877	0.095	0.736	0.81
2009	0.192	Pearson	0.399	0.201	0.877	0.067	0.813	0.813
2007	0.052	Pearson	0.161	0.619	0.877	−0.197	0.482	0.81
2005	0.477	Pearson	0.115	0.71	0.877	−0.221	0.41	0.81
2003	0.19	Pearson	0.168	0.614	0.877	−0.235	0.4	0.81
2001	0.385	Pearson	0.007	0.991	0.991	−0.199	0.477	0.81
1999	0.124	Pearson	0.105	0.75	0.877	−0.174	0.522	0.81
1997	0.887	Pearson	−0.145	0.673	0.877	−0.116	0.692	0.81
1995	0.485	Pearson	−0.091	0.797	0.877	−0.147	0.615	0.81
1993	0.763	Pearson	−0.155	0.654	0.877	−0.147	0.617	0.81
1991	0.295	Pearson	−0.145	0.673	0.877	−0.147	0.615	0.81

DOI: 10.7717/peerj.4417/table-2

Table 3:

Theropod correlation tests results.

Theropods			Sea level			Palaeotemperature
	Shapiro–Wilk (p)	Correlation test	cor	p	Adjusted p	cor	p	Adjusted p
2015	0.036	Spearman	0.175	0.588	0.672	0.115	0.71	0.868
2013	0.098	Pearson	0.234	0.464	0.464	0.334	0.264	0.362
2011	0.027	Spearman	0.099	0.751	0.751	0.059	0.844	0.868
2009	0.032	Spearman	0.17	0.579	0.672	0.055	0.856	0.868
2007	0.029	Spearman	0.17	0.579	0.672	0.055	0.856	0.868
2005	0.072	Pearson	0.289	0.316	0.464	0.363	0.184	0.362
2003	0.027	Spearman	0.407	0.151	0.659	−0.061	0.832	0.868
2001	0.006	Spearman	0.346	0.247	0.659	−0.086	0.773	0.868
1999	0.028	Spearman	0.379	0.202	0.659	−0.051	0.868	0.868
1997	0.193	Pearson	0.476	0.1	0.25	0.254	0.362	0.362
1995	0.107	Pearson	0.511	0.074	0.25	0.257	0.355	0.362
1993	0.101	Pearson	0.251	0.409	0.464	0.264	0.342	0.362
1991	0.013	Spearman	0.209	0.494	0.672	−0.071	0.803	0.868

DOI: 10.7717/peerj.4417/table-3

Table 4:

Total dinosaur correlation tests results.

All dinosaurs			Sea level			Palaeotemperature
	Shapiro–Wilk (p)	Correlation test	cor	p	Adjusted p	cor	p	Adjusted p
2015	0.327	Pearson	0.189	0.467	0.467	−0.051	0.832	0.984
2013	0.233	Pearson	0.226	0.385	0.467	−0.099	0.678	0.984
2011	0.059	Pearson	0.324	0.204	0.467	0.108	0.652	0.984
2009	0.021	Spearman	0.284	0.268	0.367	−0.072	0.763	0.876
2007	0.489	Pearson	0.233	0.367	0.467	0.01	0.966	0.984
2005	0.045	Spearman	0.207	0.407	0.367	−0.095	0.682	0.876
2003	0.053	Pearson	0.305	0.218	0.467	0.025	0.914	0.984
2001	0.043	Spearman	0.232	0.367	0.367	−0.089	0.71	0.876
1999	0.066	Pearson	0.342	0.179	0.467	0.005	0.984	0.984
1997	0.27	Pearson	0.358	0.159	0.467	−0.048	0.84	0.984
1995	0.13	Pearson	0.275	0.303	0.467	0.021	0.931	0.984
1993	0.119	Pearson	0.221	0.429	0.467	0.046	0.856	0.984
1991	0.049	Spearman	0.261	0.347	0.367	−0.04	0.876	0.876

DOI: 10.7717/peerj.4417/table-4

Discussion

The influence of sampling and publication history on dinosaur diversity estimates

The impact of publication history on estimates of both raw and subsampled dinosaur diversity has direct consequences for our interpretation of their evolutionary history and diversification (Benton, 2008a; Tarver, Donoghue & Benton, 2011). Using a small window of historical discovery, we show that dinosaur diversity remains highly volatile in specific geographical regions and geological time, typically where sampling levels remain very uneven or the overall sampling pool is very small (Sepkoski, 1993; Alroy, 2000b). In poorly sampled areas, it is clear that even small changes to the data can yield substantial changes, as we are often dealing with very small total sample sizes. This is reflected much less on an apparent ‘global’ scale, and much more so when we look at regional signals after subsampling is applied. As the rate of dinosaur discovery is increasing (both taxonomically and for occurrences) (Figs. 1 and 2), we expect this volatility to be present in the future.

As research on dinosaurs continues and new taxa are described and published from existing fossiliferous formations, raw diversity is expected to become more correlated with rock availability as result of increasing sampling effort (Raup, 1977; Wang & Dodson, 2006; Benton, 2015), and represents a form of publication bias (Sepkoski, 1993; Alroy, 2000b; Jouve et al., 2017). Research has shown that new dinosaur discoveries, and changes in their taxonomy and phylogenetic relationships, can strongly influence our understanding and interpretation of their fossil record and diversification patterns (Weishampel, 1996; Tarver, Donoghue & Benton, 2011). In this study, we examined the historical trajectory of different dinosaur diversity estimates to observe whether sampling curves are beginning to stabilise or not. For raw diversity estimates, we find evidence for relatively stable patterns in spite of any ‘bonanza effect’ (i.e. fossil discoveries driving formation counts, especially prevalent in Lagerstätten) (Raup, 1977; Benton, 2015). The fact that the curves remain relatively consistent, despite the variable addition of new taxa, suggests we are seeing some form of the ‘redundancy’ hypothesis at play, in that fossils and sampling are non-independent from each other, when only raw data are considered (Benton et al., 2011, 2013; Dunhill, Hannisdal & Benton, 2014; Benton, 2015). Conversely, a more appropriate interpretation might be that we are generally sampling fairly, or consistently, from an underlying occurrence pool through historical time, or that our application of subsampling based on a standardised estimate of coverage is sufficient to eliminate any such sampling biases.

However, what is the explanation for the diversity patterns we obtained so far, and what does the variation in these patterns tell us? Generally, a dinosaur bearing formation availability effect makes the Kimmeridgian, Barremian, Albian, Aptian, Campanian, and Maastrichtian the most productive stages (Barrett, McGowan & Page, 2009; Butler et al., 2011; Upchurch et al., 2011; Tennant, Mannion & Upchurch, 2016b). By counting genus density (number of genera per million year), three stages from these stand out: Kimmeridgian, Campanian, and Maastrichtian (Taylor, 2006), with Asia being the most productive continent followed closely by North America, then Europe, South America, Africa, Australasia, and finally Antarctica. However, what is clear from our analyses is that this is not historically consistent, and prone to change as new regions are opened up for exploration and discovery.

There is a well-recognised relationship between the amount of rock available for palaeontologists to search for dinosaur fossils, and how this influences our interpretations of their diversity patterns (Barrett, McGowan & Page, 2009; Butler et al., 2011; Mannion et al., 2011; Upchurch et al., 2011). This raises questions about the extent to which many aspects of diversity curves could be artefacts caused by changes in global sea levels, tectonics, and other geological processes related to preservational or geological megabiases (Peters & Foote, 2001; Smith, Gale & Monks, 2001; Smith & McGowan, 2007; Peters & Heim, 2010; Heim & Peters, 2011; Peters & Heim, 2011a; Smith, Lloyd & McGowan, 2012; Smith & Benson, 2013). As a way of exploring this, Barrett, McGowan & Page (2009) applied the ‘residuals’ method (formerly designed by Smith & McGowan (2007) for marine fossil taxa) to account for these sorts of geological biases, and demonstrated that many features of dinosaur diversity curves are sampling artefacts that reflect changes in the amount of fossiliferous rocks and thus reflect geological rather than biological signals. However, this method has received substantial criticism since, and might not be appropriate for studies of palaeodiversity (Brocklehurst, 2015; Sakamoto, Venditti & Benton, 2017). However, the influence of these geological biases appears to have been largely mitigated in recent studies by considering a historically accurate account of sampling and modelling variation through time (Alroy, 2010a, 2010b, 2010c; Newham et al., 2014; Mannion et al., 2015; Nicholson et al., 2015; Grossnickle & Newham, 2016; Tennant, Mannion & Upchurch, 2016b). Here, sampling heterogeneity in terms of both collection effort and rock availability can be accounted for through subsampling methods, which appear to capture and alleviate at least part of the geological signal. These relative changes in the amount of rock available for sampling, the number and abundance of different taxa, and the historical sampling intensity of different rock formations have implications for the patterns of palaeobiological change that we infer from them. An interesting extension of the present study, which explores historical publication bias, would be to test how the historical context of sampling (e.g. outcrop area variation or availability through time, sampling intensity through time) corresponds to our historical estimates of diversity.

We find that there are four main time periods when great caution should be applied to interpreting processes or patterns based on dinosaur diversity, based on volatility in subsampled diversity estimates and coverage levels. These are: (1) the Late Jurassic interval for theropods in Europe, North America, and Asia (Figs. 9 and 10); (2) the mid-Late Cretaceous interval for theropods in South America and Asia (Figs. 9 and 10); and (3) the mid-Late Cretaceous interval for ornithischians in North and South America and Asia (Figs. 6 and 7); (4) the mid-Late Cretaceous for sauropodomorphs in Africa, Asia, and South America (i.e. Gondwana) (Figs. 12 and 13). As well as this, the Late Triassic dinosaurian record is in a state of flux at the present (Baron, Norman & Barrett, 2017), and should be interpreted carefully (Figs. 6, 9 and 12). These represent the times when diversity estimates are changing most rapidly due to a combination of taxonomic revision and discovery-driven publication. While we cannot predict the future of dinosaur discovery, or the selective nature of publication, it seems prudent to suggest that we are cautious in our interpretation of events in dinosaur macroevolution in these intervals, similar to the conclusions reached by Tarver, Donoghue & Benton (2011).