A quantitative assessment of site-level factors in influencing Chukar (Alectoris chukar) introduction outcomes

Chukar partridges (Alectoris chukar) are popular game birds that have been introduced throughout the world. Propagules of varying magnitudes have been used to try and establish populations into novel locations, though the relationship between propagule size and species establishment remains speculative. Previous qualitative studies argue that site-level factors are of importance when determining where to release Chukar. We utilized machine learning ensembles to evaluate bioclimatic and topographic data from native and naturalized regions to produce predictive species distribution models (SDMs) and evaluate the relationship between establishment and site-level factors for the conterminous United States. Predictions were then compared to a distribution map based on recorded occurrences to determine model prediction performance. SDM predictions scored an average of 88% accuracy and suitability favored states where Chukars were successfully introduced and are present. Our study shows that the use of quantitative models in evaluating environmental variables and that site-level factors are strong indicators of habitat suitability and species establishment.


INTRODUCTION
A central goal in the study of introduced species involves determining the forces that might influence introduction success. Duncan, Blackburn & Sol (2003) argued that such forces fall in to three categories: species-level factors; site-level factors; and event-level factors. In a recent analysis, Moulton et al. (2018), examined a large database of game bird introductions to the United States (US), from the Foreign Game Investigation Program (FGIP). Based on the pattern of introduction successes, Moulton et al. (2018), argued that the best explanation for the pattern indicated that location-level factors must frequently be more important in deciding the fates of game bird introductions than the Alcorn & Richardson, 1951;Barnett, 1952;Galbreath & Moreland, 1953;Christensen, 1954Christensen, , 1970Christensen, , 1996Bohl, 1957;Harper, Harry & Bailey, 1958;Long, 1981;Lever, 1987). These qualitative assessments indicate that Chukars favored habitats that resembled their native range, but they might not persist in environments with too much snow (e.g., Gullion, 1965;Christensen, 1970), or hot, arid regions with limited access to free water (e.g., Galbreath & Moreland, 1953;Tomlinson, 1960;Christensen, 1970;Larsen et al., 2007).
Although Chukars have been studied extensively (e.g., Nagel, 1945;Alcorn & Richardson, 1951;Barnett, 1952;Galbreath & Moreland, 1953;Christensen, 1954Christensen, , 1970Christensen, , 1996Bohl, 1957;Harper, Harry & Bailey, 1958;Tomlinson, 1960;Moulton, Cropper & Broz, 2015;Moulton et al., 2018;Moulton & Cropper, 2016, little information exists for quantitative comparisons of locations deemed suitable versus not suitable. With this in mind, the goal of our study was to develop quantitative models using machine learning algorithms to assess environmental factors of sites in the Chukar's native range, and to determine/predict most suitable sites for introductions in the 48 contiguous states of the US. Such algorithms represent a standard practice for formulating a species distribution model or SDM, which identifies characteristics of a species niche or could be applied to predicting potential range expansions or contractions (Phillips et al., 2009;Elith et al., 2011;Hijmans, 2012;. Species distribution models (SDMs) use species occurrence records and compare them to areas where they are absent. In many cases, absences were not recorded, and models refer to 'pseudo-absent' or 'background' data, places assumed less suitable due to the lack of occurrence records, for comparative purposes (Phillips et al., 2009). These models score and rank each point in the area of interest to determine the level of suitability. Traditional methods rely on regression techniques such as generalized linear models and generalized additive models because of their seemingly straightforward interpretability (Elith, Leathwick & Hastie, 2008;Hastie, Tibshirani & Friedman, 2009). Machine learning algorithms are also able to produce such models but do not require any preconceived relationships between environmental covariates and are able to handle complex data distributions (Elith, Leathwick & Hastie, 2008;Hastie, Tibshirani & Friedman, 2009;Elith et al., 2011). A common problem when trying to quantify habitat quality is determining which variables to include in model building given that reducing the dimensionality of the problem could lead to missing subtle non-linear interactions. We chose algorithms with proven success when applied to large, multi-dimensional data sets to remove bias from the covariate selection. A common practice is to choose the 'best' model based on a set of statistics. To eliminate this bias and to ensure a collective analysis, we chose to construct and examine ensembles (e.g., model averages) to assess the importance of various site-level factors.

METHODS
All of our analysis, model construction, and graphics were done using R Ver. 3.5.2 statistical computing language (R Core Team, 2018). All data points and covariates were measured using geospatial packages and extracted from 2.5-min spatial scale raster and polygon layers.

Species occurrence data
We constructed two sets of models for our analysis. First, we collected observations submitted to eBird (Sullivan et al., 2009;eBird, 2021) that provided a media source for validation. We removed all duplicate occurrences that shared longitudinal-latitudinal coordinates leaving us with 1,302 occurrences. We compared these occurrences to where Chukars were successfully introduced and determined they were appropriate for our analysis since both records shared similar distributions (e.g., Christensen, 1970Christensen, , 1996Long, 1981;Lever, 1987). Note, 21 occurrences that met our requirements were from states where Chukar failed to establish (Christensen, 1970(Christensen, , 1996Long, 1981;Lever, 1987). While there are no records that indicate these are wild Chukar or that the locations are suitable, we retained these points to avoid sampling bias. We then randomly selected 10,000 background points from all terrestrial regions, excluding Antarctica and snow-covered Greenland using the 'spsample' function from the 'sp' package (Pebesma & Bivand, 2005). We chose to exclude these regions because the persistent, thick snow-covered areas are unsuitable habitat for Chukar, as noted above. Bohl (1957) documented several studies within the US regarding the distance travelled by released Chukars. We averaged these measurements (m ¼ 49 km) and created circular buffers around each point and merged these polygons to create an estimated range plot (Fig. 1). Mori et al. (2019) note that subspecies selection may account for variability in distribution models. Therefore, we chose to create similar models using the 97 points from the naturalized region in California, US ( Fig. 1) and predict the remaining area of the conterminous US. We chose this region because historical records show that California was one of the first places where Chukars were imported from their native range, reared, released, established/acclimated, and hunted in the US. Furthermore, several other state game commissions acquired Chukars from California game farms, with both failed and successful introductions of large propagules documented (Nagel, 1945;Galbreath & Moreland, 1953, Harper, Harry & Bailey, 1958. It should be noted that though Chukars in Nevada, the first state in the US with successfully established Chukars and the first to hold a hunting season, were not derived from California game stock, but were the same subspecies as those in California (Alcorn & Richardson, 1951;Harper, Harry & Bailey, 1958;Christensen, 1970).

Model covariates
Choosing appropriate model covariates often depends on spatial scale (Pearson & Dawson, 2003;Luoto, Virkkala & Heikkinen, 2006). Some studies (Pearson & Dawson, 2003;Thuiller, Araújo & Lavorel, 2004;Luoto, Virkkala & Heikkinen, 2006;Engler et al., 2017) note that while finer spatial scales may account for more specific habitat characteristics (e.g., biotic interactions, species dispersal limitations, soil and land cover types), Pearson & Dawson (2003) suggest a hierarchal consideration of factors, with importance dependent of spatial scale. Since our study was of the subcontinental scale and a coarser spatial resolution, model parameters pertained to climate and physiography. A favored set of measurements for SDMs are the 19 WorldClim bioclimatic variables (Hijmans et al., 2005;Fick & Hijmans, 2017;Schatz, Kramer & Drake, 2017) and were included in our model building procedure. Limiting studies to these variables alone has been criticized as this might overestimate habitat ranges and does not speak to local ecosystems (Hirzel & Le Lay, 2008). Because Chukars require steep, talus slopes and favor higher altitudes (e.g., Christensen, 1970), we also included elevation, and calculated the slope, aspect, and Terrain Roughness Index score for each cell using the NASA Shuttle Radar Topography Mission raster layer provided by WorldClim (Fick & Hijmans, 2017). Finally, to ensure equal weight to all model covariates, we normalized each raster value prior to point extraction, where values range from 0 to 1.

Algorithms
An extensive study by Norberg et al. (2019) showed SDM predictions may vary due to a variety of factors including preconceived assumptions, statistical inference, and algorithmic framework, and encourage the use of several models rather than a single 'best' model. We, therefore, chose to incorporate several of their suggestions into our modeling framework. We built our models using the 'dismo' package in R . We used the machine learning algorithms cited in  as these algorithms are widely used but have been shown to produce highly accurate models. These algorithms include artificial neural networks (ANN) as defined by Kulhanek, Leung & Ricciardi (2011), gradient boosting trees (GBM; Elith, Leathwick & Hastie, 2008), maximum entropy (MaxEnt; Elith et al., 2011), random forest (RF; Breiman, 2001), and support vector machines (SVM; Drake, Randin & Guisan, 2006).
A common statistical technique for model training and testing is to use a K-fold cross validation on the data to determine model performance (Hastie, Tibshirani & Friedman, 2009;Schatz, Kramer & Drake, 2017;Norberg et al., 2019). We used a 5-fold cross validation for all sets of models where all background points were used for each fold to enhance habitat variability.
To evaluate each iteration of model building, we chose to use the Area under the Receiver operating Curve (AUC) as a measure of model usefulness and testing performance for each fold. To test overall performance of each algorithm, we calculated the mean and standard deviation across all five folds, then created an ensemble of the model by averaging the prediction models. To evaluate model predictions, we transformed our ranked models to a binary map (i.e., suitable-not suitable) using the optimized specificity-sensitivity threshold measurement (Liu, Newell & White, 2016). We calculated the percent classification accuracy, sensitivity, and specificity of each model via confusion matrix comparing the predictive plots to the estimated naturalized range plot (Fig. 1) to determine model performance.
Finally, we created three ensembles for each algorithm, one average and two election models, and two collective election ensembles built using all 25 models. Here, an election is one where each model casts a 'vote' on the status of the raster cell and the classification in the final model is determined by a set proportion. For each algorithm, we calculated the average predictive value for each raster cell across the five folds and average the thresholds produce our binary classifications. For our election models we used a majority vote (MV, 3 of 5 folds) and a unanimous decision (UD), cases where all 5 folds agree, to determine suitability. We also used the majority (13 of 25 folds) and the UD frameworks for the two collective election ensembles. We then calculated our prediction statistics for comparative purposes.
A full analysis of the ensemble performances can be found in Table 1. Statistically, our ensemble models for ANN, GBM, and RF performed similarly to their individual folds and were slightly more accurate and specific (Figs. S6-S10). The performance of the mean and MV ensembles varied by algorithm; however, all five algorithms had improved accuracy via UD ensemble. As for our collective ensembles, collective MV performed well, but ANN MV was more accurate and sensitive. Our collective UD model produced the most convincing result with an accuracy of 0.8973 and the highest sensitivity (0.6918) scores amongst all the ensembles.
In comparison to the native range models sensitivity scores were higher for ANN l ¼ 0:2379; r ¼ 0:1701 ð Þ , GBM l ¼ 0:4498; r ¼ 0:1939 ð Þ , and MaxEnt l ¼ 0:2368; r ¼ 0:1248 ð Þ , roughly the same for SVM l ¼ 0:2576; r ¼ 0:0899 ð Þ , and slightly lower for RF l ¼ 0:3099; r ¼ 0:2511 ð Þ . Specificity scores were higher only for Þ . We found similar improvements in our ensemble models to that of the individual models (Table 1). Suitable habitat was further reduced to states roughly 103 W to 124 W. The majority of suitability fell in the Chukar naturalized range, and states where Chukars are present especially California, Idaho, Nevada, and Oregon (Fig. 3).  Pearson & Dawson (2003) state that suitability models favor a hierarchal framework, suggesting that measuring the bioclimatic envelope as a preliminary step in the modeling scheme may give insight to a broad domain of suitability. We limited our models to data derived from climate and topography, which do not speak to localized circumstances and are not exclusive to Chukar. Nonetheless, we were able to produce accurate SDMs using both native range data and data from just a small partition of the US naturalized range. The majority of the SDM plots show a strong suitability in the western half of the US, particularly in states where Chukars exist, confirming that site-level factors are indeed significant predictors of Chukar establishment success. Traditionally, species occurrences are recorded in field studies or pulled from published databases. Even so, these studies may only represent a portion of the suitable habitat range due to abiotic barriers or individuals simply were not reported in suitable areas. (Phillips et al., 2009;Hirzel & Le Lay, 2008;Robinson et al., 2011). With the growing interest and use of citizen science, SDMs studies now resort to databases such as the Global Biodiversity Information Facility (e.g., Beck et al., 2014;Mori et al., 2019), iNaturalist (e.g., Mori et al., 2019), and eBird (e.g., Steen, Elphick & Tingley, 2019). However, some of these records are unverified as individuals could be misidentified, data may be missing or inaccurate (e.g., longitudinal/latitudinal coordinates reversed), or the inability to differentiate between wild and recently released or escaped captive individuals. To avoid these pitfalls, we implemented a data filtering process on eBird data that limited our study to occurrences with proof of observation, and we removed all duplicate records. Because the distribution of occurrences was consistent with historic range maps (e.g., Christensen, 1970), we found these points suitable for our analysis.

DISCUSSION
We used algorithms that are widely chosen by ecologists (Drake, Randin & Guisan, 2006;Elith, Leathwick & Hastie, 2008;Elith et al., 2011;Schatz, Kramer & Drake, 2017;Norberg et al., 2019) and three classification statistics in our analysis to understand a model's predictive limitations. With respect to all 84 models produced, we calculated accuracy l ¼ 0:8823 ð Þ , sensitivity l ¼ 0:3525 ð Þ , specificity l ¼ 0:9397 ð Þ . We suspect the low sensitivity scores were due to spatial autocorrelation, which predicts a greater range of suitable location. Even then, our models still favored the states where Chukars are established. These statistics show that while our models perform well in classifying novel locations, the high specificities and lower sensitivities suggest our models are better at predicting locations that are not suitable over those that are. In spite of this, we argue that it is just as important to know this information, especially for gamebird introductions (Nagel, 1945, Bohl, 1957. Further inspection into species and site-level factors known to affect Chukar introduction success should be considered. Several field studies (Nagel, 1945;Alcorn & Richardson, 1951;Barnett, 1952;Galbreath & Moreland, 1953;Christensen, 1954Christensen, , 1970Christensen, , 1996Bohl, 1957;Harper, Harry & Bailey, 1958) suggest certain land types are associated with successful establishment, though modeling this should be done at a finer spatial scale (Pearson & Dawson, 2003;Thuiller, Araújo & Lavorel, 2004;Luoto, Virkkala & Heikkinen, 2006). It is also well documented that the availability of cheatgrass (Bromus tectorum) is a convincing indicator of establishment in the US (Galbreath & Moreland, 1953;Harper, Harry & Bailey, 1958;Christensen, 1970) though data regarding to cheatgrass occurrence and/or density is minimal, which is why it was excluded from our models. Artificial habitat or watering systems (e.g., 'guzzlers') have been integrated into game introductions, which may have been an adequate supplemental resource to facilitate establishment in some studies (Harper, Harry & Bailey, 1958;Christensen, 1970;Larsen et al., 2007). Sub-species environmental tolerance has been shown to impact SDM model predictions (Mori et al., 2019) and while this is implied by our spatial partitioning of the naturalized range and the use of the California data, further studies may be necessary to understand true range potential. Even then, domestication of reared Chukars may also affect introduction attempts (Alcorn & Richardson, 1951).
With this in mind, it is important to note that while our models do not account for specific site-level, species-level, or event level factors, gamebird introduction efforts would benefit from consideration of habitat variables. Modeling of the bioclimatic envelope allows us to not only determine which states consist of potential habitat, but which ones to avoid.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
The authors received no funding for this work.