Study areas and GPS tracking American white pelicans
American white pelicans are divided into eastern and western metapopulations by the Continental Divide (Anderson and King 2005). The AWPE eastern metapopulation migrates between the Northern Great Plains (the breeding grounds) and the NGOM (the non-breeding grounds) along the Mississippi and Ohio River Valleys (King et al. 2017, Shannon et al. 2002). Our study areas included the entire geographic distribution range of the eastern metapopulation of AWPE in North America (Fig. 1). We followed King et al. (2017) to delineate the boundaries of the breeding and non-breeding ranges.
We used presence data from 36 AWPEs which had been tracked with the global positioning system (GPS) transmitters for one, two or a maximum of three years from 2002 to 2010 (King et al. 2016). We divided the recorded GPS locations into the AWPE breeding grounds (only migrants) and non-breeding grounds (including migrants and year-round residents). A GPS-tracked individual was considered in spring migration (departure from the non-breeding range) when it crossed 35oN latitudinal line northward until it reached the breeding grounds (see King et al. 2017), and vice versa in autumn. For the purposes of this study, we assumed that migration ended when the given individual entered the breeding (spring) or non-breeding (autumn) region (King et al. 2017). We excluded AWPE GPS locations during seasonal migration from our analysis.
The potential usefulness of satellite tracked data in avian Species Distribution Models (SDMs) has been increasingly addressed in the ecological literature (Engler et al. 2017). Several studies have demonstrated that even a limited number of tracked individuals (six or seven birds) can produce accurate and reliable predictive models (Jiguet et al. 2011, Gschweng et al. 2012, Williams et al. 2017). Aebischer et al. (1993) suggested six radio-tracked individuals as the minimum number sampled for habitat selection with relemetry relocation data. In our study, the number of tracked individuals and the number of presence locations were above the minimum suggested or used for avian SDMs and habitat selection in the literature (Aebischer et al. 1993, Proosdij et al. 2016, Williams et al. 2017).
Processing and filtering of GPS data for the breeding and non-breeding grounds
We filtered and thinned the presence locations to control for spatial bias and unbalanced numbers of GPS locations among individual (Phillips et al. 2006, Williams et al. 2017). First, we identified the GPS-tracked individual with the smallest number of GPS locations among the 36 AWPEs. We then reduced the number of GPS locations of the other individuals to the smallest number by randomly sampling from all available locations of an individual. Second, we resampled each of the 36 subsets of presence locations so that each location was separated from the nearest one by a minimum distance of 5 km. In a preliminary analysis, we found that average hourly movement distance of AWPE during the active hours (0900–1700 hr) on both the breeding and non-breeding grounds was about 5 km. Last, to ensure that we excluded those locations where AWPE were flying at higher altitude above ground, we also excluded from analyses those locations in the upper decile of flying speed (> 4.45 m/s) (Illan et al. 2017).
In a preliminary analysis, we built SDMs using data on the GPS locations from migrants and year-round residents on the non-breeding grounds during winter. The results of SDM fitting and niche analyses did not differ in either direction or significance between the migrants and residents. Thus, in further analyses, we combined the winter GPS locations of the migrants and the residents at the non-breeding grounds.
To test for the effects of the aquaculture decline in AWPE habitat use and ecological niches, we divided our AWPE GPS location data into two subsets: before and after the 2005 peak of aquaculture activities on the non-breeding grounds. All comparative analyses were based on SDMs developed with these two sets of data (hereafter, pre- and post-aquaculture). We expected that SDMs would show different land-cover niche breadth before and after the peak of aquaculture.
Environmental variables
We used two types of environmental predictors in raster format: (i) climate (14 variables) and (ii) land cover (five variables).
Climate: Climatic variables were obtained from the WorldClim (version 2) database (Hijmans et al. 2005; http://www.worldclim.org), which provides a variety of monthly climatic data averaged over the years 1970–2000. Three main reasons persuaded us to develop our models using the WorldClim data. First, the WorldClim data include part of the AWPE non-breeding range in the Southern GOM outside the USA, which is not encompassed by an alternative raster climatic data PRISM from Oregon State University (PRISM Climate Group 2012). Second, wind direction and strength are two main drivers of AWPE movements and behaviour (Illan et al. 2017). However, none of the alternative climate data sources included data on wind currents at the desired spatiotemporal scale. Last, we regressed data obtained from two alternative climate sources, Climate Research Unit (CRU) of University of East Anglia (Harris et al. 2014) and PRISM against the same variables from the WorldClim data. Correlations were highly significant (R2 > 0.8; p < 0.001) in all climatic variables considered.
The choice of climate predictors reflected energy and water constraints on the spatial distribution of birds. We selected July for the warmest month on the breeding grounds and January for the coldest month on the non-breeding grounds of AWPE (King et al. 2017). The arrival and departure dates of AWPE seasonal migration vary substantially from year to year. The two months correspond to the time of year when AWPEs are present on the breeding and non-breeding ranges (King et al. 2017). Initially selected climatic variables included minimum, maximum and average temperature, total precipitation, solar radiation, water vapour pressure and wind speed.
Land cover: We used North American Land Change Monitoring System (NALCMS, 2005; Homer et al., 2015) (https://landcover.usgs.gov) based on MODIS satellite imagery to obtain land cover data. We chose NALCMS also because the US Land Cover Data Base does not include the AWPE geographic range outside the USA. The classification of land cover types was designed with three hierarchical levels using the Food and Agriculture Organization (FAO) Land Classification System (Homer et al, 2015). Given the importance of the presence of water for AWPE foraging habitat, we used a layer called “wetlands” (which discriminates between different types of wetlands) from the Global Lakes and Wetlands Level 3 Database (GLWD-3) (Lehner and Döll, 2004). We also calculated the variable “distance to water” as the Euclidean distance from every tracked location to the nearest permanent water body. For each individual AWPE and for each GPS location, we extracted the selected land cover variables using ArcMap spatial analyst (ESRI, 2014).
To eliminate multicollinearity, we used a backwards stepwise process to calculate the variance inflation factor (VIF) starting from the initial pool of 19 predictor variables. We only selected predictors of VIF < 4 (Graham 2003, O´Brien 2007, Kock and Lynn 2012). With the resulting set of variables, we calculated a matrix of pairwise Pearson’s correlation r and discarded any remaining predictors of r > 0.7. All land cover raster files were resampled to the same spatial resolution of climatic layers at the 30 arc-second grid cell size (equivalent to 1 km resolution). The 1-km resolution was chosen for a balance between the computational burdens and the accuracy of SDMs, given the spatial extent of North America in this study. Our selected spatial resolution is less than the average hourly movement distance (5 km) of AWPE, sufficient for AWPE species distribution modelling, and is consistent with the spatial resolution of the climatic raster files (Table 1).
Table 1
List of the environmental variables included in the species distribution modelling of American white pelicans.
Environmental variable | Code | Mean (min-max) | Original resolution | Source | Units / classes |
a) Climatic Minimum Temperature (July) Total Precipitation (July) Total radiation (July) Wind speed (January) | tmin07 ppt07 rad07 wind01 | 11.3 (-21.7–40.8) 71.2 (0–967) 20305 (123–28340) 3.5 (0.5–19.1) | 30 arc-sec 30 arc-sec 30 arc-sec 30 arc-sec | Worldclim Worldclim Worldclim Worldclim | ºC mm kJ/m2 day m/s |
b) Land cover Distance to water 2005 land cover Land cover change (2005–2010) | hidro lc2005 lchange | 17.3 (0–174.5) categorical categorical | 250 m 250 m 250 m | USGS MODIS MODIS | km 16 NA |
Wetlands | wetland | categorical | 30 arc-sec | GLWD | 10 |
Species distribution modelling and ecological niche analysis
To estimate the potential spatial distribution, climatic niche, and land-cover niche of AWPE, we built SDMs using climate, land-cover and climate-land cover combined predictors, respectively. We used the Maximum Entropy Model (MAXENT) for AWPE SDMs (Phillips et al. 2006). MAXENT has become a benchmark SDM for presence-only data and has been shown to outperform other algorithms (Elith et al. 2006). Additionally, MAXENT software was designed to integrate with GIS software thus making data input and distribution map outputs easier to handle (Elith et al. 2010). MAXENT compares the environmental conditions measured by selected predictors at presence locations with those at randomly selected points (i.e., pseudo-absence), which quantify the available 'background' of environmental conditions. MAXENT estimates the spatial distribution (i.e., relative occurrence likelihood in each grid cell) by finding the largest spreading of occurrence conditions (i.e., maximum entropy), with the constraints of the observed mean and variance of environmental conditions at the presence locations (Warren et al. 2010). Maximum entropy is essentially the same as maximizing the log likelihood of the exponential family data associated with species presence (Renner and Warton 2013).
As proved by Renner and Warton (2013), MAXENT can be parameterized in a way equivalent to a Poisson Point Process Model (PPM) for spatial count data. Consequently, we changed the MAXENT default settings following recommendations from Renner et al. 2015 to develop our PPM to retain duplicate GPS records in the same grid cell. Then, the raw output can be interpreted as a relative habitat-use intensity for each grid cell, proportional to a Poisson intensity (not the default logistic output), to create habitat suitability maps across the entire AWPE geographic range. All other MAXENT settings were left at their default values. We used 10,000 randomly selected background locations for MAXENT. Thirty percent of occurrence records were withheld from each model to be used as independent testing data in a cross validation. For our models, background points were selected within the boundaries of the breeding and wintering grounds of the AWPE eastern population, respectively, following King et al. (2016). Then we projected the models to Canada, the United States, and the Southern GOM to cover all areas potentially reachable by AWPE.
The overall performance of the models was evaluated via the True Skill Statistic metric (TSS) (Allouche et al. 2006). The TSS takes into account both omission and commission errors, and its value ranges from − 1 to + 1, where < 0 indicates indiscriminative; 0 no better than random models; 0–0.2 poor; 0.2–0.6 fair to moderate; > 0.6 good to excellent; and + 1 perfect agreement (Jones et al. 2010). The TSS is not affected by either prevalence or the size of the validation set (Allouche et al. 2006). We calculated the TSS for each MAXENT using the R packages “rocr” (Sing et al. 2005) and “boot” (Canty and Ripley 2017) in the R environment (R Development Core Team 2017). We performed a 10-fold cross-validation procedure and calculated the average TSS values for each model.
We also used the Area Under the Receiver-Operating Characteristic (ROC) Curve (AUC) to assess and compare the performance of the three models. The AUC metric has been widely used to assess model performance in machine learning and species distribution modeling (e.g., Araujo and Guisan 2006, Marmion et al. 2009, Deblauwe et al. 2016), although potential limitations have also been identified (see Lobo et al. 2008). The AUC values range from 0.0 to 1.0, with 0.5 indicating fit no better than random; 0.6–0.9 good; >0.9 indicating excellent fitting to data; and 1.0 indicating perfect model performance (Pearce and Ferrier 2000).
We calculated niche overlap between the breeding and non-breeding AWPE populations using the I statistic and Schoener’s D implemented in ENMTOOLS (Schoener 1968, Warren et al. 2008, 2010). Both metrics were calculated using habitat suitability scores in all grid cells after suitability index values were standardized, Indices I and D range from 0 to 1, with 0 indicating no niche similarity between seasonal niches and 1 indicating complete similarity (Broennimann et al. 2012, Warren et al. 2008). We calculated the niche breadth of the populations using the Levin’s concentration metrics implemented in EMNTOOLS (Quintero and Wiens 2013). The Levin’s concentration metric also ranges from 0 to 1, indicating minimum to maximum niche breadth, respectively. We developed 10 MAXENT models by bootstrapping 10 sets of training data for each of the breeding and non-breeding AWPE populations and then calculated the niche breadth and niche similarity (overlap), respectively (Warren et al. 2010).
Maxent models project living environmental conditions to suitable geographic areas using relationships between occurrence probability or intensity and observed living conditions of species (Elith et al. 2010, Boucher et al. 2014). Such geographic projection approaches may bias the estimation of ecological niches between different geographic regions (such as breeding and nonbreeding regions) and between different periods (such as seasons). Estimation of niche breadth and overlap may be more problematic when comparing seasonal niches in migratory than non-migratory birds because of differences in geographic extend, sampling bias, and climatic regimes (Elith et al. 2010, Boucher et al. 2014, Gomez et al. 2016). To verify the estimation of niche overlaps with Maxent models, we also estimated the niche overlap between seasons using an ordination method based on principal component analysis (PCA, Broennimann et al. 2011). The ordination approach compares seasonal niches in the environmental space represented by the first two axes of PCA. Species occurrences are first mapped to a 100 × 100 grid on the environmental space; then, occurrence density is estimated by kernel smoothing methods for each cell and is rescaled to the range from 0 to 1 (Broennimann et al. 2011). Climatic condition or resource availability is smoothed and rescaled to the 0–1 range for each grid cell as well (Broennimann et al. 2011). Therefore, the ordination approach allows for direct niche comparisons between the regions and between seasons taking into account differences in resource availability, climatic condition, and geographic extent. The ordination method can also visualize ecological niche overlap between the breeding and non-breeding seasons in the first two PCA axes of the environmental space. We used the R package ecospat to conduct the ordination analysis of pelican seasonal niches (Di Cola et al. 2017). The niche space of a season or population is represented by a polygon in the environmental space with its limits of the two PCA scores being chosen to match with the geographic backgrounds for the two populations. As a result, the polygon may be truncated by the chosen limits of the two PCA axes (Bronnimann, personal communication). If estimates of ecological niche overlap with both Maxent-based and ordination-based methods supported our hypotheses concerning seasonal niche variation, we focused on interpreting the Maxent-based estimations because of the wide applications of Maxent-based models.