Occurrence records
A set of records of D. calderonii was compiled from a large part of the Mesoamerican region, specifically in southern Mexico in the States of Oaxaca and Chiapas and in Central America in Guatemala, El Salvador, and Honduras. Species occurrence data were obtained from the Global Biodiversity Information Facility (GBIF.Org User, 2022) The Tropicos database of the Missouri Botanical Garden, the herbaria of Mexico (MEXU and CHIP), the herbaria of El Salvador (LAGU and MHES), records of the project "Generación de Capacidades y Lineamientos Técnicos de Manejo para Elaborar Dictámenes de Extracción no Perjudicial Orientados a las Especies del Género Dalbergia en Guatemala, El Salvador y Honduras" (MARN, 2020) and records from fieldwork carried out from 2021 to 2022 (years).
Of the 301 records compiled, misidentified and duplicate records were eliminated. To reduce spatial autocorrelation and improve the performance of the models (Boria et al., 2014), spatial filtering was applied when the distance between two or more records was less than 1 km and only one of the records was retained.
Environmental variables
Environmental variables were obtained from the CHELSA V2.1 project. (Karger et al., 2017), and initially the 19 available bioclimatic variables were taken, which have spatial information of temperatures and precipitation in digital format at a resolution of 30 seconds (~1 km2 at the equator), and with a temporality of 29 years (1981-2010). Of these variables, four that combine temperature and precipitation information within the same layer (i.e., Bio8, Bio9, Bio18, and Bio19) were eliminated, as they potentially present spatial anomalies in the form of odd discontinuities between neighboring pixels for the study area (Booth, 2022). In addition, we included Potential Evapotranspiration (ET0) and Global Aridity Index (IA_ET0) variables (Trabucco & Zomer, 2019) and 17 variables related to soil physical and chemical properties obtained from the SoilGrids v2.0 database of ISRIC-World Soil Information (Hengl et al., 2017) at a resolution of 30 seconds. Soil is an important factor in the geographic distribution of plants (Linder et al., 2005; Giraldo-Kalil et al., 2023). Therefore, edaphic variables have been recommended for more adequate modeling (Velazco et al., 2017).
The selection of the variables to characterize the ecological niche of the species followed the performance of several tests, including a Jackknife analysis, a principal component analysis (PCA), a variance inflation factor analysis (VIF) and finally, a Spearman correlation analysis (Cobos et al., 2019). Those variables matching the criteria of the highest significance for Jackknife analysis and PCA, with a VIF value <10 and a correlation value <0.8 were selected. All these selection methods were applied as this allows the identification of an adequate set of variables that improves the performance of the models (Cobos et al., 2019). The final variables selected are summarized in Table 1.
Table 1. Climatic and edaphic variables used to create the models of potential distribution and ecological niche of D. calderonii.
Type of variables
|
Variable
|
Description
|
Climate
|
Bio4 *
|
Temperature seasonality (°C)
|
Bio6 ±
|
Mean daily minimum air temperature of the coldest moth (°C)
|
Bio7 *
|
Annual range of air temperature (°C)
|
Bio10 *
|
Mean daily mean air temperatures of the warmest quarter (°C)
|
Bio14 ±*
|
Precipitation amount of the driest month (kg m-2)
|
Bio15 ±*
|
Precipitation seasonality (Kg m-2)
|
Bio16 *
|
Mean monthly precipitation amount of the wettest quarter (Kg m-2)
|
Bio17 *
|
Mean monthly precipitation amount of the driest quarter (Kg m-2)
|
ET0 ±
|
Potential evapotranspiration (mm yr-1)
|
Edaphic
|
BDTICM ±
|
Absolute depth to bedrock (cm)
|
BDRICM ±
|
Depth to bedrock (R horizon) up to 200 cm
|
CECSOL ±
|
Cation exchange capacity of soil in cmolc/kg
|
ORCDRC ±
|
Soil organic carbon content (fine earth fraction) in g per kg
|
PHIKCL ±
|
Soil pH x 10 in KCl
|
SLTPPT ±
|
Silt content (2-50 micrometer) mass fraction in %
|
± Variables used in the general model and characterization of the niche; *variables used in the models projected into the past.
Calibration area
Maxent is an algorithm that requires a calibration area to model species distribution, since it requires a background to characterize the ecological niches from the presence records. The BAM diagram (Soberón & Peterson, 2005; Soberón, 2007) is a conceptual framework for interpreting the interactions between the three main factors that determine species distribution: biotic factors (B), abiotic factors (A), and the accessible area(M). In this context, the calibration area is represented by M and corresponds to those areas that the species has been able to access in a given historical period, which is compatible with the biogeographic history of the species (Soberón & Peterson, 2005; Barve et al., 2011). Based on the suggestions by Rojas‐Soto et al. (2024) the delimitation of the calibration area was defined by selecting the areas coinciding with the records of presence with biogeographic units, which in this case were the biogeographic regions of the Neotropics (Morrone et al., 2022). Once the regions were selected, the areas were shortened by applying a spatial restriction or buffer of 100 km to the occurrences (Figure 1). When the analysis of Mexico (Mmx) and Central America (Mca) populations were established (see below), two areas of accessibility were again created following the same procedure.
Species distribution models (SDM)
For the development of the distribution models of D. calderonii (general model) and its two population models (see below), we used the Maxent V3.4.1 algorithm (Phillips & Dudík, 2008), which performs well with small sample sizes (Pearson et al., 2007). The modeling process was carried out with the package kuenm (Cobos et al., 2019), which allows testing different regularization values, which in this case were 17 values tested (from 0.1 to 1 incrementing every 0.1, and from 2 to 8 incrementing every 1) and only three types of relationships: linear, quadratic and their combination (l, q and lq) to reduce the complexity of the model.
The overall model was calibrated with 70% of the presence data and 30% for validation; 10 replicates were performed with a limit of 1000 iterations, and the Bootstrap type resampling method was used. The candidate models were evaluated on the basis of the statistical significance of the partial ROC curve (Peterson et al., 2008), the omission rate (<0.05%) and the lowest value of the Akaike's information criterion (AIC) corrected for small sample sizes (Warren et al., 2010; Warren & Seifert, 2011; Morales et al., 2017).
The final overall model was constructed from the median of the ten replicates based on the best combination of parameters evaluated. The resulting map in continuous format was converted to a binary map (presence/absence) using the probability value from the tenth percentile of the calibration data as a threshold. Pixels with probability values below the threshold were classified as absences and pixels with probability values above the threshold were classified as presences.
Considering that the general model predicted suitable environments in two geographically separate regions, two independent models were developed for each of the regions. The first was for the Chiapas and Oaxaca occurrence records, defined as the Mexican population model (Mmx), and the second was for the El Salvador, Guatemala and Honduras records, which was called the Central American population model (Mca). For this, exactly the same steps of the process used for the general model and described previously were followed, varying only the calibration areas (Figure 1).
As part of the interest in the potential areas of the historical presence of the species (general model) and of each of the two populations (Mmx and Mca), transferences were made into the past. Due to the non-existence of edaphic and evapotranspiration (ET0) variables in the past, for this historical analysis, we used only the climatic variables from CHELSA (Karger et al., 2017) (Table 1). Each model's transferences were made to the mid-Holocene (MH ~8,000 yr; Fordham et al., 2017), Last Glacial Maximum (LGM ~21,000 years; Karger et al., 2021) and Last Interglacial (LIG ~130,000 yr; Otto-Bliesner et al., 2006) all variables with 2.5 minute resolution (~5km), this resolution was chosen as it is the finest available for the LGM.
Estimated current loss of area of D. calderonii
Areas with sparse or absent vegetation in the potential area of presence of the species were estimated according to the general model; for this purpose, the Global Forest Cover (GFC) was used at a resolution of 30 meters (Hansen et al., 2013). The GFC was cut using the map from the general model as a mask and then resampled with the bilinear interpolation method at a resolution of 250 meters. The GFC values were recategorized following the classification proposed by Moncada et al. (2022) as detailed in Table 2. Considering the extent and the resolution according to the general model, areas with <40% forest cover within the suitable area were considered sites where the species is unlikely to be found due to the sparse vegetation in these areas.
Table 2. Categories established for the reclassification of the CFG in the regions were predicted to be suitable by the general model.
Category
|
Range %
|
Description.
|
1
|
0 - 20
|
Dry bare soil area
|
2
|
20 - 40
|
Areas with sparse vegetation and bare soil.
|
3
|
40 - 60
|
Areas with sparse vegetation
|
4
|
60 - 100
|
Area with dense vegetation
|
Source: Adapted from (Moncada et al., 2022).
Comparison of niche between Mexican and Central American populations
To provide evidence from an environmental perspective on a potential differentiation between Mexican and Central American populations, we compared their use of environmental space using two different approaches. The first approach was a test of niche similarity (Broennimann et al., 2012; Di Cola et al., 2017) with Shoener's (D) and Hellinger's (H) indices. (Warren et al., 2008). Environmental variation was ordered using principal component analysis (PCA), and then the densities of presences of the two populations and the density of environmental conditions on the first two axes of the PCA were calculated with the kernel density function (Broennimann et al., 2012). The similarity between the niches of the two populations was assessed by assuming that the niche of the Mexican (Nmx) and the niche of the Central American (Nca) populations were more similar than a random model (Warren et al., 2008). For this process, 1000 permutations were made, and the statistical significance of the niche overlap was assessed (Di Cola et al., 2017).
The second approach consisted of creating minimum volume ellipsoids (Farber & Kadmon, 2003) to characterize the niches of the Mexican and the Central American populations. Environmental variables were converted to principal components (PC), and models were constructed based on the centroid and covariance matrix of the variables at a 95% significance level. The two niches were compared, and the null hypothesis that the two ellipsoids fitted to the actual observations overlapped at least as much as the random data type was tested. For this, we used an ellipsoid envelopes test, in which we measured the overlap between the ellipsoids. We performed 1,000 replicates with a 5% confidence limit. The analyses were performed using the routines implemented in the package ellipsenm (Cobos et al., 2022).