Study area
The Irano-Turanian (IT) region is one of the most important phytogeographic zones, covering a large swath of southwest and central Asia. IT is divided into four sub-regions (IT1, IT2, IT3, and IT4) (Fig. 1), with IT2 serving as the main center of speciation and endemism, with low annual precipitation (low in winter and high in summer, in some areas), low winter temperatures, and a high continentality index39,40. Nepeta glomerulosa,as a species of the IT region, is distributed in the Zagros and Alborz Mountain ranges, Khorassan-Kopet Dagh Floristic Province in Iran and the Paropamisus Mountains in Afghanistan (IT2). This species has also been seen as transitional between two regions the IT and the Sahara–Sindian in the southern and southwestern parts of the country (Fig.1)26.
Occurrence data
We obtained occurrence data from the following herbaria: FUMH, HUI, SFAHAN, HSHU, MIR, Natural Resources of Khuzestan, Animal & Natural Resources Research Center of Hormozgan, and Natural Resources of Kohgiluyeh and Boyer–Ahmad (Table S1). We also used Global Biodiversity Information Facility (GBIF; http://www.gbif.org/).
Occurrence data were reviewed and filtered in two steps. First, sites where the species was expected to be present were checked, and locations outside of the known range for the species were evaluated; we also removed records for which coordinates had high associated uncertainty41. Second, to reduce autocorrelation in the occurrence data, one pair of records less than ~1 km apart was removed using the spThin package42 in R (version4.1.1 ).
Climate data
For current and potential future situations, a set of bioclimatic variables was established. The 19 bioclimatic variables for current conditions were acquired from WorldClim version 1.4 (http://www.worldclim.org), at a spatial resolution of 30" (~1 km). Four of the layers (bio 8, bio 9, bio 18, and bio 19) were removed because they include known spatial artifacts12,43,44. To select the most important environmental factors across the calibration area, we used Pearson correlation coefficients45 in R; from each pair of variables with a correlation ≥0.8 we removed one.
Bioclimatic variables under future climate scenarios were downloaded from the CCAFS website (http://www.ccafs–climate.org)46 at a spatial resolution of 30". Nine general circulation models (GCMs) were selected under two RCP 4.5 and RCP 8.5 scenarios, including (1) BCC-CSM-1, (2) CCCMA-CANESM2, (3) CSIRO-ACCESS1, (4) GISS-E2–R, (5) IPSL-CM5A-IR, (6) LASG-FGOALS-G2, (7) MIROC-MIROC5, (8) MOHC-HADGEM2-CC, and (9) NCC-NORESM1-M. This variety of GCMs was used to illuminate the uncertainty in predictions of the distribution potential of the species in the future45. Finally, for assessing the model’s performance, records split ten times (cross-validation) for calibrating the model and estimating the precision of model predictive47.
Ecological niche modeling
Model selection was performed using the ENMeval package38,48 in R.A set of 22 regularization parameter values (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1; 1.25, 1.5, 1.75, 2; 3, 4, 5, 6, 7, 8, 9, and 10) and 32 combinations of model response types (linear, quadratic, product, threshold, and hinge responses) were explored. Out of the 704 candidate models, the best model was chosen based onthe lowest value of the Akaike information criterion (AICc) to identify the model of suitable area and conditions consistent with the data48.
The final model was created using maximum entropy algorithm (Maxent) (version 3.3), with 10 cross-validation replicates among occurrence data; it was transferred across southwestern and central Asia under the current conditions. We also transferred the final model to future conditions allowing extrapolation and clamping. Median values across the replicate versions of the final model were used to estimate suitability across the region for current conditions. The median of all of the medians for the 9 GCMs was calculated to interpret the future potential geographic distribution of N. glomerulosa under different climate change scenarios (i.e., the two RCPs). Based on the range of values across the 10 cross-validation replicates, we estimated uncertainty for current and future conditions. Suitability scores were converted to binary using maxSSS (maximizing the sum of sensitivity and specificity), one of the best threshold selection methods for presence and absence data and also large samples46.