Study sites
We chose kurgans as a model system. Kurgans are ancient burial mounds built in large numbers during the Late Copper, Bronze and Iron Ages throughout the steppe biome in Eurasia (Deák et al., 2020). Primarily due to the intensive agricultural practices, approximately 57% of pristine European steppe grasslands in Chernozem soils have been destroyed or abandoned (Török et al., 2016). Due to their morphology and cultural significance, many kurgans escaped cultivation, representing one of the last refuges for populations of steppe and dry grasslands species in intensively used landscapes (Deák et al., 2020). Being separated from each other and other grasslands by croplands, farmlands and roads, they represent a good example of habitats with severed connectivity, which may be challenging to overcome for species with modest dispersal abilities such as S. nemorosa (Novák & Konvička, 2006). Previous studies found kurgans to favour species with large seed mass, tall growth and autonomous reproduction (Deák et al., 2021a), and to have lower plant density of specialist species compared to large steppe enclaves (Dembicz et al., 2016).
The study was located in the Great Hungarian Plain in Eastern Hungary, Europe (47° 19' 32"N − 21° 6' 58"E, Fig. 1a,b). The climate is continental with mean annual temperatures of 10–11 ºC, average precipitations of 500–600 mm and a typical elevation of 90–120 m.a.s.l. (National Meteorological Service of Hungary, 2022). In general, soils are highly fertile chernozems, making them ideal for agriculture (Deák et al., 2021a). We worked on 13 sites (11 kurgans and two reference flat grasslands) where S. nemorosa was present, spanning across 135 km (E - W) and 79 km (N - S) (Fig. 1b). Kurgans were selected to represent different habitat area (basal area ranged between 528–4,321 m2) and connectivity (Hanski connectivity index ranged between 0–338.4). S. nemorosa grew in remnant grassland patches dominated by Festuca rupicola, and other typical grasses were Agropyron cristatus, Stipa capillata, Bromus inermis. Due to lack of management, on kurgans patches of weeds (Carduus spp., Cirsium spp., Rumex spp.), native woody species (Crataegus monogyna, Prunus spinosa), and invasive woody species (Gleditsia triacanthos, Robinia pseudoacacia, Lycium barbarum) were typically present. The reference grasslands were usually mown in June-July.
Study species
Salvia nemorosa L. or Wood sage (Lamiaceae) is a zoogamous, short-lived hemicryptophyte, without active seed dispersal mechanisms and without obvious clonality, typically growing 20–50 cm tall (PADAPT, 2024). It has simple leaves with opposite leaf arrangement and can grow a large number of stems, most of which are usually generative in mature individuals. The inflorescence is a pseudo-spike composed of cymose inflorescences arranged in a verticillaster. The main inflorescence generally develops side inflorescence pairs that can further produce secondary and tertiary side inflorescences, thus extending the flowering time. The species is disturbance tolerant, thermophile, halophobe, heliophile and xero-tolerant, and it is characteristic of the Salvio-Festuceum rupicolae loess grassland association (Borhidi, 1995). It is a European species with a Pontic – Mediterranean - Central European range, where it is normally exposed to severe fluctuations in rainfall distribution and high temperatures (PADAPT, 2024). Due to the severe decline and degradation of dry grasslands, S. nemorosa has often found refuge in small habitat islands such as kurgans and roadside verges over large parts of its distribution area (Deák et al., 2018; Moysiyenko & Sudnik-Wójcikowska, 2010; Valkó et al., 2018).
Data collection
Data collection followed the protocol of Plantpopnet, a spatially distributed model system for population ecology (Buckley et al., 2019). Data was collected at peak flowering of S. nemorosa in June-July, over three consecutive years between 2021–2023. Eight sites were censused each year throughout the length of the study, further four sites were censused only twice as new sites were added to the study between 2022–2023, while one site was only censused once in 2021 due to subsequent census difficulties. We established up to four permanent transects of varying lengths (4–10 m), depending on the area of the habitat patch, where the number of individuals was representative of the habitat (Fig. 1c). On kurgans, S. nemorosa individuals were often limited to only one side of the kurgan (typically on South-Southwestern slopes or on the top), but on three kurgans a larger number of individuals allowed setting up transects on two contrasting slopes. Along each transect, we laid down permanent contiguous plots of 0.5 x 0.5 m, within which we permanently marked all S. nemorosa individuals with a numbered linoleum tag, including new individuals each consecutive year, and if there were no individuals, the plot was marked as empty (Fig. 1d). We recorded site (GPS coordinates, date of visit, management, habitat type), and transect level (cardinal aspect and slope) information. For each plant, we recorded in-situ three vegetative traits: height of the tallest stem, number of stems, length and width of two biggest leaves forming a leaf pair, and two reproductive traits: inflorescence length and number of primary side inflorescence pairs. To calculate mean leaf area, we approximated the measured leaf shapes to a triangle, and we multiplied the leaf length and width, which we averaged across the two measured leaves.
Data selection
In this study, we narrowed our analyses only to mature plants, to capture maximum attainable growth responses that can be related to the reproductive effort of same individuals. We defined mature individuals as plants that had a flowering probability (i.e., the likelihood that a plant develops inflorescence) higher than 50%. For this, we modelled the flowering state (yes or no) as a function of stem height, by fitting a Generalized Linear Mixed-effects Model (GLMM) with binomial error distribution in the lme4 package in R (Bates et al., 2015). Fixed effects were the tallest stem height in interaction with study year, while study site, transect, plot and plant were nested random effects. We tested the model assumptions using the DHARMa package (Hartig, 2022). We then used the generic function predict to obtain the flowering probability values for each individual, and individuals predicted a flowering probability lower than 50% were removed from further analyses. The model structure and results are presented in Supporting material (Figure S1,Table S5).
To ensure data quality, we removed imperfect observations e.g., dry plants or inflorescences, extreme values due to unusual growing circumstances, etc. After this step, the final number of plants involved in the models was 569 mature individuals, ranging between 13–49 (2021), 12–67 (2022) and 19–71 (2023) plants per site and per year.
Landscape structure and microclimate
We used the connectivity index by Hanski et al., (2000) to quantify the level of habitat isolation within a 300 m buffer around the focal habitat patch (a lower value indicating higher isolation):
$$\:{CI}_{i}={\sum\:}_{ij=1}^{n}\text{e}\text{x}\text{p}(-{\propto\:d}_{ij}){A}_{j}^{\beta\:}$$
,
where Aj is the area of a neighbouring grassland within the 300 m buffer area, dij is the distance from the focal patch to the neighbouring grassland, α is a species-specific parameter related to the species’ dispersal ability, and β a parameter describing the scaling of immigration, both set to 0.5 based on the assumption of a relatively low dispersal of 200 m for S. nemorosa. The area of the neighbouring grassland was calculated from digitised present day (2014–2016) habitat maps (source: Unified National Projection System, 1:10,000 topographic map of Hungary (Institute and Museum of Military History, Budapest), Satellite images (Google Maps) and field surveys, as in Deák et al., 2021b).
We calculated the habitat area as the basal area of the kurgan or the area of the grassland patch using the same method as described above (Deák et al., 2021b). To ease calculations due to the disproportionately large area of one of the reference grasslands compared to the area of the kurgans, we unified the Hanski connectivity and habitat area values for the two reference grasslands to the value of the grassland with the smaller area (232,061 m2).
To characterise the microclimate of the different transects we used the heat load index, a direct measure of incident solar radiation on a land surface (Buttrick et al., 2015) that combines aspect (converted into degrees, slope, and latitude values (McCune & Keon, 2002), and which we calculated with a function by (Zelený & Lin, (2019) in R, using the first equation:
$$\:Heat\:load=\:{e}^{(-1.467+1.582*cos\left(L\right)*cos\left(S\right)-1.500*cos\left(A\right)*sin\left(S\right)*sin\left(L\right)-0.262*sin\left(L\right)*sin\left(S\right)+0.607*sin\left(A\right)*sin\left(S\right))}$$
,
where L is the site latitude, A is the aspect where the S. nemorosa populations where found, and S is the slope angle of the transect.
Data analysis
We fitted Generalized Mixed Effects models (GLMM) using the glmmTMB package in R (Brooks et al., 2017; R Core Team, 2021) to test the effect of isolation, habitat area, heat load and study year on five measured traits of S. nemorosa (tallest stem height, number of stems, mean leaf area, main inflorescence length, and number of primary side inflorescence pairs). The vegetative trait models had the Hanski index, habitat area, heat load and study year specified as fixed effects, while in the reproductive trait models the date of visit was an additional fixed effect to cover the variance due to the continuous growth of the open inflorescence. Study site, transect, plot, and plant were nested random effects in all models, to avoid pseudo replication due to repeated measurements. We fitted models of vegetative traits and inflorescence length with a Gaussian error distribution, while the number of primary side inflorescences model was fitted with a truncated Poisson error distribution with log link. Due to dealing with only two reference grasslands which had very large area compared to the kurgans, we fitted all models on two separate datasets: a) kurgans-only, b) kurgans and reference grasslands, referred to hereafter as “all sites”. This way we kept the structure of the models relatively simple and avoided model singularity due to the low number of reference grasslands. In the main text we highlight the significant relationships obtained using both datasets, and we present the model structure and results for the kurgans-only dataset in Table 2, and for the “all sites” dataset in Table S2.
Table 2
Details of linear mixed-effects models (LMMs) testing the effect of landscape structure, microclimate and study year on five Salvia nemorosa traits using data collected on 11 kurgans. The first two columns show the response variable and the model structure, and the explanatory variables respectively. The next columns show the coefficient means (β) and standard errors SE (β), the p value determined by anova tests, the conditional (R2c) and marginal (R2m) R squared values of the model and the p values corresponding to the Kolmogorov-Smirnov test (K.S.) of model fit. Significant effects are shown in bold letters.
Response variable and model structure | Explanatory variables | Estimate (β) | SE (β) | p | R2 | p (K.S. test) |
Stem height glmmTMB(log(stem_hei_noinf_allplants) ~ hload_sc + Hanski_sc + log_habitat_area_sc + c_year + (1 | site_id/trans_id/plot_id/plant_id), data = stemhe_noNA2, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = gaussian(link= "identity")) | (Intercept) | 5.952 | 0.039 | < 0.001 | R2c = 0.751 R2m = 0.578 | 0.165 |
Heat load | -0.175 | 0.027 | < 0.001 |
Hanski index | 0.030 | 0.044 | 0.508 |
Kurgan area | < 0.001 | 0.043 | 0.994 |
year 2022 | -0.364 | 0.019 | < 0.001 |
year 2023 | 0.093 | 0.018 |
Mean leaf area glmmTMB(sqrt(mean_leafarea) ~ hload_sc + Hanski_sc + log_habitat_area_sc + c_year + (1 | site_id/trans_id/plot_id/plant_id), data = leafarea_noNA2, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = gaussian(link= "identity")) | (Intercept) | 7.646 | 0.059 | < 0.001 | R2c = 0.459 R2m = 0.216 | 0.128 |
Heat load | -0.165 | 0.045 | 0.002 |
Hanski index | 0.096 | 0.068 | 0.169 |
Kurgan area | 0.001 | 0.058 | 0.996 |
year 2022 | -0.260 | 0.044 | < 0.001 |
year 2023 | 0.207 | 0.041 |
Number of stems glmmTMB(log(num_stems) ~ hload_sc + Hanski_sc + log_habitat_area_sc + c_year + (1 | site_id/trans_id/plot_id/plant_id), data = numstems_noNA2, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = gaussian(link= "identity")) | (Intercept) | 2.171 | 0.045 | < 0.001 | R2c = 0.851 R2m = 0.093 | 0.218 |
Heat load | -0.176 | 0.043 | < 0.001 |
Hanski index | 0.196 | 0.048 | < 0.001 |
Kurgan area | 0.008 | 0.041 | 0.883 |
year 2022 | -0.190 | 0.029 | < 0.001 |
year 2023 | -0.395 | 0.029 |
Inflorescence length glmmTMB(inf_len ~ hload_sc + Hanski_sc + habitat_area_sc + date_number_sc + c_year + (1 | site_id/trans_id/plot_id/plant_id), data = alldatjoin4a_large, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = gaussian(link= "identity")) | (Intercept) | 132.526 | 7.691 | < 0.001 | R2c = 0.459 R2m = 0.214 | 0.252 |
Heat load | -1.830 | 4.567 | 0.783 |
Hanski index | -3.713 | 7.663 | 0.496 |
Kurgan area | -1.864 | 6.945 | 0.746 |
Date of visit | -15.158 | 6.229 | 0.207 |
year 2022 | -3.124 | 2.668 | < 0.001 |
year 2023 | -62.959 | 5.519 |
Number of primary side inflorescence pairs glmmTMB(Prim_side_inf_pairs ~ hload_sc + Hanski_sc + log_habitat_area_sc + date_number_sc + c_year + (1 | site_id/trans_id/plot_id/plant_id), data = alldatjoin5a_large, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), ziformula = ~ 1, family = truncated_compois(link = "log")) | (Intercept) | -0.037 | 0.221 | < 0.001 | R2c = 0.366 R2m = 0.147 | 0.965 |
Heat load | -0.273 | 0.129 | 0.017 |
Hanski index | 0.145 | 0.169 | 0.302 |
Kurgan area | -0.005 | 0.159 | 0.974 |
Date of visit | 0.309 | 0.098 | < 0.001 |
year 2022 | -0.030 | 0.176 | 0.009 |
year 2023 | 0.478 | 0.196 |
Because habitat area and connectivity were correlated (Figure S2), we refitted the models with either habitat area or Hanski index excluded from both datasets (Table S3, Table S4). This approach allowed us to identify uncertain area or connectivity effects due to the presence of both terms in the main models.
Finally, to investigate whether microclimate effects were modulated by annual weather fluctuations, we refitted the models to test the effect of the interaction between heat load and study year on the measured traits. Because of increased model complexity causing non-convergence warnings, we omitted the Hanski index, habitat area and date of visit from this set of models, while keeping the rest of the model structure unchanged (except for the number of primary side inflorescences model which this time was fitted with a Poisson error distribution with log link). For this analysis we used the kurgans-only dataset, due to uniform habitat conditions producing more robust models (Table 3).
Table 3
Details of linear mixed-effects models (LMMs) testing the interaction between heat load (microclimate) and study year (weather proxy) on five Salvia nemorosa traits using data collected on 11 kurgans. The first two columns show the response variable and the model structure, and the explanatory variables respectively. The next columns show the coefficient means (β) and standard errors SE (β), the p value determined by anova tests, the conditional (R2c) and marginal (R2m) R squared values of the model and the p values corresponding to the Kolmogorov-Smirnov test (K.S.) of model fit. Significant effects are shown in bold letters.
Response variable and model structure | Explanatory variables | Estimate (β) | SE (β) | p | R2 | p (K.S. test) |
Stem height glmmTMB(log(stem_hei_noinf_allplants) ~ hload_sc * c_year + (1 | site_id/trans_id/plot_id/plant_id), data = stemhe_noNA2, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = gaussian(link= "identity")) | (Intercept) | 5.926 | 0.041 | < 0.001 | R2c = 0.765 R2m = 0.575 | 0.221 |
Heat load | -0.224 | 0.031 | < 0.001 |
year 2022 | -0.341 | 0.020 | < 0.001 |
year 2023 | 0.113 | 0.019 |
Heat load:year 2022 | 0.049 | 0.018 | < 0.001 |
Heat load:year 2023 | 0.075 | 0.019 |
Mean leaf area glmmTMB(sqrt(mean_leafarea) ~ hload_sc * c_year + (1 | site_id/trans_id/plot_id/plant_id), data = leafarea_noNA2, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = gaussian(link= "identity")) | (Intercept) | 47.443 | 1.593 | < 0.001 | R2c = 0.500 R2m = 0.211 | 0.156 |
Heat load | -2.939 | 1.429 | < 0.001 |
year 2022 | -5.799 | 0.993 | < 0.001 |
year 2023 | 4.118 | 0.945 |
Heat load:year 2022 | -1.963 | 0.928 | < 0.001 |
Heat load:year 2023 | 0.040 | 0.954 |
Number of stems glmmTMB(num_stems ~ hload_sc * c_year + (1 | site_id/trans_id/plot_id/plant_id), data = numstems_noNA2, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = poisson(link = "log"))^^ | (Intercept) | 2.105 | 0.071 | < 0.001 | R2c = 0.854 R2m = 0.066 | 0.081 |
Heat load | -0.241 | 0.068 | 0.0118 |
year 2022 | -0.158 | 0.030 | < 0.001 |
year 2023 | -0.380 | 0.030 |
Heat load:year 2022 | 0.031 | 0.028 | < 0.001 |
Heat load:year 2023 | 0.261 | 0.031 |
Inflorescence length glmmTMB(inf_len ~ hload_sc * c_year + (1 | site_id/trans_id/plot_id/plant_id), data = alldatjoin4a_large, REML = FALSE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), family = gaussian(link= "identity"))^^ | (Intercept) | 129.660 | 7.185 | < 0.001 | R2c = 0.478 R2m = 0.234 | 0.077 |
Heat load | 6.210 | 5.802 | 0.021 |
year 2022 | -61.188 | 5.545 | < 0.001 |
year 2023 | -10.906 | 5.204 |
Heat load:year 2022 | -16.508 | 5.454 | < 0.001 |
Heat load:year 2023 | 1.803 | 5.621 |
Number of primary side inflorescence pairs glmmTMB(Prim_side_inf_pairs ~ hload_sc * c_year + (1 | site_id/trans_id/plot_id/plant_id), data = alldatjoin5a_large, REML = TRUE, control = glmmTMBControl(optCtrl = list(iter.max = 1e3,eval.max = 1e3)), ziformula = ~ 1, family = poisson(link = log)) | (Intercept) | 0.181 | 0.175 | 0.302 | R2c = 0.533 R2m = 0.309 | 0.314 |
Heat load | -0.447 | 0.147 | 0.003 |
year 2022 | -0.984 | 0.163 | < 0.001 |
year 2023 | -0.392 | 0.134 |
Heat load:year 2022 | -0.191 | 0.135 | < 0.001 |
Heat load:year 2023 | 0.436 | 0.135 |
We tested if residuals followed the expected distribution using the Kolmogorov-Smirnov test, and the diagnostics of overdispersion and zero inflation using the DHARMa package in R (Hartig, 2022). We explored the model goodness-of-fit by calculating Nakagawa’s conditional and marginal R2 values, using the performance::r2 command in the performance package in R (Lüdecke et al., 2021). We examined whether the effect of explanatory variables was significant by comparing the full models encompassing all variables with reduced models from which the variable of interest was removed, using ANOVA tests. To make the effect size of variables comparable within each set of models, all continuous predictor variables were centred on 0 and scaled to have unit variance. In models fitted with Gaussian error distribution, the response variables were log- or square root transformed if the transformation improved the normality of residuals.