Herbarium specimens reveal a cryptic invasion of tetraploid Centaurea stoebe in Europe

doi:10.21203/rs.3.rs-4389565/v1

Download PDF

Article

Herbarium specimens reveal a cryptic invasion of tetraploid Centaurea stoebe in Europe

https://doi.org/10.21203/rs.3.rs-4389565/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Numerous plant species are expanding their native ranges due to anthropogenic environmental change. Because cytotypes of polyploid complexes show often similar morphologies, there may be unnoticed range expansions (i.e., cryptic invasions) of one cytotype into regions where only the other cytotype is native. We critically revised 13,078 herbarium specimens of diploid and tetraploid Centaurea stoebe, collected across Europe between 1790 and 2023. Based on their distribution in relictual habitats, we suggest that diploids are native across their entire European range, whereas tetraploids are native only to South-Eastern Europe and have recently expanded their range toward Central Europe. The proportion of tetraploids exponentially increased over time in their expanded but not in their native range. This cryptic invasion took predominantly place in ruderal habitats and enlarged the climatic niche of tetraploids toward a more oceanic climate. Our differentiation between native and expanded ranges conflicts with dozens of previous studies on C. stoebe. Thus, herbarium specimens can prevent erroneous assumptions on the native ranges of species, which has fundamental implications for designing research studies and assessing biodiversity trends. Moreover, we demonstrate the value of spatio-temporally explicit data in formulating and testing hypotheses regarding the superior colonization abilities of polyploids in ruderal habitats.

Biological sciences/Ecology/Invasive species

Biological sciences/Ecology/Biogeography

Biological sciences/Plant sciences/Plant ecology

Many successful invaders do not only spread in terms of “classical”, transcontinental invasions but may simultaneously expand their native range within continents^1–3. In Europe, for example, the number of naturalized, non-native plants with intracontinental origin exceeds those with transcontinental origin⁴. However, our understanding of such gradual range expansions is limited, because the precise borders of the "true" native ranges of species are usually unknown or speculative^3,4. This lack of information may be particularly common in polyploid complexes where closely related diploid and polyploid cytotypes often show similar morphologies. Consequently, range expansions of a cytotype can go unnoticed in areas where the other cytotype is already present^5–8. These expansions can be regarded as “cryptic invasions” (sensu Novak⁹), a phenomenon believed to be much more widespread and ecologically significant than currently recognized^7,8,10.

Herbarium collections provide invaluable resources for tracking spatio-temporal dynamics of species occurrences^11,12. Moreover, herbarium labels often contain information in which habitat type the specimen was collected. This is important because during range expansions, species are much more likely to be found in ruderal than natural habitats^2,13–15. Thus, studying spatio-temporal shifts in the proportion of polyploid vs. diploid cytotypes, while differentiating between ruderal and natural habitats, may help uncover cryptic invasions in polyploid complexes. However, we are not aware of research that has addressed this issue so far.

Herbarium records can also be utilized for reconstructing temporal peaks of range expansions¹¹ and assessing how climatic niche breadths change across space and time¹⁴. Moreover, integrating spatio-temporally explicit data on climatic niches and human activities may identify key determinants of range expansions^15,16. The emerging spatio-temporally explicit databases on human settlements¹⁷ and transport systems¹⁸ offer hitherto unexplored tools to disentangle the relative roles of climate, space, dispersal corridors and urbanization in range expansions. Regarding polyploid vs. diploid distribution patterns, such integrative approaches could significantly contribute to understanding the circumstances under which cytotype shifts have occurred in the past and may continue to do so in the future.

Here we study the cryptic invasion of tetraploid Centaurea stoebe L. (spotted knapweed; Asteraceae). Diploids are native to large parts of Europe, including Central Europe. In contrast, tetraploids are supposed to be only native to South-Eastern Europe from where they might have recently expanded their range toward Central Europe^5,19,20. Tetraploid C. stoebe is among the most successful plant invaders of North America, whereas diploids have never been recorded outside their native range²¹. This distribution pattern makes C. stoebe a prominent model to study the conundrum that within polyploid complexes, polyploids are on average more likely to become invasive than their diploid counterparts²². However, while the North American invasion history of tetraploid C. stoebe is well-documented¹⁴, their cryptic invasion across Central Europe has largely remained speculative. The need to address this knowledge gap became evident to us when reviewing 52 recent papers that used “native” tetraploid populations from Europe to compare them with non-native tetraploid populations from North America (Supplementary Table 1). Intriguingly, all the reviewed studies involved Central European tetraploid populations (which might be not native) and treated them as native in their comparisons.

To unravel the spatio-temporal dimensions of the cryptic invasion of tetraploids, we critically revised 13,078 C. stoebe herbarium specimens from 167 herbaria (Supplementary Table 2 and Fig. 1). Moreover, we added 668 cytogeographical records (i.e., data from chromosome counts or flow cytometry), including 424 published records from 37 publications and 244 new records, collected for the present study (Supplementary Tables 3 and 4, respectively). After data cleaning and filtering (see Methods), our total dataset included 5,821 occurrences, recorded between 1790 and 2023. Our study was guided by two principal objectives:

(1) Fostering our understanding of how to distinguish between native and expanded species ranges: We examined the geographical ranges of diploid and tetraploid C. stoebe and employed a framework predominantly focusing on the distribution in relictual habitats to delineate the native and expanded ranges of tetraploids. As a proof of concept, we investigated how the proportion of tetraploids, relative to diploids, changed over time in the native and expanded ranges. More specifically, we asked whether these temporal dynamics have been similarly or differently pronounced in ruderal and natural habitats in either range. We then evaluated how strongly our native range assessment contrasts with the sampling designs of recent research on tetraploid C. stoebe.

(2) Facilitating our understanding of the characteristics and determinants of the spread of tetraploids: We investigated how the geographical range size and the climatic niche breadth of tetraploids evolved over time in ruderal and natural habitats of their expanded range. Finally, we examined the relative importance of climatic conditions versus anthropogenic factors, such as dispersal corridors and urbanization, in both the initial spread and the current occurrence of tetraploids.

The first step of our analyses was to validate our morphology-based cytotype determination. This was a critical step because diploid and tetraploid plants are morphologically similar, while the taxonomic acceptance of two cytotypes is rather recent²¹. Consequently, only 9.6% of the specimens were originally determined to the cytotype level on their specimen labels, and even from those few specimens, 18.2% were misidentified. We applied two molecular approaches to validate our cytotype determination: 1) We morphologically determined the cytotype of 463 individuals grown in a common garden and subsequently verified our determination using flow cytometry (morphological determination accuracy: 98.3%). 2) We genotyped the ITS1 locus of 181 herbarium specimens after morphological determination of their cytotypes (morphological determination accuracy: 97.2%). Overall, our complementary validation approaches demonstrated that the morphology-based determination was very reliable and consistent across the investigated spatial and temporal ranges of our herbarium collections. Details on the cytotype determination and its validation are provided in the Supplementary Note 1.

Tetraploids underwent a cryptic invasion into the range of diploids

Our estimation of the native ranges of both cytotypes was predominately based on their distribution in habitats that were historically available (e.g. natural steppes and relictual habitats) and was supported by phylogeographic data and a review on recent floristic reports (Methods). The estimated native range of tetraploids encompasses large parts of the mountainous regions in the Balkan countries and Romania. In addition, it includes natural steppes and forest steppes stretching from Romania to Western Russia, where it is delineated by the Don River (Fig. 1). Tetraploids expanded their native range toward Central, Western and Northern Europe. This cryptic invasion resulted in a wide overlap in the distributions of both cytotypes. Diploids are native throughout the entire study range. While diploids show a rather continuous distribution across the study range, tetraploids occur spatially-scattered, particularly in their expanded range (Fig. 1).

Temporal patterns in the proportion of tetraploids confirmed our distinction between native and expanded ranges

Overall, we recorded more diploid than tetraploid specimens across our study range (3,954 diploid and 1,537 tetraploid records, not accounting for when the specimens were collected). To predict spatio-temporal dynamics in the proportion of tetraploid relative to all C. stoebe records, we fitted generalized additive logistic models (GAMs) using individual, binomial occurrence data (tetraploid = “1” vs. diploid = “0”). This binomial response variable allowed modeling the probability that a C. stoebe population within a given spatial, temporal and environmental context is tetraploid.

We first compared the temporal patterns of the proportion of tetraploids between the native and expanded ranges and found them to significantly differ from another (ΔAIC = -38.1, Fig. 2A, Supplementary Table 5). In the native range, the proportion of tetraploids did not change significantly over time (χ² = 1.5, P = 0.69) but stayed at approximately 50%. By contrast, in the expanded range, the proportion of tetraploids increased significantly over time (χ² = 205.6, P < 0.001), rising from 0% of tetraploids in the 1850s to more than 50% of tetraploids presently. The contrasting dynamics in both ranges strongly support our distinction between the native and expanded ranges of tetraploids. The cryptic invasion of tetraploids was characterized by three main periods: an initial, modest increase in the proportion of tetraploids from the 1850s to the 1920s, followed by stagnation until the 1950s when a second, exponential increase started, persisting until the present (Fig. 2A). The latter spread coincides with the onset of the Anthropocene, which is an expected temporal pattern for range-expanding plants².

Given that range expansions primarily occur in ruderal habitats^13,15, we then compared the temporal patterns in the proportion of tetraploids between ruderal and natural habitats, which we did separately for the native and expanded ranges. In the native range, we found no differences between the habitat types (ΔAIC = + 1.8, Supplementary Table 5): The proportion of tetraploids did not change significantly over time in both ruderal (χ² = 1.8, P = 0.176) and natural habitats (χ² = 4.4, P = 0.257), but stayed at approximately 50% in both habitat types (Supplementary Fig. 5). In the expanded ranges, however, the temporal dynamics differed significantly between the habitat types (ΔAIC = -8.1, Supplementary Table 5): In natural habitats, the proportion of tetraploids increased only slightly, even though significantly (χ² = 24.1, P < 0.001), rising from 5% of tetraploids in the 1850s to approximately 10% of tetraploids presently. In ruderal habitats, the proportion of tetraploids increased much stronger over time (χ² = 126.8, P < 0.001), rising from 0% of tetraploids in the 1850s to over 75% of tetraploids currently (Fig. 2B). This result shows that the cryptic invasion of tetraploids was primarily due to colonizing ruderal habitats, again confirming our concept of a recent range expansion^13,15.

Many previous research on tetraploid C. stoebe were based on unsuitable samplings

Our native range estimation conflicts with the samplings of 52 studies, having together > 5,000 citations (Supplementary Table 1). These studies compared “native” tetraploids from Europe with non-native tetraploids from North America. On average 77.2% (SD = ± 20.6%) of the tetraploid populations that these studies considered as native would not, according to our definition, fall within the native range of tetraploids. In other words, the vast majority of populations that served as a native reference to investigate whether range expansions have led to rapid post-introduction evolution in North America were collected from European regions that we believe not to be part of their native range either. 22 of the reviewed studies additionally compared European tetraploid with European diploid populations. These comparisons aimed to identify pre-adaptive differences between the cytotypes that made native tetraploids more likely to become invasive in North America than native diploids. Again, most “native” tetraploid populations involved in these comparisons mismatch our native range estimation (mean = 70.7%, SD = ± 17.4%).

Tetraploids more than doubled their range size

Explorative maps spanning 50-year intervals showed that there has been no apparent spatial expansion of diploids over time. In contrast, tetraploids were initially limited to their native range and adjacent regions. Subsequently, tetraploids enlarged their range in the course of their cryptic invasion toward Central Europe (Supplementary Fig. 7). To assess the spatial dimensions of this cryptic invasion, we quantified the range size dynamics of tetraploids in their expanded range. To do so, we plotted so-called invasion curves¹¹, illustrating the cumulative number of colonized 10 km ×10 km pixels over time. Compared to their native range size, the cryptic invasion more than doubled the range size of tetraploids (+ 137%). We plotted the invasion curves separately for ruderal and natural habitats and found that tetraploids were in their expanded range much more widespread in ruderal than in natural habitats (+ 193%, Fig. 3A). We similarly recorded these habitat-specific invasion curves for tetraploids in their native range, and also, for diploids in the expanded range of tetraploids. These two scenarios illustrate range size patterns across the habitat types in native ranges and were diametrically opposed to the cryptic invasion of tetraploids (Supplementary Fig. 8): Diploids were, within the expanded range of tetraploids, less frequent in ruderal than in natural habitats (-45%), and similarly, tetraploids were in their native range less frequent in ruderal than in natural habitats (-69%). These comparisons indicate that the habitat-specific patterns in range size development of tetraploids in their expanded range (Fig. 3A) are indicative of a recent colonization rather than merely reflecting habitat availability in the specific geographical range (see comparison with diploids) or solely reflecting an effect of the ecological niche of tetraploids (see comparison with native tetraploids).

Tetraploids increased their climatic niche in ruderal habitats

To quantify the realized climatic niches of the cytotypes, we performed a principal component analysis (PCA) on 19 bioclimatic variables (Supplementary Fig. 9). The first axis was negatively correlated with several precipitation variables, thus representing a gradient from high to low precipitation. The second axis correlated positively with several temperature variables and negatively with seasonality in temperature, thus representing a temperature gradient from continental toward warmer climate. Overall, the realized climatic niches were comparable between diploids and tetraploids in our study range. A dynamic range box approach showed that only 6.5% of their niche spaces did not overlap. However, within tetraploids, there was a niche shift (25.3% niche non-overlap) from a more continental climate in the native range toward a more oceanic climate in the expanded range (Supplementary Fig. 9).

The site scores of the PCA were used to perform niche-over time plots¹⁴, distinguished between natural and ruderal habitats. For the cryptic invasion of tetraploids, we found that the temporal patterns of the realized climatic niches did not mirror the patterns of range size expansion because the climatic niche of tetraploids in their expanded range was largely filled by 1900, when less than 5% of the range size was occupied (Fig. 3). However, the niche space was much faster occupied in ruderal than in natural habitats. Presently, ruderal populations still show a substantially broader niche than natural populations (+ 53%), especially toward wetter climates (+ 125%). Similarly as we did for the range size dynamics, we explored whether these habitat-specific niche dynamics were unique to the cryptic invasion of tetraploids by comparing them with the niche dynamics of diploids in the same range and the niche dynamics of tetraploids in their native range. In both native range scenarios, there was no credible evidence for niche differences between the two habitat types (Supplementary Fig. 10).

The initial spread of tetraploids was mainly determined by the spatial distance to the native range

To reconstruct the initial spread routes of the cryptic invasion of tetraploids, we predicted most parsimonious dispersal routes using a minimum cost arborescence algorithm. This algorithm predicted many separate dispersal routes (Fig. 4). The oldest routes went through western Romania (1859), Hungary (1863), Slovakia (1874) and Germany (1876). A multivariate environmental similarity surface (MESS) analysis revealed no discernible relationship between the predicted spread routes and climatic dissimilarity. For example, many of the climatically most suitable regions are still not colonized by tetraploids (e.g., yellow-beige land north of the Ukrainian border in Fig. 4).

We then performed a boosted regression tree (BRT) analysis to evaluate the best predictors of the initial spread of tetraploids. As a response variable, we used the residence time, i.e., the time elapsed since tetraploids colonized distinct 10 x 10 km pixels of the expanded range. As predictors, we used twelve spatio-temporally explicit variables, with three of them belonging to each of four categories (spatial, dispersal corridor, climatic and urbanization data). Among these categories, spatial data explained most variation (Supplementary Fig. 5A). More specifically, the best predictors of the initial spread were the distance to the native range and latitude, both being negatively correlated with the residence time (Supplementary Fig. 11). In other words, the further localities are away from the native range, the later they had been colonized by tetraploids.

The current occurrence of tetraploids is mainly associated with urbanization and road-assisted dispersal

To focus on the current spread of tetraploids, we first plotted the proportion of tetraploids in the expanded range over time since 1945, i.e., the onset of the Anthropocene era²³. The proportion of tetraploids increased in the late 1950s (Supplementary Fig. 12). This increase flattened in the 1980s. Since 1989, coinciding with the fall of the iron curtain and the resulting increased connectivity between the native and expanded ranges²⁴, there was another, exponential increase of the proportion of tetraploids, persisting until the present. We therefore designated the period since 1989 as representing the time frame of the current spread of tetraploids in their expanded range.

To examine the predictors of this current spread, we performed a second BRT. As a response variable, we used binomial occurrence data, recorded from the expanded range between 1989 and 2023. The predictors were the same as in the previous BRT above. We found that the proportion of tetraploids was more associated with data related to urbanization than with spatial data, dispersal corridor data or climatic data (Supplementary Fig. 5B). More specifically, the most important predictors of the current occurrence of tetraploids were the cover of impervious structures and the road density, both being positively correlated with the proportion of tetraploids (Supplementary Fig. 13). This finding aligns with a detailed analysis on different subtypes of ruderal habitats showing that roadsides are currently the most frequently occupied habitat type of tetraploids in their expanded range. In particular, the proportion of roadsides increased from 10% in the 1870s to approximately 35% at present (Supplementary Fig. 14). Our field surveys further support these findings, as we recorded 39.3% of the tetraploid populations in the expanded ranges along roadsides (Supplementary Table 3 for a list of 244 field sites visited for this study). Some of these roadside populations appeared to naturalize into adjacent native plant communities (Supplementary Figs. 15–17 for field impressions), including the recent establishment of mixed-ploidy populations with co-existing diploid and tetraploid individuals (Supplementary Table 6).

Tetraploid C. stoebe is only native to South-Eastern Europe and has recently expanded its native range toward Central Europe. Similar range expansions have been proposed for various plant species, possibly influenced by both climate change and pronounced human activities in Central Europe^4,13. However, there are no assessments on spatio-temporal range dynamics among these species which hampers our ability to identify the overarching drivers behind their ongoing expansions^2–4. We therefore call for further research in other model species to improve our understanding on native range expansions in the Anthropocene. Our conceptual framework, considering historical habitat requirements, phylogeographic data and expert knowledge, represents a unique attempt to delineate the native range of a range-expanding species. The robustness of our estimation was reinforced by compelling results on the range- and habitat-specific dynamics in the proportion of tetraploids.

Importantly, our assessment contrasts with at least 52 studies that treated tetraploid C. stoebe populations from the estimated expanded range as native. A careful revision of the spatio-temporal range dynamics – as we did here – would have been beneficial before conducting the samplings of these studies. We consequently advocate researchers to meticulously evaluate the native range of their model species, especially when studying its evolutionary ecology³. This task may be particularly challenging, though indispensible, when dealing with cryptic species such as cytotypes that show similar morphologies⁷. In our study, more than 90% of the specimen labels did not differentiate between diploids or tetraploids. Instead, the specimens were stored under various names and frequently misidentified by collectors. Unfortunately, most available information on taxa distribution across space and time remains unrevised, representing a pressing problem of biodiversity science²⁵. We thus stress that both accurate taxonomy and solid knowledge on the native ranges of species are essential for understanding invasion dynamics^16,26 and contemporary biodiversity trends^10,27,28.

The cryptic invasion of tetraploids across Central Europe corresponds to the invasion of North America, where exclusively tetraploids were able to establish²¹. Our research thus supports the observation that polyploids often show greater invasion success than closely related diploids^22,29. More specifically, our results suggest that tetraploids possess superior colonization abilities over diploids in ruderal habitats, a characteristic that may be particularly prevalent in allopolyploid species like tetraploid C. stoebe^22,30. These superior colonization abilities may relate to the increased longevity of tetraploids, enhancing their tolerance to environmental and demographic stochasticity in ruderal habitats^21,31. The polycarpy of tetraploids can lead to higher re-sprouting success after severe disturbance³² and greater life-span seed production as compared to the monocarpic diploids³³. In addition, the genome duplication has been shown to augment adaptive capabilities²⁰ and diminish founder effects^20,34 in tetraploid compared to diploid C. stoebe.

The cryptic invasion of tetraploids involved a niche expansion toward a more oceanic climate. Notably, tetraploids also experienced a niche shift during their North American invasion, but in the opposite direction, toward a more continental climate¹⁴. These intercontinental differences could be mediated by different biotic interactions or the climate conditions prevailing on each continent^35,36. The dynamics of the niche expansion in Europe differed strongly between habitat types. The climatic niche was rapidly occupied in ruderal but not in natural habitats, representing a potentially common, yet understudied, phenomenon in plant invasions³⁷. In the most oceanic regions, natural sites remained uncolonized, probably because they are less suitable for C. stoebe (wet conditions with high interspecific competition) than ruderal sites that could mimic the conditions in the native range (dry with low interspecific competition³¹).

While the initial spread of tetraploids was determined by the distance to their native range, their current occurrence was predominantly associated with the impervious cover of the landscape and the road density. This result reiterates the crucial role of colonization abilities in ruderal habitats as a primary mechanism for the current success of tetraploid C. stoebe^20,32 and emphasizes the importance of roads as dispersal corridors facilitating range expansions³⁸. More generally, our results add to the growing body of research suggesting that polyploidy confers benefits in coping with stressors linked to urbanization^39,40. Climatic characteristics did not play a major role in either the initial spread or the current occurrence of tetraploids. This finding is consistent with current research concepts emphasizing the growing importance of anthropogenic factors relative to macroclimate as drivers of species distributions⁴¹.

In conclusion, we have presented what we believe to be the first and robust empirical evidence of a cryptic invasion by a polyploid plant expanding into the range of its diploid relative. Given the increasing accessibility of herbarium collections online⁴², we hope to motivate more scientists to critically evaluate the native ranges of their study species. Such endeavors may improve the assessment of biodiversity trends and the design of research studies. Our study also sheds light on the superior colonization ability of tetraploids as a key driver of their cryptic invasion, particularly along roadsides and in habitats with high impervious cover. The capacity to occupy such ruderal habitats is crucial in contemporary landscapes because humans are continually increasing loss and fragmentation of natural habitats while increasing the prevalence and connectivity of ruderal habitats⁴³. More studies on polyploid complexes are warranted to test generality and limitations of our results and to explore the broader implications of our findings for global patterns in the distribution of diploid versus polyploid plants^39,44.

Total dataset

From the 13,078 specimens, we first removed duplicates (i.e., specimens collected at the same site and at the same time: 41.6%) and specimens that did not belong to the C. stoebe complex (1.5%). We further removed specimens for which it was not possible to identify the cytotype (10.7%) or to gather information on locality or collection date (6.8%). After these cleaning steps, we added the cytogeographical data, resulting in a total dataset of 5,821 Eurasian records (Fig. 6). We used this total dataset to explore the entire Eurasian range of the C. stoebe complex and to estimate which part of the ranges are native for both cytotypes.

Estimation of the native and expanded ranges of tetraploids

To delineate the native and expanded ranges of tetraploids, we developed a conceptual framework combining three approaches commonly employed to unveil native range expansions and cryptic invasions^3,10: 1) revision of occurrence data, 2) phylogeographic analyses, i.e., analyses on contemporary spatial patterns of molecular variation, and 3) review of floristic publications. Among these three approaches, our delineation primarily relied on the revision of occurrence data.

Previous studies that used occurrence data to delineate native and expanded ranges have typically focused on spatio-temporal distribution patterns. Specifically, regions where occurrences had been recorded before or after a certain temporal threshold were regarded as being part of the native and expanded range, respectively^3,10,45. However, this approach can be strongly biased by spatio-temporal patterns in collection efforts (reviewed by Lang et al.⁴⁶). Instead of relying on a spatio-temporal approach, we focused on distribution patterns in natural habitats, independently of collection time. Given that the range expansion of tetraploids is largely driven by human activities^5,19, it is crucial to recognize which natural habitats were already available before human activities completely transformed the European landscape². For the light-demanding C. stoebe, this definition encompasses naturally canopy-free habitats, including zonal steppes, as well as extrazonal habitats such as rock outcrops and treeless habitats at high altitudes (i.e., “relictual habitats”). The occurrences of tetraploid populations in these relictual habitats were particularly interesting to us because C. stoebe is a predominantly barochorous species, which limits its uphill dispersal³¹. Consequently, the colonization of such relictual sites is unlikely to have occurred recently without the presence of nearby source populations over an extended period.

After defining the historical habitat requirements, we used our total dataset to identify geographical regions where tetraploids regularly occur in natural steppes or relictual habitats. We conducted these assessments across eleven geographical regions, which aligns with previous research investigating range expansions in a region-focused manner^3,45. However, in contrast to previous efforts, our estimation accounted for the occurrence of a reference taxon, a recommended strategy for addressing spatial sampling bias in herbarium studies⁴⁶. In particular, in regions where tetraploids were not found in natural steppes or relictual habitats, we examined whether diploids occurred at such natural sites. This was crucial in ensuring that the absence of tetraploid records at natural sites did not result from insufficient sampling activity. Because both closely related cytotypes share similar ecological niches^14,19-^21,31-34, it was anticipated that they would have had an equal opportunity to occupy relictual habitat types in regions where both cytotypes coexisted over an extended period. Thus, if only diploids were observed in the relictual habitats of a particular region, it is unlikely that this region was part of the native range of tetraploids. Diploids are considered to exhibit a historically widespread distribution across the entire study range^5,19. As such, diploids should have had sufficient time to colonize numerous relictual habitats at high elevations and rock outcrops throughout Central and Western Europe, making them an appropriate reference for evaluating sampling efforts in relictual habitats across the eleven geographical regions. Further details on the methods of our estimation of the native and expanded ranges, including the detailed assessment across the eleven geographical regions are provided in the Supplementary Note 2.

As a second step, we investigated whether our estimated range delineation is concordant with the spatial patterns of molecular variation, observed from five published and three unpublished phylogeographic datasets. Analyzing spatial patterns of contemporary genetic diversity is a conventional method for identifying signatures of recent or historic cryptic invasions (reviewed by Morais & Reichard¹⁰). Together, the geographical distribution of genetic diversity within both cytotypes (especially with a view on rare alleles) and the distribution of closely related taxa that share ribotypes with allotetraploid C. stoebe supported our occurrence data-based assessment of the native and expanded ranges of tetraploids (see Supplementary Note 2 for details). In addition, the phylogeographic data facilitated decision-making processes in regions where tetraploids may be native but currently have sparse distributions (e.g., Ukrainian steppes, which have undergone extensive conversion to agricultural land in recent decades).

Finally, we conducted an extensive literature survey of publications proposing a non-native origin of distinct tetraploid populations. Combining occurrence data from herbarium specimen with information gleaned from local floristic publications is a suitable approach to uncover range expansions^3,45. We found that our assessments were largely congruent with expert knowledge from local florists (see Supplementary Note 2 for details). Note that the same three approaches and criteria were applied to assess the native range of diploids showing no credible evidence of a recent range expansion in diploids (Supplementary Note 2).

We utilized ArcGIS 9.2 (ESRI, Redlands, U.S.A) to delineate the estimated border of the native range of tetraploids along the geographical barriers that we identified to separate the native and expanded ranges (e.g., rivers and mountains, see Supplementary Note 2). Occurrence data falling within this border were categorized as native, whereas those lying outside were classified as being part of the expanded range. Occurrence data within a 50-kilometer buffer around the border underwent individual confirmation of their range affiliation. We conducted this individual verification to avoid misassignments potentially arising from imprecise border delineation or inaccuracies in the georeferencing of respective study populations.

Comparison of our estimated native range with samplings from previous research

We conducted a comprehensive review of published studies comparing tetraploid C. stoebe populations from Europe and North America. We included only studies that explicitly investigated post-introduction evolution or ecological differences between the native and non-native ranges. Doing so, our literature survey was rather conservative because there are many more studies that have used tetraploid populations from their expanded range but without focusing on a native vs. non-native comparison. From the relevant publications, we took the provided GPS coordinates of the European tetraploid populations. We then assessed how many of the study populations fell either into or outside the border of our estimated native range. As above, occurrence data within a 50-kilometer buffer around the border underwent individual verification.

Focal dataset and study range

Our assessment identified two regions where tetraploids occur beyond their native range: 1) northwest from the native range encompassing Central, Western and Northern Europe and 2) east from the native range, specifically south and east of the Don River. For all analyses described below, we used our “focal dataset” focusing on our “study range”, i.e., the native range and the expansion toward Central, Western and Northern Europe (Fig. 6). Records east from the native range were not included because the estimated delineation of the native range may be less precise in European Russia due to lower sampling efforts. We had 4,417 records from Central, Western and Northern Europe but only 285 from the expanded range in European Russia despite both ranges having similar sizes. The lower sampling density in European Russia can be attributed to the relatively sparse distribution of herbaria (Supplementary Fig. 1). Note that including the data east from the native range did not change the general pattern in any of the presented results (data not shown).

Ruderal vs. natural habitat type assignment

Habitat information was extracted from the labels of the herbarium specimens or from field notes of the cytogeographical collections (Supplementary Tables 3 and 4). With the available information, we classified the habitat types according to the European classification system of habitats (EUNIS). For 28.8% of our records, we could not retrieve sufficient habitat information to classify them into the aforementioned categories. Following the approach outlined in Broennimann et al.¹⁴, habitat types classified as diluvial sediments (EUNIS category C), natural and semi-natural grasslands (E), and natural rock outcrops (H) were classified as “natural” habitats. In contrast, agricultural habitats (I) and transport networks, extractive sites, urban and industrial habitats (J) were classified as “ruderal” habitats.

Data analyses

To predict spatio-temporal patterns in the proportion of tetraploids, we fitted GAMs using the package mgcv 1.8-41⁴⁷ in R 4.3.3⁴⁸. We used a binomial response in our GAMs to accommodate the inherent inconsistencies of herbarium collection efforts across space and time⁴⁶. Specifically, spatio-temporal dynamics in species occurrences should not be predicted by the absolute numbers of specimens collected but rather in relation to a reference species, and this is particularly important in research on range expansions^11,12,46. Diploid C. stoebe is a suitable reference taxon because it shows a stable distribution across the sympatric ranges of both cytotypes over the last two centuries (Supplementary Note 2) and both cytotypes show comparable ecological niches^14,19-^21,31-34. At the same time, using diploids as a reference directly addresses the conundrum that polyploid plants become more frequent than diploids in some, but not all, environmental contexts in the Anthropocene^22,39.

We first predicted the proportion of tetraploids as a function of time in the native vs. expanded ranges using a logistic thin plate spline-based smoother function on the predictors “year by range” + “range”. To study habitat-specific temporal patterns in the proportion of tetraploids, we ran another two GAMs, one separate GAM for each range, using the predictors “year by habitat type” + “habitat type”. To account for spatial autocorrelation, all GAMs included a spline-on-the-sphere smoothing term based on latitude and longitude⁴⁹. Concurvity among year, latitude and longitude was always below 0.15 indicating very low multi-collinearity⁴⁹. To identify significant predictors, model performances were compared based on the Akaike information criterion (AIC), using ΔAIC ≤ -2 as a threshold for significance. Model structures of the GAMs can be found in Supplementary Table 5.

To assess the overall realized climatic niches, we used the 19 standard Bioclim variables bio1–bio19 from Chelsa 2.1⁵⁰ to perform a PCA using the vegan 2.6-4 package⁵¹. The resulting orthonormal system of principal components maximizes the present climatic variation, which was shown to appropriately quantify the realized climatic niches of diploid and tetraploid C. stoebe¹⁴. To estimate the percentage of overlap between the realized climatic niches of diploids and tetraploids across the study range, we used the 19 Bioclim variables to calculate dynamic range boxes, which quantify size and overlap of n-dimensional hyper volumes⁵². We similarly compared the overlap between the climatic niches of tetraploids in their native and expanded ranges. We then performed niche-over time plots according to Broennimann et al.¹⁴. To ensure conservative niche limits, we removed occurrence data out of the 5 and 95 percentiles. These outliers may reflect artifacts from the modeled climatic data or sites in terms of macroclimate but are significantly influenced by favorable microclimatic conditions¹⁴.

To study relationships between the dispersal routes and climatic dissimilarity, a MESS analysis was computed using the R-package dismo 1.3-14⁵³. The native climatic niche of tetraploids was estimated from climatic data of all grid cells occupied by tetraploids in their native range. We then compared the dissimilarity of this native niche with each grid cell in the expanded range of tetraploids^14,54. To visualize spatio-temporal patterns of niche dissimilarity of tetraploids in their expanded range, we plotted the spatial distribution of MESS values on a map with dispersal routes that were predicted by a minimum cost arborescence algorithm according to Hordijk & Broennimann⁵⁵.

To evaluate the determinants of the initial spread of tetraploids, we first estimated the residence time of tetraploids in each pixel of the expanded range (i.e., years elapsed since the first record in distinct 10 km × 10 km pixels). This residence time was used as a response variable in a BRT to estimate the relative importance of several predictors for how early or how late distinct pixels got colonized. BRTs are frequently used to identify predictors of species distribution⁵⁶. They are particularly suitable to analyze large datasets with numerous different predictor variables and are relatively insensitive to collinearity and missing values in the predictor variables⁵⁷. We fitted our BRT using the R-package dismo, assuming a Gaussian error distribution, a bag fraction of training data of 0.5, a tree complexity of 1, a learning rate of 0.01, and a tolerance of 0.01. We then assessed the significance of predictor variables using model simplification based on model-internal cross-validations. Our predictor variables included four types of data: 1) spatial data: latitude, longitude and the spatial distance to the native range, 2) climatic data: a precipitation gradient (loadings of the first PCA axis), a temperature gradient (loadings of the second PCA axis) and the climatic distance to the native range (niche dissimilarity from the MESS analysis), 3) dispersal corridor data and 4) urbanization data. The latter two data categories encompassed spatio-temporally explicit data, extracted from 10 km × 10 km pixels. We used data from 1990 as it was the earliest year with high quality data available across the data sources. Moreover, in 1990, approximately half of the pixels were colonized. The dispersal corridor data included the density of railways (log_e-transformed) and roads (log_e-transformed), and a connectivity index (log_e-transformed). Railway and road density data (km lengths within the 10 km × 10 km pixels) were extracted from the dataset in Garcia-López et al.¹⁸. The connectivity index was calculated as the reciprocal of the estimated travel time to the nearest city with at least 50,000 inhabitants from the Agglomeration Index database⁵⁸. The urbanization data included human population density (log_e-transformed) and proportion data of urban vs. rural landscape (logit-transformed), both downloaded from the Global Human Settlement Layer¹⁷, and the percentage of landscape covered by impervious structures (log_e-transformed) from the annual maps of global artificial impervious area⁵⁹.

To analyze the determinants of the current occurrence, we performed a second BRT, using binomial occurrence data (i.e., diploid vs. tetraploid) from 1989–2023 as a response variable. This BRT was fitted to the same predictor variables and using the same settings as the previous BRT, except that we assumed a Bernoulli error distribution. Moreover, for the spatio-temporally explicit predictors related to urbanization and dispersal corridors (extracted from 10 km × 10 km pixels), we retrieved data from the distinct collection year or used interpolated data for years where no data was available.

Data availability

Source data that support the findings of this study will be available on Github⁶⁰ (https://xxxx) and will be mirrored on Zenodo (https://xxxx).

Code availability

The code for data analysis will be available on Github⁶⁰ (https://xxxx) and will be mirrored on Zenodo (https://xxxx).

Acknowledgments

We thank all curators from the included herbaria for access to their collections, and in addition, the curators from the herbaria HAL, SAAR, M and O for access to leaf material from the specimens. We are grateful to S. Schmidt, M. Schramm, G. Röhrborn and I. Vogler for their help with digitalizing label information from the specimens. We also thank G. Seidler for his support with GIS analyses and J. Ochsmann for insightful discussions and the access to his C. stoebe collections. We acknowledge S. Bancheva, A. Diaconu, R. Filep, I. Goia, Z. Gudzinskas, J. Hadinec, A. Hajdari, M. Hartmann, R. Ion, F. Kolář, G. Kreiger, A. Pieters, L. Priestman, D. Purger, P. Reger, G. Röhrborn, A. Röske, M. Schramm, M. Silantyeva, B. Šingliarová, M. Štefánek, A. Şuteu, A. Synowiec, U. Treier, T. Tyler and M. Wilcox for sampling seeds for flow cytometric analyses. This work was supported by a DAAD exchange scholarship (D|12|00534 to C.R.), institutional funding of the Charles University, Prague (to P.M.), a statutory fund of the W. Szafer Institute of Botany, Polish Academy of Sciences (to M.R. and T.S.) and the Ministry of Education, Singapore (Award No. MOE-T2EP20221-0007 to M.T.G.). C.R., R.R. and K.K. acknowledge the support of iDiv funded by the German Research Foundation (DFG– FZT 118, 202548816).

Author Contributions

C.R., O.B., A.G., H.M.-S. and P.M. designed research; C.R., A.N., V.M., J.D., K.K., D.U.N., N.M.S., M.R., T.S., A.E.T., P.Z. and P.M. performed research; C.R., O.B., M.T.G., R.R., M.R., T.S. and J.A.S. analyzed data; and C.R., O.B., and P.M. wrote the paper with substantial input from all co-authors.

Competing Interest Statement

The authors declare no competing interest.

van Kleunen M et al (2015) Global exchange and accumulation of non-native plants. Nature 525:100–103
Essl F et al (2019) A conceptual framework for range-expanding species that track human-induced environmental change. Bioscience 69:908–919
Lustenhouwer N, Parker IM (2022) Beyond tracking climate: Niche shifts during native range expansion and their implications for novel invasions. J Biogeogr 49:1481–1493
Zhang Z et al (2023) The poleward naturalization of intracontinental alien plants. Sci Adv 9
Ochsmann J (2000) Morphologische und molekularsystematische Untersuchungen an der Centaurea stoebe L.-Gruppe (Asteraceae-Cardueae) in Europa. in Dissertationes botanicae 324. (Cramer in der Gebr.-Borntraeger-Verl.-Buchh (Berlin, Stuttgart)
Rüegg S, Raeder U, Melzer A, Heubl G, Bräuchler C (2017) Hybridisation and cryptic invasion in Najas marina L. (Hydrocharitaceae)? Hydrobiologia 784, 381–395
Mezhzherin SV, Tsyba AA, Kryvokhyzha D (2022) Cryptic expansion of hybrid polyploid spined loaches Cobitis in the rivers of Eastern Europe. Hydrobiologia 849:1689–1700
Kúr P et al (2023) Cryptic invasion suggested by a cytogeographic analysis of the halophytic Puccinellia distans complex (Poaceae) in Central Europe. Front Plant Sci 14
Novak SJ (2011) Geographic Origins and Introduction Dynamics. in Encyclopedia of Biological Invasions 273–280University of California Press
Morais P, Reichard M (2018) Cryptic invasions: A review. Sci Total Environ 613–614:1438–1448
Delisle F, Lavoie C, Jean M, Lachance D (2003) Reconstructing the spread of invasive plants: taking into account biases associated with herbarium specimens. J Biogeogr 30:1033–1042
Hyndman RJ, Mesgaran MB, Cousens RD (2015) Statistical issues with using herbarium data for the estimation of invasion lag-phases. Biol Invasions 17:3371–3381
Dainese M et al (2017) Human disturbance and upward expansion of plants in a warming climate. Nat Clim Change 7:577–580
Broennimann O, Mráz P, Petitpierre B, Guisan A, Müller-Schärer H (2014) Contrasting spatio-temporal climatic niche dynamics during the eastern and western invasions of spotted knapweed in North America. J Biogeogr 41:1126–1136
Theoharides KA, Dukes JS (2007) Plant invasion across space and time: factors affecting nonindigenous species success during four stages of invasion. New Phytol 176:256–273
Skokanová K et al (2023) Contrasting invasion patterns of two closely related Solidago alien species. J Biogeogr 1–14
Schiavina M et al (2023) GHSL data package 2023. Publications Office of the European Union, Luxembourg
Garcia-López M-À, Pasidis I, Viladecans-Marsal E (2022) Congestion in highways when tolls and railroads matter: evidence from European cities. J Econ Geogr 22:931–960
Mráz P et al (2012) Allopolyploid origin of highly invasive Centaurea stoebe s.l. (Asteraceae). Mol Phylogenet Evol 62:612–623
Rosche C et al (2016) The population genetics of the fundamental cytotype-shift in invasive Centaurea stoebe s.l.: genetic diversity, genetic differentiation and small-scale genetic structure differ between cytotypes but not between ranges. Biol Invasions 18:1895–1910
Mráz P, Bourchier RS, Treier UA, Schaffner U, Müller-Schärer H (2011) Polyploidy in phenotypic space and invasion context: A morphometric study of Centaurea stoebe s.l. Int J Plant Sci 172:386–402
te Beest M et al (2012) The more the better? The role of polyploidy in facilitating plant invasions. Ann Bot 109:19–45
Zalasiewicz J et al (2015) When did the Anthropocene begin? A mid-twentieth century boundary level is stratigraphically optimal. Quat Int 383:196–203
Kupková L, Bičík I, Najman J (2013) Land cover changes along the Iron Curtain 1990–2006. Geografie 118:95–115
Maldonado C et al (2015) Estimating species diversity and distribution in the era of Big Data: to what extent can we trust public databases? Glob Ecol Biogeogr 24:973–984
Pyšek P et al (2013) Hitting the right target: taxonomic challenges for, and of, plant invasions. AoB PLANTS 5:plt042–plt042
Hochkirch A et al (2021) A strategy for the next decade to address data deficiency in neglected biodiversity. Conserv Biol 35:502–509
Lehnert M, Monjau T, Rosche C (2023) Synopsis of Osmunda (royal ferns; Osmundaceae): towards reconciliation of genetic and biogeographic patterns with morphologic variation. Bot J Linn Soc boad071
Pandit MK, White SM, Pocock MJ (2014) O. The contrasting effects of genome size, chromosome number and ploidy level on plant invasiveness: a global analysis. New Phytol 203:697–703
Prentis PJ, Wilson JRU, Dormontt EE, Richardson DM, Lowe AJ (2008) Adaptive evolution in invasive species. Trends Plant Sci 13:288–294
Mráz P et al (2012) Anthropogenic disturbance as a driver of microspatial and microhabitat segregation of cytotypes of Centaurea stoebe and cytotype interactions in secondary contact zones. Ann Bot 110:615–627
Rosche C, Hensen I, Lachmuth S (2018) Local pre-adaptation to disturbance and inbreeding-environment interactions affect colonisation abilities of diploid and tetraploid Centaurea stoebe. Plant Biol 20:75–84
Hahn MA, Buckley YM, Müller-Schärer H (2012) Increased population growth rate in invasive polyploid Centaurea stoebe in a common garden. Ecol Lett 15:947–954
Rosche C et al (2017) Invasion success in polyploids: the role of inbreeding in the contrasting colonization abilities of diploid versus tetraploid populations of Centaurea stoebe s.l. J Ecol 105:425–435
Atwater DZ, Ervine C, Barney JN (2018) Climatic niche shifts are common in introduced plants. Nat Ecol Evol 2:34–43
Lee BR et al (2022) Wildflower phenological escape differs by continent and spring temperature. Nat Commun 13:7157
González-Moreno P, Diez JM, Richardson DM, Vilà M (2015) Beyond climate: disturbance niche shifts in invasive species. Glob Ecol Biogeogr 24:360–370
Follak S et al (2018) Invasive alien plants along roadsides in Europe. EPPO Bull 48:256–265
Van Drunen WE, Johnson MT (2022) J. Polyploidy in urban environments. Trends Ecol Evol 37:507–516
Turcotte MM, Kaufmann N, Wagner KL, Zallek TA, Ashman T-L (2024) Neopolyploidy increases stress tolerance and reduces fitness plasticity across multiple urban pollutants: support for the general-purpose genotype hypothesis. Evol Lett qrad072
McKeon CM, Kelly R, Börger L, De Palma A, Buckley YM (2023) Human land use is comparable to climate as a driver of global plant occurrence and abundance across life forms. Glob Ecol Biogeogr 32:1618–1631
Davis CC (2023) The herbarium of the future. Trends Ecol Evol 38:412–423
Otto SP (2018) Adaptation, speciation and extinction in the Anthropocene. Proc. Biol. Sci. 285, 20182047
Rice A et al (2019) The global biogeography of polyploid plants. Nat Ecol Evol 3:265–273
D’Andrea L, Broennimann O, Kozlowski G, Guisan A, Morin X, Keller-Senften J, Felber F (2009) Climate change, anthropogenic disturbance and the northward range expansion of Lactuca serriola (Asteraceae). J Biogeogr 36:1573–1587
Lang PLM, Willems FM, Scheepens JF, Burbano HA, Bossdorf O (2019) Using herbaria to study global environmental change. New Phytol 221:110–122
Wood SN (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc Ser B Stat Methodol 73:3–36
R Foundation for Statistical Computing. R: A language and environment for statistical computing (2023)
Wood SN, Pya N, Säfken B (2016) Smoothing parameter and model selection for general smooth models. J Am Stat Assoc 111:1548–1563
Karger DN et al (2017) Climatologies at high resolution for the earth’s land surface areas. Sci Data 4:170122
Oksanen J et al (2022) vegan: Community ecology package. R package version 2.6-4 https://CRAN.R-project.org/package=vegan
Junker RR, Kuppler J, Bathke AC, Schreyer ML, Trutschnig W (2016) Dynamic range boxes – a robust nonparametric approach to quantify size and overlap of n-dimensional hypervolumes. Methods Ecol Evol 7:1503–1513
Hijmans RJ, Phillips S, Leathwick J, Elith J (2023) dismo: Species distribution modeling. R package version 1.3–14 https://CRAN.R-project.org/package=dismo
Broennimann O et al (2021) Distance to native climatic niche margins explains establishment success of alien mammals. Nat Commun 12:2353
Hordijk W, Broennimann O (2012) Dispersal routes reconstruction and the minimum cost arborescence problem. J Theor Biol 308:115–122
Elith J, Leathwick JR, Hastie T (2008) A working guide to boosted regression trees. J Anim Ecol 77:802–813
Rauschkolb R et al (2024) Spatial variability in herbaceous plant phenology is mostly explained by variability in temperature but also by photoperiod and functional traits. Int J Biometeorol 68:761–775
Uchida H, Nelson A (2010) Agglomeration index: Towards a new measure of urban concentration. In: Beall J, Guha-Khasnobis B, Kanbur R (eds) in Urbanization and Development: Multidisciplinary Perspectives. Oxford University Press, Oxford, pp 41–59
Gong P et al (2020) Annual maps of global artificial impervious area (GAIA) between 1985 and 2018. Remote Sens Environ 236:111510
Rosche C et al Code and data for: Herbarium specimens reveal a cryptic invasion of tetraploid Centaurea stoebe in Europe. https://github.com/xxxxxx. Deposited xxx xxxx 2024

There is NO Competing Interest.

SupplementaryInformationRosche.pdf

Download PDF

Version 1

posted

You are reading this latest preprint version

Herbarium specimens reveal a cryptic invasion of tetraploid Centaurea stoebe in Europe

Status:

Version 1

Abstract

Figures

Introduction

Results

Tetraploids underwent a cryptic invasion into the range of diploids

Temporal patterns in the proportion of tetraploids confirmed our distinction between native and expanded ranges

Tetraploids more than doubled their range size

Tetraploids increased their climatic niche in ruderal habitats

The current occurrence of tetraploids is mainly associated with urbanization and road-assisted dispersal

Discussion

Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1