Sampling
The European beech study plots in Romania are located in the vicinity of Brasov city along an elevational gradient, between 550 m and 1400 m above sea level (a.s.l.). To visualize the study plots, a map (Figure 1) was generated using QGIS 3.28.3 (Quantum GIS Development Team, 2023). Carpathian montane forests cover the southern alpine and continental region from 500 to 1500 m (a.s.l.) (Olson et al., 2001; EEA, 2021).
In five locations, bud samples from adult beech trees were collected. Beech regenerated naturally in all five stands. Over the last decades almost no silvicultural cuttings have been carried out in the sampled stands because two sites (Lempes and Tampa) are located in natural protected areas (Natura 2000) and the other three sites (Solomon, Lupului and Ruia) are located close to the city of Brasov or to a ski resort. In 1946, a fire partly affected the forest in Tampa, which regenerated naturally afterward. The highest elevation population is Ruia in ca. 1400 m a.s.l., followed by Lupului in ca. 1100 m a.s.l., Solomon in ca. 900 m a.s.l., Tampa in ca. 700 m a.s.l. and Lempes in ca. 550 m a.s.l. The populations have a minimum altitudinal distance of at least 100 m from each other (Table 1).
Table 1: Characteristics of the five European beech study populations located in the Romanian Carpathian Mountains. E, East; N, North and SW, Southwest exposition on the top of the hill; DBH, diameter at breast height.
|
Ruia
|
Lupului
|
Solomon
|
Tampa
|
Lempes
|
Position
|
45°34‘39‘‘N
|
45°34‘58‘‘N
|
45°36‘59‘‘N
|
45°38‘21‘‘N
|
45°43‘34‘‘N
|
25°33‘18‘‘E
|
25°32‘38‘‘E
|
25°33‘37‘‘E
|
25°35‘46‘‘E
|
25°39‘18‘‘E
|
Elevation above sea level (m)
|
1300-1400
|
1000-1150
|
800-900
|
650-700
|
550-600
|
Exposition
|
N
|
N
|
N-SW
|
N
|
N
|
Mean annual temperature (°C)
|
3.5
|
5.2
|
6.2
|
7.2
|
7.5
|
Mean annual precipitation (mm)
|
1023
|
957
|
855
|
791
|
712
|
Soil type
|
Rendzina
|
Rendzina
|
Rendzina
|
Rendzina
|
Luvisol
|
Proportion of European beech
|
~ 10-20%
|
~ 20-30%
|
~ 80 - 90%
|
~ 40 - 50%
|
~ 70%
|
Approximate age (years)
|
150
|
160
|
110-130
|
130-150
|
100
|
Mean DBH (cm)
|
36
|
53
|
30
|
41
|
47
|
The slope was very steep except for the low elevation population (Lempes). The stands represent mixed forests where European beech is dominant in lower elevations, but its density decreases with increasing elevation. Among others, common hornbeam, Norway maple, European ash, Norwayspruce, and silver fir are some of the tree species that co-exist with beech in the five stands. In Ruia, at highest elevation, the forest shows a patchy structure with gaps.
DNA isolation and genotyping
In September and October of 2020, bud samples were collected for DNA isolation from a total of 500 beech trees from the five populations (~100 trees per population). The DNA was extracted using the DNeasy 96 Plant Kit and DNeasy Plant Mini Kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. DNA concentration was measured using a Qubit 3.0 fluorometer (Invitrogen, NY, USA) and diluted to ca. 10 ng/µl. All samples were genotyped at six highly polymorphic genomic microsatellite markers (gSSRs, genomic simple sequence repeats) and six expressed sequence tag microsatellites (EST-SSRs) which are located in expressed regions of the genome and therefore are less polymorphic than those from non-coding regions of the genome (Ellis and Burke, 2007). The gSSR markers were sfc0018, sfc0161, sfc1063 and sfc1143 (Asuka et al., 2004), FS3-04 (Pastorelli et al., 2003), mfs11 (Vornam et al., 2004). The EST-SSR markers were FgSI0006 (Burger et al., 2018), FgSI0009, and FgSI0024 (Kubisiak et al., 2009), FS_C1968, FS_C2361 and FS_C7377 (Burger et al., 2018, Table S1 Supplementary Material). All 500 samples were amplified using three multiplexes containing the 12 microsatellite markers. Each PCR mix was prepared consisting of 1 μl diluted DNA (ca. 10 ng/µl), 10 x 1.5 μl PCR buffer B (Solis BioDyne, Tartu, Estonia), 1.5 μl MgCl2 (25 mM), 1 μl dNTPs (2.5 mM each), 1 μl forward primer (5 pmol/ μl), 1 μl reverse primer (5 pmol/ μl), and 0.2 μl Taq polymerase (5 units per μl, Hot Fire Pol® DNA Polymerase, Solis BioDyne, Tartu, Estonia). Water was added to reach the final reaction volume of 14 μl and the PCR was run in a Biometra T Professional thermal cycler (Analytik Jena, Jena, Germany) following a standard protocol (Table S2, Supplementary Material). PCR products were diluted with water (multiplex I 1:50, multiplex II 1:100 and multiplex III 1.4:100, Table S1, Supplementary Material). For the microsatellite fragment size determination, 1270 μl of HiDi formamide (Applied Biosystems, CA, USA) were mixed with 1.6 μl of the internal size standard GeneScan™ 500 ROX® (Applied Biosystems, Foster City, CA). For each well, 12 μl of the mix was added to 2 μl of diluted PCR product. Prior to genotyping on a 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA), the mixture was denatured for three minutes at 95 °C. Allele sizes were determined using GeneMapper 4.1 (Applied Biosystems, Foster City, CA, USA). Cumulative distributions of allele sizes were plotted for each marker using Excel 11 (Microsoft Office, 2021), clusters of allele sizes were identified and sorted into representative sizes through manual binning. Micro-Checker 2.2.3 (Van Oosterhout et al., 2004) was used to check all loci for the presence of null alleles.
In addition to microsatellite-based genotyping, a targeted genotyping method based on single primer enrichment technology (SPET) was applied (Scaglione et al., 2019). This method allows for the identification of genome-wide SNP markers via efficient and cost-effective genotyping-by-sequencing of target SNPs. TheThe sequencing included flanking regions of ±2 kb of the targets to allow for the identification of novel SNPs located within and close to genes. Based on whole-genome sequencing of a subset of 96 samples, ~100k high quality target regions were determined for SPET sequencing by IGATech (IGA Technology Services Srl, Udine, Italy). All 500 F. sylvatica samples were SPET sequenced and sequencing reads were trimmed and mapped to the chromosome level F. sylvatica genome v2 (Mishra et al., 2022). Quality trimming was performed using ERNE v1.4.6 (Del Fabbro et al., 2013) with default parameters. Alignment to the reference genome was accomplished using BWA-MEM v0.7.17 (Li and Durbin, 2009) with default parameters. Uniquely aligned reads, defined as reads with a mapping quality ≥10, were selected. Duplicated reads were removed with UMI -_tools v1.1.2 (Smith et al., 2017). Variant calling was conducted with Freebayes v1.3.6 (Garrison and Marth, 2012) using the parameters "--min-alternate-count 6 --min-alternate-fraction 0.2 -p 2 --use-best-n-alleles 2". Subsequently, normalization and filtering of variants were performed using bcftools (Danecek et al., 2011). Variants were filtered for a read depth of more than 6x but less than 257x. Loci were considered only if at least half of the samples provided a coverage of 6x or above (N_PASS(FORMAT/GT=='RA' || FORMAT/GT=='AA')>=1). After this a total of 838,522 SNPs were retained. In order to identify neutral variants for population genetic analysis, the following filtering steps were applied. First, SNPs located in intergenic regions were determined using SnpEff (version 4.3t, Cingolani et al., 2012). For this, an annotation database based on the F. sylvatica reference sequence and gene annotations (Mishra et al., 2022) was built, based on which the effects for each SNP, such as intergenic region variant, missense variant, or intron variant, were inferred. In a second filtering step, SNPs in linkage disequilibrium were filtered out using PLINK (v1.90b6.10), (Purcell et al., 2007) with an ld-window size of 100,000 kb (--ld-window-kb). Retaining unlinked SNPs and one representative SNP per haplotype block yielded a data set of 270,287 SNPs. After filtering out loci with > 10 % missing data the data set comprised 154,437 polymorphic SNPs. Lastly, in order to identify a unique catalog of SNPs that are polymorphic in all five populations, the populations program from the Stacks analysis tool set (version 2.2, Rochette et al., 2019) was applied. Based on its output sumstats, a list of SNPs with observed heterozygosity > 0 in all five populations was prepared, which was then used as a whitelist for vcftools (version 0.1.14, Danecek et al., 2011) to generate a vcf file comprising 51,485 SNPs.
Population genetic analyses
With the aim of characterizing and visualizing the population genetic structure a Discriminant analysis of principal components (DAPC) implemented in the “adegenet” package (Jombart and Ahmed, 2011), was used to characterize the population genetic structure based on SSR and SNP markers in R (‘R Core Team’, 2023).
Genetic diversity parameters and pair-wise genetic differentiation at SSR markers were calculated using SPAGeDi 1.5d (Hardy & Vekemans, 2002). We determined the number of alleles (NA), the expected heterozygosity (HE) and the allelic richness (AR) on a standardized sample of 196 gene copies. We tested if the inbreeding coefficient (FIS) deviated significantly from zero using 10,000 permutations. We also estimated global F statistics and pairwise FST values (Weir and Cockerham, 1984) between populations and tested for significance using 10,000 permutations. GenAlEx 6.5 (Peakall and Smouse, 2012) was used to assess the number of private alleles per population.
For the SNP markers, we used all 154,437 polymorphic SNPs to characterise the genetic diversity and pairwise population differentiation. VCFtools was used to report the number of polymorphic sites and the number of private alleles per population. Using the package hierfstat in R, we estimated HE and the pair-wise FST values (Weir and Cockerham, 1984) and 95% bootstrap confidence intervals to assess if pair-wise population differentiation based on SNPs was significantly different from zero.
The fine-scale spatial genetic structure (FSGS) was assessed in each of the stands using SPAGeDi. We defined seven distance classes (20 m, 40 m, 80 m, 160 m, 320 m, 640 m, 1400 m) and calculated the Loiselle kinship coefficient Fij (Loiselle et al., 1995) for all pairs of samples per distance class, using SSR and SNP markers (vcf file comprising 51,485 SNPs polymorphic in all stands). The average pair-wise kinship coefficients per distance class were regressed on the logarithm of pair-wise spatial distances. To determine the significance of the observed FSGS, the regression slope (b) was tested against a distribution after 1,000 permutations of individual locations. SPAGeDi’s jackknife confidence intervals (CIs) of b were used to compare the FSGS between populations. The Sp-value , indicating the strength of the FSGS, was estimated using the formula Sp=-b/(1-F1) (Vekemans and Hardy, 2004) where b is the regression slope and F1 refers to the average kinship coefficient in the first distance class (20 m) (Vekemans and Hardy, 2004).
Spring phenology
Bud burst data from Ciocîrlan et al. (2024) was reanalyzed with a specific focus on the duration and synchrony of spring phenology. Briefly, bud burst stages were scored using the "Biologische Bundesanstalt, Bundessortenamt und Chemische Industrie" (BBCH) scale to classify leaf development (Vitasse et al., 2009). The four stages were: 0: at least 50 % of the buds in dormancy, 1: at least 50 % of the buds swollen and the scales separated, 2: at least 50 % of the buds broken and leaves started emerging, 3: at least 50 % of the leaves fully developed and petioles extended (further information and the visualization of the stages can be found in Ciocîrlan et al., 2024). Bud burst was recorded twice per week, from March to June 2021 and 2022, and the visual assessment was done as described by Ciocîrlan et al. (2024). The mean values of the bud burst stage as well as the standard errors were calculated for each population and the corresponding days of the year (DOY) using summarySE from “Rmisc” package in R. The visualization of the bud burst progress in spring 2021 and 2022 was conducted in R using the “ggplot2” package. The bud burst duration for each individual was calculated as the number of days between the start of stage 1 and the end of stage 3 and plotted in box plots for each population. For this analysis, stage 1 was used as the initiation of the bud burst and stage 3 as the final stage, where all the leaves were fully developed, for each individual separately. Moreover, the overall duration of bud burst for each population was calculated by determining the DOY when the first individual started stage 1 and when the last individual concluded stage 3.