Genetic diversity and structure based on SSR
SSR were amplified at five loci in 288 A. macrostemon individuals from 24 different populations and used to estimate genetic diversity. The mean expected heterozygosity (He), observed heterozygosity (Ho), effective number of alleles (Ne), observed number of alleles (Na), Shannon information index (I) and percentage of polymorphic loci (PPL) were 0.498, 0.808, 2.357, 3.008, 0.871 and 80.8 (Table 1). The BXS, SYS, THS, JAS, and QDS populations had high levels of genetic diversity, while the HDS, NXS and JGX populations had low levels of genetic diversity (Table 1). These five SSRs all showed high polymorphism at the species level , the highest genetic diversity was found at SSR ACE039, while the lowest genetic diversity was found at SSR ACM096 (Table S6). Population analysis of molecular variance (AMOVA) based on SSR markers showed that the genetic variation of A. macrostemon was mainly within populations, accounting for 76% of the total variation within the larger population (Table 2).
Table 1 Genetic diversity of 24 A. macrostemon populations based on SSR markers
population
|
He
|
Ho
|
Ne
|
Na
|
I
|
PPL(%)
|
BXS
|
0.576
|
0.800
|
3.417
|
5.000
|
1.204
|
80.00
|
CZS
|
0.459
|
0.800
|
1.928
|
2.800
|
0.762
|
80.00
|
DPS
|
0.551
|
1.000
|
2.272
|
2.600
|
0.846
|
100.00
|
FZS
|
0.493
|
0.800
|
2.148
|
3.000
|
0.858
|
80.00
|
GLS
|
0.378
|
0.600
|
1.725
|
2.200
|
0.660
|
60.00
|
HDS
|
0.339
|
0.600
|
1.398
|
1.600
|
0.530
|
60.00
|
HSS
|
0.534
|
0.800
|
2.822
|
3.600
|
1.000
|
80.00
|
JAS
|
0.716
|
1.000
|
3.598
|
4.800
|
1.389
|
100.00
|
JGX
|
0.358
|
0.600
|
1.548
|
2.000
|
0.602
|
60.00
|
JLQ
|
0.494
|
0.800
|
2.244
|
3.000
|
0.871
|
80.00
|
NJS
|
0.549
|
1.000
|
2.389
|
2.600
|
0.846
|
100.00
|
NMG
|
0.533
|
0.800
|
2.672
|
4.000
|
1.022
|
80.00
|
NXS
|
0.200
|
0.400
|
0.800
|
0.800
|
0.277
|
40.00
|
QDS
|
0.566
|
1.000
|
2.325
|
3.200
|
0.916
|
100.00
|
SHS
|
0.536
|
0.800
|
2.850
|
3.600
|
1.013
|
80.00
|
SMX
|
0.507
|
0.800
|
2.719
|
3.200
|
0.920
|
80.00
|
SYS
|
0.695
|
1.000
|
4.052
|
6.000
|
1.422
|
100.00
|
THS
|
0.604
|
1.000
|
2.811
|
3.000
|
1.003
|
100.00
|
TSS
|
0.367
|
0.600
|
1.630
|
2.000
|
0.623
|
60.00
|
WSX
|
0.522
|
0.800
|
2.546
|
3.400
|
0.955
|
80.00
|
XAS
|
0.520
|
0.800
|
2.625
|
3.000
|
0.930
|
80.00
|
XWX
|
0.569
|
1.000
|
2.375
|
2.800
|
0.910
|
100.00
|
YCS
|
0.463
|
0.800
|
1.992
|
2.200
|
0.738
|
80.00
|
ZTS
|
0.415
|
0.800
|
1.670
|
1.800
|
0.602
|
80.00
|
Mean
|
0.498
|
0.808
|
2.357
|
3.008
|
0.871
|
80.83
|
Note: Pop:population name; He: expected heterozygosity; Ho: observed heterozygosity; Ne: effective number of allele; Na: observed allele number. I: Shannon information index; PPL: percentage of polymorphic loci
Table 2 AMOVA analysis of A. macrostemon populations based on SSR markers、cpDNA and nrDNA sequences
Source of variation
|
d.f.
|
Sum of squares
|
Variance components
|
Percentage of variation (%)
|
Fixation index (Fst)
|
ssr
|
|
|
|
|
|
Among populations
|
23
|
311.696
|
13.552
|
24
|
|
Within populations
|
552
|
880.667
|
1.595
|
76
|
0.238
|
Total
|
575
|
1192.363
|
15.147
|
100
|
|
cpDNA
|
|
|
|
|
|
Among populations
|
49
|
1463.228
|
2.58579
|
93.45
|
|
Within populations
|
525
|
95.044
|
0.18138
|
6.55
|
0.93445
|
Total
|
574
|
1558.272
|
2.76717
|
100
|
|
nrDNA
|
|
|
|
|
|
Among populations
|
49
|
16389.947
|
28.63321
|
94.06
|
0.94058
|
Within populations
|
532
|
960.464
|
1.80878
|
5.94
|
|
Total
|
581
|
17350.411
|
30.442
|
100
|
|
The Mantel test showed that there was no significant correlation between geographic distance and genetic distance among A. macrostemon groups (r = 0.0714) (Fig. S1), indicating that geographic distance was not the main factor leading to genetic differentiation of A. macrostemon. Principal coordinate analysis showed that individuals from the same population clustered together, while only a few individuals from the ZTS, SHS, QDS, and SYS populations had crossover with individuals from other populations (Fig. S2). After genetic structure analysis using STRUCTURE, according to the optimal K value, the 24 A. macrostemon populations were divided into three groups, they are Group A (northern Group), Group B (central-southeastern Group) and Group C (southwestern Group). Some individuals in the 24 A. macrostemon populations were mixed to different degrees, which indicated that there was some gene exchange between the populations (Fig. 2A, 2B). Based on Nei's genetic distance and geographic distribution, the 24 A. macrostemon populations could be divided into three groups: north, southwest, and central-southeast (Fig. 2C, 2D). Groupings based on neighbor-joining cluster analysis were consistent with the results based on structure analysis.
Genetic diversity and genetic structure based on cpDNA and nrDNA sequences
By concatenating alignments from three cpDNA sequences (psbA-trnH, 539bp; rps16, 739 bp; trnL-F, 652 bp), 1930 bp of total cpDNA sequence was obtained from 574 individuals, containing 66 variant sites and G+C content of 32.99%. A total of 42 chloroplast haplotypes (H1-H42) were identified (Fig. 3B; Table S7). Haplotype H1 was the most common, appearing in 144 individuals, had the widest distribution, appearing in 14 populations, and was the oldest haplotype (Table S7, Fig. 3C). In addition, multiple chloroplast haplotypes were found in 12 populations. The remaining 38 populations were monomorphic populations. The species showed high haplotype diversity and nucleotide diversity (Hd = 0.904, π=2.08×10-3) at the species level. The total genetic diversity (HT) of chloroplast segments of A. macrostemon was 0.860, and the average genetic diversity (HS) of the population was 0.121. At the population level, some populations located in the eastern and northeastern regions had higher genetic diversity, such as DPS, JHS, HCS, BXS, SHS, SMX and XYS all had higher haplotype diversity and higher nucleotide diversity. The haplotype diversity (Hd) of SHS was the highest (0.758), and the nucleotide diversity (π) of JHS was the highest (1.700×10-3). Based on the cpDNA haplotype data (Fig. S4), we estimate the time of differentiation of A. macrostemon and outgroup, which shows that A. macrostemon separated from outgroup in the late Pliocene (about 3.295Ma), indicating that the origin of A. macrostemon was much earlier than the Quaternary. As can be seen from the figure, the differentiation time of A. macrostemon is 0.100 Ma to 3.295 Ma, with a long time span, from the late Pliocene epoch of the Tertiary to the late Pleistocene of the Quaternary, during which the Quaternary climate oscillation and the Last Glacial Maximum occurred.
The 633 bp nrDNA ITS sequence of A. macrostemon was obtained from 581 individuals, containing 391 variant sites and 50.43% G+C content. These polymorphic sites revealed a total of 65 haplotypes (H1-H65) (Fig. 3F). Among these, the H7 haplotype had the highest frequency, found in 96 individuals, and the widest distribution range. The core haplotypes of the ITS network center was H7, which was presumed to be the oldest haplotype (Table S7, Fig. 3G). Seven populations (TSS, XAS, YCS, SMX, NXS, HCS, and SYS) contained more than three haplotypes, and 34 populations had only one haplotypes. CpDNA analysis showed that there was no haplotype sharing among different groups (northern, southwest, and center-southeast), but only among populations of the same population (Table S7, Fig. 3D). The analysis of nrDNA data showed that there was haplotype sharing among different groups. For example, the haplotype distribution results of population BXS and population HCS showed that the northern group and the center-southeast group were mixed (Fig. 3F and Fig. 3H), which indicated that there was gene flow between different groups. These findings suggest that different populations in the same area often experienced genetic exchange at their cpDNA and ITS loci. Compared with chloroplast gene sequences, ribosome gene sequences showed higher haplotype diversity and nucleotide diversity at the species level (Hd = 0.957, π = 9.162×10-2) (Table S7). Different from cpDNA results, populations in southwestern China showed high genetic diversity.
The analysis of molecular variance (AMOVA) based on cpDNA and ITS sequence data further revealed the genetic structure of A. macrostemon. For cpDNA sequences, the inter-population genetic variation (93.45%) was significantly higher than the intra-population genetic variation (6.55%), and the FST value was 0.93445, which was also significant. The results using ITS sequences were similar to those of cpDNA sequences, with the variation mainly coming between populations and an FST value of 0.94058 (Table 2). The Nst genetic differentiation coefficients were not significantly greater than Gst (cpDNA: Nst = 0.930, Gst = 0.859, p > 0.05; nrDNA: Nst = 0.937, Gst = 0.808, p > 0.05), indicating that A. macrostemon had no significant systematic geographic structure.
Phylogenetic tree analysis based on GBS
GBS analysis was performed among 13 individuals from 13 populations representing the entire geographic distribution of A. macrostemon, with A. chinense used as an outgroup. Phylogenetic trees were constructed using the data, with A. macrostemon individuals clustering into three groups, namely northern, southwestern and central-southeastern. These results were similar to those based on SSR, cpDNA and ITS sequences, which further proves that the results based on fragment sequencing were reliable (Fig. 4A, 4B).
Inference of demographic history
Based on the mismatch distribution analysis of cpDNA an nrDNA sequences, the Tajima’s D values for the overall population were negative and nonsignificant, with a Tajima’s D of -1.42056 (p > 0.10) for the chloroplast sequences and -0.71303 (p > 0.10) for the nuclear sequences. The Fu's Fs value was -6.394 for the chloroplast sequences and 37.290 for the nuclear sequences. The mismatch distribution analysis produced multimodal curves, and the observed values were contrary to the expected values (Fig. S3). This violated the population expansion model, indicating that A. macrostemon did not experience significant population expansion, but rather was in dynamic equilibrium.
Analysis of suitable establishment areas for A. macrostemon
MaxEnt software is used to forecast the potential distribution of A. macrostemon in China. The predicted value is very high (AUC=0.983), which can be used to characterize the migration route and distribution changes in the Quaternary glacial period. The results show that during LGM stage, global climate cooling leads to obvious contraction and southward migration of A. macrostemon high suitable area. The Middle Holocene climate is warm and humid, similar to the modern climate, and the distribution of A. macrostemon in this period is obviously expanded, and the predicted distribution range in this period is similar to the modern climate. It is predicted that the distribution area of A. macrostemon will expand slightly in the future to reach the largest distribution area (Fig. 5). The highest contribution rate of each ecological factor is the warm season precipitation (41.2%). The mean temperature in the coldest season (16.1%) and variation coefficient of precipitation (13.7%) also contributed greatly, indicating that temperature, precipitation and season have a great influence on the distribution of A. macrostemon (Table S8).
The suitable area of A. macrostemon showed a trend of decreasing first and then increasing at different periods. In the future, the total suitable area of A. macrostemon will reach its maximum, and the distribution center may move northward.