Genetic diversity and structure of Spondias mombin populations
Structure and genetic diversity studies were carried out with 2,003 loci corresponding to SNP markers for 41 individuals of yellow mombin. The DAPC analysis was performed using 15 principal components, which explained 77% of the total variation, and two discriminant functions. The optimal number of groups obtained from the K-means analysis corresponds to three groups (Fig. 2a and 2b). Group G1, in red, is composed of 25 individuals from the populations of Paudalho-PE (PD), São Lourenço da Mata-PE (SM), Areia-PB (AR), Mata de São João-BA (MS), one accession from the population of Chapadinha-MA (CP) and one accession from the Presidente Figueiredo-AM (PF) population. Group G2, in green, is composed of 14 individuals collected in the State of Amazonas, at Iranduba (IR), Silves (SV), Presidente Figueiredo (PF), and Novo Airão (NV) populations. Group G3, in blue, comprises two of the three individuals from the Chapadinha-MA population. The DAPC analysis with K = 2 grouped the Chapadinha-MA population with the other northeastern populations in the G1 group (Suppl. Fig. S1).
Groups G1 and G2 are genetically closer than the G3 group, which, although its individuals are from a population of the Northeast Region, was isolated from the other two groups. In the DAPC analysis, with only one discriminant function, it was possible to visualize the grouping result in only one dimension (Fig. 2b). Groups G1 and G2 show an overlap area, with a greater distribution amplitude in group G2, demonstrating greater genetic variation within this group than others. In the membership probability analysis, based on the data generated in the DAPC analysis and with optimal k = 3, there were three groups observed, with group 1 (G1 - red) represented mostly by populations from the Northeast, group 2 (G2 - green) by the populations of the North, and group 3 (G3 - blue) by the population of Maranhão (Fig. 3). In this analysis, the separation of individuals by macroregions was explicit. However, two exceptions were observed: an accession from the State of Amazonas (PF) and one of the three individuals from the Chapadinha population of Maranhão within group G1. In the G2 group, only individuals from the North Region, represented by populations collected in the State of Amazonas, were found. The G3 group was isolated from the others, formed by two of the three individuals from the Chapadinha population of Maranhão.
Cluster analysis was performed using a genetic distance tree built using the Neighbor-Joining method (Fig. 4). The tree was built without defined rooting, not for evolutionary inferences but for understanding how the groups defined based on the DAPC analysis relate to each other and where a pattern of grouping by collection sites was observed. Five distinct groups were formed. Groups I (populations of Bahia, BA), II (populations of Pernambuco, PE), and III (populations of Paraíba, PB), which are equivalent to group G1 in the DAPC analysis, formed three well-defined groups, without individuals from other states. Group IV, which is equivalent to group G3 in the DAPC analysis, was formed by two of the three individuals from the Maranhão (MA) population and was placed between the group of populations from the Northeast (G1) and group V, formed by populations from Amazonas (AM), equivalent to G2. In the dendrogram, also constructed using the Neighbor-Joining method (Suppl. Fig. S2) and Nei’s distance (Nei 1972), it is possible to observe the behavior of each individual regarding the groups identified by the DAPC analysis. In this analysis, the third individual of Chapadinha, MA, classified in the G1 group, is genetically very close to the other two Chapadinha individuals of group G3 in the DAPC.
In the pairwise analysis of genetic divergence (Fst), high genetic differentiation was found between all groups, with values greater than 0.25 in all scenarios according to the scale for Fst presented by Hartl and Clark (1998). The highest Fst values were identified between groups G1 and G3 (0.37), and G2 and G3 (0.36), and the lowest Fst value (0.33) was observed between groups G1 and G2 (Table 2). AMOVA provided similar results for the DAPC groups and collected populations, both significant at 5% probability (Table 3). In both scenarios, the highest percentage of variation occurred within populations and groups, 61.4% and 59.3%, respectively.
Table 2
Result of pairwise Fst analysis between groups identified by DAPC analysis for Spondias mombin species. Group G1 is formed by the populations of Paudalho-Pernambuco, São Lourenço da Mata-Pernambuco, Mata de São João-Bahia, and Areia-Paraíba. Group G2 is composed by the populations of Iranduba, Novo Airão, Presidente Figueiredo and Silves, all from the State of Amazonas. Group G3 is formed by two individuals from the population of Chapadinha-Maranhão.
Fst | G1 | G2 | G3 |
G1 | 0.00 | | |
G2 | 0.33 | 0.00 | |
G3 | 0.37 | 0.36 | 0.00 |
Table 3
Results generated by AMOVA to identify the sources of genetic variation among and within the groups identified in the DAPC and the collected populations of Spondias mombin.
Source of variation | Degree of freedom | Sums of square | Coefficient of variation | Percent variation | F statistics |
Among populations | 8 | 12,636.86 | 259.25 | 38.56 | 0.3856* |
Within populations | 32 | 13,215.64 | 412.99 | 61.44 | |
Among DAPC groups | 2 | 2,481.53 | 104.30 | 40.69 | 0.4070* |
Within DAPC groups | 38 | 5,775.89 | 151.99 | 59.31 | |
The pairwise Fst matrix, generated with all the collected populations, enabled the identification of the greatest genetic distance when comparing the populations from the Northeast region with those from the North region (Table 4; Fig. 5). When the Northeastern populations were compared to each other, the populations of São Lourenço da Mata (SM) and Paudalho (PD), both from the State of Pernambuco, presented the lowest Fst value (0.06). The highest Fst value within the Northeast region was obtained when the populations of Areia (AR), Paraíba, and Mata de São João (MS), Bahia, were compared, generating an Fst of 0.21, with a genetic divergence considered high (Hartl and Clark 1998). Within the North region populations, the lowest Fst value (0.05) was obtained when comparing the Silves (SV) and Presidente Figueiredo (PF) populations, indicating low genetic divergence. The greatest genetic divergence between the Amazonian populations was obtained when Iranduba (IR) and Novo Airão (NV) were compared to each other, with a Fst value of 0.22, which indicates a high genetic divergence between the populations.
The geographic distance matrix constructed from the geographic coordinates presents distance values between the populations varying from 21.4 km to 2,934.4 km (Table 4). The Mantel test identified a positive and significant correlation (r = 0.78; p-value < 0.001) between the geographic distance among populations and the values of genetic divergence (Fst), showing the occurrence of isolation by distance for the yellow mombin populations.
Table 4
On the upper diagonal, the pairwise matrix of geographical distance was calculated between the populations, and on the lower diagonal, the Fst values were calculated between the populations. The distance values shown are in kilometers. The populations are represented by the abbreviations AR (Areia-Paraíba), CP (Chapadinha-Maranhão), IR (Iranduba-Amazonas), MS (Mata de São João-Bahia), NV (Novo Airão-Amazonas), PD (Paudalho-Pernambuco), PF (Presidente Figueiredo-Amazonas), SM (São Lourenço da Mata-Pernambuco), and SV (Silves-Amazonas).
Populations | AR | CP | IR | MS | NV | PD | PF | SM | SV |
AR | | 919.04 | 2,741.17 | 656.84 | 2,837.75 | 117.73 | 2,749.53 | 137.37 | 2,553.47 |
CP | 0.46 | | 1,869.07 | 1,133.84 | 1,957.92 | 1,016.66 | 1,861.16 | 1,037.87 | 1,672.22 |
IR | 0.51 | 0.62 | | 2,647.72 | 112.19 | 2,816.34 | 137.16 | 2,835.82 | 205.18 |
MS | 0.21 | 0.44 | 0.50 | | 2,755.27 | 594.25 | 2,689.56 | 593.78 | 2,487.38 |
NV | 0.47 | 0.49 | 0.23 | 0.44 | | 2,914.74 | 120.44 | 2,934.39 | 285.75 |
PD | 0.16 | 0.43 | 0.49 | 0.15 | 0.45 | | 2,829.20 | 21.36 | 2,631.60 |
PF | 0.41 | 0.44 | 0.19 | 0.39 | 0.16 | 0.38 | | 2,849.09 | 202.19 |
SM | 0.17 | 0.43 | 0.49 | 0.15 | 0.45 | 0.06 | 0.39 | | 2,651.36 |
SV | 0.39 | 0.43 | 0.21 | 0.37 | 0.17 | 0.36 | 0.08 | 0.37 | |
Two scenarios were considered to obtain the basic diversity parameters. The first considered the groups identified in the DAPC analysis, and the second considered the collection sites in the definition of populations (Table 5). The G2 group had the highest number of alleles (A = 3,547) and the greatest genetic diversity (He = 0.2031), and it was 37% more genetically diverse than the G1 group and 70% more diverse than the G3 group, presenting 649 more alleles than the G1 group and 1307 more than the G3 group. The result presented by the G2 group agrees with the inference obtained in the DAPC density analysis, showing a wider distribution, followed by the G1 group. Group G3 had the lowest number of alleles (A = 2,240) and low genetic diversity (He = 0.0594). However, the G2 group had the highest value for the inbreeding coefficient (f = 0.2280), almost double in relation to the G1 group. The G3 group had a negative value (f = -0.8529).
Table 5
Results of the genetic diversity analysis carried out between the groups identified in the DAPC and the populations of yellow mombin (Spondias mombin). The number of individuals (N), the total number of alleles (A), the observed heterozygosity (Ho), the expected heterozygosity (He), and the inbreeding coefficient (f) were analyzed for the populations: PD (Paudalho-Pernambuco), SM (São Lourenço da Mata-Pernambuco), MS (Mata de São João-Bahia), AR (Areia-Paraíba), CP (Chapadinha-Maranhão), from the Northeast region; and NV (Novo Airão-Amazonas), IR (Iranduba-Amazonas), PF (Presidente Figueiredo-Amazonas), and SV (Silves-Amazonas), from the North region.
Groups | N | A | Ho | He | F |
G1 | 25 | 2,898 | 0.1128 | 0.1279 | 0.1187 |
G2 | 14 | 3,547 | 0.1568 | 0.2031 | 0.2280 |
G3 | 2 | 2,240 | 0.1101 | 0.0594 | -0.8529 |
Average | - | 2,895 | 0.1266 | 0.1301 | -0.5062 |
Populations | N | A | Ho | He | F |
PD | 6 | 2,624 | 0.1205 | 0.1184 | -0.0177 |
SM | 6 | 2,624 | 0.1254 | 0.1178 | -0.0649 |
AR | 6 | 2,492 | 0.0833 | 0.0990 | 0.1581 |
MS | 5 | 2,562 | 0.1213 | 0.1170 | -0.0371 |
CP | 3 | 2,244 | 0.1100 | 0.0606 | -0.8146 |
Average (NE) | - | 2,509 | 0.1121 | 0.1059 | -0.0587 |
IR | 2 | 2,447 | 0.1790 | 0.1094 | -0.6362 |
SV | 3 | 2,778 | 0.1873 | 0.2092 | 0.1045 |
NV | 5 | 2,806 | 0.1509 | 0.1650 | 0.0855 |
PF | 5 | 2,856 | 0.1505 | 0.1986 | 0.2421 |
Average (N) | - | 2,722 | 0.1669 | 0.1860 | 0.1025 |
In the genetic diversity analysis considering the populations, the result was close to that obtained with the DAPC groups. Populations from the North region presented, on average, higher values of genetic diversity (A = 2,722; He = 0.1860) when compared to the populations from the Northeast region (A = 2,509; He = 0.1056) (Table 5). The Iranduba population presented divergent values from the other populations in the Amazonas State for the number of alleles, genetic diversity, and inbreeding coefficient, presenting observed heterozygosity superior to the expected heterozygosity. The Northeast Region populations presented similar values for the diversity parameters, except for the population of Chapadinha which had the lowest number of alleles and He among all the evaluated populations. The inbreeding coefficient had a positive value (0.1581) in the Areia population and negative value in the other populations.