a) Population Genetics
Detected clusters: One microsatellite locus (GA-I26) was monomorphic in all analyzed specimens and hence excluded from further analyses. The remaining 13 loci were all polymorphic with 4 (GT-I13B; GA-I5B) to 42 (GT-I34) alleles. All analyzed populations were in Hardy Weinberg Equilibrium at all analyzed loci (see Supplementary Table 3).
Applying the hypothesis-free STRUCTURE approach to delimitate clusters of genetically related individuals detected profound population structure and yielded highest support (△K) for k = 6 clusters (Fig. 1). These clusters were (1) populations west of Alabama, (2) a population adjacent to Alabama in the east (WF1), (3) populations along the west coast of Florida, (4) populations along the east coast of Florida (including two Florida inland populations), (5) inland populations of northern Florida, and (6) South Carolina. Most individuals were assigned to their respective clusters with high probabilities, only the individuals of EF1 and EF2 were ambiguous in their assignment, being mostly intermediate between the two adjacent clusters (Green and light-Blue). In the westernmost sampled population (LPB), we also observed some ambiguity in the assignment which may indicate an influx from further (unsampled) populations along the coast of Mexico. Comparing STRUCTURE assignment to geographic location (Fig. 1) does not only visualize shifts in genetic population structure but indicates some (albeit limited) genetic mixing from (1) WF1 (Red cluster) to adjacent populations to the east and (2) from South Carolina (SC1; Green cluster) to the east coast of Florida. This interpretation is supported by the elevated genetic diversity in these populations (Supplementary Table 3). Running STRUCTURE with higher k’s (up to 10) tentatively sets apart four further clusters, i.e., LPB, MIS, CF2, and CF1/CF4.
If we assign populations to groups according to their prevalent STRUCTURE assignment (k = 6), an analysis of molecular variance (AMOVA) apportions significant amounts of 16.9% of the variation to divergence among these groups (FCT=0.169; p < 0.001) and 9.08% to divergence among populations within groups (FSC=0.109; p < 0.001), while 74.03% of the variation is found within populations (Table 1).
Table 1
AMOVA showing the partition of genetic variation. Groups consist of populations pooled according to STRUCTURE assignment. Significance level is based on 20,000 permutations.
Source of variation | df | Sum of squares | Variance components | % variation | Fixation indices | p |
Among groups | 5 | 268.889 | 0.8116 | 16.90 | FCT = 0.169 | < 0.001 |
Among populations within groups | 12 | 138.482 | 0.436 | 9.08 | FSC = 0.109 | < 0.001 |
Within populations | 318 | 1129.801 | 355.283 | 74.03 | | |
Total | 335 | 1537.173 | 479.948 | | FST = 0.259 | < 0.001 |
Genetic diversity (both heterozygosity and allelic richness) was generally lower in populations from the Gulf of Mexico (Orange cluster west of Alabama; except for the diverse population in Brownsville, TX (LPB)) and at the Atlantic Coast of South Carolina (Yellow cluster), while diversity was considerably higher around Florida, both at coastal and inland habitats (Fig. 1). At several loci, we found shifts in allele sizes between the more western populations (Orange cluster) and the rest. Two loci exhibited a specific allele (GA-I5B) or increased variability (GT-II33) in the populations directly east of Alabama (Red cluster, i.e., WF1 and WF2).
In pairwise comparisons, most populations were significantly diverged from one another (by means of significant fixation coefficient G’st and Fst, respectively (Fig. 2)). Particularly high divergence among geographically adjacent populations was found west and east of Alabama (i.e., between MIS and WF1, G’st = 0.628) and between South Carolina (Yellow cluster) and the geographically closest populations of Central Florida (Blue cluster, G’st = 0.623). There was little or no genetic differentiation among most populations west of Alabama (Orange cluster; NET, CAM, LA1, LA2). Furthermore, one Central Florida population (CF3) was not genetically differentiated from proximate populations on both the east and the west coast of Florida.
Comparing geographic and genetic (G’st) distance over all pairs of populations by a Mantel test shows highly significant Isolation-by-Distance (r = 0.583; p < 0.001), accounting for 46.7% of the genetic variation found (Fig. 4). A similar IBD-pattern was observed in the reduced sample that excluded populations from central and eastern Florida to account for continental barriers in IBD inference (see methods, r = 0.581; p = 0.002). The sPCA revealed a significant global structure, suggesting a correlation between genetic variation and geographic configuration of populations (p < 0.001). Accordingly, the mapping of scores of the first principal component illustrates a genetic cline from west to east (Fig. 3). In general, the isolation-by-distance results suggest a low capacity for long-distance dispersal in P. latipinna.
Geographical distribution of clusters and paleodrainages: Our palaeodrainage reconstruction revealed 13 palaeoriver systems in which our 18 sampling sites with genetic data were located (Fig. 1b). Plotting the average genetic clusters by sampling site revealed an interesting pattern: most of the genetic clusters followed the topology of multiple palaeoriver systems. For instance, the Orange cluster (west of Alabama) was distributed across four different paleoriver systems. In West Florida, two major genetic clusters (Red and Green) spanned across three different paleoriver systems. In East Florida and parts of Central Florida, the light blue lineage predominated, however, with a substantial level of admixture with the Green cluster from East Florida, being particularly admixed in the more coastal sampling sites. Still in Central Florida, the dark-Blue cluster was dominant and shared among two populations (CF1 and CF4), which belong to two different palaeodrainage systems, suggesting fish movement between sites. Finally, the Yellow cluster is dominant in a single palaeodranaige system in coastal South Carolina. In general, the spatial distribution of genetic lineages seemed to not be limited by the water boundaries of their paleorivers, with multiple instances of genetic clusters spanning several paleodrainages, as well as multiple paleorivers containing more than one genetic cluster.
Migration events: 20 individuals (11.9%) were identified as first-generation migrants using the Lhome method in GENECLASS2 (Fig. 5). While most migration events occurred within the same genetic cluster, the majority of them (16/20, Fig. 5) occur between sampling localities within different paleoriver systems. We detected four migration events across cluster borders involving locations of the west- and east coast of Florida (EF3, EF1, WF4). Overall, inferred recent dispersal occurred over medium to longer distances reaching several tens, and in rare cases, a few hundred kilometers.
Male size was normally distributed in 11 out of 14 populations (P ≥ 0.255 in all cases) but deviated from normality in CAM (W15 = 0.735, P < 0.001), CF4 (W20 = 0.824, P = 0.002), and EF2 (W20 = 0.854, P = 0.006). However, we did not detect bimodality in any of these three populations; rather, distributions were right-skewed with a few very large males (Fig. 6).
The environmental GLM on male SL revealed significant effects of both PCs [PC1: F1,187 = 19.306, P < 0.001, ηp2 = 0.094; PC2: F1,187 = 4.093, P = 0.044, ηp2 = 0.021] and of Genetic Cluster (F5,187 = 10.661, P < 0.001, ηp2 = 0.222). Along PC1, the relationship was negative, so males were smaller in habitats with higher water temperatures, more dissolved oxygen, and greater salinity, while the relationship with PC2 was positive indicating males were larger in habitats with greater pH and turbidity. Regarding genetic clusters, males from Cluster 4 were the largest, Cluster 3 males were the smallest, and males from the remaining four clusters were intermediate. For female SL the results were similar (PC1: F1,131 = 9.339, P = 0.003, ηp2 = 0.067; PC2: F1,131 = 13.617, P < 0.001, ηp2 = 0.094; Genetic Cluster: F4,131 = 3.680, P = 0.007, ηp2 = 0.101). While Cluster 3 had both the smallest males and females, SL was largest for females from Cluster 5, with the remaining Clusters intermediate. Visual examination revealed negative relationships between female SL and both PCs, so that females got smaller with increasing water temperature, dissolved oxygen, salinity, turbidity, and pH (Fig. 7, Supplementary table 7).
Males
The multivariate environmental GLM on the remaining male traits returned significant effects of the covariate SL (F3,184 = 3167.583, P < 0.001), both PCs (PC1: F3,184 = 24.618, P < 0.001; PC2: F3,184 = 12.962, P < 0.001) and Genetic Cluster (F15,508 = 6.419, P < 0.001). However, compared to SL (ηp2 = 0.981), the effects of PC1 (0.286), PC2 (0.174) and Genetic Cluster (0.148) were much weaker. Post-hoc univariate comparisons (Bonferroni-corrected significance at ⍺ = 0.017) demonstrated that SL significantly affected male lean mass (F1,186 = 9427.448, P < 0.001, ηp2 = 0.981) and GSI (F1,186 = 38.029, P < 0.001, ηp2 = 0.170), but not fat content (F1,186 = 2.742, P = 0.099, ηp2 = 0.015). While male lean mass increased with increasing SL, GSI decreased with increasing SL so that larger males had smaller relative testis mass. Genetic Clusters significantly differed in all three male traits (lean mass: F5,186 = 3.362, P = 0.006, ηp2 = 0.083; fat content: F5,186 = 10.299, P < 0.001, ηp2 = 0.217; GSI: F5,186 = 4.680, P < 0.001, ηp2 = 0.112; Supplementary Table 5). Male lean mass was greatest in Clusters 3 and 6 and by far smallest in Cluster 2, while fat content was highest in Cluster 2 and lowest in Cluster 5, and GSI was greatest in Cluster 6 and lowest in Clusters 2, 4 and 5. Both PCs significantly affected lean mass (PC1: F1,186 = 57.894, P < 0.001, ηp2 = 0.237; PC2: F1,186 = 9.051, P = 0.003, ηp2 = 0.046) and fat content (PC1: F1,186 = 8.453, P = 0.004, ηp2 = 0.043; PC2: F1,186 = 30.947, P < 0.001, ηp2 = 0.143) but not GSI (PC1: F1,186 = 1.437, P = 0.232, ηp2 = 0.008; PC2: F1,186 = 0.182, P = 0.670, ηp2 = 0.001). Visual examination revealed that lean mass increased with increasing temperature, oxygen content, and salinity (PC1), but also with increasing turbidity (PC2), while fat content decreased with increasing temperature, oxygen content, and salinity (PC1), but increased with increasing turbidity (PC2; Fig. 6a-d).
Females
The multivariate environmental GLM on the remaining female life-history traits indicated significant effects of the covariates SL (F6,124 = 375.528, P < 0.001) and Embryonic Stage of Development (F6,124 = 9.585, P < 0.001), both PCs (PC1: F6,124 = 47.889, P < 0.001; PC2: F6,124 = 12.941, P < 0.001) and Genetic Cluster (F24,434 = 8.904, P < 0.001). The strongest effects were due to SL (ηp2 = 0.948) and PC1 (0.699), while the effects of Embryonic Stage of Development (0.317), PC2 (0.385) and Genetic Cluster (0.295) were much weaker. Post-hoc univariate comparisons (Bonferroni-corrected significance at ⍺ = 0.008) demonstrated that SL significantly affected female lean mass (F1,129 = 1676.468, P < 0.001, ηp2 = 0.929), fecundity (F1,129 = 172.164, P < 0.001, ηp2 = 0.572), and RA (F1,129 = 11.007, P = 0.001, ηp2 = 0.079; P > 0.104 for all other traits), which all three increased with increasing female SL (Supplementary Fig. 1, a and b). Embryonic Stage of Development, on the other hand, significantly affected embryo fat content (F1,129 = 39.057, P < 0.001, ηp2 = 0.232) and RA (F1,129 = 17.212, P < 0.001, ηp2 = 0.118; P > 0.032 for all other traits), with both traits decreasing with increasing developmental stage (Supplementary Fig. 1, c and d). Genetic Cluster significantly affected all traits (female lean mass: F4,129 = 4.716, P = 0.001, ηp2 = 0.128; female fat content: F4,129 = 19.521, P < 0.001, ηp2 = 0.377; fecundity: F4,129 = 9.300, P < 0.001, ηp2 = 0.224; embryo fat content: F4,129 = 3.758, P = 0.006, ηp2 = 0.104; embryo lean mass: F4,129 = 8.976, P < 0.001, ηp2 = 0.218; RA: F4,129 = 4.982, P < 0.001, ηp2 = 0.133). Female lean mass was greatest in Genetic Clusters 1 and 5, intermediate in Cluster 4 and smallest in Clusters 2 and 3, while female fat content was highest in Genetic Cluster 2 and lowest in Genetic Cluster 5 (Supplementary Table 6). Fecundity was greatest in Cluster 1 and by far the lowest in Cluster 5, and the opposite pattern was discovered for embryo lean mass, which was by far the greatest in Cluster 5 and lowest in Cluster 1. Embryo fat content was greatest in Clusters 3 and 5 and lowest in Cluster 1, and RA was by far lowest in Cluster 5 and greatest in Clusters 2 and 3 (Supplementary Table 6). PC1 significantly affected all traits but embryo fat content (female lean mass: F1,129 = 78.952, P < 0.001, ηp2 = 0.380; female fat content: F1,129 = 12.311, P < 0.001, ηp2 = 0.087; fecundity: F1,129 = 7.629, P = 0.007, ηp2 = 0.056; embryo lean mass: F1,129 = 63.764, P < 0.001, ηp2 = 0.331; RA: F1,129 = 9.664, P = 0.002, ηp2 = 0.070; embryo fat content: F1,129 = 0.585, P = 0.446, ηp2 = 0.005) and PC2 significantly affected female fat content (F1,129 = 43.108, P < 0.001, ηp2 = 0.250), fecundity (F1,129 = 9.009, P = 0.003, ηp2 = 0.065) and embryo lean mass (F1,129 = 14.934, P < 0.001, ηp2 = 0.104; P > 0.084 for all other traits). Visual examination revealed positive relationships between PC1 and both female lean mass and fecundity, while the relationships between PC1 on one side and female fat content, embryo lean mass and RA on the other side were negative (Fig. 6f-i). Similarly, female fat content and fecundity had a positive relationship with PC2 while PC2 and embryo lean mass had a negative relationship (Fig. 7f-i). This means that female lean mass and fecundity increased with increasing water temperature, DO and salinity, but that female fat content, embryo mass and RA decreased along the same axis. Moreover, female fat content and fecundity increased with increasing turbidity, while embryo lean mass decreased. Based on our Mantel tests, neither male life-history distance (r = 0.024, P = 0.442) nor female life-history distance (r = -0.234, P = 0.875) were correlated with genetic distance between population pairs.