1. Genome size diversity in Chondrichthyes and sharks
Average genome size estimates (C-values) were obtained for a total of 137 chondrichthyan species: 66 sharks, 68 batoids, and 3 chimaeras. Analysis of variance showed significant differences among major chondrichthyan lineages (ANOVA: F(2,134) = 19.01, p < 0.001***), with sharks having significantly larger genomes (mean ± SD = 6.74 pg ± 3.53) than both batoids (4.86 pg ± 1.79; t-test = 3.82, p < 0.001***) and chimaeras (1.69 pg ± 0.22; t-test = 5.35, p < 0.001***) and batoid genomes being significantly larger than those of chimaeras (t-test = 4.24, p < 0.001***) in post hoc (Tukey’s HSD) analysis (Figure 1a). Within sharks, interspecific C-value variation ranged about 6-fold, with the highest values assigned to the Squalomorphii clade in comparisons across superorders (ANOVA: F(1,63) = 63.58, p < 0.001***). Shark orders also differed significantly from one another in average DNA content (ANOVA: F(6,59) = 19.29, p < 0.001***), with post hoc analysis grouping Heterodontiformes, Squatiniformes and Squaliformes (accounting for the largest genomes) separated from Orectolobiformes, Carcharhiniformes and Hexanchiformes. With intermediate C-values, Lamniformes were not significantly different from any of these two groups (Table 1; Figure 1b).
Table 1
|
Genome size (pg/n)
|
Chromosome number (n)
|
Fundamental number (FN)
|
Lineage
Superorder
Order
|
n
|
Range
|
Mean ± SD
|
CV
|
n
|
Range
|
Mean ± SD
|
CV
|
n
|
Range
|
Mean ± SD
|
CV
|
Selachimorphii
(or Selachii)
|
66
|
2.86–17.05
|
6.74 ± 3.53
|
52.36
|
30
|
30.5a-52
|
39.62 ± 6.44
|
16.25
|
30
|
39-66
|
54.45 ± 7.61
|
13.97
|
Galeomorphii
|
44
|
2.86–14.80
|
5.12 ± 2.10
|
40.93
|
21
|
31–51
|
39.43 ± 5.71
|
14.49
|
21
|
45–66
|
56.50 ± 6.63
|
11.73
|
Heterodontiformes
|
2
|
7.60–14.80
|
11.20 ± 5.09
|
45.46
|
2
|
51
|
51 ± 0
|
0
|
2
|
56–64
|
60 ± 5.66
|
9.43
|
Orectolobiformes
|
5
|
3.90–5.51
|
4.65 ± 0.64
|
13.72
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
Lamniformes
|
3
|
5.86–6.55
|
6.29 ± 0.37
|
5.93
|
2
|
41–42
|
41.5 ± 0.71
|
1.70
|
2
|
65–66
|
65.5 ± 0.71
|
1.08
|
Carcharhiniformes
|
34
|
2.86–8.65
|
4.73 ± 1.53
|
32.28
|
17
|
31–45
|
37.82 ± 4.56
|
12.05
|
17
|
45–63
|
55.03 ± 6.24
|
11.34
|
Squalomorphii
|
22
|
4.40–17.05
|
9.99 ± 3.61
|
36.20
|
9
|
30.5a–52
|
40.06 ± 8.26
|
20.63
|
9
|
39–59
|
49.67 ± 7.95
|
16.01
|
Hexanchiformes
|
3
|
4.40–5.35
|
4.78 ± 0.50
|
10.52
|
3
|
36–52
|
46 ± 8.72
|
18.95
|
3
|
39–56
|
49.67 ± 9.29
|
18.71
|
Squatiniformes
|
3
|
9.30–16.41
|
11.84 ± 3.97
|
33.53
|
1
|
44
|
44
|
-
|
1
|
57
|
57
|
-
|
Squaliformes
|
16
|
6.55–17.05
|
10.62 ± 3.10
|
29.23
|
5
|
30.5a–43
|
35.70 ± 6.67
|
18.68
|
5
|
41–59
|
48.20 ± 820
|
17.01
|
CV: Coefficient of Variation.
a Decimal chromosome number due to averaging for species with multiple entries. See Supplementary Table S1 online for species-specific estimates.
2. Mode and rate of genome size evolution
From the three models of evolution fitted to the genome size data (66 shark and 4 outgroup species), the Early-Burst (EB) model was the best-fitting one (AICw = 0.71, with a better fit for the constant-rate Brownian Motion (BM) model in analysis lacking an outgroup, AICw = 0.56; more details in Supplementary Table S2 online). Reconstruction of ancestral values (Figure 2) showed labile changes in DNA content at different taxonomic levels, with a major rate of increase preceding shark diversification (see Supplementary Fig. S4 online for a projection of the phylogeny onto genome size phenotypic space). From this point, larger and smaller genomes were secondarily acquired from an ancestral state for all major extant shark clades (6.37 pg) similar to the present-day average (6.74 pg).
Pagel’s tree transformation parameters were estimated to further test for deviations from a constant-rate process of evolution. Pagel’s lambda (λ) was greater than 0.93, whether an outgroup was included or not (for both cases: pagainst λ=0 < 0.001***, pagainst λ=1 > 0.367), indicating that phylogeny strongly determined genome size evolution (i.e., strong phylogenetic signal, λ = 1). Pagel’s delta (δ) for all species was 0.23 (pagainst δ=1 = 0.031*), characteristic of an evolutionary scenario where character change is concentrated early in the species diversification (i.e., δ < 1). Exclusion of the outgroup species resulted in a higher estimate not significantly different from a situation where the rate of evolution remained constant through time (δ = 0.69, pagainst δ=1 = 0.489). The estimates obtained for Pagel’s kappa (κ) were 0.92 (with outgroup, pagainst κ=1 = 0.834) and 0.69 (sharks only, pagainst κ=1 = 0.413), supporting a pattern of gradual evolution (i.e., κ = 1). For this parameter, the null hypothesis of punctuated evolution could only be rejected when an outgroup was included (pagainst κ=0 = 0.020*; sharks only: pagainst κ=0 = 0.081).
Finally, the rates at which DNA content evolved across the two shark superorders were compared. Phylogenetic Independent Contrast (PIC) absolute values, which encapsulate the degree of trait change, were on average greater in Squalomorphii (mean ± SD = 0.28 ± 0.24) than in Galeomorphii (0.12 ± 0.16; Mann–Whitney U test: W = 246, p < 0.01**, n = 64). Results for all analyses described in Section 2 were consistent across the 100 alternative phylogenetic hypotheses considered, except for Pagel’s κ, where estimates widely ranged between 0 and 1 (Supplementary Figure S5 and Table S3 online).
3. Chromosome number evolution
Ancestral haploid chromosome number (n) reconstruction was based on chromosome count data for 30 shark (summarized by order in Table 1) and 3 outgroup species. From the eight models tested in ChromEvol, “Constant Rate with No Duplication” provided the best fit (AIC = 206.68). According to this result, only events of dysploidization (i.e., structural chromosome rearrangements) accounted for changes in chromosome numbers along the branches of the tree and thus, no single polyploidization or demi-duplication event was inferred from the analysis (Figure 3). Overall, inferred occurrences of ascending disploidy (i.e., individual chromosome gains through fission events) were more common than those involving descending disploidy (via fusion events): gain rate, λ = 51.66, total frequency = 979.43; loss rate, δ = 42.75, total frequency = 791.80 (more details in Supplementary Table S4 online). Empirical chromosome data and simulated values (see Methods) differed in two of the four test statistics analysed (Variance: p = 0.18; Entropy: p = 0.04*; Parsimony: p < 0.01**; ParsimonyTime: p = 0.17).
Haploid chromosome number (n) was positively correlated with fundamental number (FN) of chromosome arms (Phylogenetic Generalized Least Squares (PGLS): β = 0.26, p < 0.01**, df = 28), but not significantly associated with chromosome composition (i.e., the ratio between two-armed and one-armed chromosome morphotypes; PGLS: β = -1.69, p = 0.225, df = 27; Figures 4a-b). No significant associations were found between genome size and any of the karyotype parameters (see details in Table 2; Figures 4c-e).
4. Genome size variation across cytological, life-history, and ecological gradients
4.1 Simple regression analysis
Genome size variation was compared against a wide variety of phenotypic and ecological parameters via PGLS, summarized in Table 2.
Table 2
Category
|
Model predictor(s)
|
df
|
λ
|
Slope (β) ± SE
|
t-value
|
F-value
|
p-value
|
karyotypic and Cytological factors
|
n
|
26
|
1 (1)
|
-0.003 ± 0.012 (-0.003)
|
-0.248 (-0274)
|
0.062 (0.075)
|
0.8060 (100%)
|
FN
|
26
|
1 (1)
|
0.0008 ± 0.007 (0.0003)
|
0.113 (0.037)
|
0.013 (0.026)
|
0.9105 (100%)
|
Chr. comp.
|
25
|
1 (1)
|
0.031 ± 0.076 (0.029)
|
0.413 (0.375)
|
0.171 (0.140)
|
0.6830 (100%)
|
ln-Ca
|
28
|
1 (0.94)
|
0.547 ± 0.149 (0.594)
|
3.665 (3.943)
|
13.433 (15.550)
|
0.0010** (100%)
|
ln-Na
|
19
|
0.80 (0.74)
|
0.580 ± 0.146 (0.598)
|
3.959 (4.131)
|
15.673 (17.068)
|
0.0008*** (100%)
|
Morphological factors
|
LPCA1 + (LPCA1)2
|
61
|
0.85 (0.82)
|
L: -0.041 ± 0.029
(-0.049)
L2: 0.009 ± 0.009 (0.011)
|
L: -1.448 (-1.718)
L2: 1.018 (1.246)
|
L: 2.096 (2.951)
L2: 1.036 (1.554)
|
L: 0.1528 (100%)
L2: 0.3129 (100%)
|
ln-Wmax
|
34
|
0.86 (0.86)
|
-0.015 ± 0.027 (-0.018)
|
-0.541 (-0.672)
|
0.292 (0.452)
|
0.5923 (100%)
|
PBS
|
57
|
0.75 (0.73)
|
-
|
-
|
2.712 (2.626)
|
0.0388* (59%)
|
CFAR*§
|
54
|
1 (1)
|
-0.126 ± 0.022 (-0.129)
|
-5.748 (-5.806)
|
33.039 (33.711)
|
4.31e-07*** (84%)
|
Metabolic and physiological factors
|
ln-SMR§
|
14
|
0 (0)
|
-0.706 ± 0.112 (-0.706)
|
-6.319 (-6.319)
|
39.924 (39.924)
|
1.89e-05*** (100%)
|
LPCA1 + TS
(= Cruising Speed)
|
58
|
0.71 (0.65)
|
L: -0.003 ± 0.019
(-0.003)
TS: -
|
L: -0.144 (-0.176)
TS: -
|
L: 0.021 (0.031)
TS: 3.757 (4.039)
|
L: 0.8862 (100%)
TS: 0.0156* (100%)
|
Developmental and demographic factors
|
k*§
|
51
|
0.85 (0.82)
|
-0.229 ± 0.578 (-0.223)
|
-0.396 (-0.380)
|
0.157 (0.145)
|
0.6937 (100%)
|
TPCA1*§
|
50
|
0.95 (0.89)
|
-0.088 ± 0.072 (-0.074)
|
-1.221 (-0.922)
|
1.490 (0.851)
|
0.2279 (99%)
|
rmax§
|
29
|
0.89 (0.91)
|
0.285 ± 0.329 (0.396)
|
0.867 (1.224)
|
0.751 (1.499)
|
0.3933 (99%)
|
Reproductive factors
|
ln-Ls + RM§§
|
61
|
0.84 (0.82)
|
Ls: 0.012 ± 0.044 (0.011)
RM: -
|
Ls : 0.268 (0.251)
RM: -
|
Ls: 0.072 (0.063)
RM: 2.543 (2.606)
|
Ls: 0.7897 (100%)
RM: 0.0870 (97%)
|
Ecological factors
|
Temp
|
63
|
0.88 (0.84)
|
0.0007 ± 0.006 (-0.0003)
|
0.129 (-0.051)
|
0.017 (0.022)
|
0.8976 (100%)
|
ln-Depth
|
64
|
0.87 (0.85)
|
-0.004 ± 0.031 (-0.001)
|
-0.140 (-0.034)
|
0.020 (0.016)
|
0.8894 (100%)
|
ln-Depth range
|
64
|
0.87 (0.85)
|
-0.005 ± 0.034 (-0.001)
|
-0.156 (-0.028)
|
0.024 (0.003)
|
0.8768 (100%)
|
ln-Depth + Climate†
|
60
|
0.89 (0.86)
|
Dep: -0.018 ± 0.032
(-0.014)
Clim: -
|
Dep: -0.560
(-0.437)
Clim: -
|
Dep: 0.314 (0.191)
Clim: 0.839 (0.779)
|
Dep: 0.5773 (98%)
Clim: 0.5060 (98%)
|
ln-Depth + Occurrence†
|
62
|
0.87 (0.84)
|
Dep: -0.004 ± 0.034 (0.0008)
Oc: -
|
Dep: -0.125 (0.023)
Oc: -
|
Dep: 0.016 (0.019)
Oc: 0.007 (0.020)
|
Dep: 0.9010 (100%)
Oc: 0.9935 (100%)
|
Habitat
|
62
|
0.87 (0.85)
|
-
|
-
|
0.812 (0.712)
|
0.4922 (100%)
|
ln-Depth + Salinity†
|
61
|
0.95 (0.92)
|
Dep: -0.0009 ± 0.030
(-0.023)
Sal: -
|
Dep: -0.029 (-0.760)
Sal: -
|
Dep: 0.0008 (0.577)
Sal: 5.560 (6.215)
|
Dep: 0.9774 (100%)
Sal: 0.0061** (100%)
|
* Additional results for when poor-quality estimates were omitted (see Supplementary Methods online) are shown in the main text.
§ Body-size correction using regression residuals (i.e., partial correlation analysis).
§§ Additional body-size corrected results (via multiple regression) are shown in the main text.
† Depth correction performed by including ln-tranformed average depth (“ln-Depth”) as a covariate (i.e., ANCOVA).
Within the cell, a strong positive relationship was found between genome size and both cell (Ca) and nucleus (Na) areas. Regarding morphological factors, neither total body length (LPCA1) nor maximum body weight (Wmax) were significantly associated with genome content. Precaudal body shape (PBS), on the other hand, was a better predictor of genome size. Post hoc analysis revealed that species with body types 4 and 5 exhibit larger genomes than those embodying more active lifestyles (body types 2 and 3; the difference between 4 and 2 being nominally not significant, p = 0.053), while sharks with body type 1 were not statistically different from any of them (for body type definitions, see Supplementary Methods online). Similarly, (body-size corrected) caudal fin aspect ratio (CFAR, a general indicator of swimming activity) was negatively correlated with genome size, with almost identical results found when poor-quality values (i.e., estimated from illustrations) were omitted from the analysis (partial PGLS correlation (PCPGLS): β = -0.12, p < 0.001***, df = 35).
Among physiological predictors, (body-mass corrected) standard metabolic rate (SMR) was strongly negatively associated with DNA content. At the functional level, analysis of genome size covariance across (body-size corrected) tail shapes (TS), which relate to the species cruising speed of locomotion, revealed significant differences: tail type 4 species, characterized by the slowest cruising speeds, contained significantly higher C-values than those with faster locomotion mechanics (tail types 2 and 3). Tail type 1 species (with the second fastest cruising speeds) were not statistically different from any group (tail type definitions are given in Supplementary Methods online).
With respect to developmental and demographic traits, there were no significant effects of (body-size corrected) growth completion rate (k), age (TPCA1), or maximum intrinsic rate of population increase (rmax) on DNA content. Discarding k and TPCA1 poor-quality estimates (i.e., those calculated through the FishBase Life-history tool) led to very similar results (PCPGLS, k: β = -0.29, p = 0.583, df = 38; TPCA1: β = -0.05, p = 0.302, df = 37).
Among reproductive parameters, no significant relationships were found between genome size and reproduction mode or the covariate litter size (Ls). Factoring out the effects of body size rendered similar results (multiple PGLS regression (MLPGLS), reproduction mode: F(2,60) = 2.48, p = 0.092; Ls: β = 0.02, p = 0.636, df = 60 for both).
In relation to the ecological factors analysed, no significant associations were found between DNA content and preferred water temperature, average depth or depth range. Similarly, environmental differences in genome size according to (depth corrected) climate, species occurrence across the water column, and habitat (not corrected by water-depth) were not significant, unlike for water salinity: sharks inhabiting marine waters had genomes significantly larger than those living in marine-brackish waters, while those capable of entering rivers (amphidromous) were not statistically different from any other category.
4.2 Multivariate regression analysis
Based on AICc minimization, the best-supported combination of life-history and ecological predictors explaining shark genome size variation is given in Table 3.
Table 3
Full model predictors
|
Best models
|
df
|
log-lik
|
AICc
|
F-value
|
p-value
|
1) LPCA1 + PBS + CFAR + k + TPCA1 + RM + ln-Ls + (Temp or ln-Depth or ln-Depth range) + Clim + Oc + Sal
|
1.1) 1 + LPCA1 + Clim + Oc + Sal
|
10
|
16.094
|
-4.85
|
LPCA1: 0.319 (0.314)
Clim: 7.297 (7.318)
Oc: 5.602 (5.283)
Sal: 10.547 (10.304)
|
LPCA1: 0.5761
Clim: 0.0003***
Oc: 0.0084**
Sal: 0.0003***
|
1.2) 1 + LPCA1 + CFAR + Clim + Oc + Sal
|
11
|
17.879
|
-4.65
|
LPCA1: 0.017 (0.021)
CFAR: 5.570 (6.418)
Clim: 5.375 (5.297)
Oc: 7.496 (7.462)
Sal: 9.231 (8.995)
|
LPCA1: 0.8961
CFAR: 0.0250*
Clim: 0.0022**
Oc: 0.0023**
Sal: 0.0008***
|
2) LPCA1 + PBS + CFAR + k + TPCA1 + RM + ln-Ls + (Temp or ln-Depth or ln-Depth range) + Clim + Hab + Sal
|
2.1) 1 + LPCA1 + Clim + Hab + Sal
|
11
|
16.624
|
-2.14
|
LPCA1: 0.264 (0.297)
Clim: 6.836 (6.951)
Hab: 3.786 (3.611)
Sal: 7.229 (7.232)
|
LPCA1: 0.6114
Clim: 0.0005***
Hab: 0.0205*
Sal: 0.0027**
|
2.2) 1 + LPCA1 + PBS + RM + Clim + Sal
|
13
|
20.111
|
-0.74
|
LPCA1: 0.447 (0.447)
PBS: 4.601 (4.600)
RM: 5.445 (5.444)
Clim: 4.131 (4.131)
Sal: 4.209 (4.209)
|
LPCA1: 0.5092
PBS: 0.0097**
RM: 0.0101*
Clim: 0.0094**
Sal: 0.0252*
|
3) LPCA1 + TS + CFAR + k + TPCA1 + RM + ln-Ls + (Temp or ln-Depth or ln-Depth range) + Clim + Oc + Sal
|
3.1) 1 + LPCA1 + Clim + Oc + Sal
|
As in model 1.1
|
3.2) 1 + LPCA1 + CFAR + Clim + Oc + Sal
|
As in model 1.2
|
4) LPCA1 + TS + CFAR + k + TPCA1 + RM + ln-Ls + (Temp or ln-Depth or ln-Depth range) + Clim + Hab + Sal
|
4.1) 1 + LPCA1 + Clim + Hab + Sal
|
As in model 2.1
|
4.2) 1 + LPCA1 + TS + RM + Clim + Sal
|
13
|
20.396
|
-1.31
|
LPCA1: 1.224 (1.224)
TS: 4.796 (4.796)
RM: 5.101 (5.101)
Clim: 5.025 (5.025)
Sal: 5.111 (5.111)
|
LPCA1: 0.2781
TS: 0.0081**
RM: 0.0129*
Clim: 0.0035**
Sal: 0.0128*
|
Among the included predictors, results were consistent with Section 4.1 for LPCA1, PBS (with the difference between body types 4 and 2 now being strongly significant, p < 0.01**), CFAR, TS and salinity. In contrast, reproduction mode, climate, occurrence, and habitat had significant effects on genome size variation in all multivariate models. For these predictors, post hoc genome size comparisons revealed that 1) among reproduction modes, placental viviparous species have smaller genomes than both aplacental viviparous and oviparous ones; 2) across a climate gradient, sharks found at boreal latitudes harbour larger genomes than those inhabiting lower latitudes, i.e., temperate, subtropical and tropical waters [species found worldwide were grouped together with boreal ones in some models or were not statistically different from any category, except tropical sharks, in others]; 3) regarding species occurrence, benthopelagic sharks were associated with genomic contents greater than those found in pelagic sharks, while those pertaining to epibenthic species were not statistically different from any of them; and 4) among the habitats used, sharks inhabiting oceanic waters contain smaller genomes than those living in reefs, coastal and deep waters.
5. The interplay between genome size and diversity
A comparison was made between DNA content and diversity at the taxonomic and genetic levels. No significant associations were found with any of the taxonomic diversity parameters, namely the number of species and genera per family (PGLS: β = -0.006, p = 0.907 and β = -0.088, p = 0.159, df = 64 for both; respectively) and the number of species, genera, and families per order (PGLS: β = -0.013, p = 0.852; β = -0.097, p = 0.174 and β = -0.182, p = 0.136, df = 64 for all; respectively); nor with expected heterozygosity (He, PGLS: β = -0.73, p = 0.130, df = 16; with almost identical results when correcting for body size and age, MLPGLS: β = -0.70, p = 0.183, df = 14; Supplementary Figure S10 online). Results for all comparative analyses described in Sections 3-5 were robust to phylogenetic uncertainty (Tables 2, 3 and Supplementary Table S5 online).