Morphological characterization
Variability across the accessions
Analysis of variance revealed significant differences (P < 0.01) of all traits (plant height, number of roots, root length, leaf width, leaf length, number of leaves, and number of leaf lobes) across accessions and weeks. Root length and number of leaf lobes proved significant differences (P < 0.01) across interaction between accessions and weeks, with mean squares of 6.19*** and 0.871***, respectively as shown in Table 3.1.
Table 3.1 Mean squares of morphological traits among 101 in vitro cassava accessions
SOV
|
DF
|
LL (cm)
|
LW (cm)
|
NL
|
NR
|
PH (cm)
|
RL (cm)
|
NLL
|
Accessions
|
100
|
0.781***
|
0.392***
|
9.110***
|
13.401***
|
5.677***
|
24.721***
|
1.972***
|
Weeks
|
2
|
33.637***
|
8.932***
|
566.141***
|
575.059***
|
543.116***
|
4343.86***
|
94.003***
|
Accessions X Weeks
|
200
|
0.132
|
0.045
|
1.603
|
2.723
|
1.123
|
6.194***
|
0.871***
|
Residuals
|
909
|
0.126
|
0.053
|
1.459
|
2.733
|
1.232
|
3.815
|
0.484
|
SOV: Source of variation, DF: degree of freedom, LL: Leaf length, LW: Leaf width, NL: Number of leaves, NR: Number of roots, PH: Plant height, RL: Root length, and NLL: Number of leaf lobes, *** significant: 0.1 % level of probability
Cluster analysis
Cluster analysis was performed using the ward’s minimum clustering method to generate a dendrogram with identified four distinct clusters (I, II, III, and IV) at a 9.5 level of dissimilarity as shown in Figure 3.1. Cluster I contain 17 accessions with two sub-clusters at a 5.0 level of dissimilarity. Cluster II contains 49 accessions with five sub-clusters at a 5.0 level of dissimilarity. Cluster III contains 18 accessions with three sub-clusters at a 5.0 level of dissimilarity. Cluster IV contains 15 accessions with three sub-clusters at a 5.0 level of dissimilarity. The accessions TMe-3373; and TMe-4132 had the most similar phenotypes (≤ 1 of Euclidean distance matrix), while Cluster II had the largest number of accessions (49).
Figure 3. 1 Dendrogram constructed from ward’s minimum cluster analysis to identify genetic relatedness among 101 in vitro cassava accessions using the standardized Euclidean distance matrix.
The first 3 principal components explained 67.26% of the total variation observed among the accessions studied, with eigenvalues ranging from 2.150 to 1.125 as shown in Table 3.2. The first principal component axis (PC1) had an eigenvalue of 2.150, contributing 30.72% of the total variability. All traits negatively contributed to the total variation in PC1. PC2 and PC3 contributed 20.47% and 16.06% of the total variation with eigenvalues of 1.433 and 1.125, respectively. The identical cutoff points at which each selection trait significantly influenced the PC axes were determined as eigenvalues ≥ 0.2. The traits namely leaf length (LL= -0.551), leaf width (LW= -0.531), number of roots (NR= -0.398), plant height (PH= -0.383), and root length (RL = -0.298) loaded PC1. leaf width (LW= 0.431), leaf length (LL= 0.409), root length (RL= -0.569), and number of roots (NR= -0.515), loaded PC2, and number of leaf lobes (NLL= 0.686), plant height (PH = -0.201), and number of leaves (NL= 0.681) loaded PC3. The remaining characters provided few or no discriminatory powers.
Table 3.2 lists the mean, standard deviation, maximum and minimum values, Least Significant Differences (LSD), and heritability (h2b) of attributes in the study. A high gap was found for a few variables, including root length (RL) and number of roots (NR), according to the results (Table 4.1). Significant differences were noted in traits related to the number of leaf lobes (NLL) and root length (RL), with mean differences (LSD) of 0.569 and 1.595 respectively. The coefficients of variation ranged from 16.062% for NLL to 34.996% for NR. The heritability (h2b) of 0.83 was for leaf length (LL), 0.867 for leaf width (LW), 0.824 for the number of leaves (NL), 0.558 for the number of leaf lobes (NLL), 0.796 for the Number of roots (NR), 0.786 for plant height (PH), and 0.749 for root Length (RL). The relationship among traits is illustrated in figure 3.2 The association between leaf width and leaf length was highly significant and positive (r=0.81***).
From the principal component biplot (Figure 3.3) of this study, plant height, number of leaves, root length, and number of roots were positively correlated but negatively contributed to the variability observed in the PC1 axis. Leaf width and leaf length were positively correlated but negatively influenced the PC1 axis. The number of leaf lobes and the root length were negatively correlated. The number of leaf lobes strongly influenced PC2, whereas the plant height strongly influenced PC1.
Table 3. 2 Eigenvalues, percentage of cumulative variance, the first three principal component analyses, and descriptive statistics of seven morphological traits.
TRAITS
|
PC1
|
PC2
|
PC3
|
Mean
|
Min
|
Max
|
SD
|
CV
|
P-value
|
LSD
|
h2b
|
LL
|
-0.551
|
0.409
|
-0.023
|
1.013
|
0.567
|
1.717
|
0.255
|
25.179
|
0.331
|
0.291
|
0.830
|
LW
|
-0.531
|
0.431
|
0.009
|
0.558
|
0.225
|
1.200
|
0.181
|
32.363
|
0.912
|
0.189
|
0.867
|
NL
|
-0.142
|
-0.178
|
0.681
|
2.965
|
1.417
|
5.583
|
0.871
|
29.384
|
0.188
|
0.986
|
0.824
|
NLL
|
-0.002
|
0.080
|
0.686
|
2.524
|
1.333
|
3.250
|
0.405
|
16.062
|
0.000
|
0.569
|
0.558
|
NR
|
-0.398
|
-0.515
|
0.072
|
3.020
|
0.917
|
5.917
|
1.057
|
34.996
|
0.503
|
1.350
|
0.796
|
PH
|
-0.383
|
-0.142
|
-0.201
|
3.974
|
2.683
|
5.508
|
0.688
|
17.308
|
0.790
|
0.907
|
0.786
|
RL
|
-0.298
|
-0.569
|
-0.141
|
4.948
|
1.908
|
9.159
|
1.435
|
29.008
|
0.00
|
1.595
|
0.749
|
Eigenvalue
|
2.150
|
1.433
|
1.125
|
|
% variance
|
30.72
|
20.478
|
16.066
|
Cumulative % variance
|
30.72
|
51.198
|
67.264
|
|
CV: coefficient of variations, SD: standard deviation, LSD: Least Significant Difference, h2b: Heritability, LL: Leaf length, LW: Leaf width, NL: Number of leaves, NR; Number of roots, PH: Plant height, RL: Root length and NLL: Number of leaf lobes.
Figure 3.2 Pearson’s correlation coefficient of the morphological traits of 101 in vitro cassava accessions
Figure 3.3 Principal component biplot showing the variability among accessions and contribution of traits
Molecular characterization
SNP calling, filtering, and heterozygosity
A total of 27,170 SNP markers were discovered genome-wide in the in vitro cassava population, and all generated SNPs were aligned with cassava reference genome v7. After filtering, 19,467 high-quality SNP markers were retained, representing a 28.35% decrease in the original dataset. This filtering step was necessary to ensure that the remaining SNPs were informative and suitable for use in the genetic studies. All polymorphic information content (PIC) ranged from 0.19 to 0.49. with an average of 0.23. the average minor allele frequency (MAF) and average observed heterozygosity were 0.25 and 0.28 respectively. The call rate ranged from 0.7053 to 1.00. with an average of 0.92987. Marker repeatability ranged from 0.95 to 1.0 after filtering.
Figure 3.4: Minor allele frequency (MAF) distribution
Figure 3.5: proportion of missing data distribution
Figure 3.6: Proportion of heterozygous distribution
Genetic Relationship
The genetic diversity among the studied in vitro cassava accessions, as measured by identity by state (IBS) matrix distance, ranged from 0.121 to 0.382. This moderate level of variation suggests the presence of a diverse allelic pool among the germplasm collection. Accessions TMe-1437_T_NGA_Unk and TMe-1230_T_NGA_BR exhibited the lowest genetic distance (0.121), indicating a high degree of similarity at the molecular level due to their close relationships. Conversely, the greatest genetic distance (0.382) observed between TMe-4424-NGR-BR and TMe-2906-CPV-LR highlights substantial allelic divergence in the collection.
The neighbor-joining (NJ) phylogenic tree revealed four distinct clusters (I, II, III, and IV) (Figure 3.7.a.). The maximum distance to the root was 0.189, branch length ranged from 0.00035 to 0.17 in the cladogram tree. Cluster I contained 32 accessions, of which 24 were breeding/ research materials, four were landraces, and four were unknown (not known whether are breeding material or landraces) as shown in figure 3.7.b.
Of the 31 accessions in cluster III, 18 were breeding materials and originated from Nigeria, 11 were landraces (five from Nigeria, three from Ghana, and each from Brazil and Gambia), and two were unknown from Nigeria. Of the 18 accessions in cluster II, 13 were breeding materials and originated from Nigeria, four were landraces and originated from Cameroun, and one was unknown and originated from Nigeria. Of the eight accessions in Cluster IV, five were breeding materials originating from Nigeria, two landraces (one each from Guinea and Togo), and one was unknown from Nigeria.
Figure 3.7.a. Unrooted neighbor-joining tree showing the genetic relationships among in vitro cassava accessions based SNP Markers and explained according to their breeding status and country of origin, (NG: Nigeria, GH: Ghana, CMR: Cameroun, SR: Sierra Leone, CPV: cape verde, BR: Brazil, GMB: Gambia, TG: Togo
Figure 3.7.b: Clustering of cassava accessions based on SNPs markers with respect to their countries of origin and breeding status. (BR: Breeding material, LND: Landraces, UNK: Unknown)
Population structure analysis
Principal components analysis (PCA) was conducted to understand the structure of the cassava population. PCA analysis and similar phylogenetic analysis could not make any distinguished group based on their biological status. Proportional genetic variance analysis showed that the first three principal components, PC1, PC2, and PC3, explain 15%, 4.74%, and 3.7% of the genetic variance, respectively (Figure 3.8). The first ten PCs contributed 44.3% of the cumulative genetic variance to the total genetic variance of the in vitro cassava population.
Figure 3.8 Principal component (PC) analysis of the population of 89 in vitro cassava accessions based on 19k SNP. (a) PC analysis for PC1 vs PC3, and (b) Genetic variance explained by first 10 PCs.
Figure 3.9: Genetic variance explained by the cumulative contribution of the first 10 PCs in Principal Component (PC) analysis of the population of 87 in vitro cassava accessions using SNP markers
Duplicate Identification using Identity-by-State and Kinship G-Matrix Distance
As described in Figure 3.10, technical DNA replicates by genotyping the same plant DNA twice. This was performed to assess the accuracy and reproducibility of the genotyping process. The study used the IBS distance as a measure of genetic similarity. It calculates the proportion of genetic markers where two samples share the same alleles (genetic variations).
The threshold for duplicate confirmation is set by calculating the average IBS distance of the nine technical replicates for a single accession, and then adding the standard error (a measure of variability) to this average. This ensures that even slight variations within the replicates are accounted for, reducing the risk of excluding true duplicates.
This study established an average threshold based on these replicates' average genetic similarity (IBS distance). A diagonal low IBS distance (closer to 0) indicates high genetic similarity and an IBS distance of less than 0.125 was used as a cut-off point for identifying duplicates and considered identical, while those above were considered unique accessions.
Table 3.3.a. IBS and IBD distance between technical control cassava accessions
Ind-1
|
TechRep
|
IBS-distance
|
IBD_distance and Similarity
|
TMe_F-1
|
TMe_F-1_2nd
|
0.120
|
1.58 (0.98)
|
TMe_F-7
|
TMe_F-7_2nd
|
0.117
|
1.52 (0.98)
|
TMe_F-80
|
TMe_F-80_2nd
|
0.104
|
1.50 (0.98)
|
TMe_IV-84
|
TMe_IV-84_2nd
|
0.109
|
1.43 (0.94)
|
TMe_IV-86
|
TMe_IV-86_2nd
|
0.139
|
1.62 (0.95)
|
TMe_IV-928
|
TMe_IV-928_2nd
|
0.108
|
1.46 (0.98)
|
TMe_IV-929
|
TMe_IV-929_2nd
|
0.154
|
1.18 (0.87)
|
TMe_IV-3338
|
TMe_IV-3338_2nd
|
0.142
|
1.73 (0.98)
|
TMe_IV-3442
|
TMe_IV-3442_2nd
|
0.128
|
1.58 (0.98)
|
Average
|
0.125
|
1.51 (0.961)
|
Figure 3.10.a: Identity by state distance between controlled duplicates
Out of 89 samples, one cassava accession (TMe-1437) was identified as duplicate with TMe-1230 ( 0.122 IBS matrix of 89 lines ). This process ensures that the remaining cassava accessions represent unique genetic individuals, improving the study's overall data quality and accuracy. Similarly, two pairs of individuals of cassava accessions (TMe.3398_T & TMe.70_T and TMe.3235_T & TMe.3252_T) were identified as duplicates using an average correlation value >0.961 (Table 3.3a) of IBD-distance. There were 11 pairs of duplicate accessions confirmed, including TMe.1437_T & TMe.1230_T cassava pairs identified in IBS distance if we used an average correlation value >0.95 (Table 3.3b) as a cutoff. This gives better duplicate identification results than a correlation value >0.95.
Table 3.3. b Duplicate list of cassava pairs, including technical control cassava accessions using Kinship G-matrix (IBD-distance).
S.No
|
Indiv.A
|
Indiv.B
|
IBD-Value
|
Corr*
|
1
|
TMe_F.1
|
TMe_F.1_2nd
|
1.582674
|
0.9814242
|
2
|
TMe_IV.3338_2nd
|
TMe_IV.3338
|
1.733535
|
0.9811235
|
3
|
TMe_IV.3442_2nd
|
TMe_IV.3442
|
1.577154
|
0.980804
|
4
|
TMe_F.80
|
TMe_F.80_2nd
|
1.499839
|
0.9795043
|
5
|
TMe_F.7
|
TMe_F.7_2nd
|
1.524889
|
0.9780268
|
6
|
TMe_IV.928
|
TMe_IV.928_2nd
|
1.462965
|
0.9764166
|
7
|
TMe.3398_T
|
TMe.70_T
|
1.233879
|
0.9642543
|
8
|
TMe.3235_T
|
TMe.3252_T
|
1.574684
|
0.9611765
|
9
|
TMe.3398_T
|
TMe.4562_T
|
1.229085
|
0.9599954
|
10
|
TMe.4562_T
|
TMe.70_T
|
1.232826
|
0.9594426
|
11
|
TMe.1230_T
|
TMe.1437_T
|
1.634677
|
0.9574284
|
12
|
TMe.4593_T
|
TMe_IV.84_2nd
|
1.445569
|
0.9557188
|
13
|
TMe.3398_T
|
TMe.3314_T
|
1.225516
|
0.9555749
|
14
|
TMe.3398_T
|
TMe.1572_T
|
1.223386
|
0.9546295
|
15
|
TMe.3314_T
|
TMe.70_T
|
1.226887
|
0.9531912
|
16
|
TMe_IV.86_2nd
|
TMe_IV.86
|
1.619672
|
0.9514253
|
17
|
TMe.1572_T
|
TMe.70_T
|
1.222898
|
0.9508044
|
18
|
TMe.452_T
|
TMe_F.1
|
1.544997
|
0.9505431
|
Corr*: Use as 0.95 as a threshold to confirm duplicate pairs of cassava
Figure 3.10.b: Kinship G-matrix reflecting genetic relationship between cassava individuals. The red vertical dash line indicates probable duplicates of cassava individual pairs (based on average IBD-distance of technical control) in the cassava collection.