Phenotypic analysis of the TD and ABD
We analyzed the average TD and ABD of the F1 population for two consecutive years. The results (Table 1) showed that the average TD ranged from 4.68 cm to 27.67 cm, with an average of 13.80 cm. The average ABD range was 6.97 cm, with an average of 6.50 cm. The distribution of phenotypic data (Fig. 1) showed that the TD of most individuals in the F1 population were between 8-17 cm and the ABD of most individuals were concentrated between 5-8 cm. The phenotypic data for the TD and ABD showed a near normal distribution. Correlation analysis results showed a significant positive correlation (p=0.05) between the TD and ABD, with a Pearson correlation coefficient of 0.75.
Table 1 Descriptive statistical analysis of the average TD and ABD
Trait
|
Mini
|
Max
|
Average
|
Range
|
Standard deviation
|
Variance
|
Skew
|
Kurtosis
|
Pearson
|
TD (cm)
|
4.68
|
27.67
|
13.80
|
22.99
|
4.77
|
22.74
|
0.52
|
-0.23
|
0.75
|
ABD (cm)
|
3.57
|
10.54
|
6.50
|
6.97
|
1.46
|
2.14
|
0.66
|
0.27
|
Kurtosis and skewness were calculated by R software to fit the type of distribution.
RAD sequencing and SNP identification
We obtained a total of approximately 3.02 billion clean reads from the two parents and 173 F1 progeny, with an average of approximately 1.73 million reads per individual. After filtering, approximately 2.98 billion high-quality (HQ) reads were obtained for subsequent information analysis. Detailed filtering information for each sample sequencing data is summarized in Supplementary Data Table S1. The high-quality reads were aligned to the kiwifruit ‘V3’ reference genome showing an average mapping rate of 97.26% (Fig. 2a). The lowest mapping rate was 93.18%, while the highest mapping rate was 97.73%. Detailed information is shown in Supplementary Data Table S2. A total of 8252556 SNPs were called using GATK [29], of which 8023251 (97.22%) SNPs were located in LG0-LG28 (Fig. 2b), and the remaining of SNPs were in the unanchored contigs.
After genotyping, 1359886 of 8252556 SNPs were classified into four categories: ‘aa×bb’(326102 SNPs), ‘hk×hk’ (164520 SNPs), ‘lm×ll’ (388996 SNPs) and ‘nn×np’ (480268 SNPs). These SNPs including those that did not meet the requirements of sequencing depth, failed genotyping, or showed no polymorphism between the parents were filtered out. These SNPs were not involved in the subsequent analysis. SNP filtering results showed that the missing rate of 335421 SNPs was above 10%, the genotype of 326102 SNPs (‘aa×bb’) was not suitable for the F1 population, and 393596 SNPs were skewed (p<0.001) in the chi-square test. Finally, 304767 high-quality SNPs were retained for bin markers screening (Table 2).
Table 2 Statistical table of SNP filtering results
Filter
|
Number
|
Number of SNPs with missing rate above 10%
|
335421
|
Number of SNPs with heterozygosity exceeds 75%
|
0
|
Genotype
|
326102
|
Chis-q
|
393596
|
Number of SNPs retained
|
304767
|
Total
|
1359886
|
Genotype: number of SNPs for which the genotype is not suitable for the CP population. Chis-q: number of SNPs with skewness (p<0.001) in the chi-square test.
Construction and evaluation of a high-quality linkage map
Approximately three hundred thousand SNPs were mapped to 29 linkage groups, LG0-LG28 (Table 3), according to the physical locations of the markers in the kiwifruit ‘V3’ reference genome. The number of markers on LG6 was highest (14729 SNPs), while the number of markers on LG28 was the lowest (6956 SNPs). A high-density linkage map was constructed using Lep-MAP3 [30] (Fig. 3). SNPs with the same genetic distance were combined into bin markers, for a total of 6506 bin markers composed of 2179 (33.49%) ‘lm×ll’, 3989 (61.31%) ‘nn×np’ and 338 (5.20%) ‘hk×hk’. The number of bin markers in each linkage group was found to be between 158 (LG28) and 318 (LG3); the average number of markers is 224. This bin map spans 2548.97 cM, with an average interval of 0.39 cM between adjacent markers. The length of each linkage group was between 53.48 cM (LG28)-127.78 cM (LG7), with an average genetic distance between markers of 0.34 cM (LG22)-0.55 cM (LG7).
Table 3 Statistics of the linkage map of the diploid A. chinensis population
Linkage group
|
Number of bin markers
|
Number of SNPs
|
Length (cM)
|
Average interval
(cM)
|
LG0
|
290
|
13295
|
102.62
|
0.35
|
LG1
|
175
|
14603
|
74.74
|
0.43
|
LG2
|
240
|
12586
|
107.05
|
0.45
|
LG3
|
318
|
12772
|
111.29
|
0.35
|
LG4
|
164
|
9887
|
69.56
|
0.42
|
LG5
|
239
|
11650
|
99.85
|
0.42
|
LG6
|
232
|
14729
|
85.33
|
0.37
|
LG7
|
231
|
10012
|
127.78
|
0.55
|
LG8
|
259
|
10033
|
96.90
|
0.37
|
LG9
|
278
|
11104
|
109.8
|
0.40
|
LG10
|
226
|
11427
|
78.62
|
0.35
|
LG11
|
199
|
10252
|
79.53
|
0.40
|
LG12
|
247
|
9876
|
95.32
|
0.39
|
LG13
|
196
|
11244
|
68.25
|
0.35
|
LG14
|
196
|
12061
|
72.30
|
0.37
|
LG15
|
286
|
11878
|
110.54
|
0.39
|
LG16
|
188
|
8233
|
82.94
|
0.44
|
LG17
|
192
|
8883
|
85.61
|
0.45
|
LG18
|
246
|
7762
|
91.93
|
0.37
|
LG19
|
214
|
9895
|
84.60
|
0.40
|
LG20
|
247
|
10195
|
88.16
|
0.36
|
LG21
|
225
|
9121
|
95.65
|
0.43
|
LG22
|
244
|
9758
|
82.09
|
0.34
|
LG23
|
196
|
9518
|
78.24
|
0.40
|
LG24
|
197
|
7332
|
88.08
|
0.45
|
LG25
|
217
|
8603
|
84.81
|
0.39
|
LG26
|
206
|
8756
|
72.26
|
0.35
|
LG27
|
200
|
7264
|
71.69
|
0.36
|
LG28
|
158
|
6956
|
53.48
|
0.34
|
Total
|
6506
|
299685
|
2548.97
|
0.39
|
The statistical results for the bin map (Fig. 4a) showed a total of 20 for gap5. The maximum gap appeared in LG24 (11.43 cM). The largest number of gap5 appeared in LG7. From the genotype legends of bin markers for each linkage group (Fig. 4b), we found most of the genotypes of bin markers were ‘ac’ genotype, ‘ad’ genotype, ‘bc’ genotype and ‘bd’ genotype. The number of bin markers of unknown genotypes (gray) was small. The logarithm of odds (LOD) matrix heatmap of linkage strength between two markers (Fig. 4c) showed that the linkage strength between markers within the linkage group was very high (from blue to red, the LOD value gradually increased, and the linkage strength between markers increased). The above evaluation results showed that this linkage map of the diploid A. chinensis has high quality.
Quantitative trait locus analysis of growth traits
The frequency distribution of the diameter of trunk and annual branch was comparable for 2 years and showed a continuous variation, which is typical for quantitative traits (Fig. 1). Composition interval mapping (CIM) was used to map QTLs associated with the TD and ABD in the present study. A total of 5 TD QTLs (Fig. 5a) and 6 ABD QTLs (Fig. 5b) were identified at the LOD threshold of 3.0. The TD QTLs were identified on LG2, LG8, LG14, LG23 and LG28. The highest LOD score was calculated for LG23 (4.512), the phenotype variance explained (PVE) was 13.424%. ABD QTLs were identified on LG8, LG9, LG10 and LG28, with the highest LOD score for LG10 (4.323, PVE 5.954%). The TD QTLs explained 3.77%-13.42% of the phenotypic variance. There were 2 TD QTLs with PVE less than 5%. The ABD QTLs explained 1.58%-5.95% of the phenotypic variance. There were 4 ABD QTLs with PVE less than 5%. All TD QTLs range from 9.83 cM (LG28_2357790-LG28_3902177) to 20.23 cM (LG5_2686541-LG5_14984017). All ABD QTLs range from 6.07 cM (LG10_7542-LG10_1452090) to 20.24 cM (LG8_4711164-LG8_13738769) (Table 4). The QTLs are unevenly distributed across 8 linkage groups. Some linkage groups contain QTLs for TD and ABD traits. We observed that both traits mapped to LG8 and LG28.
Table 4 Summary of QTL mapping intervals for the TD and ABD
Trait
|
LG
|
LOD
|
PVE%
|
Peak.marker
|
Peak.cM
|
Low.cM-Up.cM
|
TD
|
LG2
|
3.54
|
4.22
|
LG2_11604226
|
78.44
|
68.33-88.56
|
LG8
|
3.88
|
5.56
|
LG8_2042468
|
11.27
|
7.80-18.21
|
LG14
|
3.34
|
3.77
|
LG14_7901868
|
43.65
|
33.53-50.01
|
LG23
|
4.51
|
13.42
|
LG23_2055047
|
8.96
|
3.76-14.16
|
LG28
|
3.58
|
7.43
|
LG28.loc20
|
20.00
|
15.32-25.15
|
ABD
|
LG8
|
3.54
|
1.58
|
LG8_7253715
|
45.67
|
35.56-55.79
|
LG9
|
3.30
|
5.13
|
LG9_16493357
|
76.24
|
70.45-80.57
|
LG10
|
4.32
|
5.95
|
LG10_584605
|
0.58
|
0-6.07
|
LG10
|
3.89
|
3.63
|
LG10_822211
|
2.02
|
0-6.07
|
LG28
|
3.34
|
4.04
|
LG28_3714611
|
23.42
|
18.21-28.62
|
LG28
|
3.00
|
4.64
|
LG28_7724040
|
49.43
|
43.36-53.48
|
PVE: the phenotype variance explained (%). Peak.marker: marker at the LOD peak (virtual marker may exist). Peak.cM: the genetic distance of Peak.marker. Low.cM: the genetic distance at the leftmost marker of the QTLs. Up.cM: the genetic distance at the rightmost marker of the QTLs.
Enrichment analysis of candidate genes
According to the QTL mapping results, we extracted genes with annotations from the QTLs identified with PVE more than 5%. The candidate genes associated with the growth traits were found by using the physical location of the kiwifruit reference genome corresponding to the bin markers (Supplementary Data Table S3). A total of 358 TD candidate genes were identified from the LG8 (91), LG23 (137) and LG28 (130). A total of 181 ABD candidate genes were identified from the LG9 (106) and LG10 (75).
Among the TD candidate genes, 169 (47.21%) were enriched in cellular components, 168 (46.93%) in molecular functions, and 202 (56.42%) in biological processes; 118 (32.96%) have no corresponding GO term. Among the ABD candidate genes, 88 (48.62%) were enriched in cellular components, 84 (46.41%) in molecular functions, and 91 (50.28%) in biological processes; 66 (36.46%) have no corresponding GO term. Since a gene often has multiple different functions, the same gene will appear under different classification entries. The level 2 GO term classification of the TD candidate genes showed most genes to be enriched in metabolic process (164), cellular process (157) and single-organism process (129). In the cellular component ontology, most candidate genes were enriched in cell (138), cell part (137) and organelle (120). In the molecular function category, most candidate genes have functions such as catalytic activity (114) and binding (102) (Fig. 6a). In the level 2 GO classification of the ABD candidate genes, the candidate genes enriched in biological processes are mainly involved in cellular processes (71) and metabolic processes (70). In the cellular component category, most of the ABD candidate genes are associated with the cell part (82), cell (82) and organelle (70). In the molecular function category, the ABD candidate genes were mainly enriched in binding (61) and catalytic activity (54) (Fig. 6b). Detailed summary information for the enrichment analysis of candidate genes for each trait is shown in Supplementary Data Tables S4 (TD) and S5 (ABD).
Through KEGG pathway annotation, the TD candidate genes were divided into 5 A-level pathways and 11 B-level pathways. These genes are involved in pathways of metabolism, genetic information processing, environmental information processing, cellular processes and organismal systems (Fig. 7a). The ABD candidate genes were divided into 2 A-level pathways (metabolism and genetic information processing) and 7 B-level pathways (Fig. 7b). A total of 31 TD candidate genes (8.66%) and 16 ABD candidate genes (8.84%) are annotated in the KEGG database. The number of candidate genes associated with genetic information processing was the largest. In the B-level pathways of genetic information processing, most TD candidate genes are related to translation, and most ABD candidate genes are related to folding, sorting and degradation. The number of candidate genes in other pathways is shown in Fig. 7. Detailed KEGG pathway information and KO_ID of each candidate gene are shown in Supplementary Data Tables S4 (TD) and S5 (ABD). From the KEGG pathway annotation, we conclude that the candidate genes related to growth traits are mainly involved in metabolic pathways and genetic information processing pathways.
Development of SNP markers for the TD and ABD
Thirty-one SNPs were extracted from among the TD QTL (LG23), and 16 SNPs were extracted from among the ABD QTL (LG10). There were 17 SNPs with significant differences related to the TD. We designed a pair of primers that amplified the fragment containing the targeted SNP of LG23_1971916 (Supplementary Data Tables S6). In the ‘Hongyang’בBiyu’♂ population, the average TD of the homozygous ‘AA’ genotype was 15.03 cm in the 80 F1 progeny. The average TD of the heterozygous ‘AG’ genotype was 12.21 cm in the 93 F1 progeny. The average TD between the phenotypes corresponding to the two genotypes was 2.82 cm, with significantly difference at P<0.001 (Fig. 8a). We also evaluated the marker in another diploid A. chinensis population. The average TD of the homozygous ‘AA’ genotype was 13.46 cm in the 33 F1 progeny, while the average TD of the heterozygous ‘AG’ genotype was 9.75 cm in the 28 F1 progeny. There was a significant (P<0.001) difference between genotypes (Fig. 8b). Individuals homozygous ‘AA’ had significantly higher TD phenotypes than those heterozygous ‘AG’. In addition, we found the TD marker (LG23_1971916) locating in the candidate gene Actinidia22292 (LG23_1968139-LG23_1975116) with annotation as protein polar localization during asymmetric division and redistribution like.
There were 3 SNPs with significant differences in the ABD QTLs. We designed a pair of primers that amplified the fragment containing the targeted SNP of LG10_584605 (Supplementary Data Tables S6). In the ‘Hongyang’בBiyu’♂ population, a significant difference (P<0.01) was observed between genotype ‘AA’ (mean ABD of 7.30 cm) and ‘AC’ (mean ABD of 5.70 cm) through Sanger sequencing (Fig. 8a). In another diploid A. chinensis population, a significant difference (P<0.01) was observed between genotype ‘AA’ (mean ABD of 6.65 cm) and ‘AC’ (mean ABD of 4.89 cm) (Fig. 8b). Individuals homozygous ‘AA’ had significantly higher ABD phenotypes than those heterozygous ‘AC’. The ABD marker (LG10_584605) was located in the candidate gene Actinidia40478 (LG10_583296-LG10_585806) with annotation as a EIN3-bingding F-box protein.