Phenotypic analysis of the quality related traits
Nine quality-related traits, i.e., the contents of oil, protein, palmitic acid, stearic acid, arachidic acid, behenic acid, oleic acid, linoleic acid, and arachidonic acid, were assessed in four environments (Zhengzhou, 2018 and 2019; Shangqiu and Weifang, 2019) on the male parental line P1 (W1202), the female parental line P2 (Yuhua15), and 329 segregating RILs. ANOVA indicated that genotypic effects significantly affected all the traits (Table 1). P1 exhibited higher contents of oleic, behenic, and arachidonic acids, whereas P2 exhibited higher contents of oil, proteins, palmitic acid, stearic acid, linoleic acid and arachidic acid (Table 1). For all the traits and environments, wide phenotypic variation and transgressive segregation were observed in the RIL population (Table 1 and Fig. 1). The CV ranged from 4.16% for oil content to 20.65% for arachidonic acid content, while broad-sense heritability ranged from 0.74 for linoleic acid content to 0.91 for behenic acid content (Table 1).
Table 1
Basic statistics and genetic heritability of nine quality traits
Trait | P1 (%) | P2 (%) | Range (%) | Mean (%) | SD | CV (%) | Skewness | Kurtosis | Sig | H2 |
Oil | 51.88 ~ 54.29 | 53.80 ~ 55.43 | 45.15 ~ 59.80 | 52.97 | 2.20 | 4.16 | -0.08 | -0.20 | *** | 0.90 |
Protein | 22.97 ~ 23.59 | 22.70 ~ 24.13 | 18.15 ~ 28.96 | 23.78 | 1.56 | 6.57 | -0.06 | 0.02 | *** | 0.86 |
Palmitic acid | 11.62 ~ 12.85 | 12.10 ~ 12.83 | 8.3 ~ 14.75 | 12.42 | 0.73 | 5.86 | -0.35 | 2.13 | *** | 0.83 |
Stearic acid | 3.15 ~ 4.22 | 4.12 ~ 4.44 | 1.03 ~ 5.84 | 3.88 | 0.62 | 15.86 | -0.20 | 0.31 | *** | 0.89 |
Oleic acid | 37.94 ~ 44.54 | 37.03 ~ 40.85 | 27.87 ~ 57.77 | 40.64 | 4.06 | 9.99 | 0.32 | 1.22 | *** | 0.78 |
Linoleic acid | 34.55 ~ 39.40 | 36.62 ~ 41.17 | 21 ~ 51.15 | 37.35 | 3.68 | 9.86 | -0.20 | 0.83 | *** | 0.74 |
Arachidic acid | 1.38 ~ 1.60 | 1.50 ~ 1.67 | 0.96 ~ 1.93 | 1.48 | 0.14 | 9.40 | -0.08 | 0.13 | *** | 0.89 |
Behenic acid | 2.13 ~ 2.28 | 2.01 ~ 2.27 | 1.55 ~ 2.88 | 2.12 | 0.18 | 8.58 | 0.11 | -0.20 | *** | 0.91 |
ArArachidonic acid | 0.65 ~ 0.93 | 0.52 ~ 0.69 | 0.19 ~ 1.41 | 0.72 | 0.15 | 20.65 | 0.32 | 1.11 | *** | 0.88 |
P1 male parent; P2 female parent; SD standrad deviation; CV coefficient of variation; Sig significance; H2 heritability per mean |
The oil content displayed a negative correlation with the protein (-0.79), oleic acid (-0.39) and arachidonic acid (-0.53) contents and exhibited a positive correlation with the palmitic acid (0.40), stearic acid (0.75), linoleic acid (0.12), arachidic acid (0.87), and behenic acid (0.85) contents (Table 2). In addition, negative correlations were observed between protein content and behenic acid (-0.83), arachidic acid (-0.62) and palmitic acid (-0.33) contents. Among fatty acids, a strong negative correlation was observed between oleic and linoleic acid (-0.91), as well as between stearic and arachidonic acid (-0.87), whereas a strong positive correlation was observed between stearic and arachidic acid (0.86).
Table 2
Pairwise correlation among the contents of oil, proteins, and six different fatty acids
| Oil | Protein | Palmitic acids | Stearic acids | Oleic acids | Linoleic acids | Arachidic acids | Behenic acids |
Protein | -0.79*** | | | | | | | |
Palmitic acid | 0.40*** | -0.06 | | | | | | |
Stearic acid | 0.75*** | -0.33*** | 0.78*** | | | | | |
Oleic acid | -0.39*** | 0.21*** | -0.91*** | -0.64*** | | | | |
Linoleic acid | 0.12* | 0.01 | 0.78*** | 0.41*** | -0.91*** | | | |
Arachidic acid | 0.87*** | -0.62*** | 0.53*** | 0.86*** | -0.50*** | 0.18** | | |
Behenic acid | 0.85*** | -0.83*** | 0.28*** | 0.53*** | -0.30*** | -0.02 | 0.77*** | |
Arachidonic acid | -0.53*** | 0.08 | -0.74*** | -0.87*** | 0.66*** | -0.50*** | -0.78*** | -0.27*** |
*, **, and *** denote significance levels of 0.05, 0.01 and 0.001, respectively. |
Snp And Bin Marker Discovery Through Whole-genome Sequencing
Whole-genome resequencing of the two parental lines and 329 RILs generated approximately 700 Gb of clean data (9.10 billion reads). For each sample, the rate of mapped reads and the rate of mapped reads with unique positions were over 96% and 73%, respectively. The effective sequencing depths were 34.42 × and 34.58 × for P1 and P2, respectively, and ranged from 1.20 × to 1.40 × for the RIL population (Table S1). The coverage rate was 99.1% for P1 and 98.49% for P2 and ranged from 52.03–63.99% for the RIL population (Table S1). All the clean sequence data obtained in this study are available from the NCBI database under Sequence Read Archive (SRA) submission SUB8691701. Following alignment and the application of the GATK protocol, 741,564 SNPs were obtained. Further filtering enabled the definition of 213,868 SNPs homozygous and polymorphic between the two parents, which were utilized to identify bin markers.
Construction Of A High-density Genetic Linkage Map
As the RIL population was sequenced at low depth, the SNP dataset was converted into bin markers using a sliding window approach [23]. In total, 7595 bin markers were detected, and eleven lines exceeding 10% of the heterozygosis rate were removed from further analysis (Table S2). After redundant markers were filtered out, 4565 bin markers were finally used to construct a linkage map. Four bin markers remained unlinked, whereas the remaining 4561 bin markers were assigned to 20 linkage groups (LGs), as reported in Fig. 2 and Table 3. As the total map length was 2032.39 cM, the average map density per marker was 0.45 cM (Table 3). The number of markers per LG ranged from 173 (LG11) to 323 (LG13), the LG length varied from 77.50 cM (LG20) to 170.15 cM (LG06), and the average marker density per marker ranged between 0.37 cM (LG15) and 0.59 cM (LG06) (Table 3). The maximum marker interval was 13.41 cM on LG06, while more than 90% of marker intervals were below 1 cM (Table 3).
Table 3
Summary of the high-density linkage groups (LGs) obtained for the RIL population
ID | Number of Markers | Map Length (cM) | Marker Density (cM) | Max Interval (cM) | Ratio of marker interval < = 1 cM |
LG01 | 233 | 97.08 | 0.42 | 1.96 | 93.13% |
LG02 | 206 | 103.91 | 0.50 | 3.03 | 89.81% |
LG03 | 319 | 148.74 | 0.47 | 6.62 | 93.73% |
LG04 | 198 | 100.87 | 0.51 | 7.84 | 89.39% |
LG05 | 228 | 83.94 | 0.37 | 2.13 | 96.05% |
LG06 | 290 | 170.15 | 0.59 | 13.41 | 91.38% |
LG07 | 189 | 85.32 | 0.45 | 2.65 | 89.42% |
LG08 | 260 | 102.03 | 0.39 | 2.48 | 94.62% |
LG09 | 197 | 94.22 | 0.48 | 2.48 | 88.83% |
LG10 | 189 | 86.80 | 0.46 | 3.91 | 89.42% |
LG11 | 173 | 95.53 | 0.55 | 3.36 | 86.13% |
LG12 | 212 | 106.07 | 0.50 | 5.62 | 90.09% |
LG13 | 323 | 125.53 | 0.39 | 3.37 | 94.74% |
LG14 | 235 | 92.18 | 0.39 | 1.96 | 93.19% |
LG15 | 260 | 95.67 | 0.37 | 8.32 | 94.62% |
LG16 | 189 | 79.55 | 0.42 | 3.91 | 89.95% |
LG17 | 232 | 99.30 | 0.43 | 2.31 | 90.95% |
LG18 | 197 | 91.59 | 0.46 | 3.72 | 90.86% |
LG19 | 237 | 96.40 | 0.41 | 2.30 | 91.98% |
LG20 | 194 | 77.50 | 0.40 | 2.65 | 90.72% |
Whole Genome | 4561 | 2032.39 | 0.45 | | 91.45% |
Identification Of Qtl For Peanut Quality-related Traits
With a LOD threshold of 3.3 being employed, 109 QTLs were identified for the nine quality-related traits under investigation. QTLs were distributed on all the LGs, except for LG15 and LG19 (Tables 4 and 5 and Table S3).
Table 4
QTLs identified on LG05 for oil, protein and fatty acid content in four environments
QTL | Position | Marker Interval | LOD | PVE (%) | Trait (Environment; Additive Effect) |
qA05.1 | 0.0 ~ 0.5 | bin1572 ~ bin1581 | 4.77 ~ 28.44 | 0.76 ~ 26.99 | Oil (2018ZZ, 2019ZZ, 2019SQ, 2019WF; -0.88~-0.65); Protein (2018ZZ, 2019ZZ, 2019WF; 0.25 ~ 0.28); Palmitic (2018ZZ, 2019ZZ, 2019SQ, 2019WF; -0.22~-0.13); Stearic (2018ZZ, 2019ZZ, 2019SQ, 2019WF; -0.25~-0.19); Oleic (2018ZZ, 2019ZZ; 0.84 ~ 0.95); Arachidic (2018ZZ, 2019ZZ, 2019SQ, 2019WF; -0.05~-0.04); Behenic (2018ZZ, 2019ZZ, 2019SQ, 2019WF; -0.07~-0.05); Arachidonic (2018ZZ, 2019ZZ, 2019SQ; 0.04) |
qA05.2 | 6.0 | bin1593 ~ bin1594 | 52.35 | 10.39 | Linoleic (2018ZZ; 1.832) |
qA05.3 | 6.7 | bin1598 ~ bin1600 | 70.92 | 16.43 | Linoleic (2018ZZ; -2.30) |
qA05.4 | 7.3 | bin1601 ~ bin1602 | 8.25 | 5.41 | Oleic (2019WF; 0.96) |
qA05.5 | 10.2 | bin1611 ~ bin1612 | 5.53 ~ 9.38 | 5.37 ~ 11.27 | Stearic (2019WF; -0.13); Arachidonic (2019WF; 0.05) |
qA05.6 | 42.9 | bin1790 ~ bin1791 | 3.60 | 2.72 | Protein (2019WF; 0.21) |
qA05.7 | 46.3 | bin1798 ~ bin1800 | 3.31 | 2.42 | Behenic (2019SQ; -0.03) |
qA05.8 | 48.2 ~ 48.6 | bin1802 ~ bin1803 | 7.64 ~ 9.77 | 5.23 ~ 9.84 | Oil (2019SQ; -0.48); Protein (2019SQ; 0.42) |
qA05.9 | 51.7 ~ 52.7 | bin1808 ~ 1812 | 4.37 ~ 16.16 | 2.83 ~ 8.68 | Protein (2019ZZ; 0.22); Palmitic (2019SQ; -0.25); Stearic (2019SQ; -0.19); Oleic (2019SQ; 1.25); Arachidonic (2019SQ; 0.05) |
qA05.10 | 55.3 ~ 56.8 | bin1817 ~ bin1824 | 5.05 ~ 39.85 | 0.29 ~ 15.18 | Palmitic (2018ZZ, 2019ZZ, 2019SQ, 2019WF; 0.15 ~ 0.42); Stearic (2018ZZ, 2019SQ, 2019WF; 0.09 ~ 0.27); Oleic (2018ZZ, 2019SQ, 2019WF; -2.08~-0.73); Linoleic (2018ZZ, 2019SQ, 2019WF; 0.77 ~ 1.02); Arachidonic (2018ZZ, 2019SQ; -0.07~-0.03) |
qA05.11 | 66.4 | bin1852 ~ bin1853 | 4.92 | 3.14 | Oleic (2019WF; 0.73) |
qA05.12 | 69.2 | bin1860 ~ bin1861 | 7.24 | 1.01 | Linoleic (2018ZZ; -0.57) |
ZZ represents Zhengzhou, SQ represents Shangqiu and WF represents Weifang. |
Table 5
QTLs identified on LG08, LG12 and LG14 for oil, protein and fatty acid content in four environments
QTL | Position | Marker Interval | LOD | PVE(%) | Trait (Environment; Additive Effect) |
qA08.1 | 28.6 | bin2648 ~ bin2649 | 5.76 | 0.79 | Linoleic (2018ZZ; 0.51) |
qA08.2 | 37.7 ~ 37.9 | bin2711 ~ bin2718 | 4.40 ~ 4.83 | 3.24 ~ 4.08 | Arachidic (2019SQ; -0.03); Behenic (2019SQ; -0.03) |
qA08.3 | 59.3 | bin2780 ~ bin2781 | 4.13 | 3.52 | Arachidic (2018ZZ; -0.02) |
qA08.4 | 59.8 ~ 60.5 | bin2782 ~ bin2787 | 6.30 ~ 14.67 | 3.88 ~ 12.58 | Oil (2018ZZ, 2019ZZ, 2019WF; -0.64~-0.42); Protein (2018ZZ, 2019WF; 0.33 ~ 0.37); Behenic (2018ZZ, 2019WF; -0.04~-0.03) |
qA08.5 | 62.0 ~ 62.4 | bin2788 ~ bin2789 | 5.70 ~ 11.27 | 5.59 ~ 9.59 | Oil (2019SQ; -0.52); Protein (2019ZZ, 2019SQ; 0.32 ~ 0.42); Arachidic (2019ZZ; -0.03); Behenic (2019ZZ; -0.05) |
qA08.6 | 76.8 | bin2831 ~ bin2832 | 4.04 | 3.35 | Arachidic (2019WF; -0.03) |
qA08.7 | 97.9 ~ 98.0 | bin2884 ~ bin2885 | 5.57 ~ 6.00 | 4.22 ~ 4.62 | Protein (2018ZZ, 2019WF; 0.26 ~ 0.28) |
qA08.8 | 101.9 | bin2896 ~ bin2898 | 6.10 | 4.00 | Protein (2019ZZ; 0.27) |
qA12.1 | 36.3 ~ 37.0 | bin4060 ~ bin4061 | 4.79 ~ 5.69 | 3.00 ~ 4.11 | Oil (2018ZZ; 0.37); Behenic (2018ZZ; 0.03) |
qA12.2 | 39.7 | bin4065 ~ bin4066 | 7.02 | 5.45 | Protein (2019WF; -0.30) |
qA12.3 | 40.6 | bin4067 ~ bin4068 | 4.60 | 3.49 | Protein (2018ZZ; -0.24) |
qA12.4 | 42.7 | bin4071 ~ bin4072 | 4.09 | 3.10 | Behenic (2019ZZ; 0.03) |
qA12.5 | 43.9 | bin4074 ~ bin4075 | 3.57 | 3.42 | Stearic (2019WF; 0.11) |
qA12.6 | 45.0 | bin4076 ~ bin4077 | 3.78 ~ 5.23 | 1.34 ~ 4.93 | Oleic (2019SQ; -0.70); Linoleic (2019SQ; 0.73) |
qA12.7 | 46.5 ~ 47.2 | bin4078 ~ bin4079 | 3.96 ~ 9.72 | 2.21 ~ 6.61 | Oil (2019ZZ, 2019SQ; 0.38 ~ 0.48); Protein (2019ZZ, 2019SQ; -0.33~-0.34); Stearic (2019SQ; 0.10); Arachidic (2019SQ; 0.02) |
qA12.8 | 51.4 | bin4092 ~ bin4096 | 7.20 | 5.41 | Behenic (2019SQ; 0.04) |
qA12.9 | 54.6 | bin4111 ~ bin4112 | 4.14 ~ 6.80 | 3.13 ~ 5.13 | Oil (2019WF; 0.46); Behenic (2019WF; 0.03) |
qA12.10 | 105.7 | bin4491 ~ bin4492 | 4.55 | 2.06 | Stearic (2019SQ; 0.10) |
qA14.1 | 28.0 | bin4971 ~ bin4972 | 3.53 | 2.24 | Oleic (2019WF; -0.62) |
qA14.2 | 32.3 | bin4985 ~ bin4987 | 4.67 | 4.50 | Stearic (2019WF; 0.12) |
qA14.3 | 38.5 | bin5019 ~ bin5020 | 4.23 | 4.62 | Linoleic (2019WF; 0.68) |
qA14.4 | 39.8 | bin5059 ~ bin5069 | 5.64 | 3.99 | Linoleic (2019ZZ; 0.69) |
qA14.5 | 40.3 | bin5139 ~ bin5149 | 4.26 ~ 5.98 | 3.76 ~ 3.82 | Stearic (2019ZZ; -0.10); Arachidic (2019ZZ; -0.03) |
qA14.6 | 41.2 ~ 41.5 | bin5338 ~ bin5404 | 4.21 ~ 13.74 | 2.00 ~ 4.91 | Oil (2019ZZ; -0.38); Oleic (2018ZZ; -0.65); Linoleic (2018ZZ; 0.82); Behenic (2018ZZ, 2019ZZ, 2019WF; -0.04~-0.03) |
qA14.7 | 42.2 | bin5413 ~ bin5416 | 10.72 | 7.41 | Oil (2019SQ; -0.58) |
qA14.8 | 42.5 ~ 42.6 | bin5417 ~ bin5420 | 4.08 ~ 7.66 | 1.84 ~ 6.61 | Stearic (2019SQ; -0.10); Arachidic (2019SQ; -0.03); Arachidonic (2019SQ; 0.03) |
qA14.9 | 43.1 ~ 43.4 | bin5428 ~ bin5438 | 4.17 ~ 8.30 | 3.28 ~ 8.21 | Oil (2018ZZ, 2019WF; -0.45~-0.39); Stearic (2018ZZ, 2019WF; -0.17~-0.09); Arachidic (2018ZZ, 2019WF; -0.03) |
qA14.10 | 51.2 | bin5511 ~ bin5512 | 7.34 | 5.52 | Behenic (2019SQ; -0.04) |
ZZ represents Zhengzhou, SQ represents Shangqiu and WF represents Weifang. |
Twelve QTLs were mapped on LG05 (Table 4). Among these QTLs, QTL qA05.1 covered a region of 0.5 cM and was associated with all traits, except for linoleic acid content (Table 4). QTL qA05.1 showed a negative additive effect on five traits (oil, palmitic acid, stearic acid, arachidic acid, and behenic acid content), which was identified in all four environments. This QTL also exhibited positive additive effects on the protein, oleic acid and arachidonic acid contents, which were detected in two or three environments. The qA05.1 region flanked by the markers bin1572 and bin1573 exhibited a considerable effect on the traits, being associated with PVE values of approximately 10.44–26.99% and LOD scores of 10.42 to 28.44 for oil content, stearic acid, arachidic acid and behenic acid. Another major QTL on LG05, qA05.10, covered a region of 1.5 cM and had a pleiotropic effect on the content of five fatty acids (except for arachidic acids and behenic acids) (Table 4). The LOD score associated with qA05.10 varied from 5.05 to 39.85, whereas the PVE values were approximately 0.29–15.18%.
Several QTL loci were mapped on LG08, 12 and 14 (Table 5). On LG08, a region of 2.6 cM, covered by QTLs qA08.4 and qA08.5, was associated with oil, protein and behenic acid content, with LOD scores of approximately 5.70-14.67 and PVE values of approximately 3.88–12.58%. Associations with oil and protein content were consistent for all four environments being tested, whereas association with behenic acids was confirmed for three environments (Table 5). A large genomic region containing several QTLs with minor phenotypic effects was identified on LG12 (Table 5). In particular, QTLs for oil, protein, and behenic acid content that were consistent for all four environments were detected in regions spanning 18.30 cM, 7.40 cM, and 17.60 cM, respectively. On LG14, QTLs from qA14.5 to qA14.9, which were included in the interval between 40.3 cM and 43.4 cM, were detected in four environments for oil, stearic acid and arachidic acid content and three environments for behenic acid content (Table 5).
Among the 69 QTLs mapped on LGs different from those mentioned in the previous paragraph, some exhibited pleiotropy on several traits and exhibited consistent effects in more than one environment (Table S3). In particular, a region of 3.4 cM covered by the QTLs qA06.3 and qA06.4 was associated with protein content in three environments with LOD ranging from approximately 15.40-18.81 and PVE values of approximately 10.84–15.78%. QTL qA06.4 was associated with behenic acid in all four environments, with LODs scoring of approximately 13.92 to 24.79 and PVE values of approximately 11.01–17.55%. QTL qA06.6 showed a major effect on arachidic acid content, exhibiting a PVE value of 10.03% and a LOD score of 11.73.
Annotation of genes and validation of the SNPs in the QTL intervals
The genes in the intervals of qA05.1, qA05.9 and qA05.10 on LG05, qA06.3 and qA06.4 on LG06, qA08.4 and qA08.5 on LG08, qA12.1 to qA12.7 on LG12, qA14.5, qA14.6, qA14.7, and qA14.8 on LG14 were extracted and screened for polymorphic SNPs between two parents, and a total of 84 polymorphic SNPs in 71 genes were identified (Table S4). Among these SNPs, 17 resulted in missense mutations (Table 6), whereas the remaining 67 were in introns or resulted in silent mutations. KASP (Kompetitive allele specific PCR) markers were designed on the 17 SNPs associated with missense mutations (Table S5), and the markers were validated using the two parents and 44 lines of the RIL population displaying contrasting oil content. Two SNPs at sites Arahy05:6599714 and Arahy05:6709559 were closely linked with the oil content (Fig. 4). Specifically, the average oil content was 55.40% in RILs displaying G at Arahy05:6709559, whereas RILs exhibiting nucleotide A at the same loci displayed an oil content of 50.62% (Fig. 5). The two SNPs were included in the genes Arahy.T0P5W2 and Arahy.YR3A5K, encoding a scarecrow-like transcription factor PAT1-like and a galactosyl transferase GMA12/MNN10 family protein, respectively (Table S4).
Table 6
Mutation type of SNPs located in candidate genes
Gene name | Chromosome | Position | P1 | P2 | Mutation type |
Arahy.T0P5W2 | Arahy.05 | 6599714 | C | A | P-T |
Arahy.YR3A5K | Arahy.05 | 6709559 | A | G | Y-C |
Arahy.DH8F8S | Arahy.05 | 6917629 | G | A | V-I |
Arahy.USM880 | Arahy.05 | 7033745 | C | T | G-R |
Arahy.5YK3TE | Arahy.05 | 7297191 | T | A | L-M |
Arahy.S4GWG4 | Arahy.05 | 7514948 | T | G | N-T |
Arahy.925H3N | Arahy.06 | 6000283 | C | T | G-R |
Arahy.7E2TSQ | Arahy.06 | 6317756 | A | T | V-D |
Arahy.EYYU9K | Arahy.08 | 37430658 | T | A | W-R |
Arahy.LF06ED | Arahy.08 | 37441082 | T | C | W-R |
Arahy.W9QXGB | Arahy.08 | 37701095 | A | G | H-R |
Arahy.W26JNR | Arahy.12 | 3782327 | A | T | F-I |
Arahy.V6I7WA | Arahy.12 | 4236238 | G | C | K-N |
Arahy.0EHV1A | Arahy.12 | 4281118 | G | A | D-N |
Arahy.N0BKZ2 | Arahy.12 | 4908333 | T | G | K-Q |
Arahy.X5Q10C | Arahy.12 | 4995776 | C | T | S-L |
Arahy.C4F96H | Arahy.12 | 5159247 | T | C | H-R |