Variation and correlation analysis of photosynthetic pigment content in peanut leaves
Four photosynthetic pigment-related traits (Chl a, Chl b, Chl a + b, and Car contents) for 241 peanut accessions were determined at the seedling and flowering stages across the 5 experimental environments. Continuous and considerable variations were observed in the five environments and their associated BLUE values. Across the five environments, the seedling Chl a, Chl b, Chl a + b, and Car content ranged from 0.28–2.28 mg/g, 0.12–1.26 mg/g, 0.45–3.50 mg/g, and 0.02–0.38 mg/g, respectively, with mean values of 0.76 mg/g, 0.39 mg/g, 1.15 mg/g, and 0.14 mg/g, respectively. Similarly, at the flowering stage, the seedling Chl a, Chl b, Chl a + b, and Car contents ranged from 0.26–1.61 mg/g, 0.10–0.92 mg/g, 0.36–2.53 mg/g, and 0.30–0.43 mg/g, respectively, with mean values of 0.69 mg/g, 0.37 mg/g, 1.06 mg/g, and 0.22 mg/g, respectively.
The coefficient of variation of BLUE values across the five studied environments ranged from 14.00% for Chl b to 20.00% for Car at the seedling stage and from 18.32% for Chl a to 36.40% for Car at the flowering stage. The H2 of the photosynthetic pigment-related traits varied from 0.82 for Chl a to 0.98 for Car at the seedling stage, and from 0.84 for Chl a + b to 0.97 for Chl b at the seedling stages, respectively, indicating that these traits were determined mainly by genetic effects. Statistical analysis for the chlorophyll and Car contents are listed in Table S3. The absolute kurtosis and skewness values of the four traits were < 1, suggesting that the four photosynthetic pigment-related traits nearly conformed to a normal distribution (Fig. S2). There were highly significant correlations (P < 0.01) between t photosynthetic pigment-related traits, indicating a stable correlation between chlorophyll and Car content (Fig. 1). Moreover, the correlation coefficients between the Chl a and Chl a + b contents were the highest at the seedling and flowering stages.
Genetic variation based on SNPs
After mapping to the peanut reference genome Tiffrunner and SNP calling, 2,110,659 high-quality SNPs were used for the subsequent GWAS analysis. The distribution of these SNPs on the 20 chromosomes (Chr.) of the allotetraploid peanut genome was uneven. Furthermore, 795,170 and 1,315,489 SNPs were distributed in A and B subgenomes, respectively (Fig. 2 and Table S4). The number of SNPs on each chromosome changed from 32,500 (Chr.8) to 152,216 (Chr.19), with an average of 105,533 SNPs. Chromosome 15 was the longest (160,859.59 kb) compared with other chromosomes, and its SNP density was 1.13 kb/SNP. Conversely, the shortest chromosome was Chr. 08 (51,742.57 kb) compared with other chromosomes, and its SNP density was the highest (1.59 kb/SNP). The SNP density of Chr.18 was the lowest (1.02 kb/SNP) compared with that of other chromosomes, and the average SNP density of the whole genome was 1.24 kb/SNP (Table S4). Furthermore, the value of polymorphism information content changed from 0.28 for both Chr.3 and Chr.16 to 0.37 for both Chr.8 and Chr.9, and its mean value was 0.33. LD decay for each chromosome ranged from 0.10 Mb for Chr.19 to 0.90 Mb for Chr.20, with an average of 0.41 (Table S4).
Population genetic structure and linkage disequilibrium analysis
To construct the population structure, 2,110,659 SNPs were employed. The CVE value was calculated for a K of 1–8 and reached the minimum value at K = 5, indicating that 241 peanut accessions were divided into 5 subpopulations (Fig. 3a). Accordingly, the optimal number of subpopulations was five (Fig. 3b). Furthermore, the phylogenetic tree and PCA exhibited similar correlations among the 241 accessions (Figs. 3c and d). Groups 1, 2, 3, 4, and 5 comprised 75, 68, 12, 47, and 39 accessions, respectively. The accessions in Groups 1 and 4 mostly belonged to modern varieties from China and other countries. The accessions in Group 2 mostly belonged to Chinese landraces. Additionally, the LD (indicated by r2) decreased by half at 50 kb for the entire genome (Fig. S1).
Identification of candidate genes using GWAS, GBA, and transcriptome analysis
At the seedling and flowering stages, 575 SNPs were detected for 4 traits related to photosynthetic pigment content according to the BLUE values (Figs. 4a and b and Table S5). Of the 149 SNPs associated with the photosynthetic pigment content at the seedling stage, 23 were associated with Chl a, 73 with Chl b, 33 with Chl a + b, and 20 with Car. Of the 426 SNPs detected during the flowering stage, 172 were associated with Chl a, 47 with Chl b, 150 with Chl a + b, and 57 with Car.
Based on the LD decay, the ± 50-Kb regions of leading SNPs should be considered as one QTL interval. Therefore, 335 QTLs were detected at the seedling and flowering stages. Of the 93 QTLs identified at the seedling stage, 14 were for Chl a, 47 for Chl b, 19 for Chl a + b, and 13 for Car. Of the 242 QTLs identified at the flowering stage, 98 were for Chl a, 28 for Chl b, 92 for Chl a + b, and 24 for Car.
Furthermore, we integrated GWAS with GBA and functional annotation to rapidly identify candidate genes associated with photosynthetic pigment content according to BLUE values at the two different growth stages. Twelve genes were significantly correlated with photosynthetic pigment content at the seedling and flowering stages (Table 1). Among these, four genes shared at least two traits.
Table 1
Identification of candidate genes by GWAS, GBA and transcriptome analysis
Trait | Period | Name | Start | End | P value | SNP | -log (P) | log2 (fc) | Function |
Chl a | SS | Arahy.8PDB05 | 66903158 | 66903897 | 4.25E-07 | 01-66876382 | 5.37 | -0.81 | senescence-associated protein |
Car | SS | Arahy.JLZP6T | 66879531 | 66879901 | 3.47E-06 | 01-66876382 | 5.81 | 0.72 | senescence-associated protein, putative |
Chl b | SS | Arahy.L8VLW8 | 121555458 | 121557140 | 2.91E-07 | 04-121558770 | 5.16 | 0.75 | General regulatory factor 8; |
Car | SS | Arahy.29UIE3 | 10811779 | 10812972 | 3.03E-06 | 08-10812027 | 5.61 | -0.73 | TLC domain-containing protein 2-like [Glycine max] |
Car | SS | Arahy.N7WRT5 | 10810478 | 10816533 | 5.12E-06 | 08-10812027 | 5.61 | -0.65 | filament-like plant protein-like isoform X3 [Glycine max] |
Chl b | SS | Arahy.ZVPY4M | 12913504 | 12915817 | 1.85E-07 | 09-12875515 | 5.62 | -0.68 | Disease resistance protein (TIR-NBS-LRR class) family |
Chl a + b | SS | Arahy.4LR7U0 | 107000013 | 107004368 | 3.02E-06 | 09-107019158 | 5.19 | 0.52 | DNA/RNA polymerase superfamily protein, putative |
Chl a | SS | Arahy.GZP73J | 68699378 | 68731144 | 2.77E-07 | 16-68731008 | 5.26 | 0.63 | GDSL esterase/lipase n = 1 Tax = Medicago truncatula Rep |
Chl b | SS | Arahy.33ZKBH | 2572282 | 2578016 | 1.47E-06 | 18-25719346 | 5.00 | -0.99 | Uncharacterized protein LOC100800099 isoform X2 [Glycine max] |
Chl a + b | 1.47E-06 | 5.05 |
Chl a + b | SS | Arahy.F6GRBT | 2496042 | 2498863 | 2.16E-06 | 18-2549114 | 5.11 | -0.70 | Transcription factor bHLH52-like [Glycine max] |
Chl b | SS | Arahy.E7QFRQ | 144412830 | 144414347 | 2.91E-06 | 19-144413724 | 5.65 | 0.49 | Uncharacterized protein LOC100811064 isoform X3 |
Chl b | FS | Arahy.VMJ95M | 121558952 | 121561353 | 5.44E-08 | 04-121558770 | 7.56 | 4.90 | Photosystem I P700 chlorophyll A-binding protein |
Car | 5.12E-07 | 5.71 |
Chl b | FS | Arahy.RY2FI6 | 27102092 | 27103238 | 2.18E-06 | 08-27152128 | 5.18 | 0.76 | Putative nuclease HARBI1-like [Glycine max] |
Car | FS | Arahy.FPVE10 | 11375565 | 11378624 | 6.72E-07 | 08-11334872 | 5.96 | 0.92 | Receptor serine/threonine kinase |
Chl a | FS | Arahy.12W76F | 53682506 | 53683774 | 4.81E-06 | 11-53683452 | 5.00 | 0.46 | Unknown protein |
Chl a | FS | Arahy.Z1U5RE | 57254550 | 57261399 | 2.16E-07 | 12-57257974 | 5.38 | -5.21 | Chloroplast Ycf protein family ATPase |
Chl b | 2.91E-07 | 5.42 |
Chl a + b | 3.14E-07 | 6.35 |
Car | FS | Arahy.S5DSK4 | 105850364 | 105853483 | 1.25E-07 | 12-105874362 | 5.10 | -0.71 | Putative Myb family transcription factor At1g14600-like isoform X2 |
Car | FS | Arahy.J6J42Y | 13000316 | 13003460 | 3.33E-07 | 18-12967061 | 5.42 | 0.58 | glucose-6-phosphate dehydrogenase 6 |
Chl a | FS | Arahy.YWY61J | 6412671 | 6416074 | 1.02E-08 | 20-6416174 | 6.35 | 12.78 | Terpene synthase 14 |
Chl b | 3.09E-08 | 5.91 |
RNA-seq was conducted using accession Guihuahei 2 with high pigment content and Zhanhei 1 with low pigment content to screen the differentially expressed genes at the seedling and flowering stages, respectively. At the seedling stage, 27,736 upregulated and 22,079 downregulated genes were identified. Additionally, 24,142 upregulated and 25,119 downregulated genes were identified at the flowering stage. We focused on the expression of 24 genes obtained from the combined GWAS and GBA analyses. Ultimately, Arahy.VMJ95M and Arahy.YWY61J genes were differentially expressed during the flowering stage. Therefore, these two genes were considered as candidate genes for the next study.
Candidate gene Arahy.VMJ95M for Chl b and Car content
The first candidate gene, Arahy. VMJ95M was annotated, and it encoded the photosystem I P700 chlorophyll A-binding protein and was linked to the leading SNP 04-121558770, which was associated with Chl b and Car content (-log10 P > 5 [Fig. 5a]). Arahy.VMJ95M was located 182 bp downstream of the leading SNP 04-121558770 (Fig. 5b) and contains two introns and three exons. Interestingly, a non-synonymous SNP variation (C/T) at nucleotide (nt) 1,637 in exon 2 of Arahy.VMJ95M resulted in an amino acid mutation from Argine to Valine (Fig. 5c). One hundred and ninety-three peanut accessions carried two haplotypes of this gene with distinct phenotypes: the Chl b and Car content of accessions with the haplotype allele TT were significantly higher than that of accessions with the haplotype CC allele (P < 0.01 [Fig. 5d]). Peanut accessions 52, 55, 115, and 160 with higher Chl b and Car content, and accessions 32, 71, 122, and 127 with lower Chl b and Car content than those of other accessions, were selected to verify the expression of the candidate gene Arahy.VMJ95M. The qRT-PCR results indicated that Arahy.VMJ95M was highly expressed in the leaves of peanut accessions with high Chl b and Car contents at the flowering stage (Fig. 5e).
Candidate gene Arahy.YWY61J for Chl a and Chl b content
The Arahy.YWY61J candidate gene, was annotated and it encoded terpene synthase 14 and was linked to the leading SNP 20-6416174, which was associated with Chl a and Chl b content (-log10P > 5) (Fig. 6a). Arahy.YWY61J was located 99 bp upstream of the leading SNP 20-6416174 (Fig. 6b) and contained 5 introns and 6 exons. Four non-synonymous SNP mutations were identified in exons 2, 3, and 4; however, the third SNP mutation (C/T) at nt 1,167 in exon 3 resulted in an amino acid change from Valine to a stop codon (Fig. 6c). Two hundred and four peanut accessions carried two haplotypes of Arahy.YWY61J with distinct phenotypes: the Chl a and Chl b content of accessions with the haplotype allele CCTA was significantly higher than that of accessions with the haplotype GACG allele (P < 0.01) (Fig. 6d). Furthermore, the qRT-PCR results indicated that Arahy.YWY61J was highly expressed in the leaves of peanut accessions with high Chl a and Chl b content at the flowering stage (Fig. 6e).