Analysis of the Fatty acid composition and contents
The results showed that there are four main components in sunflower, including Oleic acid, Linoleic acid, Stearic acid, Linolenic acid, Palmitic acid, and the sum of oleic acid and linoleic acid content accounted for more than 85% of sunflower fatty acids. Furthermore, The relative content of oleic acid and linoleic acid in sunflower seeds was negatively correlated with the growth and development of sunflower seeds (Fig. 1), which was confirmed by previous research results[17]. The value of oleic acid in the seeds first increased and then decreased, reaching a maximum of 30% at 20 DAFs. The value of linoleic acid showed an S-shaped trend, with the minimum value appearing at 20 DAFs. Therefore, the key turning point in the synthesis of sunflower main fatty acids (oleic acid and linoleic acid) is 20 DAFs.
Under the S-test, the relative content of oleic acid was measured in inbred lines J9 and P50 at 20 DAFs, and the average temperature during the sampling period (from pollination to 20 DAF) was calculated too. The result of J9 was not significantly affected by daily average temperature changes, and it was basically stable at about 85%; The result of P50 decreased significantly with the decrease of daily average temperature, with the highest relative content of oleic acid reaching 42% during the S1 sowing period, while it was only 20.49% during the S6 sowing period (Fig. 2). It can be seen that the fatty acid composition and content of high oleic acid sunflower were less affected by temperature, while low oleic acid sunflower lines were just the opposite.
Sequencing read filtering and de novo assembly
The RNA sample testing of 66 sunflower samples to be sequenced showed that their quality met the sequencing requirements (Table S1).In the present study, we performed transcriptome analysis of 36 different S-test samples to exclude affection of environmental factors. A total of 265.79G data was measured form the two strains (J9 and P50), with an average size of about 7.38G for each sample. The average Q20 index of the sample was 96.59%, and the average Q30 index was 92.93%. The distribution of base error rate conformed to the general rule. The average GC content was 46.50%, and there was no obvious separation phenomenon. The quality control results are shown in Table S2.We performed transcriptome analysis of 30 A-test samples to explore the main genes for sunflower oleic acid synthesis. And, a total of 221.87G data was measured, with an average of about 7.40G per sample. The average Q20 index of the sample was 95.62%, and the average Q30 index was 91.45%. The distribution of base error rate conformed to the general rule. The average GC content was 45.93%, and there was no obvious separation phenomenon (Table S3).A total of 66 samples were sequenced, and 119895 predicted genes were identified, including 81676 known protein coding genes and 38219 new protein coding genes.
According to the value of gene expression in each sample, the square of Pearson correlation coefficient (R2) was calculated to obtain the correlation between duplicate samples. After analysis, the correlation of each biological duplicate in this study was better (Figure S1).
Gene expression analysis and functional annotation
The expression level for each gene was calculated using the RPKM [18] method (Reads Per kb per Million reads). There were more genes in low oleic acid strain P50 than in high oleic acid strain J9 (Figure S2). In the S-test, J9 had the highest number of genes enriched in the S1 period, with 3910 genes, while P50 had more genes enriched in S1 and S4 periods, both with more than 40,000 genes (Figure S2a). In the A-test, the two strains enriched the most genes at the CK stage, and P50 had 1757 more genes than J9 (Figure S2b). To gain insight into the functions of the genes related to sunflower fatty acid synthesis, all unigenes were annotated using WEGO [19] and Blast2GO [20] software and were classified into functional categories. A total of 119,895 genes were functionally annotated, and a total of 51408 Unigenes were annotated in the GO library. The top ten GO terms in the three GO ontologies were listed in Figure S3a. A total of 3286 Unigenes were annotated in the KEGG library, among which the Cutin, suberin, and wax biosynthesis in the top 20 belonged to lipid metabolism(Figure S3b).
Principal Component Analysis
J9 and P50 were clustered in two large populations, respectively, and there was a big difference between the two test materials. The clusters of samples of J9 with high oleic acid were more concentrated, while P50 with low oleic acid was more loose, indicating that the dominant factor affecting the lipid metabolism of sunflower was its own genotype rather than the environment, and the critical period of its fatty acid metabolism was between 0 DAFs and 20 DAFs (Figure S4).
Screening and analysis of differentially expressed genes
Comparative analysis of the sequencing results of each sample between the two varieties was done, and the sum of the differentially expressed genes obtained from the comparison was defined as the S dataset and the A dataset (Table 1).
Comparing the S-CK and A-CK datasets for within-species variation analysis showed that, the high oleic acid variety J9 exhibited the most differentially expressed genes in S5-vs-S1, with a total of 1022 genes, according to an examination of intra-varietal differences in the S-CK dataset and A-CK dataset. With a total of 1712 genes, the low oleic acid variety P50 exhibited the most differentially expressed genes between S3 and S1. With a total of 16133 genes, the high oleic acid variety J9 exhibited the most differentially expressed genes in the A4-vs-CK dataset. With a total of 11476 genes, the low oleic acid variety P50 exhibited the most differentially expressed genes in the A3-vs-CK comparison (Fig. 3a). Since there were more important DEGs for fatty acid metabolism in the A-CK dataset than in the S-CK dataset, it is likely that genetics, rather than environmental factors, are the major determinants of the fatty acid components and content of sunflower. Among the five comparisons in the S-CK dataset, the high oleic acid line J9 contained 372, 279, 115, 322, and 147 specifically up-regulated DEGs; there were 664, 386, 718, 660, and 1,249 specifically down-regulated DEGs (Fig. 3b). In a comparison of the A-CK dataset, 2,252, 4,882, 2,398, and 3,857 DEGs were found to be specifically up-regulated in J9; 1,919, 1,565, 2,735, and 4,909 DEGs were found to be specifically down-regulated (Fig. 3c). The A-CK data and the S-CK dataset were subjected to inter-trial specific expression analysis, the J9 had 354 DEGs co-expressed in the up-regulated DEGs in the S- and A-tests and 730 DEGs co-expressed in the down-regulated DEGs. The P50 varieties had DEGs 500 DEGs co-expressed in up-regulated DEGs and 200 co-expressed in down-regulated DEGs(Fig. 3d). After excluding the genes showing specific expression in the S-experiment, the analysis of specific expression in the experimental chambers showed that there were a total of 30,564 differentially expressed genes (DEGs) related to sunflower genotypes and their connection to fatty acid metabolism.
The S and A datasets' inter-varietal specific expression analyses produced 15,885 and 18,220 DEGs related to sunflower fatty acid metabolism, respectively(Fig. 4).
Table 1
The resulted data sets of pairwise comparisons from J9 and P50
Comparision | Data set | Comparision | Data set | Comparision | Data set | Comparision | Data set |
S2vsS1(CK) | S2-CK | A1vsCK | A1-CK | S1(J9)vsS1(P50) | S1 | A1(J9)vsA1(P50) | A1 |
S3vsS1(CK) | S3-CK | A2vsCK | A2-CK | S2(J9)vsS2(P50) | S2 | A2(J9)vsA2(P50) | A2 |
S4vsS1(CK) | S4-CK | A3vsCK | A3-CK | S3(J9)vsS3(P50) | S3 | A3(J9)vsA3(P50) | A3 |
S5vsS1(CK) | S5-CK | A4vsCK | A4-CK | S4(J9)vsS4(P50) | S4 | A4(J9)vsA4(P50) | A4 |
S6vs(CK) | S6-CK | --- | --- | S5(J9)vsS5(P50) | S5 | A5(J9)vsA5(P50) | A5 |
--- | --- | --- | --- | S6(J9)vsS6(P50) | S6 | --- | --- |
Functional analysis of specific expressed genes
In the GO annotation specifically expressing DEGs in the S-test, J9 accumulated 96 genes related to lipid metabolism (Figure S5a). 43 were obtained in the S-test of P50 (Figure S5b), 330 were obtained in the A-test of J9 (Figure S5c), 299 were obtained in the A-test of P50 (Figure S5d), and 278 were obtained in the co expression GO terms mapping results of J9 and P50 in the A-test (Figure S5e). In summary, in the specific expression analysis within the variety, a total of 403 DEGs were annotated under GO terms related to lipid metabolism (Figure S5f); Using the same research method, 81 genes related to lipid metabolism were found in the inter variety specific expression analysis results.
In the analysis of specific expression within the variety, 403 DEGs directly related to lipid metabolism obtained through GO functional enrichment were mapped to KEGG metabolic pathways related to lipid metabolism, with a total of 29 DEGs directly participating in 19 related KEGG pathways (Figure S6a). 14 of these genes were down-regulated in comparison to the control, whereas 15 of these genes showed elevated expression in the A-test (Fig. 5a).
A total of 16 DEGs were annotated and 81 DEGs were mapped to the KEGG metabolic pathway linked to lipid metabolism in the findings of the interspecies specific expression analysis(Figure S6b). One of the genes (LOC110868833) was over-expressed in the high oleic acid line J9, whereas the gene LOC110925594 was under-expressed in J9 (Fig. 5b).
Analysis of DEGs related to sunflower fatty acid metabolism
Most fatty acids in plants are stored in the form of TAG[21]. We analyzed the 42 differentially expressed genes (DEGs) found in the pathways metabolizing sunflower fatty acids and looked at how they varied in connection to 22 distinct enzymes (Fig. 6).
The genes 110889119 and 110925594 control the activity of acetyl-CoA carboxylase (ACCase), a crucial enzyme in the production of fatty acids. When fatty acid buildup is in its early stages in both J9 and P50, gene 110889119 shows comparatively greater expression levels, with higher expression in J9. However, as seen in both the S and A tests, gene 110925594 exhibits greater relative expression levels in P50 compared to J9, showing its vulnerability to environmental influences.
The A-experiment made clear the expression characteristics of two genes controlling ketoacyl-ACP reductase (KAR). While gene 110893771 revealed the opposite trend, gene 110864403 showed low expression in the early stages of accumulation followed by high expression in the latter stages. In the S-experiment, four genes were implicated in the control of several enzymes, including NADPH. The expression patterns of the two genes, 1109382561 and 110868833, were different. While gene 110868833 had greater expression during the whole planting time in J9, gene 1109382561 displayed the reverse trend. Similar traits were also shown by these genes in the A-experiment. It is interesting that, although having substantially greater expression levels throughout the J9 fatty acid accumulation phase, gene 110868833's expression levels peaked during the crucial oleic acid time period.
Gene 110893751 participates in the activation of MFP throughout this process. The fatty acid production at DAFs 28 and 35 is when this gene is most highly expressed in J9. Genes 110941185 and 110893751 are involved in the production and transport of fatty acyl-CoA, whereas genes 110890752 and 110871748 contribute to the synthesis of saturated fatty acids C16:0 and C18:1. The rate-limiting enzyme KCS is regulated by four genes, however in the S experiment, these genes did not exhibit any notable expression characteristics. Although genes 110889575 and 110904708 were strongly expressed in the early stages of fatty acid buildup in both types in the A experiment, gene 110889575's expression levels were greater in J9. Gene 110929271, on the other hand, had strong expression in J9 throughout the early stages of fatty acid buildup but low expression in P50. In the later stage of fatty acid buildup in sunflower, gene 110942343 showed strong expression, which was especially noticeable in P50.
Factors ACX and MFP2 are controlled by the genes 110893751 and 110941185. These two genes are more strongly increased in the high oleic acid cultivar J9 in the latter stages of sunflower fatty acid accumulation. Gene 110890752 is involved in the control of the PC-Pool's ability to directly catalyze the production of oleic acid from certain substrates. It is important to note that this gene has a negative link with oleic acid production since it exhibits low expression in J9 and P50 during the early stage of fatty acid accumulation and extremely low expression in J9 at 35 DAFs during the latter stage of accumulation. Glycerol-3-phosphate is synthesized by three genes, and in the A experiment, gene 110884358 has high expression in J9 but low expression in P50. Lysophosphatidic acid (LPA) is produced by two different factors, one of which is the long non-coding RNA regulatory factor 110890321. In the regulation of LPAT and gene novel, seven genes are active.A novel gene called 9052 controls the metabolism of sunflower fatty acids. Gene 110930343 is involved in the synthesis of diacylglycerol (DAG), while 110886458 is involved in TAG synthesis.