Morphological changes in developing P. ostii fruits and seeds
At the study location (south of the Yangtze River), P. ostii fruit and seed development lasted about 120 days. P. ostii fruits and seeds grew rapidly between 7 and 49 days after fertilization (DAF), and at 49 DAF were almost as large as the mature forms (Fig. 1a–c). The fruit peel was green during the early stages of development, but began to turn yellow at about 70 DAF (Fig. 1a). Examination of the seeds in longitudinal section indicated that embryogenesis and seed-filling occurred from 0 to 70 DAF; cotyledon embryos were obvious at 70 DAF. Between 70 and 119 DAF, the seed coat became thinner and browner as compared to the period from 49 to 70 DAF (Fig. 1d). At 119 DAF, the seeds were mature (Fig. 1b, d).
Dynamic changes in FA composition in the pericarp, kernel, and testa
All five major FAs [ALA, LA, oleic acid (OA, C18:1Δ9), stearic acid (SA, C18:0) and palmitic acid (PA, C16:0)] were detected in the pericarp, testa, and kernel samples taken between 21 and 119 DAF (Fig. 2). The three most abundant UFAs during kernel development were OA, LA, and ALA. In the kernel, OA, LA, and ALA concentrations were 2.4±0.08 mg/g, 6.35±0.34 mg/g, and 2.5±0.24 mg/g, respectively, at 35 DAF, increasing to 58.79±2.21 mg/g, 92.56±4.98 mg/g, and 134.29±2.23 mg/g, respectively, at 77 DAF (Fig. 2a). After 77 DAF, OA, LA, and ALA concentrations decreased gradually, to 52±0.73 mg/g, 77.36±1.39 mg/g, and 131.26±1.68 mg/g, respectively, at 84 DAF, and to 41.96±1.21 mg/g, 58.56±1.94 mg/g, and 94.96±3.64 mg/g, respectively, at 119 DAF (Fig. 2a). Although the concentrations of all five FAs were similar throughout most of testa development (21–63 DAF and 77–119 DAF; Fig. 2b), ALA, LA, and OA peaked sharply at 70 DAF, with ALA concentration showing the greatest increase (to 20.27 ± 3.08 mg/g). FA concentrations fluctuated but remained low throughout pericarp development; PA and LA were the most abundant FAs (Fig. 2c). In all three tissues, SA concentrations remained consistently low.
Unigene Assembly, Analysis, And Quantification Of Gene Expression
The transcriptomes of the kernel, testa, and pericarp samples taken at 35, 49, 63, 77, 91, and 119 DAF (T1–T6) from two separate trees (specimens CS0009 and CS0016) were sequenced. Across the 36 samples, we obtained approximately 49.69 M reads per sample (Table 1). A total 227,837 unigenes were detected, with an average length of 755 bp (Additional file 1: Table S1). The heatmap of the Pearson correlation coefficients of gene expression levels between pairs of genes showed that the gene expression levels were significantly correlated throughout pericarp development and during rapid oil accumulation in the kernel (Additional file 2: Figure S1). Across the six developmental stages, more genes were co-expressed in all three tissues than were expressed only in one tissue (Additional file 3: Figure S2). In addition, more genes were expressed specifically in the pericarp than in either the kernel or the testa. More genes were specifically expressed in the seed kernel at 49 DAF than at any other stage(Additional file 3: Figure S2).
Table 1
Sequencing statistics for the 36 samples taken during the development of Paeonia ostii seeds.
Sample | Total Raw Reads(Mb) | Total Clean Reads(Mb) | Total Clean Bases(Gb) | Clean Reads Q20(%) | Clean Reads Q30(%) | Clean Reads Ratio(%) |
CS0009_p_35 | 65.12 | 51.42 | 7.71 | 97.09 | 91.51 | 78.96 |
CS0009_p_49 | 53.48 | 41.89 | 6.28 | 97.08 | 91.53 | 78.32 |
CS0009_p_63 | 79.5 | 63.54 | 9.53 | 97.22 | 91.84 | 79.92 |
CS0009_p_77 | 85.27 | 63.95 | 9.59 | 96.95 | 91.22 | 74.99 |
CS0009_p_91 | 75.27 | 59.6 | 8.94 | 97.1 | 91.58 | 79.18 |
CS0009_p_119 | 79.37 | 58.11 | 8.72 | 97 | 91.2 | 73.22 |
CS0009_t_35 | 42.71 | 33.45 | 5.02 | 97.35 | 91.95 | 78.32 |
CS0009_t_49 | 68.7 | 51.46 | 7.72 | 96.62 | 90.23 | 74.9 |
CS0009_t_63 | 56.31 | 43.43 | 6.51 | 97.09 | 91.59 | 77.12 |
CS0009_t_77 | 56.61 | 41.72 | 6.26 | 96.93 | 91.16 | 73.69 |
CS0009_t_91 | 65.53 | 48.97 | 7.35 | 96.94 | 91.21 | 74.73 |
CS0009_t_119 | 49.76 | 41.65 | 6.25 | 98.13 | 94.01 | 83.71 |
CS0009_k_35 | 51.22 | 40.63 | 6.1 | 97.12 | 91.56 | 79.34 |
CS0009_k_49 | 55.96 | 44.77 | 6.72 | 97.22 | 91.84 | 80.01 |
CS0009_k_63 | 80.62 | 61.78 | 9.27 | 96.94 | 91.19 | 76.63 |
CS0009_k_77 | 56.87 | 42.97 | 6.45 | 96.99 | 91.24 | 75.55 |
CS0009_k_91 | 66.07 | 48.77 | 7.32 | 96.86 | 91.06 | 73.82 |
CS0009_k_119 | 58.84 | 43.22 | 6.48 | 96.76 | 90.8 | 73.45 |
CS0016_p_35 | 64.68 | 49.97 | 7.49 | 96.95 | 91.25 | 77.25 |
CS0016_p_49 | 70.79 | 55.9 | 8.38 | 97.05 | 91.41 | 78.95 |
CS0016_p_63 | 59.62 | 46.02 | 6.9 | 96.92 | 91.18 | 77.19 |
CS0016_p_77 | 72.86 | 56.46 | 8.47 | 96.99 | 91.3 | 77.48 |
CS0016_p_91 | 81.99 | 64.04 | 9.61 | 96.95 | 91.24 | 78.1 |
CS0016_p_119 | 42.01 | 31.77 | 4.77 | 96.89 | 90.75 | 75.62 |
CS0016_t_35 | 104.07 | 83.87 | 12.58 | 97.81 | 93.25 | 80.59 |
CS0016_t_49 | 66.34 | 50.45 | 7.57 | 97.06 | 91.51 | 76.04 |
CS0016_t_63 | 61.83 | 48.98 | 7.35 | 97.06 | 91.39 | 79.22 |
CS0016_t_77 | 73.54 | 57.14 | 8.57 | 97.15 | 91.65 | 77.71 |
CS0016_t_91 | 61.93 | 47.81 | 7.17 | 97.19 | 91.81 | 77.2 |
CS0016_t_119 | 109.95 | 81.97 | 12.3 | 97.34 | 92.47 | 74.56 |
CS0016_k_35 | 52.81 | 41.03 | 6.15 | 97 | 91.31 | 77.69 |
CS0016_k_49 | 61.03 | 48.59 | 7.29 | 97.2 | 91.78 | 79.61 |
CS0016_k_63 | 52.57 | 39.83 | 5.97 | 96.97 | 91.28 | 75.76 |
CS0016_k_77 | 57.55 | 48.62 | 7.29 | 98.16 | 94.06 | 84.47 |
CS0016_k_91 | 50.84 | 37.42 | 5.61 | 96.84 | 90.95 | 73.6 |
CS0016_k_119 | 47.16 | 34.85 | 5.23 | 96.75 | 90.75 | 73.9 |
Ontology and KEGG pathway enrichment of the differentially expressed genes
The up- and downregulated genes at each of the six time points in the kernel, testa, and pericarp are shown in Additional file 4: Table S2. We found that the proportion of genes up- and downregulated varied among tissues and time points. The heatmap of the hierarchical DEG clusters indicated that more genes were differently expressed during the first three stages of kernel development (Additional file 5: Figure S3). We then determined which gene ontology (GO) functions were enriched in the DEGs. GO functions are grouped in three categories: molecular function, cell component and biological process. At 49 DAF, twice as many genes in the kernel and testa were enriched in metabolic process as compared to the pericarp (Additional file 6: Figure S4). In the kernel, 49 DAF was a period of rapid oil accumulation (Fig. 2a). Consistent with this, the genes differentially expressed in the kernel at 49 DAF as compared to 35 DAF were primarily associated with the lipid metabolism (226 DEGs) and the carbohydrate metabolism (309 genes; Fig. 3a). Other metabolic pathways, such as fatty acid metabolism, fatty acid biosynthesis, linoleic acid metabolism, glycosphingolipid biosynthesis-ganglioseries, glycosphingolipid synthesis-globo and isoglobe series, starch and sucrose metabolism were also significantly enriched in these DEGs (Fig. 3b). Most of the metabolic pathways were overrepresented in the genes downregulated between 49 DAF and 35 DAF, including the linoleic acid metabolism pathway; linoleic acid is the substrate of ALA synthesis (Fig. 3c). However, the fatty acid metabolism, fatty acid biosynthesis, pyruvate metabolism, and plant hormone signal transduction pathways were overrepresented in the genes upregulated between 49 DAF and 35 DAF (Fig. 3c). With respect to the genes differently expressed between 63 DAF and 49 DAF, more pathways were significantly enriched in the downregulated genes than in the upregulated genes (Fig. 3d). Interestingly, the fatty acid metabolism and fatty acid biosynthesis pathways, which were significantly enriched at 49 DAF, were not significantly enriched at any other time point (Additional file 7: Figure S5).
In the testa, 187 genes associated with the lipid metabolism were differentially expressed between 49 DAF and 35 DAF. Pathways related to the lipid metabolism (i.e., glycerolipid metabolism, glycerophospholipid metabolism, glycosphingolipid biosynthesis-ganglio series, and cutin/suberine and wax biosynthesis) and to the carbohydrate metabolism (i.e., starch/sucrose metabolism and galactose metabolism) were significantly enriched in these DEGs (Fig. 4a). The glycerolipid metabolism pathway was overrepresented in similar numbers of up- and downregulated DEGs at 49 DAF (Fig. 4b). Several other lipid-associated pathways (i.e., glycerophospholipid metabolism, glycosphingolipid biosynthesis-ganglio series, and steroid biosynthesis) were more enriched in the upregulated DEGs than in the downregulated DEGs at 49 DAF (Fig. 4b). Similarly, pathways related to starch accumulation and sugar synthesis (i.e., fructose/mannose metabolism and galactose metabolism) were also enriched in more upregulated than downregulated DEGs at 49 DAF (Fig. 4b). This might be due to the intensive membrane lipid synthesis, seed coat thickening, and dry matter accumulation that occurs during this stage in the testa. Compared with 35 DAF, 287 genes associated with the lipid metabolism were differently expressed at 63 DAF in the testa; the metabolic pathways significantly enriched in these DEGs included fatty acid elongation, glycerolipid metabolism, sphingolipid metabolism, glycosphingolipid biosynthesis-ganglio series, and cutin/suberine and wax biosynthesis (Fig. 4c). At 63 DAF, most of the DEGs were associated with the alpha linolenic acid pathway; in this pathway, the number of upregulated genes was similar to the number of downregulated genes (Fig. 4d). No lipid metabolism pathways were enriched at any other stage of testa development (Additional file 8: Figure S6).
In the pericarp, certain lipid metabolism pathways, including glycerolipid metabolism, fatty acid elongation, sphingolipid metabolism, and cutin/suberine and wax biosynthesis, were similarly enriched in all DEGs in different stages (Additional file 9: Figure S7 and Additional file 10: Figure S8).
Expression Patterns Of Unigenes Associated With Oil Accumulation
Based on the functional annotations of the DEGs, we identified 1373 unigenes associated with lipid metabolism (Additional file 11: Table S3). These unigenes were associated with 10 metabolic pathways: de novo plastid FA synthesis, elongation, desaturation and export (138 unigenes); triacylglycerol biosynthesis and eukaryotic phospholipid synthesis & editing (176 unigenes); prokaryotic and eukaryotic galactolipid, sulfolipid, and phospholipid synthesis (25 unigenes); sphingolipid biosynthesis (58 unigenes); fatty acid elongation, wax biosynthesis, cutin synthesis, and suberin synthesis and transport (251 unigenes); triacylglycerol and fatty acid degradation (99 unigenes); phospholipid signaling (108 unigenes); lipid trafficking (27 unigenes); oxylipin metabolism (39 unigenes); and mitochondrial fatty acid synthesis (43 unigenes) (Additional file 12: Table S4).
Of these 1373 unigenes, 314 were associated with FA and TAG biosynthesis (Additional file 12: Table S4). Specifically, 13 unigenes were associated with the pyruvate dehydrogenase complex (PDHC), which catalyzes the oxidative decarboxylation of pyruvate to produce acetyl-CoA. The PDHC contains three subunits: pyruvate dehydrogenase (PDH, 10 unigenes), dihydrolipoyl acyltransferase (DHLAT, 2 unigenes), and dihydrolipoamide dehydrogenase (LPD, 1 unigene). These unigenes were highly expressed throughout the development of the pericarp, testa, and kernel. In addition, 25 unigenes encoded subunits of ACC, a multi-subunit enzyme that includes biotin carboxylase (BC), the biotin carboxyl carrier protein (BCCP), and carboxyltransferase (CT). Finally, seven unigenes were homologous to α-CT; one unigene was homologous to β-CT; 13 unigenes were homologous to BC; and four unigenes were homologous to BCCP (Additional file 12: Table S4). In particular, CL15202, homologous to α-CT, was highly expressed across all three tissues, but was most strongly upregulated in the kernel. Similarly, CL9207.Contig2, homologous to homomeric ACC (HmACC), was also highly expressed during the development of all three tissues, but was most highly expressed in the testa. CL16462 (homologous to α-CT) and unigene 719 (homologous to β-CT) were strongly upregulated in the pericarp, but expressed only at low levels in the kernel and the testa. In addition, homologs to BC (CL7345.contig8) and BCCP (CL18970.contig3 and unigene17464) were significantly upregulated in the kernel at 35 and 49 DAF, as compared to later developmental stages (63–119 DAF); these genes were expressed only at low levels throughout the development of the testa and pericarp (Additional file 13: Figure S9). Three unigenes were associated with the catalysis of the malonyl-ACP elongation cycle: CL8489.Contig2, associated with 3-ketoacyl-ACP synthase isoform I (KAS I); CL1611.Contig2, associated with KAS II; and unigene32463, associated with KAS III. These unigenes were strongly upregulated during pericarp and testa development, but were highly expressed in the kernel at 49 DAF only (Additional file 13: Figure S9). Eighteen unigenes encoding SAD were also identified as DEGs. Of these, four (CL2824.Contig13, CL2824.Contig14, CL2824.Contig25, and CL2824.Contig26) were more highly expressed in the kernel and the testa than in the pericarp (Additional file 13: Figure S9). In addition, in the kernel, the expression levels of CL2824.Contig25 steadily increased from 35–63 DAF, remained constant from 63–77 DAF, and decreased steadily from 77–119 DAF (Fig. 5b). One unigene was shown to encode fatty acyl-ACP thioesterase A (FATA) (unigene17393), and three were shown to encode FATB (unigene2022, unigene5608, and unigene6721). Unigene6721 was differently expressed among the different developmental stages of the pericarp, testa, and kernel, but unigene17393 was only highly expressed in the kernel (with an expression peak at 49 DAF; Additional file 13: Figure S9). We also identified 10 unigenes encoding long-chain acyl-CoA synthetase (LACS) (Additional file 12: Table S4).
Ten unigenes were associated with PUFA biosynthesis: three unigenes encoded FAD2 (CL15521.Contig1, CL15521.Contig2, and CL15521.Contig3), and two unigenes encoded FAD6 (CL5945.Contig3 and CL5945.Contig8). In the kernel, the expression level of one of the FAD2-encoding genes (CL15521.Contig3) increased from 35–77 DAF, and decreased from 77–119 DAF (Fig. 5b). This unigene was also sharply upregulated at 77 DAF in the testa (Additional file 13: Figure S9). An additional four unigenes encoded FAD3 and FAD8 (unigene2598, unigene43483, CL1023.Contig1, and CL1023.Contig4), and one unigenes encoded FAD7 (unigene21286) (Additional file 11: Table S3). Based on our phylogenetic analysis of ω-3 FADs from Arabidopsis thaliana and P. ostii (Additional file 14: Figure S10), we identified four microsomal FAD3 genes and one plastid FAD7/8 gene in P. ostii. In the kernel, the expression level of one of these FAD3-encoding genes (CL1023.Contig4) increased steadily from 39–91 DAF, followed by an abrupt decrease (Fig. 5b). Similarly, CL1023.Contig4 was sharply upregulated at 77 DAF in the testa (Additional file 13: Figure S9). In contrast, CL1023.Contig4 was expressed only at low levels throughout pericarp development.
Several unigenes encoding enzymes important for acyl editing and the Lands cycle were also identified. The three unigenes encoding CDP-choline:DAG choline phosphotransferase (CPT) (unigene7809, unigene16109, and unigene16131) were differentially expressed during the development of the kernel, pericarp, and testa. One unigene, encoding phosphatidylcholine:diacylglycerol cholinephosphotransferase (PDCT) (unigene32372), was significantly differentially expressed among these tissues. In the kernel, unigene32372 expression increased from 35–49 DAF, and then decreased fairly steadily until the end of the experiment (Fig. 5b). Three unigenes encoding lysophosphatidylcholineacyltransferase (LPCAT) (unigene39796, CL18110.Contig2, and CL18110.Contig24) and two unigenes encoding lysophosphatidylethanolamineacyltransferase (LPEAT) (CL10703.Contig3 and CL20149.Contig2) were identified as DEGs. We also identified 19 unigenes encoding PLD. In the kernel, the expression profile of one of these unigenes (CL1337.Contig6) resembled a bell-shaped curve, peaking at 63 DAF (Fig. 5b). Other phospholipase D (PLD) genes were also identified as DEGs, but did not exhibit a smooth peak in expression.
We identified 45 unigenes associated with TAG biosynthesis. Of these, 16 unigenes encoded glycerol-3-phosphate acyltransferase (GPAT), 10 unigenes encoded lysophosphatidic acid acyltransferase (LPAAT), 10 unigenes encoded phosphatidate phosphatase (PAP), and 4 unigenes encoded diacylglycerol acyltransferase (DGAT). Specifically, four unigenes were strongly upregulated in the kernel at 49 DAF as compared to 35 DAF: unigene42999 encoding GPAT; CL1842.Contig1 and CL1842.Contig2 encoding LPAAT; CL4004.Contig1 and CL18619.Contig1 encoding PAP; and CL5087.Contig1 encoding DGAT (CL5087.Contig1) (Additional file 13: Figure S9). This pattern was observed in neither the testa nor the pericarp. Five unigenes encoding phospholipid:diacylglycerol acyltransferase (PDAT) were identified as DEGs. One of these (CL7876.Contig4) was highly expressed throughout pericarp and testa development, but was downregulated at 49 DAF as compared to 35 DAF in the kernel. In the kernel, the expression of another PDAT-encoding unigene (unigene24505) increased dramatically from 35–49 DAF and then again from 77–91 DAF (Fig. 5b). This gene was highly expressed in neither the testa nor the pericarp (Additional file 13: Figure S9). We identified 29 unigenes associated with the formation of oil bodies in the cytoplasm that were also DEGs. Of these unigenes, six encoded oleosin (OLE), 19 encoded caleosin (CLE), and four encoded steroleosin (SLE) (Additional file 12: Table S4). All of these unigenes were stably expressed at very low levels during pericarp development, but were highly expressed throughout the development of the testa and very highly expressed throughout the development of the kernel (Additional file 13: Figure S9). In the kernel, bell-shaped expression profiles were observed for one CLE-encoding unigene (CL18131.Contig3), two OLE-encoding unigenes (unigene5123 and unigene17798), and one SLE-encoding unigene (CL19272.Contig1) (Fig. 5b).
TFs associated with FA and TAG biosynthesis
We predicted the DEGs that might encode TFs, and classified these into TF families (Fig. 6a). Of these TF DEGs, 301 belonged to the MYB family, 164 to the AP2-EREBP family, 161 to the FAR1 family, 132 to the NAC family, 115 to the MADS family, 114 to the bHLH family, 70 to the WRKY family, and 14 to the bZIP family. Within the TF DEGs, we also identified the unigenes related to the FA metabolism: nine in the WRI family, 20 in the bZIP family, one in the LEAFY family, six in the FUSCA (FUS) family, five in the ABSCISIC ACID-INSENSITIVE family, and 82 in the MYB family (Additional file 15: Table S5).
WRI1 (CL22311) was more strongly upregulated in the kernel than in the testa and the pericarp, especially during the early stages of FA accumulation (Fig. 6b). FUSCA (unigene25640) was also more strongly upregulated in the kernel than in the pericarp or the testa (Fig. 6b). The two unigenes encoding bZIP (unigeneCL3725 and unigeneCL15124) were constitutively expressed in all three tissues (Additional file 16: Fig. 11). The expression patters of the many TF DEGs in the MYB family differed among tissues. One unigenes (CL14921.Contig3) was highly expressed in the pericarp, but expressed only at low levels in the kernel and testa. Similarly, unigene25046 was also differentially expressed among tissues. In the kernel, unigene25046 expression increased from 35–49 DAF, and then decreased sharply until the end of the experiment (Additional file 16: Fig. 11). In the testa, unigene25046 was upregulated at 49 DAF, this high level of expression was maintained until the end of the experiment. In the pericarp, unigene25046 was expressed at low levels throughout the experiment. We also considered the unigenes encoding zinc finger CCCH domain-containing protein 54 (C3H54) and ethylene-responsive transcription factor (RA212). Unigene32844, encoding C3H54, and unigene43518, encodingRA212, were both upregulated during FA accumulation in the kernel, but were expressed only at low levels in the pericarp and testa (Fig. 6b).
Quantitative real-time PCR (qRT-PCR) validation of key DEGs
In general, the expression patterns of the 13 key genes chosen for validation were similar between the qRT-PCR and RNA-seq analyses (Fig. 5b and 7; Additional file Figure S12-15). For example, seven unigenes (encoding SAD, FAD2, FAD3, PDAT, OLE, CLE, and SLE), which exhibited bell-shaped patterns of expression during kernel development in the RNA-seq data, were observed to exhibit similar expression patterns in the qRT-PCR data. Importantly, both RNA-seq and qRT-PCR indicated that these seven unigenes were only expressed at low levels throughout the development of the testa and pericarp. However, the qRT-PCR data indicated that FAD3 expression in the testa increased sharply at 63 DAF (Fig. 7b). RNA-seq and qRT-PCR analyses showed that the remaining six unigenes were upregulated in the kernel, but expressed at low levels throughout the development of the pericarp and testa.