Effects of different light qualities on plant growth
At the final harvest (day 60), the leaf area was significantly lower in the RL and BL treatments than in the WL treatment (Fig. 1A). The plant height was significantly higher and lower in the RL and BL treatments, respectively, compared to the WL (Fig. 1B). Finally, on day 60, the stem diameter of samples in the WL and BL treatments was significantly larger than those in RL treatment (Fig. 1C). All statistical data were shown in Additional file 1: Table S1.
Illumina sequencing, de novo assembly, and annotation of the reference transcriptome
In total, 1,025 million raw reads were generated from the 15 cDNA libraries with a total of 1,009 million high-quality clean reads. To be specific, each library produced about 55.9-92.7 million raw reads (CRA002486). The relevant parameters of de novo assembly were shown in Additional file 2: Table S2. The results of the functional annotation from the seven public databases showed that 46.9 % of unigenes were annotated in at least one database, while only 3.0 % of unigenes were annotated in all the databases (Additional file 3: Table S3). A total of 33,626 unigenes were categorized into 56 functional groups among three categories: biological process, cellular component, and molecular function (Additional file 4: Fig. S1).
DEGs analysis
To identify DEGs in the 15 samples, a pairwise comparison was performed and filed with padj < 0.05 and |log2FoldChange| > 1. The results showed that 861 (BY vs RY), 378 (BY vs WY), 47 (RY vs WY), 10,033 (WG vs WY), 7917 (WJ vs WG), and 6379 (WJ vs WY) unigenes were significantly expressed.
The enrichment analysis was performed using the GO ontology analysis (Additional file 5: Fig. S2a-e). A large number of DEGs were predominant in five terms, namely: metabolic process, single-organism metabolic process, catalytic activity, ion binding, and transferase activity. Meanwhile, all DEGs were mapped in the KEGG database (Additional file 6: Fig. S3a-f). Interestingly, in almost every group (apart from RY vs WY) the DEGs were enriched in the biosynthesis of phenylpropanoids (including flavonoids). The phenylpropanoid-derived compounds are just originated from this biosynthetic pathway. Therefore, the follow-up experiment and analysis were performed to describe this regulatory mechanism based on transcriptomic and metabolomic profiling.
Analysis of metabolites
Based on the results obtained from the UPLC-MS/MS platform and self-compiled database, a total of 454 metabolites were detected, including various classes of lignans and coumarins, flavonoids, phenolic acids and other. Moreover, the majority of them are phenylpropanoid-derived compounds, including flavonoids (90), phenolic acids (66), lignans and coumarins (11), and tannins (13) (Table 1 and Additional file 7: Table S4). Among them, the major ingredients of fraxetin, scopolin, isofraxidin, fraxidin and esculin from the lignans and coumarins class and rosmarinic acid, rosmarinyl glucoside, chlorogenic acid, and esculetin from the phenolic acids class were isolated from S. glabra samples.
Table 1 Classes of the detected metabolites in S. glabra
Metabolite class
|
Number of metabolites
|
Metabolic class
|
Number of metabolites
|
Flavonoids
|
90
|
Organic acids
|
32
|
Alkaloids
|
39
|
Lipids
|
60
|
Phenolic acids
|
66
|
Tannins
|
13
|
Lignans and Coumarins
|
11
|
Terpenoids
|
10
|
Amino acids and derivatives
|
52
|
Others
|
53
|
Nucleotides and derivatives
|
28
|
|
|
The PCA (Fig. 2A) results showed that the samples tested were significantly different and that replication was generally consistent with that of others in the same group. The HCA (Fig. 2B) presented metabolic profiling and distributed them into 10 classes in all of the samples. Compared to WY and RY, most flavonoids contents (e.g. quercitrin and kaempferol) were higher in BY. Moreover, the chemicals distribution was significantly different between the root (WG) and leaves (WY, RY, and BY) as a whole. Leaf tissues contained a higher abundance of compounds such as the classes of flavonoids, phenolic acids, and tannins, while the classes of lipids, amino acids and derivative alkaloids, organic acids, nucleotides and derivatives, lignans and coumarins, terpenoids were much more abundant in the root tissue.
The results of the comparative analysis are shown in Table 2 and Fig. 3. There were 44, 87, and 296 compounds that were differentially produced in WY vs RY, WY vs BY, and WY vs WG, respectively (Fig. 3A, B, and C). Besides, compared with WY, 11 flavonoids (e.g., kaempferol) and 6 phenolic acids (e.g., cryptochlorogenic acid) were significantly down-regulated in RY (Table 2 and Fig. 3D). Among them, only 3 flavonols compounds (kaempferol, kaempferin, and cynaroside) were mapped in KEGG pathway related to flavonoids biosynthesis. In WY vs BY group, 40 flavonoids, 11 phenolic acids, and 2 lignans and coumarins had significant differences. Among them, 40 of these compounds were up-regulated and 13 compounds were down-regulated (Table 2 and Fig. 3D). In addition, only 8 out of 40 flavonoids were mapped in the flavonoid biosynthesis, and 6 of which increased by 3.2- to 6.2-fold in BY, such as quercitrin and kaempferol. However, the content of cynaroside and protocatechuic aldehyde were approximately reduced by 4.5-fold in BY. For phenolic acids, seven metabolites contents were down-regulated, including caffeic acid, esculetin, and sinapinaldehyde, whereas others increased by 2.5- to 3.3-fold. For lignans and coumarins compounds, in BY, daphnetin production decreased by 9.5-fold and oxypeucedanin increased by 2.1-fold. In terms of WY vs WG group, 129 phenylpropanoid-derived metabolites were significantly up- or down-regulated (Table 2 and Fig. 3D). Within the group of WY vs WG, compounds derived from flavonoids biosynthesis mainly accumulated in WY rather than WG (Fig. 2B). In contrast, some coumarins (esculin, isofraxidin, and fraxidin) were more abundant in WG, but the production of other coumarins (scopolin and oxypeucedanin) and most phenolic acids compounds were much higher in WY.
Furthermore, we selected the main active constituents and compared their accumulation patterns in WY, RY, BY, and WG tissues (Fig. 4). Relative to WY, the production of esculetin, caffeic acid, isofraxidin, fraxidin, fumaric acid, and maleic acid were significantly reduced (fold change ≤0.5), while the yields of quercitrin and kaempferol were significantly up-regulated in BY (fold change ≥2). Besides, the production of cryptochlorogenic acid, cinnamic acid, esculin, and rutin were increased by 1.5-fold in BY. Meanwhile, compared to WY, the production of sinapic acid, esculetin, esculin, astilbin, fumaric acid, and maleic acid were increased by 1.3- to 1.7-fold, whereas the contents of cryptochlorogenic acid, cinnamic acid, and kaempferol were significantly decreased in RY (fold change ≤0.5). Finally, compared to WY, the production of chlorogenic acid, cryptochlorogenic acid, cinnamic acid, coumaric acid, rosmarinyl glucoside, scopolin, quercitrin, kaempferol, astilbin, phloretin 2'-O-glucoside, quercetin, and rutin were significantly reduced (fold change ≤0.5), whereas the yields of sinapic acid, esculetin, fraxetin, esculin, isofraxidin, fraxidin, fumaric acid, and maleic acid were significantly promoted in WG (fold change ≥2).
Table 2 The quantitative statistics of identified Phenylpropanoid-derived metabolites in
different groups
Group name
|
Identified
Metabolites
|
Phenylpropanoid-derived Metabolites
|
Phenylpropanoid-derived Metabolites
(significant)
|
Up-
regulated
|
Down-
regulated
|
WY vs RY
|
450
|
180
|
17
|
0
|
17
|
WY vs BY
|
450
|
180
|
53
|
40
|
13
|
WY vs WG
|
454
|
180
|
129
|
30
|
99
|
Candidate genes involved in the phenylpropanoid biosynthesis
We have identified 46 candidate unigenes encoding most of the enzymes involved in the flavonoids, coumarins, and phenolic acid biosynthesis (Table 3 and Additional file 8: Table S5). Moreover, five unigenes relating to rosmarinic acid biosynthesis were also detected. Whatever flavonoids, phenolic acids, and most of the coumarins biosynthesis, they all shared the core phenylpropanoid pathway, comprising the three committed steps catalyzed by PAL, C4H, and 4CL enzymes. In this study, four PAL unigenes, four 4CL unigenes, and one C4H unigene were identified, of which the PAL unigene (Cluster-22883.50216), 4CL unigene (Cluster-22883.47381), C4H unigene (Cluster-22883.50271) exhibited the highest expression levels compared to other members and showed higher expression levels in BY and RY. In the chlorogenic acid pathway, five unigenes encoding hydroxycinnamoyltransferases were found, and the expression level of unigene (Cluster-22883.48487) was much higher than other members. Besides, the unigene (Cluster-22883.48487) was highly expressed in BY. Furthermore, the unigene (Cluster-22883.21708) was only highly expressed in WG compared to other samples. The scopoletion biosynthesis is derived from a key precursor, namely ferulic CoA, produced by the enzymatic reactions of 3-O-methyltransferase (COMT) and 4CL on caffeic acid or by Caffeoyl CoA 3-O-methyltransferase (CCoAOMT) reacting with caffeoyl CoA. In this pathway, only one COMT unigene (Cluster-22883.56077) and one CCoAOMT unigene (Cluster-22883.54879) were detected, which both of them have highest expression levels in WG and down-regulated in RY and BY compared to WY. Furthermore, five unigenes encoding 2-oxoglutarate and Fe(II)-dependent dioxygenases family proteins were characterized. Particularly, the F6´H and S8H proteins belong to 2-oxoglutarate and Fe(II)-dependent dioxygenases family. The unrooted phylogenetic tree showed that the proteins encoded by these five unigenes were highly similar to F6'H or S8H proteins (Additional file 9: Fig. S4). Among them, the protein encoded by unigene (Cluster-22883.51443) from S. glabra showed maximum similarity with S8H protein (XP_030962790.1) from Quercus lobata. In addition, among these unigenes, the unigene (Cluster-22883.51443) was highly expressed in WG, while another unigene (Cluster-22883.50772) had a significantly higher expression level in BY compared to other samples. The pathway to the product of rosmarinic acid consists of four consecutive enzyme reactions, starting from the precursor (tyrosine) and four enzymes tyrosine aminotransferase (TAT), hydroxyphenylpyruvate reductase (HPPR), rosmarinate synthase (RAS), and cytochrome P450 reductase (CYP), which encoded by unigenes of Cluster-22883.49301, Cluster-22883.50853, and Cluster-22883.50532, respectively. As for the rosmarinate synthase (RAS) protein, the unrooted phylogenetic tree showed that RAS protein encoded by the unigene (Cluster-22883.29589) from S. glabra had the maximum similarity with RAS proteins from Juglans regia (XP_018826314.1) and Arachis ipaensis (XP_016205435.1) Additional file 10: Fig. S5). The unigene of Cluster-22883.49301 showed no significantly differential expression level in WY, RY, and BY, but Cluster-22883.50853 was highly expressed in BY. The unigene of Cluster-22883.50532 had a significantly higher expression level in WG, whereas its expression level was less in BY, compared to WY and RY.
Table 3 Summary of identified unigenes in the transcriptome of S. glabra
Gene
|
Number of
unigenes
|
Gene
|
Number of
unigenes
|
Gene
|
Number of unigenes
|
PAL
|
4
|
CHS
|
2
|
ANR
|
1
|
4CL
|
4
|
CHI
|
2
|
ANS
|
1
|
C4H
|
1
|
F3H
|
2
|
TAT
|
1
|
C3H
|
2
|
F3'H
|
1
|
HPPR
|
1
|
COMT
|
1
|
FLS
|
1
|
HPPD
|
1
|
CCoAOMT
|
1
|
UFGT
|
1
|
RAS
|
1
|
F6'H
|
4
|
RT
|
2
|
CPR
|
1
|
S8H
|
1
|
DFR
|
1
|
Hydroxycinnamoyl-
transferase
|
5
|
SGTF
|
1
|
LAR
|
1
|
O-MT
|
2
|
The flavonoids biosynthesis shared the core pathway incorporating CHS, CHI, and F3H enzymes, and the expression levels of these unigenes encoding above-mentioned enzymes were higher in RY and BY. When the intermediate compound (dihydrokaempferol) was produced by the core enzymatic reactions, it would be flowed into several branches to generate anthocyanins, proanthocyanidins, flavonols, etc. The biosynthesis of kaempferol is catalyzed by flavonol synthase (FLS) enzyme on dihydrokaempferol, and FLS enzyme was encoded by unigene of Cluster-22883.47725, the expression level of Cluster-22883.47725 was significantly up-regulated in BY. Rutin also originates from the precursor of dihydrokaempferol through the enzymatic reactions by flavonoid 3'-hydroxylase (F3'H, encoded by Cluster-22883.55303), FLS, UDPG flavonoid O-glucosyltransferase (UFGT, encoded by Cluster-22883.49005) and rhamnosyl transferase (RT, encoded by Cluster-22883.85233 or Cluster-22883.49258). These unigenes (except Cluster-22883.85233) had similar expression patterns that BL had the potential to induce the higher expression levels. Interestingly, the unigene of Cluster-22883.85233 was mainly expressed in WG to synthesize rutin, but Cluster-22883.49258 showed a higher expression level in WY and WJ. A series of enzymes, including dihydroflavonol 4-reductase (DFR), leucoanthocyanidin reductase (LAR), anthocyanin synthase (ANS), and anthocyanidin reductase (ANR), are responsible for the production of catechin and epicatechin, which eventually synthesize the product of proanthocyanidins (PAs). The unigenes of DFR (Cluster-22883.42375) and LAR (Cluster-22883.41029) had higher expression levels in RY, while unigene of Cluster-22883.47730 encoding ANR had a higher expression level in BY. Additionally, the unigene of ANS (Cluster-22883.50399) was highly expressed in RY and BY. Overall, the BL and RL cultivation increased the expression levels in a majority of functional genes, such as some isoforms in the PAL gene family, 4CL gene family, CHS, CHI, F3H, C4H, and ANS. The phenylalanine metabolic pathways were shown in Fig. 5A and the heat map diagram of the aforementioned unigenes with the values of Log2(FPKM) were presented in Fig. 5B.
Candidate transcription factors of MYB and bHLH
A total of 142 raw MYB (MYB and MYB-related) unigenes and 61 raw bHLH unigenes were collected. For MYB proteins, after screening, 92 MYB unigenes had different numbers of highly conserved DNA-binding domains and were divided into 4 classes: 1R-MYB, R2R3-MYB, 3R-MYB, and 4R-MYB proteins including 35, 53, 3, and 1 unigenes, respectively. Since the Arabidopsis genome was published, it has provided the first insight into the description and classification of plant MYB TFs. In the plants, R2R3-MYB proteins are the largest class in MYB family. They are involved in plant-specific processes and have been divided into subgroups in Arabidopsis based on the conservation of the DNA-binding domain and of amino acid motifs in the C terminal domain [32]. Here, we have identified all the primary structures of 53 R2R3-MYB domains in S. glabra (Additional file 11: Fig. S6). We constructed the phylogenetic tree with these 53 R2R3-MYB amino acid sequences from S. glabra and 125 full-length R2R3-MYB amino acid sequences from Arabidopsis thaliana (Fig. 6A). Numerous studies have shown that some subgroups of R2R3-MYB proteins have been involved in the regulation of flavonoids biosynthesis in Arabidopsis thaliana, including subgroups of S4, S5, S6, and S7. In subgroup 7, AtMYB11, AtMYB12, and AtMYB111 were closely related to flavonols biosynthesis. SgMYB38 was clustered with AtMYB111. SgMYB53 showed a high degree of functional similarity within the subgroup of S6 (AtMYB75/90/113/114), regulating the anthocyanin pigments pathway. A subset of six R2R3-MYB proteins of SgMYB8/14/28/43/27/39 were in the same branch of the PAs biosynthetic pathway with AtMYB123 in subgroup 5. Moreover, we identified two R2R3-MYB proteins: SgMYB6 with a high amino acid sequence similarity to AtMYB38 (RAX2, S14), and SgMYB32 clustered with members of subgroup 16. The heat map showed the expression patterns of selected 10 unigenes mentioned earlier according to the Log2(FPKM) values (Fig. 6B). Based on the transcriptome results, the expression level of SgMYB38 (Cluster-22883.68401) was up-regulated in BY. In subgroup 5, the six unigenes had different expression patterns among different samples. The unigenes of SgMYB28 (Cluster-22883.46964), SgMYB8 (Cluster-22883.824), and SgMYB39 (Cluster-22883.68770) exhibited the highest expression levels in WY, compared to RY and BY. Meanwhile, the expression level of SgMYB43 (Cluster-22883.34211) displayed a significant increase in BY. Remarkably, the unigene of SgMYB27 (Cluster-22883.3887) presented a highest expression level in WJ, while the unigenes of SgMYB6 (Cluster-22883.63255) and SgMYB32 (Cluster-22883.11906) were highly expressed in WG.
The bHLH proteins are another ancient and functionally diverse superfamily of TFs, which have been widely investigated in the past. In this study, we have identified 53 unigenes encoding bHLH proteins that have the distinguishable signature domain. We constructed the phylogenetic tree with these 53 bHLH amino acid sequences from S. glabra and 182 full-length AtbHLH protein sequences from Arabidopsis thaliana (Fig. 7A). The results of multiple sequences alignment of SgbHLH domains were displayed in Additional file 12: Fig. S7. As shown in Fig. 7A, all the bHLH proteins were divided into 21 subclasses indicating that the node values between different subclasses were low, but values within each subclass were high, suggesting that the latter have strong evolutionary relationships. Specially, a previous study showed that the bHLH proteins were classified into 32 subclasses [43], with subclasses of S2, S5, and S24 (corresponding to subclasses of S8, S7, and S15 of AtbHLH proteins) involved in the regulation of flavonoid biosynthesis. In the subclass of S8, we have identified a group of SgbHLH proteins, namely SgbHLH22/24/25/43/45, which showed high sequence similarity with other AtbHLH protein members. Compared to the similar gene-expression change between the WY and RY, the relative expression levels of these five unigenes encoding SgbHLH proteins from S8 were significantly lower in BY. Particularly, the unigene of SgbHLH45 (Cluster-22883.52750) was significantly up-regulated in WJ and WG. Within the subclass of S7, SgbHLH13 and SgbHLH12 were clustered with the subset of At1g63650 (EGL), At5g41315 (GL3), At4g00480 (MYC1), and At4g09820 (TT8) that associated with some R2R3-MYB members to participate in the regulation of flavonoids metabolism, trichome formation, epidermal cell fate specification, and in the formation of hair and non-hair cells in root epidermis cells [43]. The unigene of SgbHLH12 (Cluster-22883.46083) showed no significantly differential expression pattern in WY, RY, BY, and WG, but it had a lower expression level in WJ. However, the unigene of SgbHLH13 (Cluster-22883.47507) had the highest expression level in WJ. Furthermore, three bHLH proteins of SgbHLH4/27/37 belonged to the subclass of S15. The expression level of SgbHLH4 (Cluster-22883.57401) was slightly reduced in BY and RY, compared to WY, but SgbHLH27 (Cluster-22883.76231) showed a lower level in BY, WJ, and WG. The unigene of SgbHLH37 (Cluster-22883.51340) had a much higher expression level in BY (Fig. 7B).
Validation of the transcript expression using qRT-PCR
To confirm the accuracy of the RNA-seq results, a total of 24 unigenes (five R2R3-MYB unigenes, five bHLH unigenes, and another 14 functional genes related to flavonoids biosynthesis) were selected (Additional file 8: Table S5). The qRT-PCR results revealed that the expression levels of 24 selected unigenes were mostly consistent with those obtained by RNA-seq (Fig. 8). Furthermore, for each selected unigene, the high correlation coefficient (r > 0.600) was found between qRT-PCR and RNA-seq. Therefore, the qRT-PCR results showed that RNA-seq data were accurate and could be used in future experiments.