Effects of different light qualities on plant growth
At the final harvest (day 60), the leaf area was significantly lower in the RL and BL treatments relative to the WL treatment (Fig. 1A). Compared to the WL, the plant height was higher in the RL treatment, whereas lower in BL treatments (Fig. 1B). On the 60th day, the stem diameters of plants under the WL and BL treatments were significantly greater than those in RL treatment (Fig. 1C). Details of statistical data have been presented in Additional file 1: Table S1.
Illumina sequencing, de novo assembly, and annotation of the reference transcriptome
In total, 1,025 million raw reads were generated from the 15 cDNA libraries with a total of 1,009 million high-quality clean reads. Approximately 55.9-92.7 million raw reads were produced in each library (PRJNA664220). The relevant parameters of de novo assembly have been depicted in Additional file 2: Table S2. The results of the functional annotation from the seven public databases showed that 46.9% of unigenes were annotated in at least one database, while only 3.0% of unigenes were annotated in all the databases (Additional file 3: Table S3). A total of 33,626 unigenes were categorized into 56 functional groups among three categories: biological process, cellular component, and molecular function (Additional file 4: Fig. S1).
DEGs analysis
Differential expression analysis of two groups was performed. The results showed that 861 (BY vs. RY), 378 (BY vs. WY), 47 (RY vs. WY), 10,033 (WG vs. WY), 7917 (WJ vs. WG), and 6379 (WJ vs. WY) genes were expressed significantly.
The enrichment analysis was performed using the GO ontology analysis (Additional file 5: Fig. S2). A large number of DEGs were predominant in five terms, namely metabolic process, single-organism metabolic process, catalytic activity, ion binding, and transferase activity. Meanwhile, all DEGs were mapped in the KEGG database (Additional file 6: Fig. S3). Except for RY vs. WY, in each group, the DEGs were enriched in the biosynthesis of phenylpropanoids (including flavonoids). The phenylpropanoid-derived compounds were just originated from this biosynthetic pathway. Therefore, the follow-up experiment and analysis were performed to describe this regulatory mechanism based on transcriptomic and metabolomic profiling.
Analysis of metabolites
Based on the results obtained from the UPLC-MS/MS platform and self-compiled database, a total of 454 metabolites were detected, including various classes of lignans and coumarins, flavonoids, phenolic acids, and other. Moreover, the majority of them are phenylpropanoid-derived compounds, including flavonoids (90), phenolic acids (66), lignans and coumarins (11), and tannins (13) (Table 1 and Additional file 7: Table S4). Among them, the major ingredients of fraxetin, scopolin, isofraxidin, fraxidin, and esculin from the lignans and coumarins class and rosmarinic acid, rosmarinyl glucoside, chlorogenic acid, and esculetin from the phenolic acids class were isolated from S. glabra samples.
Table 1 Classes of the detected metabolites in S. glabra
Metabolite class
|
Number of metabolites
|
Metabolic class
|
Number of metabolites
|
Flavonoids
|
90
|
Organic acids
|
32
|
Alkaloids
|
39
|
Lipids
|
60
|
Phenolic acids
|
66
|
Tannins
|
13
|
Lignans and Coumarins
|
11
|
Terpenoids
|
10
|
Amino acids and derivatives
|
52
|
Others
|
53
|
Nucleotides and derivatives
|
28
|
|
|
The PCA (Fig. 2A) results showed that the tested samples were significantly different and their replication was generally consistent with that of others in the same group. The HCA (Fig. 2B) presented metabolic profiling and distributed them into 10 classes in all of the samples. Compared to WY and RY, most flavonoids contents (e.g. quercitrin and kaempferol) were higher in BY. Moreover, the distribution of chemicals was significantly different between the root (WG) and leaves (WY, RY, and BY) as a whole. Leaf tissues contained a higher abundance of compounds such as the classes of flavonoids, phenolic acids, and tannins, while the classes of lipids, amino acids and derivative alkaloids, organic acids, nucleotides and derivatives, lignans and coumarins, and terpenoids were much more abundant in the root tissue.
The results of the comparative analysis are shown in Table 2 and Fig. 3. There were 44, 87, and 296 compounds that were differentially produced in WY vs. RY, WY vs. BY, and WY vs. WG, respectively (Fig. 3A, B, and C). Besides, compared to WY, 11 flavonoids (e.g., kaempferol) and 6 phenolic acids (e.g., cryptochlorogenic acid) were significantly down-regulated in RY (Table 2 and Fig. 3D). Among them, only 3 flavonols compounds (kaempferol, kaempferin, and cynaroside) were mapped in KEGG pathway related to flavonoids biosynthesis. In WY vs. BY group, 40 flavonoids, 11 phenolic acids, and 2 lignans and coumarins had significant differences. Among them, 40 of these compounds were up-regulated and 13 compounds were down-regulated (Table 2 and Fig. 3D). Moreover, in the flavonoid biosynthesis, only 8 out of 40 flavonoids were mapped, and 6 of these increased in BY by 3.2-to 6.2-fold, such as quercitrin and kaempferol. However, the contents of cynaroside and protocatechuic aldehyde were approximately reduced by 4.5-fold in BY. For phenolic acids, seven metabolites contents were down-regulated, including caffeic acid, esculetin, and sinapinaldehyde, whereas others increased by 2.5- to 3.3-fold. For lignans and coumarins compounds, in BY, daphnetin production decreased by 9.5-fold and oxypeucedanin increased by 2.1-fold. In terms of WY vs. WG group, 129 phenylpropanoid-derived metabolites were significantly up- or down-regulated (Table 2 and Fig. 3D). Within the group of WY vs. WG, compounds derived from the biosynthesis of flavonoids mainly accumulated in WY instead of WG (Fig. 2B). In contrast, some coumarins (esculin, isofraxidin, and fraxidin) were more abundant in WG, but the production of other coumarins (scopolin and oxypeucedanin) and most phenolic acids compounds were much higher in WY.
Furthermore, we selected the main active constituents and compared their accumulation patterns in WY, RY, BY, and WG tissues (Fig. 4). Relative to WY, the production of esculetin, caffeic acid, isofraxidin, fraxidin, fumaric acid, and maleic acid were significantly reduced (fold change ≤ 0.5), while the yields of quercitrin and kaempferol were significantly up-regulated in BY (fold change ≥ 2). Besides, the production of cryptochlorogenic acid, cinnamic acid, esculin, and rutin were increased by 1.5-fold in BY. Meanwhile, compared to WY, the production of sinapic acid, esculetin, esculin, astilbin, fumaric acid, and maleic acid were increased by 1.3- to 1.7-fold, whereas the contents of cryptochlorogenic acid, cinnamic acid, and kaempferol were significantly decreased in RY (fold change ≤ 0.5). Finally, compared to WY, the production of chlorogenic acid, cryptochlorogenic acid, cinnamic acid, coumaric acid, rosmarinyl glucoside, scopolin, quercitrin, kaempferol, astilbin, phloretin 2'-O-glucoside, quercetin, and rutin were significantly reduced (fold change ≤ 0.5), whereas the yields of sinapic acid, esculetin, fraxetin, esculin, isofraxidin, fraxidin, fumaric acid, and maleic acid were significantly promoted in WG (fold change ≥ 2).
Table 2 The quantitative statistics of identified Phenylpropanoid-derived metabolites in
different groups
Group name
|
Identified
Metabolites
|
Phenylpropanoid-derived Metabolites
|
Phenylpropanoid-derived Metabolites
(significant)
|
Up-
regulated
|
Down-
regulated
|
WY vs. RY
|
450
|
180
|
17
|
0
|
17
|
WY vs. BY
|
450
|
180
|
53
|
40
|
13
|
WY vs. WG
|
454
|
180
|
129
|
30
|
99
|
Candidate genes involved in the phenylpropanoid biosynthesis
We have identified 46 candidate genes encoding most of the enzymes involved in the flavonoids, coumarins, and phenolic acid biosynthesis (Table 3, Fig.5A, Fig.5B, Additional file 8: Table S5, and Additional file 9: Table S6). Moreover, five genes related to rosmarinic acid biosynthesis were also found. The biosynthesis of flavonoids, phenolic acids, and most of the coumarins is derived from the core phenylpropanoid pathway, which is composed of three committed steps catalyzed by PAL, C4H, and 4CL enzymes. In this study, four PAL genes, four 4CL genes, and one C4H gene were identified, of which the PAL gene (Cluster-22883.50216), 4CL gene (Cluster-22883.47381), C4H gene (Cluster-22883.50271) exhibited the highest expression levels compared to other members and showed higher expression levels in BY and RY. In the chlorogenic acid pathway, five genes encoding hydroxycinnamoyltransferases were found, and the expression level of gene (Cluster-22883.48487) was much higher than other members. Besides, the gene (Cluster-22883.48487) was highly expressed in BY. Furthermore, the gene (Cluster-22883.21708) was only highly expressed in WG compared to other samples. The scopoletion biosynthesis is derived from a key precursor, namely ferulic CoA, produced by the enzymatic reactions of 3-O-methyltransferase (COMT) and 4CL on caffeic acid or by Caffeoyl CoA 3-O-methyltransferase (CCoAOMT) reacting with caffeoyl CoA. Compared to WY, only one COMT gene (Cluster-22883.56077) and one CCoAOMT gene (Cluster-22883.54879) were identified in this pathway, both of which have the highest expression levels in WG and down-regulated in RY and BY. Furthermore, five genes encoding 2-oxoglutarate and Fe(II)-dependent dioxygenases family proteins were characterized. Particularly, the F6'H and S8H proteins belong to 2-oxoglutarate and Fe(II)-dependent dioxygenases family. The unrooted phylogenetic tree showed that the proteins encoded by these five genes were highly similar to F6'H or S8H proteins (Additional file 10: Fig. S4). Among them, the protein encoded by a gene (Cluster-22883.51443) from S. glabra showed maximum similarity with S8H protein (XP_030962790.1) from Quercus lobata. In addition, among these genes, the gene (Cluster-22883.51443) was highly expressed in WG, while another gene (Cluster-22883.50772) had a significantly higher expression level in BY compared to other samples. The pathway to the product of rosmarinic acid consists of four consecutive enzyme reactions, starting from the precursor (tyrosine) and four enzymes tyrosine aminotransferase (TAT), hydroxyphenylpyruvate reductase (HPPR), rosmarinate synthase (RAS), and cytochrome P450 reductase (CYP). These were encoded by genes of Cluster-22883.49301, Cluster-22883.50853, and Cluster-22883.50532, respectively. For the rosmarinate synthase (RAS) protein, the unrooted phylogenetic tree showed that RAS protein encoded by the gene (Cluster-22883.29589) from S. glabra had the maximum similarity with RAS proteins from Juglans regia (XP_018826314.1) and Arachis ipaensis (XP_016205435.1) Additional file 11: Fig. S5). The gene of Cluster-22883.49301 showed no significantly differential expression level in WY, RY, and BY, but Cluster-22883.50853 was highly expressed in BY. The gene of Cluster-22883.50532 had a significantly higher expression level in WG, whereas its expression level was less in BY, relative to WY and RY.
Table 3 Summary of identified genes in the transcriptome of S. glabra
Gene
|
Number of
genes
|
Gene
|
Number of
genes
|
Gene
|
Number of genes
|
PAL
|
4
|
CHS
|
2
|
ANR
|
1
|
4CL
|
4
|
CHI
|
2
|
ANS
|
1
|
C4H
|
1
|
F3H
|
2
|
TAT
|
1
|
C3H
|
2
|
F3'H
|
1
|
HPPR
|
1
|
COMT
|
1
|
FLS
|
1
|
HPPD
|
1
|
CCoAOMT
|
1
|
UFGT
|
1
|
RAS
|
1
|
F6'H
|
4
|
RT
|
2
|
CPR
|
1
|
S8H
|
1
|
DFR
|
1
|
Hydroxycinnamoyl-
transferase
|
5
|
SGTF
|
1
|
LAR
|
1
|
O-MT
|
2
|
The flavonoids biosynthesis shared the core pathway incorporating CHS, CHI, and F3H enzymes and the expression levels of these genes encoding the above-mentioned enzymes were higher both in RY and BY. As the intermediate compound (dihydrokaempferol) was produced by the core enzymatic reactions, it would be flowed into several branches to generate anthocyanins, proanthocyanidins, flavonols, etc. The biosynthesis of kaempferol is catalyzed by flavonol synthase (FLS) enzyme on dihydrokaempferol, and FLS enzyme was encoded by the gene of Cluster-22883.47725, the expression level of Cluster-22883.47725 was significantly up-regulated in BY. Rutin also originates from the precursor of dihydrokaempferol through the enzymatic reactions by flavonoid 3'-hydroxylase (F3'H, encoded by Cluster-22883.55303), FLS, UDPG flavonoid O-glucosyltransferase (UFGT, encoded by Cluster-22883.49005), and rhamnosyl transferase (RT, encoded by Cluster-22883.85233 or Cluster-22883.49258). These genes (except Cluster-22883.85233) had similar expression patterns that BL had the potential to induce higher expression levels. Interestingly, the gene of Cluster-22883.85233 was mainly expressed in WG to synthesize rutin, but Cluster-22883.49258 showed a higher expression level in WY and WJ. A series of enzymes, including dihydroflavonol 4-reductase (DFR), leucoanthocyanidin reductase (LAR), anthocyanin synthase (ANS), and anthocyanidin reductase (ANR), are responsible for the production of catechin and epicatechin, which eventually synthesize the product of proanthocyanidins (PAs). The genes of DFR (Cluster-22883.42375) and LAR (Cluster-22883.41029) had higher expression levels in RY, while the gene of Cluster-22883.47730 encoding ANR had a higher expression level in BY. Additionally, the gene of ANS (Cluster-22883.50399) was highly expressed in RY and BY. Overall, the BL and RL cultivation increased the expression levels of many functional genes, such as some isoforms in the PAL gene family, 4CL gene family, CHS, CHI, F3H, C4H, and ANS.
Candidate transcription factors of MYB and bHLH
A total of 142 raw MYB (MYB and MYB-related) genes and 61 raw bHLH genes were collected. For MYB proteins, after screening, 92 MYB genes had different numbers of highly conserved DNA-binding domains and were divided into 4 classes: 1R-MYB, R2R3-MYB, 3R-MYB, and 4R-MYB proteins including 35, 53, 3, and 1 genes, respectively. Since the Arabidopsis genome was published, it has provided the first insight into the description and classification of plant MYB TFs. In the plants, R2R3-MYB proteins are the largest class in MYB family. They are involved in plant-specific processes and have been divided into subgroups in Arabidopsis based on the conservation of the DNA-binding domain and of amino acid motifs in the C terminal domain [32]. Here, we have identified all the primary structures of 53 R2R3-MYB domains in S. glabra (Additional file 12: Fig. S6). We constructed the phylogenetic tree with these 53 R2R3-MYB amino acid sequences from S. glabra and 125 full-length R2R3-MYB amino acid sequences from Arabidopsis thaliana (Fig. 6A). Numerous studies have shown that some subgroups of R2R3-MYB proteins have been involved in the regulation of flavonoids biosynthesis in Arabidopsis thaliana, including subgroups of S4, S5, S6, and S7. In subgroup 7, AtMYB11, AtMYB12, and AtMYB111 were closely related to flavonols biosynthesis. SgMYB38 was clustered with AtMYB111. SgMYB53 showed a high degree of functional similarity within the subgroup of S6 (AtMYB75/90/113/114), regulating the anthocyanin pigments pathway. Six R2R3-MYB proteins of SgMYB8/14/28/43/27/39 were in the same branch of the PAs biosynthetic pathway with AtMYB123 in subgroup 5. Moreover, we identified two R2R3-MYB proteins: SgMYB6 with a high amino acid sequence similarity to AtMYB38 (RAX2, S14), and SgMYB32 clustered with members of subgroup 16. The heat map showed the expression patterns of selected 10 genes mentioned earlier according to the Log2(FPKM) values (Fig. 6B). Based on the transcriptome results, the expression level of SgMYB38 (Cluster-22883.68401) was up-regulated in BY. In subgroup 5, the six genes had different expression patterns among different samples. The genes of SgMYB28 (Cluster-22883.46964), SgMYB8 (Cluster-22883.824), and SgMYB39 (Cluster-22883.68770) exhibited the highest expression levels in WY, compared to RY and BY. Meanwhile, the expression level of SgMYB43 (Cluster-22883.34211) displayed a significant increase in BY. Remarkably, the gene of SgMYB27 (Cluster-22883.3887) presented the highest expression level in WJ, while the genes of SgMYB6 (Cluster-22883.63255) and SgMYB32 (Cluster-22883.11906) were highly expressed in WG.
The bHLH proteins are another ancient and functionally diverse superfamily of TFs, which have been widely investigated in the past. In this study, we identified 53 genes encoding bHLH proteins that had the distinguishable signature domain. We constructed the phylogenetic tree with these 53 bHLH amino acid sequences from S. glabra and 182 full-length AtbHLH protein sequences from Arabidopsis thaliana (Fig. 7A). The results of multiple sequences alignment of SgbHLH domains have been displayed in Additional file 13: Fig. S7. As shown in Fig. 7A, all the bHLH proteins were divided into 21 subclasses indicating that the node values between different subclasses were low, but values within each subclass were high, suggesting that the latter have strong evolutionary relationships. Particularly, a previous study showed that the bHLH proteins were classified into 32 subclasses [43], with subclasses of S2, S5, and S24 (corresponding to subclasses of S8, S7, and S15 of AtbHLH proteins) involved in the regulation of flavonoids biosynthesis. In the subclass of S8, we identified a group of SgbHLH proteins, namely SgbHLH22/24/25/43/45, which showed high sequence similarity with other AtbHLH protein members. Compared to the similar gene-expression change between the WY and RY, the relative expression levels of these five genes encoding SgbHLH proteins from S8 were significantly lower in BY. Particularly, the gene of SgbHLH45 (Cluster-22883.52750) was significantly up-regulated in WJ and WG. Within the subclass of S7, SgbHLH13 and SgbHLH12 were clustered with the subset of At1g63650 (EGL), At5g41315 (GL3), At4g00480 (MYC1), and At4g09820 (TT8) that associated with some R2R3-MYB members to participate in the regulation of flavonoids metabolism, trichome formation, epidermal cell fate specification, and in the formation of hair and non-hair cells in root epidermis cells [43]. The gene of SgbHLH12 (Cluster-22883.46083) showed no significantly differential expression pattern in WY, RY, BY, and WG, but it had a lower expression level in WJ. However, the gene of SgbHLH13 (Cluster-22883.47507) had the highest expression level in WJ. Furthermore, three bHLH proteins of SgbHLH4/27/37 belonged to the subclass of S15. The expression level of SgbHLH4 (Cluster-22883.57401) was slightly reduced in BY and RY, compared to WY, but SgbHLH27 (Cluster-22883.76231) showed a lower level in BY, WJ, and WG. The gene of SgbHLH37 (Cluster-22883.51340) had a much higher expression level in BY (Fig. 7B).
Validation of the transcript expression using qRT-PCR
To confirm the accuracy of the RNA-seq results, a total of 24 genes (five R2R3-MYB genes, five bHLH genes, and another 14 functional genes related to flavonoids biosynthesis) were selected (Additional file 8: Table S5). The qRT-PCR results revealed that the expression levels of 24 selected genes were mostly consistent with those obtained by RNA-seq (Fig. 8). Furthermore, for each selected gene, the high correlation coefficient (r > 0.600) was found between qRT-PCR and RNA-seq. Consequently, the findings of the qRT-PCR revealed that the RNA-seq data were reliable and could be used in future experiments.