TM1 is a suitable medium for TOL production
To select a culture medium for TOL production, KMLY1-2 was cultured in transformation medium (TM) containing different carbon and nitrogen sources (Table S1). No TOL was produced in TM1 and TM2, and the highest yield (only 1.66 mg/L) was detected in TM4 (Fig. S1a). After an addition of 1 g/L Trp, the yield of TOL in TM1 and TM2, two media without nitrogen, was significantly increased, with TM1 resulting in the highest yield of 236.68 mg/L. In addition, TOL production decreased with the increasing nitrogen content in TM3-1Trp−TM5-1Trp, with the yield of TOL in TM5-1Trp being approximately 4.93% of that in TM1-1Trp (Fig. S1b). The data indicated that, except for Trp, high nitrogen was not conducive to TOL production, which is highly consistent with a previous report by Chen and Fink [16]. Therefore, TM1, due to the absence of nitrogen and high-yield TOL, was chosen for subsequent experiments.
TOL production is dependent on cell density and the expression of key genes
Yeast growth and TOL production were measured over time. As shown in Fig. 1a, 0−24 and 24−42 h were identified as the exponential phase and stationary phase, respectively, according to the time−OD600 curve. Additionally, the TOL content sharply increased from 0−24 h and stayed constant from 24−42 h. Among them, the TOL concentration at 24 h was 211.46 mg/L. The highly consistent data of TOL production and cell growth indicated that TOL production is closely related to cell density. To clarify whether this process requires TOL biosynthetic genes, four key genes (aro8, aro9, aro10 and aro80) were selected and their expression levels were analysed. The expression of aro8, aro10 and aro80 gradually increased from 0 h to 18 h, while that of aro9 peaked at 12 h and slightly decreased at 18 h. In addition, these genes showed stable expression levels after 24 h (Fig. 1b). The profile of key gene expression is consistent, to a certain extent, with the pattern of TOL production and growth, indicating that the phenomenon of TOL yield dependence on cell density requires the expression of aromatic aminotransferases (ARO8 and ARO9), a decarboxylase (ARO10) and a transcription factor (ARO80), which is partly congruent with reports by Chen and Fink [16]. Therefore, 24 h is a critical time point and is considered to be the fermentation time of KMLY1-2 in the subsequent experiments.
TOL production is dependent on Trp and Phe concentrations
To explore the effects of Trp and Phe on TOL biosynthesis, TOL production was monitored after KMLY1-2 was incubated in TM1 with different concentrations of Trp and Phe. As shown in Fig. 2a, TOL yield increased proportionally as the Trp concentration increased but was more or less flat (231.02−266.31 mg/L) when ≥0.6 g/L Trp was supplied to the medium. Meanwhile, increasing amounts of residual Trp accumulated accordingly. These data indicated that the ability of KMLY1-2 to convert Trp into TOL was saturated from 0.6 g/L Trp. However, Phe did not affect TOL production when it was the sole nitrogen source in the medium (data not shown), which was ascribed to Phe being a direct precursor for 2-phenylethanol biosynthesis [9]. In addition, TOL content was significantly reduced when KMLY1-2 was cultured in TM1 containing Trp and Phe as nitrogen sources, and the reduction was strongly dependent on the Phe concentration (Fig. 2b). This may be attributed to nitrogen catabolite repression (NCR) in which high ammonia restricts TOL production by repressing the transcript level of genes in the Ehrlich pathway [10, 16]. However, the mechanism of Phe affecting the conversion of Trp to TOL is still unclear. In contrast with the results of extracellular TOL, a significantly different effect of 0.6 and 1.5 Trp on intracellular TOL production was observed (Fig. 2c). This discrepancy was mainly due to differences in biomass, as the intracellular TOL yield was calculated by weight normalization.
Metabolomic profiles
The chemical profile of the KMLY1-2 endometabolome was generated by liquid chromatography/mass spectrometry, and a total of 4473 metabolites with definite names were identified. Principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) showed that the sample replicates were tightly clustered and the samples from different media were clearly separated (Fig. S2), suggesting that the metabolomic data were highly reproducible. Of these metabolites, 1011, 1201, 281 and 694 differential metabolites (DMs), including 30, 30, 14 and 11 compounds of Trp and its derivatives, were identified in TM vs. TM-06T, TM vs. TM-15T, TM-06T vs. TM-15T, and TM-06T vs. TM-TP, respectively (Table S2). The Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of DMs showed that biosynthesis of antibiotics (ko01130) was the only shared enriched pathway in TM vs. TM-06T, TM vs. TM-15T and TM-06T vs. TM-TP (Table S2), suggesting that the addition of Trp and Phe to the medium may affect the biosynthesis of some antibiotics [22]. In addition, the abundance of DMs in the Ehrlich pathway and its bypass of Trp metabolism was analysed. As shown in Fig. 3, intracellular L-Trp abundance, as expected, increased with the increase in exogenous Trp concentration, which was not different between TM-06T and TM-TP, indicating that Phe had no effect on Trp transportation. A similar abundance trend for intracellular Trp was observed in IPA, IAD, tryptamine and IAA, suggesting that both the Ehrlich pathway and the tryptamine-dependent pathway (Trp→tryptamine→IAD) are involved in TOL biosynthesis of S. cerevisiae, as reported for the fungus Neurospora crassa [23], although the functional genes in the tryptamine pathway have not been identified in S. cerevisiae. In addition, the ranges of the increases in IPA and IAD abundance from TM to TM-06T were respectively smaller than those of L-Trp and IPA, while the decline ratios of IPA and IAD abundance from TM-06T to TM-TP were larger than that of L-Trp (Fig. 3a). The former indicates that transamination and decarboxylation are two rate-limiting steps in TOL biosynthesis, while the latter shows that these steps are susceptible to Phe, which may result in the NCR phenomenon of TOL biosynthesis due to Phe or phenylpyruvate (PPA) competing with Trp or IPA for the active centre of transaminase and decarboxylase, respectively. For the analysis of TOL abundance, the peak value was identified in TM-06T, which is highly consistent with the intracellular high-performance liquid chromatography (HPLC) data in Fig. 2c. The relatively low abundance in TM-15T may be ascribed to the reversible conversion of TOL to IAD by alcohol dehydrogenase, followed by convertion to IAA by aldehyde dehydrogenase. Different from the effect of Phe on the abovementioned metabolites, Phe promoted indolelactate accumulation (Fig. 3b), and the reason should be further studied.
General features of the KMLY1-2 genome
PacBio sequencing generated an assembled nuclear genome containing 31 contigs with 11.79 Mbp (∼113 × coverage) and 38.1% GC content, similar to previously reported S. cerevisiae strains [24]. A total of 5539 protein-coding sequences (CDSs) with an average length of 1476 bp were predicted, representing 69.31% of the genome. In addition, 3 rRNA and 315 tRNA were identified in the genome. Among all the predicted proteins, 5537, 5527, 2733 and 3795 CDSs were allocated to the non-redundant protein (Nr), swiss-prot, eukaryotic orthologous groups of proteins (KOG), and KEGG databases, respectively, based on sequence homologies, yielding 2376 shared annotated CDSs in total. According to the KEGG annotation, the pathways for glycolysis, citrate cycle, pentose phosphate cycle, amino acids biosynthesis, and purine and pyrimidine metabolism were complete. Furthermore, the candidate genes involved in TOL biosynthesis in KMLY1-2 and S288C (a model S. cerevisiae strain) were subjected to a comparative analysis. As shown in Table 1, a complete Ehrlich pathway, containing seven aminotransferases, four decarboxylases, and seven dehydrogenases, which showed sufficient homology (98.66−100%, except for ADH3), were identified. Genomic blast analysis showed that the adh3 gene in KMLY1-2 was highly homologous to four consecutive genes in S288C (adh3, YMR084W, YMR085W and seg1), indicating that it is a tetrafunctional polypeptide. Additionally, nine proteins (ARO1−ARO4 and TRP1−TRP5) with 97.93−100% identity were responsible for Trp biosynthesis from E4P and PEP via chorismate, meaning that both KMLY1-2 and S288C have the complete de novo biosynthetic pathway of TOL. Similarly, genes in the auxiliary pathway showed high homology (83.8−99.89%) between KMLY1-2 and S288C, suggesting that these genes are also conserved in yeast. Nevertheless, the TOL yield of KMLY1-2 was approximately 1.57 times than that of S288C when they were cultivated in TM1 supplemented with 1.5 g/L Trp, which may be attributed to the different regulatory mechanisms during TOL biosynthesis.
Transcriptome sequencing and analysis of differentially expressed genes (DEGs)
With the aim of dissecting the molecular mechanism of yeast TOL biosynthesis and regulation, transcriptome sequencing and analysis of TM, TM-06T, TM-15T and TM-TP were performed. Approximately 36623732−89449834 clean reads (99.82−99.93% of raw reads) were generated in 12 RNA-seq libraries, and 89.74−96.06% of the total reads mapped to the KMLY1-2 reference genome, among which 45.69−81.24% were CDS mapped reads (Table S3). A correlation analysis of three replicates showed high biological reproducibility (average Pearson correlation coefficient of 0.98). The transcript levels determined by the average FPKM (fragment per kilobase of transcript per million mapped reads) values showed that 5495, 5492, 5383 and 5288 genes were expressed in TM, TM-06T, TM-15T and TM-TP, respectively. According to the criteria described in the methods, there were 2183, 2068, 43, 2297, 1018 and 527 DEGs in the comparisons TM vs. TM-06T, TM vs. TM-15T, TM-06T vs. TM-15T, TM vs. TM-TP, TM-06T vs. TM-TP, and TM-15T vs. TM-TP, respectively (Fig. S3a). Since there were no shared DEGs in these comparisons, we divided them into group I (TM vs. TM-06T and TM vs. TM-15T) and II (TM-06T vs. TM-TP and TM-15T vs. TM-TP), and 1447 shared DEGs in group I, 253 in group II, and 201 in group I and II were identified (Fig. S3b), which may be closely associated with the fact that Trp facilitated TOL accumulation and Phe reduced TOL production. Using the short time series expression miner (STEM) algorithms, these shared DEGs can be clustered in 17 profiles (Fig. S3c−e). The transcript levels of DEGs in profile 1 were in a pattern of “increase-keep-decrease” from TM to TM-06T and TM-15T to TM-TP, which perfectly matched the data of extracellular TOL yield in Fig. 2a and b. Further analyses showed that one aminotransferase (HIS5), two decarboxylases (ARO10 and PDC5), a chorismate synthase (ARO2), and the transcriptional activator ARO80, which have been proven to participate in aromatic alcohols biosynthesis [8, 9, 20], were contained in profile 1 (Table S4). Additionally, eight and two genes speculated to participate in the Ehrlich, de novo, and auxiliary biosynthetic pathways of TOL were identified in profile 9 and profile 14, respectively (Table S4). Considering the fact that genes with similar expression patterns might be functionally correlated [25], it is reasonable to assume that genes in profiles 1 and 9, profiles 1 and 14 may be highly related to TOL production and the NCR phenomenon, respectively. In fact, eight and five TPP metabolism genes, coding for essential cofactors for PDC1, PDC5 and ARO10 [26], were respectively identified in profiles 1 and 9. Additionally, an aldehyde dehydrogenase, which functions in the conversion of IAD to IAA [11] and showed a relatively low expression level in TM-TP, gives an alternative reason why Phe decreased IAA abundance (Fig. 3b) and has been identified in profile 14 (Table S4). These data further confirmed that genes in profiles 1 and 9 were closely related to aromatic alcohol biosynthesis and regulation, and genes in profile 14 were responsible for NCR in S. cerevisiae.
Expression profile analyses of the DEGs involved in TOL biosynthesis
As shown in Fig. 4, when Trp was used as the sole nitrogen source, four of five aminotransferases, two of four PPA or pyruvate decarboxylases, and two of three alcohol dehydrogenases in the Ehrlich pathway showed an upward trend, indicating that these enzymes played important roles in the biosynthesis of TOL; while the transcript levels of these transaminase and decarboxylase genes decreased when Trp and Phe were present in the same medium, which was similar to the fact that abundant nitrogen downregulated aro9 and aro10 gene expression as reported by Chen and Fink [16]. An alternative, but non-exclusive, explanation for NCR can be given based on the expression patterns of aro9 and aro10. Phe, Trp, and TOL (a degradation metabolite of Trp) can upregulate the expression of aro9 and aro10 [16]; hence, the promoting effect of Trp on aro9 and aro10 transcripts was stronger than that of Phe. When Phe and Trp are present simultaneously, they will compete for metabolic enzymes in the Ehrlich pathway of S. cerevisiae, and Phe is a preferentially used amino acid [27], causing the pathway of Trp to IAD to be blocked, meaning that the expression of aro9 and aro10 is only regulated by Phe. Therefore, the gene expression level in TM-TP is lower than that in TM-06T and TM-15T. For the de novo biosynthesis pathway, the transcription levels of six DEGs first increased and then decreased sharply, with TM-06T as the turning point, which indicated that these genes tended to make cells synthesize more Trp at low concentrations of extracellular Trp, while the expression of these genes was suppressed by feedback in the presence of high concentrations of amino acids. In addition, the expression of gene 3726 (aro1) was more susceptible to exogenous Trp, and its expression level was inhibited once the exogenous environment contained Trp. For the auxiliary pathway, Trp promoted the expression abundance of three amino acid transporters, but there was no difference between TM-06T and TM-TP, which perfectly matched the metabolomics data in Fig. 3a, indicating again that Phe and Trp do not compete with the cell transport system. In addition, the expression levels of lpd1 (1892) and aro80 (4004) showed a trend consistent with the production of extracellular TOL, suggesting that they did play an important role in the biosynthesis of TOL in S. cerevisiae. However, the transcript levels of mig1 (5390), cat8 (3322) and gln3 (4745) decreased with the increase in nitrogen concentration, indicating that they were negatively correlated with the biosynthesis of TOL. This was partly the same as and partly contrary to the results reported by Wang et al. [19], and the reasons need to be further determined.
Integrated metabolomics and transcriptomics analyses
To fully understand the molecular mechanism of TOL overproduction in S. cerevisiae, the metabolism of amino acids, especially Trp, and the central carbon metabolism (glycolysis, pentose phosphate pathway, and citrate cycle) containing DMs and/or DEGs were summarized and described in Fig. 5. As expected, the abundance of metabolites and expression levels of most genes in the Ehrlich pathway in TM-06T and TM-15T were significantly increased compared with TM. For instance, the contents of TOL, IAD and IPA in TM-06T and TM-15T were 3.97−253.69 times more than those in TM. Consistently, the transcript levels of aro9, pdc5, aro10, adh2 and adh5 increased 2.19−376.46 times (Table S5). The results indicated that the addition of Trp increased TOL biosynthesis by enhancing the Ehrlich pathway, and genes with large changes, such as aro9 (353.15 to 376.46 folds), pdc5 (30.2 to 49.16 folds), and aro10 (205.74 to 222.96 folds), may have made important contributions. However, some metabolites and genes in the Ehrlich pathway showed a decreased trend after Phe was added to TM-06T (i.e., sample TM-TP), in which the content of TOL and the expression levels of his5, aat1, aro10, and pdc5 in TM-TP were 34.26% and 19.63−43.81% of those in TM-06T, respectively (Table S5). The results suggested that Phe addition weakened the Ehrlich pathway of TOL biosynthesis, which may be attributed to the inhibition of Trp to TOL by Phe competition because Phe is a preferred nitrogen source in S. cerevisiae.
In addition, the abundance of most metabolites in other branches of Trp metabolism in TM-06T or TM-15T was, as expected, significantly higher than that in TM. For example, a 2.74−34.76-fold increase was identified for indole-3-acetonitrile, 5-hydroxy-tryptophan and N-formylkynurenine (Table S5). Most strikingly, except for gene bna2 (0177), which showed 5.83 and 6.11 times higher transcript levels in TM-06T and TM-15T, respectively, compared with those in TM, the transcript levels of other related genes mostly decreased to different degrees (Table S5). The somewhat inconsistent results between metabolomic and transcriptomic data has often been reported, which might be related to the complex post-transcriptional mechanisms after gene transcription [28, 29].
For other amino acids, compared with TM, 10 amino acids, including serine, arginine and others displayed 2.05–74.19-folds increases in TM-06T or TM-15T. Moreover, alanine, valine and tyrosine increased their abundance by 3.24-, 2.56- and 5.06-fold, respectively, in the comparison of TM-06T vs. TM-TP (Table S5). These increases may be due to the addition of Trp and Phe providing cells with more energy and precursors, which promotes the biosynthesis of other amino acids and consequently results in cell growth. The fact that KMLY1-2 biomass increased significantly after Trp and Phe were added to TM1 (Fig. S4) further supports this speculation. In this context, the rapid growth of yeast cells will undoubtedly consume more carbon sources. Indeed, compared with TM, the abundance of many intermediate metabolites in the glycolysis, the pentose phosphate pathway and the citrate cycle was significantly reduced in TM-06T or TM-15T, and the expression levels of corresponding genes also showed a decreasing trend (Fig. 5). The lack of significant changes in glucose content may be attributed to hexokinase (1820), an important regulatory enzyme in central carbon metabolism [30], whose transcription level decreased significantly in TM-06T and TM-15T (Table S5), thus limiting the efficient utilization of glucose.