Integration of catabolism and global regulation enables rapid growth on pentoses.
Previously, we demonstrated that coupling activation of galactose (GAL)-responsive regulon to catabolism and growth on non-native substrate, xylose, enabled faster growth and more complete utilization when compared to the constitutive overexpression of the same genes. Here, we wanted to assess whether the benefit was unique to xylose or if growth on other substrates, like arabinose, could also benefit from activation of the GAL regulon. Since the Gal3p-Syn4.1 was engineered to activate the GAL regulon in a xylose-dependent manner, we wondered if it could be activated by arabinose since the two sugars are structurally similar. We assessed activation of GAL1p-EGFP by wild-type Gal3p or engineered Gal3p-Syn4.1 in the presence of native Gal2p and/or engineered Gal2p-2.1 permease (18). We observed that a Gal3p-Syn4.1 and Gal2p-2.1 co-expressing strain showed activation on arabinose with low background and high dynamic range (Fig. 2A). Interestingly, arabinose showed higher activation of GAL1p-EGFP than xylose, even though both Gal3p-Syn4.1 and Gal2p-2.1 were engineered for activity on xylose (Fig. 2B). Given robust activation, we constructed a semi-integrant strain by integrating accessory genes including, sensor (GAL3-Syn4.1), transporter (GAL2-2.1), and transaldolase (TAL1) – expressed under GAL promoters – to generate a “REG” (regulon) strain background (Fig. 2C). Similarly, as control, and to compare the with the traditional engineering approach, we integrated GAL2-2.1 and TAL1 under strong constitutive promoters to generate the “CONS” parental strain. We then assessed the growth of both strains transformed with araBAD or XYLA*3-XKS1 arabinose and xylose, respectively. In ARA-REG and XYL-REG all genes were expressed under GAL-responsive promoters whereas in ARA-CONS and XYL-CONS all genes were under constitutive promoters (Fig. 2C). On both carbon sources, the strains that coordinated GAL regulon activation with substrate assimilation demonstrated higher growth rate (µmax; ARA-REG = 0.14 ± 0.04 h− 1, XYL-REG = 0.17 ± 0.01 h− 1) compared to those that constitutively overexpressed the same genes (µmax; ARA-CONS = 0.06 ± 0.01 h− 1, XYL-CONS = 0.11 ± 0.01 h− 1) (Fig. 2D-E).
Perturbation of intrinsic factors result in modest improvements.
In traditional metabolic engineering strategies, adaptive laboratory evolution (ALE) or rational data-driven approaches are often used to improve growth rates of initial strain designs, since growth rates and biomass yields are most often suboptimal. We decided to use a (data-driven) systems biology approach to identify “intrinsic” genetic targets to modify so that may lead to improved growth rates of our REG strains on xylose and arabinose (i.e., pentoses). To identify these, we identified differentially expressed genes between pentoses (non-native substrates) and galactose (a native substrate) – since growth rate on the latter has evolved to function harmoniously as part of the GAL regulon. Identifying the differences between two and closing the gap between them may aid in better synchronizing GAL regulation with pentose metabolism. We chose not to investigate previously identified genetic targets due to the vast difference in the transcriptional phenotype of REG strains relative to traditional CONS-type designs (18).
We performed RNA-seq on strains growing on arabinose (ARA-REG), xylose (XYL-REG), and galactose (WT). We observed that while the overall profiles of all three substrates were different, the two pentoses clustered closer together than with the hexose, galactose (Fig. 3A). Differential gene expression (DGE) analysis also revealed that a total of 865 genes were differentially regulated between ARA-REG and GAL-REG strains and 1455 genes between XYL-REG and GAL-REG strains (Figure S1) (p-value < 0.05 after Benjamini–Hochberg correction). We reasoned genes that are either directly regulated by GAL regulon (e.g., GAL2, GAL4, etc.) make a small fraction of the transcriptome and would not be differentially expressed in pentoses and hexoses. Rather, differences would arise due to metabolic differences between pentoses and hexoses (e.g., energetic content, cofactor balance, etc.) that impose an alternate and complex, albeit indirect, regulatory control. This would explain why the transcriptome profile of the two pentoses were more alike compared to galactose. Therefore, we also hypothesized that if we performed DGE analysis on pentose (combined xylose + arabinose) vs. galactose, we may be able to identify “intrinsic” factors that sense and distinguish growth on pentoses vs. a hexose. We found 480 genes that were upregulated and 358 genes that were downregulated genes in pentoses vs. galactose (Fig. 3B).
Given the large number of differences between the two conditions (pentose vs. galactose), we wanted to identify core, highly-connected, regulatory elements that are the major contributors of observed differential phenotypes, which requires analysis of inferred gene regulatory networks (GRN). We mined our data using three different GRNs – EGRIN (20), YEASTRACT (21), and CLR (22) – to identify potential factors that are responsible for divergent transcriptional phenotypes between pentoses and galactose. We then associated the significant genes (log2 > 1.5, p-value < 0.05) within our DGE data with the corresponding node within the three networks. Using “Betweenness Centrality” (BC) value for each of the remaining nodes, we collected the identities of the top nodes to compare against the list of potential targets generated by examining the DGE data alone. Exploring this network, we found that the enriched nodes broadly related to six gene ontology (GO) terms (Fig. 3C, Figure S2). We determined a list of 24 targets for knock-out to understand their effect on growth and phenotype (Fig. 3D). We generated barcoded deletion (KO) libraries of these “intrinsic” factors in wild-type and REG strain backgrounds and performed enrichment on glucose, galactose, xylose, and arabinose (Fig. 4A). Comparing population shifts using barcodes, we identified 8 genes that displayed positive fitness on both xylose and arabinose but not on galactose – indicating a role in controlling growth primarily on pentose. We then tested their growth individually on galactose, xylose, and arabinose (Fig. 4B-D). Of these genes, only ΔGLN3 was either neutral or beneficial in pentose – all other deletions were detrimental for growth on xylose. Further, the growth rate of this mutant (and all others) was still lower than that of the parental strain on galactose, indicating additional limitations. Overall, the results of this approach were disappointing. And while we can employ several strategies to further improve growth (e.g., combine deletions, ALE, etc.), we decided to focus on the upstream metabolic module.
Pentose metabolism is largely extrinsically controlled.
Our studies have so far focused on identifying “intrinsic” factors that may be controlling/bottlenecking growth on pentoses and given the inconsistent benefits, we wondered whether the limitations were, in fact, “extrinsic”. The former implies that the native metabolic and/or regulatory capacity of this yeast is inherently limited for effective pentose metabolism and improving growth rate requires vast restructuring of associated (intrinsic) networks. The latter implies that there are no inherent limitations in this yeast and that it is already poised for rapid growth on pentoses, but the observed low growth rates are due to suboptimal design of the upstream metabolic module that includes the heterologous (extrinsic) genes. To assess whether the “extrinsic limitation” paradigm has any merits, we needed to optimize the design of the upstream metabolic module responsible for substrate uptake and flux into central carbon metabolism (i.e., glycolysis).
First, we looked at the effect of plasmid copy number. For arabinose, there was no difference in growth rate when araA-araB-araD genes were expressed on high- or low-copy plasmids, whereas for xylose, we only observed growth when the XYLA*3-XKS1 gene dose was high (Figure S3). In either case, changing plasmids backbone did not improve growth rate. Next, we hypothesized that balancing expression of these heterologous genes may be required to enhance growth rate. For arabinose, we created all six combinations of gene-promoter pairings and assessed their performance. Surprisingly, we found marked improvement in growth rate – from 0.14 ± 0.04 h− 1 in the original design to 0.27 ± 0.03 h− 1 in the best re-design (Fig. 5A). To understand the cause of this behavioral change, we compared the relative expression levels of the araA, araB, and araD using quantitative reverse transcription PCR (qPCR) (Fig. 5B). We observed that in the poor performing combinations, expression of araB (ribulokinase) was the highest, whereas, in high performing combinations, expression level followed this pattern araA > araB > araD. In addition, we found a strong positive correlation between growth rate and relative expression of araA:araB and araA:araD, respectively (Fig. 5C-E). The success of this approach encouraged us to attempt the same on xylose and found similar improvements – from 0.17 ± 0.01 h− 1 to 0.24 ± 0.01 h− 1 (Figure S4). These are comparable to the growth rate of yeast on galactose when GAL1-7-10 are expressed from a plasmid (0.24 ± 0.03 h− 1) (Figure S5).
Next, we used directed evolution to improve the growth rate on arabinose further. We randomly mutagenized the six arabinose pathway combinations, adding barcodes to track the lineages and enriched this library size of 108 variants in minimal (SC + Ara) and complex (YPAA) arabinose media. The growth rate over subcultures increased from 0.12 h− 1 to 0.22 h− 1 and 0.18 h− 1 to 0.26 h− 1 in SC + Ara and YPAA, respectively (Fig. 5G, S6). Using barcodes, we tracked the performance of promoter-gene combinations. We observed that initially all the six plasmids start at similar abundance, but the araA-B-D (under GAL1-10-7p, respectively) was the most abundant at the end of enrichment in SC + Ara. We picked single colonies from each condition and calculated their growth rates and identified 4 variants that showed the highest growth rates (0.35 ± 0.04 h− 1) from the SC + Ara enriched culture (Figure S7). We sequenced the barcodes to identify the lineage and the whole cassette to identify the mutations and found that three of the six initial designs were represented in the four best variants (named N-3, N-6, N-12, and N-16) (Table S3). Re-transformation into parent background strain indicated that the four variants attained the same maximum growth rate as that on galactose (0.29 ± 0.01 h− 1). Since the mutations were distributed throughout the cassettes, we quantified the expression levels of araA, araB, and araD in the four strains and we found a strong positive correlation between growth rate and relative expression of araA:araB and negative correlation between araB:araD, indicating that the directed evolution campaign likely altered both activity and expression in each strain differently (Figure S8). These results highlight a key insight about yeast: it’s ability to utilize pentoses is largely limited only by “extrinsic” factors (upstream pathway, especially heterologous enzyme activity) and minimally by any “intrinsic” factor (i.e., native regulation or metabolic pathway).
Engineering intrinsic factors lead to pleiotropic fitness trade-offs.
We next assessed whether the global regulatory elements identified through our network analysis would benefit our best designs (araA-B-D under GAL1-10-7p and XYLA*3-XKS1 under GAL1-7p). Deleting each of the 8 genes in this strain did not result in any improvements over the parental strains (Fig. 6A). Here too, we could explore a combinatorial deletion of ALE strategy to enhance growth rate. However, given that each strain with had already attained aerobic growth rate equivalent to the maximum described for this yeast on glucose and galactose (0.22 h− 1 – 0.25 h− 1), with high biomass yields and short lag, we expect that improvements would be insignificant.
One consideration not yet accounted for is the suitability these strains for bioprocessing and their resilience to growth inhibitors found in lignocellulosic hydrolysates. Given that all deletion targets are highly-connected genes that control key cellular processes, we were concerned that dysregulation major networks may lead to undesirable pleiotropic effects. Since resilience against stress is a complex phenotype, often requiring concerted response from gene networks, we tested the fitness of all single KO strains on bioprocessing relevant stressors – individually, and in combination (Fig. 6B-E). Using parental strain as reference, we observed that performance of strains under stress was highly context dependent. For example, in sucrose medium, all deletions lost fitness in single or mixed inhibitor cultures (Fig. 6C, E). Conversely, on arabinose (with GAL regulon activated), certain KO strains had improved tolerance to stressors (e.g., ΔTEC1, ΔMET28) (Fig. 6D, F). ΔGLN3 has previously been shown to improve fitness under isobutanol stress (23); however, in our study, it was less fit than the parental strain under all stress conditions. Interestingly, it did demonstrate improved growth rate in a sub-optimal upstream metabolic design in arabinose (Fig. 4E) but lost that benefit in a more optimized design (Fig. 6C, E). Collectively, these results highlight that deleting genes to remodel expression profiles to enhance a single phenotype (e.g., growth rate on a non-native substrate) can lead to some improvements, but they are often accompanied by negative pleiotropic effects that make the strain less suitable for eventual bioprocessing applications.