In a previous study, we had hypothesized that some of the non-essential genes, which get up-regulated in post induction cultures, may be acting as signaling messengers which activate the CSR. Knocking them out would disrupt this signaling pathway leading to a lowered CSR and hence higher protein yields. In a preliminary proof of principle study, we generated a panel of single and double gene knockouts that gave superior expression for GFP and L-asparaginase[23]. To validate this study and demonstrate that these knock outs indeed constituted a better expression platform, we decided to check their ability to express a “difficult to express” protein. For this task we chose a DKO combination ‘ΔelaAΔcysW’ which had shown improved performance and tested its ability to enhance the expression of ‘Rubella E1 glycoprotein’ which is otherwise expressed very poorly due to its toxic nature.
The maximum GFP expression obtained in the DKO was 16 AU (arbitrary units) which was about 2.5-fold higher compared to the control. Interestingly, the DKO showed a continuous increase in fluorescence for a significantly longer time, i.e. 16 hours, in comparison to control where the sfGFP fluorescence plateaued within 8 hours post induction (Fig. 1(c)). These results suggest that the DKO was able to counter the stress associated with toxic protein expression, leading to a 5.6-fold increase in product accumulation per unit biomass. Simultaneously the ability to sustain expression for longer periods, that we had observed with L-asparaginase as well, indicated that the global feedback controls which regulate protein expression are weaker in this DKO strain.
Transcriptomic studies
We conducted a comparative transcriptomic study of pre and post induction cultures expressing L-asparaginase in the DKO and control strains to check the impact of these gene knockouts on the CSR. We used complex rather than defined media since it provides a much higher levels of recombinant protein expression and therefore possibly triggers a stronger CSR. Our hypothesis was that this CSR would be partially blocked in the DKO strain and hence a significantly lower proportion of genes would show differential expression post induction.
Analysis of down-regulated genes in the DKO and control
To obtain a global picture of the changes post induction in the DKO and control strains we applied a cutoff of |log2(XIN/XUN)|>1 i.e. a fold change magnitude of ≥ 2 in terms of either up or down-regulation, to obtain the differentially expressed genes (DEGs). We found only 423 DEGs in the DKO strain in contrast to 1632 DEGs in control (Fig. 2(a)). This was a truly remarkable result given that knocking out just two genes that too belonging to the bottom of the regulatory hierarchy of E. coli, could lead to such a large difference in the number of differentially regulated genes. It was also the first clear proof that the reprogramming of the cellular machinery which is the primary effect of the CSR was significantly reduced in the DKO strain.
However, before arriving at any firm conclusions it was important to analyze the nature of these DEGs. From the 1632 DEGs in control, 736 were found to be down-regulated (Fig. 2(b)); out of which 667 genes were specific to the control. These down-regulated and up-regulated gene clusters were functionally categorized using KEGG GENES database. The majority of down-regulated genes were associated with key cellular processes like translation (90 DEGs), transcription (44 DEGs), RNA and ribosome biogenesis (50 DEGs), transport (67 DEGs), protein folding, sorting and degradation processes (34 DEGs), central carbon metabolism (33 DEGs), energy metabolism (34 DEGs), DNA replication, repair and recombination (27 DEGs), glycerol (substrate) metabolism (5 DEGs) and other catabolic processes (97 DEGs; carbohydrate, amino acid and nucleotide metabolism) (Fig. 2(c)). This pattern was similar to what we had observed in our previous transcriptomic studies conducted on high cell density fed batch cultures expressing L-asparaginase and other recombinant proteins(4,14,28). Here also the genes associated with carbon metabolic pathways, energy metabolism, transport and amino acid metabolism had got down-regulated and this is now considered to be a key feature of the CSR[12, 22]. In contrast, out of the 423 DEGs in the DKO, only 133 DEGs were found to be down-regulated, of which 52 genes were common to both control and DKO (Fig. 3(a)). The major part of the total down-regulated genes belonged to the class of transporters (33 DEGs), energy metabolism (anaerobic) (15 DEGs) and cell motility (14 DEGs) (Fig. 2(c)). Unlike control, only a very limited number of down-regulated genes were found to be associated with key cellular processes, such as, translation (9 DEGs), central carbon metabolism (5 DEGs), carbohydrate metabolism (7 DEGs), transcription (4 DEGs), RNA & ribosome biogenesis (3 DEGs). Clearly the DKO was able to prevent the down-regulation of critical pathways which is the hallmark of a strong CSR. We next looked in more detail at the specific pathways which directly impact on recombinant protein yields.
Respiratory metabolism
The respiratory metabolism of E. coli is efficient due to the fast kinetics of terminal oxidases[29]. It was observed that the global regulators arcA and fnr which regulate the expression of two major E. coli terminal oxidases: cytochrome bd-I (Cyd) and cytochrome bo9 (Cyo)(30) were 3.8-fold and 3.1-fold down-regulated in control compared to DKO strain where arcA was less down-regulated (1.42-fold) while fnr was up-regulated by 1.6-fold (Fig. 3(b)). Among terminal oxidases, Cyd is functional under micro-aerobic conditions due to its strong affinity to oxygen, whereas Cyo is dominant under fully aerobic conditions due to its low affinity to oxygen[31]. The cyo operon genes (cyoABCDE) were extensively down-regulated (3‒5-fold) in control, whereas the expression of cydAB genes was not significantly affected (Additional file 1: Table S1). However, in DKO, a much lower down-regulation of cyoABCE genes (1.1‒1.8-fold) and up-regulation of cyoD (1.45-fold) was observed (Fig. 4). Another important change in the DKO strain was seen in terms of unchanged transcript levels of the atp operon genes (encoding for F0 F1- ATP synthase) and nuoA gene (encoding the subunit A of NADH-quinone oxidoreductase) (Fig. 3(b)) in comparison to control, where these were severely down-regulated (3‒7-fold). Several researchers have shown that this down-regulation of energy metabolism genes post induction is a key feature of the CSR and a crucial factor behind the lowering of protein expression rates[14, 32]. These results demonstrate the ability of the DKO strain to exculpate the cell from experiencing this stress, thereby maintaining better homeostasis and retaining its capacity to generate the ATP required to meet the increased energy demands for recombinant protein synthesis.
Transcription and translation
The gene expression levels of RNA polymerases (rpoB, rpoC, rpoZ and rpoA) which are essential for transcription initiation were found to be severely down-regulated (7‒10-fold) in control. However, in the DKO strain, only rpoA was down-regulated (2-fold) while the expression of the remaining polymerases stayed unaffected. Similarly, the transcript levels of rpoD encoding for the primary sigma factor 70 was unchanged in the DKO compared to a 3.7-fold down-regulation in control (Fig. 4). Since sigma factor 70 coordinates the transcription of house-keeping genes during exponential growth[33], its lack of impairment in the DKO ensured better cellular health and consequently improved expression capability of this strain.
The protein synthesis ability of E. coli is also determined by number of functional 70S ribosome units inside the cell. Transcriptomic studies showed a significantly higher down-regulation of 30S and 50S ribosomal genes in control, 8‒13-fold, compared to the DKO strain where it was only 1.1–3-fold (Fig. 3(b)) (Additional file 1: Table S1). Translation elongation factors that play an important role in bringing the aminoacyl-tRNA to the ribosome and facilitate the translocation of ribosome along the mRNA during protein synthesis[34] were also found to be less down-regulated in the DKO (1.2‒2.5-fold) compared to control (5.5‒10-fold) (Additional file 1: Table S1). These findings suggest that the effect of CSR on the transcriptional and translational machinery was much less pronounced in the DKO strain.
Substrate uptake
Many transcriptomic studies have highlighted the negative effects of recombinant protein over-expression on nutrient uptake systems of E. coli[4, 35]. We also observed down-regulation of genes of glycerol catabolic regulon (up to 4-fold) in control. Interestingly this down-regulation of glycerol metabolism genes was intensified in the DKO strain (up to 8-fold) (Additional file 1: Table S1). Clearly this down-regulation of substrate metabolism genes in the DKO strain offset many of the gains obtained in terms of improved cellular health and expression capabilities and reflects the costs associated with tampering the finely tuned process of cellular dynamics that would have evolved to optimize cell survival.
Cell motility
In E. coli, flagellar biosynthesis and motility is a tightly regulated process since it is energetically expensive[36]. Therefore, it is advantageous only when flagellar motility is required. It was observed that the flagellar genes belonging to flgDEFGHIJK, fliAZ, fliDM, fliFGHIJK operons were down-regulated up to 5.6-fold in the DKO strain, while they were up-regulated up to 6.2-fold in control (Fig. 3(b)) (Additional file 1: Table S1). This is possibly an associated evolutionary response to stress which is not only blocked but reversed in the DKO strain. The flagellar sigma factor fliA which was 2.8-fold up-regulated in control was also found to be 5.3-fold down-regulated in the DKO strain. This down-regulation of flagellar genes in DKO would have the added advantage of conserving and hence redirecting the energy expenditure of the cell towards recombinant protein synthesis.
Analysis of up-regulated genes
Just as we had observed with the down-regulated genes, similarly a much smaller subset of genes was found to be up-regulated in the DKO; 290 DEGs compared to 896 DEGs in the control (Fig. 2(b)), clearly signifying a diminished CSR. The major component of the up-regulated genes in control belonged to the following categories; transporters (158 DEGs), carbohydrate metabolism (82 DEGs), amino acid metabolism (38 DEGs), cell motility (28 DEGs), energy metabolism (24 DEGs) and nucleotide metabolism (20 DEGs) (Fig. 2C). In the DKO strain, this list contained genes that mostly belonged to the following categories; central carbon metabolism (23 DEGs), transcription factors (10 DEGs), energy metabolism (14 DEGs), carbohydrate (28 DEGs) and transport (45 DEGs) (Fig. 2C). Apart from these, many other genes involved in protection against various kinds of stress were found to be up-regulated. These gene categories were analyzed in order to gain a better insight of cellular dynamics and their impact on recombinant protein synthesis.
Central carbon metabolism
Transcriptomic analysis showed a differential up-regulation of several genes which are associated with central catabolic pathways. The data revealed a selective up-regulation of some TCA cycle genes (sucABCD operon, sdhCDAB operon, icd, mdh) only for the DKO strain (Fig. 3(b)) (Additional file 1: Table S1). The sdhCDAB operon of TCA cycle and nuo operon genes are known to be involved in aerobic electron transport chain to generate energy via oxidative phosphorylation[37, 38]. We also observed increased expression of nuo operon genes (2-fold) encoding the subunits of NADH dehydrogenase I in the DKO strain in contrast to their unchanged levels in control. There is a possible interconnectedness between the increased expression of such genes in the DKO with the higher rates of energy metabolism in terms of both ATP and reduction equivalents (NADH, NADPH & FADH) which together helped to meet the enhanced energy requirements imposed on these cells due to recombinant protein synthesis.
Generalized stress response
The generalized stress response in E. coli is controlled by a global regulator ‘rpoS’, which is known to regulate the expression of 23% of E. coli genes under stress conditions[39, 40]. The DKO strain showed a 4-fold up-regulation of rpoS in contrast to a negligible change in control (Fig. 4). It was therefore no surprise to also observe the up-regulation of genes that are positively regulated by rpoS[41] (like bfr, dps, osmB, osmC, osmY, psiF, uspB) in the DKO strain compared to their down-regulated or unchanged expression in control (Additional file 1: Table S1). Some research groups have shown that rpoS also regulates the expression of gadE gene which is a transcriptional activator of glutamate-dependent acid resistance (GDAR) system[42]. In E. coli, GDAR plays an important role in maintaining cellular homeostasis under acidic conditions[43–45]. The transcriptomic data showed a much higher up-regulation of gadE regulated acid resistance genes i.e. gadA, gadB, gadC in the DKO strain (8‒10-fold) compared to control (3‒4-fold). This up-regulation could have boosted the general stress resistance of the DKO and allowed it to maintain homeostasis in spite of stress.
Starvation stress
In E. coli, the stress response DNA binding protein ‘dps’ is an indicator of starvation stress inside cells[46, 47]. We found a 6-fold up-regulation of the dps gene in the DKO compared to its unchanged expression in control. Increased carbon starvation initiates a cascade of events inside the cell that results in release of carbon starvation proteins to prolong cell survival[48, 49]. It was observed that cstA gene encoding for a carbon starvation protein A that facilitates nutrient scavenging in terms of peptide transport and utilization[48] was 4-fold upregulated in DKO strain compared to its 1.4-fold up-regulation in control (Additional file 1: Table S1). The role of cstA gene in activating glycolysis and acetate metabolism in a CsrA dependent manner has also been studied[50]. An increased up-regulation for other carbon starvation inducible genes csiD (7.3-fold)[51, 52] and slp (starvation lipoprotein) (8.3-fold) was also observed in the DKO strain compared to control (csiD 1.76-fold, slp 3-fold). slp has been shown to promote cell survival during carbon starvation or stationary phase conditions[53]. It is quite remarkable that unlike the control the DKO strain was not only able to prevent but also anticipate the onset of stress and take remedial action by up-regulating global regulators like rpoS and dps.
Amino acid biosynthesis
Amino acids play a crucial role in maintaining cellular metabolism and mediating the stress response. It is well established that their concentrations inside the cell affects gene expression, enzyme activities and redox homeostasis[54]. Transcriptomic analysis provided us an insight into the relative expression levels of genes associated with amino acid biosynthesis in both control and the DKO strain. However, these differences were not so evident since the use complex media for cultivation ensured an exogenous supply of amino acids, and this would have had a major impact on the results. A majority of the amino acid biosynthesis genes were found to be down-regulated by more than 2-fold in control, such as; ilvNGAEDYC (valine, leucine and isoleucine biosynthesis), dapADF (lysine biosynthesis), aroFG and pheA (aromatic amino acid biosynthesis), aspC (aspartate biosynthesis) and thrCS (threonine biosynthesis) (Fig. 3(b) & Fig. 4). However, the expression levels of most of these genes remained unchanged in the DKO except ilvN and aspC which were up-regulated by more than 2-fold (Additional file 1: Table S1). The genes for tryptophan (trpE) and cystiene biosynthesis (cysH, cysI) also remained 2‒4-fold up-regulated in both control and DKO strain. These findings suggest that the DKO strain is able to maintain a homeostatic environment by undergoing fewer changes in its amino acid biosynthetic pathways.
Proteomic Analysis
To observe the differential impact of the CSR on cellular health at the protein level, a preliminary study of the protein abundance profiles in the control and DKO strain was compared at the 4th and 10th hour post induction. These time points were chosen since we had observed that both cell growth and protein expression capability remain unimpaired till the 4th hour post induction after which it declines sharply in the control. We hypothesized that the CSR would significantly reduce the concentration of proteins which are critically required for protein synthesis in the control while its effect would be marginal in the DKO. We focused only on the top 100 most abundant proteins since their higher concentrations inside the cell allowed for a more precise quantitation by a label free LC MS/MS procedure. For normalization of protein content between samples, we used a multiplication factor so that the sum of the peak areas of the top 200 proteins obtained from LC-MS/MS analysis was equal between samples. These proteins were grouped into various categories like translation, central carbon metabolism, energy metabolism etc., similar to the categories used in our transcriptomic studies. We then calculated the log2 fold change for each protein across C4 and C10 and also between D4 and D10 (representing the 4th and 10th hour post induction samples of the control and DKO respectively) (Additional file 1: Table S2). This was done only for those proteins which were present in the list of top 100 proteins at both time points. Figure 5 shows the heat map of this fold change in both the control and DKO for each group of proteins where it is clear that the central carbon metabolism and energy metabolism protein ratios for D10/D4 were far better than the C10/C4 ratios.
To estimate whether these differences were statistically significant we calculated the mean and variance of the log2(fold change) for each set of proteins belonging to the same category and performed a t-test (Additional file 1: Table S3). The results confirmed our previous observation that the proteins belonging to central carbon and energy metabolism were more abundant in the D10 sample which explained its superior ability to sustain recombinant protein expression. This was remarkable given that the D10 sample had accumulated a far higher amount of L-asparaginase (15% of the total cellular proteins) and MBP (13% of the total cellular proteins) leaving lesser space for host cell proteins inside the cell. This could possibly explain why the translational proteins were not significantly different in their ratios between the DKO and control even though the transcriptomic analysis showed a difference. The carbohydrate metabolism proteins did show a slightly higher level which was however not significant at the 95% confidence interval. This global analysis helped us identify the lumped changes in the protein abundance profiles which have a direct impact on cellular health and also validated the results of transcriptomic profiling. The results confirmed that the DKO was able to effectively block the cellular reprogramming which took place in the control which is why it was able to retain its expression abilities for a longer time period.
To conclude, the results of transcriptomic and proteomic analysis suggested that the DKO strain was able to substantially block the signaling pathways leading to the CSR and hence alleviate most of its negative impact on cellular metabolism. The absence of down-regulation of key pathway genes and their master regulators implied that the modified strain was able to maintain its energy pool, transcriptional and translational rates as well as carbon uptake and metabolism by preventing the reprogramming of its gene expression patterns which is otherwise triggered due to recombinant protein mediated cellular stress. The only downside of this knock out strategy was that it exacerbated the down-regulation of substrate uptake genes, which now remained the only bottleneck that could adversely impact on growth and protein expression capability.
Growth and substrate utilization profiles of the DKO strain producing L-asparaginase
To evaluate the phenotypic effect of this higher down-regulation of glycerol uptake genes, we compared the glycerol consumption profiles of the control and DKO strains expressing L-asparaginase in shake flask culture. An uninduced culture of the control strain was used as a benchmark to measure the normal growth and glycerol uptake capability of cells in the absence of cellular stress. Both induced cultures showed a decline in growth post induction, with the DKO strain displaying a sharper drop in growth rate (Fig. 6(a)) and a poorer glycerol uptake rate compared to control (Fig. 6(b)). Thus, control cells completely consumed the residual glycerol within 10 hours post induction, while a significant amount of glycerol was leftover in the DKO culture even after 14 hours post induction.
Since poor substrate uptake would become a rate limiting factor for all cellular processes, we decided to supplement the DKO strain with additional copies of glycerol metabolism genes, and see its effect on the expression levels of recombinant protein. We therefore co-expressed glycerol kinase (glpK) and sn-glycerol-3-phosphate dehydrogenase (glpD) genes using the pPROLar.A122 vector backbone, since it was compatible with the plasmid used to express L-asparaginase in the DKO strain. Many previous studies have shown that these two genes play a critical role in enhancing glycerol uptake rates [20, 55–58].
Co-expression of glpDK with L-asparaginase
The plasmid pPROLAR.A122glpDK carrying the genes (glpK and glpD) of the substrate utilization pathway was co-transformed along with pMALS1Asp into the DKO strain and labeled as the test strain. The DKO strain transformed only with the pMALS1Asp plasmid was used as a control for this study.
We had earlier observed that optimal supplementation of pathway related genes can be accomplished even without induction, since the leaky expression associated with plasmid based genes is enough to ensure an adequate supply of protein[59]. To confirm this, test cultures were either induced or left uninduced for glpDK expression while being induced for L-asparaginase. Interestingly we observed a higher decline in growth rates 7 hour post induction for the test cultures compared to the control (Fig. 7(a)). This happened in spite of a superior glycerol uptake rate for both test cultures, indicating that the higher substrate consumption by cells was utilized primarily to enhance the flux towards product formation rather than growth (Fig. 7(b)). Also, the uninduced test culture performed significantly better in terms of both growth and product concentrations (Fig. 7(c)) demonstrating that the ideal supplementation levels of glpDK were achieved by simply allowing basal level expression of these genes.
Since the residual glycerol got completely exhausted within 7 hours post induction, we decided to pulse the growing cultures with glycerol. Both control and test flasks were induced with IPTG at an O.D. of ~ 1.5-2. After 6 hours post induction, a glycerol pulse was given to both control and test flasks and repeated every three hours till 12 hours post induction. A final pulse was given 21 hours post induction to check whether the cells retained their substrate uptake capability even after the onset of stationary phase. This higher availability of glycerol did not significantly alleviate the growth rate of the test culture but unlike the previous case no sharp fall in O.D.600 was observed for the test flasks (Fig. 8(a)). Residual glycerol profiles showed that the substrate uptake capability of the test cultures remained high till 12 hours post induction and then declined gradually (Fig. 8(b)) with the cells not being able to consume glycerol after the onset of stationary phase. The product concentration increased from being 1.72 folds higher to being 2.3 folds higher than control (Fig. 8(c)), underscoring the fact that the potential of this strain is truly realized when glycerol is available in the medium. It should be noted that the actual product formation ability per unit biomass was considerably greater for the test strain. Also the control used in this experiment was the DKO strain which had been previously shown to give > 2-fold higher expression of L-asparaginase compared to the unmodified host[23], so the net improvement in yield over the unmodified host was much higher.