Discovery of C-terminal Degron via Directed Evolution
For directed evolution of ADs, the mutant variants of Prochlorococcus marinus AD (ADpm, Uniprot ID Q7V6D4) were generated using error-prone PCR, and E. coli cells were transferred to a chemostat cultivation system containing 2 g/L hexanal which was found to inhibit the growth of E. coli cells without recombinant AD. We hypothesized that this system would select for AD variants with increased rates of hexanal decarbonylation to pentane, which is less toxic (Fig. 2A).
After seven days of chemostat cultivation, 33 mutants were sequenced and the distribution of mutations in the AD coding sequence was analyzed. The highest mutation frequency was observed in the α4 and α8 helices (Fig. 2B), and the α8-truncated mutants, due to the nonsense mutations, were found to have the highest activity based on the amount of NADPH consumed (Supplementary Fig.1). According to the crystal structure of ADpm, eight α-helices form the intact structure, among which α1, 2, 4 and 5 form the catalytic domain (Fig. 2C). Mutations in these domains might benefit the activity of AD. Since α8 helix is far from the catalytic domain, we hypothesized that α8-truncated mutants might be filtered out due to a novel mechanism other than enhanced activity.
This was further probed in vitro, by expressing ADpm without the C-terminal helix, RMAAAALVS (ADpmC1-9), herein termed ADpm-9. Enzymatic assays with purified ADpm-9 proteins revealed a decrease in specific activity and thermostability, whereas size-exclusion chromatography showed no change in the oligomeric state compared to ADpm (Supplementary Fig.2). The results lead us to hypothesize that E. coli expressing ADpm-9 exhibited improved alkane production because the biostability of the enzyme was improved, thus increasing its abundance. As ADpmC1-9 shows rich of L, A, V and S as E. coli degron motif 1 (Supplementary Fig. 3) [35], so ADpmC1-9 was speculated to be a degron that shortens the half-life of ADpm. To test this hypothesis, the ADpmC1-9 was fused to green fluorescent protein (GFP) to monitor degradation over time in vivo. The fluorescence increased slightly in the beginning four hours for E. coli expressing ADpmC1-9-tagged GFP, but decreased during the remainder of the incubation. In contrast, the fluorescence of untagged GFP increased till 19 h after inoculation and remained stable thereafter (Fig. 2D). Overall, ADpmC1-9 reduced the GFP fluorescence to 8.6% of the untagged GFP after 30 h of cultivation, which supports the hypothesis that the short C-terminal region of ADpm functions as a degron reducing protein half-life in vivo.
Identification of the minimal AD degron sequence
To more precisely determine which part of the C-terminal region of ADpm is necessary for its degradation, GFP was tagged with different C-terminal segments of ADpm (3, 5, 10, 20, and 30 amino acids). Each tagged GFP was controlled by a constitutive synthetic promoter pJ23119. Compared to the untagged GFP, degradation was observed in all the tagged GFP strains (Fig. 3A). The degradation of GFP increased as the degron length increased from 3 (GFP+ADpmC1-3) to 10 (GFP+ADpmC1-10), but no further enhancement of degradation was observed beyond 10 residues (Fig. 3A). These results indicate that the short C-terminal region of ADpm (10 amino acids) is the minimal AD degron sequence.
Western blot was used to analyze the protein degradation of GFP and ADpm triggered by the C-terminal degron. Fluorescence assays revealed a 90% decrease in fluorescence of GFP+ADpmC1-10 compared to the untagged GFP, which is in line with approximately 90% reduction in the GFP protein level for the GFP+ADpmC1-10 construct (Fig. 3A, 3B). These results demonstrate fluorescence from the GFP protein is directly proportional to the abundance of the GFP protein. The protein concentration of untagged GFP increased with more incubation time, while the fluorescence of GFP+ADpmC1-10 decreased. Similarly, the protein concentration of wild-type ADpm decreased with more incubation time while protein accumulation was observed in ADpm-10 (with 10 C-terminal amino acids missing) (Fig. 3C). These results indicate that a short C-terminal region of ADpm, ADpmC1-10, represents a degron whose presence leads to proteolytic degradation in vivo.
Recognition of the C-terminal degron in bacterial ADs via bioinformatic analyses
To expand our understanding of biostability of the bacterial ADs, over 600 protein sequences of bacterial AD homologues from the UniProtKB database were analyzed using multiple sequence alignments. This dataset was reduced to 371 sequences by removing redundant sequences. A phylogenetic tree was generated using MEGA 5.10’s NJ method with 100 bootstrap replications (Fig. 4A). For further analysis, eight representative sequences from different branches were selected from the phylogenetic tree. A high degree of conservation was observed in C-terminal residues of the selected candidates (Fig. 4B), including basic amino acids (Arg or Lys) at the 10th position and nonpolar Ala and Leu at the 7th and 4th positions, respectively. Ala-Ala dipeptides were also observed to be a common feature in the C terminus of ADs (Fig. 4B). These results suggest that the C-terminal degron detected in ADpm appears to be a conserved motif in the family of bacterial ADs. A statistical analysis was conducted for the last 10 amino acids of the C-terminal sequence from ADs in the reduced dataset. It was revealed that half of the positions have an amino acid frequency larger than 50% (Supplementary Table 1). The most conserved position is Ala at the 7th position with a frequency of 67.92% among the bacterial AD homologs (Fig. 4C). From this, RMSAYGLAAA appears to be a consensus sequence for a degron conserved in bacterial ADs, herein termed ADcon.
To test this hypothesis and evaluate the functionality of ADcon, the C-terminal sequences of three ADs from major branches of the phylogenetic tree, as well as the ADcon, were fused to GFP to test whether they would trigger protein degradation. All four C-terminal sequences caused marked GFP degradation (Fig. 4D), which proved that the C-termini of bacterial AD homologs serve as degrons. Furthermore, it was observed that lack of some of the conserved residues affected the degron efficacy. For example, the degron from Euhalothece (RMSAYGLREV), which lacked the Ala-Ala dipeptides, only caused 62% of GFP degradation after 25 h incubation compared to the ADcon which caused 94% GFP degradation. Besides, it was reported that AD from Gloeobacter violaceus PCC 7421 (7421ADO) shows higher protein level in E. coli than AD from cyanobacterial strains [15]. Based on sequence alignment, we observed significant amino acid difference in the degron region of 7421ADO, suggesting that 7421ADO is less susceptible to proteolytic degradation explaining a higher protein level. These findings strongly support our hypothesis that the C-termini of bacterial ADs is a conserved degron.
Investigation of molecular mechanisms of AD degron-dependent protein degradation in E. coli
To pinpoint the mechanism of protein degradation triggered by the AD degron, ADcon was compared to previously reported C-terminal degrons in E. coli [35]. We found that ADcon shares sequence similarity to the ssrA degron (CAANDENYALAA) from E. coli, which is degraded by the ClpAP and ClpXP protease complexes [36]. To test the hypothesis that the degradation mechanism of AD degron is like ssrA’s, degron tagged GFP was transformed into constructed ∆ClpA, ∆ClpX, and ∆ClpP strains. In all four degron tagged GFP strains, fluorescence recovery was observed in both ∆ClpA and ∆ClpP strains, but not significant in the ∆ClpX strain, compared to WT E. coli (Fig. 5A). This suggests that AD degron triggers protein degradation using the ClpAP protease complex where ClpA is responsible for recognizing the signal peptide, unfolding the protein, and translocating it to ClpP for proteolysis (Fig. 5B).
However, GFP degradation was still observed in the ∆ClpP strain compared with the untagged GFP (Fig. 5A). Previous studies found that a single degron can be recognized by multiple protease complexes. This includes the Umu degron that is recognized by the Lon protease complex and ClpXP, as well as the ssrA degron that can be recognized by both ClpXP and ClpAP [37-39]. Lon protease was suspected to degrade AD degron-tagged GFP since it is known to be an efficient protease for non-native protein degradation [40]. To test whether Lon protease can recognize AD degron, we transformed the AD degron tagged-GFPs into a constructed ∆Lon E. coli strain. The fluorescence was partially recovered in all tagged candidates when expressed in the E. coli ∆Lon strain. Nonetheless, GFP degradation still occurred in all degron-tagged GFP samples even in the ∆ClpP∆Lon strain (Fig. 5A). These results suggest that there are other proteases that can recognize AD degron in E. coli. The protease system in bacteria is complicate and vital for all biological pathways [41], which was supported by our observation that the physical appearance of the cell cultures bearing Lon and ClpP knock-outs had a stickier consistency compared to wild-type cells which are normally pasty. Hence, the need to maintain cell viability prevents the deletion of all proteases responsible for AD degron recognition in order to improve the biostability of AD, leaving only the option of managing the impact of the degradation tag.
Effects of AD degron engineering on alkane production
The three most commonly used ADs from Prochlorococcus marinus (ADpm, Uniprot ID –Q7V6D4), Nostoc punctiforme (ADnp, Uniprot ID B2J1M1) and Synechococcus elongatus (ADse, Uniprot ID Q54764) were selected to investigate the effects of degrons modification on alkane production. First, degron parts were removed from all three ADs to create ADpm-9, ADnp-10, and ADse-10. Initial enzymatic screening revealed the specific activities of degron-free versions of ADpm, ADnp and ADse decreased from 14.9 to 9.6 (1.55-fold decrease), 90.9 to 16.6 (5.47-fold decrease), and 49.1 to 42.7 (1.14-fold decrease) mU/mg, respectively (Fig. 6A). In contrast, the relative enzyme abundance of degron-free versions increased 2.2, 2.65 and 3.3-fold in cells harboring ADpm, ADnp, ADse, respectively (Fig.6B). Consequently, pentane production increased from 3.1 to 3.9 mg/L with ADpm and from 5.9 to 9.7 mg/L with ADse in E. coli (Fig. 5C). We attribute this improvement in pentane production to the increase in AD protein abundance, which not only compensates for the reduction in enzyme activity, but also increases overall pentane accumulation. The pentane production decreased from 7.9 to 5.2 mg/L in E. coli harboring ADnp-10 compared with ADnp (Fig. 6C), because its 2.65-fold increase in enzyme abundance could not compensate the 5.47-fold decrease in activity. Our modelling results suggest that the elimination of the C-terminal sequence has a negative effect on the substrate binding in all three candidates (Supplementary Fig. 4), which is consistent with the notion that residues far from the active site and the substrate binding site still contribute to the enzyme activity of AD [15].
Since deleting protease complexes negatively impacts cell viability and deleting the entire AD degron decreases enzyme activity, an alternative method to increase the half-life of AD is needed. We hypothesized that addition of amino acids after the degron would protect the AD from proteolytic degradation. This was tested by adding a 6xHis-tag after the native degron in wild-type ADse, which previously produced the most pentane (Fig. 6C). As shown in Fig. 6D, ADse with a 6xHis-tag shows 1.9-fold and 2.1-fold increases in protein abundance and pentane production, respectively, compared to wild-type ADse. Surprisingly, the 6xHisTag also improved the specific activity, possibly because the addition of His-tag affects the conformation of the C-terminal helix and somehow benefit the activity of ADse. Our docking results also provide some support for this hypothesis. One of the two highest scored interacting models shows that ferredoxin can interact with the ADse C-terminal helix (Fig. 6E), indicating that the C-terminal helix modification might improve the recruitment of ferredoxin (Fig. 6D). Turning to pentane production, the highest titer comes with overexpression of fdx and fpr (Fig. 6D), suggesting that ferredoxin (fdx) and ferredoxin reductase (fpr) are necessary for electron transfer to AD and for high enzyme activity. The protection principle was extended to GFP+ADcon in vivo to test the limit of the protective effect conferred by a 6xHis-tag. Fig. 6E shows that the 6xHis-tag protects GFP+ADcon from protein degradation during the cultivation (Fig. 6F). However, in comparison to wild-type GFP, the presence of 6xHis-tag did not totally prevent the degron recognition by protease (Supplementary Fig 5). Overall, these results demonstrate that manipulating the AD degron can improve enzyme activity and alkane production.