Identification of IREDs. To identify suitable IREDs, a direct DKR-ARA of racemic 1-benzyl-4-methylpiperidin-3-one 2 (15 mM, 3 g L− 1) with methylamine a (200 mM) to access (3R,4R)-2a, the pivotal chiral intermediate en route to tofacitinib, was chosen as a model reaction (Fig. 1c). A library screening of 125 IREDs from various sources developed by our laboratory was performed for the above reaction. This yielded six enzymes that catalyzed the target reaction, although 2a was afforded with poor conversions (2−34%) even with excess amine donor (13 eq.) (Table 1). Among them, SvIRED, RedAm-13, and PIR358 displayed reactivity toward the undesired enantiomer (3S,4S)-2a with outstanding stereoselectivity (96−99% ee, 91:9−99:1 dr) (Table 1, entries 3−6). Only PocIRED, IRED-7, and IRED-18 afforded the desired enantiomer (3R,4R)-2a (Table 1, entries 1 and 2). However, the hits exhibited low conversion, and insufficient diastereo- and enantioselectivity, which limit their further utility in the efficient synthesis of the tofacitinib precursor. Therefore, reaction optimization and enzyme evolution were necessary to improve the process efficiency and the performance of enzymes. Subsequently, we selected PocIRED as the candidate catalyst for subsequent studies due to its superior catalytic performance compared with IRED-7 and IRED-18.
Reaction optimization. Designing an effective DKR-ARA system is a complex task, necessitating the fulfillment of specific requirements: (i) the racemization rate (krac) should be high enough to match the enzyme-catalyzed reaction rate of the fast-reacting enantiomer (kfast);49 (ii) enzyme-catalyzed reductive amination of prochiral ketones requires outstanding stereoselectivity; (iii) catalysts must exhibit excellent discrimination between two enantiomers for the kinetic resolution of racemic substrates;50 (iv) enzymes and biologically active compounds in the reaction system must be compatible with the racemization conditions.51,52
One pivotal consideration among these requirements is the balance between krac and kfast. This becomes particularly critical as, in most cases, krac is lower than kfast in a neutral aqueous solution.48,51,53 This imbalance may lead to an undesirable process in which the enzyme is compelled to slowly catalyze (S)-2 into path 3 or 4 (Fig. 2a). Consequently, this will result in a lower substrate conversion and poor optical purity of the product even when using an enzyme with superior activity and stereoselectivity.9 We sought to address this challenge in two ways. Initially, we tried to adjust the enzyme catalytic rate (kfast) by changing the ratio of substrate to enzyme (S/C) to match the racemization rate. As shown in Fig. 2b (Entries 1−4), the increase in S/C from 2:1 to 200:1 resulted in the increase in enantiomeric excess and diastereomeric ratio of product from 74% ee and 90:10 dr to 88% ee and 92:8 dr, respectively, but the conversion significantly dropped from 95–22%. Elevated pH potentially facilitates the formation of imine intermediates and expedites racemization of substrate enantiomers.42,51 The results indicated that raising the pH from 8.0 to 9.5 increased ee from 82–89%, and enhanced the conversion from 84–91%, confirming our hypotheses (Fig. 2b, Entries 5−7). The favorable reaction outcomes (88% ee; 92:8 dr; 91% conversion) at pH 9.5 with S/C 10:1 strongly support the feasibility of DKR-ARA (fulfilling requirement i).
Protein engineering. Despite achieving promising results through reaction optimization, further protein engineering was clearly needed to enhance the application potential of the biocatalytic process. Meeting process targets for industrial-scale production of the tofacitinib intermediate is imperative, including > 100 g L− 1 substrate loading, > 90% conversion, > 99.5% ee and > 95:5 dr.43,51,54 In addition, since high pH accelerates substrate racemization, it is necessary to improve the tolerance of PocIRED under the high pH conditions that are beneficial for the DKR process. Due to the demanding process targets, it is imperative to concurrently evolve multiple properties of the catalyst including enantioselectivity, diastereoselectivity, activity and tolerance. Therefore, a comprehensive strategy for protein engineering was devised. This included protein crystal structure-based and computational-based approaches for initial site selection, coupled with diverse mutant library design and screening methods.
Initially, the crystal structure of the PocIRED-WT, in complex with NADP+ and substrate 2, was determined at a resolution of 1.5 Å (PDB: 8YXQ). In the first round, a focused mutant library at 48 sites (the core layer, Fig. 3a) within 9 Å of the substrate-binding pocket was derived by altering amino acids to alanine (A), leucine (L), or phenylalanine (F), respectively in a probe-like fashion.55 Changes in stereoselectivity were then analyzed using chiral gas chromatography. Sites showing enhanced stereoselectivity underwent single-site saturation mutagenesis, streamlining the evolutionary process to enhance enzyme stereoselectivity. Several hits with improved stereoselectivity over WT were identified. The most beneficial mutant, Q239G (designated as M1), exhibited markedly improved stereoselectivity, with ee increasing from 88% to > 99.9%, and dr from 9:1 to 13:1. Furthermore, M1 displayed a 1.8-fold enhancement in conversion compared with its WT.
In the first round of stereoselectivity screening, multiple mutants exhibited an increased dr value coupled with decreased conversion or vice versa, implying a potential trade-off between diastereoselectivity and conversion. In the second round, these single-point mutants were incorporated into M1 in pursuit of enhanced mutants. However, the trade-off persisted even after combinatorial mutations such as M1 + S72D, M1 + S72N, and M1 + M179G (Fig. 3c). Moderate diastereoselectivity with the sole remaining undesired stereoisomer, (3R,4S)-2a suggested that the mutant successfully accomplished the asymmetric reductive amination of the prochiral carbonyl of 2 (ee > 99.9%), yet its capability to discriminate between the enantiomers of the substrate (rac-2) remained constrained (only fulfilling requirement ii). Therefore, we sought to enhance the diastereoselectivity of the enzyme in the second round of protein engineering. Subsequently, a combinatorial library, combining beneficial single-site mutations from round 1 and a CAST library,56 was constructed and screened. A single-site mutant, M1 + P125T (designated as M2), was obtained that exhibited an excellent dr value (99:1) and maintained enantioselectivity (> 99.9% ee), albeit with a decrease in conversion compared to M1. The remarkable dr value indicated that productive catalysis with M2 only takes place with an enantiomer of the racemic substrate (fulfilling requirement iii).
Close proximity between the enzyme’s active pocket amino acid residues and the substrate undoubtedly increases the likelihood of potentially modifying the enzyme’s catalytic properties. Therefore, in the third round, the 48 sites in the first round were chosen again for mutagenesis using the NNK codon set. A random subset of 4600 variants, with 96 mutants per site to cover 20 amino acids with 95% probability,57 was screened by HPLC to assess conversion under defined reaction conditions (30 g L− 1 substrate 2 and 600 mM amine donor a in pH 9.0 Tris-HCl buffer). Meanwhile, 110 sites from the middle and outer layers of the enzyme active pocket were chosen for saturation mutagenesis (Fig. 3a). For these sites, however, a library-building approach for triple-code saturation mutagenesis was employed, screening only 48 mutants per site to encompass the full range of amino acids with 85% probability.57,58 This suggests that saturating mutagenesis at 110 sites only required the screening of 5280 mutants. Utilizing in silico protein structure analysis, the efficiency of protein engineering can be significantly enhanced by determining the potential influence of different sites on enzyme catalytic performance and subsequently employing diverse saturation mutagenesis methods. In addition, various computer-assisted methods, including identity analysis and homology comparison with engineered imine reductases, were employed in this round to identify potentially beneficial hits.45 Numerous positive mutations, resulting in a 1.2- to 4-fold improvement in conversion, were obtained compared with M2.
While beneficial mutants can be rapidly identified through single-point mutations, potential multi-point mutants with synergistic effects may be overlooked. Hence, in the fourth round, beneficial mutations from the third round were categorized into seven groups based on spatial distance, and mutant libraries were serially constructed and screened (Fig. 3b, from group 1 to group 7) within each group using multicodon combinatorial mutagenesis (see Fig. S7, and Table S6 for details).59 The screening pressure is raised incrementally to converge toward the desired process conditions (substrate loading from 30 to 80 g L− 1, methylamine concentration from 600 mM to 1200 mM, pH from 9.0 to 9.5). Screening of these libraries under process-like conditions provides variants that are incrementally better adapted to the desired process conditions.28,60 After several iterative rounds of combinatorial mutation library screening, the catalytic activity and tolerance of the IRED was substantially improved.
The results and screening pressure are summarized in Fig. 3d. The initial two rounds prioritized enhancing the enzyme’s stereoselectivity, with mutant M2 exhibiting flawless stereoselectivity and a two-fold rise in specific activity compared with WT. The subsequent two rounds focused on improving catalyst activity and tolerance (mc). Following thorough screening, a 20-point mutant variant, PocIRED-M6 (Q239G; P125T; S40E; V124T; I127E; L128Y; S72N; Y73Q; T76V; K195A, G180S, R220V, A47P, V50A, F210L, K212A, W217Q, I178V, H224R; L236I), was identified. M6 displayed outstanding diastereo- and stereoselectivity (> 99.9% ee and > 99:1 dr, compared with WT 88% ee and 9:1 dr), high specific activity (6.4 U mg− 1, compared with 0.34 U mg− 1 for WT), and thermostability (Tm 63.5°C vs. Tm 48.1°C for WT). Moreover, M6 exhibited enhanced robustness for high pH (fulfilling requirement iv), increased tolerance to methylamine and cosolvent DMSO, and improved expression in the Escherichia coli host (Fig. S5, Fig. S6, and Fig. S16). The criteria for establishing a DKR-ARA system through reaction optimization and protein engineering were satisfactorily fulfilled, and the next step involved evaluating the performance of the system.
Substrate scope. With the best mutant M6 in hand, we subsequently explored the substrate scope of the tailored PocIRED and WT (Table 2). In general, M6 exhibited a broader substrate scope and enhanced catalytic performance compared with WT. We initially investigated the asymmetric synthesis of 2b−2i via IRED-catalyzed biotransformation of ketone 2 with various amine nucleophiles (b−i). The results showed that M6 exhibited outstanding enantioselectivity (95−99% ee), and diastereoselectivity (98:2 to 99:1 dr) toward all amine donors tested, even ammonia. By contrast, WT exhibited very low conversion and poor stereoselectivity toward almost all amine donors. Less than 10% conversion was obtained with amine donors b−d, f and g and no detectable product was formed for e and h. For ketones 1−4, conversion with both M6 and WT progressively decreased with increasing steric hindrance of the β-substituted group.
Finally, substrates bearing additional functional groups on their aromatic ring 5−20 were tested with both WT and M6. We found that various ketones were smoothly acted on by M6 to generate the corresponding products with excellent enantioselectivity (> 99% ee) and high diastereoselectivity (93:7−99:1 dr). The results indicate that the position of the substituent (ortho, meta, para), an electron-withdrawing or electron-donating group, had no significant effect on the performance of M6. Notably, we successfully improved the diastereoselectivity of 16a from 79:21 to 93:7 under optimized S/C conditions, suggesting the possibility of further improving stereoselectivity through reaction optimization. By contrast
WT displayed distinct catalytic activities toward different ketones. In general, ketones with substituent at the meta position on the benzene ring were better tolerated than those at the para and ortho positions. Among the seven para-substituted substrates (5, 9, 13, and 17−20), WT only exhibited favorable catalytic performance toward ketone 9, but showed poor or reversed stereoselectivity, or even no activity, toward others. These results may be attributed to steric hindrance, with the smaller hindrance of a fluorine atom substituent in the para position (substrate 9) having less impact on WT catalytic performance. In addition, WT displayed a near loss of stereoselectivity for ortho position substrates 7, 11 and 15. In conclusion, the engineered M6 has a broader substrate acceptance scope and better selectivity than WT, making it a promising biocatalyst for the highly diastereo- and enantioselectivity synthesis of β-branched chiral amines with contiguous stereocenters.
Preparative synthesis of the tofacitinib intermediate. To demonstrate the utility of the developed biocatalytic method, a preparative synthesis of the tofacitinib intermediate was conducted at a substrate concentration of 110 g L− 1 in a 500 mL reaction volume, using the best mutant M6 in the form of lyophilized cell-free extract. The reaction was maintained at pH 9.5 through titration with a 1.5 M aqueous methylamine solution.61 Subsequently, PocIRED-M6 facilitated the reductive coupling of rac-2 (110 g L− 1, 550 mM) with methylamine (1000 mM, 1.8 eq.), yielding 44.5 g of product with > 99.9% ee, 98:2 dr, and a yield of 74% (Fig. 4). These results demonstrate superior stereoselectivity and substrate loading compared with previously reported enzyme- or chemical-catalyzed asymmetric synthesis of the tofacitinib intermediates.
Mechanistic analysis. To gain insight into the mechanism of the improvement of the engineered catalyst, X-ray crystal structures of PocIRED-M6 with NADPH (PDB: 8YVH), and PocIRED-M6 with NADPH and substrate 2 (PDB: 8YXY) were solved at resolutions of 2.1 and 1.1 Å (Table S14), respectively. Furthermore, based on the X-ray structures, molecular dynamics simulations were performed to investigate the structural foundation. The final mutant, PocIRED-M6, contains 20 mutation sites, with 12 sites from the core layer of the mutant library and 8 from the middle or outer layers. Mutations V124T, P125T, I127E, L128Y, G180S, F210L, W217Q, R220V, H224R, L236I, and Q239G are situated within the active pocket, and might be mainly involved in substrate- and cofactor- binding, as well as stereoselective catalysis. Mutations S40E, A47P, V50A, T76V, K195A, and K212A are located on the surface or distal side of the protein, potentially contributing to increased hydrophilicity or charge of the protein surface and decreased flexibility, thereby favoring enzyme expression and stability. Mutation A47P is located at the end of a loop, where proline probably aids in stabilizing the loop and mutation T76V results in an elongation of the α-helix (Fig. S22), thereby enhancing structural stability. In addition, mutation I178V probably also intensifies the rigidity by stabilizing the dimer interface. Other mutations, namely S72N and Y73Q are close to the cofactor and might influence its positioning (Fig. 5a).
Mutations are distributed across various regions of the enzyme. They not only tailor the active pocket for the catalytic steps (chemical steps), but also modify the overall structure for the physical steps of the catalytic cycle, which involve substrate access and product release. From a crystallographic perspective, the active pocket of the ternary complex PocIRED-M6 (binding NADPH and substrate) exhibited a more compact conformation compare with the binary complex that only binds NADPH (Fig. S21). However, PocIRED-WT maintained its closed form of the pocket regardless of the presence of binding substrates (Fig. S19c). Substrate inhibition generally attributed to the formation of an unproductive enzyme–substrate complex after the simultaneous binding of two or more substrate molecules in the active pocket.62 The closed conformation of WT may result in the unproductive binding of substrates. Indeed, PocIRED-WT exhibited severe substrate inhibition (0.34 U mg− 1 at 50 mM substrate concentration vs. 0.13 U mg− 1 at 200 mM), and simultaneously accommodated two substrates 2 in its active pocket (Fig. S19b). Compared with the closed form of WT, M6 exhibited two active pocket forms of opening and closing that were activated by the substrate. In addition, the smaller, amino acid side chains of T125 (A), Q217 (B), V220 (B), I236 (B), and G239 (B) also affected the substrate tunnel in M6. New substrate tunnels were observed in M6 (Fig. 5d and 5e; Fig. S14d and S14e). This may also be partially accountable for the enhanced catalytic activity and reduced substrate inhibition (6.4 U mg− 1 at 50 mM substrate concentration vs. 5.2 U mg− 1 at 200 mM) by improving the physical steps.
According to the molecular dynamics simulation results, the substrate could be stabilized in the binding pocket in both WT and M6 throughout the entire simulation. The substrate-binding affinity of PocIRED-M6 calculated using MMPBSA method was slightly decreased (-20.9 ± 3.7 kcal/mol, -17.6 ± 3.9 kcal/mol in different parallel simulations) compared with that of PocIRED-WT (-23.8 ± 4.4 kcal/mol, -25.0 ± 3.5 kcal/mol) (Table S10). Although the substrate can bind in the pocket in both WT and M6, the binding mode differs in these two systems.
In WT, the side chain of W217 (B) engages in aromatic-proline interactions with P125 (A), and π-π stacking interactions with F210 (B). Since P125 (A) is located close to NADPH, the substrate is obstructed and prevented from binding closer to NADPH (Fig. 5b). The distances between C4 atoms of NADPH and substrate C atoms are relatively long and unstable (7.0 ± 1.3 Å and 10.7 ± 0.5 Å in different parallel simulations; Fig. S12). By contrast, in M6, the smaller side chains of T125 (A), L210 (B), and Q217(B) alleviate the steric hindrance, allowing the substrate to bind closer to NADPH (Fig. 5c), resulting in a shorter and more stable distances between C4 atoms of NADPH and the prochiral C atoms of the substrate (5.5 ± 0.7 Å and 5.6 ± 0.5 Å in different parallel simulations; Fig. S13). Therefore, the substrate in M6 can adopt a more catalytically competent conformation attributed to the tailored active pocket, facilitating the chemical steps.
Finally, we found that compare with WT, M6 formed a salt bridge interaction between R224 (B) and E127 (A), and a stable hydrogen bond between Q217 (B) and T125 (A) (Fig. 5b and 5c). These new interactions stabilize the conformation between the two monomers of the enzyme, and are likely to contribute to the improved thermal stability.