A Ni2+-dependent metformin hydrolase
When examining the genomes of several metformin utilizers recently characterized16–19, a gene cluster with seven genes (designated metR2TR1ABCaCb) was found to be present in all of them (Extended Data Fig. 1). This gene cluster consists of two transcriptional regulators belonging to the MerR family (MetR2) and AcrR family (MetR1), a nucleobase-cation-symport-1 (NCS1) like importer (MetT), two genes annotated as Ni/Fe hydrogenase nickel incorporation-associated proteins (MetA and MetB), as well as two arginase/agmatinase family proteins (MetCa and MetCb). The differential omics analyses revealed that the expression levels of the genes metCaCb encoding arginase family proteins are upregulated under metformin treatment condition17,18. Arginase family enzymes are metalloenzymes, usually preferring Mn2+ for the catalysis of cleavage of C-N bond20. In this regard, the met gene cluster encoding metallohydrolases, metallochaperones, a putative uptake system, and transcriptional regulators might encode metformin catabolism (Fig. 1a).
When the two arginase family proteins MetCaCb from strain NyZ550 were coexpressed in E. coli, low activity was detected in soluble protein extracts without the addition of transition metal salts (Fig. 1b). This activity was absent in extracts from E. coli cells carrying an empty vector. The addition of Ni2+ to the cell extracts strongly increased the activity and Co2+ slightly increased the activity, but the activity was inhibited by the addition of Zn2+ and Cu2+. Comparable activity was detected when coexpression of MetCaCb with or without the two metal chaperones MetAB, either in the presence or absence of metal ions during the assay (Fig. 1b). Additionally, when MetCa or MetCb was individually expressed in the absence or presence of MetAB, no metformin hydrolase activity was detected in the soluble cell extracts (Extended Data Fig. 2a). It was observed that MetCa had low solubility when expressed alone in E. coli, while coexpression with MetCb improved its solubility. To rule out the possibility that the low solubility of MetCa in the cell extracts led to a failure to detect activity, the recombinant MetCa and MetCb were purified individually for activity assays. As shown in Fig. 1c, neither protein exhibited activity in vitro, even in the presence of excess Ni2+. Likewise, no activity was observed when equal amounts of MetCa and MetCb were mixed and preincubated at 30°C for 1 h before the addition of 10 mM metformin. However, activity appeared when MetCa and MetCb were co-expressed and purified (Fig. 1c, d; and see below), suggesting that metCa and metCb together encode metformin hydrolase activity.
Since six additional genes encoding arginase family proteins were present in the genome of strain NyZ550, gene deletion was performed to verify whether the metformin hydrolase activity of MetCaCb was exclusive to strain NyZ550. With the deletion of coding sequence of either metCa or metCb following a two-step homologous recombination strategy, the mutants were no longer able to utilize metformin as the sole carbon and nitrogen source for growth, whereas their capacity to grow with dimethylamine was unaffected (Fig. 1e). Moreover, a metAB mutant also lost the ability to grow on metformin, indicating that MetAB was required for metformin utilization in vivo (Fig. 1e). It was then concluded that MetCa and MetCb together function as a Ni2+-dependent metformin hydrolase mediating the utilization of metformin.
MetCa and MetCb form a heterocomplex
To examine whether MetCa and MetCb bind each other to form a heteromultimeric metformin hydrolase, MetCaCb was purified from cell extracts containing maltose binding protein (MBP)-tagged MetCa and tag free MetCb. MetCb was co-purified with MBP-MetCa, as evidenced by two bands visible on the SDS-PAGE, and metformin hydrolase activity was only detected in co-purified MetCaCb (Fig. 2a). This result was also repeatable when the MBP tag was replaced with a His tag purified by Ni-nitrilotriacetic acid (NTA) agarose (Fig. 2b). MetCa and MetCb comprise 357 (theoretically 40 kDa) and 348 (theoretically 38 kDa) amino acids, respectively. Size exclusion chromatography (SEC) coupled with multi-angle light scattering (MALS) analysis showed that the molecular weight of the co-purified active MetCaCb complex was determined to be 223.7 ± 0.6 kDa in solution (Fig. 2c), approximately a hexamer. These results confirmed that the interaction of MetCa with MetCb to form a heteromultimeric complex was essential for its metformin hydrolase activity.
Characterization of the metformin hydrolase
13C-NMR and HPLC-MS analyses of the reaction mixture containing the 13C15N-labelled metformin as the substrate indicated that MetCaCb catalyzed the conversion of metformin into guanylurea (Fig. 1d; Extended Data Fig. 2b and c). The oxygen atom of guanylurea was derived from water, as evidenced by H218O label assays (Extended Data Fig. 2d). The other product from the hydrolysis of metformin by MetCaCb was determined to be dimethylamine, which is stoichiometrically formed with guanylurea (Extended Data Fig. 2e). Dimethylamine is a productive product used as a carbon and nitrogen resource by strain NyZ550 (Fig. 1e).
The MetCaCb’s dependence on Ni2+ was confirmed by the significant inhibition of its metformin hydrolase activity in the presence of the Ni-specific chelator dimethylglyoxime (DMG) (Fig. 1c). With the addition of Ni2+, the metformin hydrolase MetCaCb performed under optimal reaction conditions similar to other Mn2+-dependent arginase family enzymes20. The optimum temperature for the reaction was ~ 50°C (Fig. 2d). MetCaCb hydrolyzed metformin most efficiently at pH 9.0 (Fig. 2e), and exhibiting a Michaelis constant (Km) of 6.84 mM for metformin with a catalytic efficiency of 1.88 mM− 1 s− 1 (Fig. 2f; Extended Data Table 1). The hydrolase activity was inhibited by the arginase inhibitor 2(S)-amino-6-boronohexonic acid (ABH)21 (Fig. 1c), indicating the possible formation of a tetrahedral intermediate during MetCaCb-catalyzed metformin hydrolysis.
MetCaCb also catalyzed the hydrolysis of biguanide and its derivatives, including 1-butylbiguanide, 1-methylbiguanide, and phenformin with the production of guanylurea, albeit with markedly reduced activities (Extended Data Fig. 2f). The low affinity and efficiency of MetCaCb toward 1-methylbiguanide compared to that for metformin (Fig. 2g; Extended Data Table 1) was consistent with the previously observed weak growth of strain NyZ550 on 1-methylbiguanide17. No hydrolase activity of MetCaCb was observed toward other guanidinium moiety-containing substrates, including guanidine, dimethylguanidine, agmatine, L-arginine, guanidinobutyrate, and guanidinopropionate (Extended Data Fig. 2f).
Architecture of the MetCaCb complex
To gain insight into the structure of metformin hydrolase and its recognition of metformin, a crystal structure analysis of the enzyme was initiated using X-ray crystallography. The co-purified MetCaCb was subjected to SEC before crystallization, and the active fractions of the MetCaCb protein complex (Fig. 2b) were used for crystallizing by the sitting-drop method. Finally, the atomic structure of the substrate-free MetCaCb complex was resolved at a resolution of 1.84 Å (PDB NO. 8X3G). The asymmetric unit (AU) of the P212121 crystal showed an uneven stoichiometry of the two subunits, which contained two MetCa (α subunit) and four MetCb (β subunit) molecules (Fig. 3a and Extended Data Fig. 3a). This uneven composition of the MetCaCb complex was also roughly inferred from different intensities of SDS-PAGE bands (Fig. 2b). The hexamer was composed as a centrosymmetric dimer of trimers in which a MetCa and two MetCb molecules were arranged in a head-to-tail manner (Extended Data Fig. 3). To observe the polymeric state of the MetCaCb complex approximating physiological states, we performed single-particle cryoEM using the same sample for crystallization. This analysis demonstrated a remarkable degree of uniformity among the enzyme particles, and the resulting cryoEM map evidently unveiled the hexameric configuration of the MetCaCb complex (Extended Data Fig. 4).
MetCa and MetCb share 34% identity in their amino acid sequences, whereas their overall architectures were found to be basically identical (rms deviation of 1.94 Å for 321 Cα atoms), and exhibited the canonical three-layer alpha-beta-alpha fold of the arginase protein superfamily (Fig. 3b). The major structural differences between MetCa and MetCb lay in their N- and C-termini. MetCa had an additional N-terminal loop (T8–G26) contributing to the interaction with another MetCa subunit through a hydrophobic interface (Extended Data Fig. 5a and b). In contrast, an extended C-terminal loop of MetCb was located on the MetCa-MetCb interface, and it also stabilized the N-terminal structure of MetCb that capped the substrate-binding cavity of MetCa (Extended Data Fig. 5c).
Another difference between MetCa and MetCb was found in the arginase family active center region. Although the corresponding regions in both MetCa and MetCb were folded in a similar conformation, only the active site of MetCa contained a bimetallic center coordinated by four Asp (D183, D187, D276 and D278) and two His (H158 and H185) residues (Fig. 3c and Extended Data Fig. 5d). This metal ion coordination pattern is conserved among bacterial arginase family enzymes. A glycerol molecule from the crystal buffer was bound in the active site and coordinated to the two metal ions. Variants that disrupted the binuclear metal ion cluster in MetCa were introduced by substituting the conserved coordinating residues with Ala, which led to a loss of activity of the enzyme (Extended Data Table 1). In contrast, no electron density for metal ion was observed in the corresponding region of the subunit MetCb. This was likely due to the mutations in five of the six conserved metal ion coordinating residues in MetCb (Fig. 3c, d and Extended Data Fig. 5e). Analysis of the metal content of purified metformin hydrolase by inductively coupled plasma-mass spectrometry (ICP-MS) indicated there were 0.8 manganese and 0.05 nickel atoms per α1β2 trimer, while the metal content was 1.9 manganese and 0.06 nickel atoms per trimer when the enzyme was incubated with excess Mn2+ before ICP-MS analysis. In contrast, metal ions were not detected in the recombinant MetCb protein. The results are consistent with the aforementioned observations from the protein complex structure of MetCaCb. Taken together, these results inferred MetCa as the catalytic center of this metalloenzyme.
Active-site architecture of MetCaCb
An unexpected discovery from the crystal structure of the holoenzyme is that the active site cavity within the head domain of MetCa is capped by the tail domain of a MetCb subunit (Extended Data Fig. 6a). This configuration is distinct from typical arginase family proteins, which feature an open, solvent-exposed substrate-binding pocket22. The N-terminal 23 amino acids of MetCb adopted a hollowed-out structure, predicted to serve as the substrate entrance connecting to the active site cavity of MetCa (Fig. 3e). Notably, the truncation of these 23 amino acids in the N-terminal of MetCb resulted in a failure to achieve soluble expression of the MetCaCb complex, suggesting its contribution to stabilizing the protein complex.
Next, a metformin molecule was computationally docked into the metal ion-containing active center of MetCa, which revealed that metformin was bound in a highly negative pocket buried by the aromatic amino acids Y22 and F82 from MetCb and W192 from MetCa (Fig. 3f). The noncoordinating residues in the active site of MetCa were mostly hydrophilic, including Q80, N199, N288, S289, and E320 (Fig. 3e; Extended Data Fig. 6b). The N199 and S289 residues of MetCa appeared to be unique among arginase family enzymes (Extended Data Fig. 6c). The equivalent residue of N199 in the arginase family proteins is a conserved noncoordinating histidine which is catalytically important during the reaction process23. The complete loss of activity of the N199A variant was observed (Extended Data Table 1), implying that N199 serves as a proton shuttle like its equivalent histidine. Additionally, S289 was identified as a substrate-binding residue and as being essential for metformin hydrolase activity (Extended Data Table 1).
MetCaCb and paired arginase family proteins
Arginase family proteins are commonly known as single gene-encoding enzymes that function as monomer or homomultimer20. However, the heterohexameric metformin hydrolase encoded by the arginase gene pair represents a unique member in this family (Extended Data Fig. 7). To investigate the possible distribution of the MetCaCb-like arginase pairs, their homologs with at least 25% amino acid sequence identity were searched in the NCBI nr database24. As a result, 56 gene pairs were identified from four phyla of bacteria including Proteobacteria (44), Actinomycetota (10), Acidobacteriota (1) and Planctomycetota (1) (Supplementary Data 1). The occurrence of these paired arginases in soil, aquatic, and plant-associated bacteria underlies their environmental ubiquity. A phylogenetic tree analysis showed that MetCa and MetCb were located in two well-separated clades that evolved from a common origin (Fig. 4a). Homologs within the MetCa clade retained all six conserved metal ion coordinating residues, whereas the MetCb clade contained a group of unique arginase family proteins exhibiting three to five mutations of the six conserved coordinating residues (Fig. 4a). The genetic context analysis indicated that most encoding genes within the MetCaCb clades were found in close physical proximity to hydrogenase maturation nickel metallochaperones genes that shared homology with MetAB in this study, indicating their likely dependence on Ni2+ ion as well. Furthermore, there was a set of arginase pairs that contained the intact metal ion coordinating residues in both members of the pairs. Nevertheless, their specific functions, metal ion-dependence, and whether they collaborate in their enzymatic activities remain open questions.
To test whether these MetCaCb homologous pairs were functional, three codon-optimized DNA sequence pairs (AcMetCaCb, HmMetCaCb, and RsMetCaCb, as shown in Fig. 4a) from the MetCaCb clades were synthesized for the expression of recombinant proteins in E. coli. The α subunits within these three pairs exhibited AA identities of 93.8%, 71.2%, and 61.7% with MetCa. All the protein pairs were successfully expressed and purified to be heterohexamers as the positive control MetCaCb (Fig. 4b). In the presence of Ni2+, conversion of metformin into guanylurea by AcMetCaCb and HmMetCaCb was confirmed by LC-MS. However, the hydrolase activities were approximately 20,000-fold lower in comparison to that of MetCaCb. In contrast, no detectable enzymatic activity against metformin was observed for RsMetCaCb (Fig. 4c).