Discovery of a signature cyclic immonium ion from lysine lactylated peptides
Driven by the urge to discover unknown lactylation substrate proteins, we sought to identify unique features of lactyllysine peptide carriers on LC-MS/MS. We first chemically introduced lactylation onto lysines of model peptides LVFFKA and NKGAII (Fig. 1a, Fig. S1a). Successful derivatization, both the L- and D-lactylation1,21, was confirmed by a mass shift of 72.02 Da on the peptide precursors and the collision-induced dissociation (CID)-generated MS/MS fragment ions that carry the modification using a Q/TOF (Fig. 1b-c, Fig. S1b-c). Further scrutinization of the low-mass region elicited us to note a clear ion at m/z 173.129 corresponding to the linIm ion of lactyllysine (Fig. 1c). Notably, this ion is conducive to lose NH3 (Fig. 1d), and is thus accompanied by the generation of a cyclic form of immonium (cycIm) ion at m/z 156.103 that exhibited significantly higher intensity than the linIm ion (Fig. 1c, Fig. S1c). Consistently high intensity of the cycIm ion over the linIm ion, both yielded by lactyllysine, was confirmed by screening the MS/MS spectra of two lactylated model peptides acquired at distinct collision energy (CE) spanning a wide range (Fig. S1d-e).
Cyclic immonium ion of lactyllysine signifies lactylation
Then, we sought to investigate the diagnostic value of the cycIm ion to signify the occurrence of lysine lactylation using the human recombinant ENO1 as a model protein. We employed a mild, nondenaturing chemical reaction by incubating ENO1 with LGSH21 (Fig. S1a), and performed bottom-up analysis on the lactylated protein using an Orbitrap. Subsequent mass shift-based database search and manual inspection led us to identify 28 lactylated lysines on ENO1, which corresponds to 100 lactylated peptides and 357 PSMs (Fig. 1e). We also conducted the analysis on a Q/TOF (Fig. S2a). Interesting, regardless of the instrument platforms and database search engines we tested, analysis of the MS/MS spectra of the in vitro lactylated ENO1 peptides demonstrated that lactyllysine carriers rather than non-lactylated peptides are prone to produce the cycIm ion in a greater frequency and at exceedingly higher intensity compared to the linIm ion (Fig. 1f-g, Fig. S2b-d), suggesting the sensitivity and specificity of the cycIm ion for confident lactylation assignment. In agreement, receiver operating characteristic (ROC) curves verified that the AUC value of using the presence of cycIm ion as the marker of lactylation reached 0.91, whereas AUC of using the linIm ion as lactylation marker was only 0.70 (Fig. 1h). Collectively, using in vitro lactylated peptides we propose that the cycIm ion is a promising marker for lactylation assignment.
Meanwhile, we found the non-specific nature of the linIm ion in signifying lactyllysine can be attributed to the yield of isobaric MS/MS fragment ions from peptides with N-terminal sequences of LS/SL, IS/SI, TV/VT (Fig. S3a). In comparison, interfering MS/MS ion for the cycIm ion are rare. Given ion mobility is powerful in differentiating isobaric molecules based on gas-phase conformations25,26,27, we subsequently tested whether ion mobility can accurately measure the collision cross section (CCS) values of the signature ions and thus enable distinguishing the linIm ion from interfering isomeric MS/MS ions. We employed a travelling wave ion mobility mass spectrometer (TWIMS) to induce MS/MS fragmentation for three lactylated peptides and then analyzed the arrival time distributions (ATDs) for the produced MS/MS ions (Fig. S3b). Indeed, the detected ATDs of the diagnostic cycIm ions produced from LVFFKLacA, NKLacGAII and FNKLac and hence the corresponding CCS values all resembled (Fig. S3b). Moreover, TWIMS enabled us to unambiguously discern the linIm ion at m/z 173.1290 from the isobaric MS/MS ion, a2 ion of dipeptidyl Ser-Ile at m/z 173.1290, by distinct ATDs (Fig. S3c). Consequently, although current MS proteomics workflow is still unamenable to record the CCS value of all the MS/MS fragment ions produced from peptide precursors on-the-fly and thus resolve confounding peaks for the marker ions of lactyllysine via ion mobility, we anticipate this analytical challenge will be overcome by future advancement of instrumentation and endow the lactyllysine diagnostic ions with even greater specificity and hence increased diagnostic value for lactylation assignment.
Lysine lactylation modifies peptide chromatographic behavior
Besides the signature cycIm ion, we postulate that the biochemical properties of lactylation peptide carriers are also affected by this modification. We thus compiled the assigned lactylated ENO-digested peptides and the unmodified peptides of identical sequences based on shotgun proteomics data (Fig. S2a and Table S1). Pairwise analysis of these peptides suggests a marked increase in peptide retention time conferred by lactylation (Fig. 2a-b, Fig. S4a and Table S1). Further, we analyzed the chromatographic behaviors of lactylated vs. unmodified peak pairs based on bottom-up proteomics data of lactylated ENO1 collected on different analytical instruments, and verified this finding with 122 peptide pairs detected by timsTOF and 97 pairs by an Orbitrap (Fig. 2c, Fig. S4b-d). Consistently, the lactylation-induced peptide retention time shift was substantiated by bottom-up analysis of additional in vitro lactylated BSA and lysozyme (Fig. 2c). Distribution of the retention time shift of lactylated peptides was summarized (Fig. 2d). Then, as we have analyzed the model protein digests with trapped ion mobility spectrometry (TIMS) using a parallel accumulation-serial fragmentation (PASEF) approach28, the recorded CCS values of lactylated vs. unmodified peptide precursors pairs were subjected to comparison. Nevertheless, this pilot study identifies no significant changes in ion mobility behaviors (1/K0) being brought to the examined peptide carriers by lactylation, posing a stark contrast to altered chromatographic behaviors (Fig. 2e). Collectively, retention time increase on a reversed-phase column is believed to serve as an additional indicator of lactylation and further eliminate false positive matches in combination with the cycIm ion.
Benchmarking diagnostic cycIm ion with affinity enriched-lactylation proteome data
We next asked whether the signature cycIm and linIm ions of lactylated peptides discovered on model peptides and proteins also apply to complex human cellular proteome. We first validated the origin of the cycIm and linIm ion by analyzing the metabolically labeled proteome data that was collected from MCF-7 cells cultured with isotopically labeled glucose (13C6-glucose). Theoretically, 13C6-glucose is endogenously metabolized to 13C3-lactate and is expected to induce a mass shift of 75.0312 Da for isotopically lactylated peptides, which accords to △m=3.0101 Da compared to light lactylation peptides when cells were cultured in 12C6-glucose (Fig. 3a). Such mass shift was confirmed for isotopically lactylated peptide precursors and also for the cycIm and linIm ions that both inherited the three-carbon units from lactate and hence delivered m/z 159.1121 and m/z 176.1381, respectively (Fig. 3a). Meanwhile, the ensuing analysis of an open-access quantitative proteome dataset collected from MCF-7 cells labeled by amino acids in cell culture (SILAC) and enriched by pan-lactylation antibody further led us to note that the signature cycIm and linIm ions also inherited the heavy isotopes introduced by lysine by comparing the light and heavy group of SILAC proteome. Specifically, in the heavy-labeled group the linIm ion of lactylation peptide inherited 13C6 and 15N2 and increased its mass to m/z 180.1391, whereas the cycIm ion lost a NH3 and inherited 13C6 and 15N as evidenced by its increase to m/z 162.1154 (Fig. 3b). Combinatorially, we substantiated the cycIm and linIm ions of lyactyllysine inherit the isotopically labeled C derived from lactate and lysine as well as N from lysine, supporting the proposed production pathway (Fig. 1d).
Confirmation of lactyllysine as the origin of the detected cycIm and linIm ions in tandem MS spectra thus led us to appraise the production efficiency of both ions from lactylation proteome of cultured human cells. We analyzed two publicly accessible, SILAC-processed and affinity enriched proteome datasets collected from MCF-7 cells in response to rotenone and DCA intervention1 (Fig. 3c, Table S2). Excitingly, after database searching against the two datasets, we found the cycIm ion was detected in lactylated peptides at a percentage as high as 89% and 90% whereas the linIm ion displayed a lowered percentage at 76% and 79% (Fig. 3c). Thus, we corroborated the high production rate of cycIm ion from lactylated peptides using affinity-enriched lactylation cellular proteomics data (Fig. 3c). Then, we investigated whether the cycIm ion is produced at high intensity in the examined affinity-enriched lactylation proteome datasets, and found the relative ion abundance of the cycIm ion in MS/MS spectra of lactylated peptides reached as high as 39.74% and 44.17% from the two datasets (Fig. 3d, Table S2), supporting the cycIm ion as a sensitive marker in signifying lactylation from complex cellular proteome.
Uncover non-histone nuclear proteins as lactylation substrates from affinity-enriched human cell proteome
Upon validating the usefulness of the cycIm ion in lactylation site assignment from the affinity proteomics data, we are curious whether lactylation modifies human non-histone proteins and believe this knowledge holds the key to elucidate unexplored regulatory functions of lactylation besides being a histone mark. We used the presence of the cycIm ion to filter all database searching-identified lactylation peptides from the affinity proteomics data. As a result, we retrieved 97 lactylated peptides and 84 lactylation sites belonging to 35 human non-histone proteins and 16 histones, whereas 9 lactylation site and 11 lactylation peptides lacking the cycIm ion in MS/MS spectra were removed as false positive matches. Driven by the observation that lactylation does not exclusively modify human histones, we proposed to infer the regulatory roles of lactylation on non-histone proteins by conducting gene ontology (GO) analysis including cellular component (CC), molecular function (MF) and biological processes (BP) on the newly identified lactylated proteins. Since the revisited proteomics data was collected following a histone extraction protocol, CC analysis shows that the identified lactylated proteins were mostly distributed in chromatin and nucleus (Fig. S5a). In agreement, MFs of the lactylated proteins were mainly classified as RNA and DNA binding (Fig. S5a). Noteworthy, we found lactylation can be installed on non-histone transcriptional regulators including High mobility group protein HMG-I/HMG-Y (HMGA1), General transcription factor II-I (GTF2I) and Zinc finger protein 706 (ZNP706) (Fig. S5b, Table S2), warranting future exploration of the functional readouts of lactylation on proteins of this class. Next, since lysine residues is a prevalent residue that bears other regulatory PTMs including ubiquitination, methylation, acetylation and sumoylation, we performed PTM category analysis for all the identified lactylated lysines and found they are prone to carry PTMs such as acetylation and ubiquitination compared to all quantified lysine residues (Fig. S5c), suggesting the tendency of PTM crosstalk occurring on the lactylated proteins.
Next, we queried whether the lactylation levels of these identified substrate proteins can respond to glycolysis and hence phenocopy histone lactylation1. Specifically, activated glycolysis was achieved through the inhibition of oxidative phosphorylation by rotenone while the glycolysis flux was inhibited by sodium dichloroacetate (DCA) treatment. By performing quantitative analysis of the lactylated proteome, we confirmed that the abundance levels of the lactylated peptides were mostly increased by rotenone and lowered by DCA, in concert with the abundance changes of lactate (Fig. 3e, Table S2). Although we noted that the lactylation sites sensitively responding to glycolytic activation and inhibition are not fully overlapped (Fig. 3f-g, Fig. S5d, Table S2), comparative analysis of the dynamic lactylation proteome shows that certain non-histone lactylation sites such as K102 and K116 on human nucleolin (NCL, Fig. S5e) were tightly controlled by both rotenone and DCA treatment (Relative abundance change>1.5). Such dynamic responses are in concert with a previously reported epigenetic mark, K9 lactylation on Histone H4 (Fig. 3f-g). Notably, NCL is an abundant nucleolar protein that has been demonstrated to be involved in modulating mRNA turnover and transcription, pre-rRNA transcription and processing, nucleolar chromatin remodeling and ribosomal assembly. Thus, our discovery on NCL lactylation bolsters an intriguing hypothesis that lactylation may control transcription of downstream target genes through diverse recipients expanding beyond histones in human cells.
Diagnostic ion reveals widespread DHRS7 lactylation from the draft map of the Human Proteome
Next, we wondered whether the signature cycIm ion benchmarked by the analysis of affinity-enriched lactylation proteome is useful for globally mapping lactylation proteins from large-scale, unenriched human proteome resources. Thus, we first performed database searching against lysine lactylation from the draft map of the Human Proteome22, which delivered multiple hits. However, most of the identified lactylated peptides failed to produce the diagnostic cycIm ion and are thus believed as false positive matches (Fig. 4a). For those indeed yielding cycIm ions in MS/MS spectra, we can make the assignment with very high confidence. The increased hydrophobicity and hence delayed elution time further aid to confirm the assigned lactylation sites (Fig. S6a), recapitulating the observation we made using model proteins (Fig. 2a-d). Among the assigned true lactylation carriers (Fig. 4a), an understudied enzyme DHRS7 was repetitively identified to carry lactylated K321 in human tissues including liver, retina, spinal cord, testis, ovary and prostate (Fig. 4b). We confirmed this lactylation site can be identified by database searching using three different search engines (Fig. S6b, Table S3). Moreover, the cycIm ion consistently showed significant abundance in the MS/MS spectra of the lactylated DHRS7 peptide regardless of the tissue origins from which the proteomics data was collected (Fig. 4a, Fig. S6c). Of note, the lactylation sites and protein substrates that we identified from the draft map of the Human Proteome using different search engines actually varied (Fig. S6d, Table S3), reiterating the importance of establishing a gold standard for confident lactylation assignment.
DHRS7 belongs to the SDR superfamily that usually catalyze NAD(P)(H)-dependent reactions with a large array of substrates, including steroid hormones, prostaglandins, retinoids, lipids and xenobiotics29,30. However, DHRS7 and its functional interplay remains understudied, nonetheless the discovered lactylation site K321. Predicted protein structure of DHRS7 revealed that K321 is located in an ɑ-helix in proximity to the protein C-terminus (Fig. 4c). Its functional importance is implied by the previously reported methylation and ubiquitination on the identical residue (Fig. 4c). The evolutionary conservation of K321 across eukaryotic phylogeny also support the likelihood of this residue and hence the installed lactylation to be functionally important for DHRS7 (Fig. 4d).
Exploring regulatory mechanism and biological readouts of DHRS7 lactylation
Before we set out to assess how this modification may perturb the function of DHRS7, we must firstly validate the lactylation on K321 of DHRS7. Therefore, we overexpressed His-tagged and FLAG-tagged DHRS7 in HEK293T cells (Fig. S6e). Indeed, we identified lactylated K321-containing peptides from the lysed cells by database searching, and further confirmed the assignment by the presence of the cycIm ion at m/z 156.1025 in the resultant MS/MS spectra (Fig. 5a-b) along with their significantly delayed elution on a C18 column compared to the unmodified counterparts (Fig. 5c). The substantiated lactylation on K321 of DHRS7 thus prioritized our investigation of this residue as a functional hotspot. Since DHRS7 is abundant in human prostate tissue and pronouncedly upregulated in prostate adenocarcinoma patients (Fig. S6f), we overexpressed the DHRS7 plasmid in a human prostate cancer cell line PC3. Surprisingly, lactylation on K321 of DHRS7 as we observed in HEK293T cells was dampened in DHRS7-overexpressed PC3 cells and the unmodified DHRS7 became the bulk proteoform, phenocopying a nonlactylated K321A mutant DHRS7-transfected cells (Fig. 5d). Hence, the distinct lactylation statuses on K321 in HEK293T and PC3 cells stimulate the interrogation of the expression levels of potential lactylation “writers” and “erasers” in these two cell lines, respectively. Comparative proteomics analysis of the HEK293T and PC3 cells has shown that, among the currently verified delactylases, HDAC3 was detected in PC3 cells yet absent in the detected HEK293T proteome (Fig. 5e-f). Considering a recent biochemical study that demonstrated HDAC3 as an efficient lactylation eraser31, it is likely that lactylation is maintained on K321 in HEK293T cells while removed in PC3 cells due to the cell-type specific, differential delactylase levels. Collectively, our in vitro findings laid a foundation for future systematic screening of lactylation writers and erasers responsible for DHRS7 and possibly for other substrate proteins, holding the potential to resolve the puzzle of the regulatory enzymatic systems for lactylation in mammalian cells.
Lastly, given lactylation together with previously recorded methylation and ubiquitination all being installed on K321, we hypothesized that this residue is important for DHRS7 and may affect its structure, function, protein engagement or the transduced signaling network and ultimately certain phenotypes of cells. We transiently overexpressed the wild-type (WT) and the nonmodifiable K321A mutant DHRS7 constructs in HEK293T cells and employed shotgun proteomics to globally evaluate the cellular responses (Fig. 5g). We found that the K321A point mutation maintained the overall abundance of the global proteome yet caused significant abundance changes for a small subset of proteins including the prosaposin protein (PSAP) (Fig. 5g), unveiling a previously unannotated link between DHRS7 and PSAP. A cue of this association is provided by a single cell transcriptomics analysis defining both DHRS7 and PSAP as top signatures of metastasizing cancer cells32. Besides proteomics, we further employed metabolomics to interrogate whether lactylated K321 on DHRS7 influence cells’ metabolic phenotypes. Untargeted metabolomics analysis of the WT and K321A DHRS7-transfected cells indicated that K321 mutagenesis is sufficient to significantly alter the cellular metabolic profiles as shown by the PCA and OPLS-DA score plots (Fig. S6g) and the heatmap of metabolite abundances (Fig. S6h).
Glycolytic enzymes are heavily lactylated in human cells and carry lactylation that perturbs protein stability
Current analytical advancement has yielded valuable, unenriched deep human proteome data with PTM information recorded yet waiting to be excavated. Together with the signature cycIm ions enabling true lactylation assignment, we sought to discover more lactylation substrate proteins from the Meltome Atlas23, one of the recently shared large-scale data resources that measure thermal stability of the global proteome in multiple model organisms. The ability to determine lactylation site from the Meltome would unveil whether lactylation can influence the thermal profiles of its substrate proteins according to a Hotspot Thermal Profiling (HTP) approach24, and hence provide a plausible answer regarding the functionality of human lactylation from a biophysical perspective.
We considered lactylation sites identified by conventional mass shift-based database searching as true on condition that the signature cycIm ion can be retrieved from the MS/MS spectra (Fig. S7a-b), whereas systematic increase in retention time conferred by lactylation further supports the assignment (Fig. S8a). Full searching of lactylation from 14 human cell lines available in the Meltome Atlas led to the discovery of totally 246 lactylated peptides and 231 lactylation sites, whereas enzymes involved in glycolysis, the renowned pathway that produces lactate as its end product, is most heavily lactylated (Fig. 6a, Table S4). Since such enzymes are of high abundance in human cell lines, we must rule out the possibility that the observed high frequency of lactylation on glycolytic enzymes is attributed to their high protein abundances. Thus, we summarized protein abundance ranking for not only the enzymes involved in glycolysis but also those from pentose phosphate pathway (PPP) and tricarboxylic acid (TCA) cycle, the other two major metabolic pathways, in each examined cell line (Fig. 6b). We found that enzymes involved in PPP and TCA cycle vs. glycolysis are of similar abundance, whereas their frequencies of being lactylated are distinct, demonstrating that lactylation preferentially modifies glycolytic enzymes. This finding solicits us to infer the existence of a feedback loop, through which the glycolysis-produced lactate regulates glycolysis through lactylation when the glycolysis pathway is hyperactivated and produces excessive lactate. Another intriguing finding is that pathway enrichment analysis of the identified lactylation proteome reveals heavy lactylation on an alternative energy-producing pathway, fatty acid degradation (Fig. S8b), implying a second route that lactylation takes to control energetics production and govern cell metabolism.
Subsequent sequence motif analysis for the lactylation sites shows a mildly conservative presence of leucine and glutamate residues flanking the identified lactyllysine (Fig. S8c). Next, we analyzed the PTM distribution on the lactylated lysines and found that approximately 82.25% of them have been reported to bear classically functional PTMs including ubiquitination and acetylation (Fig. S8d). This is exemplified by lactylated K343 on ENO1, which has been previously detected to carry acetylation and ubiquitination, and known to locate in the enzymatic active site33,34,35. Thus, we conjecture that lactylation on this critical residue would interrupt the substrate-enzyme interaction and hamper the catalytic activity (Fig. 6c). Notably, among all the identified lactylated human proteins across the examined cell lines, lysines of ENO1 were lactylated in the greatest coverage, reaching the peak level at 23.68% in colon cancer spheroid cells (Fig. 6c, Fig. S7b). Another example is lactylated K147 shared by highly homologous ALDOA and ALDOC, refreshing our knowledge of this residue as merely an acetylation and ubiquitination carrier33,34. Intriguingly, K147 represents the most frequently lactylated residue in the Meltome Atlas based on its ubiquitous presence in cell lines including K562, Jurkat, A549, HL60, colon cancer spheroids, HaCaT, HAOEC and HEK293T cells (Fig. S7b). Considering the low human cell line specificity of K147 lactylation, it is tempting to speculate that lactylation would uniformly disturb the enzymatic activity of ALDO and hence inhibit glycolysis given K147 is known to be important for substrate binding36. Together, the lactylation distribution on glycolytic enzymes suggests a regulatory mechanism that glycolytic flux can be controlled by redundantly modulated glycolytic enzyme activity via lactylation in human cells.
Besides mapping the lactylation proteome landscape, the Meltome Atlas also allows for discerning functional lactylation hotspots that manage to alter the thermal stability of modified human protein substrates. Indeed, the application of the HTP approach detects that the proteoform of ENO1 carrying lactylated K343 displayed an increased thermal stability when compared to the bulk, unmodified protein in colon cancer spheroids (Fig. 6d). Then, we wondered whether this lactylation-induced change to the intrinsic biophysical property of ENO1 is systematic. Intriguingly, we found the thermal stability shift we observed for lactylated K343 of ENO1 disappeared for the ENO1 proteoform bearing lactylated K326 (Fig. 6d), consistent with previously reported site-specific and unsystematic phosphorylation-induced protein thermal stability change37. Another finding is that the lactylation-induced protein thermal stability shift is dependent on cellular millieu. For instance, the lactylated K11 proteoform of the glycolytic enzyme PGK1 shows decreased thermal stability in A549 cells, whereas this trend was dampened in Jurkat cells (Fig. S9a-b). This context-dependent difference is further validated by thermal profiles of lactylated sepiapterin reductase (SPR) (Fig. S9c). As shown in Fig. S9d, thermal stability curves of the lactylated K226 and the unmodified proteoforms indicate that given lactylation induced a subtle yet clear increase in stability in primary human hepatocytes, whereas such deviation was diminished in HepG2 cells. In agreement, the widespread lactylation on K321 of DHRS7 initially discovered in the draft map of the Human Proteome has been verified in the Meltome Atlas datasets, and this modification was found to exert variant effects on DHRS7 thermal stability in K562, Jurkat cells and colon cancer spheroids (Fig. S9e).