Protein Identification
For protein expression, we found 5149 quantitative proteins from total 6658 identified proteins, in which there were 967 differential expression proteins (DEPs), including 662 proteins upregulated (≥ 1.5-fold) and 305 downregulated (≤ 0.67-fold) between OSCC tissues and paired adjacent normal tissues.
For protein acetylation, we identified 8388 acetylation sites in 2706 proteins. A total of 4883 acetylation sites in 1738 proteins were quantified. 2027 Kac proteins (74.9%) had ≤ 3 Kac sites, whereas 114 Kac proteins (4.2%) had ≥ 10 Kac sites (table S2) which was more abundant than acetylation in hepatocellular carcinoma[17]. TTN, MYH2, FLNA, MYH9, MYH7, SPTAN1, AHNAK, FLNC, APOB, and MYBPC1 are the top 10 proteins with the most abundant acetylated sites. This study included 282 upregulated Kac sites (≥ 1.5-fold) in 234 proteins and 235 downregulated Kac sites (≤ 0.67-fold) in 162 proteins between OSCC tissues and paired adjacent normal tissues. Hierarchical clustering analysis of DEPs and DAPs is shown in Fig. 1.
For all DEPs and DAPs, we identified 132 proteins with expression difference and acetylation difference (Fig. 1d).
Functional Characterization of Differentially Lysine Acetylation Proteins.
The effects of lysine acetylation proteins were then further analyzed by Gene Ontology (GO) annotation and COG (Clusters of orthologous groups) analysis.
The function classification of differential acetylation proteins was then carried out with COG analysis. Among upregulated DAPs, we found that 27 proteins are associated with cytoskeleton (12.4%), 22 proteins with posttranslational modification, protein turnover, chaperones (10.1%), and 20 proteins with Translation, ribosomal structure, and biogenesis (9.2%). Among downregulated DAPs, 23 proteins are associated with cytoskeleton (15.2%), 21 proteins with Energy production and conversion (13.9%), and 17 proteins with Signal transduction mechanisms (11.3%) (Fig. 2a,2b).
The subcellular localizations of the upregulated DAPs were mainly distributed in the cytoplasm (45.49%), nucleus (19.74%), mitochondria (14.59%), extracellular (6.87%), cytoplasm, nucleus (6.44%) and plasma membrane (3.86%) (Fig. 2c). Among the downregulated DAPs, the subcellular localizations were mainly distributed in cytoplasm (37.89%), mitochondria (22.36%), extracellular (14.29%), nucleus (11.18%) and plasma membrane (4.97%) (Fig. 2d).
In GO annotation, the related biological functions were established for upregulated DAPs and further separated into three classifications: cellular component (33.3%), molecular function (22.2%), and biological process (44.4%) (Fig. 2e). Among downregulated DAPs, three classifications included cellular component (29.6%), molecular function (25.9%), and biological process (44.4%). The number of DAPs involved in the three biological process groups were shown in Fig. 2e,2f.
Functional Enrichment Analysis of Differentially Lysine Acetylation Proteins.
GO enrichment analysis was performed to further understand the functions of differential lysine acetylation proteins. Among 234 upregulated DAPs, the cellular component in GO enrichment analysis significantly enriched in actin filament bundle, actomyosin, and actin cytoskeleton (Fig. 3a). Biological process was significantly enriched in cellular response to type I interferon, type I interferon signaling pathway, response to type I interferon, negative regulation of viral process, cellular response to cytokine stimulus, regulation of apoptotic process, and regulation of programmed cell death (Fig. 3b). According to molecular function enrichment classification, double-stranded RNA binding, cytoskeletal protein binding, actin filament binding, and nucleoside-triphosphatase activity were significantly enriched (Fig. 3c). Among 162 downregulated DAPs, the cellular component significantly enriched in actin cytoskeleton, contractile fiber part, and contractile fiber part (Fig. 4a). According to biological process, striatedd muscle contraction, cardiac muscle contraction, heart contraction, and positive regulation of blood circulation were highly enriched (Fig. 4b). Molecular function highly enriched in actin binding, structural constituent of muscle, ankyrin binding, and calcium ion binding (Fig. 4c).
KEGG analysis of upregulated DAPs enriched in the top 3 pathways including hepatitis C, biosynthesis of unsaturated fatty acids, fatty acid metabolism. As shown in Fig. 3d, HIF-1 signaling pathway, ferroptosis, JAK-STAT signaling pathway were also highly enriched. Among downregulated DAPs, we found that most of these proteins are involved in tyrosine metabolism, phenylalanine metabolism in KEGG analysis.
Domain enrichment of upregulated DAPs was involved in Guanylate-binding protein, C-terminal and N-terminal domain as well as Ku70/Ku80 N-terminal alpha/beta and beta-barrel domain (Fig. 3e). We found that the downregulated DAPs were significantly enriched in myosin N-terminal SH3-like domain, EF-hand domain pair, and fibronectin type III domain in domain enrichment (Fig. 4e).
The Exploration of Differentially Lysine Acetylated Motifs.
A total of 8,267 Kac peptides from all identified peptides with amino acids around the acetylated lysine from 10 amino acids upstream to 10 amino acids downstream are subjected to the Motif-X program. We identified 50 conserved motifs based on acetylated lysine. Especially, the motifs xxxxxxxxxx_K_HxxxxxxKxx, xxxxxxxxxN_K_Axxxxxxxxx (Motif Score > 30.00) were strikingly conserved. The top 3 motifs and their abundances are as follows: xxxxxxxxxx_K_Sxxxxxxxxx (748 peptides), xxxxxxxxxx_K_Txxxxxxxxx (652 peptides), xxxxxxxxxG_K_xxxxxxxxxx (509 peptides). Overall, the present results suggested that these motifs may reveal characteristics regarding acetylation in patients with OCSS (Table S3).
PPI Network Construction and Survival Analysis of Hub Genes
PPI network of the upregulated DAPs was illustrated in Fig. 5(a). A total of 228 nodes with 530 edges were reflected in this network system. The top 30 hub genes were identified by CytoHubba (Table S4). Of these, several proteins had abundant acetylation sites, including 15 acetylation sites on HSP90AA1, 23 acetylation sites on MYH9, 16 acetylation sites on PKM, and 12 acetylation sites on EEF2. Among the top 30 hub genes, survival analysis from the TCGA database showed that lower expression of RPS3, RPL24, RPL19, CCT8, HSP90AA1, EGFR, and ARPC2 were associated with poor prognosis (Fig. 6a-d and Figure S1 a-d). 13 functional subnet cluster were selected from the PPI network by MCODE. All the 17 proteins in cluster 1 (MCODE score 9.2) were contained in the top 30 hub genes (Fig. 5b). Biological process of GO analysis of these proteins was associated with mRNA splicing and SRP-dependent cotranslational protein targeting to membrane (Fig. 5c). KEGG enrichment analysis of cluster 1 significantly related to spliceosome and ribosome pathway (Fig. 5d). Remarkably, RPS3, RPL24, and RPL19 in cluster 1 were closely associated with OSCC prognosis.
Among the 162 downregulated DAPs, PPI network contained 159 nodes with 94 edges. The top 30 hub genes were listed in table S5. Of these, 11 proteins had more than 10 acetylation sites, especially 130 acetylation sites on TTN, 45 acetylation sites on MYH2, 27 acetylation sites on MYH7 and 24 acetylation sites on SPTAN1. Among the top 30 hub genes, the survival analysis from the TCGA database showed that low expression of EIF4A2, RPL12, MYBPC1, RPS6, ARCN1, and TMEM9 was related to poor prognosis, particularly EIF4A2 and RPL12 with a better survival curve (Fig. 7a-c and Figure S2 a-d).