Identification of lipid-immune-related Multiomics symbols in OSCC from public datasets
As shown in the flow chart (Figure 1A), Multiomics symbols were selected to establish the LASSO risk model. Firstly, based on the consensus clustering analysis, 373 OSCC samples were divided into high- and low-active lipid metabolism subgroups (Figure 1B), or high- and low-immunity subgroups (Figure 1C). Next, 21 high-immunity-lipid metabolism specimens and 172 low-immunity-lipid metabolism specimens were selected for further differential expression analysis. R package “limma” was used to identified differential expression of mRNA (Figure 1D), microRNA (Figure 1E) and DNA Methylation (Figure 1F), and |log2FC| >0.585 and p.value < 0.05 was consider as DESs. Then, according to the Univariate Cox analysis of Multiomics characteristics in the TCGA cohort, symbols with p.value <0.05 was obtained to establish the LASSO model.
Construction of the Multiomics characteristics Signature
After Univariate Cox analysis, 150 multiomics symbols were selected for established the risk model. The “glmnet” package was used for constructing a risk model for predicting OSCC early recurrence and returned a sequence of models, and 10-fold cross-validations were performed to select the best one. Then, the Lambda of the risk score model was shown in Figure 2A. After 10-fold cross-validation was run up to 1000 times, and the cross-validated errors were averaged, the 8 symbols were selected for constructing the prognostic risk model (Figure 2B). Based on the expression of the 8 symbols, a risk score formula for RFS was calculated as follows: Risk score = (3.876779×expression of cg00229245) +(1.965364× expression of cg00678539) + (-2.370920×expression of cg11079619) + (0.001135× expression of hsa-mir-338) + (0.258650×expression of EIF4E3) + (-1.072688× expression of APOBEC3D) + (0.299636×expression of RPS10) + (-0.005202× expression of PLVAP).
Evaluation of the Risk Score Formula for recurrence in the TCGA Cohort
The expression of 8-Multiomics characteristics was closely correlated with relapse in OSCC patients as shown in Figure 3. Kaplan-Meier survival analysis showed that the high-risk subgroup had higher recurrence compared to the low-risk subgroup (p < 0.001, Figure 3A). As shown in the heatmap (Figure 3B), cg00229245, cg00678539, hsa-mir-338, EIF4E3, and RPS10 were upregulated in the high-risk subgroup, however, cg11079619, APOBEC3D, and PLVAP was over-expression in the low-risk group. Also, with the median risk score as the cutoff value, all OSCC patients were assigned to either a high-risk or a low-risk group (Figure 3C), as well as, an overview of the survival state and risk score was shown in Figure 3D. Univariate (Figure 3E) and multivariate (Figure 3F) Cox analysis demonstrated that risk score was a dependent risk factor compared to other clinical characteristics. Compared to clinical factors (Figure 3G) and single symbols (Figure 3H), the risk model had a more accurate recurrence predictive value at 1-year. Meanwhile, the 8-Multiomics signature and DFS showed that the 1-year, 3-year, and 5-year AUC were 0.721, 0.787, and 0.729, respectively (Figure 3I).
All the samples in the TCGA database were divided into two series randomly, consisting of TCGA-train and TCGA-test subgroups. In the TCGA test, according to the median risk score, patients were further divided into high-risk (n = 84) or low-risk (n = 88) groups. As shown in the heatmap, cg00229245, cg00678539, hsa-mir-338, EIF4E3, and RPS10 were upregulated in the high-risk subgroup, however, cg11079619, APOBEC3D, and PLVAP was over-expression in the low-risk group (Supplementary Figures 1A). Survival analysis showed that patients in the high-risk group had shorter RFS time than those in the low-risk group (p < 0.001, Supplementary Figures 1B). Then, Risk score curves (Supplementary Figures 1C), and Dot diagrams (Supplementary Figures 1D) indicated that the 8-Multiomics signature had predicted glioma patients’ prognosis after surgery perfectly. As well as in the TCGA-train group (Supplementary Figures 1E-H). Besides, we found that eight symbols in the risk model also had better prediction values in the DFS of OSCC patients (Supplementary Figures 2). Compared to clinical characteristics and single symbols, the 8-Multiomics signature had a more accurate recurrence predictive value at 3 years and 5 years, respectively (Supplementary Figures 3).
The expression of eight symbols in different clinical characteristics subgroups
In Figure 4, the expression of 8 symbols was shown in different TNM subgroups. The expression of cg00229245 was significantly different between stages III and IV (p = 0.022, Figure 4A). The expression of cg11079619 was higher in stage I compared to stage III (p = 0.046) and stage IV (p = 0.011), respectively (Figure 4C). Besides, RPS10 was upregulated in stage IV compared to stage I (p = 0.015) and stage II (p = 0.017), respectively (Figure 4F). Unfortunately, other symbols had not significantly different in TNM stages. As shown in Figure 4I, the comprehensive nomogram for predicting probabilities of OSCC patients with 1-, 3- and 5-year DFS in the TCGA dataset, and the calibration plots for predicting OSCC patients with 1-year DFS in the TCGA cohort (Figure 4J). In addition, the expression of 8 symbols between different grade subgroups was performed (Supplementary Figure 4). The expression of cg00229245 was significantly different between Grade 2 and Grade 4 (p = 0.0064, Supplementary Figures 4A). In Supplementary Figure 4B, with the increase in Grade, the expression of cg00678539 decreased gradually. Compared to Grade 1 (p < 2.22e-16), Grade 2 (p < 2.22e-16), and Grade 3 (p < 2.22e-16), the expression of cg11079619 was upregulated in Grade 4. In the Grade 3, PLVAP upregulated significantly compared to Grade 1 (p = 0.0019) and Grade 2 (p = 0.032). The expression of RPS10 decreased in Grade 4 compared to Grade 1 (p = 0.012), Grade 2 (p = 0.011), and Grade 3 (p = 0.0004). The expression of EIF4E3 significantly different between Grade 3 and Grade 1 (p = 0.0012) or Grade 2 (p = 0.00096), respectively. Finally. The expression of APOBEC3D was also different between Grade 3 and Grade 1 (p = 0.00026) or Grade 2 (p = 0.016), respectively, at the same time, APOBEC3D upregulated in Grade 2 compared to Grade 1 (p = 0.012). However, the expression of has-mir-338 was not significantly different between different Grades.
To further investigate the applicable OSCC population of this 8-Multiomics signature, the 8-Multiomics signature-based DFS analyses were performed in subgroups of patients with different clinical variables (Supplementary Figures 5). The 8-Multiomics prognostic risk index consistently stratified patient survival regardless of age (p <0.001 in less than or equal to 65 years; p = 0.009 in over 65 years), gender (p <0.001 in male patients; p = 0.021 in female patients), grade (p <0.001 in grade II; p <0.001 in grade III), tumor size (p < 0.001 in T2; p = 0.043 in T3; p < 0.001 in T4), Lymph node metastasis (p <0.001 in N0 patients; p < 0.001 in N2 patients), metastasis (p <0.001 in M1 patients), patients with Grade ( p < 0.001 in Grade 1 patients; p < 0.001 in Grade 2 patients; p < 0.001 in Grade 3 patients) and Stages (p = 0.001 in Stage II; p = 0.012 in Stage III; p <0.001 in Stage IV). Unfortunately, patients with T1 (p = 0.092), N1 (p = 0.159), M1 (p = 0.317) and stage I (p = 0.717) were a little powerless for relapse prediction.
Immune cells of different 8-Multiomics signature subgroups
To analyze the composition of immune cells in different 8-Multiomics signature subgroups, the Wilcoxon test was used to compare the distribution of immune cells in different subgroups. B cells memory, Eosinophils, Mast cells activated, Monocytes, NK cells activated, and T cells CD8 were more abundant in the high-risk subgroup (Figure 5A-F), while B cells naïve, Mast cells resting, NK cells resting, T cells CD4 memory activated, T cells CD4 memory resting, and Tregs were more abundant in the low-risk subgroup (Figure 5G-F).
Furthermore, K-M survival analysis showed that abundant Macrophage M0, Mast cells resting, T cells CD4 memory resting, and Tregs had longer disease-free survival time (Figure 6A-D). while, abundant Macrophage M1, Mast cells activated, plasma cells, T cells CD4 memory activated, T cells CD8, and T cells follicular helper had more tendency to recurrence (Figure 6E-J).
Immune functions of different 8-Multiomics signature subgroups
27 immunity functions such as aDCs, APC-co-stimulation, B-cells, chimeric costimulatory receptor (CCR), check-point, Dendritic cells (DCs), human leukocyte antigen (HLA), Macrophages, Mast-cells, Neutrophils, Natural Kill-cells, para-inflammation, Plasmacytoid dendritic cells (pDCs), T-cell co-inhibition, T-cell co-stimulation, T-helper cells, T follicular helper (Tfh), T helper (Th) cell subsets 2 (Th2), tumor-infiltrating lymphocytes (TIL), Regulatory T (Treg), and Type-I-IFN-response were significantly suppressed in the high-risk subgroup (p <0.05) (Figure 7A).
KM survival curves exhibited that the activity of T cells related-immunity function could predict longer DFS of OSCC patients. As shown in Figure 7B-H, The more active the immune function, the lower the tumor recurrence (p <0.05). However, the more dynamic Th2 cells predicted a worse prognosis. Besides, active other immune functions contributed to prolonging the disease-free survival time of OSCC (p <0.05, Supplementary Figures 6).
The benefit of ICI therapy in different 8-Multiomics signature subgroups
As shown in Figure 8A, the Score of IPS was higher in the high-risk subgroup (p = 0.02). While, there was no difference in the score of IPS-PD1-CTLA4, IPS-PD1, and IPS-CTLA4 between the high- and low-risk groups (Figure 8B-D). TIDE was used to assess the potential clinical efficacy of immunotherapy in different 8-Multiomics signature subgroups. The patients with higher TIDE scores indicated a higher potential for immune evasion and were less likely to benefit from ICI therapy. In our results, patients in the high-risk subgroup had a lower TIDE score compared to the patients in the low-risk subgroup, which suggests that ICI therapy was more beneficial for high-risk subgroup patients (Figure 8E). Also, exclusion, microsatellite instability (MSI) score, and T cell dysfunction were low activity in the high-risk subgroup (Figure 8F-H). Furthermore, we found that the 8-multiomics signature had better prediction than TIDE and TIS at 1 year, 3 years, and 5 years, which might be a prognostic signature for immune therapy (Supplementary Figures 7).
Pan-cancer analysis of 8-Multiomics signature
The eight symbols' expression levels were analyzed in various cancers and ranked from high to low (Supplementary Figure 8). All cancers expressed cg00229245, cg00678539, cg11079619, hsa-mir-338, EIF4E3, APOBEC3D, RPS10, and PLVAP with the highest levels in Lymphoid Neoplasm Diffuse Large B-cell Lymphoma (DLBC), Lung adenocarcinoma (LUAD), Acute Myeloid Leukemia (LAML), Brain Lower Grade Glioma (LGG), DLBC, LAML, Kidney renal clear cell carcinoma (KIRC), and Ovarian serous cystadenocarcinoma (OV), respectively, and the lowest in KIRC, OV, Prostate adenocarcinoma (PRAD), OV, LGG, Liver hepatocellular carcinoma (LIHC), LAML, and Kidney Chromophobe (KICH), respectively. The eight symbols expression levels were compared between cancer and matched normal samples from 33 cancers, based on TCGA data (Figure 9). cg00229245 was highly expressed in the cancer group of Esophageal carcinoma (ESCA), Head and Neck squamous cell carcinoma (HNSC), KIRC, LIHC, Lung squamous cell carcinoma (LUSC), and Pancreatic adenocarcinoma (PAAD). However, cg00229245 was downregulated in the cancer group of Bladder Urothelial Carcinoma (BLCA), Rectum adenocarcinoma (READ), Thyroid carcinoma (THCA), and Uterine Corpus Endometrial Carcinoma (UCEC) (Figure 9A). The expression of cg00678539 was highly upregulated in the cancer group of Breast invasive carcinoma (BRCA). In contrast, cg00678539 was downregulated in the cancer group of BLCA, LIHC, LUSC, PRAD, Sarcoma (SARC), THCA, and UCEC (Figure 9B). The expression of cg11079619 was significantly downregulated in the cancer group of BLCA, BRCA, Colon adenocarcinoma (COAD), ESCA, HNSC, KIRC, LIHC, LUSC, PAAD, PRAD, READ, THCA, and UCEC (Figure 9C). The expression of hsa-mir-338 was significantly downregulated in the normal group of COAD, KIRC, LIHC, Pheochromocytoma, Paraganglioma (PCPG), and READ. In contrast, the expression of hsa-mir-338 was significantly upregulated in the normal group of ESCA, HNSC, KICH, Kidney renal papillary cell carcinoma (KIRP), LUSC, and THCA (Figure 9D). APOBEC3D was highly expressed in the cancer group of Cholangiocarcinoma (CHOL), Glioblastoma multiforme (GBM), HNSC, KIRC, KIRP, LIHC, Lung adenocarcinoma (LUAD), and LUSC. However, APOBEC3D was downregulated in the cancer group of COAD, PRAD, READ, THCA, and UCEC (Figure 9E). The expression of cg00678539 was highly upregulated in the cancer group of CHOL, KIRP, and PCPG. In contrast, cg00678539 was downregulated in the cancer group of BRCA, Cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), COAD, ESCA, GBM, HNSC, KICH, LIHC, LUAD, TCGA-LUSC, PRAD, READ, Stomach adenocarcinoma (STAD), and UCEC (Figure 9F). The expression of PLVAP was significantly downregulated in the cancer group of BRCA, CESC, COAD, KICH, and KIRP, on the other hand, PLVAP was highly expressed in the cancer group of CHOL, GBM, HNSC, KIRC, LIHC, LUAD, PCPG, and THCA (Figure 9G). As shown in Figure 9H, RPS10 was highly expressed in the cancer group of CHOL, COAD, ESCA, GBM, KIRC, KIRP, LIHC, LUAD, LUSC, and PRAD, contrarily, the expression of RPS10 was significantly downregulated in cancer group of KICH, and PCPG.