3.1 Data Processing
The workflow of our study is shown in Fig. 1. A total of 910 samples were included in the study, with 348 and 562 samples containing clinical and RNA sequencing data from TCGA and GEO database, respectively. The median OS time of these samples was 22 months. Of the 910 samples, 108 (12.1% of the total number of cases) were staged as Stage I, 248 (27.7% of the total number of cases) as Stage II, 357 (39.8% of the total number of cases) as Stage III, and 183 (20.4% of the total number of cases) as Stage IV. Detailed patient baseline data are presented in Table 1.
Table 1
Clinical information of the 910 samples used in this study.
Characteristics | Number of cases (%) | TCGA | GEO |
n | 910 | 348 | 562 |
Age, median (IQR) | 65 (56.76, 72) | 67.35 (58.48, 73.15) | 64.35 (56, 70.12) |
Gender | | | |
Male | 601 (66%) | 225 | 376 |
Female | 309 (34%) | 123 | 186 |
Stage | | | |
Stage I | 108 (12.1%) | 46 | 62 |
Stage II | 248 (27.7%) | 110 | 138 |
Stage III | 357 (39.8%) | 144 | 213 |
Stage IV | 183 (20.4%) | 34 | 149 |
Status | | | |
Alive | 482 (53%) | 203 | 279 |
Dead | 428 (47%) | 145 | 283 |
OS time (months), median (IQR) | 22 (11.128, 58) | 15.48 (9.2, 26.16) | 32.67 (12.82, 70.57) |
3.2 Characterization of lactate metabolism subgroups
Based on LMRG expression, we conducted unsupervised clustering analysis; we chose the clustering results at K = 4 and separated TCGA and GEO samples into four lactate metabolism subgroups (Fig. 2A-C).
The ESTIMATE algorithm was used to further explore immune and stromal infiltration among these cluster subgroups. Counterintuitively, we observed no significant differences between clusters 1 and 2, but both clusters 1 and 2 had significant differences with the other two subgroups, clusters 3 and 4; an overall higher immune cell infiltration was found in cluster 3 than in the other three clusters (Fig. 2D–F). As no significant differences were observed in the immune infiltration analysis between clusters 1 and 2, we subsequently combined the two groups for analysis.
Next, to further explore the function of the three subgroups, we performed GSVA. As shown in Fig. 2G, the EMT pathway was significantly enriched in cluster 3, whereas the “HALLMARK_OXIDATIVE_PHOSPHORYLATION” pathway was significantly enriched in cluster 4. Thus, we further defined clusters 3 and 4 as the EMT and metabolic subtypes, respectively. As shown in Fig. 2H, “HALLMARK_MTORC1_SIGNALING,” “HALLMARK_E2F_TARGETS,” and “HALLMARK_G2M_CHECKPOINT” pathways, which were related to proliferation, were enriched in cluster 1 and 2. Therefore, we further defined cluster 1 + 2 as the proliferation subtype.
Survival analysis was conducted among the three subgroups. As shown in Fig. 3A, OS of patients with GC statistically significantly differed among the three groups; patients in cluster 3 had the worst prognosis, whereas patients in cluster 4 had a significantly better prognosis than patients in the other two groups. We calculated the score of oxidative phosphorylation (OXPHOS) among groups using the ssGSEA algorithm; cluster 3 had the lowest score, whereas cluster 4 had the highest (Fig. 3B). These results suggested that a higher OXPHOS score might be associated with a better prognosis and higher immune cell infiltration, suggesting a potential relationship between LMRGs and immune cell infiltration.
To assess the transcriptomic differences in the regulatory patterns of cellular lactate metabolism, we investigated the DEGs among different subgroups. We identified 317 up-regulated and 452 downregulated DEGs between clusters 3 and 4, 53 upregulated and 138 downregulated DEGs between clusters 3 and 1 + 2, and 10 upregulated and 32 downregulated DEGs between clusters 1 + 2 + 3 and 4. The volcano plots are displayed in Fig. 3C–3E. The Venn diagram (Fig. 3F) indicates overlaps for 30 DEGs.
In the TCGA dataset, we grouped these DEGs according to median expression and conducted a survival analysis to screen for prognostic genes with p < 0.05. CCDC80 and three other genes (PPP1R14A, APOD, and OGN) were identified. We selected CCDC80 for further investigation since it had not been reported with GC in the previous studies.
3.3 CCDC80 expression in TCGA stomach adenocarcinoma (TCGA-STAD) and Pan-Cancers
We conducted a pan-cancer analysis of CCDC80 expression using data from TCGA and the Genotype-Tissue Expression (GTEx) portal. Figure 4A indicates that CCDC80 was highly expressed in GC samples (p < 0.001).
In the TCGA-STAD cohort, survival analysis showed that patients with high CCDC80 expression had lower OS than those with low expression (p < 0.05, Fig. 4B). We further explored the relationship between CCDC80 expression and clinicopathological variables. Figure 4C shows the relationship between CCDC80 expression and histological grade, indicating that the CCDC80 expression increased with a higher degree of tumor differentiation (p < 0.001). Similarly, CCDC80 expression was lower in T1 (Fig. 4D, p < 0.001) and stage I (Fig. 4F, p < 0.05) groups. However, no significant differences were observed between the different N stages (Fig. 4E).
Additionally, Fig. 4G shows the prognostic value of CCDC80 expression in the TCGA and GEO cohorts, which was consistent with the result in TCGA-STAD cohort. These results indicated that high CCDC80 expression is a predictor of poor prognosis in patients with GC.
We then analyzed the differences in immune checkpoint (ICP) in different CCDC80 expression groups. We confirmed that the expression of most ICPs was significantly in-creased in the high CCDC80 expression group, which suggests that CCDC80 expression may have an impact on immunotherapy.
The results of univariate and multivariate Cox regression analyses are presented in Table 2. Multivariate Cox regression analysis revealed that CCDC80 expression was an independent prognostic factor (Fig. 5A). Figure 5B depicts the risk score based on multivariate Cox regression, patient survival outcomes, and CCDC80 expression. We used the timeROC function to explore the prognosis of the risk score and then obtained the 1-, 3-, and 5-year prognostic classification efficiency. As shown in Fig. 5C, the computed areas under the curve (AUC) were large: 0.726 at 1 year, 0.740 at 3 years, and 0.747 at 5 years. We constructed a nomogram to evaluate the prognosis of patients with GC. Except for CCDC80 expression, the model included age and pathological stage (Fig. 5D). Figure 5E–G shows the calibration curve for these time points.
Table 2
Results of univariate and multivariable analysis.
Characteristics | Total(N) | Univariate analysis | | Multivariate analysis |
Hazard ratio (95% CI) | P value | Hazard ratio (95% CI) | P value |
CCDC80 | 910 | 1.244 (1.132–1.368) | < 0.001 | | 1.261 (1.139–1.397) | < 0.001 |
Age | 902 | 1.014 (1.005–1.022) | 0.003 | | 1.025 (1.016–1.035) | < 0.001 |
Gender | 910 | | | | | |
Male | 601 | Reference | | | | |
Female | 309 | 0.844 (0.688–1.037) | 0.106 | | | |
Stage | 896 | | | | | |
Stage I | 108 | Reference | | | | |
Stage II | 248 | 1.670 (1.041–2.679) | 0.034 | | 1.618 (1.008–2.598) | 0.046 |
Stage III | 357 | 3.291 (2.113–5.126) | < 0.001 | | 3.164 (2.027–4.938) | < 0.001 |
Stage IV | 183 | 6.766 (4.304–10.637) | < 0.001 | | 7.343 (4.657–11.579) | < 0.001 |
3.4 Pathway enrichment analysis
GO, KEGG, and GSEA analyses were performed to reveal the mechanism of CCDC80, using the DEGs between low and high CCDC80 expression groups, including 387 upregulated and 13 downregulated genes. The top enriched GO terms and KEGG pathways are shown in Figs. 6A and 6B. The GSEA results suggested that three functional gene sets were enriched in the high CCDC80 expression group, which included EMT, myogenesis and angiogenesis path-ways (Fig. 6C–E). In contrast, OXPHOS, G2M checkpoint, and E2F targets pathways were significantly enriched in the low CCDC80 expression group (Fig. 6F–H).
3.5 Immune cell infiltration and drug sensitivity analyses
Immune infiltration analysis was performed in the GC tumor microenvironment, as shown in Fig. 7A. Figure 7B shows the correlation between CCDC80 expression and various immune cells. This result was consistent with the TIMER database, in which the expression of CCDC80 was significantly positively correlated with macrophage infiltration (r = 0.687, p < 0.001) (Fig. 7D). We also compared the differences in immune cell in-filtration between the low and high CCDC80 expression groups. As shown in Fig. 7C, many immune cells were highly infiltrated in the high CCDC80 expression group, including CD4 + T cells, CD8 + cells, B cells, and macrophages. Further analysis using the quanTIseq algorithm yielded similar results (Fig. 8A). These results suggest that CCDC80 is vital in regulating immune cell infiltration in GC. The score of important signatures was used to explore the difference in function between the high and low CCDC80 expression groups. The high-expression group had higher ICP (Fig. 8B) and lower OXPHOS scores (Fig. 8C).
The results of drug sensitivity analysis can be seen in Fig. 8D. It demonstrated that the drug sensitivity of paclitaxel is lower in the high CCDC80 expression group (p < 0.05). In other anti-cancer drugs, such as gefitinib, lapatinib, rapamycin and sorafenib, we can see the similar results. However, we observed no significant differences in the drug sensitivity of cisplatin, docetaxel, and doxorubicin between the low and high CCDC80 expression groups.
3.6 Mutation characteristics
The overall mutational landscape of TCGA-STAD is shown in Fig. 9A; missense mutations occurred most frequently, and the top two mutated genes were TTN and MUC16. Subsequently, we analyzed the TMB. We observed significant differences between the groups, with the high CCDC80 expression group exhibiting lower TMB than the low expression group (Figs. 9B and 9C), indicating the impact of immunotherapy.
We analyzed the mutation characteristics of the high and low CCDC80 expression groups. The top 30 mutated genes in the low and high CCDC80 expression groups were mapped (Figs. 9D and 9E). TTN and MUC16 were more frequently mutated in the low CCDC80 expression group, with mutation frequencies of 53% and 34%, respectively. Figure 9F and 9G shows the correlation between the top 20 mutated genes. These results provide novel insights into the intrinsic connection between immunotherapy and somatic variation.
3.7 CCDC80 expression in our GC samples and its relation with clinicopathological characteristics
The age of the independent cohort was 28–80 years old (≥ 60 years old, n = 37, 46.2%; <60 years old, n = 43, 53.8%), and the cohort included both male (n = 50, 62.5%) and female (n = 30, 37.5%) (Table 3). Patients with clinical stages I (n = 29, 36.3%), II (n = 24, 30%), and III (n = 27, 33.7%) were present. Both GC and adjacent non-tumor tissue specimens exhibited cytoplasmic CCDC80 expression. CCDC80 staining was significantly more intense in GC tissue specimens (score ≥ 4, 73.8% [59/80]) than in adjacent non-tumor tissue specimens (score ≥ 4, 47.5% [38/80]) (p = 0.001) (Fig. 10A–C). High CCDC80 protein expression was positively correlated with the clinical stage (p < 0.001), T stage (p < 0.001), N stage (p = 0.004), and pathologic differentiation (p = 0.009). Regarding the ability of CCDC80 expression to discriminate between patients with GC and healthy individuals, the ROC area under the curve was 0.737 (Fig. 10D).
Table 3
Clinical characteristics of patients from The First Hospital of China Medical University (CMU).
Characteristics | Number of cases (%) | Low expression of CCDC80 | High expression of CCDC80 | P value |
Tumor | 80 | 21(26.2%) | 59(73.8%) | 0.001 |
Adjacent non-tumor | 80 | 42(52.5%) | 38(47.5%) |
Age(y) | |
≥ 60 | 37(46.2) | 9 | 28 | 0.717 |
< 60 | 43(53.8) | 12 | 31 |
Gender | |
Male | 50(62.5) | 12 | 38 | 0.555 |
Female | 30(37.5) | 9 | 21 |
Clinical stage | |
Stage I | 29(36.3) | 17 | 12 | < 0.001 |
Stage II | 24(30.0) | 2 | 22 |
Stage III | 27(33.7) | 2 | 25 |
T stage | |
T1 | 20(25) | 11 | 9 | < 0.001 |
T2 | 16(20) | 6 | 10 |
T3 | 14(17.5) | 3 | 11 |
T4 | 30(37.5) | 1 | 29 |
N stage | |
N0 | 16(20) | 9 | 7 | 0.004 |
N1 | 14(17.5) | 5 | 9 |
N2 | 13(16.3) | 1 | 12 |
N3 | 37(46.2) | 6 | 31 |
Pathologic differentiation | |
G1 | 16(20.0) | 9 | 7 | 0.009 |
G2 | 18(22.5) | 4 | 14 |
G3 | 46(57.5) | 8 | 38 |
Histological type | |
Papillary type | 4(5.0) | 1 | 3 | 0.995 |
Tubular type | 27(33.7) | 7 | 20 |
Poorly differentiated type | 24(30.0) | 7 | 17 |
Signet Ring type | 9(11.3) | 2 | 7 |
Mucinous type | 16(20.0) | 4 | 12 |
Venous invasion | |
No | 70(87.5) | 17 | 53 | 0.501 |
Yes | 10(12.5) | 4 | 6 |
Lymphatic invasion | |
No | 56(70.0) | 16 | 40 | 0.471 |
Yes | 24(30.0) | 5 | 19 |
Subsequently, we analyzed the distribution of different clinical characteristics in the high and low CCDC80 expression groups. As shown in Fig. 10E–H, the proportion of patients with Stage I in the high CCDC80 expression group was lower than 50%, whereas that of patients with Stages II and III was approximately 90%. Furthermore, we found that CCDC80 expression was higher in patients with a higher degree of malignancy. Survival analysis based on the histoscore of CCDC80 indicated that patients with relatively high CCDC80 expression had poorer OS than those with low CCDC80 expression (p < 0.001) (Fig. 10I). Multivariate Cox regression indicated that clinical stage (p < 0.001) and CCDC80 expression level (HR = 3.316; 95% CI [1.309–7.531]; p = 0.01) were prognostic factors independently correlated with poor OS (Fig. 10J; Table 4).
Table 4
Univariate and multivariate Cox regression analyses of patients from The First Hospital of CMU.
Characteristics | Total(N) | Univariate analysis | | Multivariate analysis |
Hazard ratio (95% CI) | P value | Hazard ratio (95% CI) | P value |
CCDC80 | 80 | | | | | |
Low | 21 | Reference | | | | |
High | 59 | 4.910 (2.194–10.989) | < 0.001 | | 3.136 (1.309–7.513) | 0.010 |
Age | 80 | | | | | |
<60 | 43 | Reference | | | | |
≥ 60 | 37 | 0.910 (0.530–1.564) | 0.733 | | | |
Clinical stage | 80 | | | | | |
Stage I | 29 | Reference | | | | |
Stage II | 24 | 23.788 (7.942–71.253) | < 0.001 | | 22.251 (7.186–68.898) | < 0.001 |
Stage III | 27 | 25.072 (8.448–74.408) | < 0.001 | | 22.337 (7.387–67.537) | < 0.001 |
Pathologic differentiation | 80 | | | | | |
G1 | 16 | Reference | | | | |
G2 | 18 | 1.679 (0.716–3.940) | 0.234 | | | |
G3 | 46 | 1.553 (0.738–3.271) | 0.246 | | | |
Venous invasion | 80 | | | | | |
No | 70 | Reference | | | | |
Yes | 10 | 1.236 (0.583–2.623) | 0.581 | | | |
Lymphatic invasion | 80 | | | | | |
No | 56 | Reference | | | | |
Yes | 24 | 0.737 (0.400-1.359) | 0.328 | | | |