1. Determination the PRDEGs in GC
A total of 134 DEGs associated with ferroptosis and iron metabolism were obtained in this study (Figure1A). The volcano plot displayed that 97 genes were the high expression in tumor tissues, and the remaining 37 were the opposite (Figure1B). Univariate Cox regression analysis was applied to filter out 13 ferroptosis- and iron metabolism-related genes with prognostic values (PRGs) in GC patients (Figure1C).
Combined with the previous analysis of PRGs expression and the data in ImmPort database, we screened a total of 70 differential transcription factors based on GC (GCTFs, Figure1D). The volcano plot displayed 52 GCTFs were up-regulated and 18 GCTFs were down-regulated (Figure1E). Moreover, correlation analysis revealed the regulatory network between GCTFs-PRGs. As presented in Figure1F, there were 37 pairs of GCTFs-PRGs regulatory relationships, of which 26 pairs were positive and 11 pairs were negative.
We intersected 134 DEGs and 13 PRGs to determine 6 PRDEGs (GCTFs, Figure1G). The differential expression of six PRDEGs between normal and tumor tissues was represented in Figure 1 H. Univariate Cox regression analysis clarified that the six PRDEGs were linked to the prognosis of GC patients, among which NOX4 and ZFP36 were high risk genes, while GLS2, MTF1, SLC1A5 and SP1 were low risk genes (Figure1I). The correlation network showed the co-expression relationship among these PRDEGs (Figure1J).
2. Creation and verification of the prognostic risk model of GC
Based on the PRDEGs expression profile and LASSO-penalized Cox regression analysis, our study finally identified the 6-gene signature in accordance with the optimal value of λ (Figure2A-B). Six PRDEGs (GLS2, MTF1, SLC1A5, SP1, NOX4, and ZFP36) were further confirmed to be closely correlated to prognosis of GC patients. The risk score= (-0.341* expression level of GLS2) + (-0.359* expression level of MTF1) + (-0.104* expression level of SLC1A5) + (-0.158* expression level of SP1) + (0.489* expression level of NOX4) + (0.154* expression level of ZFP36).
For the TCGA cohort, 350 GC patients were separated into the high-risk group (n=175) and the low-risk group (n=175) according to the risk score median value (Figure2C). PCA and t-SNE analysis confirm that preciseness and distinction of grouping in the prognostic risk signature (Figure2D-E). Figure 2F revealed that with the risk score raised, the distribution of living patients decreased, indicating that there was a correlation between this risk score and the OS of patients. Next,K-M survival curve revealed the low-risk group patients had a higher survival probability compared with the high-risk group (Figure2G, P< 0.001). The area under the curves (AUCs) of the time-dependent ROC reached 0.636 at 1 year, 0.639 at 2 years, and 0.654 at 3 years, suggesting the prognostic risk signature having good sensitivity in predicting survival of GC patients (Figure2H).
The above analysis results were verified in the GEO cohort. We divided 431 patients with GC from the GEO cohort into two groups in accordance with the median value calculated from the TCGA cohort: 13 patients in the high-risk group and 418 patients in the low-risk group (Figure2I). Likewise, PCA and t-SNE analysis confirm the two groups’ patients were disseminated in opposite directions (Figure2J-K). Consistent with the TCGA cohort results, the high-risk group’s patients have the higher death rate and shorter OS compared with the low-risk group’s patients (Figure2L-M, P< 0.001). Last, the AUCs for the GC patients’ OS was 0.579 at 1 year, 0.593 at 2 years, and 0.600 at 3 (Figure2N).
3. Independent prognostic performance of the prognostic risk model
Univariate and multivariate Cox regression analyses were employed to examine whether the prognostication effect of the risk score for GC patient's prognosis was independent of other available clinical indicators (including age, gender, histological grade and TNM stage). Univariate Cox regression analyses demonstrated that the risk scores of the prognostic risk signature and prognosis of patients were highly correlation in both the two cohorts (TCGA cohort: P<0.001, HR =2.932, 95% CI=1.847-4.655; GEO cohort: P<0.001, HR=1.609, 95% CI =1.271-2.035) (Figure3A-B). Furthermore, multivariate analysis once again verifies that risk score was an independent parameter for predicting the patients’ prognosis (TCGA cohort: P<0.001, HR =3.195, 95% CI=2.012-5.072; GEO cohort: P<0.001, HR=1.592, 95% CI =1.250-2.028) (Figure3C-D).
To appraise the accuracy of the risk score compared with other clinical indicators, we calculated the AUCs of the ROC curves in two cohorts. The results disclosed that the risk score’s AUCs in TCGA and GEO cohorts were 0.642 and 0.579, respectively (Figure3E-F). From this, we concluded that this prognostic risk model could precisely predict the GC patients’ OS. Subsequently, the predictive nomograms were built to estimate possible survival time of individual GC patient at 1, 2 and 3 years (Figure3G-H). what’s more, the calibration curves of the nomograms based on the prognostic risk signature exhibited that the predicted survival probability and actual OS had good consistency at 3-year (Figure3I-J).
Finally, our study investigated the correlation between the prognostic risk model (including the risk score and the genes) and clinical indicators in TCGA cohort. With the enhance of the risk score and NOX4, the histological grade of patients with GC enhanced, but the result of SLC1A5 was on the contrary (Figure3K-L). In terms of pathological stage, with the increased expression of NOX4 and SP1, the stage also increased (Figure3M-N). These results demonstrated that the abnormal expression of genes in the prognostic risk signature were intimately connected with the GC’s progression.
4. Analysis of association between risk score and TMB of GC
The occurrence of tumors is caused by the accumulation of somatic mutations in the genes of diseased cells[22].Here, we first analyzed the difference in tumor mutation burden (TMB) of GC patients in the high-risk group (Figure 4A) and the low-risk group (Figure 4B). Figure4 C showed that the number of TMB was higher in the low-risk group than in the high-risk group. At the same time, patients in the low-risk group with the high TMB had a higher 5-year survival rate than those in the high-risk group with the low TMB (Figure 4D). TMB has become a biomarker to identify cancer patients who can benefit from immunotherapy and to predict the efficacy of immunoassay inhibitors [22].
Moreover, the fractions of T cells CD4 memory activated, NK cells resting, macrophages M1, and neutrophils were higher in the high TMB group, while the T cells regulatory (Tregs) and mast cells resting were just the opposite (P < 0.05, Figure 4E).
To illuminate the underlying mechanism of the protective effects of the prognostic risk model and Immune microenvironment of GC, we further investigated the effect of somatic cell copy number (CNA) of the 6-gene signature on different kinds of immune cell infiltration. CNA of the 6-gene signature significantly affected the fraction of B cells, CD4 + T cells, CD8 + T cells, neutrophils, macrophages, and dendritic cells (DCs) in GC (Figure 4F-K). Above the results manifested that the prognostic risk model plays a critical role in regulating the tumor immune microenvironment of GC patients.
5. Functional enrichment analysis
We performed GO and KEGG analysis on the DEGRGs. In terms of GO, DEGRGs were mainly correlated to extracellular matrix organization, extracellular structure organization, and glycosaminoglycan binding in both the TCGA and GEO cohort (P < 0.05, Figure 5A, E). KEGG analyses elucidated that DEGRGs were enriched in ECM-receptor interaction, PI3K-AKT signaling pathways and protein digestion and absorption in both cohorts (P < 0.05, Figure 5B, F). Subsequently, we performed GSEA analysis on GO and KEGG in the two cohorts to find more special biological processes and molecular signaling pathways. The results demonstrated the DEGRGs from the two cohorts were primarily associated with metabolic- related biological processes (Figure 5C, G). Related enrichment pathways analyses elucidated that DEGRGs were mainly concentrated on lysine degradation, cysteine and methionine metabolism, glutathione metabolism, and TGF beta signaling pathway (Figure 5D, H). These pathways involved in ferroptosis-related biological processes and immune-related molecular functions.
6. The connection between the different risk groups and TIME of GC
Considering the relationship between prognosis and the TIME in patients with GC, we further evaluated the enrichment scores of diverse kinds of immune cell subsets and immune-associated effects in two risk groups from the TCGA and GEO cohorts using ssGSEA.
For the TCGA cohort, B cells, macrophages, and plasmacytoid dendritic cells (pDCs) had higher infiltration in high-risk group. In particularly, immune-related functions, such as type II IFN response, APC co-stimulation,and Chemokine Receptor (CCR) were also more active in the high-risk group (Figure 6A, B). In terms of the GEO cohort, the scores of T follicular helper cells (Tfh), T helper type 2 (Th2) cells, inflammation-promoting, MHC class I, and type II IFN response had statistical difference between the high-and low-risk groups (Figure 6C, D)
Figure 6E-J exhibited the relevance among the prognostic risk score and diverse immune cells. The association between the six prognostic risk genes’ expression and the immune infiltrate was shown in Figure 6K-P.
7. Validation the expression of PRDEGs in cell lines
The QPCR was used to detect the expression level of PRDEGs in different GC cell lines. The QPCR results showed that the expression of GLS2, MTF1, SLC1A5, SP1, NOX4, and ZFP36 was upregulated in GC cell lines (HGC-27) vs human gastric mucosal epithelial cell line (GES-1). The expression of MTF1 was upregulated while SP1, NOX4, ZFP36 was downregulated in GC cell lines (MKN-45) vs GES-1(Figure 7A-F). The QPCR data results were roughly the same as our bioinformatics analysis results (Figure 7G).