Identification of 4 ferroptosis-related candidate genes in Glioma Patients
A total of 55 ferroptosis-related genes were inclued from previous studies[16, 34-42] (Supplement table 1). 20 ferroptosis-related genes (20/55, 36.4%, TP53, SLC1A5, CD44, HSPA5, PROCR, SLC40A1, LPCAT3, HMOX1, PANX1, IGFBP7, ALOX5AP, FANCD2, STEAP3, SERPINE1, CDKN1A, IGFBP3, BDNF, HIC1, MT1G, CHAC1) were found in 4352 DEGs from TCGA-GTEx datasets. (Figure 2A and Table 2). Ferroptosis-related prognostic DEGs (FANCD2, TP53, IGFBP3, LPCAT3, SLC40A1, HMOX1) was filtrated through the univariate Cox regression analysis (Figure 2B). TP53 and SLC40A1 were not significant to be excluded by multivariate Cox regression analysis (Figure 2C). The co-linearity was checked between these genes to measure the connection among them and the result showed that the genes were independent of each other and the construction of model is available (Figure 2D). Finally, 4 genes as prognostic model factors were determined (FANCD2, IGFBP3, LPCAT3, HMOX1, Figure 2E).
Prognostic model construction in TCGA cohort
The risk score was calculated according to the expression of these ferroptosis-related genes in TCGA dataset, as formula: (0.504*expression level of FANCD2 + 0.264*expression level of HMOX1 + 0.235*expression level of IGFBP3 + 0.267*expression level of LPCAT3) (Supplementary table 2). Patients were ranked by the score, and they were stratified into a low-risk group (n=436) and a high-risk group (n=251) based on the optimal cut-off expression value. Patients in the low-risk group had prolonged OS than high-risk group (Figures 3A-C). The predictive performance of the risk model for OS was estimated by time-dependent ROC curves. The results showed that the area under the curve (AUC) reached 0.843 at 1 year, 0.846 at 2 years, and 0.849 at 3 years respectively (Figure 3D).
External validation of 4-gene prognosis model
From the analysis above, the model was able to effectively predict glioma patients’ prognosis in TCGA dataset. Subsequently, CGGA, Gravendeel and REMBRANDT datasets were performed to verify the efficacy and availability of the model. The patients in CGGA, Gravendeel and REMBRANDT datasets were also grouped into low-risk and high-risk respectively. The results in three datasets showed that high-risk group had a markedly poor prognosis than low-risk group (Figures 3A-C), suggesting that the 4-gene signature model can greatly predict the prognosis of patients. Time-dependent ROC curves were plotted to illustrate the sensitivity and specificity of the model. The 1-, 3- and 5-year AUC values of CGGA are 0.654, 0.718 and 0.738; of Gravendeel are 0.741, 0.729 and 0.770 respectively; and the 1-, 2-, and 3-year AUC values of REMBRANDT are 0.649, 0.744 and 0.740 (Figure 3D). According to the results of AUC curve, the prognostic model was significant.
Clinical characteristics of the patients based on risk score
The heatmaps (Figure 4A) showing the differential expression of the 4 selected genes, and baseline characteristics of the patients in different risk groups (Table 3) suggested that clinical and molecular features, such as WHO grade, age, classical subtypes, mesenchymal subtypes, and IDH wild types were enriched in high-risk group. Furthermore, with an increase in glioma grade, the risk score increased. The highest increase in risk score was found in the WHO grade IV patients, whereas the lowest increase in risk score was observed in the WHO grade II patients in the TCGA, CGGA, Gravendeel and REMBRANDT datasets (Figure 4B). Meanwhile, Kaplan-Meier plots of overall survival based on different clinical characters presented apparent differences between the high-risk group and low risk group for survival (Figures 5A-H). The survival-predictor scores from these models were highly predictive of survival in datasets (p < 0.001). The 4-gene mod was able to distinguish patients with obviously distinct outcomes across subsets, demonstrating the risk score based on four gene was greater prognostic power as compared with only used of clinical subgroups (WHO grade, IDH1, MGMT, 1p/19q).
Independent Prognostic Value of the 4-gene Signature
The independence of the clinical prognostic significance of the signature in glioma was verified. The risk score was significantly associated with OS in univariate Cox regression analysis in TCGA, CGGA, Gravendeel and REMBRANDT datasets, and the Hazard ratio (HR) of four datasets were: 9.57, 3.64, 2.80, 3.18 (95%CI, Under threshold of p<0.001) respectively (Table 4). Multivariate regression analysis still showed a significant correlation of the risk score with OS after adjusting for other clinical factors. The HR of multivariate analysis in four databases are 2.50, 1.87, 1.80, 1.87 (95%CI, Under threshold of p<0.001) (Figure 6). Consequently 4-gene signature model can be determined to be an independent prognostic factor of survival.
The Clinical Predictive Performance of the Nomogram
Based on the four ferroptosis-related gene signature (risk score) and clinical factor, a nomogram to predict patients' prognosis in the TCGA database was created (Figure 7A). In the study, we used a bootstrap method to verify the developed nomogram with the C-index of 0.869(95%CI), which suggested that the predictive model had good predictive performance. Furthermore, the calibration curves in 1-, 2-, 3-year survival of patients also showed good consistency compared with the ideal model, further indicating that the nomogram was stable in predicting the prognosis of glioma patients (Figures 7B-D). This suggests that basing therapy strategy on our nomogram will improve clinical outcome (Figure 7E).
Functional Annotation of the 4-gene Signature
To clarify the potential functional characteristics associated with risk score, the DEGs between the low-risk and high-risk groups were used to perform GO enrichment and KEGG pathway analyses. The results of GO enrichment were concentrated in three areas: binding with skeleton-associated, changing of receptor channel and extracellular matrix constituent, extracellular matrix could be a pro-ferroptosis stress, which may mediate cell adhesion, vesicular bodies and exosomes resulted in poor prognosis for high-risk groups. The results indicated that the biological processes associated with extracellular matrix and cell adhesion were enriched in high-risk group (P. adjust < 0.05, Figure 8A). KEGG pathway analysis also showed that the ECM−receptor interaction pathway and the Focal adhesion pathway were significantly enriched in high-risk group (P. adjust < 0.05, Figure 8B). Meanwhile, the results of GeneMANIA also confirmed that the high-risk group was obviously gathered in the structure, organization and disassembly of extracellular matrix (Figure 8C). These enriched biologic functions thus provided a deeper understanding of the patients’ poor prognosis.
Relationship between ferroptosis and Immune infiltration
Analysis of hallmark pathway gene signatures indicated that signaling pathways gathering at various biological progress were significantly different between high- and low-risk patients (Figure 9A-D). Different database has also different enrich result of signaling pathways. Immune related signal pathways are enriched in four different databases. (Figure 9E-F) (Interferon Gamma Respnse, TNFA Signaling Via NFĸB and Interferon Alpha response 3/7). Recent study has been reported that due to the multifaceted nature and the involvement of numerous metabolic pathways directly impinging on ferroptosis, it is highly likely that certain key nodes of the ferroptosis process are targeted by the immune system[43]. To further explore the correlation between the model and immune status, enrichment scores of the diverse immune cell subpopulations in TCGA and CGGA were quantified by Cibersort. Various immune cells (CD4+T cell, CD8+T cell, Tregs, monocytes, macrophages M0/M2, NK cells) were obviously different between the high risk and low risk (Figure 9G-H). The immune infiltrations of the two databases perfectly shows that the four ferroptosis-gene model are closely relevant to the immune.