Figure 1 shows the flow diagram of this research. In this study, 365 cases of TCGA-LIHC and 231 cases of ICGC (LIRI-Japan) HCC were included.
Identification of intersected genes of the ICGC and TCGA cohort
Differential analysis in the TCGA cohort showed that 123 out of 408 (30.07%) FRGs were differentially expressed between HCC tissues and their adjacent non-tumor tissues (Fig.2a-b). In addition, 128 FRGs were associated with OS. Next, differential and prognostic genes associated with ferroptosis were intersected. Results indicated that 62 differential prognostic genes in the TCGA cohort were associated with ferroptosis. Subsequent univariate analysis in the ICGC cohort found that 35 genes were associated with OS (supplementary table 3).
Furthermore, we intersected prognostic genes in the ICGC cohort with differentially prognostic genes associated with iron death in the TCGA cohort. Thirty five genes were reserved as the intersection genes for further analysis (Fig.2c). The network of interactions among these genes showed that MAPK3, NQO1, CDKN2A, TXN, and other genes (supplementary table 4) were hub genes (Fig.2d).
Construction of a prognostic model using the ICGC cohort
The expression profiles of 35 genes were used to establish the prediction model after Lasso-Cox regression analysis, from which, eight genes were identified based on the optimum value of lambda (Supplementary Fig.1a). According to the best truncated value of each gene, the survival analysis revealed that high expression of these genes was associated with unfavourable prognosis (Supplementary Fig.1d). Notably, the adjusted p value was less than 0.05. The risk score was calculated as follows: e (0.073 * expression level (EL) of G6PD+ 0.104 * EL of MAPT+ 0.061 * EL of NRAS+ 0.007 * EL of RRM2+ 0.165 * EL of SLC2A1+ 0.093 * EL of SLC7A11+ 0.164 * EL of SRXN1+ 0.046 * EL of STMN1). Next, the patients were split into a high-risk group (n=115) and a low-risk group (n=116) based on the median critical value (Fig.3a). PCA and t-SNE analyses revealed that there was always a differential distribution of cases among the different risk groups (Fig.3b-c). Results showed that patients in the high-risk group were more likely to die compared to patients in the low-risk group (Fig.3d). Moreover, the Kaplan–Meier curve indicated that patients in the low-risk group had a higher OS than patients in the high-risk group (Fig.3e, P< 0.001). The outcomes revealed that the area under the curve (AUC) of one year, two years, and three years were 0.706, 0.730, and 0.719, respectively (Fig.3f). Furthermore, survival analysis of the eight identified genes showed that, with the exception of MAPT, the other seven genes were associated with lower OS in the ICGC cohort (P<0.05, Fig.S2).
Validation of the 8-gene signature in the TCGA cohort
Survival analysis in the TCGA cohort showed that high expression of the eight genes was associated with poor OS (P<0.05, Fig.S3). To test the reliability of the 8-gene signature prognostic model (established using data obtained from the ICGC cohort), HCC patients in the TCGA cohort were further divided into high-risk and low-risk groups based on the median risk score of the ICGC cohort (Fig.4a). It was found that the validation results were similar to the results of the test model establishedconstructed using the ICGC cohort. Moreover, PCA and t-SNE analyses indicated that the distribution direction of high- and low-risk groups was different (Fig.4b-c). The OS of patients in the high-risk group was worse than for patients in the low-risk group (Fig. 4d-e, P< 0.001). The results showed that the AUC of one year, two years, and three years were 0.788, 0.706, and 0.668, respectively (Fig.4e).
Independent prognostic value of the 8-gene signature
Univariate and multivariate analyses were used to determine whether the risk score can be an independent predictor of OS. Univariate analysis results showed that the risk scores were associated with OS in both cohorts (HR= 4.594, 95% CI = 3.009-7.013), P< 0.001; HR= 3.780, 95% CI = 2.195-6.510, P< 0.001, respectively) (Fig.5a, b). Multivariate analysis further showed that risk scores were independent predictors of OS (TCGA cohort: HR =4.101, 95% CI = 2.662-6.316, P<0.001; ICGC cohort: HR=3.208, 95% CI = 1.833-5.613, P<0.001; Fig.5c, d).
Functional enrichment analysis (FEA)
To determine the biological mechanism associated with the gene signatures, differential gene expression was analyzed between groups using GO and KEGG analyses. Results showed that most of the DEGs were enriched in many molecular functions associated with ferroptosis, such as cell growth and redox ability (P< 0.05, Fig.6a, c). There were a lot of immune-related biological processes in the ICGC cohort (P<0.05, Fig.6a). The mainly enriched biological processes or molecular functions in the TCGA cohort were immune functions, including T-helper 1 type immune response, mast cell activation, immune effector process, humoral immune response, and neutrophil activation (P<0.05, Fig.6c). KEGG analysis results indicated that HIF−1 and PPAR signal channels were mainly enriched in both cohorts (P < 0.05, Fig.6b, d). Moreover, the relationship between risk score and immune status was explored. SsGSEA was used to quantitatively analyze the enrichment parts, related functions, or pathways of different immune cell subsets.
In the ICGC cohort, there were significant differences between different risk score groups with regard to the process of antigen presentation, including dendritic cells (DCs) score. There were also significant differences in macrophage and Th2_cells scores between different risk score groups (P< 0.05, Fig.7a-b). In addition, type II IFN response scores were lower, and MHC _ I scores was higher in the high-risk group (P< 0.05, Fig.7a-b). With regard to the macrophage score, significant differences were found in the TCGA cohort (P< 0.05, Fig7c-d).