3.1 Identification of overlapped ERSRGs in TCGA and ICGC cohorts
As calculated by univariable cox regressions with an adjust P value <0.05, 330 ERSRGs in TCGA and 225 ERSRGs in ICGC had significant prognostic relevance, respectively. Then 111 genes were identified as overlapped ERSRGs for further analysis (Figure 1A). The expression levels of the 111 overlapping ERSRGs in normal and tumor tissues in TCGA-LIHC dataset were shown in Figure S1A. Results of Go and KEGG analysis indicated that these overlapped genes were majorly associated with ER stress-related pathway (Figure S1B).
3.2 Establishment of an ER stress-related signature in TCGA
Overlapped ERSRGs were selected by performing the LASSO-Cox regression model based on the minimum value of λ and 19 genes were screened as shown in Figure 1B. These 19 genes were then placed into a stepwise Cox proportional model and finally a prognostic seven-gene signature was identified. Risk score = (0.17243378×MAPT) + (0.09459892×NQO1) – (0. 12755413×PON1) – (0. 14029120×PPARGC1A) + (0. 65200737×NGLY1) + (0. 37755837×CDK5) + (0. 19417209×RNF186). Risk scores for HCC patients were calculated with the above formula, and patients were stratified into high- or low-risk subgroups with an optimal risk score threshold (Figure 2C). The association between risk score and clinical characteristics including age, gender, grade, stage, vascular invasion, value of AFP, cirrhosis, HBV infection status and tumor status were evaluated. The results revealed that higher risk scores were linked to advanced TNM stage, later grade, later T stage, HBV infection and recurrence (Supplementary Figure S2). Kaplan-Meier survival analysis revealed that patients with higher risk score were significantly relevant to poorer survival outcomes (Figure 2D). In addition, further stratified survival analysis was applied for different clinical characteristics, and the results demonstrated that this prognostic model could further differentiated patients with different clinical characteristics including age, vascular invasion, grade, recurrence, TNM stage, gender, HBV infection status and AFP value (Supplementary Figure S3). Finally, ROC analysis revealed that this signature had a good prognostic performance with AUCs at 1-, 2-, 3-year of 0.800, 0.732, 0.745 (Figure 2E). To explore whether the seven-gene signature could be acted as an independent prognostic model for HCC patients, univariable and multivariate Cox analysis were performed. Univariable Cox regression analysis revealed that this seven -gene signature was statistically associated with survival outcomes for HCC patients (HR=3.088, 95%CI 2.200-4.335, P < 0.001, Figure 2F Left). Then statistically significant variables obtained above were entered into multivariate Cox regression analysis, which revealed that this signature could be served as an independent prognostic factor for HCC patients (HR=2.203, 95%CI 1.313-3.694, P < 0.001, Figure 2F Right) after adjusting for other clinical features.
3.3 Verification of the signature in ICGC cohort
To validate the signature, ICGC dataset was applied as a validation cohort. Risk scores of patients were calculated with the same formula, and patients were stratified into high- or low-risk subgroups in ICGC cohort (Figure 3A). Kaplan-Meier survival analysis revealed that patients with higher risk score were prominently relevant to poorer OS rate in ICGC cohort (Figure 3B). ROC analysis revealed that this signature had a good prognostic performance with AUCs at 1-, 2-, 3-year of 0.723, 0.740, 0.736 in ICGC cohort (Figure 3C). To explore whether the seven-gene signature could be acted as an independent prognostic model for HCC patients, univariable and multivariate Cox analysis were performed. Univariable Cox regression analysis revealed that this seven -gene signature was statistically associated with survival outcomes for HCC patients (HR=3.054, 95%CI 1.882-4.958, P < 0.001, Figure 3D Left). Then statistically significant variables obtained above were entered into multivariate Cox regression analysis, which revealed that this signature could be served as an independent prognostic factor for HCC patients (HR=2.463, 95%CI 1.487-4.078, P < 0.001, Figure 3D Right) after adjusting for other clinical features.
3.4 Functional analysis and Immune infiltrates analysis
Five Go items with FDR < 0.05 were enriched in this signature, including adaptive immune response, defense response to virus, immune effector process, response to virus and T cell differentiation (Figure 3A). According to the results of ESTIMATE algorithm, risk scores was significantly associated with stromal scores, but not immune scores (Figure 3B), and patients in low-risk subgroup had higher stromal scores when compared with patients in high-risk subgroup (Figure 3C), indicating that this signature was closely related to tumor immune status. Furthermore, based on CIBERSORT algorithm, the differences of 22 types of TIICs in two subgroups in TCGA were assessed by Wilcoxon signed-rank test analysis, and the results demonstrated that HCC patients in low-risk score subgroup had modestly increased ratios of resting mast cells, while patients in high-risk score subgroup had significantly elevated ratios of Macrophages.M0 cells (Figure 3D). Furthermore, Kaplan-Meier survival analysis revealed that only Macrophages.M0 cells were prominently relevant to poor survival outcomes in HCC patients (Figure 3E).
3.5 Genetic alterations and TMB analysis
The results of genetic alterations analysis indicated that the mutation rates of the top 10 most significantly mutated genes were remarkably different in the two subgroups (Figure 4A&4B). Subsequently, the TMB of each patient was assessed. We found that the risk score was closely related with TMB and patients in high-risk scores subgroup had significantly increased TMB (Figure 4C).
3.6 Immune checkpoint genes analysis
In the following, the expression levels of 15 potentially targetable immune checkpoint genes were compared between the two subgroups in TCGA database, and results showed that patients in high-risk subgroup had significantly increased PD1, PD-L1, CCL2, LAG3, CD276, CTLA4, CXCR4, IL1A, PD-L2, TGFB1, OX40 and CD137 (Figure 5), indicating that immune checkpoint inhibitors (ICIs) treatment was more effective for patients in high-risk subgroup.
3.7 Establishment of a nomogram model in TCGA
To investigate the coefficient prediction efficiency of this signature, a nomogram model was established in TCGA dataset, and the result revealed that the nomogram with a C-index of 0.723 could help us provide a quantitative method for predicting the 1-, 2-, 3-year survival rate accurately (Figure 6A). The overlap between the forecasted and actual probabilities of 1-, 2-, 3-year survival rate in the calibration curves indicated good agreement (Figure 6B).
3.8 Drug susceptibility analysis
Among the 574 in advanced clinical trials and 216 Food and Drug Administration (FDA)-approved drugs, 103 were considered tumor-sensitive drugs (Table S3) and the top 16 most significant tumor-sensitive drugs were shown in Figure 7.
3.9 Expression levels of genes in risk model
The mRNA expression of the seven genes in HCC samples were explored in the independent clinical cohort by qRT-PCR (Figure 8A). Risk scores of patients were calculated with the same formula, and patients were stratified into high- or low-risk subgroups (Figure 8A). Kaplan-Meier survival analysis revealed that patients with higher risk score were prominently relevant to poorer OS rate (Figure 8B). ROC analysis revealed that this signature had a good prognostic performance with AUCs at 1-, 2-, 3-year of 0.846, 0.824, 0.686 (Figure 8C).