1. Identification and selection of hypoxia-related lncRNAs
The entire work framework is shown in Fig. 1. We extracted 15036 lncRNAs from the TCGA database. At the same time, 199 hypoxia-related genes were available in HCC samples (Table S1). We performed Pearson analysis to calculate coefficients between hypoxic genes and lncRNAs. We defined 486 hypoxia-related lncRNAs with the criteria that coefficient |r| > 0.4 and P < 0.001. Significant DElncRNAs were identified among HCC compared with adjacent noncancerous tissues. Next, we identified 300 DElncRNAs (11 downregulated and 289 upregulated lncRNAs) using the limma package (absolute log Fold Change value |LogFC| > 1, P < 0.05). These DElncRNAs were displayed through the volcano plot (Fig. 2A). In addition, univariate analysis screened prognostic lncRNAs from 486 hypoxia-related lncRNAs, and 77 prognostic lncRNAs were obtained (P < 0.01, Table S2). The Venn diagram showed the DElncRNAs and their intersection with hypoxia-related prognostic lncRNAs, the 62 candidate lncRNAs were inputted into further analysis (Fig. 2B).
2. Construction and validation of hypoxia-related lncRNAs signature
The model was constructed from training datasets and verified using validation datasets. As 62 candidate lncRNAs might display similar biological roles, we reduced the dimensionality by Lasso-Cox regression analysis. When its lowest value at a log(λ)=-2.45, the risk score model generated corresponding coefficients. Finally, eight hypoxia-related lncRNAs were included to develop the model (Fig. 3A, B). Therefore, the formula was: Risk score=SNHG3*0.0046+NRAV*0.0645+AL031985.3*0.0616+AL049840.6*0.0217+ZFPM2-AS1*0.0546+AC074117.1*0.0681+AC073611.1*0.0676+MAFG-DT*0.0005. In GSE76427, only 5 out of 8 lncRNAs were extracted, containing AC074117.1, ZFPM2-AS1, NRAV, SNHG3, MAFG-DT, and then calculate risk scores. Meanwhile, we obtained the best cut-off value using X-tile software, which was confirmed to be 1.601.
Later, in the training dataset, the survival curve indicated that high-risk patients had significantly inferior prognosis (P < 0.001) (Fig. 3C). High-risk HCC patients tended to die earlier (Fig S1). The predicted AUC value for 1‐, 3‐ and 5‐year OS were 0.767, 0.710, and 0.710, respectively, indicating the good predictive performance of HRLS (Fig. 3D). Consistent with the training dataset, these trends were observed in the test dataset (Fig. 3E, F) and GSE76427 (Fig. 3G, H).
3. Prognostic value of the hypoxia-related lncRNA signature
In the univariate and multivariate Cox regression model, we analyzed five variables that include age (continuous value), gender (male and female), grade (low and high), stage (I/ II and III/ IV), HRLS (low and high) in three datasets. As shown in Fig. 4A and Fig. S2A, C, the HRLS was an independent predictive biomarker in three datasets (training dataset: hazard ratio (HR) = 4.017, 95% confidence interval (CI) = 2.389-6.755, P < 0.001; test dataset: HR = 3.319, 95% CI = 1.619-6.808, P = 0.001; GSE76427: HR = 3.477, 95% CI = 1.467-8.242, P = 0.005). Further, multivariate analysis indicated that stage and HRLS were still independent risk factors for OS, and HRLS exhibited highest significance in addition to the other existing clinical parameters (training dataset: HR = 4.164, 95%CI = 2.364-7.334, P < 0.001; test dataset: HR = 2.871, 95%CI = 1.361-6.055, P = 0.006; GSE76427: HR = 3.113, 95%CI = 1.284-7.550, P = 0.012. Fig. 4B, Fig. S2B, D). Then, we constructed a nomogram consisting of two meaningful variates in the entire TCGA dataset (Fig. S3A) and GSE76427 (Fig. S3B), which could predict mortality in HCC patients by quantitative scoring methods. The quantified total score is functionally transformed to obtain the survival probability at 1, 3 and 5 years. Furthermore, the calibration plot showed model has better consistency between predicted OS and actual observations, which consistent with the results of the validation dataset (Fig. S3B, D).
As illustrated in Fig.4C, riskScore had a significantly higher C-index (consistency index) than the stage in TCGA datasets. Importantly, the combination of risk score and stage can significantly promote C-index in the TCGA dataset (stage+riskScore C-index:0.705, 95% CI 0.653-0.757). However, the difference in the C statistics was not significant in GSE76427 datasets (stage+riskScore C-index:0.655, 95% CI 0.546-0.765). Finally, the DCA analysis indicated that HRLS and immune cells models added more net benefit than did the clinical characteristics. It also indicated that prediction with all or non-patient schemes is more beneficial when the decision probability based on the nomogram is >0.25 and <0.4 (Fig. 4D). Therefore, this nomogram showed superior clinical usefulness.
Next, to further evaluate the accuracy of HRLS, patients were stratified based on age, gender, pathological grade, and stage. The survival curve showed that patients in the low-risk subgroup had better OS outcomes compared with ones in the high-risk subgroup in the TCGA studies (Table S3). The classifier also identifies significantly different distribution of risk scores among different subgroups. We found that patients with high grades (T3-T4) and advanced stages (III-IV) had a higher risk score (Fig. S4). Nevertheless, there were no survival benefits in the age and gender subgroups. As such, risk scores may be correlated with pathologic parameters for HCC patients.
4. The landscape of immune cells infiltration
To explore whether HRLS assess changes of immune cells in the TME of liver cancer, we depicted their relationships. Hence, the activation level of 29 immune signatures (16 immune cell types and 13 functions) was calculated using the ssGSEA algorithm according to the reference gene set. As shown in Fig. 5A and Fig. S5, the low-risk subgroup had a significantly greater proportion of NK cells and Mast cells. Macrophages and regulatory T cells (Tregs) were more likely to be distributed in the high-risk subgroup. In particular, we also found that hypoxia risk scores were positively associated with the infiltration of two immune cells, including Macrophages and Treg (Macrophages: r = 0.31, P < 0.001; Fig. 5B; Treg: r = 0.21, P < 0.001; Fig. 6C). It suggested that tumor hypoxic signature is associated with immune cells phenotypes.
5. HRLS predicts therapeutic benefits
To further identify HCC patients who can benefit from chemotherapy, we evaluated 6 drugs response in two risk subgroups. The low-risk patients were highly sensitive to the chemotherapeutics Methotrexate (P-value < 0.001, Fig. 6A), Gefitinib (P-value < 0.0025, Fig. 6B), and Docetaxel (P-value < 0.001, Fig.6 C). Our analysis indicated that their chemotherapy response rates were higher. On the contrary, the high-risk patients were highly sensitive to the chemotherapeutics Cisplatin (P-value < 0.001, Fig. 6D), Bexarotene (P-value < 0.001, Fig. 6E), and Gemcitabine (P-value < 0.001, Fig. 6F), which meant that HRLS may be a potential biomarker for chemosensitivity. The results could also explain that the application of these three drugs could result in a poor prognosis for low-risk patients who may have a chemoresistant environment. We retrieved different drug treatments gene sets from MSigDB to perform GSEA. This supports the above results that three-drug resistance pathways were significantly correlated with HRLS (Fig. 6G, Table S4).
The TIDE algorithm has been applied to the clinical efficacy of immunotherapy. Studies showed that a higher TIDE score means less benefit from immune-checkpoint-inhibitor (ICI) treatment due to immune evasion. Our results showed that the TIDE score in tumors with high-HRLS was significantly lower than those in the low-HRLS subgroup, implying a better response to ICI immunotherapy. (Fig. 6H). Besides, a higher TIDE score may predict worse outcomes for ICI treatment. Also, analysis shows that patients in the low-HRLS subgroup had a higher T cell dysfunction (Fig. 6J) and a lower T cell exclusion score (Fig. 6I). Collectively, these data suggested that risk stratification may be useful in assessing the response of HCC patients to immunotherapy.
6. Gene Set and Function Enrichment Analysis
To understand the underlying mechanism about how hypoxia-related lncRNAs affect liver cancer, GSEA was performed and the results were compared in the two groups in TCGA the dataset. Gene sets upregulated in the high-risk subgroup were tumor proliferation, including MTOR signaling pathway, P53 signaling pathway, VEGF signaling pathway, nod like receptor signaling pathway, and pathway in cancer. In contrast, the immune-related pathways were significantly enriched in the low-risk subgroup, which included the PPAR signaling pathway, fatty acid metabolism pathway, drug metabolism cytochrome 450 pathway, complement, and coagulation cascades pathway, glycine serine, and threonine metabolism pathway (Fig. 7A, Table S4). Meantime, to further investigate the BPs associated with 816 differentially expressed genes corresponding to the HRLS (|LogFC| > 1 and P < 0.05), ClueGO analysis was carried out and suggested that the hypoxia environment was related to four significant GO pathways (P < 0.05), such as cytokine-mediated signaling pathway, icosanoid metabolic process, mitotic spindle organization, and regulation of cell migration (Fig. 7B). These differential genes suggested that hypoxia correlates with oncogenic pathways and enhanced immune function.
7. Validation of the expression level of hypoxia-related lncRNAs and construction of the ceRNA network
We detected the expression level of hub-lncRNAs of 3 hypoxia-treated and 3 normoxia-treated HCC cell lines in the GSE155505 database. The expression of two lncRNAs were significantly decreased in Hep3B cells after hypoxia exposure (48 hours) as compared to the control group, such as AC073611.1 (P = 0.026), AL031985.3 (P = 0.029) (Fig. 8A, B). Therefore, hub-lncRNAs are hypoxia-responsive in HCC cells. Moreover, by calculating the risk score for 67 HCC patients with sorafenib treatment downloaded from the GSE109211 dataset, we found that the expression levels of MAFG-DT (P = 0.00021) were significantly higher in 21 sorafenib treatment responders than in 46 non-responders (Fig. 8C). Taken together, our analyses indicated that the expression level of a single lncRNA may be different in treatment response.
To investigate how hypoxia-associated lncRNAs regulate mRNA expression through sponging miRNAs, we firstly extracted three of 62 lncRNAs from the Starbase and identified 10 pairs of interaction between the three lncRNAs and eight miRNAs. Then, the targets genes of eight candidate miRNAs were obtained via the mirDIP database with the predicted score as ‘very high’, and total, of 21 mRNAs were identified. Ultimately, based on the above evidence (Table S5, 6), we next constructed a ceRNA regulatory network, which contained 32 nodes and 43 edges (Fig. 8D). Since all three lncRNAs inhibit miR-139 expression, we predicted its binding sites by mircode and TargetScan (Table S7 and S8). In addition, functional analysis of these 21 target genes was implemented by the Metascape online tool. And these genes were found to be associated with the pathways, such as blood vessel development, response to hypoxia, PI3K-Akt-mTOR-signaling pathway, etc. (Fig. 8E-G).