DEGs and DGIA lncRNAs in LCCs and RCCs
We firstly selected, separated and bundled TCGA samples in furtherance to segregate DGIA IncRNAs from DEGs of LCCs and RCCs. Soon after, the corresponding gene expression data were standardized and analyzed. We obtained 1724 DEGs (Fig. 1a), including 1325 mRNAs and 399 lncRNAs. According to the CNSMs, the top 25% (n = 75) and last 25% (n = 62) patients were labeled as GU group and GS group, respectively. By analyzing in contrast in lncRNAs between these two groups, 123 DGIA lncRNAs were attained. Among them, 63 lncRNAs showed upregulation whereas 60 exhibited downregulation in the GU group (Fig. 1b, 1c). Based on these DGIA lncRNAs, we carried out unsupervised HCA on TCGA specimens and distributed them into GU group and GS group (Fig. 1d). The CNSMs in both groups were significantly different with its median higher in the GU group comparing with GS group (Fig. 1e). These findings conclusively depicted that the selected DGIA lncRNAs had a marvelous classification effect.
Functional enrichment analysis for DGIA lncRNAs
To explore the functionalities and pathways concerned with 123 DGIA lncRNAs, we operated functional enrichment analysis on protein-coding genes (PCGs) co-expressed with DGIA lncRNAs. The first method’s procedure included a correlation analysis between the selected DGIA lncRNAs and 1,325 differential mRNAs from LCCs and RCCs. The PCGs of DGIA lncRNAs, that is, the top 10 mRNAs with the strongest correlation with each lncRNAs, were achieved.
The second method disclosed, after constructing the co-expression network (Additional file 1: Figure S1), the cognizance of blue module with highest positive correlation, and the turquoise module with highest negative correlation (Fig. 2a). We intersected the selective mRNAs from the blue and turquoise modules with PCGs chosen by the first method. Thereupon, these genes were used to construct a lncRNAs-mRNA co-expression network (Fig. 1f).
The results of functional enrichment analysis with the intersection PCGs comprised DNA-binding transcription activator activity, RNA polymerase II-specific and various phospholipase-related enzyme activities. These molecular functions are closely associated with the formation and development of genomic instability. More importantly, the enrichments of biological processes are mainly related to immune processes, such as T cell activation, lymphocyte differentiation, regulation of T cell activation, etc. (Fig. 2b). KEGG enrichment analysis displayed that both regulating pluripotency of stem cells signaling pathways, as well as immune-related pathways, including Th17 cell differentiation, Th1, and Th2 cell differentiation, PD − L1 expression, PD − 1 checkpoint pathway in cancer, etc., were significantly enriched (Fig. 2c). These results indicated that 123 DGIA lncRNAs not only cause genomic instability but also influence the regulation of immune system. The variation in the expression of these 123 DGIA IncRNAs potentially disturbs the balance of co-expressed PCGs regulatory network, consequently causing instability in the cell genome, and also affecting the killing of tumors by immune cells, mostly by the proliferation, differentiation, activation, and receptor recognition of T cells. Thus, these DGIA lncRNAs possess astounding potential in immune regulation while affecting gene instability.
Construction of DRPM using DGIA lncRNAs
The samples from TCGA were randomly and uniformly arranged into model group (n = 162) and validation group (n = 160). The clinical factors were not statistically significantly different in both groups (all p > 0.05) (Additional file 2: Table S1). In the model group, we accomplished univariate and multivariate COX to assort and construct DRPM with 123 DGIA lncRNAs, and 6 prognostic-related DGIA lncRNAs with corresponding risk coefficients were determined (Table 1). All patients in TCGA were divided into HRG and LRG on the basis of median of RS (0.851) measured by DRPM in the model group (Additional file 2: Table S2).
Table 1
DRPM information including 6 DGIA lncRNAs
LncRNAs | Coefficients | HR | 95% CI lower | 95% CI upper | P value |
NKILA | 0.198 | 1.219 | 1.027 | 1.447 | 0.024 |
AC004009.1 | 0.316 | 1.371 | 1.128 | 1.668 | 0.002 |
AP003555.2 | 0.377 | 1.457 | 1.227 | 1.731 | ༜0.001 |
BOLA3-AS1 | 0.329 | 1.390 | 1.038 | 1.860 | 0.027 |
LINC00543 | 0.074 | 1.077 | 1.007 | 1.152 | 0.030 |
UCA1 | 0.013 | 1.014 | 1.004 | 1.023 | 0.004 |
DGIA: Differential genomic instability-associated; LncRNAs: Long non-coding RNAs; DRPM: DGIA lncRNAs related prognostic model; HR: Hazard ratio; CI: Confidence interval. |
Validation of the DRPM
To confirm the anticipated effects of DRPM, we conducted Kaplan-Meier test to plot a survival curve. The results demonstrated that the survival outcomes of HRG were worse than those in LRG (all p༜0.05) (Fig. 3a). The ROC curves plotted for patients in different groups confirmed the consistency and satisfying categorization effects of DRPM, the AUC were shown in the figures respectively (Fig. 3b). Using RS, we organized the patients in different groups and detected changes in expression level of the prognostic DGIA lncRNAs. The heat map presented the increment in the expression levels of 6 lncRNAs in HRG (Fig. 3c).
To verify the independent predictive effects of RS, we combined RS with clinical factors for univariate and multivariate COX analysis. These clinical factors were age, gender, and TNM stage. The results indicated that RS was an independent prognostic factor (Table 2). Besides, to assess the risk clustering ability of DRPM in different strata, we separately stratified age (< 65 years and ≥ 65 years), gender (male and female), and clinical stage (Stage I-II and Stage III-IV). The survival curves of HRG and LRG were plotted through stratification of different clinical factors. HRG and LRG exhibited significant difference in overall survival up to the stratification of age, gender, and Stage I-II (all p < 0.05) (Additional file 3: Figure S2). In stratification of Stage III-IV, the difference was very close to reaching statistical significance (p = 0.077) (Additional file 3: Figure S2). In summary, the DRPM revealed a consistent and promising prognostic evaluation ability in different stratification.
Table 2
Univariate and multivariate COX of prognostic factors in different groups
Factors | Univariate COX | Multivariate COX |
HR | 95% CI lower | 95% CI upper | P value | HR | 95% CI lower | 95% CI upper | P value |
All patients in TCGA (n = 322) | | | | | | | | |
Age | 1.031 | 1.010 | 1.053 | 0.004 | 1.043 | 1.021 | 1.066 | ༜0.001 |
Gender | 1.330 | 0.828 | 2.134 | 0.238 | 1.149 | 0.698 | 1.891 | 0.584 |
T | 3.555 | 2.155 | 5.864 | ༜0.001 | 2.897 | 1.601 | 5.240 | ༜0.001 |
N | 1.898 | 1.438 | 2.506 | ༜0.001 | 1.132 | 0.804 | 1.595 | 0.477 |
M | 4.484 | 2.740 | 7.337 | ༜0.001 | 2.985 | 1.603 | 5.559 | 0.001 |
Risk score | 1.228 | 1.161 | 1.300 | ༜0.001 | 1.196 | 1.127 | 1.269 | ༜0.001 |
Model group (n = 162) | | | | | | | | |
Age | 1.024 | 0.996 | 1.053 | 0.094 | 1.033 | 1.002 | 1.065 | 0.038 |
Gender | 0.734 | 0.393 | 1.369 | 0.331 | 0.758 | 0.388 | 1.480 | 0.417 |
T | 2.359 | 1.235 | 4.507 | 0.009 | 1.465 | 0.702 | 3.060 | 0.309 |
N | 1.904 | 1.315 | 2.756 | 0.001 | 1.216 | 0.763 | 1.940 | 0.411 |
M | 4.654 | 2.451 | 8.838 | ༜0.001 | 3.405 | 1.500 | 7.726 | 0.003 |
Risk score | 1.235 | 1.158 | 1.317 | ༜0.001 | 1.160 | 1.078 | 1.249 | ༜0.001 |
Internal validation group (n = 160) | | | | | | | | |
Age | 1.040 | 1.006 | 1.076 | 0.021 | 1.063 | 1.026 | 1.102 | 0.001 |
Gender | 3.060 | 1.361 | 6.878 | 0.007 | 2.179 | 0.904 | 5.248 | 0.083 |
T | 6.649 | 3.078 | 14.362 | ༜0.001 | 9.612 | 3.662 | 25.230 | ༜0.001 |
N | 2.074 | 1.313 | 3.276 | 0.002 | 1.378 | 0.744 | 2.555 | 0.308 |
M | 4.349 | 1.992 | 9.493 | ༜0.001 | 2.299 | 0.767 | 6.891 | 0.137 |
Risk score | 1.178 | 1.033 | 1.343 | 0.015 | 1.250 | 1.088 | 1.436 | 0.002 |
COX: Cox proportional hazard regression analysis. |
Validation of the prognostic DGIA lncRNAs
To verify the accuracy of prognostic DGIA lncRNAs, we plotted survival curves for these lncRNAs in TCGA samples. In AC004009.1, AP003555.2, BOLA3-AS1, NKILA, LINC00543 and UCA1, the prognosis of the high expression group was worse as compared to the low expression group (all p༜0.05) (Fig. 4a). Also, we investigated the correlation between these lncRNAs and stages, and the results indicated that the expression levels of AP003555.2, BOLA3-AS1, NKILA, LINC00543, and UCA1 were significantly different between at least two stages (Fig. 4b).
Meanwhile, in the external validation group from GEO, we constructed a model and grouped patients with the four prognostic DGIA lncRNAs, comprising BOLA3-AS1, NKILA, LINC00543, and UCA1. The prognosis of HRG was also worse than that of LRG (p༜0.001) (Fig. 5a). The timeROC of 1, 3, and 5 years proved that the model had a promising classification effect, and that of 3 years displayed optimum effects (AUC = 0.83) (Fig. 5b).
Immune infiltration and GSEA within different risk groups
The above-mentioned enrichment investigation demonstrated that DGIA lncRNAs also influences immune regulation. Hence, we evaluated the differences in the infiltration of 22 immune cells in HRG and LRG according to the results of CIBERSORT. The expression of CD8+ T cells, out of all TCGA samples and model group, was lower in HRG than in LRG (all p༜0.05) (Fig. 6). CD8+ T cells are the cytotoxic immune cells that are capable of directly killing tumor cells, and their abundance differences indicated the immune-related reasons for the prognostic difference in HRG and LRG.
To explore the significantly altered MF, BP, and pathways in HRG and LRG, we performed GO- and KEGG- related GSEA. Mainly immune and genomic instability related pathways in LRG were significantly enriched. In GO enrichment terms, immune-related pathways encompassed response to type I interferon (IFN-Ⅰ), natural killer cell activation, T cell activation involved in immune response, etc. Simultaneously, some genomic instability-related pathways were also significantly enriched, including structural constituent of ribosome, transcription elongation from RNA polymerase II promoter, response to dsRNA, and some energy-related pathways in glucose metabolism (Fig. 7a, 7b). In KEGG enrichment terms, besides the regulation of autophagy and cytosolic DNA sensing pathway, which are related to genomic instability, there were also immune-related pathways, including antigen processing and presentation, and cytokine receptor interaction enriched (Fig. 7c). Finally, we also noticed that the CNSMs of LRG was significantly higher comparing with HRG (all p < 0.05) (Additional file 4: Figure S3).