Identification of prognostic m7G-related DEG in TCGA cohort
By comparing the expression levels of 29 m7G-related genes and TCGA-KIRC data of 539 tumors and 72 normal subjects, 21 differentially expressed genes were finally found based on R (P < 0.05). Among these genes, 9 genes (NUDT10, NUDT4, EIF4E, NUDT16, NUDT3, DCPS, NCBP1, WDR4, LSM1) were down-regulated and 12 genes (SNUPN, NCBP3, EIF3D, DCP2, METTL1, IFIT5, NSUN2, AGO2, GEMIN5, NCBP2L, NUDT11, EIF4A1) were up-regulated in tumor tissues. By integrating differential genes, we obtained 18 prognosis related genes (DCP2, NCBP2, LSM1, LARP1, NUDT3, WDR4, NUDT10, EIF4E, EIF4E1B, NUDT16, NUDT4, GEMIN5, IFIT5, CYFIP1, EIF4A1, METTL1, EIF4E3, NUDT11) based on adjust P < 0.1. Finally, we drew the Venn diagram and obtained 13 intersection genes (Fig. 2A, Fig. 2C), and the correlation between these genes is shown in the Fig. 2B. The 13 prognostic difference genes related to m7G are DCP2, EIF4A1, EIF4E, GEMIN5, IFIT5, LSM1, METTL1, NUDT10, NUDT11, NUDT16, NUDT3, NUDT4, WDR4 respectively(Fig. 3A). The interaction network between these genes is made through the online database STRING(https://cn.string-db.org/)(figure3B).
Development of a Prognostic Gene Model in the TCGA Cohort
Using LASSO Cox regression analysis, a prognostic model was established by using the expression profiles of the above 13 genes, and 9 gene signatures were constructed(Fig. 4A,4B,4C). According to the "scale" function of R in TCGA data, the risk score was calculated and divided into two subgroups based on median risk score, including 265 patients in high-risk group and 269 patients in low-risk group. The Kaplan-Meier curve showed that the OS of patients in the high-risk group was significantly lower than that of patients in the low-risk group after survival package calculation (P < 0.001, Fig. 5A). It is suggested that the death probability of high-risk patients is higher than that of low-risk patients. In order to evaluate the specificity and sensitivity of the prognostic model, we used time-dependent ROC to analyze the 1-year, 2-year and 3-year survival rates. The AUCs were 0.672, 0.636 and 0.642, respectively(Fig. 5B). The patients in the high-risk group had more deaths and shorter survival time than those in the low-risk group. Consistently, PCA and t-SNE analysis showed that the distribution of patients in different risk groups tended to move in different directions (Fig. 5C-F).
Independent Prognostic Value of the Risk Model
Univariate and multivariate Cox regression analysis were used to find independent prognostic factors. In Univariate Cox analysis, risk score, risk level, tumor stage, tumor grade, T and M stage were independent prognostic factors, age, gender and N stage were not(Fig. 6A). However, only tumor grade (HR = 1.6658,95% CI = 1.0674 − 2.5996, P = 0.0246) and risk level (HR = 2.4149,95% CI = 1.5176 − 3.8428, P < 0.001) were independent prognostic factor in multivariate Cox regression analysis(Fig. 6B). In addition, we created a Heatmap of clinical characteristics of the TCGA cohort and found significant differences between the two subgroups(Fig. 6C).
Functional analysis based on risk model
Based on the use of limma package in R, we identified 87 differential genes from the previously obtained risk groups and TCGA cohort, including 79 down-regulated genes and 8 up-regulated genes. In this paper, we only get CC in GO, which is mainly concentrated in synaptic membrane(Fig. 7A). Therefore, we further analyze KEGG by KOBAS and find that it is mainly concentrated in human immunodeficiency virus 1 infection(Fig. 7B). The results showed that m7G-related DEGs played an important role in immune function in KIRC. To further explore the correlation between risk score and immune status, we used ssGSEA in R to compare the enrichment fractions of 13 immune related pathways and 16 immune cell activities among different groups in TCGA. Interestingly, contents of the immune response cells, including the score of CD8 T cells, pDCs, Tfh were significantly different between the low risk and high-risk group in the TCGA cohort (all adjusted P < 0.05, Fig. 7C). Moreover, the cytokine-cytokine receptor interaction such as APC co-stimulation, CCR, inflammation-promoting, T cell co-stimulation were higher in the high-risk group, while the Check-point showed significant means (p < 0.05, Fig. 7D).