Expression of pyroptosis-related genes was upregulated in ccRCC
With the aim of elucidating the role of pyroptosis-related genes in the course of ccRCC, we first identified 51 such genes by reviewing published literatureKarki and Kanneganti 2019, Xia, Wang 2019, Man and Kanneganti 2015, Ruhl, Shkarina 2018, Hu, Chen 2020, Zhou, Zhang 2018, Tan, Huang 2020), overlapped them with differentially expressed genes (DEGs) obtained from gene profiles of 71 sets of paired samples in TCGA database, and acquired an intersection set of 43 genes (p < 0.01; Figure 1A). Further, we measured their expression in 71 sets of paired samples and found that 37 genes (VPS4A, CASP5, CHMP6, CASP1, IL18, GPX4, CHMP2A, IRF2, CASP4, HMGB1, NLRC4, IRF1, PLCG1, NLRP6, NLRP7, GSDMC, CASP3, NOD1, AIM2, GSDMD, CASP6, GSDMA, IL1A, GZMB, GSDMB, NLRP1, TIRAP, PRKACA, IL1B, NOD2, PYCARD, CHMP4A, NLRP3, CASP8, CHMP4B, SCAF11, and BAX) were upregulated in tumour tissues, and six other genes (CHMP2B, DFNB59, PLCG2, CYCS, NLRP2, and TP63) were downregulated in tumour tissues. Further, we intended to clarify the interplay of selected genes through protein–protein interaction (PPI) analysis, as presented in Figure 1B. We set the highest confidence at 0.9 of interaction score for PPI analysis. Figure 1C depicts the correlation network containing all pyroptosis-associated genes. The results reveal the biological role of pyroptosis-related genes in ccRCC development.
Consensus clustering for pyroptosis-related genes with the characteristics and survival of patients with ccRCC
Further, to investigate the relationship between the expression level of pyroptosis-related genes and clinical characteristics, we conducted a consensus clustering analysis. After testing the clustering potency according to the similarity displayed by the expression levels of DEGs and the proportion of ambiguous clustering measures by setting k from 2 to 9, k = 2 was identified as optimal clustering stability (Figure 2A- B). Thereafter, patients were divided into two clusters; cluster 1 (n = 142) and cluster 2 (n = 352). The gene expression profiles and clinical characteristics of patients are presented in a heatmap (Figure 2C). The individual gene expression profiles and the clinical characteristics including survival status (alive or dead), degree of tumour differentiation (G1–G4), tumour-node-metastasis (TNM) stage, M stage, T stage, age (<60 or ≥60 years), and sex are shown in the heatmap; differences in clinical features between clusters are insignificant. Overall survival (OS) between clusters was not significant (p = 0.394; Figure 2D).
Development and validation of a risk-score prognostic model in TCGA cohort
Further, we attempted to determine the prognostic value of pyroptosis-related genes in ccRCC patients. The 494 patients were randomly distributed in the TCGA training cohort with a validation cohort in the ratio of 7:3. All clinical characteristics including age, sex, T stage, M stage, grade, and TNM stage were found to have no statistical significance between the two cohorts (all p > 0.05, Table S1). To construct a more accurate prediction model of clinical outcome correlated with pyroptosis-related genes, we conducted the least absolute shrinkage and selection operator regression analysis on 43 DEGs based on the TCGA training cohort and selected six genes (AIM2, BAX, GSDMB, NLRP6, SCAF11, TIRAP; Figure 3A-B). Cox univariate analysis was performed on these six genes, and five variates (those with p < 0.2) were subjected to the Cox multivariate analysis, giving a coefficient to each variate. AIM2 and GSDMB were linked to an increased risk (HRs > 1), whereas NLRP6, SCAF11, and TIRAP may be protective genes (HRs < 1). The results of all Cox univariate and multivariate analyses are shown in Table S3. Using the coefficient of the final two variables, we constructed a two-gene risk score model. The risk score of each patient was calculated using the following equation: risk score = (1.035 ´ AIM exp) + (1.091 × GSDMB exp). We then split the TCGA training and validation cohorts into two groups (high and low) from the median risk score of the TCGA training cohort. Risk score, immunoscore, grade, TNM stage, M stage, N stage, age, and sex are presented in a heatmap (Figure 3C), which displays the upregulation of AIM2 and GSDMB in the high-risk score group. The distribution of the risk score, OS, OS status, and expression profiles of the two-gene risk-score model in TCGA training and validation cohorts are displayed in Figures 3D and 3E. Low-risk groups in both the training and validation cohorts were associated with a better OS (p < 0.001, Figure 3F–G). In an attempt to test the prognosis accuracy of this two-gene model, we carried out 1-, 3-, and 5-year receiver operating characteristic curve analyses by comparing the respective area under curve (AUC) values. In the TCGA training cohort, the 1-, 3- and 5-year AUC values for the two-gene risk-score model were 0.680, 0.626, and 0.675, respectively (Figure 3H). In the TCGA validation cohort, they were 0.661, 0.644, and 0.657, respectively (Figure 3I). The AUC values showed that the two-gene risk score model had a satisfactory discrimination potential for ccRCC patients. These results demonstrate that the two-gene risk-score model is a potent predictor of ccRCC prognosis.
Prognostic risk scores correlated with immunoscore and PBRM1 mutation in ccRCC
The correlation between risk score and clinical characteristics was further studied. We identified DEGs between high and low-risk subgroups in TCGA training and validation cohorts separately using Wilcox test with a threshold of p value<0.05 and |log2FoldChange|≥1, and obtained an intersection set of 1126 genes. Among them, 1095 were upregulated in the high-risk score subgroup and 31 were downregulated (Figure 4A; Additional file 4: Table S4). DEGs were then subjected to GO and KEGG analysis, and were found to be enriched and regulates T cell activation and primary immunodeficiency (Figure 4B). This led us to further examine the immunoreactive differences between the high-and low-risk subgroups. Hence, we compared the immunoscore of the two groups and found that the immunoscore of the low-risk subgroup was distinctly lower than that of the high-risk subgroup (p < 0.0001, Figure 4D). Contrary to other cancers, a higher immunoscore represents more infiltration of immune cells in the tumor microenvironment and a better prognosis. However, in the case of ccRCC, higher immunoscores are associated with a worse prognosis. To better explain this phenomenon, we compared PBRMI mutation differences between high- and low-risk subgroups based on the available literature(Braun, Hou 2020). We compared the mutations in TCGA training cohorts and found that the percentage of patients with PBRM1 mutation in the low-risk subgroup was distinctly higher than that in the high-risk subgroup (Chi-square p < 0.001, Figure 4C, 4E)
Relationships between risk scores and infiltration abundances of nine immune cell types
We then determined the differences in the immune microenvironment between the two groups. We used the CIBERSORT algorithms to calculate the infiltrating levels of 22 immune cell types in TCGA training cohort, with p < 0.05, as the threshold. Moreover, the proportion of the eight immune cell types was significantly different in different subgroups (Figure 5A). The results show that risk scores are negatively related to proportions of CD4+ T memory resting cell, macrophages M0, macrophages M2, mast resting, and naive B cells, and positively related to CD8+ T, T follicular helper, gamma delta T, and regulatory T cells (Figure 5B-J). Such evidence suggests that our risk-score model based on pyroptosis-related genes is related to the immune microenvironment of ccRCC. Utilising the Tumour Immune Estimation Resource database, we demonstrated the correlation between two hub genes (AIM2 and GSDMB), tumour purity, and five infiltrating immune cells. Expression of AIM2 and GSDMB was negatively correlated with tumour purity (Spearman rho = -0.284 and -0.04, respectively; all p < 0.001), positively related to infiltrating regulatory T cells, T follicular helper cells, and CD8+ T cells, and negatively related to CD4+ T cells and B naive cells (all p < 0.05). The above results demonstrate that immune infiltration plays a major role in ccRCC development.
Risk score-based treatment strategy for ccRCC
We analysed the expression levels of four types of immune checkpoint molecules, including CTLA4, PD1, PD-L1, and PD-L2, between two subgroups of different risk scores. The high-risk subgroup was proved to possess higher expression of immune checkpoint molecules (Figure 6A), and the risk scores were positively correlated with the expression of immune checkpoint molecules (Figure 6B). In addition, we used the Cancer Immunome Atlas (TCIA)-recommended immunophenoscore (IPS) scheme to evaluate the immunogenicity of the two groups. Ips_ctla4_neg_pd1_pos, Ips_ctla4_pos_pd1_neg, and Ips_ctla4_pos_pd1_pos reached higher scores in the high-risk group (Figure 6C). These results might indicate a potentially better response to immunotherapy among patients in the high-risk group.
Expression validation of the two hub genes by quantitative real-time PCR
To better describe the expression levels of two hub genes in cancerous and normal renal tissues, we collected 12 normal renal samples and 12 ccRCC samples. Compared to normal tissues, the expression levels of AIM2 and GSDMB were increased in tumour samples (Figure 7 A-B, all p < 0.05). Moreover, a higher expression of GSDMB and AIM2 was associated with a poor prognosis (HR = 1.6 and 1.7, respectively; all p < 0.01)
Independent prognostic value of the risk score model and development of a predictive nomogram in TCGA cohort
We set out to test the qualification of our two-gene risk-score model in the prediction of ccRCC patient prognosis by univariate and multivariate Cox regression analysis in TCGA training cohorts, judging from the hazard ratios (HR) and the 95% confidence intervals (CI) of the risk score, both analyses demonstrated our model’s decent prognostic efficiency independent of clinical characteristics including sex, age, TNM stage, and grade. The HRs of the two analyses were 2.074 95% CI: 1.413–3.044 (p < 0.001) and 1.752 95% CI: 1.018–2.599 (p < 0.01; Figure 8A and 8B). All Cox univariate and multivariate analysis results of TCGA validation cohorts are shown in Figure S1(Additional File 5). To complement the prognostic model, we constructed a nomogram to predict ccRCC prognosis based on Cox regression analysis of TCGA training cohort. Our nomogram considered two predictive factors: pathologic grade and prognostic model (Figure 8C). Each variate was graded based on a multivariate Cox regression analysis. The overall nomogram score was the sum of the individual scores of all the predictive variables. Higher nomogram scores indicate worse 1-, 3-, and 5-year survival rates and vice versa (Figure 8D–F). The C-index of the nomogram for prediction was 0.68 (95% CI: 0.656–0.704) and a calibration plot in fine consistency with the ideal curve proved our nomogram to be stable in predicting the clinical prognosis of ccRCC patients.