3.1 Possible impact of the different expression clusters of RBPs on glioma immune infiltration
Through univariate regression analysis of 1120 Immune-related RBPs screened from TCGA database, a total of 749 prognostic RBPs were obtained. Considering the impact of different expression clusters of RBP on its associated genes, we performed unpredicted clustering of expression clusters of 749 differential RBPs using the “ConsensusClusterPlus” package, which finally led to the identification of two different modification clusters (see Supplementary Fig. S1A). Cluster A comprised of 187 patients and Cluster B comprised of 211 patients (Figure 2A). Indeed, Cluster A was found to enrich more patients whose tumors were of grade IV exhibited significantly higher mortality compared with Cluster B. On the contrary, Cluster B enriched greater number of patients with tumors classified as grade III, and less frequent mortality. Kaplan-Meier curves indicated that Cluster A conferred a worse prognosis among the two (Figure 2E).
To explore the potential impact of different expression clusters of RBPs on glioma immunity, we did GSVA enrichment analysis. The analysis results showed that the various immune related pathways such as that of antigen processing and presentation, natural killer cell-mediated cytotoxicity, immune deficiency, allograft rejection, graft versus host were found to be enriched in the Cluster A. However, Cluster B was enriched for wingless-type MMTV integration site family (WNT) signaling, Hedgehog (Hh) signaling, calcium signaling, and phosphatidylinositol signaling (Figure 2B). We hypothesized that the different pathways and regulators involved in immunosuppression and immune rejection were enriched in glioma by Cluster A leading to poor prognosis of patients.
Thereafter, heatmap of immune infiltration by CIBERSORT showed a significantly elevated expression of immunosuppressive related cell include regulated T cells (Tregs), M2 macrophages, naive CD4 T cells, non-activated NK cells, and Cells with immune function such as CD8 T cells, activated CD4 T memory cells, gamma delta cells, M0, M1 macrophages and neutrophils, stromal score, and immune score that could be associated with immunosuppression in Cluster A. This cluster was consistent with the previously reported immune phenotype characterized by immunosuppression and immune rejection 23. However, the immune cells that infiltrated in Cluster B included activated NK cells, monocytes, activated dendritic cells, activated mast cells, eosinophils, stromal score, and exhibited a markedly decreased immune score. This cluster showed an innate immune infiltration (Figure 2C). This was consistent with the previously reported immune inflamed phenotype and was primarily characterized by adaptive immune cell infiltration and immune activation(23). These results suggest that immunoinvasive phenotypes of immunosuppression and immunorejection are present in the RBP clustering pattern with poor prognosis. In addition, immune related pathway enhancement was also accompanied by an increased expression of immune checkpoint molecules PD-L1 (Figure 2F). This clustering cluster could effectively distinguish the two samples with different prognostic effects on patients (Figure 2G).
To verify the potential impact of RBPs expression clusters on glioma immune infiltration, we extracted 308 different immune genes associated with 749 RBPs as described above (see Supplementary Table S4). A total of 244 immune genes were found to be associated with prognosis in subsequent univariate regression analyses. Among these, 66 genes were observed to be protective factors with HR <1, while 178 genes were classified as risk factors with HR >1 (see Supplementary Table S5). Moreover, two distinct clusters of modification were identified by clustering on 244 genes (see Supplementary Fig. S1B). Cluster A included 213 patients, and Cluster B contained 185 patients. The 178 risk factors were thereafter enriched in Cluster A, corresponding to patients with tumor grade IV and a higher proportion of deaths. On the contrary, Cluster B enriched 66 different protective factors, corresponding to patients with tumor grade III and a significantly lower mortality rate (Figure 2D). Kaplan-Meier curves indicated that Cluster A conferred a worse prognosis, and there was a better survival advantage observed for Cluster B (Figure 2H).
KEGG enrichment analysis of genes enriched in Cluster A, Cluster B was then performed using the "BiocManager" package of the R software. The results showed that the 143 genes in 178 risk factors of Cluster A constituted background genes in KEGG. Pathways of highly expressed genes enriched in immune deficiency pathways such as EPstein-Barr virus infection(42/143), human T cell leukemia virus 1 infection༈37/143༉, human immunodeficiency virus 1 infection ༈26/143༉, and immune rejection related pathways include allograft rejection༈17/143༉, graft versus host disease ༈17/143༉and autoimmune thyroid disease༈17/143༉ (Figure 2I). However, the pathways associated with highly expressed genes in Cluster B were predominantly enriched in T cell receptor signaling pathway, B cell receptor signaling pathway, Vascular endothelial growth factors (VEGFs) signaling pathway, mitogen-activated protein kinase (MAPK) signaling pathway and natural killer cell cytotoxic activity (Figure 2J).
The subsequent heatmap of immune infiltration by CIBERSORT showed that Cluster A displayed significantly higher stromal score and immune score. Infiltrated immunosuppressive related cell include naive CD4 T cells, non-activated NK cells, M2 macrophages, and Cells with immune function such as CD8 T cells, activated CD4 T memory cells, gamma delta T cells, M0, M1 macrophages and neutrophils exhibited an elevated expression in Cluster A. This cluster was consistent with the previously reported immune phenotype characterized by immunosuppression and immune rejection(23). However, Cluster B possessed lower stromal score and immune score. The plasma cells, activated NK cells, monocytes, activated mast cells and eosinophils were highly expressed in Cluster B (Figure 3A), thus presenting with a phenotype characterized by adaptive immune cell infiltration and immune activation. This expression cluster of immune genes was almost consistent with that observed with RBPs.
To further clarify the correspondence between clusters of RBPs expression and those of relevant immune genes, we did enrichment heatmap of expression clusters of RBPs and immune gene expression clusters. It was noted from the correspondence of RBPs and immune gene expression clusters, immune gene expression Cluster A corresponded directly with the expression Cluster A of RBPs, which enriched more patients with tumor grade IV, and caused higher patient mortality. However, the Cluster B of immune gene expression corresponded to that of RBPs expression, which was primarily enriched in patients with tumor grade III and exhibited relatively low mortality (Figure 3B). Additionally, the alluvial diagram also illustrated this interesting relationship (see Supplementary Fig. S1C). The relationship of protein interaction between RBPs and related immune genes revealed a complex regulatory relationship between them upon analysis (Figure 3C). There was harmonious relationship between them (such as the link between RBP protective factor and immune gene protective factor, RBP risk factor and immune gene risk factor) and antagonistic relationships between them also (such as the link between RBP risk factors and RBP protective factors, and the link between immune gene risk factors and immune gene protective factors).
3.2 Regulation mechanism of different expression patterns of RBPS on immune invasion
To observe whether RBPs can activate inflammation-related factors to construct tumor microenvironment, we investigated the expression of various chemokines and cytokines characterizing three different gene clusters. Cytokines and chemokines selected from the published literature were extracted, among which TGRB1, Smad9, Twist1, CLDN3, TGFBR2, Acta2, COL4A1, Zeb1, and Vim were considered to associate with transcripts of the transforming growth factor (TGF) B/EMT pathway. PD-L1, CTLA-4, IDO1, LAG3, HAVCR2, PD-1, PD-L2, CD80, CD86, TIGIT, and TNFRSF9 have been reported to be involved in transcripts of immune checkpoints. TNF, IFNG, TBX2, GZMB, CD8A, PRF1, GZMA, CXCL9, and CXCL10 were associated with transcripts related to immune activation(24, 25). The results clearly showed that these pathways that were involved primarily in immunosuppression were enriched in RBPs Cluster A and were differentially expressed compared with Cluster B (see Supplementary Fig. S1E). The diagram of protein interaction between RBPs and inflammatory pathway-related genes shows that glioma related RBPs can directly regulate inflammatory pathway-related genes (Figure 3D). This result illustrated that RBPs could activate immune-related pathways to create tumor microenvironment in gliomas.
To verify the possible distribution profile of immunosuppressive cells in immune gene expression clusters, we investigated the expression of different cell surface markers associated with immune regulation. Immune regulatory cells and their markers were obtained from the published literature, including regulatory T cells whose major surface markers include CD25, CCR7, FOXP3, CD4, CD127, CD152, and GITR; Regulatory B cells whose critical surface markers comprise CD19, CD20, CD24, CD27, and CD38; Tumor associated macrophages whose surface markers contain CD14, CD16, CD64, CD86, and CD163; and myeloid derived suppressor cells which can encompass several surface markers including that of CD11B, CD33, CD34, and HLA-DR(26). These factors were all significantly enriched in the expression Cluster A (Figure 3G), and there were marked differences in the expression of these factors between clusters A and B (Figure 3F), thereby illustrating that there was a distinct immunosuppressive phenomenon established in the Cluster A of immune genes.
It has been suggested that stromal activation mediates immunosuppression in tumors(23). To confirm this phenomenon, markers of stromal activation were extracted from previous studies, comprising 141 factors(27) (see Supplementary Table S6). According to the distribution of these factors in the immune gene cluster pattern, stromal activation-related factors were all enriched in the immune gene cluster A pattern, suggest that immunosuppression in high-grade gliomas is associated with tumor stromal activation (Figure 3G). From the protein interaction between RBPs, immunosuppressive cell markers and stromal activation markers, we believe that RBP regulates both immunosuppressive genes and stromal activation-related factors to constitute the immunosuppressive microenvironment in gliomas. The protein interaction between immunosuppressive cell markers and stromal activation markers was stronger than the regulation of immunosuppressive genes by RBPs, that implies the bridging role of stromal activation in tumor immunosuppressive regulation (Figure 3E).
3.3 The specific causes for the regularity of RBP clustering pattern
In order to explore the specific reasons of RBP clustering rules, the lasso regression analysis was applied on 749 genes associated with the prognosis (Figure 3I-J), which resulted in 32 factors with a significant prognostic impact. Among them, 14 factors were found to be protective with HR <1 and 18 were risk factors with HR >1 (Figure 3K). Based on the degree of enrichment of the 32 factors in both the clusters, the risk factors were enriched in Cluster A, while the protective factors were enriched in Cluster B and depicted a low risk (Figure 3H). thereby indicating that the enrichment status of protective and risk factors could potentially determine the expression of clusters and the survival prognosis of patients.
In addition, we further investigated that whether the status of the enrichment of risk and protective factors could significantly change as a result of changes in tumor grade. We included grade II-III glioma samples to contrast with grade III-IV glioma samples. The results showed that the factors of RBPs with prognostic impact were markedly decreased in grade II-III glioma samples compared with grade III-IV glioma samples, and the factors of RBPs were only found to be partially the same between the two groups (see Supplementary Fig. S1F). Among the 32 factors that were found to be significantly prognostic in grade III-IV gliomas, only 24 of them remained prognostic in grade II-III glioma samples, and the remaining eight factors were found to be statistically insignificant (see Supplementary Fig. S1G). There were no significant changes in the conversion of the various protective factors to risk factors or vice versa for any of the 24 factors with consistent effects in grade II-III, and in grade III-IV samples (see Supplementary Fig. S1H). However, for the enrichment of 24 factors in the RBPs clustering of grade II-III gliomas (see Supplementary Fig. S1D), the risk factors were further enriched in the Cluster B of poor OS prognosis, which directly corresponded to the samples with glioma grade III. However, the different protective factors were enriched in Cluster A with a good prognosis, which was consistent to the samples with glioma grade II (see Supplementary Fig. S1H). The results clearly demonstrated that the status of the enrichment of risk factors and protective factors could significantly change with the tumor grade, and the protective factors could always be enriched in the tumor grade with better prognosis.
3.4 Prognostic model
We next constructed a novel prognostic model and to validate it, we proceeded to download the relevant glioma transcriptomic data from CGGA and extracted only a total of 650 expression matrices of the expression data of grades 3-4 with complete clinical information. After extracting the RDB details from transcriptome data, the differential genes were obtained by setting the parameters such that P-value was below 0.01 in both KM and COX methods, and the difference in 5-year survival between the high and low risk groups was > 20%. A total of 395 differential genes were screened and a total of 390 prognosis related RBPs were identified by univariate regression analysis. A total of 190 genes were thus identified by intersecting the prognostic genes in the TCGA and those in the CGGA databases respectively. We screened 17 different genes with the best potential prognostic significance among 190 genes using the lasso regression algorithm. Among them, ARPP21, CHD3, CTIF, SNRPN, and EIF3L acted as protective factors with HR < 1, while ASPM, BRCA1, CLIC1, DDX60L, ERI1, KIF4A, MBD2, MLEC, NCAPD2, PXN, REXO2, and TOP2A were found to be risk factors with HR >1 (Figure 4A). The expression profiles of these 17 RBPs in the tumors have been shown in Figure 4B. The forest plots of the prognostic impact of these 17 factors in the CGGA database and their expression profiles have been shown in Supplementary Fig. S1I-J.
We further analyzed the relationship between the expression levels of 17 factors in the prognostic model and immune subtypes. Among the risk factors, REXO2, MBD2, ERI1, KIF4A and TOP2A were all highly expressed in the C5 immune subtype, which corresponded to the immune silent type, and was characterized by M2-type macrophages as the dominant infiltrating cells. CLIC1 was highly expressed in the C4 immune subtype, which corresponded to the lymphocyte depletion type. DDX60L was highly expressed in the C6 immune subtype, which was classified as TGF- B dominant type. Both C4 and C5 immune subtypes are common in gliomas. Among the protective factors, ARPP21 and CTIF were highly expressed in C1 immune subtype, which corresponded to tissue healing type and had Th2-biased acquired immune invasion. CHD3 is highly expressed in the C2 immune subtype, which corresponds to IFN-γ dominant type (Figure 4C). The rest of the factors including ASPM, BRCA1, MLEC, NCAPD2, PXN, EIF3L and SNRPN in immune express no obvious difference between different subtypes. The immune subtypes belonging to these 17 factors were consistent with the immune types corresponding to our previous RBP clustering model.
We identified five distinct prognostic factors by using multiple stepwise Cox regression analysis. CTIF and EIF3L were protective factors with HR <1 in the TCGA (Figure 4D) and CGGA (Figure 4E) databases, while CLIC1, PXN, and TOP2A were considered as potential risk factors with HR >1. These were consistent with their attributes in the model. Kaplan-Meier survival curves in the TCGA and CGGA datasets further indicated that high expressions of CTIF and EIF3L were favorable for patient survival, while high expressions of CLIC1, PXN, and TOP2A resulted in poor outcomes (Figure 5J-K). We calculated the risk score for each patient according to the following formula:
Risk score=(0.0060*ExpCLIC1) + (-0.0679* Exp CTIF) + (-0.0139*Exp EIF3L) + (0.0381* Exp PXN) + (0.0192* Exp TOP2A)
After dividing the patients into different high- and low-risk groups according to the median value of the risk score in the cohort, the Kaplan-Meier survival curve indicated that patients in the high-risk group displayed a worse prognosis (Figure 4F). In the analysis of the relationship between the high-low risk group and the immune subtype, we found that the high-risk group in the line-up was highly expressed in the C5 type of the immune subtype, while the low-risk group had no obvious enrichment characteristics in the C1-C6 immune subtype (Figure 5D). In addition, the high-risk group had higher immune and matrix scores (Figure 5F-G). This result also confirms our previous inference that RBP can construct an immunosuppressive microenvironment through matrix activation, leading to poor prognosis of patients. The AUC of ROC curve in the RBP score system was observed to be 0.893 (Figure 4G), thereby indicating a nice diagnostic performance. In addition, the risk curves and gene expression heatmaps of the high-risk and low-risk groups were clearly distinguished by the risk scores derived from the five RBPs as shown in Figure 4H.
To further validate the predictive ability of prognostic mode for clinical predictive power, we used the CGGA dataset. The risk scores for patients in the CGGA cohort were calculated using the same formula, and patients were then segregated into high- and low-risk groups based on the obtained median value. The results showed that glioma patients in the high-risk group displayed a worse OS and survival rates compared with those in the low-risk group (Figure 5A). The AUC of ROC curve was 0.771 (Figure 5B), thus clearly illustrating that this RBP signature possessed a stable predictive ability in glioma patients. Furthermore, the risk curves and gene expression heatmaps of the high-risk and low-risk groups were markedly distinguished based on the risk scores constructed using the five RBPs as shown in Figure 5C.
To assess the independent prognostic value of the RBP signature, we next performed univariate and multivariate Cox analyses in the TCGA cohort. TCGA dataset results indicated that the RBP risk stratification method acted not only an OS-related prognostic factor in glioma, which was validated as an independent prognostic factor for glioma patients in each independent and combined cohort (Figure 5H-I). Among them, findings of univariate analysis indicated that HR was 1.2949 (1.2380-1.3544), P= 1.86E-29, and multivariate analysis showed that HR was 1.1564 (1.0754-1.2434), P= 8.71E-05. The similar results were obtained during the validation of the CGGA dataset (see Supplementary Fig. S2A-B). A nomogram model was built for clinically predicting the prognosis of high-grade LGG patients using the 5-signature RBPs (Figure 5E).
3.5 Clinical validation and drug response characteristics of RBP-related prognostic models
Immunohistochemical findings for CLIC1, CTIF, EIF3L, PXN, and TOP2A were obtained from the Human Protein Atlas database (https://www.proteinatlas.org/). Moreover, compared with the normal brain glial cells, CLIC1 and EIF3L exhibited slightly elevated expression, whereas PXN and TOP2A showed strong high expression, and CTIF showed decreased expression in high-grade glioma (Figure 6A). Additionally, clinical high-grade glioma samples (one was grade III and two were grade IV) further confirmed the mRNA (see Supplementary Fig. S2C-G) and protein expression level of the above factors in high-grade glioma (Figure 6B-F). The mRNA and protein expression level of CLIC1, PXN and TOP2A showed high expression, EIF3L showed decreased expression in high-grade glioma. CTIF only detected high expression of mRNA in tumor, but not protein expression was detected. Finally, we predicted the drug sensitivity of 5 factors in the line-up, PXN was more sensitive to Dimethylaminoparthenolide, Carmustine, Oxaliplatin, Imexon and Ifosfamide (COR <-0.45, P <0.001) (Figure 6G-K), whereas EIF3L was more sensitive to Chelerythrine (COR =0.481, P <0.001) (Figure 6L). No significant correlation was found between CLIC1,CTIF and TOP2A and the included drugs (COR >0.45 or COR <-0.45).