Data Preparation
We collected the information of 511 patients (529 samples) was from the TCGA-LGG dataset. Due to the limited mutation data, 495 patients were finally enrolled. Moreover, we recorded another 1,018 patients for external validation from the CGGA database (mRNAseq_693 and mRNAseq_325). All transcriptome expression data among datasets accepted the following transition before data analysis: .
We acquired the immune infiltration status of 529 samples through 'CIBERSORT' analysis. 314 out of 529 samples (305 out of 511 patients) passed the verification process (p<0.05). Furthermore, we calculated the stromal, immune, and ESTIMATE scores. The Pearson correlation plot of immune cell types among all patients was shown in Figure 1C.
Immune Cell Infiltration Cluster
Four ICI clusters were built based on the types of infiltrated immune cells among all 305 patients. The consensus matrix plot was shown in additional files [see Additional file 1]. Heatmap of four immune cell infiltration (ICI) clusters were plotted in Figure 1A. Survival curves of each ICI cluster were plotted and shown in Figure 1D. Significant differences were observed between ICI cluster A-B (log-rank test; p=0.003; Figure 1D), B-C (log-rank test; p<0.001), and C-D (log-rank test; p=0.034).
The differences in prognosis among different ICI clusters might result from the differences in specific genes. Moreover, we compared these clusters in immune-checkpoint gene expression and immune cell types. Cluster B showed significantly higher gene expression levels in PD-1, PD-L1, CTLA-4, IDO1, and CD161 expression than the rest three clusters (Kruskal-Wallis test; p<0.01; Figure 1E-I). Cluster C had a better prognosis than cluster D. However, the differences between the two clusters were not significant enough in PD-1, CTLA-4, and IDO1 expression (Kruskal-Wallis test; p>0.05). We further compared the ICI clusters in immune cell types. ICI cluster B had significantly more infiltrated immune cells in the following types: naïve B cells (Kruskal-Wallis test; p<0.01; Figure 1B), CD8 T cells (Kruskal-Wallis test; p<0.001), resting memory CD4 T cells (Kruskal-Wallis test; p<0.001), M0 macrophage (Kruskal-Wallis test; p<0.001), M1 macrophage (Kruskal-Wallis test; p<0.001), and neutrophile (Kruskal-Wallis test; p<0.001). And cluster B also got the highest stromal and immune score (Kruskal-Wallis test; p<0.001). Cluster C had more activated mast cells and eosinophils (Kruskal-Wallis test; p<0.001) with a better prognosis than other clusters. For cluster D, which received a relatively worse prognosis, had the most M2 macrophage than other clusters (Kruskal-Wallis test; p<0.001).
Gene Expression Cluster
To identify the critical prognostic factor among the ICI clusters, we selected the genes that differed in expression levels for further analysis. We reclassified all 511 patients into two consensus clusters according to listed genes, including gene cluster A (397 patients) and B (114 patients). The consensus matrix plots for gene clusters were also shown in additional files [see Additional file 1]. The heatmap of gene clusters was shown in Figure 2A. We further classified genetypes according to the correlation coefficients between expression level and gene cluster. Genetype A included data of positive coefficients, and genetype B consisted of negative-coefficient data. We conducted intersection (495 patients) of the gene cluster dataset (511 patients) and clinical dataset (495 patients) for Kaplan-Meier analysis. And gene cluster A (383 patients) acquired significantly higher survival probabilities than gene cluster B (112 patients) (log-rank test; p<0.001; Figure 2B). Gene ontology enrichment analysis showed the high-level expression pathways concerning biological process, cellular component, and molecular function in both gene cluster A and B groups (Figure 2C). All pathways listed in the bubble plot were significantly highly expressed in tumor tissues.
Comparison results of immune cell types and expression of immune-checkpoint genes were shown in Figure 2D-H. Gene cluster B showed worse survival outcome with a higher level of immune cells in the following types: naïve B cells (Kruskal-Wallis test; p<0.01; Figure 2D), CD8 T cells (Kruskal-Wallis test; p<0.001), resting memory CD4 T cells (Kruskal-Wallis test; p<0.001), activated memory CD4 T cells (Kruskal-Wallis test; p<0.05), regulatory T cells (Kruskal-Wallis test; p<0.05), M0 macrophage (Kruskal-Wallis test; p<0.001), M1 macrophage (Kruskal-Wallis test; p<0.001), resting mast cells (Kruskal-Wallis test; p<0.01), eosinophils (Kruskal-Wallis test; p<0.05), and neutrophils (Kruskal-Wallis test; p<0.001). Stromal and immune scores of gene cluster B were significantly higher than cluster A (Kruskal-Wallis test; p<0.001; Figure 2D). As for immune-checkpoint genes, gene cluster B also had relatively higher expression levels. The relevant genes included PD-L1 (Kruskal-Wallis test; p<0.001; Figure 2E), CTLA-4 (Kruskal-Wallis test; p<0.001; Figure 2F), IDO1 (Kruskal-Wallis test; p<0.001; Figure 2G), and CD161 (Kruskal-Wallis test; p<0.001; Figure 2H).
Immune Cell Infiltration Score
To better describe the correlations between gene clusters and expression levels, we divided the patients by the correlation coefficients into two genetypes through Boruta analysis (Boruta analysis showed significantly better performance in large number gene weighting than traditional regression models[20]). Further reduction of dimensions was conducted through PCA. The ICI scores (based on results of Boruta and PCA) contained the information of prognostic genes among LGG patients. The relationship among ICI score, gene cluster, and the prognosis was shown in the alluvial plot (Figure 3A). The high ICI score group had significantly lower survival probabilities than the low ICI score group (log-rank test; p<0.001; Figure 3B). Moreover, the high ICI score group showed higher gene expression in CXCL10, PD1, IDO1, LAG3, GZMB, GZMA, CD8A, HAVCR2, CXCL9, CTLA4, PRF1, and PD-L1 (Kruskal-Wallis test; all p values<0.001; Figure 3C). Enrichment analysis was conducted at the pathway level, and the immune-related highly-enriched pathways for the high ICI score group were shown in Figure 3D.
Tumor Mutation Burden
Mutation information for both the high ICI score group and low ICI score group was firstly analyzed through the 'maftool' R package, and the oncoplots for each group were plotted in Figure 4A and 4B. Mutations in IDH1, TP53, ATRX, and CIC frequently happened in both groups, and non-missense mutation occurred most frequently in ATRX. Further, we calculated the tumor mutation burden value, and the high ICI score group got significantly higher TMB values (Kruskal-Wallis test; p<0.001; Figure 4C). The correlation coefficient between TMB values and ICI scores was 0.17 (Spearman test; p<0.001; Figure 4D). Patients with high TMB values acquired significantly lower survival probabilities (log-rank test; p<0.001; Figure 4E). To better describe the prognostic relationship of both ICI scores and TMB values, patients were divided into four groups. High TMB values with high ICI score subgroup acquired the minuscule survival profits (log-rank test; p<0.001; Figure 4F), and low values for both TMB and ICI scores received the most survival benefits (log-rank test; p<0.001; Figure 4F).
Clinical Features and ICI Score Group
The differences in prognosis between high and low ICI score groups were significant. Nevertheless, there still might be selection bias during the enrolling process. The limited number of patients prevented us from conducting propensity score matching (PSM) to minimize the effect of selection bias. Therefore, we further compared patients' survival outcomes with different ICI scores in multiple clinical feature subgroups. Significant differences were observed between high and low ICI score subgroups in different groups divided by age (log-rank test; p≤0.001; Figure 5B and C), gender (log-rank test; p<0.001; Figure 5E and F), and histology (log-rank test; p≤0.01; Figure 5H-J). And we also observed that the ICI score distribution was not relatively even among patients with different clinical features. Patients in the following groups had significantly lower ICI scores: age < 56 years (vs. ≥ 56 years; Wilcox test; p<0.01; Figure 5A) and oligodendroglioma (vs. astrocytoma; Wilcox test; p<0.01; Figure 5G).
Nomogram Model
Clinical features and ICI score group were analyzed through the LASSO regression, univariate Cox regression, and multivariate Cox regression. Age, grade, and ICI score were identified as independent prognostic factors (Wald test; all p values<0.05; Table 1). We randomly divided all 495 available patients into the training group (248 patients) and test group (247 patients) based on the three risk factors. The nomogram model was built based on the training group (Figure 6A). Further internal validation was conducted among 247 patients in the test group. The 1-year, 3-year, and 5-year ROC curves in the training and test groups were shown in Figure 6B and C. The AUC values were 0.895, 0.754, and 0.73 for the training set. And the AUC values of internal validation were 0.851, 0.86, and 0.768 for 1-year, 3-year, and 5-year truncation time. We further draw the calibration curve for internal validation to correct the model's possible bias (Figure 6E). Figure 6G showed the 3-year time-dependent DCA curve in internal validation, and the rest DCA curves were shown in additional files [see Additional file 1]. External validation was conducted based on the 591 patients from the CGGA database. The ROC was plotted in Figure 6D, and the AUC value was 0.725, 0.744, and 0.735 for 1-year, 3-year, and 5-year truncation time. The calibration curve and time-dependent DCA of external validation were shown in Figure 6F and H.