1p/19q co-deletion and differentially expressed immune genes
Analysis of survival in the CGGA datasets, revealed significantly lower OS in cases with 1p/19q co-deletion relative to those lacking the codeletion. Interestingly, outcomes were markedly in grade II and III tumors with 1p/19q co-deletion relative to grade IV tumors (Fig 1a). Univariate and multivariate Cox analyses revealed 1p/19q co-deletion as an independent prognostic factor for LGGs (Fig 1b-c). Differential gene expression analysis revealed that 551 DEGs between 1p/19q co-deletion samples (n=191) and non-codeletion samples (n=393). Of the 551, 56 are immune-related genes. No statistically significant differences were noted in co-deletion vs non-codeletion samples with regards to sex, age, primary or recurrent type, chemotherapy or radiotherapy status (Table 1).
Elucidation and internal validation of the immune-related prognostic signature
Univariate Cox proportional hazards regression and LASSO regression analyses of candidate genes revealed 23 genes that were screened in multivariate Cox proportional hazards regression. Thirteen immune-related genes associated with coefficients were included in the prognostic signature (Table 2). KM analysis of the low and high-risk LGG samples using log-rank test revealed that 1, 3 and 5-year survival rates in the low-risk group were 97%, 89%, and 79%, respectively, while in the high risk groups survival rates were 81%, 50% and 34%, respectively (Fig 2a). The prognosis was significantly better in cases with lower risk scores, indicating that the risk score negatively correlated with OS. AUC values for the signature’s prediction of 1, 3, and 5-year survival were 0.818, 0.793, and 0.750, respectively (Fig 2b). Indicating high risk score correlated with decreased survival (Fig 2c). The heatmap depicted the visual difference trends of transcript expression of genes incorporated in the signature between the high- and low-risk categories (Fig 2d). The violin plot presented a statistically differential expression between the two categories (Fig 2e).
External validation of the prognostic signature
The 1, 3 and 5-year survival in the TCGA database was 99%, 89%, and 76% respectively, while in the testing cohort, the corresponding survival rates were only 84%, 51%, and 36%, respectively (Fig 3a). The capacity of the testing cohort to predict survival was very similar to that of the training cohort. ROC curve analysis was used to validate prediction accuracy. The AUC values for 1, 3 and 5-years survival were 0.896, 0.785 and 0.708, respectively (Fig 3b). A similar trend was observed in the testing cohort (Fig 3c-e).
Evaluation of the independent prognostic value
Correlation between risk score and clinical parameters, including age, sex, radiotherapy and chemotherapy status, tumor grade, primary or recurrent types and IDH mutation status, was assessed. The value of risk score was lower in chemotherapy, 1p/19 codeletion, tumor grade II, primary and IDH mutation categories (p < 0.05, Fig 4a). Univariate and multivariate Cox proportional hazards regression indicated that the risk score phenotype had independent prognostic value in the training testing cohort (Fig 4b-e).
Nomogram analysis
Nomogram analysis, using 5 prognostic markers (age, tumor grade, primary or recurrent type, risk score, and 1p/19 codeletion status), was used to predict survival in the training set. Among these prognostic factors, risk score ranked a vital proportion in the total points (Fig 5a). To validate the accuracy of the individual assessment, concordance index (C-index) and calibration curve of the nomogram were evaluated for internal validation. The C-index of the nomogram was 0.794. The visualized calibration curve for probabilities for 1, 3 and 5-year OS revealed good agreement between the predicted nomogram and actual survival (Fig 5b-d). Additionally, internal validation was done by randomly sampling 50% of the CGGA samples. The C-index was 0.797, and the calibration curves had goodness-off-fit (Fig 5e-g).
GO term analysis
503 genes were differentially expressed between low and high-risk groups. Of these, 255 had a log2FC >1.2, and 166 of them were included in the term GO analysis. The GO term analysis produced 33 terms that had gene counts >10, 12 of which (including 37 DEGs) were associated with immune-related terms, including B cell receptor signaling pathway, lymphocyte-mediated immunity and humoral immune response (Fig 6).
Elucidation of immune checkpoints and risk score
Next, the relationship between 7 established immune checkpoints and risk score was evaluated (Fig 7). Expression of immune checkpoint genes, (except TIGIT), in the low-risk vs high-risk samples was statistically significant in the training and testing sets (Fig 7a-b). TIM3, MIR155, and CD48 exhibited the highest correlation (>0.4) in the training set (Fig 7c-i). In the testing set, TIM3, MIR155, PD1, and PDL1 expression positively correlated with risk score (Fig 7g-p). Analysis of the correlation between DEGs (in low vs high-risk samples) and survival indicated that 279 and 199 genes in the CGGA and TCGA dataset, respectively, are significantly associated with OS. Further analysis revealed 42 and 73 genes in the CGGA and TCGA datasets (AUC >0.7), respectively, that were independent of age, gender, tumor grade and IDH mutation status. Of these, 20 were common between the 2 datasets. Analysis of correlation between expression of the 20 genes and the risk score revealed 6 (colorectal neoplasia differentially expressed (CRNDE), transmembrane protein 71 (TMEM71), growth arrest specific 2 like 3 (GAS2L3), insulin like growth factor 2 mRNA binding protein 3 (IGF2BP3), vav guanine nucleotide exchange factor 3 (VAV3), TNF receptor superfamily member 11b (TNFRSF11B)) with a correlation coefficient >0.6 in the training and he validation groups (Fig 8a-d). VAV3 and TNFRSF11B are immune-related genes. Sankey diagram analysis revealed co-expression between the 7 established immune checkpoint genes and the 6 immune checkpoint genes we identified. CD48, MIR155HG, PDL1 showed a strong relationship (Cor>0.4, p <0.05) with other immune checkpoints in the training and testing set (Fig 8e-f). VAV3 exhibited a close relationship with MIR155HG, while TNFRSF11B correlated with MIR155HG and PD1 in the training and testing sets.