Patients and analytic pipeline
RNA profiles and clinical data of 3810 brain tumor patients were collected from 11 public studies. The baseline characteristics are shown in Supplementary Table 5. Glioma and medulloblastoma account for 83% (3146) and 16% (624), respectively. Among 2951 patients with cancer type information, the primary, recurrent, secondary and post-treatment tumor account for 71% (2105), 11% (332), 1% (40) and 16% (474), respectively. In the glioma cohort (3146), patients with tumor grade II, III, IV respectively accounted for 24% (743), 28% (872) and 39% (1226); 10% (305) of patients did not have tumor grade information. Meanwhile, there were 2960 glioma patients have pathological information. This glioma cohort consisted of diverse pathological subtypes such as astrocytoma (27%), oligodendroglioma (21%), oligoastrocytoma (8%) and glioblastoma (44%). Among 2289 of glioma patients with IDH mutation examined, 55% of them (1254) have IDH mutation. Among 1098 of these 1254 patients with co-deletion of 1p/19q tested, 42% (459) carried co-deletion of 1p and 19q. Among 1731 patients tested for MGMT methylation, 59% (1028) of them were positive for hypermethylation of MGMT promoter. Among 1226 patients with radio-chemotherapy treatment information, the proportion of patients treated with chemotherapy, radiotherapy and a combination of both were 15% (184), 23% (281) and 62% (761), respectively.
A flowchart depicting the whole procedures of this study was shown in Figure 1. We collected 93,293 single-cell RNA profiles from 16 published datasets and manually curated 2,616 genes that were associated with tumor microenvironment, immune cells, immune checkpoint blockade therapy response and prognosis. We developed a self-supervised deep learning model on single-cell RNA profiles of these 2616 genes to decipher gene expression signatures from transcriptomes. Subsequently, we applied this developed feature encoder to extract expression signature from transcriptome of bulk brain tumor samples (See Methods and Supplementary Figure 3). We then examined the association of expression signature with immune signatures, genomic alteration and prognosis.
Differences of immune infiltration signatures in C1 versus C2 subtype
The results obtained from CIBERSORT (18) showed that 18 of 22 types of immune cells were significantly different between C1/2 subtypes (Figure 2A). All types of TAMs (i.e. M0, M1, M2), CD4+ follicular helper T cells and neutrophils had higher infiltration rate in C2 as compared with C1 subtype (Figure 2A). Contrastively, C1 had higher infiltration of CD8+ T cells, plasma cells and dendritic cells than C2 subtype (Figure 2A). The infiltration of the other cell types was provided in Supplementary Table 6.
We observed that 24 immunomodulatory genes were differentially expressed in C1 versus C2 subtype (Figure 2B). Specifically, C10orf54, CX3CL1 and EDNRB were highly expressed in C1 versus C2 subtype (Figure 2B). CD276, CCL5, CXCL10, HMGB1 and the other 17 immunomodulatory genes were significantly upregulated in C2 versus C1 subtype (Figure 2B). The detailed expression of all immunomodulatory genes were provided in Supplementary Table 7. Enrichment analysis of 50 cancer hallmarks and 132 immune signaling modules showed that CSF-1, MYC, TGF-β, JAK/STAT3, IFN-α and the other 28 signaling pathways were enriched in C2 versus C1 subtype (Supplementary Table 4).
C1/2 subtypes were significantly associated with genomic alterations
In the TCGA low-grade glioma, non-silent mutation burden, intratumor heterogeneity, aneuploidy and the other 6 types of genomic variation were significantly higher in C2 versus C1 subtype (Figure 2C and Supplementary Table 8). In TCGA glioblastoma cohort, there was no difference among the aforementioned variations except for segments of copy number variation (Supplementary Figure 4).
We also examined the association of C1/2 subtypes and driver gene mutations of brain tumors that linked to prognosis and therapeutic resistance (Supplementary Table 6). Our finding showed that 4 driver events were significantly higher in C1 versus C2 subtype, including IDH mutation, hypermethylation of MGMT promoter, high CpG island methylation phenotype (G-CIMP) and co-deletion of 1p and 19q (Figure 2D). Four driver events were significantly higher in C2 versus C1 subtype such as EGFR amplification, deletion of CDKN2A/CDKN2B and PTEN, gain of chromosome 7 and/or loss of chromosome 10 (Figure 2D).
In addition, we found that C1/2 subtypes were linked to TCGA molecular subtypes, namely classical, neural, proneural and mesenchymal subtypes (17) (Figure 2E). Neural (168(37%) versus 104(11%); P-value = 6.2e-29) and proneural subtypes (186(41%) versus 243(26%); P-value = 6.2e-29) were significantly enriched in C1 versus C2 subtype. C2 has higher proportions of classical (307(33%) versus 56(12%); P-value = 1.4e-16) and mesenchymal subtypes (265(29%) versus 44(10%); P-value = 2.3e-15) as compared with C1 subtype.
C1/2 subtypes were significantly associated with clinical characteristics
Clinical characteristics of brain tumor patients were provided in Supplementary Table 9. C2 subtype had lower Karnofsky scores (Median: 80 vs. 90, Wilcoxon rank sum test, P-value = 3.4e-6) and higher tumor microvascular infiltration rate versus C1 subtype (61/76, 80% vs. 31/65, 48%; OR: 4.2, 95% CI: 2.0 – 8.7; Chi-squared test, P-value = 1.8e-4). Among patients with recurrence, C1 subtype has marginally significant lower distant recurrence rate (4/23, 17% vs. 19/48, 40%; OR: 0.3, 95% CI: 0.1 – 1.1) and higher local recurrence rate (19/23, 83% vs. 29/48, 60%; OR: 3.1, 95% CI: 0.9 – 10.6) as compared with C2 subtype (Chi-squared test, P-value = 0.1). There were no significant differences in family history of cancer, prediagnostic symptoms and tumor location between C1/2 subtypes (Chi-squared test, all P-values > 0.5).
Kaplan-Meier survival analysis showed that C1 subtype has better survival than C2 subtype (Figure 3A; Log-rank test, P-value = 8.2e-78) in the combined cohort of 3810 patients. This result was also observed in each individual of the 11 datasets (Figure 3A; Log-rank test, all P-values < 0.05). Moreover, the difference remained significant in the combined cohort after controlling for confounding factors such as age, gender, tumor, histology, radio-chemotherapy, recurrent/secondary status, IDH mutation status, MGMT methylation status and co-deletion of 1p and 19q (Figure 3B and Supplementary Figure 6; Multivariate Cox model, HR: 2.2, 95% CI: 1.7 – 2.9; P = 3.7e-10). The independent association of C1/2 subtypes with prognosis from multivariate model remained significant in 6 individual datasets and exhibited the same trend in the other 4 datasets (Figure 3B and Supplementary Figure 6). In TCGA glioma cohort, surgery type was taking into consideration additively. In the medulloblastoma cohort (i.e. GSE85217), clinically relevant confounding factors such as age, gender and molecular subtypes were included. In addition, we observed that the association between prognosis and expression signatures derived from deep learning is more generalizable as compared with PCA (Supplementary Table 10).
We also examined the association between C1/2 subtypes and prognosis of glioma patients with respect to histology, genomic alteration and grade. The glioma patients were divided into 9 subgroups such as astrocytoma, oligodendroglioma, glioma with or without IDH mutation, glioma with IDH mutation with or without co-deletion of 1p and 19q, tumor grade II, III and IV (Figure 4A). The C2 subtype has significantly poor survival outcome than C1 in all subgroups (Figure 4A; Log-rank test, P-values < 0.05). In addition, the difference remained significant in 8 out of these 9 subgroups and marginal significant in grade IV glioma after taking into account age, gender, histology, IDH mutation status, MGMT methylation status and co-deletion of 1p and 19q (Figure 4B and Supplementary Figure 6). The dataset was taken as strata variable in multivariate Cox model. The C2 subtype of glioblastoma with IDH mutation has poor survival outcome analogous to glioblastoma without IDH mutation (Figure 5A; Log-rank test, adjusted P-value = 0.8). While the C1 subtype of glioblastoma with IDH mutation has favorable survival outcome versus C2 subtype (Log-rank test, adjusted P-value = 1.2e-3) or glioblastoma without IDH mutation (Log-rank test, adjusted P-value = 1.3e-6). The result remained significant after ruling out confounding impact of age, gender and co-deletion of 1p and 19q (Figure 5B).
Kaplan-Meier survival analysis showed that C2 subtype had worse progression-free survival as compared with C1 subtype in TCGA glioma cohort (Supplementary Figure 5, Log-rank test, adjusted P-value = 6.1e-4). The difference remained significant in radio-chemotherapy patients (Supplementary Figure 5, Log-rank test, adjusted P-value = 5.3e-3) and show the same trend in radiotherapy along patients (Supplementary Figure 5, Log-rank test, adjusted P-value = 0.4). Progression-free survival was not analyzed for the chemotherapy group due to the limited sample size (Supplementary Table 11).