Integrating alternative splicing events in GBM
Our research conducts data analysis in sequence according to the flowchart(Figure 1). A total of 45 610 AS events detected in 21136 gene symbols were collected from 154 GBM samples and five normal control samples. These results comprised 3827 AAs in 2684 genes, 3269 in 2270 genes, 8686 in 3476 genes, 8456 in 3695 genes, 18 360 in 6934 genes, 184 MEs in 180 genes, and 2828 RIs in 1897 genes (Figure 2A). The exon skip was the most dominant splicing type, followed by the AP splicing type. The number of AS events was larger than the total number of genes, implying that a single gene may have undergone multiple splicing. Some genes can be expressed in up to six AS types. We have described the detailed information of genes of a specific AS type in the Upset diagram (Figure 2A). Compared with a traditional Venn diagram, it can more effectively prove the quantitative results of multiple interaction sets.
Identification of differentially expressed AS (DEAS) events
By comparing the differences in AS events between GBM and normal tissues, we identified key DEAS events involved in tumorigenesis. A total of 2697 DEAS events detected in 2206 gene symbols were collected, including 1054 differentially upregulated AS events and 1643 differentially downregulated AS events. These results comprised 107 AAs in 101 genes, 116 in 110 genes, 695 in 568, 634 in 552, 1008 in 748, 23 in 23, and 114 in 104, as shown in Figure 2B. These data indicated that in GBMs, ES was the most frequent DEAS event, and ME was the rarest. Furthermore, there were 6170 upregulated genes and 4638 downregulated genes in TCGA GBM dataset. We also found that the DEAS events could compensate for the shortcomings of differential expression analysis; for example, we compared the genes involved in the differential AS event with the differentially expressed genes, and found that 234 genes were recognized, as shown in Figure 3A. Through observation, we found that there were no common transcription factors in these 234 genes. However, we used FUNRICH software to analyze and found that 234 genes have some common transcription factors, such as GATA1, POU3F4, MSX1, LHX3, NHLH1, and HENMT1 (p<0.05) (Figure 3B).
DEAS events mainly are enriched in neuron signaling and metabolism-related processes in GBM
Subsequently, 2697 DEAS events detected in the 2206 gene were profiled using GO functional enrichment analysis. In biological processes, the top three enriched terms were axonogenesis, modulation of chemical synaptic transmission, and regulation of trans-synaptic signaling. In the cellular component, the top three enriched terms were cell leading edge, asymmetric synapse, and neuron to neuron synapse. Regarding molecular function, the top three enriched terms were actin binding, cadherin binding, and tubulin binding. as shown in Figure 3C. Taken together, the DEAS events were mainly enriched in synaptic genesis and transmission.
A Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed several important pathways. The top five enriched pathways were the HIF-1a signaling pathway, Rap1 signaling pathway, axon guidance, regulation of actin cytoskeleton, and insulin signaling pathway (Figure 3D). Taken together, the DEAS genes regulated by DEAS events were mainly involved in metabolism-related processes. These results indicate that splicing some core genes involved in neuronal signaling and metabolism-related processes may be important in GBM.
Identification of DEAS events associated with survival of GBM patients
Using the DEAS event profiles in the GBM cohort, we identified 135 DEAS events that were significantly associated with the OS of GBM patients (p<0.05) by univariate Cox regression analysis (Table 1). In particular, we found genes with potentially more than one DEAS event that was significantly associated with patient survival. To better visualize the intersecting sets, an UpSet plot was created, as shown in Figure 3E. Interestingly, our analysis revealed that genes can exhibit up to seven types of DEAS events that were all found to be significantly associated with GBM patient survival. In addition, via FUNRICH software analysis, we found that 135 DEAS events were associated with the OS of GBM patients are not only related to central nervous system development, but are also closely related to other kidney, stomach, breast, skin, hematopoietic, and lymphoid tissues in cosmic analysis. (Figure 3F).
We drew a volcano map based on the obtained data, which clearly showed that DEAS events related to prognosis accounted for the majority (Figures 4A). The bubble plots illustrate the top 20 OS-DEASs in seven types of splicing patterns (Figures 4B–H). In summary, among all the OS-DEAS, SETD6-36663-AA, CSPG5-64534-AD, NKIRAS2-40977-AP, CCDC40-44016-AT, UBXN11-101231-ES, GRIA1-125279-ME, and CNIH2-17000-RI were the most significant events for AA, AD, AP, AT, ES, ME, and RI, respectively. OS-DEAS genes are closely related to the development of multiple organs.
Risk score as a low-risk factor can be an independent prognostic indicator of GBM.
Next, we performed a LASSO regression analysis of all types of DEAS events related to prognosis (Figure 5A, B). The number of the other seven AS events is too small, and it is not suitable for LASSO regression analyses. The left mode of each group is the LASSO coefficient curve, and the right mode of each group is the choice of the adjustment parameter (λ) in the LASSO model. The y-axis represents partial likelihood deviations. The lower x-axis indicates the logarithm (λ), and the upper x-axis represents the average number of predictors. The red dot represents the average deviation value for each model with a given lambda, where the location indicates the best data. As a result, the prognostic-relevant AS events were selected in all types of AS events, these AS events included 8 AP events, 5 AT events, 4 ES events, and there was NKIRAS2|40977|AP, TMEM63B|76352|AP, GSG1L|35696|AP, RPL39L|68070|AP, GSG1L|35698|AP, ROBO2|65640|AP, SPATS2L|56734|AP, PAIP1|71958|AP, CCDC40|44016|AT, FAM13A|69905|AT, CDKL3|73367|AT, OBSL1|57730|AT, HNF1A|24784|AT, UBXN11|101231|ES, CPEB4|74594|ES, IP6K2|64757|ES, MTIF2|53607|ES.
To determine the independent prognostic factors associated with GBM patients with DEAS events, we screened the most important DEAS events. A multiple Cox regression model with independent prognostic factors was performed for 17 DEAS events. In our data analysis, we found that the above types of DEAS events had a strong ability to predict the prognosis of GBM patients (Figure 5C). The red line represents the high-risk subgroup, the blue line represents the low-risk subgroup. The AUC for all types of AS events was 0.826 (Figure 5D).
Figure 5E shows the distributions of survival time, status, and risk scores among the low-risk and high-risk groups. The image is divided into the upper, middle, and lower sections. The top portion indicates the distribution of risk scores, the middle part shows the survival time and status of GBM patients (green indicates alive, red indicates dead), and the bottom part is the heatmap associated with it. The black vertical dashed line in the middle indicates the best dividing line between the low-risk and high-risk groups. DEAS events, including the AP of GSG1L and AT of CDKL3, were associated with poor prognosis. DEAS events including AP of TMEM63, ES of CPEB4, AT of FAM13A, AT of OBSL1, AT of HNF1A, and AT of CCDC40 were associated with good prognosis.
As shown in Figure 5F, we performed univariate and multivariate Cox regression analyses of all groups of prognosis-related AS events in the forest plot. We analyzed the data using Univariate Cox regression analysis showed that age (reference low), IDH mutation (reference high), and risk score (reference low) of the four groups were used as independent prognostic indicators of GBM (P<0.05). Multivariate Cox regression analyses also demonstrated that the risk score (reference low) of the four groups was an independent prognostic indicator of GBM (P<0.05).
Construction of Potential SF-AS Regulatory Network
To explore the upstream mechanism of AS regulation, we compared the identified SF genes with the genes obtained from TCGA and extracted the expression data of SF genes from the latter. We screened a total of 49 SF-related genes and constructed a regulatory network of DEAS and SF genes (IcorI> 0.8, P<0.001) using Cytoscape, and integrated the analyses with the risk factors (Figure 6A). Our analysis identified nine key SFs, including ELAVL4, CELF3, CELF5, RBFOX1, YWHAH, STXBP1, CELF4, ELAVL2, and DNM1. Interestingly, we found that 49 SF-related genes were mainly enriched in the regulation of nucleobase, nucleoside, nucleic acid metabolism, and protein metabolism in biological processes. Regarding molecular function, 49 SF-related genes were mainly enriched in the structural constituents of ribosomes and RNA binding. In the biological pathway, 49 SF-related genes were mainly enriched in gene expression, RNA metabolism, and insulin synthesis and processing(Figure 6B). Meanwhile, we use FUNRICH software to find Biological pathway enrichment analysis of SF (Figure 6C)and analysis SF Site of expression(Figure 6D).
Immune-related hub genes
By intersecting 135 OS-DEAS with the lists of immune-related genes obtained from ImmPort and InnateDB, two hub immune-related genes, NR1H3|15695|AP and PIK3R2|48396|AT, are related to prognosis(Figure 6D). We found that NR1H3 was highly expressed in GBM(Figure 6E). The cg07298473 methylation site was related to NR1H3 expression(Figure 6F). Among the 25 methylation sites, 18 methylation sites showed low expression and seven methylation sites showed high expression(Figure 6G).