Overview of AS events in TCGA CESC cohort
A total of 246 patients with CESC were identified. The mRNA splicing data enrolled in this study includes 17,069 AS events in 9,608 genes. As shown in Fig. 1A, a single gene could have up to 6 different splicing modes, and most genes had more than 1 AS event. Exon skipping (ES) was the most frequent splice type in the 7 AS types, followed by an alternate terminator (AT) and alternate promoter (AP). The top 20 significantly survival-related AS events for 7 AS types were presented in Fig. 1D-K.
Prognostic index models featured by AS events for CC
To explore the prognostic utility of each AS signature, AS events associated with OS were identified by fitting univariate Cox proportional hazard regression models after merging the clinical data in the CC cohort. As a result, 2,269 AS events were determined with p < 0.05, containing 1,153 high-risk survival-associated AS events (hazard ratio [HR] >1) and 1,116 low-risk survival-associated AS events (HR < 1). The intersecting sets between different genes and AS events were visualized by the UpSet plot (Fig. 1B), indicating that 1 gene might have more than 1 survival-associated AS event. The most frequent splice type in the survival-associated AS events was AP, followed by ES and AT.
After conducting univariate regression analysis, LASSO regression was performed. The prediction models were constructed by the optimal survival-related AS events to avoid the overfitting of OS based models (Fig. 1C). In the meantime, a 7-AS event signature was identified as a survival predictor of CC through the Cox proportional hazards regression model (Table 1). A distribution diagram of survival risk-score, survival status of CC patients, and clustering heatmap of the PSI levels of survival-associated AS events are shown from top to bottom in Fig. 1L. The horizontal axis represents the patients in risk-score order from low to high (Fig. 1L).
Kaplan-Meier curves and log-rank tests were gererated to analyze the relationship between risk-score and survival status. The survival rates of low-risk patients were higher than that of high-risk patients, exactly as illustrated in Fig. 2F (P < 0.001). The ROC analysis was then applied to compare the predictive power of prognostic models, and the ROC curve (area under the curve [AUC]) in 1, 2, and 3 years were all greater than 0.850. Moreover, the AUC of the risk score model predicting a 1-year survival rate was larger than that of the age, grade, and FIGO stage (Fig. 2A, 2B).
Table 1
The 7 AS events associated with the OS of patients with CC
ID
|
Coefficient
|
HR
|
HR.95L
|
HR.95H
|
P-value
|
FCF1|28425|AD
|
3.8372
|
46.3961
|
1.7893
|
1203.0625
|
0.0209
|
FOXRED2|62052|AP
|
2.4601
|
11.7062
|
0.8097
|
169.2506
|
0.0711
|
HNRNPA1|301521|ES
|
3.4369
|
31.0901
|
0.2586
|
3738.2195
|
0.1596
|
NDUFA3|51782|ES
|
5.1283
|
168.7360
|
12.9262
|
2202.6492
|
0.0001
|
SHF|30409|AP
|
1.7134
|
5.5477
|
1.4991
|
20.5309
|
0.0103
|
NHLRC3|25701|ES
|
-4.3616
|
0.0128
|
0.0005
|
0.3207
|
0.0080
|
MCC|73005|AP
|
1.0190
|
2.7703
|
0.7634
|
10.0533
|
0.1213
|
HR, hazard ratio |
Construction and evaluation of the nomogram
Univariate and multivariate Cox regression methods were used and combined patient clinical characteristics (age, grade, and FIGO stage) to analyze whether the 7-AS event signatures could be an independent predictor of CC patients survival As shown in Fig. 2, it is shown that the risk-score could still be used as a reliable and stable independent risk predictor in the CC cohort (P < 0.001; Fig. 2C). We constructed a predictive nomogram based on the multivariate analysis (Fig. 2D) that included risk scores and clinical characteristics. As a result, the risk-score showed satisfactory diagnostic ability together with clinical characteristics.
Risk-score and AS events are associated with the infiltration of immune cells in the CESC microenvironment
First, immunescore in 29 types of infiltrating immune cells and immune function was assessed by the ssGSEA method (He et al., 2018). The immune score difference of each immune cell in the low risk-score and high risk-score groups is shown in Fig. 3B and Fig. 3C. We further explored the impact of risk-score on the infiltration of 22 types of immune cells in the tumor microenvironment (TME) by the CIBERSORT algorithm. The landscape of 22 types of immune cells infiltrating in the low risk-score group and the high risk-score group is shown in Fig. 3A. Differential analysis results showed that 7 types of immune cells (T cells CD8, T cells regulatory [Tregs], T cells CD4 memory activated, neutrophils, mast cells resting, mast cells activated, and macrophages M0) had a significant difference between the 2 groups (P < 0.05). The correlations with risk-score and the number of those 7 types of immune cells were shown in Fig.3D-J. Positive correlation could be found between risk-score and neutrophils (R=0.24, P=0.001, Fig.3G), mast cells activated (R=0.33, P=4.3e-0.6, Fig.3I) as well as macrophages M0 (R=0.3, P=3.2e-0.5, Fig.3J).
Finally, the violin plot was used to assess the difference in tumor purity, ESTIMATE score, immune score, and stromal score calculated using the ESTIMATE algorithm between the 2 groups (Fig. 8A). Tumor purity, ESTIMATE score, stromal score, and the immune score had no differences in the low- and high risk-score groups (P ≥ 0.05).
The risk-score is associated with the key immune checkpoint genes in the immune microenvironment of CESC
The difference in the expression level of 47 immune checkpoint genes in the low risk-score and high risk-score groups was evaluated, and 13 genes showed the significant differences (Fig. 4A). Next, R packages (limma, corrplot, ggpubr, ggExtra) were used to screen the risk-score associated with the 6 most important checkpoint genes (CD274, PDCD1, PDCD1LG2, CTLA4, HAVCR2, and IDO1). With the absolute threshold value of P less than 0.001, 3 immune checkpoint genes, IDO1, PDCD1, and HAVCR2, were identified (Fig. 4B). Scatter plots displaying the correlation of those 3 genes and risk-score were plotted separately. Though 3 of the correlation coefficients did not reach 0.3, the scatter plot showed a negative correlation (Fig. 4C-4E).
Extraction of IRGs depending on AS events and their correlation with clinical parameters
The expression of 7 genes (Table 1) was identified for difference analysis between CC and normal cohort by "limma" package with the threshold of |log2FC|>1 and P<0.05, and the 2 genes, SHF (logFC= -1.5396, P<0.05) and FOXRED2 (logFC= 1.3212, P < 0.05) were extracted for further analysis. Next, we divided tumor patients into high expression and low expression groups according to the optimal cut-off in SHF and FOXRED2 for clinical prognostic analysis.
The correlations between 2 hub genes expression and clinicopathological parameters were evaluated using R packages (limma, survival, survminer). The expression of SHF was higher in normal group (P < 0.01, Fig. 5B) while the expression of FOXRED2 was lower in normal group (P < 0.05, Fig. 5G). However, no notable association between 2 genes and age, grade, or stage were found (P ≥ 0.05) (Fig. 5C-E, H-J). Survival analysis revealed that high expression of SHF (P < 0.002) and low expression of FOXRED2 (P < 0.001) were associated with poor prognosis (Fig. 5A, 5F).
Relationship between SHF expression and immune cell infiltration
The landscape of 22 types of immune cells infiltrating in the low expression group and high expression group of SHF is shown in Fig. 6A. The B cell naive, plasma cells, T cells CD8+, T cells CD4 memory activated, T cells CD4 memory resting, natural killer (NK) cells resting, macrophages M1, had differences in 2 groups. The immune score difference of each immune cell in the 2 groups is shown in Fig. 6B. The expression of SHF had positive correlations with T cells CD4 memory resting, plasma cells, mast cells resting, and B cells naive, while being negatively correlated with T cells CD4 memory activated, T cells CD8, NK cells resting, and macrophages M1 (P < 0.05) (Fig. 6D-6K). However, we found no strong correlation between immune cells infiltration and SHF expression. Given that the risk score was related to tumor immunity, we finally appraised the correlation between the gene signature and the expression of immune checkpoints. The immune checkpoints with differential expression in the low expression and high expression groups of SHF are shown in Fig. 6C.
Correlations between FOXRED2 expression and immune cell infiltration
Fig. 7A showed the landscape of 22 types of immune cells infiltrating in the low expression and high expression groups of FOXRED2. The number of B cell naive was different in 2 groups. The immune score difference of each immune cell in the 2 groups is shown in Fig. 7B. In additon, the association between FOXRED2 expression and the number of tumor-infiltrating immune cells was investigated. The results exhibited that FOXRED2 expression had positive correlations with T cells CD4 memory resting, mast cells resting, B cells naive, while being negatively correlated with NK cells resting (P<0.05) (Fig. 7D-7G). However, we also found no strong correlation between immune cells infiltration and FOXRED2 expression. The immune checkpoints with differential expression in the low and high expression group of FOXRED2 are shown in Fig. 7.
The tumor purity, ESTIMATE score, immune score, and stromal score between the low and high expression groups of SHF and FOXRED2
In order to verify the effectiveness of the grouping strategy between the low and high expression groups of SHF and FOXRED2, the ESTIMATE method was applied to evaluate tumor purity, ESTIMATE score, immune score, and stromal score. Compared with the high expression group in SHF, the low expression group had a higher immune score (P < 0.001) (Fig. 8B). Other parameters had no differences between 2 groups in SHF (P ≥ 0.05) (Figure 8B). Compared with the low expression group in FOXRED2, the high expression group had a lower ESTIMATE score, immune score, and stromal score (P < 0.001), but had higher tumor purity (P < 0.001) (Fig. 8C).
Potential Regulatory Network Between SFs and AS Events
A total of 31 SFs (blue) were significantly associated with 227 survival-associated AS events consisting of 135 low-risk AS events (green) and 92 high-risk AS events (red). Most of low-risk AS events showed negatively correlation with the SF expression (green lines), and most of the high-risk AS events were positively correlated with SF expression (red lines) (Fig. 9).