Differentially Expressed LRGs in EC
The process for this study is shown in Figure 1. A total of 532 patients were involved in the development and validation of the prognostic signature, including 35 normal tissues. First, we performed the Wilcoxon test with a log2FC > 1 and P < 0.05 to detect the differentially expressed genes (DEGs). Then, 5,155 DEGs were found between 35 normal samples and 532 tumor samples. Next, we downloaded the list of LRGs from the MSigDB. These IRGs intersected with the DEGs, and 89 differentially expressed IRGs were obtained (Figure 2A), including 32 down-regulated and 57 up-regulated genes (Figures 2B). The DE-LRGs list, including log2FC and the adjusted P-values of each gene was provided in Supplementary Table S1. Afterwards, we performed GO and KEGG pathway analysis for the DE-LRGs and the top 10 GO and KEGG pathway enrichment terms shown in Figures 2C-D.
GO analysis revealed that the DE-LRGs were mainly enriched in mitotic nuclear division, nuclear division, chromosome segregation and so on. The KEGG analysis indicated that the genes were mainly involved in positive regulation of cell cycle, progesterone-mediated oocyte maturation, p53 signaling pathway. et al. Thus, through the combined analysis, we revealed 89 DE-LRGs that were significantly associated with EC. Most of the DEGs were associated with the cell cycle and p53 pathway.
Identification of prognostic immune-related DEGs
To further screened out the LRGs with potentially prognostic value for EC patients, LASSO regression model was used to select key prognosis-associated genes. In the LASSO-penalized Cox regression, as log λ (a tuning parameter) changed, the corresponding coefficients of certain genes were reduced to zero, suggesting that their effects on the model could be omitted because they were shrinking parameters (Figure 3A). Following cross-validation, 11 genes (C14orf28, HNRNPA3P1, DSP, ACACB, TPX2, HMGB3, ATP8B4, PPP1R14C, CIRBP, CDC6, DTWD1) achieved the minimum partial likelihood deviance and were identified as key prognostic immune-related genes in OS model (Figure 3B). As described in the methods. the calculation formula = (ACACB *0.1335)- (ATP8B4*0.0144) -(C14orf28*0.0007) -(CDC6*0.0727) -(CIRBP*0.1112) +(DSP*0.0196) -(DTWD1*0.0318) +HMGB3*0.1201) -(HNRNPA3P1*0.1167) +(PPP1R14C*0.0185) +(TPX2*0.1833). The correlations between these genes were calculated in EC using the Spearman correlation analysis. We found they were significantly relevant. For instance, expression levels of DTWD1 and ABACB, ATP8B4 and CDC6, ATP8B4 and HMGB3 genes were closely correlated with each other (Figure 3C). The expression of these 11 genes in tumor and adjacent normal tissues were shown in Figure 3D. The results showed that the expression of ACACB, ATP8B4, C14orf28, CIRBP, and DTWD1 were higher in normal tissues, and the content of CDC6, DSP, HMGB3, HNRNPA3P1, PPP1R14C, and TPX2 were higher in tumor tissues. Furthermore, the transcript message of patients stratified by risk score into high- and low- risk subgroups were analyzed by GSEA. In the prognostic model, representative hallmark in high-risk patients were “androgen response”, “DNA repair”, “estrogen response early”, “fatty acid metabolism”, “glycolysis”. What’s more, “MTORC1 signaling”, “notch signaling”, “PI3K-AKT-MTOR signaling”, “TNFA signaling via NFKB”, “WNT-β-catenin- signaling” were enriched in low-risk group patients (Figure 3E).
Validation and the efficacy of the 11-LRGs prognostic signature
Based on the mean risk score from risk signature, the patients were divided into high-risk and low-risk groups. Then, the expression of the 11 genes in low- and high-risk patients in the TCGA dataset was also demonstrated in the heatmap (Figure 4A). We found significant differences between the high- and low-risk groups associated with LNM, cancer status, peritoneal cytology, recurrence, grade, menopausal status, stage, and stage (all p<0.05). Then risk score of each individual and survival status were ranked and displayed on the dot plot, which showed significant differences in OS between the groups (Figure 4B-C). Likewise, the Kaplan-Meier curve analysis demonstrated that the OS of the high-risk group was significantly shorter than that of the low-risk group (p= 3.583e-08) (Figure 4D). ROC curve analysis revealed that the area under the ROC curve (AUC) of 1-, 3-, 5-year survival of the prognostic LRG model was 0.669, 0.702, and 0.718 (Figure 4E). Furthermore, we analyzed the 11 genes for different status of LNM, respectively. We found that expression levels of C14orf28, CIRBP, DTWD1, HMGB3, and TPX2 were significantly different (Figure S1). Above all, these results revealed that the LRGs risk model could be served as an effective and accurate prognostic signature in EC.
Nomogram building and validation
For supplying the clinicians a practical formula to estimate EC patients’ survival probability, a comprehensive prognostic nomogram based on the patients’ risk scores and clinical features was built. First of all, univariate and multivariate Cox regression analyses of survival were performed by the risk signature and related clinicopathological factors (age, menopausal status, stage, histology, grade, peritoneal cytology, LNM, and 11-gene risk model) in the TCGA dataset to determine whether the 11-gene risk score can be used as an independent prognostic factor. Both univariate and multivariate analyses showed that the risk model could be used as a prognostic indicator (P<0.001, Figure 5A-B). Afterwards, five independent prognostic parameters, including age, grade, stage, peritoneal cytology, and 11-gene risk model were integrated into the nomogram (Figure 5C). The specific score of each factor was shown in Table 3. The C-index of the model was 0.769 (C-index=0.712 for clinical prognostic factors alone). The calibration plots showed excellent consistency between the nomogram predictions and actual observations in terms of the 1-, 3-, and 5-year survival rates in the TCGA cohort (Figure 5D). Furthermore, we divided the cohort into 3 subgroups (low-score, moderate-score, and high-score groups) evenly according to their risk scores from the nomogram. The survival curve of high-score group had a worse OS than the moderate- and low-score groups (Figure 5E). In order to find out whether the risk signature was an effective prognostic indicator, tdROC was plotted. Similar to the performance in the cohort, the AUCs were 0.784, 0.814, and 0.787 for 1-, 3-, and 5‐year survival time, respectively (Figure 5F). These results found that the 11-gene LRGs model improved the predictive accuracy of OS in EC patients.
Clinical experimental validation-qPCR
First of all, the differential expression of 11 genes was validated based on GEPIA website (Figure S2), including HMGB3 (Figure 6A).
We knocked down the expression of HMGB3 in ishikawa cell line and conducted the following functional experiments. Then western blot was performed to examine the efficiency of knockdown. The results suggested that the expression of MMP12 in the siRNA group was significantly down-regulated compared with the negative control group (Ctrl) and the blank control group (siCtrl) in ishikawa cells (p<0.01, Figure 6B). CCK-8 assay suggested that the proliferation rate of ishikawa was significantly decreased in HMGB3 knockdown group (Figure 6C). Transwell and gap closure assay were performed to investigate the effects of HMGB3 on the invasion and metastatic behaviors of EC cells in vitro. The results demonstrated that ishikawa in HMGB3 knockdown group exhibited significant declines in invasion capabilities (Figure 6D-E) and migration (Figure 6F-G) compared with respective control groups. These results suggested that HMGB3 played a crucial role in the progression of EC cells.
The immune related risk signature and mutation profile
Gene mutations are an important cause of tumorigenesis and development. Hence, we evaluated tumor mutation burden (TMB) of patients in low- and high-risk groups with somatic mutation data. In the risk model, the low-risk group had somatic mutations in the following order: PTEN> ARID1A> PIK3CA> TTN> CTNNB1> PIK3R1> CTCF> MUC16> KMT2D> ZFHX3 (Figure 7A). Meanwhile, in the high-risk group, somatic mutations were listed in the following order: TP53> PIK3CA> PTEN> TTN> ARID1A> PIK3R1> KMT2D> MUC16> PPP2R1A> CHD4 (Figure 7B). The TMB scores of patients stratified by the LRGs prognostic model was therefore investigated. The t test demonstrated a significant difference between the low-risk and high-risk groups (p<0.01, Figure S3). These results indicated that different risk groups in LRGs risk model had different TMB features.