Preprocessing of microarray datasets and identification of DEGs
The four gene datasets contained a total of 48 normal ovarian tissues and 125 ovarian cancer tissues (Table S1), and differential gene analysis data were normalized (Figure S1). We initially identified 5,870, 4,040, 4,532 and 18,520 DEGs from GSE27651, GSE38666, GSE14407 and GSE18520, respectively, with the limma package of Rtools and the criteria |logFC| >1 and adjusted P value <0.01(Figure 1A-D). Then, all DEGs were divided into up- and down- regulated genes according to whether |logFC| >0 or |logFC| <0, respectively. A Venn diagram tool was applied to identify the DEGs in the sets of up- and downregulated DEGs. A total of 636 DEGs, including 269 upregulated and 367 downregulated DEGs, were identified in the OC samples compared to the normal ovarian tissue samples (Figure 1E&F). All of these DEGs are listed in Table S2.
PPI network construction and MCODE analysis
The PPI network of the 269 upregulated DEGs was constructed with that demonstrated 256 nodes and 773 edges were enriched 6 gene clusters(Figure S2A, TableS3). As well as, the PPI network of the downregulated DEGs contained 330 nodes and 94 edges and enriched 7 gene clusters using the MCODE plugin of Cytoscape software (Figure S2B, TableS3). Next, two gene clusters with network scores of >10 were selected from the upregulated DEGs (Figure 2A&B), and two gene clusters with scores of >5 from the downregulated gene clusters (Figure 2C&D) were selected as the most important molecular modules in the respective results and further analyzed for function and prognosis.
GO enrichment analysis showed that the two gene clusters of upregulated DEGs were primarily related to the biological processes (BP) of nuclear division and sister chromatid segregation in the cell cycle. The primary molecular functions included microtubule binding and various kinase regulator activities (Figure 3A&B). KEGG pathway analysis indicated genes enriched mainly in oocyte maturation, cell cycle, p53 signaling pathway, and cellular senescence pathway (Figures 4A-F).
The two gene clusters of downregulated DEGs in GO enrichment analysis indicated genes involved mainly in the endoplasmic reticulum lumen and mitochondrial outer membrane as cellular components, protein modification and regulation of various signal paths as biological processes, and hormone activity and alcohol dehydrogenase [NAD(P)+] activity as molecular functions (Figure 5A&B). No signaling pathways were significantly enriched for the first gene cluster (P<0.05). The second gene cluster was mainly involved in many metabolic functions, such as tyrosine metabolism, fatty acid degradation and retinol metabolism (Figure 6A-C).
Survival analysis for identification of prognostic genes
The OS and PFS of patients were stratified by the expression of genes in the gene clusters of up- and down- regulated DEGs using the Kaplan–Meier Plotter database. Twenty-six DEGs from upregulated DEGs were identified to correlate with OS and PFS in ovarian cancer. The results were visualized through the forestplot package of Rtools (Figures 7A-D). However, only CHGB and MAOB genes from downregulated DEGs were relevant to prognosis in OC, and a higher expression of these genes predicted better OS and PFS (Table S4).
To further examine the prognostic potential of the twenty-eight genes from the up- and downregulated DEGs, we continued to reanalyze the survival differences through the PrognoScan database. Interestingly, two cohorts (GSE9891 and GSE17260) [8, 9], which included 278 samples and 110 samples at different stages of OC, showed that higher expression of AURKA, CDCA5, CEP55 and UBE2C was significantly associated with poorer prognosis (Figures 8A-H). Therefore, it is conceivable that high AURKA CDCA5, CEP55 and UBE2C expression is an independent risk factor and leads to a poor prognosis in OC patients.
Analyses the differential mRNA and protein expression of prognostic genes
We compared the transcriptional levels of these four genes in cancers with those in normal tissue samples using the ONCOMINE database (Figure 9). ONCOMINE analysis revealed that the mRNA expression of AURKA, CDCA5, CEP55 and UBE2C was upregulated in patients with OC (Table 1). The transcription levels of AURKA were significantly higher in patients with OC in four datasets [10-12]. In Yoshihara’s dataset [10], AURKA was overexpressed in ovarian serous adenocarcinoma compared to normal samples, with a fold change of 14.459 and P-value of 5.17E-10. In TCGA, AURKA was overexpressed in ovarian serous cystadenocarcinoma compared to normal samples, with a fold change of 6.504 and P-value of 6.53E-8. In TCGA, Adib’s and the Lu Ovarian dataset [11, 12], AURKA expression was overexpressed in ovarian serous cystadenocarcinoma compared to normal samples, with a fold change of 6.504 (P-value = 6.53E-8), 2.287 (P-value = 4.20E-4) and 1.506 (P-value = 0.002), respectively. The transcription levels of CDCA5 were significantly higher in serous cystadenocarcinoma than in normal tissue [10]. The Lu Ovarian dataset showed that CEP55 expression was significantly higher in different subtypes of OC [12], including serous adenocarcinoma, endometrioid adenocarcinoma and clear cell adenocarcinoma, with fold changes of 2.457 (P-value = 9.44E-7), 2.033 (P-value = 1.45E-4) and 1.787 (P-value = 0.007), respectively. In Yoshihara’s dataset [10], Lu’s dataset [12] and TCGA, UBE2C was significantly overexpressed in ovarian serous adenocarcinoma, with fold changes of 12.955 (P-value = 5.73E-13), 2.358 (P-value = 1.39E-7) and 10.184 (P-value = 2.24E-7), respectively. UBE2C expression in ovarian carcinoma was significantly increased compared with that in the normal samples in Bonome’s datasets [13].
Using the GEPIA dataset, we further confirmed the differential mRNA expression of the four genes between OC and normal tissues. The results showed that mRNA expression levels of AURKA, CDCA5, CEP55 and UBE2C were significantly increased in OC tissues compared with normal ovarian tissues [(tumor sample: n = 426 vs. normal sample: n = 88) (Figure 10). Additionally, expression of the four genes was analyzed by ovarian cancer stage. However, in contrast to CDCA5, mRNA expression of AURKA ,CEP55 and UBE2C was not significantly associated with FGIO stage of OC (Figure S3).The IHC staining and images were downloaded from the Human Protein Atlas ,the proteins levels of these four prognostic genes was significantly elevated in tumor tissues compared with normal tissues (Figure 11).
Gene alteration and correlation analyses of prognostic genes
Mutations in AURKA, CDCA5, CEP55 and UBE2C retrieved in 1,680 cases from three TCGA datasets of ovarian serous carcinoma (606 cases from TCGA, Firehose Legacy; 585 cases from TCGA, PanCancer Atlas; and 489 cases from TCGA, Nature 2011) were analyzed using the cBioPortal database. Among the 3 OC datasets analyzed, the alterations of the four genes were mainly related to amplification and deep deletion, and ranges from 17.67% of 583 genes to 7.77% of 489 genes were identified for the gene sets submitted for analysis (Figure 12A). The percentages of AURKA, CDCA5, CEP55 and UBE2C gene alterations in OC were 6%, 2.1%,1.3% and 4% (Figure 12B), respectively. Kaplan–Meier Plotter and log-rank test results indicated no significant difference in OS and PFS between the cases with alterations in one of the queried genes and those without alterations in any of the queried genes (P-values of 0.0851 and 0.213, respectively; Figure 12 C&D).
The GeneMANIA database was used for correlation analysis of AURKA, CDCA5, CEP55 and UBE2C at the gene level (Figure 13A). The 4 central nodes were surrounded by 20 nodes representing genes that were closely related to the family in terms of physical interactions, co-expression, predictions, co-localization, and genetic interactions. The results revealed relationships in physical interactions, co-expression and co-localization between AURKA, CDCA5, CEP55 and UBE2C. The same pathway was shared between AURKA and UBE2C. The predicted protein domains was shared among AURKA, CEP55 and UBE2C. However, no relationship in shared protein domains or genetic interactions were noted among the four genes. Further functional analysis revealed that these four genes mainly correlated with regulation of cell division, including mitotic cell cycle, mitotic cytokinesis, mitotic sister chromatid segregation mitosis, nuclear division and cell cycle G2/M phase transition (Table S5).
The Spearman correlations between AURKA, CDCA5, CEP55 and UBE2C in OC were determined by online analyses using the cBioPortal database (TCGA, PanCancer Atlas (Figure 13B). Spearman’s correlation coefficient exceeding 0.30 indicated a good correlation. The results indicated a significant positive correlation between AURKA and CDCA5 (R2 = 0.674, P = 4.86e-41), CEP55 (R2 = 0.564, P = 1.32e-26), and UBE2C (R2 = 0.78, P = 7.36e-62); CDCA5 and CEP- 55 (R2 = 0.616, P = 1.13e-32) and UBE2C (R2 = 0.658, P = 1.30e-38); and UBE2C and CEP55 (R2 = 0.512, P = 2.08e-21) (Figure S4).
Immune infiltrate analysis ofprognostic genes
Tumor inflammatory cell infiltration levels play an important role in cancer progression and patient survival. We analyzed whether AURKA, CDCA5, CEP55 and UBE2C expression correlated with inflammatory cell infiltration levels in OC. The results showed that the mRNA levels of the four genes correlated weakly with inflammatory cell infiltration (Figure 14). AURKA expression was related to macrophages cells, CD4+T cells and neutrophils, but no significant correlation with tumor purity or infiltrating levels of B cells, CD8+T cells and dendritic cells was noted. In addition to the negative correlation with CD8+T cells, CDCA5 expression was not significantly correlated with other immune inflammatory cells and tumor purity. CEP55 was negatively related to CD8+T cells and positively related to CD4+T cells. UBE2C was only associated with tumor purity and CD4+T cells. The results suggest that these four genes did not correlate closely with tumor immune inflammatory cell and tumor purity associated with the occurrence and development of OC. The detailed mechanisms require further study.