Correlation of ESTIMATE scores with OV prognosis
The gene expression profiles and clinical date of 469 OV female patients diagnosed pathologically between 1992 and 2009 were downloaded from the TCGA database. Among them, 63 (13.4%) cases were of differentiated subtype, 80 (17.1%) cases of immunoreactive subtype, 64 (13.6%) cases of mesenchymal subtype, 74 (15.8%) cases of proliferative subtype, and 188 (40.1%) cases of unknown molecular subtype. ESTIMATE algorithm was used to evaluate infiltration of immune and stromal cells and tumor purity. According to ESTIMATE algorithm, immune scores ranged from -1498.58 to 2774.16, and stromal scores were distributed between -1988.05 to 1837.43, respectively. In short, immune scores of patients were higher than stromal scores in the entire OV cohort. Immunoreactive subtype showed the highest average immune scores of 4 subtypes, followed by mesenchymal subtype, differentiated subtype, and proliferative subtype (Fig. 1A, p < 0.0001). Similarly, mesenchymal subtype ranked the highest average stromal scores, followed by immunoreactive subtype, differentiated subtype, and proliferative subtype (Fig. 1B, p < 0.0001), suggesting that both immune and stromal scores were obviously correlated with the classification of molecular subtypes in OV patients. The proliferative subtype patients had the lowest immune scores and stromal scores.
In this study, there were 10 BRCA1 mutant patients with OV, 287 BRCA1 wild-type patients, and 172 patients of unknown status. In addition, there were 9 BRCA2 mutant patients, 288 BRCA2 wild-type patients, and 172 patients of unknown status. We drew the distribution of ESTIMATE scores between BRCA1/2 mutant and wild-type patients with OV. However, no significant difference was found in ESTIMATE scores between BRCA1/2 mutant and wild-type patients (Fig. 1C-F).
To better understand the correlation of ESTIMATE scores with OV prognosis, we categorized the 469 OV patients into high and low scores groups according to the top half of 235 samples with higher ESTIMATE scores and the bottom half of 234 samples with lower ESTIMATE scores. We found that patients with low immune scores showed no difference in median survival compared with patients with high scores (p = 0.5707, Fig. 1H). Median survival of patients with high stromal scores was lower than the patients with low scores based on Kaplan-Meier curve, although it showed no statistical significance (p = 0.1145). And the trend was mainly reflected in the shorter median survival (OS < 5 years) (p = 0.0376, Fig. 1G).
Comparison of DEGs with ESTIMATE scores
To explore the association of gene expression signatures with ESTIMATE scores, we compared gene expression profiles of 469 OV samples downloaded from TCGA database. Heatmaps and PCA plots demonstrated visualized distribution of gene expression profiles between high and low ESTIMATE scores groups in Fig. 2A-D. A total of 487 genes were identified to be DEGs in immune scores group. Among them, there were 442 upregulated DEGs and 45 downregulated DEGs in the high group compared with the low group of immune scores based on p < 0.05 and FC > 1.5. On the other hand, 449 genes were identified to be DEGs in stromal scores group. Of these, there were 428 upregulated DEGs and 21 downregulated DEGs in the high group compared with the low group of stromal scores based on p < 0.05 and FC > 1.5. Similarly, volcano plots (Fig. 2E, F) showed the distribution of upregulated DEGs and downregulated DEGs in immune scores and stromal scores groups. Furthermore, Venn diagrams displayed 287 upregulated DEGs shared in the immune scores and stromal scores groups, and 13 downregulated DEGs shared (Fig. 2G, H). Since only stromal scores showed distinct association with OV prognosis, we determined to mainly explore the DEGs in stromal scores group for the subsequent analysis in this study.
To predict the potential function of 428 upregulated DEGs (Table 1) in stromal scores group, we performed the GO function and KEGG pathway enrichment analysis of these genes. There were 355 GO terms of BP, 64 GO terms of CC, and 76 GO terms of MF indicated significant difference based on FDR < 0.05 and -log(FDR) > 1.301. Top 10 GO terms were identified, such as immune response, extracellular space, and extracellular matrix (ECM) structural constituent (Fig. 3A-C). In addition, a total of 54 KEGG pathway categories showed to be significant based on FDR < 0.05 and -log(FDR) > 1.301. Top 10 KEGG pathway categories were identified, such as staphylococcus aureus infection, phagosome, ECM-receptor interaction, which were related with immune response (Fig. 3D).
Correlation of expression of DEGs with OV prognosis
To estimate the possible role of DEGs in OS, we ploted Kaplan-Meier survival curves with log-rank (Mantel-Cox) test. Among the 428 upregulated DEGs in the stromal scores group, a total of 44 DEGs were screen out to be significantly associated with OV prognosis. Of these 44 DEGs, 26 DEGs (hazard ratio (HR) > 1, Table 2) obviously predicted poor outcomes for OV (p < 0.05, selected genes are shown in Fig. 4A-F).
PPI network of prognostic DEGs
To functionally explore the interactions between these prognostic DEGs, we constructed PPI network by the STRING database and Cytoscape software, which consisted of 41 nodes. We defined the color of node continuously by log (FC) value of prognostic DEGs in the network. Similarly, we identified the size of node continuously based on degree value which was the number of connections the node and other nodes. In this PPI network, MMP9, CXCL10, GZMB, CXCL9, CXCL11, CXCL13, C5AR1, GBP1, CD2, GBP2 and CX3CR1 were the notable nodes with higher degree values, because they showed strong association with other nodes (Fig. 5).
Validation of prognostic DEGs in a external cohort
To determine whether the prognostic DEGs identified from the TCGA database also have significant prognostic value in a different OV cohort, we obtained and analyzed gene expression signatures of a dataset with 40 OV samples from the GEO database (Series GSE32063). Eventually, 6 genes, including CH25H, CX3CR1, GFPT2, NBL1, TFPI2, and ZFP36 were verified to be significantly correlated with poor outcomes (p < 0.05, Fig. 6A-F).