Cell segmentation and annotation by deep learning-based IMC data analysis
We employed a deep learning method, mask-R-CNN (MRCNN)14 for IMC cell segmentation based on the computerized outputs of watershed segmentation (Fig. 2A). The mean dice similarity of MRCNN was significantly higher than that of watershed (mean of mean dice similarity of MRCNN = 0.59, mean of mean dice similarity of watershed = 0.56, p = 0.0023, n = 10 samples, paired t test; Fig. S1B), and the standard deviation of dice similarity of MRCNN was significantly lower than that of watershed (mean standard deviation of dice similarity of MRCNN = 0.20, mean standard deviation of dice similarity of watershed = 0.24, p = 1.33*10− 5, n = 10 samples, paired t test; Fig. S1C). These results suggest that the segmentation by MRCNN is more similar to the manual segmentation than the watershed segmentation and that it generates less oversegmentation than watershed (Fig. 2A).
After three iterations of phenograph clustering to identify cell subtypes, we identified 40 cell subtypes of T and B cell, macrophage, endothelial, and fibroblastic cell populations, as well as tumor cell subtypes from 162,869 cells in 41 images (Fig. 2B) and quantified the normalized expression of all markers across various cell subtypes (Fig. 2C). Specifically, based on the normalized expression of cell subtype–specific markers, we identified nine macrophage/monocyte subtypes (Fig. 3A), four CD8+ T cell subtypes and 6 CD4+ T cell subtypes (Fig. 3B), nine tumor subtypes (Fig. 3C), and many other fibroblastic and immune cell subtypes (Fig. 3D). Because the normalization step converts the expression of all cells in a sample to a z-score, the normalized expression reflects the relative expression across all cells. For example, although tu_1, tu_2, tu_3, tu_5, tu_6, tu_8, and tu_9 have median normalized Keratin 8/18 expression around zero (Fig. 3C), they exhibit significantly higher Keratin 8_18 expression levels than non-tumor cells (Fig. S3). Based on the results of Fig. 2C and Fig. 3, the phenotypes of all cell subtypes are summarized in Table S2.
Spatially resolved cell density and cell-cell nearest-neighbor interactions analyses of the ovarian tumor microenvironment
We automatically calculated the tumor-enriched region for each image (Methods). By computing the cell density as the cell count in the tumor-enriched region per total tumor cells, we found several cell subtypes exhibiting significant differences between long-term survivors (LTS; overall survival ≥ 60 months, n = 21) and short-term survivors (STS; overall survival ≤ 20 months, n = 20; Fig. 4A). Among different T cell subtypes, granzyme B+ CD8+ cytotoxic T cell (CD8_4) density was significantly higher in LTS (p = 0.019, Fig. 4A) and CD45RO+ CD44+ CD8+ memory T cell (CD8_3) density had a trend of declining in in LTS (p = 0.084, Fig. S4) than in STS. In addition, CD45RO+ CD4+ memory T cell (CD4_4) density was significantly higher in LTS than in STS (p = 0.024, Fig. 4A). Among different CD73+ cell subtypes, CD73+ cell (CD73_1) density and CD73mid cell (CD73_2) density were significantly lower in the LTS than in STS (p = 0.015 and p = 0.002, respectively; Fig. 4A). CD31+ CD73mid endothelial cell (CD31) density was significantly lower in LTS than in STS (p = 0.007, Fig. 4A). Among tumor cell subtypes, B7H4+ Keratin+ tumor cell (tu_9) density was significantly lower in LTS than in STS (p = 0.018, Fig. 4A). A comparison of cell densities of all cell subtypes between LTS and STS is shown in Fig. S4.
To determine whether the appearance of one cell subtype is associated with the appearance of another cell subtype, a Spearman correlation matrix between various cell subtype densities was generated. The results demonstrated that the granzyme B+ CD8+ cytotoxic T cell (CD8_4) density was negatively correlated with cell densities of B7H4+ tumor cells (tu_7; r = -0.47, p = 0.0006) and tu_9 (r = -0.38, p = 0.006; Fig. 4B), suggesting that these CD8+ cytotoxic cells are infiltrating the tumor mass and actively depleting B7H4+ ovarian cancer cells in LTS. CD163+ CD68+ CD14+ macrophage (ma_3) density was positively correlated with CD45RO+ CD44+ CD4+ memory T cell (CD4_1) density (r = 0.56, p = 2.8*10− 5), CD45ROmid CD44+ CD4+ memory T cell (CD4_3) density (r = 0.56, p = 2.4*10− 5), and CD45ROmid CD44mid CD4+ memory T cell (CD4_5) density (r = 0.45, p = 0.001) (Fig. 4B). CD163, as an M2 marker, and CD68 as a pan-macrophage or M1 marker, have been widely used to classify macrophages15,16. In addition, previous studies demonstrated that tissue-associated macrophages are very diverse and heterogeneous, are subjected to microenvironmental factors and do not have restricted M1 (CD68+) or M2 (CD163+) phenotypes17,18. Based on these findings, we postulated that a positive correlation between CD163+ CD68+ CD14+ cell type (ma_3), which represents atypical M1-macrophage population and CD4+ memory T cells (CD4_1, CD_3, CD_5) could suppress the tumor progression in LTS.
Next, we examined the prognostic significance of cell-cell nearest-neighbor interactions by computing the average cell count of cell subtype X in the nearest neighborhood of cell subtype Y (distance between the center of two cells less than 20 µm) in the tumor-enriched region of every LTS and STS patient sample. We computed the cell-cell nearest-neighbor interactions that were significantly higher or lower (Benjamini-Hochberg adjusted p value < 0.05) in LTS (n = 21) than in STS (n = 20). We identified 120 cell-cell nearest-neighbor interactions that were significantly different between LTS and STS (Fig. 5A). Among them, granzyme B+ CD8+ cytotoxic T cells (CD8_4) had significantly more interactions with multiple tumor cell subtypes (tu_1, tu_2, tu_3, and tu_5) in LTS than in STS (Fig. 5A), suggesting increased interactions between CD8 cytotoxic cells and multiple subtypes of tumor cells in LTS. An example of the interaction between CD8_4 and tu_1 is shown in Fig. S5A.
In contrast to more CD8_4–tumor cell interactions in LTS than in STS, CD73mid cell (CD73_2) had significantly fewer interactions with 17 cell subtypes, including CD73_1, CD31, macrophages and monocytes (ma_1, ma_2, ma_4, ma_5, ma_8, ma_9), stromal cells (s_1, s_2), T cells (CD4_5, CD8_2), and tumor cells (tu_1, tu_3, tu_4, tu_5, tu_6), than in STS, suggesting that when the number of CD73_2 cells is high, they are mainly surrounded by macrophages and tumor cells in STS. The interaction between CD73_2 and CD163+ CD68+ CD14+ macrophages (ma_9) is shown in Fig. S5B. CD4_4 cells had significantly more interactions with CD163+ CD68+ Vistamid CD14+ macrophages (ma_1; Fig. S5C) and CD14+ monocytes (ma_2) in LTS than in STS (Fig. 5A), suggesting increased interactions between CD4+ memory T cells and certain subtypes of macrophages in LTS. CD45RO+ CD44+ CD8+ memory T cells (CD8_3) had significantly fewer interaction with CD163+ CD14mid macrophages (ma_8; Fig. S5D, Fig. 5A) in LTS than in STS, suggesting decreased interactions between CD8+ memory T cells and this subtype of macrophages in LTS.
Feature selection for overall survival prediction by logistic regression
We used a machine learning method, logistic regression to predict survival, with recursive elimination of features after filtering out highly correlated ones (see Methods). Our first approach included using only the cell density detected in tumor-enriched regions and patient age as features (Fig. 5B). The optimal number of selected features was seven, because both training accuracy (0.947) and validation accuracy (0.938) were high (Fig. 5B.a). The test accuracy was 0.78 (test sensitivity = 0.6, test specificity = 1) and the area under the curve (AUC) was 0.8 (Fig. 5B.b). Among the seven prognostic features selected by the model, CD73mid cells (CD73_2), CD31+ CD73mid endothelial cells (CD31), CD163+ CD68+ Vistamid CD14+ macrophages (ma_1), CD45RO+ CD44+ CD8+ memory T cells (CD8_3), and age had negative coefficients, suggesting that they were inversely correlated with patient survival. Granzyme B+ CD8+ cytotoxic T cells (CD8_4) and CD45RO+ CD4+ memory T cells (CD4_4) had positive coefficients, suggesting that they were positively correlated with patient survival (Fig. 5B.c, Fig. 4A).
Next, we used the cell-cell nearest-neighbor interactions, which are related to the seven prognostic cell density features in the tumor-enriched region, and patient age as features (Fig. 5C). The optimal number of selected features was 11, because both training accuracy (1) and validation accuracy (1) were the highest for that number of features (Fig. 5C.a). The test accuracy was 0.89 (test sensitivity = 1, test specificity = 0.75) and the AUC was 1 (Fig. 5C.b). Among the 11 features that were selected by the logistic regression model, CD73_2 neighboring CD73_1, ma_9, or s_2; CD31 neighboring tu_4; CD8_3 neighboring ma_8; and age were negatively correlated with patient survival. In contrast, CD4_4 neighboring s_2, s_1, ma_1, or CD44 and CD8_4 neighboring tu_1 were positively correlated with survival (Fig. 5C.c). Some of the prognostic nearest-neighbor interaction features, such as CD8_4 neighboring tu_1, CD73_2 neighboring ma_9, CD4_4 neighboring ma_1, and CD8_3 neighboring ma_8, can be visualized in Fig. S5. These results indicate that multiple neighboring interactions may have similar predictive power for survival prediction. The complex cell-cell interaction patterns of ovarian cancer with various immune cell and other stromal subtypes led to divergence in the tumor microenvironment between LTS and STS.
Taken together, our results indicated that using cell-cell nearest-neighbor interactions and age as features allowed a more accurate prediction of patient survival than using cell densities and age as features. The Spearman correlation of any two neighbor interaction features that both had relatively high correlation with patient survival (Spearman correlation coefficient > 0.2) and were related to the seven prognostic cell density features is visualized in Fig. 5D. The Spearman correlation study demonstrated that certain features were highly correlated. For example, we found that the average cell count of CD8_4 neighboring with a tu_1 cell was highly correlated with the average cell count of CD8_4 neighboring with a tu_2 cell (r = 0.77, p = 5*10− 9) and the average cell count of CD8_4 neighboring with a tu_3 cell (r = 0.72, p = 1*10− 7). The feature elimination process of our logistic regression model first filtered out the highly correlated features and kept one feature of each highly correlated feature pair (see Methods). Owing to this elimination process, features that are highly correlated with any feature in the prognostic list identified by the logistic regression model may also have prognostic value. For example, because CD8_4 neighboring tu_1 was a prognostic feature, CD8_4 neighboring tu_2 or CD8_4 neighboring tu_3 may have similar prognostic values.
Correlations between cell subtype density and transcriptomic profiles from microdissected fibroblastic and epithelial compartments of HGSC
To provide mechanistic insights by which certain cell phenotypes identified by IMC modulate survival in HGSC patients, we performed correlation studies to identify genes in the epithelial compartment (Fig. 6A) or fibroblastic compartment (Fig. 6B) of HGSCs that had a significant positive or negative correlation with the IMC cell density in the tumor tissue (Table S3 and 4) and that were significantly different between LTS and STS (Fig. S6 and 7). A total of 26 samples with whole transcriptome data available were used. We focused our attention on the cell densities of the six IMC features that were significantly different between LTS and STS (Fig. 4A), four of which were also selected by our machine learning model for survival prediction (Fig. 5B.c). We discovered relationships consistent with known cancer biology, and we also made several unexpected observations.
In the epithelial compartment of HGSC tumors, we identified expression levels of several genes involved in the migration and invasion of cancer cells correlated with CD73_1, CD73_2, and CD31 densities and with poor survival, suggesting that these genes regulate the metastatic potential of cancer cells in a paracrine manner; these genes included PARD6B, S100A10, SLURP1, and SPINT1. In addition, we demonstrated that ovarian cancer cell–derived ITIH5, TACSTD2, and WFDC2 expression levels were negatively correlated with granzyme B+ CD8+ cytotoxic T cell (CD8_4) densities. In contrast, BPIFB1 and SLURP1 expression levels were positively correlated, and CD9 and ITIH5 were negatively correlated with CD45RO+ CD4+ memory T cell (CD4_4) densities (Fig. 6A, Table S3, Supp. Figure 6). These findings suggest that these ovarian cancer cell–derived genes, which are known to code for extracellular matrix and metabolism modulators19–25, could facilitate or block T cell infiltration in the tumor microenvironment as well as interfere with T cell activation and facilitate immune system evasion. Moreover, cancer cell–derived FST and PARD6B were positively correlated with CD31+ CD73mid endothelial cell (CD31) density, indicating that these genes modulate endothelial cell activity and subsequently angiogenesis. The prognostic significance of these genes was determined by analyzing 530 samples of optimally debulked advanced HGSC in the KM Plotter, and the results are summarized in Fig. S6.
To identify cancer-associated fibroblast (CAF)-derived mediators that confer the prognostic phenotypes identified by IMC, we analyzed correlations of the expression levels of genes in the transcriptomes generated from the microdissected fibroblastic compartment of HGSC with the prognostic phenotypes identified. Survival analyses of the significantly correlated genes were also performed. The list of CAF-derived genes that were positively correlated with CD73+ cell (CD73_1) and CD73mid cell (CD73_2) densities and also associated with STS is shown in Table S4, Fig. 6B, and Fig. S7. These genes are likely expressed by CD73_1 and CD73_2, two different CAF subtypes. Among these genes, CCDC85B, DDAH1, EFEMP2, F2RL1, ITGB1, LOX, LDLR, MFGE8, MICAL2, MKL1, MSRB3, NCAM1, NPTX, PLAT, SLC2A3, SPSB1, and VASN have been described as promotors of tumor cell growth, invasion, and migration, as well as angiogenesis26–41 (Table S4, Fig. S7). HMOX2, ICMT, MICU1, and TSPAN9 also positively correlated with CD73_1 and CD73_2 densities and short survival and have been shown to be involved in conferring chemoresistance in ovarian cancer34,42−44 (Table S4, Fig. S7). CAF-derived ANGPTL2, WWTR1, and PLOD2 promote tumor progression and cell invasion, and MYO10 regulates CAF rigidity45–50. These findings indicate that these mediators produced by CD73_1 and CD73_2 CAF subtypes modulate the malignant phenotypes of HGSC.
CAF-derived genes involved in the modulation of the immune response and also associated with patient survival were also examined, and the results are summarized in Table S4 and Fig. S7. Among these genes, BMPR1B, a gene encoding the receptor of the bone morphogenetic protein (BMP), was negatively correlated with CD4_4 cells and associated with STS. This finding suggests that CAFs expressing high levels of BMPR1B may be more responsive to BMP signaling, which subsequently modulates the rigidity of CAFs, stiffness of the tumor microenvironment, and CD4_4 T cell trafficking. Besides BMPR1B, VSTM4, a secreted protein that can reduce IFN-γ, IL-2, and IL-17 cytokine production by human T cells and cause a profound decrease in T cell activation51, was negatively correlated with CD4_4 density but positively associated with STS, suggesting that CAF-derived VSTM4 modulates CD4_4 activity and subsequently leads to poor survival rates in patients with HGSC.
In addition to these correlations between expression of CAF-derived genes and prognostic immune cell phenotypes, expression of several genes with prognostic significance was also associated with the two CAF phenotypes CD73_1 and CD73_2. FN1, TGFBI, TNC, and LRRC32 were positively correlated with both CD73_1 and CD73_2 and negatively correlated with survival. LRRC32 is a key regulator of TGF-β activation52. Together with increased TGFB1 secreted by CD73_1 and CD73_2 CAFs, LRRC32 may generate an immune-suppressive and pro-tumorigenic microenvironment to support the malignant phenotype of ovarian cancer cells, as we previously described53,54. Both FN1 and TNC encode extracellular proteins that have previously been shown to be associated with short progression-free survival and increased migration-inducing potential in HGSC55,56.
Several fibroblastic genes were associated with density of endothelial cells (CD31). Among them, MFGE8, which was also positively correlated with CD73_1 and CD73_2 density, demonstrated the strongest correlation with increased CD31 density and poor patient survival (Table S4 and Fig. S7). Our findings are supported by published studies reporting that MFGE8 increases tumor angiogenesis by increasing VEGF and ET-1 expression in stromal cells and by enhancing M2 polarization of macrophages32. Moreover, MFGE8 proteins accumulated around CD31+ blood vessels have been shown to promote angiogenesis by enhancing PDGF-PDGFRβ signaling mediated by integrin-growth factor receptor crosstalk33,57.
Taken together, the transcriptome analysis revealed multiple key genes in both the cancer cell and fibroblastic compartments of HGSC that correlate with IMC features positively or negatively, confer either a tumor-promoting or an immune-suppressive microenvironment, and subsequently lead to short-term survival in HGSC patients (Fig. 6C).