A single cell map of human endometrium
To analyze the scRNA-seq cells from human endometrium through four menstrual cycle stages, we performed principal-component analysis (PCA) using the top 2,000 most variably expressed genes across 59,397 cells. Cells were clustered into transcriptionally distinct clusters with top 20 principal components (PCs). The cells were visualized using UMAP plot and revealed eight clusters that could be annotated to known cell types (Fig. 1A). Moreover, we found that cells were clustered together based on cell types but not based on patient identify (Fig. 1B). We then used well-known marker genes to define the identity of each cell cluster. For example, epithelial cells expressed KRT8 and PAEP, endothelial cells expressed VWF, fibroblast cells expressed APOD, DCN and COL3A1, perivascular (PV) cells expressed RGS5, smooth muscle cells expressed ACTG2 and MYH11, multi-potent stromal cells (MSC) expressed TOP2A and UBE2C, lymphoid and myeloid cells expressed CD74, NKG7 and GNLY (Fig. 1A, E and Additional file 1: Fig. S1).
Next, we calculated the proportion of cells during four menstrual cycle stages and different patients. We found that the proportions of cell types among stages and patients were significantly different (Fig. 1C-D, p-values < 2.2E-16). PV cells were predominated in early-secretory stage and in A30 patient. Fibroblast cells decreased in early-secretory stage and immune cells were enriched in proliferative stage and late-secretory stage (Fig. 1C). MSC decreased during the menstrual cycle stages. Moreover, we identified the highly expressed marker genes in each cell types. We found that C1QA, C1QB and C1QC were highly expressed in myeloid cells (Fig. 1E), indicating their phagocytic ability [26]. Together, our comprehensive analysis provided a comprehensive catalog of the major cell types together with their cellular position in endometrium.
Systematic discovery of cell type-specific RNAs in human endometrium
We next explored the cell type-specific RNAs that could help explain distinct biological states of these cell types. Functional enrichment analysis revealed that genes highly expressed in fibroblast cells were significantly enriched in wound healing, regulation of vasculature development and regulation of angiogenesis (Fig. 2A). Genes highly expressed in smooth muscle cells were enriched in wound healing, extracellular matrix organization and extracellular structure organization, whereas PV cell-specific genes were enriched in response to corticosteroid, steroid hormone and glucocorticoid (Fig. 2A). Epithelial cell-specific genes were significantly enriched in epithelial cell proliferation and tissue migration, and endothelial cell-specific genes were enriched in regulation of vasculature development and angiogenesis (Fig. 2A). MSCs are a population of self-renewing multipotent cells in the perivascular regions of the endometrium in both the basalis and functionalis [27, 28]. We found that genes highly expressed in MSCs were enriched in sister chromatid segregation and nuclear division (Fig. 2A). A large proportion of MSCs were in G2M stage of cell cycle (Fig. 2B). In particular, we found that two proliferative marker genes, MKI67 and TOP2A, were highly expressed in MSC cells (Fig. 2C-D). These results suggested that enriched functional analysis of each cluster supported their functions in human endometrium.
Two newly discovered subtypes of perivascular cells
We analyzed the PV population (n = 8,120) based on the known markers and re-clustered it into two distinct populations as indicated in the UMAP (Fig. 3A and Additional file 1: Fig. S2A). We analyzed the relative proportion of cells in each cluster and noted that two clusters exhibited similar proportion during menstrual cycle stages and patients (Fig. 3B, p = 1.862E-8 and Additional file 1: Fig. S2B). These observations were consistent with the results PV-MYH11 + are characteristic of myometrium while PV-STEAP4 + are only present in the endometrium [6]. We next analyzed the gene expression profiles and identified the top differentially expressed genes in two PV populations (Fig. 3C). In particular, STEAP4 and MYH11 were separately expressed in two PV populations (Fig. 3D-E). We found that several collagen-related genes (e.g., COL3A1, COL1A2, COL4A1 and COL1A1) were highly expressed in PV-STEAP4 populations (Fig. 3C), which might be correlated with their roles in repair of endometrial damage and induced angiogenesis [29]. Moreover, IGFBP5 was highly expressed in PV-STEAP4 + subtypes, which is consistent with its roles in promoting angiogenic and neurogenic differentiation [30, 31]. In contrast, the PV-MYH11 + populations were characteristic of myometrium and highly expressed MUSTN1 and MYH11.
To further determine the specific roles of two PV populations that might contribute to human endometrium, we performed functional enrichment analysis based on the differentially expressed genes. We found that genes highly expressed in two PV populations were both significantly enriched in wound healing, response to oxidative stress and extracellular matrix organization (Additional file 1: Fig. S2C). However, the proportions of genes in PV-STEAP4 + were much higher. In particular, genes highly expressed in PV-STEAP4 + were significantly enriched in extracellular matrix organization and structure organization, while genes in PV-MYH11 + were significantly enriched in muscle development related functions (Fig. 3F). Cancer hallmark-related pathways also exhibited distinct activities in two PV populations. PV-STEAP4 + cells exhibited higher activities in immune and metabolism related functions and PV-MYH11 + cells exhibited higher activities in signaling and proliferative pathways, such as PI3K-AKT-mTOR (Fig. 3G). It has been demonstrated that glycolysis is beneficial to angiogenesis and plays an important role in endometrial decidualization [32–34]. In addition, PI3K-AKT-mTOR pathway plays an important role in the decidualization of endometrium.
Subpopulations of epithelial cells across human menstrual cycle
The epithelial cell cluster (n = 5,758) was then assigned to four epithelial subtypes based on known markers obtained from the published literatures [35, 36]. Three secretory glandular cells (secretory LGALS1+/PAEP+/MT+) and ciliated cells were identified (Fig. 4A). We calculated the proportion of four subpopulations across menstrual cycle and found that ciliated epithelial cells were enriched in proliferative phase, secretory LGALS1 + cells were enriched in early and mid-secretory phases (Fig. 4B, p < 2.2E-16). Examination of gene expression patterns of four epithelial subpopulations we revealed that ciliated cells expressed high levels of markers, such as the ciliated marker TPPP3, PIFO and FAM183A (Fig. 4C-4D).
The gene set enrichment analyses showed that genes up-regulated in ciliated cells were mainly enriched for cilium organization, cilium assembly, cilium movement functions (Fig. 4E and Additional file 1: Fig. S3). This result is similar with one previous study [7]. The secretory MT + cell populations highly expressed genes in metallothionein (MT) family, such as MT2A, MT1G, MT1E, MT1X and MT1H (Fig. 4E), which might contribute to impaired endometrial receptivity [37]. Moreover, higher proportion of secretory MT + cells was observed in late-secretory phase (Fig. 4B). This is consistent with previous observations that MT genes were highly expressed in late-secretory phase [38]. The secretory PAEP + cell populations were enriched during the menstrual cycle and highly expressed CLDN3, CLDN4 and CXCL family genes (Fig. 4E). Functional analysis revealed that genes up-regulated in this cell population were enriched in regulation of cell-cell adhesion, cell migration and reproductive development (Additional file 1: Fig. S3).
We next performed functional analysis based on cancer hallmark-related pathways. Examination of the pathway activities we found that secretory populations exhibited higher pathway activities in numerous pathways (Fig. 4F). The ciliate epithelial cells exhibited higher pathway activities in bile acid metabolism and mitotic spindle. The Wnt signaling pathway, DNA repair and G2M checkpoint pathways were enriched in secretory LGALS1 + cell populations (Fig. 4F). In particular, epithelial mesenchymal transition was enriched in secretory LGALS1+, which was consistent with their roles in regeneration or differentiation of endometrium [27]. The inflammation-related pathways were significantly enriched in secretory PAEP + cell populations, such as IL6-JAK-STAT3 and IL2-STAT5 signaling pathways (Fig. 4F). These results extended the transcriptional signature and potential pathways underlying the human endometrial epithelial cells.
Fibroblast cells separate into four distinct cell types
Fibroblasts (n = 21,865) were further separated into four clusters and annotated based on the highest expressing genes (Fig. 5A). SPARACL1, ID4, MMP11 and EGR1 were highly expressed in corresponding clusters. We found that fibroblast ID4 + and SPARCL1 + cells were primarily observed in early- and late-secretory phases (Fig. 5B). Fibroblast MMP11 + cells were primarily in the proliferative phase and EGR1 + cells were in mid-secretory phase (Fig. 5B, p < 2.2E-16). The proportions were variable in different patients (Additional file 1: Fig. S4A), suggesting the high heterogeneity among patients. Gene expression analysis revealed fibroblast SPARCL1 + highly expressed fibroblast-related genes, such as DCN, APOD, COL1A2 and COL15A1 (Fig. 5C and Additional file 1: Fig. S4B). In particular, four clusters highly expressed corresponding marker genes (Fig. 5D). We next performed the functional enrichment analysis and found that genes up-regulated in fibroblast SPARCL1 + were significantly enriched in extracellular structure organization and negative regulation of locomotion (Fig. 5E). Genes highly expressed in fibroblast ID4 + were enriched in epithelial cell proliferation, positive regulation of cytokine production and regulation of angiogenesis (Fig. 5E). Numerous of transcription factors (TFs) were highly expressed in fibroblast EGR1 + sub-populations, such as JUNB, EGR1, FOS and JUN. We also observed high expression of GADD45B in this population, which plays important roles in DNA repair [39]. We also investigated the cancer hallmark pathway activities in different fibroblast populations and found that Wnt, MYC and peroxisome exhibited higher activities in ID4 + cell populations (Additional file 1: Fig. S4C). Notch and PI3K-AKT signaling pathways were active in SPARCL1 + cell populations (Additional file 1: Fig. S4C). Together, these results uncovered the potential pathways underlying different fibroblast populations in human endometrium.
TF regulators of cell types in human endometrium
It has been demonstrated that genes usually interact with each other to form a complex interaction network [40]. We thus analyzed the correlation between gene expressions and we found that genes highly expressed in cell types were co-expressed with each other (Fig. 6A), indicating the modular programs associated with basic cellular functions of cells. TFs are important regulators and play important roles in regulating gene expression [41]. In addition, we calculated the module activities as the average expression of genes. We found that the expressions of several TFs (such as IRF8, SOX17, FOS and KLF2) were significantly correlated with the module activities (Additional file 1: Fig. S5). We thus identified the cell type-specific TFs based on the pySCENIC pipeline. In total, we identified 336 TFs exhibited high activities in different cell types and 62 TFs also exhibited differential expression (Fig. 6B). We found that TFs exhibited high cell type- and phase-specificity. For example, SOX4 exhibited higher activity in proliferative phase (Fig. 6B). SOX4 is closely associated with the development and progression of many malignant tumors and plays an important role in the cell growth and proliferation [42]. We also observed that the expressions of SOX4 in proliferative phase were significantly higher than secretory phases. In addition, MAF, KLF4, JUN, FOS and EGR1 exhibited higher activities in secretory phases (Fig. 6C).
Moreover, we found that several TFs exhibited cell type-specific activities in human endometrium. CD59 exhibited higher activities in endothelial cells, which is highly expressed in endothelial cells of human endometrium (Additional file 1: Fig. S6). It has been demonstrated that CD59 may be important in protection of endothelial cells against C-mediated damage at local sites of inflammation, thereby maintaining the vascular integrity [43]. E2F1 and EZH2 exhibited higher activities in MSC cell populations (Fig. 6B). E2F1 plays a pivotal role in driving cells out of a quiescent state and into the S phase of the cell cycle [44] and the activity of EZH2 influences cell fate regulation [45]. Functional analysis of the targets of EZH2 revealed that they were significantly enriched in DNA replication and cell cycle, which is consistent with the proliferative state of MSC cells. EGR1 exhibited higher activities in fibroblast cells (Fig. 6B), which had been demonstrated to contribute to inflammatory factors and fibrosis reduction [46]. The potential targets of EGR1 were significantly enriched in Wnt signaling pathway and cell growth (Fig. 6D). In addition, FOS exhibited higher activities in fibroblast and smooth muscle cell populations (Fig. 6B) and its targets were significantly enriched in muscle tissue development, Wnt signaling and cell differentiation (Fig. 6D). It has been demonstrated that FOS plays an important role in control of fibroblast senescence and activation of programmed cell death [47]. Several TFs also exhibited higher activities in immune cells, such as STAT4, BATF, RUNX3 and EOMES (Fig. 6B). These results suggest that the dynamic transcriptome during menstrual cycle were strictly regulated by TFs that were with dynamic activities.
Cell-cell interactions in human endometrium
The development of tissue and progression of complex diseases relies on a complex network of cell-cell interactions [48]. We thus inferred intercellular communications for human endometrium based on CellChat [24]. As a result, we found that the myeloid cells frequently interact with MSC and epithelial cells, while the fibroblast cells frequently interact with others (Fig. 7A). Immune cells mainly function as signal input cells. As the signal output cells, epithelial and endothelial cells mainly interact with immune cells as targets (Fig. 7A). Next, the 15 signaling pathways associated with inferred networks were mapped onto a two-dimensional manifold and clustered into four groups (Fig. 7B-C). MIF, IGF, PTN and MK pathways were grouped into cluster 1, while group 4 was formed by pathways of VEGF, CXCL and VISFATIN (Fig. 7C).
We specifically examined how macrophage migration inhibitory factor (mif) communications among cell populations (Fig. 7D). All cells function as signal output cells, although the intensity of interaction was different. When MSC and epithelial cells acted as signal input cells, the output intensity of cells was low. In this pathway, immune cells were mainly used as signal input cells, and the strength of intercellular interaction was high (Fig. 7D). Macrophage migration inhibitory factor had been identified as a potential biomarker of endometriosis [49, 50]. We also examined ligand receptor pairs that play a major role in this pathway and identified two ligand-receptor pairs MIF−(CD74 + CD44), and MIF−(CD74 + CXCR4) (Fig. 7E). MIF exhibited high expression in all cell types while CXCR4, CD74 and CD44 exhibited higher expression in immune cells (Fig. 7F). We also found that the SPP1 signaling pathway only took myeloid cells as signal output cells, and exported signals to fibroblasts, PV cells, SMC cells and lymphocytes in paracrine mode, and played a role in autocrine mode to a large extent (Fig. 7G). The SPP1-(ITGA4 + ITGB1) and SPP1-CD44 plays important roles in cell-cell communications (Fig. 7H). SPP1 and CD44 exhibited higher expression in immune cells while ITGB1 was highly expressed in all cell types (Fig. 7I)
Next, we compared the information flow for each signaling pathway. In signal input, several pathways only functions in one cell type, such as EGF and EDN in fibroblasts, VEGF and CXCL in endothelial cells, and ncWNT in PV cells (Fig. 7J). MIF signaling pathway only participates in the signal input of immune cells and most of these pathways are involved in signal input patterns of myeloid cells, fibroblasts, and endothelial cells (Fig. 7J). In the signal output part, we found that IL6, EGF, SPP1 and CCL only played a role in the signal output of immune cells. EDN only plays a role in signal output of endothelial cells. As with signal input patterns, most of these pathways were involved in signal output by fibroblasts and myeloid cells (Fig. 7K). Together, cell-cell communications analysis enables multifaceted assessment of intercellular communication patterns in human endometrium.
RNA-based molecular subtypes of human endometrial cancers
Endometrial cancer (EC) is the most common gynecologic malignancy. We next explored the expression patterns of the ligand-receptor in EC. We found that the patients can be grouped into five clusters based on the ligand-receptor of the 15 pathways identified in cell-cell communication (Fig. 8A). The survival rates for patients in five clusters were with significantly different and patients in cluster-5 were with poor survival (Fig. 8B, log-rank test p = 0.00025). We found that patients in cluster-5 were with distinct expression of 17 genes, which were involved multiple cytokine-related pathways, including MIF, CXCL, SPP1 and VEGF. We next calculated the pathway activities for these pathways and found that the majority of pathway activities in cluster-5 were significantly higher than other clusters. For example, epidermal growth factor (EGF) was a conventional mitogenic factor that can stimulate the proliferation of various types of cells including epithelial cells and fibroblasts [51]. The activities of EGF in cluster-5 were significantly higher than that in cluster 1, 3, and 4 (Fig. 8C). The EGFR family also plays an important role in maintaining epithelial homeostasis [52] and overexpressed in EC and ovarian cancers [53, 54]. Compared with other clusters, we found that ERBB2 were highly expressed in patients of cluster-5 (Fig. 8D).
In addition, as a major ligand of IGF, IGF2 overexpression has been shown to play a role in many cancers [55]. However, there seem to be few reports about its role in endometrial cancer. We found that the activities of IGF pathway and expressions of IGF2 in patients of cluster-5 were significantly higher than cluster-2 and cluster-4 (Fig. 8E-F). In contrast, there were several pathways (i.e., CXCL) had lower pathway activities in cluster-5 and the expressions of CXCR4 were also lower in patients of cluster-5. These results suggested that the genes identified here can be used to predict the prognostic survival of patients with endometrial cancer.