3.1 Identification of dysregulated miRNAs in human CRC.
As shown in the volcano plot and heatmap, differentially expressed miRNAs in GEO and TCGA datasets were identified separately with a threshold predefined (Fig. 1A–D). In the GSE108153 dataset, a total of 49 dysregulated miRNAs were identified. Meanwhile, a total of 504 dysregulated miRNAs were also screened out in the TCGA dataset. A total of 28 dysregulated miRNAs were selected for subsequent analyses by using the intersect function of Venn Diagram (Fig. 1E).
3.2 Identification of key miRNAs.
In order to evaluate the prognostic values of dysregulated miRNAs for overall survival in human CRC, clinical information including survival status and survival time of 547 patients with CRC were obtained from the TCGA database. The results showed that five of the dysregulated miRNAs (miR-195-5p, miR-10b-5p, miR-21-3p, miR-125b-5p, and miR-145-5p) were significantly associated with prognosis of patients with CRC (P < 0.05; Fig. 2A-E). In the meanwhile, we compared the expression levels of these survival related miRNAs between cancer samples and normal samples in TCGA dataset. After combination of the prognostic value and expression level, we found that only two of them (miR-195-5p and miR-10b-5p) showed both high expression and poor prognosis in human CRC (P < 0.05; Fig. 2F-J). To further enhance the reliability of our screening results, we verified the two dysregulated miRNAs in the GEDS database. As shown in Fig. 2K and L, miR-195-5p and miR-10b-5p were dysregulated in multiple cancer types including CRC. Subsequently, those two miRNAs were chosen for next analyses.
3.3 Construction of ceRNA regulatory network.
The upstream lncRNAs that could potentially bind to the key miRNAs were predicted through starBase and LncBase V2 databases, a total of three upstream lncRNAs were predicted to regulate the two key miRNAs simultaneously (Fig. 3A). Subsequently, in order to identify downstream mRNAs that could potentially be targeted by those two key miRNAs, we applied miRTarBase, targetScan and miRDB to improve the credibility of the predicted results. As a result, a total of 287 downstream mRNAs were predicted to be targeted by miR-195-5p, while there were 22 downstream mRNAs that could potentially be targeted by miR-10b-5p (Fig. 3B & C). Based on the prediction above, a ceRNA regulatory network comprised of six lncRNA-miRNA pairs and 310 miRNA-mRNA pairs was established and visualized by Cytoscape software (Fig. 3D).
3.4 Functional analysis for the ceRNA regulatory network
We used DAVID database to further explore the biological functions of the ceRNA regulatory network. A total of 222 GO terms and 45 KEGG pathways were enriched, and the top 10 enriched GO terms of each group and the top 20 enriched KEGG pathways according to P value were displayed (Tables 1 & 2). In the biological processes (BP) group, the ceRNA regulatory network mainly enriched in the process of regulation of transcription from RNA polymerase II promoter, regulation of cell migration, regulation of cell adhesion, and regulation of apoptotic process. In the cellular component (CC) group, the ceRNA regulatory network mainly enriched in nucleoplasm, cytoplasm, and nucleus. In the molecular function (MF) group, the ceRNA regulatory network mainly enriched in protein binding, beta-catenin binding, and ubiquitin protein ligase activity. These result above indicated that the ceRNA network was significantly enriched in terms correlated with cell proliferation, cell migration, and cell apoptosis. The KEGG enrichment analysis revealed that the ceRNA regulatory network was significantly enriched in multiple cancer related pathways such as PI3K-Akt signaling pathway, FoxO signaling pathway, Signaling pathways regulating pluripotency of stem cells, and AMPK signaling pathway.
3.5 Identification and validation of key lncRNA and mRNA.
Based on the ceRNA hypothesis, the miRNA could be negatively correlated with its upstream lncRNA and downstream mRNA simultaneously. Therefore, we assessed the expression levels and prognostic values of the predicted lncRNAs for overall survival in patients with CRC by using GEPIA and PrognoScan databases, respectively. Only two lncRNAs (NEAT1 and XIST) showed both low expression and improved prognosis in patients with CRC (Fig. 4A & B). As for downstream mRNA, we collected gene expression profiles in human CRC from TCGA database to assess the expression levels of the predicted mRNA. In the meanwhile, the prognostic values of those mRNAs for overall survival in patients with CRC were also assessed using the PrognoScan database. A total of 1964 downregulated genes were identified in human CRC from TCGA database, we then integrated the 1964 downregulated genes with the downstream mRNAs in the ceRNA regulatory network. As a result, no specific gene was identified to be targeted by miR-10b-5p, while there were 16 specific genes that could potentially be targeted by miR-195-5p (Fig. 4C & D). Subsequently, survival analysis found that eight of the 16 specific genes (ACOX1, CYP26B1, IRF4, ITPR1, LITAF, PHLPP2, RECK, and TPM2) were significantly associated with prognosis of CRC patients. The expression boxplot and survival curve of those mRNAs were displayed in Fig. 4E-L. Those two lncRNAs and eight mRNAs were chosen for further analysis.
3.6 Integration of survival related ceRNA regulatory network and GSEA enrichment analysis.
Based on the validation above, a survival related ceRNA regulatory network including two lncRNA-miRNA pairs (NEATI-miR-195-5p and XIST-miR-195-5p) and eight miRNA-mRNA pairs (miR-195-5p-ACOX1/CYP26B1/IRF4/ITPR1/LITAF/PHLPP2/RECK/TPM2) was established (Fig. 5A). Each component in the ceRNA regulatory network had significant prognostic values in human CRC, and fully compliance with the rule of ceRNA hypothesis. After that, we used GSEA software package to further explore the biological function of the survival related ceRNA regulatory network. As shown in Fig. 5B, GSEA enrichment analysis indicated that the survival related ceRNA regulatory network was significantly enriched in multiple cancer related pathways, such as JAK-STAT signaling pathway, mTOR signaling pathway, TGF-β signaling pathway, Wnt signaling pathway, ERBB signaling pathway, and VEGF signaling pathway.
3.7 Correlation of ceRNA network with immune infiltration level in human CRC.
The correlation between the survival related ceRNA regulatory network and tumor-infiltrating immune cells in the CRC microenvironment was assessed by using TIMER. As shown in Fig. 6, We found that genes expression in the ceRNA regulatory network was significantly associated with infiltrating immune cells, including B cells, CD4+/CD8+ T cells, Macrophages, Neutrophils, and Dendritic cells, in human CRC microenvironment. Subsequently, the correlation between the ceRNA regulatory network and abundance of T cells infiltration in human CRC were further evaluated by using the GEPIA database, which indicated that the survival related ceRNA regulatory network is closely associated with abundance of Central memory T cells (r = 0.75, P = 0), Effector T cells (r = 0.73, P = 0), Effector Treg T-cells (r = 0.74, P = 0), Exhausted T cells (r = 0.68, P = 0), Naïve T cells (r = 0.72, P = 0), Resident memory T cells (r = 0.69, P = 0), Resting Treg T-cells (r = 0.76, P = 3.9e− 69), and Th1-like (r = 0.72, P = 0) (Fig. 7A-H). We also found that the ceRNA regulatory network is closely associated with the expression of multiple well-known immune checkpoints, such as PDL1 (r = 0.48, P = 1.6 e− 16), CTLA-4 (r = 0.60, P = 1.3 e− 37), TIM-3(r = 0.71, P = 1.1 e− 56), and LAG-3(r = 0.46, P = 2 e− 20), suggesting that the ceRNA regulatory network participates in the process of immune checkpoint mediated T cell failure (Fig. 7I-L). Therefore, these findings further support that this survival related ceRNA regulatory network was closely correlated with immune infiltration level in human CRC, suggesting that the ceRNA regulatory network might be important participant in immune escape in the CRC microenvironment.