Identification of prognostic ARGs
Figure 1 shows the flow chart of our research process. After analyzing the expression profiles of 232 ARGs in the COAD tumor tissues and comparing them with the normal tissues, 16 upregulated and 20 downregulated ARGs were obtained, according to the following criteria: false discovery rate (FDR) < 0.05 and |log2 (fold change) |>1(Figs. 2A,B). Then, analysis of Kaplan-Meier was performed to determine the ability of the ARGs for OS to predict the prognosis of COAD patients, and 15 ARGs were screened. The forest map of the hazard ratio indicates that most of these genes are oncogene (HR > 1), except for SERPINA1 (Fig. 2C). Furthermore, the PPI analysis showed that these genes are closely related to autophagy, pre-autophagosomal structure, macroautophagy, autophagic vacuole assembly, cellular response to starvation, response to starvation, and cellular response to nutrient levels (Fig. 2D). Finally, given the important clinical implications of these ARGs, the genetic alterations of these genes were examined, and deep deletion and amplification were classified as two common types of mutations. A total of four genes have a mutation rate ≥ 4%, in which GRID1 is the most frequently mutated gene (6%) (Fig. 2E).
Go and KEGG enrichment analysis of the differentially expressed ARGs
Then, the potential biological processes and pathways of the differentially expressed ARGs between COAD and non-tumor tissues were investigated. The GO term functional enrichment (P < 0.05) and the KEGG pathway enrichment analyses of these genes are summarized (P < 0.01) in Additional file 1: Fig. S1. The top 5 GO terms for biological processes include “autophagy,” “process utilizing autophagic mechanism,” “response to oxygen level,” “intrinsic apoptotic signaling pathway,” and “macroautophagy” (in decreasing order of p value). For the cellular components of the enrichment analysis, the top 5 terms were notably associated with “autophagosome,” “vacuolar membrane,” “autophagosome membrane,” “outer mitochondrial membrane,” and “membrane organelle outer membrane.” On the basis of molecular function, among the top 5 terms, the genes were mostly enriched in “ubiquitin protein ligase binding,” “ubiquitin-like protein ligase binding,” “protein kinase regulator activity,” “protein kinase regulator activity,” and “protein phosphatase 2A binding.” In addition, 30 KEGG pathways were considered statistically significant (P < 0.01), and the top 3 significant pathways were “p53 signaling pathway,” “apoptosis,” and “human cytomegalovirus.” Overall, autophagy played an important role in the pathogenesis of COAD.
Identification of prognostic autophagy-related LncRNA
lncRNAs dominate the upstream portion of the RNA network and function as primary effectors of the mRNAs. A gene co-expression network analysis was performed using the Pearson correlation with |R|>0.3 and P < 0.01 as the cutoff point. A total of 943 lncRNAs were obtained in proportion to the 36 ARGs (Additional file 5 : Table S1). Only 369 of 943 autophagy-related lncRNAs were significantly differentially expressed between patients with COAD and normal tissues (∣log2FC∣>1; P < 0.05; Figs. 3A,B; Additional file 6 : Table S2). Subsequently, 36 of 369 different expressed lncRNAs were found to be associated with OS of COAD patients, of which five lncRNAs (AP001554.1, LINC00513, SNHG16, AL137782.1, and AL590483.1) were protective genes with an HR < 1 and the remaining 31 lncRNAs were risky genes with an HR > 1 (Fig. 3C).
Construction of the ARGs-based prognostic signature
Based on the 15 ARGs that were significantly correlated with OS in COAD patients, an autophagy-related risk signature was considered in predicting the prognosis. After multivariate Cox regression analysis, 6 ARGs which consisted of GRID1, DAPK1, RAB7A, PELP1, ULK3, and WIPI2 were identified to construct a prognostic signature for OS. Subsequently, with the expression coefficient of each independent risk gene, our six-mRNA models were formed using the following formula: prognosis score = (0.524 × expression level of UKL3) + (1.022 × expression level of GRID1) + (0.254 × expression level of DAPK1) + (0.410 × expression level of PELP1)+ (0.608 × expression level of WIP2) + (1.097 × expression level of RAB7A). Using this formula, we calculated the risk score of each patient. By using the median risk score as the threshold, we divided the patients into two groups: high-risk groups and low-risk groups (Figs. 4A,B). The heatmap of these six ARGs and the Kaplan–Meier analysis of the different OS between the two groups are displayed in Fig. 4C,D. Notably, our data showed that the high-risk group (risk scores ≥ 0.954; n = 213) had a worse prognosis (shorter OS) than the low-risk group (risk scores < 0.954; n = 213). The area under the curve (AUC) of the corresponding ROC curve for the 5-year OS is 0.728 in the training group (Fig. 4E). A similar trend was observed with a 5-year OS AUC of 0.733 in the validation group (GSE72970) from the GEO database. (P < 0.001; Figs. 4F,G). This indicated that the 6 mRNA-prognostic scoring system for COAD based on ARGs has a certain potential in survival prediction.
Construction of autophagy related lncRNA-based prognostic signature
Based on the 36 differentially expressed autophagy-related lncRNAs between the tumor and normal tissues that were significantly correlated with the OS of COAD patients, multivariate Cox regression analyses were performed to select the potential prognosis-related lncRNAs. As a result, 14 lncRNAs, which consisted of CASC9, PCAT6, AP006621.2, GS1-124K5.4, MIR4435-2HG, AL354993.2, AC048344.4, AC010973.2, AL590483.1, AL137782.1, STAG3L5P-PVRIG2P-PILRB, LINC00513, SNHG16, and AP001554.1, were determined as independent prognostic indicators for OS. Furthermore, our 14-lncRNA prognosis score was calculated using the following formula: prognosis score = (0.183 × expression level of CASC9) +(0.334 × expression level of PCAT6) + (0.348 × expression level of AP006621.2) +(0.409 × expression level of GS1-124K5.4) + (0.651 × expression level of MIR4435-2HG) +(0.838 × expression level of AL354993.2) + (0.881 × expression level of AC048344.4) +(1.145 × expression level of AC010973.2) + (− 0.882 × expression level of AL590483.1) +(− 0.875 × expression level of AL137782.1) + (− 0.688 × expression level of STAG3L5P-PVRIG2P-PILRB) +(− 0.665 × expression level of LINC00513) +(− 0.663 × expression level of SNHG16) +(− 0.573 × expression level of AP001554.1). Using this formula, we calculated the risk score of each patient. By using the median risk score as the threshold, the patients were divided into two groups: high-risk and low-risk groups. The heatmap of the 14 lncRNAs and the Kaplan–Meier analysis of the different survival durations between the two groups are shown in Fig. 5C,D. The results show that the high-risk patients (risk scores ≥ 1.009;n = 213) had shorter OS, indicating that they require more clinical attention and better clinical management, while the low-risk patients (risk scores < 1.009; n = 213) have better survival, in which a milder treatment plan is required to avoid over-treatment. The prognostic value of the risk score was also validated based on the data of the primary site rectosigmoid junction of the patient samples (n = 67) from the TCGA database. The AUC of the corresponding ROC curve of the autophagy related lncRNA-based prognostic signature prognostic index for the 5-year OS is 0.726 (P < 0.001; Figs. 5F,G). As a consequence, autophagy related lncRNA-based prognostic signature also provides a robust prediction for the prognosis of COAD patients.
The univariate analysis showed that the two autophagy-related prognostic scores were significantly associated with OS in COAD patients (Fig. 6A). After adjusting for the clinicopathological features, such as age, gender, pathology_T_stage, pathology_N_stage, pathology_M _stage, and pathologic_stage. The two prognostic signatures remained an independent prognostic indicator for OS based on the multivariate analysis (mRNA-based signatures (HR = 1.381, 95% CI = 1.208–1.578; P < 0.001; Fig. 6B); LncRNA-based signatures (HR = 1.204, 95% CI = 1.134–1.278; P < 0.001; Fig. 6B). Subsequently, we also determined the clinical utility of the 6-mRNA prognostic signature and 14-lncRNA prognostic signature regarding the pathology_T_stage, pathology_N_stage, pathology_M_stage, and pathologic_stage. As shown in Fig. 6C, the mRNA-based risk score tends to increase in the higher pathology_T_stage (T1–2 vs T3–4; P = 0.034), lymph node metastasis ( N0 vs N1–3; P < 0.001), and pathologic_stage (stage I & II vs stage III & IV, P < 0.001). Moreover, as shown in Fig. 6D, higher 14-lncRNA-based risk scores were observed among those with higher T stage (T1–2 vs T3–4, P = 0.002), lymph node metastasis (N0 vs N1–3, P = 0.002), and tumor stage (stage I-II vs stage III-IV, P = 0.001). Specifically, the identified two prognostic signatures are of clinical relevance to pathologic_stage and TNM staging of COAD. These data suggest that two prognostic signatures have the potential in diagnosis of COAD. While, according to the ROC curve(Additional file 2: Fig. S2), the prognostic model of 14 autophagy-related lncRNA(1,3,5 year AUC = 0.768, 0.76, 0.804, respectively) was superior to the 6 autophagy-related mRNA(1,3,5 year AUC = 0.694, 0.697, 0.728, respectively) and other independent prognostic factors, which was then selected for further analyses.
Establishment of the predictive nomogram for OS based on prognostic clinical factors and the 14-lncRNA signature
To further improve the predictive accuracy of our autophagy-related lncRNA signature by examining their performance in combination with other independent prognostic factors (e.g., age, tumor stage; Figs. 6A,B), we established an easy-to-use and clinically adaptable, risk nomogram for predicting the 3- and 5-year survival probability of COAD patients (Fig. 7A). Using the nomogram, “point” was scaled to estimate the points for each variable by drawing a vertical line. Then, the “total points” were scaled to estimate the corresponding 3- and 5-year OS rates of the COAD patients. The calibration curves of nomogram for predicting the 3- and 5-year COAD patients in the training set and validation set are shown in Figs. 7B, C. The C-indices (0.826 for the training set, 0.895 for the validation set) further strengthen the good discriminative ability of our models with regard to the age and tumor stage in predicting the OS of patients with COAD.
14-autophagy-related lncRNAs co-expressed network
An interaction network analysis was performed on the 14 risk score-related lncRNAs and the 9 associated ARGs (BID, CAPN10, CCR2, CD46, FAS, ITPR1, TNFSF10, VEGFA, PINK1). Then, the lncRNA–mRNA risk axes were integrated into two module maps (Additional file 3: Figs. S3A,B). Positive correlations were obtained for 17 lncRNA-mRNA pairs, and negative correlations were obtained for 3 lncRNA‐mRNA pairs (Additional file 3: Fig. S3C). These results suggested that the coexpressed lncRNAs-mRNA networks may play important roles in regulating autophagy and in modulating COAD patients’ survival.
Gene set enrichment analysis of 14 lncRNAs in the model
Moreover, to gain insights into the functional roles of the 14 lncRNAs in the molecular mechanisms of COAD, GSEA enrichment analyses for the lncRNAs between the predicted high-risk and low-risk groups were performed. The results revealed that the identified 14 lncRNAs were enriched in several autophagy-related, metastasis-related, and COAD-related pathways, such as MAPK signaling pathway, VEGF signaling pathway, Hedgehog signaling pathway, Notch signaling pathway, pathway in cancer; natural killer cell-mediated cytotoxicity, GAP junction, and JAK_STAT signaling pathway (Additional file 4 : Fig. S4). Therefore, we predicted that the 14-autophagy-related lncRNAs might be potential therapeutic targets for COAD.