Analysis using public database
The expression status of GPC family genes in tissues derived from various normal human organs was analyzed using GTEx (Fig. 1). The expression level of GPC family genes was lower in human pancreas than in other organs. The results of GEPIA analysis showed that expression of GPC1, GPC3, GPC4, and GPC6 was significantly higher in PDAC tumor tissues than in normal tissues (P < 0.05) (Fig. 2). GO functional enrichment analysis indicated that GPC family genes were mainly involved in composition of cell membrane, organelles and anchored components of the membrane, heparan sulfate proteoglycan binding, and glycosaminoglycan metabolic process (Fig. 3, Supplementary Table 1). The results of BiNGO analysis (Fig. 4) confirmed those of GO analysis.
Survival analysis
The Kaplan-Meier method and log-rank test were used to investigate the association between basic clinical characteristics and OS in TCGA database. Supplementary Table 2 shows that histologic grade, extent of surgery, treatment with radiation and targeted molecular therapy were significant in OS. GPC family genes were divided into two groups based on expression level, and survival analysis was performed between the two groups. The results (Fig. 5A–F) demonstrated that expression of GPC2, GPC3, and GPC5 was significantly associated with survival. The median survival time (MST) was significantly longer in patients with high expression of GPC2, GPC3, and GPC5 than the low expression group (log-rank P = 0.031, 0.021, and 0.028, respectively; MST, 634 days vs. 481 days, 614 days vs. 473 days, and 593 days vs. 485 days, respectively, Figure 5B, 5C, 5E and Fig. 6). After adjusting for survival-significant clinical parameters in a multivariate Cox proportional hazards regression model, GPC2, GPC3, and GPC5 were still significantly associated with OS (Table 1) (adjusted P = 0.005, adjusted HR = 0.449, 95% CI = 0.258–0.782; adjusted P = 0.022, adjusted HR = 0.531, 95% CI = 0.309–0.914; and adjusted P = 0.020, adjusted HR = 0.525, 95% CI = 0.306–0.902, respectively). Results of stratified analysis for GPC2, GPC3, and GPC5 are shown in Table 2. High expression of GPC2 was significantly associated with better OS in patients who were male, were >60 years old, had histologic grade G1 or G2, had R1 or Rx resection or whether received radiation therapy. GPC3 expression was related to patients who were female, were >60 years old, had histologic grade G1 or G2, or did not receive radiation or targeted molecular therapy. Moreover, GPC5 could influence prognosis of patients who were ≤60 years old, had histologic grade G3 or G4, had R1 or Rx resection, or did not receive radiation or targeted molecular therapy.
Joint effects analysis
Based on the prognostic significance of each GPC family gene, we combined every two genes among GPC2, GPC3, and GPC5 to investigate their significance in PDAC prognosis. The combination of GPC2 and GPC3 was associated with worse survival outcome in group 1 (MST= 278 days, adjusted P value < 0.001). The group of GPC2 and GPC5 was associated with the highest risk of death in group Ⅰ (MST = 278 days, adjusted P value < 0.001) and the group combining GPC3 and GPC5 showed the poorest prognosis in group ⅰ (MST = 278 days, adjusted P value < 0.001).
We also analyzed survival associated with the three genes simultaneously. Group A showed the worst in survival status (MST = 219 days, adjusted P value = 0.018), whereas the best survival was observed in group D (MST = 702 days, adjusted P value < 0.001). These data are shown in Table 3 and Fig. 7A–D showed the survival curves.
Prognosis nomogram
Based on the status of each clinical parameter and expression levels of GPC2, GPC3, and GPC5, a score for each variable was calculated. The total score could be calculated to predict 1-, 2-, and 3- year survival probabilities. The nomogram (Fig. 8) indicated that GPC2, GPC3, and GPC5 affected the prognosis of PDAC to different degrees.
Validation dataset to demonstrate the prognostic value of survival-related genes
To further understand the prognostic value of GPC2, GPC3, and GPC5, we acquired the GSE62452 dataset from GEO database. As shown in Supplementary Table 3, histologic grade was significantly associated with OS. GPC family genes were also divided into two groups by the median expression level of each gene and survival analysis between the two groups was carried out. Table 4 and Fig. 9A–F show that higher expression of GPC3 was significantly related to better survival (log-rank P = 0.038) and higher expression of GPC2 and GPC5 was also related to better survival, though not significantly (log-rank P = 0.337 and 0.090, repectively). Multivariate Cox proportional hazards regression analysis adjusted for prognosis-related clinical characteristics showed that none of these genes was significantly correlated to overall survival (all adjusted P > 0.05).
Genome-wide co-expression analysis of GPC2, GPC3 and GPC5 in PDAC
Genome-wide co-expression analysis was performed for each of these genes to investigate their related functional pathways through TCGA database. For GPC2 and its co-expressed genes, a correlation network was established as shown in Fig. 10A (Supplementary Table 4). GO analysis indicated that GPC2 and its co-expressed genes functioned mainly in sequence-specific DNA binding, protein transport, cell differentiation, and anterior/posterior pattern specification (Fig. 10B, Supplementary Table 5).
The correlation network for GPC3 and its co-expressed genes (Fig. 11A, Supplementary Table 6) identified 511 positively co-expressed genes and 25 negatively co-expressed genes. GO analysis of these genes indicated that they were enriched in cell adhesion, angiogenesis, and inflammatory response (Fig. 11B, Supplementary Table 7). And KEGG analysis indicated that these genes were related to several biological processes, mainly in Ras, Rap1, PI3K-Akt, and chemokine signaling pathways (Fig. 11C, Supplementary Table 8).
The correlation network for GPC5 and its co-expressed genes was shown in Fig. 12A and Supplementary Table 9. The results of GO analysis showed that these genes were associated with transcription factor complex and phospholipid metabolic process (Fig. 12B, Supplementary Table 10). KEGG analysis showed that these genes were involved in pancreatic secretion and glycerophospholipid metabolism (Fig. 12C, Supplementary Table 11).
Gene set enrichment analysis
GSEA was carried out to explore possible mechanisms of GPC family genes affecting prognosis of PDAC patients through TCGA database. The results of c6 reference indicated that low GPC2 expression was closely related to oncogenic signatures such as KRAS, RAF1, STK33, and VEGFA (Fig. 13A–F; Supplementary Table 12). GSEA results of c2 enrichment showed that high GPC3 expression was associated with neuroactive ligand receptor interaction and GPCR ligand binding (Fig. 14A–C; Supplementary Table 13), and c6 enrichment suggested that high GPC3 expression was correlated to cyclin D1, p53, and PTEN (Fig. 13D–F; Supplementary Table 14). For GPC5, c2 reference indicated that low expression of GPC5 was related to the EGFR pathway, gene methylation status, TFRC1, and the cell cycle (Fig. 15A–D; Supplementary Table 15) and c6 reference indicated that low GPC5 expression was related to HOXA9 and BMI1 (Fig. 15E–F; Supplementary Table 16).