Construction of the prognostic HRGPs signature
A total of 146 HRGs were measured on all platforms and fulfilled the criteria of the training cohort (MAD > 0.5). From these 146 HRGs, 1645 HRGPs were constructed. After removing HRGPs with relatively small variation (MAD=0), 677 HRGPs were kept as candidates. Then we developed the HRGPs index using Lasso Cox regression on the training cohort and selected 13 HRGPs in the final risk model (Figure1 A-B). This HRGPs signature consisted of 24 unique HRGs (Table 2). Next, we used the HRGPs index to calculate the risk score for each patient in the training cohort. The patients were stratified into high- or low-risk groups based on the optimal cut-off point which was set at 0.101 using time-dependent ROC curve analysis (Figure 1C). The survival curves suggested that the high-risk group presented a significantly poor OS than the low-risk group in the training cohort (P <0.001, Figure 2A). We further evaluated the prognostic power of HRGPs signature based on other clinical factors in the TCGA-LAUD dataset using the univariate and multivariate Cox proportional hazards regression analysis. The results of the univariate analysis showed that the clinical stage and risk model presented a prognostic effect on the OS of patients with LUAD (Figure 3A). After adjusting for other clinical variables, the risk model remained an independent prognostic factor in multivariate analysis (HR=3.3178, 95% CI 2.351-4.296, P < 0.001, Figure 3B).
Validation of the HRGPs signature for prognostic prediction
To validate the constant prognostic value of HRGPs signature, we applied the same formula to other independent populations. GSE31210 (n=226) served as an external validation cohort and were divided into high- and low-risk groups based on the optimal cut-off value (HRGPs index= 0.101). when compared the survival difference between these two groups, patients in the high-risk group had a poorer OS than those in the low-risk group (P =0.04, Figure 2B). The risk model also remained independent from other clinical variables in univariate and multivariate Cox analyses (Figure 3C-D). Besides, The HRGPs signature presented a higher C-index when compared with other clinically applicable biomarkers, including RNA-binding proteins signature, immune-related signature, and autophagy-related signature (Figure 4) [13, 18, 19].
Correlation of the HRGPs signature and immune cells infiltration
Increasing evidence has been shown that tumor-infiltrating immune cells had an impact on tumors’ prognosis[20, 21]. We applied the CIBERSORT algorithm to calculated the abundance of 22 types of immune cells for each sample within different risk groups. Figure 5A depicted that several immune cells, such as plasma cells, T-cells CD4 memory activated, T cells follicular helper, T cells regulatory (Tregs), NK cells resting, NK cells activated, Macrophages M0, dendritic cells resting, mast cells resting, mast cells activated, and neutrophils, were significantly enriched between different risk groups (P <0.05). Next, we performed the survival curves based on these immune cells. The results showed the plasma cells and NK cells resting were significantly correlated with the OS of patients with LUAD (P <0.05, Figure 5B-C).
Functional analysis of the HRGPs signature
To characterize the biological process determined by the HRGPs signature, we performed the GO analysis and GSEA. The results of the GO analysis showed that the HRGPs signature was mainly enriched in the terms of cornification, glycolmetabolism, and chromosome-related process (Figure 6). GSEA demonstrated that DNA replication, lymphocyte apoptotic process, and cell cycle-related pathways were significantly altered in high-risk groups (FDR < 0.05, Figure 7).