3.1 Identification of Differentially Expressed Autophagy-Related Genes(ARGs)
Based on the edgeR algorithm and according to the screening criteria of DEGs, 48 of the 224 ARGs (Table S1) showed significant alterations of expression levels in LUSC compared with normal control, including 20 up-regulated and 28 down-regulated genes, respectively (p < 0.05 and log2 |fold change| > 1). The results are shown in a heatmap (Figure 1A). These differentially expressed ARGs (DE-ARGs) were then selected for KEGG pathway enrichment analysis with a threshold of p < 0.05. As shown in Figures 1B, the differentially expressed ARGs were mainly enriched in regulation of autophagy, apoptosis and mitophagy. It is also worth noting that “PD-L1 expression and PD-1 checkpoint pathway in cancer” were enriched, implicating the correlation between these ARGs and immunosuppression pathway.
3.2 Construction of the Autophagy-related Prognostic Signature
The Kaplan–Meier method and log-rank test were applied to examine the relationship between overall survival (OS) and all ARGs (rather than DE-ARGs) in 326 LUSC samples from TCGA. As shown in Table S2, 54 genes were significantly related with OS of LUSC patients (p < 0.05) and these genes were defined as survival‑related ARGs (sARGs). Meanwhile, we noticed that not all sARGs are DE-ARGs. Next, Random Forest was performed to select sARGs with the best prognostic value and to build an autophagy-related risk score model in the TCGA cohort. Four sARGs including CFLAR, RGS19, PINK1 and CTSD were selected as most important survival relevant variables. The Kaplan-Meier overall survival (OS) curves of these four genes were displayed in Figure 2 (The cutoff value of gene expression level was 50%). The random survival forest -based score was derived for each sample and defined as risk score. Patients in the TCGA cohort (training dataset) then were assigned to a high-risk or low-risk group using the optimal cut-off value obtained with the “survminer” R package. Risk score distribution and corresponding four -gene expression patterns were shown in Figure 3A. With the increase of risk score, the expression levels of the four genes were elevated. The overall survival time of the high-risk group was shorter than that of the low-risk group.
3.3 Validation of the Autophagy-related Prognostic Signature
For further validation, 78 samples of LUSC subtype from GSE41271 (GEO dataset) were derived as an independent testing dataset. Then, Kaplan-Meier analysis were applied to the two cohorts and demonstrated that patients with a high-risk score were correlated with worse outcomes in the two cohorts (Figure 3B-C). Besides, the time-dependent ROC curve analysis of the risk score model in the TCGA cohort indicated a promising prognostic ability for OS (1-year AUC = 0.948, 3-year AUC = 0.965, 5-year AUC = 0.97, Figure 3D). Meanwhile, the ROC curve of OS prediction was drawn in GEO cohort (1-year AUC = 0.649, 3-year AUC = 0.559, 5-year AUC = 0.655, Figure 3E). The autophagy-related model showed a less promising prognostic ability in GEO cohort than the TCGA cohort, possibly due to the small sample size of GEO dataset (78 in GEO vs. 326 in TCGA), which deserved further large samples to validate. Furthermore, in TCGA cohort, we took age, gender, TNM stage (M stage excluded) as candidate risk factors for Cox regression analyses. As shown in Table 1, univariate Cox regression analysis indicated that N stage and risk score are risk factors for poor prognosis, and multivariate Cox regression demonstrated the independence of risk score of this signature in prognosis prediction from other clinical factors.
3.4 High Risk Group Indicated an Immune-suppression status
To explore the immune phenotype and the potential immune-related mechanisms underlying our constructed prognostic gene signature, GSEA was performed to find enriched immune-related gene sets annotated by the gene ontology (GO) term. Results unveiled that seven significantly altered immune-related pathways were enriched in high-risk group, and no immune-related pathways were enriched in the low-risk group (Figure 4). High risk score was significantly associated with the macrophage activation involved in immune response (P < 0.05), negative regulation of adaptive immune response (P < 0.01), negative regulation of immune effector process (P < 0.01), negative regulation of immune system process (P < 0.01), negative regulation of innate immune response (P < 0.01), negative regulation of adaptive immune response (P < 0.01), and negative response of immune response (P < 0.01). Consistently, high-risk group exhibited an immunosuppressive phenotype.
3.5 High Risk Group showed an increased Infiltration of suppressive immune cell populations
To further investigate whether there is an association between autophagy-related score and the regulation of suppressive immune cell populations, normalized ssGSEA scores of 28 infiltrating immune cell populations in each sample were calculated and represented their infiltration level. We drew a heatmap to visualize the relative abundance of 28 infiltrating immune cell populations (Figure 5). And the boxplot of these immune cells labeled with the P value of Wilcoxon rank test between low-risk and high- risk score group was shown in Supplementary Figure 1. We observed a positive correlation between risk score and the infiltration levels of most immune cell subtypes, including cell types executing anti-tumor immunity (Activated CD4 T cell, Activated CD8 T cell, Central memory CD4 T cell, Central memory CD8 T cell, Effector memory CD4 T cell, Effector memory CD8 T cell, Type 1 T helper cell, Type 17 T helper cell, CD56bright natural killer cell) and cell types executing pro-tumor and immune suppressive functions (Regulatory T cell, Type 2 T helper cell, CD56dim natural killer cell, Immature dendritic cell, Macrophage, MDSC, Neutrophil, Plasmacytoid dendritic cell). For further investigation, Pearson's correlation analysis was applied to the abundances of these two categories of cells. Results showed that the abundances of anti-tumor immune cells were positively associated with the abundances of immunosuppression cells in tumor microenvironment (Figure 6A). And It seemed that immune suppression is stronger in high-risk group than low-risk group, which is consistent with GSEA results.
3.6 Macrophage and Regulatory T cell (Treg) correlated with worse outcome
Then, univariate Cox regression and Kaplan-Meier analysis plus the log-rank test were employed to select the prognostic immune cells among these 28 infiltrating immune cell subtypes. Results revealed that only macrophage and regulatory T cell have a significant correlation with overall survival. As shown in Figure 6D-E, higher infiltration of macrophage and regulatory T cell correlated with shorter overall survival time (P < 0.05). As above, high- risk group was associated with higher infiltration of macrophage and regulatory T cell (P < 0.01) (Figure 6B-C). This observation suggested macrophage and regulatory T cell may accounted for the poor prognosis induced by immunosuppression in high-risk group.
3.7 Correlation between genes expression and immunocyte infiltration
The correlations of the expression of four genes in this autophagy-related signature with the infiltration level of macrophage and regulatory T cell as well as risk score were investigated. Figure7A demonstrated overall correlations among these arguments. The four genes were strongly interrelated and exhibited significantly positive association with the infiltration of two immunocyte. Besides, the genes expression and immunocyte infiltration were increasing along with the growth of risk score.
3.8 Immune Checkpoints Analysis
Immune checkpoints inhibitors are becoming the promising strategies in the treatments of lung cancer. Therefore, we investigated the association of the risk score and the main immune checkpoints, including PD-1/PD-L1 and CTLA4. As Figure 7B-D shown, risk score showed a significantly positive correlation with PD-1 and CTLA4 (P < 0.01). While, there is no significant correlation between risk score and PD-L1 expression. In summary, high-risk group had higher levels of PD-1 and CTLA4. Then, we also examined the correlations between the expression of the four ARGs (CFLAR, RGS19, PINK1, CTSD) and the expression of immune checkpoints (PD-1, PD-L1, CTLA4). Results showed that the expressions of four ARGs had significant associations with the immune checkpoints consistently (only the correlation between PINK1 and PD-L1 was not significant) (Supplementary Figure 2).
3.9 Estimate score, immune score and stromal score
ESTIMATE (Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data) is a newly developed algorithm that uses the transcriptional profiles of cancer tissues to infer the level of infiltrating stromal and immune cells based on specific gene expression signatures of stromal and immune cells in the specific cancer type. In the present study, the stromal score, immune score, and estimate score of each included sample of was calculated by applying Estimate R package. Next, the association of autophagy-related risk scores with the estimate/immune/stromal scores was examined. As shown in Figure 8A-C, estimate score and immune score were found to have a positive correlation with the risk scores of our established model (P < 0.05). Stromal score was also rising with the increase of risk score, but not significantly (P=0.072). The prognostic value of the three types of score were also investigated. Survival analysis showed that the patients with higher estimate/immune/stromal score had a poorer overall survival, although the statistical significance were only observed in stromal score (P < 0.05).