Construction of the predictive model of anti-PD1 immunotherapy response
Differentially expressed genes play an important role in the occurrence and development of tumors, and hence the identification of these genes can help deepen our understanding of tumors. First, we obtained RNA sequencing datasets (GSE78220 and GSE91061) containing anti-PD1 treatment response information from the GEO database (Table 1). The aforementioned datasets were merged into a new dataset for subsequent analysis through batch correction. We obtained 1874 differentially expressed genes through differential expression analysis, including 1097 upregulated genes and 777 downregulated genes (Fig. 1A). Next, we randomly divided the new dataset into training and validation datasets (the ratio of sample size was 6:4). We obtained five key genetic variables using the "glmnet" package to perform Lasso regression and cross-validation, and constructed a binary logistic regression model for the training dataset (the response to immunotherapy was recorded as 0, and the nonresponse was recorded as 1) (Fig. 1B). Therefore, the lower the score, the more sensitive the immunotherapy response. This model mainly predicted the anti-PD1 immunotherapy response of patients with melanoma. The key genes involved in this model were protein tyrosine phosphatase domain containing 1 (PTPDC1), serologically defined colon cancer antigen 8 (SDCCAG8), Peter Pan Homolog (PPAN), serine/threonine kinase 40 (STK40), and small nucleolar RNA, C/D box 16 (SNORD16) (Table 2). Finally, the anti-PD1 immunotherapy response of each tumor sample was predicted using the predict() function. We revealed the relationship between the actual immunotherapy response of tumor samples and the immunotherapy response score using differential expression analysis. The results showed that the actual immunotherapy nonresponse group had a significantly higher score than the response group (Fig. 1C). This finding illustrated that the score clearly distinguished the immunotherapy response group from the nonresponse group, which was of great significance for predicting the immunotherapy response.
Evaluation and verification of the anti-PD1 immunotherapy response prediction model
As mentioned earlier, we constructed an anti-PD1 immunotherapy response prediction model, but the accuracy of this model needed further evaluation. Based on the constructed model, we calculated the immunotherapy response score of the validation dataset. This score clearly distinguished the immunotherapy response group from the nonresponse group in the validation dataset (Fig. 1D). The ROC curve analysis showed that the area under the curve (AUC) of the immunotherapy response score prediction model based on the training dataset was 0.972 (Fig. 1E). In this model, sensitivity (SE), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), and accuracy were 1, 0.708, 0.889, 1, and 0.913, respectively (Table 3). Subsequently, we validated the model with the validation dataset. The AUC of the immunotherapy response score prediction model based on the validation dataset was 0.797 (Fig. 1E). The SE, SP, PPV, NPV, and accuracy of this model was 0.923, 0.571, 0.857, 0.727, and 0.830, respectively (Table 3). This suggested that the predictive effect of the immunotherapy response prediction model for patients with melanoma was good.
Comparison of the anti-PD1 immunotherapy response prediction model with other indicators
At present, the common predictors of immunotherapy efficacy include PDL1 expression, tumor mutation burden, deficient mismatch repair (dMMR), and so forth[12]. We evaluated the role of the aforementioned indicators in predicting the response to anti-PD1 immunotherapy to compare the effectiveness of the model and other common indicators. DNA mismatch repair is a highly conservative process involving four key genes mutL homologue 1 (MLH1), postmeiotic segregation increased 2 (PMS2), mutS homologue 2 (MSH2), and mutS homologue 6 (MSH6)[13]. We used PCA to obtain an algorithm that reflected the dMMR process based on the expression values of the aforementioned four genes (Fig. 2A and 2B). Subsequently, the algorithm's role in predicting the response of patients with melanoma to anti-PD1 immunotherapy was evaluated. The results showed that the AUC of the dMMR algorithm in the training and validation datasets was 0.615 and 0.473, respectively (Fig. 2C and 2D). The ROC curve analysis showed that the AUC of the PDL1 expression in the training and validation datasets was 0.606 and 0.538, respectively (Fig. 2E and 2F). The accuracy of the aforementioned common predictive indicators was lower than that of the constructed model, which showed that the model had certain advantages and value in predicting the response of immunotherapy.
Correlation between the anti-PD1 immunotherapy response prediction model and tumor microenvironment
The tumor microenvironment is mainly composed of stromal cells and recruited immune cells. We used the "estimate" algorithm to estimate the number of immune and matrix components in the melanoma samples. We used the Wilcoxon test to perform differential expression analysis so as to reveal the correlation between the ratio of matrix and immune components in tumor samples and the score of immunotherapy response. The results showed that samples with a higher immunotherapy response score had a lower matrix score (P = 0.019) and a lower immune score (P < 0.001) (Fig. 3A and 3B). This indicated that tumor samples that were more sensitive to anti-PD1 immunotherapy had higher levels of stromal and immune components.
Subsequently, we performed GSEA on melanoma samples from the groups with high (n = 66) and low (n = 67) immunotherapy response scores (Fig. 3C, 3D, 3E and 3F). The results showed that melanomas in the low-score group were significantly enriched in the biological processes of allograft rejection, antigen binding, B-cell receptor signaling pathway, interleukin 4 production, T-cell selection, dendritic cell pathway, Th1/Th2 pathway, and natural killer T cells pathway. Obviously, the lower the anti-PD1 immunotherapy score, the better the immunotherapy effect. This phenomenon might be attributed to the activation of part of the tumor's immune function.
To reveal the correlation between immunotherapy response score and TICs, we used the CIBERSORT algorithm to analyze the proportion of infiltrating immune subpopulations in melanoma tissues and constructed the expression profiles of 22 immune cells in these samples (Fig. 4A). We explored the relationship between immune cells using Pearson correlation analysis (Fig. 4B). Next, we assessed whether a difference existed in the expression of immune cells between the high and low scores of immunotherapy responses. The analysis showed differences in the naive B cells, CD8 + T cells, gamma/delta T cells, monocytes, and plasma cells in the samples of patients with melanoma between the high- and low-score groups of anti-PD1 immunotherapy response (P < 0.05) (Fig. 4C). As shown in the previous correlation analysis of immune cells, CD8 + T cells and gamma/delta T cells were positively correlated. Subsequently, we discussed the expression characteristics of common immune checkpoint genes based on the high and low scores of immunotherapy responses. We found that most immune checkpoint genes were highly expressed in patients with melanoma in the low-score group (P < 0.05) (Fig. 4D). This indicated that the more sensitive the patient's anti-PD1 immunotherapy, the higher the expression of immune checkpoint genes.
Value of immunotherapy response scores in other tumors based on the model
Although our model played a significant role in predicting the response of patients with melanoma to anti-PD1 immunotherapy, whether this model was applicable to other tumors was not clear. We obtained the renal cell carcinoma dataset GSE67501 containing anti-PD1 immunotherapy response information from the GEO database to disclose the role of this model in other tumors. We scored the immunotherapy response to renal cell carcinoma based on the established model algorithm. The ROC curve analysis showed that the AUC of the immunotherapy response score in this dataset was 0.679 (Fig. 4E). Despite no significant difference in the immunotherapy scores between the actual immunotherapy response–sensitive and insensitive groups, the immunotherapy response scores in the immunotherapy-sensitive groups showed a lower trend, which was consistent with our previous results (Fig. 4F). This suggested that our model played a role in predicting renal cell carcinoma immunotherapy response, which provided a reference for selecting immunotherapy options for patients with renal cell carcinoma.