1. Inflammation-related genes differ significantly in breast cancer and normal tissues.
To explore the genes that are involved in breast cancer genesis and inflammation-related genes(IRGs) in breast cancer, we performed differential and principal component analyses based on the mRNA levels of 1,217 breast cancer patients from the TCGA database, we found that there are significant differences between breast cancer and normal tissues (Figs. 1A and 1B). We found that 689 genes were significantly up-regulated, and 1217 genes were significantly down-regulated in breast cancers compared to normal tissues (Fig. 1C). Meanwhile, 427 of 1308 genes related to inflammation were shown significant differences between breast cancer and normal mammary tissues. (Figs. 1D and 1E). Among them, 213 and 214 genes related to inflammatory response were up-regulated and down-regulated in breast cancer, respectively when compared to normal tissue (Fig. 1F). We next performed a survival prognosis assessment based on different tumor stages and ages and found that younger patients with breast cancer had longer survival. (Figure S1A). At the same time, the survival time of patients with stage I or II tumors was longer than that of stage III and IV tumors (Figure S1B). In tumor node metastasis (TNM) staging, the overall survival of patients also decreased with the degree of stage progression (Figure S1C-D).
2. A prognostic model composed of 12 inflammatory response-related genes
We conducted a Log rank test and Univariate Cox analyses on 1,308 inflammation-related genes (IRGs). Among them, 118 genes (p < 0.05) were identified by Log rank test and 178 genes (p < 0.05) were detected by Univariate Cox. Then we got 59 genes from the intersection of the Log rank test, Univariate Cox, and inflammation-related differential genes (IRDGs; Fig. 2A).
Through consensus clustering analysis, we found that IRDGs can be clustered in 2 groups (Fig. 2B). When K is 2 (Fig. 2C) in a total of 9 clustering analyses, the slope of CDF decline was the smallest (Fig. 2D), and the prognosis of breast cancer patients was significantly different (Fig. 2E). According to the results of clustering and grouping, we further showed that the survival time is quite different between different groups, which led us to construct a risk prognosis model. We found that 16 genes were positively associated with whereas 43 genes were negatively related to overall survival in breast cancer patients (Fig. 2H). After removing the effect of collinearity by Lasso analysis, 22 genes could be used as inflammatory response-dependent prognostic features (Figs. 2F and 2G). In order to further optimize the model, we used the stepwise regression method to eliminate interference factors, and got the best model containing 12 genes to estimate the survival risk of each patient, the model is as follows: Risk score = 0.35369 × expression of EDA2R + (-0.25247) × expression of WLS + (-0.15276) × expression of RASGRP1 + 0.33080 × expression of EMP1 + (-0.12058) × expression of PTGER3 + (-0.09659) × expression of DAPL1 + 0.38817 × expression of EIF2AK1 + (-0.09533) × expression of TNFRSF18 + (-0.17530) × expression of IRS2 + (-0.24169) × expression of PSME2 + (-0.07606) × expression of NPY5R + (-0.11625) × expression of CD6 (Fig. 2I).
3. Inflammation-related gene signature enables assessment of patient survival.
To investigate whether the 12 inflammation-related gene signatures could be used to evaluate the survival rate of breast cancer patients, we grouped the patients into high-risk and low-risk groups according to the model. After excluding the effect of gene collinearity, we found that the expression levels of EDA2R, EMP1 and EIF2AK1 were abnormally elevated in the high-risk group (Figs. 3A-C). Previous studies showed that EDA2R acted as a risk factor in different cancers to promote occurrence [16–18]. EMP1, which negatively regulates autophagy-dependent apoptosis[19], plays an important role in the immune infiltration and tumor cell invasion and metastasis[20–28]. EIF2AK1 was reported to be involved in drug resistance in breast cancer[29]. We further showed that overexpression of these genes was significantly associated with the poor over the survival of patients in the high-risk group (Figs. 3D-L). Meanwhile, we found that DAPL1, PSME2, CD6, WLS, PTGER3, RASGRP1, IRS2, NPY5R, and TNFRSF18 were significantly decreased in the high-risk group (Figs. 3F-K). DAPL1 was reported to negatively regulate CD8 T cell response to tumorigenesis and the depletion of DAPL1 leading to enhancing the functionality of TEX cells [30]. The differential expression of PTGER3 was shown to relate to drug resistance in breast cancer[31–33]. CD6 and PSEM2 were also reported to be associated with immune infiltration and tumor microenvironment[34–38]. It is well recognized that tumor-associated macrophages (TAMs) include two major populations: tumor-supportive M2 macrophages and tumor-suppressive M1 macrophages, although each population may contain subpopulations[39]. In tumor immunity, the activation and phosphorylation of IRS2 by IL4 inhibits the activation of M2 macrophages. Similarly, WLS has the same function, and its deletion upregulates the expression of inflammation-related genes[40] and activates M2 macrophage activity [41]. RASGRP1 is associated with congenital immunodeficiency, and its deletion leads to severe recurrent infections, autoimmunity, autoinflammation, and early onset malignancies[42]. NPY5R limited the activation of the STAT3 signaling pathway through interaction with IL6, simultaneously inhibited the growth of breast tumor cells, induced apoptosis and G2/M arrest, and increased the sensitivity of BC cells to doxorubicin[43]. TNFRSF18 (Glucocorticoid-induced tumor necrosis factor receptor- superfamily Member 18) is highly expressed in low-risk breast cancer and has a good prognosis. We demonstrated that low expression of these immune-related genes was linked to the poor prognosis of patients in the high-risk group (Figs. 3M-X). Furthermore, we found significant differences between high- and low-risk groups in the TNM stage and tumor stage (Figures S3A-D).
4. Inflammation-dependent prognostic model can predict patient survival
In order to ensure the accuracy of the model, we used the training set (TCGA-BRCA, n = 1,104) to predict the life and death of the testing set (GEO, n = 2,969). We found that the number of patient death in the high-risk group was significantly higher than that of the low-risk group (Fig. 4A-C) and testing sets (Figs. 4D-F). According to the KM survival curve, this model worked well in both the training (Fig. 4G) and testing sets (Fig. 4I), indicating that our model is reliable. We also calculated the AUC values for the 5-, 10- and 15-year survival in training sets (Fig. 4H) and the AUC values for the 1-, 2- and 3-year in testing sets (Fig. 4J). The results showed that the AUC value in the training set were all above 0.7 and AUC values in test set were also almost above 0.6 according to the model, indicating that the inflammatory-related risk signature could be used to predict the survival of breast cancer patients.
5. Risk Score and Nomogram can predict the prognosis of breast cancer patients
Univariate and multivariate cox regression analyses revealed that tumor stage, TNM stage, age, and Risk Score were the factors that influence the survival in breast cancer patients and that Risk Score and tumor stage were independent factors (Figs. 5A-D). We further constructed Nomograms in the training set (Figure. 5E) and testing set (Figure. 5G) according to the Risk Score. We found that the higher the Risk Score was associated with the lower the survival rate of patients. Calibration showed that Nomogram is more accurate (Fig. 5F and 5H).
6. High-risk group has inflammation or chronic inflammation, while the low-risk group inactivates the MAPK and calcium ion signaling pathways.
To analyze the relationship between the high-risk group and inflammation, we performed KEGG and GO enrichment analysis of GSEA in high-risk and low-risk patients. We found that nine KEGG signaling pathways, including inflammation and immune deficiency, were enriched in the high-risk group (Fig. 6A). Similarly, GO analysis revealed that the signaling pathways such as chronic inflammation, inflammatory response, interferon, and chemokines were also activated in the high-risk group (Fig. 6B). This suggested that inflammatory reactions are occurring in the high-risk group. However, the MAPK signaling pathway and calcium ion signaling pathway were inhibited in the low-risk group (Fig. 6C). Endogenous stimuli from the TME or cancer cells were also inhibited in the GO signaling pathway (Fig. 6D). These findings further support our observation that the patients in the high-risk group had lower survival rates. Hallmarks co-expression network and GENEMANIA analyses showed that inflammatory response-related signaling pathways were enriched and that 12 inflammation-related genes were co-expressed (Figs. 6E-F).
7. High-risk group has higher macrophage M2 infiltration and lower memory T cells
In order to study the immune cells in mammary tumorigenesis, we analyzed the overall proportion of different immune cells. Among the 22 types of immune cells, memory CD4 T cells and M2-macrophage had the highest abundance (Fig. 7A). Many studies have shown that the accumulation of M2-macrophage promotes tumor blood vessel formation and tumor cell proliferation, while M1-macrophage inhibited tumor growth.[44–47]. We next analyzed the levels of 22 immune cells in patients with different risk groups and found that macrophages M0, M2, and mast cells were higher in the high-risk group, while macrophages M1, T cells, B cells, and Monocytes were higher in the low-risk group (Fig. 7B). According to the previous report[48], we divided 22 types of immune cells into 4 groups: lymphocytes, dendritic cell, macrophage and mast cell, and then analyzed them in different risk patients individually. The results showed that there were more macrophages and mast cells in the high-risk group patients, while there were more lymphocytes in the low-risk group (Fig. 7C). Previous report has shown that excessive or uncontrolled macrophage accumulation contributes to chronic inflammation and occurrence of different diseases [49]. Therefore, we analyzed the abundance of 22 immune cells based on risk score and clinical information. The results strongly indicated that the risk score could classify immune cells, and the abundance of immune cells in different risk groups (Fig. 7D).
To further investigate the relationship between immune cells and 12 inflammation-related genes, we performed the analysis using Cibersort software (https://cibersortx.stanford.edu/). The result showed that the genes, which are associated with poor prognosis, were positively correlated with macrophage M2 and negatively correlated with macrophage M1(Fig. 7E). Our results suggest that M2-macrophages are important factor for the poor survival rate in the high-risk group patients.
8. High-risk group has higher tumor mutational burden (TMB) and higher drug sensitivity to Bicalutamide and Imatinib.
High tumor mutational burden (TMB) and neoantigen load in tumors have been associated with an enhanced response to immune checkpoint blockade therapy[50]. In order to confirm the somatic mutation differences in our defined high- and low-risk groups, we downloaded the somatic mutation data and performed tumor mutation burden analysis. We also analyzed amino acid mutation site, drug-gene interaction, oncogenic signal pathway, and drug sensitivity. The results showed that SNP deletion mutations are the most common in breast cancer (Figs. 8A, B, S3A and S3B). The mutation frequency of the TTN and GATA3 genes was higher in the high-risk group, while the mutation frequency of TP53, PIK3CA, and CDH1 was higher in the low-risk group (Figs. 8C and S3C-H). Inflammation-related hub gene mutation frequency in the high-risk group was higher than in the low-risk group (Figure S3I). And there was a higher TMB in the high-risk group (Figs. 8D and 8E). We further predicted potential drug signaling pathways based on gene mutation frequencies and biological functions (Figures S3G-M). It is well known that the sensitivity to chemotherapeutic drugs significantly affects the treatment outcome of patients. Through drug screening, we found that PD.0332991, ROSCOVITINE, and risk score were significantly positively correlated (Figs. 8F-I), and the treatment effect was better for low-risk patients. Bicalutamide, Imatinib, and RISK Score were negatively correlated (Figs. 8J-M), and the treatment effect was better in high-risk patients.