3.1 Identification of DEGs
Gene expression profiles and corresponding clinical information of the GSE110224 and GSE112366 datasets were obtained from the GEO database. The data were normalized, and the differentially expressed genes were identified by cluster analysis and DEGs (Figure 1). By taking the intersection of the Venn diagram (Fig. 2A, B), 61 DEGs with the same trend of expression were obtained, including 44 up-regulated genes and 17 down-regulated genes.
3.2 GO and KEGG pathway enrichment analyses
The function of co-expressed genes was analyzed by GO and KEGG enrichment analysis. In the KEGG pathway, four important enrichment pathways are a humoral immune response, lipopolysaccharide response, response to bacterial-derived molecules, and cytokine-mediated signaling pathways (Figure 2C). The results of GO analysis showed that these genes were mainly enriched in humoral immune response, lipopolysaccharide response, and response to bacterial-derived molecules (Figure 2D). These findings strongly suggest that the humoral immune response is closely related to the occurrence and progression of these two diseases.
3.3 PPI Network and module analysis
Using Cytoscape, a PPI network with a total score greater than 0.4 was created, which consists of 39 nodes and 150 interaction pairs (figure 3A), using the MCODE plug-in to get the closest gene module to constitute a sub-network. A subset of these (Score value = 10) included 12 common DEGs (figure 3C), the vast majority of which were chemokine family members.
3.4 Selection and Analysis of Hub Genes
The first 15 hub genes were screened using the Cysto-Hubba plug-in (figure 4A). Including CXCL11, MMP3, CXCL3, MMP1, CXCL5, CXCL2, LCN2, CXCL1, CXCL6, Il1b, FPR1, IL1RN, CCL23, FCGR3B, and S100A9(figure 4A). The Gene-MANIA database was used to investigate the combination networks and related functions of these genes (figure 4B). These hub genes demonstrated a sophisticated PPI network with 38.13% co-expression, 29.49% physical interaction, and 18.71% common protein domain (figure 4B). These genes were shown to be mostly connected to chemokine-mediated signaling pathways, chemokine responses, cellular responses to chemokines, and so on. (figure 4D). Furthermore, KEGG pathway analysis confirmed that they were primarily enriched in lipids and arteriosclerosis, formation of neutrophil extracellular traps, leukocyte migration across endothelial cells, etc. (figure 4E, F).
3.5 Relationship between CXCL1 and immunity
Analysis by the TISIDB database found a positive association of CXCL1 with immune cell infiltration in colon cancer, including neutrophil and CD4 effector T cells, CD8 effector T cells, etc. (figure 5B-H). At the same time, we analyzed the correlation with immune checkpoints and found that CXCL1 was positively correlated with immune checkpoint molecules that promote immune escape, such as PDCD1, CD274, etc. (figure 5J-M). What’s more, for verification of the expression levels for such hub genes, the TCGA-COAD dataset and an external CD dataset were employed. The findings revealed that all hub genes except CCL23 were significantly upregulated in colon cancer tissues in the external data set (TCGA-COAD) compared with the normal gut (figure 6). Similarly, in another data set (GSE102133), the expression of all genes except CCL23 was also higher than that of normal colon tissues (figure 7). The TCGA-COAD dataset was used to validate the prognostic impact of all 15 pivotal genes in colon cancer. The results showed that only CXCL1 was associated with prognosis in colon cancer, and its low expression was associated with poor prognosis (figure 8). Further, to clarify the role of CXCL1 in colon cancer, analysis of the TCGA-COAD data set found that low CXCL1 expression was associated with higher pathological stage, N stage, and M stage (figure 8G, H).