Expression of DEGs
GSE13353 and GSE36791 were combined as the analysis data set, which included 54 ruptured IA samples (experimental group) and 26 unruptured IA samples (control group). After differential analysis, 27 DEGs, including 26 upregulated genes and 1 downregulated gene, were screened and displayed by applying the heat map and the volcano map (Fig. 1)
Enrichment analysis of DEGs
GO and KEGG enrichment analyses of DEGs were performed to screen enrichment results with an actual probability of P <0.05 (Fig. 2A,2B). The GO results suggested that the biological processes associated with DEGs mainly included myeloid leukocyte activation, regulation of immune effector processes, positive regulation of DNA-binding transcription factor activity, and myeloid leukocyte-mediated immunity. The DNA-KEGG results showed that DEG-associated pathways included neutrophil extracellular trap formation, fructose and mannose metabolism, transcriptional dysregulation in cancer, fluid shear stress, arterial dysregulation, fluid shear stress and atherosclerosis, amebiasis, and leukocyte transendothelial cell migration. The enrichment analysis after importing DEGs into the Metascape database showed (P < 0.01) that the DEGs were related to interleukin (IL) signaling pathway, neutrophil degranulation, positive regulation of DNA-binding transcription factor activity, inflammatory response, and IL-18 signaling pathway (Fig. 2C).
Random forest trees and gene importance analysis
The number of trees corresponding to the point with the lowest error in the cross-validation of the experimental group, the control group, and all samples was 153, which was obtained by filtering the DEGs through a random forest tree plot (Fig. 3A). Further analysis of this point resulted in a gene importance score (Fig. 3B). Disease signature genes with scores greater than 2 were filtered out, including hexokinase 3 (HK3), matrix metalloproteinase 9 (MMP9), CST7, NCF2, and uridine phosphorylase 1 (UPP1). Subsequent heat map analysis of these five disease signature genes (Fig. 3C) showed that the samples from the experimental and control groups could be roughly closely clustered, indicating that the expression of the disease signature genes could be distinguished from that in the experimental group. This further confirmed that the five genes were signature genes of RIAs.
ANN model analysis and diagnostic value
The ANN model (Fig. 4) could be divided into three layers: the input layer included the five characteristic genes of RIAs, the hidden layer was obtained according to the scoring of the disease characteristic genes, and the output layer was obtained according to the corresponding weights, the five nodes of the hidden layer, and their weights. The output layer was the sample attributes. The results suggested that the MMP9 and UPP1 genes reached the hidden layer and the output layer with greater weights, respectively, indicating that the two predicted the sample with greater accuracy. The next ROC curve results suggested that the ANN model predicted the sample attributes with an accuracy of AUC = 0.947 (95% CI 0.900–0.983) in the analysis data set (Fig. 5A) and AUC = 0.825 (95% CI 0.475–1.000) in the validation data set (Fig. 5B), which proved that this ANN model was used to predict sample attributes with high confidence.
Immune cell infiltration and immune cell differential analysis
A total of 22 immune cells were selected, including 12 types of intrinsic immune cells, activated dendritic cells, unactivated dendritic cells, activated mast cells, unactivated mast cells, activated natural killer (NK) cells, unactivated NK cells, macrophages (M0, M1, and M2), monocytes, neutrophils, and eosinophils, and 10 types of adaptive immune cells, including plasma cells, naive B cells, CD8+ T cells, naive CD4+ T cells, memory B cells, activated CD4+ memory T cells, unactivated CD4+ memory T cells, follicular helper T cells, regulatory T (Treg) cells, and γδ T cells. The samples from UIA and RIA groups were analyzed for immune cell infiltration and immune cell differences (Fig. 6). The numbers of CD8+ T cells (P < 0.001), activated CD4+ memory T cells (P = 0.01), unactivated CD4+ memory T cells (P = 0.039), Treg cells (P < 0.001), activated NK cells (P = 0.002), unactivated NK cells (P = 0.007), macrophages M0 (P < 0.001), and neutrophils (P < 0.001) showed statistically significant differences in expression compared with that in the UIA group. Among these, the numbers of CD8+ T cells, activated CD4+ memory T cells, unactivated CD4+ memory T cells, activated NK cells, macrophage M0, and neutrophils were higher in the RIA group. In contrast, the numbers of Treg cells and unactivated NK cells were high in the UIs group. Positive correlations were found on immune cell correlation analysis between the numbers of eosinophils and macrophages M1 (r = 0.65), eosinophils and activated dendritic cells (r = 0.43), and unactivated CD4+ memory T cells and CD8+ T cells (r = 0.43). Negative correlations were found between the numbers of neutrophils and CD8+ T cells (r = –0.69), Treg cells and activated CD4+ memory T cells (r = –0.51), and neutrophils and unactivated CD4+ memory T cells (r = –0.47).