Identification of DEGs and module genes.
In this study, the "Limma" R package was used to identify differentially expressed genes (DEGs) in the dataset GSE37069. The selection criteria were set as |log2 FC| > 0.575. Heatmaps and volcano plots were generated to visualize the differential gene expression(Fig. 1A-B). A total of 2899 DEGs were identified between the Major burn and control samples, and the top 10 genes with the largest differential expression were SLC2A3, CCDC84, MAPK14, KIAA1377, CR1, MMP9, BMX, B3GALT2, PYGL, and NR3C2. In the weighted gene co-expression network analysis, a tree diagram was constructed by incorporating the gene sets of all samples without outliers. The soft threshold was set to 11 (R^2 = 0.95) to construct a scale-free network, resulting in the identification of 18 modules(Fig. 2A-D). Based on the criterion (|Cor| > 0.4), the darkturquoise, red, and pink modules were selected, which included a total of 4936 genes for further analysis(Fig. 2E-G).
Acquisition and functional enrichment of intersection genes
The intersection of the obtained 2899 DEGs, 4936 module genes, and 437 OSRGs resulted in 44 intersection genes. A Venn diagram and Manhattan plot were created to visualize these intersection genes and their genomic positions(Fig. 3A). We also investigated the chromosomal locations of intersecting genes and visualized them(Fig. 3B,C). GO and KEGG analyses were applied to the cluster Profiler package to explore the potential molecular functions and mechanisms of the intersection genes(Fig. 3D-F). Bubble plots and bar charts were generated for GO and KEGG enrichment, revealing significant enrichment in 10 biological process (BP) terms, including regulation of oxidative stress − induced cell death, regulation of response to oxidative stress, cell death in response to oxidative stress, cellular response to hydrogen peroxide, cellular response to reactive oxygen species, response to hydrogen peroxide, response to reactive oxygen species, cellular response to chemical stress, cellular response to oxidative stress, response to oxidative stress, 10 cellular component (CC) termfs, including astrocyte projection, perinuclear endoplasmic reticulum, organelle envelope lumen, mitochondrial intermembrane space, glial cell projection, outer membrane, organelle outer membrane, mitochondrial outer membrane, membrane microdomain, membrane raf, and 10 molecular function (MF) terms, including DNA−(apurinic or apyrimidinic site) endonuclease activity, carbon − oxygen lyase activity, damaged DNA binding, scaffold protein binding, oxidoreductase activity, acting on peroxide as acceptor, deoxyribonuclease activity, DNA endonuclease activity, antioxidant activity, oxidoreductase activity, acting on a sulfur group of donors, DNA − binding transcription factor binding. The top five KEGG pathways were most enriched, including Glutathione metabolism, Chemical carcinogenesis − reactive oxygen species, Focal adhesion, Pathways of neurodegeneration − multiple diseases, MicroRNAs in cancer.
Identification of hub genes
In order to identify hub genes, we constructed a protein-protein interaction (PPI) network using the STRING website and performed analysis using Cytoscape(Fig. 4). We selected the top 10 hub genes with the highest number of connections, which are GCLM, MMP9, BCL2, IL10, SIRT1, NME8, PINK1, GCLC, EPAS1, and PARK7. Among these, SIRT1 and BCL2 have the closest connections to other parts of the network, followed by PKRK7 and MMP9.
Expression of hub genes and ROC curve analysis
All 10 hub genes have different levels of expression in the major burn and normal samples in the GSE37069 dataset (Fig. 5). Among them, EPAS1, PINK1, NME8, MMP9, IL10, and GCLM are highly expressed in the major burn samples, while the other genes are lowly expressed in the major burn samples. Then, ROC curve analysis was used to test the ability of each hub gene to differentiate major burn samples from control samples based on expression levels (Fig. 6A), and candidate genes with the best discriminatory performance (ROC value > 0.8) were selected. It was found that the AUC of ROC for all 10 hub genes exceeded 0.8. Among them, MMP9 had the highest AUC, which was 0.966, and GCLC had the lowest AUC, which was 0.844. Co-expression analysis revealed the most significant positive correlation between SIRT1 and GCLC, IL10 and GCLM(Fig. 6B-D).
Immunoinfiltration analysis of hub genes
This study analyzed the effects of hub genes on the immune system after severe burns and conducted an immunoinfiltration scoring. Among the 28 types of immune cells, MMP9 and BCL2 were significantly correlated with multiple immune cells(Fig. 7). MMP9 was positively correlated with plasmacytoid dendritic cell, macrophage, regulatory T cell, and activated dendritic cell, but negatively correlated with activated CD8 T cell, effector memory CD8 T cell, MDSC, CD56bright natural killer cell, natural killer T cell, natural killer cell, eosinophil, type 2 T helper cell, type 1 T helper cell, central memory CD4 T cell, T follicular helper cell, effector memory CD4 T cell, memory B cell, CD56dim natural killer cell, neutrophil, immature B cell, and activated CD4 T cell. BCL2 was significantly positively correlated with natural killer T cell, activated B cell, activated CD4 T cell, MDSC, T follicular helper cell, central memory CD4 T cell, CD56bright natural killer cell, CD56dim natural killer cell, natural killer cell, type 1 T helper cell, effector memory CD8 T cell, and activated CD8 T cell, but significantly negatively correlated with activated dendritic cell, macrophage, regulatory T cell, mast cell, and neutrophil.
Gene set enrichment analysis
According to the GSEA results, the MMP9 high-expression group was enriched in amylopectin and sucrose metabolism, lysosome and regulation of cytoskeleton by actin, while the BCL2 high-expression group was highly enriched in lysosome, graft versus host disease, and primary immune deficiency. The GCLM high-expression group was enriched in amylopectin and sucrose metabolism, amino sugar and nucleotide sugar metabolism, and cell cycle. The enrichment status of other high-expression gene groups is shown in the figure (Fig. 8).
Prediction of drugs, miRNAs, and transcription factors based on hub genes
Based on the hub genes, we predicted 37 target drugs. For example, the target drug for GCLM is cisplatin, and the target drugs predicted by MMP9 include curcumin, marimastat, and bevacizumab. BCL2 predicted target drugs include paclitaxel, vinorelbine, cisplatin, and isosorbide dinitrate, with a total of 19 drugs. The target drugs predicted for IL-10 include retinoic acid, sirolimus, tacrolimus, and letrozole. SIRT1 predicted target drugs include sulforaphane, rapamycin, and resveratrol, and GCLC predicted target drugs include cisplatin and cisplatin (Fig. 9A).
Furthermore, utilizing the miRNet database, a hub gene-miRNA regulatory network and a hub gene-transcription factor (TF) network were generated, identifying 458 targeted miRNAs and 19 TFs (Fig. 9B,C).
Multi-gene diagnostic model and independent analysis validation
Based on the dataset GSE37069, we established a multi-gene diagnostic model using LASSO regression and ROC analysis, focusing on 10 hub genes (Fig. 10). The AUC of the ROC curve is 0.988, indicating excellent discriminatory performance. Furthermore, the confusion matrix shows an accuracy of 0.96, precision of 0.96, and recall of 0.99 for this model (Fig. 11A, B). In the validation dataset, the accuracy is 0.99, precision is 0.99, and recall is 0.99. Additionally, the AUC of the ROC curve is 0.988, indicating good practicality of the model (Fig. 11C, D).