Part I Results
1.Mendelian randomization analysis
Exp
|
Method
|
Nsnp
|
B
|
SE
|
P
|
OR(CI 95%)
|
BMI
|
IVW
|
432
|
0.40609
|
0.04292
|
3.05E-21
|
1.50093(1.37983-1.63271)
|
MR Egger
|
432
|
0.38617
|
0.11651
|
0.00099
|
1.47134(1.17097-1.84875)
|
WM
|
432
|
0.38231
|
0.07932
|
1.44E-06
|
1.46565(1.25463-1.71216)
|
HC
|
IVW
|
399
|
0.31202
|
0.04537
|
6.09E-12
|
1.36618(1.24994-1.49322)
|
MR Egger
|
399
|
0.27233
|
0.12574
|
0.03092
|
1.31303(1.02621-1.68001)
|
WM
|
399
|
0.37041
|
0.07136
|
2.09E-07
|
1.44832(1.25928-1.66574)
|
WC
|
IVW
|
357
|
0.53017
|
0.05401
|
9.44E-23
|
1.69923(1.52857-1.88894)
|
MR Egger
|
357
|
0.41554
|
0.15475
|
0.00759
|
1.51521(1.11877-2.05209)
|
WM
|
357
|
0.49593
|
0.09516
|
1.87E-07
|
1.64202(1.36264-1.97869)
|
Firstly, the causal association between obesity-related indexes and the development of sepsis was analyzed in this study based on MR. Based on the above screening conditions, 432 SNPs, 399 SNPs, and 357 SNPs were obtained from the data obtained for BMI, HC, and WC, respectively. These IVs were used to analyze with the outcomes, respectively, and the results showed that regardless of Inverse variance weighted (IVW), MR Egger or Weighted median (WM) showed that Body mass index (BMI), Hip circumference (HC) and Waist circumference (WC) were positively correlated with sepsis (P<0.05, OR>1), suggesting that BMI, HC and WC are risk factors for sepsis, as shown in Table 1.
Table 1. Results of causal analysis between BMI/HC/WC and sepsis
Notes: Inverse variance weighted(IVW); Weighted median(WM); Body mass index(BMI); Hip circumference(HC); Waist circumference(WC).
Exposure
|
Method
|
Q
|
Q_df
|
P
|
BMI
|
MR Egger
|
454.04163
|
430
|
0.20394
|
IVW
|
454.07735
|
431
|
0.21330
|
HC
|
MR Egger
|
446.94904
|
397
|
0.04211
|
IVW
|
447.07797
|
398
|
0.04493
|
WC
|
MR Egger
|
367.33016
|
355
|
0.31475
|
IVW
|
367.97674
|
356
|
0.31960
|
Table 2. Heterogeneity tests
Notes: Inverse variance weighted(IVW); Weighted median(WM); Body mass index(BMI); Hip circumference(HC); Waist circumference(WC).
Exposure
|
Egger_intercept
|
SE
|
P
|
BMI
|
0.00039
|
0.00211
|
0.85416
|
HC
|
0.00078
|
0.00231
|
0.73523
|
WC
|
0.00189
|
0.00239
|
0.42977
|
2. Heterogeneity test and pleiotropy test
To verify the reliability of the results, the study proceeded with the heterogeneity test using Cochran's Q test. The results showed that there was no heterogeneity of single nucleotide polymorphisms between IVW and MR Egger ( P>0.05), as shown in Table 2. And a symmetrical distribution of SNPs was also seen in the funnel plot, suggesting that MR analysis did not show pleiotropy (Figure S1). And the results of MR Egger intercept test showed that MR analysis did not show horizontal pleiotropy (intercept values:BMI 0.00039,P=0.85416; HC 0.0007, P=0.73523; WC 0.00189, P=0.42977).
Table 3. Egger_intercept
Notes: Inverse variance weighted(IVW); Weighted median(WM); Body mass index(BMI); Hip circumference(HC); Waist circumference(WC).
Exposure
|
Egger_intercept
|
SE
|
P
|
BMI
|
0.00039
|
0.00211
|
0.85416
|
HC
|
0.00078
|
0.00231
|
0.73523
|
WC
|
0.00189
|
0.00239
|
0.42977
|
3. Sensitivity analysis
At the end of the MR analysis, the study further evaluated the presence of selection bias through the "leave-one-out" sensitivity analysis. No single SNP was found to have a significant impact on the overall results, and no bias was caused by a single SNP, suggesting that the results of the MR analysis were reliable (Figure S2).
Figure 1. Differential expression analysis. A. Heat map of the results of differential expression analysis of GES49756 using GEO2R; B. Heat map of the results of differential expression analysis of GES57065 using GEO2R; C. Heat map of the results of differential expression analysis of GES236713 using GEO2R; D. Volcanogram of the results of differential expression analysis of GES49756; E. Volcanogram of the results of the differential expression analysis of GES57065; F. Volcanogram of the results of the differential expression analysis of GES49756; G. Intersection Venn diagram of up-regulated genes taken from the results of differential expression analysis of the three datasets; H. Intersection Venn diagram of down-regulated genes taken from the results of differential expression analysis of the three datasets; I. Heatmap of the results of differential expression analysis using the "limma" package after data merging and normalization.
Part II Results
1. Differential expression analysis
Three sepsis datasets, GSE49756, GSE57065 and GSE236713, were downloaded from the GEO database. The three datasets included in the study were firstly analyzed for differential expression analysis (|log(FC)|>0.585) using GEO2R, an analysis tool that comes with the GEO database, and the resultant heatmaps are shown in Figure 1.A-C,and the volcano plots are shown in Figure 1.D-F. The up-regulated differentially expressed genes and down-regulated differentially expressed genes obtained from the three datasets were intersected, and the genes appearing in the two different datasets were defined as differentially expressed genes filtered by "GEO2R" (up-regulated Figure 1.G, down-regulated Figure 1.H), and a total of 580 differentially expressed genes were obtained.
Next, the three datasets were normalized and merged, and differential expression analysis was performed using the "limma" package (|log(FC)|>0.585), resulting in 2144 differentially expressed genes. The heatmap of some of the genes is shown in Figure 1.I. Finally, the combined dataset was analyzed using WGCNA to explore the clustering of sepsis with normal related genes. Combined with the module independence test ( Figure 2.A), Gene dendrogram and module colors (Figure 2.B) and Module-trait relationships analysis (Figure 2.C) in the WGCNA analysis results, the purple module showed better sepsis correlation (cor=0.57, p=5e-45), and Module membership vs. gene significance correlation was also better (cor=0.77, p=5.1e-64), therefore, the corresponding gene module of purple was selected for the subsequent analysis, and 320 sepsis-related genes were obtained.
Figure 2. WGCNA analysis. A. Module independence test in the results. B. Gene dendrogram and module colors; C. Heatmap of Module-trait relationships analysis; D. Correlation scatterplot of Module membership vs. gene significance for normal samples in the purple module; E. Correlation scatterplot of Module membership vs. gene significance for sepsis samples in the purple module.
2.Screening and enrichment analysis of anoikis core genes
The differentially expressed genes obtained by the three methods were taken to be intersected with anoikis genes at the same time to determine the final anoikis related-differentially expressed genes (ARDEGs) in sepsis, and 6 ARDEGs were obtained, as shown in Figure 3.A. The GO function and KEGG pathway enrichment analysis showed that these ARDEGs were mainly involved in the negative regulation of cytokine production, negative regulation of interleukin-1 production, negative regulation of interleukin-1 production and negative regulation of myeloid leukocyte differentiation at the biological process(BP) level. Granule-related functions at the cell component(CC) level. And at the MF level is mainly involved in peptidase inhibitor activity, endopeptidase inhibitor activity and protein phosphatase binding. And in the aspect of pathway, it is mainly involved in the pathway related to Arrhythmogenic right ventricular cardiomyopathy, Acute myeloid leukemia and other diseases (Figure 3.B).
Considering that the 6 ARDEGs did not show good disease correlation, then this study further screened the 6 ARDEGs using support vector machine (SVM-RFE), random forest tree algorithm (RF), and lasso-cox regression analysis, respectively. And the results showed that the SVM-RFE got the minimum worse validation at the validation coefficient of 6, and the 6 ARDEGs were obtained (Figure 3.C). RF got the minimum error at trees=157, taking the genes with significance score greater than 20 and got 3 ARDEGs (Figure 3.D,E). The lasso-cox gets the minimum cross-validation bias at 4, yielding 4 ARDEGs (Figure 3.F). The intersection of the results of the three algorithms was taken to finally obtain 3 core ARDEGs ( Figure 3.G).
Figure 3. screening and enrichment analysis of loss-of-nest apoptotic core genes. A. Wayne plots of the results obtained from the three differential expression analyses with loss-of-nest apoptosis genes taken as intersections; B. Bar graph of GO and KEGG enrichment analysis of ARDEGs; C. Screening of core genes using support vector machine (SVM-RFE); D. Random Forest Tree (RF) algorithm for screening core genes; E. RF-based gene importance circle map; F. Screening of core genes using lasso-Cox regression analysis; G. Wayne's plot of the intersection of the results of the three methods of screening genes taken.
3. Risk nomogram construction and validation
Next, the expression of the 3 core ARDEGs was again validated using the merged and normalized sepsis mRNA-Seq data, and the results showed that SERPINB1, MERTK and CEACAM8 showed a significant up-regulation in sepsis (Figure 4.A-C,P<0.001). The diagnostic efficacy of the 3 core ARDEGs was evaluated, and the results showed that the ROC-AUC of all three was greater than 0.8, suggesting that the three had good diagnostic efficacy for sepsis (Figure 4.D-F). Based on the good diagnostic efficacy of SERPINB1, MERTK and CEACAM8, this study attempted to construct a risk prediction nomogram based on these three core ARDEGs, as shown in Figure 4.G. Then, the predictive efficacy of the model was verified using calibration curves, and the results showed that the calibration curves exhibited a good fit (Figure 4.H). Decision curve analysis (DCA) was also plotted in this study to assess the predictive efficacy of the model, and the results showed that the predictive efficacy of the model was high and stable (Figure 4.I). The predictive value of the model was also observed to be very close to the true positive rate in the clinical impact curve, suggesting higher predictive efficacy and higher patient benefit (Figure 4.J).
Figure 4. Risk model construction and validation. A. The validation box plot for the SERPINB1 expression analysis; B. The MERTK expression analysis validation box plot; C. The CEACAM8 expression analysis validation box plot; D. The diagnostic ROC curve for SERPINB1; E. The diagnostic ROC curve for MERTK; F. The diagnostic ROC curve for CEACAM8; G. The nomogram of risk prediction models based on SERPINB1, MERT and CEACAM8; H. Calibration curves; I. Decision curves; J. Clinical impact curves.
4. immune cell profile in sepsis
The current study also explored the distribution of immune cells in patients with sepsis based on CIBERSORET, and the proportions of the composition of the various immune cells are shown in Figure 5.A. The study also analyzed the correlation between the various immune cells, and the results showed that the relationship between the various immune cells was generally dominated by a negative correlation, the details of which are shown in Figure 5.B. In order to further clarify the trend of various immune cells in sepsis, this study analyzed the levels of each immune cell in the normal and sepsis groups. The results showed that the levels of Plasma cells, T cells CD4 naive, T cells gamma delta, Macrophages M0, Macrophages M2, Mast cells activated and Neutrophils were raised in the sepsis group. While the levels of T cells CD8, T cells CD4 memory resting, T cells CD4 memory activated, T cells follicular helper, T cells regulatory (Tregs), NK cells resting, Dendritic cells resting, etc. levels were decreased in the sepsis group as shown in Figure 5.C.
Figure 5. Overview of the immune environment in sepsis. A. Bar graph of the composition of various immune cells in the normal sepsis group; B. Heatmap of correlation analysis between various immune cells in sepsis; C. Comparative violin plots of the levels of various immune cells in the normal group sepsis group.
5. Functions and signaling pathways of core ARDEGs
Based on the Genmania online database, we investigated the functions and pathways of the 3 core ARDEGs. The results showed that MERTK was mainly involved in phagocytosis, regulation of cytokine-mediated signaling pathway, leukocyte apoptotic process, regulation of dendritic cell apoptotic process, and dendritic cell apoptotic process. apoptotic process, dendritic cell apoptotic process, positive regulation of natural killer cell activation, regulation of leukocyte apoptotic process (Figure 6.A). CEACAM8 is mainly involved in cell killing, plasma membrane signaling receptor complex, macrophage activation, negative regulation of cytokine production, protein complex involved in cell adhesion, humoral immune response, defense response to bacterium (Figure 6.B). While SERPINB1 is mainly involved in inflammasome complex, cysteine-type endopeptidase activity involved in apoptotic process (Figure 6.C). It can be seen that the three core ARDEGs are mainly involved in the apoptotic process of various immune cells. It is noteworthy that SERPINB1 is mainly involved in inflammasome complex, and considering the close association of sepsis with inflammation in the organism, this study further explored the correlation between SERPINB1 and classical inflammasome and its related ligands, and the results showed that SERPINB1 was associated with NLRP3, NLRC4, NLRP12, AIM2, CASP1, CASP4 and CASP5, which had significant positive correlations (Figure 6.D, P<0.001). Thus, the present study further explored the regulatory mechanism of SERPINB1 and body inflammation based on KEGG2, with reference to the Necroptosis process[17]and NOD-like receptor signaling pathway[18], and hypothesized that SERPINB1 mainly may be involved in regulating the inflammatory response of the organism through the NLRC4/CASP1-inflammatory effects pathway (Figure 7).