Study design
The data utilized in our analysis were obtained from publicly available repositories and were approved by the respective institutional review committees. Therefore, no additional ethical approvals were required. In our study, single nucleotide polymorphisms (SNPs) were designated as instrumental variables (IVs) [14]. Our investigation consisted of three main components, as illustrated (Fig.1): 1) Assessing the causal effects of 731 immunophenotypes on epilepsy; 2) Investigating the causal effects of 1400 metabolomic phenotypes (1,091 plasma metabolites and 309 metabolite ratios) on epilepsy; and 3) Conducting mediation analysis to explore the role of metabolites in the pathway from immune cells to epilepsy. MR was based on three fundamental assumptions: 1) robust prediction of the exposures, 2) association exclusively with the outcome through the exposures, and 3) absence of association with any confounders of the exposure-outcome relationship [15].
Data source
To mitigate sample overlap and ensure the inclusion of exclusively European samples, summary statistics for epilepsy were acquired from the FinnGenR10 database (https://r10.finngen.fi/). This dataset comprised up to 173,450 participants, with 12,891 cases and 312,803 control samples.
Summary statistics for each immune trait were accessed from the GWAS Catalog (https://www.ebi.ac.uk/gwas/), identified with accession numbers ranging from GCST0001391 to GCST0002121 [16]. In total, 731 immunophenotypes were included, covering diverse categories such as absolute cell counts (AC) (n=118), median fluorescence intensities indicating surface antigen levels (n=389), morphological parameters (n=32), and relative cell counts (n=192). These features represented a broad spectrum of immune cells, including B cells, circulating dendritic cells, T cells, monocytes, myeloid cells, and others.
Summary statistics of plasma metabolomics were acquired on the GWAS Catalog (https://www.ebi.ac.uk/gwas/) under the study accession numbers GCST90199621–GCST90201020, which included 1,091 plasma metabolites and 309 metabolite ratios from 8,299 European individuals [17]. In that study, there were 850 known metabolites among 1,091 plasma metabolites, which could be divided into 8 broad metabolic groups: lipid (395), amino acid (n=210), xenobiotics (n=130), nucleotides (n=33), cofactors and vitamins (n=31), carbohydrates (n=22), peptides (n=21), and energy (n=8); the remaining metabolites were partially characterized molecules (n=21) and unknown (n=220).
The data used for bioinformatics analysis were obtained from the GEO database (GSE143272), which included 34 drug-naive patients with epilepsy and 50 healthy subjects as control group[18].
IVs selection and data harmonization
To address the challenge posed by the limited number of SNPs attaining genome-wide significance, we adjusted the threshold to a more relaxed level (p< 1E-05). Additionally, we ensured that the chosen SNPs were sufficiently distant (at least 10,000 kbp apart) and displayed minimal linkage disequilibrium (R2 ≤ 0.001) [19]. In the process of MR analysis, it is imperative to verify that the effects of SNPs on exposure align with those on outcomes. Consequently, we excluded palindromic and ambiguous SNPs from IVs for MR analysis [20].
Essential data, including chromosome, effect allele (EA), other allele (OA), effect allele frequency (EAF), effect sizes (β), standard error (SE), and P-value, were extracted. Subsequently, we computed the explained variance (R2) and F-statistic parameters to ascertain the strength of association between the identified IVs and exposure. In this study, we employed the formulas R2=2×β2×EAF×(1-EAF)/(2×EAF× (1-EAF)×β2+2×EAF×(1-EAF)×N×SE2) and F-statistic=R2×(N-2)/(1-R2) [21]. An F-statistic >10 indicates a robust correlation between the IVs and exposure [22]. SNPs with an F-statistic <10 were deemed insufficiently robust and thus discarded.
Primary analysis
In our study, we aimed to evaluate the causal impact of immune cells and metabolites on epilepsy through a two-sample MR analysis conducted, as illustrated in Fig.1 (Step 1 and Step 2). We employed various MR methodologies to elucidate causal relationships across related to immune cells and metabolites on epilepsy. These methodologies encompassed IVW, weighted median, MR-Egger, Simple mode and weighted mode. IVW was selected as the primary method for estimating causal effects due to its widespread utilization in MR analysis and its robustness in assessing causality [23]. During MR analysis result selection, we adhered to the following criteria: 1). IVW method with a p-value < 0.05; 2). Consistent direction of OR values across the five study methods; 3). Horizontal pleiotropy analysis using MR-Egger method with a p-value > 0.05. During reverse MR analysis result selection, we adhered to the following criteria: 1). IVW method with a p-value < 0.05; 2). Horizontal pleiotropy analysis using MR-Egger method with a p-value > 0.05. The results of analysis were presented as odds ratios (ORs) accompanied by their respective 95% confidence intervals (CIs).
Mediation analysis
we conducted mediation analysis using a two-step MR design to investigate whether metabolites act as mediators in the causal pathway from immune cells to epilepsy outcomes in Fig.1 (Step 3). We employed five MR methods, computed ORs, performed tests for heterogeneity, pleiotropy, and leave-one-out sensitivity analysis. Subsequently, we screened the experimental results based on the following criteria: 1. IVW method yielding a p-value < 0.05; 2. Consistent direction of OR values across the five research methods; 3. Horizontal pleiotropy analysis results indicating a p-value < 0.05. The total effect was decomposed into its indirect effect (mediated by the metabolite) and direct effect (not mediated by the metabolite) [24]. Specifically, the total effect of immune cells on epilepsy was decomposed into: 1) the direct effect of immune cells on epilepsy as shown in Figure 1 (E, Step 3); and 2) the indirect effect of immune cells mediated by the metabolite as illustrated in Figure 1 (C, D, Step 3). The indirect effect is calculated as β1 times β2. The direct effect is calculated as the total effect minus the indirect effect.
Sensitivity Analysis
In our sensitivity analysis, we utilized Q-tests for data heterogeneity assessment employing both the MR-Egger and IVW methods. A smaller p-value indicates increased heterogeneity, suggesting a higher likelihood of directional pleiotropy. Therefore, we chose data with p-values > 0.05. Additionally, scatter plots depicting SNP–exposure associations and SNP–outcome associations were generated to visualize MR results. Leave-one-out analysis was conducted to assess the impact of each SNP on the results, systematically excluding individual SNPs to gauge the potential influence of specific variants on the estimates [25]. Moreover, MR-Egger regression was utilized to investigate potential horizontal pleiotropic effects. Any detected outliers were subsequently removed, and MR causal estimates were re-evaluated. To further scrutinize the assumption of instrument validity, we conducted MR-PRESSO testing with default parameters to identify biased SNPs [26].
All MR analyses were executed using R (version 4.1.2), leveraging the R-based package "TwoSampleMR". The "MR_PRESSO" package was employed for multiplicity tests [27].
Bioinformatics analysis
Differentially expressed genes (DEGs) were identified between healthy individuals and epilepsy patients in the GEO database. Up- and down-regulated DEGs were selected based on adjusted P-values < 0.05 and absolute log2 fold change > 0.585 for further analysis. Statistical analysis was conducted using the "limma" package.
To analyze the biological functions associated with the DEGs, Gene Ontology (GO) enrichment analyses were conducted using Metascape (https://metascape.org), a web-based portal offering comprehensive gene list annotation and analysis resources for experimental biologists. Screening conditions included a minimum overlap of 3 and minimum enrichment of 1.5. A significance threshold of P < 0.01 was applied to identify enriched terms.
We conducted further analysis to identify immune cells closely associated with epilepsy. Utilizing the Cell type Identification by Estimating Relative Subsets of RNA Transcripts (CIBERSORT) algorithm, we identified 22 types of immune cells. Subsequently, only samples with a CIBERSORT P-value < 0.05 were included for analysis. The Wilcoxon rank-sum test was employed to evaluate significant differences in the proportion of immune cell infiltration between healthy individuals and epilepsy patients.
To further explore the relationship between epilepsy and CD8 T cells, we computed the functional enrichment score for epilepsy samples using the "GSVA" package with default settings. Gene sets related to immune functions were sourced from the Gene Set Enrichment Analysis (GSEA) database (https://www.gsea-msigdb.org/gsea/index.jsp, https://download.baderlab.org/EM_Genesets/current_release/Human/symbol/). Specifically, we analyzed all immune-related gene sets from the GSEA database and all gene sets related to CD8 T cells from the baderlab database. The enrichment results were visually depicted using a heatmap generated with the "heatmap" package.