Figure 1 showed the overall design of the study. The current MR analysis was based on publicly available datasets including the GWAS catalog database [17], the UK Biobank study [19], and the eQTLGen alliance [20] (Table S1). In this study, instrumental variables for mitochondrial genes were extracted at the methylation, gene expression, and protein abundance levels. Subsequent MR analyses were conducted separately for sepsis at each biological level. To strengthen the causal inference, colocalization analyses were then applied. The GWAS catalog was the primary discovery dataset, with UK Biobank and eQTLGen Consortium datasets used for validation.
Specifically, these datasets were employed to investigate gene methylation, gene expression, and protein abundance levels, respectively. Through the integration of results obtained from MR analyses at these three distinct levels, we identified causal candidate genes. There was no overlap in samples between the exposure and outcome populations.
Figure 1 Study design. SMR, summary-based Mendelian randomization; QTL, quantitative trait loci; SNP, single nucleotide polymorphisms; PPH4, posterior probability of H4.
Data sources of methylation, expression, and protein quantitative trait loci
Integration of multi-omics data enabled us to illuminate the underlying molecular networks of mitochondrial dysfunction. QTLs could reveal the associations of single nucleotide polymorphisms (SNPs) with levels of DNA methylation, gene expression, and protein abundance. SNP-CpG associations in blood were obtained from methylation quantitative trait loci (mQTL) data by McRae et al. in 1980 European ancestry individuals [18]. Individual methylation probes were normalized using a generalized linear model using a logistic link function with corrections for the chip, sex, age, age2, sex × age, and sex × age2 [18]. The dataset of blood expression quantitative trait loci (eQTL) data was extracted from the eQTLGen consortium which included 31,684 individuals [20]. Summary statistics of genetic associations with circulating protein levels were extracted from a protein quantitative trait loci (pQTL) study by Ferkingstad et al. comprising 35,559 Icelanders [21]. Rank-inverse normal transformed levels were adjusted for age, sex, and sample age for each protein tested [21].
Genetic associations with sepsis were identified by extracting data from the GWAS catalog database, encompassing a total of 1573 sepsis cases and 454,775 control individuals. The GWAS catalog served as a valuable resource for researchers, compiling a comprehensive collection of genetic variants associated with various diseases. In the context of sepsis, these genetic associations shed light on the underlying genetic factors that contribute to the susceptibility or resilience to this severe medical condition. By analyzing the genetic profiles of a large cohort of individuals, researchers could uncover potential biomarkers, pathways, or therapeutic targets that may aid in the development of more effective diagnostic tools and targeted interventions for sepsis patients. The inclusion of a substantial number of control individuals in the study design ensured statistical robustness and allowed for meaningful comparisons between cases and controls, enhancing the reliability of the identified genetic associations.
Mitochondrial-related genes were identified by MitoCarta3.0 which contained an updated inventory of 1136 human mitochondrial genes [22]. The MitoCarta identified all protein components resident in the mitochondrion based on the Bayesian integration of seven experimental and sequence features. Each mitochondrial gene product was subjected to an independent literature-guided review in the updated MitoCarta3.0, thus providing an inventory of 1136 human mitochondrial genes [22]. Leveraging the inventory, we separately identified mitochondrial genes in the QTL datasets.
Summary-data-based MR analysis
Summary-data-based Mendelian randomization (SMR) was employed to estimate the association of mitochondrial gene methylation, expression, and protein abundance with the risk of sepsis [17]. Based on the top associated cis-QTL, the SMR could reach a much higher statistical power than conventional MR analysis when exposure and outcome were available from two independent samples with large sample sizes [17]. The top associated cis-QTL were selected by considering a window centered around the corresponding gene (± 1000 kb) and passing a P-value threshold of 5.0 × 10− 8. The SNPs with allele frequency differences larger than the specified threshold (set as 0.2 in the current study) between any pairwise data sets, including the LD reference sample, the QTL summary data, and the outcome summary data, were excluded. Heterogeneity in the dependent instrument (HEIDI) test was applied to distinguish pleiotropy from linkage, where P-HEIDI < 0.01 were considered likely due to pleiotropy and thus discarded from the analysis. SMR and HEIDI tests were implemented using the SMR software tool (SMR v1.3.1). The P-values were adjusted to control the false discovery rate (FDR) at α = 0.05 using the Benjamini-Hochberg method. Associations with the FDR-corrected P-value < 0.05 and P-HEIDI > 0.01 were undertaken for colocalization analysis.
Colocalization analysis
We conducted colocalization analyses to detect shared causal variants between sepsis and identified mitochondrial-related mQTLs, eQTLs, or pQTLs with coloc R package [23]. In colocalization analysis, five different posterior probabilities are reported, which correspond to the five hypotheses: five exclusive hypotheses: 1) no causal variants for either of the two traits (H0); 2) a causal variant for gene expression only (H1); 3) a causal variant for disease risk only (H2); 4) distinct causal variants for two traits (H3); 5) and the same shared causal variant for both traits (H4). For colocalization of pQTL-GWAS [24], eQTL-GWAS [25] and mQTL-GWAS [26], the colocalization region windows were ± 1000 kb, ± 1000 kb and ± 500 kb respectively according to published articles. The prior probabilities that the causal variants were associated with only trait 1 (i.e., mQTL), only trait 2 (i.e., sepsis) and both were respectively set at 10− 4, 10− 4 and 10− 5. The posterior probability of H4 (PPH4) > 0.70 was considered supporting evidence of colocalization with its cutoff corresponds to a false discovery rate of < 5%, which strengthened the evidence for a causal relationship [27].
Integrating results from multi-omics level of evidence
To obtain a full picture of associations between regulation of mitochondrial-related genes and sepsis at different levels, we integrated results from three different gene regulation tiers. Since proteins were the ultimate expression products of genes and establishing evidence of causation at the protein level was a fundamental requirement, all three tiers of causal candidate genes in our classification were required to have genes causally associated with sepsis at the protein level. Based on this principle, we divided the causal candidate genes into three tiers using the following criteria: 1) tier 1 genes were defined to have gene-sepsis associations at protein abundance level (FDR-corrected P-value < 0.05), PPH4 of colocalization > 0.7, and associations with sepsis at both methylation and expression levels (FDR-corrected P-value < 0.05); 2) tier 2 genes were defined to have gene-sepsis association at protein abundance level (FDR-corrected P-value < 0.05), PPH4 of colocalization > 0.7, and associations with sepsis at methylation or expression levels (FDR-corrected P-value < 0.05); 3) tier 3 genes were defined to have gene-sepsis associations at protein abundance level (FDR-corrected P-value < 0.05), PPH4 of colocalization of ≥ 0.5 and < 0.7, and associations with sepsis at both methylation and expression levels (original P-value < 0.05). To further explore the potential regulation among gene methylation, expression, and protein abundance, we conducted MR analysis of the causal associations between mitochondrial-related gene methylation and expression, gene expression, and protein abundance. We further performed colocalization analysis for identified associations to rule out the possibility that the association is caused by linkage disequilibrium.
Ethics
Included studies had been approved by corresponding ethical review committees and all participants signed the consent forms.
Role of funders
The funders had no role in the study design, data collection, analysis, interpretation, manuscript preparation, or the decision to submit the manuscript for publication.