2.1 Study design
The workflow of this MR study is illustrated in Figure 1. Firstly, we acquired publicly available summary-level data from GWAS concerning gut microbiota, circulating inflammatory proteins and IgAN. Two-sample MR methods were utilized to evaluate the causal relationships between gut microbiota and IgAN, as well as between circulating inflammatory proteins and IgAN. Subsequently, circulating inflammatory proteins with a mediating role were identified in the positive findings of the causal relationship between gut microbiota and IgAN.
2.2 Data sources
GWAS summary statistics for IgAN are sourced from the FinnGen consortium (R10), involving 653 IgAN cases and 411,528 controls. The data can be directly accessed from https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_N14_IGA_NEPHROPATHY.gz. GWAS summary data for gut microbiota are derived from a whole-genome association study conducted by the Netherlands Microbiome Project team, which involved 7,738 participants and assessed 412 features (including 207 gut microbial taxa and 205 functional pathways)[17]. The GWAS data for circulating inflammatory proteins were obtained from the study by Zhao et al., which recruited 14,824 participants and identified 91 circulating inflammatory proteins using Olink Target Inflammation Immunoassay Panels to analyze whole-genome genetic data and plasma proteomic data[18]. The GWAS summary statistics for circulating inflammatory proteins can be found in the EBI GWAS Catalog (accession numbers GCST90274758-GCST90274848)[18].
2.3 Instrumental variable selection
Qualified instrumental variables (IVs) must meet three core assumptions: (1) Selected IVs are directly linked to the exposure factor; (2) IVs are unrelated to any confounding factors influencing the "exposure-outcome" relationship; (3) Selected IVs influence the outcome solely through the exposure factor[19]. Strict criteria are necessary for screening IVs to ensure the credibility of MR study findings. When screening single nucleotide polymorphisms (SNPs) associated with gut microbiota and circulating inflammatory proteins, we initially employed a stringent threshold (P<5×10-8) for selection, resulting in a limited number of SNPs being chosen. To increase the number of SNPs for the study, we adjusted the threshold to P<1×10-5, based on the majority of previous studies[20-22], and subsequently established parameters kb = 10000, r2= 0.001 to mitigate interference from linkage disequilibrium[23]. Additionally, we performed reverse MR analysis, with the criteria for selecting IgAN-related SNPs set at P<5×10-6, kb = 10000, r2= 0.001.Weak instrumental variables exhibit a feeble association with the exposure factor, thereby compromising result accuracy. The strength of each SNP was calculated using the F statistic, and IVs with F<10 were excluded as weak instruments[24]. Palindromic SNPs were eliminated by reconciling exposure-outcome datasets[24]. The Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) method were employed to remove outlier SNPs[25]. Following stringent screening based on the aforementioned criteria, the remaining SNPs were utilized for subsequent MR analysis.
2.4 Statistical analysis
2.4.1 Two-sample Mendelian randomization
Five analytical methods—Inverse Variance Weighted (IVW), MR-Egger regression, Weighted Median Model (WME), Simple Model (SM), and Weighted Model (WM)—were employed to assess causality[26]. When all SNPs are valid and no horizontal pleiotropy exists, the IVW method combines estimated values of individual IVs using inverse variance weights to yield a consistent and unbiased estimate of the causal effect[27]. MR-Egger regression and WME are valuable tools in MR for addressing situations involving horizontal pleiotropy or violations of the IVs assumption. They enable researchers to estimate causal effects while accounting for pleiotropy effects and offer additional insights into the relationship between exposure and outcome variables[25]. Nevertheless, the non-parametric nature of WME might result in reduced estimation accuracy, while MR Egger, relying on regression modeling, may diminish statistical power[28]. SM groups SNP categories with similar values and evaluates causal associations based on the group with the most SNPs[29]. WM requires identifying multiple variables as valid instruments to detect the same causal effect.IVW analysis has the highest statistical power, and this study utilized IVW analysis as the primary method for MR analysis[30]. Additionally, Bayesian Weighted Mendelian Randomization (BWMR) was employed to validate positive results. BWMR served as our primary reference, and negative outcomes from BWMR were disregarded. BWMR accounts for uncertainty stemming from polygenicity, resulting in weak instrument effects, and tackles violations of the IV core assumption attributed to horizontal pleiotropy through Bayesian weighted outlier detection[31].
Sensitivity analysis was performed to evaluate the robustness of the results. Cochran's Q test was employed to assess heterogeneity among IVs, where a significance level of P < 0.05 indicated substantial heterogeneity among SNPs, warranting the use of a random-effects model; otherwise, a fixed-effects model was employed[32]. MR-Egger regression was utilized to evaluate horizontal pleiotropy and its statistical significance. Absence of a significant intercept term in MR-Egger (P > 0.05) indicates the absence of horizontal pleiotropy[33]. Leave-one-out analysis was conducted by sequentially excluding individual IVs to investigate whether any SNP exerts a dominant influence on the causal association. Significant influence on the MR results upon removal of a specific SNP suggests that the outcome is impacted by a single IV. Subsequent analysis will omit results identified as outliers and displaying horizontal pleiotropy by MR-PRESSO.
Additionally, this study is exploratory in nature, and to achieve more positive and mediating results, we did not apply multiple testing corrections.
All analyses were performed using R (version 4.3.2) along with the R-TwoSampleMR package (version 0.5.10) and the R-MR-PRESSO package.
2.4.2 Reverse Mendelian randomization analysis
To investigate the causal relationship between IgAN and gut microbiota as well as circulating inflammatory proteins (P IVW < 0.05), we also performed reverse MR analysis.In this scenario, SNPs associated with IgAN are treated as exposures, while gut microbiota and categories of circulating inflammatory proteins are regarded as outcomes. The steps involved in reverse MR analysis mirror those of standard MR analysis.
2.4.4. Mediation analysis
Mediation analysis aims to evaluate the pathways through which exposure influences the outcome, assisting in the exploration of potential mechanisms through which exposure affects the outcome. Mediation analysis comprises four specific steps. In Step 1, we have already derived the total effect (beta_all) of gut microbiota on IgAN through two-sample MR analysis. In Step 2, reverse MR analysis was performed on the outcomes of Step 1 to assess the causal association between IgAN and gut microbiota; mediation analysis can only proceed when the mediator is independent of the exposure[16]. Step 3 involved conducting two-sample MR analysis on circulating inflammatory proteins and IgAN to determine the effect size (beta2) of circulating inflammatory proteins on IgAN. Step 4 involved subjecting the gut microbiota obtained in Step 1 and circulating inflammatory proteins obtained in Step 3 to two-sample MR analysis to derive the effect size (beta1). The mediation effect is calculated as beta1 × beta2, where beta_direct is the difference between the total effect and the mediation effect, and the mediation ratio is determined as (mediation effect / total effect) × 100%. The delta method was used to estimate the 95% confidence interval (CI) for the mediation effect and mediation ratio[16].