Study design
The research methodology is outlined in Figure 1. Initially, the MR approach was employed to explore the causal relationship between druggable genes in the blood and constipation, with the robustness of the MR analysis ensured through comprehensive sensitivity evaluations. For druggable genes with significant MR findings, we performed a Bayesian co-localization analysis to determine whether the cis-eQTL and constipation share the same causal variant [17]. After identifying pertinent genes, we further validated the credibility of the MR results through a summary data-based MR (SMR) analysis and the heterogeneity in dependent instruments (HEIDI) test [14, 18].
Data sources
The term "druggable genes" refers to genes that encode proteins with sequence and structural similarities to established drug targets. In this study, the set of druggable genes was derived from the work of Finan et al., which identified a total of 4,479 genes. Of these, 1,427 genes encode proteins that serve as approved or clinical-stage drug targets; 682 genes encode proteins that either interact with established drug molecules or share similarities with recognized drug targets; and 2,370 genes belong to essential druggable gene families or encode proteins with distant resemblance to known drug targets [19].
We acquired cis-eQTL data for 2,888 druggable genes expressed in blood from the eQTLGen Consortium. This dataset, derived from a cohort of 31,684 participants, predominantly of European descent, consolidates data from multiple studies. The eQTLGen Consortium provides high-quality, large-scale genetic information, enabling the investigation of gene expression regulation in blood [20].
Constipation data was extracted from the GWAS database. We utilized summary-level information from a cohort of 4,11,623 individuals of European ancestry, encompassing 24,176,599 SNPs [21]. Additionally, we accessed the most recent R11 GWAS summary data on constipation from the FinnGen database, which includes 44,590 cases of constipation and 4,09,143 controls for replication MR analysis (Table 1) [22].
Table 1. Details of the constipation GWAS data.
Disease
|
Sources
|
Cases
|
Controls
|
Population
|
NO SNPs
|
Constipation
|
IEU
|
15,902
|
395,721
|
European
|
24,176,599
|
Constipation
|
FinnGen (R11)
|
44,590
|
409,143
|
European
|
21,306,794
|
Abbreviations: IEU, Integrative Epidemiology Unit; FinnGen (R11): Latest release 11 of the FinnGen Study
IV selection
In this study, we acquired cis-eQTL data for drug target genes and meticulously selected IVs appropriate for MR analysis [13]. SNPs were required to meet the criterion of genome-wide significance (P < 5 × 10-⁸). To minimize the effects of linkage disequilibrium, we applied a clumping algorithm with parameters of r² = 0.1 and a 10,000 kb distance [23, 24]. Subsequently, we utilized the PhenoScanner database (https://www.phenoscanner.medschl.cam.ac.uk) to exclude SNPs associated with confounding factors and constipation, thereby mitigating the influence of potential confounders [25]. However, the website is currently no longer accessible.
MR analysis
We conducted the MR analysis using the "TwoSampleMR" package (version 0.5.6) in the R software environment (version 4.3.2) [26]. In this study, the primary analytical approach for investigating the causal relationships between cis-eQTLs and constipation was the Inverse Variance Weighted (IVW) method. This method employs a weighted regression of SNP-specific Wald estimates, allowing for an accurate evaluation of causal effects on constipation. The IVW method served as a critical tool in our analysis, providing insights into the intricate relationships within our dataset [27, 28]. To assess heterogeneity among the SNPs, we used Cochran's Q statistic [29, 30]. Furthermore, we applied the MR-PRESSO method to detect outliers and minimize potential bias due to horizontal pleiotropy [31]. The presence of pleiotropy can undermine the reliability of MR analysis results.
When a gene exhibited significance in both the primary and replication MR analyses (P < 0.05), we proceeded with SMR analysis to further validate the MR findings. SMR integrates GWAS summary statistics with eQTLs to investigate potential causal relationships between genes and specific phenotypes, employing the HEIDI test to assess the reliability of the outcomes [14]. We conducted the SMR analysis and the HEIDI test using the SMR software (available at https://yanglab.westlake.edu.cn/software/smr/). Gene expression data were obtained from the Genotype-Tissue Expression (GTEx) database.
Co-localization analysis
For druggable genes that met the significance thresholds in both MR and SMR analyses, we conducted co-localization analysis using the "coloc" package in R (version 5.1.0.1) [17]. This method integrates data from multiple SNPs or other genetic variants to determine whether genes and diseases share the same genomic regions or potentially interact. We applied the default prior probabilities: P1 = 1.0 × 10-⁴ for the probability that an SNP is linked to druggable gene expression, P2 = 1.0 × 10-⁴ for its association with the outcome, and P12 = 1.0 × 10⁻⁵ for its involvement in both. The co-localization analysis produced posterior probabilities for five distinct scenarios: PP.H0—no association with either gene expression or the outcome; PP.H1—association with gene expression only; PP.H2—association with the outcome but not gene expression; PP.H3—association with both gene expression and the outcome, but through different causal variants; and PP.H4—shared causal variants influencing both gene expression and the outcome. A PPH4 value greater than 0.85 was considered strong evidence for co-localization, identifying genes as potential therapeutic targets for constipation. SNPs most relevant to the outcome were carefully included in the co-localization analysis.