Tissue enrichment and pathway analyses
Summary-level data of meta-analysis from UK Biobank (UKB) and CARDIoGRAMplusC4D including 123,733 CAD cases and 424,528 controls28 was leveraged to assess tissue and pathway enrichment. We implemented Data-driven Expression Prioritized Integration for Complex Traits (DEPICT) to document the enrichment of tissue and pathways for genetic association data in CAD. By using GWAS summary statistics, DEPICT provides an analysis in which risk loci mapped genes are integrated to a vast collection of human tissue gene expression to prioritize highly expressed genes and their relevant pathways. According to DEPICT, genetic association for CAD was enriched in arteries (P=7.38 x 10−9) and smooth muscle (P=4.33 x 10−6) (Figure 1A) (Suppl. Table 1). We next evaluated the enrichment of genetic association in pathways. DEPICT showed an enrichment for abnormal cell adhesion (P=5.04 x 10−14), abnormal vitelline vasculature morphology (P=5.64 x 10−13) and abnormal cell migration (P=3.67 x10−11) (Figure 1B) (Suppl. Table 2). These data thus indicated that the vasculature and smooth muscle cells as well as their related functions such as adhesion and migration are key features associated with CAD-gene variants.
Heritability for CAD explained by the vascular enhancer-promoter connectome
Considering the high enrichment of genetic association data for CAD in smooth muscle and arteries, we analyzed publicly available HiChIP for H3K27ac obtained in human coronary artery smooth muscle cells (HCASMC) (GSE101498). H3K27ac-HiChIP provides a high-definition map of chromatin conformation between enhancers and promoters29. After a stringent loop call with FitHiChIP (FDR < 1 x 10−6), we identified 224,209 confident loops in HCASMC. The anchor loops were significantly enriched in open chromatin detected by assay for transposase-accessible chromatin and sequencing (ATAC-seq) in HCASMC (fold enrichment=2.9, P<2.2 x 10−16, binomial test) (Figure 1C). By using HOMER, a pathway enrichment of anchor loops in HCASMC using the Kyoto Encyclopedia of Genes and Genomes (KEGG) showed an enrichment for endocytosis (P=1.36 x 10−10), regulation of actin cytoskeleton (P=1.70 x 10−8) and autophagy (P=5.54 x 10−8) (Suppl. Table 3). We next wondered whether enhancer-promoter contacts in HCASMC explained a significant part of the heritability in CAD risk. For that purpose, we partitioned the heritability of cis-anchor loops in HCASMC by using the full baseline model of 53 annotations in the stratified LD score regression framework (Methods)30. This analysis revealed that chromosomal interactions in HCASMC were enriched for regions that explained the heritability of CAD (enrichment=2.38, P=5.90 x 10−5) (Suppl. Table 4). Despite representing only 9% of CAD-single nucleotide polymorphisms (SNPs), variants within anchor loops in HCASMC explained 22% of the heritability for CAD (Suppl. Table 4).
Identification of gene promoters connected to CAD associated variants within distant enhancers
To identify gene promoters mapped by chromatin looping in H3K27ac-HiChIP in HCASMC, we identified lead and individual significant SNPs (PGWAS<5 x 10−8, r2<0.6) that overlapped with enhancer loops (Methods). In total, 353 individual significant SNP-enhancer-promoter loop pairs tagging 228 gene promoters were identified (Suppl. Table 5). The mean number of loops per gene promoter was 1.5 and the average distance between enhancers and gene promoters was ~ 168kb. For instance, at 11q13.1, rs3741380, a CAD individual significant SNP intronic to EHBP1L1, is localized within an interacting loop with the promoter of MAP3K11 located ~32kb downstream (Figure 1D). We next hypothesized that CAD gene variants may be enriched in hub enhancers having a high level of chromatin contacts (in network with a degree ≥ 90th percentile) (Methods). In HCASMC, we identified 5,455 highly connected hub enhancers involved in chromatin looping. This analysis showed a significant enrichment of CAD individual significant SNPs associated with hub enhancers (observed/expected=3.4, P=1.9 x10−6, binomial test). Among the different genes mapped by CAD variants and enhancer-promoter looping, SH2B3 is a gene connected to a hub enhancer. SH2B3 has been previously identified as a causal candidate gene for CAD31. On the other hand, some genes identified by chromatin looping were not previously mapped in CAD. For instance, at 11q13.4, rs590121 is located in a distant hub enhancer intronic to SERPINH1 and having contacts with multiple enhancers-promoters including with the promoter of ARRB1 located ~298kb upstream (Figure 1E).
Identification of vascular eQTLs
We next examined if CAD-associated genes mapped by enhancer-promoter chromatin looping were also significant vascular expression quantitative trait loci (eQTL). CAD-individual significant SNPs were mapped to eQTLs of the aorta in GTEx v8. In total, 15,516 CAD SNP-eQTL gene pairs were significant at FDR 5% in the aorta and tagged 202 genes (Suppl. Table 6). Among the 228 genes mapped by enhancer-promoter chromatin looping there were 41 genes (18%), which were also CAD-associated eQTL genes (fold enrichment observed/expected=21.4, P < 2.2x10−16, binomial test). Hence, these findings highlight that gene mapping of CAD genetic association data with enhancer-promoter chromatin conformation in HCASMC is enriched in vascular eQTL genes.
Genetic colocalization analyses
By combining enhancer-promoter chromatin looping and eQTL data, there were 383 genes mapped by CAD-associated SNPs. We performed Bayesian colocalization analyses between the eQTL signal (GTEx v8) of the 383 mapped genes in the aorta with the GWAS signal for CAD. We identified 35 genes with a high posterior probability (PP) (PP > 0.8) of shared signal between eQTLs in the aorta and CAD-GWAS (CDH13, PHACTR1, TCF21, N4BP2L2, SYPL2, TWIST1, PDGFD, IPO9, FHL3, UTP11, GGCX, SEMA5A, DMPK, MIA3, TMEM133, CAMK1D, ARHGAP42, DAGLB, DMWD, CETP, MORF4L1, JCAD, MFGE8, HAPLN3, HHIPL1, DHX36, B3GNT8, BMP1, LMOD1, FAM117B, MAT2A, ATP2B1, EXOSC5, EIF2B2, ZEB2) (Suppl. Table 7). Genes with a shared signal were enriched in gene ontology (GO) for negative regulation of cell adhesion (P=1.99 x 10−4) and positive regulation of endothelial cell proliferation (P=2.18 x 10−4) (Suppl. Table 8). Figure 2A shows a LocusCompare plot for the GWAS associations at 6p24.1 where there is a shared signal (PP=1) with the expression of PHACTR1, a gene previously identified as a causal candidate at this locus32. Rs9349379 is the strongest SNP for both GWAS and eQTL in the aorta for the expression of PHACTR1, a gene encoding for a regulator of actin polymerization33. Allele G-rs9349379 increases the risk of CAD (OR: 1.11, 95%CI: 1.10-1.12, PGWAS=2.71 X 10−76) and is associated with a decreased expression of PHACTR1 in the aorta (Figure 2B). This analysis underlined some targets identified in previous screen and functional assays such as CDH13, TCF21, LMOD1 and JCAD34–37, but also identified novel potential causal candidates such as EXOSC5 and B3GNT8, which encode for a RNA exosome component and a galactosyltransferase respectively.
Mendelian Randomization
In order to further evaluate causal associations, we implemented a 2-sample Mendelian randomization (MR) between gene expression in the aorta (GTExV8) by using a minimum of three independent (r2<0.1) cis-variants as IVs and genetic associations for CAD as the outcome. For that purpose 3D and eQTL mapped genes were evaluated in MR. In inverse variance weighted (IVW) MR, we identified 74 vascular genes expressed in the aorta that were significantly associated with CAD (FDR < 0.05). Among these genes, 33 did not show heterogeneity (Cochran’s Q test > 0.05) and were considered as causally associated with CAD (MRAS, HHIPL1, CDH13, JCAD, MFGE8, BMP1, FGD6, CTSK, MAP3K11, TMEM133, CAMK1D, DMPK, ZEB2, EIF2B2, HSD17B12, CDC25A, ARRB1, SFMBT1, TRIP4, KCNH2, NME7, ATP1B1, MRPL35, CCDC181, AGPAT4, RNF123, ANKDD1A, BEND6, CTSH, NPHP3, PIF1, ALKBH5, MEAF6) (Figure 3) (Suppl. Table 9). These 33 vascular genes were enriched for GO in cardiac muscle cell repolarization (P=1.44 x10−4) and proteolysis involved in cellular catabolic process (P=4.80 x 10−4) (Suppl. Table 10). Among the causally associated vascular genes, there were 10 genes also identified in colocalization analyses (PP > 0.8) (HHIPL1, CDH13, JCAD, MFGE8, BMP1, TMEM133, CAMK1D, DMPK, ZEB2, EIF2B2). In total, by using colocalization and MR approaches, we identified 58 potential causal vascular genes for CAD (Suppl. Table 11). Among the causal candidate genes, 9 were identified by both eQTL and enhancer-promoter conformation mapping (FHL3, MIA3, MAT2A, GGCX, DAGLB, MAP3K11, ATP2B1, EXOSC5, B3GNT8), whereas 6 genes (MEAF6, AGPAT4, KCNH2, ARRB1, PIF1, UTP11) were mapped only by chromatin conformation. For instance, AGPAT4, a gene encoding for a lysophosphatidic acid acyltransferase, is negatively associated with the CAD risk in MR (OR: 0.94, 95%CI: 0.91-0.97, Pcausal=1.5 x 10−3). At this locus, two CAD independent significant SNPs (rs9295142, rs142697177) located in an intronic enhancer of AGPAT4 are within a chromatin loop with the promoter of the same gene (Suppl. Figure 1).
In sensitivity analysis, we assessed causal associations by using weighted median MR, which provides a robust assessment with up to 50% of invalid instrumental variables (variants with horizontal pleiotropy)38. Among the 33 vascular genes associated with CAD in IVW MR, we found that 29 genes (MFGE8, CAMK1D, CTSK, JCAD, CDH13, EIF2B2, MAP3K11, FGD6, RNF123, ATP1B1, SFMBT1, MRPL35, MRAS, KCNH2, NPHP3, ANKDD1A, BEND6, CTSH, NME7, CCDC181, HSD17B12, TRIP4, CDC25A, TMEM133, DMPK, ZEB2, ARRB1, BMP1, HHIPL1) were also significant in weighted median MR analysis (Suppl. Table 12). The directional effects were concordant between IVW and weighted median MR analyses.
Network and prioritization
We aimed to characterize putative causal vascular genes singled out by the colocalization and MR screens by using a network approach with the objectives to identify: 1) key driver genes and 2) pathways of connected genes with functional relevance. A network was constructed from the DifferentialNet dataset, which includes 134,223 protein interactions, and inferred to the aorta from the gene expression profile (Methods)39. Candidate causally associated vascular genes identified by MR and colocalization analyses were used as seeds to generate the network, which encompassed 681 nodes (genes) and 763 edges (connections) (Figure 4A). A pathway enrichment analysis showed that genes within the network were enriched in TGF-beta signaling pathway (P=1.87 x 10−20), signaling events mediated by HDAC class I (P=3.46 x 10−17), and MAPK signaling pathway (P=1.62 x 10−16) (Suppl. Table 13). Causally associated genes were enriched in nodes with a high degree (degree >90th percentile) (fold enrichment 3.83, P=2.70 x 10−25, hypergeometric test). Genes with a high degree and acting as hub in networks are enriched in drug targets and are involved in key biological functions. Predicted CAD causally associated vascular genes in MR and acting as hub gene (high degree in PPI network) include among others ARRB1 (OR: 0.84, 95%CI: 0.78-0.91, Pcausal =9.46 x 10−6), CDC25A (OR: 0.96, 95%CI: 0.95-0.98, Pcausal =5.33 x 10−6), KCNH2 (OR: 1.03, 95%CI: 1.01-1.04, Pcausal =7.14 x 10−5) and MAP3K11 (OR: 1.07, 95%CI: 1.05-1.10, Pcausal =8.46 x 10−9).
Druggability of candidate causal genes
We next assessed whether causally associated vascular genes in CAD were potentially druggable by using The Drug Gene Interaction Database (DGIdb)40. DGIdb includes an exhaustive list of drug-gene pairs, which are collated from different resources. In total 383 compounds targeting 13 predicted causally associated vascular genes were identified by DGIdb (CTSK, MAP3K11, CDC25A, ARRB1, KCNH2, ATP1B1, PDGFD, IPO9, GGCX, DAGLB, CETP, DHX36, CAMK1D) (Suppl. Table 14). Considering the directional effects in MR, 5 vascular genes are potential targets for drug inhibition (CTSK, MAP3K11, KCNH2, ATP1B1, CAMK1D). For instance, the vascular expression of CAMK1D which encodes for a calcium calmodulin dependent protein kinase, is positively associated with the risk of CAD in MR (OR: 1.11, 95%CI: 1.07-1.16, Pcausal=5.12 x 10−8) (Figure 4B). Several drugs under investigation are kinase inhibitors, which are reported to target CAMK1D. By using colocalization analyses, the directional effects of the candidate causal SNP between vascular gene expression and CAD-risk suggest that the inhibition of 4 genes may lower disease-associated risk (PDGFD, IPO9, GGCX, CETP). As an example, among the potential drug targets, PDGFD encodes for platelet derived growth factor D. In the aorta, colocalization between eQTL for PDGFD and CAD-GWAS (PP=0.99) suggests a causal association and prioritizes gene variant rs974819 (Figure 4C). In the aorta, T-rs974819 is associated with a higher expression of PDGFD (Figure 4D) and an elevated CAD-risk (OR: 1.06, 95%CI: 1.05-1.07, PGWAS=1.11 x 10−28). According to the directional effects in MR and colocalization data, 4 genes (CDC25A, ARRB1, DAGLB, DHX36) could be targeted by agonist-based therapy. However, for these targets there is no agonist compound reported in DGIdb.
As some potential disease-associated targets may be linked with adverse side-effects, we undertook a phenome-wide analysis (PheWAS) by using Gene ATLAS, which includes 778 diseases-traits from the UK Biobank41. Genes identified as candidates for drug inhibitors were included in this analysis (CTSK, MAP3K11, KCNH2, ATP1B1, CAMK1D, PDGFD, IPO9, GGCX, CETP). The strongest instrumental variable (lowest P-value) in MR or the variant prioritized in colocalization analyses for each of the potential drug target gene was assessed in the cross-phenotype association analysis. Diseases-traits were deemed significantly associated with the variant at FDR 5% and data are presented in Suppl. Table 15. In GTEx, gene variant A-rs12042263 (strongest IV in MR for the expression) is associated with a lower vascular expression of CTSK (PeQTL=1.69 x 10−10) and a decreased risk for CAD (OR: 0.96, 95%CI: 0.95-0.98, PGWAS=1.18 x 10−3). In PheWAS, A-rs12042263 is positively associated with the risk of cerebrovascular disease (PPheWAS=3.15 x 10−4) (I67 other cerebrovascular disease) (Suppl. Table 15). These data suggest that the inhibition of CTSK may lower the risk of CAD (data in MR are concordant as the vascular expression of CTSK is positively associated with CAD-risk), but at the expense of increasing cerebrovascular events. Several drugs under investigation are reported in DGIdb to inhibit CTSK (Suppl. Table 14). A recent randomized clinical trial evaluating odanacatib, a CTSK inhibitor developed for postmenopausal osteoporosis, has found that inhibition of the target lowered primary endpoints, but increased the risk of stroke (HR 1.32, p=0.03)42. Also, inhibition of ATP1B1, a gene positively associated with the CAD-risk in MR, is predicted to increase the risk of venous thromboembolic disease (Suppl. Table 15). Prediction based on colocalization analysis suggests that inhibition of GGCX may lower CAD-risk, but according to the PheWAS it is associated with an increase risk of malignant neoplasm of prostate (PPheWAS=3.06 x 10−9) (Suppl. Table 15). We next interrogated the Human Phenotype Ontology database43 to identify disease-trait associations for the genes deemed druggable (Suppl. Table 16). In Human Phenotype Ontology, CTSK, GGCX and KCNH2, have been linked to bone-related conditions, coagulation defects and ventricular arrhythmia respectively. KCNH2 encodes for the Ether-A-Go-Go-Related Protein 1, a potassium voltage-gated channel, involved in arrhythmia. Consistently, in the Side Effect Resource (SIDER)44, several drugs targeting KCNH2 in DGIdb are reported to induce ventricular arrhythmia. After a comprehensive filtering based on multiple resources including a PheWAS analysis and interrogation of Human Phenotype Ontology as well as SIDER databases, potential vascular drug targets such as MAP3K11, CAMK1D, PDGFD, IPO9 and CETP were not predicted to be associated with major adverse side effects (cardiovascular, neurologic, metabolic, cancer) that would result from drug inhibition. Thus, some of these genes may represent suitable potential drug targets for follow-up studies.
Impact of targeting MAP3K11 in vascular smooth muscle cells
By using the present framework, we showed that some causally associated vascular genes were central in a network and were potentially druggable. Among those genes, MAP3K11 (also known as MLK3) is positively associated with the CAD risk (OR: 1.07, 95%CI: 1.05-1.10, Pcausal =8.46 x 10−9) and is a target of the experimental compound URMC-09945. Community network analysis using random walks showed that a module including MAP3K11 (P=1.94 x 10−6) (Suppl. Figure 2) was enriched in GO for the regulation of JNK cascade (P=1.81 x 10−10) (Suppl. Table 17), which is involved in inflammation and cell migration46. URMC-099 is an experimental compound that inhibits MAP3K11. We hypothesized that URMC-099 may modulate the expression of key cytokines and transcription factors involved in the development of atherosclerosis. We observed that vascular smooth muscle cells (VSMCs) treated with URMC-099 (100nM) for 6 hours had lower expression of transcripts encoding for interleukin1 beta (IL1B), C-C motif chemokine ligand 2 (CCL2 and also known as MCP1) and plasminogen activator urokinase (PLAU) (Figure 5A-C). IL1B, CCL2 and PLAU are key mediators involved in plaque activity47–49. Early growth response 1 (EGR1) is a transcription factor known to enhance the expression of IL1B and chemokines as well as to promote the development of atherosclerosis in mice50. In isolated VSMCs, URMC-099 reduced the transcript level encoding for EGR1 by 30% (Figure 5D). Considering the pathway enrichment of CAD gene variants in cell migration (Suppl. Table 2), we assessed the impact of URMC-099 on VSMC transmigration in a Boyden chamber. Consistent with the impact of URMC-099 on genes having a role on cell migration such as CCL2 and EGR151, we observed a reduction of cell migration by 40% in VSMC treated with the inhibitor for 6 hours (Figure 5E). As a proof of concept, these data provide further evidence that the pharmacological inhibition of predicted causally associated genes such as MAP3K11 in VSMC impacts athero-relevant gene expression and cell phenotype.