Genetic correlation between ADs and CVDs
We used cross-trait linkage disequilibrium (LD) score regression (LDSC) for the calculation of SNP-based heritability (h2SNP) and the estimation of genome-wide genetic correlation (rg) between six major ADs and six major CVDs. Univariate LDSC analysis revealed that the estimated h2SNP for ADs were substantially higher than those for CVDs, approximately 16 times greater on average. Specifically, SLE exhibited the highest heritability (h2SNP = 0.576, SE = 0.081), while T1D displayed the lowest (h2SNP = 0.033, SE = 0.004). Among six major CVDs analyzed, CAD exhibited relatively high heritability (h2SNP = 0.034, SE = 0.002), approximately six times greater than that of PAD, which had the lowest heritability (h2SNP = 0.006, SE = 0.001) (Fig. 1a and Supplementary Table 2a). Bivariate LDSC analysis revealed positive significant genetic correlations in 9 out of 36 trait pairs between ADs and CVDs (P < 0.05), with coefficients ranging from 0.087 to 0.207. Of these, only the trait pair of UC and VTE surpassed the stringent Bonferroni correction threshold (rg = 0.117, SE = 0.036, P = 1.10×10− 3) (Fig. 1b and Supplementary Table 2b).
While the genetic correlation coefficient rg quantifies the genetic correlation between two traits, it may not differentiate between genetic overlap caused by a mixture of concordant and discordant effects and the total lack of genetic overlap, which could lead to an rg value close to zero in both situations. Consequently, more than LDSC analysis is needed to capture the complex dimensions of genetic overlap fully. To address this, we have employed recently established statistical methods, including the causal mixture modeling approach (MiXeR) and Local Analysis of [co]Variant Annotation (LAVA), to comprehensively characterize the genetic overlap between ADs and CVDs beyond mere genetic correlation.
Genetic overlap between ADs and CVDs
MiXeR identified genetic overlap regardless of the direction of effect, which complements genetic correlation to provide a more thorough understanding of the genetic relationships among phenotypes. MiXeR considers differences in polygenicity to determine which phenotypes may have shared genetic variants. Univariate MiXeR analyses revealed that CAD (N = 1.795K, SD = 0.101K) and HF (N = 2.231K, SD = 0.317K) exhibited higher polygenicity, while SLE and PSC displayed lower polygenicity, suggesting a polygenicity pattern distinct from h2SNP estimates in ADs and CVDs. Bivariate MiXeR analyses showed weak to moderate but distinct patterns of genetic overlap between ADs and CVDs, with the Dice coefficients ranging from 0.021 to 0.307 (Supplementary Table 3a). For example, consistent with the strongest positive genome-wide genetic correlation (rg = 0.211, SE = 0.027) and a positive genetic correlation of shared variants (rgs = 0.890, SE = 0.087), a pronounced genetic overlap was observed between RA and PAD. This was reflected by a Dice coefficient of 0.238 (SD = 0.038), with 0.117K shared variants (SD = 0.019K) accounting for 22.1% of the variants affecting RA and 25.7% of the variants affecting PAD, respectively. Despite the lack of significant rg in the LDSC analyses, RA and CAD demonstrated extensive genetic overlap, evidenced by a Dice coefficient of 0.208 (SD = 0.033). This suggests a mixed direction of effect between RA and CAD, further validated by a significant proportion (59.5%) of shared variants exhibiting consistent effects. When shared variants have both concordant and discordant effect directions, they nullify each other, masking genetic correlation at the genome-wide level (Fig. 2c). Given the low polygenicity observed in ADs such as SLE and PSC and the high polygenicity in CVDs like CAD and HF, substantial disparities were noted in the number of shared and unique "causal" variants. For example, SLE and CAD shared a relatively low number of variants (N = 0.019K, SD = 0.009K), while there were significantly more unique variants for CAD (N = 1.775K, SD = 0.104K) compared to SLE (N = 0.023K, SD = 0.010K). These unique variants accounted for 45.7% of the variants influencing SLE and only 1.07% of those influencing CAD. Consequently, SLE and CAD exhibited minimal genetic overlap (Dice coefficient = 0.021, SD = 0.011) and demonstrated a rg approaching zero (rg = 0.036, SE = 0.031) (Fig. 2b, Supplementary Fig. 1, Supplementary Table 3b). Finally, MiXeR results indicate that the model fits for SLE-AF, UC-PAD, and PSC-Stroke are suboptimal, as evidenced by negative Akaike Information Criterion (AIC) scores. When comparing the best-fit model to the minimum possible overlap (minima), these scores suggest that the shared genetic component between these trait pairs may be less than previously estimated.
Local genetic correlation between ADs and CVDs
Genetic variance in small genomic regions may be shared by pairs of ADs and CVDs, even without a significant genome-wide genetic correlation. We conducted LAVA analyses to estimate broad local genetic correlations between ADs and CVDs in 2,495 unique genomic regions, further elucidating the direction of the mixed effects observed. Local genetic correlations (rgs) showed that only 58.7% of nominally significant local rgs were in the positive direction between the AD-CVD phenotype pair (Supplementary Table 4–5). A mixture of negative and positive local rgs were observed for each pair, potentially leading to minimal genetic correlations at the genome-wide level. Supporting the MiXeR findings, further evidence of mixed effect directions was evident among RA-CAD (18 positively and 17 negatively correlated loci) and SLE-CAD (19 positively and 29 negatively correlated loci). Interestingly, many loci had negative local genetic correlations between RA and PAD, somewhat divergent from positive genetic correlations (8/10, 80%). After correcting for multiple tests using Bonferroni correction, we also identified 23 loci that exhibited a significant local genetic correlation without a significant global correlation (Fig. 2c, Fig. 5, Supplementary Table 5). Our investigation also identified three regions (LD block 1,841, chr12: 111,592,382 − 113,947,983; LD block 100, chr1: 113,418,038–114,664,387; LD block 2,048, chr15: 37,962,916 − 39,238,840) displaying significant correlations for more than one trait pair. Overall, these findings indicate that global genetic correlations cannot fully represent the heterogeneity in genetic associations between phenotypes.
Shared genetic loci and functional annotation for ADs and CVDs
Despite these advances, the shared genetic mechanisms between ADs and CVDs remain unclear. Uncertainty persists regarding whether the genetic basis observed predominantly reflects horizontal pleiotropy, whereby the same genetic variant affects both traits. At the most fine-grained level of analysis, PLACO analyses are concerned with estimating SNP-level effects on phenotypes and identified 233,00 SNPs with potentially pleiotropic effects across 36 trait pairs between ADs and CVDs. FUMA annotation subsequently clustered these pleiotropic SNPs into 815 lead SNPs and 679 independent genomic risk loci across 208 unique chromosomal regions. Notably, 131 pleiotropic loci were identified across multiple trait pairs, with six of these loci exhibiting genetic signals in more than one-third of the trait pairs, suggesting a potentially broad functional impact of specific genomic regions (Supplementary Fig. 2, Supplementary Fig. 3, Supplementary Table 6–9). For example, the loci spanning 12q24.1-q24.12 on chromosome 12 overlaps with 30 trait pairs, which does not involve SLE-CAD and any related to AF except for T1D-AF. Notably, rs4766578 at 12q24.12 showed a remarkably consistent degree of pleiotropy across most trait pairs and was located in the binding sequence of the transcription factor HNF4A, a crucial regulatory element of ALDH2. This transcription factor has previously been linked to significant health outcomes, including blood pressure, cardiovascular disease, and autoimmune disease. Interestingly, the locus 17q12 on chromosome 17 was jointly associated with all ADs and AF, except for PSC-AF. This locus surrounds SNP rs1008723, located in the intronic region of the gasdermin B gene [GSDMB]. GSDMB encodes a family of structurally related proteins that play crucial roles, particularly in pyroptosis, a process implicated in the pathogenesis of ADs such as IBD and CVDs due to its involvement in severe cytokine release and inflammation. Overall, 345 SNPs (50.8%) exhibited novel associations with ADs, while 354 SNPs (52.1%) displayed novel associations with CVDs. Notably, 79 of these SNPs are reported for the first time about ADs and CVDs, suggesting potential implications for the immune and cardiovascular systems that warrant further investigation. More than half (51.3%) of the significant SNPs identified by PLACO have opposing genetic effects on the two diseases, suggesting different underlying causes for ADs and CVDs and potentially explaining the weak genetic correlation in the above analyses.
ANNOVAR annotation revealed that out of 679 top lead SNPs, 176 (25.9%) were intergenic variants, 352 (51.8%) were intronic variants, and 39 (5.7%) were exonic variants. Among these exonic variants, the SNP rs10781542, located at the 9q34.3 locus on chromosome 9 (PPLACO = 1.04×10− 9 for CD-Stroke), had the highest RDB score of 1a, indicating strong evidence of functionality. Additionally, 49 SNPs (7.2%) had CADD scores above 12.37, with rs601338 at the 19q13.33 locus on chromosome 19 (PPLACO = 3.77×10− 10 for T1D-CAD) presenting the highest CADD score of 52, suggesting potential deleterious effects. Further colocalization analysis revealed that 112 (16.5%) out of 679 pleiotropic loci exhibited PP.H4 greater than 0.7, identifying 11 unique SNPs as candidate-shared causal variants. Additionally, 93 (13.7%) pleiotropic loci showed PP.H3 greater than 0.7, suggesting the presence of different causal variants within these loci (Supplementary Fig. 4, Supplementary Table 6).
Candidate pleiotropic genes between ADs and CVDs
Instead of focusing on single SNPs, we conducted a gene-centered pleiotropy analysis by collectively analyzing sets of SNPs located within genes. MAGMA analysis identified 662 pleiotropic genes, of which 191 are unique, located within or overlapping with 679 pleiotropic loci. Notably, 590 genes (89.1%, 119 unique) were detected in at least two trait pairs (Fig. 3, Supplementary Table 10–12). Furthermore, four unique pleiotropic genes were detected in over one-third of the trait pairs, including ATXN2, BRAP, ALDH2, and SH2B3, all located at the 12q24.1-q24.12 loci. Ataxin 2 [ATXN2] is a polyglutamine protein primarily involved in various biological processes, including RNA translation and cytoskeletal reorganization. Recent studies have suggested that ataxin-2 deficiency is associated with dyslipidemia, potentially impacting the normal metabolism of the cardiovascular system. Rare variants in ATXN2 have been proposed to be related to obesity, insulin-resistance, and diabetes mellitus. Obesity may trigger and maintain a chronic low-level inflammatory state that can worsen autoimmune conditions and their related complications. Inflammatory stimuli increase the expression of BRCA1-associated protein [BRAP1], which in turn promotes the release of inflammatory cytokines, thereby elevating the likelihood of atherosclerosis, a key contributor to cardiovascular disease development. Accumulated evidence demonstrates that BRCA is rapidly recruited to DNA lesions and plays a crucial role in the DNA damage response, potentially mediating autoimmune and systemic immune-mediated diseases. In addition, these results suggest that 225 pleiotropic genes (34.0%) are novel candidate genes for ADs, while 312 genes (47.1%) are associated with CVDs. Notably, ATXN2, BRAP, and SH2B3 were not previously reported to be associated with both traits. A total of 644 genes (97.3%) identified by MAGMA were confirmed using FUMA positional mapping (Supplementary Table 8).
Tissue-specific pleiotropic genes between ADs and CVDs
We applied stratified LDSC to specifically expressed genes (LDSC-SEG) to connect genetic discoveries to pertinent tissues and cell types, offering an understanding of the role of particular tissue or cellular functions in the genetic basis of a trait. Some of our findings from analyzing gene expression data align with established biological knowledge: immunological traits exhibit immune cell-type enrichments, while cardiovascular traits are strongly enriched in tissues such as the heart's left ventricle and arterial tissues, including the aorta, coronary artery, and tibial artery. Chromatin data from the Roadmap Epigenomics and ENCODE projects confirmed the multiple-tissue gene expression analysis described above (Fig. 4, Supplementary Table 13).
While it is commonly assumed that the nearest gene is often the causal gene, this isn't always true. MAGMA mainly focuses on variants near the gene boundary, potentially overlooking significant variant-gene associations. E-MAGMA, based on MAGMA, integrates genetic and transcriptomic data (e.g., eQTLs) to identify risk genes, thus enhancing the utilization of distal variant-gene associations. Additionally, e-MAGMA could assist in pinpointing actual susceptibility genes in the tissue context by leveraging eQTLs of potentially phenotype- associated tissues. A total of 5,483 pleiotropic tissue-specific genes (550 unique) were identified in at least one tissue (Supplementary Table 14). Ten genes, including RBM6, UBA7, MST1R, RNF123 (located at 3p21.31), GSDMB, ORMDL3, PGAP3 (all located at 17q12-q21.1), ALDH2, TMEM116 and SH2B3 (all located at 12q24.11-12) were identified in the greater than or equal to one-third of trait pairs. Four genes at 3p21.31 were identified as candidate risk genes for UC, CD, and PSC for three disease-specific conditions. For example, evidence supported the role of RNA binding motif protein 6 [RBM6] in IBD by participating in the intestinal immune network for IgA production. Dysregulation of RNA-binding proteins like RBM6 can lead to aberrant immune responses, significantly contributing to hypertension and thereby increasing the risk of CVDs. Ubiquitin-like modifier activating enzyme 7 [UBA7] has been identified as a target gene for IBD-associated variants, which influence immune responses. Given the association between IBD and PSC, UBA7 may also play a role in PSC pathogenesis. It is also known to activate ISG15, a ubiquitin-like protein, which contributes to heart failure development by regulating cardiac amino acid metabolism and altering cardiomyocyte protein turnover. Transcriptome-wide association Scanning (TWAS) validated the e-MAGMA analyses using single-trait GWAS results (Supplementary Table 15). A total of 70.1% of tissue-specific pleiotropic genes were identified as novel for ADs and 70.6% for CVDs. A total of 30.4% of genes identified by e-MAGMA were confirmed through FUMA eQTL mapping (Supplementary Table 8).
In conclusion, 388 pleiotropic genes (130 unique) were finally identified through the combined use of MAGMA and e-MAGMA, in which ALDH2 and SH2B3 were detected in over or equal to one-third of the trait pairs (Supplementary Table 10). Except for ALDH2 and SH2B3 in SLE-CAD, located at 12p24.12, ALDH2 and SH2B3 for other trait pairs are located at the 12q24.11-12 locus. The aldehyde dehydrogenase two family member [ALDH2] significantly inhibited mitophagy during reperfusion, attenuated hypoxia/reoxygenation- induced cardiomyocyte contractile dysfunction, and may serve as a primary target for cardioprotection. Additionally, overexpression of ALDH2 protects against oxidative stress-induced inflammatory events that lead to cellular or tissue injury and protects ADs. The SH2B adaptor protein 3 [SH2B3], an adaptor protein, negatively regulates cytokine signaling and cell proliferation. This function contributes to an increased risk of various autoimmune diseases, potentially due to SH2B3's impact on impairing the adverse selection of immature or transitional self-reactive B cells. Furthermore, SH2B3 has been implicated in causing heart injury by promoting a proinflammatory response and impairing insulin signaling. Remarkably, the disease-specific RNF123 and MST1R genes at locus 3p21.31 identified in the e-MAGMA analysis were still present. RING finger protein [RNF123] plays a significant role in the immune response, mainly through the TLR3/IRF7-mediated pathway that promotes type 1 interferon (IFN) expression, thereby exacerbating chronic inflammation in IBD. Research indicates that RNF123 can influence the stability of critical proteins involved in the inflammatory response, such as those in the NF-κB signaling pathway, which is also implicated in atherosclerosis and other cardiovascular conditions. Macrophage- stimulating one receptor [MST1R], also known as RON receptor tyrosine kinase, is critical in regulating inflammatory responses and tissue repair. Research has shown that MST1R is involved in several key signaling pathways that mediate immune responses and fibrotic processes, which are central to the pathogenesis of PSC. Moreover, MST1R signaling pathways could intersect with those involved in lipid metabolism and oxidative stress, which are critical in the pathogenesis of CVDs.
Shared biological mechanisms between ADs and CVDs
Pathway and gene set approaches, by aggregating and analyzing signals at the gene level within functional pathways, reveal the functional and biological characteristics of genes that confer risk for a particular phenotype. Here, we note that the associated genes collectively perturb various nodes in T cell activation and signaling pathways, yet different disease clusters show distinct patterns of genetic associations (Supplementary Table 16a, Supplementary Table 16b). Notably, the gene SH2B3, associated with most trait pairs, was significantly enriched in multiple gene sets within the lower layers of ‘intracellular signal transduction,’ including ‘regulation of the MAPK cascade’ and ‘regulation of phosphatidylinositol 3-kinase/protein kinase B signal transduction’. In autoimmune diseases such as RA and SLE, dysregulation of the MAPK cascade can lead to aberrant T-cell activation and inflammatory cytokine production, perpetuating the autoimmune response. Excessive activation of the MAPK cascade can also contribute to the development of atherosclerosis by promoting endothelial cell dysfunction and inflammatory processes within the vascular wall. The PI3K/AKT pathway is integral to various cellular functions, including growth, survival, metabolism, and immune responses. In the lower layers of the ‘innate immune system,’ CD, UC, and PSC were significantly associated with ‘antigen processing and presentation of peptide antigen via MHC class I,’ whose dysregulation can lead to immune responses against self-antigens, contributing to autoimmune pathology. For instance, studies have implicated aberrant MHC class I antigen presentation in CD, where T cells recognize self-peptides as foreign, triggering chronic inflammation in the gastrointestinal tract. Similarly, in UC, defective antigen processing mechanisms may result in the immune system attacking the intestinal lining, exacerbating symptoms.
Drug-potential target network
Using STRING V.11.5, we identified ten biologically related genes of SH2B3, ATXN2, BRAP, and ALDH2, respectively. Subsequent queries in the Drug Gene Interaction Database (DGIdb) revealed that SH2B3, ATXN2, ALDH2, and their associated genes, such as JAK2 (linked to SH2B3), are targeted in various treatments for ADs and CVDs. This discovery is consistent with the known role of JAK2 in mediating immune responses and regulating T-cell differentiation. We also identified 444 FDA-approved drugs and 858 potential candidates targeting these 28 unique genes. Notably, paclitaxel, an anti-inflammatory drug targeting JAK2, demonstrates potential as a therapeutic option for ADs and is already approved for various CVDs, including PAD. Furthermore, the potential for repurposing antitumor drugs like fedratinib to prevent or treat ADs and CVDs merits further exploration in clinical trials, as detailed in Supplementary Table 17.
Causal relationships between ADs and CVDs
Mendelian randomization analysis could detect causal trait pairs and partially reflect vertical pleiotropy. In the Latent Heritable Confounder MR (LHC-MR) analysis, after correcting for multiple comparisons using Bonferroni correction, we found convincing evidence (P < 2.25×10 − 7) for causal effects in five trait pairs, including RA-VTE, T1D-AF, T1D-Stroke, CD-Stroke, and UC-VTE (Supplementary Table 18). For example, for a one-unit increase in log odds of UC (equalling a one-unit increase in the prevalence of UC), the odds ratios were 1.046 (95% CI, 1.029, 1.063) for VTE. In the reverse analysis, convincing evidence of genetic causality emerged for three pairs (VTE-T1D, Stroke-RA, and Stroke-PSC). For example, genetic liability to VTE showed a positive association with T1D (OR 1.981; 95% CI, 1.560–2.515). Furthermore, two trait pairs (CD-AF and CD-CAD) hinted at a bidirectional causality. Overall, MR analysis demonstrated strong evidence of genetic causality in 10 trait pairs analyzed in either direction, suggesting that vertical pleiotropy may mediate their relationship.