3.1 Genetic susceptibility to AF and DCM risk
A total of 818,314 European populations (4962 patients and 813,352 controls) were analyzed in this MR study, and IVs significantly associated with AF and DCM were extracted from GWASs (p<5×10-8) and LD was removed (r2<0.001, 10,000-kb). As shown in Table 2, in the European population, we observed a causal inference between genetic susceptibility to AF and DCM (IVW: β=20.44, 95%CI: 15.00-25.88, p=0.0002; weighted median: β=18.28, 95%CI: 11.37-25.19, p=0.0082) (Table 2 and Figure 2).
3.2 Sensitivity analyses for MR estimates
Thirteen SNPs were analyzed using leave-one-out analysis and heterogeneity tests. Leave-one-out analyses showed that no single SNP could mediate a causal relationship between AF and DCM (Figure 2). The Cochrane Q test for heterogeneity had p-values greater than 0.05 (Q value for the IVW test=10.56, p=0.5667; Q value for the MR-Egger test=10.56, p=0.4806), suggesting that there was no heterogeneity among the SNPs. The MR-PRESSO model test for horizontal pleiotropy confirmed that pleiotropy was unlikely to affect causality (p>0.05) (Table S1).
3.3 Identification of differentially expressed genes
A total of 6463 DEGs were identified in the DCM dataset with p-value <0.05, |log2FC| >1.2, and the volcano plot and heatmap in Figure 3 demonstrate the differential expression of these DEGs. Similarly, 1850 DEGs were identified in the AF dataset with p-value <0.05, |log2FC| >1.2, and the differential expression of these DEGs in AF is depicted in Figures 4A and B.
3.4 Weighted gene co-expression network analysis and critical module identification
A scale-free co-expression network constructed using WGCNA was used to identify the most relevant modules in AF. A “soft” threshold β of 5 was chosen based on scale independence and average connectivity (Figures 4C, D). The clustered dendrograms of AF and controls had 38 different colored gene co-expression modules with a module merging threshold of 0.25 and a minimum size of 30, as shown in Figures 4E-G. Clinical correlation analysis showed that the brown module had the strongest association with AF (r=0.72, p<0.001), as shown in Figure 4H. Therefore, we selected a brown module consisting of 572 genes for analysis. Correlation analysis between module membership and gene significance revealed a significant positive correlation (correlation coefficient=0.70, p<0.001), as shown in Figure 4I. These results suggest that genes in the brown module are most closely associated with AF.
3.5 Functional enrichment analysis of AF
The intersection of AF genes from Limma and module genes was enriched, and a total of 172 intersecting genes were obtained, as shown in Figure 5A. KEGG analysis showed that the intersected genes were involved in “ubiquitin-regulated protein degradation” and “PPAR signaling pathway,” as shown in Figure 5B. The results of GO analysis showed that the intersected genes were involved in biological processes (BPs) such as “chromosome organization”, “organelle localization, and “monovalent inorganic cation homeostasis”, as shown in Figure 5C. In the cellular component (CC), the intersected genes were involved in “Copii encapsulated ER to Golgi transport vesicles”, “autophagosome, and “LSM1-7-PAT1 complex”, as shown in Figure 5D. For molecular function (MF), the results showed that “nucleotide binding” and “guanosine binding” were more common, as shown in Figure 5E.
3.6 Enrichment analysis of AF with DCM and screening node genes via the protein– protein interaction network
The intersection of the DEGs for DCM and module genes for AF included 209 genes, as shown in Figure 6A. KEGG analysis showed that 209 genes were mainly enriched in “various infections” and “phagocytic vesicles, ” as shown in Figure 6B. GO analysis revealed genes involved in “vascular phylogeny,” “organelle localisation” (BP), “anchoring junctions,” “cellular substrate connectivity” (CC) and “binding enzymes” (MF), as shown in Figures 6C-E. The interactions between 209 genes were determined by constructing a PPI network, as shown in Figure 6F. Genes were sorted according to the number of nodes (Figure 6G).
3.7 Identification of candidate hub genes via machine learning
LASSO regression, RF, and SVM machine-learning algorithms were used to identify potential candidate genes associated with the diagnosis of AF combined with DCM. LASSO regression identified 10 genes that were closely associated with the diagnosis and plotted ROC curves, which demonstrated the high accuracy of LASSO regression (AUC 1.00, 95%CI 0.99-1.00) (Figures 7A, B, C). The top 15 significant genes from RF and SVM are shown in Figures 7D and E, respectively. Two genes (VSNL1 and ETNPPL) identified by the three methods served as the key diagnostic genes for the final validation, as shown in Figure 7F.
3.8 Diagnosis value evaluation
The model was validated using GSE17800 to construct a nomogram containing two key diagnostic genes (Figure 8B). The AUC and 95%CI of these genes were calculated by constructing ROC curves to assess the diagnostic efficacy, as shown in Figure 8A. The results were as follows: VSNL1 (AUC 0.87, 95%CI 0.74-1.00), ETNPPL (AUC 0.81, 95%CI 0.64-0.99), and nomogram (AUC 0.89, 95%CI 0.74-1.00).
3.9 Immune infiltration analysis
Figure 9 shows the correlation between the DCM gene set and immune cells in the study, which revealed that only activated CD 4+ memory T cells differed between the disease and control groups.
3.10 Target miRNAs prediction and integrated miRNAs-targets network construction
As shown in Figure 10, VSNL1 interacted with 37 target miRNAs, whereas no target miRNAs were found to interact with ETNPPL. VSNL1 interacted with ARMC1 in the PPI network, whereas no proteins were found to be associated with ETNPPL. The intracellular signalling and regulatory pathways of VSNL1 are yet to be experimentally verified.