Despite the large volume of genetic information available, the pathogenesis and aetiology of MD are not yet fully understood; probably because most GV lie in non-coding regions with no obvious direct effect on a gene or function. In this context, leveraging multiple omics data is key for moving forward in the understanding of the influence of genomic variants in MD disease development. On top of that, full-genome summary statistics are not readily available due to study sharing policies (specially for private-public research partnerships) hampering the usage of most post-GWAS data analysis tools. This study aims to unravel the role of MD GVs in genetic regulation by focusing on regulatory variation following two complementary approaches: cis-eQTLs and TF binding alterations. Both are key to identifying potentially causal genes and understanding gene expression regulation [6, 8]; as reported by supporting evidence for its association with other mental disorders [40–42] and with MD in particular [43–45]. The regulatory variation analysis pipelines we have designed and implemented involve fine-mapping, cis-eQTL colocalization, transcription factor binding analysis and chromatin accessibility data; specially designed to perform well when full-genome summary statistics are not available [46].
Multiple GVs have been associated with MD, most of them characterized as not potentially pathogenic in addition to being common and mostly in non-coding regions of the genome according to CADD and VEP, respectively (Figure 1). The fine-mapping of MD GVs identified 172 causal GVs and 95 pGenes (Supplementary Table 1). The functional enrichment analysis of pGenes stands along with hypotheses of MD pathogenesis such as alterations in neurogenesis and neuroplasticity or the circadian rhythm theory [13]. Additionally, these are also enriched for other phenotypes frequently co-occurring with MD like alterations of body mass index or smoking [14]. While most pGenes (63%) and GVs (58%) have previous evidence for association with MD, our study pinpoints novel pGenes and GVs (Supplementary Table 2 and 3). Additionally, existing literature supports the role of pGenes in processes related to MD pathogenesis such as immune response, nervous system development, response to stress or behavioural despair.
MD causally associated GVs are those most likely to be causal and functioning as eQTLs and indeed, proved to be significantly enriched in cis-eQTLs from GTEx; in line with previous findings on MD and other psychiatric disorders [40, 47]. The colocalizing eGenes are involved in processes relevant to MD, such as neuron projection [48] and have been associated with MD and related phenotypes according to DISGENET plus [30, 31]. BAG5 is constitutively expressed in all tissues; while MYRF and SLC12A5 show higher levels in brain tissues (Supplementary Figure 3). BAG5 has been previously identified as associated with MD [49]. We characterize SLC12A5, involved in chloride homeostasis in neurons, as a pGene too, and its downregulation has been described as an effect from stress leading to the activation of the hypothalamic-pituitary-adrenal axis which ultimately can lead to MD-like symptoms [50, 51]. However, rs12624433 is an eQTL in Brain Putamen basal ganglia associated with the upregulation of SLC12A5. Thus, more research is needed to unravel the exact mechanism by which rs12624433 exerts its role on regulation of expression of SLC12A5. This eGene has been described as a potential drug target for mental disorders but considerations should be taken given its important role in brain development; besides, it is highly influenced by exercise and environmental factors [50]. rs198457 mediates the downregulation of MYRF expression, which plays a role in myelination and oligodendrocyte differentiation [29]. These, in turn, require thyroid hormones for their differentiation and maturation [52]. Furthermore, oligodendrocytes have been stated as crucial for psychological functions likely involved in mental disorders such as MD [53].
The analysis of TF regulation with RSAT enabled a precise prediction of TF binding alterations. Although TF expression is not highly tissue-specific [7, 54], for this type of analysis it is key to pick meaningful sets of TFs and GVs [55]. We focused on TF expressed in brain-related tissues as it has been previously reported that genes involved in depression are highly expressed in brain regions [4, 14, 19, 30]. Our analysis resulted in the prediction of 270 GVs lying in active regulatory regions of the genome of brain-related tissues based on chromatin accessibility data. These GVs alter the binding of 101 TFs, roughly equally distributed as disrupting or creating a binding site. The activating or repressing role of these TFs is hard to interpret since it will always depend on the sequence context and the cofactors involved [54]. Thus, further analysis is required to elucidate the impact of these GVs in gene expression regulation. Our pipeline enabled us to filter and prioritize the large number of candidate GVs by combining different omics data and ultimately propose mechanistic hypotheses.
By using eQTL data, we were able to identify the GV rs12624433, which regulates the expression of SLC12A5. This GV, previously referred to as colocalizing, is predicted to alter the binding of the TFs USF1, USF2 and MYC; these belong to the bHLH family involved in neural development [56]. USF1 and USF2 generally exert activating effects [57], with USF1 being a risk gene for Alzheimer’s disease and relevant for brain cholesterol metabolism involving its transport from astrocytes to neurons [58].
Besides, we found mechanistic evidence for 2 GV-TF-eGene/target associations (rs11227217: RAD21 → ZNRD2-DT; rs62259947: YY1 → P4HTM) when combining pattern matching results, chromatin accessibility data, GTEx eQTLs PICS and hTFtarget data. ZNRD2-DT is a lncRNA and interestingly our findings include several IncRNAs in the set of pGenes as well as related with regulatory variations, either colocalizing with cis-eQTLS (RP11-894P9.2 and RP5-1115A15.1) or with mechanistic evidence for its association with gene expression regulation (ZNRD2-DT). Although not being clear their exact role in MD pathophysiology, ncRNAs have been described as promising biomarkers and drug targets for MD [59, 60].
Regarding the association rs62259947: YY1 → P4HTM, P4HTM has been related with neurological disorders and social behaviour [38, 39]. It is involved in Ca2+ signalling mediated by the hypoxia-inducible factor HIF1α altering astrocytes gliotransmission [38]. Indeed, hypoxia has been associated with mental disorders in general and MD in particular [61–64]. In addition, P4HTM null mutation results in a reduction in fear and depression [39]. In turn, rs62259947 downregulates the expression of P4HTM and changes the binding affinity of YY1 in the Brain Cerebellar Hemisphere. Besides, YY1 regulates transcription by forming loops although its specific role as activator or repressor is not yet fully understood [37]. Furthermore, P4HTM and HIF1α have been reported as potential drug targets for MD [39, 65]. rs11227217 and RAD21 are associated with red blood cell and reticulocyte count respectively by PheWAS [26]. Indeed, red blood cells parameters have been described as altered in patients with mental disorders [66].