Experimental design to access phenotype-specific engram nuclei activated in the social interaction (SI) test following chronic social defeat stressor exposure
Though bulk studies have provided valuable insights into the intricate molecular alterations following stressor exposure, the inherent heterogeneity of molecular signatures in different cell types (70, 71) could potentially mask critical signals from active cells. Therefore, to increase the precision of identifying specific molecular alterations associated with resilience and susceptibility following stressor exposure, we used the ArcCreERT2 mouse line (72) coupled with the reporter Sun1sfGFP (73) mouse line (Arc-GFP). In the presence of TAM, this experimental system allows for the expression of sfGFP in the nuclear membrane of the stimulus-activated cells. This facilitates the meticulous tracing of spatiotemporally active nuclei or phenotype-specific engrams in response to our predefined stimulus, i.e., SI in our experiment (Fig. 1a).
A detailed description of the behavior paradigm and corresponding result are available as previously reported (31). Briefly, 7–8-week-old male mice (Arc-GFP) were examined for their baseline behavior using an open field/eagle test and social interaction test with conspecifics (SI/CO, Fig. 1b, see (31) and Methods for more details) for counter-balance and segregation into stress and control groups. Following the segregation, the stressed groups were subjected to CSD stress for 3×15s attack daily, with a new bully CD1 mouse for each attack session, followed by 24 hours of sensory stimulation without physical attacks (scheme in Fig. 1b). This procedure was repeated for ten days. The non-stressed control groups were also handled daily in a separate room. After the last day of CSD, mice (both stressed and non-stressed controls) were allowed to rest for seven days before performing the SI test with a novel CD1 male mouse (SI/CD1). TAM (150 mg/kg) was delivered intraperitoneally 5 hours before the SI test for spatiotemporal tracing of SI-activated nuclei. Interestingly, the SI/CD1 test (n = 43 controls, n = 73 stressed) showed a significant group stress effect after the CSD (Fig. 1b, p < 0.05, Kolmogorov-Smirnov test), along with a significant increase in weight of the stressed group (31). Following the confirmation of effective stress application, mice that showed a strong interaction with the CD1 mice similar to the control non-stress group (SI-index > 100) were termed as resilient (Res) mice. In contrast, those individuals that exhibited a deficit in social interaction (SI-index < 100) were labelled as susceptible (Sus) mice (Fig. 1b). For stringency, the behaviors of these selected mice were also studied carefully by an experienced researcher and substantiated in the recorded videos. After the validations, representatives of the control (n = 6), resilient (n = 6), and susceptible (n = 9) mice (see Supplementary Fig. 1) were sacrificed 72 hours after TAM injection/SI/CD1. We focused on the activated nuclei (51, 52) from the ventral hippocampus (vHipp), given its pivotal role in social and fear memory (74–77), goal-directed behavior (8) and decision-making (78, 79), which are consistently altered in stress disorders. SI-activated sfGFP positive (sfGFP+) nuclei from the vHipp were collected using FANS (52). Thereafter, the methylome was accessed using reduced representation bisulfite sequencing (RRBS-seq (53)), and RNA-seq data (31) was integrated to generate an increased resolution of the molecular mechanisms underlying susceptibility and resilience. For capturing the distinct engram populations, we employed a tamoxifen (TAM)-inducible ArcCreERT2 mouse line (72) coupled with a reporter line expressing sfGFP on the nuclear membrane (73) (Arc-GFP mice) in stimulus-activated cells.
Accessing the DNA methylation landscapes of the control, resilient and susceptible groups using activated engram nuclei
To investigate the methylome landscape using RRBS-seq, sfGFP + nuclei were collected from a pool of 2–3 mice each from the selected control, resilient and susceptible groups (Supplementary Fig. 1) to yield 3, 3 and 4 biological replicates, respectively. We obtained approximately 30 million reads per sample, detecting 3.4×108 cytosines in resilient, 3.8×108 cytosines in susceptible and 3.0×108 cytosines in control groups with > 10× coverage (average detection per sample: 3.5×108). From the detected cytosines, considering only the methylated cytosines, we noted that all three groups showed similar global CpG methylation levels (Supplementary Fig. 2a). The CHG and CHH global methylation levels were also similar between the groups but were less than 5%. As the CHG and CHH methylation levels were comparatively lower, we focused our further analyses for this manuscript on the methylation at CpG dinucleotide sites (> 10× coverage).
Landscape and functional characterization of the differentially methylated cytosines (dmCs) and genes (dmGs) harbouring the dmCs
For a detailed investigation of the methylation landscape, we first looked at the DNA methylation differences at single CpG sites. This approach is essential because individual CpG modifications at genomic locations of importance, such as transcription factor binding sites (80) (TFBSs), can alter gene expression. Using a 25% difference in methylation levels as a threshold in all pairwise group comparisons, we identified 1421 unique dmCs in resilient compared to control (Res_c), 911 unique dmCs in susceptible compared to control (Sus_c) and 1316 dmCs in the direct comparison between the stressed phenotypes, susceptible compared to resilient (Sus_r) (Supplementary Table 1). These stress-induced modifications were prevalent in genic as well as intergenic regions (Supplementary Fig. 2b) in both the susceptible (Sus) and resilient (Res) groups. Tracing the genomic locations of these modifications using log odds ratio, we noted significant enrichments (either hypo-/hypermethylated or both) in CpG shelves, introns and 3’UTRs in all pairwise comparisons (Supplementary Fig. 2c). The findings are similar to studies investigating methylome in the non-CSD context (46, 61, 81, 82), where enrichment of methylation alterations in intronic regions indicates induced neuronal modifications (81), following stressor exposure. Next, focusing on the genic regions, we traced the dmC alterations to 884 genes in Res_c, 598 genes in Sus_c and 920 genes in Sus_r (Supplementary Fig. 2d). Of note, most genes (> 70% dmGs) harbored only one dmC (Supplementary Fig. 2e). Interestingly, some alterations detected in Sus_r were specific to only one of the two stress phenotypes (124 genes specific to Res_c and 72 genes specific to Sus_c (Supplementary Fig. 2f, Supplementary Table 1). However, we did not detect phenotype-specific genomic loci, as we observed an equivalent distribution of dmCs over the chromosomes (Supplementary Fig. 2g).
Next, to obtain a comprehensive understanding of the functionalities of the altering molecular landscape following stressor exposure, we performed gene ontology (GO) analyses of the dmGs from each dual comparison (Supplementary Table 1) as well as a combination of the genes from all dual comparisons. From the combined analysis, we identified an enrichment of biological processes (BP), including ‘pattern specification’, ‘neuron cell fate commitment’, and ‘axonogenesis’. Enriched molecular functions (MF) included activities related to ‘transcription activator/repressor and ‘ion channels’. On the other hand, enriched GO cellular components (CC) included `transcription regulator complex’, `synaptic membrane’ and ‘actin-based cell projection’ (Supplementary Fig. 3a, Supplementary Table 1). Additionally, we noted a higher fraction of differentially methylated transcription factors (TFs, orange) than synaptic genes (SGs (60), blue, Supplementary Fig. 3b).
Landscape and functional characterization of the differentially methylated regions (DMRs) and genes (DMGs) harbouring the DMRs
Since regional methylation modifications have been considered better indicators of gene expression alterations and more robust to statistical power (83, 84), we, next,determined the differential methylation averaged over blocks of genomic DNA (1000 bps). Similar to the dmCs, the DMRs were equivalently distributed in the genic and intergenic regions and showed alterations in both the susceptible and resilient phenotypes (Supplementary Fig. 4a, Supplementary Table 2) as opposed to the controls. Additionally, the DMRs were also enriched in the genomic locations (Fig. 2a) of CpG shelves, 3’ UTRs and intronic regions (mostly first introns, Supplementary Fig. 4b). While DMRs were similarly distributed along the genome in all dual comparisons (Supplementary Fig. 4c), noteworthy is that chromosome 2 possessed the highest number of DMRs in both Res_c (n = 152) and Sus_c (n = 116). Next, focusing on the genic regions, the DMRs were assigned to 790 DMGs in Res_c (Fig. 2b), 578 DMGs in Sus_c (Fig. 2b) and 728 DMGs in Sus_r (Fig. 2b). Amongst the DMGs in Sus_r, 130 were specific to Res_c, 69 to Sus_c, 41 genes appeared in both Res_c and Sus_c (Supplementary Fig. 4d), while 488 genes were not detected in the respective comparisons with the controls. Nevertheless, the majority (74%) of the DMGs belonged to protein-coding genes (Supplementary Fig. 4d), with most genes harbouring only one DMR (Supplementary Fig. 4d, Supplementary Table 2).
To estimate the probable functional roles of the DMGs in the dual comparisons, we performed gene ontology (GO) enrichment analysis. Across all pairwise comparisons, we observed a significant enrichment of the BP ‘synapse organization’ (Fig. 2c), indicating extensive neuronal remodelling after stressor exposure. This observation was supported by an increased prevalence of synaptic genes (SynGo genes (60), Supplementary Fig. 4e) in the DMR analysis as compared to the dmCs. Indeed, there was little overlap between the dmGs and DMGs (Supplementary Fig. 4f). Other top GO BPs included ‘small-GTPase mediated signal transduction’ in Res_c, ‘regulation of Wnt signaling pathway’ and ‘regulation of endocytosis’ in Sus_c, whereas ‘regulation of neurogenesis’ and ‘regulation of neuron differentiation’ were significantly enriched in Sus_r (Fig. 2c, Supplementary Table 2). A deeper analysis of the remaining BPs revealed that the terms ‘neuronal differentiation’, ‘gliogenesis’ and ‘axonogenesis’ were specific to Res_c. At the same time, ‘endocytosis’, ‘cell-matrix adhesion’ and ‘Wnt signaling’ were unique to Sus_c (Supplementary Fig. 4g, Supplementary Table 2). Next, a UMAP dimensional reduction (Fig. 2d, Enrichr (57–59) ) was employed to visualize the collective stress-induced alterations in MFs using the DMGs from all pairwise comparisons. We noted significant enrichments of gene clusters related to ‘guanine nucleotide exchange factors (GEF)/GTPase activator activity’ (cluster 6), ‘Wnt signaling’ (cluster 9), ‘tyrosine kinase activity’ (clusters 2,5,6,9), ‘ion channel/receptor binding activity’ (cluster 3) and ‘actin/tropomyosin binding’ (cluster 12) (Supplementary Table 2, ‘GO terms_combined modifications’). Thereafter, clustering of the common methylated regions based on absolute methylation values (Fig. 2e) identified that MF ‘tyrosine kinase activity’ was the most representative for the gene sets that lost methylation in both phenotypes upon stressor exposure (p < 0.05, Enrichr (57–59)). Meanwhile, alterations in the ‘guanine nucleotide exchange factors (GEFs)’ and ‘cell-cell adhesion’ activities were the most representative for the gene sets that gained methylation in the resilient group. Interestingly, GO MFs of ‘actin/microtubule binding’ were the most representative of gene sets that gained methylation in the susceptible group. On the other hand, the GO CCs showed that both stressed phenotypes lost methylation for genes associated with GO terms of ‘synaptic membrane’ and ‘extracellular matrix’ (ECM) (Fig. 2e, Supplementary Table 2). CCs `glutamate receptor complex’ and `cell-cell contact zone’ (Fig. 2e, Supplementary Table 2) were enriched in gene sets that gained methylation in the resilient group. In contrast, the susceptible phenotype gained methylation in the domains of ‘actin filament/membrane rafts’ (Fig. 2e, Supplementary Table 2). Cumulatively, these findings revealed that despite some common stress-invoked pathways in the methylome of the stressed group, there were clear divergent molecular processes active in the resilient and susceptible engrams.
Regulatory functions of the DMRs suggest a major involvement of GTPase signaling in stress-related alterations
Having determined the probable functional roles (GO terms) of the genes (DMGs) that contain DMRs, we investigated whether these DMRs could possess potential regulatory roles. For this purpose, we matched the genomic coordinates of the DMRs with publicly available data of cis-regulatory elements such as enhancers, insulators and inhibitors (brain-specific data from cATLAS (62), and non-brain-specific data from cENCODE (61). Interestingly, we noted a substantial overlap of the DMRs (68.59%, cATLAS and 50.69%, cENCODE) with the regulatory regions (hence termed ‘rrDMRs’, Fig. 3a). This finding indicates a potential regulatory role of the methylation alterations following stressor exposure. Further, enrichment analyses revealed that the genes harbouring rrDMRs were top enriched (p-adjusted < 0.05, Enrichr (57–59)) for the GO BPs of ‘regulation of intracellular/small GTPase mediated signal transduction’, ‘enzyme-linked receptor protein signaling pathway’ and ‘peptidyl-tyrosine modification/phosphorylation’ (Fig. 3a). The top MFs of these rrDMR genes included ‘protein tyrosine kinase’, ‘GTPase regulator’, ‘GEF factor’ and ‘Wnt signaling co-receptor’ activities (Supplementary Table 3). These findings provide evidence that the alterations in the methylome landscape could potentially influence key signaling pathways. To increase stringency, we overlapped our data with age-matched brain enhancer data available from EnhancerAtlas2.0(63) (Mus musculus, brain). Supporting the previous findings, the majority of the genes with DMRs in enhancer regions (termed ‘eDMGs’) were associated with GTPase signaling (Arhgap9, Cdk14, Rasgef1a, Gabbr2, Arhgap35, Gnas, Tbc19db), tyrosine kinase signaling (Tex14, Aatk), synaptic vesicle functions (Ap1m1, Sorl1, Coro1c) and synaptic plasticity (Map7, Camk2b, Fmnl1, Npas4, Arpc3, Prph, Cdh22) (Fig. 3b, Supplementary Table 3). Noteworthy is that apart from the genic regions, a considerable number of intergenic regions also overlapped with the regulatory regions (Supplementary Table 1), indicating potential regulatory roles for non-coding regions in stress-related alterations.
Motifs for E26 Transformation Specific (ETS) family of TFs enriched within the DMRs
Following the substantial overlap of the DMRs with the regulatory regions (Fig. 3a, b), we hypothesized that there could be multiple TFBSs within the DMRs. Methylation of TFBSs plays an important role in TF binding(80, 85, 86) and chromatin accessibility, thereby affecting gene expression (36, 87–90). Therefore, we performed TF binding DNA sequence motif enrichment analyses using HOMER (64), similar to Zhang and colleagues (91). Interestingly, we identified members of the ETS family of TFs(92) to be substantially enriched, with the top position occupied by ETS-like 4 (Elk4), p = e− 126 (Fig. 3c), followed by ETS-like 1 (Elk1), p = e− 65, with very similar seed motifs (TTCCGG) (92, 93). Further, separate HOMER analyses of the pairwise comparisons of Res_c, Sus_c and Sus_r identified a similar top enrichment of the ETS family of TFs (Fig. 3d, Supplementary Table 3). Interestingly, the TFBS enrichment analyses using only rrDMRs yielded a similar result (Supplementary Table 3). Notably, the ETS family of TFs are nuclear targets of the Ras (GTPase)-extracellular signal-regulated kinase (ERK) signaling (94, 95), crucial for activity-dependent transcription (95), synaptic plasticity, learning and memory, among other important neuronal functions (96–98). These findings implicate an important role for the ETS family of TFs and GTPase signaling in regulating the effects of stress via the methylome. Though one could argue that the observed enrichment of the ETS TF might be because of the enrichment of the CG-rich genomic regions in the RRBS-seq, this is unlikely because RRBS-seq data for non-stress research in the hippocampus showed enrichment of other TFs (61).
Regulatory influence of the differential methylation in gene expression at the sample collection timepoint
Having determined the potential regulatory role of the DMRs, we next aimed at identifying the impact of differential methylation on gene expression following stressor exposure. For a preliminary screen, we compared the DMGs with the differentially expressed genes (DEGs (31) obtained within the same experiment). Interestingly, we identified only about 10% (190) of the total DEGs overlapping with DMGs (Fig. 3e). This is consistent with existing literature (99), where epigenetic modifications do not alter transcript levels shortly after an environmental challenge but instead act as a priming module for future alterations. Surprisingly, despite the low overlap, the genes shared between the two omics datasets showed GO MF enrichment of ‘GTPase signaling’ activities (Fig. 3e, GO terms) and ‘GEFs’ (highlighted in the rectangle in Fig. 3e). Of note, this enrichment in MFs was accompanied by enrichment of BPs related to ‘cell adhesion’ and ‘ion channel’ activities (Supplementary Table 4). However, these genes revealed a complex, non-conventional relationship between DNA methylation and gene expression (Supplementary Fig. 5a). Indeed, such features have been observed wherein, methylation increases gene expression (100) or facilitates permissive environments for alternative splicing mechanisms (82, 101) instead of gene downregulation. In fact, only eight of the eDMGs (black arrows, Fig. 3b) showed alterations at the transcript levels (Fig. 3e, right panel) following stressor exposure.
Divergent patterns of GTPase signaling and related mechanisms in the susceptible and resilient phenotypes at multiple omics levels
As the previous sections revealed an important contribution of GTPase signaling in regulating stress effects, we next sought to clarify the existence of a phenotypic-specific divergence in this signaling pathway via a multi-omics molecular profiling. For this, we adopted a networks approach by integrating the genes that exhibited altered methylation or expression patterns into plausible protein-protein interactions (PPI) in the STRING platform (see Methods)(65–67).
Following a preliminary analysis, we identified an enrichment of the domains of the GTPase regulators (GEFs) in both the methylome (combined DMGs) and transcriptome (combined DEGs) datasets (Pfam(102) and SMART (103), STRING) (Supplementary Fig. 5b, Supplementary Table 4). Therefore, this family was investigated in-depth. Intriguingly, we noted a divergent pattern of the RhoGEFs between the resilient and susceptible phenotypes. While a higher fraction of hypermethylated GEFs (intronic) was prevalent in the resilient animals (Fig. 4a, Res_c_s, resilient compared to either control or susceptible), a higher fraction of hypomethylation was prevalent in the susceptible group (Fig. 4a, Sus_c_r, susceptible compared to either control or resilient). Interestingly, this trend was accompanied by a higher fraction of upregulated GEFs in the resilient (Fig. 4a, Res_c_r) as opposed to downregulated GEFs in the susceptible phenotype (Sus_c_r). This phenotype-specific divergence in the RhoGEF family, surpassing different omics layers, suggests a central and novel role of this GTPase regulator in directing the stress phenotypes. Next, analyzing hub genes (CytoHubba (68)) within the PPI networks, we ascertained that GTPases functioned as hubs in our network, albeit with a unique and distinct pattern in the Res_c compared to Sus_c in both omics’ levels. Specifically, in the methylome Res_c, GTPases (eg., Rhoa, Gnas, Fras1) and GTPase regulators (Kalrn, Syngap1) formed the majority of the top 10 hub genes (Fig. 4b), while the hubs in Sus_c showed a low representation of the GTPases (Gnas). In contrast, in the transcriptome, the GTPases formed the top 3 hubs in Sus_c (e.g., Rhoa, Rac1) while the spliceosome family of heterogeneous nuclear ribonucleoprotein particles (Hnrnps) and serine/arginine-rich splicing factors (Srsfs) dominated the top hub ranks in Res_c (Fig. 4b). The hub genes for Sus_r for the two omics levels are illustrated in Supplementary Fig. 5c.
While the crucial involvement of GTPase signaling at the different omics layers is clear, given the presence of hub genes involved in other BPs, we wanted to determine if we could obtain other phenotype-specific signaling networks through an integrated multi-omics readout.
For this integrated approach, DMGs and DEGs of each phenotype comparisons were combined, and plausible PPI networks were generated in STRING (65–67). We identified network clusters specific to the integrated resilient comparisons (RES_C, example, spliceosome assembly, cadherin clusters) and integrated susceptible comparisons (SUS_C, example, actin nucleation, oligodendrocyte development). Interestingly, GTPase families remained as critical hubs (Supplementary Fig. 5d, Supplementary Table 4) in the integrated data sets. On the other hand, BPs related to ‘exocytosis/endocytosis’, ‘Wnt signaling’, and ‘myelination’ were specific to SUS_C, whereas, amongst the limited BPs specific to RES_C, we noted ‘mRNA processing’, and ‘Rho protein signal transduction’ to be of importance. We validated these findings using Enrichr (Supplementary Table 4). Of note, specificity of BP ‘endocytosis’ to the susceptible comparisons persisted both across individual omics comparisons and upon data integration (Supplementary Fig. 5e, Supplementary Table 4), as opposed to other BPs that changed phenotype-specificity at different omics levels (Supplementary Fig. 5e, Supplementary Table 4). With this gain of information on the complex yet convergent molecular alterations, the next step was to screen for potential targetable pathways/genes within networks.
Screening of potential targetable systems in the integrated networks of the stress-altered pathways
We reasoned that an ideal targetable system should be within a group of molecular functions prevalent at the multi-omics levels in all comparisons. For a smooth visualization, we gathered the collective stress-related alterations in molecular functions for each omics level and generated integrated enrichment maps using Cytoscape(69) (Fig. 5a,). Here, we identified alterations in targetable molecular functions of ion channels and receptors (Cluster 3) prevalent at all omics levels. Additionally, the alterations in the DMGs and DEGs overlapped at the targetable MFs of GTPase activities (Cluster 5) and actin/cytoskeletal binding (Cluster 6). As these molecular functions are critical for synaptic functions and stress-related disorders, we focused our search for potential therapeutic targets within the synaptic genes.
As networks provide more information on the importance of a signaling molecule, we generated a PPI network of the synaptic genes detected as significantly different in any pairwise group comparisons for any omics level. Following the generation of the network, we screened for novel genes, which have not been reported as prominently implicated in depression or stress research and assessed the degree of connections with every other gene. Our screening list (Supplementary Table 5) consisted of data from the ‘major depression’ gene list from Harmonizome 3.0 (104) (links in Supplementary Table 5), Bagot and colleagues (105) (vHipp only), Fan and colleagues (106) Scarpa and colleagues (107), Gammie and colleagues (108) (Top 1000 gene list); and Kuehner and colleagues (109) (using only the shared genes between three and six months old animals) (Supplementary Table 5).
Supporting previous findings, we noted several genes previously identified as relevant to stress-related alterations in our network (blue nodes, green nodes, Fig. 5b), including GTPases, Rho GTPase activating proteins (GAPs), and RhoGEFs, On the other hand, we identified a few genes that have not yet been implicated as significantly relevant in stress pathways in previous studies (red nodes, Fig. 5b, Supplementary Table 5). Amongst these novel genes, we noted five genes (black, Fig. 5b) belonging to the GTPase family and Rho GTPase regulators. Interestingly, four of these genes (Rab5a, Rab6b, Ophn1and Arf6) are tightly linked to endocytic/exocytic processes (110–113). Amongst these genes, ADP-ribosylation factor 6 (Arf6) stands prominent because of its distinct nature (114) and critical roles in synaptic vesicle functions, including endocytosis (110, 115) and actin polymerization (114), which are consistently enriched in susceptible group comparisons. Since Arf6 expression is upregulated in the susceptible (log2 fold change ~ 0.4) compared to the resilient group, given its connection with hub genes, decreasing its expression could be beneficial. Of note, decreasing the activity of Arf6 has been shown to be essential for the regenerative capacity of the axons in the central nervous system (CNS) (116) as well as increasing the readily releasable pools at the synapses (115). Last but not the least, it is interesting to note that gm9887, partially overlapping with Arf6 (117) showed 3 CpG modifications at the 1–5 kb positions in the Res_c comparison (Supplementary Table 1).