SETDB1 selectively inhibited the chromatin accessibility on SINE_B2 elements
To investigate the regulatory mechanisms of TEs in the developing brain by SETDB1, we dissected out ganglionic eminences (GEs) from both wildtype (WT) and SETDB1-knockout (KO) mouse brains at embryonic day 15.5 (E15.5), and conducted neurosphere culture following an established protocol19. The purified NPCs were then harvested and subjected to various epigenomic profiling assays, including the evaluation of genome-wide chromatin accessibility (ATAC-seq), H3K9me3 and CTCF occupancy (ChIP-seq), DNA methylation (WGBS), 3D genome organization (in situ Hi-C), and gene transcription (RNA-seq) (Fig. 1A). It is worth noting that ATAC-seq and RNA-seq data were derived from our previously published work19. However, in this study, we undertook a novel analysis of these datasets from a completely different perspective, shifting our emphasis onto the investigation of TEs instead of individual gene loci.
Differential analysis (KO/WT) of ATAC-seq peaks revealed genome-wide alterations in chromatin accessibility, resulting in 15,868 up-regulated and 9,868 down-regulated peaks (Fig. 1B). Notably, a significant finding was that more than 65% (10,713) of the up-regulated ATAC-seq peaks (“ATAC_up”) were located on TEs. Furthermore, nearly half of these peaks (4,556) were specifically annotated to the B2 family of SINE class (SINE_B2). In contrast, less than 25% of the down-regulated ATAC peaks (2,442) were associated with TEs, and only a limited number of 88 peaks were annotated as SINE_B2 (Fig. 1B, S1A). Further subtype analysis of the overlapping TEs revealed that only SINE and LTR showed significant enrichment within the ATAC_up peaks (Fig. 1C-D). The enrichment on LTR was expected, as previous studies have reported that SETDB1 plays critical roles in repressing LTRs, particularly ERVs, in the brain and other organs19,25,29–31. In our study, within the LTR class, ERVKs were notably enriched in ATAC_up peaks (Fig. 1C). However, the robust enrichment within the SINE class was unexpected. Among the SINE class, we found that B2, as opposed to Alu and other subclasses, exhibited highly enrichment in the ATAC_up peaks. Moreover, among the five B2 subtypes (B3, B3A, B2_Mm2, B2_Mm1t, B2_Mm1a), B3 and B3A were the main contributors to the observed effects (Fig. 1C-D). Considering the relatively short length of SINE (average length of 159 bp) compared to other types of TEs, we recalculated the enrichment by accumulative length and obtained consistent results (Fig. S1B-C).
We defined the group of B2 elements that overlapped ATAC_up peaks as “ATAC_up_B2”. When compared to all B2 elements in the genome (“All_B2”), and the group of B2 that did not overlap with ATAC_up peaks (“ATAC_not_up_B2”), we observed a notable increase in chromatin accessibility for ATAC_up_B2, even in the WT, which was further augmented after SETDB1-KO (Fig. 1E). This suggested that this specific set of B2 elements exhibits a baseline level of activity and potential functionality. Genome annotation analysis revealed that ATAC_up_B2, as well as ATAC_up_B3 and ATAC_up_B3A, were significantly enriched in proximal intergenic regions and gene bodies (Fig. 1F). Interestingly, although these B2 elements were not enriched in promoters (Fig. 1F), they were significantly closer to transcription start sites (TSS) when compared to the distribution of B2 elements throughout genome (Fig. S1D). In contrast, ATAC_not_up_B2 elements were enriched in distal intergenic regions (Fig. 1F), and located significantly farther from TSS (Fig. S1D). Visualizations on the map tracks clearly demonstrated that ATAC_up_B2 elements were highly concentrated in gene-rich regions (Fig. 1G). These all suggest a high potential of ATAC_up_B2 elements to influence gene transcription. Furthermore, we conducted Homer motif analysis and revealed that only ATAC_up_B2 elements were highly enriched with motifs for CTCF, in contrast to ATAC_not_up_B2 or All_B2 (Fig. S1E), underscoring the distinct functional role of ATAC_up_B2 elements. In summary, these findings indicate that the chromatin accessibility of a specific population of SINE_B2 elements is controlled by SETDB1 in NPCs.
SETDB1 suppressed chromatin accessibility of SINE_B2 via H3K9me3 deposition
SETDB1 is one of the most important histone methyltransferases with specificity towards H3K9me3. We, therefore, investigated whether SETDB1 represses the chromatin accessibility on SINE_B2 elements in NPC through direct H3K9me3 deposition. To explore this, we conducted H3K9me3 ChIP-seq in NPCs from both WT and SETDB1-KO. The correlation analysis showed a clear distinction between WT and KO samples (Fig. 2A). Subsequently, we called H3K9me3 narrow peaks and performed a differential analysis between KO and WT, which disclosed a substantial downregulation of H3K9me3 narrow peaks (8,063) (“K9_down”) in KO, with only 6 peaks showing upregulation (Fig. 2B). It is worth noting that H3K9me3 signals predominantly manifest as broad domains in heterochromatic regions, whereas the identified narrow peaks typically represent localized H3K9me3 signals in open chromatin regions. Indeed, a majority (62.2%) of these K9_down peaks were located in genic regions (Fig. S2A). Consistent with the ATAC-seq data (Fig. 1), we observed enrichments of SINE (by TE class), B2 (by SINE family) and B3/B3A (by B2 subtype) elements within the K9_down peaks (Fig. 2C-D, S2B). While LTR elements constituted a substantial portion of the K9_down peaks, they did not reach statistical significance in the enrichment tests (Fig. 2C-D). In addition, when considering enrichment by cumulative length, SINE_B2 remained highly enriched in K9_down peaks, while LTR displayed a milder level of enrichment (Fig. S2C). Furthermore, we detected enrichment of the CTCF motif in the K9_down peaks (Fig. S2D).
Next, we assessed the signal correlation between differential peaks from the ATAC-seq and H3K9me3 ChIP-seq datasets. The H3K9me3 signal was almost completely erased on ATAC_up_B2 elements in KO compared to WT (Fig. 2E), and vice versa, the ATAC-seq signal was increased on K9_down_B2 elements (Fig. 2F). These observations were visually evident in the H3K9me3 track maps, particularly on the ATAC_up_B2 elements (Fig. 2G). In summary, these findings strongly suggested that genetic ablation of SETDB1 in NPCs results in the loss of H3K9me3 signal in the euchromatin regions, which likely contributes to the increased chromatin accessibility observed on the SINE_B2 elements.
DNA methylation participated in SETDB1-mediated repression of SINE_B2
Our previous work demonstrated the interplay between SETDB1-mediated H3K9me3 and DNA methylation, another well-established repressive epigenetic modification28. Given this, in the current study, we examined DNA methylation by conducting WGBS in NPCs derived from both WT and KO. The high efficiency of bisulfite conversion was confirmed using spiked-in lambda DNA (Fig. S3A). While the global DNA methylation levels showed only mild changes (Fig. S3B), differential analysis (KO/WT) revealed over 30,000 significant differentially methylated regions (DMRs) across the genome (Fig. 3A), evenly distributed on promoters, gene bodies and distal intergenic regions (Fig. S3C). As expected, most TEs exhibited high levels of DNA methylation, while non-TE regions, especially gene promoters, displayed lower levels of methylation (Fig. 3B, left panel). Meanwhile, the average methylation level on SINE and its subfamilies including B2 remained largely unchanged (KO/WT) at a global level (Fig. 3B, left panel). However, when examining the significant DMRs, we observed massive downregulation (KO/WT) of DNA methylation on TEs, with the exception of low-complexity regions. Among all TEs, the significant DMRs in SINE were the most hypomethylated, with B2 being the most hypomethylated within the SINE class (Fig. 3B, right panel). Interestingly, gene promoters, but not those active promoters, exhibited increased DNA methylation in KO compared to WT (Fig. 3B, right panel). Moreover, enrichment analysis, both by TE number and accumulative length, revealed that significantly hypomethylated DMRs (“DMR_hypo”) were enriched for SINE (by TE class) and B2 (by SINE family), but not for LTR (Fig. 3C-D, S3D-E). Interestingly, within the subtypes of B2, DMR_hypo were more enriched for Mm1t and Mm1a (Fig. S3D-E), contrasting with B3/B3A that were enriched for the up-regulated ATAC-seq peaks (ATAC_up) (Fig. 1C-D), and down-regulated H3K9me3 narrow peaks (K9_down) (Fig. 2C-D). Nevertheless, the CTCF motif was also enriched in DMR_hypo regions (Fig. S3G).
Next, we evaluated the intersection between DMRs and ATAC-seq peaks (ATAC). We found that for significant DMRs and ATAC peaks overlapping B2 elements (sig_B2), a substantial fraction (514/2497) was situated in the 1st quadrant (DMR_hypo and ATAC_up), suggesting an increase in chromatin accessibility in the DNA hypomethylation regions on B2 elements in KO (Fig. 3E). Notably, there was a near complete depletion of the H3K9me3 signal in this set of B2 elements (Fig. 3F-G). Altogether, these findings imply that DNA methylation and H3K9me3 act synergistically on SETDB1-mediated suppression of B2 elements in NPCs.
SETDB1 constrained excessive CTCF binding at SINE_B2
SINE_B2 elements are well known for their potential to harbor CTCF binding sites21,22. In our current study, we discovered an enrichment of CTCF motifs in B2 elements with ATAC_up, K9_down, and DMR_hypo events (Fig. S1E, S2D & S3F). Therefore, we performed anti-CTCF ChIP-seq to characterize the CTCF binding landscape in both WT and KO. Correlation analysis showed a clear distinction between WT & KO samples (Fig. S4A). Moreover, differential analysis (KO/WT) revealed a substantial global increase in CTCF binding, with 8,597 significantly up-regulated and 615 down-regulated CTCF peaks in KO (Fig. 4A). Remarkably, more than 70% of these upregulated CTCF peaks (“CTCF_up”) overlapped with SINE_B2 elements (Fig. 4B), and showed exclusive enrichment (both in terms of TE number and cumulative length) for SINE (by TE class), B2 (by SINE family), and B3A/ B3 (by B2 subtype) elements (Fig. 4B, S4B-D). Notably, despite multiple evidence indicating the activation of LTR elements in KO (Fig. 1C-D, S2B-C), LTR was not enriched in the CTCF_up peaks when calculated by TE number (Fig. S4B) and showed only very mild enrichment when calculated by cumulative length (Fig. S4D). Furthermore, CTCF motifs were only enriched in the CTCF_up peaks overlapped with B2 elements (“CTCF_up_B2”), not in those that did not overlap with B2 (Fig. S4E). These findings highlight the specificity of the increased CTCF binding towards SINE_B2 elements following the loss of SETDB1 in NPCs.
We proceeded to conduct an integrated analysis of the CTCF ChIP-seq and ATAC-seq datasets. Permutation tests revealed a significantly higher number of overlaps and a shorter average distance between CTCF_up peaks and ATAC_up_B2, relative to the background of all B2 loci (Fig. S4F). Approximately half of the CTCF_up peaks overlapped with ATAC_up peaks, and over 70% (3,429/4,683) of ATAC_up_B2 peaks were associated with CTCF_up peaks (Fig. S4G). Furthermore, the distribution plot showed that almost all the B2 elements (4,005/4,007) with significant changes in ATAC-seq and CTCF ChIP-seq exhibited increases of both signals in KO compared to WT (Fig. S4H). This suggests that the increase in CTCF bindings in KO may be a result of the alleviation of chromatin accessibility repression on B2 elements. Following these observations, we looked deeper into the accumulative signal of CTCF, ATAC-seq, H3K9me3, and DNA methylation on CTCF_up_B2 elements in both WT and KO (Fig. 4C-G). These B2 elements exhibited moderate enrichment of CTCF occupancy in WT, which increased in KO as anticipated (Fig. 4C). In parallel, chromatin accessibility mirrored this trend (Fig. 4D). Interestingly, the repressive epigenetic marks showed a similar trend but exhibited distinct patterns. For H3K9me3, a notable enrichment (“peak”) was observed on CTCF_up_B2 in WT, which was almost completely diminished in KO (Fig. 4E). In contrast, DNA methylation was lower than the background (“valley”) on CTCF_up_B2 in WT, and this was further reduced in KO (Fig. 4F). These findings suggest that H3K9me3 and DNA methylation play together, but with unique ways in controlling the activity of this population of SINE_B2 elements.
CTCF is a highly conserved zinc finger protein and plays an important role in the regulation of gene transcription32–34. CTCF occupancy is broadly distributed across the genome. Density plot illustrated that a considerable proportion of CTCF peaks were located on gene promoters (around TSS), while the remaining peaks were concentrated around 15 Kb away from the TSS (Fig. 4H). Interestingly, upregulated CTCF peaks (CTCF_up, “Up” for short), as well as those on the B2 elements (CTCF_up_B2, “Up_B2” for short), were almost completely absent from the promoter regions (Fig. 4H). Further genome annotation revealed that these CTCF_up peaks were primarily enriched in genic regions. Particularly, CTCF_up_B2 showed a significant enrichment in gene body regions (Fig. S4I), suggesting an indirect regulatory role of these peaks in gene expression.
Chromatin loop formation due to increased CTCF occupancy at SINE_B2 elements
In recent years, CTCF has been extensively studied for its pivotal role in maintaining 3D genome architecture through the mediation of chromatin folding32–34. Therefore, we explored the consequences of a genome-wide increase in CTCF binding on chromatin organization in both WT and KO NPCs. We performed in situ Hi-C and generated Hi-C contact maps with a resolution of < 15 Kb for individual samples, and < 5 Kb for merged samples by genotype (Fig. S5A). There were no significant alterations in normalized contact maps (Fig. S5B) or genome-wide relative contact probability between WT and KO (Fig. S5C). Moreover, although CTCF has been recognized as an insulator in the formation of boundaries for TADs35,36, our data did not reveal any significant changes in the number of TADs, the width, the insulation score of TAD boundaries, or the interaction frequency between neighboring TADs (Fig. 5A-B & S5D).
We then turned our focus to chromatin loops. While the number and span of loops remained largely unchanged (Fig. S5E-F), we observed significant variations in both the strength and position of chromatin loops (Fig. 5C-D). The differential (KO/WT) contact intensity decreased noticeably in WT loops, while it increased in KO loops (Fig. 5C). Additionally, we detected a significant redistribution of loop anchors, resulting in the emergence of 6,444 “new loop anchors” (NLA), the disappearance of 5,515 “lost loop anchors” (LLA) in KO, and the conservation of 10,240 “unchanged loop anchors” (ULA) between WT and KO (Fig. 5D). Notably, a significant portion of NLAs (p < 0.0001) overlapped with CTCF_up_B2 elements (Fig. S5G). This overlap exceeded expectations (Fig. 5E, left), while the overlaps between LLA (Fig. 5E, middle) and CTCF_up_B2, or ULA and CTCF_up_B2 (Fig. 5E, right), significantly fell below anticipated levels. Visualization of the chromatin loops from WT and KO showed the emergence of new loop anchors corresponding to CTCF_up_B2 elements (Fig. 5F).
In conclusion, these findings suggest that while increased CTCF binding did not cause major structural alterations in the 3D genome, it led to a reorganization of chromatin interactions in the SETDB1-KO NPCs.
SETDB1 maintained cell cycle in NPCs by restraining aberrant loop formation
Chromatin loops, which bring cis-regulatory elements into close proximity with gene promoters, are known to regulate gene transcription. To understand the transcriptional implications of reorganized chromatin loop formation following SETDB1 loss in NPCs, we analyzed RNA-seq data. Interestingly, we found that the promoters of differentially expressed genes (DEGs) significantly overlapped with the anchors of new loops (those with at least one NLA), but not with lost loops (those with at least one LLA) or unchanged loops (Fig. 6A), indicating that these novel chromatin loops might influence gene transcription. Despite a genome-wide increase in chromatin accessibility in the SETDB1-knockout NPCs, particularly on TEs (Fig. 1), we observed a decrease in the ATAC-seq signal on all gene promoters in the KO (Fig. S6A). This finding was unexpected, given SETDB1's known role as a transcriptional repressor. Detailed analysis revealed that the decrease in the ATAC-seq signal primarily occurred on the promoters of downregulated genes, but not on those of upregulated genes (Fig. 6B). In addition, protein-protein interaction analysis of the DEGs specifically overlapping with NLAs on CTCF_up_B2 elements identified enriched functional clusters associated with cell division (Fig. 6C). Indeed, we observed CTCF_up_B2 mediated NLAs on these genes (Fig. S6B). Meanwhile, RNA-seq data indicated a significant downregulation of their transcription levels in KO NPCs (Fig. 6D). This suggests that the newly formed chromatin loops in KO mediated downregulation of gene transcription in NPCs, especially for those genes implicated in the cell cycle.
We subsequently assessed cell proliferation capabilities of both WT and KO NPCs. Our in vitro analysis of primarily cultured neurospheres showed a significant reduction in diameter for the KO group compared to the WT (Fig. 6E). Cell cycle analysis further revealed fewer cells at the G2/M phase in the KO group compared to the WT (Fig. 6F, S6C). For in vivo studies, we administered intraperitoneal injections of BrdU to pregnant mice at E15.5, and then sacrificed them at E16.5. We examined the fetal brains for BrdU labeling and phospho-histone H3 (PH3) expression using immunofluorescence staining. Upon genotyping, we observed a significant decrease in the number of PH3 + and PH3+/BrdU + cells within the ganglionic eminence (GE) region of KO brains (Fig. 6G-I). Therefore, our findings from both in vitro and in vivo studies suggest impaired cell proliferation and dysregulated cell cycle in the NPCs due to the loss of SETDB1.