NSN to SN transition is associated with transcriptome changes
We used scM&T-seq to generate scRNA-seq and scBS-seq data from nine NSN and 18 SN GV oocytes collected from two mice and classified by Hoechst staining of their nuclei. After quality control filtering, we were left with RNA-seq data for nine NSN and 16 SN oocytes, with an average read count of 3.6 million (range: 1.9–5.3 million reads; Supplementary Table S1). Principal component analysis (PCA) resulted in distinct clustering of NSN and SN oocytes on PC1 (36% of variance), demonstrating distinct transcriptome profiles of the two GV maturation stages (Fig. 1A). As the NSN-to-SN transition coincides with transcriptional silencing (10), we analysed the total number of transcripts detected in our dataset. We observed that SN oocytes contained on average 1,362 fewer transcript species than NSN oocytes (Fig. 1B), likely reflecting the transcript degradation associated with oocyte maturation (38). To explore whether the absence of a specific transcript was a shared event between SN oocytes, we estimated the proportion of cells expressing each gene and observed that these events were shared across most SN oocytes (Supplementary Figure S1A). This observation suggested that heterogeneity in the transcriptome of GV oocytes is not increased after transcriptional silencing even though a reduction in expressed genes was observed. To confirm this, we conducted a differential over-dispersion test between NSN and SN oocytes. A similar number of over-dispersed transcripts was observed in both categories (NSN = 224, SN = 185), supporting the idea that both groups share similar levels of transcriptional heterogeneity (Supplementary Figure S1B). To further explore the genes no longer consistently detected in SN oocytes, we selected 382 transcripts that we characterized as “SN-missing”, based on 1) having at least 1 count in at least 7 out of 9 NSN oocytes and 2) 0 counts in at least 10 out of 16 SN oocytes (Supplementary Table S2). Gene ontology analysis of genes that annotate to SN-missing transcripts did not render any interesting results. We noted that out of 382 transcripts, 114 (29.8%) were oocyte-specific transcripts, according to our previous annotation of the mouse oocyte transcriptome (39), and a further 61 (16.0%) had no assigned gene name. Furthermore, when analysing the expression of SN-missing genes in NSN oocytes, we saw a significant bias for SN-missing transcripts to be expressed in the lowest 2 expression quartiles at the NSN stage compared to random genes (Chi square P < 0.0001; Fig. 1C).
Therefore, transcripts that were detected in NSN but not SN oocytes tend to be already lowly expressed in NSN oocytes. Taken together with the lack of meaningful gene ontology results and the finding that around half of these SN-missing genes do not have an assigned gene function, these results may suggest that the majority of transcript species missing in SN oocytes are not of critical functional importance for the oocyte.
Next, we used DESeq2 analysis to identify 2,347 differentially expressed transcripts (DEGs), of which 579 had a Log2 fold change (LFC) > |1|. Strikingly, 576 out of these 579 transcripts were downregulated (Fig. 1D, Supplementary Table S2). The vast overrepresentation of downregulated transcripts in SN oocytes is again likely to reflect the absence of active transcription in combination with transcript degradation at this stage. Gene ontology analysis of downregulated DEGs with a LFC < -1 showed enrichment for processes related to mRNA processing, including ribonucleoprotein complex assembly, RNA splicing and ncRNA metabolic process as well as protein dephosphorylation (Fig. 1E).
Using RNA-seq data to infer chromatin configuration
Given the strong effect that chromatin configuration has on gene expression, we attempted to generate an NSN-SN classifier to infer chromatin configuration in other RNA-seq datasets. First, we annotated the list of 2,347 differentially expressed transcripts to genes and compared them with the list of DEGs reported by Ma et al., 2013 (28) from bulk RNA-seq analysis. This resulted in 199 genes for which we extracted the associated transcripts. Next, the transcripts were divided as upregulated or downregulated in SN. For transcripts which were downregulated in SN, we choose the top 65 transcripts on the basis of FDR. We used a similar strategy for the transcripts upregulated in SN and chose the top 35 transcripts on the basis of FDR, thereby generating the 100 gene classifier (Fig. 2A, Supplementary Table S2). Application of the 100 gene classifier split our samples cleanly into NSN and SN oocytes (Fig. 2A). Surprisingly, using Uniform Manifold Approximation and Projection (UMAP) clustering analysis the Ma et al. NSN sample clustered with our SN oocytes and the Ma et al. SN sample with our NSN oocytes, independently of whether we used our 100 NSN-SN classifier genes or PCA of all 22,869 transcripts (Supplementary Figure S2A, B). As Ma et al. (28) only had one replicate sample for each group (NSN and SN), we conclude that the samples in Ma et al. have likely been mis-assigned.
To analyse whether we could use our classifier to distinguish between NSN and SN oocytes in other scRNA-seq datasets, we selected a variety of mouse models, including ageing and gene knockout (KO) models, with a known skew in their NSN-SN ratio (Table 1).
Table 1
scRNA-seq datasets from mouse models used to test our NSN-SN classifier.
Mouse model | NSN:SN transition | Transcription | Method | GSE | Reference |
Oocyte ageing | Predominant SN | Silenced in SN | scRNA-seq | GSE154370 | (40) |
Exosc10 KO | Impaired | | scRNA-seq | GSE141190 | (41) |
Sall4 KO | Complete block | Continuous transcription | scRNA-seq | GSE84924 | (42) |
Zfp36l2 KO | Normal | Continuous transcription | scRNA-seq | GSE96638 | (27) |
Zcchc8 KO | Impaired | Not assessed | Bulk RNA-seq | GSE127790 | (43) |
Setd2 KO | Normal | Normal | Bulk RNA-seq | GSE112835 | (35) |
Based on the described phenotypes for these models, we expected to see a shift in the proportion of NSN and SN oocytes compared to control oocytes. We first looked at scRNA-seq data from oocytes from young (12 weeks) and aged (> 40 weeks) mice from our previous study using the same sequencing protocol (40). As the SN ratio increases with ageing (3), we expected to see more SN oocytes in aged oocytes compared to young oocytes. Clustering analysis based on the 100 NSN-SN classifier genes, allowed us to categorize 55 out of 87 oocytes as having a clear NSN or SN signature, respectively (Fig. 2B, Supplementary Figure S2C, Supplementary Table S3). This classification indeed showed a skew in aged oocytes towards SN oocytes compared to young oocytes (Fig. 2B,C; Supplementary Figure S2C).
Next, we analysed Exosc10 KO oocytes, which have been described to have impaired NSN to SN transition (41). We predicted that the transcription profile of the KO oocytes would be skewed towards the NSN signature, while the corresponding wild-type (WT) oocytes would be predominantly SN. Indeed, we saw a clear difference of NSN-SN composition between Exosc10 KO and WT oocytes (Fig. 2C,D; Supplementary Figure S3A). While all WT oocytes clustered together with our SN oocytes, the Exosc10 KO oocytes were a mixture of NSN and SN, recapitulating the impaired, but not completely blocked, NSN to SN transition, described by Wu et al. (41). We also tested Sall4 KO oocytes, which are described to have a complete block of the NSN to SN transition (Supplementary Figure S3B,C) (42). Again, we saw clustering of WT oocytes with our SN oocytes, while KO oocytes cluster in proximity with our NSN oocytes.
As NSN to SN transition and transcriptional silencing can be uncoupled (2, 26), the classifier has a caveat, in that it will incorrectly identify NSN and SN state if the two processes are uncoupled. As an example, we used our classifier in Zfp36l2 KO oocytes, in which transcriptional silencing is impaired, even though they have seemingly normal NSN to SN transition (27). The classifier predicts Zfp36l2 KO oocytes to be of the NSN type, while WT oocytes are of the SN type (Supplementary figure S4A,B), which reflects their impaired transcriptional state.
We also tested whether our classifier could determine NSN or SN status in bulk RNA-seq datasets. For Zcchc8 KO oocytes, which have impaired NSN to SN transition (43), the classifier correctly identified the KO oocytes to be of the NSN type (Supplementary Figure S5A). When applied to Set2d KOs (Supplementary Figure S5B), which are reported to have normal NSN to SN transition and normal transcriptional silencing (35), we detected a strong divergence between the two Setd2 KO replicates. We suspect this indicates variable proportions of NSN and SN oocytes collected for the two Setd2 KO replicates. Overall, the classifier showed a higher accuracy/robustness when a dataset had more samples, both in single-cell and bulk datasets.
Taken together, we believe that the NSN-SN classifier based on the abundance of 100 transcripts can be a useful tool to assess a possible effect on NSN to SN transition in a variety of mouse models. However, we do recommend verifying the results by other methods, such as imaging, as transcription and chromatin conformation may be uncoupled.
GV oocytes gain DNA methylation during the NSN to SN transition
During oocyte growth, de novo DNA methylation is predominantly linked to gene transcription and the deposition of H3K36me3 over expressed gene bodies. How DNA methylation levels differ between NSN and SN oocytes is unknown. To assess this, we generated scBS-seq libraries from the same oocytes used for used the scRNA-seq analysis above. scBS-seq libraries were successfully generated for 9 NSN and 23 SN oocytes, of which only 5 NSN and 18 SN oocytes were taken ahead after QC filtering. To increase the number of oocytes for a more robust analysis, we included the scBS-seq data from young oocytes from Castillo-Fernandez et al. (40), which were made using the same protocol, and inferred their NSN and SN state with the classifier described above (Supplementary Table S3). This resulted in 12 NSN and 28 SN oocytes for downstream analysis. As expected, we observed higher global levels of CpG methylation in SN oocytes (mean 32.3%) compared to NSN oocytes (mean 30.3%; Fig. 3A). We then extended this comparison to the next stage in oocyte development by using previously generated MII oocyte scBS-seq data (44). Global CpG methylation in MII oocytes had similar levels to those observed in SN oocytes (Fig. 3A). Non-CpG methylation increased from a mean of 4.02% in NSN to 4.68% SN oocytes (Fig. 3B). The oocyte methylome is known to be highly bimodal, with large domains exhibiting either high or low methylation, with smaller regions in between, which may be termed “intermediately” or “partially” methylated (39, 45, 46). We classified the genome into domains with low (< 30% methylation) and highly methylated domains (> 70%) as defined by Veselovska et al. (39). Domains with intermediate methylation were defined as regions in between low and highly methylated domains (40). The largest differences between NSN and SN oocytes were observed in domains with intermediate levels of CpG methylation (Fig. 3C).
Next, we sought to identify differentially methylated regions (DMRs) between NSN and SN oocytes. Because the coverage of scBS-seq libraries is low it is difficult to find domains with sufficient coverage in all samples for differential methylation analysis. Single oocytes samples were therefore pseudo-bulked, by randomly grouping four oocytes together and comparing three groups of four NSN and SN oocytes each in 100 iterations, as previously described (40). By selecting only those DMRs that are shared in at least 50% of tested combinations, we were left with a robust selection of 1,146 DMRs (Fig. 3D, Supplementary Table S4). The majority of identified DMRs (1,064, 91.3%) were hypermethylated in SN oocytes. Hypermethylated DMRs were enriched for oocyte transcripts, which we used as an indicator for genes (Fig. 3E). In contrast, no significant enrichment was observed for intergenic regions or promoters. As we saw the largest changes in intermediately methylated domains, we also assessed the overlap with regions that are hypomethylated in Uhrf1 KO oocytes from Maenohara et al. (47), because it was shown that loss of Uhrf1 in growing oocytes results in hypomethylation, with intermediately methylated regions most affected (47). We saw indeed an odds ratio of 5.62 of SN hyper DMRs for regions that lost DNA methylation in the Uhrf1 KO oocytes (Fig. 3E).
As DNA methylation in growing oocytes is driven predominantly by active transcription, we assessed whether the expression changes observed between NSN and SN oocytes correlated with DNA methylation changes. However, there was no correlation when comparing expression changes and methylation difference, neither for the identified DEGs, nor for the DMRs (Supplementary Figure S6A,B). This is perhaps not surprising, as the transcript abundance changes observed are likely due to downstream processing of RNA and transcriptional decay, rather than changes in active transcription.
DNA methylation changes in SN oocytes enriched for histone H3K27me3 and H3K36me3 marks
Loss of DNA methylation in Uhrf1 KO oocytes was described to be associated with untranscribed regions (47), regions expected to lack the histone posttranslational modification H3K36me3. Considering the overlap of SN hyper DMRs and Uhrf1 KO hypo DMRs, we decided to explore the relationship between the SN DMRs and the genomic histone modification patterns. For this we performed H3K36me3 ultra-low input native ChIP-sequencing of three replicate sets containing ~ 280 GV oocytes and compared these datasets with previously published H3K4me3 and H3K27me3 datasets (48). Replicate samples of the different histone marks showed high correlations, while correlation between different marks was much lower (Supplementary Figure S7A,B). H3K4me3, which is usually a mark of active promoters, additionally shows a non-canonical pattern of broad domains that are enriched in unmethylated, untranscribed regions in the oocyte (Fig. 4A) (46, 48, 49). H3K27me3 is a repressive mark and usually found broadly throughout unmethylated regions (Fig. 4A) (46, 48). H3K36me3 has been associated with actively transcribed gene bodies and overlaps regions with DNA methylation (50, 51). Indeed, H3K36me3 enrichment aligned well with methylated domains in GV oocytes (Fig. 4A). Interestingly, we noticed regions where several histone marks were overlapping (Fig. 4B, Supplementary Figure S7C). While regions with overlapping H3K4me3 and H3K27me3 (N = 106,063) are well described as so-called “bivalent” domains and known to be of crucial importance for normal embryo development (52), overlap between the other marks observed here is less common. We observed 54,094 windows with H3K4me3 + H3K36me3, 24,404 windows with H3K36me3 + H3K27me3 and 6,132 windows with all three marks (H3K4me3 + H3K36me3 + H3K27me3; Fig. 4B). Supporting that these regions are co-marked, in contrast to regions with H3K36me3 alone that are almost exclusively fully methylated, these regions despite also having H3K36me3 were only partially methylated (Fig. 4C). These findings suggest that the co-occurrence of H3K4me3 and/or H3K27me3 with H3K36me3 may disrupt recruitment or activity of de novo DNMTs or represent regions that are heterogeneously marked between individual oocytes.
When analysing the overlap of the different histone domains with our SN hyper DMRs, the DMRs appear to be especially enriched for regions with overlapping H3K36me3 + H3K27me3 and regions with all three marks (Fig. 4D). The latter, however, comprises such a small proportion of all analysed windows that the significance is unclear. In contrast, Uhrf1 hypo DMRs are equally enriched in H3K4me3 + H3K36me3 and H3K36me3 + H3K27me3 domains, indicating that the overlap between SN hyper and Uhrf1 KO hypo DMRs may not be fully accounted for by the same mechanism (Fig. 4D).