A single-cell atlas of the human cerebellum
We used split-pool ligation-based transcriptome sequencing (SPLiT-seq)30 to sequence a total of 1.08 million nuclei from the cerebellar cortex of 109 individuals (Fig. 1A). Our cohort was comprised of 16 ET patients as well as 93 healthy controls (Fig. 1A-C). We were able to retrieve, on average, 9,250 nuclei per sample, 3,486 unique molecular identifiers (UMIs) and 1,361 genes per nuclei.
We used Scanpy31 and Harmony32 to integrate nuclei across donors, batches, sex and case-control status. Using previously known markers and mammalian single-cell atlases33–35, we annotated a total of 14 clusters representing all major cell clusters of the cerebellum (Fig. 1D). All identified clusters expressed canonical markers of their respective cell-types: Granule cells (RBOFX3), Purkinje (CALB1, ITPR1), molecular layer interneurons (MLI) with two distinct populations (MLI_1; PTPRK, MLI_2, NXPH1), Purkinje layer interneurons (PLI; NPXH2, SNTG1), Golgi cells (DLGAP2), unipolar brush cells (UBC; PTPRO, LMX1A), Bergmann glia (SCL1A3), astrocytes (GFAP), microglia (PLXDC2), endocytes (FLT1, ATP10A), pericytes (CEMIP, DLC1), oligodendrocytes (MBP) and oligodendrocytes progenitor cells (OPC; SOX6, PDGFRA) (Fig. 1D and 1F). As expected, granule cells were overrepresented in most samples as they represent around 50–70% of extracted cells in the cerebellar cortex (Fig. 1E). Bergmann glia were the most common glial cells, followed by astrocytes. Proportion of Purkinje cells and interneurons remained somewhat stable across all samples (Fig. 1E).
Cell-type specific effects of cerebellar scQTLs
All 109 individuals subject to snRNA-seq were simultaneously genotyped. Following imputation, variant QC, and removal of non-European donors, 103 individuals and 5.3 million SNPs were kept for scQTL analysis (Supplementary Fig. 1). We restricted our analysis to SNPs with a minor allele frequency (MAF) greater than 0.05 and to those found within 1 megabase pair (Mb) of the transcription start site (TSS). We used TensorQTL 36 to map eQTLs in all of the 14 cell-types identified. Using both nominal and permutation mapping, we found a total of 3,120 eGenes, i.e. genes that were significantly regulated by at least 1 genetic variant (Fig. 2A; Supplementary Fig. 2). A majority of eGenes were found in granule cells, the most abundant cell type in our dataset. As previously reported, the number of scQTLs per cell-type was correlated with the number of cells (Fig. 2C). This is in line with the previous observation that increasing the number of sequenced cells could increase the discovery of scQTLs2. Most scQTLs were found upstream of the TSS, as expected, given their potential effects around transcription factor binding sites (Fig. 2D). Similar distributions around the TSS were also previously observed in brain scQTLs2.
We investigated shared effect sizes between cell-types in our dataset. For SNP-gene pairs at p-value < 1e-5, we studied the pairwise replication p-values for all cell-types (Fig. 2I). Overall, cell-types shared a significant number of scQTLs between each other. (Pi1 0.73–0.99). Only pericytes, which had significantly lower number of scQTLs, had lower replication p-values (Pi1 0.46–0.89), possibly due to a small number of sequenced nuclei. Cell-types with the greatest number of scQTLs (and greatest number of cells) shared the most scQTLs with each other. For example, at p-value < 1e-5, granule cells displayed Pi1 = 0.99 with Bergmann glia. Glial cells shared the highest similarity with other glial cells. Interestingly, neuronal populations did not share higher Pi1 statistics with other neuronal populations. Of note, Purkinje scQTLs shared a large number of scQTLs with both UBCs and oligodendrocytes indicating some cross glial-neuron correlation between scQTLs. These results indicate that a majority of scQTLs are shared across different populations of cells and only a small fraction (< 10%) represent cell-type specific eQTLs.
Next, we sought to study whether our single-cell results could replicate bulk-RNAseq eQTLs from the Cerebellar MetaBrain eQTL study6 (Supplementary Fig. 3). We used a similar approach as described above to calculate Pi1 statistic. Although most MetaBrain eQTLs were replicated in our dataset, scQTLs replicated at a lower rate when querying against MetaBrain eQTLs, possibly due to lower power in our analysis compared to the MetaBrain study. We find that granule cell scQTLs displayed the highest Pi1 statistic with MetaBrain eQTLs with over 50% of bulk eQTLs replicating in that population. Given the large number of granule cells in the cerebellum, this might indicate that most observed bulk eQTLs probably originate from these cells. Bergmann glia scQTLs replicated 41% of MetaBrain eQTLs, with other frequent cerebellar populations such as astrocytes, oligodendrocytes and OPCs replicating over 30% of bulk eQTLs.
We investigated whether our identified scQTLs had cell-type specific effects in the cerebellum. We defined cell-type specific eGenes as genes with significant genome-wide permutation tests in only 1/14 cell-types. We found a total of 803 cell-type specific eGenes in the cerebellum (Fig. 2B). As with scQTLs, cell-type specific eGenes were correlated with the number of nuclei, with granule cells displaying over half of all cell-type specific effects. This further demonstrates the importance of large-scale sequencing in the discovery of scQTLs. We illustrate a few examples of cell-type specific eGenes (Fig. 2E-H). For example, KNDC1, a guanine exchange factor known to regulate granule cell growth37, is specifically regulated in these cells. We also find that BACE2, a gene previously associated with ET and AD24,38–42, is specifically regulated in oligodendrocytes. BACE2 was previously associated with oligodendrocyte- and endocyte-specific genetic regulation in an scQTL study of the human cerebral cortex2.
Mendelian randomization of disease traits with cerebellar scQTLs identifies putative molecular mechanisms in neurological and psychiatric diseases
Disease associated SNPs often occur in non-coding regions and, therefore, understanding their molecular effects remains challenging. We used our newly constructed cerebellar scQTL dataset to investigate the causal effects between disease variants and expression events. We used summary-based mendelian randomization (SMR) to estimate the causal or pleiotropic relationship between GWAS SNPs and scQTLs. We also used the HEIDI test to distinguish pleiotropic effects from associations due to linkage disequilibrium (LD). We performed SMR for 5 traits: PD, AD, SCZ, Cerebellar volume (CBV) and ET (Fig. 3A-D; 4B). We defined significant causal or pleiotropic associations as SNPs with FDR < 5% for the SMR test and SNPs with p-value > 0.05 for the HEIDI-test to eliminate LD associations.
For PD, a majority of SMR-causally associated scQTLs were found in the 17q21.31 locus, a region associated with multiple neurological disorders43,44,45. This locus includes genes such as ARL17B, LRRC37, KANSL1, CRHR1 and MAPT. Such associations in this locus were observed in most cerebellar cell-types and were not cell-type specific, in concordance with previously observed colocalization in cortical scQTLs across glial and neuronal populations2. Previous association of MAPT regulation by astrocyte-specific scQTLs were not replicated in the cerebellum, correlating with the scarcity of tau accumulations in this brain region. Although we found few cell-type specific SMR associations for PD, we found significant association for Bergmann glia specific HIP1R expression with increased risk for PD. HIP1R encodes a regulator of clathrin-mediated endocytosis and a known interactor of Huntingtin 46.
In AD, we also found significant associations for multiple genes in the 17q21.31 locus. As with PD, associations within this locus were not cell-type specific. We, however, found that a majority of significant associations outside of this locus were cell-type specific. Notably, AD had multiple significant SMR associations in oligodendrocytes and OPCs. Additionally, we found significant association with cell-type specific KNOP1 expression in oligodendrocytes, ERCC2 in OPCs, SIRBP1 in Bergmann glia and ZNF225 in MLI_2. A majority of these SMR associations had variants with a decreasing effect on disease risk, consistent with previous findings of late cerebellar involvement in AD19. Moreover, we failed to find significant microglia-specific SMR associations, including BIN1 colocalization, hinting at potential lack of AD risk factors in cerebellar microglia. Overall, for both AD and PD, a majority of causal SMR associations were correlated with decreased effects on disease risk, concordant with pathological observations of late cerebellar involvement in these disorders47 18,48.
For CBV, we found 2 loci that had causal SMR associations with cerebellar scQTLs: PTK2 and SLC44A5 that were significant in granule cells and Bergmann glia, the most abundant cell-types in the cerebellum. PTK2 encodes FAK, a cell adhesion tyrosine kinase implicated in synaptic branching known to be associated with cerebellar hypoplasia49,50.
SCZ was associated with a large number of granule and Bergmann glia eGenes. This is consistent with previous associations of SCZ with excitatory neurons2,51. SCZ risk genes such as ANKK1, FLMN1 and DMTF1 were all associated with granule-specific scQTLs. Interestingly, we found multiple interneuron-specific eGenes associated with SCZ: calcium-voltage channel CACNA1C in MLI_1, proton transporter ATP6V1G2 in UBCs and the cholinergic receptor CHRNA2 in Golgi cells.
ET risk variants drive BACE2 downregulation in oligodendrocytes
ET remains one of the most common neurological disorders, yet its pathophysiology is poorly understood. In our ET SMR analysis, we found that top ET risk variant rs9982271 was significantly associated with BACE2 downregulation in oligodendrocytes (Fig. 4B). We found that this scQTL was colocalized with ET risk SNPs at that locus (Fig. 4C). Since ET genetic risk variants seem to be specifically enriched in the cerebellum, we tested whether scQTLs from cerebral oligodendrocytes also colocalized with ET. We found significant colocalization with rs9980363 and oligodendrocyte-specific scQTLs in the human cerebral cortex, hinting that BACE2 regulation by ET risk variants might not be limited to the cerebellum (Fig. 4D). To validate our SMR results, and further prioritize other ET risk genes, we performed cell-type specific transcriptome-wide association study (TWAS) using an Omnibus Transcriptome Test using Expression Reference Summary (OTTERS)52 (Fig. 4A). We found that BACE2 was prioritized as the most significant oligodendrocyte locus (ACAT-O p-value = 1.24e-12). Our results indicate that ET-associated variants in the upstream region of BACE2 control its expression in cerebellar oligodendrocytes.
Expression of ET-associated loci is significantly enriched in adult cerebellar oligodendrocytes and is independent from PD
Etiological mechanisms for ET pathophysiology are mostly unknown. Given this novel oligodendrocyte-ET association found through our scQTL analyses, we sought to validate these findings using our cohort of cerebellar snRNA-seq from ET patients. We first calculated the genetic enrichment for ET in our cerebellar dataset given that, other than BACE2, few GWAS loci colocalized with cerebellar scQTLs. We postulated that these loci might nonetheless display oligodendrocytes specific expression. We therefore used MAGMA cell-typing51 to calculate genetic enrichment within cerebellar cell-types. We found significant enrichment of ET loci in oligodendrocytes, Purkinje cells, PLIs and MLI_1 populations (Fig. 5A). To control for correlated expression between cell-types, we ran MAGMA using conditional analyses to remove co-expression signal between cell-types. After conditioning on significant cell-types, we only found significant enrichment in oligodendrocytes. This enrichment was not replicated in fetal cerebellum or in adult cerebral cortex (Supplementary Fig. 4). These findings point to ET genetic risk driving disease progression specifically in adult cerebellar oligodendrocytes (Supplementary Fig. 4).
We thought this enrichment could be explained in part by genetic correlation between ET and PD. This is due to the fact that 1) PD loci were shown to be enriched in cortical and spinal oligodendrocytes53 and 2) ET and PD share a genetic correlation of 30%24. To test whether the enrichment of ET loci in oligodendrocytes is independent from PD, we conditioned the ET GWAS based on the most recent PD GWAS 54 to remove correlated signals (Supplementary Fig. 4). We found that ET loci were still significantly enriched in oligodendrocytes after conditioning on the PD GWAS. Furthermore, PD loci were not significantly enriched in cerebellar oligodendrocytes.
A subset of immature oligodendrocytes drives genetic enrichment of ET loci
We hypothesised that a subset of cerebellar oligodendrocytes might display significant enrichment of ET risk genes. Using Harmony, we re-clustered cerebellar oligodendrocytes and identified 4 oligodendrocytes subclusters (Oligo_1; FGF14; Oligo_2: KNC1, Oligo_3: NRG3, Oligo_4: PCDH9) (Fig. 5B & 5D). To study subtype enrichment, we first pulled the top 100 genes driving ET enrichment in oligodendrocytes. Surprisingly, BACE2 was found at rank 79. Among the top 15 driving genes, we found oligodendrocyte markers such as CA2, LRRC7 and OLIG157–59(Fig. 5B). These genes are concomitantly found in loci indicative of genome-wide significance on chromosomes 8, 1, and 21, respectively24. Most of these ET-associated genes, including BACE2, had enriched expression in Oligo_4 (Fig. 5H). We also found that Oligo_4 displayed high expression of immature oligodendrocyte markers such as OPALIN, and low expression of mature markers such as RBFOX17(Fig. 5B & Fig. 5I-J). Moreover, this cluster had high expression of MBP, indicative of active re-myelination (Fig. 5I). Although both ET and controls had similar numbers of RBFOX1 + Oligo_1 (adjusted p-value = 1.56e-1), we found that Oligo_4 cells were significantly more represented in ET samples than in control samples (permutation Z-score = 5.81, adjusted p-value = 2.57e-8; Supplementary Fig. 5).
We performed differential expression analysis across all 4 oligodendrocyte subclusters to identify ET driven changes in mRNA expression (Fig. 5E & 5F). In line with ET loci enrichment in Oligo_4, this population displayed the strongest effect across the 4 subtypes (Oligo_4: 339 DEGs; Oligo_1: 19 DEGs; Oligo_3: 12 DEGs; Oligo_2: 9 DEGs) (Fig. 5C). We performed pathway enrichment analysis of Oligo_4 specific DEGs and found significant enrichment of pathways related to synaptic signalling and ion transport across neuronal membranes (Fig. 5F). These processes are known to be crucial to myelin and axonal homeostasis 60,61 62 and could be important compensatory mechanisms in ET oligodendrocytes. Overall, these results indicate that a subset of genetically vulnerable oligodendrocytes display dysfunctional synaptic and ionic changes in ET.
Neuronal dysfunction and Bergmann gliosis in the ET cerebellar cortex
Oligodendrocyte vulnerability alone is unlikely to drive disease progression in ET. We therefore sought to assess other affected neuronal and glial populations in the cerebellar cortex. Using pseudobulk differential expression, we observed the strongest effects in glial cells, mostly Bergmann glia, OPC, and astrocytes (Fig. 6A). Concordant with the finding of Bergmann gliosis in the ET cerebellar cortex 63 64 65, Bergmann glia displayed the largest number of DEGs. Bergmann DEGs were mostly related to gliogenesis and axon projection (Fig. 6D). Multiple genes related to gliosis such as ID4, GFAP, VIM and NES66, were all found to be significantly upregulated in Bergmann glia (adjusted pval < 0.05, log2FC > 0.5) (Fig. 6C). Given the lack of genetic enrichment of both common and rare variants in this population, we postulated that the effects observed in this population are secondary to neuronal insult as is observed in spinocerebellar ataxia67.
Bergmann gliosis in ET is thought to arise around sites of Purkinje cells degeneration or loss. Interestingly, we found few DEGs in Purkinje cells when comparing ET to control samples (Fig. 6A) despite sequencing a large number of Purkinje nuclei (37087 total Purkinje cells; 31,241 control Purkinje cells; 5,846 ET Purkinje cells (15.76%)). DEGs were mostly related to the post-synaptic membrane (adjusted p-value = 8.20e-03; Supplementary Fig. 6) including genes such as LRRTM4, GLRA3 and TENM2. This is in line with previous transcriptomic characterization of ET Purkinje cells, where only 36 genes were found to be dysregulated compared to healthy Purkinje cells68. Golgi cells were found to be the most affected neuronal population in our cohort (Fig. 6A; 575 DEGs with adjusted p-value < 0.05 & log2FC > or < 0.5). Golgi cells are the most abundant inhibitory interneurons and are crucial for the inhibition of granule cells69. We found that Golgi cell DEGs were related to synaptic transmission, axonal development, dendritic tree, and glutamate receptor activity (Fig. 6E & 6F). Given these findings we sought to investigate the differential cell-cell connections between Golgi cells and other cerebellar neurons using CellPhoneDB70. We found that ET Golgi cells had differential cell-cell communications with most cerebellar neurons, including granule cells, mainly through Teneurin- and Ephrin-signalling proteins (Supplementary Fig. 7). We found the strongest effect in ET differential cell communications to be between Golgi cells and PLIs via TENM2/3/4-ADGRL2 adhesions. Concomitantly, studying the cell-cell communications between oligodendrocytes subclusters and neurons highlighted numerus specific Oligo_4 and PLI communications through leucin-rich repeat (LRR) proteins and neurexins. These transmembrane proteins are known to be involved in myelination and axonal differentiation. In-depth functional characterization of these populations in both the healthy and ET cerebellum are lacking, but our results indicate that Golgi cells might display dysregulated synaptic communications with PLIs in ET.