Although the genetic etiology of the RTT is largely associated with MeCP2 mutations, linking clinical symptoms to specific molecular mechanisms is still an unsolved issue. This may depend on the fact that MeCP2 molecular functions are still partially understood. MeCP2 was initially recognized as a methylated DNA binder [4]. It interacts with a wide spectrum of chromatin components and regulators and shows RNA binding activity [3, 5, 6, 82]. Recent experimental evidence strongly indicates MeCP2 contribution to RNA maturation [21, 30, 31, 83–86]. The current view of MeCP2 is of a multifaceted epigenetic driver, involved in multiple layers of co-transcriptional regulation, some still elusive. Therefore, a deeper understanding of its functions is essential for a better comprehension of RTT molecular mechanisms and the development of new clinical interventions.
As recently stated, transcriptomic studies represent an excellent unbiased readout of RTT molecular mechanisms [87]. However, till now no genes or molecular pathways have been consistently associated with MeCP2 pathological mutations. Methodological differences in computational data analysis across studies represent one important issue in comparative transcriptomics. Therefore, to perform a retrospective comparative analysis of RTT human and mouse RNA-seq published studies, we re-analyzed input raw data by applying unique pipelines for gene expression. Despite the known implication of MeCP2 in RNA processing [21, 30, 31, 83–86], much less attention has been devoted to AS variation in RTT, and AS profiling in RTT human biosamples is still missing. AS is an important source of protein sequence variation and functions in the mammalian nervous system [25], and it might be relevant for the pathophysiology of RTT [21, 30], as demonstrated in many other neurological disorders [88]. Our data mining for AS revealed significant alterations in splicing profiles of RTT datasets. The lack of overlap between human and mouse DAS gene lists could be ascribed to the rapid evolution of RNA processing mechanisms and the low conservation from rodents to primates of splicing cis-regulatory sequences of orthologous genes [24, 25, 89]. Detecting the AS dysregulation of human genes relevant to the disease, our analysis pinpoints the relevance of splicing investigation for the understanding of the pathobiology of RTT. Interestingly, our gene expression and AS splicing analysis revealed that both DEGs and DAS genes converge onto common cellular pathways. Functional analysis of human DEGs and DAS genes indicates a significant dysregulation of genes and pathways related to cell-extracellular matrix communication, and synapse structure and function. Mouse DEGs and DAS genes functionally converge to synaptic structure and function. Overall, synaptic dysregulation was observed in human and mouse transcriptomic analysis. Accordingly, in 50% of human datasets we observed a dysregulation of CD44 and CD47, at their expression and AS levels. Both genes codify for transmembrane glycoprotein genes that mediate cellular response to the microenvironment. CD44 is involved in cell-cell interactions, cell adhesion, and migration. In the brain, CD44 regulates structural and functional plasticity of dendritic spines and coordinates neural circuit development [80, 91]. CD47, also known as a ‘‘don’t eat me’’ signal, promotes dendritic and axonal development in hippocampal neurons and prevents excess pruning during postnatal development [92, 93]. Both CD44 and CD47 are autism-associated genes [64, 92]. It was shown that MeCP2 binds CD44 precursor RNA and regulates its splicing [30]. However, no altered CD44 splicing was observed in our analysis. Moreover, among highly frequent DEG and DAS genes in human datasets, we identified ion channel regulators and subunits, such as SHISA6 and ANKRD36C. CACNA1G, a voltage-sensitive calcium channel, is the only common DEG identified in 3 and 7 mouse datasets, respectively. CACNA1G is associated with various neurological disorders: autosomal-dominant cerebellar ataxia [95], idiopathic generalized epilepsy [96], spinocerebellar and cerebellar ataxia [97], and autism [98]. Recently, CACNA1G has been reported as a pathological variant in RETT patients [99]. Changes in the expression of Irak1 were observed in the majority of MeCP2 KO mouse datasets, in agreement with previous studies [22, 100]. It should be pointed out that the Irak1 gene is located ~ 3 kb downstream of Mecp2, in the same transcriptional orientation. As previously discussed, changes in its expression might be a direct consequence of the large deletion in the MeCP2 KO mouse line [53] (Mecp2tm1.Bird carries exon 3 and 4 large gene deletion), causing an altered local chromatin conformation or the deletion of an Irak1 negative regulator [100]. No changes in Irak1 expression were observed in R306C MeCP2 mutant mice.
Our finding of the altered AS of Snhg14 and Firre ncRNAs in many mouse datasets is puzzling. Snhg14 is a host gene for the small nucleolar RNA genes Snord115 Snord116 [78], however their expression was not altered in our mouse gene expression analysis. No changes of SNHG14 and FIRRE AS or expression were observed in human datasets. Both the mouse and human SNHG14/Snhg14 loci are imprinted and only expressed from the paternally inherited chromosome [78]. Large deletions of the SNHG14 locus, as well as microdeletions of the SNORD116 locus, lead in human to the neurodevelopmental genetic disorder Prader–Willi syndrome (PWS) [78]. The relatively poor conservation of the IPW gene between human and mouse [101] and the fact that a paternally inherited deletion that includes the Ipw gene appears asymptomatic in mice [102] argue against any biological relevance for this ncRNA gene in PWS [103]. Syntenically conserved lncRNA Firre displays different expression and localization patterns in human and mouse [104]. In general, lncRNAs are rapidly evolving, thus typically poorly conserved in their sequences and this rapid evolution might affect potential functions of lncRNAs [104]. This should be the case of Snhg14 and Firre. We hypothesize that the observed altered splicing of Snhg14 and Firre might depend on a local chromatin conformation resulting from MeCP2 loss of function.
Limitations
The major limitation of this study is the inclusion in our massive analysis of RNA-seq datasets generated from highly heterogeneous biosamples. Transcriptomic data were obtained from biosamples carrying different MeCP2 mutations, in some cases RNA-seq were generated from multiple patients with different MeCP2 mutations. Although MeCP2 mutant mice analyzed have more similar genetic backgrounds, RNA-seq data were obtained from various brain regions, primary neurons, and blood. This genetic and biological heterogeneity might explain the low number of shared DEG and DAS identified. Nonetheless, we were able to detect common dysregulated genes, relevant to the disease. Future studies will be required to clarify their role in the pathophysiology of RTT. In the future, single-cell transcriptomics analysis will help elucidate gene dysregulation specific to different types of neurons and brain regions. Additionally, our AS analysis is exploratory, aimed at investigating the overall dysregulation of AS in RTT. Detailed AS analysis of RTT-relevant genes will be required to molecularly characterize the functional impact of the altered processing of primary transcripts and the potential implication for the disease.