APA site usage is an understudied aspect of gene regulation. Although APA sequencing can reveal changes in overall gene expression, it's designed to focus on changes in APA usage and cannot reveal differences in splicing or transcription start sites (TSSs). On the other hand, bulk RNA-seq analysis often ignores APA, TSS and splice isoforms to simply assess reads per gene. Currently it would be very difficult to enumerate copies of all the mRNA isoforms for each gene. Yet appreciation is growing for the importance of APA sites in regulating mRNA stability17,42, mRNA/protein localization20,43,44, and human disease31,45.
Rhythmic APA site usage has been uncovered in the mouse liver22,23,46, and in temperature-entrained cultured cells, circadian APA usage occurs in many genes and can regulate expression of specific central clock genes24. Still, alternative poly(A) site usage hasn't been given much attention in the sleep and circadian field. We therefore initiated this investigation into the conjunction of APA with sleep and diurnal expression. As far as we are aware, the current study is the first to examine APA sites related to circadian rhythms and sleep pressure in any mammalian brain. There are several, diverse ways in which data from this study can translate into biological relevance as described in the examples below.
Here, we observed that 6% of all PASs cycled with a 24 h period. One of the top pathways identified for the diurnal APA gene set was 'circadian entrainment' (Table 2). Since transcription-translation feedback loops are central to circadian regulation, this may not be surprising, but APA site usage suggests a more complex role24,46. For example, we find that one Sin3b APA follows a diurnal rhythm (Fig. 3a, b). Sin3b encodes short and long variants conserved in mammals. The short variant binds to CRY1 but cannot bind HDAC147. The long isoform is implicated in regulation of Per1/Per2 transcription48, along with many other genes49. In our data, long Sin3b APA reads constitute the predominant isoform at ZT6 and ZT22, while the short, diurnal isoform is the most abundant one at ZT10, ZT14 and perhaps ZT2 (Fig. 3b). Sin3b transcript levels in mouse hippocampus have previously been reported to be affected by sleep deprivation50, although this effect was not observed using TRAP-seq51, suggesting post-transcriptional processing can lead to changes in sleep-dependent differential expression. Together with our work, this example highlights the importance of utilizing various "-omic" approaches to properly decipher the complexity of molecular processing tied to changes in behavioral state in the brain.
Additional significant pathways emerged from the diurnal APAs, such as Oxytocin, Ephrin, and MAPK signaling that have demonstrated links to the circadian clock52–54. In the GO analysis of the diurnal genes with multiple PASs, we discovered that terms related to the synapse (12), protein localization (6), and vesicles (7) (Table 2 and Supplementary Table S3) were enriched suggesting APAs are poised to affect neural communication.
A large proportion of diurnal APAs had expression peaks around ZT20 (Supplementary Fig. S1). Considering that rats are nocturnal, this is similar to what has been seen for bulk transcripts in several human tissues, including brain55. Interestingly, among the identified diurnal APA sites, 3 were in genes for RNA-binding proteins (Celf2, Elavl3, and Rbfox1) whose expressions correlate with more distal APA usage47. Peak expression of these three genes is from ZT21 to ZT1, so it would be interesting to see if transcripts of predicted targets tend to be longer at these times.
In addition to the 24 h circadian rhythm, recent studies have also demonstrated the existence of cell-autonomous ultradian clocks that run independently of the circadian clock to regulate 12 h oscillations in gene expression and metabolism35–39. Here we found that 5% of all PASs cycle with a 12 h period. Further analysis of these genes showed enrichment of gene ontology terms and pathways such as "regulation of trans-synaptic signaling" and "protein-protein interactions at synapses" (Supplementary Table S6), indicating that APAs could function to regulate cyclic actions of cell signaling and communication.
Gene expression studies following changes in sleep homeostasis have largely ignored alternative polyadenylation. Of the 31,795 total PASs characterized in rat forebrain in our study, we determined that 2.5% were differentially expressed with sleep deprivation and recovery sleep. We also observed 6 GO terms significantly enriched following 6 hours of sleep loss and 26 following 4 hours of recovery sleep (Table 3).
Human APA isoforms have been linked to many neurological disorders31. Among the genes that we identified to have rhythmic expression of APA sites or had APA sites that were affected by sleep pressure, we found that 46 have also been correlated with brain disorder susceptibility (Table 4). For example, the human MAPT/TAU gene produces transcripts containing short or long 3' UTRs, and a 3' single-nucleotide polymorphism, (SNP) is associated with both 3' UTR length and risks for 8 neurological disorders, including Alzheimer's and Parkinson's diseases31. Homozygosity of the more common SNP variant is associated with short MAPT 3' UTRs, homozygosity of the less common SNP variant is associated with long 3' UTRs, and heterozygosity is associated with 3' UTRs of intermediate lengths. In our rat APA data, there were both short and long 3' UTR forms (5 in total) of the Mapt gene that were identified (Fig. 3c, d). Only two are currently annotated in the rat genome and one of the newly discovered APAs was observed to cycle with time-of-day. In mouse, binding of the ALS-associated protein TDP-43 to two sites in the 3' UTR of Mapt has been shown to destabilize the mRNA56. In Alzheimer’s disease, the expression level of TDP-43 protein is often low, and TAU is overexpressed and eventually forms neurofibrillary tangles. The two TDP-43 binding sites that were experimentally determined in mouse are conserved in sequence and position in the rat gene, implying that transcripts with shorter 3' UTRs would not be affected by TDP-43, while longer ones could be destabilized56,57. The presence of at least one putative TDP-43 binding site in the human MAPT 3'UTR suggests that this may be contributing to the neurological disorder risk.
Ntrk2 is among the APA TWAS genes linked to anxiety31 and has been associated with autism in other studies58. We found strong time-of-day oscillations of the 2 most abundant APA sites of the short, tyrosine kinase deficient (TK-) Ntrk2 isoform. The TK- isoform of Ntrk2 has several known functions, including a dominant negative effect on the full-length TK + isoform during neuronal proliferation, differentiation, and survival. In addition, the TK- version promotes filopodia and neurite outgrowth; sequesters, translocates, and presents BNDF; and affects calcium signaling and cytoskeletal modifications in glia59. Our WTTS-seq data revealed short, medium, and long 3' UTRs in the rat Ntrk2 TK- isoform (Fig. 3e). In mice, the longer Ntrk2 TK- transcripts are preferentially targeted to apical dendrites60. Since the sequence of the rat 3' UTR is highly conserved with the mouse sequence, it is plausible that an analogous dendritic localization mechanism is also in use in the rat (Fig. 3e). Interestingly, 'Ntrk signaling' was one of the pathways over-represented in the diurnal APA genes (Supplementary Table S3). APA sites in Src, Frs2, Atf1, Nras, Sh3gl2, Ntrk3, Mapk1, Grb2, Pik3r1, and Mapk14 contributed to this enrichment.
Four different APAs from the Sorl1 gene exhibited significant changes in our analyses; two diurnal, one cycled with a 12 h period, and one was reduced during recovery from sleep deprivation (Fig. 4). In total, there were seven APAs in the Sorl1 3'UTR, three short, one medium and two long. The longest and most abundant isoform cycles per 12 h, the second longest and medium ones are diurnal and the shortest isoform is differentially expressed after SD (Fig. 4). SORL1 encodes an endosomal recycling receptor61, and a deficiency of SORL1 as well as many polymorphisms are strong risk factors for AD62,63. The mouse and human 3' UTRs share extensive similarities including 5 APAs in mouse and 3 in human based on the PolyA_DB v3 (https://exon.apps.wistar.org/polya_db/v2/ ) and UCSC database64. Four microRNA binding sites with high probability of preferential conservation are in good alignment (TargetScanHuman v8.0)65. The first motif can be bound by five miRNAs (miR-25-3p, miR-32-5p, miR-92-3p, miR-363-3p, and miR-367-3p), while the second contains overlapping 7mer and 8mer motifs bound by miR-128-3p and miR-27-3p, respectively. The final two more distal sites are recognized by miR-153-3p and mir-137 (Fig. 4a). Sequences matching the consensus binding site for CPEB are present in the 3' UTRs of all three species, with 2 in very good alignment. Cytoplasmic polyadenylation element binding protein (CPEB) facilitates mRNA trafficking to synapses and local translation66,67, and we have previously shown that the core clock-controlled Fabp7 mRNA68,69 contains functional CPE sites in its 3'UTR to regulate translation70. Since APOE4, an apolipoprotein E variant with increased risk of AD71, disrupts FABP7 interaction with sortilin, (an APOE receptor similar to Sorl1), to interfere with neuroprotective lipid signaling72, this suggests circadian variation in local translation of CPEB-mediated polyadenylation of target mRNAs may be a generalizable mechanism that modulates AD susceptibility through downstream lipid pathways. Any one or more of these conserved features could lead to conserved functional consequences dependent on APA choice.
One caveat to our approach is that WTTS-seq generates Ion Torrent PGM sequences which may retain more noise compared to Illumina platform reads and since only Illumina has the option of paired-end reads, there can be more uncertainty in mapping Ion Torrent reads. Our strategy was to capture the maximum number of PASs, including the discovery of novel PASs, and the rat genome is not as thoroughly annotated as some other vertebrate species, we therefore included potentially intergenic reads. In our analysis, we found 5,122 PASs and 318 diurnal PASs that mapped outside of known genes, and many APAs within genes mapped to regions in which 3' ends have yet to be annotated. Based on prior WTTS-seq data sets and other PAS mapping approaches, some portion of our PASs could be method-based artifacts27,73, (see Zhou et al.27 Figs. 3, 4 and 5). In this, our initial PAS survey, we assayed a large portion of the brain. Therefore, future studies in restricted brain structures or cell types will be required to uncover APAs that cycle or are differentially expressed at a finer scale. Overall, the newly discovered PASs should add valuable insights into regulation of the rat transcriptome and for characterizing PAS usage in the mammalian brain.
Here we used an unbiased discovery-based approach for uncovering novel APA usage following time-of-day or changes in sleep pressure in mammalian brain. These data leverage a call to action for additional work to elucidate the core mechanisms of PAS usage in the brain and to examine the capacity of APA to affect the transcriptomes and proteomes that regulate central brain processes known to be altered by time-of-day and sleep/wake homeostasis. Moreover, it known that PAS usage varies across brain region and cell type21 (i.e., substructure-, circuit-, laminar- or nucleus-specific)74. These hypothesis-generating data provide an impetus for continued research aimed at delineating how sleep and circadian rhythms impact mental health and neurodegenerative disease.