Single-cell multi-omics technologies offered the direct means to determine the relation between different layers of gene regulation1-4. However, spatial information in the tissue context, which is critical to understanding cellular function, is missing in single cell data. Furthermore, tissue dissociation leads to perturbation in cell state or introduces technical noise. Recently, spatial epigenomics, transcriptomics, and proteomics emerged to address this challenge5,6. Currently, most spatial technologies can capture only one layer of genomics information. Although it is possible to computationally integrate the data from multiple modalities7, the mechanistic link between different layers of omics can not be readily uncovered. Thus, it is highly desirable to realize joint profiling of multiple omics. Previously, we developed DBiT-seq for spatially resolved co-measurement of transcriptomics and a panel of proteins to link select proteins to genome-wide gene expression. To further investigate the epigenetic mechanisms underlying gene expression regulation, we aim to develop joint profiling of epigenome and transcriptome pixel by pixel in tissue. Herein, we present two of such technologies for spatially resolved genome-wide co-mapping of epigenome and transcriptome by simultaneously profiling chromatin accessibility and gene expression (spatial-ATAC-RNA-seq) or histone modification and gene expression (spatial-CUT&Tag-RNA-seq) on the same tissue section at cellular level via a microfluidic deterministic co-barcoding strategy to implement the chemistries for spatial ATAC-seq8 or CUT&Tag6 and spatial transcriptome. We applied them to the mapping of embryonic and neonatal mouse brain as well as adult human brain hippocampus to dissect epigenetic states or modifications in regulating cell type and dynamics in tissue. In general, distinct tissue features could be identified by either spatial epigenome or spatial transcriptome alone with high concordance, they do have differential roles in defining cell states. This work presents a versatile technology for spatial co-profiling of epigenome and transcriptome pixel by pixel in tissue at cellular level for identifying the mechanistic link between epigenetic state (chromatin accessibility or histone modification) and transcriptional phenotype directly in the tissue context. It opens new frontiers in spatial omics and may bring unprecedented opportunities to biological and biomedical research.
Technology design, workflow, and data quality
Spatial-ATAC-RNA-seq was realized by deterministic barcoding of transposase-accessible chromatin and transcriptome on the same tissue section5 as shown in Fig. 1a and Fig. S1. It started with a fresh frozen tissue section fixed with formaldehyde and then processed with the DNA adapter loaded Tn5 transposition complex. This adapter sequence containing a universal ligation linker was then inserted into the transposase accessible genomic DNA loci. Subsequently, a biotinylated DNA adapter containing a poly-T sequence and a universal ligation linker was added to the tissue surface to bind to mRNAs followed by reverse transcription (RT) in tissue. Next, we placed a microfluidic chip on the tissue section to introduce a set of spatial barcodes Ai (i = 1-50/100), each flowing through a different microchannel and then covalently linked to the DNA adapters through the universal ligation linker via templated ligation. Then, this microfluidic chip was replaced by another one that has microchannels perpendicular to the first flow direction to introduce another set of spatial barcodes Bj (j = 1-50/100) that were ligated to barcodes Ai (j = 1-50/100) using a similar ligation chemistry, resulting in a 2D grid of spatially barcoded tissue pixels, each of which had a unique combination of barcodes Ai and Bj (i = 1-50/100, j = 1-50/100, n of barcoded pixels = 2,500/10,000). Finally, barcoded cDNA and genomic DNA (gDNA) fragments were collected by reserve crosslinking. cDNAs were enriched with streptavidin beads and gDNA fragments were retained in the supernatant. Afterwards, gDNA and cDNA libraries were constructed separately during PCR amplification. We also developed spatial-CUT&Tag-RNA-seq for spatially resolved co-sequencing of histone modification and transcriptome on the same tissue section at cellular level. Similarly, this was implemented by conducting co-assay of cleavage under targets and tagmentation (CUT&Tag) and transcriptome on the same tissue area with the aforementioned microfluidic deterministic barcoding strategy5, as depicted in Fig. 1b and Fig. S2. First, the antibody against the target histone modification was incubated with the paraformaldehyde fixed tissue section on a glass slide. Then, a secondary antibody was added to improve the tethering of transposome (pA-Tn5). After transposome activation, the DNA adapter with the universal ligation linker was inserted into the histone mark-specific genomic DNA loci. The subsequent steps were similar to spatial-ATAC-RNA-seq for constructing gDNA and cDNA libraries for next generation sequencing.
We performed spatial-ATAC-RNA-seq experiments on an embryonic day 13 mouse embryo (E13) and human hippocampus (Fig. 1c, 50 μm pixel size). For the ATAC portion in spatial-ATAC-RNA-seq, we obtained a median of 17,960 (E13) and 9,898 (hippocampus) unique fragments per pixel, of which 16% (E13) and 15% (hippocampus) of fragments overlapped with TSS regions, and 11% (E13) and 11% (hippocampus) were located in peaks. The proportion of mitochondrial fragments is 1% for E13 and 20% for human hippocampus. For the RNA portion of spatial-ATAC-RNA-seq, the UMIs per pixel and detected genes were found to be an average of 1,038 genes (E13) and 1,200 genes (hippocampus) per pixel (Fig. 1d) and the total number of genes detected was 19,887 (E13) or 29,293 (hippocampus). We performed both spatial-ATAC-RNA-seq and spatial-CUT&Tag-RNA-seq experiments on mouse postnatal day 21 (P21) brain (Fig. 1c, 20 μm pixel size). For the ATAC (spatial-ATAC-RNA-seq) or CUT&Tag (spatial-CUT&Tag-RNA-seq) data, we obtained a median of 10,857 (ATAC) and 4,756 (CUT&Tag) unique fragments per pixel, of which 20% (ATAC) and 19% (CUT&Tag) of fragments overlapped with TSS regions, 24% (ATAC) and 19% (spatial-CUT&Tag-RNA-seq) located in peaks. The proportion of mitochondrial fragments was 0.1% for spatial-CUT&Tag-RNA-seq and 9% for spatial-ATAC-RNA-seq. For the RNA potion of spatial-ATAC-RNA-seq and spatial-CUT&Tag-RNA-seq, totally 19,859 genes (spatial-ATAC-RNA-seq) or 19,831 genes (spatial-CUT&Tag-RNA-seq) were detected with an average of 1,005 genes (spatial-ATAC-RNA-seq) and 1,145 genes (spatial-CUT&Tag-RNA-seq) per pixel (Fig. 1d) in the mouse brain tissues.
To further increase the mapping area, we also developed a new device to perform in tissue barcoding of 100x100 pixels and demonstrated it for spatial-ATAC-RNA-seq on an E13 mouse embryo sample (Fig. 1c, 25 μm pixel size). For the ATAC portion of the data, we obtained a median of 15,633 unique fragments per pixel, of which 15% of fragments overlapped with TSS regions, and 13% were in peaks. The proportion of mitochondrial fragments was 0.5%. For the RNA portion of the data, it detected 21,478 genes in total with an average of 550 genes per pixel (Fig. 1d) due to a lower sequencing depth. Moreover, the insert size distributions of chromatin accessibility (spatial-ATAC-RNA-seq) and histone modification (spatial-CUT&Tag-RNA-seq) fragments were consistent with the captured nucleosomal fragments in all tissues (Fig. 1e and Fig. S3a-c).
Mouse embryo: spatial co-mapping of chromatin accessibility and transcriptome to identify tissue feature, cell state and developmental dynamics
Spatial-ATAC-RNA-seq of a E13 mouse embryo identified eight major clusters using ATAC data and 15 clusters using RNA data. The distinct patterns of these clusters on the spatial map are in agreement with the tissue histology (see H&E staining of an adjacent tissue section, Fig. 2a and Fig. S5a). In the ATAC data, cluster 2 represents the eye in the mouse embryo with open chromatin accessibility for Six6 (Fig. 2d). Cluster 3 to 5 are associated with several developing internal organs. Cluster 6 to cluster 7 cover the peripheral and central nervous system (PNS and CNS). In order to benchmark the ATAC data, we projected the ENCODE reference ATAC-seq from different embryonic organs onto our ATAC UMAP embedding. In general, the clusters are in good concordance with the ENCODE ATAC-seq projection (Fig. S4a-d). Cell type-specific marker genes were identified for individual clusters and their expressions were inferred from chromatin accessibility (Fig. 2d-e and Fig. S5b-d). Sox2, which is involved in the development of nervous tissue especially for the optic nerve formation, showed high chromatin accessibility in the embryonic eye field but also the ventricular layer with enriched neural stem/progenitor cells. Myt1l, which encodes myelin transcription factor 1 like protein, was activated in the embryonic brain and neural tube. Nrxn2, which encodes Neurexin 2, a key gene in the vertebrate nervous system, was observed extensively in most neural cells. Six6, a key gene involved in eye development, showed highest activity and expression in the embryonic eye region. Rbfox3, a slicing factor well known as the nuclear biomarker NeuN, was observed extensively in all neural cells while Sox1 appeared to be enriched in the ventricular layer (Fig. S5). Cell type-specific enrichment of transcription factor (TF) regulators were also examined using deviations of TF motifs (Fig. S6). We observed that Sox2 transcription factor was enriched in cluster 8, consistent with its function in the embryonic brain development. Nfib motif was also enriched in the embryonic brain, which is necessary for proper brain development (Fig. S6a). The GREAT analysis further verified the strong concordance between gene regulatory pathways and anatomical annotation (Fig. S6c). We integrated spatial ATAC data with scRNA-seq data for cell type assignment in each cluster9 (Fig. 2b and Fig. S5e). For instance, radial glia (neural stem/progenitor cells) were observed predominantly in the ventricular layer and the differentiated cell types such as granule neurons and inhibitory interneurons were enriched in distinct regions of the embryonic brain (Fig. S5e-f). For the RNA data, 15 distinct clusters were identified and characterized by specific marker genes (Fig. 2a and Fig. S6d). Their spatial patterns also agreed with the tissue histology shown in the H&E staining from an adjacent tissue section (Fig. 2a and Fig. S5a). For example, cluster 11 correlated with the embryonic eye, and also has the same maker Six6 as identified in ATAC data (Fig. 2d). Clusters 2, 5, 7, 9, and 10 are related to PNS and CNS. Cluster 8 is associated with limb and bone development (Fig. 2d and Fig. S5b-d). The pathway analysis10 results are consistent with anatomical annotation (Fig. S7).
What is unique in this work is co-profiling of epigenome and transcriptome, which allows for investigating the correlation between accessibility peaks and expressed genes pixel by pixel and predicting the interactions between different regulatory regions. Some enhancers (Sox2, Mytl1, Sox1, Rbfox3, and Nrxn2) were found to have dynamically regulated promoter interactions (Fig. 2e). For example, enhancers for Sox2 and Mytl1, had higher chromatin accessibility specially in cluster 6 and 7, suggesting their roles in these regions to regulate Sox2 and Mytl1 expression. Moreover, we found that the some of the marker genes identified from ATAC data were enriched in part of the embryonic brain (for example, Nrxn2, and Myt1l) but not highly expressed in RNA data (Fig. 2d and Fig. S5b), which may indicate the lineage priming of these genes in embryonic brain7. The discordance between chromatin accessibility and gene expression proves the importance of spatial multi-omics co-profiling, which can help decipher the epigenetic mechanisms regulating gene expression. The integration of the RNA data with the scRNA-seq data9 was also performed to assign the cell identities to each cluster (Fig. 2b and Fig. S5f). We observed that the radial glia, granule neurons, and inhibitory interneurons were present in the same major clusters as shown in the ATAC analysis, which verified the power to define cell identities from multiple omics information. It was also attempted to use joint clustering of the ATAC and RNA data to refine the spatial patterns. For example, we identified cluster 11 (granule cells) in the joint clustering analysis, which was not readily resolved by single modalities alone (Fig. 2a, right). This result highlights the unique value to use joint multi-omics profiles for improving the cell-type-specific spatial mapping11.
To investigate the dynamic relationship between chromatin accessibility and gene expression during embryonic development, we analyzed the differentiation trajectory from radial glia cells to various types of neurons12 such as inhibitory interneurons. The pseudotime analysis13 was conducted on both ATAC and RNA data in spatial-ATAC-RNA-seq, and the results identified interneuron heterogeneity and the developmental trajectories, which were directly visualized in the spatial tissue map (Fig. 2f, g). The gene activity scores for chromatin accessibility and gene expression along this developmental trajectory were computed and the dynamic changes in select marker genes were presented (Fig. 2h, i). Overall, the gene expression exhibited a similar temporal tendency as the chromatin accessibility, but showed a slower pace during the developmental process, such as Sox11 and Nfix, in agreement with epigenetic lineage-priming of gene expression7. Thus, our spatial-ATAC-RNA-seq can be used to decipher the gene regulation mechanism and dynamics during tissue development.
Postnatal mouse brain: spatial co-mapping of chromatin accessibility or chromatin modification with transcriptome at the cellular level
We conducted spatial mapping of P21 mouse brain coronal sections (at Bregma 1) for both chromatin accessibility and histone modification jointly with transcriptome. Either ATAC or RNA data identified 9 clusters with distinct spatial distribution in agreement with anatomical annotation defined by the Nissl staining (Fig. 3a and Fig. S10a). Spatial clusters between ATAC and RNA showed strong concordance in cluster assignment (Fig. S15b). For ATAC data, unique chromatin accessibility resolved clusters within specific regions. For example, medium spiny neurons (Pde10a, cluster 1), corpus callosum (Sox10, Mobp, and Tspan, cluster 3), and the ventricular zone (Notch1, cluster 7) (Fig. 3c and Fig. S8a). Cell type-specific chromatin regulatory elements were analyzed (Fig. S11) that may act as cell type-specific reporters. The GREAT analysis confirmed that the pathways correlated with the tissue functions in different anatomical regions (Fig. S11b). For the RNA data, the clusters also showed specific gene expression within individual regions. For example, medium spiny neurons (Pde10a, cluster 1), and corpus callosum (Mobp, and Tspan2, cluster 4) (Fig. 3c and Fig. S8b).
Integration of single-cell ATAC-seq of the mouse brain atlas data14 with our spatial ATAC data was carried out (Fig. 3d) to identify all major cell types15 and then used the label transfer to assign cell types to spatial locations where epigenetic state may control specific cell type formation (Fig. S10b). For example, an enrichment of mature oligodendrocytes (MOLs) was observed within corpus callosum. A thin layer of ependymal cells (EPENs) was identified in the ventricular zone, and medium spiny neurons (MSNs) were found to be enriched in striatum (Fig. S10b). We then integrated our spatial RNA data with scRNA-seq15 and again used label transfer to identify the dominant cell types in each cluster (Fig. 3e, j and Fig. S10b). We observed a high degree of concordance in cell type identification between integrated ATAC and integrated RNA data analyses, suggesting a general congruence between chromatin accessibility and transcriptome to define cell identities in tissue. For example, the ependymal cells (EPEN), medium spiny neuron (MSN), and mature oligodendrocytes (MOL) revealed by the RNA data were enriched in the same regions as identified by the ATAC data in spatial-ATAC-RNA-seq (Fig. S10b). Furthermore, the detection of a thin layer of EPEN cells in the ventricular zone demonstrated a high spatial resolution for our technology to identify low abundance cells at near-single cell resolution. The result was further improved by joint clustering of ATAC and RNA profiles (Fig. 3a, bottom).
Joint profiling of ATAC and RNA allows for searching the correlated peak accessibility and gene expression to predict putative enhancers. We detected 21,791 significant peak-to-gene linkages between regulatory elements and target genes (Fig. S10d and Fig. S15a). Some potential enhancers with dynamically regulated promoter interactions were found to be cluster-specifically enriched, such as Sox10 (cluster 3, and cluster 8), Notch1 (cluster 3, cluster 6, cluster 7, and cluster 8), Tspan2 (cluster 3), and Mobp (cluster 3) (Fig. 3k and Fig. S8c), indicating the ability for spatial-ATAC-RNA-seq to identify the key regulatory regions. The annotation of cell type-specific chromatin regulatory elements and TF regulators (Sox10, Atoh7, Nifx, Nfic, and Neurod6) in relation to the spatial pattern provided further insights to the regulatory factors in the tissue context (Fig. S11a and Fig. S12). For example, Sox10 was examined for chromatin accessibility, gene expression, putative enhancers, and TF activities simultaneously, permitting more comprehensive understanding of gene regulation dynamics. We found that some highly active marker genes with open chromatin accessibility (i.e., Notch1) were not highly expressed for transcription (Fig. 3c), suggesting the possibility of linage-priming of these genes in the brain tissue development7.
In addition to chromatin accessibility, histone modifications are also important aspects in epigenetic regulation. Spatial-CUT&Tag-RNA-seq was performed to profile transcriptome and H3K27ac (activating enhancers and/or promoters) in P21 mouse brain with 20 μm pixel size, which identified 7 and 9 specific clusters for CUT&Tag and RNA, respectively (Fig. 3b). Overall, these clusters agreed with the anatomical regions defined by the Nissl staining from an adjacent tissue section (Fig. 3b and Fig. S10a) and also good concordance between CUT&Tag and RNA in terms of spatial patterns (Fig. S16b). For the CUT&Tag data, the cluster specific active genes were shown in gene activity score (GAS) (Fig. 3c and Fig. S9a). For example, Sox10, Mal, and Cldn11 showed high gene activity score in cluster 3 (corpus callosum), Notch1 activity enriched in cluster 2 (the ventricular zone), and Pde10a highly active in cluster 4 (medium spiny neuron). We also identified TF motifs in H3K27ac modification loci, which could help better understand the most active regulatory factors across all clusters (Fig. S13a and Fig. S14) and the pathway enrichment by GREAT agreed with the tissue function in different anatomical regions (Fig. S13b). The RNA data clusters showed unique signature within specific marker gene regions (Fig. 3c and Fig. S9b) as exemplified by medium spiny neurons (Pde10a, cluster 4) and corpus callosum (Sox10, Cldn1, and Mal, cluster 2) (Fig. 3c).
Integration of our CUT&Tag data with the scCUT&Tag data16 (Fig. 3f, i) also allowed for label transfer15 to assign epigenetic cell identities/states to spatial location (Fig. S10c). We also observed an enrichment of MOL1 within the corpus callosum, a thin layer of EPEN in ventricular zone, and MSN1 in the striatum, in agreement with spatial-ATAC-RNA-seq data (Fig. S10c), suggesting the important role for H3K27ac to control the development of all major tissue types in this brain region. We also integrated the RNA part of the spatial-CUT&Tag-RNA-seq data with scRNA-seq15 to identify dominant cell types in each cluster via label transfer (Fig. 3g, j and Fig. S10c). MOL1, a thin layer of EPEN, and MSN1 were enriched in the same spatial regions recognized by CUT&Tag (Fig. S10c). Joint clustering of CUT&Tag and RNA could further enhance the resolution of spatial clusters (Fig. 3b)11. To directly interrogate the interactions between specific gene expression and the corresponding enhancers across all clusters, we identified a total of 16,973 significant peak-to-gene linkages (Fig. S16a) such as Sox10 (cluster 3), Notch1 (cluster 2, cluster 3, cluster 5, and cluster 6), Mal (cluster 3), and Cldn11 (cluster 3) (Fig. 3l and Fig. S9c). We also found that some of the highly activated marker genes identified by CUT&Tag (i.e., Notch1) were not highly expressed (Fig. 3c), which may also suggest the lineage-priming7, in this case, mediated by H3K27ac.
Human brain hippocampus: co-mapping of accessible chromatin and transcriptome to dissect cell type, dynamics, and gene regulation
We performed spatially resolved co-sequencing of epigenome and transcriptome (50 μm pixel size) on adult human brain hippocampal formation, which is a highly complex brain region involved in high level cognitive function and neurological diseases such as major depression disease (MDD) and Alzheimer’s disease (AD). Neuronal cell types, fate, and dynamics in this region remains poorly understood. We identified 7 and 8 major clusters for ATAC and RNA, respectively, and the spatial distribution corresponds well to the major anatomical regions by the Nissl staining (Fig.4a). For the ATAC data, cluster 4 is the granular cell layer (GCL) (THY1, BCL11B) and cluster 6 is the choroid plexus (Fig. 4a, d). The TF motifs (ID4, and MESP1) and their spatial patterns were analyzed to identify the regulatory factors in different tissue regions (Fig. S17b). For the RNA data, we also detected distinct clusters with unique marker genes (Fig. 4a, d) such as PROX1 and BCL11B enriched in cluster 4 (GCL).
To directly interrogate the gene regulatory interactions between regulated promoters and specific enhancers, the correlation of peak accessibility and gene expression was analyzed (Fig. 4e) and identified some enhancers with dynamically regulated promoter interactions, such as SOX2 (cluster 0, cluster1, and cluster 3). This also suggests that spatial-ATAC-RNA-seq allows for mapping chromatin accessibility dynamics at different regulatory regions directly in the spatial tissue context such as the human hippocampus.
We also conducted data integration with single-cell ATAC or RNA sequencing. Combining the ATAC part of our data with scATAC-seq17 revealed the dominant cell identities in individual clusters (Fig. 4b). In addition, the RNA part of our data was integrated with scRNA-seq data18 to assign transcriptional cell types to each cluster via label transfer (Fig. 4c and Fig. S17a). The granule cells were detected clearly in the GCL, the cornu ammonis (CA) neurons were enriched in CA3-4 regions, and the vascular and leptomeningeal cells (VLMC) were strongly distinguished in choroid plexus (ChPx) as opposed to other regions.
In general, both ATAC and RNA can readily resolve all major tissue features in this region but spatial co-sequencing of epigenome and transcriptome can provide new insights into the dynamic gene regulation mechanism which cannot be realized by single modalities. For example, PROX1, a signature gene defining the granule neuron identity during pyramidal neuron fate selection19, is indeed highly expressed in GCL but showed low chromatin accessibility (Fig. 4d). This could be attributed to a minimal demand to synthesize new PROX1 transcripts in postmitotic mature granule neurons and thus not needed to maintain an active open chromatin state for PROX1.
To verify this hypothesis, we conducted RNA velocity analysis to examine cell dynamics by quantifying tissue region specific RNA turnover rate (Fig. 4f). Overall, the RNA velocity map identified the major branches corresponding to, for example, GCL of the dentate gyrus (DG) and CA region, which is consistent with the neuronal developmental lineage relationship as well as the anatomical and functional subdivisions of hippocampus19. Using RNA velocity to examine the dynamics of selected genes (Fig. 4g) confirmed that PROX1 has a low RNA turnover rate despite highest expression level in GCL. Together with the discordance between chromatin accessibility and gene expression (Fig. 4d), these results suggested that PROX1 activity is differentially required in regions during adult human brain hippocampus in agreement with the findings in murine studies20. PLP1, which plays an important role in oligodendrocyte development and myelin sheath maintenance, showed high expression in oligodendrocytes enriched white matter region but relatively low chromatin accessibility, which also correlated with a low level of RNA turnover (Fig. 4g). BCAS1, a key gene required for myelination, had high chromatin accessibility in the white matter region but the gene expression was low, which corresponded to a high level of RNA turnover despite low gene expression (Fig. 4g). Cell dynamics revealed by RNA velocity for PROX1, PLP1, and BCAS1 validated our findings from the discordance between chromatin accessibility and gene expression. In general, high RNA velocity is associated with more accessible chromatin while low RNA turnover was detected in closed chromatin. These results further highlighted the importance of spatial multi-omics mapping to fully dissect not only cell state but also developmental dynamics in a highly heterogeneous tissue.