Human postmortem brain tissues.
Human fresh frozen brain blocks were provided by the Arizona Study of Aging and Neurodegenerative Disorders/Brain and Body Donation Program at Banner Sun Health Research Institute65 and the New York Brain Bank at Columbia University Medical Center66. The demographics of human cases used in this study are listed in Supplementary Table 1. These specimens were obtained by consent at autopsy and have been de-identified and are IRB exempt so as to protect the identity of each patient. Frozen sections (10 μm) were cut from frozen blocks under RNase-free conditions.
Reagents
Human/murine phospho-tau pSer202/Thr205 (AT8, Cat# MN1020), Tau46 (Cat# 13-6400), Alexa Fluor dye-labeled cross-absorbed donkey secondary antibodies, and TRIzol RNA isolation reagent (Cat# 15596026) were purchased from ThermoFisher Scientific. Goat anti-Olig2 (Cat# AF2418) and GAD1 (Cat# AF2086) polyclonal antibody were purchased from R&D Systems. Rabbit anti-GFAP (Cat# G9269), WFS1 (Cat# 1158-1-AP), Aβ (Cat# Ab2539), and TauC (Cat # A0024) polyclonal antibodies were purchased from Sigma-Aldrich, Proteintech, Abcam, and DAKO, respectively. Rat anti-P2RY12 (Cat# 848002) was purchased from BioLegend. RNAscope HiPlex Ancillary Kit (Cat# 324120) and human-specific RNA probes were purchased from Advanced Cell Diagnostics. TrueBlack lipofuscin autofluorescence quencher (Cat# 23007) was purchased from Biotium. Fluoromount-G Mounting Medium (Cat# 0100-01) was purchased from SouthernBiotech.
Sample selection and preparation for ST
Thirty milligram brain tissue was homogenized in 500 µl TRIzol RNA isolation reagent. RNA was extracted from each homogenate by following the TRIzol RNA extraction procedure. The RNA quality of each sample was assessed by RNA integrity number (RIN) via Agilent 2200 TapeStation system. Samples with a RIN higher than 6.0 were selected and cut into 10 µm sections, which were subjected to IF staining of total tau (TauC antibody) and phosphorylated tau level (AT8, pS202/T205 tau antibody). Regions with the highest TauC+ and/or AT8+ staining were scored within 6.5 mm × 6.5 mm selected square area.
The time course of the tissue permeabilization for each sample was determined by 10x Genomics Visium Spatial Tissue Optimization (STO) Kit (Slide Kit, Part# 1000191. Reagent Kit, Part# 1000192). Briefly, sections from selected square areas of the brain tissue were mounted on the capture areas of STO slides. Sections were fixed in the -20°C prechilled methanol for 30 min and subjected to H&E staining. The 20x tiles image for each sample was obtained by Zeiss Axio Observer microscope. After imaging, sections were permeabilized with permeabilization enzyme for varying amounts of time, and the reverse transcription (RT) was performed directly on the slide using Cy3-nucleotides. After RT, sections were imaged and aligned with H&E images. The best time course was chosen for ST if its image shows the strongest Cy3 signal and the minimum signal diffusion.
ST processing
Each optimized human postmortem brain tissue was cryosectioned at 10 µm continuously for seven sections and was labeled sequentially as No.1 to No.7. The middle section (No.4) was mounted on the Visium Gene Expression (GE) slide for ST profiling, sections No. 2, 3, 5, 6 were used for IF staining of cell-type markers and AD pathology. The outmost sections (No.1 and No.7) were used for the RNAscope HiPlex assay. ST were performed using the 10x Genomics Visium Spatial Gene Expression Kit (Slide Kit, Part# 1000188. Reagent Kit, Par# 1000189). The procedures prior to RT are the same as described for STO. For RT, cDNA was synthesized using nucleotides without the label of Cy3. By using Template Switch Oligo, the second strand cDNA (ss cDNA) was synthesized according to cDNA templates captured on the poly T probes. The ss cDNA was then denatured by 0.08 M KOH, washed off, and then amplified using PCR. The cDNA quality control was performed using an Agilent Bioanalyzer high sensitivity chip. After the cDNA concentration was determined, the sequencing library of each sample was constructed using the 10x Library Construction kit (Part# 1000196). Briefly, optimized cDNA was obtained by enzymatic fragmentation and size selection. P5, P7, i7 and i5 sample indexes and TruSeq Read 2 were added via End Repair, A-tailing, Adaptor Ligation, and PCR. The cDNA library with correct sizes was then selected using the SPRIselect reagent (Beckman Coulter, Part# B23318). In order to meet the required sequencing depth, cDNA libraries from four samples were pulled in a NovaSeq6000 SP v1.0 flowcell and paired-end sequencing were performed on an Illumina NovaSeq6000 sequencer at the Genomics Services Laboratory at Nationwide Children’s Hospital.
IF staining
As described above, sections No.3&5, 2&6, and 1&7 were 0 µm, 10 µm, and 20 µm apart from the GE section, respectively. Section No.3 was sequentially stained with P2RY12/GFAP/AT8, section No.5 with Olig2/Tau46/Aβ, section No.2 with WFS1/AT8, and section No.6 with TauC/GAD1. Sections No.1&7 were used for the RNAscope HiPlex assay. IF staining was performed as previously described11. All sections were air dried at 60°C for 10 min and then fixed and permeabilized by prechilled acetone at -20 °C for 15 min. For sections No.3&5, slides were immersed in 1x PBS for 3 hours at 37°C for antigen retrieval. For sections No.2&6, antigen retrieval was performed by incubating slides in 10 mM sodium citrate (pH6.0, 95°C) for 12 min. Antibodies were incubated sequentially to avoid false-positive results coming from co-immunostaining. On day 1, after 1-h blocking by 10% donkey serum (in 1x PBS), P2RY12 (1:1000) and Olig2 (1:500) primary antibodies were applied on section No.3 and 5, respectively, overnight at 4°C, while GAD1 (1:100 in PBS with 10% donkey serum) antibody was applied to sections No.2 and 6 overnight at 4°C. On day 2, after three washes with 1x PBS, secondary antibodies were incubated with corresponding sections for 2 hours at 37°C. The nuclei were stained with Hoechst33342. Autofluorescence was quenched with 0.5x TrueBlack solution in 70% ethanol for 10 min. The coverslips were mounted with Fluoromount-G Mounting Medium, and the slides were then imaged by Zeiss Axio Observer microscope. Following 1-h re-blocking with 10% donkey serum in 0.3% PBST, primary antibody combinations for each section were incubated at 4°C in the same way as described above for regular IF staining.
H&E and IF staining image alignment
The 20x tiles images taken by Zeiss Axio Observer microscope were exported into merged and individual channel images by Zen software (v3.2 blue edition). The merged channel images were landmarked according to hematoxylin staining in corresponding H&E images using Fiji “Multi-points” tool. Multi-points landmarks in each merged channel image were then saved as selection and transferred to individual channel images. Image alignment for each channel to H&E was performed using the “Transform/Landmark correspondences” plugin in Fiji. H&E image and transformed channel images were stacked into .tiff image and imported into Loupe Browser (v5.0.1) and Space Ranger (v1.2.2) for further alignment with ST spots.
Unsupervised and supervised annotation of cortical layers and the white matter of the human MTG
To assign ST spots to their corresponding layers, we first combined all 17,203 spots from our four samples and generated a cross-sample unsupervised Uniform Manifold Approximation and Projection (UMAP) via Seurat (v.3.2.2)67 to visualize two dimension-reduced data. All spots were clustered into eight groups and visualized in the original sample using eight colors assigned to each cluster. Although UMAP can roughly reflect the cortical layers’ information and separate the white matter from the gray matter well, it is not powerful to segregate six layers in the gray matter of cortex. Since certain layer(s) may be missing during the brain dissection or mounting brain sections on the GE slide, arbitrarily assigning clusters from unsupervised UMAP into a sample may not be accurate without manual annotations. Moreover, the pathology development in AD brains may also change the gene expression profile and introduce more confounders in clustering. To solve this problem, we used both UMAP and IF staining as references to manually label each cortical layer and the white matter.
First, a UMAP plot for each sample was generated by Loupe Browser to avoid batch effects across samples. In order to assign UMAP spots to corresponding layers, previously identified layer-specific marker genes were chosen to highlight regions of certain layers on the UMAP. A group of marker genes for each layer was shown as a feature combination using a method of Feature Sum. The outstanding spots with higher expression of feature combination genes were selected and temporally labeled with that layer. The boundary of each layer would then be verified in spatial images with different IF staining, which were used as the ground truth. For example, three previously defined layer 2 markers, C1QL2, RASGRF2, and WFS1, were shown on the UMAP17,36. The highlighted spots with higher expression of those three genes were assigned to layer 2 in the category of Layer. Those spots would then be visualized on the image with WFS1 staining to define a clear boundary between layer II and layer I/III. Using the same strategy, the gene expression of RELN and the IF staining of GFAP (astrocytes) were used to define layer 1. Similarly, the gene expression of MAPT, MBP and MOBP (oligodendrocytes), and the IF staining of total tau via TauC and Tau46 as well as the IF staining of Olig2 (oligodendrocytes) were used to define the gray and the white matter, respectively. Since layer IV is an internal granular layer and a distinct layer with dense nuclei that could be identified in both H&E and DAPI staining, we also include the nucleus staining from H&E and DAPI as ground truth when using the canonical layer 4 marker genes (e.g., RORB and PDYN) to label layer IV. In addition, PCP4 was used as a marker gene for layer V, while PCDH17, TLE4 and FOXP2 were used as marker genes for layer VI. Although no IF staining is available in layer V and layer VI, the boundary between those 2 layers are already clear enough by only using layer-specific marker genes. All the spots across four samples were allocated to a certain layer or assigned as noise if their definition was too vague based on UMAP or IF staining. A total of 250 spots were dropped as the noise across four samples.
Selection of AD pathology-associated spots and adjacent 3-level spots.
After the alignment of all 11 channels, the signal from each channel was adjusted to clearly present the staining. ST spots with AD pathology (Aβ+, AT8+, Aβ+/AT8+, GFAP+/Aβ+/AT8+, or P2RY12+/Aβ+/AT8+) in AD cases were selected based on Aβ, AT8, GFAP and P2RY12 staining. Since Aβ and AT8 were not co-stained on the same section, and they are 10 µm away from each other (Aβ on No.5 slide, and AT8 on No. 3 slide), Tau46 (on No.5 slide) staining was used as a reference to ensure Aβ+/AT8+ within pathology-associated spots are accurate. After each pathology-associated spots were labeled, the immediately surrounding spots were labeled as level 1 spots, spots surrounding level 1 were labeled as level 2 spots, and the spots surrounding level 2 spots as level 3 spots. The selected spots labeled as noise in manual annotation step were automatically dropped and would not be included in the following data analysis.
FASTQ generation, alignment, and quantification
Raw sequencing data of four human MTG samples were obtained from Visium (BCL files) and processed with the SpaceRanger (v.1.2.2) software to generate FASTQ files via the SpaceRanger mkfastq function. FASTQ files were then aligned to and quantified to the expression matrix by using GRCh38 Reference-2020-A and the spaceranger count function. The two functions spaceranger mkfastq and count were used for demultiplexing sample and transcriptome alignment using default parameters.
Data processing
Seurat (v.3.2.2) was used for the following analysis. For a sample, the spatial file and HDF5 file obtained from spaceranger were loaded into Seurat by the function Load10X_spatial. Then we used the recommended sctransform method68 to normalize counts by the function SCTransform in its default settings. Multiple manual annotations (meta information) were added to each Seurat object via the function AddMetaData. All the spots labeled as noise were removed, and the remaining spots were used for the following analyses.
Data Integration, dimension reduction, spot clustering, and visualization
The Seurat integration framework was used for identifying clusters and reducing confounders from the four samples (two AD samples and two normal tissues). First, 3,000 features were selected for integrating four datasets via setting nfeature as 3,000 from the function SelectIntegrationFeatures. Then the function PrepSCTIntegration and FindIntegrationAnchors were used for preparing anchor features and generating anchors for the following analyses. Principal Component Analysis (PCA) was utilized for dimension reduction and denoising via runPCA. The top ten principal components (PCs) were select to perform UMAP via runUMAP in the default settings. To better match the six layers and white matter of human cortex structure, we set the resolution as 0.3 to generate spot clustering results by the functions FindNeighbors and FindClusters. The UMAP plots and spatial maps were generated by DimPlot and SpatialPlot.
Validation of identified layer-specific marker on public datasets
Four datasets were used for validating and visualizing layer-specific gene modules identified in this study. Filtered expression matrices and spatial information of two samples (Adult Human Brain 1 and Adult Human Brain 2 under Space Ranger 1.1.0) were downloaded from https://www.10xgenomics.com/products/spatial-gene-expression. Filtered expression matrices and spatial information of two samples (151673, 151674) from Maynard’s dataset17 were downloaded from Globus data transfer platform via the accessible endpoint “jhpce#HumanPilot10x” for http://research.libd.org/globus/.
WGCNA of DEGs between AD and the control
The R package WGCNA (v1.69)69 was used to identify gene modules and build unsigned co-expression networks, which include negative and positive correlations. The identified AD upregulated and downregulated DEGs at the pseudo-bulk level was used to identify gene modules. Soft power 3 was chosen by the WGCNA function pickSoftThreshold. Then, the function TOMsimilarityFromExpr was used to calculate the TOM similarity matrix via setting power = “3”, networkType = "unsigned" and TOMType = "unsigned". The distance matrix was generated by substrating the values from similarity adjacency matrix by one. The function flashClust (v.1.01)70 was used to cluster genes based on the distance matrix, and the function cutreeDynamic was utilized to identify four gene modules (yellow: M1, blue: M2, brown: M3, and turquoise: M4) by setting deepSplit =3. To determine key genes with the highest degree, the function TOMsimilartyFromExpr was used to generate the similarity matrix based on all genes of each module. The matrix was then used for calculating the edges’ closeness centrality score via function graph_from_adjacency_matrix and closeness from the R package igraph (v.1.2.6) for four modules. To fairly compare closeness scores among different modules, the min-max normalization method was used to normalize closeness score. Based on the normalized closeness score, genes were sorted, and the Top-15 genes were selected to generate four modules circos plot via the circlize package (v.0.4.13)71.
DEG and GO enrichment analysis
The DEG analysis was conducted by the Seurat function FindMarkers via grouping AD and control samples (min.pct was set as 0.01 and logfc.threshold was set as 0.1). The DEGs were selected if the adjusted p-value was less than 0.05 and the absolute value of log-fold change was higher than 0.1. Based on the identified DEGs, the enrichment analyses of GO terms (Biological Process), KEGG, and Reactome were performed via the R package clusterProfiler (v.3.18.0)72 and ReactomePA (v.1.9.4)73 using the functions of enrichGO, enrichKEGG, and enrichPathway. The enrichment analysis results were filtered out if the adjusted p-value was greater than 0.05. For KEGG analysis, gene database Org.Hs.eg.Db was used for transferring SYMBOL to ENREZID via function bitr. R package ggplot2 (v.3.3.2) was used for the visualizations. The sidebar legend was colored by the gene ratio defined as hit genes over the query genes via enrichGO function. For each identified gene module, the component genes were used for pathway enrichment analysis.
To determine the layer-specific marker genes (Supplementary Table 3) in both AD and CT samples, DEG analysis was first performed among different layers and the white matter within each individual sample. We calculated the number of shared DEGs across four samples in the same layer. Then layer-specific marker genes were determined by their uniqueness (i.e., unique to one layer in at least two samples compared to other layers’ DEGs) and their consistency (i.e., one DEG being identified from at least 2 samples simultaneously). Enrichment analyses of GO terms (Biological Process), KEGG, and Reactome were performed for DEGs with consistency numbers greater than 2. Similarly, AD-associated DEGs from different layers and the white matter (Supplementary Table 6) were identified from 4 comparisons (AD-1 vs CT-1, AD-2 vs CT-1, AD-1 vs CT-2, and AD-2 vs CT-2). We calculated the number of shared DEGs across four comparisons in the same layer.
RNAscope HiPlex assay in combination with post-IF staining of Aβ/AT8/GFAP
In order to validate and quantitate the gene expression changes around AD pathology-related regions (Aβ+, AT8+, and/or GFAP+), we performed the smFISH assay using the RNAscope HiPlex12 Ancillary Kit to simultaneously detect 10 different RNA targets selected from DEGs associated with Aβ plaques and/or AT8+-tangles. The process was performed according to the manufacturer’s user manual with a few modifications. Fresh frozen sections from another set of human AD and control brain tissues, as well as sections No.1 & 7 preserved from 10x Visium ST experiment, were fixed in 4% PFA at room temperature for 1 h. The sections were washed twice by 1x PBS and dehydrated sequentially in 50%, 70% and 100% ethanol for 5 min. After permeabilization by protease IV, the sections were incubated with the premixed 10 probes in the ACD HybEZ II oven for 2 h at 40°C, and three amplification steps in 40°C were extended to 45 min instead of 30 min. Since the Alexa 750 channel is not available on our microscope, we changed the original three rounds reaction (four probes per round) to four rounds with only three probes in each round. After each round, the nucleus were counterstained with DAPI for 30 s, and the lipofuscin autofluorescence was quenched by 1 min of 0.5x True Black. The slides were mounted with Fluoromount-G Mounting Medium, and 40x Tiles image were taken by Zeiss Axio Observer microscope after each round. Before the next round, signals were quenched using cleaving buffer and washed twice with 1x PBS containing 0.5% of Tween20. To define the regions with Aβ or AT8, IF staining was performed after RNAscope HiPlex12 assay. Briefly, the signal from round 4 was quenched using the cleaving buffer. After that, the antigen retrieval and staining process were performed as described above. Images from each round and the IF staining were aligned using ImageJ Software.
Spatial integration of snRNA-Seq and ST data from human MTG
The snRNA-Seq datasets from human entorhinal cortex (EC) of AD and controls9 were used for performing single-cell registration for our four samples based on R package SPOTlight (v.0.1.6)25. The EC datasets were downloaded from https://www.synapse.org/, and the access number was syn22722817. To fairly and precisely integrate the EC datasets and our samples, single-nucleus data labeled as disease stage 0 was selected for the following analysis. The function FindAllMarkers was utilized to identify the EC dataset’s DEGs based on benchmarking labels. An in-house database named scREAD74 was used for acquiring cell-type-specific markers. Final cell-type-specific markers (CTS, Supplementary Table 10) were identified by comparing the overlapped DEGs based on the EC dataset and the markers from scREAD. Next, the proportion of each cell type was estimated by the function spotlight_deconvolution via inputting the defined new CTS markers. The mean proportion of each cell type in each layer and the five pathological regions was estimated and visualized using the R package ggplot2 suite.
Statistical analysis
No statistical methods were used to predetermine sample sizes, since (i) a rigorous statistical framework for design of ST experiments is missing in the literature and (ii) existing tools for the design of single-cell genomic experiments are not appropriate for ST experiments and do not consider key factors of ST experiments. In our study, each brain sample has over 4,000 spots and ~3500 genes per spot, which provides sufficient power for DEG analysis of Visium ST datasets. Prism 5 software was used to analyze the data. All data are expressed as mean ± s.e.m. We performed the D’Agostino–Pearson omnibus normality test to determine whether the data were normally distributed, or the F test to determine whether the data assumed equal variances. We then chose the following statistical tests. Nonparametric Mann-Whitney tests were used to compare the number of single RNA dots in each cell human non-AD and AD. All results represent two-sided tests comparing groups of biological replicates. P < 0.05 was considered statistically significant for all measures. The n values represent the number of spots or cells in each group; exact values are indicated in figure legends.
Code availability
All the codes used in this study are temporarily available for reviewers only at https://bmbls.bmi.osumc.edu/scread/stofad, and will be public available after publication.