Study Cohort Characteristics
We obtained four publicly available RNA-seq data sets (Allen Brain Institute Aging Dementia and TBI study, Mayo Clinic RNA-seq, MSBB, and ROSMAP) from the brain (temporal cortex, parietal cortex, prefrontal cortex, and hippocampus) and three microarray datasets from whole blood (AddNeuroMed cohort 1, AddNeuroMed cohort 2 and ADNI). After outlier removal, we included a total of 1,084 brain samples (58% female; 26% apoE4 carriers) and 645 blood samples (58% female; 38% apoE4 carriers) in our analysis. Table 1 shows a summary of sample annotations including number of cases and controls, apoE carrier status, and number of males and females for brain datasets and blood datasets.
In the brain datasets, compared to controls, AD patients were significantly older (mean ± SD for AD: 86.5 ± 6.0 years and controls: 84.8 ± 7.4 years; two sample t-test, P < 0.001), more likely to be apoE4 carriers (AD: 38% carriers vs controls: 15% carriers; Chi-squared test, P < 0.001), and more likely to be females (AD: 65% female vs controls: 51% female; Chi-squared test, P < 0.001). The proportion of male and female samples from each brain region did not different across brain regions (Chi-squared test, P < 0.01).
In the blood datasets, compared to controls, AD patients were significantly older (mean ± SD for AD: 77.0 ± 7.1 years and controls: 74.7 ± 5.7 years; two sample t-test, P < 0.001), more likely to be apoE4 carriers (AD: 60% carriers vs controls: 27% carriers; Chi-squared test, P < 0.001), more likely to be females (AD: 64% female vs controls: 55% female; Chi-squared test, P < 0.001), and had more years of education (mean ± SD for AD: 9.4 ± 4.8 years and controls: 13.9 ± 4.7 years; two sample t-test, P < 0.001).
Studies were merged and batch corrected using ComBat resulting in 13,500 common genes across 1,084 samples for brain studies and 3,371 common genes across 645 samples for blood studies. Supplementary Figure S1 and S2 show PCA plots before and after batch correction, demonstrating successful data merging and batch effect removal.
Table 1
Meta-analysis Study Characteristics
|
|
|
|
AD
|
CN
|
Study
|
Accession
|
Total participants
|
AD, no.
(%)
|
Female /Male
(% Female)
|
apoE4 Yes /No
(% Yes)
|
Female /Male
(% Female)
|
apoE4 Yes /No
(% Yes)
|
Brain Transcriptomic Studies
|
Allen
|
https://aging.brain-map.org/
|
212
|
72
(34)
|
29/43
(40)
|
22/50
(31)
|
54/86
(39)
|
19/121
(14)
|
Mayo Clinic RNA-Seq
|
syn5550404
|
154
|
80
(52)
|
49/31
(61)
|
42/38
(53)
|
36/38
(49)
|
9/65
(12)
|
MSBB
|
GSE52564
|
301
|
185
(62)
|
131/54
(71)
|
63/122
(34)
|
57/59
(49)
|
16/100
(13)
|
ROSMAP
|
syn3219045
|
417
|
218
(52)
|
151/67
(70)
|
83/135
(38)
|
122/77
(61)
|
33/166
(17)
|
Sum
|
|
1084
|
555
(52)
|
360/195
(65)
|
210/345
(38)
|
269/260
(51)
|
77/452
(15)
|
Whole Blood Transcriptomic Studies
|
ADNI
|
http://adni.loni.usc.edu/
|
301
|
43
(14)
|
17/26
(40)
|
32/11
(74)
|
135/125
(52)
|
71/189
(27)
|
AddNeuroMed1
|
GSE63060
|
182
|
91
(50)
|
65/26
(71)
|
52/39
(57)
|
55/36
(60)
|
30/61
(33)
|
AddNeuroMed2
|
GSE63061
|
160
|
86
(43)
|
59/27
(69)
|
47/39
(55)
|
45/29
(61)
|
15/59
(20)
|
Sum
|
|
645
|
220
(34)
|
141/79
(64)
|
131/89
(60)
|
235/190
(55)
|
116/309
(27)
|
Differential Gene Expression in the Brain Identifies a Distinct Sex-Specific Signature of AD
We observed distinct AD-associated transcriptomic signatures in the brain in males and females. A total of 477 genes were differentially expressed in females, including 308 upregulated genes and 167 downregulated genes (FC > 1.2, q < 0.05; Figs. 2A-B; Supplementary Table 1). In males, 366 genes were differentially expressed, including 320 upregulated genes and 46 downregulated genes (FC > 1.2, q < 0.05; Figs. 2A-B; Supplementary Table 1). Altogether, 262 genes were uniquely dysregulated in females, including 130 upregulated genes and 136 downregulated genes. In males, 151 genes were uniquely dysregulated, including 142 upregulated genes and 10 downregulated genes. There was a significant overlap of dysregulated genes across males and females (P < 0.05; hypergeometric test).
Next, we characterized the transcriptomic signatures observed in the brains of male and female AD patients. In females, among upregulated AD genes, we found 46 enriched pathways, many of them relating to components of the innate and adaptive immune system (Table 2; Supplementary Table 2). Several upregulated HLA system genes including HPA-DRA and HLA-DPA1 contributed to enrichment of a number of pathways relating to response to infection (Table 2). Components of the complement system including C3AR1, C4B, and C4A were also uniquely dysregulated in females (Table 2; Supplementary Table 2). We also observed an enrichment of genes in the MAPK signaling pathway including MRAS and MK2. Downregulated AD genes in females were enriched for a number of neurological signaling pathways including GABAeric signaling, neuroactive ligand-receptor activation, and cAMP signaling (Table 2; Supplementary Table 3).
Strikingly, we observed an enrichment of fewer immune-related pathways in males with AD. Among upregulated genes in male AD patients, we found 12 enriched pathways, including amoebiasis and cytokine-cytokine receptor interaction, suggestive of adaptive and innate immune activation (Table 2; Supplementary Table 4). Similar to females, we also observed an enrichment of the MAPK signaling pathway, including MAP4K4 and MK2, in males. Among downregulated genes in male AD patients, we observed an enrichment of neuropeptide signaling and glutamate signaling related pathways (Table 2; Supplementary Table 5). For a full list of enriched pathways, refer to Supplementary Tables S2-S5.
Lastly, we performed a non-stratified analysis comparing gene expression between AD and control samples irrespective of sex. Statistical models were adjusted for sex, apoE4 carrier status, and age. A total of 662 genes were upregulated and 430 genes were downregulated in patients with AD compared to controls (Figure S3, Table 2; Supplementary Table 1. Upregulated genes were enriched for several pathways previously implicated in AD including PI3K-Akt signaling and MAPK signaling as well as a number of immune related pathways including Staphylococcus aureus infection, human papillomavirus infection, and malaria (Supplementary Table S6). Several components of the complement system, including C4B, C4A, C1R, C3AR1, and C5AR1 also contributed to this enrichment (Supplementary Table S7). In our analysis of downregulated genes, we found several pathways related to neuroreceptor signaling and GABAergic transmission were enriched including the genes GABRA1, GNG3, GNG2, SLC32A1, GABRD, and GABRG2 (Supplementary Table S7).
Table 2
Enriched Pathways in the Brain
Term
|
Adjusted P
|
Genes
|
Female Upregulated Genes (n = 583)
|
Malaria
|
< 0.001
|
TGFB2;TGFB1;GYPC;HGF;ITGB2;PECAM1;CCL2;TLR4;ICAM1
|
Hippo signaling pathway
|
< 0.001
|
YAP1;CRB2;WWTR1;TGFB2;TGFB1;FZD7;SERPINE1;ITGB2;BMP6;GLI2;TGFBR2;PARD3;CCN2;AJUBA;TEAD2
|
PI3K-Akt signaling pathway
|
< 0.001
|
NGFR;CDKN1A;ANGPT2;CSF1;ITGB5;ITGB4;LAMB2;HGF;IGF2;GNG12;OSMR;PGF;PIK3R5;COL1A2;ITGA10;COL6A2;DDIT4;CDK2;SPP1;ITGA5;TLR4
|
Proteoglycans in cancer
|
< 0.001
|
CDKN1A;TGFB2;TGFB1;HPSE2;ITGB5;FZD7;HGF;IGF2;DCN;MRAS;SMO;ITGA5;EZR;TLR4;CD44
|
Human T-cell leukemia virus 1 infection
|
< 0.001
|
CDKN1A;TGFB2;TGFB1;ITGB2;NFATC2;FOS;ICAM1;NFATC4;TGFBR2;NFKBIA;ZFP36;CDK2;HLA-DRA;MSX1;HLA-DPA1
|
Rheumatoid arthritis
|
< 0.001
|
TGFB2;TGFB1;CSF1;ITGB2;CCL2;HLA-DRA;FOS;TLR4;ICAM1;HLA-DPA1
|
ECM-receptor interaction
|
< 0.001
|
COL1A2;ITGB5;ITGB4;LAMB2;COL6A2;ITGA10;SPP1;ITGA5;CD44
|
Osteoclast differentiation
|
< 0.001
|
NFKBIA;SOCS3;TGFB2;TYROBP;TGFB1;CSF1;NFATC2;TNFRSF11B;TREM2;FOS;TGFBR2
|
TGF-beta signaling pathway
|
< 0.001
|
TGIF1;TGFB2;TGIF2;TGFB1;ID4;ID3;DCN;BMP6;TGFBR2
|
Staphylococcus aureus infection
|
< 0.001
|
C4B;C4A;ITGB2;CFI;C3AR1;HLA-DRA;ICAM1;HLA-DPA1
|
36 more..
|
|
|
Female Downregulated Genes (n = 398)
|
Neuroactive ligand-receptor interaction
|
< 0.001
|
GABRA1;CHRM4;SSTR1;TACR1;HTR5A;RXFP1;GABRG2;MCHR2;ADCYAP1;MAS1;GLRA3;CCKBR;SST;GALR1;TAC3;TAC1;VIP
|
GABAergic synapse
|
0.002
|
PRKCG;GABRA1;GNG3;SLC32A1;GAD1;GAD2;GABRG2
|
cAMP signaling pathway
|
0.009
|
ADCYAP1;PAK1;BDNF;SST;CAMK4;CALM3;SSTR1;VIP;CNGB1
|
African trypanosomiasis
|
0.02
|
PRKCG;HBB;HBA2;HBA1
|
Male Upregulated Genes (n = 415)
|
Focal adhesion
|
< 0.001
|
VAV3;PDGFRB;FLT1;ITGB5;LAMB2;HGF;CAV1;FN1;ELK1;PGF;COL1A2;ITGA10;COL6A2;SPP1;ITGB8;ITGA5;TLN1
|
PI3K-Akt signaling pathway
|
< 0.001
|
PDGFRB;NGFR;CDKN1A;FLT1;ANGPT2;CSF1;ITGB5;LAMB2;HGF;FN1;IGF2;PGF;PIK3R5;COL1A2;ITGA10;COL6A2;DDIT4;SPP1;ITGB8;ITGA5;TLR4;EPHA2
|
Proteoglycans in cancer
|
< 0.001
|
CDKN1A;TGFB2;ITGB5;HGF;CAV1;MMP2;IGF2;FN1;IQGAP1;ELK1;DCN;SMO;ITGA5;EZR;TLR4;CD44
|
ECM-receptor interaction
|
< 0.001
|
COL1A2;ITGB5;LAMB2;COL6A2;ITGA10;SPP1;FN1;ITGB8;ITGA5;CD44
|
MAPK signaling pathway
|
0.001
|
PDGFRB;NGFR;TGFB2;FLT1;ANGPT2;CSF1;DUSP1;HGF;IGF2;HSPB1;ELK1;PGF;TGFBR2;GNA12;EPHA2;HSPA1A
|
Hippo signaling pathway
|
0.01
|
YAP1;CRB2;WWTR1;TGFB2;LATS2;CCN2;BMP6;TEAD2;GLI2;TGFBR2
|
Pathways in cancer
|
0.02
|
PDGFRB;NOTCH2;CDKN1A;CDKN2B;TGFB2;LAMB2;HGF;MMP2;FN1;IGF2;LRP5;CXCR4;ELK1;PGF;GLI2;TGFBR2;NFKBIA;CASP7;SMO;GNA12
|
Ras signaling pathway
|
0.02
|
PDGFRB;NGFR;FLT1;ANGPT2;CSF1;HGF;IGF2;FOXO4;ELK1;PGF;EPHA2;PLA1A
|
TGF-beta signaling pathway
|
0.02
|
TGFB2;CDKN2B;ID3;DCN;BMP6;RGMA;TGFBR2
|
Regulation of actin cytoskeleton
|
0.02
|
VAV3;PDGFRB;ITGB5;ITGA10;GNA12;FN1;CXCR4;ITGB8;IQGAP1;ITGA5;EZR
|
4 more..
|
|
|
Male Downregulated Genes (n = 98)
|
Malaria
|
0.02
|
HBB;HBA2;HBA1
|
Neuroactive ligand-receptor interaction
|
0.02
|
MAS1;ADCYAP1;SST;TAC3;TAC1;VIP
|
Taurine and hypotaurine metabolism
|
0.02
|
GAD1;GAD2
|
African trypanosomiasis
|
0.03
|
HBB;HBA2;HBA1
|
Network Analysis in the Brain Identifies a Stronger Disease Signature in Females
To assess transcriptomic changes on a gene network level, we utilized WGCNA. Gene networks were derived separately for male and female samples and compared using network preservation methods, as previously described49. We identified two AD-associated modules in males and 11 AD-associated modules in females (Fig. 3A) that met the significance threshold (FDR < 0.05) and were either positively or negatively correlated with case/control status. Among the male modules, a 463-gene module (termed black) was upregulated in AD, and a 151-gene module (termed tan) was downregulated in AD. The black module in males had significant overlap with two modules in females (termed yellow and pink) (P < 0.001; hypergeometric test) as indicated by asterisks in Fig. 3B. The black module also had strong preservation in the female network (Z-summary score > 10). Similarly, the tan module had strong preservation in the female-network (Z-summary score > 10). Among the female-specific disease associated modules, four modules (termed green, red, black and turquoise) were downregulated in AD, while seven were upregulated (Fig. 3A).
Enrichment analysis of disease-associated modules using the 2019 KEGG Human pathway database revealed pathways relevant to AD that were consistent with those identified in the single gene analysis (Fig. 3A). For example, in both males and females, an upregulated module was enriched for Akt signaling related pathways and downregulated modules were enriched for oxidative phosphorylation and thermogenesis related pathways, consistent with single gene level analyses.
Notably, several additional pathways not seen through single gene analysis were observed in the network analyses. An upregulated module in both males and females was highly enriched for zinc finger nuclease genes related to Herpes simplex viral infection, consistent with recent work demonstrating Herpes virus infection in AD brains54.
Consistent with the single gene analysis, we observed greater number of disease associated modules in females with AD than in males. For example, an upregulated female module was enriched for cell structural processes related to adherens junctions, actin cytoskeleton and axonal guidance. An additional downregulated female module was enriched for neurological signaling pathways including synaptic vesicle exocytosis, aldosterone synthesis and secretion and morphine addiction. Interestingly, an additional female downregulated module was enriched for autophagy and proteolysis pathways, consistent with molecular studies demonstrating decreased autophagy in AD, particularly in females55 (Fig. 3A).
We also conducted an analysis identifying modules with apoE4:disease interactive effect to understand differential penetrance of the apoE ε4 allele in males and females. In the male gene network, we were unable to identify modules with significant apoE4:disease interactive effect. Interestingly, in the female network, we identified one module that was downregulated (2211 genes) in AD, and two modules (329 genes and 439 genes) that were upregulated in AD and exhibited a significant apoE4:disease interactive effect (Fig. 3A). The two upregulated modules (termed pink and purple) were significantly enriched for several zinc finger nuclease genes related to Herpes simplex viral infection. The downregulated module was enriched for metabolic pathways including oxidative phosphorylation and the TCA cycle. Together these results suggest a female-specific network dysregulation involving zinc finger nucleases and metabolic alteration supporting differential apoE4 penetrance in males and females.
There were 102 hub genes among disease associated modules in the female network identified as module membership greater than 0.8, gene significance greater than 0.2, and differentially expressed between AD and controls (Fig. 3C; Supplementary Table S8). In contrast, zero hub genes were identified in the male gene network. Protein-protein interaction maps generated by STRING v11 suggest several Ca+ 2- and G protein-dependent interconnected genes including ITPKB, PDGFRB, GNG12, and GNA12 among the female disease associated modules (Fig. 3C). Among modules with apoE4:disease interactive effect in females, 35 hub genes were identified, including ITPKB as a highly connected regulator (Fig. 3D). For a full list of genes in each module, including hub genes, please refer to Supplementary Table S8).
Differential Gene Expression in Whole Blood Identifies Stronger Disease Signatures in Females with AD in Comparison to Males
Similar to the brain, we observed distinct AD-associated transcriptomic signatures between males and females with AD in whole blood. We observed a total of 599 differentially expressed genes in females with AD, including 294 upregulated genes and 305 downregulated genes (q < 0.05; Figs. 2C-D; Supplementary Table 9). In males, 98 genes were differentially expressed in AD, including 38 upregulated genes and 50 downregulated genes (q < 0.05; Figs. 2C-D; Supplementary Table 9). Altogether, 542 genes were uniquely dysregulated in females, including 271 upregulated genes and 271 downregulated genes. In males, 31 genes were uniquely dysregulated, including 15 upregulated genes and 16 downregulated genes. There was a significant overlap of dysregulated genes across males and females with AD (P < 0.05; hypergeometric test).
Next, we characterized the transcriptomic signatures observed in the blood of male and female AD patients. Among upregulated genes in female AD patients, we found 14 enriched pathways, many of them relating to components of the innate and adaptive immune system (Table 3; Supplementary Table S10). Several cytokine response elements including STAT5B, STAT6, and IL10RB contributed to enrichment of a number of pathways relating to response to infection (Table 3). Similar to the brain, components of actin cytoskeleton regulation were also dysregulated in females. (Table 3; Supplementary Table S10). Downregulated genes in female AD patients were enriched for a number of metabolism related processes including oxidative phosphorylation and thermogenesis, consistent with the single-gene and network analysis in the brain (Supplementary Table S11).
Similar to the brain analysis, we observed dramatically fewer enriched pathways in males with AD. Among upregulated genes in male AD patients, we did not identify any enriched pathways. Among downregulated genes in male AD patients, components of the proteasome were enriched including PSMD4 and PSMC3 (Table 3; Supplementary Table S12). For a full list of enriched pathways, refer to Supplementary Tables S10-S12.
Lastly, we performed a non-stratified analysis comparing gene expression between AD and control samples irrespective of sex in whole blood. Analyses were adjusted for sex, apoE4 carrier status, age and education. A total of 339 genes were upregulated and 360 genes were downregulated in patients with AD compared to controls (Figure S3B, Supplementary Table S8). Upregulated genes were enriched for several pathways previously implicated in AD, including MAPK signaling, autophagy and NFkB signaling (Supplementary Table S13). In addition, a number of immune related pathways were enriched including tuberculosis, Escherichia coli infection, salmonella infection, and inflammatory bowel disease. Several components of the NFkB cascade and antigen presentation system including NFKBIA, ITGAM, STAT5B, TLR5, TLR4, CD14 and C4A, contributed to this enrichment (Supplementary Table S13). Among downregulated genes, pathways related to protein synthesis and metabolism, including ribosome, proteasome, protein export, thermogenesis, and oxidative phosphorylation were enriched. Included in these pathways were several oxidation phosphorylation related genes including NDUFA9, NDUFA8, COX4I2 (Supplementary Table S14).
Table 3
Enriched Pathways in Blood
Term
|
Adjusted P
|
Genes
|
Female Upregulated Genes (n = 294)
|
Tuberculosis
|
< 0.001
|
ATP6V0B;CEBPB;ITGAM;IL10RB;IFNGR2;TCIRG1;CTSS;CREB1;IRAK1;LAMP2;ITGAX;RAF1;CAMK2G
|
Necroptosis
|
0.004
|
PYCARD;STAT5B;MLKL;H2AFJ;IFNGR2;STAT6;TYK2;CFLAR;CAMK2G;HIST1H2AC;HIST2H2AC
|
Fc gamma R-mediated phagocytosis
|
0.006
|
HCK;PTPRC;ARPC1A;PRKCD;RAC2;ASAP1;ARPC5;RAF1
|
Pathogenic Escherichia coli infection
|
0.01
|
ARPC1A;NCK2;ARHGEF2;ARPC5;TLR5;TUBA4A
|
TNF signaling pathway
|
0.01
|
CEBPB;RPS6KA5;CREB1;MLKL;MAP3K8;FOS;CFLAR;CREB5
|
Regulation of actin cytoskeleton
|
0.02
|
FGD3;ITGAM;SPATA13;ARPC1A;RAC2;ITGAX;IQGAP1;ARPC5;RAF1;SSH2;PAK2
|
Lysosome
|
0.02
|
GNPTG;CD63;ATP6V0B;LAMP2;IDS;TCIRG1;GNS;CTSS
|
Phagosome
|
0.02
|
ATP6V0B;ITGAM;LAMP2;CANX;TAP1;TCIRG1;TUBA4A;CTSS;ATP6V1F
|
JAK-STAT signaling pathway
|
0.02
|
STAT5B;CCND3;CSF3R;IL10RB;IFNGR2;STAT6;TYK2;RAF1;MCL1
|
Estrogen signaling pathway
|
0.03
|
CREB1;PRKCD;FOS;KRT10;RAF1;ADCY7;FKBP5;CREB5
|
4 more..
|
|
|
Female Downregulated Genes (n = 305)
|
Ribosome
|
< 0.001
|
RPL4;RPL5;RPL30;RPL41;RPL32;RPL12;RPL22;RPL11;RPL35A;MRPL36;MRPL24;RPL6;MRPL33;RPS25;RPL36AL;RPL35;RPL24;RPS20;RPL26;RPS27A;RPL39;RPS24;RPS12
|
Proteasome
|
< 0.001
|
PSMB6;PSMA5;PSMB7;PSMA3;PSMD4;PSMC3;PSMC1;POMP;PSMB1;PSMC2;PSMD1;PSMF1
|
Spliceosome
|
< 0.001
|
ISY1;HSPA8;SF3B5;CCDC12;BUD31;DDX42;PLRG1;PQBP1;SNRPD2;ZMAT2;SYF2;SNRPG;PPIH;SNRPA1;SNRPB2;SLU7;CTNNBL1
|
Protein export
|
< 0.001
|
SRP19;SEC61G;SRPRB;SRP68;SRP14;SEC11A
|
Oxidative phosphorylation
|
< 0.001
|
NDUFA9;NDUFA8;NDUFS5;COX17;NDUFB2;NDUFA1;COX6A1;ATP6V1E1;NDUFV2;COX6C;ATP6V1D;UQCRH
|
Huntington disease
|
< 0.001
|
NDUFA9;NDUFA8;NDUFB2;NDUFA1;CLTA;COX6C;COX6A1;UQCRH;SOD1;SIN3A;NDUFS5;VDAC3;BAX;NDUFV2
|
Non-alcoholic fatty liver disease (NAFLD)
|
< 0.001
|
NDUFA9;NDUFA8;NDUFS5;NDUFB2;NDUFA1;BAX;PIK3R1;COX6A1;NDUFV2;COX6C;ADIPOR2;UQCRH
|
Protein processing in endoplasmic reticulum
|
0.002
|
DNAJA1;ATXN3;HSPA8;HSP90AA1;HSPH1;HSP90AB1;EIF2AK1;SEC61G;ERP29;BAX;UBXN6
|
Parkinson disease
|
0.002
|
NDUFA9;NDUFA8;NDUFS5;VDAC3;NDUFB2;NDUFA1;COX6A1;NDUFV2;COX6C;UQCRH
|
Thermogenesis
|
0.007
|
NDUFA9;COA3;NDUFA8;SMARCC1;NDUFS5;COX17;NDUFB2;NDUFA1;COX6C;COX6A1;NDUFV2;UQCRH
|
3 more…
|
|
|
Male Upregulated Genes (n = 38)
|
No enriched pathways
|
Male Downregulated Genes (n = 50)
|
Proteasome
|
0.06
|
PSMD4;PSMC3;POMP
|
Network Analysis in Whole Blood Identifies a Stronger Disease Signature in Females
We identified five AD-associated modules in females and zero AD-associated modules in males (Fig. 4) that met the significance threshold (FDR < 0.05) and were either positively or negatively correlated with case/control status. Among the modules in female samples, three modules including a 483-gene module (termed turquoise), a 129-gene module (termed pink) and 153-gene module (termed black) were upregulated in AD. Two modules including a 270-gene module (termed blue) and 119-gene module (termed magenta) were downregulated in AD (Fig. 4A). No modules with significant apoE4:disease interaction effect were found in female or male network analyses from the blood datasets.
Enrichment analysis of disease-associated modules using the 2019 KEGG Human pathway database revealed pathways relevant to AD that were consistent with those identified in the single gene analysis (Figs. 4A and 3A). For example, upregulated modules in females were strongly enriched for innate immune system activity, neutrophil degranulation, CSF signaling, IL2 signaling, and cytokine signaling. Consistent with single gene analyses, downregulated modules in females were enriched for metabolic processes including metabolism of RNA and metabolism of amino acids (Fig. 4A).
There were 35 hub genes among disease associated modules in the female-specific network identified as module membership greater than 0.8, gene significance greater than 0.2 and differentially expressed between AD and controls (Fig. 4B). In contrast, zero hub genes were identified in the male-specific gene network. Protein-protein interaction maps generated by STRING v11 suggest several interconnected genes including the B cell development related protein, IGLL1, and ribosomal proteins RPS20, RPS25, RPL4, and RPL35A (Fig. 4B).
For a full list of genes in each module, including hub genes, please refer to Supplementary Tables S15- S16).
Comparison of Brain and Blood Transcriptomic Signatures Reveals Common Immune Related Signals in Females
We next identified genes that were commonly dysregulated in both blood and brain (Fig. 2E). In females, a total of 12 genes were dysregulated in the brain and blood in the same direction (all upregulated). Several genes among the commonly upregulated genes are known to be highly expressed in lymphoid tissue and play roles in immune cell recruitment including SERTAD1, ITGAX and TYROBP. In contrast, in males we found one upregulated gene, VCAN encoding vesican, dysregulated in both the blood and brain (Fig. 2E).
Cell-type Deconvolution Identifies Sex-specific Immune Cell Dysregulation in Females with AD
Differences in 22 immune blood cell types (Figs. 5A-B) were evaluated by deconvolving the transcriptomic signature obtained via meta-analysis of blood studies. Analysis of cell type proportions adjusting for age, sex, and apoE4 status revealed an increase in neutrophils and naïve B cells, and a decrease in M2 macrophages and CD8 + T cells in AD patients compared to controls in pooled male and female samples (Fig. 5C, FDR P < 0.05). Among females with AD, relative to controls, we observed an increase in neutrophils and naïve B cells and a decrease in M2 macrophages, memory B cells, and CD8 + T cells in AD samples (Fig. 5C, FDR P < 0.05). Interestingly, among males with AD, we did not find any significant differences in immune cell proportions compared to controls.
Sex-specific Transcriptomic Data Improves AD Classification Accuracy
To assess the value of sex-specific transcriptomic data in developing a blood-based classifier in AD, we trained a linear SVM model to classify AD patients controls using the transcriptomic signature obtained via meta-analysis of blood studies. We trained a ‘clinical model’ with age, sex, education, and apoE4 status and a ‘clinical + molecular model’ with age, sex, education, apoE4 status, and blood transcriptomic data. Using pooled male and female samples, the ‘clinical + molecular model’ achieved a higher AUROC compared to the ‘clinical model’ (AUROC = 0.88 for ‘clinical + molecular model’; AUROC = 0.77 for ‘clinical model’) on a test set composed of 25% of samples (Figs. 6A and S4A).
Interestingly, a model trained with only female data achieved a higher AUROC (‘clinical + molecular model’: 0.90 and ‘clinical model’: 0.86; Figs. 6B and S4B) than the pooled male and female model. In contrast, a model trained with only male data obtained a lower AUROC (‘clinical + molecular’ model 0.81 and ‘clinical model’ 0.83; Figs. 6C and S4C) than the pooled male and female model.
Figures 6G-H summarizes shared features between models. In all simple models (pooled male and female, female only, and male only), age and apoE4 status had a positive feature importance while education had a negative feature importance. A positive feature importance means that the expression of that feature increases the likelihood of being classified as AD (termed risk factor). A negative feature importance means that expression of the feature expression reduces the likelihood of being classified as AD (termed protective factor). In the female ‘clinical + molecular model’, 57 features, including known risk factors including apoE4 and age, had a positive feature importance (Supplementary Table S17). In addition, 50 features had negative feature importance. Among these were education and previously implicated AD risk genes including CETN2 (Supplementary Table S17). In the male ‘clinical + molecular model’, 103 features, including apoE4, had positive feature importance. (Supplementary Table S18). In addition, 105 features, including education, had negative feature importance (Supplementary Table S18).
Altogether, we observed a significant overlap (P < 0.001, hypergeometric test) in features with non-zero feature importance between the pooled male and female ‘clinical + molecular model’ and female ‘clinical + molecular model’; female ‘clinical + molecular model’ and male ‘clinical + molecular model’; and pooled male and female ‘clinical + molecular model’ and male ‘clinical + molecular model’ (Fig. 6G).
Functional annotation of features with a non-zero feature importance was performed via enrichment analysis using the 2019 KEGG database of human pathways. Among features with non-zero feature importance, we did not identify any enriched biological pathways in the male only and female only complex models. In the male and female pooled complex model, features with positive feature importance (risk factors), were enriched for staphylococcus aureus infection, graft-vs-host disease, and antigen presentation and processing KEGG pathways (adjusted P < 0.05; Fig. 6H). The HLA genes HLA-DRB4 and HLA-DQA1 contributed to this enrichment. In addition, the P-selection glycoprotein ligand-1 gene (SELPLG) and killer cell immunoglobulin-like receptor (KIR2DL3) also contributed to enrichment, suggesting a role for leukocyte recruitment and natural killer cell activity in AD pathology.
Down sampling sensitivity analysis:
Figure S5 describes the analytical approach for down sampling our blood and brain datasets. Specifically, we performed 100 iterations in which we down sampled the female dataset such as the total number of female AD cases and controls was equal to the number of male AD cases and controls. In each iteration, we performed sex-stratified differential expression and computed the number of AD-associated genes and derived a 95% confidence interval. We randomly selected one iteration to replicate the functional analyses, network analyses, cell-type deconvolution and machine learning analyses described in the original manuscript.
Differential expression results in the brain revealed a significantly higher mean number of differentially expressed AD genes in females compared to males (p < 0.01), consistent with our original findings (Supplementary Table 20). Similarly, in the blood, differential expression results in blood revealed a greater than 3-fold increase in the number of differentially expressed AD genes in females compared to males (p < 0.001), consistent with our original claims (Supplementary Table 21).
We next selected one down sampled iteration for follow up evaluation of enriched pathways in genes differentially expressed between AD cases and controls. In both the randomly selected iterations from blood and the brain, we were able to replicate nearly every enriched pathway observed in the entire dataset. Unless otherwise indicated, Figure S6 displays the top 5 enriched pathways (adjusted P < 0.05) in each group of genes (ie. male upregulated in AD, female upregulated in AD, etc).
To assess whether network changes observed in the entire dataset are preserved in the down sampled dataset, we selected the same iteration in described previously to perform Weighted Gene Network Correlation Analysis (WGCNA). Consistent with original analysis, we created a WGCNA network separately in males and females to derive modules (or groups of genes) within sex-stratified data. Because only the female dataset was down sampled, module preservation between the down sampled dataset and entire dataset was computed only for the female dataset. In the brain, we found 10 modules, each with Zsummary score > 10 suggesting strong preservation in the entire dataset (Figure S7A). Similarly, in blood, we found 11 modules, each with a ZSummary score greater than 10, 10 suggesting strong preservation in the entire dataset (Figure S7B). Overall, this analysis suggests that network effects in the down sampled dataset are strongly preserved in the entire dataset.
To assess whether the cell type deconvolution results are replicated in the down sampled dataset, we selected the same iteration described previously and computed cell type proportions using CIBERSORT. Figure S8 presents results are presented for both the entire dataset (B) and down sampled dataset (C). In the down sampled dataset, we observed that levels of M2 macrophages, neutrophils, naïve B cells, CD8 T cells, memory B cells were significantly different between AD cases and controls among females (p < 0.05, C). Upon pooling both male and female samples we similarly observed dysregulation in M2 macrophages, neutrophils, naïve B cells, CD8 T cells, memory B cells. We did not observe dysregulation in any of the CIBERSORT cell types (A) among male samples (C). These results in pooled male + female samples, female samples only, and male samples only are consistent with the cell type changes we observed in the entire dataset (B).
To assess whether the performance of a linear support vector machine (SVM) model with l1 regularization used to classify AD cases and controls based on blood gene expression data was different in the down sampled data compared to the entire dataset, we created receiver operating characteristic (ROC) curves depicting performance of each linear SVM model on a test set composed of 25% of samples. Features include gene expression data obtained via meta-analysis, age, sex, education, and apoE4 status. Models were fit for female samples only (Figure S9A, S9C), and male samples only (Figure S9B, S9D). While we did not down sample the male dataset, the performance in the male dataset was slightly different compared to the original manuscript (AUROC = 0.80 vs AUROC = 0.81 in the original dataset). This difference can be ascribed to using a random seed when training the SVM. In the down sampled dataset, consistent with our original claims (Figure S9A, S9B), we observed a higher AUROC in a model trained on female samples (AUROC = 0.85; C) compared to a model trained on male samples (AUROC = 0.80; Figure S9D). Overall, these results suggest that performance differences in male and female samples are not strongly driven by sample size differences.