Background:
Aberrant neurofibrillary tangles (NFT) deposits is a mainly character of Alzheimer's diseases (AD). A lot of evidence shows that at different stages of AD, the distribution region of NFT in the brain is also different. For example, the entorhinal cortex (EC) is the area where NFT deposits occurs firstly in AD. Here, we use machine learning and weighted gene co-expression network (WGCNA) to explore the relationship between cortex regions of NFT deposits and Braak stages, and reveal potential biomarkers and therapeutic targets of AD.
Objective: To explore the differences in gene expression patterns in multiple brain regions with distinct Braak stages and find the hub genes by using WGCNA and machine learning.
Methods: The transcriptional profiling data of the human entorhinal cortex, temporal cortex (TC) and frontal cortex (FC) derived from individuals ranging from Braak stages 0 to VI was obtained from the GEO database (GSE131617) in NCBI. For WGCNA analysis, first, we detected consensus modules of different brain regions and illustrated the relationships of these modules. The second, we conducted single sample gene set enrichment analysis (ssGSEA) to obtain the ecrichment scores of samples and screened the enrichment genes through the best subset regression and the random forest algorithm. Third, we analyzed the relationships between consensus modules (EC-FC, EC-TC, and FC-TC) and ssGSEA enrichment scores. Next, the overlapping genes between differentially expressed genes (DEG, between Braak stage 0 and Braak stage I-VI) and genes of interest in the module were discovered. Metascape analysis were conducted to determine the function of overlapping genes, and Random Forest classifier was preformed to obtain the most significant genes from the overlapping genes. The disclosed significant genes were finally identified through network analysis.
Results: Preservation ((PreservEC, FC) is 0.91, (PreservEC, TC) is 0.95, (PreservTC, FC) is 0.9) of consensus modules and connectivity of different brain regions illustrated very high correlation preservation of all pairs of eigengenes across the two networks. We found that the oxidative damage pathway plays a vital role in classifying the Braak stages via ssGSEA, random forest and best subset algorithm, the imp is 0.57. Through WGCNA analysis, the black module is found highly positively correlated with oxidative damage, which is involved in immune response mainly. Through step by step filtering of the module genes by overlapping with DEGs and Random Forest classifier analysis, we found that LYN, CD68, LAPTM5, IFI30, PI3KAP1, HCK and ARHGDIB were co-expressed and highly correlated with oxidative damage and immune response.
Conclusion: The co-expression network has strong similarities in different brain regions undergoing AD. Molecules such as IFI30, LYN, CD68, PTPRC, HCK, LAPTM5, FCERIG, and ARHGDIB play an important role in the early stages of AD through the inflammatory response mediated by microglia.