Identification of DEGs in the EAC patients
To explore the variation of gene expression during BE to EAC, we downloaded datasets GSE26886 and GSE37200, and the DEGs between BE and EAC were analyzed, respectively (Figure 1A-D). Then, we sought for the overlapping DEGs between the two datasets, and found that 27 up- and 104 down-regulated genes were observed (Figure 1E-F).
GO and KEGG Pathway Enrichment Analysis
To explore the potential roles of DEGs between BE and EAC, GO and KEGG pathway enrichment analyses were performed. GO analysis showed that the up-regulated genes were mainly involved in biological processes (BP) associated with negative regulation of cell population proliferation, skeletal system development, blood vessel development, positive regulation of programmed cell death, and ossification (Figure 2A). In contrast, the down-regulated genes were mainly involved in BP associated with monocarboxylic acid metabolic process, digestion, thrombin-activated receptor signaling pathway, response to zinc ion, and cellular response to fluid shear stress (Figure 2B). These results indicated that the DEGs were highly associated with epithelial-mesenchymal transition (EMT) and nutrition. KEGG analysis indicated that the up-regulated DEGs were primarily enriched in Pertussis, IL-17 signaling pathway, cytokine-cytokine receptor interaction, ECM-receptor interaction, protein digestion and absorption, and Amoebiasis (Figure 2C); while the down-regulated genes were enriched in chemical carcinogenesis, Amoebiasis, drug metabolism-other enzymes, steroid hormone biosynthesis, bile secretion, glycerophospholipid metabolism, and inflammatory mediator regulation of trp channels (Figure 2D). These results demonstrated that the DEGs were highly involved in tumorigenesis.
Identification of Key Modules by WGCNA
WGCNA analysis provides an overview of the transcriptomic organization, and the relationships between sets of genes with external, biological traits. To identify key modules related to clinical traits, WGCNA was performed by using GSE26886 dataset (Figure 3A). The power of β = 8 (scale-free R2=0.90) was selected as the soft thresholding parameter to construct a scale-free network (Figure 3B). Similar module clustering was constructed by using dynamic hybrid cutting (threshold = 0.25). A total of 25 modules were identified (Figure 3C). The results in Figure 3D showed that the grey module was the highest positive module correlated to EAC (R2 = 0.86,P = 2e-12) and was highly negative correlated to BE (R2 = 0.86,P = 2e-12). Figure 3E showed gene significance for BE and EAC in grey module.
In the module-trait analysis, we intersected the trait-related genes in grey module highly associated with EAC and 131 DEGs generated from expression difference analysis, and finally extracted 27 trait-expression-related genes for the following analysis (Figure 3F, Table S1-2). These results showed that the 27 trait-expression-related genes were significantly correlated with the pathogenesis of EAC.
Exploration of Trait-expression-related Genes in TCGA database
Next, further validation and exploration were conducted among the 27 trait-expression-related genes in TCGA database. MYO1A, ACE2, COL1A1, LGALS4, and ADRA2A were significantly up-regulated in EAC tissue; while AADAC, RAB27A, and P2RY14were abnormally down-expressed in EAC tissue, which indicated that these genes were repeatable in EAC (Figure 4A). Subsequently, ROC curves were performed to estimate the diagnostic value in EAC, and the result showed that the genes mentioned above had good diagnostic properties (Figure 4B). Later, survival analysis was performed to explore the prognostic value of the 8 genes. Low- AADAC and ACE2 expressionwere significantly correlated with poor progress-free interval (PFI); while low- ADRA2A expression was associated with poor overall survival (OS) and disease-specific survival (DSS). These results further illustrated that AADAC, ACE2,and ADRA2A contribute to EAC pathogenesis and progression.