NSCLC, COPD-associated genes
After searching ‘COPD’ and ‘NSCLC’ in the OMIM database (on May 24, 2019), we got 1615 NSCLC-associated and 417 COPD-associated genes (supplement Table I). The two diseases share 125 overlapping genes, which accounted for 7.74% (125/1615) of the identified NSCLC-associated genes, 29.98% (125/417) of COPD-associated genes.
Diseases network topological analysis
NSCLC network (Fig. 1A) and COPD network (Fig. 1B) spread outward from center. As shown in (Table 1), topological parameters of the two disease networks are compared. NSCLC network include 1615 associated genes and 4,405 edges (interactions), which was more complex than COPD 417 genes and 907 edges). The increase in the node degree (the number of node-edges in the network) in the NSCLC (Fig. 1C) and COPD (Fig. 1D) networks followed a power-law distribution.
Functional enrichment analysis of NSCLC
DAVID functional annotation was used to enrich the genes of the two diseases. After gene ontology, 6327 GO terms and 1702 pathways were obtained, and the number of GO term connections was 317051 (Fig. 2). The most significant functions were regulation of apoptotic process, cellular protein metabolic process, and positive regulation of protein metabolic process. Cellular protein modification process, regulation of cellular macromolecule biosynthetic process, and RNA metabolic process had the largest number of genes, which were 569, 561 and 546 respectively. The most signifcant pathways were IRS-mediated signaling, SOS-mediated signaling, and downstream signaling events of B Cell Receptor (BCR) (Table 2).
Functional enrichment analysis of COPD
The COPD network consists of 1515 GO terms and 716 pathways, the number of GO term connections was 107254 (Fig. 3). The most signifcant functions was inflammatory response, regulation of apoptotic process, and cellular response to chemical stimulus. Cellular protein modification process, regulation of cellular macromolecule biosynthetic process, and RNA metabolic process had the largest number of genes. The most significant pathways was cytokines and inflammatory response (Table 3).
Modules of NSCLC and its functional enrichment analysis
A total of 124 modules analyzed by MCODE were identifed from the NSCLC network. 315 biological processes were involved in the top 3 modules of NSCLC-associated network (Fig. 4), including positive regulation of transcription from RNA polymerase II promoter, negative regulation of apoptotic process and inflammatory response. There are 12 hub genes in the top 3 modules (genes with the most connections above 10 in the network).
Modules of COPD and its functional enrichment analysis
Then, 35 modules were identified from the COPD network (Fig. 5). The top 3 modules of COPD-associated network have 121 biological processes, including negative regulation of apoptotic process, cell proliferation and inflammatory response. They shared 81 pathways, including the AMPK signaling pathway, pathways in cancer and non-small cell lung cancer pathway.
Validation of hub genes
We examined the expression profiles of overlapping hub genes in the top three modules of NSCLC and COPD. Finally, MMP9 and BCL2 were highly expressed in the blood of NSCLC patients with COPD compared to healthy donors, while BAX and TP53 were lowly expressed (Fig. 6).