Differentially expressed genes between IPF patients and normal people
We performed a differential expression of the gene sequencing data of three different IPF patient cohorts. According to our screening criteria(log2|FC|>1 and p value<0.05), for the GSE53945 data set, we got 964 up-regulated genes and 1123 down-regulated genes. In the data set GSE92592, we have 1698 up-regulated genes and 1309 down-regulated genes. In the data set GSE124685, we have 617 up-regulated genes and 395 down-regulated genes. The volcano map shows the up-regulated and downregulated genes(Figure 1A). Each heatmap shows the overall expression level of the differentially expressed genes between IPF patients and the normal population (Figure 1B, Supplementary Table S1). To obtain more common and representative differentially expressed genes in IPF patients, we took the intersection of these three different data sets and finally got 143 co-upregulated genes and 104 co-downregulated genes (Figure 1C).
GO function enrichment analysis of differentially expressed genes
To further explore the functions of these differentially expressed genes, we performed GO function enrichment analysis on the up-regulated genes and the down-regulated genes, respectively. As the results showed, the up-regulated genes were enriched in response to vitamin, response to nutrient, regeneration, osteoblast differentiation, ossification, hemidesmosome assembly, extracellular matrix structural constituent, extracellular structure organization, extracellular matrix disassembly, collagen trimer, collagen metabolic process, collagen fibril organization, collagen-containing extracellular matrix, chondrocyte development, cartilage development, bone development, basement membrane (Figure 2A). We could find that these up-regulated differential genes were enriched in multiple collagen-related GO terms. The genes involved include SFRP2, COL14A1, COL15A1, MMP11, SPP1, LOXL1, MMP1, TNC, ITGA11, COL17A1, COL10A1, MFAP2, VCAM1, FBLN2, SULF1, FAP, COL1A1, MMP13, CCDC80, GREM1, TTR, POSTN, MMP7, MMP10, COMP. Also, down-regulated genes were enriched in regulation of calcium-mediated signaling, positive regulation of integrin-mediated signaling pathway, positive regulation of endothelial cell migration, positive regulation of cold-induced thermogenesis, PDZ domain binding, negative regulation of apoptotic process, N-methyltransferase activity, integral component of membrane, G protein-coupled peptide receptor activity, extracellular region, calcium ion binding, activation of adenylate cyclase activity and so on(Figure 2B, Supplementary Table S2). These downregulated GO terms may be involved in the occurrence and progression of IPF.
KEGG function enrichment analysis of differentially expressed genes
To explore the altered signaling pathways in IPF, we further performed KEGG enrichment analysis on the up-regulated genes and the downregulated genes, respectively. The results suggested that the up-regulated genes were enriched in Protein digestion and absorption, PI3K-Akt signaling pathway, Phagosome, Pathogenic Escherichia coli infection, Focal adhesion, ECM-receptor interaction, Cytokine-cytokine receptor interaction, Cell adhesion molecules (CAMs)(Figure 3A). The down-regulated genes were enriched in pathways in cancer, Neuroactive ligand-receptor interaction, metabolic pathways, dilated cardiomyopathy (DCM), cGMP-PKG signaling pathway, cAMP signaling pathway(Figure 3B, Supplementary Table S3).
Protein interaction network of differentially expressed genes
To further investigate the molecular mechanism of the development of IPF, we attempted to describe the interaction network between these different genes based on the STRING tool. The result indicated that these genes form an intertwined network with each other(Figure 4A). On the other hand, we used the tool MCODE to screen out two clusters of core genes from these differential genes,cluster1 included COL17A1, COL10A1, LCOL14A1, COL15A1, MMP13, SPP1, MMP1, and cluster2 included P2RY6, NTS, ADRB1, VIPR1, RXFP1, EDNRE, BDKRB2, RXFP1, Some of these genes are known to play a key role in the progression of pulmonary fibrosis(Figure 4B). Other genes may be involved in the formation of pulmonary fibrosis.
GSEA analysis of the mRNA expression profile
We carried out GSEA analysis on the expression profile of IPF patients and normal tissues to explore the signaling pathways changed during the formation and development of IPF. The results showed that Asthma, Type I diabetes mellitus,Lupus erythematosus, intestinal immune network for IgA production,p53 signaling pathway were activated in the tissues of IPF while aldosterone regulated sodium reabsorption was activated in the tissues of normal(Figure 5).
The role of gene methylation modification in pulmonary fibrosis
To further explore the molecular mechanism in the process of lung fibrosis, we analyzed the results of two methylation chips from IPF and normal samples. As shown in the heatmap, there are different gene methylation patterns between IPF tissue and normal lung tissue. The differentially methylated genes could well distinguish IPF tissue from normal lung tissue and were expected to become a new methylation marker for IPF(Figure 6A, Supplementary Table S4). Taking the intersection between the differentially expressed gene set and the differentially methylated gene set, we finally got 8 genes with low methylation and high expression at the same time, including CXCL14, DAPL1, DOK5, FNDC4, MMP7, MMP10, MMP11, SPP1 (Figure 6B).
GO function enrichment analysis of methylated genes
We carried out GO enrichment analysis of hypermethylated and hypomethylated genes, respectively. The results indicated that the hypermethylated genes were enriched in transforming growth factor-beta receptor signaling pathway, transcription regulator complex, SMAD protein signal transduction, response to endoplasmic reticulum stress, negative regulation of the apoptotic process, cell migration, and so on (Figure 7A). The hypomethylated genes were enriched in regulation of small GTPase mediated signal transduction, positive regulation of MAPK cascade, positive regulation of adenylate cyclase activity, negative regulation of cell population proliferation, inflammatory response, extracellular matrix organization, extracellular exosome, DNA-binding transcription factor activity, collagen catabolic process (Figure 7B, Supplementary Table S5).
KEGG function enrichment analysis of methylated genes
KEGG enrichment analysis was performed to investigate the signaling pathways that these differentially methylated genes might be involved in. The results indicated that the hypermethylated genes were enriched in the TGF-beta signaling pathway, Ras signaling pathway, Rap1 signaling pathway, PI3K-Akt signaling pathway, HIF-1 signaling pathway, Focal adhesion, Chemokine signaling pathway, and so on (Figure 8A). The hypomethylated genes were enriched in Toll-like receptor signaling pathway, p53 signaling pathway, Hippo signaling pathway, Cytokine-cytokine receptor interaction, Cell adhesion molecules (CAMs), cAMP signaling pathway, C-type lectin receptor signaling pathway (Figure 8B, Supplementary Table S6).
Validation in a mouse model
To better verify whether these hypomethylated genes were highly expressed in pulmonary fibrosis, mice models of pulmonary fibrosis were constructed by intraperitoneal injection of bleomycin. Masson's trichrome staining results indicated that the bleomycin treatment group showed apparent collagen deposition and pluralization characteristics (Figure 9A). RT-PCR was performed to detect the mRNA levels of these genes in pulmonary fibrosis tissues.
The results indicated that the expression of CXCL14, DAPL1, DOK5, FNDC4, MMP7, MMP10, MMP11, SPP1 were significantly increased in the IPF group compared with the control (Figure 9B)