Identification of DE lncRNAs and mRNAs in lung adenocarcinoma and adjacent tissues
In our study, microarray profiling performed with 4 paired LUAD tissue samples (tumor and paracarcinoma tissues) to identify DE lncRNAs and DE mRNAs. An overview of this study is shown in Fig. 1. In general, we identified 2819 lncRNAs and 2396 mRNAs with significant differential expression (P-value < 0.05 and fold change (FC) ≥ 2 or ≤ 0.5): 859 upregulated lncRNAs and 1960 downregulated lncRNAs; 757 upregulated and 1639 downregulated mRNAs. The hierarchical clustering heatmap showed the expression levels of the DE lncRNAs (Fig. 2A) and mRNAs (Fig. 2B) and distinguished cancer from adjacent tissues based on the molecular signature of these DE lncRNAs and mRNAs. By volcano plot and scatter plot analyses evaluating the overall distribution of the two sets of data, these DE RNAs were divided into up- and downregulated lncRNAs (Fig. 2C) and up- and downregulated mRNAs (Fig. 2D). The number of downregulated genes was significantly greater than that of upregulated genes. In addition, the top 20 significantly DE lncRNAs and mRNAs were identified according to FC values and are shown in Tables 1 and 2.
Table 1
Top-20 differentially expressed lncRNAs of lung adenocarcinoma and adjacent tissue samples
Accession | pvalues | FC(abs) | Regulation | Chr | GeneSymbol |
ENST00000423781 | 0.045138791 | 154.273527 | up | chr7 | AC004870.4 |
lnc-SYT16-1:1 | 0.008110801 | 62.1498876 | up | chr14 | --- |
lnc-TSPAN13-2:1 | 0.000027973 | 34.927184 | up | chr7 | --- |
ENST00000431027 | 0.007829215 | 26.7528941 | up | chr1 | RP3-340N1.2 |
NR_046533 | 0.000809301 | 25.2378211 | up | chr13 | CLDN10-AS1 |
lnc-BCKDHB-4:1 | 0.002525077 | 18.4268214 | up | chr6 | --- |
ENST00000605886 | 0.04382729 | 16.6142073 | up | chr1 | RP11-284F21.10 |
NR_125404 | 0.023786704 | 15.962481 | up | chr3 | LOC100505920 |
lnc-USP26-3:1 | 0.003316762 | 15.7760352 | up | chrX | --- |
lnc-BCKDHB-6:1 | 0.006763185 | 13.5382121 | up | chr6 | --- |
lnc-NSRP1-2:2 | 0.000161724 | 27.71072711 | down | chr17 | --- |
ENST00000480831 | 0.001737636 | 15.74807358 | down | chr3 | ADAMTS9-AS1 |
ENST00000432452 | 0.001497719 | 14.60031376 | down | chr10 | RP11-464C19.3 |
lnc-GTDC1-15:1 | 0.000686192 | 14.54798755 | down | chr2 | --- |
ENST00000443224 | 0.000787963 | 14.36781879 | down | chr10 | RP11-371A19.2 |
lnc-ZPLD1-2:2 | 0.004414129 | 14.01720412 | down | chr3 | --- |
ENST00000507525 | 0.000368778 | 13.3686819 | down | chr4 | RP13-577H12.2 |
lnc-TRAPPC5-1:1 | 0.001126285 | 12.74950409 | down | chr19 | --- |
NR_003928 | 0.012408513 | 12.17016938 | down | chr1 | CHIAP2 |
ENST00000624132 | 0.015263542 | 11.88176057 | down | chr9 | RP11-205K6.3 |
Table 2
Top-20 differentially expressed mRNA of lung adenocarcinoma and adjacent tissue samples
Accession | pvalues | FC(abs) | Regulation | Chr | GeneSymbol |
NM_173076 | 0.00025142 | 112.2153919 | up | chr2 | ABCA12 |
NM_003695 | 0.02604819 | 93.23154863 | up | chr8 | LY6D |
NM_001032280 | 0.01004802 | 63.89111709 | up | chr6 | TFAP2A |
NM_032899 | 0.00605935 | 41.28293568 | up | chr8 | FAM83A |
NM_001199042 | 0.03422378 | 40.11209326 | up | chr15 | STRA6 |
NM_001164431 | 0.01571357 | 38.51455003 | up | chr20 | ARHGAP40 |
NM_025153 | 0.00026241 | 35.86487577 | up | chr5 | ATP10B |
NM_001080407 | 0.02410958 | 30.83360008 | up | chr11 | GLB1L3 |
NM_001251830 | 0.00046426 | 30.81333114 | up | chr4 | SPP1 |
NM_001077188 | 0.00251909 | 30.572041 | up | chrX | HS6ST2 |
NM_012391 | 0.00602412 | 26.86301237 | up | chr6 | SPDEF |
NM_001045 | 0.00033657 | 45.9038257 | down | chr17 | SLC6A4 |
NM_000261 | 0.000161133 | 30.0889688 | down | chr1 | MYOC |
NM_001114133 | 0.00024818 | 20.6970509 | down | chr10 | SYNPO2L |
NM_203451 | 0.000202297 | 19.3511029 | down | chr13 | SERTM1 |
NM_001332 | 0.000243926 | 16.3150335 | down | chr5 | CTNND2 |
NM_153370 | 0.000374892 | 15.9720912 | down | chr6 | PI16 |
NM_021146 | 0.005748437 | 15.6408757 | down | chr1 | ANGPTL7 |
NM_000575 | 0.000118471 | 14.9645892 | down | chr2 | IL1A |
NM_001278236 | 0.003914533 | 14.8213111 | down | chr11 | PTPN5 |
NM_032961 | 0.000080895 | 14.6704759 | down | chr4 | PCDH10 |
Validation of the DE lncRNAs via The Cancer Genome Atlas (TCGA) database
To verify the microarray data in a large cohort of clinical samples, we downloaded the TCGA LUAD database, which contains both gene expression and patient survival data for the screened cohort, and obtained 573 samples (including 514 LUAD tissue samples and 59 adjacent tissue samples). The clinical information of the patients is shown in Table 3. As shown in the hierarchical clustering heatmap (Fig. 2E) and the volcano plot (Fig. 2F), 1916 DE lncRNAs (| log2FC| > 1, P < 0.05), namely, 1271 upregulated and 645 downregulated lncRNAs, were identified. As shown in the Venn diagram (Fig. 2G), the intersection of the 2819 DE lncRNAs identified by microarray analysis with the 1916 DE lncRNAs identified by TCGA database analysis contained 255 overlapping DE lncRNAs.
Table 3
The Clinicopathological characteristics of LUAD samples downloaded from TCGA database.
Clinicopathological characteristics | Patients (N = 514) |
N | % |
Age | | |
< 68 | 280 | 54.4 |
≥ 68 | 224 | 43.6 |
Gender | | |
Male | 240 | 46.7 |
Female | 274 | 53.3 |
Pathologic stage | | |
Stage l | 279 | 54.3 |
Stage ll | 122 | 23.7 |
Stage lII | 78 | 15.2 |
Stage V | 26 | 2.5 |
Pathologic T | | |
T1 | 172 | 33.5 |
T2 | 277 | 53.9 |
T3 | 46 | 4.4 |
T4 | 18 | 1.7 |
Tx | 4 | < 0.3 |
Pathologic N | | |
NO | 335 | 32.5 |
N1 | 95 | 9.2 |
N2 | 68 | 6.6 |
N3 | 2 | < 0.1 |
Pathologic M | | |
MO | 340 | 33.0 |
M1 | 25 | 4.86 |
Mx | 144 | 14.0 |
Vital status | | |
Alive | 333 | 32.3 |
Dead | 181 | 17.6 |
Annotation Analyses Of The De Lncrnas And Mrnas
GO analysis was used to annotate gene functions and standardize the descriptions of the DE genes according to the biological process (BP), cellular component (CC), and molecular function (MF) categories. We analyzed the results of cis-regulated lncRNAs and found that most of the top 30 GO terms enriched with the upregulated and downregulated genes (i.e., DE lncRNAs and DE mRNAs) were in the BP and CC categories (Fig. 3A, 3B). The top 3 descriptive terms enriched with the DE lncRNAs were atomic septum development, structural molecule activity conferring elasticity, and embryonic digestive tract morphogenesis (Fig. 3C). However, condensed chromosome outer kinetochore, cell migration involved in heart development, and regulation of vasculogenesis were the top 3 descriptive terms enriched with the DE mRNAs (Fig. 3D). Moreover, all DE lncRNAs and DE mRNAs were involved in angiogenesis and cell proliferation.
In our study, KEGG, a database for pathway analysis of DE genes to identify their biological functions, was divided into the following six classifications: cellular processing, environmental information processing, genetic information processing, human diseases, metabolism, and organismal systems. Comprehensive analysis of the KEGG classification results for the DE lncRNAs (Fig. 3G) and DE mRNAs (Fig. 3H) showed that the DE genes were enriched mainly in the signal transduction, immune system, and cancers: overview pathway terms. Moreover, KEGG pathway enrichment analysis suggested that the DE genes were enriched mainly in vascular smooth muscle contraction, focal adhesion and TGF beta signaling pathway (Fig. 3E, F). Further analysis of the KEGG pathway term human diseases showed that these genes were closely related to small cell lung cancer, NSCLC, melanoma, glioma, prostate cancer, thyroid cancer, colorectal cancer (CRC) and other tumors (Fig. 3G, H).
Analysis Of Lncrna Target Genes
To further clarify the functional annotations of the DE genes, we determined the intersection of the 1302 DE lncRNA target genes and the 2396 DE mRNAs. Then, 523 common DE genes were selected via Venn diagram software (Fig. 4A). GO analysis showed that these genes were also enriched in the CC and BP categories. Moreover, the main enriched terms were extracellular matrix, myosin complex, and cytoskeleton in the CC category (Fig. 4B); signal transduction in the BP category (Fig. 4C); and peptidase activity in the MF category (Fig. 4D). The KEGG analysis results showed that these genes were enriched mainly in the pathways focal adhesion, axon guidance, differentiated cardiomyopathy, and melanoma (Fig. 4E). These results suggested that these genes may play an important role in cell morphology, adhesion, intercellular connections, and signal transduction.
Candidate DE lncRNA validation in LUAD cell lines and overall survival analysis
We selected 4 candidate DE lncRNAs from the 255 overlapping genes: 2 downregulated genes (ENST00000609697 and ENST00000443224) and 2 upregulated genes (ENST00000602992 and NR_024321). To confirm the screening results, the expression of the 4 DE lncRNAs was validated in 7 LUAD cell lines and compared with that in the BEAS-2B cell line using qRT-PCR (Fig. 5A). The expression of ENST00000609697 and ENST00000443224 showed a significant decreasing trend in almost all 7 LUAD cell lines, consistent with the microarray data (P < 0.05), while ENST00000443224 was upregulated in H1993 cells (P < 0.05) (Fig. 5A). The significant increasing trend (P < 0.05) in ENST00000602992 and NR_024321 expression was also consistent with the microarray data, but the increasing trend in NR_024321 expression was not obvious in H2228 cells (Fig. 5B). Furthermore, we downloaded gene expression data and patient follow-up data from the TCGA dataset to further elucidate whether these candidate genes are potential prognostic markers for LUAD. Through TCGA dataset analysis, we found that ENST00000609697 was downregulated (P < 0.001) (Fig. 5C) and was the only candidate gene related to the prognosis of LUAD (log-rank P = 0.029) (Fig. 5D). ENST00000602992 and NR_024321 were upregulated in the TCGA dataset (P < 0.001) (Figure S1A, S1B). However, ENST00000602992 was not associated with the prognosis of LUAD (P = 0.24) (Figure S1C), and upregulation of NR_024321 was not positively correlated with good prognosis in LUAD (P = 0.018) (Figure S1D). Moreover, downregulation of ENST00000609697 was positively correlated with good prognosis in LUAD. It was considered the candidate biomarker and may act as a tumor suppressor.
Cerna Regulatory Network Of The De Lncrnas
To further illustrate the potential interactions among the DE lncRNAs, DE miRNAs and DE mRNAs involved in LUAD, lncRNA-miRNA-mRNA ceRNA networks were constructed, and a total of 188 DE lncRNAs, 444 DE miRNAs and 410 DE mRNAs were selected (Supplement S1E). Moreover, we found that most DE lncRNAs in the ceRNA regulatory network were downregulated. We singled out the ceRNA network of the candidate gene ENST00000609697 and found that 7 miRNAs (hsa-miR-3191-3p, hsa-miR-4731-5p, hsa-miR-598-5p, hsa-miR-6791-5p, hsa-miR-4292, hsa-miR-4446-3p and hsa-miR-1827) and 20 DE mRNAs (COLGALT2, MYOCD, TNS1, RASL12, CNN1, and so on) are involved in this network (Fig. 6A). Hsa-miR-4731-5p was enriched and targeted most DE mRNAs in the ENST00000609697 ceRNA network, indicating that it may play a critical role in LUAD.
Functional and survival analyses considering the target DE mRNAs in the ENST00000609697 ceRNA network
Furthermore, we conducted GO enrichment analysis of the 20 targeted DE mRNAs in three ontologies: BP, CC and MF. The 30 GO terms most enriched with the 20 targeted DE mRNAs are shown in Fig. 6B. The most enriched GO terms in the BP, CC, and MF categories were smooth muscle cell differentiation, focal adhesion, and actin binding, respectively (Fig. 6B). Most DE mRNAs were mapped to the BP category; thus, we generated a BP cnetplot that showed the DE mRNAs associated with the top 10 BP terms, in which 4 DE mRNAs (CNN1, FLNC, FOXF1, and MYOCD) were enriched (Fig. 6C). FOXF1 and MYOCD are related to multiple biological processes, suggesting that they may be the critical genes in LUAD. The BP emaplot showed the overlapping relationship between each pair of terms (Fig. 6D) and suggested that smooth muscle cell differentiation was a very important biological process. For screening ceRNA networks related to the prognosis of LUAD, we downloaded the expression and survival data related to the 20 target DE mRNAs in the ENST00000609697 ceRNA network in the UCSC Xena database. The expression of RASL12 was downregulated in LUAD (P < 0.0001) (Fig. 6E), and downregulation of RASL12 was positively correlated with good prognosis (P = 0.034) (Fig. 6F). These results suggested that the ENST00000609697–hsa-miR-6791-5p–RASL12 axis may play a tumor-suppressive role in LUAD.