1. Differential expression of NUF2 in NSCLC
We developed a flow diagram to show our process(Fig. 1). We initially evaluated NUF2 transcription levels between tumor and normal tissues in pan-cancer from Oncomine and TIMER(Fig. 2A), and further analyzed the expression of NUF2 in NSCLC via UALCAN, TCGA, GEO. As shown in Fig. 2A, data in these databases revealed that mRNA expression of NUF2 were significantly higher in NSCLC. Then, we further studied the transcription levels of NUF2 in LUSC and LUAD (P<0.01). NUF2 mRNA was significantly over-expressed in TCGA-FPKM (LUSC) and GSE32863(LUAD)(P<0.01)(Fig. 2B).
2. The association of NUF2 expression with prognosis and clinicopathological factors in NSCLC
We next explored the correlation between the transcription level of NUF2 with prognosis and clinicopathological factors. Thus, we utilize the Kaplan-meier plotter in order to evaluate whether NUF2 expression relates to the prognosis of NSCLC, to reveal NUF2 high expression to be significantly associated with a poorer OS in NSCLC(OS HR=1.91, 95%CI=1.38-2.65, logrank P=7e-05) and its subgroup, like OS in LUAD(OS HR=1.54, 95%CI=1.21-1.97, logrank P=0.00047) (Fig. 3Ba-b). But there was no obviously significant connection between the expression of NUF2 and the OS of LUSC(Fig. 3Bc). We further explored the association between NUF2 expression and clinicopathological features. Subgroup analysis of mutiple clinic pathological features of NSCLC in UALCAN showed that the transcription level of NUF2 was significantly higher in NSCLC(both LUSC and LUAD) than normal group based on stage and smoking habits. We also found the transcription level of NUF2 was significant difference based on T and N stage of AJCC TMN cancer stage in NSCLC via TCGA database (Fig. 3A).
3. Functional annotation and protein-protein interaction
The DEGs analysis was performed according to the expression of NUF2 in GSE77803. The volcano plot showed 262 genes(red dots) significant positive and 164 genes(green dots) negative correlations with NUF2, and the top 20 significant genes set positively and negatively correlated with NUF2 as shown in the heat map(Fig. 2Ca). To go a step further, we assessed the association of the top 20 positive genes and negative gene in Fig. 2Cc. We next wanted to determine the functional annotation and protein-protein interaction network(PPI network) of the differential expressed genes above (including NUF2) in NSCLC. As shown in Fig. 4Aa, DEGs were mainly enriched in organelle fission and nuclear division via Biological Process(BP) GO annotations, spindle via Cellular Component(CC) GO annotations and ATPase activity via Molecular Function(MF) GO annotations respectively.
KEGG pathway enrichment of NUF2 interactive genes showed that cell cycle was the most enriched pathways(Fig. 4Ab). Among them, the cellc cycle had the smallest P-value(P=2.45e-10) and the largest number of involved consensus genes(count=18). As shown in Fig. 4Ba, there were 22 nodes and 120 edges in the network of PPI via STRING. The vast majority of the nodes were upregulated DEGs in the network. TTK, RRM2, RAD51AP1, PBK, NDC80, MELK, KIF4A, DLGAP5, CEP55, CCNB1, CDC20 have the largest edges in the network(Fig. 4Bc). In addition, we found a significant module via MCODE in the Cytoscape and the most significant pathway in the module was enriched in cell cycle(Fig. 4Bd).
We further used GSEA enrichment analysis to demonstrate these results in MSigDB database. In curated gene sets(C2.CP.KEGG), the genes related with NUF2 enriched in cell cycle, pyrimidine metabolism, purine metabolism and DNA replication; while in hallmark gene sets, the genes related with NUF2 enriched in G2M checkpoint, E2F targets, mitotic spindle, PI3K-AKT-mTOR signaling(Fig. 4C).
4. Basic characteristic of NUF2 and the correlation of NUF2 with special genes
To understand the characteristic of NUF2, we explored the basic information and mechanism of NUF2 in NSCLC on Genecards and Compartments. As shown in Fig. 4Aa, NUF2 was a protein-coding RNA which located at q23.3 in Chromosome 1. In compartments, Fig. 4Ab showed the subcellular locations of NUF2 and the highest confidence of subcellular locations were nucleus and cytosol. These results helped us to know the functional location of NUF2 in the cells. In order to know the association of NUF2 with some special genes such as EGFR, KRAS, ROS1 and genes related with cell cycle, we explored it by using GEPIA. We found NUF2 was associated with KRAS, EGFR, ROS1, PIK3CA and also related to CDK1/2/4/6, E2F1 which are enriched in cell cycle. NUF2 had a positive correlation with KRAS, EGFR, PIK3CA, CDK1/2/4/6, E2F1; while had a negative correlation with ROS1(Fig. 5B). By analyzing the relationship of NUF2 with cell cycle and tumor-related genes, it showed that NUF2 was involved in tumorigenesis and development.
5. NUF2 DNA methylation status in NSCLC
By using the UALCAN website for differential methylation analysis, we found that the promoter methylation level of NUF2 in LUSC was higher than that of normal tissues. In contrast, the promoter methylation level of NUF2 in LUAD was lower than in normal tissues(Fig. 6A). Then, based on different clinical characteristics, we further discovered whether the promoter methylation level of NUF2 was correlated with clinical characteristics. The subgroup analysis results showed the promotor methylation of NUF2 was possibly impact by stage, smoking status and N stage of AJCC TMN cancer stage(Fig. 6B).
6. NUF2 alteration in NSCLC
We then used the cBioPortal to determine the types and frequency of NUF2 alterations based on Whole exome sequencing data from NSCLC(data including 42.3% LUSC and 57.7% LUAD) in TCGA. To highlight the role of NUF2 in NSCLC, we also compared NUF2 with NDC80, SPC24, SPC25, which were the component of NDC80 complex. As shown in Fig. 7A, the NDC80 complex was totally altered in 148 of 1144(12.9%) in NSCLC patients, while NUF2 was altered in 92 of 1144(8%) and most of the cases are amplification. The alteration of NUF2 accounts for 62% of the total alteration. To go a step further, we explored the specific alteration of every genes, and we also found that the number of alteration sites in NUF2 was more than others(Fig. 7Ac-f). We next explored the correlation between alteration and the prognosis of NSCLC in cBioportal. As shown in Fig. 7B, the altered group significantly linked with a poorer prognosis in NUF2(P=0.0407) while selecting genomic profiles as somatic mutations, and we didn’t found obvious difference in other genes.
7. The transcription level of NUF2 associated with immune cells infiltration and their type markers in NSCLC
We next assessed whether the transcription level of NUF2 in NSCLC was related to immune cell infiltration. TIMER database was used for correlation analysis. In LUSC, the results have found that NUF2 was positively related with tumor purity(cor=0.388, P=1.25e-18), negatively correlated with CD4 + T cells(partial.cor=-0.213, P=2.96e-06) and macrophages(partial.cor=-0.254, P=1.88e-08), and weakly negatively correlated with dendritic cells(partial.cor=-0.132, P=4.01e-03) and neutrophils(partial.cor=-0.137, P=2.83e-03). We also found that in LUAD, NUF2 was negatively correlated with B cells(partial.cor=-0.19, P=2.67e-05), CD4 + T cells(partial.cor=-0.178, P=8.61e-05), macrophages(partial.cor=-0.107, P=1.89e-02), and dendritic cells(partial.cor=-0.133, P=3.23e-03)(Fig. 8A).
We next want to find out the relationships between the expression of NUF2 and the type markers of various immune cells in NSCLC. The markers of B cells, CD8+ T cells, neutrophils, marcrophages, dendritic cells(DCs), NK cells, Th1 cells, Treg, monocyte were tested via TIMER database. Markers of immune cells were considered to investigate further association of NUF2 expression with immune cells. No matter whether the correlation adjusted or not, NUF2 in LUSC was negatively correlated with markers in several immune cells, such as FCRL2, CD19, MS4A1 in B cells; FCGR3B, CEACAM3, SIGLEC5, FPR1, CSF3R, S100A12 in neutrophils; CD68, CD84, CD163, MS4A4A in macrophages; CD209 in dendritic cells; FOXP3, CCR8 in Treg; C3AR1, CD86, CSF1R in monocyte(Table 1). Meanwhile, NUF2 in LUAD was negatively correlated with MS4A1 in B cells, CD8A; CD8B in CD8+T cells; CSF3R, S100A12 in neutrophils, KIR3DL3, NCR1 in NK cellls; CSF1R in monocyte. We next used GEPIA to verify the results. In LUSC, correlation results between NUF2 and markers of macrophages, neutrophils, dendritic cells in GEPIA are similar to those in TIMER. In LUAD, correlation results between NUF2 and markers of B cells, CD8+T cells, NK cells in GEPIA are similar to those in TIMER(Table 2).
8. Prognosis analysis of NUF2 expression in NSCLC based on immune cells infiltration
We have demonstrated that the expression of NUF2 have some relationship with the immune cell infiltration in NSCLC, and the expression of NUF2 was also related to the prognosis of NSCLC. So we speculated whether the expression of NUF2 in NSCLC would have influence on the prognosis partly affected by immune cell infiltration. We analyzed the correlation between NUF2 expression and prognosis based on the enrichment of related immune cell in NSCLC. The results revealed that higher expression of NUF2 of LUAD in enriched B cells, enriched CD4+T cells and enriched macrophages(HR=2.01) had a poor prognosis respectively(Fig. 8B). The results in LUSC described that high expression of NUF2 in enriched macrophages have a better prognosis(Fig. 8B).