Malignant tumors from the mouth, nasopharynx, oropharynx, hypopharynx, and larynx are collectively referred to as head and neck cancer (HNC). Among them, head and neck squamous cell carcinoma (HNSCC) is the most common[1]. HNC is the sixth most common type of cancer worldwide, with 550,000 people diagnosed and about 380,000 dying from the disease annually[2].
Smoking and drinking are the main risk factors for the development of head and neck malignancies[3]. Moreover, the Epstein-Barr virus and human papilloma virus are related to nasopharyngeal carcinoma and oropharyngeal carcinoma, respectively[4, 5]. Furthermore, the choice of treatment options for HNSCC varies according to the anatomical location of the tumor, tumor stage, patient’s age, health status, and preexisting comorbidities.
Surgery, including open and minimally invasive, is considered the standard treatment for most oropharyngeal cancers and early laryngeal cancers. Locally advanced HNSCC is treated with surgery and definitive radiotherapy, usually accompanied by platinum-based chemotherapy[5].
Overexpression of the epidermal growth factor receptor in tumor tissue has led to the use of cetuximab, a first-line chemotherapy treatment for locally advanced HNSCC and metastatic/recurring diseases[6]. Immunotherapy has also shown efficacy against HNSCC[7]. Both nivolumab and pembrolizumab have been approved for relapse/metastatic HNSCC in the second line[8], although the effective rate of anti- programmed death-1 monotherapy is still only at 15%[9, 10]. It is difficult to achieve accurate early detection, which may be the most important reason for the high mortality of patients with HNSCC. Therefore, there is an urgent need to develop an effective means of early detection, diagnosis, and treatment to improve the treatment of HNSCC.
In contrast to the limitations of traditional experimentation, the development of microarray and sequencing technology provides an excellent tool and platform for cancer research, with the application of big data bioinformatics rapidly developing[11–14]. The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases provide a large amount of relevant data for cancer research[15, 16]. Moreover, the R package weighted gene co-expression network analysis (WGCNA) can be used as a data exploration tool or gene screening (ranking) method to find clusters (modules) of highly correlated genes[17]. This algorithm has been widely used to find transcription level biomarkers[18–21]. For instance, SHI et al. analyzed the GEO data matrix to understand the pathogenesis of lung squamous cell carcinoma and revealed that CCNB1, CEP55, FOXM1, MKI67, and TYMS were potential biomarkers or therapeutic targets[22]. Zhou et al. used the GSE62452 and TCGA data sets for survival and regression analyses and identified 10 hub genes closely related to the progression of pancreatic cancer[23].
In the present study, we downloaded HNSCC clinical and gene expression data from TCGA database, including 31 normal tissue samples and 415 HNSCC tissue samples and gene expression profiles from the GEO database (GSE23036). R software was used to screen differentially expressed genes (DEGs) between HNSCC and normal tissue samples. After using WGCNA for gene module identification, intersection was performed to obtain 15 intersection genes. Subsequently, enrichment analysis, protein-protein interaction (PPI) network construction, and Cytoscape version 3.8.0 were used to illustrate significant correlations between DEGs. Finally, the identification, verification, and analysis of hub genes in DEGs revealed the prognostic and clinical value of the thioredoxin reductase 1 (TXNRD1) regulatory network involved in HNSCC.