3.1. Transcription levels of TOPs in patients with sarcoma
The development of microarray and RNA sequencing technology has made RNA research an integral part of biomedical research. Therefore, in this study, we analyzed TOP members' expression based on diverse databases and examined their correlations with clinicopathological features, prognosis, and local tumor recurrence, and explored the possible regulatory mechanism in patients with sarcoma. Six TOP family members have been found in mammalian cells. We compared the transcription levels of TOPs in 20 different types of cancer diseases with those in normal samples utilizing the Oncomine database (Figure 1). Student's t-test compared differences in transcriptional expression. The cutoff of the P-value was 0.01, and that of fold change was 1.5.
3.2. Significant changes in TOP expression at the transcription level between sarcoma and normal tissues (Oncomine)
To date, six DNA TOPs have been identified in mammalian cells. The Oncomine database was used to compare the mRNA levels of TOP1, TOP1MT, TOP2Α, TOP2Β, TOP3Α, and TOP3Β between sarcoma and normal tissues (Table 1). It was found that the mRNA levels of TOP1, TOP2Α, TOP2Β, TOP3A, and TOP3Β were upregulated in patients with sarcoma. Further analysis showed that TOP1 was overexpressed in sarcoma patients in the Detwiller Sarcoma dataset, with a fold change of 2.164 and a P-value of 4.5E-4. In addition, Top2A mRNA was overexpressed in fibrosarcoma (fold change=27.421), pleomorphic liposarcoma (fold change=17.927), LMS (fold change=12.112), malignant fibrous histiocytoma (fold change= 21.095), round cell liposarcoma (fold change= 13.321), and SS (fold change= 26.646). TOP2Β was highly expressed in DDLPS (fold change=2.002), fibrosarcoma (fold change= 2.938), round cell liposarcoma (fold change= 2.491), and SS (fold change= 3.108). In the Barretina Sarcoma dataset, TOP2Α was found to have higher expression in myxofibrosarcoma (fold change=12.264), pleomorphic liposarcoma (fold change=9.788), DDLPS (fold change=7.746), and myxoid/round cell liposarcoma (fold change=4.104). Top2b was overexpressed in myxoid/round cell liposarcoma (fold change = 2.318), DDLPS (fold change= 1.65), and pleomorphic liposarcoma (fold change = 1.505). Top3α was overexpressed in myxofibrosarcoma (fold change=1.758), pleomorphic liposarcoma (fold change=1.676), and LMS (fold change =1.757) in the Quade Uterus dataset, and high TOP2Α expression was found in uterine corpus leiomyosarcoma (fold change= 3.894) compared to normal samples.
3.3. Relationship between the mRNA levels of the TOPs and the clinicopathological parameters of patients with sarcoma
Using GEPIA (http://gepia.cancer-pku.cn/), we compared the mRNA expression of TOP family members between sarcoma and adjacent normal tissues. The expression levels of TOP1, TOP1MT, TOP2Α, TOP2Β, TOP3Α, and TOP3Β were higher in sarcoma tissues than in normal tissues (Figure 2A, 2B, and 2C).
3.4. Relationship between TOP mRNA expression and sarcoma grade
There are differences in the expression of TOPs in distinctive grades of sarcomas. We analyzed the expression of TOPs using the classification system of the Fédération Nationale does Centers de Lutte Contre le Cancer (FNCLCC) for sarcomas (Figure 3A). The revision of the American Joint Committee on Cancer (AJCC) Cancer Staging Manual, 8th Edition (2017) built on TNM stage and tumor grade. The AJCC system follows the classification system of the sarcoma groups of the FNCLCC, a 3-level system founded on the differentiation of tumor cells, mitotic activity, and the extent of necrosis. The panel recommends determining histological grade using the FNCLCC or AJCC/NCI system or an appropriate diagnostic-specific scoring system. The FNCLCC grading system can better reflect the patient's prognosis and predict distant metastasis and death in cancer patients. Postoperative chemotherapy can improve the metastasis-free survival and OS times in patients graded with the FNCLCC system(18). The mRNA expression level of TOPs in different FNCLCC grades correlated significantly with the individual tumor grade. In sarcoma, TOP1, TOP2Α, and TOP3Β differed considerably between FNCLCC grades 2 and 3, while TOP1 and TOP2Α differed between grades 1 and 3. TOP mRNA expression correlated with the individual tumor grade (P-value <0.05). TOP1, TOP2Α, and TOP3Β can be used to identify soft-tissue sarcoma patients with different grades, and the local treatment options (low expression) can thus be discussed.
Month to distant Recurrence harms long-term survival and quality of life in several malignancies(19). Despite improvements in local tumor control through surgery, radiation therapy, and chemotherapy, distant metastases and high tumor-related mortality remain problems with current treatment strategies(20). After surgical resection, Month to distant Recurrence plays a major role in reducing patients' survival and quality of life with sarcoma. The patterns and risk factors for Month to distant Recurrence have been highlighted(21). Important prognostic factors for predicting Month to distant recurrence are patient age, tumor size, completeness of the resection, grade, tumor rupture, multiplicity, histological subtype, and previous radiation therapy(22). We divided the data of 206 patients from TCGA into two groups based on the presence or absence of TOP mutations. The mutation group contained at least one TOP gene mutation to compare the effects of TOP mutations on tumor recurrence (P-value=5.238e-3, P<0.05; q-value = 0.0489). The results suggest that the expression level of TOPs can use as a risk factor for assessing the Month to distant Recurrence of sarcoma (Figure 3B).
3.5. The relationship between the expression of TOPs in sarcoma and tumor subtypes
Soft-tissue sarcomas are malignant mesenchymal tumors, with an estimated 12,390 new sarcoma cases and 4990 deaths in the United States in 2017 and a 5-year OS rate of 64%(23). The completeness of resection, histologic subtype, and tumor grade are the main determinants of survival(24)(25). We considered the expression of TOPs in different sarcoma subtypes. A cumulative histogram of 100% multiple classifications (100% stacked bar graph) (Figure 4A) was made, and different volumes represented the percentage of gene expression in different subtypes. The chi-square test was used to assess the relationship between TOP expression and the six major sarcoma subtypes. The expression of TOPs was significantly different among the six main sarcoma subtypes. The results showed that TOP1 has the highest expression ratio in DDLPS, with a ratio of 31.82%. TOP1MT and TOP3Β have the highest expression ratios in UPS, with 30% and 36.84% ratios, respectively. TOP2Α and TOP3Α have the highest expression ratios in STLMS, with 30% and 36.84% ratios, respectively. TOP2Β has the highest expression in ULMS, with a ratio of 36.84%. There were considerable differences in the expression proportions among these six sarcoma subtypes (P-value=8.540e-3, q-value=0.0368). The results showed that there were differences in the expression of TOPs between different subtypes. These results provide important insights into TOPs that can be used to diagnose different subtypes of sarcoma.
3.6. The relationships between the expression of TOPs and DNA methylation and miRNA clusters
Because the binary definition of hypermethylation is based entirely on the DNA methylation signal and DNA methylation profiles, sarcoma subtype information could not be separated at the DNA methylation level. Thus, we used the Partition Around Medoids (PAM) clustering method. The dendrogram from divisive hierarchical clustering revealed five main subsets of sarcomas. We integrated the methylation data and the mRNA expression data to identify the DNA methylation changes in different sarcoma subtypes and function-related genes (including potential tumor suppressor candidates). Previous studies have examined DNA methylation in soft-tissue sarcoma and based on genome-level DNA methylation studies. There are different patterns of DNA methylation in pediatric embryos and alveolar rhabdomyosarcoma. We found that Ewing's sarcoma euphemistically inactivates potential target genes. We further explored the correlation between TOPs and methylation, dividing data from TCGA into two groups based on TOP mutations' presence or absence. The mutation group contained TOP gene mutations to compare the effects of TOP mutations on DNA methylation. The results suggested that there are group differences between genes in separate clusters (P-value < 0.05).
Sarcomas were divided into distinct miRNA clusters, and different miRNA clusters were associated with DFS. The results showed that there were significant differences in TOP1, TOP2Α, TOP2Β, TOP3Α, and TOP3Β expression between cluster 1 and cluster 2 by the Wilcoxon signed-rank test.
3.7. Gene mutations of the TOP family and their relationship with OS and DFS in sarcoma (cBioPortal)
We next proceeded from the genome to disease intervention, rather than judging by tumor subtype. To better understand the relationship between TOP mutations and sarcoma, we elaborated and standardized RNA-Seq V2 data in the TCGA database. RNA-Seq V2 data in RSEM corresponded to RSEM genes from TCGA. The Z-score determined a threshold of ±2.0 missense mutations. The results show that the TOP gene mutation rate was higher in sarcoma patients. The mutation rates were 11% for TOP1, 10% for TOP1MT, 10% for TOP2Α, 9% for TOP2Β, 28% for TOP3Α and 16% for TOP3Β (Figure 5A and 5B). High expression of TOPs in sarcoma was associated with shorter OS and DFS. We believe this finding may be related to TOP gene mutations.
3.8. Mutation and copy number variation analysis
There are somatic copy number variations of TOPs in sarcomas, and we examined the variation in the somatic copy numbers of the TOP genes in sarcoma. Molecular genetic testing has been suggested to be an efficient approach to additional testing because various types of sarcoma have specific genetic aberrations, including substitutions, deletions, amplifications, and single base pair translocations. Moreover, the accumulation of somatic mutations is one of the main reasons for tumor development and contributes to the expression of neoantigens(26). We sought to determine whether mutations in TOP genes are associated with sarcoma. We used a multivariate Cox proportional hazards model to correct clinical factors, including age, sex, race, and tumor stage, to show the clinical correlation between gene expression related to sarcoma and immune subgroups. We identified somatic mutations in sarcoma using the International Cancer Genome Consortium (ICGC) dataset and the TCGA dataset. We further investigated the association of gene mutations with prognosis. Transcription signatures associated with selected mutations were identified and analyzed in terms of immune cell infiltration and outcome. Finally, we examined whether gene mutations affect immunity. The results of this study may reveal novel biomarkers and suggest potential immunotherapy for patients with sarcoma.
We used the somatic copy number alteration (SCNA) module to obtain the copy number variations. The SCNA module provides an opportunity to compare the somatic copy number of a given gene with tumor invasion level between tumor subtypes. Sus is certified by GISTIC 2.0 and includes deep deletion (-2), arm-level deletion (-1), diploid/normal (0), arm-level gain (1), and high gain (2). Graphs were used to display each immune subset's distribution in each copy number of the selected cancer types. The two-sided Wilcoxon rank-sum test was used to compare each SCNA category's penetration level with the normal level. We found that TOP1 has highly amplified somatic copy number variations in neutrophils and dendritic cells and is an immunization promoter; TOP1MT has highly amplified somatic copy number variations in CD4+ T cell and neutrophil immune subsets; TOP2Β has high amplification in the B cell immune subset, arm-level gain in the CD4+ T cell immune subset, and somatic copy number variation with an arm-level deletion in dendritic cells; TOP3Α has arm-level and high gain amplification, the presence or absence of arm-level variations in neutrophils, and somatic copy number variations in CD4 + T cells and neutrophils immune subsets; TOP3Β has high amplification in the immune subset of B cells, CD4+ T cells, CD8+ T cells, and dendritic cells, as well as arm-level somatic copy number variations in CD4 +T cells (Figure 5C).
3.9. Immune cell infiltration of TOPs in patients with sarcoma
Immunotherapy has become a promising treatment option for the treatment of solid tumors(27), but there is no research on the immune correlation between TOPs and sarcoma development. Our report describes the outcome and recruitment of TOPs and immune cells in sarcoma for the first time. We utilized the TIMER database to initiate a comprehensive study of the correlation between TOP and immune cell infiltration. Since GBM/OV microarray data contain more samples than RNA-seq data, we used GBM/OV microarray expression values to determine whether genes were available. By studying their relationship with immune infiltration, we observed that TOP-expressing tumors have low purity in the entire sinusoidal and basic subtypes and upper and lower cancer populations (as described in the Materials and Methods section, tumor tissue immune population). We found that the high expression of TOP1 in sarcoma was negatively correlated with neutrophils (Cor = 0.1996, P = 0.001) and B cells (Cor = 0, 1611, P = 0.012). TOP1MT expression was negatively correlated with macrophages (Cor = -0.236, P = 0.0002) and neutrophils (Cor = -0.144, P = 0.024). TOP2Α expression was negatively correlated with CD4+ T cell infiltration (Cor = -0.2322, P = 0.0002) and macrophages (Cor = -0.1713, P = 0.0084). Similarly, TOP2Β expression was negatively correlated with CD4+ T cell infiltration (Cor = -0.2554, P = 6.48E-05), macrophages (Cor = -0.226594498, P = 0.00046395) and dendritic cells (Cor = -0.294179571, P = 3.38E-06). TOP3Α expression was negatively correlated with CD4+ T cell infiltration (Cor = -0.2522, P = 8.07E-05). TOP3Β expression was negatively correlated with macrophage infiltration (Cor = -0.14316, P = 0.028219) (Figure 6). We describe the TOP and immune cell recruitment as well as changes in the tumor microenvironment in sarcoma. The results showed that the transcriptional characteristics associated with TOP mutations are related to immune cell populations' infiltration into tumors. The high expression of TOPs in CD4+ T cells and neutrophils is positively correlated with sarcoma prognosis. In contrast, the high expression of TOPs in other immune cells is negatively correlated with sarcoma prognosis. The elevated expression of TOP1, TOP1MT, TOP2Α, TOP2Β, TOP3Α, and TOP3Β in CD4+ T cells and neutrophils increased the OS rate and improved the immune response (Figure 7). TOPs are associated with inflammatory responses and immune cell infiltration, influencing the clinical outcome of SARC patients. TOP mutations are a risk factor for prognosis and have a predictive effect on sarcoma prognosis. After considering age, sex, and race (Table 2), TOP mutations were considered significantly different based on age.
We explored the clinical relevance of one or more tumor immune subsets, with the flexibility to correct multiple covariates in a multivariable Cox proportional hazards model. The covariates included clinical factors (age, sex, ethnicity, and tumor stage) and gene expression. For the Cox model outputs, survival (cancer type)~variables was the formula of the user-defined Cox regression model. This model was fitted by the function in the R package 'survival.' Coef is the regression coefficient. HR is the hazard ratio. Its lower and upper 95% confidence intervals are denoted as 95%CI_l and 95%CI_u.
3.10. Association of the increased mRNA expression of TOPs with the improved prognosis of patients with sarcoma
The existing evidence suggests that TOPs would improve DFS in selected patients at high risk of recurrence. We used public datasets (2015 version; http://kmplot.com/analysis/index.php?) to analyze the correlation between the TOP mRNA levels and survival time of patients with sarcoma. The Kaplan-Meier curve and log-rank test showed that elevated TOP mRNA levels were significantly related to the OS and DFS times of all patients with sarcoma (P<0.05). Compared to normal tissues, high TOP expression in sarcoma tissues was negatively correlated with the OS and DFS of patients with sarcoma. The results showed that the high expression of TOP1 (hazard ratio (HR) 2.11, confidence interval (CI) 1.41-3.17; P = 2.3e-4), TOP1MT (HR 1.83, CI 1.2-2.79; P = 4.1e-3), TOP2Α (HR 2.2, CI 1.39-3.47; P = 5.2e-4), TOP2Β (HR 2.11, CI 1.36-3.27; P = 6.4e-4), TOP3Α (HR 1.62, CI 1.07-2.46; P = 2.2e-2) and TOP3Β (HR 1.57, CI 1.05-2.33; P = 2.5e-2) was associated with lower OS. The high expression of TOP2Α (HR 2.41, CI 1.38-4.2; P = 1.5e-3), TOP2Β (HR 2.03, CI 1.2-3.44; P = 6.9e-3) and TOP3Α (HR 1.76, CI 1.06-2.93; P = 2.7e-2) was correlated with lower DFS. We also divided the sarcoma patients into two groups: one group was defined as the TOP mutation group, and the other group was defined as the TOP non-mutation group. The survival curve analysis of OS and DFS using cBioPortal showed that when the TOP family is regarded as a whole, the increased expression of TOPs in sarcoma is related to lower OS, and the result is of scientific and statistical significance. Simultaneously, the medium-to-high expression of the TOP family in sarcoma is not completely correlated with lower DFS, and the result is statistically significant. The result shows that each TOP member plays a different role in the prediction of DFS. (additional file shows this in more detail [see Additional file 3])
3.11 Functional enrichment analysis of TOP co-expression modules
GSEA is a genomic expression profile micro class data analysis tool used to collect genes composed of numerous genes in an entire transcript modification group to perform a simple and straightforward correlation analysis(28). The functions of TOPs and the genes significantly related to TOP expression changes were predicted by performing GO and the KEGG analyses in the database. We analyzed the enriched biological processes (BPs), cellular components (CCs), and molecular functions (MFs) of TOPs as well as the main BPs regulated by the signing of the TOPs, classified by the combination of the results. We found that GO:0006265 (DNA topological change), GO:0007059 (chromosome segregation), and GO:0006260 (DNA replication) (Figure 9A, 9B) were significantly different, and TOP alterations in sarcoma significantly regulated these changes. (additional file shows this in more detail [see Additional file 1])
3.12 Co-expressed genes of TOPs (Metascape)
The co-expressed genes of TOP1, TOP1MT, TOP2Α, TOP2Β, TOP3Α, and TOP3Β were tested by Metascape (Figure 10A and Table 3) and network topology-based analysis (Figure 10B). We set the top-ranking neighbors as 10 to identify the adjacent genes associated with TOPs with differential expression(additional file shows this in more detail [see Additional file 2]).
3.13. Predicted cellular functions and pathways of TOP genes and TOP-related neighboring genes in sarcoma
A network of TOP mutations and their 120 frequently changed neighboring genes was created. The results showed that BUB1B, CENPF, SPC25, NUF2, KNL1SGO1, RPS20, SNRPD3, and MED9 were significantly associated TOP mutations (Figure 11A and 11B). GO functional enrichment analysis predicted the three main mutation functions of TOPs and their 120 frequently changed neighboring genes, including BPs, CCs, and MFs. KEGG pathway analysis revealed the pathways of TOPs and their 120 most frequently changed neighboring genes (Figure 11C).
3.14. Protein interaction network of TOPs (Metascape)
For a given list of TOP genes, an enrichment analysis of protein-protein interactions was performed using the following databases: BioGrid, InWeb_IM, and OmniPath. The resulting network contains a subset of proteins with physical interaction with at least one additional member of the list(29)(30)(31)(32)(33)(34). GO and KEGG results were obtained and showed that in sarcoma, the neighboring proteins of TOPs are mainly expressed on the GO: 0000775 chromosomes, the centromeric region (log10P: -12.7); GO: 0005819 spindles (log10P: -12.0); and GO: 0000819 sister (log10P: -11.2). There were four protein clusters with apparent interaction relationships (Table 4)(35)(36)(additional file shows this in more detail [see Additional file 4]).