An abundance of important genes are differentially expressed between SS and tumor-adjacent tissues
To gain insights into the transcriptomic changes of SS patients, we deeply sequenced the tumor and tumor-adjacent tissues of 10 SS patients with total RNA sequencing (including both poly (A+) and poly (A-) RNAs). The detailed information and corresponding pathological diagnosis reports of these 10 SS patients were shown in Table 1, respectively. We first aligned the RNA-seq reads of each sample to the human reference genome GRCh38 using HISAT2[16] and then conducted differential expression calling by employing DEseq2[18]. A total of 4,286 differentially expressed genes (DEGs) were detected using the threshold of |fold change| >2 and adjusted P-value < 0.01, of which 2,309 (including 432 lncRNA genes) and 1,977 (including 333 lncRNA genes) genes were separately upregulated and downregulated in SS compared to tumor-adjacent tissues (Fig. 1A, Supplementary Table S1). Interestingly, we found that 340, 185, 124, and 7 of those DEGs are oncogenes, tumor suppressor genes (TSGs), transcription factors (TFs), and splicing factors (Supplementary Figure S1), respectively (Fig. 1B). Specifically, 52 TFs (such as AES and BCL6) were down-regulated and 72 TFs (e.g. ARID3A and BRCA2) were up-regulated, suggesting that the expression changes of these TFs could influence the expression of their downstream target genes including related oncogenes and TSGs. Since oncogenes and TSGs are closely correlated with cancer, their expression changes may play an important role in the development of SS. Specifically, in consideration of the crucial function of splicing factors in AS regulation[28], we further conducted a qPCR experiment to validate the expression profiles of the seven splicing factors (ELAVL2, HNRNPA1, HNRNPH2, MBNL1, PCBP1, QKI, and TIA1) in DEGs (Supplementary Figure S2). As expected, the experimental results were consistent with the RNA-seq data, indicating the robustness of our analysis. Therefore, the differential expression of these splicing factors could result in the AS deregulation of corresponding genes in SS.
Table 1
Detailed information of 10 synovial sarcoma patients.
Patient
ID
|
Age
|
Sex
|
Tumor Location
|
Tumor Size (cm)
|
Tumor status
|
Outcome
|
1
|
26
|
Female
|
Thigh
|
12*10.5*10
|
Primary
|
Alive
|
2
|
18
|
Male
|
Foot
|
6.0*3.5*2.0
|
Local recurrence
|
Died
|
3
|
37
|
Male
|
Groin
|
9.5*6*6
|
Local recurrence
|
Alive
|
4
|
29
|
Female
|
Lung
|
2.0*2.0*1.3
|
Primary
|
Alive
|
5
|
59
|
Female
|
Iliac Bone
|
5.5*4.5*3.5
|
Local recurrence
|
Alive
|
6
|
28
|
Female
|
Foot
|
2*1.7*0.7
|
Primary
|
Alive
|
7
|
20
|
Female
|
Neck
|
6*5*4
|
Primary
|
Alive
|
8
|
27
|
Male
|
Shank
|
7*6*2
|
Primary
|
Alive
|
9
|
41
|
Male
|
Shank
|
8*6*4
|
Local recurrence
|
Alive
|
10
|
71
|
Female
|
Thigh
|
9*6*3
|
Primary
|
Alive
|
Gene ontology (GO) and KEGG pathway enrichment analyses showed that these upregulated and downregulated DEGs were mainly involved in the fundamental and tumor-related biological processes and pathways (Fig. 1C, D, and E FDR < 0.05). For example, the up-regulated DEGs were primarily enriched in the cell-cycle-related biological processes (e.g. chromosome organization, chromatin organization, and DNA conformation change) and pathways of systemic lupus erythematosus, cell cycle, DNA replication, and P53 signaling (Fig. 1C). Several previous studies also identified the cell-cycle-related genes in sarcoma as a major category of up-regulated genes[29, 30], which was in line with our finding. By contrast, the down-regulated DEGs were mainly involved in the metabolic-related biological processes (such as energy derivation by oxidation of organic compounds, muscle system process, and glucan metabolic process) and the pathways of oxidative phosphorylation, insulin signaling pathway, and vascular smooth muscle contraction (Fig. 1D). Thus, the result suggests that a multitude of genes prominently altered their expression levels in SS, which could be one of the main factors accounting for tumorigenesis through up-regulating and down-regulating corresponding pathways.
Deregulation of alternative splicing could contribute to the tumorigenesis of SS
Considering that the misregulation of AS can lead to the production of aberrant proteins that contribute to tumorigenesis[8], we further compared the AS profile between SS and tumor-adjacent tissues by employing rMATS[19]. Five classical splicing categories of exon skipping (ES), alternative 5’ donor sites (A5DS), alternative 3’ acceptor sites (A3AS), mutually exclusive exons (ME), and intron retention (IR) were analyzed. In total, we identified 2511 (including 41 lncRNA genes) significantly differential AS genes (DASGs), of which 2018, 223, 242, 486, and 159 belong to the splicing mode changes of ES, A5DS, A3AS, ME, and IR, respectively (Fig. 2A, FDR < 0.05, Supplementary Table S2). As expected, ES was the most common differential splicing mode (80.37%, 2018 out of 2511 DASGs), whereas IR was the least (6.33%, 159 out of 2511 DASGs). Notably, the majority of those DASGs among the five classical splicing categories were largely different, only a small portion of them simultaneously exhibited three or more distinct splicing types (Fig. 2A).
Gene functional enrichment analysis indicated that those 2511 DASGs were mainly involved in the RNA splicing and cancer-related biological processes and KEGG pathways (Fig. 2B and C, adjusted P-value < 0.05), which was highly correlated with the AS process. For instance, the top enriched biological processes of those DASGs were mRNA processing, microtubule cytoskeleton organization, and RNA splicing, while the enriched pathways are endocytosis, RNA transport, proteoglycans in cancer, and spliceosome. Moreover, we observed that 21 splicing factor genes showed significantly differential AS between SS and tumor-adjacent tissues, such as HNRNPA1, PTBP2, QKI, RBFOX2, and TRA2A. It is well known that the splicing factors are crucial for AS regulation[28], the deregulation of those splicing factors could drastically disrupt the splicing process of many corresponding genes and contribute to the tumorigenesis of SS[31]. Furthermore, we found that 346, 204, and 122 oncogenes, TSGs, and TFs were also differentially spliced (Fig. 2D). The abnormal splicing of these TFs could influence the expression of their downstream target genes and contribute to the development and progression of SS. Only 368 genes shared between DASGs and DEGs, leaving most of them were distinct (Fig. 2E). These common 368 genes were enriched in the biological process of actomyosin structure organization and pathway of regulation of actin cytoskeleton (Fig. 2E, adjusted P-value < 0.05). Thus, the genes that showed differential expression were quite distinct from those that exhibited differential splicing, suggesting that AS is complementary with expression level in revealing the transcriptomic changes. These results indicate that the abnormal AS changes of genes could be another important factor responsible for the tumorigenesis of SS.
Dissection of the gene fusions in SS
We further explored the gene fusion events in SS patients using TopHat-Fusion[20] A total of 14 and 11 unique gene fusion pairs were separately identified in SS and tumor-adjacent tissues, and no fusion was shared between them. The 14 tumor-specific gene fusion pairs were from seven SS patients, most of which (11 out of 14) were resulted from the rearrangements within the same chromosome, while 3 of them were generated by breaking and rejoining of two disparate chromosomes (Fig. 3A). In total, 27 genes were involved in these tumor-specific gene fusions. SS18 was fused with SSX1 and SSX2, which was in line with previous studies[9]. In constrast, other genes were mainly fused with one partner (Fig. 3B).
As shown in Fig. 3C, the maximum number of gene fusion pairs detected in individual patients was four and the gene fusion events in those SS patients were quite distinct. Intriguingly, these tumor-specific gene fusion events contain one TF of SSX2 and seven oncogenes of SS18, SSX1, SSX2, BCOR, CNOT1, HIST2H2AC, and TOP1 (Fig. 3D). Oncogene SS18 was fused with the TF and oncogene of SSX2 as well as the oncogene SSX1, which is consistent with the known findings[32]. Besides, other oncogenes of BCOR, CNOT1, HIST2H2AC, and TOP1 formed the fusion events of BCOR-CCNB3, CNOT1-SETD6, HIST2H2AC-HIST2H2AB, and TOP1-PLCG1-AS1, respectively. Previous studies have shown that BCOR-CCNB3 fusion tends to occur in the undifferentiated small round-cell sarcomas like Ewing sarcoma and has the potential to drive sarcoma progression[33–35]. Other gene fusions could be novel for SS, and the involved genes could be functionally important. For example, CNOT1 encodes the CCR4-NOT transcription complex subunit 1, which mainly participates in deadenylating mRNAs[36]. HIST2H2AC and HIST2H2AB can generate the replication-dependent histones that are basic nuclear proteins responsible for the nucleosome structure of the chromosomal fiber. TOP1 encodes the enzyme of DNA topoisomerase for controlling and altering the topologic states of DNA during transcription[37]. Since TF could regulate the expression of many downstream target genes and oncogenes are closely associated with cancer, the fusion events of those TFs and oncogenes may contribute to the tumorigenesis/progression of SS. Interestingly, lncRNA genes of LINC00970, LOC105375787, and PLCG1-AS1 were also involved in the gene fusion events, but their functions were still unknown. Gene functional enrichment analysis showed that those fusion genes were significantly enriched in the KEGG pathway of transcriptional misregulation in cancer (Fig. 3E, adjusted P-value < 0.05).
Moreover, we further explored the expression profile of these fusion genes using synovial sarcoma data of The Cancer Genome Atlas (TCGA)[38]. As expected, these fusion genes showed similar expression patterns between the synovial sarcoma samples of us (n = 10) and TCGA (n = 7) (Supplementary Figure S3A). Additionally, we also found that these fusion genes exhibited slightly different expression profiles across distinct types of TCGA sarcomas (Supplementary Figure S3B). Considering that the number of synovial sarcoma samples is limited, we used all the TCGA sarcoma samples to do the survival analysis based on our identified fusion genes. Interestingly, the expression levels of KLRB1 and TOP1 were significantly associated with the survival of sarcoma patients (Fig. 3F, P < 0.05), indicating that they could be potential prognostic markers.
Circular RNAs may play a role in SS formation
Emerging evidence shows that circRNAs can involve in various aspects of tumor biology[15, 39], thus we further investigated the expression profile of circRNAs in SS and tumor-adjacent tissues. We detected 49 differentially expressed circular RNAs by employing CIRI[21] with the threshold of |fold change| >2 and adjusted P-value < 0.01. As shown in Fig. 4A, 21 of them were significantly up-regulated in SS, whereas the other 28 were down-regulated. Furthermore, we found that the great majority (46 out of 49, 93.88%) of those differentially expressed circRNAs were formed by the circulation of exons of their parental genes, only two circRNAs of 10:24380869|24384423 (parental gene: KIAA1217) and 17:35168061|35168685 (parental gene: UNC45B) were produced from the intronic region and another one (5:137757867|137759020) was generated in the intergenic region (Fig. 4B).
Intriguingly, 7, 5, and 3 of the parental genes for those differentially expressed circRNAs were oncogenes, TSGs, and TFs (Fig. 4C). Previous studies have shown that circRNAs can form posttranscriptional regulators to regulate the expression of their parental genes[40, 41]. Thus, these circRNAs have the potential to affect the expression of their parental oncogenes, TSGs, and TFs. Gene functional enrichment analysis showed that the parental genes of those differentially expressed circRNAs were mainly enriched in the muscle system process (such as MAP2K4, HDAC4, TMEM38A, MYH1, MYH2, CAMK2G, TRDN, and SULF2) and muscle contraction (e.g. HDAC4, TMEM38A, MYH1, MYH2, TRDN, and SULF2) (Fig. 4D, adjusted P-value < 0.05).
The genes involved in different types of transcriptomic changes are largely distinct
We further compared the four gene types of DEGs, DASGs, the fusion genes, and the parental genes of differentially expressed circRNAs. As shown in Fig. 5, the genes in one type were largely distinct from that of other types, and no genes were common among the four categories. Only a fraction of them was involved in two or three types of changes (Fig. 5). Intriguingly, the DEGs of BCOR, HIST2H2AB, and MEG8, and the DASGs of AKR1E2 and DCAF8 were overlapped with the fusion genes, suggesting that the fusion events may influence the expression and/or AS profile of these genes. BCOR is an oncogene, while MEG8 is an imprinted gene. Moreover, 18 DEGs (e.g. DNM3OS, ZNF730, DNAH14, and AFF2) shared with the parental genes of differentially expressed circRNAs, implying that expression changes of these genes could affect the expression of circRNAs as well. In addition, 17 DASGs (such as SUCO, VWA8, MTUS1, and USP53) were common to the parental genes of differentially expressed circRNAs. Since circRNAs are mainly formed by AS of pre-mRNAs through backsplicing[42], the AS changes of these DASGs have the potential to influence the expression of corresponding circRNAs. Collectively, our results show that all the four transcriptomic aspects of expression changes, AS, gene fusions, and circRNAs could be closely correlated with the tumorigenesis/progression of SS.
CircRNAs could potentially regulate the expression of a multitude of genes by acting as miRNA sponges
An increasing number of studies suggested that endogenous circRNAs can act as miRNA sponges to regulate corresponding gene expression[43, 44]. We further constructed the interaction network among differentially expressed circRNAs, miRNAs, and the miRNA target genes of DEGs, DASGs, and fusion genes to elucidate the functional roles of those differentially expressed circRNAs. Based on the known miRNA-circRNA regulations, and the miRNA-targets relationships in the starBase database[24] as well as the protein-protein interactions (PPIs) in the String database[23], the resulting interaction network involved in 5 circRNAs (hsa_circ_0001699, hsa_circ_0000247, hsa_circ_0000246, hsa_circ_0000095, and hsa_circ_0000118), 44 miRNAs, 293 protein-coding genes, containing 57 miRNA-circRNA interactions, 789 miRNA-mRNA interactions and 350 PPIs (Fig. 6).
It is well known that circRNAs can regulate gene expression by influencing transcription, mRNA turnover, and translation by sponging RNA-binding proteins (RBPs) and miRNAs[44]. Our resulting network showed that circRNAs hsa_circ_0001699, hsa_circ_0000247, hsa_circ_0000246, hsa_circ_0000095, and hsa_circ_0000118 could act as the sponges of 14, 13, 13, 12, and 5 miRNAs, respectively. Moreover, these miRNAs have the potential to regulate the expression of 119, 202, and 3 genes of DEGs, DASGs, and/or fusion genes. Base on the findings in previous studies[43, 44], the expression of these miRNA target genes could be indirectly influenced by corresponding circRNAs through competing for the interaction with miRNAs. Consequently, our result suggests that circRNAs could potentially function as miRNA sponges to regulate the expression of an abundance of corresponding genes.