CIN70 is significantly elevated in CRPC metastases. The CIN70 signature was previously derived by identifying differentially expressed genes in tumors displaying high versus low levels of chromosomal imbalance. CIN70 was applied to five CRPC transcriptome datasets [22, 35–38]. The signature activation scores derived from primary tumors were compared to CRPC metastases (Fig. 1). CIN70 scores were significantly higher in CRPC metastases compared to primary tumors across all datasets.
CIN70 score strongly correlates with various genomic alterations in the PC cohort in TCGA. Since the high CIN70 signature activation score strongly associated with CRPC metastases, we sought to determine if it correlated with genomic evidence of CIN (i.e., broad CNAs) as well as unfavorable tumor features and outcome in untreated PC tissue samples. To assess the correlations of specific types of genomic aberrations and CIN70 activation scores measured in primary PC, we employed the PC cohort (n = 473) found in TCGA (Fig. 2A). We considered five different types of genomic alterations including focal CNAs, broad CNAs, tumor mutational burden (TMB), gene fusions, and microsatellite instability (MSI) for each case. All genomic alterations were quantified using the individual scoring method (see Methods). Briefly, CNA events were identified in each sample and assigned broad or focal status based exclusively upon length using GISTIC 2.0 [26]. The TMB and fusion score were computed using the number of events identified in each case. The MSI score was based upon the MANTIS scoring method, and pre-computed scores for the PC cohort in TCGA were obtained from a previously published report [27]. Cases were displayed in a heat map based upon CIN70 score (high to low) in order to visualize recurrent genomic alterations associated with the CIN (Fig. 2A). Pathology Gleason score and biochemical recurrence (BCR) status were also included on the map for each case. Comparing CIN scores to specific classes of genomic alterations allowed determination of correlation coefficients (Fig. 2B). The CIN70 score most strongly correlated with broad CNAs (r = 0.52, Fig. 2C). There was a positive correlation with focal CNAs (r = 0.46), which was weaker with mutational burden (r = 0.34) and fusions (r = 0.31). No significant correlation between CIN70 score and MSI score was observed (r = 0.09, Fig. 2C), and anti-correlation with CIN23 score (-0.27) was noted. High CIN70 scores were associated with PC displaying Gleason score of 8 or higher (Supplemental Fig. 1). PC cases displaying BCR were also more likely to display high CIN70 scores (Supplemental Fig. 1). These findings confirm that high CIN70 score is reflective of broad CNA frequency in PC, which, in turn, is associated with aggressive disease and poor outcome in TCGA cases. To assess whether specific, recurrent CNAs differ between cases with high versus low CIN scores, TCGA cases classified as CIN70-high versus CIN70-low were compared (Fig. 2D). An increase in recurrent CNAs were identified throughout the genome, but no specific chromosomal locations where affected by increases in CNAs, confirming the global genomic impact of CIN.
PNBX Cohort. Having established that CIN70 scores are highest in mCRPC and high-risk primary CSPCs contained in TCGA, we sought to evaluate transcriptomic profiles derived from untreated primary tumors of men diagnosed with de novo metastases (clinical stage M1). High-grade PC areas in PNBX were procured for RNA sequencing (RNAseq) from formalin-fixed and paraffin-embedded diagnostic PNBX of 99 patients (Fig. 3A). We accessed selected archival diagnostic PNBX from a racially and ethnically diverse cohort of 1927 men who were diagnosed and treated exclusively within a single Veterans Affairs (VA) healthcare system (Supplemental Fig. 2). Sub-stratification of this cohort was performed based upon metastatic tumor burden (oligo versus poly) at diagnosis and follow-up (Supplemental Fig. 2A). Kaplan-Meier curves demonstrated overall survival (Supplemental Fig. 2B), with M1-poly and M0-poly cases displaying significantly shorter survival than M1-oligo and M0-oligo cases. For comparison, we selected high-grade PC cases without evidence of metastatic progression (M0-NM) over a median follow-up of 56 months. Clinical characteristics of the sequenced PNBX cases are provided in Supplemental Table 1.
mCRPC biology embedded in PNBX M1 cases. We aimed to identify the transcriptomic footprint of metastatic disease in primary tumors by comparing mCRPC and primary tumors collected from PNBX of men diagnosed with de novo metastatic disease (M1-oligo and M1-poly). We also questioned the amount of CIN in these cancers. Towards this goal, we identified 1,234 DEGs by comparing RNA sequencing datasets of PCs from men in the VA cohort diagnosed with de novo metastatic disease (M1-poly or M1-oligo) versus non-metastatic (M0-NM) cases. We also derived DEGs through comparison of gene expression between primary tumors from RPs and metastases collected at rapid autopsy in two published mCRPC cohorts (Taylor and Grasso data sets) [22, 35]. Strong correlations were revealed between DEGs from the VA cohort and the DEGs derived from the Taylor (r = 0.64) or Grasso (r = 0.45) cohorts (Fig. 3B). Slightly weaker correlations were evident for M1-oligo cases (r = 0.34 and 0.27, respectively).
Next, the overlap amongst DEGs across datasets (PNBX, Taylor, and Grasso) was used to identify 157 shared DEGs (Fig. 3C). Functional enrichment analysis in the 157 DEGs demonstrated the greatest activity of pathways associated with mitotic nuclear division, cell proliferation and cell-cell signaling (Fig. 3D). A large portion of DEGs (89/157; 57%) had the same directionality in gene expression between primary tumors associated with de novo metastases and CRPC sampled at metastatic sites. In a multidimensional scaling diagrams supervised by the 157 DEG set, cases without metastases (M0-NM) and M1 cases (both M1-oligo and M1-poly) form distinct clusters (Fig. 3E). Similarly, when applied to the Taylor and Grasso cohorts, primary tumors were separated from metastases (Fig. 3E). Collectively, these results demonstrate that untreated primary tumors from men with de novo metastases possess gene expression profiles that are correlative to heavily treated mCRPC. These results suggest mCRPC biology is embedded in the primary tumors of M1 patients and has the potential to reveal biological mediators of metastases, castration-resistance and lethal PC at the time of diagnosis via standard-of-care PNBX.
PNBX M1 cases represent high CIN without deregulated DDR. We hypothesized that CIN70, which is highly elevated in mCRPC, may demonstrate a similar expression profile in M1 PNBX. Consequently, we evaluated enrichment of CIN70 gene expression via gene set enrichment analysis (GSEA) of M1 versus M0-NM cases. This analysis revealed significant up-regulation of CIN70 genes in M1 tumors (Supplemental Fig. 3) [16]. In order for tumor cells to tolerate CIN, inactivation of the TP53 gene or its associated pathway is often required [39]. Interrogation of genes linked to the hallmark p53 pathway activation signature demonstrated down-regulated in M1 relative to M0-NM tumors (p = 0.031). Interestingly, there was no enrichment in the DDR gene signature, indicating that this potential mechanism of genomic instability may not be prevalent in de novo metastatic CSPC (Supplemental Fig. 3).
Derivation of PC-CIN. Since M1-oligo and M1-poly cases displayed significant differences in PC-specific survival (Supplemental Fig. 1B), we hypothesized that distinct biological triggers may influence tumor burden. Consistent with this idea, out of total 1,234 DEGs in M1 versus M0-NM, a relatively small portion of the DEGs (105 genes; 9%) were common in both M1-oligo and M1-poly, while most of the genes (696/801 DEGs in M1-oligo and 433/538 DEG in M1-poly) were exclusively regulated in M1-oligo or M1-poly cases (Supplemental Fig. 4A). These DEGs were grouped into 3 clusters (shared, oligo-dominant, and poly-dominant) based on their differential expression patterns (Supplemental Fig. 4B). To identify cellular processes within each group, functional enrichment analysis was performed using DAVID software (Supplemental Fig. 4C) [40]. While the oligo-dominant cluster was enriched in inflammatory response, steroid metabolic processing, cell-cell signaling, and cell differentiation, the poly-dominant cluster displayed the strongest enrichment in cell proliferation and mitotic cell division. Consistent with this, GSEA analysis of M1-poly versus M0-NM revealed a leading-edge subset of genes that significantly contributed to the enrichment of CIN70 and demonstrated significant up-regulation in M1 tumors (Fig. 4A and Supplemental Fig. 3A). Notably, seven out of the top 19 leading-edge genes (PBK, CEP55, UBE2C, MELK, TPX2, PTTG1, and CDCA3) regulate mechanisms during mitosis [41]. We will refer to these seven genes as PC-CIN (prostate cancer-CIN). In order to determine whether the CIN70 signature genes or simplified PC-CIN associated with metastasis (M)-stage (M1 versus M0-NM) at diagnosis, we developed a prediction model using a support vector machine (SVM) algorithm (see Methods and Supplemental Fig. 5) and tested its accuracy using PC-CIN or CIN70 genes. The model displayed a high level of accuracy in predicting metastasis stage with area under the curve (AUC) value of 0.90 for PC-CIN and 0.96 for CIN70 (Fig. 4B). Both CIN70 and PC-CIN appeared significantly enriched in mCRPC relative to primary tumors across five datasets (Fig. 4C and Supplemental Fig. 5A, all p < 0.00001). PC-CIN activation score was also highest in Gleason 8 and higher PC samples in TCGA, as well as in cases of BCR (Fig. 4D-F).
PNBX analysis revealed heterogeneity of CIN in M1 cases. To better understand the distribution of CIN70 and PC-CIN scores in the context of the PNBX cohort (n = 99), we created an integrative heat map of CIN70 genes split into functional groups, as well as the 7 PC-CIN genes (Fig. 5A). Embedded in this heat map is the CIN70 score, with cases arranged from CIN70-low to CIN70-high, the disease stage (M0-NM, M0-oligo, M0-poly, M1-oligo, M1-poly), and the Gleason Sum (6–10). The heat map allows observation of the pattern of distribution of de novo metastatic (M1) cases along the spectrum of CIN scores. Interestingly, a bimodal distribution of M1 cases is observed, with 23/63 displaying CIN70 scores in the lowest third and 25/63 displaying CIN70 scores in the highest third. PC-CIN gene expression variability appeared to mirror the expression pattern of CIN70 genes. A volcano plot of PC-CIN genes in M0-NM versus M1-poly cases in the PNBX cohort demonstrates differential expression (Fig. 5B). PC-CIN scores are significantly higher in M1-poly cases compared to M0-NM cases (p = 0.0426), however, the wide range of PC-CIN scores is evident in the box plot in Fig. 5C, which reflects the bimodal distribution of M1 CIN scores observed in the heat map (Fig. 5A).
To identify genomic evidence of CIN in M1 cases, we sampled the same tumor regions for DNA extraction that were previously selected to generate transcriptome data in M0-poly and M0-NM cases. Since there is limited tissue in PNBX, only 24 cases yielded sufficient quality/quantity of DNA for CNA evaluation. However, a significant increase (p = 0.038) in copy number alterations in M1-poly versus M0-NM cases was observed in this small sample (Fig. 5D), consistent with heightened frequencies of amplifications and deletions associated with CIN in TCGA (Fig. 2). Gain of MYC and loss of RB1 and SIAH3 were also identified, consistent with previous studies of genetic alterations associated with poor PC prognosis (Supplemental Fig. 6A and B) [42–44].
Differentially expressed genes in CIN-High versus CIN-Low Cases. The bimodal distribution of M1 cases when organized by CIN70 score suggests both CIN-dependent and CIN-independent gene associations and processes linked to PC lethality. Consequently, we evaluated DEGs and biological processes associated with CIN70-Low versus CIN70-High, as well as PC-CIN-Low versus PC-CIN-High cases from the PNBX cohort. Heat maps of the DEGs based upon these different gene expression signatures is displayed in Supplemental Fig. 7A and B). Enriched biological processes associated with CIN70 and PC-CIN scores (low versus high) are also displayed (Supplemental Fig. 7C and D). Distinct biological processes appear to be active in CIN-high versus CIN-low tumors. As expected, CIN-high tumors involve processes associated with cell cycle, mitosis, and chromosome segregation. In contrast, the top processes associated with CIN-low tumors involve developmental signatures, specifically those related to vascular and urogenital system development, as well as muscle contraction.
Additional analysis of previously described CIN genes and drivers was also performed. During chromosome segregation, sister chromatids are separated by a kinetochore mediated attachment to spindle microtubules [45]. The microtubules are nucleated from centromeres, which require the highly evolutionary conserved OIP5/MIS18β for proper assembly [46]. Disruption of kinetochore and centrosome dynamics are components of neoplastic transformation, and, similar to aneuploidy, centrosome amplification is another hallmark of cancer [47]. Although there was no clear evidence of specific functional group dysregulation among nine mechanistic subgroups of CIN70 genes that we annotated (chromosomal separation, condensation, cyclins/cell cycle, DNA damage repair (DDR), kinetochores, spindle-related, spindle related/centrosomes, and spindle related/cyclins), we did find interesting expression differences that connect CIN and metastatic progression. KIF20A is one of the leading edge genes found on GSEA of M1-poly versus M0-NM DEGs (Fig. 4A) and is homologous to KIF2B, a protein that directly promotes tumor metastasis in cell line models of CIN [15]. Both KIF20A and KIF2B are significantly overexpressed in PC-CIN-high cases relative to (Supplemental Fig. 8A).
A recent analysis of highly aneuploidy breast cancers in TCGA found overexpression of three transcriptional regulators, E2F1, MYBL2, and FOXM1 [13]. Overexpression of these genes in non-transformed Xenopus embryos was sufficient to significantly increase the rate of chromosomal missegregation and initiate aneuploidy. Evaluation of expression of these transcription factors in CIN70-high versus CIN70-low PNBX demonstrated significantly elevated expression in CIN70-high cases (Supplemental Fig. 8B). In addition, 6/7 PC-CIN genes (CEP55, UBE2C, MELK, TPX2, PTTG1, and CDCA3) were also found to be among the top DEGs identified through comparison of high aneuploidy versus low aneuploidy breast tumors in TCGA. These results suggest that the same drivers and effectors are involved across tumor types.
Staging and prognostic value of PC-CIN in independent cohorts. Next, we questioned whether the PC-CIN signature genes were associated with disease progression in presumed localized (M0) cases. PC-CIN score separated high- and low-risk BCR groups from two independent PC cohorts (Fig. 6A) [22, 48]. We also tested the ability of PC-CIN to separate cases based upon M-stage equally well in subcohorts of African-American (AA) and European-American (EA) PNBX (n = 121). Clinical characteristics of patients included in this expanded PNBX cohort are shown in Supplemental Table 2. PC-CIN was associated with metastatic progression in cases stratified by race with AUC of 0.78 for AA men and 0.80 for EA men (Fig. 6B). Both AA and EA men displayed significantly higher PC-CIN in M1 PNBX compared to M0 PNBX (P = 1.06e-04 and 3.11e-04, respectively, Fig. 6C). In both racial groups, univariate analyses demonstrated that PC-CIN high cases were significantly more likely to be classified as clinical stage M1, progress to CRPC, and die from PC (Supplemental Table 3). In order to evaluate the impact of PC-CIN score in the context of other clinicopathological variables, multivariate logistic regression analysis was performed. After controlling for age, race, PSA and Gleason sum, PC-CIN-high is significantly associated with higher odds of M1 stage, CRPC, PC-death and all-cause mortality in multivariate analysis (OR 10.84, 16.13, 6.26 and 6.00, respectively; all p < 0.001, Table 1).
Table 1
Multivariate logistic regression analysis of variables associated with PC progression
| CRPC | PC-Death | All-Cause Mortality |
PC-CIN High Versus Low | 16.13 (3.23, 80.55) | 6.26 (2.44,16.04) | 6.00 (2.45, 14.71) |
Age at Diagnosis | 0.93 (0.86,1.01) | 1.00 (0.95,1.05) | 1.01 (0.97,1.06) |
Race (AA v. White) | 0.99 (0.22,4.96) | 0.31 (0.11,0.86) | 0.43 (0.16,1.11) |
PSA (< 20 v. >20) | 1.04 (0.22, 4.96) | 8.23 (3.06, 22.15) | 5.71 (2.29, 14.25) |
Gleason (> 8 v. <8) | 2.29 (0.52, 10.03) | 0.62 (0.23, 1.64) | 0.87 (0.35, 2.18) |