In the current study, we constructed a user-friendly nomogram with five routine clinicopathological predictors, including age, histologic subtype, tumor grade, ER expression, PR expression, and Ki-67 index. Validation of the nomogram demonstrated optimal predictive ability in terms of discrimination and calibration. Subgroup analysis showed that the nomogram had better performance in patients who had tumors larger than 2 cm. When compared with other published models, consistency was observed, indicating the value of our nomogram for clinic practice.
The 21-gene RS testing could provide more precise prognostic and predictive information when compared with classical clinicopathological parameters. However, this multigene assay is not available in some countries, the high price also prevented a lot of patients from receiving the test. Literature reported that only a quarter to a third of the eligible patients in the United States had this assay performed (7, 12), and in developing countries, certain controversy remained regarding the applicability of this testing (13). On another hand, cost-effectiveness analysis demonstrated that the 21-gene RS testing was only associated with lower cost in patients with clinical high-risk (14, 15). Thus, a surrogate for the 21-gene RS testing is needed for those who have no access to this assay, as well as to relieve the heavy financial burden of patients for whom the testing is not cost-efficient (16).
The NSABP B-20 retrospectively validated that patients with RS > 30 could have better distant recurrence-free survival if they received chemotherapy (17). In the current study, we set RS > 25 as the objective of prediction because this was used for defining high-risk RS in the prospective TAILORx trial. We postulate that the chemotherapy should be included in treatment for patients with an RS of 26–30 since in the TAILORx trial those patients were assigned to use chemotherapy and had better clinical results than expected outcomes with endocrine monotherapy (18). Moreover, we only included patients > 50 years, because in patients of 50 years of age or younger some chemotherapy benefit could be found in those had an RS of 16–25 (6). To the best of our knowledge, there was no published model take the age stratification into consideration. We further validated our nomogram in patients ≤ 50 years, and the AUC was 0.739, which had a significant difference with that in the older cohort. The inapplicability of the current nomogram in the younger population may due to the biological difference of tumors between young and old breast cancer patients (19).
For the development of the nomogram, we used five variables: tumor grade, histologic subtype, ER expression, PR expression, and Ki67-index. Ki-67 index was the most significant predictor of high-risk RS, which was consistent with Lee et al reported (20). Indeed, serving as a proliferation index, Ki-67 had been universally recognized and has been endorsed to discriminate Luminal A-like with Luminal B-like breast cancer (9, 21). However, further efforts are imperative to improve the poor interlaboratory reproducibility and resolve the disagreement during cutoff selection for this biomarker (22). The relationship between RS and tumor grade as well as PR status were always reported, and those two parameters had been constantly incorporated in the model predicting high-risk RS (23–26). Tumor grade is associated with the biologic aggressiveness of tumors, which is also the only one factor that showed a significant effect on prognosis beyond RS in the TAILORx. PR negativity together with the semi-quantitative measurement of PR such as Allred scoring were both correlated to RS stratification (20, 27). We used the percentage of the positively stained cell as a rule for scoring ER and PR as Kim et al because quantitative estrogen and progesterone receptor was validated to be associated with the risk of relapse (25, 28).
Our nomogram had an AUC of 0.798, indicating a strong model with good discrimination. And subgroup analysis demonstrated that the model had better performance in patients with large tumors. Tumor size was significantly associated with RS in the univariant analysis, but missed the statistical significance when entering the model together with other variables, while it was verified as a predictor in the model constructed by Orucevic et al (26). When stratified by the luminal subtype, the AUC values of two cohorts were both lower than that in the overall population. A possible explanation was that the categorization of the luminal subtype depends on PR expression and Ki-67 index, two major predictors of high-risk RS. And when grouped patients with these two parameters, the predictive value may be narrowed accordingly.
Up to now, there were several models using clinical parameters to estimate the RS with TAILORx cutoffs (20, 25, 26, 29, 30). Our nomogram has a similar discriminative ability with models developed by Lee et al (20, 25). Kim et al used forest random method to develop a model and allows for online implementation of the model (25). We used four predictors identical to them but didn’t include the HER2 status into the establishment of the nomogram. Although the 21-gene RS testing is only applicable in HR+/HER2- patients, literature reported that using quantitative RT-PCR to discriminate HER2 status could further elucidate the benefit of chemotherapy. Hence, other measurements of HER2 status such as Fluorescence In Situ Hybridization may contribute to a better model in further research (31). Orucevic et al built a nomogram using a large cohort in National Cancer Data Base (NCDB) with a C-index of 0.81, while the AUC was only 0.695 when validated with our patients (26). The racial disparity may be one reasonable explanation. Meanwhile, Ki-67 was not regularly recorded in the NCDB thus was not incorporated into their model, whereas our study demonstrated the importance of this biomarker in predicting high-risk RS. Recently, Zhang et al developed a model by using the Ki-67 index, PR expression, tumor grade, and tumor size with a predictive accuracy of 86.5%, which also had prognostic value (30). However, it’s hard to perform a direct comparison between our two models because of the decision tree method they used.
Our strength was that we confined the scope of application to patients > 50 years, for whom the efficacy of chemotherapy is undoubted. There were also several limitations. First, as a retrospective study, selection bias may make the results less convincing, although all patients who met the criterion of 21-gene RS testing consecutively received this multigene assay in our center. Second, the nomogram was constructed using single-center data, and the nomogram could be improved with validation on the external population. Last but not least, though using multivariate logistic regression, we developed a strong model with good fitness, other methods such as decision tree model and random forest model may also work, which warrant further consideration.