The cell populations of origin for specific BrC subtypes are generally unknown. If these specific cell populations are identified, we can identify cellular pathways that contribute to subtype-specific carcinogenesis and potentially screen patients for malignant mutations in these cell populations that would predispose patients to BrC. This research compares normal breast cell populations to human BrC subtypes and normal mouse mammary gland populations. We identified shared gene expression patterns and predicted two cell populations that may transform into Basal-like and HER2-overexpressing BrCs, respectively. Basal BrC is a subtype of TN, and we went on to examine TN BrC scRNAseq datasets [3]. From this research, we identify two potential CSC populations within a subset of TN BrCs. These CSCs highly expressed many of the SC genes.
The identified normal SC population and the corresponding CSC populations have known mammary gland stemness properties. SC and the CSCs have luminal, basal, epithelial, and mesenchymal marker expression. Mammary stem cells are known to express both luminal and basal markers and even vimentin, a mesenchymal marker [27, 37]. Also, partial EMT, which is characterized by co-expression of epithelial and mesenchymal markers, identifies BrC cells with stem cell properties [38, 39].
The SC population we identified expressed twelve marker genes, BIRC5, CDK6, CENPF, CENPW, HIST1H4C, HMGB2, STMN1, TOP2A, TPX2, TYMS, UBE2C, and UBE2S. The scientific literature not only corroborates that many of these genes are stem cell related, but also that they are specifically associated with the basal-like BrC subtype. BIRC5, CENPF, FDCSP, HIST1H4C, HMGB2, STMN1, TYMS, UBE2C, and UBE2S are upregulated in BrCs, specifically in basal-like BrCs for BIRC5, CENPF, STMN1, TYMS [4, 40–53]. In several papers where TN BrCs were not subdivided to include a separate section for the basal-like subtype, the TN subtype showed high expression of SC marker genes. CENPW, TPX2, and UBE2C are overexpressed in TN BrCs [54–56]. UBE2C expression is upregulated in HER2 expressing and TN BrCs [47, 48].
Many of the SC marker genes are correlated with poor BrC patient survival statistics in the literature. BrCs expressing BIRC5, CENPF, or UBE2C have reduced disease-free, metastasis-free, and overall survival [41, 48, 50, 57, 58]. HIST1H4C expression is associated with worse overall and metastasis-free survival in BrCs [44]. HMGB2, STMN1, and TPX2 expression is correlated with worse disease-free and overall survival [45, 52, 56, 59, 60]. TYMS expressing BrCs have low overall survival [46].
Several SC marker genes have known stem cell-related properties in BrCs. STMN1 expression is associated with the CD44+/CD24- BrC stem cell phenotype [59]. TYMS maintains BrC spheroid formation efficiency and CD24- status [46]. TPX2 and UBE2C knockdown reduce colony formation efficiency in TN BrCs [55]. Lastly, UBE2S knockdown suppresses anchorage-independent growth in BrCs [49].
Interestingly, several breast cancer papers identify co-expression of the SC marker genes. CENPF expression correlates with BIRC5 expression [4, 58]. TPX2 and UBE2C were highly expressed in the same TN BrC cell populations and cell lines [55]. Overall, the scientific literature corroborates and greatly bolsters the association of the SC markers with basal-like BrC.
Most SC marker genes are specifically upregulated in basal-like or TN BrC subtypes, and we hypothesize that malignant SC cells create basal-like BrCs. These marker genes were almost exclusively in the SC population in the normal human breast epithelium; therefore, these genes could be promising targets for targeted treatment of basal-like BrC with minimal local normal tissue damage. For instance, the SC gene BIRC5 is an onco-fetal protein rarely expressed in adult tissues and BIRC5 inhibitors have been shown to be effective in in vitro BrC treatment [61, 62]. A CENPF inhibitor also shows promise in BrC treatment in vitro [63].
Besides the SC population, SC marker genes, and potential TN CSC populations, we also identified a S100A7, S100A8, and S100A9 expressing normal mammary luminal progenitor cell population. We determined that these three genes are strongly associated with the HER2 BrC subtype and hypothesized that transformed cells from this population become HER2-overexpressing BrCs. Many scientific papers have shown that these three genes are associated with HER2 BrCs and are often correlated with poor BrC patient outcomes. S100A7 expression is negatively correlated with ESR1 and PGR in human BrCs and correlated with decreased disease-free and overall survival [64, 65]. High S100A8 expression is positively correlated with HER2 expression and negatively correlated with ESR1 and PGR expression [66–68]. Further, high S100A8 expression is associated with increased cancer relapse and lower overall and disease-free survival [66]. S100A9 has also been correlated with HER2 BrCs and poor overall survival [67–69]. Further, a Luminal A cell line treated with 100A8/A9 had a marked decrease in ESR1 expression, suggesting that S100A8/A9 may have a causal role in the HER2 BrC phenotype [67]. Together, there is strong evidence in the literature that S100A7, S100A8, and S100A9 are negative BrC prognostic markers and are associated with HER2 BrCs, as we suggest in this research.
Interestingly, S100A7, S100A8, and S100A9 are also associated with stem-cell properties in BrC. In BrC, expression of these three genes is associated with effective mammosphere formation and inhibition of these genes stunted mammosphere growth and xenograft tumor growth [70].
As with the SC marker genes, S100A7, S100A8, and S100A9 are promising targets for treatment. We found that these genes are almost exclusively in one luminal progenitor cell population, suggesting that targeted treatment for these genes in HER2 BrC would not cause significant off-target damage to most breast cells, contributing to better patient outcomes.