LTED Cells are the Cell-Based Model of Endocrine resistance Closest to Patients Data among Breast Cancer Cell Models
First, we identified the best cell-based model of endocrine resistance based on their similarities to patients’ tumor data. For this purpose, we compared the gene expression patterns from cell models of endocrine resistance before and after establishment, utilizing datasets GSE20361 and GSE111151, and gene expression patterns of resistant and sensitive tumors from patients using the GSE87411 dataset (16). The GSE20361 dataset is characterized by acquiring endocrine resistance. By culturing MCF7 cells in the estrogen-deprived medium for a long period, cells developed resistance to endocrine therapies 15. On the other hand, GSE111151 contains multiple models of tamoxifen resistance developed using several breast cancer cell lines (MCF7, T-47D, ZR-75-1, and BT-474) and the gene expression data is from before and after developing the resistance 17. We used the whole gene expression patterns of these datasets to generate a heatmap and utilized hierarchical clustering to find the distance or similarity between each of these datasets. Amongst the cell line models, we found that the gene expression patterns of the long-term estrogen-deprived (LTED) cells are the closest to the patient data, visualized by the shortest distance between these two datasets in the column’s dendrogram of the heatmap (Figure 1). ZR-75-1 tamoxifen-resistant cell lines are next based on their closeness to patient data, followed by BT-474, MCF7, and T-47D tamoxifen-resistant cell lines. Therefore, we decided to use LTED cells as the most suitable cell line model for studying endocrine resistance in breast cancer.
Estrogen Deprived MCF7 Cells Show Four Dominant Differential Expression Patterns with Unique Cell Functions
To better understand the process of acquiring endocrine resistance and its underlying gene expression patterns, we used the GSE20361 dataset 15. In this work, researchers collected the RNA samples from the MCF7 cells with estrogen depletion after 0, 3, 15, 30, 90, 120, 150, and 180d. We applied our recently developed statistical pipeline to the dataset to find dynamically regulated genes employed in the process of endocrine-resistance development and progression 18. The pipeline provides three main functions: First, statistical hypothesis testing determines a set of dynamic response genes (DRGs) that exhibit significant changes over time. Next, these DRGs are clustered into gene response modules (GRMs), subsets of DRGs with similar time-course expression patterns. Finally, the GRMs associations and regulatory effect are analyzed through a gene regulatory network built using ordinary differential equations.
Among 54 675 probes in the GSE 20361 microarray data, 14 693 genes were characterized as DRGs. The 14 693 DRGs were further ranked based on F-ratio, and the top 3 000 DRGs were selected for correlation-based iterative hierarchical clustering 18,19. The top 3 000 DRGs were clustered into 20 distinct GRMs with behavioral profiles corresponding to high degrees of correlation. Response module sizes ranged from 1 to 1 095 genes. Time-resolved expression of the top 3 000 DRGs and the module behavior for the 10 largest modules in the top 3 000 DRGs show a significant difference from one module to the other which emphasizes the resolution of our clustering method (Figure 2a, b)
Exploring the modules showed that the top four GRMs account for 2 748 (92% of the top 3 000) DRGs (Figure 2c). The 3 000 DRGs comprised 2 305 unique genes. The first GRM was the largest model, containing 1 095 genes characterized by a gradual downregulation in expression from day 0 to 30, at which point the pattern shifts, revealing sharp upregulation from day 30 to 150. The second module, comprised of 812 DRGs, showed nearly opposite behavior, demonstrating gradual upregulation until day 30, after which sharp downregulation was observed. Genes from individual GRMs were submitted to the Metascape gene annotation tool for gene enrichment analysis. The first module was significantly enriched for cell division, DNA replication, DNA-dependent DNA replication, and cell cycle pathways and ontologies. The interaction between these different gene ontologies is also visualized in a tree cluster in Figure 2e. The second GRM gene association included membrane trafficking, cellular protein catabolic processes, and macro-autophagy, while the third GRM enriched mostly for positive regulation of protein autophosphorylation and regulation of cellular ketone metabolic processes. The fourth GRM is mostly related to cell junction organization (Figure 2d, e). These results indicate that we identified the genes significantly changed during the development of endocrine resistance and found that each GRM encodes a unique cell function.
Dynamic Gene Expression Analysis of LTED Data Overlaps with Endocrine-Resistant Patient Data and also with Genes Regulated in TNBC
To further narrow down our search for identifying the candidate genes with active roles associated with endocrine resistance, we also analyzed GSE87411, an endocrine-resistant/sensitive patient dataset; the only patient dataset publicly available to the best of our knowledge 16. A group of 109 breast cancer patients was treated with aromatase inhibitors for 2-4 weeks and their response to treatment was evaluated 16. Following the evaluation of patient response to treatment, they were categorized into endocrine-resistant and sensitive patients. We used the gene expression profile of these patients to find the genes significantly regulated in endocrine-resistant breast cancer tumors as compared to endocrine-sensitive breast cancer tumors. We compiled patient data, analyzed the microarrays as two groups, and compared the results to find the significantly differentially expressed genes 20. Among these genes, 984 genes were significantly upregulated in endocrine-resistant patients while 621 genes were significantly downregulated. We then compared these significantly regulated genes in the patient dataset to the DRGs found from our dynamic gene expression analysis and selected those with highly correlated expression patterns between the two datasets for further analysis, which returned 319 genes (Figure 3a).
TNBC accounts for around 15% of breast cancer cases, characterized as one of the most aggressive types of breast cancer with no established therapy options yet 2. Similar to endocrine-resistant breast cancer patients, TNBC patients are also resistant to conventional endocrine therapies due to its nature of the lack of hormone receptors. To examine whether these two groups share certain gene expression patterns or not, we explored the gene expression similarities between the development of endocrine resistance and TNBC. First, we identified the 1 497 genes significantly and differentially expressed in TNBC compared to luminal A breast cancer. This list was compared to the 2 305 DRGs including 319 commonly shared genes between the DRG pipeline analysis and endocrine-resistant patient data. Surprisingly, 97.8 % of genes significantly expressed in the TNBC vs. Luminal A gene set were included in the DRGs, and more than 80% of the genes commonly shared among the DRGs and endocrine-resistant patient data were also significant in the TNBC vs. Luminal A gene set. This common set between the three sets of analyses is comprised of 254 genes. We posit these genes are important to the development of endocrine resistance, estrogen deprivation, and TNBC (Figure 3a).
Gene ontology analysis on the shared 254 genes showed significant enrichment for cell cycle, cell division, and DNA repair pathways, all signatures of genes affected in cancer cells (Figure 3b). Genes related to DNA repair pathways were included in both TNBC and endocrine-resistant breast cancer, indicating the importance of the DNA repair system to account for mutations in the DNA in their cancers 21. Other important genes include ribonuclease H2 subunit A (RNASEH2A), a gene that is known to be upregulated in cancers. RNASEH2A is a mediator of the removal of lagging-strand Okazaki fragment RNA primers, thus it can be integral in the proliferation of both triple-negative and endocrine-resistant breast cancers 22. Moreover, most of the common genes are regulated by transcription factors such as E2F1, MYCN, and TFDP1, which are all important transcription factors in TNBC (figure 3c). This finding confirms the notion that TNBC tumors share significant similarities with endocrine-resistant breast cancer.
MCM family, RAD51, CAV1, and CCNE1 are Among the Top Candidate Genes Responsible for Endocrine-Resistant Development
To further identify the genes important for the endocrine resistance development and progression, we utilized several bioinformatic approaches designated to ranking and prioritizing genes. We used several tools to generate gene-gene and protein-protein interaction networks based on the candidate genes in each of the modules 23-26. These tools rank or prioritize genes according to the functional similarities of the genes, direct and indirect interactions amongst them, and their roles in the same pathway to cluster the genes or proteins together. For tools that require a training gene set, we gathered known genes for endocrine-resistance development from literature and submitted them as the training set. Using generated networks, we selected the genes with the highest number of neighbors in the network as the master regulators of genes, which will be our main candidates (Figure 4). These core genes may have the most essential roles in the development of endocrine resistance in LTED-MCF7 cells. Thereby, we were able to narrow the list of candidates to 34 genes from major four modules, mostly from modules 1 and 2 (n = 31) (Table S1).
Among the genes in module 1, we found PARP1 and E2F1, recently discovered important genes for endocrine resistance in breast cancer, indicating the validity of our approach. We also found minichromosome maintenance (MCM) family genes as important genes, namely MCM2 and MCM7, RAD51, and TCF3 (Figure 4), which have not been studied thoroughly 27-29.MCM family genes (MCM2-7) form an MCM complex protein that functions as a DNA replication licensing factor and plays a central role in eukaryotic DNA replication 30. Recent evidence suggests that blocking the expression of these genes can lead to the inhibition of the growth of tamoxifen-resistant cancer cells. This evidence perfectly matches the expression pattern of these genes in our dataset, as these genes are significantly upregulated during the endocrine-resistance process in breast cancer cells 31. RAD51 overexpression, a key protein of homologous recombination, is also linked to overall poor survival and endocrine resistance in breast cancer, although the exact underlying signaling pathways are not well understood yet 32. As for TCF3, while there is no study on its role in endocrine resistance, it is important for breast cancer differentiation, development, and prognosis 33. All the genes in module two were slightly upregulated by day 30 and significantly downregulated afterward (Figure 5). In line with our data, there are reports showing downregulation of CAV1 as an important step in breast cancer development and resistance to endocrine therapies 34, and further studies are warranted. Some of the other important candidate genes were selected from modules 3 and 4, including ATG3, CCNE1, and MFAP4.
Long-term Estrogen Deprived MCF7 Cells Gained Resistance at Day 90
To validate the bioinformatic findings, we established the LTED MCF7 cell model by culturing human breast cancer MCF7 cells in estrogen-depleted growth media for an extended period. This cell model recapitulated acquired resistance to aromatase inhibitors in postmenopausal women and the same experimental settings as the study on which we used for our gene expression analysis with an extended sample collection period 15. After the initial quiescence state of around 30 days when the cell's growth rate was minimal, they started to adapt with estrogen depletion and regain their growth ability while losing their responsiveness to 17b-estradiol (E2). At around day 90, LTED cells showed complete loss of response to E2, characterized by their loss of ability to respond to estrogen stimulation to increase their growth rate, while their basal growth rate reached similar levels as those of parental MCF7 cells cultured in estrogen-rich media (Figure 5a). We then continued to maintain the LTED cells for another 6 months. After a period of super-sensitivity to estrogen at low doses (10-11 to 10-13 M), which happened around day 150, they became completely unresponsive to estrogen treatments at all concentrations by day 250 (Figure 5b). These results are consistent with the previous reports and follow a similar timeline 15,35-38.
To validate the gene expression profiles of the candidate genes, we collected samples of LTED cells at different time points (from day 0 to day 135) and measured the expression of some of the candidate genes using qRT-PCR (Figure 5c). The expression profile of these genes closely mimicked their gene expression patterns from the microarray studies, which further confirms microarray results and our downstream bioinformatics analysis. Furthermore, as our qRT-PCR analysis contains much more data points compared to the previous study, the expression patterns of the genes are more detailed than the microarray results.