Generation of an LCM-ILC dataset
LCM was performed on 23 ILC fresh frozen human samples: 17 were Grade 2 (74%), five Grade 1 and one Grade 3. RNA was isolated from tumor epithelium (TE) and tumor stroma (TS) compartments (TS defined here as primarily CAFs and matrix proteins, with the majority of immune cells being excluded). Gene expression data were generated for a total of 22 TE and 18 TS samples (Figure 1A; Supplementary Figure 1), including matched pairs from 17 samples. 17 matched TE and TS samples from the same patient. Two-class paired rank product analysis (percent false positive (pfp) < 0.01) identified 1,082 genes consistently more highly expressed in the TS and 837 in the TE. These genes clustered the samples by compartment type (epithelium/stroma), showing successful microdissection of TE and TS compartments. Biomolecular pathway annotation revealed up-regulation of genes involved in extracellular matrix (ECM) remodeling, collagen degradation and integrin cell surface interactions in TS compared to TE, while genes related to cell cycle, DNA replication and methylation were up-regulated in TE compared to TS compartments (Figure 1B).
Identification of TS-ILC enriched genes
An analysis pipeline was set up to identify genes up-regulated in the TS compared to TE in ILC, but not IDC or normal breast (Figure 1C). First, the list of 1,082 genes differentially expressed in our TS LCM-ILC dataset was applied to previously reported LCM-IDC (GSE68744) [17] and LCM-normal (GSE4823) [14] datasets. This identified 261 genes increased in the TS compared to TE only in ILC. Pathway enrichment analysis (https://toppgene.cchmc.org/) identified 45 of these genes to be involved in significantly over-represented pathways (Benjamini–Hochberg adjusted p-value < 0.05), all related to the ECM (Figure 1D; Supplementary Figure 2). Network analysis revealed 30 interconnected genes, including matrix proteins (COL6A3, COL18A1, TNC, EFEMP2), proteoglycans (SPOCK2, PAPLN, HAPLN3, GPC6), proteinases and their regulators (MMP2, TIMP2, MMP14, CAPN3, ADAMTS18, SERPINH1, PAPPA) and integrin subunits (ITGA7,ITGA10). A number of growth factors, including those of the TGFsuperfamily (PGF, GFD11, HGF, TGFB3, BMP2), were also identified. The analysis highlighted physical interactions of MMP2 with TIMP2, MMP14, COL6A3, COL18A1 and CAPN3 gene products, all involved in ECM organization. In addition, BMP2 and PAPPA were the two main hubs of genetic interactions [18] (Figure 1D). The expression of these 45 stromal genes was examined in ILC and IDC ER+ samples from the METABRIC [19] and The Cancer Genome Atlas (TCGA; http://cancergenome.nih.gov/) bulk mRNA datasets (Table 1). PAPPA, PRKCA, TGFB3, ITGA10, ITGA7, CLEC1A, CLEC10A and PAPLN were up-regulated in ILC compared to IDC in all three datasets analyzed (METABRIC, TCGA RNA-Seq, TCGA microarray), highlighting the importance of these genes in the stroma of lobular carcinoma (Supplementary Figure 3).
ILC-specific expression of primary cancer-associated fibroblasts
In order to perform functional studies, gene expression profiling of primary CAFs from ILC and IDC tumors was performed (Figure 2A and B; Supplementary Figure 4). Rank product analysis identified 1,027 genes differentially expressed between ILC- and IDC-derived CAFs (pfp < 0.05); approximately half (485) were up-regulated in ILC CAFs compared to IDC CAFs (Figure 2C) and were enriched for ECM-associated genes, along with genes involved in glycolysis, focal adhesions and members of the TGF signaling pathway (Figure 2D).
Of the 45 ILC-specific stromal genes identified, 28 were expressed in the CAF dataset (Table 1; Figure 2E). Functional network analysis (http://genemania.org/) identified that 24 of these 28 genes were in the same pathway or are linked by known genetic or physical interactions (Supplementary Figure 5). The majority (14/24) were significantly up-regulated in ILC compared to IDC (p < 0.05) in at least one of the published bulk datasets (Table 1). Clustering the 10 genes with a TS/TE fold change >2 in the LCM-ILC dataset clearly showed increased expression in the TS of ILC, but not in IDC or normal breast (Figure 2F).
PAPPA is predominantly expressed in the stroma of ILC
PAPPA showed the greatest fold change expression in the stromal compared to epithelial compartments in ILC (log2(FC(TS/TE)) = 2.6) (Table 1), so it was selected for further analysis. PAPPA encodes PAPP-A, a secreted metalloproteinase that cleaves IGFBP-4, releasing bioactive IGF-1 [20]. PAPP-A activity can be inhibited by non-covalent or covalent complex formation with endogenous inhibitors stanniocalcin-1 (STC1) or -2 (STC2), respectively [21, 22] (Figure 3A). To verify that PAPPA was expressed predominantly by CAFs, we analyzed PAPPA transcripts by RNAScope. Results confirmed higher levels of PAPPA expression in CAFs compared to tumor cells, although epithelial PAPPA transcripts were also seen in some tumors (Figure 3B). We then examined expression of PAPPA and functionally related genes in the LCM-ILC, -IDC and -normal datasets (Figure 3C). In ILC, both PAPPA and IGF1 were significantly more highly expressed in the stroma compared to the epithelium (p < 0.0001), while IGF1R was found predominantly in the tumor epithelium (p < 0.05), suggesting the presence of a potential paracrine activation loop. In IDC and normal breast tissue, IGF1 was also predominantly expressed in the stroma (p < 0.0001 and p < 0.01, respectively), whereas PAPPA was expressed in both stromal and epithelial compartments (Figure 3C). Interestingly, we found a clear positive correlation between PAPPA and IGF1 in ILC (r = 0.64, p < 0.0001), which was not observed in IDC and normal tissue (Figure 3D). As there are few cell lines that represent ILC, we first examined PAPPA across three integrated breast cancer cell line datasets [23]. PAPPA was low or undetectable in all luminal cell lines, including two reported ILC lines, SUM44-PE and MDA-MB-134VI (Figure 4A). qPCR confirmed that PAPPA was not expressed in SUM44-PE and MDA-MB-134VI cells or the T47D and MCF-7 ER+ IDC lines. Analysis of 11 primary patient-derived ILC CAFs, 5 IDC CAFs, and HCI 013, an ILC patient-derived xenograft, showed that PAPPA and IGF1 were expressed exclusively in CAFs (Figure 4B; Supplementary Figure 6). In contrast, IGF1R was mostly expressed by tumor cells (Figure 4B). We also separated tumor epithelial cells from CAFs in tumors derived from a mouse model of ILC driven by loss of Trp53 and Cdh1 (Supplementary Figure 7A) [24]. qPCR results showed that Pappa, Igf1 and Stc1 were only expressed in the CAFs, while Igf1r, Stc2 and Igfbp4 were expressed in both tumor cells and CAFs (Supplementary Figure 7B). Together, these data support a wider paracrine signaling role for PAPP-A in luminal tumors.
PAPP-A secreted by CAFs is active
We analyzed conditioned media (CM) and confirmed that PAPP-A was secreted by CAFs but not the tumor cells (Figure 5A). PAPP-A needs to be active in order to cleave IGFBP-4 and liberate IGF-1. CM from the CAFs was able to cleave recombinant IGFBP-4, indicating that non-complexed, active PAPP-A was present in the media (Figure 5B). To confirm that the IGFBP-4 fragments generated by the CAF CM were a result of PAPP-A activity, the CM was treated with a PAPP-A inhibitory antibody, (mAb 1/41) [25]. Pre-incubation with mAb 1/41 reduced levels of the cleaved IGFBP-4 fragment to those in the control lane, showing that the observed cleavage of IGFBP-4 is due to PAPP-A present in the CM (Figure 5C).
PAPPA expression is positively correlated with IGF1 and negatively with IGF1R and is elevated in CDH1−, claudin-low tumors
Investigation of PAPPA and related genes in large cohorts of breast cancers confirmed positive correlations between PAPPA and IGF1 and negative correlations with IGF1R and CDH1 (Figure 6A and 6B). An integrated compendium of 17 Affymetrix datasets representing 2,999 breast cancers [23] revealed that PAPPA is detectably expressed in 36% of tumors, the highest proportion of which were of the claudin-low subtype, which also had significantly higher levels of PAPPA expression (Figure 6A). Somewhat surprisingly, 7% of ILCs in the METABRIC dataset are classified as claudin-low, compared to only 4% of IDCs, and PAPPA expression was significantly higher in tumors where CDH1 was undetectable (Figure 6B).
Low PAPPA expression in ILC, but not IDC, is associated with poor outcomes
To examine whether PAPPA relates to prognosis of breast cancers, we performed comprehensive survival analysis using the survivALL R package [16]. Low PAPPA expression was significantly associated with worse overall survival for a large proportion (807/1,903) of all possible (n − 1) cut-points for 1,904 breast cancers in the METABRIC cohort (p < 0.05). However, none of the cut-points for PAPPA expression were significantly associated with overall survival across the 1,098 breast cancers of the TCGA dataset (Supplementary Figure 8A). Restricting analysis to the 142 and 206 ILCs of the METABRIC and TCGA cohorts, respectively, identified a number of cut-points that were significantly associated with overall survival in both cohorts (Figure 6C; Supplementary Figure 8B). To compare this with IDCs from the two cohorts, analysis was limited to the same numbers (142 and 206) of ER+ IDC tumors with 10-fold random sampling, but on no occasion were any significant cut-points identified (Figure 6D; Supplementary Figure 8C). Taken together, these results suggest that reduced or absent PAPPA is an ILC-specific prognostic marker.